Skip to main content

Identifying Clusters of Rare and Novel Words in Emergency Department Chief Complaints


A goal of biosurveillance is to identify incidents that require a public health response. The challenge is creating specific definitions of such incidents so they can be detected. In syndromic surveillance, this is accomplished by classifying emergency department chief complaints, nurse triage calls, and other prediagnostic data into categories, and then looking for increases in visits related to those categories. This approach can only find incidents that match those predefined categories. It is well-suited to handle common diseases; data from prior years provides information not only on which symptoms correlate with the disease, but also on how patients report them and how they appear in prediagnostic data streams. For unique or rare events, it is hard to know in advance how they will be described or recorded. Another approach is to look for similarities in the time of the healthcare encounters alone. This method can detect events which are missed by syndrome-oriented surveillance, but healthcare encounters that only have time of occurrence aren't necessarily related. To address this limitation, we propose a set of similarity criteria which incorporates both timing and reason.


Develop a method for detecting groups of related healthcare encounters without having to specify details of the reasons for those encounters in advance.

Submitted by knowledge_repo… on