Skip to main content



A number of different methods are currently used to classify patients into syndromic groups based on the patient’s chief complaint (CC). We previously reported results using an “Ngram” text processing program for building classifiers (adapted from business research technology at AT&T Labs). The method applies the ICD9 classifier to a training set of ED visits for which both the CC and ICD9 code are known. A computerized method is used to automatically generate a collection of CC substrings (or Ngrams), with associated probabilities, from the training data. We then generate a CC classifier from the collection of Ngrams and use it to find a classification probability for each patient. Previously, we presented data showing good correlation between daily volumes as measured by the Ngram and ICD9 classifiers.



Our objective was to determine the optimized values for the sensitivity and specificity of the Ngram CC classifier for individual visits using a ROC curve analysis. Points on the ROC curve correspond to different classification probability cutoffs.

Submitted by elamb on

Free-text emergency department triage chief complaints (CCs) are a popular data source used by many syndromic surveillance systems because of their timeliness, availability, and relevance. The lack of standardization of CC vocabulary poses a major technical challenge to any automatic CC classification approach. This challenge can be partially addressed by several methods, for example, medical thesaurus, spelling check, manually-created synonym list, and supervised machine learning techniques that directly operate on free text. Current approaches, however, ignore the fact that medical terms appearing in CCs are often semantically related. Our research exploits such semantic relations through a medical ontology in the context of automatic CC classification for syndromic surveillance.



This paper presents a novel approach of using a medical ontology to classify free-text CCs into syndrome categories.

Submitted by elamb on

There exists no standard set of syndromes for syndromic surveillance, and available syndromic case definitions demonstrate substantial heterogeneity of findings constituting the definition. Many syndromic case definitions require the presence of a syndromic finding (e.g., cough or diarrhea) and a fever.



Automated syndromic surveillance systems often use chief complaints as input. Our objective was to determine whether chief complaints accurately represent whether a patient has any of the following febrile syndromes: Febrile respiratory, febrile gastrointestinal, febrile rash, febrile neurological, or febrile hemorrhagic.

Submitted by elamb on

Text-based syndrome case definitions published by the Center for Disease Control (CDC)1 form the basis for the syndrome queries used by the North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT). Keywords within these case definitions were identified by public health epidemiologists for use as search terms with the goal of capturing symptom complexes from free-text chief complaint and triage note data for the purpose of early event detection and situational awareness. Initial attempts at developing SQL queries incorporating these search terms resulted in the return of many unwanted records due to the inability to control for certain terms imbedded within unrelated free text strings. For example, a query containing the search term “h/a”, a common abbreviation for headache, also returns false positives such as “cough/asthma”, “skin rash/allergic reaction” or “psych/anxiety”.  Simple abbreviations without punctuation, such as “ha”, were even more problematic.  Global wildcards ('%') indicate that zero or more characters of any type may substitute for the wildcard.2 The term “ha” as a synonym for "headache" appears frequently in the data, but searching this term bracketed by global wildcards returns any instance where the two letters appear together (e.g. pharyngitis, hand, hallucinations, toothache). Using global wild cards to search for common symptoms such as headache using simple abbreviations, with or without specialized punctuation, results in the return of many unwanted false positive records. We describe here the advanced application of SQL character set wildcards to address this problem.


This paper describes a novel approach to the construction of syndrome queries written in Structured Query Language (SQL). Through the advanced application of character set wildcards, we are able to increase the number of valid records identified by our queries while simultaneously decreasing the number of false positives.

Submitted by elamb on

The inception of syndromic surveillance has spawned a great deal of research into emergency department chief complaint data. In addition to its use as an early warning system of a bioterror or outbreak event, many health departments are attempting to maximize the utility of the information to augment chronic and communicable disease surveillance. Hence, it can be used to enhance the traditional methods of surveillance. Using syndromic data to describe what could be the normal for a geographic area may be useful in monitoring a population for disease trends. Prevention efforts could be concentrated during a particular time of year. In addition, geospatial shifts in directional trends may indicate an unusual occurrence related to the utilization of emergency department services.


To describe the geographical mean as well as the directional trends of syndromes for the District of Columbia using temporal and geospatial analyses.

Submitted by elamb on

In North Carolina, select hospital emergency departments have been submitting data since 2003 for use in syndromic surveillance. These data are collected, stored, and parsed into syndrome categories by the North Carolina Emergency Department Database. The fever with rash illness syndrome is designed to capture smallpox cases. This syndrome was created as a combination of the separate fever and rash syndromes proposed by the consensus recommendations of the CDC’s Working Group on Syndrome Groups.



This paper describes the construction of a syndromic surveillance case definition and a test for its ability to capture the appropriate syndromic cases.

Submitted by elamb on

The purpose of syndromic surveillance is the early identification of disease outbreaks. Classification of chief complaints into syndromes and the type of statistics used for aberration detection can affect outbreak detection sensitivity and specificity. Few data are available on the relationship between chief complaints and demographics such as gender, age, or race. For example, myocardial infarction in women would be misclassified using definitions based solely on “male” symptoms such as chest pain because women more commonly report neck, jaw, and back pain.



We evaluated the sensitivity and specificity of a gastrointestinal syndrome group using the Boston Public Health Commission syndromic surveillance system.

Submitted by elamb on

The goal of this paper is to describe a methodology used to create a gold standard set of emergency department (ED) data that can subsequently be used to evaluate the sensitivity and specificity of syndrome definitions.

Submitted by elamb on