Skip to main content

Forbach Cory

Description

Text-based syndrome case definitions published by the Center for Disease Control (CDC)1 form the basis for the syndrome queries used by the North Carolina Disease Event Tracking and Epidemiologic Collection Tool (NC DETECT). Keywords within these case definitions were identified by public health epidemiologists for use as search terms with the goal of capturing symptom complexes from free-text chief complaint and triage note data for the purpose of early event detection and situational awareness. Initial attempts at developing SQL queries incorporating these search terms resulted in the return of many unwanted records due to the inability to control for certain terms imbedded within unrelated free text strings. For example, a query containing the search term “h/a”, a common abbreviation for headache, also returns false positives such as “cough/asthma”, “skin rash/allergic reaction” or “psych/anxiety”.  Simple abbreviations without punctuation, such as “ha”, were even more problematic.  Global wildcards ('%') indicate that zero or more characters of any type may substitute for the wildcard.2 The term “ha” as a synonym for "headache" appears frequently in the data, but searching this term bracketed by global wildcards returns any instance where the two letters appear together (e.g. pharyngitis, hand, hallucinations, toothache). Using global wild cards to search for common symptoms such as headache using simple abbreviations, with or without specialized punctuation, results in the return of many unwanted false positive records. We describe here the advanced application of SQL character set wildcards to address this problem.

Objective

This paper describes a novel approach to the construction of syndrome queries written in Structured Query Language (SQL). Through the advanced application of character set wildcards, we are able to increase the number of valid records identified by our queries while simultaneously decreasing the number of false positives.

Submitted by elamb on