Skip to main content

The NGram CC Classifier: A Novel Method of Automatically Creating CC Classifiers Based on ICD9 Groupings

Description

Syndromic surveillance of emergency department (ED) visit data is often based on computer algorithms which assign patient chief complaints (CC) to syndromes. ICD9 code data may also be used to develop visit classifiers for syndromic surveillance but the ICD9 code is generally not available immediately, thus limiting its utility. However, ICD9 has the advantages that ICD9 classifiers may be created rapidly and precisely as a subset of existing ICD9 codes and that the ICD9 codes are independent of the spoken language. If a classifier based on ICD9 codes could be used to automatically create the code for a chief-complaint assignment algorithm then CC algorithms could be created and updated more rapidly and with less labor. They could also be created in multiple spoken languages. We had developed a method for doing this based on an “ngram” text processing program adapted from business research technology (AT&T Labs). The method applies the ICD9 classifier to a training set of ED visits for which both the CC and ICD9 code are known. A computerized method is used to automatically generate a collection of CC substrings with associated probabilities, and then generate a CC classifier program. The method includes specialized selection techniques and model pruning to automatically create a compact and efficient classifier.

 

Objective

Our objective was to determine how closely the performance of an ngram CC classifier for the gastrointestinal syndrome matched the performance of the ICD9 classifier.

Submitted by elamb on