Skip to main content

Kilicaslan Isa

Description

Previously we used an “N-Gram” classifier for syndromic surveillance of emergency department (ED) chief complaints (CC) in English for bioterrorism. The classifier is trained on a set of ED visits for which both the ICD diagnosis code and CC are available by measuring the associations of text fragments within the CC (e.g. 3 characters for a “3-gram”) with a syndromic group of ICD codes. Because the ICD system is language independent, the technique has the potential advantage of rapid automated deployment in multiple languages. Our objective was to apply the N-Gram method to a training set of Turkish ED data to create a Turkish CC classifier for the respiratory syndrome (RESP) and determine its performance in a test set.

 

Objective

To determine how closely the performance of an ngram CC classifier for the RESP syndrome matched the performance of the ICD9 classifier.

Submitted by elamb on
Description

Previously we developed an “Ngram” classifier for syndromic surveillance of emergency department (ED) chief complaints (CC) in Turkish for bioterrorism. The classifier is developed from a set of ED visits for which both the ICD diagnosis code and CC are available. A computer program calculates the associations of text fragments within the CC (e.g. 3 characters for a “3-gram”) with a syndromic group of ICD codes. The program then generates an algorithm which can be deployed to evaluate chief complaint data in real-time. However, the N-gram method differs from most other classifiers in that it assigns a probability that each visit falls within the syndrome rather than ruling the visit “in” or “out” of the syndrome. It is possible to dichotomize visits “in” or “out” using N-grams by choosing a cut-off sensitivity for the n-grams used, but this affects the specificity of the method. The effect of this trade-off is best measured by a receiveroperator curve.

 

Objective

Our objective was to determine the sensitivity and specificity of the Ngram CC classifier for individual ED visits. We also wish to compare these results to those obtained when we substituted anglicized characters for 6 problematic Turkish characters.

Submitted by elamb on