Skip to main content

Improvement in Performance of Ngram Classifiers with Frequent Updates

Description

 

Syndromic surveillance of emergency department(ED) visit data is often based on computerized classifiers which assign patient chief complaints (CC) tosyndromes. These classifiers may need to be updatedperiodically to account for changes over time in the way the CC is recorded or because of the addition of new data sources. Little information is available as to whether more frequent updates would actually improve classifier performance significantly. It can be burdensome to update classifiers which are developed and maintained manually. We had available to us an automated method for creating classifiers thatallowed us to address this question more easily. The “Ngram” method, described previously, creates a CC classifier automatically based on a training set of patient visits for which both the CC and ICD9 are available. This method measures the associations of text fragments within the CC (e.g. 3 characters for a “3-gram”) with a syndromic group of ICD9 codes. It then automatically creates a new CC classifier based on these associations. The CC classifier thus created can then be deployed for daily syndromic surveillance.

Objective

Our objective was to determine if performance of the Ngram classifier for the GI syndrome was improved significantly by updating the classifier more frequently.

Submitted by elamb on