Skip to main content

Syndromic Prediction Power: Comparing Covariates and Baselines

Description

The eleven syndrome classifications for clinical data records monitored by BioSense include rare events such as death or lymphadenitis and also common occurrences such as respiratory infections. BioSense currently uses two statistical methods for prediction and alerting with respect to the eleven syndromes. These are a modified CUSUM; and small area regression and testing (SMART), described by Ken Kleinman. At the inception of BioSense, these prediction methods were implemented as one-model-fits-all, and they remain largely unmodified. An evaluation of the predictive value of these methods is required. The SMART method, as used in BioSense, uses long-term data. As covariate predictors, day-of-week, a holiday indicator, day after holiday, and sine/cosine seasonality variables are used. Lengthy, stable historical data is not always available in BioSense data sources, and this obstacle is expected to grow as data sources are added. We wish to test regression methods of surveillance that use shorter time periods, and different sets of predictors.

 

Objective

This paper compares the prediction accuracy of regression models with different covariates and baseline periods, using a subset of data from CDC’s BioSense initiative. Accurate predictions are needed to achieve sensitivity at practical false alarm rates in anomaly detection for biosurveillance.

Submitted by elamb on