Skip to main content

Implementation and Comparison of Preprocessing Methods for Biosurveillance Data

Description

Modern biosurveillance relies on multiple sources of both prediagnostic and diagnostic data, updated daily, to discover disease outbreaks. Intrinsic to this effort are two assumptions: (1) the data being analyzed contain early indicators of a disease outbreak and (2) the outbreaks to be detected are not known a priori. However, in addition to outbreak indicators, syndromic data streams include such factors as day-of-week effects, seasonal effects, autocorrelation, and global trends. These explainable factors obscure unexplained outbreak events, and their presence in the data violates standard control-chart assumptions. Monitoring tools such as Shewhart, cumulative sum, and exponentially weighted moving average control charts will alert based largely on these explainable factors instead of on outbreaks. The goal of this paper is 2-fold: first, to describe a set of tools for identifying explainable patterns such as temporal dependence and, second, to survey and examine several data preconditioning methods that significantly reduce these explainable factors, yielding data better suited for monitoring using the popular control charts.

Submitted by elamb on