Skip to main content

Statistical Methods

Description

Current syndromic surveillance systems run multiple simultaneous univariate procedures, each focused on detecting an outbreak in a single data stream. Multivariate procedures have the potential to better detect some types of outbreaks, but most of the existing methods are directionally invariant and are thus less relevant to the problem of syndromic surveillance. This article develops two directionally sensitive multivariate procedures and compares the performance of these procedures both with the original directionally invariant procedures and with the application of multiple univariate procedures using both simulated and real syndromic surveillance data. The performance comparison is conducted using metrics and terminology from the statistical process control (SPC) literature with the intention of helping to bridge the SPC and syndromic surveillance literatures. This article also introduces a new metric, the average overlapping run length, developed to compare the performance of various procedures on limited actual syndromic surveillance data. Among the procedures compared, in the simulations the directionally sensitive multivariate cumulative sum (MCUSUM) procedure was preferred, whereas in the real data the multiple univariate CUSUMs and the MCUSUM performed similarly. This article concludes with a brief discussion of the choice of performance metrics used herein versus the metrics more commonly used in the syndromic surveillance literature (sensitivity, specificity, and timeliness), as well as some recommendations for future research.

Submitted by elamb on
Description

After the SARS outbreak in 2003, Beijing established Fever Clinics in major hospitals for the early detection of potential respiratory disease outbreaks. The data collection in Fever Clinics contains the basic patient information, body temperature, cough, and breath condition, as well as a primary diagnosis. Since the symptoms and diagnosis are mainly recorded in free text format, it is very difficult to use for data analysis. Because of the problems in data processing, the data collection has decreased.

 

Objective

This paper describes the methodology in the development of an Integrated Surveillance System for Beijing, China.

Submitted by elamb on
Description

Benchmarking of temporal surveillance techniques is a critical step in the development of an effective syndromic surveillance system. Unfortunately, holding “bakeoffs” to blindly compare approaches is a difficult and often fruitless enterprise, in part due to the parameters left to the final user for tuning. In this paper, we demonstrate how common analytical development and analysis may be coupled with realistic data sets to provide insight and robustness when selecting a surveillance technique.

 

OBJECTIVE

This paper compares the robustness and performance of three temporal surveillance techniques using a twofold approach: 1) a unifying statistical analysis to establish their common features and differences, and 2) a benchmarking on respiratory, influenza-like ill-nesses, upper GI, and lower GI complaint time series from the Harvard Pilgrim Health Care (HPHC).

Submitted by elamb on
Description

Syndromic surveillance needs to be (1) transparent, (2) actionable, and (3) flexible. Traditional frequentist approaches to syndromic surveillance, such as cusum charts and scan statistics, tend to fail on all three criteria. First, the validity of the assumptions is generally difficult to check and the methods are hard to modify; second, the false positive rate makes it impossible to be both sensitive to true signal and resistant to spurious signal; and third, the implementation usually requires significant hand-tinkering to adjust background rates for known seasonal affects and other identifiable influences.

 

OBJECTIVE

This paper describes a Bayesian approach to syndromic surveillance. The method provides more interpretable inference than traditional frequentist approaches. Bayesian methods avoid many of the problems associated with alpha levels and multiple comparisons, and make better use of prior information. The technique is illustrated on simulated data.

Submitted by elamb on
Description

The statistical process control (SPC) community has developed a wealth of robust, sensitive monitoring methods in the form of control charts [1]. Although such charts have been implemented for a wide variety of health monitoring purposes [2], some implementations monitor data that violate basic assumptions required by the control charts [3] yielding alerting methods with uncertain detection performance. This problem highlights an inherent obstacle to the use of traditional SPC methods for syndromic surveillance: the nature of the data. Syndromic data streams are based not on physical science, as are manufacturing processes, but on changing population behavior and evolving data acquisition and classification procedures. To overcome this obstacle, either more sophisticated detection algorithms must be developed or the data must be preconditioned so that it is appropriate for traditional monitoring tools. Objective: For robust detection performance, alerting algorithms for biosurveillance require input data free of trends, day-of-week effects, and other systematic behavior. Time series forecasting methods may be used to remove this behavior by subtracting forecasts from observations to form residuals for algorithmic input. This abstract examines and compares methods for the automatic preconditioning of health indicator data to enable the timely prospective monitoring required for effective syndromic surveillance.

 

Objective

For robust detection performance, alerting algorithms for biosurveillance require input data free of trends, day-of-week effects, and other systematic behavior. Time series forecasting methods may be used to remove this behavior by subtracting forecasts from observations to form residuals for algorithmic input. This abstract examines and compares methods for the automatic preconditioning of health indicator data to enable the timely prospective monitoring required for effective syndromic surveillance.

Submitted by elamb on
Description

The Los Angeles County (LAC) Bioterrorism Preparedness and Response Unit has made significant progress in automating the syndromic surveillance system. The surveillance system receives electronic data on a daily basis from different hospital information systems, then standardizes and generates analytical results.

 

OBJECTIVE

This article describes architecture, analytical method, and software applications used in automating the LAC syndromic surveillance system.

Submitted by elamb on
Description

Objective

Several authors have described ways to introduce artificial outbreaks into time series for the purpose of developing, testing, and evaluating the effectiveness and timeliness of anomaly detection algorithms, and more generally, early event detection systems. While the statistical anomaly detection methods take into account baseline characteristics of the time series, these simulated outbreaks are introduced on an ad hoc basis and do not take into account those baseline characteristics. Our objective was to develop statistical-based procedures to introduce artificial anomalies into time series, which thus would have wide applicability for evaluation of anomaly detection algorithms against widely different data streams.

Submitted by elamb on
Description

The traditional SaTScan algorithm[1],[2] uses the euclidean dis- tance between centroids of the regions in a map to assemble a con- nected (in the sense that two connected regions share a physical border) sets of regions. According to the value of the respective log- arithm of the likelihood ratio (LLR) a connected set of regions can be classified as a statistically significant detected cluster. Considering the study of events like contagious diseases or homicides we con- sider using the flow of people between two regions in order to build up a set of regions (zone) with high incidence of cases of the event. In this sense the regions will be closer as the greater the flow of peo- ple between them. In a cluster of regions formed according to the cri- terion of proximity due to the flow of people, the regions will be not necessarily connected to each other.

 

Objective

We present a new approach to the circular scan method [1] that uses the flow of people to detect and infer clusters of regions with high incidence of some event randomly distributed in a map. We use a real database of homicides cases in Minas Gerais state, in south- east Brazil to compare our proposed method with the original circu- lar scan method in a study of simulated clusters and the real situation.

Submitted by dbedford on
Description

There has been much research on statistical methods of prospective outbreak detection that are aimed at identifying unusual clusters of one syndrome or disease, and some work on multivariate surveillance methods. In England and Wales, automated laboratory surveillance of infectious diseases has been undertaken since the early 1990’s. The statistical methodology of this automated system is described in. However, there has been little research on outbreak detection methods that are suited to large, multiple surveillance systems involving thousands of different organisms.

 

Objective

To look at the diversity of the patterns displayed by a range of organisms, and to seek a simple family of models that adequately describes all organisms, rather than a well-fitting model for any particular organism.

Submitted by hparton on
Description

Analyses produced by epidemiologists and public health practitioners are susceptible to bias from a number of sources including missing data, confounding variables, and statistical model selection. It often requires a great deal of expertise to understand and apply the multitude of tests, corrections, and selection rules, and these tasks can be time-consuming and burdensome. To address this challenge, Aptima began development of CARRECT, the Collaborative Automation Reliably Remediating Erroneous Conclusion Threats system. When complete, CARRECT will provide an expert system that can be embedded in an analyst’s workflow. CARRECT will support statistical bias reduction and improved analyses and decision making by engaging the user in a collaborative process in which the technology is transparent to the analyst.

Objective

The objective of the CARRECT software is to make cutting edge statistical methods for reducing bias in epidemiological studies easy to use and useful for both novice and expert users.

 

Submitted by uysz on