Skip to main content

Simulation

Description

Developing and evaluating outbreak detection is challenging for many reasons.  A central difficulty is that the data the detection algorithms are “trained” on are often relatively short historical samples and thus do not represent the full range of possible background scenarios.  Once developed, the same dearth of historical data complicates evaluation.  In systems where only a count of cases is provided, plausible synthetic data are relatively easy to generate.  When precise location data is available, simple approaches to generating hypothetical cases is more difficult.

Advances in epidemiological modeling have allowed for increasingly realistic simulations of infectious disease spread in highly detailed synthetic populations. These agent-based simulations are capable of better representing real-world stochastic disease transmission process and thus show highly variable results even under identical initial conditions. Due to their ability to mimic a wide range outcomes and more fully represent the unknowns in a system, models of this class have become increasingly used to help inform decisions about public policies about hypothetical situations (eg pandemic influenza [1]).  This characteristic also makes them a powerful tool to represent the processes that create surveillance information.

Objective

Developing and evaluating detection algorithms in noisy surveillance data is complicated by a lack of realistic noise, meaning the surveillance data stream when nothing of public health interest is happening. These jobs are even more complex when data on the precise location of cases is available. This paper describes a methodology for plausible generation of such noise using agent-based models of infectious disease transmission based on highly resolved dynamic social networks.

Submitted by elamb on
Description

Immediately following September 11, 2001, the District of Columbia Department of Health began a syndromic surveillance program based on emergency room (ER) visits. ER logs are faxed on a daily basis to the health department, where health department staff code them on the basis of chief complaint, recording the number of patients in each of the following syndromic categories: death, sepsis, rash, respiratory complaints, gastrointestinal complaints, unspecified infection, neurological, or other complaints. These data are analyzed daily and when a syndromic category shows an unusually high occurrence, a patient chart review is initiated to determine if the irregularity is a real threat. 

A time series analysis of the data from this system has shown that with the application of a variety of detection algorithms, the syndromic surveillance data does well in identifying the onset of the flu season. In addition, simulation studies using the same data have shown that over a range of simulated outbreak types, the univariate and multivariate CUSUM algorithms performed more effectively than other algorithms. The multivariate CUSUM was preferred to the univariate CUSUM for some but not all outbreak types.

 

Objective

This paper evaluates an ER syndromic surveillance system based on simulation studies and comparisons with other surveillance systems.

Submitted by elamb on
Description

Research evaluating the use of spatial data for surveillance purposes is ongoing and evolving. As spatial methods evolve, it is important to characterize their effectiveness in real-world settings. Assessing the performance of surveillance systems has been difficult because there has been a paucity of data from real bioterrorism events. Recent efforts to assess surveillance system performance have focused on injecting synthetic outbreak data (signal) into actual background visit data. These studies focused on either temporal data, a single syndrome category, or a single bioterrorism agent. We are unaware of prior studies evaluating the performance of spatial outbreak detection for multiple syndrome categories in an operational surveillance system.

 

Objective

To characterize the performance of a spatial scan statistic, we used SaTScan to measure the sensitivity and positive predictive value for detecting simulated outbreaks having varying size, case density, and syndrome type.

Submitted by elamb on
Description

There are many proposed methods of identifying outbreaks of disease in surveillance data. However, there is little agreement about appropriate ways to choose amongst them. One common basis for comparison is simulating outbreaks and adding the simu lated cases to real data streams (‘injected outbreaks’); competing statistical methods then attempt to detect the outbreak. The receiver operating characteristic (ROC) curve and the area beneath it are well-known approaches to evaluation. The ROC curve plots the sensitivity against 1 less the specificity for a range of decision thresholds. Unfortunately, defining ROC curves in this context is not straightforward. In the usual setting of screening, ROC curves are constructed based on individuals, not populations, and it is unclear how to extend the concept to surveillance. In addition, the sensitivity and specificity need to be supplemented by the timeliness: a method with perfect sensitivity and specificity that detects outbreaks too late is useless.

 

Objective

We developed metrics for evaluating tools used for outbreak detection, assuming simulated outbreaks.

Submitted by elamb on
Description

In previous work, we described a non-disease-specific outbreak simulator for the evaluation of outbreak detection algorithms. This Template-Driven Simulator generates disease patterns using user-defined template functions. Estimation of a template function from real outbreak data would enable researchers to repetitively simulate outbreaks that resemble a single real outbreak. These simulated outbreaks can then be used to evaluate outbreak detection algorithms. To demonstrate template estimation, we employ BARD, a disease-specific outbreak model for outdoor aerosol release of B. anthracis. It uses epidemiological and atmospheric dispersion models in conjunction with geographical and meteorological data to generate anthrax cases. The home census block group and time of visit to an emergency department are available for each simulated case.

 

Objective

In previous work, we developed a Template-Driven Simulator, which is a non-disease specific outbreak simulator that uses templates to describe the temporal or spatial-temporal pattern of an outbreak. Here we address the problem of estimating the template from outbreak data. We then conduct a limited validation of the outbreak simulation model by estimating the template using outbreak data generated from BARD, a sophisticated state-of-the-art anthrax outbreak simulator and detector. This limited validation confirms that the outbreak simulator is capable of generating complicated disease outbreak patterns for evaluating outbreak detection algorithms.

Submitted by elamb on
Description

Evidence suggests that transmission within the workplace contributes significantly to the magnitude of a pandemic flu epidemic. A significant number of large organizations have a pandemic plan in place which may help in controlling this manner of transmission. These plans typically include telecommuting and other measures to reduce the need to physically commute to the workplace. Good data are needed in order to obtain valid results from simulation models and to be able to assess the effect of reductions in commuting.

 

Objective

The objective in this study was to explore data on employment and commuting from different sources, using statistical analytic techniques together with geographical experts to obtain information to be provided to modelers in order to help them improve the employment and commuting component of their models, determine potential issues related to these data, and identify problem areas where further investigation is needed.

Submitted by elamb on
Description

The effectiveness of public health interventions during a disease outbreak depends on rapid, accurate characterization of the initial outbreak and spread of the pathogen. Computer-based simulation using mathematical models provides a means to characterize both and enables practitioners to test intervention strategies. While compartmental differential equation models can be used to represent epidemics, they are unsuitable for early time simulations (first few days) when a small number of people are infected (and even fewer symptomatic), nor are they capable of representing spatial disease spread. Numerous models for disease propagation have been explored, including national scale network models for influenza and social network-based and probabilistic models for smallpox. To be useful in a public health context, a model for disease propagation should be efficient (e.g., simulating several weeks of real time in an hour) and flexible enough to simultaneously represent multiple diseases and attack scenarios.

 

Objective

This paper describes biologically-based mathematical models and efficient methods for early epoch simulation of disease outbreaks and bioterror attacks.

Submitted by elamb on
Description

Spatial scan finds the most anomalous region that has shown increase in observed counts when compared to the expected baseline. As there can be infinitely many regions to search for, most state-of-the-art algorithms assumes a specific shape of the attack region (circles for Kulldorff and rectangles for Ultra-Fast Spatial Scan Statistics). This assumption might reduce the detection power as real world attacks don't follow standard geometric shapes.

 

Objective

We propose discriminative random field approach for detecting a disease outbreak. Given observed data on a spatial grid, the goal is to label each node as being under attack and non-attack.

Submitted by elamb on
Description

Current syndromic surveillance systems run multiple simultaneous univariate procedures, each focused on detecting an outbreak in a single data stream. Multivariate procedures have the potential to better detect some types of outbreaks, but most of the existing methods are directionally invariant and are thus less relevant to the problem of syndromic surveillance. This article develops two directionally sensitive multivariate procedures and compares the performance of these procedures both with the original directionally invariant procedures and with the application of multiple univariate procedures using both simulated and real syndromic surveillance data. The performance comparison is conducted using metrics and terminology from the statistical process control (SPC) literature with the intention of helping to bridge the SPC and syndromic surveillance literatures. This article also introduces a new metric, the average overlapping run length, developed to compare the performance of various procedures on limited actual syndromic surveillance data. Among the procedures compared, in the simulations the directionally sensitive multivariate cumulative sum (MCUSUM) procedure was preferred, whereas in the real data the multiple univariate CUSUMs and the MCUSUM performed similarly. This article concludes with a brief discussion of the choice of performance metrics used herein versus the metrics more commonly used in the syndromic surveillance literature (sensitivity, specificity, and timeliness), as well as some recommendations for future research.

Submitted by elamb on
Description

The Public Health Agency of Canada is currently utilizing a syndromic surveillance prototype called the Canadian Early Warning System (CEWS). This system monitors several live data feeds, including emergency room chief complaint records from all seven local hospitals, Telehealth (24/7 nurse hotline) calls, and over-the-counter drug sales from a number of the large chain drug stores. Data trends are analysed for aberrations as early indicators of outbreak events. Collaborators on this Winnipeg, Manitoba-based pilot include the Winnipeg Regional Health Authority and IBM Business Solutions. Algorithms currently in CEWS include the 3, 5 and 7-day moving averages, CUSUM and the CDC’s EARS. We seek to investigate the performance of these algorithms in view of the fact that their detection ability may be dependent upon data source and/or the type of outbreak event.

 

Objective

To determine the sensitivity, specificity and days to detection of several commonly used algorithms in syndromic surveillance systems.

Submitted by elamb on