Skip to main content

Modeling

Description

Bordetella Pertussis outbreaks cause morbidity in all age groups, but the infection is most dangerous for young infants. Pertussis is difficult to diagnose, especially in its early stages, and definitive test results are not available for several days. Because of temporal and geographic variability of pertussis outbreaks, delay in diagnostic test results and ramifications of incorrect management decisions at the point of care, pertussis represents a prototypical disease where realtime public health surveillance data might inform, guide and improve medical decision making. Previously, we showed that diagnostic accuracy for meningitis can be improved when information about recent, local disease incidence is accounted for. Here, we quantify the contribution of epidemiologic context to a clinical prediction model for pertussis using a state public health data stream.

 

Objective

To explore the integration of epidemiological context – current population-level disease incidence data – into a clinical prediction model for pertussis.

Submitted by elamb on
Description

Modern surveillance systems use statistical process control (SPC) charts such as Cumulative Sum and Exponentially Weighted Moving Average charts for monitoring daily counts of such quantities as ICD-9 codes from ED visits, sales of medications, and doctors’ office visits. The working assumption is that such pre-clinical data contain an early signature of disease outbreaks, manifested as an increase in the count levels. However, the direct application of SPC charts to the raw counts leads to unreliable performance. A popular statistical solution is to precondition the data before applying the charts by modeling or removing explainable patterns from the data and then monitoring the residuals. Although the general idea is common practice, the specifics of how to identify the existing explainable components and how to account for them are domain-specific. Therefore, we seek to present a set of modeling and data-driven tools that are useful for syndromic data.

 

Objective

SPC charts are widely used in disease surveillance. The charts are very effective when monitored data meet the requirements of temporal independence, stationarity, and normality. However, when these assumptions are violated, the SPC charts will either fail to detect special cause variations or will alert frequently even in the absence of anomalies. Currently collected biosurveillance data contain predictable factors such as day-of-week effects, seasonal effects, holidays, autocorrelation, and global trends that cause the data to violate these assumptions. This work (1) describes a set of tools for identifying such explainable patterns and (2) examines several data preconditioning methods that account for these factors, yielding data better suited for monitoring by traditional SPC charts.

Submitted by elamb on
Description

Modern biosurveillance relies on multiple sources of both prediagnostic and diagnostic data, updated daily, to discover disease outbreaks. Intrinsic to this effort are two assumptions: (1) the data being analyzed contain early indicators of a disease outbreak and (2) the outbreaks to be detected are not known a priori. However, in addition to outbreak indicators, syndromic data streams include such factors as day-of-week effects, seasonal effects, autocorrelation, and global trends. These explainable factors obscure unexplained outbreak events, and their presence in the data violates standard control-chart assumptions. Monitoring tools such as Shewhart, cumulative sum, and exponentially weighted moving average control charts will alert based largely on these explainable factors instead of on outbreaks. The goal of this paper is 2-fold: first, to describe a set of tools for identifying explainable patterns such as temporal dependence and, second, to survey and examine several data preconditioning methods that significantly reduce these explainable factors, yielding data better suited for monitoring using the popular control charts.

Submitted by elamb on
Description

Existing statistical methods can perform well in detecting simulated bioterrorism events. However, these methods have not been well-evaluated for detection of the type of respiratory and gastrointestinal events of greatest interest for routine public health practice. To assess whether a syndromic surveillance system can detect these outbreaks, we constructed simulated outbreaks based on public health interest and experience. We then inserted these outbreaks into real data. We assessed whether the simulated outbreaks could be detected using a battery of detection methods, including model-adjusted scan statistics and space-time permutation scan statistics.

 

Objective

We used simulation methods to assess the performance of two distinct anomaly-detection approaches, each under a variety of parameter settings, with respect to their ability to detect outbreaks of commonly occurring events of public health importance.

Submitted by elamb on
Description

We developed a probabilistic model of how clinicians are expected to detect a disease outbreak due to an outdoor release of anthrax spores, when the clinicians only have access to traditional clinical information (e.g., no computer-based alerts). We used this model to estimate an upper bound on the amount of time expected for clinicians to detect such an outbreak. Such estimates may be useful in planning for outbreaks and in assessing the usefulness of various computer-based outbreak detection algorithms.

Submitted by elamb on
Description

In epidemiology, contact tracing is a process to control the spread of an infectious disease and identify individuals who were previously exposed to patients with the disease. After the emergence of AIDS, SNA was demonstrated to be a good supplementary tool for contact tracing [1]. Traditionally, social networks for disease investigation are constructed only with personal contacts since personal contacts are the most identifiable paths for disease transmission. However, for diseases which transmit not only through personal contacts, incorporating geographical contacts into SNA has been demonstrated to reveal potential contacts among patients [2][3].

Objective

In this research, we aim to investigate the necessity of incorporating geographical contacts into Social Network Analysis (SNA) for contact tracing in epidemiology and explore the strengths of multi-mode networks with patients and geographical locations in network visualization for disease spread investigation.

Submitted by elamb on
Description

The performance of even the most advanced syndromic surveillance systems can be undermined if the monitored data is delayed before it arrives into the system.  In such cases, an outbreak may be detected only after it is too late for appropriate public health response. Surveillance systems can experience delays in data availability for a number of reasons: The process of transmitting data from data sources to the surveillance system can involve delays, especially in large systems where data is first aggregated across a national network of data sources before being transmitted to the surveillance system. Delays can also arise in the course of care, where, for example, a diagnosis is not available for a few days after the healthcare encounter.  It is important to minimize delays in data availability in order to maintain timeliness of detection [1].  When this is not possible, it is desirable to compensate for these data delays to minimize their effects.

Objective

This paper describes an approach to improving the detection timeliness of real-time health surveillance systems by modeling and correcting for delays in data availability.

Submitted by elamb on
Description

In previous work, we described a non-disease-specific outbreak simulator for the evaluation of outbreak detection algorithms. This Template-Driven Simulator generates disease patterns using user-defined template functions. Estimation of a template function from real outbreak data would enable researchers to repetitively simulate outbreaks that resemble a single real outbreak. These simulated outbreaks can then be used to evaluate outbreak detection algorithms. To demonstrate template estimation, we employ BARD, a disease-specific outbreak model for outdoor aerosol release of B. anthracis. It uses epidemiological and atmospheric dispersion models in conjunction with geographical and meteorological data to generate anthrax cases. The home census block group and time of visit to an emergency department are available for each simulated case.

 

Objective

In previous work, we developed a Template-Driven Simulator, which is a non-disease specific outbreak simulator that uses templates to describe the temporal or spatial-temporal pattern of an outbreak. Here we address the problem of estimating the template from outbreak data. We then conduct a limited validation of the outbreak simulation model by estimating the template using outbreak data generated from BARD, a sophisticated state-of-the-art anthrax outbreak simulator and detector. This limited validation confirms that the outbreak simulator is capable of generating complicated disease outbreak patterns for evaluating outbreak detection algorithms.

Submitted by elamb on
Description

Evidence suggests that transmission within the workplace contributes significantly to the magnitude of a pandemic flu epidemic. A significant number of large organizations have a pandemic plan in place which may help in controlling this manner of transmission. These plans typically include telecommuting and other measures to reduce the need to physically commute to the workplace. Good data are needed in order to obtain valid results from simulation models and to be able to assess the effect of reductions in commuting.

 

Objective

The objective in this study was to explore data on employment and commuting from different sources, using statistical analytic techniques together with geographical experts to obtain information to be provided to modelers in order to help them improve the employment and commuting component of their models, determine potential issues related to these data, and identify problem areas where further investigation is needed.

Submitted by elamb on