Skip to main content

Data Analytics

Description

This paper evaluates the operating characteristics of limited baseline aberration detection methods using different lengths (7-28 days) and end dates (1-7 days prior to the current day) for the baseline period using simulated outbreaks added to real data and simulated data representative of real data.

Submitted by elamb on
Description

Benchmarking of temporal surveillance techniques is a critical step in the development of an effective syndromic surveillance system. Unfortunately, holding “bakeoffs” to blindly compare approaches is a difficult and often fruitless enterprise, in part due to the parameters left to the final user for tuning. In this paper, we demonstrate how common analytical development and analysis may be coupled with realistic data sets to provide insight and robustness when selecting a surveillance technique.

 

OBJECTIVE

This paper compares the robustness and performance of three temporal surveillance techniques using a twofold approach: 1) a unifying statistical analysis to establish their common features and differences, and 2) a benchmarking on respiratory, influenza-like ill-nesses, upper GI, and lower GI complaint time series from the Harvard Pilgrim Health Care (HPHC).

Submitted by elamb on
Description

Historical data are essential for development of detection algorithms. Spatio-temporal data, however, are difficult to come by due to variety of issues concerning patient confidentiality. Several approaches have been used to generate benchmark data using statistical methods. Here, we demonstrate how to generate benchmark data using a discrete event model simulating inter- and intra-contact network transmission dynamics of infectious diseases in space and time using publicly available population data.

 

OBJECTIVE

The objective of this study is to generate benchmark data from a discrete event model simulating the transmission dynamics of an infectious disease within and between contact networks in urban settings using real population data. Such data can be used to test the performance of various temporal and spatio-temporal detection algorithms when real data are scarce or cannot be shared.

Submitted by elamb on
Description

A Bayesian Network (BN) is a probabilistic graphical model representing dependencies and relationships. The structure of the network and conditional probabilities capture an expert’s view of a system. BN have been applied to the public health domain for research purposes, but have not been used directly by the end users of public health systems. As BN technology becomes more and more accepted in the public health domain, the data fusion visualization becomes a critical component of the overall system design. The tools developed utilize computer assisted analysis on BN in the public health domain, provide a concise view of the data for better decision support, and shorten the decision making phase allowing rapid dissemination of information to public health.

 

OBJECTIVE

This paper describes the use and visualization of BNs to better assists public health users. The Data Fusion Visualization (DFV) provides an intuitive graphical interface that supports users in three ways. The first is by providing a seamless drill down interpretation of a dataset. The second is by providing an intuitive interpretation of BN. Finally, by abstracting the visualization from the underlying model, the DFV is capable of masking inter-operating BNs into a single visualization. The DFV provides a graphical representation of BN Network Data Fusion.

Submitted by elamb on
Description

Bio-surveillance is an area providing real time or near real time data sets with a rich structure. In this area, the new wave of interest lies in incorporating medical-based data such as percentage of Influenza-Like-Illnesses (ILI) or count of ILI observed during visits to Emergency Room as intelligence function; since many different bioterrorist agents present with flu-like symptoms. Developing a control technique for ILI however is a complex process which involves the unpredictability of the time of emergence of influenza, the severity of the outbreak and the effectiveness of influenza epidemic interventions. Furthermore, the need to detect the beginning of epidemic in an on-line fashion as data are received one at the time and sequentially make the problems surrounding ILI's even more challenging. Statistical tools for analyzing these data are currently well short of being able to capture all their important structural details. Tools from statistical process control are on the face of it ideally suited for the task, since they address the exact problem of detecting a sudden shift against a background of random variability. Bayesian statistical methods are ideally suited to the setting of partial but imperfect information on the statistical parameters describing time series data such as are gathered in BioSense and Sentinel settings.

 

Objective

This paper presents a Bayesian approach to quality control through the use of sequential update technique in order built a fast detection method for influenza outbreak and potential intentional release of biological agents. The objective is to find evidence of outbreaks against a background in which markers of possible intentional release are non-stationary and serially dependent. This work takes on the US Sentinel ILI data to find this evidence and to address some issues related to the control of infectious diseases. A sensitivity analysis is conducted through simulation to assess timeliness, correct alarm and missed alarm rates of our technique.

Submitted by elamb on
Description

Computational and statistical methods for detecting disease clusters, such as the spatial scan statistic, have become frequently used tools in epidemiology. However, they simply tell the user where a cluster is, and leave the analysis task to the user. Multivariate visualization tools provide one way for this analysis. The approach developed in this research is computational in nature, using computer vision techniques to analyze the shape of the cluster. Shapes are used here because different spatial processes that cause clusters, such as pollution along a river, create clusters with different shapes. Thus, it may be possible to categorize clusters by their respective spatial processes by analyzing the cluster shapes.

 

OBJECTIVE

There are plenty of computational and statistical methods for detecting spatial clusters, although the interpretation of these clusters is a task left to the user. This research develops computational methods to not just detect, but also analyze the cluster to hypothesize one or more potential causes.

Submitted by elamb on
Description

New York City ED syndromic surveillance data uses SaTScan to detect spatial signals. SaTScan analysis has been integrated into SAS since 2002, and signal maps have been generated from SAS since 2003. Signal maps are created occasionally to investigate a severe outbreak based on the SaTScan results. Previous use and integration of additional GIS analysis in ArcGIS has been done manually, requiring more time, and running the risk of being less consistent than an automated method. This script now integrates the SAS, SaTScan and spatial analysis from ArcGIS to create high-quality maps in an automated procedure.

 

Objective

The objective was to minimize the amount of time spent on routine, daily analysis of syndromic data, integrate additional spatial analysis, create better maps, and cut response times to outbreaks.

Submitted by elamb on