Skip to main content

Data Analysis

Description

Public health officials are now receiving more data than ever in electronic formats, and also stand to benefit more than ever from ongoing advances in the medical and epidemiological sciences. At the same time, this growing body of knowledge as well as volatile world events present an increasingly complex set of threats to population health. As a consequence, public health officials are finding that they need to ask many more, and more complex, questions of their data in order to keep sight of the state of the public’s health. Most current disease surveillance systems enable users to ask many different questions of health data, but are limited in that users can only extract results one question, or query, at a time.



Objective

Develop an Automated Data Query tool to allow public health officials to easily extract batches of raw medical encounter data using custom queries that the officials themselves set up. Additionally, the tool shall be capable of running anomaly detection algorithms against the raw data and returning the statistics. Users shall be able to perform their own analyses on the data and/or the statistical results after using the tool to collect the information efficiently. The tool will help them spot trends of interest that may be specific to their own jurisdictions.

Submitted by elamb on
Description



SaTScan is a freely available software that uses the scan statistic to detect clusters in space, time or space-time. SaTScan uses Monte Carlo hypothesis testing in order to produce a p-value for the null hypothesis that no clusters are present. Monte Carlo hypothesis testing can be a powerful tool when asymptotic theoretical distributions are inconvenient or impossible to discover; the main drawback to this approach is that precision for small p-values can only be obtained through greatly increasing the number of Monte Carlo replications, which is both  computer-intensive and time consuming. Depending on the type of analysis being done, the number of geographical areas included, the amount of historical data, and the number of Monte Carlo replications, SaTScan can take anywhere from seconds to hours to run. In doing daily surveillance of many syndromes, we need to limit the amount of time it takes to generate each p-value while still retaining enough precision in the p-value to determine how unusual a cluster is. Since the type of analysis done and the geographic regions being used cannot be changed in most cases, we focus here on trying to reduce the number of Monte Carlo replicates needed.

 

Objective

Our goal was to increase the precision of the p-value produced from SaTScan while reducing the amount of CPU time needed by decreasing the number of Monte Carlo replicates.

Submitted by elamb on
Description

The 2003/04 influenza season included a more pathogenetic organism and had an earlier onset. There were noticeably more deaths in otherwise healthy children than in previous seasons. Following this season, States were asked by the Centers for Disease Control and Prevention to increase their surveillance efforts for influenza illness.

 

Objective 

This paper describes data that was available in Ohio for analysis and considered valuable to determine the occurrence of influenza-like illness (ILI). These data sources were studied to determine their value to ILI surveillance and to develop an improved method of establishing influenza activity levels.

Submitted by elamb on
Description

Surveillance strategies following major natural disasters have varied widely with respect to methods used to collect and analyze data. Following Hurricane Katrina, public health concerns included infectious disease outbreaks, injuries, mental health and exacerbation of preexisting chronic conditions resulting from unprecedented population displacement and disruption of public health services and health-care infrastructure.

 

Objective

This paper describes the public health surveillance response to hurricane Katrina in New Orleans and surrounding Parishes; particularly illustrating the methods, results, and lessons learned for implementing passive, active and electronic syndromic surveillance systems during a major disaster.

Submitted by elamb on
Description

Temporally localized outbreaks occur in the presence of a complex background, greatly complicating both retrospective and real-time detection. Numerous techniques have been proposed for adjusting thresholds to account for this variable background. In this paper, we apply wavelet transforms to detect localized structures in health care time series, using a generalization of many of these viewpoints. A rigorous, nonparametric approach is applied in a general setting to identify coherent outbreaks.

Submitted by elamb on
Description

This paper investigates the use of data-adaptive multivariate statistical process control (MSPC) charts for outbreak detection using real-world syndromic data. The widely used EARS [1] methods and other adaptive implementations assume implicitly that nonsta-tionarity and/or the lack of historic data preclude the conventional Phase I/Phase II approach of SPC. This work examines that assumption formally by evaluating and comparing the false alarm rates and sensitivity of adaptive and non-adaptive MSPC charts applied to simulated outbreaks injected into both desea-sonalized and raw data.

Submitted by elamb on
Description

OBJECTIVE This paper describes a series of data mining techniques used to gather and analyze and disseminate large amounts of data from numerous sources in English as well as Chinese. The objective of the analysis is to attempt to identify locations where the data may indicate a current or future outbreak of the A-H5N1 strain of the flu virus.

Submitted by elamb on
Description

The University of Washington's Center for Public Health Informatics, in collaboration with the Kitsap County Health District and the UW Clinical Informatics Research Group, has developed the Peninsula Syndromic Surveillance Information Collection System (SSIC), a complex second-generation [1,2] distributed database system which collects heterogeneous data from three emergency department / urgent care facilities computerized electronic admission and discharge diagnosis data. We transform heterogeneous institution-specific data to a standardized XML (eXtensible Markup Language) format, which is then transmitted to and integrated into a central database. Aberration detection algorithms are used to analyze this data so that public health officials can detect higher-than-usual incidences of the clinical syndromes under surveillance.

Submitted by elamb on
Description

This paper will use CDCís EARS-X to examine Tele-healthís potential as an early warning system specifically for influenza-like illness compared to NACRS, as well as qualitatively comparing the resultant EARS flags to peaks in influenza activity identified by the Public Health Agency of Canadaís (PHAC) Federal Influenza surveillance system (Fluwatch).

Submitted by elamb on