Skip to main content

Cluster Detection

Description

This study uses data on births in New York City between 2000-2005 to investigate the spatial pattern of birthweight and gestation, two primary risk factors for infant mortality. The analysis uses SatScan to perform normal-distribution cluster detection after controlling for individual-level demographic variables. While previous research has investigated neighborhood effects and spatial patterns of low birth weight and infant mortality, few studies have done so with individual-level information and continuous outcomes. The overarching goal is to develop a framework to better understand demographic and spatial patterns of infant mortality, birthweight, and gestation to inform public health practice.

Submitted by elamb on
Description

The traditional SaTScan algorithm[1],[2] uses the euclidean dis- tance between centroids of the regions in a map to assemble a con- nected (in the sense that two connected regions share a physical border) sets of regions. According to the value of the respective log- arithm of the likelihood ratio (LLR) a connected set of regions can be classified as a statistically significant detected cluster. Considering the study of events like contagious diseases or homicides we con- sider using the flow of people between two regions in order to build up a set of regions (zone) with high incidence of cases of the event. In this sense the regions will be closer as the greater the flow of peo- ple between them. In a cluster of regions formed according to the cri- terion of proximity due to the flow of people, the regions will be not necessarily connected to each other.

 

Objective

We present a new approach to the circular scan method [1] that uses the flow of people to detect and infer clusters of regions with high incidence of some event randomly distributed in a map. We use a real database of homicides cases in Minas Gerais state, in south- east Brazil to compare our proposed method with the original circu- lar scan method in a study of simulated clusters and the real situation.

Submitted by dbedford on
Description

To develop and implement an effective program of rabies eradication in Ukraine in 2008 was founded the unique collection of samples of pathological materials confirmed as positive in rabies at the regional veterinary laboratories of Ukraine. The collection is constantly updated and to present moment it includes 1389 samples from all regions of Ukraine, selected from 17 animal species and humans.

Objective:

To identify the presence of genetic clusters of rabies virus at the territory of Ukraine and to determine the degree of activity of rabies vaccines against these genetic clusters.

 

Submitted by Magou on
Description

Multiple data sources are essential to provide reliable information regarding the emergence of potential health threats, compared to single source methods [1,2]. Spatial Scan Statistics have been adapted to analyze multivariate data sources [1]. In this context, only ad hoc procedures have been devised to address the problem of selecting the most likely cluster and computing its significance. A multi-objective scan was proposed to detect clusters for a single data source [3].

Objective:

To incorporate information from multiple data streams of disease surveillance to achieve more coherent spatial cluster detection using statistical tools from multi-criteria analysis.

Submitted by Magou on
Description

In July 2012, the 54 children infected with enterovirus-71(EV71) were died in Cambodia. The media called it as mystery illness and made Asian parents worried. In fact, the severe epidemics of enterovirus occurred frequently in Asia, including Malaysia, Singapore, Taiwan and China. The clinical severity varied from asymptomatic to mild (hand-foot-mouth disease and herpangina) and severe pulmonary edema/hemorrhage and encephalitis. Up to now, the development of vaccine for EV-71 and the more effective antiviral drug was still ongoing. Therefore, surveillance for monitoring the enterovirus activity and understanding the epidemiological characteristics between mild and severe enterovirus cases was crucial.

Objective

This study was to elucidate the spatio-temporal correlations between the mild and severe enterovirus cases through integrating enterovirus-related three surveillance systems in Taiwan. With these fully understanding epidemiological characteristics, hopefully, we can develop better measures and indicators from mild cases to provide early warning signals and thus minimizing subsequent numbers of severe cases.

Submitted by teresa.hamby@d… on
Description

The Joint VA/DoD BioSurveillance System for Emerging Biological Threats project seeks to improve situational awareness of the health of VA/DoD populations by combining their respective data. Each system uses a version of the Electronic Surveillance System for Early Notification of Community-Based Epidemics (ESSENCE); a combined version is being tested. The current effort investigated combining the datasets for disease cluster detection. We compared results of retrospective cluster detection studies using both separate and joined data. — Does combining datasets worsen the rate of background cluster determination?

— Does combining mask clusters detected on the separate datasets?

— Does combining find clusters that the separate datasets alone would miss?

Objective:

We examined the utility of combining surveillance data from the Departments of Defense (DoD) and Veterans Affairs (VA) for spatial cluster detection.

 

Submitted by Magou on
Description

Early detection of heroin overdose clusters is important in the current battle against the opioid crisis to effectively implement prevention and control measures. The New York State syndromic surveillance system collects hospital emergency department (ED) visit data, including visit time, chief complaint, and patient zip code. This data can be used to timely identify potential heroin overdose outbreaks by detecting spatial-temporal case clusters with scan statistic.

Objective:

To utilize syndromic surveillance data timely detecting herion overdose outbreaks in the community.

Submitted by elamb on
Description

At the Governor’s Opioid Addiction Crisis Datathon in September 2017, a team of Booz Allen data scientists participated in a two-day hackathon to develop a prototype surveillance system for business users to locate areas of high risk across multiple indicators in the State of Virginia. We addressed 1) how different geographic regions experience the opioid overdose epidemic differently by clustering similar counties by socieconomic indicators, and 2) facilitating better data sharing between health care providers and law enforcement. We believe this inexpensive, open source, surveillance approach could be applied for states across the nation, particularly those with high rates of death due to drug overdoses and those with significant increases in death.

Objective:

A team of data scientists from Booz Allen competed in an opioid hackathon and developed a prototype opioid surveillance system using data science methods. This presentation intends to 1) describe the positives and negatives of our data science approach, 2) demo the prototype applications built, and 3) discuss next steps for local implementation of a similar capability.

Submitted by elamb on
Description

Influenza is a highly contagious, acute respiratory disease that causes periodic seasonal epidemics and global pandemics[1]. Yunnan Province is characterized by poverty, multi-ethnic, and cross-border movement, which maybe be susceptible of influenza (Fig-1). Finding from spatial patter of ILI will promote to control and prevent the respiratory diseases epidemic

Objective

The purpose of the study was to determine spatial clustering of the spread of influenza like illness (ILI) epidemic in Yunnan province, China with the aim of producing useful information for prevention and control measures.

 

Submitted by Magou on
Description

Typical approaches to monitoring ED data classify cases into pre-defined syndromes and then monitor syndrome counts for anomalies. However, syndromes cannot be created to identify every possible cluster of cases of relevance to public health. To address this limitation, NC DETECT’s approach clusters cases by arrival times and monitors the textual chief complaint data associated with each identified cluster for relevant similarities [1]. This approach is time consuming and limited in its ability to detect emerging outbreaks that are dispersed across time. A new method is needed to automatically identify clusters of interest that would not be detected by existing syndromes. Clusters may be based on symptoms, events, place names, arrival time, or hospital location. The NC DPH dataset describes 198,511 de-identified ED visits over one year at 3 North Carolina hospitals. The data include chief complaint, altered date and time of arrival, hospital A/B/C, and age group. About 40 simulated outbreaks were injected into the data set by the NC DETECT team. For example, an inject cluster might consist of 4 patients who report getting sick after eating at a particular restaurant.

Objective

We apply a novel semantic scan statistic approach to solve a problem posed by the NC DETECT team, North Carolina Division of Public Health (NC DPH) and UNC Department of Emergency Medicine Carolina Center for Health Informatics, and facilitated by the ISDS Technical Conventions Committee. This use case identifies a need for methodology that detects emerging, potentially novel outbreaks in free-text emergency department (ED) chief complaint data.

 

Submitted by Magou on