Skip to main content

Cluster Detection

Description

Many cities in the US and the Center for Disease Control and Prevention have deployed biosurveillance systems to monitor regional health status. Biosurveillance systems rely on algorithms that analyze data in temporal domain (e.g., CuSUM) and/or spatial domain (e.g., SaTScan). Spatial domain-based algorithms often require population information to normalize the counts (e.g., emergency department visits) within a geographic region. This paper presents a new algorithm Ellipse-based Clustering Analysis (ECA) that analyzes data in both temporal and spatial domains--using time series analysis for each of zip codes with abnormal counts and using pattern recognition methods for spatial clusters.

 

Objective

This paper describes a new clustering algorithm ECA, which uses a time series algorithm to identify zip codes with abnormal counts, and uses a pattern recognition method to identify spatial clusters in ellipse shapes. Using ellipses could help detect elongated clusters resulting from wind dispersion of bio-agents. We applied the ECA to over-the-counter medicine sales. The pilot study demonstrated the potential use of the algorithm in detection of clustered outbreak regions that could be associated with aerosol release of bio-agents.

Submitted by elamb on
Description

Syndromic surveillance is focused upon organizing data into categories to detect medium to large scale clusters of illness. Detection often requires that a critical threshold be surpassed. Data mining searches through data to identify records containing keywords. New Hampshire has combined data mining with syndromic surveillance since January 2003 to improve detection capacity.

 

Objective

1. Understand the principles behind the use of syndromic surveillance and data mining. 2. Understand how New Hampshire's unique approach combining data mining with syndromic surveillance has enhanced disease surveillance efforts. 3. Describe the steps and code necessary to implement and enhance data mining.

Submitted by elamb on
Description

Early warning systems must not always rely on geographical proximity for modeling the spread of contagious diseases. Instead, graph structures such as airways or social networks are more adequate in those situations. Nodes, associated to cities, are linked by means of edges, which represent routes between cities. Scan statistics are highly successful for the evaluation of clusters in maps based on geographical proximity. The more flexible neighborhood structure of graphs presents difficulties for the direct usage of scan statistics, due to the highly irregular structures involved. Besides, the traffic intensity between connected nodes plays a significant role which is not usually present in scan statistic based models.

 

Objective 

We describe a model for cluster detection and inference on networks based on the scan statistic. Our aim is to detect as early as possible the appearance of an emerging cluster of syndromes due to a real outbreak (signal) amidst unrelated syndromes (noise).

Submitted by elamb on
Description

Outbreaks of infectious diseases are identified in a variety of ways by clinicians and public health practitioners but not usually by analytic methods typically employed in syndromic surveillance. Systematic spatial-temporal analysis of statewide data may enable earlier detection of outbreaks and identification of multi-jurisdictional outbreaks.

 

Objective

Clusters of cases of individually-reportable infectious diseases were identified by a spatial-temporal retrospective analysis. Clusters were examined to determine association with previously reported outbreaks.

Submitted by elamb on
Description

Multiple or irregularly shaped spatial clusters are often found in disease or syndromic surveillance maps. We develop a novel method to delineate the contours of spatial clusters, especially when there is not a clearly dominating primary cluster, through artificial neural networks. The method may be applied either for maps divided into regions or point data set maps.

Submitted by elamb on
Description

I examine the nature and expression of the null hypothesis often used in spatial surveillance. I also show an example of how incorrect specification of the null can lead to excess signals without interesting outbreaks, and argue that this may be a cause of excess signals when using spatial surveillance in public health applications.

Submitted by elamb on
Description

Using New York Cityís dead bird surveillance for West Nile Virus (WNV), this paper presents two explorations of the spatial cluster detection problem in which lagged test results are available for a random subset of observations. First, we establish a framework for the direct evaluation of methods and identify the optimal parameterization over a large family of models. We then investigate ways in which the lagged test results and other covariates might be used prospectively to extend the family of models by refining the baseline.

Submitted by elamb on
Description

Computational and statistical methods for detecting disease clusters, such as the spatial scan statistic, have become frequently used tools in epidemiology. However, they simply tell the user where a cluster is, and leave the analysis task to the user. Multivariate visualization tools provide one way for this analysis. The approach developed in this research is computational in nature, using computer vision techniques to analyze the shape of the cluster. Shapes are used here because different spatial processes that cause clusters, such as pollution along a river, create clusters with different shapes. Thus, it may be possible to categorize clusters by their respective spatial processes by analyzing the cluster shapes.

 

OBJECTIVE

There are plenty of computational and statistical methods for detecting spatial clusters, although the interpretation of these clusters is a task left to the user. This research develops computational methods to not just detect, but also analyze the cluster to hypothesize one or more potential causes.

Submitted by elamb on