Skip to main content

Cluster Detection

Description

Geographic visualization methods allow analysts to visually discover clusters in multivariate, spatially-referenced data. Computational and statistical cluster detection techniques can automatically detect spatial clusters of high values of a variable of interest. The authors propose that the two approaches can be complementary; and present an integration of the GeoViz Toolkit and Proclude software suites as proof-of-concept.

Submitted by elamb on
Description

Many heuristics were developed recently to find arbitrarily shaped clusters (see  review  [1]). The most popular statistic is the spatial scan  [2]. Nevertheless, even if all cluster solutions could be known, the problem  of selecting the best cluster is ill posed. This happens because other measures, such as geometric regularity  [3-5] or topology  [6] must be taken intoconsideration. Most cluster finding  methods does not address  this last problem. A genetic multi-objective algorithm was developed elsewhere to identify irregularlyshaped clusters [5]. That method conducts a search aiming to maximize two objectives, namely the scan  statistic and the regularity of shape (using the compactness concept).The solution presented is a Pareto-set, consisting of all the clusters found which are not simultaneously worse in both objectives. The significance evaluation is conducted in parallel for all the  clusters  in  the  Pareto-set  through a  Monte Carlo simulation, determining the best cluster solution.

Objective

Irregularly shaped clusters occur naturally in disease surveillance, but they are not well defined. The number of possible clusters increases exponentially with the number of regions in a map. This concurs to reduce the power of detection, motivating the utilization of some kind of penalty function to avoid excessive freedom of shape. We introduce a weak link based correction which penalizes inconsistent clusters, without forbidding the presence of the geographically interesting irregularly shaped ones.

Submitted by elamb on
Description

Electronic  Health  Record  (EHR)  data  offers  the  researcher a potentially rich source of data for tracking disease  syndromes. Procedures  performed  on  the  patient, medications prescribed (not necessarily filled by  the  patient),  and  reason  for  visit  are  just  some  characteristics of the patient encounter that are available  through  an  EHR  that  can  be  used  to  define  surveillance  syndromes.    Since  procedures  have  not  been used frequently in defining syndromes, encounter  level  procedures  data,  extracted  from  the  EHR  of  a   large   local   primary   care   practice   with   about   200,000 patient encounters per year was used to identify  procedures  associated  with  an  established  respiratory syndrome.

Objective

To investigate the utility of different sources of patient encounter information, particularly in the primary care setting, that can be used to characterize surveillance syndromes, such as respiratory or flu.

Submitted by elamb on
Description

The spatial scan statistic is the usual measure of strength of a cluster [1]. Another important measure is its geometric regularity [2]. A genetic multiobjective algorithm was developed elsewhere to identify irregularly shaped clusters [3]. A search is executed aiming to maximize two objectives, namely the scan statistic and the regularity of shape (using the compactness concept). The solution presented is a Pareto-set, consisting of all the clusters found which are not simultaneously worse in both objectives. A significance evaluation is conducted in parallel for all clusters in the Pareto-set through Monte Carlo simulation, then finding the most likely cluster. \

Objective

Situations where a disease cluster does not have a regular shape are fairly common. Moreover, maps with multiple clustering, when there is not a clearly dominating primary cluster, also occur frequently. We would like to develop a method to analyze more thoroughly the several levels of clustering that arise naturally in a disease map divided into m regions.

Submitted by elamb on
Description

The City of Atlanta, volunteer organizations, and the faith community operate several homeless shelters throughout the city. Services available at these shelters vary, ranging from day services, such as meals, mail collection, and medical clinics, to overnight shelter accommodations. In addition to the medical clinics available at these facilities, the Atlanta homeless population also utilizes emergency departments in Fulton County for their health care needs.

 

Objective

This paper describes a cluster of Streptococcus pneumoniae infections identified through emergency department syndromic surveillance.

Submitted by elamb on
Description

CDC’s BioSense system provides near-real time situational awareness for public health monitoring through analysis of electronic health data. Determination of anomalous spatial and temporal disease clusters is a crucial part of the daily disease monitoring task. Spatial approaches depend strongly on having reliable estimated values for counts among the geographic sub-regions. If estimates are poor, algorithms will find irrelevant clusters, and clusters of importance may be missed. While many studies have focused on improved computation time and more general cluster shapes, our effort focused on finding anomalies that are correct according to available BioSense data history.

 

Objective

We applied spatial scan statistics to data from CDC’s BioSense system and examined the effect of the spatial prediction method on determination of anomalous disease clusters. The objectives were to decide on a reliable spatial estimation method for one BioSense data source and to establish criteria for making this decision using other sources.

Submitted by elamb on
Description

In 2004, the Indiana State Department of Health (ISDH) contracted with the Regenstrief Institute to build an information exchange infrastructure to support the collection of surveillance data. This pilot program involves implementation of electronic reporting in 46 of the state’s 114 emergency departments. Chief complaint data are collected and analyzed to identify clusters of disease earlier than a diagnosis can be confirmed or the disease reported to the ISDH. The system utilized the chief complaint coder CoCo to map the chief complaints into one of eight syndromes. This evaluation was completed after one-third of the pilot facilities were operational.

 

Objective

This evaluation was conducted to determine if any pilot hospitals have operational practices that may affect the ability of the Public Health Emergency Surveillance System to accurately and efficiently identify clusters of infectious disease in Indiana.

Submitted by elamb on
Description

The abattoir and the fallen stock surveys constitute the active surveillance component aimed at improving the detection of scrapie across the EU. Previous studies have suggested the occurrence of significant differences in the operation of the surveys throughout Europe. Del Rio Vilas et al assessed the presence of heterogeneity between the observed prevalence estimates of 18 EU countries by means of a meta-analysis and showed a large residual variability indicating an inconsistent approach to the surveys across the EU. The study of these differences merits attention as they inform discrepancies in the performance of the surveys between countries. In the absence of sufficient covariate information to explain the observed variability across countries, we can model, still under the general context of the meta-analysis, the unobserved heterogeneity in our data. Countries could be grouped into clusters representing the underlying subpopulations relative to the risk of scrapie between the two surveys in each country.

 

Objective

In the present study we assessed the standardisation of the active surveillance of scrapie throughout time across the EU and identified countries with similar underlying characteristics allowing comparisons between them.

Submitted by elamb on
Description

The utility of syndromic surveillance systems to augment health departments’ traditional surveillance for naturally occurring disease has not been prospectively evaluated.

 

Objective

In this interim report we describe the signals detected by a real-time ambulatory care-based syndromic surveillance system and discuss their relationship to true outbreaks of illness.

Submitted by elamb on
Description

Capital Health is a regional health care organization, which provides services for over one million inhabitants in the Edmonton area of Alberta, Canada. Traditionally, disease surveillance under its jurisdiction has been paper-based and records maintained by different departments in several locations. Before the Alberta Real Time Syndromic Surveillance Net (ARTSSN), there was no centralized database or unified approach to surveillance and automated reporting despite rich electronic health data in the region. The existing labor-intensive manual surveillance process is inefficient and inherently susceptible to human error. Its effectiveness is sub-optimal in detecting outbreaks of emerging infectious diseases, and clusters of injuries or toxic exposures. The ultimate objective of ARTSSN is to enhance public health surveillance through earlier and more sensitive detection of clusters and trends, with subsequent tracking and response through an integrated, automated surveillance and reporting system.

 

Objective

ARTSSN is a pilot public health surveillance project developed for the Capital Health region of Alberta, Canada and funded by Alberta Health and Wellness. This paper describes the advantages of using ARTSSN and comparing information derived from multiple electronic data sources simultaneously for real time syndromic surveillance.

Submitted by elamb on