Skip to main content

Algorithm

Description

By capturing the spatio-temporal organization of the data using a graph, GraphScan avoids the challenges associated with trying to “fit” incoming data into moving windows of predefined shapes and sizes. Whereas the popular space-time permutation scan statistic [1] attempts to find clusters within spacetime volumes of predefined shape, GraphScan employs no such preconceptions about the form of the clusters.  Instead, clusters are allowed to “evolve” freely to better reflect the structural properties of the data.  Moreover, GraphScan is capable of tracking possible causal relationships between spatio-temporal events.

Objective

This paper proposes an efficient and flexible algorithm applicable to spatio-temporal aberration detection in public health data.

Submitted by elamb on
Description

Many heuristics were developed recently to find arbitrarily shaped clusters (see  review  [1]). The most popular statistic is the spatial scan  [2]. Nevertheless, even if all cluster solutions could be known, the problem  of selecting the best cluster is ill posed. This happens because other measures, such as geometric regularity  [3-5] or topology  [6] must be taken intoconsideration. Most cluster finding  methods does not address  this last problem. A genetic multi-objective algorithm was developed elsewhere to identify irregularlyshaped clusters [5]. That method conducts a search aiming to maximize two objectives, namely the scan  statistic and the regularity of shape (using the compactness concept).The solution presented is a Pareto-set, consisting of all the clusters found which are not simultaneously worse in both objectives. The significance evaluation is conducted in parallel for all the  clusters  in  the  Pareto-set  through a  Monte Carlo simulation, determining the best cluster solution.

Objective

Irregularly shaped clusters occur naturally in disease surveillance, but they are not well defined. The number of possible clusters increases exponentially with the number of regions in a map. This concurs to reduce the power of detection, motivating the utilization of some kind of penalty function to avoid excessive freedom of shape. We introduce a weak link based correction which penalizes inconsistent clusters, without forbidding the presence of the geographically interesting irregularly shaped ones.

Submitted by elamb on
Description

The Public Health Agency of Canada is currently utilizing a syndromic surveillance prototype called the Canadian Early Warning System (CEWS). This system monitors several live data feeds, including emergency room chief complaint records from all seven local hospitals, Telehealth (24/7 nurse hotline) calls, and over-the-counter drug sales from a number of the large chain drug stores. Data trends are analysed for aberrations as early indicators of outbreak events. Collaborators on this Winnipeg, Manitoba-based pilot include the Winnipeg Regional Health Authority and IBM Business Solutions. Algorithms currently in CEWS include the 3, 5 and 7-day moving averages, CUSUM and the CDC’s EARS. We seek to investigate the performance of these algorithms in view of the fact that their detection ability may be dependent upon data source and/or the type of outbreak event.

 

Objective

To determine the sensitivity, specificity and days to detection of several commonly used algorithms in syndromic surveillance systems.

Submitted by elamb on
Description

OBJECTIVE This paper describes a series of data mining techniques used to gather and analyze and disseminate large amounts of data from numerous sources in English as well as Chinese. The objective of the analysis is to attempt to identify locations where the data may indicate a current or future outbreak of the A-H5N1 strain of the flu virus.

Submitted by elamb on
Description

Most research in syndromic surveillance has emphasized early detection, but clinical diagnosis of the index case will tend to occur before detection by syndromic surveillance for certain types of outbreaks [1]. Syndromic surveillance may, however, still play an important role in rapidly characterizing the outbreak size because there will be additional non-diagnosed symptomatic cases in the medical system when the index case is identified. Other authors have shown that the temporal pattern of symptomatic cases could be used to project the total outbreak size, but their approach requires a priori knowledge of the incubation curve for the specific anthrax strain and exposure level [2]. In this paper, we focus on estimating the number of non-diagnosed symptomatic cases at the time of detection without making assumptions about the exposure level or disease course.

Objective 

Upon detection of an inhalational anthrax attack, a critical priority for the public health response would be to characterize the size and extent of the outbreak. Our objective is to assess the potential role of syn-dromic surveillance in estimating the outbreak size.

Submitted by elamb on
Description

Benchmarking of temporal surveillance techniques is a critical step in the development of an effective syndromic surveillance system. Unfortunately, holding “bakeoffs” to blindly compare approaches is a difficult and often fruitless enterprise, in part due to the parameters left to the final user for tuning. In this paper, we demonstrate how common analytical development and analysis may be coupled with realistic data sets to provide insight and robustness when selecting a surveillance technique.

 

OBJECTIVE

This paper compares the robustness and performance of three temporal surveillance techniques using a twofold approach: 1) a unifying statistical analysis to establish their common features and differences, and 2) a benchmarking on respiratory, influenza-like ill-nesses, upper GI, and lower GI complaint time series from the Harvard Pilgrim Health Care (HPHC).

Submitted by elamb on
Description

Approximately one quarter of people treated for tuberculosis (TB) have no supporting microbiology, and thus are not detectable through laboratory reporting systems. Health departments depend upon clinicians to report these cases, but there is important underreporting. We previously described the performance of several algorithms for TB detection using electronic medical record (EMR) and claims data, and noted good sensitivity when screening for >2 anti-TB drugs; however, the positive predictive value was only 30%. We re-evaluated this and other algorithms in light of evolving TB treatment practices and enhanced ability to apply complex decision rules to EMR data in real time.

 

Objective

To develop algorithms for case detection of TB using EMR data to improve notifiable disease reporting.

Submitted by elamb on