Skip to main content

Statistical Methods

Description

Recently published studies evaluate statistical alerting methods for disease surveillance based on detection of modeled signals in a data background of either authentic historical data or randomized samples. Differences in regional and jurisdictional data, collection and filtering methods, investigation resources, monitoring objectives, and systemrequirements have hindered acceptance of standard monitoring methodology. The signature of a disease outbreak and the baseline data behavior depend on various factors, including population coverage, quality and timeliness of data, symptomatology, and the careseeking behavior of the monitored population. For this reason, statistical process control methods based on standard data distributions or stylized signals may not alert as desired. Practical algorithm evaluation and adjustment may be possible by judging algorithmperformance according to the preferences of experienced human monitors.

 

Objective

This presentation gives a method of monitoring surveillance time series on the basis of the human expert preference. The method does not require detailed history for the current series, modeling expertize, or a well-defined data signal. It is designed for application to many data types and without need for a sophisticated environment or historical data analysis. 

Submitted by hparton on
Description

Scan statistics are highly successful for the evaluation of space-time clusters. Recently, concepts from the graph theory were applied to evaluate the set of potential clusters. Wieland et al. introduced a graph theoretical method for detecting arbitrarily shaped clusters on the basis of the Euclidean minimum spanning tree of cartogram transformed case locations, which is quite effective, but the cartogram construction step of this algorithm is computationally expensive and complicated.

 

Objective

We describe a method for prospective space-time cluster detection of point event data based on the scan statistic. Our aim is to detect as early as possible the appearance of an emerging cluster of syndromic individuals because of a real outbreak of disease amidst the heterogeneous population at risk.

Submitted by hparton on
Description

The research reported in this paper is part of a larger effort to achieve better signal-to-noise ratio, hence accuracy, in pharmacovigilance applications. The relatively low frequency of occurrence of adverse drug reactions leads to weak causal relations between the reaction and any measured signal. We hypothesize that by grouping related signals, we can enhance detection rate and suppress false alarm rate.

 

Objective

ICD-9 codes are commonly used to identify disease cohorts and are often found to be less than adequate. Data available in structured databasesFlab test results, medications etc.Fcan supplement the diagnosis codes. In this study, we describe an automated method that uses these related data items, and no additional manual annotations to more accurately identify patient cohorts.

Submitted by hparton on
Description

Time series data involving counts are frequently encountered in many biomedical and public health applications. For example, in disease surveillance, the occurrence of rare infections over time is often monitored by public health officials, and the time series data collected can be used for the purpose of monitoring changes in disease activity. For rare diseases with low infection rates, the observed counts typically contain a high frequency of zeros (zero-inflated), but the counts can also be very large (overdispersed) during an outbreak period. Failure to account for zero-inflation and overdispersion in the data may result in misleading inference and the detection of spurious associations.

 

Objective

The purpose of this study is to develop novel statistical methods to analyze zero-inflated and overdispersed time series consisting of count data.

Submitted by elamb on
Description

Syndromic surveillance uses syndrome (a specific collection of clinical symptoms) data that are monitored as indicators of a potential disease outbreak. Advanced surveillance systems have been implemented globally for early detection of infectious disease outbreaks and bioterrorist attacks. However, such systems are often confronted with the challenges such as (i) incorporate situation specific characteristics such as covariate information for certain diseases; (ii) accommodate the spatial and temporal dynamics of the disease; and (iii) provide analysis and visualization tools to help detect unexpected patterns. New methods that improve the overall detection capabilities of these systems while also minimizing the number of false positives can have a broad social impact.

Submitted by elamb on
Description

The Veterans Health Administration (VHA) uses the Electronic Surveillance System for the Early Notification of Community-based Epidemics to detect disease outbreaks and other health-related events earlier than other forms of surveillance. Although Veterans may use any VHA facility in the world, the strongest predictor of which health care facility is accessed is geographic proximity to the patient's residence. A number of outbreaks have occurred in the Veteran population when geographically separate groups convened in a single location for professional or social events. One classic example was the initial Legionnaire's disease outbreak, identified among participants at the Legionnaire's convention in Philadelphia in the late 1970s. Numerous events involving travel by large Veteran (and employee) populations are scheduled each year.

 

Objective

To develop an algorithm to identify disease outbreaks by detecting aberrantly large proportions of patient residential ZIP codes outside a health care facility catchment area.

Submitted by elamb on
Description

Parallel surveillance, separate monitoring of each continuous series, has been widely used for multivariate surveillance, however, it has severe limitations. Firstly, it faces the problem of multiplicity from multiple testing. Also, the ignorance of CBS reduces the performance of outbreak detection if data are truly correlated. Finally, since health data are normally dependent over time, CWS is another issue which should be taken into account. Sufficient reduction methods are used to reduce the dimensionality of a simple multivariate series to a univariate series which has been proved to be sufficient for monitoring a mean shift in multivariate surveillance (1 and 2). Having considered the sufficiency property and the nature of health data, we propose a sufficient reduction method for detecting a mean shift in multivariate series where CWS and CBS are taken into account.

Objective

To reduce the dimensionality of p-dimensional multivariate series to a univariate series by deriving sufficient statistics which take into account all the information in the original data, correlation within series (CWS) and correlation between series (CBS).

Submitted by elamb on
Description

The spatial scan statistic proposed by Kulldorff has been widely used in spatial disease surveillance and other spatial cluster detection applications. In one of its versions, such scan statistic was developed for inhomogeneous Poisson process. However, the underlying Poisson process may not be suitable to properly model the data. Particularly, for diseases with very low prevalence, the number of cases may be very low and zero excess may cause bias in the inferences.

Lambert introduced the zero-inflated Poisson (ZIP) regression model to account for excess zeros in counts of manufacturing defects. The use of such model has been applied to innumerous situations. Count data, like contingency tables, often contain cells having zero counts. If a given cell has a positive probability associated to it, a zero count is called a sampling zero. However, a zero for a cell in which it is theoretically impossible to have observations is called structural zero.

 

Objective

The scan statistic is widely used in spatial cluster detection applications of inhomogeneous Poisson processes. However, real data may present substantial departure from the underlying Poisson process. One of the possible departures has to do with zero excess. Some studies point out that when applied to zero-inflated data the spatial scan statistic may produce biased inferences. Particularly, Gomez-Rubio and Lopez-Quılez argue that Kulldorff’s scan statistic may not be suitable for very rare diseases problems. In this work we develop a closed-form scan statistic for cluster detection of spatial count data with zero excess.

Submitted by elamb on
Description

Ordering-based approaches [1,2] and quadtrees [3] have been introduced recently to detect multiple spatial clusters in point event datasets. The Autonomous Leaves Graph (ALG) [4] is an efficient graph-based data structure to handle the communication of cells in discrete domains. This adaptive data structure was favorably compared to common tree-based data structures (quad-trees). An additional feature of the ALG data structure is the total ordering of the component cells through a modified adaptive Hilbert curve, which links sequentially the cells (the orange curve in the example of Figure 1).

Objective

To detect multiple significant spatial clusters of disease in case-control point event data using the Autonomous Leaves Graph and the spatial scan statistic.

Submitted by elamb on
Description

Data obtained through public health surveillance systems are used to detect and locate clusters of cases of diseases in space-time, which may indicate the occurrence of an outbreak or an epidemic. We present a methodology based on adaptive likelihood ratios to compare the null hypothesis (no outbreaks) against the alternative hypothesis (presence of an emerging disease cluster).

 

Objective

Disease surveillance is based on methodologies to detect outbreaks as soon as possible, given an acceptable false alarm rate. We present an adaptive likelihood ratio method based on the properties of the martingale structure which allows the determination of an upper limit for the false alarm rate.

Submitted by elamb on