Skip to main content

Time Series

Description

Time series analysis is very popular in syndromic surveillance. Mostly, public health officials track in the order of hundreds of disease models or univariate time series daily looking for signals of disease outbreaks. These time series can be aggregated counts of various syndromes, possibly different genders and age-groups. Recently, spatial scan algorithms find anomalous regions by aggregating zipcode level counts [1]. Usually, public health officials have a set of disease models (for e.g. fever or headache symptom in male adults is indicative of a particular disease). Based on the past experience public health officials track these disease models daily to find anomalies that might be indicative of disease outbreaks. A typical syndromic surveillance system these days will track in the order of 100-200 time series on daily basis using different univariate algorithms like CUSUM, moving average, EWMA, etc.

Let us consider a representative dataset of a state which has 100 zipcodes that monitors 10 syndromes among 3 age groups and 2 genders in emergency rooms. There are a total of 6,000 (100 x 10 x 3 x 2) distinct time series for a particular zipcode, syndrome, age-group and gender. This number already seems too high to monitor daily. Hence most syndromic systems only monitor state level aggregates for all syndromes or a few combinations of syndromes, gender and age-groups.

But most real world disease models are more complex and affect multiple syndromes, or multiple agegroups. We need to analyze more complex streams that aggregate multiple values in the attributes to mine more interesting patterns not seen otherwise. As an example, a massive search could reveal that recently senior female patients having fever and nausea have increased in the north eastern part of the state.

Objective

This paper shows how T-Cubes, a data structure that makes tracking millions of disease models simultaneously feasible, can be used to perform multivariate time series analysis using primitive univariate algorithms. Hence, the use of T-Cube in brute-force search helps identify stronger disease outbreak signals currently missed by the surveillance systems.

Submitted by elamb on
Description

Temporal anomaly detection is a key component of real time surveillance. Today, despite the abundance of temporal information on multiple syndromes, multivariate investigation of temporal anomalies remains under-explored. Traditionally, an outbreak is thought of as disease localization in time. That is, for an event to qualify as an outbreak, a significant deviation from the observed distribution of the disease must occur.  However, the underlying processes that govern the health seeking behavior of a population with respect to one disease can potentially impact multiple syndromes leading to observable correlation patterns in the daily rates of those syndromes. Thus, a deviation from the observed correlation pattern between different syndromes can be an early indicator of potential anomalies when the rise in the daily rates of one or more syndrome is not sufficiently discernable to be identified by standard univariate techniques.

Objective

The objectives of this study are to develop a mathematical multi-syndrome framework for early detection of temporal anomalies, to demonstrate improvement in detection sensitivity and timeliness of the multivariate technique compared with those of standard uni-syndrome analysis, and to put forward a new practical concept for timely outbreak investigation.

Submitted by elamb on
Description

Objective: Ideal anomaly detection algorithms should detect both sudden and gradual changes, while keeping the background false positive alert rate at a tolerable level. Our objective was to develop an anomaly detection algorithm that adapts to the time series being analyzed and reduces false positive signals. Background: Earlier we have presented studies with HWR, where the alerts were generated using a logical OR of several different criteria [1]. The anomaly detection contest required a continuous score for each day of the time series. This gave the impetus to develop a new version of our algorithm.

Submitted by elamb on
Description

In the last decade, time series analysis has become one of the most important tools of surveillance systems. Understanding the nature of temporal fluctuations is essential for successful development of outbreak detection algorithms, aberration assessment, and to control for seasonal variations. Typically, in applying the time series methods to health outcomes collected over an extended period of time it is assumed that population profiles remain constant. In practice, such assumptions have been rarely tested. At best, the temporal analysis is performed using stratification by age or other discriminating factors if heterogeneity is suspected. Any community can experience population changes in various forms. Long-term trends of inflow/outflow migration and rapid transient fluctuations associated with specific events are typical examples of changes in population profile. Seasonality, as an intrinsic property of infectious diseases manifestation in a community, is typically attributed to periodic changes in transmissibility of pathogens. To some extent, seasonal fluctuations in the incidence of infectious diseases could also be associated with the changes in population profiles. The ability to detect and describe such changes would provide valuable clues into seasonally changing factors associated with an infection.

 

Objective

The objective of this communication is two-fold: 1) to introduce an analytical approach for assessing temporal changes in the surveillance reporting with respect to population profile; and 2) to demonstrate the utility of this method using laboratory-confirmed cases for four reportable enteric infections (cryptosporidiosis, giardiasis, shigellosis, and salmonellosis) recorded by the Massachusetts Department of Public Health over the last 12 years. This new approach for assessing seasonal changes is based on comparison of gender-specific single-year age distributions, which constitute population profiles.

Submitted by elamb on
Description

Many cities in the US and the Center for Disease Control and Prevention have deployed biosurveillance systems to monitor regional health status. Biosurveillance systems rely on algorithms that analyze data in temporal domain (e.g., CuSUM) and/or spatial domain (e.g., SaTScan). Spatial domain-based algorithms often require population information to normalize the counts (e.g., emergency department visits) within a geographic region. This paper presents a new algorithm Ellipse-based Clustering Analysis (ECA) that analyzes data in both temporal and spatial domains--using time series analysis for each of zip codes with abnormal counts and using pattern recognition methods for spatial clusters.

 

Objective

This paper describes a new clustering algorithm ECA, which uses a time series algorithm to identify zip codes with abnormal counts, and uses a pattern recognition method to identify spatial clusters in ellipse shapes. Using ellipses could help detect elongated clusters resulting from wind dispersion of bio-agents. We applied the ECA to over-the-counter medicine sales. The pilot study demonstrated the potential use of the algorithm in detection of clustered outbreak regions that could be associated with aerosol release of bio-agents.

Submitted by elamb on
Description

The 2003/04 influenza season included a more pathogenetic organism and had an earlier onset. There were noticeably more deaths in otherwise healthy children than in previous seasons. Following this season, States were asked by the Centers for Disease Control and Prevention to increase their surveillance efforts for influenza illness.

 

Objective 

This paper describes data that was available in Ohio for analysis and considered valuable to determine the occurrence of influenza-like illness (ILI). These data sources were studied to determine their value to ILI surveillance and to develop an improved method of establishing influenza activity levels.

Submitted by elamb on
Description

Numerous recent papers have evaluated algorithms for biosurveillance anomaly detection. Common essential problems in the disparate, evolving data environment include trends, day-of-week effects, and other systematic behavior. Public health monitors have expressed the need for modifiable case definitions, requiring monitoring of time series that cannot be modeled in advance. Thus, automated algorithm selection is required. Recent research showed superior predictive performance of the H-W forecasting method compared to regression based predictors applied to syndromic data. This effort discusses extension to a practical monitoring tool, including selection from parametric and initialization settings based on limited data history, selection criteria for routine updating, specification of confidence limits, and validation of the resulting algorithm.

 

Objective

The objective is to develop and evaluate an operational alerting algorithm appropriate for the variety of time series behavior observed in biosurveillance data. The Holt-Winters (H-W) implementation of generalized exponential smoothing, comparable to complex regression models in predictive capability and far easier to specify and adapt, is built into a robust detection method.

Submitted by elamb on
Description

The Utah Department of Health documented a single epidemic of cryptosporidiosis in Utah during 2007. Seven hundred eleven laboratory-confirmed cases were reported in Salt Lake County, Utah from July 27 through December 18. Illness onset date was available for 86% (611 of 711) of patients and ranged from May 30 through November 11. Approximately 32% (224 of 691) of patients sought care in area emergency departments or urgent care facilities, and 8.5% (50 of 590 with data available) of patients required hospitalization. Sixty-one percent (432 of 711) of patients were less than 13 years of age. Of 381 patients with data available on symptoms, nearly all (99%, 378) reported diarrhea. Other commonly reported symptoms included vomiting (57%, 218), abdominal pain (51%, 196), and nausea (44%, 168).

 

Objective

The objective of this study was to evaluate the potential for improved detection of enteric disease epidemics using a classification category based on variations of diarrhea appearing in the chief complaints from emergency department and urgent care facility visits.

Submitted by elamb on