Skip to main content

Regression Analysis

Description

Accurately assigning causes or contributing causes to deaths remains a universal challenge, especially in the elderly with underlying disease. Cause of death statistics commonly record the underlying cause of death, and influenza deaths in winter are often attributed to underlying circulatory disorders. Estimating the number of deaths attributable to influenza is, therefore, usually performed using statistical models. These regression models (usually linear or poisson regression are applied) are flexible and can be built to incorporate trends in addition to influenza virus activity such as surveillance data on other viruses, bacteria, pure seasonal trends and temperature trends.

 

Objective

Mortality exhibits clear seasonality mainly caused by an increase in deaths in the elderly in winter. As there may be substantial hidden mortality for a number of common pathogens, we estimated the number of elderly deaths attributable to common seasonal viruses and bacteria for which robust weekly laboratory surveillance data were available.

Submitted by hparton on
Description

The eleven syndrome classifications for clinical data records monitored by BioSense include rare events such as death or lymphadenitis and also common occurrences such as respiratory infections. BioSense currently uses two statistical methods for prediction and alerting with respect to the eleven syndromes. These are a modified CUSUM; and small area regression and testing (SMART), described by Ken Kleinman. At the inception of BioSense, these prediction methods were implemented as one-model-fits-all, and they remain largely unmodified. An evaluation of the predictive value of these methods is required. The SMART method, as used in BioSense, uses long-term data. As covariate predictors, day-of-week, a holiday indicator, day after holiday, and sine/cosine seasonality variables are used. Lengthy, stable historical data is not always available in BioSense data sources, and this obstacle is expected to grow as data sources are added. We wish to test regression methods of surveillance that use shorter time periods, and different sets of predictors.

 

Objective

This paper compares the prediction accuracy of regression models with different covariates and baseline periods, using a subset of data from CDC’s BioSense initiative. Accurate predictions are needed to achieve sensitivity at practical false alarm rates in anomaly detection for biosurveillance.

Submitted by elamb on
Description

Numerous methods have been applied to the problem of modeling temporal properties of disease surveillance data; the ESSENCE system contains a widely used approach (1). STL (2) is a flexible, wellproven method for temporal modeling that decomposes the series into frequency components. A periodic component like DW can be exactly periodic or evolve through time. STL is based on loess (3), which can model a numeric response as a function of any explanatory variables. After the STL modeling of the counts, we will add patient address and produce a timespace modeling using both STL and more general loess methods.

 

Objective

Use the STL local-regression (loess) decomposition procedure and transformation to model the univariate time-series characteristics of chief-complaint daily counts as a first step in a time and spatial modeling. Develop visualization tools for model display and checking.

Submitted by elamb on
Description

One of the most important goals of disease surveillance is to identify the "what" and "when" of an epidemic. Influenza surveillance is made difficult by inconsistent laboratory testing, deficiencies in testing techniques, and coding subjectivity in hospital records. We hypothesized that respiratory diseases other than influenza may serve as a useful proxy for this infection in pediatric populations, due to similarities in the seasonal characteristics of these illnesses.

Submitted by elamb on
Description

A significant research topic in biosurveillance is how to group individual events—such as single emergency department (ED) visits and sales of over-thecounter healthcare (OTC) products—into counts of “similar” events. For OTC products, the goal is to find categories of individual products that have superior outbreak detection performance relative to categories that biosurveillance systems currently monitor. We have described a method to identify OTC categories that correlate more highly with disease activity than existing categories.1 However, it is an open question whether a category that correlates more highly—or according to some other model has a higher ‘association’—with disease activity than an existing category necessarily has superior detection performance. Here, we evaluate whether a linear regression procedure that clusters OTC products based on how well they ‘explain’ ED visits for influenzalike illness (ILI) can find categories with superior outbreak-detection performance for influenza.

Objective

To develop a procedure that identifies product categories with superior outbreak detection performance.

Submitted by elamb on
Description

Accurate and precise estimation of disease rates for a given population during a specified time frame is a major concern for public health practitioners and researchers in biosurveillance. Many diseases follow distinct patterns; incidence and prevalence of many diseases increase approximately exponentially with age, including many cancers, respiratory infections, and gastroenteritis. With increasing demographic information available in biosurveillance systems leading to more complex and comprehensive disease databases, seeking concise and informative summary measures of disease burden over space and time is becoming more critical for public health surveillance. In this paper we present two summary measures of disease burden in the elderly that simultaneously reflect disease dynamics and population characteristics.

 

Objective

To better estimate disease burden in the elderly population we illustrate an approach—the Slope Intercept Modeling for Population Linear Estimation (SIMPLE) method—that summarizes age-specific disease rates in the 65+ population using the observed exponential increase in disease rates with age in this dynamic and rapidly growing population subgroup.

Submitted by elamb on
Description

The frequency of disease outbreaks varies as a result of complex biological processes. Analysis of these frequencies can reveal patterns that can serve as a basis for predictions.

Objective:

The goal of this study was to identify the periodicity of seven zooanthroponoses in humans, and set epidemic thresholds for future occurrences.

Submitted by elamb on