Skip to main content

Bayesian Methods

Description

The evolution of a communicable disease in a human population is not entirely predictable. However, the spreading process can be assumed to vary smoothly in time. The time-dependent infection process can be linked to observations of the epidemic’s evolution by convolving it with a stochastic delay model. In retrospective analyses of epidemics, when the observations are the dates of exhibition of patients’ symptoms, the delay is the incubation period. In case of biosurveillance data, the delay is caused by incubation and a (hospital) visit delay, modeled as independent random variables. A model for observational error is also required. The time-dependent infection/spread rate may be inferred from observations by a deconvolution process. The smooth temporal variation of the infection rate allows its representation using a low dimensional parametric model, and the inference may be performed with relatively little data. For large outbreaks, the data may be available early in the epidemic, allowing timely modeling of the outbreak. Short-term forecasts using the model could thereafter be used for medical planning.

 

Objective

We present a statistical method to characterize an epidemic of a communicable disease from a time series of patients exhibiting symptoms. Characterization is defined as estimating an unobserved, time-dependent infection rate and associated parameters that completely define the evolution of an epidemic. The problem is posed as one of Bayesian inference, where parameters are inferred with quantified uncertainty. The method is demonstrated on synthetic and historical epidemic data. 

Submitted by hparton on
Description

We are developing a Bayesian surveillance system for realtime surveillance and characterization of outbreaks that incorporates a variety of data elements, including free-text clinical reports. An existing natural language processing (NLP) system called Topaz is being used to extract clinical data from the reports. Moving the NLP system from a research project to a real-time service has presented many challenges.

 

Objective

Adapt an existing NLP system to be a useful component in a system performing real-time surveillance.

Submitted by hparton on
Description

Current practices of automated case detection fall into the extremes of diagnostic accuracy and timeliness. In regards to diagnostic accuracy, electronic laboratory reporting (ELR) is at one extreme and syndromic surveillance is at the other. In regards to timeliness, syndromic surveillance can be immediate, and ELR is delayed 7 days from initial patient visit. A plausible solution, a middle way, to the extremes of diagnostic precision and timeliness in current case detection practices is an automated Bayesian diagnostic system that uses all available data types, for example, freetext ED reports, radiology reports, and laboratory reports.We have built such a solution - Bayesian case detection (BCD). As a probabilistic system, BCD operates across the spectrum of diagnostic accuracy, that is, it outputs the degree of certainty for every diagnosis. In addition, BCD incorporates multiple data types as they appear during the course of a patient encounter or lifetime, with no degradation in the ability to perform diagnosis.

 

Objective

This paper describes the architecture and evaluation of our recently developed automated BCD system.

Submitted by hparton on
Description

The ability to rapidly detect any substantial change in disease incidence is of critical importance to facilitate timely public health response and, consequently, to reduce undue morbidity and mortality. Unlike testing methods (1, 2), modeling for spatio-temporal disease surveillance is relatively recent, and this is a very active area of statistical research (3). Models describing the behavior of diseases in space and time allow covariate effects to be estimated and provide better insight into etiology, spread, prediction and control. Most spatio-temporal models have been developed for retrospective analyses of complete data sets (4). However, data in public health registries accumulate over time and sequential analyses of all the data collected so far is a key concept to early detection of disease outbreaks. When the analysis of spatially aggregated data on multiple diseases is of interest, the use of multivariate models accounting for correlations across both diseases and locations may provide a better description of the data and enhance the comprehension of disease dynamics.

Objective

This study deals with the development of statistical methodology for on-line surveillance of small area disease data in the form of counts. As surveillance systems are often focused on more than one disease within a predefined area, we extend the surveillance procedure to the analysis of multiple diseases. The multivariate approach allows for inclusion of correlation across diseases and, consequently, increases the outbreak detection capability of the methodology

Submitted by elamb on
Description

Block 3 of the US Military Electronic Surveillance System for Early Notification of Community-Based Epidemics (ESSENCE) system affords routine access to multiple sources of data. These include administrative clinical encounter records in the Comprehensive Ambulatory Patient Encounter Record (CAPER), records of filled prescription orders in the Pharmacy Data Transaction Service, developed at the Department of Defense (DoD) Pharmacoeconomic Center, Laboratory test orders and results in HL7 format, and others. CAPER records include a free-text Reason for Visit field, analogous to chief complaint text in civilian records, and entered by screening personnel rather than the treating healthcare provider. Other CAPER data fields are related to case severity. DoD ESSENCE treats the multiple, recently available data sources separately, requiring users to integrate algorithm results from the various evidence types themselves. This project used a Bayes Network approach to create an ESSENCE module for analytic integration, combining medical expertise with analysis of 4 years of data using documented outbreaks.

 

Objective

The project objective was to develop and test a decision support module using the multiple data sources available in the U.S. DoD version of ESSENCE.

Submitted by elamb on
Description

Cardiovascular event prediction has long been of interest in the practice of intensive care. It has been approached using signal-processing of vital signs [1-4], including the use of graphical models [3,4]. Our approach is novel in making data segmentation as well as hidden state segmentation an unsupervised process, and in simultaneously tracking evolution of multiple vital signs. The proposed models are adaptable to the individual patient's vitals online and in real time, without requiring patient-specific training data if the patient-specific feedback signal is available. Additionally, they can incorporate expert interventions, produce explanations for alarm predictions, and consider effects of medication on state changes to reduce false alert probability.

Objective

To enable prediction of clinical alerts via joint monitoring of multiple vital signs, while enabling timely adaptation of the model to particulars of a given patient.

Submitted by elamb on
Description

Optimal sequential management of disease outbreaks has been shown to dramatically improve the realized outbreak costs when the number of newly infected and recovered individuals is assumed to be known (1,2). This assumption has been relaxed so that infected and recovered individuals are sampled and therefore the rate of information gain about the infectiousness and morbidity of a particular outbreak is proportional to the sampling rate (3). We study the effect of no recovered sampling and signal delay, features common to surveillance systems, on the costs associated with an outbreak.

Objective

Development of general methodology for optimal decisions during disease outbreaks that incorporate uncertainty in both parameters governing the outbreak and the current outbreak state in terms of the number of current infected, immune, and susceptible individuals.

Submitted by elamb on
Description

With the increase in GPS enabled devices, pin-point spatial data is an obvious future growth area for cluster detection research. The FBSSS handles binary labelled point data, but requires Monte Carlo testing to obtain inference [1]. In the Bayesian Poisson SSS [2], Monte Carlo is replaced by use of historic data, manifoldly speeding up processing. Following [2], [3] derived the BBSSS, replacing historic data with expert knowledge on cluster relative risk. This paper compares the spatial accuracy of BBSSS and FBSSS using new measure [4] which, being independent of inference level, permits direct comparison between Bayesian and frequentist methods. To compare the spatial accuracy of a Bayesian Bernoulli spatial scan statistic (BBSSS) and the frequentist Bernoulli spatial scan statistic (FBSSS), using benchmark trials.

Submitted by elamb on
Description

A number of syndromic surveillance systems include tools that quickly identify potentially large disease outbreak events. However, the high falsepositive rate continues to be a problem in all of these systems. Our earlier work has showed that multi-source information fusion can improve specificity of the syndromic surveillance systems. However, an anomalous health event that presents as only a few cases may remain undetected because the chief complaint data does not contain enough details. New linked data sources need to be used to enhance detection capabilities. The focus of this project examining the incorporation of laboratory, prescription medications and radiology data linked to the patient encounter within syndromic surveillance systems. These data source linkings may enhance the sensitivity of syndromic surveillance.

Submitted by elamb on
Description

Although rare in the US, the CDC reports 13-14 drinking-water-related disease outbreaks per year, affecting an average of about 1000 people. The US EPA has determined that the distribution system is the most vulnerable component of a drinking water system. Recognizing this vulnerability, water utilities are increasingly measuring disinfectant levels and other parameters in their distribution systems. The US EPA is sponsoring an initiative to fuse this distribution system water quality data with health data to improve surveillance by providing an assessment of the likelihood of the occurrence of a waterborne disease outbreak. This fused analysis capability will be available via a prototype water security module within a population-based public health syndromic surveillance system.

 

Objective

The objective of this paper is to illustrate a technique for combining water quality and population-based health data to monitor for water-borne disease outbreaks.

Submitted by elamb on