Skip to main content

Bayesian Methods

Description

Influenza is a contagious disease that causes epidemics in many parts of the world. The World Health Organization estimates that influenza causes three to five million severe illnesses each year and 250,000-500,000 deaths. Predicting and characterizing outbreaks of influenza is an important public health problem and significant progress has been made in predicting single outbreaks. However, multiple temporally overlapping outbreaks are also common. These may be caused by different subtypes or outbreaks in multiple demographic groups. We describe our Multiple Outbreak Detection System (MODS) and its performance on two actual outbreaks. This work extends previous work by our group by using model-averaging and a new method to estimate non-influenza influenza-like illness (NI-ILI). We also apply MODS to a real dataset with a double outbreak.

Submitted by teresa.hamby@d… on
Description

Taking into account reporting delays in surveillance systems is not methodologically trivial. Consequently, most use the date of the reception of data, rather than the (often unknown) date of the health event itself. The main drawback of this approach is the resulting reduction in sensitivity and specificity1. Combining syndromic data from multiple data streams (most health events may leave a “signature” in multiple data sources) may be performed in a Bayesian framework where the result is presented in the form of a posterior probability for a disease2.

Objective

We apply an empirical Bayesian framework to perform change point analysis on multiple cattle mortality data streams, accounting for delayed reporting of syndromes.

Submitted by Magou on
Description

Traditional influenza surveillance relies on reports of influenzalike illness (ILI) by healthcare providers, capturing individuals who seek medical care and missing those who may search, post, and tweet about their illnesses instead. Existing research has shown some promise of using data from Google, Twitter, and Wikipedia for influenza surveillance, but with conflicting findings, studies have only evaluated these web-based sources individually or dually without comparing all three of them1-5. A comparative analysis of all three web-based sources is needed to know which of the web-based sources performs best in order to be considered to complement traditional methods.

Objective

To comparatively analyze Google, Twitter, and Wikipedia by evaluating how well change points detected in each web-based source correspond to change points detected in CDC ILI data.

Submitted by Magou on
Description

Traditional infectious disease epidemiology is built on the foundation of high quality and high accuracy data on disease and behavior. Digital infectious disease epidemiology, on the other hand, uses existing digital traces, re-purposing them to identify patterns in health-related processes. Medical claims are an emerging digital data source in surveillance; they capture patient-level data across an entire population of healthcare seekers, and have the benefits of medical accuracy through physician diagnoses, and fine spatial and temporal resolution in near real-time. Our work harnesses the large volume and high specificity of diagnosis codes in medical claims to improve our understanding of the mechanisms driving spatial variation in reported influenza activity each year. The mechanisms hypothesized to drive these patterns are as varied as: environmental factors affecting transmission or virus survival, travel flows between different populations, population age structure, and socioeconomic factors linked to healthcare access and quality of life. Beyond process mechanisms, the nature of surveillance data collection may affect our interpretation of spatial epidemiological patterns, particularly since influenza is a non-reportable disease with non-specific symptoms ranging from asymptomatic to severe. Considering the ways in which medical claims are generated, biases may arise from healthcare-seeking behavior, insurance coverage, and medical claims database coverage in study populations.

Objective

To assess the use of medical claims records for surveillance and epidemiological inference through a case study that examines how ecological and social determinants and measurement error contribute to spatial heterogeneity in reports of influenza-like illness across the United States.

Submitted by Magou on
Description

We describe an automated system that can detect multiple outbreaks of infectious diseases from emergency department reports. A case detection system obtains data from electronic medical records, extracts features using natural language processing, then infers a probability distribution over the diseases each patient may have. Then, a multiple outbreak detection system (MODS) searches for models of multiple outbreaks to explain the data. MODS detects outbreaks of influenza and non-influenza influenza-like illnesses (NI-ILI).

Submitted by teresa.hamby@d… on
Description

Timely monitoring and prediction of the trajectory of seasonal influenza epidemics allows hospitals and medical centers to prepare for, and provide better service to, patients with influenza. The CDC’s ILINet system collects data on influenza-like illnesses from over 3,300 health care providers, and uses this data to produce accurate indicators of current influenza epidemic severity. However, ILINet indicators are typically reported at a lag of 1-2 weeks. Another source of severity data, Google Flu Trends, is calculated by aggregating Google searches for certain influenza related terms. Google Flu Trends data is provided in near-real time, but is a less direct measurement of severity than ILINet indicators, and is likely to suffer from bias. We create a hierarchical model to estimate epidemic severity for the 2014 - 2015 epidemic season which incorporates current and historical data from both ILINet and Google Flu Trends, allowing our model to benefit both from the recency of Google Flu Trends data and the accuracy of ILINet data.

Objective

To use multiple data sources of influenza epidemic severity to inform a model which can estimate and forecast severity for the current influenza epidemic season by accounting for the bias from each source.

Submitted by teresa.hamby@d… on
Description

The early detection of outbreaks of diseases is one of the most challenging objectives of epidemiological surveillance systems. In order to achieve this goal, the primary foundation is using those big surveillance data for understanding and controlling the spatiotemporal variability of disease through populations. Typically, public health’s surveillance system would generate data with the big data characteristics of high volume, velocity, and variety. One common question of big data analysis is most of the data have the multilevel or hierarchy structure, in other word the big data are non-independent. Traditional multilevel or hierarchical model can only deal with 2 or 3 hierarchical data structure, which bound health big data further research for modeling, forecast and early-warning in the public health surveillance, in particular involving complex spatial and temporal variability of Infectious Diseases in the reality. 

Objective

The purpose of this article was to quantitative analyses the spatial variability and temporal variability of influenza like illness (ILI) by a three-level Poisson model, which means to explain the spatial and temporal level effects by introducing the random effects. 

Submitted by Magou on
Description

An increasing number of geo-coded information streams are available with possible use in disease surveillance applications. In this setting, multivariate modeling of health and non-health data allows assessment of concurrent patterns among data streams and conditioning on one another. Therefore it is appropriate to consider the analysis of their spatial distributions together. Specifically for vector-borne diseases, knowledge of spatial and temporal patterns of vector distribution could inform incidence in humans. Tularemia is an infectious disease endemic in North America and parts of Europe. In Finland tularemia is typically mosquito-transmitted with rodents serving as a host; however, a country-wide understanding of the relationship between rodents and the disease in humans is still lacking. We propose a methodology to help understand the association between human tularemia incidence and rodent population levels. 

Objective

We seek to integrate multiple streams of geo-coded information with the aim to improve public health surveillance accuracy and efficiency. Specifically for vector-borne diseases, knowledge of spatial and temporal patterns of vector distribution can help early prediction of human incidence. To this end, we develop joint modeling approaches to evaluate the contribution of vector or reservoir information on early prediction of human cases. A case study of spatiotemporal modeling of tularemia human incidence and rodent population data from Finnish health care districts during the period 1995-2013 is provided. Results suggest that spatial and temporal information of rodent abundance is useful in predicting human cases. 

 

Submitted by Magou on
Description

Most surveillance methods in the literature focus on temporal aberration detections with data aggregated to certain geographical boundaries. SaTScan has been widely used for spatiotemporal aberration detection due to its user friendly software interface. However, the software is limited to spatial scan statistics and suffers from location imprecision and heterogeneity of population. R Surveillance has a collection of spatiotemporal methods that focus more on research instead of surveillance.

 Objective

To build an open source spatiotemporal system that integrates analysis and visualization for disease surveillance. 

 

Submitted by Magou on
Description

In the United States, surveillance of vaccine uptake for childhood infections is limited in scope and spatial resolution. The National Immunization Survey (NIS) - the gold standard tool for monitoring vaccine uptake among children aged 19-35 months - is typically constrained to producing coarse state-level estimates. In recent years, vaccine hesitancy (i.e., a desire to delay or refuse vaccination, despite availability of vaccination services) has resurged in the United States, challenging the maintenance of herd immunity. In December 2014, foreign importation of the measles virus to Disney theme parks in Orange County, California resulted in an outbreak of 111 measles cases, 45% of which were among unvaccinated individuals. Digital health data offer new opportunities to study the social determinants of vaccine hesitancy in the United States and identify finer spatial resolution clusters of under-immunization using data with greater clinical accuracy and rationale for hesitancy.

Objective

The purpose of this study was to investigate the use of large-scale medical claims data for local surveillance of under-immunization for childhood infections in the United States, to develop a statistical framework for integrating disparate data sources on surveillance of vaccination behavior, and to identify the determinants of vaccine hesitancy behavior. 

Submitted by Magou on