Skip to main content

Data Analytics

Description

In the realm of public health, there has been an increasing trend in exploration of social network analyses (SNAs). SNAs are methodological and theoretical tools that describe the connections of people, partnerships, disease transmission, the interorganizational structure of health systems, the role of social support, and social capital1. The Florida Department of Health in Orange County (DOH-Orange) developed a reproducible baseline social network analysis of patient movement across healthcare entities to gain a county-wide perspective of all actors and influences in our healthcare system. The recognition of the role each healthcare entity contributes to Orange County, Florida can assist DOH-Orange in developing facility-specific implementations such as increased usage of personal protective equipment, environmental assessments, and enhanced surveillance.

Objective:

To create a baseline social network analysis to assess connectivity of healthcare entities through patient movement in Orange County, Florida.

Submitted by elamb on
Description

Syndromic surveillance has been widely used in influenza surveillance worldwide. However, despite the potential benefits created by the large volume of data, biases due to the changes in healthcare seeking behavior and physicians’ reporting behavior, as well as the background noise caused by seasonal flu epidemics, contribute to the complexity of the surveillance system and may limit its utility as a tool for early detection. Since most current analysis methods are developed for outbreak detection, there are few tools to characterize influenza surveillance data for situational awareness purposes in a quantitative manner. Hong Kong Centre for Health Protection has a comprehensive influenza surveillance system based on healthcare providers, laboratories, schools, daycare centers and residential care homes for the elderly. Hong Kong usually experiences a summer peak in July and August, which potentially doubles the data volume and constitutes a natural experiment to assess the effect of school-age children in the influenza transmission dynamics. The richness of the available data and the unique epidemiological characteristics make Hong Kong an ideal study object to develop and evaluate our model.

Objective

Our goal is to develop a statistical model for characterizing influenza surveillance systems that will be helpful in interpreting multiple streams of influenza surveillance data in future outbreaks.

Submitted by rmathes on
Description

Vast amounts of free, real-time, localizable Twitter data offer new possibilities for public health workers to identify trends and attitudes that more traditional surveillance methods may not capture, particu- larly in emerging areas of public health concern where reliable sta- tistical evidence is not readily accessible. Existing applications include tracking public informedness during disease outbreaks. Twitter-based surveillance is particularly suited to new challenges in tobacco control. Hookah and e-cigarettes have surged in popular- ity, yet regulation and public information remain sparse, despite con- troversial health effects. Ubiquitous online marketing of these products and their popularity among new and younger users make Twitter a key resource for tobacco surveillance. 

Objective

We present results of a content analysis of tobacco-related Twitter posts (tweets), focusing on tweets referencing e-cigarettes and hookah. 

Submitted by jababrad@indiana.edu on
Description

Dengue is a major cause of morbidity in Thailand. Annual outbreaks of varying sizes provide a particular challenge to the public health system because treatment of severe cases requires significant resources. Advanced warning of increases in incidence could help public health authorities allocate resources more effectively and mitigate the impact of epidemics.

Objective

To develop a statistical model for dengue fever surveillance that uses data from across Thailand to give early warning of developing epidemics.

Submitted by teresa.hamby@d… on
Description

A seroprevalence survey carried out in four counties in the Tampa Bay area of Florida (Hillsborough, Pinellas, Manatee and Pasco) provided an estimate of cumulative incidence of infection due to the 2009 influenza A (H1N1) as of the end of that year’s pandemic. During the pandemic, high-level decison-makers wanted timely, credible forecasts as to the likely near-term course of the pandemic. The cumulative percentage of people who will be infected by the end of the epidemic can be estimated from the intrinsic reproductive number of the viral strain, its R0 , which can be measured early in the epidemic. If the current cumulative number of infections can be estimated, then one can determine what fraction of the eventual total number of infected people have already been infected.

Objective

To estimate the number of infections due to the novel 2009 influenza A/H1N1 virus corresponding to each ED visit for ILI in a four-county area of Florida. Knowing such ratios, one could (in future similar situations) estimate the cumulative number of infections due to a novel influenza virus in a population.

Submitted by rmathes on
Description

Public health is at a precipice of increasing demand for the consumption and analysis of large amounts of disparate data, the centralization of local and state IT offices, and the compartmentalization of programmatic technology solutions. Public health informatics needs differ across programmatic areas, but may have commonalities across jurisdictions. Initial development of the PHCP was launched with the goal of providing a shared infrastructure for state and local jurisdictions enabling the development of interoperable systems and distributed analytical methods with common sources of data. The PHCP is being designed to leverage recent successes with cloud-based technology in public health.

Success of the PHCP is dependent on the involvement of state and local public health jurisdictions in the transparent development and future direction of the platform. Equally critical to success is the selection of appropriate technology, consideration of various governance structures, and full understanding of the legal implications of a shared platform model.

Objective

To update the public health practice community on the continuing development of the Public Health Community Platform (PHCP).

Submitted by teresa.hamby@d… on
Description

Weather events such as a heat wave or a cold snap can cause a change to the number of patients and types of symptoms seen at a healthcare facility. Understanding the impact of weather patterns on ILI surveillance may be useful for early detection and trend analysis. In addition, weather patterns limit our ability to extrapolate data collected in one region to a different region, which may not share the same weather or periodic trends. By modeling these sources of variation, we can factor out their effects and increase the sensitivity of our overall surveillance system.

Objective

To develop a statistical model to account for weather variation in influenza-like illness (ILI) surveillance.

Submitted by teresa.hamby@d… on
Description

Syndromic surveillance is the real-time collection and interpretation of data to allow the early identification of public health threats and their impact, enabling public health action. Statistical methods are used in syndromic surveillance to identify when the activity of indicator ‘signals’ have significantly increased. A wide range of techniques have been applied to syndromic data internationally. As part of the preparation for the 2012 Olympics Public Health England expanded its syndromic surveillance service. As new syndromic systems were introduced, statistical methods were developed and applied for each system, tailored to the particular system challenges at the time, e.g. a lack of historical data, and regular changes to geographical coverage.

Objective

This paper describes the design and application of a new statistical method for real-time syndromic surveillance, used by Public Health England.

Submitted by teresa.hamby@d… on
Description

Kulldorff’s spatial scan statistic1 detects significant spatial clusters of disease by maximizing a likelihood ratio statistic over circular spatial regions. The fast localized subset scan2 enables scalable detection of proximity-constrained subsets and increases power to detect irregularly-shaped clusters, However, unconstrained subset scanning within each circular neighborhood2, may not necessarily capture the pattern of interest, and is too under-constrained for use with case/control point data. Thus we propose the star-shaped scan statistic (StarScan), a novel method that efficiently maximizes the loglikelihood ratio over irregularly-shaped clusters, while incorporating soft constraints on smoothness. More precisely, we allow the radius of the cluster around a center location to vary along with angle, and penalize proportional to the total change in radius.

Objective

We present StarScan, a novel scan statistic for accurately detecting irregularly-shaped disease outbreaks. StarScan maximizes a penalized log-likelihood ratio statistic, allowing the radius around a central location to vary as a function of the angle and applying a penalty proportional to the total change in radius.

 

Submitted by Magou on
Description

Production animal health syndromic surveillance (PAHSyS) data are varied: there may be standardized ratios, proportions, counts of adverse events, categorical data and even qualitative ‘intelligence’ that may need to be aggregated up a hierarchy. PAHSyS provides some unique challenges for event detection. Livestock populations are made up of many subpopulations which are constantly moving around between farms and markets to slaughter. Pathogen expression often varies across production types and rearing-intensity levels. The complexity of animal production systems necessitates monitoring many time series; and makes the investigation of statistical signals imperative and at the same time difficult and resource intensive. Having multivariate surveillance methods that can work across multiple data streams to increase both sensitivity and specificity are much needed.

Objective

The question of how to aggregate animal health information derived from multiple data streams that vary in their specificity, scale, and behaviour is not trivial. Our view is that outbreak detection in a multivariate context should be viewed as a probabilistic prediction problem.

Submitted by teresa.hamby@d… on