Skip to main content

Data Analytics

Description

Most public health surveillance systems in the United States do not capture individual-level measures of socioeconomic position. Without this information, socioeconomic disparities in health outcomes can be hidden. However, US Census data can be used to describe neighborhood-level socioeconomic conditions like poverty and crowding. Place matters. Neighborhood affects health independently of personal characteristics. Thus, important trends may be elucidated by linking geocoded public health surveillance data to area-based measures of socioeconomic position, such as the percentage of residents with incomes below the federal poverty level.

Objective

The panel will describe applying the methods of Harvard’s Public Health Disparities Geocoding Project to a diverse collection of infectious disease surveillance data from 14 US states and New York City. This session will demonstrate the feasibility and utility of using US Census data to reveal sub-populations vulnerable to infectious diseases.

Submitted by teresa.hamby@d… on
Description

Early detection of outbreaks is crucial in public health surveillance in order to enable rapid control measures. Statistical methods are widely used for outbreak detection but no study has proposed to evaluate and compare thoroughly the performance of these methods.

Objective

Evaluate the performance of 8 statistical methods for outbreak detection in health surveillance with historical data.

Submitted by teresa.hamby@d… on
Description

The success of public health campaigns in decreasing or eliminating the burden of vaccine-preventable diseases can be undermined by media content influencing vaccine hesitancy in the population. A tool for tracking and describing the ever-growing platforms for such media content can help decide how and where to invest in campaigns to increase public confidence in vaccines. The Vaccine Sentimeter, developed from the Healthmap project, aims to assist public health practitioners in maintaining or improving vaccine coverage through a real-time, online visualization tool of global media content on vaccines.

Objective The current analysis describes the scope and trends in United States content from the Vaccine Sentimeter’s results, while seeking to examine any possible links between media content, vaccine coverage, and reported vaccine adverse events in the country.

Submitted by teresa.hamby@d… on
Description

The EpiCenter syndromic surveillance platform currently uses Java libraries for time series analysis. Expanding the data quality capabilities of EpiCenter requires new analysis methods. While the Java ecosystem has a number of resources for general software engineering, it has lagged behind on numerical tools. As a result, including additional analytics requires implementing the methods de novo.

The R language and ecosystem has emerged as one of the leading platforms for statistical analysis. A wide range of standard time series analysis methods are available in either the base system or contributed packages, and new techniques are regularly implemented in R. Previous attempts to integrate R with EpiCenter were hampered by the limitations of available R/Java interfaces, which were not actively developed for a long time.

An alternative bridge is via the PostgreSQL database used by EpiCenter on the backend. An R extension for PostgreSQL exists, which can expose the entire R ecosystem to EpiCenter with minimal development effort.

Objective To demonstrate the broader analytical capabilities available by making the R language available to EpiCenter reporting

Submitted by teresa.hamby@d… on
Description

The basic reproduction number represents the number of secondary infections expected to be caused by an infectious individual introduced into an entirely susceptible population. It is a fundamental measure used to characterize infectious disease outbreaks and is essential in developing mathematical models to determine appropriate interventions. Much work has been done to investigate methods for estimating the basic reproduction number during the early stages of infectious disease outbreaks. However, these methods often require data that may not be readily available at the beginning of an outbreak. An approach developed by Becker has been widely used to estimate the basic reproduction number using only the final case count and size of the at-risk population. A modification to this approach is proposed that allows estimates to be obtained earlier in an outbreak using only the current case count, number currently ill, and the size of the at-risk population.

Objective

To present a modification to an established approach to estimating the basic reproduction number to allow estimates to be obtained at any point during an outbreak using only the current case count, number currently ill, and the size of the at-risk population.

Submitted by teresa.hamby@d… on
Description

Advanced cancer treatments and research have been helping reduce cancer mortality nationally and in Wisconsin. However, chronic health disparities in cancer remain a major public health concern as not all population subgroups have equal accesses to these healthcare benefits. Previous cancer studies showed that cancer health disparities persisted among racial populations had primarily focused on the entire state of Wisconsin. The southeastern region Wisconsin, the greater Milwaukee metropolitan area, is home to 83% of Wisconsin’s African American population, and includes one of the most segregated metropolitan areas in the United States. Because of this, better understanding of cancer trends in the southeastern Wisconsin region can assist in targeting a focal point to more effectively use resources to eliminate health disparities in Wisconsin.

Objective

To assesse health disparities in all-site cancer incidence and mortality rates, and stage of specific cancer diagnosis (female breast cancer and colorectal cancer) compared between African American and white populations of southeastern Wisconsin during 2007-2011.

Submitted by teresa.hamby@d… on
Description

Internet based technologies are becoming quite prominent among today’s generation due to its easy accessibility through computer or phone devices. Internet’s relative anonymity leads high risk groups to find it easier to meet sexual partners with similar characteristics through dating sites like Grindr, Jack’D, Adams4Adams etc. and mainstream social networking sites like Facebook, Twitter, or Instagram. According to various studies, young MSMs prefer to use dating sites and social networking sites more as a source to meet sexual partners than older MSMs.

Objective

To assess the usage of dating sites and social networking sites for finding sexual partners among newly diagnosed HIV positive MSMs in Harris County in 2014

Submitted by teresa.hamby@d… on
Description

The French syndromic surveillance system SursaUD® has been set up by Santé publique France, the national public health agency (formerly French institute for public health - InVS) in 2004. In 2016, the system is based on three main data sources: the attendances in about 650 emergency departments (ED), the consultations to

 62 emergency general practitioners’ (GPs) associations SOS Médecins and the mortality data from 3,000 civil status offices [1]. Daily, about 60,000 attendances in ED (88% of the national attendances), 8,000 visits in SOS Médecins associations (95% of the national visits) and 1,200 deaths (80% of the national mortality) are recorded all over the territory and transmitted to Santé publique France. About 100 syndromic groupings of interest are constructed from the reported diagnostic codes, and monitored daily or weekly, for different age groups and geographical scales, to characterize trends, detect expected or unexpected events (outbreaks) and assess potential impact of both environmental and infectious events. All-causes mortality is also monitored in similar objectives. Two user-friendly interactive web applications have been developed using the R shiny package [2] to provide a homogeneous framework for all the epidemiologists involved in the syndromic surveillance at the national and the regional levels.

Objective

The presentation describes the design and the main functionalities of two user-friendly applications developed using R-shiny to support the statistical analysis of morbidity and mortality data from the French syndromic surveillance system SurSaUD.

Submitted by Magou on
Description

State HIV offices routinely produce fact sheets, epidemiologic profiles, and other reports from the eHARS (Enhanced HIV/AIDS Reporting System) database which was created and is maintained by the CDC. The eHARS software is used throughout the United States to monitor the HIV epidemic and evaluate HIV prevention programs and policies. Due to limited variability of eHARS throughout the United States, software developed to analyze and visualize data using the eHARS database schema may be useful to many state HIV offices. Software developed based on the eHARS database schema could reduce the time required for analysis and production of reports.

The R software environment for statistical computing is an open source project with a thriving community of users who continue to expand R’s analysis capacity through the addition of packages. A package is “a standardized collection of material extending R, e.g. providing code, data, or documentation”. Shiny is one example of a user-developed package which easily allows R users to create interactive web applications from analytical software.

Objective

Describe the development process and function of a data dashboard for state HIV surveillance and discuss the benefits of creating interactive data dashboards in the R software environment.

Submitted by teresa.hamby@d… on
Description

The Biosurveillance Ecosystem (BSVE) is a biological and chemical threat surveillance system sponsored by the Defense Threat Reduction Agency (DTRA). BSVE is intended to be user-friendly, multi-agency, cooperative, modular and threat agnostic platform for biosurveillance [2]. In BSVE, a web-based workbench presents the analyst with applications (apps) developed by various DTRAfunded researchers, which are deployed on-demand in the cloud (e.g., Amazon Web Services). These apps aim to address emerging needs and refine capabilities to enable early warning of chemical and biological threats for multiple users across local, state, and federal agencies. Soda Pop is an app developed by Pacific Northwest National Laboratory (PNNL) to meet the current needs of the BSVE for early warning and detection of disease outbreaks. Aimed for use by a diverse set of analysts, the application is agnostic to data source and spatial scale enabling it to be generalizable across many diseases and locations. To achieve this, we placed a particular emphasis on clustering and alerting of disease signals within Soda Pop without strong prior assumptions on the nature of observed diseased counts.

Objective

To introduce Soda Pop, an R/Shiny application designed to be a disease agnostic time-series clustering, alarming, and forecasting tool to assist in disease surveillance “triage, analysis and reporting” workflows within the Biosurveillance Ecosystem (BSVE). In this poster, we highlight the new capabilities that are brought to the BSVE by Soda Pop with an emphasis on the impact of metholodogical decisions.

Submitted by Magou on