Skip to main content

Data Visualization

Presented October 28, 2016.

We are going to briefly explore the tidytext, widyr, and flexdashboard packages to analyze word co-occurrence, look at ngrams, and then visualize the results in word network graphs. Looking at your data in this way can help the user gain an understanding of the underlying data.

Presented May 31, 2017.

Eric Bakota will go over the results from the survey and then I’ll show a report that we generate at HHD using RMarkdown, SQL, and RODBC. This report uses RODBC to connect to our Electronic Disease Surveillance System (MAVEN) to query data needed for the report. The data are imported to R, where they are processed into the various tables, graphs, charts that are used to generate the report. Automating this report has saved 8-10 hours each month.

Presented May 26, 2016.

Eric Bakota will go over Hadley Wickham’s ‘ggplot2’ package using the same grammar of graphics framework outlined in Wickham’s 2010 paper on the subject. This webinar will discuss how to create a plot by looking at the components that make up its overall structure. It will also go into how these graphics can be integrated into RMarkdown to create an automated report that is visually appealing.

Description

Real-time syndromic surveillance requires daily surveillance of a range of health data sources. Most real-time data sources from health care systems exhibit large day of the week fluctuations as service provision and patient behaviour varies by day of the week. Regular day of the week effects are further complicated by the occurrence of public holidays (usually 8 per year in England), which can limit the availability of certain services and affect patient behaviour. Simple seven day moving averages fail to provide a smoothed trend around public holidays and can lead to false alarms or potentially delays in detection of outbreaks.

Objective

To develop smoothing techniques for daily syndromic surveillance data that allow for the easier identification of trends and unusual activity independent of day of the week and holiday effects.

Submitted by teresa.hamby@d… on
Description

The CDC provides data on incidences of diseases on its website (https://data.cdc.gov/). Data is available at national, regional, and state levels, and is uploaded to the CDC’s website on a weekly basis. The CDCPlot web application (available at https://michaud.shinyapps.io/ CDCPlot/), built using the Shiny package in R, provides a quick and user-friendly method of visualizing this data. Users are able to the select timeframes, locations, and diseases which they wish to view, and plots are produced. There is an optional alert threshold, which will alert users when a disease increases significantly from one week to the next. In addition, CDCPlot provides visualizations of CDC data on Pneumonia and Influenza mortality.

Objective

To demonstrate the current features and functionality of the CDCPlot application, and to introduce potential new features of the application. 

Submitted by rmathes on
Description

The French syndromic surveillance system SursaUD® has been set up by Santé publique France, the national public health agency (formerly French institute for public health - InVS) in 2004. In 2016, the system is based on three main data sources: the attendances in about 650 emergency departments (ED), the consultations to

 62 emergency general practitioners’ (GPs) associations SOS Médecins and the mortality data from 3,000 civil status offices [1]. Daily, about 60,000 attendances in ED (88% of the national attendances), 8,000 visits in SOS Médecins associations (95% of the national visits) and 1,200 deaths (80% of the national mortality) are recorded all over the territory and transmitted to Santé publique France. About 100 syndromic groupings of interest are constructed from the reported diagnostic codes, and monitored daily or weekly, for different age groups and geographical scales, to characterize trends, detect expected or unexpected events (outbreaks) and assess potential impact of both environmental and infectious events. All-causes mortality is also monitored in similar objectives. Two user-friendly interactive web applications have been developed using the R shiny package [2] to provide a homogeneous framework for all the epidemiologists involved in the syndromic surveillance at the national and the regional levels.

Objective

The presentation describes the design and the main functionalities of two user-friendly applications developed using R-shiny to support the statistical analysis of morbidity and mortality data from the French syndromic surveillance system SurSaUD.

Submitted by Magou on
Description

State HIV offices routinely produce fact sheets, epidemiologic profiles, and other reports from the eHARS (Enhanced HIV/AIDS Reporting System) database which was created and is maintained by the CDC. The eHARS software is used throughout the United States to monitor the HIV epidemic and evaluate HIV prevention programs and policies. Due to limited variability of eHARS throughout the United States, software developed to analyze and visualize data using the eHARS database schema may be useful to many state HIV offices. Software developed based on the eHARS database schema could reduce the time required for analysis and production of reports.

The R software environment for statistical computing is an open source project with a thriving community of users who continue to expand R’s analysis capacity through the addition of packages. A package is “a standardized collection of material extending R, e.g. providing code, data, or documentation”. Shiny is one example of a user-developed package which easily allows R users to create interactive web applications from analytical software.

Objective

Describe the development process and function of a data dashboard for state HIV surveillance and discuss the benefits of creating interactive data dashboards in the R software environment.

Submitted by teresa.hamby@d… on
Description

MERS-CoV was discovered in 2012 in the Middle East and human cases around the world have been carefully reported by the WHO. MERS-CoV virus is a novel betacoronavirus closely related to a virus (NeoCov) hosted by a bat, Neoromicia capensis. MERS-CoV infects humans and camels. In 2015, MERS-CoV spread from the Middle East to South Korea which sustained an outbreak. Thus, it is clear that the virus can spread among humans in areas in which camels are not husbanded.

Objective

Here we use novel methods of phylogenetic transmission graph analysis to reconstruct the geographic spread of MERS-CoV. We compare these results to those derived from text mining and visualization of the World Health Organization’s (WHO) Disease Outbreak News.

 

Submitted by Magou on
Description

CPC provides the 24/7/365 poison hotline for the entire state of North Carolina and currently handles approximately 80,000 calls per year. CPC consultation services that assist callers with poison exposure, diagnosis, optimal patient management, therapy, and patient disposition guidance remain indispensable to the public and health care providers. Poison control center data have been used for years in syndromic surveillance practice as a reliable data source for early event detection. This information has been useful for a variety of public health issues, including environmental exposures, foodborne diseases, overdoses, medication errors, drug identification, drug abuse trends and other information needs. The North Carolina Department of Health and Human Services started formal integration of CPC information into surveillance activities in 2004. CPC call data are uploaded in real time (hourly), 24/7/365, to the NC DETECT state database.

Objective

To describe Carolinas Poison Control Center (CPC) calls data collected in the NC DETECT syndromic surveillance system.

Submitted by teresa.hamby@d… on