Skip to main content

R Language

Presented October 28, 2016.

We are going to briefly explore the tidytext, widyr, and flexdashboard packages to analyze word co-occurrence, look at ngrams, and then visualize the results in word network graphs. Looking at your data in this way can help the user gain an understanding of the underlying data.

Presented July 27, 2017.

The inferences we make from data can only be as good as the quality of the data; making sure that we are receiving timely, quality data is important. In this presentation, Mark White will describe a number of functions that he has written to perform data quality checks on Kansas emergency department records from NSSP’s BioSense Platform.

Presented May 31, 2017.

Eric Bakota will go over the results from the survey and then I’ll show a report that we generate at HHD using RMarkdown, SQL, and RODBC. This report uses RODBC to connect to our Electronic Disease Surveillance System (MAVEN) to query data needed for the report. The data are imported to R, where they are processed into the various tables, graphs, charts that are used to generate the report. Automating this report has saved 8-10 hours each month.

Presented May 26, 2016.

Eric Bakota will go over Hadley Wickham’s ‘ggplot2’ package using the same grammar of graphics framework outlined in Wickham’s 2010 paper on the subject. This webinar will discuss how to create a plot by looking at the components that make up its overall structure. It will also go into how these graphics can be integrated into RMarkdown to create an automated report that is visually appealing.

Presented January 19, 2016.

This presentation will briefly introduce concepts related to effective visual display and a “big picture” of why and how R is an excellent tool to produce such displays. Through examples, the overall mechanics for producing visuals in R will be shown, as will some “nuts and bolts” details (e.g. the use of color). Methods for creating reproducible (e.g. with user made functions) and interactive (e.g. with the Shiny package) displays will be shown.

Description

The EpiCenter syndromic surveillance platform currently uses Java libraries for time series analysis. Expanding the data quality capabilities of EpiCenter requires new analysis methods. While the Java ecosystem has a number of resources for general software engineering, it has lagged behind on numerical tools. As a result, including additional analytics requires implementing the methods de novo.

The R language and ecosystem has emerged as one of the leading platforms for statistical analysis. A wide range of standard time series analysis methods are available in either the base system or contributed packages, and new techniques are regularly implemented in R. Previous attempts to integrate R with EpiCenter were hampered by the limitations of available R/Java interfaces, which were not actively developed for a long time.

An alternative bridge is via the PostgreSQL database used by EpiCenter on the backend. An R extension for PostgreSQL exists, which can expose the entire R ecosystem to EpiCenter with minimal development effort.

Objective To demonstrate the broader analytical capabilities available by making the R language available to EpiCenter reporting

Submitted by teresa.hamby@d… on