Data Analytics

Our objective in this research is to take advantage of a supercomputer grid (TeraGrid) to develop a distributed memory national scale agent-based model (ABM) to study disease outbreaks at the micro level. This has data needs at both the national data surveillance and the local community structure and outbreak levels.

Referenced File

A_National_Scale_Microsimulation_Of_Disease_Outbreaks.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

This paper develops a new method for multivariate spatial cluster detection, the Ã¬multivariate Bayesian scan statisticÃ® (MBSS). MBSS combines information from multiple data streams in a Bayesian framework, enabling faster and more accurate outbreak detection.

Referenced File

A_Multivariate_Bayesian_Scan_Statistic.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

Real-time syndromic surveillance systems require adapted dataflow organization and tools for supporting data processing in real time, from their acquisition until the counter-measure building process. This work explores the capabilities of a specific model based architecture for fulfilling these requisites and its results during a real-size international disease surveillance exercise.

Referenced File

A_Model-Based_Architecture_For_Supporting_Situational_Diagnosis_In_Real-Time_Surveillance_Systems.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

The Activity Monitoring Operating Characteristic (AMOC) curve is a useful and popular method for assessing the performance of algorithms that detect outbreaks of disease [1]. As it is typically applied in biosurveillance, the AMOC curve plots the expected time to detection (since the outbreak began) as a function of the false alert rate. An ideal algorithm has zero false alerts and a detection time of zero. An al-ternative, conceptually equivalent version of the AMOC curve plots (T – detection_time) as a function of the false alert rate, where T is a maximum mean-ingful detection time. We focus on this version.

Objective

We introduce a new measure for evaluating alerting algorithms, which is a generalization of the AMOC curve [1]. For a given rate of false positives alerts, the new measure estimates the time between when an alert is raised and when clinicians are expected to detect the outbreak on their own. We call this measure the Expected Warning Time (EWT).

Referenced File

A_Generalization_Of_The_Amoc_Curve.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

A time periodic geographic disease surveillance system based on a cylindrical space-time scan statistic proposed by Kulldorff [1] has been used extensively for disease surveillance along with the SaTScan software. This statistic is based on a circular spatial scan statistic. On the other hand, many different tests have been proposed to detect purely spatial disease clusters. In particular, some spatial scan statistics such as those developed by Duczmal and Assuncao(2004), Patil and Taillie (2004), and Tango and Takahashi(2005) are aimed at detecting irregularly shaped clusters which may not be detected by the circular spatial scan statistic. However, due to the unlimited geometric freedom of cluster shapes, these statistics have a risk to detect quite large and unlikely peculiarly shaped clusters. A flexible spatial scan statistic proposed by Tango and Takahashi[2], which has been used along with the FleXScan software[3], has a parameter K as the pre-set maximum length of neighbors to be scanned, to be avoid detecting a cluster of unlikely peculiar shape. The flexible spatial scan statistic can be easily extended to space-time alerting methods in syndromic surveillance. Objective: This paper proposes a flexible space-time scan statistic for early detection of disease outbreaks.

Referenced File

A_Flexible_Space-Time_Scan_Statistic_For_Disease_Outbreak.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

This paper describes a comparison between two statistics Ã± SaTScan and FleXScan, applying to a data of absentees in primary school in Japan.

Referenced File

A_Comparison_Of_Satscan_And_Flexscan_For_Outbreak_Detection_And_Monitoring.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

Effective anomaly detection depends on the timely, asynchronous generation of anomalies from multiple data streams using multiple algorithms. Our objective is to describe the use of a case manager tool for combining anomalies into cases, and for collaborative investigation and disposition of cases, including data visualization.

Referenced File

A_Case_Manager_Tool_For_Anomaly_Investigation_In_Biosurveillance.pdf

Submitted by elamb on Mon, 07/30/2018 - 08:40

The traditional SaTScan algorithm[1],[2] uses the euclidean dis- tance between centroids of the regions in a map to assemble a con- nected (in the sense that two connected regions share a physical border) sets of regions. According to the value of the respective log- arithm of the likelihood ratio (LLR) a connected set of regions can be classified as a statistically significant detected cluster. Considering the study of events like contagious diseases or homicides we con- sider using the flow of people between two regions in order to build up a set of regions (zone) with high incidence of cases of the event. In this sense the regions will be closer as the greater the flow of peo- ple between them. In a cluster of regions formed according to the cri- terion of proximity due to the flow of people, the regions will be not necessarily connected to each other.

Objective

We present a new approach to the circular scan method [1] that uses the flow of people to detect and infer clusters of regions with high incidence of some event randomly distributed in a map. We use a real database of homicides cases in Minas Gerais state, in south- east Brazil to compare our proposed method with the original circu- lar scan method in a study of simulated clusters and the real situation.

Submitted by dbedford on Tue, 07/17/2018 - 18:31

Data consisting of counts or indicators aggregated from multiple sources pose particular problems for data quality monitoring when the users of the aggregate data are blind to the individual sources. This arises when agencies wish to share data but for privacy or contractual reasons are only able to share data at an aggregate level. If the aggregators of the data are unable to guarantee the quality of either the sources of the data or the aggregation process then the quality of the aggregate data may be compromised. This situation arose in the Distribute surveillance system (1). Distribute was a national emergency department syndromic surveillance project developed by the International Society for Disease Surveillance for influenza-like-illness (ILI) that integrated data from existing state and local public health department surveillance systems, and operated from 2006 until mid 2012. Distribute was designed to work solely with aggregated data, with sites providing data aggregated from sources within their jurisdiction, and for which detailed information on the un-aggregated ‘raw’ data was unavailable. Previous work (2) on Distribute data quality identified several issues caused in part by the nature of the system: transient problems due to inconsistent uploads, problems associated with transient or long-term changes in the source make up of the reporting sites and lack of data timeliness due to individual site data accruing over time rather than in batch. Data timeliness was addressed using prediction intervals to assess the reliability of the partially accrued data (3). The types of data quality issues present in the Distribute data are likely to appear to some extent in any aggregate data surveillance system where direct control over the quality of the source data is not possible.

Objective

In this work we present methods for detecting both transient and long-term changes in the source data makeup.

Submitted by uysz on Fri, 07/06/2018 - 11:08

Aberration detection methods are essential for analyzing and interpreting large quantity of nonspecific real-time data collected in syndromic surveillance system. However, the challenge lies in distinguishing true outbreak signals from a large amount of false alarm (1). The joint use of surveillance algorithms might be helpful to guide the decision making towards uncertain warning signals.

Objective

To develop and test the method of incorporating different control bars for outbreak detection in syndromic surveillance system

Submitted by uysz on Fri, 07/06/2018 - 09:40

Subscribe to Data Analytics