Skip to main content

Machine Learning

Description

Within the traditional surveillance of notifiable infectious diseases in Germany, not only are individual cases reported to the Robert Koch Institute, but also outbreaks themselves are recorded: A label is assigned by epidemiologists to each case, indicating whether it is part of an outbreak and of which. This expert knowledge represents, in the language of machine leaning, a "ground truth" for the algorithmic task of detecting outbreaks from a stream of surveillance data. The integration of this kind of information in the design and evaluation of algorithms is called supervised learning.

Objective: By systematically scoring algorithms and integrating outbreak data through statistical learning, evaluate and improve the performance of automated infectious-disease-outbreak detection. The improvements should be directly relevant to the epidemiological practice. A broader objective is to explore the usefulness of machine-learning approaches in epidemiology.

Submitted by elamb on
Description

Mortality is an indicator of the severity of the impact of an event on the population. In France mortality surveillance is part of the syndromic surveillance system SurSaUD and is carried out by Santé publique France, the French public health agency. The set-up of an Electronic Death Registration System (EDRS) in 2007 enabled to receive in real-time medical causes of death in free-text format. This data source was considered as reactive and valuable to implement a reactive mortality surveillance system using medical causes of death (1). The reactive mortality surveillance system is based on the monitoring of Mortality Syndromic Groups (MSGs). An MSG is defined as a cluster of medical causes of death (pathologies, syndromes, symptoms) that meet the objectives of early detection and impact assessment of events (2). Since causes of death are entered in free-text format, their automatic classifications into MSGs require the use of natural language processing methods. We observe a constant increase in the use of these methods to classify medical information and for health surveillance over the last two decades (3).

Objective: This study aims to implement and evaluate two automatic classification methods of free-text medical causes of death into Mortality Syndromic Groups (MSGs) in order to be used for reactive mortality surveillance.

Submitted by elamb on
Description

Opioid overdoses have emerged within the last five to ten years to be a major public health concern. The high potential for fatal events, disease transmission, and addiction all contribute to negative outcomes. However, what is currently known about opioid use and overdose is generally gathered from emergency room data, public surveys, and mortality data. In addition, opioid overdoses are a non-reportable condition. As a result, state/national standardized procedures for surveillance or reporting have not been developed, and local government monitoring is frequently not specific enough to capture and track all opioid overdoses. Lastly, traditional means of data collection for conditions such as heart disease through hospital networks or insurance companies are not necessarily applicable to opioid overdoses, due to the often short disease course of addiction and lack of consistent health care visits. Overdose patients are also reluctant to follow-up or provide contact information due to law enforcement or personal reasons. Furthermore, collected data related to overdoses several months or years after the fact are useless in terms of short-term outreach. Therefore, given the potentially brief timeline of addiction or use to negative outcome, the current project set to create a near real-time surveillance and treatment/outreach system for opioid overdoses using an already existing EMS data collection framework.

Objective: To develop and implement a classifcation algorithm to identify likely acute opioid overdoses from text fields in emergency medical services (EMS) records.

Submitted by elamb on
Description

Infectious diseases present with multifarious factors requiring several efforts to detect, prevent, and break the chain of transmission. Recently, machine learning has shown to be promising for automated surveillance leading to rapid and early interventions, and extraction of phenotypic features of human faces. In addition, mobile devices have become a promising tool to provide on-the-ground surveillance, especially in remote areas and geolocation mapping. Pacific Northwest National Laboratory (PNNL) combines machine learning with mobile technology to provide a groundbreaking prototype of disease surveillance without the need for internet, just a camera. In this android application, VisionDx, a machine learning algorithm analyses human face images and within milliseconds notifies the user with confidence level whether or not the person is sick. VisionDx comes with two modes, photo and video, and additional features of history, map, and statistics. This application is the first of its kind and provides a new way to think about the future of syndromic surveillance.

Objective: Automated syndromic surveillance using mobile devices is an emerging public health focus that has a high potential for enhanced disease tracking and prevention in areas with poor infrastructure. Pacific Northwest National Laboratory sought to develop an Android mobile application for syndromic biosurveillance that would i) use the phone camera to take images of human faces to detect individuals that are sick through a machine learning (ML) model and ii) collect image data to increase training data available for ML models. The initial prototype use case is for screening and tracking the health of soldiers for use by the Department of Defense’s Disease Threat Reduction Agency.

Submitted by elamb on
Description

Unlike other health threats of recent concern for which widespread mortality was hypothetical, the high fatality burden of opioid overdose crisis is present, steadily growing, and affecting young and old, rural and urban, military and civilian subpopulations. While the background of many public health monitors is mainly infectious disease surveillance, these epidemiologists seek to collaborate with behavioral health and injury prevention programs and with law enforcement and emergency medical services to combat the opioid crisis. Recent efforts have produced key terms and phrases in available data sources and numerous user-friendly dashboards allowing inspection of hundreds of plots. The current effort seeks to distill and present combined fusion alerts of greatest concern from numerous stratified data outputs. Near-term plans are to implement best-performing fusion methods as an ESSENCE module for the benefit of OHA staff and other user groups.

Objective: In a partnership between the Public Health Division of the Oregon Health Authority (OHA) and the Johns Hopkins Applied Physics Laboratory (APL), our objective was develop an analytic fusion tool using streaming data and report-based evidence to improve the targeting and timing of evidence-based interventions in the ongoing opioid overdose epidemic. The tool is intended to enable practical situational awareness in the ESSENCE biosurveillance system to target response programs at the county and state levels. Threats to be monitored include emerging events and gradual trends of overdoses in three categories: all prescription and illicit opioids, heroin, and especially high-mortality synthetic drugs such as fentanyl and its analogues. Traditional sources included emergency department (ED) visits and emergency management services (EMS) call records. Novel sources included poison center calls, death records, and report-based information such as bad batch warnings on social media. Using available data and requirements analyses thus far, we applied and compared Bayesian networks, decision trees, and other machine learning approaches to derive robust tools to reveal emerging overdose threats and identify at-risk subpopulations.

Submitted by elamb on
Description

In 2004, Sante publique France, the French Public Health Agency set up a reactive all-cause mortality surveillance based on the administrative part of the death certificate, in the final objectives 1/ to detect unexpected or usual variations in mortality and 2/ to provide a first evaluation of mortality impact of events. In 2007, an Electronic Death Registration System (EDRS) was implemented, enabling electronic transmission of the medical causes of death to the agency in real-time. To date, 12% of the mortality is registered electronically. A pilot study demonstrated that these data were valuable for a reactive mortality surveillance system based on causes of death. A strategy has thus been developed for the analysis in routine of the medical causes of death with the objectives of early detection of expected and unexpected outbreaks and reactive evaluation of their impact. This system will allow approaching the cause accountability when an excess death will be observed.

Objective: The aim of this study is to present the syndromic groups that will be routinely monitored for the reactive mortality surveillance based on free-text medical causes of death.

Submitted by elamb on
Description

Patient consultations recorded as voice dictations are frequently stored electronically as transcriptions in free text format. The information stored in free text is not computer tractable. Advances in artificial intelligence permit the conversion of free text into structured information that allows statistical analysis.

 

Objective

This paper describes DMReporter, a medical language processing system that automatically extracts information pertaining to diabetes (demography, numerical measurement values, medication list, and diagnoses) from the free text in physicians’ notes and stores it in a structured format in a MYSQL database.

Submitted by hparton on
Description

Event-based biosurveillance is a practice of monitoring diverse information sources for the detection of events pertaining to human health. Online documents, such as news articles on the Internet, have commonly been the primary information sources in event-based biosurveillance. With the large number of online publications as well as with the language diversity, thorough monitoring of online documents is challenging. Automated document classification is an important step toward efficient event-based biosurveillance. In Project Argus, a biosurveillance program hosted at Georgetown University Medical Center, supervised and unsupervised approaches to document classification are considered for event-based biosurveillance.

 

Objective

This paper describes ongoing efforts in enhancing automated document classification toward efficient event-based biosurveillance. 

Submitted by hparton on
Description

Event-based biosurveillance is a practice of monitoring diverse information sources for the detection of events pertaining to human, plant, and animal health. Online documents, such as news articles, newsletters, and (micro-) blog entries, are primary information sources in it. Document classification is an important step to filter information and machine learning methods have been successfully applied to this task.

 

Objective

The objective of this literature review is to identify current challenges in document classification for event-based biosurveillance and consider the necessary efforts and the research opportunity.

Submitted by elamb on
Description

Disease surveillance data often has an underlying network structure (e.g. for outbreaks which spread by person-to-person contact). If the underlying graph structure is known, detection methods such as GraphScan (1) can be used to identify an anomalous subgraph which might be indicative of an emerging event. Typically, however, the network structure is unknown, and must be learned from unlabeled data, given only the time series of observed counts (e.g. daily hospital visits for each zip code).

Objective

Our goal is to learn the underlying network structure along which a disease outbreak might spread, and use the learned network to improve the timeliness and accuracy of outbreak detection.

Submitted by elamb on