Dowling John

We are developing a Bayesian surveillance system for realtime surveillance and characterization of outbreaks that incorporates a variety of data elements, including free-text clinical reports. An existing natural language processing (NLP) system called Topaz is being used to extract clinical data from the reports. Moving the NLP system from a research project to a real-time service has presented many challenges.

Objective

Adapt an existing NLP system to be a useful component in a system performing real-time surveillance.

Referenced File

Challenges_In_Adapting_An_Natural_Language_Processing_System_For_Real_Time_Surveillance.pdf

Submitted by hparton on Tue, 06/18/2019 - 13:47

Current practices of automated case detection fall into the extremes of diagnostic accuracy and timeliness. In regards to diagnostic accuracy, electronic laboratory reporting (ELR) is at one extreme and syndromic surveillance is at the other. In regards to timeliness, syndromic surveillance can be immediate, and ELR is delayed 7 days from initial patient visit. A plausible solution, a middle way, to the extremes of diagnostic precision and timeliness in current case detection practices is an automated Bayesian diagnostic system that uses all available data types, for example, freetext ED reports, radiology reports, and laboratory reports.We have built such a solution - Bayesian case detection (BCD). As a probabilistic system, BCD operates across the spectrum of diagnostic accuracy, that is, it outputs the degree of certainty for every diagnosis. In addition, BCD incorporates multiple data types as they appear during the course of a patient encounter or lifetime, with no degradation in the ability to perform diagnosis.

Objective

This paper describes the architecture and evaluation of our recently developed automated BCD system.

Referenced File

Building_An_Automated_Bayesian_Case_Detection_System.pdf

Submitted by hparton on Tue, 06/18/2019 - 13:40

Ontologies representing knowledge from the public health and surveillance domains currently exist. However, they focus on infectious diseases (infectious disease ontology), reportable diseases (PHSkbFretired) and internet surveillance from news text (BioCaster ontology), or are commercial products (OntoReason public health ontology). From the perspective of biosurveillance text mining, these ontologies do not adequately represent the kind of knowledge found in clinical reports. Our project aims to fill this gap by developing a stand-alone ontology for the public health/biosurveillance domain, which (1) provides a starting point for standard development, (2) is straightforward for public health professionals to use for text analysis, and (3) can be easily plugged into existing syndromic surveillance systems.

Objective

To develop an application ontology - the extended syndromic surveillance ontology - to support text mining of ER and radiology reports for public health surveillance. The ontology encodes syndromes, diagnoses, symptoms, signs and radiology results relevant to syndromic surveillance (with a special focus on bioterrorism).

Referenced File

Developing_An_Application_Ontology_For_Mining_Clinical_Reports_The_Extended_Syndromic_Surveillance_Ontology.pdf

Submitted by hparton on Fri, 06/14/2019 - 10:58

Current methods for influenza surveillance include laboratory confirmed case reporting, sentinel physician reporting of Influenza-Like-Illness (ILI) and chief-complaint monitoring from emergency departments (EDs).

The current methods for monitoring influenza have drawbacks. Testing for the presence of the influenza virus is costly and delayed. Specific, sentinel physician reporting is subject to incomplete, delayed reporting. Chief complaint (CC) based surveillance is limited in that a patient’s chief complaint will not contain all signs and symptoms of a patient.

A possible solution to the cost, delays, incompleteness and low specificity (for CC) in current methods of influenza surveillance is automated surveillance of ILI using clinician-provided free-text ED reports.

Objective

This paper describes an automated ILI reporting system based on natural language processing of transcribed ED notes and its impact on public health practice at the Allegheny County Health Department.

Referenced File

An_Automated_Influenza_Like_Illness_Reporting_System_Using_Freetext_Emergency_Department_Reports.pdf

Submitted by hparton on Mon, 06/10/2019 - 09:09

The Extended Syndromic Surveillance Ontology (ESSO) is an open source terminological ontology designed to facilitate the text mining of clinical reports in English [1,2]. At the core of ESSO are 279 clinical concepts (for example, fever, confusion, headache, hallucination, fatigue) grouped into eight syndrome categories (rash, hemorrhagic, botulism, neurological, constitutional, influenza-like-illness, respiratory, and gastrointestinal). In addition to syndrome groupings, each concept is linked to synonyms, variant spellings and UMLS Concept Unique Identifiers. ESSO builds on the Syndromic Surveillance Ontology [3], a resource developed by a working group of eighteen researchers representing ten syndromic surveillance systems in North America. ESSO encodes almost three times as many clinical concepts as the Syndromic Surveillance Ontology, and incorporates eight syndrome categories, in contrast to the Syndromic Surveillance Ontology's four (influenza-like-illness, constitutional, respiratory and gastrointestinal). The new clinical concepts and syndrome groupings in ESSO were developed by a board-certified infectious disease physician (author JD) in conjunction with an informaticist (author MC).

Objective

In order to evaluate and audit these new syndrome definitions, we initiated a survey of syndromic surveillance practitioners. We present the results of an online survey designed to evaluate syndrome definitions encoded in the Extended Syndromic Surveillance Ontology.

Referenced File

Evaluating_Syndromic_Definitions_In_The_Extended_Syndromic_Surveillance_Ontology.pdf

Submitted by elamb on Thu, 05/02/2019 - 08:52

Scientists have utilized many chief complaint (CC) classification techniques in biosurveillance including keyword search, weighted keyword search, and naïve Bayes. These techniques may utilize CC-to-syndrome or CC-to-symptom-to-syndrome classification approaches. In the former approach, we classify a CC directly into syndrome categories. In the latter approach, we first classify a CC into symptom categories. Then, we use a syndrome definition, a combination of one or more symptoms, to determine whether or not a chief complaint belongs in a particular syndrome category. One approach to CC-to-symptom-to-syndrome classification uses manually weighted keyword search and Boolean operations to build syndrome classifiers. A limitation to this approach is that it does not address uncertainty in the data and the system is manually parameterized. A CC-tosymptom-to-syndrome approach that is both probabilistic and utilizes machine learning addresses these limitations.

Objective

Design, build and evaluate a symptom-based probabilistic chief complaint classifier for the Real-time Outbreak and Disease Surveillance System.

Referenced File

Syco_A_Probabilistic_Machine_Learning_Method_For_Classifying_Chief_Complaints_Into_Symptom_And_Syndrome_Categories.pdf