Skip to main content

Free Text

Description

We are developing a Bayesian surveillance system for realtime surveillance and characterization of outbreaks that incorporates a variety of data elements, including free-text clinical reports. An existing natural language processing (NLP) system called Topaz is being used to extract clinical data from the reports. Moving the NLP system from a research project to a real-time service has presented many challenges.

 

Objective

Adapt an existing NLP system to be a useful component in a system performing real-time surveillance.

Submitted by hparton on
Description

Patient consultations recorded as voice dictations are frequently stored electronically as transcriptions in free text format. The information stored in free text is not computer tractable. Advances in artificial intelligence permit the conversion of free text into structured information that allows statistical analysis.

 

Objective

This paper describes DMReporter, a medical language processing system that automatically extracts information pertaining to diabetes (demography, numerical measurement values, medication list, and diagnoses) from the free text in physicians’ notes and stores it in a structured format in a MYSQL database.

Submitted by hparton on
Description

Security threats and the recent emergence of avian influenza in Europe have heightened the profile of and need for a good surveillance strategy during such events. The two main rationales for enhanced infectious disease surveillance at mass events include a perceived increased risk of infectious disease events and a need to detect and respond to events more quickly. Moreover, the requirements of the International Health Regulations (IHR) issued by the World Health Organization (WHO), which take effect in mid-2007, define the need for timely reporting of infectious diseases during international mass events [1]. Therefore, an enhanced surveillance, based on Germany’s pre-existing system of mandatory notifications was conducted in the12 World Cup cities.

Objective

In this abstract, we describe the major findings of an evaluation of our enhanced infectious disease surveillance activities during the FIFA Soccer World Cup 2006 in Germany.

Submitted by elamb on

Free text queries are performed by ESSENCE users very often. And increasingly, those free text queries are incorporating negation terms that allow the users to find case definitions when certain terms are not present. This video attempts to explain some of the special cases where negation in free text queries may be confusing for users. The video will mention some features in ESSENCE that you may not be familiar with like the Explain Query button and Advanced Query Tool (AQT).

Submitted by elamb on
Description

The North Carolina Bioterrorism and Emerging Infection Prevention System (NC BEIPS) receives daily emergency department (ED) data from 33 (29%) of the 114 EDs in North Carolina. These data are available via a Web-based portal and the Early Aberration Reporting System to authorized NC public health users for the purpose of syndromic surveillance (SS). Users currently monitor several syndromes including: gastrointestinal severe, fever/rash illness and influenza-like illness. The syndrome definitions are based on the infection-related syndrome definitions of the CDC and search the chief complaint (CC) and, when available, triage note (TN) and initial temperature fields. Some EDs record a TN, which is a brief text passage that describes the CC in more detail. Most research on the utility of ED data for SS has focused on the use of CC. The goal of this study was to determine the sensitivity, specificity, and both positive and negative predictive value of including TN in the syndrome queries.

 

Objective

This study evaluates the addition of TN to syndrome queries used in the NC BEIPS.

Submitted by elamb on
Description

Syndromic surveillance of emergency department (ED) visit data is often based on computer algorithms which assign patient chief complaints (CC) and ICD code data to syndromes. The triage nurse note (NN) has also been used for surveillance. Previously we developed an “NGram” classifier for syndromic surveillance of ED CC in Italian for detection of natural outbreaks and bioterrorism. The classifier is developed from a set of ED visits for which both the ICD diagnosis code and CC are available by measuring the associations of text fragments within the CC (e.g. 3 characters for a “3-gram”) with a syndromic group of ICD codes. We found good correlation between daily volumes by the ICD10 classifier and estimated by NGrams. However, because the CC was limited to 23 options based on the pick list, it might be possible to obtain results as good as the NGram method or better using a simpler probabilistic approach. Also, in addition to the CC, the Italian data included a free-text NN note. We might be able achieve improved performance by applying the n-gram method to the NN or the CC supplemented by the NN.

 

Objective

Our objective was to compare the performance of the NGram CC classifier to two discrete classifiers based on probabilistic associations with the CC pick list items. Also, we wished to determine the performance of the NGram method applied to CC alone, NN alone, and CC plus NN.

Submitted by elamb on
Description

The syndromic surveillance system in Scotland was implemented in response to Gleneagles hosting the G8 summit in July 2005. Part of this surveillance system used data from NHS24, a nurse led telephone help line that is the means of access to out of hours general practice services for the Scottish population. This data was processed by the ERS system and reports generated for 10 syndromes considered relevant to possible bio-terrorism or disease outbreaks. These syndromes are; colds and flu, difficulty breathing, fever, diarrhoea, coughs, double vision, eye problems, rash, lumps and vomiting. Following the G8 summit the ERS has been updated weekly using data pre-catagorised into syndromes at NHS24 (known as protocolled data). The proportion of calls processed by the protocol at NHS24 over this time has however fallen to around 40%. This change has given the impetus to create a free text searching algorithm which can classify all calls received by NHS 24 into one of the 10 syndromes or “other”. This therefore allows all calls to be analysed by the ERS.

 

Objective

Public Health consultants at Health Protection Scotland (HPS) monitor routine data from the NHS24 telephone helpline to provide information on possible epidemics of flu or other infectious diseases in Scotland. Within this paper the exception reporting system run at HPS is described and the adaptations made to the classification system as a response to the change of data recording patterns at NHS24 are described.

Submitted by elamb on
Description

After the SARS outbreak in 2003, Beijing established Fever Clinics in major hospitals for the early detection of potential respiratory disease outbreaks. The data collection in Fever Clinics contains the basic patient information, body temperature, cough, and breath condition, as well as a primary diagnosis. Since the symptoms and diagnosis are mainly recorded in free text format, it is very difficult to use for data analysis. Because of the problems in data processing, the data collection has decreased.

 

Objective

This paper describes the methodology in the development of an Integrated Surveillance System for Beijing, China.

Submitted by elamb on
Description

Timely surveillance of disease outbreak events of public health concern currently requires detailed and time consuming manual analysis by experts. Recently in addition to traditional information sources, the World Wide Web has offered a new modality in surveillance, but the massive collection of multilingual texts which must be processed in real time presents an enormous challenge.

 

Objective

In this paper we present a summary of the BioCaster system architecture for Web rumour surveillance, the rationale for the choices made in the system design and an empirical evaluation of topic classification accuracy for a gold-standard of English and Vietnamese news.

Submitted by elamb on
Description

BioSense is a national automated surveillance system designed to enhance the nation's capability to rapidly detect and quantify public health emergencies, by accessing and analyzing diagnostic and prediagnostic health data. The BioSense system currently receives near real-time data from more than 540 civilian hospitals, as well as national daily batched data from over 1100 Department of Defense and Veterans Affairs medical facilities. BioSense maps chief complaint and diagnosis data to 11 syndromes and 78 sub-syndromes. This project was spurred by the recent detection of several clusters with chief complaints containing the term “exposure” only some of which map to current BioSense sub-syndromes. BioSense currently does not have a generic “exposure” sub-syndrome.

 

OBJECTIVE

To identify hospital visits with chief complaints concerning exposures, characterize them, and develop methods for detecting exposure clusters.

Submitted by elamb on