Skip to main content

Algorithm

Description

CKD is currently the ninth leading cause of death in the United States. The prevalence of end-stage renal disease, the most severe form of CKD, has doubled in the last decade.1 Early detection and treatment of CKD is critical to slowdown the progression of the disease and to decrease the risk of other chronic conditions, such as cardiovascular disease.2 One accessible and cost-effective method for health research activities involves use of medical administrative databases, such as insurance claims databases and institutional medical record systems. Individuals with diabetes, for example, have been accurately identified in Medicare and Veterans’ Health Administration databases using clearly defined and highly valid search algorithms.3 However, little is known about the validity of administrative databases for identifying CKD. A systematic review of the literature was conducted to identify the validity of published methods for searching administrative databases for cases of CKD.

Objective

This poster summarizes a systematic literature conducted to (1) describe published methods for researching chronic kidney disease (CKD) in administrative databases and (2) summarize the reported validity of methods of searching for CKD in administrative databases.

Submitted by Magou on
Description

Real-Time Biosurveillance Program (RTBP) introduces modern surveillance technology to health departments in Sri Lanka and Tamil Nadu, India. Triage data from each patient visit (basic demographics, signs, symptoms, preliminary diagnoses) is recorded on paper at health facilities. Case records are transmitted daily to a central database using the RTBP mobile phone application. It is done by medical professionals in India, but in Sri Lanka, due to staffing constraints, the same duty is performed by lower cost personnel with limited domain knowledge. That results in noticeable differences in data entry error rates between the two locations. Most of such issues are due to systematic and subjectivemisinterpretations of the handwritten doctor notes by the data entry personnel. If not identified and remedied quickly, these errors can adversely affect accuracy and timeliness of health events detection. There is a need to support system managers in their efforts to maintain high reliability of data used for public health surveillance.

 

Objective

We present a method for automated detection of systematic data entry errors in real time biosurveillance.

Submitted by hparton on
Description

In Montreal, notifiable diseases are reported to the Public Health Department (PHD). Of 44, 250 disease notifications received in 2009, up to 25% had potential address errors. These can be introduced during transcription, handwriting interpretation and typing at various stages of the process, from patients, labs and/or physicians, and at the PHD. Reports received by the PHD are entered manually (initial entry) into a database. The archive personnel attempts to correct omissions by calling reporting laboratories or physicians. Investigators verify real addresses with patients or physicians for investigated episodes (40–60%). 

The Dracones qualite (DQ) address verification algorithm compares the number, street and postal code against the 2009 Canada Post database. If the reported address is not consistent with a valid address in the Canada Post database, DQ suggests a valid alternative address.

 

Objective

To (1) validate DQ developed to improve data quality for public health mapping and (2) identify the origin of address errors.

Submitted by hparton on
Description

Influenza-like illness (ILI) data is collected by an Influenza Sentinel Provider Surveillance Network at the state (Iowa, USA) level. Historically, the Iowa Department of Public Health has maintained 19 different influenza sentinel surveillance sites. Because participation is voluntary, locations of the sentinel providers may not reflect optimal geographic placement. This study analyzes two different geographic placement algorithms - a maximal coverage model (MCM) and a K-median model. The MCM operates as follows: given a specified radius of coverage for each of the n candidate surveillance sites, we greedily choose the m sites that result in the highest population coverage. In previous work, we showed that the MCM can be used for site placement. In this paper, we introduce an alternative to the MCM - the K-median model. The K-median model, often called the P-median model in geographic literature, operates by greedily choosing the m sites which minimize the sum of the distances from each person in a population to that person’s nearest site. In other words, it minimizes the average travel distance for a population.

 

Objective

This paper describes an experiment to evaluate the performance of several alternative surveillance site placement algorithms with respect to the standard ILI surveillance system in Iowa.

Submitted by hparton on
Description

The Veterans Health Administration (VHA) uses the Electronic Surveillance System for the Early Notification of Community-based Epidemics to detect disease outbreaks and other health-related events earlier than other forms of surveillance. Although Veterans may use any VHA facility in the world, the strongest predictor of which health care facility is accessed is geographic proximity to the patient's residence. A number of outbreaks have occurred in the Veteran population when geographically separate groups convened in a single location for professional or social events. One classic example was the initial Legionnaire's disease outbreak, identified among participants at the Legionnaire's convention in Philadelphia in the late 1970s. Numerous events involving travel by large Veteran (and employee) populations are scheduled each year.

 

Objective

To develop an algorithm to identify disease outbreaks by detecting aberrantly large proportions of patient residential ZIP codes outside a health care facility catchment area.

Submitted by elamb on
Description

One of the significant challenges that multi-user biosurveillance systems have is alarm management. Currently deployed syndromic surveillance systems [1–3] have a single user interface. However, different users have different objectives; the alarms that are important for one category of user are irrelevant to the objectives of another category of user. For example, a physician wants to identify disease on an individual-patient level, a county health authority is interested in identifying disease outbreak as early as possible within his local region, while an epidemiologist at the national level is interested in global situational awareness. The objective of a multi-agent decision support system is not only to recognize patterns of epidemiologically significant events but also to indicate their relevance to particular user groups’ objectives. Thus, instead of simply providing alerts of anomaly detections, the system architecture needs to provide analyzed information supporting multiple users’ decisions.

Submitted by elamb on
Description

There is a clear need for improved surveillance of chronic diseases to guide public health practice and policy. Chronic disease surveillance has tended to use administrative data, due to the need to link encounters for an individual over time and to have complete capture of all encounters. Case-detection algorithms generally combine variables found in the data using Boolean operators (i.e., AND, OR, NOT). For example, a commonly used algorithm for DM surveillance requires a patient to have one hospitalization or two physician visits within two years with a diagnostic code for DM. While this approach to defining case-detection algorithms is straightforward, it has limitations. For example, if more than simple combinations of one or two variables are used, then it becomes unwieldy to represent the algorithm and it can be difficult to identity how different variables in the definition contribute to detection accuracy. A multivariable probabilistic case-detection algorithm can address these problems and facilitate exploration of how the multiple variables available from different data sources might improve case-detection accuracy1. In this research, we develop an approach for probabilistic multivariable case-detection and apply the method to a cohort of older adults with known DM status to demonstrate and evaluate the method.

Objective

To develop and validate a multivariable probabilistic algorithm for detecting cases of diabetes mellitus (DM) using clinical and demographic data.

Submitted by knowledge_repo… on
Description

TOA identifies clusters of patients arriving to a hospital ED within a short temporal interval. Past implementations have been restricted to records of patients with a specific type of complaint. The Florida Department of Health uses TOA at the county level for multiple subsyndromes (1). In 2011, NC DPH, CCHI and CDC collaborated to enhance and evaluate this capability for NC DETECT, using NC DETECT data in BioSense 1.0 (2). After this successful evaluation based on exposure complaints, discussions were held to determine the best approach to implement this new algorithm into the production environment for NC DETECT. NC DPH was particularly interested in determining if TOA could be used for identifying clusters of ED visits not filtered by any syndrome or sub-syndrome. In other words, can TOA detect a cluster of ED visits relating to a public health event, even if symptoms from that event are not characterized by a predefined syndrome grouping? Syndromes are continuously added to NC DETECT but a syndrome cannot be created for every potential event of public health concern. This TOA approach is the first attempt to address this issue in NC DETECT. The initial goal is to identify clusters of related ED visits whose keywords, signs and/or symptoms are NOT all expressed by a traditional syndrome, e.g. rash, gastrointestinal, and flu-like illnesses. The goal instead is to identify clusters resulting from specific events or exposures regardless of how patients present – event concepts that are too numerous to pre-classify.

Objective:

To describe a collaboration with the Johns Hopkins Applied Physics Laboratory (JHU APL), the North Carolina Division of Public Health (NC DPH), and the UNC Department of Emergency Medicine Carolina Center for Health Informatics (CCHI) to implement time-of-arrival analysis (TOA) for hospital emergency department (ED) data in NC DETECT to identify clusters of ED visits for which there is no pre-defined syndrome or sub-syndrome.

 

Submitted by Magou on
Description

The variability of free text emergency department (ED) data is problematic for biosurveillance, and current methods of identifying search terms for symptoms of interest are inefficient as well as time- and labor-intensive. Our ad hoc approach to term identification for the North Carolina Disease and Epidemiologic Collection Tool (NC DETECT) begins with development of clinical case definitions from which we build automated syndrome queries in standard query language. The queries are used to search free text clinical data from EDs, with the goal of identifying free text terms to match the case definitions. The free text search terms were initially collected from epidemiologists and clinical and technical staff at NC DETECT through informal review of ED data. Over time, we reviewed individual cases missed by our queries and identified additional search terms. We also manually reviewed records to find misspellings, abbreviations and acronyms for known search terms (e.g., dypnea, diff. br. and SHOB for dyspnea), and developed a pre-processor to clean text prior to syndromic classification. The purpose of this project was to develop and test a more standardized approach to search term identification.

 

Objective

This paper describes and applies a new method for identifying biosurveillance search terms using the Semantic Network of the Unified Medical Language System.

Submitted by elamb on
Description

Measures aimed at controlling epidemics of infectious diseases critically benefit from early outbreak recognition. Through a manual electronic medical record (EMR) review of 5,127 outpatient encounters at the Veterans Administration health system (VA), we previously developed single-case detection algorithms (CDAs) aimed at uncovering individuals with influenza-like illness (ILI). In this work, we evaluate the impact of using CDAs of varying statistical performance on the time and workload required to find a community-wide influenza outbreak through a VA-based syndromic surveillance system (SSS). The CDAs utilize various logical arrangements of EMR data, including ICD-9 codes, structured clinical parameters, and/or an automated analysis of the free-text of the full clinical note. The 18 ILI CDAs used here are limited to the most successful representatives of ICD-9-only and EMR-based case detectors.

 

Objective

This work uses a mathematical model of a plausible influenza epidemic to begin to test the influence of CDAs on the performance of a SSS.

Submitted by elamb on