Skip to main content

Data Quality

Description

There is an ancient African proverb that states, ‘If you want to travel fast, travel alone; if you want to travel far, travel together.’ This paper examines the issue of whether public health can and should ‘go it alone’ in efforts for creating linkages between clinical care systems and the public health sector, as part of meaningful use requirements. ‘Going it alone’ in this circumstances refers to whether public health should seek to require data flows, through meaningful use requirements, that meet its work flow needs but do not add value to clinical work flows. An alternative would be to look for synergies between public health goals and the goals of the clinical care system, which public health could exploit to achieve its ends through collaborative means.

Objective

The objective of this paper is to review the limitations of current approaches to linkage of public health through meaningful use reporting requirements and to explore alternatives based on integration of public health data reporting requirements, with clinical quality improvement reporting requirements.

Submitted by uysz on
Description

When a reportable condition is identified, clinicians and laboratories are required to report the case to public health authorities. These case reports help public health officials to make informed decisions and implement appropriate control measures to prevent the spread of disease. Incomplete or delayed case reports can result in new occurrences of disease that could have been prevented. To improve the disease reporting and surveillance processes, the Utah Department of Health is collaborating with Intermountain Healthcare and the University of Utah to electronically transmit case reports from healthcare facilities to public health entities using Health Level Seven v2.5, SNOMED CT, and LOINC. As part of the Utah Center of Excellence in Public Health Informatics, we conducted an observation study in 2009 to identify metrics to evaluate the impact of electronic systems. We collected baseline data in 2009 and in this paper we describe preliminary results from a follow-up study conducted in 2010.

 

Objective

This paper describes a comparison study conducted to identify quality of reportable disease case reports received at Salt Lake Valley health department in 2009 and 2010.

Submitted by hparton on
Description

Most, if not all, disease surveillance systems are federated in the sense that hospitals, doctors’ offices, pharmacies are the source of most surveillance data. Although a health department may request or mandate that these organizations report data, we are not aware of any requirements about the method of data collection or audits or other measures of quality control.

Because of the heterogeneity and lack of control over the processes by which the data are generated, data sources in a federated disease surveillance system are black boxes the reliability, completeness, and accuracy of which are not fully understood by the recipient.

In this paper, we use the variance-to-mean ratio of daily counts of surveillance events as a metric of data quality. We use thermometer sales data as an example of data from a federated disease surveillance system. We test a hypothesis that removing stores with higher baseline variability from pooled surveillance data will improve the signal-to-noise ratio of thermometer sales for an influenza outbreak.

 

Objective

We developed a novel method for monitoring the quality of data in a federated disease surveillance system, which we define as ‘a surveillance system in which a set of organizations that are not owned or controlled by public health provide data.’

Submitted by hparton on
Description

The Centers for Disease Control and Prevention's (CDC) Emerging Infections Program (EIP) monitors and studies many infectious diseases, including influenza. In 10 states in the US, information is collected for hospitalized patients with laboratory-confirmed influenza. Data are extracted manually by EIP personnel at each site, stripped of personal identifiers and sent to the CDC. The anonymized data are received and reviewed for consistency at the CDC before they are incorporated into further analyses. This includes identifying errors, which are used for classification.

 

Objective

Introducing data quality checks can be used to generate feedback that remediates and/or reduces error generation at the source. In this report, we introduce a classification of errors generated as part of the data collection process for the EIP’s Influenza Hospitalization Surveillance Project at the CDC. We also describe a set of mechanisms intended to minimize and correct these errors via feedback, with the collection sites.

Submitted by hparton on
Description

Under-ascertainment of severe outcomes of influenza infections in administrative databases has long been recognised. After reviewing registered deaths following an influenza epidemic in 1847, William Farr, of the Registrar-General's Office, London, England, commented: ''the epidemic carried off more than 5,000 souls over and above the mortality of the season, the deaths referred to that cause [influenza] are only 1,157"[1]. Even today, studies of the population epidemiology, burden and cost of influenza frequently assume that influenza's impact on severe health outcomes reaches far beyond the number of influenza cases counted in routine clinical and administrative databases. There is little current evidence to justify the assumption that influenza is poorly identified in health databases. Using population based record linkage, we evaluated whether the assumption remains justified with modern improvements in diagnostic medicine and information systems.

Objective

To estimate the degree to which illness due to influenza is under-ascertained in administrative databases, to determine factors associated with influenza being coded or certified as a cause of death, and to estimate the proportion of coded influenza or certified influenza deaths that is laboratory confirmed.

Submitted by elamb on
Description

The Washington Comprehensive Hospital Abstract Reporting System (CHARS) has collected discharge data from billing systems for every inpatient admitted to every hospital in the state since 1987 [1]. The purpose of the system is to provide data for making informed decisions on health care. The system collects age, sex, zip code and billed charges of the patient, as well as hospital names and discharge diagnoses and procedure codes. The data have potential value for monitoring the severity of outbreaks such as influenza, but not for prospective surveillance: Reporting to CHARS is manual, not real-time, and there is roughly a 9-month lag in release of information by the state. In 2005, Public Health - Seattle & King County (PHSKC) requested that hospitals report pneumonia and influenza admissions (based on both admission and discharge codes) directly to the PHSKC biosurveillance system; data elements included hospital name, date/time of admission, age, sex, home zip code, chief complaint, disposition, and diagnoses. In 2008, reporting was revised to collect separate admission and discharge diagnoses, whether the patient was intubated or was in the ICU, and a patient/visit key. Hospitals transmit data daily for visits that occurred up to 1 month earlier. Previously, we identified a strong concordance between the volume of influenza diagnoses recorded across the PHSKC and CHARS systems over time [2]. However, discrepancies were observed, particularly when stratified by hospital. We undertook an evaluation to identify the causes of these discrepancies.

Objective

We sought to evaluate the quality of influenza hospitalizations data gathered by our biosurveillance system.

Submitted by elamb on
Description

The Public Health - Seattle & King County syndromic surveillance system has been collecting emergency department (ED) data since 1999. These data include hospital name, age, sex, zip code, chief complaint, diagnoses (when available), disposition, and a patient and visit key. Data are collected for 19 of 20 King County EDs, for visits that occurred the previous day. Over time, various problems with data quality have been encountered, including data drop-offs, missing data elements, incorrect values of fields, duplication of data, data delays, and unexpected changes in files received from hospitals. In spite of close monitoring of the data as part of our routine syndromic surveillance activities, there have occasionally been delays in identifying these problems. Since the validity of syndromic surveillance is dependent on data quality, we sought to develop a visualization to help monitor data quality over time, in order to improve the timeliness of addressing data quality problems.

 

Objective 

We sought to develop a method for visualizing data quality over time.

Submitted by elamb on
Description

Distribute is a national emergency department syndromic surveillance project developed by the International Society for Disease Surveillance for influenza-like-illness (ILI) that integrates data from existing state and local public health department surveillance systems. The Distribute project provides graphic comparisons of both ILI-related clinical visits across jurisdictions and a national picture of ILI. Unlike other surveillance systems, Distribute is designed to work solely with summarized (aggregated) data which cannot be traced back to the un-aggregated 'raw' data. This and the distributed, voluntary nature of the project creates some unique data quality issues, with considerable site to site variability. Together with the ISDS, the University of Washington has developed processes and tools to address these challenges, mirroring work done by others in the Distribute community.

Objective

To present exploratory tools and methods developed as part of the data quality monitoring of Distribute data, and discuss these tools and their applications with other participants.

Submitted by elamb on
Description

Electronic laboratory reporting (ELR) was demonstrated just over a decade ago to be an effective method to improve the timeliness of reporting as well as the number of reports submitted to public health agencies. The quality of data (inc. completeness) in information systems across all industries and organizations is often poor, and anecdotal reports in the surveillance literature suggest that ELR may not improve the completeness of the data in the submitted reports.

 

Objective 

To examine the completeness of data submitted from clinical information systems to public health agencies as notifiable disease reports.

Submitted by elamb on
Description

Distribute is a national emergency department syndromic surveillance project developed by the International Society for Disease Surveillance (ISDS) for influenza-like-illness (ILI) that integrates data from existing state and local public health department surveillance systems. The Distribute is a national emergency department syndromic surveillance project developed by the International Society for Disease Surveillance (ISDS) for influenza-like-illness (ILI) that integrates data from existing state and local public health department surveillance systems. The Distribute project provides graphic comparisons of both ILI-related clinical visits across jurisdictions and a national picture of ILI. Unlike other surveillance systems, Distribute is designed to work solely with summarized (aggregated) data which cannot be traced back to the un-aggregated 'raw' data. This and the distributed, voluntary nature of the project create some unique data quality issues, with considerable site to site variability. Together with the ISDS, the University of Washington has developed processes and tools to address these challenges, mirroring work done by others in the Distribute community.

Objective

The goal of this session will be to briefly present two methods for comparing aggregate data quality and invite continued discussion on data quality from other surveillance practitioners, and to present the range of data quality results across participating Distribute sites.

Referenced File
Submitted by elamb on