Skip to main content

Epi Archive: Automated Synthesis of Global Notifiable Disease Data

Description

Government reporting of notifiable disease data is common and widespread, though most countries do not report in a machine-readable format. This is despite the WHO International Health Regulations stating that [e]ach State Party shall notify WHO, by the most efficient means of communication available. Data are often in the form of a file that contains text, tables and graphs summarizing weekly or monthly disease counts. This presents a problem when information is needed for more data intensive approaches to epidemiology, biosurveillance and public health. While most nations likely store incident data in a machine-readable format, governments can be hesitant to share data openly for a variety of reasons that include technical, political, economic, and motivational. A survey conducted by LANL of notifiable disease data reporting in over fifty countries identified only a few websites that report data in a machine-readable format. The majority (>70%) produce reports as PDF files on a regular basis. The bulk of the PDF reports present data in a structured tabular format, while some report in natural language or graphical charts. The structure and format of PDF reports change often; this adds to the complexity of identifying and parsing the desired data. Not all websites publish in English, and it is common to find typos and clerical errors. LANL has developed a tool, Epi Archive, to collect global notifiable disease data automatically and continuously and make it uniform and readily accessible.

Objective: Automatically collect and synthesize global notifiable disease data and make it available to humans and computers. Provide the data on the web and within the Biosurveillance Ecosystem (BSVE) as a novel data stream. These data have many applications including improving the prediction and early warning of disease events.

Submitted by elamb on