Skip to main content

Incorporating Learning into Disease Surveillance Systems

Description

Current state-of-the-art outbreak detection methods [1-3] combine spatial, temporal, and other covariate information from multiple data streams to detect emerging clusters of disease.  However, these approaches use fixed methods and models for analysis, and cannot improve their performance over time.   Here we consider two methods for overcoming this limitation, learning a prior over outbreak regions and learning outbreak models from user feedback, using the recently proposed multivariate Bayesian scan statistic (MBSS) framework [1]. Given a set of outbreak types {Ok}, set of space-time regions S, and the multivariate dataset D, MBSS computes the posterior probability Pr(H1(S, Ok) | D) of each outbreak type in each region, using Bayes’ Theorem to combine the prior probabilities Pr(H1(S, Ok)) and the data likelihoods Pr(D | H1(S, Ok)). Each outbreak type can have a different prior distribution over regions, as well as a different model for its effects on the multiple streams.  The set of outbreak types, as well as the region priors and outbreak models for each type, can be learned incrementally from labeled data or user feedback.

Objective

We argue that the incorporation of machine learning algorithms is a natural next step in the evolution and improvement of disease surveillance systems. We consider how learning can be incorporated into one recently proposed multivariate detection method, and demonstrate that learning can enable systems to substantially improve detection performance over time.

Submitted by elamb on