Disease outbreak detection based on traditional surveillance datasets, such as disease cases reported from hospitals, is potentially limited in that the collection of clinic information is costly and time consuming. However, social media provides the vast amount of data available in real time on the internet at almost no cost. Our solution, NPHGS, provides a nonparametric statistical approach for outbreak detection that well addresses the key technical challenges in social media streams.
Objective
We present a new method for disease outbreak detection, the 'Non-Parametric Heterogeneous Graph Scan (NPHGS)'. NPHGS enables fast and accurate detection of emerging space-time clusters using Twitter and other social media streams where standard parametric model assumptions are incorrect.