Using Twitter to Detect and Investigate Disease Outbreaks

Social media is of considerable interest as a sensor into the thoughts, interests and health of a population. We consider three types of health events that an analyst may wish to be made aware of:

- Given a known disease, such as MERS, SARS, Measles, etc., an event corresponds to individuals contracting the disease.

- Given a set of symptoms (fever, stomach pain, etc.), an event is an unusual number of individuals1 complaining of the symptoms.

- Most generally: an event is an unusually large group of individuals who can be identified as being effected by some personal illness.

Note that to detect an “unusual number” of something, we need to count the indicators of the event, and we need to compare the current count with past counts. Further, we are generally interested in geographically constrained events, and so for this work we will focus on county-based counts. We will count the number of items (tweets or individuals) expressing the event indicator (a disease name, symptom, or classified as “personal health related” as indicated by our classifier). Our approach to detecting health related events is: filter -> classify -> detect. We first filter out tweets that contain no “health related” terms, then apply a classifier to each tweet. This classifier is designed to flag a tweet as being about “personal health” or not. We then aggregate the positive instances per day at the county level and detect as an event any county/day pair with an unusually high count (as compared to the recent past).

Objective

In this work we investigate the extent to which social media, in particular Twitter, can be used to detect an outbreak of a disease or illness. We term these outbreaks “events”, and we will describe methodologies for detecting events.

Submitted by teresa.hamby@d… on Fri, 12/29/2017 - 12:54