Skip to main content

Fast subset scan for multivariate spatial biosurveillance

Description

The spatial scan statistic detects significant spatial clusters of disease by maximizing a likelihood ratio statistic over a large set of spatial regions. Several recent approaches have extended spatial scan to multiple data streams. Burkom aggregates actual and expected counts across streams and applies the univariate scan statistic, thus assuming a constant risk for the affected streams. Kulldorff et al. separately apply the univariate statistic to each stream and then aggregate scores across streams, thus assuming independent risks for each affected stream. Neill proposes a ‘fast subset scan’ approach, which maximizes the scan statistic over proximity-constrained subsets of locations, improving the timeliness of detection for irregularly shaped clusters. In the univariate event detection setting, many commonly used scan statistics satisfy the ‘linear-time subset scanning’ (LTSS) property, enabling exact and efficient detection of the highest-scoring space-time clusters.

Objective

We extend the recently proposed ‘fast subset scan’ framework from univariate to multivariate data, enabling computationally efficient detection of irregular space-time clusters even when the numbers of spatial locations and data streams are large. These fast algorithms enable us to perform a detailed empirical comparison of two variants of the multivariate spatial scan statistic, demonstrating the tradeoffs between detection power and characterization accuracy

Submitted by teresa.hamby@d… on