With the increase in GPS enabled devices, pin-point spatial data is an obvious future growth area for cluster detection research. The FBSSS handles binary labelled point data, but requires Monte Carlo testing to obtain inference [1]. In the Bayesian Poisson SSS [2], Monte Carlo is replaced by use of historic data, manifoldly speeding up processing. Following [2], [3] derived the BBSSS, replacing historic data with expert knowledge on cluster relative risk. This paper compares the spatial accuracy of BBSSS and FBSSS using new measure [4] which, being independent of inference level, permits direct comparison between Bayesian and frequentist methods. To compare the spatial accuracy of a Bayesian Bernoulli spatial scan statistic (BBSSS) and the frequentist Bernoulli spatial scan statistic (FBSSS), using benchmark trials.
Statistical Methods
Presented December 13, 2018.
For public health surveillance, is machine learning worth the effort? What methods are relevant? Do you need special hardware? This talk was motivated by these and other questions asked by ISDS members. It will focus on providing practical—and slightly opinionated—advice about how to determine whether machine learning could be a useful tool for your problem.
Presenter
There is limited closed-form statistical theory to indicate how well the prospective space-time permutation scan statistic will perform in the detection of localized excess illness activity. Instead, detection methods can be applied to simulated data to gain insight about detection performance. Such results are dependent on the way outbreaks are simulated and the nature of the background data. As an alternative, we explore an empirical approach in which the membership of a large health plan is used to represent a community and detection performance is assessed in samples from the larger group.
Objective
Our goal was to assess the impact of sentinel sample size and criteria for a signal on performance of daily prospective space-time permutation detection by comparing results in varying size random samples from a large health plan to results found in the full membership.
Expectation-based scan statistics extend the traditional spatial scan statistic approach by using historical data to infer the expected counts for each spatial location, then detecting regions with higher than expected counts. Here we consider five recently proposed expectation-based statistics: the expectation-based Poisson (EBP), expectation-based Gaussian (EBG), population-based Poisson (PBP), populationbased Gaussian (PBG), and robust Bernoulli-Poisson (RBP) methods. We also consider five different time series analysis methods used to predict the expected counts (including the Holt-Winters method and moving averages optionally adjusted for day of week and seasonality), giving a total of 25 methods to compare. All of these methods are detailed in the full paper.
Objective
We present a systematic empirical comparison of five recently proposed expectation-based scan statistics, in order to determine which methods are most successful for which spatial disease surveillance tasks.
Seasonal influenza accounts for a high proportion of outpatient morbidity during the winter months. However, influenza case counts are greatly underestimated due to frequently undiagnosed influenza. Electronic medical record (EMR) systems provide a very large, complex data source for influenza surveillance at both the patient and population level. It is important to identify influenza patients for specimen collection, respiratory isolation for school age children, prescription of an appropriate influenza drug, or to identify patients at risk for complications. At a population level, public health agencies monitor the tempo and spread of influenza season for resource management, as well as maintain situational awareness for avian influenza.
Objective
The objective of this work was to evaluate the utility of classification tree methods for syndromic surveillance case definition development using an EMR system as a data source.
Modern surveillance systems use statistical process control (SPC) charts such as Cumulative Sum and Exponentially Weighted Moving Average charts for monitoring daily counts of such quantities as ICD-9 codes from ED visits, sales of medications, and doctors’ office visits. The working assumption is that such pre-clinical data contain an early signature of disease outbreaks, manifested as an increase in the count levels. However, the direct application of SPC charts to the raw counts leads to unreliable performance. A popular statistical solution is to precondition the data before applying the charts by modeling or removing explainable patterns from the data and then monitoring the residuals. Although the general idea is common practice, the specifics of how to identify the existing explainable components and how to account for them are domain-specific. Therefore, we seek to present a set of modeling and data-driven tools that are useful for syndromic data.
Objective
SPC charts are widely used in disease surveillance. The charts are very effective when monitored data meet the requirements of temporal independence, stationarity, and normality. However, when these assumptions are violated, the SPC charts will either fail to detect special cause variations or will alert frequently even in the absence of anomalies. Currently collected biosurveillance data contain predictable factors such as day-of-week effects, seasonal effects, holidays, autocorrelation, and global trends that cause the data to violate these assumptions. This work (1) describes a set of tools for identifying such explainable patterns and (2) examines several data preconditioning methods that account for these factors, yielding data better suited for monitoring by traditional SPC charts.
Modern biosurveillance relies on multiple sources of both prediagnostic and diagnostic data, updated daily, to discover disease outbreaks. Intrinsic to this effort are two assumptions: (1) the data being analyzed contain early indicators of a disease outbreak and (2) the outbreaks to be detected are not known a priori. However, in addition to outbreak indicators, syndromic data streams include such factors as day-of-week effects, seasonal effects, autocorrelation, and global trends. These explainable factors obscure unexplained outbreak events, and their presence in the data violates standard control-chart assumptions. Monitoring tools such as Shewhart, cumulative sum, and exponentially weighted moving average control charts will alert based largely on these explainable factors instead of on outbreaks. The goal of this paper is 2-fold: first, to describe a set of tools for identifying explainable patterns such as temporal dependence and, second, to survey and examine several data preconditioning methods that significantly reduce these explainable factors, yielding data better suited for monitoring using the popular control charts.
Geographic visualization methods allow analysts to visually discover clusters in multivariate, spatially-referenced data. Computational and statistical cluster detection techniques can automatically detect spatial clusters of high values of a variable of interest. The authors propose that the two approaches can be complementary; and present an integration of the GeoViz Toolkit and Proclude software suites as proof-of-concept.
Early warning systems must not always rely on geographical proximity for modeling the spread of contagious diseases. Instead, graph structures such as airways or social networks are more adequate in those situations. Nodes, associated to cities, are linked by means of edges, which represent routes between cities. Scan statistics are highly successful for the evaluation of clusters in maps based on geographical proximity. The more flexible neighborhood structure of graphs presents difficulties for the direct usage of scan statistics, due to the highly irregular structures involved. Besides, the traffic intensity between connected nodes plays a significant role which is not usually present in scan statistic based models.
Objective
We describe a model for cluster detection and inference on networks based on the scan statistic. Our aim is to detect as early as possible the appearance of an emerging cluster of syndromes due to a real outbreak (signal) amidst unrelated syndromes (noise).
Spatial scan finds the most anomalous region that has shown increase in observed counts when compared to the expected baseline. As there can be infinitely many regions to search for, most state-of-the-art algorithms assumes a specific shape of the attack region (circles for Kulldorff and rectangles for Ultra-Fast Spatial Scan Statistics). This assumption might reduce the detection power as real world attacks don't follow standard geometric shapes.
Objective
We propose discriminative random field approach for detecting a disease outbreak. Given observed data on a spatial grid, the goal is to label each node as being under attack and non-attack.
Pagination
- Previous page
- Page 3
- Next page