Binned Kernels for Anomaly Detection in Multi-timescale Data using Gaussian Processes

Matthew Adelsberg, Christian Schwantes
Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance, PMLR 71:102-113, 2018.

Abstract

Financial services and technology companies invest significantly in monitoring their complex technology infrastructures to allow for quick responses to technology failures. Because of the volume and velocity of signals monitored (e.g., customer transaction volume, API calls, server CPU utilization, etc.), they require sophisticated models of normal system behavior to determine when a component falls into an anomalous state. Gaussian processes (GPs) are flexible, Bayesian nonparametric models that have successfully been used for time series forecasting, interpolation, and anomaly detection in complex data sets. Despite the growing use of GPs for time series analysis in the literature, these methods scale poorly with the size of the data. In particular, data sets containing multiple timescales can pose a problem for GPs, as they can require a large number of points for training. We describe a novel method for including long and short timescale information without including an impractical number of data points through the use of a binned process, defined as the definite integral over a latent Gaussian process. This results in a binned covariance function for the time series, which we use to fit and forecast data at multiple resolutions. The resulting models achieve higher accuracy with fewer data points than their non-binned counterparts, and are more robust to long tailed noise, heteroskedasticity, and data artifacts.

Cite this Paper


BibTeX
@InProceedings{pmlr-v71-adelsberg18a, title = {Binned Kernels for Anomaly Detection in Multi-timescale Data using Gaussian Processes}, author = {Adelsberg, Matthew and Schwantes, Christian}, booktitle = {Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance}, pages = {102--113}, year = {2018}, editor = {Anandakrishnan, Archana and Kumar, Senthil and Statnikov, Alexander and Faruquie, Tanveer and Xu, Di}, volume = {71}, series = {Proceedings of Machine Learning Research}, month = {14 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v71/adelsberg18a/adelsberg18a.pdf}, url = {https://proceedings.mlr.press/v71/adelsberg18a.html}, abstract = {Financial services and technology companies invest significantly in monitoring their complex technology infrastructures to allow for quick responses to technology failures. Because of the volume and velocity of signals monitored (e.g., customer transaction volume, API calls, server CPU utilization, etc.), they require sophisticated models of normal system behavior to determine when a component falls into an anomalous state. Gaussian processes (GPs) are flexible, Bayesian nonparametric models that have successfully been used for time series forecasting, interpolation, and anomaly detection in complex data sets. Despite the growing use of GPs for time series analysis in the literature, these methods scale poorly with the size of the data. In particular, data sets containing multiple timescales can pose a problem for GPs, as they can require a large number of points for training. We describe a novel method for including long and short timescale information without including an impractical number of data points through the use of a binned process, defined as the definite integral over a latent Gaussian process. This results in a binned covariance function for the time series, which we use to fit and forecast data at multiple resolutions. The resulting models achieve higher accuracy with fewer data points than their non-binned counterparts, and are more robust to long tailed noise, heteroskedasticity, and data artifacts.} }
Endnote
%0 Conference Paper %T Binned Kernels for Anomaly Detection in Multi-timescale Data using Gaussian Processes %A Matthew Adelsberg %A Christian Schwantes %B Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance %C Proceedings of Machine Learning Research %D 2018 %E Archana Anandakrishnan %E Senthil Kumar %E Alexander Statnikov %E Tanveer Faruquie %E Di Xu %F pmlr-v71-adelsberg18a %I PMLR %P 102--113 %U https://proceedings.mlr.press/v71/adelsberg18a.html %V 71 %X Financial services and technology companies invest significantly in monitoring their complex technology infrastructures to allow for quick responses to technology failures. Because of the volume and velocity of signals monitored (e.g., customer transaction volume, API calls, server CPU utilization, etc.), they require sophisticated models of normal system behavior to determine when a component falls into an anomalous state. Gaussian processes (GPs) are flexible, Bayesian nonparametric models that have successfully been used for time series forecasting, interpolation, and anomaly detection in complex data sets. Despite the growing use of GPs for time series analysis in the literature, these methods scale poorly with the size of the data. In particular, data sets containing multiple timescales can pose a problem for GPs, as they can require a large number of points for training. We describe a novel method for including long and short timescale information without including an impractical number of data points through the use of a binned process, defined as the definite integral over a latent Gaussian process. This results in a binned covariance function for the time series, which we use to fit and forecast data at multiple resolutions. The resulting models achieve higher accuracy with fewer data points than their non-binned counterparts, and are more robust to long tailed noise, heteroskedasticity, and data artifacts.
APA
Adelsberg, M. & Schwantes, C.. (2018). Binned Kernels for Anomaly Detection in Multi-timescale Data using Gaussian Processes. Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance, in Proceedings of Machine Learning Research 71:102-113 Available from https://proceedings.mlr.press/v71/adelsberg18a.html.

Related Material