FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers

Abishek Sankararaman, Balakrishnan Narayanaswamy, Vikramank Y Singh, Zhao Song
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:19153-19177, 2022.

Abstract

Technology improvements have made it easier than ever to collect diverse telemetry at high resolution from any cyber or physical system, for both monitoring and control. In the domain of monitoring, anomaly detection has become an important problem in many research areas ranging from IoT and sensor networks to devOps. These systems operate in real, noisy and non-stationary environments. A fundamental question is then, ‘How to quickly spot anomalies in a data-stream, and differentiate them from either sudden or gradual drifts in the normal behaviour?’ Although several heuristics have been proposed for detecting anomalies on streams, no known method has formalized the desiderata and rigorously proven that they can be achieved. We begin by formalizing the problem as a sequential estimation task. We propose \name, (\textbf{Fi}ne \textbf{T}une on \textbf{Ne}w and \textbf{S}imilar \textbf{S}amples), a flexible framework for detecting anomalies on data streams. We show that in the case when the data stream has a gaussian distribution, FITNESS is provably both robust and adaptive. The core of our method is to fine-tune the anomaly detection system only on recent, similar examples, before predicting an anomaly score. We prove that this is sufficient for robustness and adaptivity. We further experimentally demonstrate that \name;{is} flexible in practice, i.e., it can convert existing offline AD algorithms in to robust and adaptive online ones.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-sankararaman22a, title = {{FITNESS}: ({F}ine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers}, author = {Sankararaman, Abishek and Narayanaswamy, Balakrishnan and Singh, Vikramank Y and Song, Zhao}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {19153--19177}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/sankararaman22a/sankararaman22a.pdf}, url = {https://proceedings.mlr.press/v162/sankararaman22a.html}, abstract = {Technology improvements have made it easier than ever to collect diverse telemetry at high resolution from any cyber or physical system, for both monitoring and control. In the domain of monitoring, anomaly detection has become an important problem in many research areas ranging from IoT and sensor networks to devOps. These systems operate in real, noisy and non-stationary environments. A fundamental question is then, ‘How to quickly spot anomalies in a data-stream, and differentiate them from either sudden or gradual drifts in the normal behaviour?’ Although several heuristics have been proposed for detecting anomalies on streams, no known method has formalized the desiderata and rigorously proven that they can be achieved. We begin by formalizing the problem as a sequential estimation task. We propose \name, (\textbf{Fi}ne \textbf{T}une on \textbf{Ne}w and \textbf{S}imilar \textbf{S}amples), a flexible framework for detecting anomalies on data streams. We show that in the case when the data stream has a gaussian distribution, FITNESS is provably both robust and adaptive. The core of our method is to fine-tune the anomaly detection system only on recent, similar examples, before predicting an anomaly score. We prove that this is sufficient for robustness and adaptivity. We further experimentally demonstrate that \name;{is} flexible in practice, i.e., it can convert existing offline AD algorithms in to robust and adaptive online ones.} }
Endnote
%0 Conference Paper %T FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers %A Abishek Sankararaman %A Balakrishnan Narayanaswamy %A Vikramank Y Singh %A Zhao Song %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-sankararaman22a %I PMLR %P 19153--19177 %U https://proceedings.mlr.press/v162/sankararaman22a.html %V 162 %X Technology improvements have made it easier than ever to collect diverse telemetry at high resolution from any cyber or physical system, for both monitoring and control. In the domain of monitoring, anomaly detection has become an important problem in many research areas ranging from IoT and sensor networks to devOps. These systems operate in real, noisy and non-stationary environments. A fundamental question is then, ‘How to quickly spot anomalies in a data-stream, and differentiate them from either sudden or gradual drifts in the normal behaviour?’ Although several heuristics have been proposed for detecting anomalies on streams, no known method has formalized the desiderata and rigorously proven that they can be achieved. We begin by formalizing the problem as a sequential estimation task. We propose \name, (\textbf{Fi}ne \textbf{T}une on \textbf{Ne}w and \textbf{S}imilar \textbf{S}amples), a flexible framework for detecting anomalies on data streams. We show that in the case when the data stream has a gaussian distribution, FITNESS is provably both robust and adaptive. The core of our method is to fine-tune the anomaly detection system only on recent, similar examples, before predicting an anomaly score. We prove that this is sufficient for robustness and adaptivity. We further experimentally demonstrate that \name;{is} flexible in practice, i.e., it can convert existing offline AD algorithms in to robust and adaptive online ones.
APA
Sankararaman, A., Narayanaswamy, B., Singh, V.Y. & Song, Z.. (2022). FITNESS: (Fine Tune on New and Similar Samples) to detect anomalies in streams with drift and outliers. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:19153-19177 Available from https://proceedings.mlr.press/v162/sankararaman22a.html.

Related Material