Online Adaptive Anomaly Thresholding with Confidence Sequences

Sophia Huiwen Sun, Abishek Sankararaman, Balakrishnan Murali Narayanaswamy
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:47105-47132, 2024.

Abstract

Selecting appropriate thresholds for anomaly detection in online, unsupervised settings is a challenging task, especially in the presence of data distribution shifts. Addressing these challenges is critical in many practical large scale systems, such as infrastructure monitoring and network intrusion detection. This paper proposes an algorithm that connects online thresholding with constructing confidence sequences achieving (1) adaptive online threshold selection robust to distribution shifts, (2) statistical guarantees on false positive and false negative rates without any distributional assumptions, and (3) improved performance when given relevant offline data to warm-start the online algorithm, while having bounded degradation if the offline data is irrelevant. We complement our theoretical results by empirical evidence that our method outperforms commonly used baselines across synthetic and real world datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-sun24h, title = {Online Adaptive Anomaly Thresholding with Confidence Sequences}, author = {Sun, Sophia Huiwen and Sankararaman, Abishek and Narayanaswamy, Balakrishnan Murali}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {47105--47132}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/sun24h/sun24h.pdf}, url = {https://proceedings.mlr.press/v235/sun24h.html}, abstract = {Selecting appropriate thresholds for anomaly detection in online, unsupervised settings is a challenging task, especially in the presence of data distribution shifts. Addressing these challenges is critical in many practical large scale systems, such as infrastructure monitoring and network intrusion detection. This paper proposes an algorithm that connects online thresholding with constructing confidence sequences achieving (1) adaptive online threshold selection robust to distribution shifts, (2) statistical guarantees on false positive and false negative rates without any distributional assumptions, and (3) improved performance when given relevant offline data to warm-start the online algorithm, while having bounded degradation if the offline data is irrelevant. We complement our theoretical results by empirical evidence that our method outperforms commonly used baselines across synthetic and real world datasets.} }
Endnote
%0 Conference Paper %T Online Adaptive Anomaly Thresholding with Confidence Sequences %A Sophia Huiwen Sun %A Abishek Sankararaman %A Balakrishnan Murali Narayanaswamy %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-sun24h %I PMLR %P 47105--47132 %U https://proceedings.mlr.press/v235/sun24h.html %V 235 %X Selecting appropriate thresholds for anomaly detection in online, unsupervised settings is a challenging task, especially in the presence of data distribution shifts. Addressing these challenges is critical in many practical large scale systems, such as infrastructure monitoring and network intrusion detection. This paper proposes an algorithm that connects online thresholding with constructing confidence sequences achieving (1) adaptive online threshold selection robust to distribution shifts, (2) statistical guarantees on false positive and false negative rates without any distributional assumptions, and (3) improved performance when given relevant offline data to warm-start the online algorithm, while having bounded degradation if the offline data is irrelevant. We complement our theoretical results by empirical evidence that our method outperforms commonly used baselines across synthetic and real world datasets.
APA
Sun, S.H., Sankararaman, A. & Narayanaswamy, B.M.. (2024). Online Adaptive Anomaly Thresholding with Confidence Sequences. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:47105-47132 Available from https://proceedings.mlr.press/v235/sun24h.html.

Related Material