Deep Anomaly Detection under Labeling Budget Constraints

Aodong Li, Chen Qiu, Marius Kloft, Padhraic Smyth, Stephan Mandt, Maja Rudolph
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:19882-19910, 2023.

Abstract

Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with optimal data coverage under labeling budget constraints. In addition, we propose a new learning framework for semi-supervised AD. Extensive experiments on image, tabular, and video data sets show that our approach results in state-of-the-art semi-supervised AD performance under labeling budget constraints.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-li23x, title = {Deep Anomaly Detection under Labeling Budget Constraints}, author = {Li, Aodong and Qiu, Chen and Kloft, Marius and Smyth, Padhraic and Mandt, Stephan and Rudolph, Maja}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {19882--19910}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/li23x/li23x.pdf}, url = {https://proceedings.mlr.press/v202/li23x.html}, abstract = {Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with optimal data coverage under labeling budget constraints. In addition, we propose a new learning framework for semi-supervised AD. Extensive experiments on image, tabular, and video data sets show that our approach results in state-of-the-art semi-supervised AD performance under labeling budget constraints.} }
Endnote
%0 Conference Paper %T Deep Anomaly Detection under Labeling Budget Constraints %A Aodong Li %A Chen Qiu %A Marius Kloft %A Padhraic Smyth %A Stephan Mandt %A Maja Rudolph %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-li23x %I PMLR %P 19882--19910 %U https://proceedings.mlr.press/v202/li23x.html %V 202 %X Selecting informative data points for expert feedback can significantly improve the performance of anomaly detection (AD) in various contexts, such as medical diagnostics or fraud detection. In this paper, we determine a set of theoretical conditions under which anomaly scores generalize from labeled queries to unlabeled data. Motivated by these results, we propose a data labeling strategy with optimal data coverage under labeling budget constraints. In addition, we propose a new learning framework for semi-supervised AD. Extensive experiments on image, tabular, and video data sets show that our approach results in state-of-the-art semi-supervised AD performance under labeling budget constraints.
APA
Li, A., Qiu, C., Kloft, M., Smyth, P., Mandt, S. & Rudolph, M.. (2023). Deep Anomaly Detection under Labeling Budget Constraints. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:19882-19910 Available from https://proceedings.mlr.press/v202/li23x.html.

Related Material