Latent Outlier Exposure for Anomaly Detection with Contaminated Data

Chen Qiu, Aodong Li, Marius Kloft, Maja Rudolph, Stephan Mandt
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:18153-18167, 2022.

Abstract

Anomaly detection aims at identifying data points that show systematic deviations from the majority of data in an unlabeled dataset. A common assumption is that clean training data (free of anomalies) is available, which is often violated in practice. We propose a strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models. The idea is to jointly infer binary labels to each datum (normal vs. anomalous) while updating the model parameters. Inspired by outlier exposure (Hendrycks et al., 2018) that considers synthetically created, labeled anomalies, we thereby use a combination of two losses that share parameters: one for the normal and one for the anomalous data. We then iteratively proceed with block coordinate updates on the parameters and the most likely (latent) labels. Our experiments with several backbone models on three image datasets, 30 tabular data sets, and a video anomaly detection benchmark showed consistent and significant improvements over the baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-qiu22b, title = {Latent Outlier Exposure for Anomaly Detection with Contaminated Data}, author = {Qiu, Chen and Li, Aodong and Kloft, Marius and Rudolph, Maja and Mandt, Stephan}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {18153--18167}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/qiu22b/qiu22b.pdf}, url = {https://proceedings.mlr.press/v162/qiu22b.html}, abstract = {Anomaly detection aims at identifying data points that show systematic deviations from the majority of data in an unlabeled dataset. A common assumption is that clean training data (free of anomalies) is available, which is often violated in practice. We propose a strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models. The idea is to jointly infer binary labels to each datum (normal vs. anomalous) while updating the model parameters. Inspired by outlier exposure (Hendrycks et al., 2018) that considers synthetically created, labeled anomalies, we thereby use a combination of two losses that share parameters: one for the normal and one for the anomalous data. We then iteratively proceed with block coordinate updates on the parameters and the most likely (latent) labels. Our experiments with several backbone models on three image datasets, 30 tabular data sets, and a video anomaly detection benchmark showed consistent and significant improvements over the baselines.} }
Endnote
%0 Conference Paper %T Latent Outlier Exposure for Anomaly Detection with Contaminated Data %A Chen Qiu %A Aodong Li %A Marius Kloft %A Maja Rudolph %A Stephan Mandt %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-qiu22b %I PMLR %P 18153--18167 %U https://proceedings.mlr.press/v162/qiu22b.html %V 162 %X Anomaly detection aims at identifying data points that show systematic deviations from the majority of data in an unlabeled dataset. A common assumption is that clean training data (free of anomalies) is available, which is often violated in practice. We propose a strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models. The idea is to jointly infer binary labels to each datum (normal vs. anomalous) while updating the model parameters. Inspired by outlier exposure (Hendrycks et al., 2018) that considers synthetically created, labeled anomalies, we thereby use a combination of two losses that share parameters: one for the normal and one for the anomalous data. We then iteratively proceed with block coordinate updates on the parameters and the most likely (latent) labels. Our experiments with several backbone models on three image datasets, 30 tabular data sets, and a video anomaly detection benchmark showed consistent and significant improvements over the baselines.
APA
Qiu, C., Li, A., Kloft, M., Rudolph, M. & Mandt, S.. (2022). Latent Outlier Exposure for Anomaly Detection with Contaminated Data. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:18153-18167 Available from https://proceedings.mlr.press/v162/qiu22b.html.

Related Material