In- or out-of-distribution detection via dual divergence estimation

Sahil Garg, Sanghamitra Dutta, Mina Dalirrooyfard, Anderson Schneider, Yuriy Nevmyvaka
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:635-646, 2023.

Abstract

Detecting out-of-distribution (OOD) samples is a problem of practical importance for a reliable use of deep neural networks (DNNs) in production settings. The corollary to this problem is the detection in-distribution (ID) samples, which is applicable to domain adaptation scenarios for augmenting a train set with ID samples from other data sets, or to continual learning for replay from the past. For both ID or OOD detection, we propose a principled yet simple approach of (empirically) estimating KL-Divergence, in its dual form, for a given test set w.r.t. a known set of ID samples in order to quantify the contribution of each test sample individually towards the divergence measure and accordingly detect it as OOD or ID. Our approach is compute-efficient and enjoys strong theoretical guarantees. For WideResnet101 and ViT-L-16, by considering ImageNet-1k dataset as the ID benchmark, we evaluate the proposed OOD detector on 51 test (OOD) datasets, and observe drastically and consistently lower false positive rates w.r.t. all the competitive methods. Moreover, the proposed ID detector is evaluated, using ECG and stock price datasets, for the task of data augmentation in domain adaptation and continual learning settings, and we observe higher efficacy compared to relevant baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-garg23b, title = {In- or out-of-distribution detection via dual divergence estimation}, author = {Garg, Sahil and Dutta, Sanghamitra and Dalirrooyfard, Mina and Schneider, Anderson and Nevmyvaka, Yuriy}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {635--646}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/garg23b/garg23b.pdf}, url = {https://proceedings.mlr.press/v216/garg23b.html}, abstract = {Detecting out-of-distribution (OOD) samples is a problem of practical importance for a reliable use of deep neural networks (DNNs) in production settings. The corollary to this problem is the detection in-distribution (ID) samples, which is applicable to domain adaptation scenarios for augmenting a train set with ID samples from other data sets, or to continual learning for replay from the past. For both ID or OOD detection, we propose a principled yet simple approach of (empirically) estimating KL-Divergence, in its dual form, for a given test set w.r.t. a known set of ID samples in order to quantify the contribution of each test sample individually towards the divergence measure and accordingly detect it as OOD or ID. Our approach is compute-efficient and enjoys strong theoretical guarantees. For WideResnet101 and ViT-L-16, by considering ImageNet-1k dataset as the ID benchmark, we evaluate the proposed OOD detector on 51 test (OOD) datasets, and observe drastically and consistently lower false positive rates w.r.t. all the competitive methods. Moreover, the proposed ID detector is evaluated, using ECG and stock price datasets, for the task of data augmentation in domain adaptation and continual learning settings, and we observe higher efficacy compared to relevant baselines.} }
Endnote
%0 Conference Paper %T In- or out-of-distribution detection via dual divergence estimation %A Sahil Garg %A Sanghamitra Dutta %A Mina Dalirrooyfard %A Anderson Schneider %A Yuriy Nevmyvaka %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-garg23b %I PMLR %P 635--646 %U https://proceedings.mlr.press/v216/garg23b.html %V 216 %X Detecting out-of-distribution (OOD) samples is a problem of practical importance for a reliable use of deep neural networks (DNNs) in production settings. The corollary to this problem is the detection in-distribution (ID) samples, which is applicable to domain adaptation scenarios for augmenting a train set with ID samples from other data sets, or to continual learning for replay from the past. For both ID or OOD detection, we propose a principled yet simple approach of (empirically) estimating KL-Divergence, in its dual form, for a given test set w.r.t. a known set of ID samples in order to quantify the contribution of each test sample individually towards the divergence measure and accordingly detect it as OOD or ID. Our approach is compute-efficient and enjoys strong theoretical guarantees. For WideResnet101 and ViT-L-16, by considering ImageNet-1k dataset as the ID benchmark, we evaluate the proposed OOD detector on 51 test (OOD) datasets, and observe drastically and consistently lower false positive rates w.r.t. all the competitive methods. Moreover, the proposed ID detector is evaluated, using ECG and stock price datasets, for the task of data augmentation in domain adaptation and continual learning settings, and we observe higher efficacy compared to relevant baselines.
APA
Garg, S., Dutta, S., Dalirrooyfard, M., Schneider, A. & Nevmyvaka, Y.. (2023). In- or out-of-distribution detection via dual divergence estimation. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:635-646 Available from https://proceedings.mlr.press/v216/garg23b.html.

Related Material