Entropic Issues in Likelihood-Based OOD Detection

Anthony L. Caterini, Gabriel Loaiza-Ganem
Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops, PMLR 163:21-26, 2022.

Abstract

Deep generative models trained by maximum likelihood remain very popular methods for reasoning about data probabilistically. However, it has been observed that they can assign higher likelihoods to out-of-distribution (OOD) data than in- distribution data, thus calling into question the meaning of these likelihood values. In this work we provide a novel perspective on this phenomenon, decomposing the average likelihood into a KL divergence term and an entropy term. We argue that the latter can explain the curious OOD behaviour mentioned above, suppressing likelihood values on datasets with higher entropy. Although our idea is simple, we have not seen it explored yet in the literature. This analysis provides further explanation for the success of OOD detection methods based on likelihood ratios, as the problematic entropy term cancels out in expectation. Finally, we discuss how this observation relates to recent success in OOD detection with manifold-supported models, for which the above decomposition does not hold.

Cite this Paper


BibTeX
@InProceedings{pmlr-v163-caterini22a, title = {Entropic Issues in Likelihood-Based {OOD} Detection}, author = {Caterini, Anthony L. and Loaiza-Ganem, Gabriel}, booktitle = {Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops}, pages = {21--26}, year = {2022}, editor = {Pradier, Melanie F. and Schein, Aaron and Hyland, Stephanie and Ruiz, Francisco J. R. and Forde, Jessica Z.}, volume = {163}, series = {Proceedings of Machine Learning Research}, month = {13 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v163/caterini22a/caterini22a.pdf}, url = {https://proceedings.mlr.press/v163/caterini22a.html}, abstract = {Deep generative models trained by maximum likelihood remain very popular methods for reasoning about data probabilistically. However, it has been observed that they can assign higher likelihoods to out-of-distribution (OOD) data than in- distribution data, thus calling into question the meaning of these likelihood values. In this work we provide a novel perspective on this phenomenon, decomposing the average likelihood into a KL divergence term and an entropy term. We argue that the latter can explain the curious OOD behaviour mentioned above, suppressing likelihood values on datasets with higher entropy. Although our idea is simple, we have not seen it explored yet in the literature. This analysis provides further explanation for the success of OOD detection methods based on likelihood ratios, as the problematic entropy term cancels out in expectation. Finally, we discuss how this observation relates to recent success in OOD detection with manifold-supported models, for which the above decomposition does not hold.} }
Endnote
%0 Conference Paper %T Entropic Issues in Likelihood-Based OOD Detection %A Anthony L. Caterini %A Gabriel Loaiza-Ganem %B Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops %C Proceedings of Machine Learning Research %D 2022 %E Melanie F. Pradier %E Aaron Schein %E Stephanie Hyland %E Francisco J. R. Ruiz %E Jessica Z. Forde %F pmlr-v163-caterini22a %I PMLR %P 21--26 %U https://proceedings.mlr.press/v163/caterini22a.html %V 163 %X Deep generative models trained by maximum likelihood remain very popular methods for reasoning about data probabilistically. However, it has been observed that they can assign higher likelihoods to out-of-distribution (OOD) data than in- distribution data, thus calling into question the meaning of these likelihood values. In this work we provide a novel perspective on this phenomenon, decomposing the average likelihood into a KL divergence term and an entropy term. We argue that the latter can explain the curious OOD behaviour mentioned above, suppressing likelihood values on datasets with higher entropy. Although our idea is simple, we have not seen it explored yet in the literature. This analysis provides further explanation for the success of OOD detection methods based on likelihood ratios, as the problematic entropy term cancels out in expectation. Finally, we discuss how this observation relates to recent success in OOD detection with manifold-supported models, for which the above decomposition does not hold.
APA
Caterini, A.L. & Loaiza-Ganem, G.. (2022). Entropic Issues in Likelihood-Based OOD Detection. Proceedings on "I (Still) Can't Believe It's Not Better!" at NeurIPS 2021 Workshops, in Proceedings of Machine Learning Research 163:21-26 Available from https://proceedings.mlr.press/v163/caterini22a.html.

Related Material