Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation

Masahiro Kato, Takeshi Teshima
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:5320-5333, 2021.

Abstract

Density ratio estimation (DRE) is at the core of various machine learning tasks such as anomaly detection and domain adaptation. In the DRE literature, existing studies have extensively studied methods based on Bregman divergence (BD) minimization. However, when we apply the BD minimization with highly flexible models, such as deep neural networks, it tends to suffer from what we call train-loss hacking, which is a source of over-fitting caused by a typical characteristic of empirical BD estimators. In this paper, to mitigate train-loss hacking, we propose non-negative correction for empirical BD estimators. Theoretically, we confirm the soundness of the proposed method through a generalization error bound. In our experiments, the proposed methods show favorable performances in inlier-based outlier detection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-kato21a, title = {Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation}, author = {Kato, Masahiro and Teshima, Takeshi}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {5320--5333}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/kato21a/kato21a.pdf}, url = {https://proceedings.mlr.press/v139/kato21a.html}, abstract = {Density ratio estimation (DRE) is at the core of various machine learning tasks such as anomaly detection and domain adaptation. In the DRE literature, existing studies have extensively studied methods based on Bregman divergence (BD) minimization. However, when we apply the BD minimization with highly flexible models, such as deep neural networks, it tends to suffer from what we call train-loss hacking, which is a source of over-fitting caused by a typical characteristic of empirical BD estimators. In this paper, to mitigate train-loss hacking, we propose non-negative correction for empirical BD estimators. Theoretically, we confirm the soundness of the proposed method through a generalization error bound. In our experiments, the proposed methods show favorable performances in inlier-based outlier detection.} }
Endnote
%0 Conference Paper %T Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation %A Masahiro Kato %A Takeshi Teshima %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-kato21a %I PMLR %P 5320--5333 %U https://proceedings.mlr.press/v139/kato21a.html %V 139 %X Density ratio estimation (DRE) is at the core of various machine learning tasks such as anomaly detection and domain adaptation. In the DRE literature, existing studies have extensively studied methods based on Bregman divergence (BD) minimization. However, when we apply the BD minimization with highly flexible models, such as deep neural networks, it tends to suffer from what we call train-loss hacking, which is a source of over-fitting caused by a typical characteristic of empirical BD estimators. In this paper, to mitigate train-loss hacking, we propose non-negative correction for empirical BD estimators. Theoretically, we confirm the soundness of the proposed method through a generalization error bound. In our experiments, the proposed methods show favorable performances in inlier-based outlier detection.
APA
Kato, M. & Teshima, T.. (2021). Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:5320-5333 Available from https://proceedings.mlr.press/v139/kato21a.html.

Related Material