Robust Probabilistic Modeling with Bayesian Data Reweighting

Yixin Wang, Alp Kucukelbir, David M. Blei
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3646-3655, 2017.

Abstract

Probabilistic models analyze data by relying on a set of assumptions. Data that exhibit deviations from these assumptions can undermine inference and prediction quality. Robust models offer protection against mismatch between a model’s assumptions and reality. We propose a way to systematically detect and mitigate mismatch of a large class of probabilistic models. The idea is to raise the likelihood of each observation to a weight and then to infer both the latent variables and the weights from data. Inferring the weights allows a model to identify observations that match its assumptions and down-weight others. This enables robust inference and improves predictive accuracy. We study four different forms of mismatch with reality, ranging from missing latent groups to structure misspecification. A Poisson factorization analysis of the Movielens 1M dataset shows the benefits of this approach in a practical scenario.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-wang17g, title = {Robust Probabilistic Modeling with {B}ayesian Data Reweighting}, author = {Yixin Wang and Alp Kucukelbir and David M. Blei}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {3646--3655}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/wang17g/wang17g.pdf}, url = {https://proceedings.mlr.press/v70/wang17g.html}, abstract = {Probabilistic models analyze data by relying on a set of assumptions. Data that exhibit deviations from these assumptions can undermine inference and prediction quality. Robust models offer protection against mismatch between a model’s assumptions and reality. We propose a way to systematically detect and mitigate mismatch of a large class of probabilistic models. The idea is to raise the likelihood of each observation to a weight and then to infer both the latent variables and the weights from data. Inferring the weights allows a model to identify observations that match its assumptions and down-weight others. This enables robust inference and improves predictive accuracy. We study four different forms of mismatch with reality, ranging from missing latent groups to structure misspecification. A Poisson factorization analysis of the Movielens 1M dataset shows the benefits of this approach in a practical scenario.} }
Endnote
%0 Conference Paper %T Robust Probabilistic Modeling with Bayesian Data Reweighting %A Yixin Wang %A Alp Kucukelbir %A David M. Blei %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-wang17g %I PMLR %P 3646--3655 %U https://proceedings.mlr.press/v70/wang17g.html %V 70 %X Probabilistic models analyze data by relying on a set of assumptions. Data that exhibit deviations from these assumptions can undermine inference and prediction quality. Robust models offer protection against mismatch between a model’s assumptions and reality. We propose a way to systematically detect and mitigate mismatch of a large class of probabilistic models. The idea is to raise the likelihood of each observation to a weight and then to infer both the latent variables and the weights from data. Inferring the weights allows a model to identify observations that match its assumptions and down-weight others. This enables robust inference and improves predictive accuracy. We study four different forms of mismatch with reality, ranging from missing latent groups to structure misspecification. A Poisson factorization analysis of the Movielens 1M dataset shows the benefits of this approach in a practical scenario.
APA
Wang, Y., Kucukelbir, A. & Blei, D.M.. (2017). Robust Probabilistic Modeling with Bayesian Data Reweighting. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3646-3655 Available from https://proceedings.mlr.press/v70/wang17g.html.

Related Material