Recovering Probability Distributions from Missing Data

Jin Tian
Proceedings of the Ninth Asian Conference on Machine Learning, PMLR 77:574-589, 2017.

Abstract

A probabilistic query may not be estimable from observed data corrupted by missing values if the data are not missing at random (MAR). It is therefore of theoretical interest and practical importance to determine in principle whether a probabilistic query is estimable from missing data or not when the data are not MAR. We present algorithms that systematically determine whether the joint probability distribution or a target marginal distribution is estimable from observed data with missing values, assuming that the data-generation model is represented as a Bayesian network, known as m-graphs, that not only encodes the dependencies among the variables but also explicitly portrays the mechanisms responsible for the missingness process. The results significantly advance the existing work.

Cite this Paper


BibTeX
@InProceedings{pmlr-v77-tian17a, title = {Recovering Probability Distributions from Missing Data}, author = {Tian, Jin}, booktitle = {Proceedings of the Ninth Asian Conference on Machine Learning}, pages = {574--589}, year = {2017}, editor = {Zhang, Min-Ling and Noh, Yung-Kyun}, volume = {77}, series = {Proceedings of Machine Learning Research}, address = {Yonsei University, Seoul, Republic of Korea}, month = {15--17 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v77/tian17a/tian17a.pdf}, url = {https://proceedings.mlr.press/v77/tian17a.html}, abstract = {A probabilistic query may not be estimable from observed data corrupted by missing values if the data are not missing at random (MAR). It is therefore of theoretical interest and practical importance to determine in principle whether a probabilistic query is estimable from missing data or not when the data are not MAR. We present algorithms that systematically determine whether the joint probability distribution or a target marginal distribution is estimable from observed data with missing values, assuming that the data-generation model is represented as a Bayesian network, known as m-graphs, that not only encodes the dependencies among the variables but also explicitly portrays the mechanisms responsible for the missingness process. The results significantly advance the existing work.} }
Endnote
%0 Conference Paper %T Recovering Probability Distributions from Missing Data %A Jin Tian %B Proceedings of the Ninth Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Min-Ling Zhang %E Yung-Kyun Noh %F pmlr-v77-tian17a %I PMLR %P 574--589 %U https://proceedings.mlr.press/v77/tian17a.html %V 77 %X A probabilistic query may not be estimable from observed data corrupted by missing values if the data are not missing at random (MAR). It is therefore of theoretical interest and practical importance to determine in principle whether a probabilistic query is estimable from missing data or not when the data are not MAR. We present algorithms that systematically determine whether the joint probability distribution or a target marginal distribution is estimable from observed data with missing values, assuming that the data-generation model is represented as a Bayesian network, known as m-graphs, that not only encodes the dependencies among the variables but also explicitly portrays the mechanisms responsible for the missingness process. The results significantly advance the existing work.
APA
Tian, J.. (2017). Recovering Probability Distributions from Missing Data. Proceedings of the Ninth Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 77:574-589 Available from https://proceedings.mlr.press/v77/tian17a.html.

Related Material