Learning from Conditional Distributions via Dual Embeddings

Bo Dai, Niao He, Yunpeng Pan, Byron Boots, Le Song
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:1458-1467, 2017.

Abstract

Many machine learning tasks, such as learning with invariance and policy evaluation in reinforcement learning, can be characterized as problems of learning from conditional distributions. In such problems, each sample x itself is associated with a conditional distribution $p(z|x)$ represented by samples $z_i_i=1^M$, and the goal is to learn a function f that links these conditional distributions to target values y. These problems become very challenging when we only have limited samples or in the extreme case only one sample from each conditional distribution. Commonly used approaches either assume that z is independent of x, or require an overwhelmingly large set of samples from each conditional distribution. To address these challenges, we propose a novel approach which employs a new min-max reformulation of the learning from conditional distribution problem. With such new reformulation, we only need to deal with the joint distribution p(z,x). We also design an efficient learning algorithm, Embedding-SGD, and establish theoretical sample complexity for such problems. Finally, our numerical experiments, on both synthetic and real-world datasets, show that the proposed approach can significantly improve over existing algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-dai17a, title = {{Learning from Conditional Distributions via Dual Embeddings}}, author = {Dai, Bo and He, Niao and Pan, Yunpeng and Boots, Byron and Song, Le}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {1458--1467}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/dai17a/dai17a.pdf}, url = {https://proceedings.mlr.press/v54/dai17a.html}, abstract = {Many machine learning tasks, such as learning with invariance and policy evaluation in reinforcement learning, can be characterized as problems of learning from conditional distributions. In such problems, each sample x itself is associated with a conditional distribution $p(z|x)$ represented by samples $z_i_i=1^M$, and the goal is to learn a function f that links these conditional distributions to target values y. These problems become very challenging when we only have limited samples or in the extreme case only one sample from each conditional distribution. Commonly used approaches either assume that z is independent of x, or require an overwhelmingly large set of samples from each conditional distribution. To address these challenges, we propose a novel approach which employs a new min-max reformulation of the learning from conditional distribution problem. With such new reformulation, we only need to deal with the joint distribution p(z,x). We also design an efficient learning algorithm, Embedding-SGD, and establish theoretical sample complexity for such problems. Finally, our numerical experiments, on both synthetic and real-world datasets, show that the proposed approach can significantly improve over existing algorithms.} }
Endnote
%0 Conference Paper %T Learning from Conditional Distributions via Dual Embeddings %A Bo Dai %A Niao He %A Yunpeng Pan %A Byron Boots %A Le Song %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-dai17a %I PMLR %P 1458--1467 %U https://proceedings.mlr.press/v54/dai17a.html %V 54 %X Many machine learning tasks, such as learning with invariance and policy evaluation in reinforcement learning, can be characterized as problems of learning from conditional distributions. In such problems, each sample x itself is associated with a conditional distribution $p(z|x)$ represented by samples $z_i_i=1^M$, and the goal is to learn a function f that links these conditional distributions to target values y. These problems become very challenging when we only have limited samples or in the extreme case only one sample from each conditional distribution. Commonly used approaches either assume that z is independent of x, or require an overwhelmingly large set of samples from each conditional distribution. To address these challenges, we propose a novel approach which employs a new min-max reformulation of the learning from conditional distribution problem. With such new reformulation, we only need to deal with the joint distribution p(z,x). We also design an efficient learning algorithm, Embedding-SGD, and establish theoretical sample complexity for such problems. Finally, our numerical experiments, on both synthetic and real-world datasets, show that the proposed approach can significantly improve over existing algorithms.
APA
Dai, B., He, N., Pan, Y., Boots, B. & Song, L.. (2017). Learning from Conditional Distributions via Dual Embeddings. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:1458-1467 Available from https://proceedings.mlr.press/v54/dai17a.html.

Related Material