Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences

Ikko Yamane; Junya Honda; Florian Yger; Masashi Sugiyama

Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences

Ikko Yamane, Junya Honda, Florian Yger, Masashi Sugiyama

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:11637-11647, 2021.

Abstract

Ordinary supervised learning is useful when we have paired training data of input $X$ and output $Y$. However, such paired data can be difficult to collect in practice. In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U’_j, Y’_j)\}$. A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$, but we show that this is not statistically consistent. Moreover, predicting $U$ can be more difficult than predicting $Y$ in practice, e.g., when $U$ has higher dimensionality. To circumvent the difficulty, we propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_{X}$ to predict $h(U)$ which is trained with $S_{Y}$ to approximate $Y$. We prove statistical consistency and error bounds of our method and experimentally confirm its practical usefulness.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-yamane21a,
  title = 	 {Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences},
  author =       {Yamane, Ikko and Honda, Junya and Yger, Florian and Sugiyama, Masashi},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {11637--11647},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/yamane21a/yamane21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/yamane21a.html},
  abstract = 	 {Ordinary supervised learning is useful when we have paired training data of input $X$ and output $Y$. However, such paired data can be difficult to collect in practice. In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U’_j, Y’_j)\}$. A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$, but we show that this is not statistically consistent. Moreover, predicting $U$ can be more difficult than predicting $Y$ in practice, e.g., when $U$ has higher dimensionality. To circumvent the difficulty, we propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_{X}$ to predict $h(U)$ which is trained with $S_{Y}$ to approximate $Y$. We prove statistical consistency and error bounds of our method and experimentally confirm its practical usefulness.}
}

Endnote

%0 Conference Paper
%T Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences
%A Ikko Yamane
%A Junya Honda
%A Florian Yger
%A Masashi Sugiyama
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-yamane21a
%I PMLR
%P 11637--11647
%U https://proceedings.mlr.press/v139/yamane21a.html
%V 139
%X Ordinary supervised learning is useful when we have paired training data of input $X$ and output $Y$. However, such paired data can be difficult to collect in practice. In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U’_j, Y’_j)\}$. A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$, but we show that this is not statistically consistent. Moreover, predicting $U$ can be more difficult than predicting $Y$ in practice, e.g., when $U$ has higher dimensionality. To circumvent the difficulty, we propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_{X}$ to predict $h(U)$ which is trained with $S_{Y}$ to approximate $Y$. We prove statistical consistency and error bounds of our method and experimentally confirm its practical usefulness.

APA

Yamane, I., Honda, J., Yger, F. & Sugiyama, M.. (2021). Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:11637-11647 Available from https://proceedings.mlr.press/v139/yamane21a.html.

Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences

Abstract

Cite this Paper

Related Material