Conformal prediction in manifold learning

Alexander Kuleshov; Alexander Bernstein; Evgeny Burnaev

Conformal prediction in manifold learning

Alexander Kuleshov, Alexander Bernstein, Evgeny Burnaev

Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications, PMLR 91:234-253, 2018.

Abstract

The paper presents a geometrically motivated view on conformal prediction applied to nonlinear multi-output regression tasks for obtaining valid measure of accuracy of Manifold Learning Regression algorithms. A considered regression task is to estimate an unknown smooth mapping $\mathbf{f}$ from $q$-dimensional inputs $\mathbf{x}\in \mathbf{X}$ to $m$-dimensional outputs $\mathbf{y} = \mathbf{f}(\mathbf{x})$ based on training dataset $\mathbf{Z}_{(n)}$ consisting of ``input-output' pairs $\{Z_i = (\mathbf{x}_i, \mathbf{y}_i = \mathbf{f}(\mathbf{x}_i))^{\mathrm{T}}, i = 1, 2, \ldots , n\}$. Manifold Learning Regression (MLR) algorithm solves this task using Manifold learning technique. At first, unknown $q$-dimensional Regression manifold $\mathbf{M}(\mathbf{f}) = \{(\mathbf{x}, \mathbf{f}(\mathbf{x}))^{\mathrm{T}}\in\mathbb{R}^{q+m}: \mathbf{x}\in \mathbf{X}\subset \mathbb{R}^{q} \}$, embedded in ambient $(q+m)$-dimensional space, is estimated from the training data $\mathbf{Z}_{(n)}$, sampled from this manifold. The constructed estimator $\mathbf{M}_{MLR}$, which is also $q$-dimensional manifold embedded in ambient space $\mathbb{R}^{q+m}$, is close to $\mathbf{M}$ in terms of Hausdorff distance. After that, an estimator $\mathbf{f}_{MLR}$ of the unknown function $\mathbf{f}$, mapping arbitrary input $\mathbf{x}\in \mathbf{X}$ to output $\mathbf{f}_{MLR}(\mathbf{x})$, is constructed as the solution to the equation $\mathbf{M}(\mathbf{f}_{MLR}) = \mathbf{M}_{MLR}$. Conformal prediction allows constructing a prediction region for an unknown output $\mathbf{y} = \mathbf{f}(\mathbf{x})$ at Out-of-Sample input point $\mathbf{x}$ for a given confidence level using given nonconformity measure, characterizing to which extent an example $Z = (\mathbf{x}, \mathbf{y})^{\mathrm{T}}$ is different from examples in the known dataset $\mathbf{Z}_{(n)}$. The paper proposes a new nonconformity measure based on MLR estimators using an analog of Bregman distance.

Cite this Paper

BibTeX

@InProceedings{pmlr-v91-kuleshov18a,
  title = 	 {Conformal prediction in manifold learning},
  author = 	 {Kuleshov, Alexander and Bernstein, Alexander and Burnaev, Evgeny},
  booktitle = 	 {Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications},
  pages = 	 {234--253},
  year = 	 {2018},
  editor = 	 {Gammerman, Alex and Vovk, Vladimir and Luo, Zhiyuan and Smirnov, Evgueni and Peeters, Ralf},
  volume = 	 {91},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {11--13 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v91/kuleshov18a/kuleshov18a.pdf},
  url = 	 {https://proceedings.mlr.press/v91/kuleshov18a.html},
  abstract = 	 {The paper presents a geometrically motivated view on conformal prediction applied to nonlinear multi-output regression tasks for obtaining valid measure of accuracy of Manifold Learning Regression algorithms. A considered regression task is to estimate an unknown smooth mapping $\mathbf{f}$ from $q$-dimensional inputs $\mathbf{x}\in \mathbf{X}$ to $m$-dimensional outputs $\mathbf{y} = \mathbf{f}(\mathbf{x})$ based on training dataset $\mathbf{Z}_{(n)}$ consisting of ``input-output' pairs $\{Z_i = (\mathbf{x}_i, \mathbf{y}_i = \mathbf{f}(\mathbf{x}_i))^{\mathrm{T}}, i = 1, 2, \ldots , n\}$. Manifold Learning Regression (MLR) algorithm solves this task using Manifold learning technique. At first, unknown $q$-dimensional Regression manifold $\mathbf{M}(\mathbf{f}) = \{(\mathbf{x}, \mathbf{f}(\mathbf{x}))^{\mathrm{T}}\in\mathbb{R}^{q+m}: \mathbf{x}\in \mathbf{X}\subset \mathbb{R}^{q} \}$, embedded in ambient $(q+m)$-dimensional space, is estimated from the training data $\mathbf{Z}_{(n)}$, sampled from this manifold. The constructed estimator $\mathbf{M}_{MLR}$, which is also $q$-dimensional manifold embedded in ambient space $\mathbb{R}^{q+m}$, is close to $\mathbf{M}$ in terms of Hausdorff distance. After that, an estimator $\mathbf{f}_{MLR}$ of the unknown function $\mathbf{f}$, mapping arbitrary input $\mathbf{x}\in \mathbf{X}$ to output $\mathbf{f}_{MLR}(\mathbf{x})$, is constructed as the solution to the equation $\mathbf{M}(\mathbf{f}_{MLR}) = \mathbf{M}_{MLR}$. Conformal prediction allows constructing a prediction region for an unknown output $\mathbf{y} = \mathbf{f}(\mathbf{x})$ at Out-of-Sample input point $\mathbf{x}$ for a given confidence level using given nonconformity measure, characterizing to which extent an example $Z = (\mathbf{x}, \mathbf{y})^{\mathrm{T}}$ is different from examples in the known dataset $\mathbf{Z}_{(n)}$. The paper proposes a new nonconformity measure based on MLR estimators using an analog of Bregman distance.}
}

Endnote

%0 Conference Paper
%T Conformal prediction in manifold learning
%A Alexander Kuleshov
%A Alexander Bernstein
%A Evgeny Burnaev
%B Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications
%C Proceedings of Machine Learning Research
%D 2018
%E Alex Gammerman
%E Vladimir Vovk
%E Zhiyuan Luo
%E Evgueni Smirnov
%E Ralf Peeters	
%F pmlr-v91-kuleshov18a
%I PMLR
%P 234--253
%U https://proceedings.mlr.press/v91/kuleshov18a.html
%V 91
%X The paper presents a geometrically motivated view on conformal prediction applied to nonlinear multi-output regression tasks for obtaining valid measure of accuracy of Manifold Learning Regression algorithms. A considered regression task is to estimate an unknown smooth mapping $\mathbf{f}$ from $q$-dimensional inputs $\mathbf{x}\in \mathbf{X}$ to $m$-dimensional outputs $\mathbf{y} = \mathbf{f}(\mathbf{x})$ based on training dataset $\mathbf{Z}_{(n)}$ consisting of ``input-output' pairs $\{Z_i = (\mathbf{x}_i, \mathbf{y}_i = \mathbf{f}(\mathbf{x}_i))^{\mathrm{T}}, i = 1, 2, \ldots , n\}$. Manifold Learning Regression (MLR) algorithm solves this task using Manifold learning technique. At first, unknown $q$-dimensional Regression manifold $\mathbf{M}(\mathbf{f}) = \{(\mathbf{x}, \mathbf{f}(\mathbf{x}))^{\mathrm{T}}\in\mathbb{R}^{q+m}: \mathbf{x}\in \mathbf{X}\subset \mathbb{R}^{q} \}$, embedded in ambient $(q+m)$-dimensional space, is estimated from the training data $\mathbf{Z}_{(n)}$, sampled from this manifold. The constructed estimator $\mathbf{M}_{MLR}$, which is also $q$-dimensional manifold embedded in ambient space $\mathbb{R}^{q+m}$, is close to $\mathbf{M}$ in terms of Hausdorff distance. After that, an estimator $\mathbf{f}_{MLR}$ of the unknown function $\mathbf{f}$, mapping arbitrary input $\mathbf{x}\in \mathbf{X}$ to output $\mathbf{f}_{MLR}(\mathbf{x})$, is constructed as the solution to the equation $\mathbf{M}(\mathbf{f}_{MLR}) = \mathbf{M}_{MLR}$. Conformal prediction allows constructing a prediction region for an unknown output $\mathbf{y} = \mathbf{f}(\mathbf{x})$ at Out-of-Sample input point $\mathbf{x}$ for a given confidence level using given nonconformity measure, characterizing to which extent an example $Z = (\mathbf{x}, \mathbf{y})^{\mathrm{T}}$ is different from examples in the known dataset $\mathbf{Z}_{(n)}$. The paper proposes a new nonconformity measure based on MLR estimators using an analog of Bregman distance.

APA

Kuleshov, A., Bernstein, A. & Burnaev, E.. (2018). Conformal prediction in manifold learning. Proceedings of the Seventh Workshop on Conformal and Probabilistic Prediction and Applications, in Proceedings of Machine Learning Research 91:234-253 Available from https://proceedings.mlr.press/v91/kuleshov18a.html.

Related Material

Download PDF