Generalized Optimal Reverse Prediction

Martha White; Dale Schuurmans

Generalized Optimal Reverse Prediction

Martha White, Dale Schuurmans

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1305-1313, 2012.

Abstract

Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-white12,
  title = 	 {Generalized Optimal Reverse Prediction},
  author = 	 {White, Martha and Schuurmans, Dale},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1305--1313},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/white12/white12.pdf},
  url = 	 {https://proceedings.mlr.press/v22/white12.html},
  abstract = 	 {Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.}
}

Endnote

%0 Conference Paper
%T Generalized Optimal Reverse Prediction
%A Martha White
%A Dale Schuurmans
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-white12
%I PMLR
%P 1305--1313
%U https://proceedings.mlr.press/v22/white12.html
%V 22
%X Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.

RIS


TY  - CPAPER
TI  - Generalized Optimal Reverse Prediction
AU  - Martha White
AU  - Dale Schuurmans
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-white12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1305
EP  - 1313
L1  - http://proceedings.mlr.press/v22/white12/white12.pdf
UR  - https://proceedings.mlr.press/v22/white12.html
AB  - Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.
ER  -

APA


White, M. & Schuurmans, D.. (2012). Generalized Optimal Reverse Prediction. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1305-1313 Available from https://proceedings.mlr.press/v22/white12.html.

Generalized Optimal Reverse Prediction

Abstract

Cite this Paper

Related Material