Generalized Optimal Reverse Prediction

Martha White, Dale Schuurmans
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1305-1313, 2012.

Abstract

Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-white12, title = {Generalized Optimal Reverse Prediction}, author = {White, Martha and Schuurmans, Dale}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {1305--1313}, year = {2012}, editor = {Lawrence, Neil D. and Girolami, Mark}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/white12/white12.pdf}, url = {https://proceedings.mlr.press/v22/white12.html}, abstract = {Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.} }
Endnote
%0 Conference Paper %T Generalized Optimal Reverse Prediction %A Martha White %A Dale Schuurmans %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-white12 %I PMLR %P 1305--1313 %U https://proceedings.mlr.press/v22/white12.html %V 22 %X Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art.
RIS
TY - CPAPER TI - Generalized Optimal Reverse Prediction AU - Martha White AU - Dale Schuurmans BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-white12 PB - PMLR DP - Proceedings of Machine Learning Research VL - 22 SP - 1305 EP - 1313 L1 - http://proceedings.mlr.press/v22/white12/white12.pdf UR - https://proceedings.mlr.press/v22/white12.html AB - Recently it has been shown that classical supervised and unsupervised training methods can be unified as special cases of so-called “optimal reverse prediction": predicting inputs from target labels while optimizing over both model parameters and missing labels. Although this perspective establishes links between classical training principles, the existing formulation only applies to linear predictors under squared loss, hence is extremely limited. We generalize the formulation of optimal reverse prediction to arbitrary Bregman divergences, and more importantly to nonlinear predictors. This extension is achieved by establishing a new, generalized form of forward-reverse minimization equivalence that holds for arbitrary matching losses. Several benefits follow. First, a new variant of Bregman divergence clustering can be recovered that incorporates a non-linear data reconstruction model. Second, normalized-cut and kernel-based extensions can be formulated coherently. Finally, a new semi-supervised training principle can be recovered for classification problems that demonstrates advantages over the state of the art. ER -
APA
White, M. & Schuurmans, D.. (2012). Generalized Optimal Reverse Prediction. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1305-1313 Available from https://proceedings.mlr.press/v22/white12.html.

Related Material