Rademacher Observations, Private Data, and Boosting

Richard Nock; Giorgio Patrini; Arik Friedman

Rademacher Observations, Private Data, and Boosting

Richard Nock, Giorgio Patrini, Arik Friedman

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:948-956, 2015.

Abstract

The minimization of the logistic loss is a popular approach to batch supervised learning. Our paper starts from the surprising observation that, when fitting linear classifiers, the minimization of the logistic loss is \textitequivalent to the minimization of an exponential \textitrado-loss computed (i) over transformed data that we call Rademacher observations (rados), and (ii) over the \textitsame classifier as the one of the logistic loss. Thus, a classifier learnt from rados can be \textitdirectly used to classify \textitobservations. We provide a learning algorithm over rados with boosting-compliant convergence rates on the \textitlogistic loss (computed over examples). Experiments on domains with up to millions of examples, backed up by theoretical arguments, display that learning over a small set of random rados can challenge the state of the art that learns over the \textitcomplete set of examples. We show that rados comply with various privacy requirements that make them good candidates for machine learning in a privacy framework. We give several algebraic, geometric and computational hardness results on reconstructing examples from rados. We also show how it is possible to craft, and efficiently learn from, rados in a differential privacy framework. Tests reveal that learning from differentially private rados brings non-trivial privacy vs accuracy tradeoffs.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-nock15,
  title = 	 {Rademacher Observations, Private Data, and Boosting},
  author = 	 {Nock, Richard and Patrini, Giorgio and Friedman, Arik},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {948--956},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/nock15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/nock15.html},
  abstract = 	 {The minimization of the logistic loss is a popular approach to batch supervised learning. Our paper starts from the surprising observation that, when fitting linear classifiers, the minimization of the logistic loss is \textitequivalent to the minimization of an exponential \textitrado-loss computed (i) over transformed data that we call Rademacher observations (rados), and (ii) over the \textitsame classifier as the one of the logistic loss. Thus, a classifier learnt from rados can be \textitdirectly used to classify \textitobservations. We provide a learning algorithm over rados with boosting-compliant convergence rates on the \textitlogistic loss (computed over examples). Experiments on domains with up to millions of examples, backed up by theoretical arguments, display that learning over a small set of random rados can challenge the state of the art that learns over the \textitcomplete set of examples. We show that rados comply with various privacy requirements that make them good candidates for machine learning in a privacy framework. We give several algebraic, geometric and computational hardness results on reconstructing examples from rados. We also show how it is possible to craft, and efficiently learn from, rados in a differential privacy framework. Tests reveal that learning from differentially private rados brings non-trivial privacy vs accuracy tradeoffs.}
}

Endnote

%0 Conference Paper
%T Rademacher Observations, Private Data, and Boosting
%A Richard Nock
%A Giorgio Patrini
%A Arik Friedman
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-nock15
%I PMLR
%P 948--956
%U https://proceedings.mlr.press/v37/nock15.html
%V 37
%X The minimization of the logistic loss is a popular approach to batch supervised learning. Our paper starts from the surprising observation that, when fitting linear classifiers, the minimization of the logistic loss is \textitequivalent to the minimization of an exponential \textitrado-loss computed (i) over transformed data that we call Rademacher observations (rados), and (ii) over the \textitsame classifier as the one of the logistic loss. Thus, a classifier learnt from rados can be \textitdirectly used to classify \textitobservations. We provide a learning algorithm over rados with boosting-compliant convergence rates on the \textitlogistic loss (computed over examples). Experiments on domains with up to millions of examples, backed up by theoretical arguments, display that learning over a small set of random rados can challenge the state of the art that learns over the \textitcomplete set of examples. We show that rados comply with various privacy requirements that make them good candidates for machine learning in a privacy framework. We give several algebraic, geometric and computational hardness results on reconstructing examples from rados. We also show how it is possible to craft, and efficiently learn from, rados in a differential privacy framework. Tests reveal that learning from differentially private rados brings non-trivial privacy vs accuracy tradeoffs.

RIS


TY  - CPAPER
TI  - Rademacher Observations, Private Data, and Boosting
AU  - Richard Nock
AU  - Giorgio Patrini
AU  - Arik Friedman
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-nock15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 948
EP  - 956
L1  - http://proceedings.mlr.press/v37/nock15.pdf
UR  - https://proceedings.mlr.press/v37/nock15.html
AB  - The minimization of the logistic loss is a popular approach to batch supervised learning. Our paper starts from the surprising observation that, when fitting linear classifiers, the minimization of the logistic loss is \textitequivalent to the minimization of an exponential \textitrado-loss computed (i) over transformed data that we call Rademacher observations (rados), and (ii) over the \textitsame classifier as the one of the logistic loss. Thus, a classifier learnt from rados can be \textitdirectly used to classify \textitobservations. We provide a learning algorithm over rados with boosting-compliant convergence rates on the \textitlogistic loss (computed over examples). Experiments on domains with up to millions of examples, backed up by theoretical arguments, display that learning over a small set of random rados can challenge the state of the art that learns over the \textitcomplete set of examples. We show that rados comply with various privacy requirements that make them good candidates for machine learning in a privacy framework. We give several algebraic, geometric and computational hardness results on reconstructing examples from rados. We also show how it is possible to craft, and efficiently learn from, rados in a differential privacy framework. Tests reveal that learning from differentially private rados brings non-trivial privacy vs accuracy tradeoffs.
ER  -

APA


Nock, R., Patrini, G. & Friedman, A.. (2015). Rademacher Observations, Private Data, and Boosting. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:948-956 Available from https://proceedings.mlr.press/v37/nock15.html.

Related Material

Download PDF