Loss-Proportional Subsampling for Subsequent ERM

Paul Mineiro; Nikos Karampatziakis

Loss-Proportional Subsampling for Subsequent ERM

Paul Mineiro, Nikos Karampatziakis

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):522-530, 2013.

Abstract

We propose a sampling scheme suitable for reducing a data set prior to selecting a hypothesis with minimum empirical risk. The sampling only considers a subset of the ultimate (unknown) hypothesis set, but can nonetheless guarantee that the final excess risk will compare favorably with utilizing the entire original data set. We demonstrate the practical benefits of our approach on a large dataset which we subsample and subsequently fit with boosted trees.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-mineiro13,
  title = 	 {Loss-Proportional Subsampling for Subsequent ERM},
  author = 	 {Mineiro, Paul and Karampatziakis, Nikos},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {522--530},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/mineiro13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/mineiro13.html},
  abstract = 	 {We propose a sampling scheme suitable for reducing a data set prior to  selecting a hypothesis with minimum empirical risk.  The sampling only  considers a subset of the ultimate (unknown) hypothesis set, but can  nonetheless guarantee that the final excess risk will compare favorably  with utilizing the entire original data set. We demonstrate the practical  benefits of our approach on a large dataset which we subsample and  subsequently fit with boosted trees.}
}

Endnote

%0 Conference Paper
%T Loss-Proportional Subsampling for Subsequent ERM
%A Paul Mineiro
%A Nikos Karampatziakis
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-mineiro13
%I PMLR
%P 522--530
%U https://proceedings.mlr.press/v28/mineiro13.html
%V 28
%N 3
%X We propose a sampling scheme suitable for reducing a data set prior to  selecting a hypothesis with minimum empirical risk.  The sampling only  considers a subset of the ultimate (unknown) hypothesis set, but can  nonetheless guarantee that the final excess risk will compare favorably  with utilizing the entire original data set. We demonstrate the practical  benefits of our approach on a large dataset which we subsample and  subsequently fit with boosted trees.

RIS


TY  - CPAPER
TI  - Loss-Proportional Subsampling for Subsequent ERM
AU  - Paul Mineiro
AU  - Nikos Karampatziakis
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-mineiro13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 522
EP  - 530
L1  - http://proceedings.mlr.press/v28/mineiro13.pdf
UR  - https://proceedings.mlr.press/v28/mineiro13.html
AB  - We propose a sampling scheme suitable for reducing a data set prior to  selecting a hypothesis with minimum empirical risk.  The sampling only  considers a subset of the ultimate (unknown) hypothesis set, but can  nonetheless guarantee that the final excess risk will compare favorably  with utilizing the entire original data set. We demonstrate the practical  benefits of our approach on a large dataset which we subsample and  subsequently fit with boosted trees.
ER  -

APA


Mineiro, P. & Karampatziakis, N.. (2013). Loss-Proportional Subsampling for Subsequent ERM. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):522-530 Available from https://proceedings.mlr.press/v28/mineiro13.html.

Related Material

Download PDF