Importance Sampling Tree for Large-scale Empirical Expectation

Olivier Canevet; Cijo Jose; Francois Fleuret

Importance Sampling Tree for Large-scale Empirical Expectation

Olivier Canevet, Cijo Jose, Francois Fleuret

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1454-1462, 2016.

Abstract

We propose a tree-based procedure inspired by the Monte-Carlo Tree Search that dynamically modulates an importance-based sampling to prioritize computation, while getting unbiased estimates of weighted sums. We apply this generic method to learning on very large training sets, and to the evaluation of large-scale SVMs. The core idea is to reformulate the estimation of a score - whether a loss or a prediction estimate - as an empirical expectation, and to use such a tree whose leaves carry the samples to focus efforts over the problematic "heavy weight" ones. We illustrate the potential of this approach on three problems: to improve Adaboost and a multi-layer perceptron on 2D synthetic tasks with several million points, to train a large-scale convolution network on several millions deformations of the CIFAR data-set, and to compute the response of a SVM with several hundreds of thousands of support vectors. In each case, we show how it either cuts down computation by more than one order of magnitude and/or allows to get better loss estimates.

Cite this Paper

BibTeX

@InProceedings{pmlr-v48-canevet16,
  title = 	 {Importance Sampling Tree for Large-scale Empirical Expectation},
  author = 	 {Canevet, Olivier and Jose, Cijo and Fleuret, Francois},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1454--1462},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/canevet16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/canevet16.html},
  abstract = 	 {We propose a tree-based procedure inspired by the Monte-Carlo Tree Search that dynamically modulates an importance-based sampling to prioritize computation, while getting unbiased estimates of weighted sums. We apply this generic method to learning on very large training sets, and to the evaluation of large-scale SVMs. The core idea is to reformulate the estimation of a score - whether a loss or a prediction estimate - as an empirical expectation, and to use such a tree whose leaves carry the samples to focus efforts over the problematic "heavy weight" ones. We illustrate the potential of this approach on three problems: to improve Adaboost and a multi-layer perceptron on 2D synthetic tasks with several million points, to train a large-scale convolution network on several millions deformations of the CIFAR data-set, and to compute the response of a SVM with several hundreds of thousands of support vectors. In each case, we show how it either cuts down computation by more than one order of magnitude and/or allows to get better loss estimates.}
}

Endnote

%0 Conference Paper
%T Importance Sampling Tree for Large-scale Empirical Expectation
%A Olivier Canevet
%A Cijo Jose
%A Francois Fleuret
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-canevet16
%I PMLR
%P 1454--1462
%U https://proceedings.mlr.press/v48/canevet16.html
%V 48
%X We propose a tree-based procedure inspired by the Monte-Carlo Tree Search that dynamically modulates an importance-based sampling to prioritize computation, while getting unbiased estimates of weighted sums. We apply this generic method to learning on very large training sets, and to the evaluation of large-scale SVMs. The core idea is to reformulate the estimation of a score - whether a loss or a prediction estimate - as an empirical expectation, and to use such a tree whose leaves carry the samples to focus efforts over the problematic "heavy weight" ones. We illustrate the potential of this approach on three problems: to improve Adaboost and a multi-layer perceptron on 2D synthetic tasks with several million points, to train a large-scale convolution network on several millions deformations of the CIFAR data-set, and to compute the response of a SVM with several hundreds of thousands of support vectors. In each case, we show how it either cuts down computation by more than one order of magnitude and/or allows to get better loss estimates.

RIS

TY  - CPAPER
TI  - Importance Sampling Tree for Large-scale Empirical Expectation
AU  - Olivier Canevet
AU  - Cijo Jose
AU  - Francois Fleuret
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-canevet16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1454
EP  - 1462
L1  - http://proceedings.mlr.press/v48/canevet16.pdf
UR  - https://proceedings.mlr.press/v48/canevet16.html
AB  - We propose a tree-based procedure inspired by the Monte-Carlo Tree Search that dynamically modulates an importance-based sampling to prioritize computation, while getting unbiased estimates of weighted sums. We apply this generic method to learning on very large training sets, and to the evaluation of large-scale SVMs. The core idea is to reformulate the estimation of a score - whether a loss or a prediction estimate - as an empirical expectation, and to use such a tree whose leaves carry the samples to focus efforts over the problematic "heavy weight" ones. We illustrate the potential of this approach on three problems: to improve Adaboost and a multi-layer perceptron on 2D synthetic tasks with several million points, to train a large-scale convolution network on several millions deformations of the CIFAR data-set, and to compute the response of a SVM with several hundreds of thousands of support vectors. In each case, we show how it either cuts down computation by more than one order of magnitude and/or allows to get better loss estimates.
ER  -

APA

Canevet, O., Jose, C. & Fleuret, F.. (2016). Importance Sampling Tree for Large-scale Empirical Expectation. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1454-1462 Available from https://proceedings.mlr.press/v48/canevet16.html.

Importance Sampling Tree for Large-scale Empirical Expectation

Abstract

Cite this Paper

Related Material