Batch-Expansion Training: An Efficient Optimization Framework

Michal Derezinski, Dhruv Mahajan, S. Sathiya Keerthi, S. V. N. Vishwanathan, Markus Weimer
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:736-744, 2018.

Abstract

We propose Batch-Expansion Training (BET), a framework for running a batch optimizer on a gradually expanding dataset. As opposed to stochastic approaches, batches do not need to be resampled i.i.d. at every iteration, thus making BET more resource efficient in a distributed setting, and when disk-access is constrained. Moreover, BET can be easily paired with most batch optimizers, does not require any parameter-tuning, and compares favorably to existing stochastic and batch methods. We show that when the batch size grows exponentially with the number of outer iterations, BET achieves optimal O (1/epsilon) data-access convergence rate for strongly convex objectives. Experiments in parallel and distributed settings show that BET performs better than standard batch and stochastic approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-derezinski18b, title = {Batch-Expansion Training: An Efficient Optimization Framework}, author = {Derezinski, Michal and Mahajan, Dhruv and Keerthi, S. Sathiya and Vishwanathan, S. V. N. and Weimer, Markus}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {736--744}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/derezinski18b/derezinski18b.pdf}, url = {https://proceedings.mlr.press/v84/derezinski18b.html}, abstract = {We propose Batch-Expansion Training (BET), a framework for running a batch optimizer on a gradually expanding dataset. As opposed to stochastic approaches, batches do not need to be resampled i.i.d. at every iteration, thus making BET more resource efficient in a distributed setting, and when disk-access is constrained. Moreover, BET can be easily paired with most batch optimizers, does not require any parameter-tuning, and compares favorably to existing stochastic and batch methods. We show that when the batch size grows exponentially with the number of outer iterations, BET achieves optimal O (1/epsilon) data-access convergence rate for strongly convex objectives. Experiments in parallel and distributed settings show that BET performs better than standard batch and stochastic approaches.} }
Endnote
%0 Conference Paper %T Batch-Expansion Training: An Efficient Optimization Framework %A Michal Derezinski %A Dhruv Mahajan %A S. Sathiya Keerthi %A S. V. N. Vishwanathan %A Markus Weimer %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-derezinski18b %I PMLR %P 736--744 %U https://proceedings.mlr.press/v84/derezinski18b.html %V 84 %X We propose Batch-Expansion Training (BET), a framework for running a batch optimizer on a gradually expanding dataset. As opposed to stochastic approaches, batches do not need to be resampled i.i.d. at every iteration, thus making BET more resource efficient in a distributed setting, and when disk-access is constrained. Moreover, BET can be easily paired with most batch optimizers, does not require any parameter-tuning, and compares favorably to existing stochastic and batch methods. We show that when the batch size grows exponentially with the number of outer iterations, BET achieves optimal O (1/epsilon) data-access convergence rate for strongly convex objectives. Experiments in parallel and distributed settings show that BET performs better than standard batch and stochastic approaches.
APA
Derezinski, M., Mahajan, D., Keerthi, S.S., Vishwanathan, S.V.N. & Weimer, M.. (2018). Batch-Expansion Training: An Efficient Optimization Framework. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:736-744 Available from https://proceedings.mlr.press/v84/derezinski18b.html.

Related Material