Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods

Junhong Lin; Volkan Cevher

Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods

Junhong Lin, Volkan Cevher

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:3092-3101, 2018.

Abstract

We study generalization properties of distributed algorithms in the setting of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We investigate distributed stochastic gradient methods (SGM), with mini-batches and multi-passes over the data. We show that optimal generalization error bounds can be retained for distributed SGM provided that the partition level is not too large. Our results are superior to the state-of-the-art theory, covering the cases that the regression function may not be in the hypothesis spaces. Particularly, our results show that distributed SGM has a smaller theoretical computational complexity, compared with distributed kernel ridge regression (KRR) and classic SGM.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-lin18a,
  title = 	 {Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods},
  author =       {Lin, Junhong and Cevher, Volkan},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {3092--3101},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/lin18a/lin18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/lin18a.html},
  abstract = 	 {We study generalization properties of distributed algorithms in the setting of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We investigate distributed stochastic gradient methods (SGM), with mini-batches and multi-passes over the data. We show that optimal generalization error bounds can be retained for distributed SGM provided that the partition level is not too large. Our results are superior to the state-of-the-art theory, covering the cases that the regression function may not be in the hypothesis spaces. Particularly, our results show that distributed SGM has a smaller theoretical computational complexity, compared with distributed kernel ridge regression (KRR) and classic SGM.}
}

Endnote

%0 Conference Paper
%T Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods
%A Junhong Lin
%A Volkan Cevher
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-lin18a
%I PMLR
%P 3092--3101
%U https://proceedings.mlr.press/v80/lin18a.html
%V 80
%X We study generalization properties of distributed algorithms in the setting of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We investigate distributed stochastic gradient methods (SGM), with mini-batches and multi-passes over the data. We show that optimal generalization error bounds can be retained for distributed SGM provided that the partition level is not too large. Our results are superior to the state-of-the-art theory, covering the cases that the regression function may not be in the hypothesis spaces. Particularly, our results show that distributed SGM has a smaller theoretical computational complexity, compared with distributed kernel ridge regression (KRR) and classic SGM.

APA


Lin, J. & Cevher, V.. (2018). Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:3092-3101 Available from https://proceedings.mlr.press/v80/lin18a.html.

Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods

Abstract

Cite this Paper

Related Material