Distributed Learning with Sublinear Communication

Jayadev Acharya, Chris De Sa, Dylan Foster, Karthik Sridharan
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:40-50, 2019.

Abstract

In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the number of features (‘"dimension"), the speedup afforded by distributed learning may be overshadowed by the cost of communicating a single example. This paper investigates the following question: When is it possible to learn a $d$-dimensional model in the distributed setting with total communication sublinear in $d$? Starting with a negative result, we observe that for learning $\ell_1$-bounded or sparse linear models, no algorithm can obtain optimal error until communication is linear in dimension. Our main result is that by slightly relaxing the standard boundedness assumptions for linear models, we can obtain distributed algorithms that enjoy optimal error with communication logarithmic in dimension. This result is based on a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates, and extends to the general stochastic convex optimization model.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-acharya19b, title = {Distributed Learning with Sublinear Communication}, author = {Acharya, Jayadev and De Sa, Chris and Foster, Dylan and Sridharan, Karthik}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {40--50}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/acharya19b/acharya19b.pdf}, url = {https://proceedings.mlr.press/v97/acharya19b.html}, abstract = {In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the number of features (‘"dimension"), the speedup afforded by distributed learning may be overshadowed by the cost of communicating a single example. This paper investigates the following question: When is it possible to learn a $d$-dimensional model in the distributed setting with total communication sublinear in $d$? Starting with a negative result, we observe that for learning $\ell_1$-bounded or sparse linear models, no algorithm can obtain optimal error until communication is linear in dimension. Our main result is that by slightly relaxing the standard boundedness assumptions for linear models, we can obtain distributed algorithms that enjoy optimal error with communication logarithmic in dimension. This result is based on a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates, and extends to the general stochastic convex optimization model.} }
Endnote
%0 Conference Paper %T Distributed Learning with Sublinear Communication %A Jayadev Acharya %A Chris De Sa %A Dylan Foster %A Karthik Sridharan %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-acharya19b %I PMLR %P 40--50 %U https://proceedings.mlr.press/v97/acharya19b.html %V 97 %X In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This model has received substantial interest in machine learning due to its scalability and potential for parallel speedup. However, in high-dimensional settings, where the number examples is smaller than the number of features (‘"dimension"), the speedup afforded by distributed learning may be overshadowed by the cost of communicating a single example. This paper investigates the following question: When is it possible to learn a $d$-dimensional model in the distributed setting with total communication sublinear in $d$? Starting with a negative result, we observe that for learning $\ell_1$-bounded or sparse linear models, no algorithm can obtain optimal error until communication is linear in dimension. Our main result is that by slightly relaxing the standard boundedness assumptions for linear models, we can obtain distributed algorithms that enjoy optimal error with communication logarithmic in dimension. This result is based on a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates, and extends to the general stochastic convex optimization model.
APA
Acharya, J., De Sa, C., Foster, D. & Sridharan, K.. (2019). Distributed Learning with Sublinear Communication. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:40-50 Available from https://proceedings.mlr.press/v97/acharya19b.html.

Related Material