Adding vs. Averaging in Distributed Primal-Dual Optimization

Chenxin Ma, Virginia Smith, Martin Jaggi, Michael Jordan, Peter Richtarik, Martin Takac
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1973-1982, 2015.

Abstract

Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-mab15, title = {Adding vs. Averaging in Distributed Primal-Dual Optimization}, author = {Ma, Chenxin and Smith, Virginia and Jaggi, Martin and Jordan, Michael and Richtarik, Peter and Takac, Martin}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {1973--1982}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/mab15.pdf}, url = { http://proceedings.mlr.press/v37/mab15.html }, abstract = {Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines.} }
Endnote
%0 Conference Paper %T Adding vs. Averaging in Distributed Primal-Dual Optimization %A Chenxin Ma %A Virginia Smith %A Martin Jaggi %A Michael Jordan %A Peter Richtarik %A Martin Takac %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-mab15 %I PMLR %P 1973--1982 %U http://proceedings.mlr.press/v37/mab15.html %V 37 %X Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines.
RIS
TY - CPAPER TI - Adding vs. Averaging in Distributed Primal-Dual Optimization AU - Chenxin Ma AU - Virginia Smith AU - Martin Jaggi AU - Michael Jordan AU - Peter Richtarik AU - Martin Takac BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-mab15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 1973 EP - 1982 L1 - http://proceedings.mlr.press/v37/mab15.pdf UR - http://proceedings.mlr.press/v37/mab15.html AB - Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines. ER -
APA
Ma, C., Smith, V., Jaggi, M., Jordan, M., Richtarik, P. & Takac, M.. (2015). Adding vs. Averaging in Distributed Primal-Dual Optimization. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1973-1982 Available from http://proceedings.mlr.press/v37/mab15.html .

Related Material