Adding vs. Averaging in Distributed Primal-Dual Optimization

Chenxin Ma, Virginia Smith, Martin Jaggi, Michael Jordan, Peter Richtarik, Martin Takac
; Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1973-1982, 2015.

Abstract

Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-mab15, title = {Adding vs. Averaging in Distributed Primal-Dual Optimization}, author = {Chenxin Ma and Virginia Smith and Martin Jaggi and Michael Jordan and Peter Richtarik and Martin Takac}, pages = {1973--1982}, year = {2015}, editor = {Francis Bach and David Blei}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/mab15.pdf}, url = {http://proceedings.mlr.press/v37/mab15.html}, abstract = {Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines.} }
Endnote
%0 Conference Paper %T Adding vs. Averaging in Distributed Primal-Dual Optimization %A Chenxin Ma %A Virginia Smith %A Martin Jaggi %A Michael Jordan %A Peter Richtarik %A Martin Takac %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-mab15 %I PMLR %J Proceedings of Machine Learning Research %P 1973--1982 %U http://proceedings.mlr.press %V 37 %W PMLR %X Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines.
RIS
TY - CPAPER TI - Adding vs. Averaging in Distributed Primal-Dual Optimization AU - Chenxin Ma AU - Virginia Smith AU - Martin Jaggi AU - Michael Jordan AU - Peter Richtarik AU - Martin Takac BT - Proceedings of the 32nd International Conference on Machine Learning PY - 2015/06/01 DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-mab15 PB - PMLR SP - 1973 DP - PMLR EP - 1982 L1 - http://proceedings.mlr.press/v37/mab15.pdf UR - http://proceedings.mlr.press/v37/mab15.html AB - Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper, we present a novel generalization of the recent communication-efficient primal-dual framework (COCOA) for distributed optimization. Our framework, COCOA+, allows for additive combination of local updates to the global parameters at each iteration, whereas previous schemes only allow conservative averaging. We give stronger (primal-dual) convergence rate guarantees for both COCOA as well as our new variants, and generalize the theory for both methods to cover non-smooth convex loss functions. We provide an extensive experimental comparison that shows the markedly improved performance of COCOA+ on several real-world distributed datasets, especially when scaling up the number of machines. ER -
APA
Ma, C., Smith, V., Jaggi, M., Jordan, M., Richtarik, P. & Takac, M.. (2015). Adding vs. Averaging in Distributed Primal-Dual Optimization. Proceedings of the 32nd International Conference on Machine Learning, in PMLR 37:1973-1982

Related Material