DiSCO: Distributed Optimization for Self-Concordant Empirical Loss

Yuchen Zhang, Xiao Lin
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:362-370, 2015.

Abstract

We propose a new distributed algorithm for empirical risk minimization in machine learning. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, where the n data points are i.i.d. sampled and when the regularization parameter scales as 1/\sqrtn, we show that the proposed algorithm is communication efficient: the required round of communication does not increase with the sample size n, and only grows slowly with the number of machines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-zhangb15, title = {DiSCO: Distributed Optimization for Self-Concordant Empirical Loss}, author = {Zhang, Yuchen and Lin, Xiao}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {362--370}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/zhangb15.pdf}, url = {https://proceedings.mlr.press/v37/zhangb15.html}, abstract = {We propose a new distributed algorithm for empirical risk minimization in machine learning. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, where the n data points are i.i.d. sampled and when the regularization parameter scales as 1/\sqrtn, we show that the proposed algorithm is communication efficient: the required round of communication does not increase with the sample size n, and only grows slowly with the number of machines.} }
Endnote
%0 Conference Paper %T DiSCO: Distributed Optimization for Self-Concordant Empirical Loss %A Yuchen Zhang %A Xiao Lin %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-zhangb15 %I PMLR %P 362--370 %U https://proceedings.mlr.press/v37/zhangb15.html %V 37 %X We propose a new distributed algorithm for empirical risk minimization in machine learning. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, where the n data points are i.i.d. sampled and when the regularization parameter scales as 1/\sqrtn, we show that the proposed algorithm is communication efficient: the required round of communication does not increase with the sample size n, and only grows slowly with the number of machines.
RIS
TY - CPAPER TI - DiSCO: Distributed Optimization for Self-Concordant Empirical Loss AU - Yuchen Zhang AU - Xiao Lin BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-zhangb15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 362 EP - 370 L1 - http://proceedings.mlr.press/v37/zhangb15.pdf UR - https://proceedings.mlr.press/v37/zhangb15.html AB - We propose a new distributed algorithm for empirical risk minimization in machine learning. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, where the n data points are i.i.d. sampled and when the regularization parameter scales as 1/\sqrtn, we show that the proposed algorithm is communication efficient: the required round of communication does not increase with the sample size n, and only grows slowly with the number of machines. ER -
APA
Zhang, Y. & Lin, X.. (2015). DiSCO: Distributed Optimization for Self-Concordant Empirical Loss. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:362-370 Available from https://proceedings.mlr.press/v37/zhangb15.html.

Related Material