Newton Method over Networks is Fast up to the Statistical Precision

Amir Daneshmand, Gesualdo Scutari, Pavel Dvurechensky, Alexander Gasnikov
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2398-2409, 2021.

Abstract

We propose a distributed cubic regularization of the Newton method for solving (constrained) empirical risk minimization problems over a network of agents, modeled as undirected graph. The algorithm employs an inexact, preconditioned Newton step at each agent’s side: the gradient of the centralized loss is iteratively estimated via a gradient-tracking consensus mechanism and the Hessian is subsampled over the local data sets. No Hessian matrices are exchanged over the network. We derive global complexity bounds for convex and strongly convex losses. Our analysis reveals an interesting interplay between sample and iteration/communication complexity: statistically accurate solutions are achievable in roughly the same number of iterations of the centralized cubic Newton, with a communication cost per iteration of the order of $\widetilde{\mathcal{O}}\big(1/\sqrt{1-\rho}\big)$, where $\rho$ characterizes the connectivity of the network. This represents a significant improvement with respect to existing, statistically oblivious, distributed Newton-based methods over networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-daneshmand21a, title = {Newton Method over Networks is Fast up to the Statistical Precision}, author = {Daneshmand, Amir and Scutari, Gesualdo and Dvurechensky, Pavel and Gasnikov, Alexander}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2398--2409}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/daneshmand21a/daneshmand21a.pdf}, url = {https://proceedings.mlr.press/v139/daneshmand21a.html}, abstract = {We propose a distributed cubic regularization of the Newton method for solving (constrained) empirical risk minimization problems over a network of agents, modeled as undirected graph. The algorithm employs an inexact, preconditioned Newton step at each agent’s side: the gradient of the centralized loss is iteratively estimated via a gradient-tracking consensus mechanism and the Hessian is subsampled over the local data sets. No Hessian matrices are exchanged over the network. We derive global complexity bounds for convex and strongly convex losses. Our analysis reveals an interesting interplay between sample and iteration/communication complexity: statistically accurate solutions are achievable in roughly the same number of iterations of the centralized cubic Newton, with a communication cost per iteration of the order of $\widetilde{\mathcal{O}}\big(1/\sqrt{1-\rho}\big)$, where $\rho$ characterizes the connectivity of the network. This represents a significant improvement with respect to existing, statistically oblivious, distributed Newton-based methods over networks.} }
Endnote
%0 Conference Paper %T Newton Method over Networks is Fast up to the Statistical Precision %A Amir Daneshmand %A Gesualdo Scutari %A Pavel Dvurechensky %A Alexander Gasnikov %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-daneshmand21a %I PMLR %P 2398--2409 %U https://proceedings.mlr.press/v139/daneshmand21a.html %V 139 %X We propose a distributed cubic regularization of the Newton method for solving (constrained) empirical risk minimization problems over a network of agents, modeled as undirected graph. The algorithm employs an inexact, preconditioned Newton step at each agent’s side: the gradient of the centralized loss is iteratively estimated via a gradient-tracking consensus mechanism and the Hessian is subsampled over the local data sets. No Hessian matrices are exchanged over the network. We derive global complexity bounds for convex and strongly convex losses. Our analysis reveals an interesting interplay between sample and iteration/communication complexity: statistically accurate solutions are achievable in roughly the same number of iterations of the centralized cubic Newton, with a communication cost per iteration of the order of $\widetilde{\mathcal{O}}\big(1/\sqrt{1-\rho}\big)$, where $\rho$ characterizes the connectivity of the network. This represents a significant improvement with respect to existing, statistically oblivious, distributed Newton-based methods over networks.
APA
Daneshmand, A., Scutari, G., Dvurechensky, P. & Gasnikov, A.. (2021). Newton Method over Networks is Fast up to the Statistical Precision. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2398-2409 Available from https://proceedings.mlr.press/v139/daneshmand21a.html.

Related Material