SILVER: Single-loop variance reduction and application to federated learning

Kazusato Oko; Shunta Akiyama; Denny Wu; Tomoya Murata; Taiji Suzuki

SILVER: Single-loop variance reduction and application to federated learning

Kazusato Oko, Shunta Akiyama, Denny Wu, Tomoya Murata, Taiji Suzuki

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:38683-38739, 2024.

Abstract

Most variance reduction methods require multiple times of full gradient computation, which is time-consuming and hence a bottleneck in application to distributed optimization. We present a single-loop variance-reduced gradient estimator named SILVER (SIngle-Loop VariancE-Reduction) for the finite-sum non-convex optimization, which does not require multiple full gradients but nevertheless achieves the optimal gradient complexity. Notably, unlike existing methods, SILVER provably reaches second-order optimality, with exponential convergence in the Polyak-Łojasiewicz (PL) region, and achieves further speedup depending on the data heterogeneity. Owing to these advantages, SILVER serves as a new base method to design communication-efficient federated learning algorithms: we combine SILVER with local updates which gives the best communication rounds and number of communicated gradients across all range of Hessian heterogeneity, and, at the same time, guarantees second-order optimality and exponential convergence in the PL region.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-oko24a,
  title = 	 {{SILVER}: Single-loop variance reduction and application to federated learning},
  author =       {Oko, Kazusato and Akiyama, Shunta and Wu, Denny and Murata, Tomoya and Suzuki, Taiji},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {38683--38739},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/oko24a/oko24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/oko24a.html},
  abstract = 	 {Most variance reduction methods require multiple times of full gradient computation, which is time-consuming and hence a bottleneck in application to distributed optimization. We present a single-loop variance-reduced gradient estimator named SILVER (SIngle-Loop VariancE-Reduction) for the finite-sum non-convex optimization, which does not require multiple full gradients but nevertheless achieves the optimal gradient complexity. Notably, unlike existing methods, SILVER provably reaches second-order optimality, with exponential convergence in the Polyak-Łojasiewicz (PL) region, and achieves further speedup depending on the data heterogeneity. Owing to these advantages, SILVER serves as a new base method to design communication-efficient federated learning algorithms: we combine SILVER with local updates which gives the best communication rounds and number of communicated gradients across all range of Hessian heterogeneity, and, at the same time, guarantees second-order optimality and exponential convergence in the PL region.}
}

Endnote

%0 Conference Paper
%T SILVER: Single-loop variance reduction and application to federated learning
%A Kazusato Oko
%A Shunta Akiyama
%A Denny Wu
%A Tomoya Murata
%A Taiji Suzuki
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-oko24a
%I PMLR
%P 38683--38739
%U https://proceedings.mlr.press/v235/oko24a.html
%V 235
%X Most variance reduction methods require multiple times of full gradient computation, which is time-consuming and hence a bottleneck in application to distributed optimization. We present a single-loop variance-reduced gradient estimator named SILVER (SIngle-Loop VariancE-Reduction) for the finite-sum non-convex optimization, which does not require multiple full gradients but nevertheless achieves the optimal gradient complexity. Notably, unlike existing methods, SILVER provably reaches second-order optimality, with exponential convergence in the Polyak-Łojasiewicz (PL) region, and achieves further speedup depending on the data heterogeneity. Owing to these advantages, SILVER serves as a new base method to design communication-efficient federated learning algorithms: we combine SILVER with local updates which gives the best communication rounds and number of communicated gradients across all range of Hessian heterogeneity, and, at the same time, guarantees second-order optimality and exponential convergence in the PL region.

APA


Oko, K., Akiyama, S., Wu, D., Murata, T. & Suzuki, T.. (2024). SILVER: Single-loop variance reduction and application to federated learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:38683-38739 Available from https://proceedings.mlr.press/v235/oko24a.html.

SILVER: Single-loop variance reduction and application to federated learning

Abstract

Cite this Paper

Related Material