On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network

Hongchang Gao, Bin Gu, My T. Thai
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:9238-9281, 2023.

Abstract

Bilevel optimization has been applied to a wide variety of machine learning models and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the single-machine setting so that they are incapable of handling the distributed data. To address this issue, under the setting where all participants compose a network and perform peer-to-peer communication in this network, we developed two novel decentralized stochastic bilevel optimization algorithms based on the gradient tracking communication mechanism and two different gradient estimators. Additionally, we established their convergence rates for nonconvex-strongly-convex problems with novel theoretical analysis strategies. To our knowledge, this is the first work achieving these theoretical results. Finally, we applied our algorithms to practical machine learning models, and the experimental results confirmed the efficacy of our algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-gao23a, title = {On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network}, author = {Gao, Hongchang and Gu, Bin and Thai, My T.}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {9238--9281}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/gao23a/gao23a.pdf}, url = {https://proceedings.mlr.press/v206/gao23a.html}, abstract = {Bilevel optimization has been applied to a wide variety of machine learning models and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the single-machine setting so that they are incapable of handling the distributed data. To address this issue, under the setting where all participants compose a network and perform peer-to-peer communication in this network, we developed two novel decentralized stochastic bilevel optimization algorithms based on the gradient tracking communication mechanism and two different gradient estimators. Additionally, we established their convergence rates for nonconvex-strongly-convex problems with novel theoretical analysis strategies. To our knowledge, this is the first work achieving these theoretical results. Finally, we applied our algorithms to practical machine learning models, and the experimental results confirmed the efficacy of our algorithms.} }
Endnote
%0 Conference Paper %T On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network %A Hongchang Gao %A Bin Gu %A My T. Thai %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-gao23a %I PMLR %P 9238--9281 %U https://proceedings.mlr.press/v206/gao23a.html %V 206 %X Bilevel optimization has been applied to a wide variety of machine learning models and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the single-machine setting so that they are incapable of handling the distributed data. To address this issue, under the setting where all participants compose a network and perform peer-to-peer communication in this network, we developed two novel decentralized stochastic bilevel optimization algorithms based on the gradient tracking communication mechanism and two different gradient estimators. Additionally, we established their convergence rates for nonconvex-strongly-convex problems with novel theoretical analysis strategies. To our knowledge, this is the first work achieving these theoretical results. Finally, we applied our algorithms to practical machine learning models, and the experimental results confirmed the efficacy of our algorithms.
APA
Gao, H., Gu, B. & Thai, M.T.. (2023). On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:9238-9281 Available from https://proceedings.mlr.press/v206/gao23a.html.

Related Material