Convex Analysis of the Mean Field Langevin Dynamics

Atsushi Nitanda, Denny Wu, Taiji Suzuki
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:9741-9757, 2022.

Abstract

As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics recently attracts attention due to its connection to (noisy) gradient descent on infinitely wide neural networks in the mean field regime, and hence the convergence property of the dynamics is of great theoretical interest. In this work, we give a concise and self-contained convergence rate analysis of the mean field Langevin dynamics with respect to the (regularized) objective function in both continuous and discrete time settings. The key ingredient of our proof is a proximal Gibbs distribution $p_q$ associated with the dynamics, which, in combination with techniques in Vempala and Wibisono (2019), allows us to develop a simple convergence theory parallel to classical results in convex optimization. Furthermore, we reveal that $p_q$ connects to the duality gap in the empirical risk minimization setting, which enables efficient empirical evaluation of the algorithm convergence.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-nitanda22a, title = { Convex Analysis of the Mean Field Langevin Dynamics }, author = {Nitanda, Atsushi and Wu, Denny and Suzuki, Taiji}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {9741--9757}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/nitanda22a/nitanda22a.pdf}, url = {https://proceedings.mlr.press/v151/nitanda22a.html}, abstract = { As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics recently attracts attention due to its connection to (noisy) gradient descent on infinitely wide neural networks in the mean field regime, and hence the convergence property of the dynamics is of great theoretical interest. In this work, we give a concise and self-contained convergence rate analysis of the mean field Langevin dynamics with respect to the (regularized) objective function in both continuous and discrete time settings. The key ingredient of our proof is a proximal Gibbs distribution $p_q$ associated with the dynamics, which, in combination with techniques in Vempala and Wibisono (2019), allows us to develop a simple convergence theory parallel to classical results in convex optimization. Furthermore, we reveal that $p_q$ connects to the duality gap in the empirical risk minimization setting, which enables efficient empirical evaluation of the algorithm convergence. } }
Endnote
%0 Conference Paper %T Convex Analysis of the Mean Field Langevin Dynamics %A Atsushi Nitanda %A Denny Wu %A Taiji Suzuki %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-nitanda22a %I PMLR %P 9741--9757 %U https://proceedings.mlr.press/v151/nitanda22a.html %V 151 %X As an example of the nonlinear Fokker-Planck equation, the mean field Langevin dynamics recently attracts attention due to its connection to (noisy) gradient descent on infinitely wide neural networks in the mean field regime, and hence the convergence property of the dynamics is of great theoretical interest. In this work, we give a concise and self-contained convergence rate analysis of the mean field Langevin dynamics with respect to the (regularized) objective function in both continuous and discrete time settings. The key ingredient of our proof is a proximal Gibbs distribution $p_q$ associated with the dynamics, which, in combination with techniques in Vempala and Wibisono (2019), allows us to develop a simple convergence theory parallel to classical results in convex optimization. Furthermore, we reveal that $p_q$ connects to the duality gap in the empirical risk minimization setting, which enables efficient empirical evaluation of the algorithm convergence.
APA
Nitanda, A., Wu, D. & Suzuki, T.. (2022). Convex Analysis of the Mean Field Langevin Dynamics . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:9741-9757 Available from https://proceedings.mlr.press/v151/nitanda22a.html.

Related Material