Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization

Siqi Zhang, Yifan Hu, Liang Zhang, Niao He
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:694-702, 2024.

Abstract

This paper studies the generalization performance of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The sample complexities for achieving $\epsilon$-generalization are $\tilde{\mathcal{O}}(d\kappa^2\epsilon^{-2})$ and $\tilde{\mathcal{O}}(d\epsilon^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension of the primal variable and $\kappa$ is the condition number. We further study the algorithm-dependent generalization bounds via stability arguments of algorithms. In particular, we introduce a novel stability notion for minimax problems and build a connection between stability and generalization. As a result, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) and the more general sampling-determined algorithms (SDA).

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-zhang24c, title = {Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization}, author = {Zhang, Siqi and Hu, Yifan and Zhang, Liang and He, Niao}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {694--702}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/zhang24c/zhang24c.pdf}, url = {https://proceedings.mlr.press/v238/zhang24c.html}, abstract = {This paper studies the generalization performance of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The sample complexities for achieving $\epsilon$-generalization are $\tilde{\mathcal{O}}(d\kappa^2\epsilon^{-2})$ and $\tilde{\mathcal{O}}(d\epsilon^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension of the primal variable and $\kappa$ is the condition number. We further study the algorithm-dependent generalization bounds via stability arguments of algorithms. In particular, we introduce a novel stability notion for minimax problems and build a connection between stability and generalization. As a result, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) and the more general sampling-determined algorithms (SDA).} }
Endnote
%0 Conference Paper %T Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization %A Siqi Zhang %A Yifan Hu %A Liang Zhang %A Niao He %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-zhang24c %I PMLR %P 694--702 %U https://proceedings.mlr.press/v238/zhang24c.html %V 238 %X This paper studies the generalization performance of algorithms for solving nonconvex-(strongly)-concave (NC-SC/NC-C) stochastic minimax optimization measured by the stationarity of primal functions. We first establish algorithm-agnostic generalization bounds via uniform convergence between the empirical minimax problem and the population minimax problem. The sample complexities for achieving $\epsilon$-generalization are $\tilde{\mathcal{O}}(d\kappa^2\epsilon^{-2})$ and $\tilde{\mathcal{O}}(d\epsilon^{-4})$ for NC-SC and NC-C settings, respectively, where $d$ is the dimension of the primal variable and $\kappa$ is the condition number. We further study the algorithm-dependent generalization bounds via stability arguments of algorithms. In particular, we introduce a novel stability notion for minimax problems and build a connection between stability and generalization. As a result, we establish algorithm-dependent generalization bounds for stochastic gradient descent ascent (SGDA) and the more general sampling-determined algorithms (SDA).
APA
Zhang, S., Hu, Y., Zhang, L. & He, N.. (2024). Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:694-702 Available from https://proceedings.mlr.press/v238/zhang24c.html.

Related Material