Conjugate Gradient Method for Generative Adversarial Networks

Hiroki Naganuma, Hideaki Iiduka
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:4381-4408, 2023.

Abstract

One of the training strategies of generative models is to minimize the Jensen–Shannon divergence between the model distribution and the data distribution. Since data distribution is unknown, generative adversarial networks (GANs) formulate this problem as a game between two models, a generator and a discriminator. The training can be formulated in the context of game theory and the local Nash equilibrium (LNE). It does not seem feasible to derive guarantees of stability or optimality for the existing methods. This optimization problem is far more challenging than the single objective setting. Here, we use the conjugate gradient method to reliably and efficiently solve the LNE problem in GANs. We give a proof and convergence analysis under mild assumptions showing that the proposed method converges to a LNE with three different learning rate update rules, including a constant learning rate. Finally, we demonstrate that the proposed method outperforms stochastic gradient descent (SGD) and momentum SGD in terms of best Frechet inception distance (FID) score and outperforms Adam on average.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-naganuma23a, title = {Conjugate Gradient Method for Generative Adversarial Networks}, author = {Naganuma, Hiroki and Iiduka, Hideaki}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {4381--4408}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/naganuma23a/naganuma23a.pdf}, url = {https://proceedings.mlr.press/v206/naganuma23a.html}, abstract = {One of the training strategies of generative models is to minimize the Jensen–Shannon divergence between the model distribution and the data distribution. Since data distribution is unknown, generative adversarial networks (GANs) formulate this problem as a game between two models, a generator and a discriminator. The training can be formulated in the context of game theory and the local Nash equilibrium (LNE). It does not seem feasible to derive guarantees of stability or optimality for the existing methods. This optimization problem is far more challenging than the single objective setting. Here, we use the conjugate gradient method to reliably and efficiently solve the LNE problem in GANs. We give a proof and convergence analysis under mild assumptions showing that the proposed method converges to a LNE with three different learning rate update rules, including a constant learning rate. Finally, we demonstrate that the proposed method outperforms stochastic gradient descent (SGD) and momentum SGD in terms of best Frechet inception distance (FID) score and outperforms Adam on average.} }
Endnote
%0 Conference Paper %T Conjugate Gradient Method for Generative Adversarial Networks %A Hiroki Naganuma %A Hideaki Iiduka %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-naganuma23a %I PMLR %P 4381--4408 %U https://proceedings.mlr.press/v206/naganuma23a.html %V 206 %X One of the training strategies of generative models is to minimize the Jensen–Shannon divergence between the model distribution and the data distribution. Since data distribution is unknown, generative adversarial networks (GANs) formulate this problem as a game between two models, a generator and a discriminator. The training can be formulated in the context of game theory and the local Nash equilibrium (LNE). It does not seem feasible to derive guarantees of stability or optimality for the existing methods. This optimization problem is far more challenging than the single objective setting. Here, we use the conjugate gradient method to reliably and efficiently solve the LNE problem in GANs. We give a proof and convergence analysis under mild assumptions showing that the proposed method converges to a LNE with three different learning rate update rules, including a constant learning rate. Finally, we demonstrate that the proposed method outperforms stochastic gradient descent (SGD) and momentum SGD in terms of best Frechet inception distance (FID) score and outperforms Adam on average.
APA
Naganuma, H. & Iiduka, H.. (2023). Conjugate Gradient Method for Generative Adversarial Networks. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:4381-4408 Available from https://proceedings.mlr.press/v206/naganuma23a.html.

Related Material