First Order Generative Adversarial Networks

Calvin Seward, Thomas Unterthiner, Urs Bergmann, Nikolay Jetchev, Sepp Hochreiter
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4567-4576, 2018.

Abstract

GANs excel at learning high dimensional distributions, but they can update generator parameters in directions that do not correspond to the steepest descent direction of the objective. Prominent examples of problematic update directions include those used in both Goodfellow’s original GAN and the WGAN-GP. To formally describe an optimal update direction, we introduce a theoretical framework which allows the derivation of requirements on both the divergence and corresponding method for determining an update direction, with these requirements guaranteeing unbiased mini-batch updates in the direction of steepest descent. We propose a novel divergence which approximates the Wasserstein distance while regularizing the critic’s first order information. Together with an accompanying update direction, this divergence fulfills the requirements for unbiased steepest descent updates. We verify our method, the First Order GAN, with image generation on CelebA, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-seward18a, title = {First Order Generative Adversarial Networks}, author = {Seward, Calvin and Unterthiner, Thomas and Bergmann, Urs and Jetchev, Nikolay and Hochreiter, Sepp}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {4567--4576}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/seward18a/seward18a.pdf}, url = {https://proceedings.mlr.press/v80/seward18a.html}, abstract = {GANs excel at learning high dimensional distributions, but they can update generator parameters in directions that do not correspond to the steepest descent direction of the objective. Prominent examples of problematic update directions include those used in both Goodfellow’s original GAN and the WGAN-GP. To formally describe an optimal update direction, we introduce a theoretical framework which allows the derivation of requirements on both the divergence and corresponding method for determining an update direction, with these requirements guaranteeing unbiased mini-batch updates in the direction of steepest descent. We propose a novel divergence which approximates the Wasserstein distance while regularizing the critic’s first order information. Together with an accompanying update direction, this divergence fulfills the requirements for unbiased steepest descent updates. We verify our method, the First Order GAN, with image generation on CelebA, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task.} }
Endnote
%0 Conference Paper %T First Order Generative Adversarial Networks %A Calvin Seward %A Thomas Unterthiner %A Urs Bergmann %A Nikolay Jetchev %A Sepp Hochreiter %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-seward18a %I PMLR %P 4567--4576 %U https://proceedings.mlr.press/v80/seward18a.html %V 80 %X GANs excel at learning high dimensional distributions, but they can update generator parameters in directions that do not correspond to the steepest descent direction of the objective. Prominent examples of problematic update directions include those used in both Goodfellow’s original GAN and the WGAN-GP. To formally describe an optimal update direction, we introduce a theoretical framework which allows the derivation of requirements on both the divergence and corresponding method for determining an update direction, with these requirements guaranteeing unbiased mini-batch updates in the direction of steepest descent. We propose a novel divergence which approximates the Wasserstein distance while regularizing the critic’s first order information. Together with an accompanying update direction, this divergence fulfills the requirements for unbiased steepest descent updates. We verify our method, the First Order GAN, with image generation on CelebA, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task.
APA
Seward, C., Unterthiner, T., Bergmann, U., Jetchev, N. & Hochreiter, S.. (2018). First Order Generative Adversarial Networks. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4567-4576 Available from https://proceedings.mlr.press/v80/seward18a.html.

Related Material