Mode-Seeking Divergences: Theory and Applications to GANs

Cheuk Ting Li; Farzan Farnia

Mode-Seeking Divergences: Theory and Applications to GANs

Cheuk Ting Li, Farzan Farnia

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:8321-8350, 2023.

Abstract

Generative adversarial networks (GANs) represent a game between two neural network machines designed to learn the distribution of data. It is commonly observed that different GAN formulations and divergence/distance measures used could lead to considerably different performance results, especially when the data distribution is multi-modal. In this work, we give a theoretical characterization of the mode-seeking behavior of general f-divergences and Wasserstein distances, and prove a performance guarantee for the setting where the underlying model is a mixture of multiple symmetric quasiconcave distributions. This can help us understand the trade-off between the quality and diversity of the trained GANs’ output samples. Our theoretical results show the mode-seeking nature of the Jensen-Shannon (JS) divergence over standard KL-divergence and Wasserstein distance measures. We subsequently demonstrate that a hybrid of JS-divergence and Wasserstein distance measures minimized by Lipschitz GANs mimics the mode-seeking behavior of the JS-divergence. We present numerical results showing the mode-seeking nature of the JS-divergence and its hybrid with the Wasserstein distance while highlighting the mode-covering properties of KL-divergence and Wasserstein distance measures. Our numerical experiments indicate the different behavior of several standard GAN formulations in application to benchmark Gaussian mixture and image datasets.

Cite this Paper

BibTeX

@InProceedings{pmlr-v206-ting-li23a,
  title = 	 {Mode-Seeking Divergences: Theory and Applications to GANs},
  author =       {Li, Cheuk Ting and Farnia, Farzan},
  booktitle = 	 {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {8321--8350},
  year = 	 {2023},
  editor = 	 {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
  volume = 	 {206},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v206/ting-li23a/ting-li23a.pdf},
  url = 	 {https://proceedings.mlr.press/v206/ting-li23a.html},
  abstract = 	 {Generative adversarial networks (GANs) represent a game between two neural network machines designed to learn the distribution of data. It is commonly observed that different GAN formulations and divergence/distance measures used could lead to considerably different performance results, especially when the data distribution is multi-modal. In this work, we give a theoretical characterization of the mode-seeking behavior of general f-divergences and Wasserstein distances, and prove a performance guarantee for the setting where the underlying model is a mixture of multiple symmetric quasiconcave distributions. This can help us understand the trade-off between the quality and diversity of the trained GANs’ output samples. Our theoretical results show the mode-seeking nature of the Jensen-Shannon (JS) divergence over standard KL-divergence and Wasserstein distance measures. We subsequently demonstrate that a hybrid of JS-divergence and Wasserstein distance measures minimized by Lipschitz GANs mimics the mode-seeking behavior of the JS-divergence. We present numerical results showing the mode-seeking nature of the JS-divergence and its hybrid with the Wasserstein distance while highlighting the mode-covering properties of KL-divergence and Wasserstein distance measures. Our numerical experiments indicate the different behavior of several standard GAN formulations in application to benchmark Gaussian mixture and image datasets.}
}

Endnote

%0 Conference Paper
%T Mode-Seeking Divergences: Theory and Applications to GANs
%A Cheuk Ting Li
%A Farzan Farnia
%B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2023
%E Francisco Ruiz
%E Jennifer Dy
%E Jan-Willem van de Meent	
%F pmlr-v206-ting-li23a
%I PMLR
%P 8321--8350
%U https://proceedings.mlr.press/v206/ting-li23a.html
%V 206
%X Generative adversarial networks (GANs) represent a game between two neural network machines designed to learn the distribution of data. It is commonly observed that different GAN formulations and divergence/distance measures used could lead to considerably different performance results, especially when the data distribution is multi-modal. In this work, we give a theoretical characterization of the mode-seeking behavior of general f-divergences and Wasserstein distances, and prove a performance guarantee for the setting where the underlying model is a mixture of multiple symmetric quasiconcave distributions. This can help us understand the trade-off between the quality and diversity of the trained GANs’ output samples. Our theoretical results show the mode-seeking nature of the Jensen-Shannon (JS) divergence over standard KL-divergence and Wasserstein distance measures. We subsequently demonstrate that a hybrid of JS-divergence and Wasserstein distance measures minimized by Lipschitz GANs mimics the mode-seeking behavior of the JS-divergence. We present numerical results showing the mode-seeking nature of the JS-divergence and its hybrid with the Wasserstein distance while highlighting the mode-covering properties of KL-divergence and Wasserstein distance measures. Our numerical experiments indicate the different behavior of several standard GAN formulations in application to benchmark Gaussian mixture and image datasets.

APA

Li, C.T. & Farnia, F.. (2023). Mode-Seeking Divergences: Theory and Applications to GANs. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:8321-8350 Available from https://proceedings.mlr.press/v206/ting-li23a.html.

Related Material

Download PDF