Learning with minibatch Wasserstein : asymptotic and gradient properties

Kilian Fatras, Younes Zine, Rémi Flamary, Remi Gribonval, Nicolas Courty
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2131-2141, 2020.

Abstract

Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches i.e., they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-fatras20a, title = {Learning with minibatch Wasserstein : asymptotic and gradient properties}, author = {Fatras, Kilian and Zine, Younes and Flamary, R\'emi and Gribonval, Remi and Courty, Nicolas}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {2131--2141}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/fatras20a/fatras20a.pdf}, url = {https://proceedings.mlr.press/v108/fatras20a.html}, abstract = {Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches i.e., they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.} }
Endnote
%0 Conference Paper %T Learning with minibatch Wasserstein : asymptotic and gradient properties %A Kilian Fatras %A Younes Zine %A Rémi Flamary %A Remi Gribonval %A Nicolas Courty %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-fatras20a %I PMLR %P 2131--2141 %U https://proceedings.mlr.press/v108/fatras20a.html %V 108 %X Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches i.e., they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.
APA
Fatras, K., Zine, Y., Flamary, R., Gribonval, R. & Courty, N.. (2020). Learning with minibatch Wasserstein : asymptotic and gradient properties. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:2131-2141 Available from https://proceedings.mlr.press/v108/fatras20a.html.

Related Material