Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Kilian Fatras, Thibault Sejourne, Rémi Flamary, Nicolas Courty
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3186-3197, 2021.

Abstract

Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, i.e. minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behaviors. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-fatras21a, title = {Unbalanced minibatch Optimal Transport; applications to Domain Adaptation}, author = {Fatras, Kilian and Sejourne, Thibault and Flamary, R{\'e}mi and Courty, Nicolas}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {3186--3197}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/fatras21a/fatras21a.pdf}, url = {https://proceedings.mlr.press/v139/fatras21a.html}, abstract = {Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, i.e. minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behaviors. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.} }
Endnote
%0 Conference Paper %T Unbalanced minibatch Optimal Transport; applications to Domain Adaptation %A Kilian Fatras %A Thibault Sejourne %A Rémi Flamary %A Nicolas Courty %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-fatras21a %I PMLR %P 3186--3197 %U https://proceedings.mlr.press/v139/fatras21a.html %V 139 %X Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, i.e. minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behaviors. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.
APA
Fatras, K., Sejourne, T., Flamary, R. & Courty, N.. (2021). Unbalanced minibatch Optimal Transport; applications to Domain Adaptation. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3186-3197 Available from https://proceedings.mlr.press/v139/fatras21a.html.

Related Material