InfoOT: Information Maximizing Optimal Transport

Ching-Yao Chuang, Stefanie Jegelka, David Alvarez-Melis
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:6228-6242, 2023.

Abstract

Optimal transport aligns samples across distributions by minimizing the transportation cost between them, e.g., the geometric distances. Yet, it ignores coherence structure in the data such as clusters, does not handle outliers well, and cannot integrate new data points. To address these drawbacks, we propose InfoOT, an information-theoretic extension of optimal transport that maximizes the mutual information between domains while minimizing geometric distances. The resulting objective can still be formulated as a (generalized) optimal transport problem, and can be efficiently solved by projected gradient descent. This formulation yields a new projection method that is robust to outliers and generalizes to unseen samples. Empirically, InfoOT improves the quality of alignments across benchmarks in domain adaptation, cross-domain retrieval, and single-cell alignment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-chuang23a, title = {{I}nfo{OT}: Information Maximizing Optimal Transport}, author = {Chuang, Ching-Yao and Jegelka, Stefanie and Alvarez-Melis, David}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {6228--6242}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/chuang23a/chuang23a.pdf}, url = {https://proceedings.mlr.press/v202/chuang23a.html}, abstract = {Optimal transport aligns samples across distributions by minimizing the transportation cost between them, e.g., the geometric distances. Yet, it ignores coherence structure in the data such as clusters, does not handle outliers well, and cannot integrate new data points. To address these drawbacks, we propose InfoOT, an information-theoretic extension of optimal transport that maximizes the mutual information between domains while minimizing geometric distances. The resulting objective can still be formulated as a (generalized) optimal transport problem, and can be efficiently solved by projected gradient descent. This formulation yields a new projection method that is robust to outliers and generalizes to unseen samples. Empirically, InfoOT improves the quality of alignments across benchmarks in domain adaptation, cross-domain retrieval, and single-cell alignment.} }
Endnote
%0 Conference Paper %T InfoOT: Information Maximizing Optimal Transport %A Ching-Yao Chuang %A Stefanie Jegelka %A David Alvarez-Melis %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-chuang23a %I PMLR %P 6228--6242 %U https://proceedings.mlr.press/v202/chuang23a.html %V 202 %X Optimal transport aligns samples across distributions by minimizing the transportation cost between them, e.g., the geometric distances. Yet, it ignores coherence structure in the data such as clusters, does not handle outliers well, and cannot integrate new data points. To address these drawbacks, we propose InfoOT, an information-theoretic extension of optimal transport that maximizes the mutual information between domains while minimizing geometric distances. The resulting objective can still be formulated as a (generalized) optimal transport problem, and can be efficiently solved by projected gradient descent. This formulation yields a new projection method that is robust to outliers and generalizes to unseen samples. Empirically, InfoOT improves the quality of alignments across benchmarks in domain adaptation, cross-domain retrieval, and single-cell alignment.
APA
Chuang, C., Jegelka, S. & Alvarez-Melis, D.. (2023). InfoOT: Information Maximizing Optimal Transport. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:6228-6242 Available from https://proceedings.mlr.press/v202/chuang23a.html.

Related Material