Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer

Doron Haviv, Russell Zhang Kunes, Thomas Dougherty, Cassandra Burdziak, Tal Nawy, Anna Gilbert, Dana Pe’Er
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:17697-17718, 2024.

Abstract

Optimal transport (OT) and the related Wasserstein metric ($W$) are powerful and ubiquitous tools for comparing distributions. However, computing pairwise Wasserstein distances rapidly becomes intractable as cohort size grows. An attractive alternative would be to find an embedding space in which pairwise Euclidean distances map to OT distances, akin to standard multidimensional scaling (MDS). We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space wherein Euclidean distances approximate OT distances. Extending MDS theory, we show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Empirically, distances between Wormhole embeddings closely match Wasserstein distances, enabling linear time computation of OT distances. Along with an encoder that maps distributions to embeddings, Wasserstein Wormhole includes a decoder that maps embeddings back to distributions, allowing for operations in the embedding space to generalize to OT spaces, such as Wasserstein barycenter estimation and OT interpolation. By lending scalability and interpretability to OT approaches, Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-haviv24a, title = {{W}asserstein Wormhole: Scalable Optimal Transport Distance with Transformer}, author = {Haviv, Doron and Kunes, Russell Zhang and Dougherty, Thomas and Burdziak, Cassandra and Nawy, Tal and Gilbert, Anna and Pe'Er, Dana}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {17697--17718}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/haviv24a/haviv24a.pdf}, url = {https://proceedings.mlr.press/v235/haviv24a.html}, abstract = {Optimal transport (OT) and the related Wasserstein metric ($W$) are powerful and ubiquitous tools for comparing distributions. However, computing pairwise Wasserstein distances rapidly becomes intractable as cohort size grows. An attractive alternative would be to find an embedding space in which pairwise Euclidean distances map to OT distances, akin to standard multidimensional scaling (MDS). We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space wherein Euclidean distances approximate OT distances. Extending MDS theory, we show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Empirically, distances between Wormhole embeddings closely match Wasserstein distances, enabling linear time computation of OT distances. Along with an encoder that maps distributions to embeddings, Wasserstein Wormhole includes a decoder that maps embeddings back to distributions, allowing for operations in the embedding space to generalize to OT spaces, such as Wasserstein barycenter estimation and OT interpolation. By lending scalability and interpretability to OT approaches, Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.} }
Endnote
%0 Conference Paper %T Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer %A Doron Haviv %A Russell Zhang Kunes %A Thomas Dougherty %A Cassandra Burdziak %A Tal Nawy %A Anna Gilbert %A Dana Pe’Er %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-haviv24a %I PMLR %P 17697--17718 %U https://proceedings.mlr.press/v235/haviv24a.html %V 235 %X Optimal transport (OT) and the related Wasserstein metric ($W$) are powerful and ubiquitous tools for comparing distributions. However, computing pairwise Wasserstein distances rapidly becomes intractable as cohort size grows. An attractive alternative would be to find an embedding space in which pairwise Euclidean distances map to OT distances, akin to standard multidimensional scaling (MDS). We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space wherein Euclidean distances approximate OT distances. Extending MDS theory, we show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Empirically, distances between Wormhole embeddings closely match Wasserstein distances, enabling linear time computation of OT distances. Along with an encoder that maps distributions to embeddings, Wasserstein Wormhole includes a decoder that maps embeddings back to distributions, allowing for operations in the embedding space to generalize to OT spaces, such as Wasserstein barycenter estimation and OT interpolation. By lending scalability and interpretability to OT approaches, Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.
APA
Haviv, D., Kunes, R.Z., Dougherty, T., Burdziak, C., Nawy, T., Gilbert, A. & Pe’Er, D.. (2024). Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:17697-17718 Available from https://proceedings.mlr.press/v235/haviv24a.html.

Related Material