[edit]
Fast 1-Wasserstein distance approximations using greedy strategies
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:325-333, 2024.
Abstract
Among numerous linear approximation methods proposed for optimal transport (OT), tree-based methods appear to be fairly reliable, notably for language processing applications. Inspired by these tree methods, we introduce several greedy heuristics aiming to compute even faster approximations of OT. We first explicitly establish the equivalence between greedy matching and optimal transport for tree metrics, and then we show that tree greedy matching can be reduced to greedy matching on a one-dimensional line. Next, we propose two new greedy-based algorithms in one dimension: the $k$-Greedy and 1D-ICT algorithms. This novel approach provides Wasserstein approximations with accuracy similar to the original tree methods on text datasets while being faster in practice. Finally, these algorithms are applicable beyond tree approximations: using sliced projections of the original data still provides fairly good accuracy while eliminating the need for embedding the data in a fixed and rigid tree structure. This property makes these approaches even more versatile than the original tree OT methods.