Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm

Pavel Dvurechensky, Alexander Gasnikov, Alexey Kroshnin
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1367-1376, 2018.

Abstract

We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size n, up to accuracy ε. For the first algorithm, which is based on the celebrated Sinkhorn’s algorithm, we prove the complexity bound ˜O(n2ε2) arithmetic operations (˜O hides polylogarithmic factors (lnn)c, c>0). For the second one, which is based on our novel Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD) algorithm, we prove the complexity bound ˜O(min arithmetic operations. Both bounds have better dependence on \varepsilon than the state-of-the-art result given by \widetilde{O}\left(\frac{n^2}{\varepsilon^3}\right). Our second algorithm not only has better dependence on \varepsilon in the complexity bound, but also is not specific to entropic regularization and can solve the OT problem with different regularizers.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-dvurechensky18a, title = {Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm}, author = {Dvurechensky, Pavel and Gasnikov, Alexander and Kroshnin, Alexey}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1367--1376}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/dvurechensky18a/dvurechensky18a.pdf}, url = {https://proceedings.mlr.press/v80/dvurechensky18a.html}, abstract = {We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size $n$, up to accuracy $\varepsilon$. For the first algorithm, which is based on the celebrated Sinkhorn’s algorithm, we prove the complexity bound $\widetilde{O}\left(\frac{n^2}{\varepsilon^2}\right)$ arithmetic operations ($\widetilde{O}$ hides polylogarithmic factors $(\ln n)^c$, $c>0$). For the second one, which is based on our novel Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD) algorithm, we prove the complexity bound $\widetilde{O}\left(\min\left\{\frac{n^{9/4}}{\varepsilon}, \frac{n^{2}}{\varepsilon^2} \right\}\right)$ arithmetic operations. Both bounds have better dependence on $\varepsilon$ than the state-of-the-art result given by $\widetilde{O}\left(\frac{n^2}{\varepsilon^3}\right)$. Our second algorithm not only has better dependence on $\varepsilon$ in the complexity bound, but also is not specific to entropic regularization and can solve the OT problem with different regularizers.} }
Endnote
%0 Conference Paper %T Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm %A Pavel Dvurechensky %A Alexander Gasnikov %A Alexey Kroshnin %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-dvurechensky18a %I PMLR %P 1367--1376 %U https://proceedings.mlr.press/v80/dvurechensky18a.html %V 80 %X We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size $n$, up to accuracy $\varepsilon$. For the first algorithm, which is based on the celebrated Sinkhorn’s algorithm, we prove the complexity bound $\widetilde{O}\left(\frac{n^2}{\varepsilon^2}\right)$ arithmetic operations ($\widetilde{O}$ hides polylogarithmic factors $(\ln n)^c$, $c>0$). For the second one, which is based on our novel Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD) algorithm, we prove the complexity bound $\widetilde{O}\left(\min\left\{\frac{n^{9/4}}{\varepsilon}, \frac{n^{2}}{\varepsilon^2} \right\}\right)$ arithmetic operations. Both bounds have better dependence on $\varepsilon$ than the state-of-the-art result given by $\widetilde{O}\left(\frac{n^2}{\varepsilon^3}\right)$. Our second algorithm not only has better dependence on $\varepsilon$ in the complexity bound, but also is not specific to entropic regularization and can solve the OT problem with different regularizers.
APA
Dvurechensky, P., Gasnikov, A. & Kroshnin, A.. (2018). Computational Optimal Transport: Complexity by Accelerated Gradient Descent Is Better Than by Sinkhorn’s Algorithm. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1367-1376 Available from https://proceedings.mlr.press/v80/dvurechensky18a.html.

Related Material