A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport

Tianyi Lin, Marco Cuturi, Michael Jordan
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:145-153, 2024.

Abstract

Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. Recent works suggest that these estimators are more statistically efficient than plug-in (linear programming-based) OT estimators when comparing probability measures in high-dimensions (Vacher et al., 2021). Unfortunately,that statistical benefit comes at a very steep computational price: because their computation relies on the short-step interior-point method (SSIPM), which comes with a large iteration count in practice, these estimators quickly become intractable w.r.t. sample size $n$. To scale these estimators to larger $n$, we propose a nonsmooth fixedpoint model for the kernel-based OT problem, and show that it can be efficiently solved via a specialized semismooth Newton (SSN) method: We show, exploring the problem’s structure, that the per-iteration cost of performing one SSN step can be significantly reduced in practice. We prove that our SSN method achieves a global convergence rate of $O(1/\sqrt{k})$, and a local quadratic convergence rate under standard regularity conditions. We show substantial speedups over SSIPM on both synthetic and real datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-lin24a, title = {A Specialized Semismooth {N}ewton Method for Kernel-Based Optimal Transport}, author = {Lin, Tianyi and Cuturi, Marco and Jordan, Michael}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {145--153}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/lin24a/lin24a.pdf}, url = {https://proceedings.mlr.press/v238/lin24a.html}, abstract = {Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. Recent works suggest that these estimators are more statistically efficient than plug-in (linear programming-based) OT estimators when comparing probability measures in high-dimensions (Vacher et al., 2021). Unfortunately,that statistical benefit comes at a very steep computational price: because their computation relies on the short-step interior-point method (SSIPM), which comes with a large iteration count in practice, these estimators quickly become intractable w.r.t. sample size $n$. To scale these estimators to larger $n$, we propose a nonsmooth fixedpoint model for the kernel-based OT problem, and show that it can be efficiently solved via a specialized semismooth Newton (SSN) method: We show, exploring the problem’s structure, that the per-iteration cost of performing one SSN step can be significantly reduced in practice. We prove that our SSN method achieves a global convergence rate of $O(1/\sqrt{k})$, and a local quadratic convergence rate under standard regularity conditions. We show substantial speedups over SSIPM on both synthetic and real datasets.} }
Endnote
%0 Conference Paper %T A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport %A Tianyi Lin %A Marco Cuturi %A Michael Jordan %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-lin24a %I PMLR %P 145--153 %U https://proceedings.mlr.press/v238/lin24a.html %V 238 %X Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. Recent works suggest that these estimators are more statistically efficient than plug-in (linear programming-based) OT estimators when comparing probability measures in high-dimensions (Vacher et al., 2021). Unfortunately,that statistical benefit comes at a very steep computational price: because their computation relies on the short-step interior-point method (SSIPM), which comes with a large iteration count in practice, these estimators quickly become intractable w.r.t. sample size $n$. To scale these estimators to larger $n$, we propose a nonsmooth fixedpoint model for the kernel-based OT problem, and show that it can be efficiently solved via a specialized semismooth Newton (SSN) method: We show, exploring the problem’s structure, that the per-iteration cost of performing one SSN step can be significantly reduced in practice. We prove that our SSN method achieves a global convergence rate of $O(1/\sqrt{k})$, and a local quadratic convergence rate under standard regularity conditions. We show substantial speedups over SSIPM on both synthetic and real datasets.
APA
Lin, T., Cuturi, M. & Jordan, M.. (2024). A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:145-153 Available from https://proceedings.mlr.press/v238/lin24a.html.

Related Material