Learning Distances from Data with Normalizing Flows and Score Matching

Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnoerr, Ullrich Koethe
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:56531-56548, 2025.

Abstract

Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest neighbor graphs suffer from poor convergence due to inaccurate density estimates. Moreover, graph-based methods scale poorly to high dimensions, as the proposed geodesics are often insufficiently smooth. We address these challenges in two key ways. First, we learn densities using normalizing flows. Second, we refine geodesics through relaxation, guided by a learned score model. Additionally, we introduce a dimension-adapted Fermat distance that scales intuitively to high dimensions and improves numerical stability. Our work paves the way for the practical use of density-based distances, especially in high-dimensional spaces.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-sorrenson25a, title = {Learning Distances from Data with Normalizing Flows and Score Matching}, author = {Sorrenson, Peter and Behrend-Uriarte, Daniel and Schnoerr, Christoph and Koethe, Ullrich}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {56531--56548}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/sorrenson25a/sorrenson25a.pdf}, url = {https://proceedings.mlr.press/v267/sorrenson25a.html}, abstract = {Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest neighbor graphs suffer from poor convergence due to inaccurate density estimates. Moreover, graph-based methods scale poorly to high dimensions, as the proposed geodesics are often insufficiently smooth. We address these challenges in two key ways. First, we learn densities using normalizing flows. Second, we refine geodesics through relaxation, guided by a learned score model. Additionally, we introduce a dimension-adapted Fermat distance that scales intuitively to high dimensions and improves numerical stability. Our work paves the way for the practical use of density-based distances, especially in high-dimensional spaces.} }
Endnote
%0 Conference Paper %T Learning Distances from Data with Normalizing Flows and Score Matching %A Peter Sorrenson %A Daniel Behrend-Uriarte %A Christoph Schnoerr %A Ullrich Koethe %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-sorrenson25a %I PMLR %P 56531--56548 %U https://proceedings.mlr.press/v267/sorrenson25a.html %V 267 %X Density-based distances (DBDs) provide a principled approach to metric learning by defining distances in terms of the underlying data distribution. By employing a Riemannian metric that increases in regions of low probability density, shortest paths naturally follow the data manifold. Fermat distances, a specific type of DBD, have attractive properties, but existing estimators based on nearest neighbor graphs suffer from poor convergence due to inaccurate density estimates. Moreover, graph-based methods scale poorly to high dimensions, as the proposed geodesics are often insufficiently smooth. We address these challenges in two key ways. First, we learn densities using normalizing flows. Second, we refine geodesics through relaxation, guided by a learned score model. Additionally, we introduce a dimension-adapted Fermat distance that scales intuitively to high dimensions and improves numerical stability. Our work paves the way for the practical use of density-based distances, especially in high-dimensional spaces.
APA
Sorrenson, P., Behrend-Uriarte, D., Schnoerr, C. & Koethe, U.. (2025). Learning Distances from Data with Normalizing Flows and Score Matching. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:56531-56548 Available from https://proceedings.mlr.press/v267/sorrenson25a.html.

Related Material