Dimensionality estimation without distances

Matthäus Kleindessner, Ulrike Luxburg
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:471-479, 2015.

Abstract

While existing methods for estimating the intrinsic dimension of datasets require to know distances between data points, we consider a situation where one has considerably less information. Given a sample of points, all we get to see is who are the k nearest neighbors of every point. In other words, we get the adjacency matrix of the directed, unweighted k-nearest neighbor graph on the sample, but do not know any point coordinates or distances between the points. We provide two estimators for this situation, a naive one and a more elaborate one. Both of them can be proved to be statistically consistent. However, further theoretical and experimental evidence shows that the naive estimator performs rather poorly, whereas the elaborate one achieves results comparable to those of estimators based on distance information.

Cite this Paper


BibTeX
@InProceedings{pmlr-v38-kleindessner15, title = {{Dimensionality estimation without distances}}, author = {Kleindessner, Matthäus and Luxburg, Ulrike}, booktitle = {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics}, pages = {471--479}, year = {2015}, editor = {Lebanon, Guy and Vishwanathan, S. V. N.}, volume = {38}, series = {Proceedings of Machine Learning Research}, address = {San Diego, California, USA}, month = {09--12 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v38/kleindessner15.pdf}, url = {https://proceedings.mlr.press/v38/kleindessner15.html}, abstract = {While existing methods for estimating the intrinsic dimension of datasets require to know distances between data points, we consider a situation where one has considerably less information. Given a sample of points, all we get to see is who are the k nearest neighbors of every point. In other words, we get the adjacency matrix of the directed, unweighted k-nearest neighbor graph on the sample, but do not know any point coordinates or distances between the points. We provide two estimators for this situation, a naive one and a more elaborate one. Both of them can be proved to be statistically consistent. However, further theoretical and experimental evidence shows that the naive estimator performs rather poorly, whereas the elaborate one achieves results comparable to those of estimators based on distance information.} }
Endnote
%0 Conference Paper %T Dimensionality estimation without distances %A Matthäus Kleindessner %A Ulrike Luxburg %B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2015 %E Guy Lebanon %E S. V. N. Vishwanathan %F pmlr-v38-kleindessner15 %I PMLR %P 471--479 %U https://proceedings.mlr.press/v38/kleindessner15.html %V 38 %X While existing methods for estimating the intrinsic dimension of datasets require to know distances between data points, we consider a situation where one has considerably less information. Given a sample of points, all we get to see is who are the k nearest neighbors of every point. In other words, we get the adjacency matrix of the directed, unweighted k-nearest neighbor graph on the sample, but do not know any point coordinates or distances between the points. We provide two estimators for this situation, a naive one and a more elaborate one. Both of them can be proved to be statistically consistent. However, further theoretical and experimental evidence shows that the naive estimator performs rather poorly, whereas the elaborate one achieves results comparable to those of estimators based on distance information.
RIS
TY - CPAPER TI - Dimensionality estimation without distances AU - Matthäus Kleindessner AU - Ulrike Luxburg BT - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics DA - 2015/02/21 ED - Guy Lebanon ED - S. V. N. Vishwanathan ID - pmlr-v38-kleindessner15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 38 SP - 471 EP - 479 L1 - http://proceedings.mlr.press/v38/kleindessner15.pdf UR - https://proceedings.mlr.press/v38/kleindessner15.html AB - While existing methods for estimating the intrinsic dimension of datasets require to know distances between data points, we consider a situation where one has considerably less information. Given a sample of points, all we get to see is who are the k nearest neighbors of every point. In other words, we get the adjacency matrix of the directed, unweighted k-nearest neighbor graph on the sample, but do not know any point coordinates or distances between the points. We provide two estimators for this situation, a naive one and a more elaborate one. Both of them can be proved to be statistically consistent. However, further theoretical and experimental evidence shows that the naive estimator performs rather poorly, whereas the elaborate one achieves results comparable to those of estimators based on distance information. ER -
APA
Kleindessner, M. & Luxburg, U.. (2015). Dimensionality estimation without distances. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:471-479 Available from https://proceedings.mlr.press/v38/kleindessner15.html.

Related Material