Dimensionality estimation without distances
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:471-479, 2015.
While existing methods for estimating the intrinsic dimension of datasets require to know distances between data points, we consider a situation where one has considerably less information. Given a sample of points, all we get to see is who are the k nearest neighbors of every point. In other words, we get the adjacency matrix of the directed, unweighted k-nearest neighbor graph on the sample, but do not know any point coordinates or distances between the points. We provide two estimators for this situation, a naive one and a more elaborate one. Both of them can be proved to be statistically consistent. However, further theoretical and experimental evidence shows that the naive estimator performs rather poorly, whereas the elaborate one achieves results comparable to those of estimators based on distance information.