Compressibility Barriers to Neighborhood-Preserving Data Visualization

Szymon Snoeck; Noah Bergam; Nakul Verma

Compressibility Barriers to Neighborhood-Preserving Data Visualization

Szymon Snoeck, Noah Bergam, Nakul Verma

Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-30, 2026.

Abstract

To what extent is it possible to visualize high-dimensional data in two- or three-dimensional plots? We reframe this question in terms of embedding $n$-vertex graphs (representing the neighborhood structure of the input points) into metric spaces of low doubling dimension $d$ in such a way that keeps neighbors close and non-neighbors far. This notion of neighbor preservation can be understood as a weaker embedding constraint than near-isometry, yet it is similarly as demanding in terms of how the minimum required dimension scales with the number of points. We show that for an overwhelming fraction of graphs, $d = \Theta(\log n)$ is both necessary and sufficient for neighbor preservation. Even sparse regular graphs, which encode a much simpler class of neighborhood structures, typically require $d= \Omega(\log n / \log\log n)$. The landscape changes dramatically when embedding into normed spaces: general graphs become exponentially harder to embed, requiring $d=\Omega(n)$, while sparse regular graphs continue to admit $d = O(\log n)$. Finally, we study the implications of these results for visualizing data with intrinsic cluster structure. We show that graphs produced from a planted partition model with $k$ clusters on $n$ points typically require $d=\Omega(\log n)$ even when the cluster structure is salient. These results challenge the aspiration that constant-dimensional visualizations can faithfully preserve neighborhood structure.

Cite this Paper

BibTeX

@InProceedings{pmlr-v313-snoeck26a,
  title = 	 {Compressibility Barriers to Neighborhood-Preserving Data Visualization},
  author =       {Snoeck, Szymon and Bergam, Noah and Verma, Nakul},
  booktitle = 	 {Proceedings of The 37th International Conference on Algorithmic Learning Theory},
  pages = 	 {1--30},
  year = 	 {2026},
  editor = 	 {Telgarsky, Matus and Ullman, Jonathan},
  volume = 	 {313},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--26 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v313/main/assets/snoeck26a/snoeck26a.pdf},
  url = 	 {https://proceedings.mlr.press/v313/snoeck26a.html},
  abstract = 	 {To what extent is it possible to visualize high-dimensional data in two- or three-dimensional plots? We reframe this question in terms of embedding $n$-vertex graphs (representing the neighborhood structure of the input points) into metric spaces of low doubling dimension $d$ in such a way that keeps neighbors close and non-neighbors far. This notion of neighbor preservation can be understood as a weaker embedding constraint than near-isometry, yet it is similarly as demanding in terms of how the minimum required dimension scales with the number of points. We show that for an overwhelming fraction of graphs, $d = \Theta(\log n)$ is both necessary and sufficient for neighbor preservation. Even sparse regular graphs, which encode a much simpler class of neighborhood structures, typically require $d= \Omega(\log n / \log\log n)$. The landscape changes dramatically when embedding into normed spaces: general graphs become exponentially harder to embed, requiring $d=\Omega(n)$, while sparse regular graphs continue to admit $d = O(\log n)$. Finally, we study the implications of these results for visualizing data with intrinsic cluster structure. We show that graphs produced from a planted partition model with $k$ clusters on $n$ points typically require $d=\Omega(\log n)$ even when the cluster structure is salient. These results challenge the aspiration that constant-dimensional visualizations can faithfully preserve neighborhood structure.}
}

Endnote

%0 Conference Paper
%T Compressibility Barriers to Neighborhood-Preserving Data Visualization
%A Szymon Snoeck
%A Noah Bergam
%A Nakul Verma
%B Proceedings of The 37th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Matus Telgarsky
%E Jonathan Ullman	
%F pmlr-v313-snoeck26a
%I PMLR
%P 1--30
%U https://proceedings.mlr.press/v313/snoeck26a.html
%V 313
%X To what extent is it possible to visualize high-dimensional data in two- or three-dimensional plots? We reframe this question in terms of embedding $n$-vertex graphs (representing the neighborhood structure of the input points) into metric spaces of low doubling dimension $d$ in such a way that keeps neighbors close and non-neighbors far. This notion of neighbor preservation can be understood as a weaker embedding constraint than near-isometry, yet it is similarly as demanding in terms of how the minimum required dimension scales with the number of points. We show that for an overwhelming fraction of graphs, $d = \Theta(\log n)$ is both necessary and sufficient for neighbor preservation. Even sparse regular graphs, which encode a much simpler class of neighborhood structures, typically require $d= \Omega(\log n / \log\log n)$. The landscape changes dramatically when embedding into normed spaces: general graphs become exponentially harder to embed, requiring $d=\Omega(n)$, while sparse regular graphs continue to admit $d = O(\log n)$. Finally, we study the implications of these results for visualizing data with intrinsic cluster structure. We show that graphs produced from a planted partition model with $k$ clusters on $n$ points typically require $d=\Omega(\log n)$ even when the cluster structure is salient. These results challenge the aspiration that constant-dimensional visualizations can faithfully preserve neighborhood structure.

APA

Snoeck, S., Bergam, N. & Verma, N.. (2026). Compressibility Barriers to Neighborhood-Preserving Data Visualization. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-30 Available from https://proceedings.mlr.press/v313/snoeck26a.html.

Compressibility Barriers to Neighborhood-Preserving Data Visualization

Abstract

Cite this Paper

Related Material