Nonlinear Dimensionality Reduction as Information Retrieval

Jarkko Venna, Samuel Kaski
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:572-579, 2007.

Abstract

Nonlinear dimensionality reduction has so far been treated either as a data representation problem or as a search for a lowerdimensional manifold embedded in the data space. A main application for both is in information visualization, to make visible the neighborhood or proximity relationships in the data, but neither approach has been designed to optimize this task. We give such visualization a new conceptualization as an information retrieval problem; a projection is good if neighbors of data points can be retrieved well based on the visualized projected points. This makes it possible to rigorously quantify goodness in terms of precision and recall. A method is introduced to optimize retrieval quality; it turns out to be an extension of Stochastic Neighbor Embedding, one of the earlier nonlinear projection methods, for which we give a new interpretation: it optimizes recall. The new method is shown empirically to outperform existing dimensionality reduction methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-venna07a, title = {Nonlinear Dimensionality Reduction as Information Retrieval}, author = {Venna, Jarkko and Kaski, Samuel}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {572--579}, year = {2007}, editor = {Meila, Marina and Shen, Xiaotong}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/venna07a/venna07a.pdf}, url = {https://proceedings.mlr.press/v2/venna07a.html}, abstract = {Nonlinear dimensionality reduction has so far been treated either as a data representation problem or as a search for a lowerdimensional manifold embedded in the data space. A main application for both is in information visualization, to make visible the neighborhood or proximity relationships in the data, but neither approach has been designed to optimize this task. We give such visualization a new conceptualization as an information retrieval problem; a projection is good if neighbors of data points can be retrieved well based on the visualized projected points. This makes it possible to rigorously quantify goodness in terms of precision and recall. A method is introduced to optimize retrieval quality; it turns out to be an extension of Stochastic Neighbor Embedding, one of the earlier nonlinear projection methods, for which we give a new interpretation: it optimizes recall. The new method is shown empirically to outperform existing dimensionality reduction methods.} }
Endnote
%0 Conference Paper %T Nonlinear Dimensionality Reduction as Information Retrieval %A Jarkko Venna %A Samuel Kaski %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-venna07a %I PMLR %P 572--579 %U https://proceedings.mlr.press/v2/venna07a.html %V 2 %X Nonlinear dimensionality reduction has so far been treated either as a data representation problem or as a search for a lowerdimensional manifold embedded in the data space. A main application for both is in information visualization, to make visible the neighborhood or proximity relationships in the data, but neither approach has been designed to optimize this task. We give such visualization a new conceptualization as an information retrieval problem; a projection is good if neighbors of data points can be retrieved well based on the visualized projected points. This makes it possible to rigorously quantify goodness in terms of precision and recall. A method is introduced to optimize retrieval quality; it turns out to be an extension of Stochastic Neighbor Embedding, one of the earlier nonlinear projection methods, for which we give a new interpretation: it optimizes recall. The new method is shown empirically to outperform existing dimensionality reduction methods.
RIS
TY - CPAPER TI - Nonlinear Dimensionality Reduction as Information Retrieval AU - Jarkko Venna AU - Samuel Kaski BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-venna07a PB - PMLR DP - Proceedings of Machine Learning Research VL - 2 SP - 572 EP - 579 L1 - http://proceedings.mlr.press/v2/venna07a/venna07a.pdf UR - https://proceedings.mlr.press/v2/venna07a.html AB - Nonlinear dimensionality reduction has so far been treated either as a data representation problem or as a search for a lowerdimensional manifold embedded in the data space. A main application for both is in information visualization, to make visible the neighborhood or proximity relationships in the data, but neither approach has been designed to optimize this task. We give such visualization a new conceptualization as an information retrieval problem; a projection is good if neighbors of data points can be retrieved well based on the visualized projected points. This makes it possible to rigorously quantify goodness in terms of precision and recall. A method is introduced to optimize retrieval quality; it turns out to be an extension of Stochastic Neighbor Embedding, one of the earlier nonlinear projection methods, for which we give a new interpretation: it optimizes recall. The new method is shown empirically to outperform existing dimensionality reduction methods. ER -
APA
Venna, J. & Kaski, S.. (2007). Nonlinear Dimensionality Reduction as Information Retrieval. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 2:572-579 Available from https://proceedings.mlr.press/v2/venna07a.html.

Related Material