Quality assessment of nonlinear dimensionality reduction based on $K$-ary neighborhoods

John Lee, Michel Verleysen
Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, PMLR 4:21-35, 2008.

Abstract

Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the high- and low-dimensional spaces leads to the definition of the co-ranking matrix. Rank errors and concepts such as neighborhood intrusions and extrusions can be associated with different blocks of the co-ranking matrix. The considered quality criteria are then cast within this unifying framework and the blocks they involve are identified. The same framework allows us to propose simpler criteria, which quantify two aspects of the embedding, namely its overall quality and its tendency to favor either intrusions or extrusions. Eventually, a simple experiment illustrates the soundness of the approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v4-lee08a, title = {Quality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods}, author = {Lee, John and Verleysen, Michel}, booktitle = {Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008}, pages = {21--35}, year = {2008}, editor = {Saeys, Yvan and Liu, Huan and Inza, Iñaki and Wehenkel, Louis and Pee, Yves Van de}, volume = {4}, series = {Proceedings of Machine Learning Research}, address = {Antwerp, Belgium}, month = {15 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v4/lee08a/lee08a.pdf}, url = {https://proceedings.mlr.press/v4/lee08a.html}, abstract = {Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the high- and low-dimensional spaces leads to the definition of the co-ranking matrix. Rank errors and concepts such as neighborhood intrusions and extrusions can be associated with different blocks of the co-ranking matrix. The considered quality criteria are then cast within this unifying framework and the blocks they involve are identified. The same framework allows us to propose simpler criteria, which quantify two aspects of the embedding, namely its overall quality and its tendency to favor either intrusions or extrusions. Eventually, a simple experiment illustrates the soundness of the approach.} }
Endnote
%0 Conference Paper %T Quality assessment of nonlinear dimensionality reduction based on $K$-ary neighborhoods %A John Lee %A Michel Verleysen %B Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008 %C Proceedings of Machine Learning Research %D 2008 %E Yvan Saeys %E Huan Liu %E Iñaki Inza %E Louis Wehenkel %E Yves Van de Pee %F pmlr-v4-lee08a %I PMLR %P 21--35 %U https://proceedings.mlr.press/v4/lee08a.html %V 4 %X Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the high- and low-dimensional spaces leads to the definition of the co-ranking matrix. Rank errors and concepts such as neighborhood intrusions and extrusions can be associated with different blocks of the co-ranking matrix. The considered quality criteria are then cast within this unifying framework and the blocks they involve are identified. The same framework allows us to propose simpler criteria, which quantify two aspects of the embedding, namely its overall quality and its tendency to favor either intrusions or extrusions. Eventually, a simple experiment illustrates the soundness of the approach.
RIS
TY - CPAPER TI - Quality assessment of nonlinear dimensionality reduction based on $K$-ary neighborhoods AU - John Lee AU - Michel Verleysen BT - Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008 DA - 2008/09/11 ED - Yvan Saeys ED - Huan Liu ED - Iñaki Inza ED - Louis Wehenkel ED - Yves Van de Pee ID - pmlr-v4-lee08a PB - PMLR DP - Proceedings of Machine Learning Research VL - 4 SP - 21 EP - 35 L1 - http://proceedings.mlr.press/v4/lee08a/lee08a.pdf UR - https://proceedings.mlr.press/v4/lee08a.html AB - Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the high- and low-dimensional spaces leads to the definition of the co-ranking matrix. Rank errors and concepts such as neighborhood intrusions and extrusions can be associated with different blocks of the co-ranking matrix. The considered quality criteria are then cast within this unifying framework and the blocks they involve are identified. The same framework allows us to propose simpler criteria, which quantify two aspects of the embedding, namely its overall quality and its tendency to favor either intrusions or extrusions. Eventually, a simple experiment illustrates the soundness of the approach. ER -
APA
Lee, J. & Verleysen, M.. (2008). Quality assessment of nonlinear dimensionality reduction based on $K$-ary neighborhoods. Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, in Proceedings of Machine Learning Research 4:21-35 Available from https://proceedings.mlr.press/v4/lee08a.html.

Related Material