Factorized Diffusion Map Approximation

Saeed Amizadeh, Hamed Valizadegan, Milos Hauskrecht
; Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:37-46, 2012.

Abstract

Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from the curse of dimensionality. Motivated by other machine learning models for which the existence of structure in the underlying distribution of data can reduce the complexity of estimation, we study and show how the factorization of the underlying distribution into independent subspaces can help us to estimate diffusion maps more accurately. Building upon this result, we propose and develop an algorithm that can automatically factorize a high dimensional data space in order to minimize the error of estimation of its diffusion map, even in the case when the underlying distribution is not decomposable. Experiments on both the synthetic and real-world datasets demonstrate improved estimation performance of our method over the regular diffusion-map framework.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-amizadeh12, title = {Factorized Diffusion Map Approximation}, author = {Saeed Amizadeh and Hamed Valizadegan and Milos Hauskrecht}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {37--46}, year = {2012}, editor = {Neil D. Lawrence and Mark Girolami}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/amizadeh12/amizadeh12.pdf}, url = {http://proceedings.mlr.press/v22/amizadeh12.html}, abstract = {Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from the curse of dimensionality. Motivated by other machine learning models for which the existence of structure in the underlying distribution of data can reduce the complexity of estimation, we study and show how the factorization of the underlying distribution into independent subspaces can help us to estimate diffusion maps more accurately. Building upon this result, we propose and develop an algorithm that can automatically factorize a high dimensional data space in order to minimize the error of estimation of its diffusion map, even in the case when the underlying distribution is not decomposable. Experiments on both the synthetic and real-world datasets demonstrate improved estimation performance of our method over the regular diffusion-map framework.} }
Endnote
%0 Conference Paper %T Factorized Diffusion Map Approximation %A Saeed Amizadeh %A Hamed Valizadegan %A Milos Hauskrecht %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-amizadeh12 %I PMLR %J Proceedings of Machine Learning Research %P 37--46 %U http://proceedings.mlr.press %V 22 %W PMLR %X Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from the curse of dimensionality. Motivated by other machine learning models for which the existence of structure in the underlying distribution of data can reduce the complexity of estimation, we study and show how the factorization of the underlying distribution into independent subspaces can help us to estimate diffusion maps more accurately. Building upon this result, we propose and develop an algorithm that can automatically factorize a high dimensional data space in order to minimize the error of estimation of its diffusion map, even in the case when the underlying distribution is not decomposable. Experiments on both the synthetic and real-world datasets demonstrate improved estimation performance of our method over the regular diffusion-map framework.
RIS
TY - CPAPER TI - Factorized Diffusion Map Approximation AU - Saeed Amizadeh AU - Hamed Valizadegan AU - Milos Hauskrecht BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics PY - 2012/03/21 DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-amizadeh12 PB - PMLR SP - 37 DP - PMLR EP - 46 L1 - http://proceedings.mlr.press/v22/amizadeh12/amizadeh12.pdf UR - http://proceedings.mlr.press/v22/amizadeh12.html AB - Diffusion maps are among the most powerful Machine Learning tools to analyze and work with complex high-dimensional datasets. Unfortunately, the estimation of these maps from a finite sample is known to suffer from the curse of dimensionality. Motivated by other machine learning models for which the existence of structure in the underlying distribution of data can reduce the complexity of estimation, we study and show how the factorization of the underlying distribution into independent subspaces can help us to estimate diffusion maps more accurately. Building upon this result, we propose and develop an algorithm that can automatically factorize a high dimensional data space in order to minimize the error of estimation of its diffusion map, even in the case when the underlying distribution is not decomposable. Experiments on both the synthetic and real-world datasets demonstrate improved estimation performance of our method over the regular diffusion-map framework. ER -
APA
Amizadeh, S., Valizadegan, H. & Hauskrecht, M.. (2012). Factorized Diffusion Map Approximation. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in PMLR 22:37-46

Related Material