Alpha-Beta Divergences Discover Micro and Macro Structures in Data

Karthik Narayan; Ali Punjani; Pieter Abbeel

Alpha-Beta Divergences Discover Micro and Macro Structures in Data

Karthik Narayan, Ali Punjani, Pieter Abbeel

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:796-804, 2015.

Abstract

Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization \citeyang2013icml,bunte2012neuro, little work discusses the direct effects that divergence measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how the αand βparameters of the generalized alpha-beta divergence can be chosen to discover hidden macro-structures (categories, e.g. birds) or micro-structures (fine-grained classes, e.g. toucans). Our method, which generalizes t-SNE \citetsne, allows us to discover such structure without extensive grid searches over (α, β) due to our theoretical analysis: such structure is apparent with particular choices of (α, β) that generalize across datasets. We also discuss efficient parallel CPU and GPU schemes which are non-trivial due to the tree-structures employed in optimization and the large datasets that do not fully fit into GPU memory. Our method runs 20x faster than the fastest published code \citefmm. We conclude with detailed case studies on the following very large datasets: ILSVRC 2012, a standard computer vision dataset with 1.2M images; SUSY, a particle physics dataset with 5M instances; and HIGGS, another particle physics dataset with 11M instances. This represents the largest published visualization attained by SNE methods. We have open-sourced our visualization code: \texttthttp://rll.berkeley.edu/absne/.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-narayan15,
  title = 	 {Alpha-Beta Divergences Discover Micro and Macro Structures in Data},
  author = 	 {Narayan, Karthik and Punjani, Ali and Abbeel, Pieter},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {796--804},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/narayan15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/narayan15.html},
  abstract = 	 {Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization \citeyang2013icml,bunte2012neuro, little work discusses the direct effects that divergence measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how the αand βparameters of the generalized alpha-beta divergence can be chosen to discover hidden macro-structures (categories, e.g. birds) or micro-structures (fine-grained classes, e.g. toucans). Our method, which generalizes t-SNE \citetsne, allows us to discover such structure without extensive grid searches over (α, β) due to our theoretical analysis: such structure is apparent with particular choices of (α, β) that generalize across datasets. We also discuss efficient parallel CPU and GPU schemes which are non-trivial due to the tree-structures employed in optimization and the large datasets that do not fully fit into GPU memory. Our method runs 20x faster than the fastest published code \citefmm. We conclude with detailed case studies on the following very large datasets: ILSVRC 2012, a standard computer vision dataset with 1.2M images; SUSY, a particle physics dataset with 5M instances; and HIGGS, another particle physics dataset with 11M instances. This represents the largest published visualization attained by SNE methods. We have open-sourced our visualization code: \texttthttp://rll.berkeley.edu/absne/.}
}

Endnote

%0 Conference Paper
%T Alpha-Beta Divergences Discover Micro and Macro Structures in Data
%A Karthik Narayan
%A Ali Punjani
%A Pieter Abbeel
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-narayan15
%I PMLR
%P 796--804
%U https://proceedings.mlr.press/v37/narayan15.html
%V 37
%X Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization \citeyang2013icml,bunte2012neuro, little work discusses the direct effects that divergence measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how the αand βparameters of the generalized alpha-beta divergence can be chosen to discover hidden macro-structures (categories, e.g. birds) or micro-structures (fine-grained classes, e.g. toucans). Our method, which generalizes t-SNE \citetsne, allows us to discover such structure without extensive grid searches over (α, β) due to our theoretical analysis: such structure is apparent with particular choices of (α, β) that generalize across datasets. We also discuss efficient parallel CPU and GPU schemes which are non-trivial due to the tree-structures employed in optimization and the large datasets that do not fully fit into GPU memory. Our method runs 20x faster than the fastest published code \citefmm. We conclude with detailed case studies on the following very large datasets: ILSVRC 2012, a standard computer vision dataset with 1.2M images; SUSY, a particle physics dataset with 5M instances; and HIGGS, another particle physics dataset with 11M instances. This represents the largest published visualization attained by SNE methods. We have open-sourced our visualization code: \texttthttp://rll.berkeley.edu/absne/.

RIS


TY  - CPAPER
TI  - Alpha-Beta Divergences Discover Micro and Macro Structures in Data
AU  - Karthik Narayan
AU  - Ali Punjani
AU  - Pieter Abbeel
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-narayan15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 796
EP  - 804
L1  - http://proceedings.mlr.press/v37/narayan15.pdf
UR  - https://proceedings.mlr.press/v37/narayan15.html
AB  - Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization \citeyang2013icml,bunte2012neuro, little work discusses the direct effects that divergence measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how the αand βparameters of the generalized alpha-beta divergence can be chosen to discover hidden macro-structures (categories, e.g. birds) or micro-structures (fine-grained classes, e.g. toucans). Our method, which generalizes t-SNE \citetsne, allows us to discover such structure without extensive grid searches over (α, β) due to our theoretical analysis: such structure is apparent with particular choices of (α, β) that generalize across datasets. We also discuss efficient parallel CPU and GPU schemes which are non-trivial due to the tree-structures employed in optimization and the large datasets that do not fully fit into GPU memory. Our method runs 20x faster than the fastest published code \citefmm. We conclude with detailed case studies on the following very large datasets: ILSVRC 2012, a standard computer vision dataset with 1.2M images; SUSY, a particle physics dataset with 5M instances; and HIGGS, another particle physics dataset with 11M instances. This represents the largest published visualization attained by SNE methods. We have open-sourced our visualization code: \texttthttp://rll.berkeley.edu/absne/.
ER  -

APA


Narayan, K., Punjani, A. & Abbeel, P.. (2015). Alpha-Beta Divergences Discover Micro and Macro Structures in Data. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:796-804 Available from https://proceedings.mlr.press/v37/narayan15.html.

Alpha-Beta Divergences Discover Micro and Macro Structures in Data

Abstract

Cite this Paper

Related Material