Alpha-Beta Divergences Discover Micro and Macro Structures in Data
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:796-804, 2015.
Although recent work in non-linear dimensionality reduction investigates multiple choices of divergence measure during optimization \citeyang2013icml,bunte2012neuro, little work discusses the direct effects that divergence measures have on visualization. We study this relationship, theoretically and through an empirical analysis over 10 datasets. Our works shows how the αand βparameters of the generalized alpha-beta divergence can be chosen to discover hidden macro-structures (categories, e.g. birds) or micro-structures (fine-grained classes, e.g. toucans). Our method, which generalizes t-SNE \citetsne, allows us to discover such structure without extensive grid searches over (α, β) due to our theoretical analysis: such structure is apparent with particular choices of (α, β) that generalize across datasets. We also discuss efficient parallel CPU and GPU schemes which are non-trivial due to the tree-structures employed in optimization and the large datasets that do not fully fit into GPU memory. Our method runs 20x faster than the fastest published code \citefmm. We conclude with detailed case studies on the following very large datasets: ILSVRC 2012, a standard computer vision dataset with 1.2M images; SUSY, a particle physics dataset with 5M instances; and HIGGS, another particle physics dataset with 11M instances. This represents the largest published visualization attained by SNE methods. We have open-sourced our visualization code: \texttthttp://rll.berkeley.edu/absne/.