Mixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings

Avleen S. Bijral, Markus Breitenbach, Greg Grudic
Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:35-42, 2007.

Abstract

Machine learning applications often involve data that can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings that constitute an example of directional data and can result in different shapes on a hypersphere (depending on the original structure). Other examples of directional data include text and some sub-domains of bioinformatics. The Watson distribution for directional data presents a tractable form and has more modeling capability than the simple von Mises-Fisher distribution. In this paper, we present a generative model of mixtures of Watson distributions on a hypersphere and derive numerical approximations of the parameters in an Expectation Maximization (EM) setting. This model also allows us to present an explanation for choosing the right embedding dimension for spectral clustering. We analyze the algorithm on a generated example and demonstrate its superiority over the existing algorithms through results on real datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-bijral07a, title = {Mixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings}, author = {Bijral, Avleen S. and Breitenbach, Markus and Grudic, Greg}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {35--42}, year = {2007}, editor = {Meila, Marina and Shen, Xiaotong}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/bijral07a/bijral07a.pdf}, url = {https://proceedings.mlr.press/v2/bijral07a.html}, abstract = {Machine learning applications often involve data that can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings that constitute an example of directional data and can result in different shapes on a hypersphere (depending on the original structure). Other examples of directional data include text and some sub-domains of bioinformatics. The Watson distribution for directional data presents a tractable form and has more modeling capability than the simple von Mises-Fisher distribution. In this paper, we present a generative model of mixtures of Watson distributions on a hypersphere and derive numerical approximations of the parameters in an Expectation Maximization (EM) setting. This model also allows us to present an explanation for choosing the right embedding dimension for spectral clustering. We analyze the algorithm on a generated example and demonstrate its superiority over the existing algorithms through results on real datasets.} }
Endnote
%0 Conference Paper %T Mixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings %A Avleen S. Bijral %A Markus Breitenbach %A Greg Grudic %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-bijral07a %I PMLR %P 35--42 %U https://proceedings.mlr.press/v2/bijral07a.html %V 2 %X Machine learning applications often involve data that can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings that constitute an example of directional data and can result in different shapes on a hypersphere (depending on the original structure). Other examples of directional data include text and some sub-domains of bioinformatics. The Watson distribution for directional data presents a tractable form and has more modeling capability than the simple von Mises-Fisher distribution. In this paper, we present a generative model of mixtures of Watson distributions on a hypersphere and derive numerical approximations of the parameters in an Expectation Maximization (EM) setting. This model also allows us to present an explanation for choosing the right embedding dimension for spectral clustering. We analyze the algorithm on a generated example and demonstrate its superiority over the existing algorithms through results on real datasets.
RIS
TY - CPAPER TI - Mixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings AU - Avleen S. Bijral AU - Markus Breitenbach AU - Greg Grudic BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-bijral07a PB - PMLR DP - Proceedings of Machine Learning Research VL - 2 SP - 35 EP - 42 L1 - http://proceedings.mlr.press/v2/bijral07a/bijral07a.pdf UR - https://proceedings.mlr.press/v2/bijral07a.html AB - Machine learning applications often involve data that can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings that constitute an example of directional data and can result in different shapes on a hypersphere (depending on the original structure). Other examples of directional data include text and some sub-domains of bioinformatics. The Watson distribution for directional data presents a tractable form and has more modeling capability than the simple von Mises-Fisher distribution. In this paper, we present a generative model of mixtures of Watson distributions on a hypersphere and derive numerical approximations of the parameters in an Expectation Maximization (EM) setting. This model also allows us to present an explanation for choosing the right embedding dimension for spectral clustering. We analyze the algorithm on a generated example and demonstrate its superiority over the existing algorithms through results on real datasets. ER -
APA
Bijral, A.S., Breitenbach, M. & Grudic, G.. (2007). Mixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 2:35-42 Available from https://proceedings.mlr.press/v2/bijral07a.html.

Related Material