A Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models


Mohammad Khan, Shakir Mohamed, Benjamin Marlin, Kevin Murphy ;
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:610-618, 2012.


The development of accurate models and efficient algorithms for the analysis of multivariate categorical data are important and long-standing problems in machine learning and computational statistics. In this paper, we focus on modeling categorical data using Latent Gaussian Models (LGMs). We propose a novel stick-breaking likelihood function for categorical LGMs that exploits accurate linear and quadratic bounds on the logistic log-partition function, leading to an effective variational inference and learning framework. We thoroughly compare our approach to existing algorithms for multinomial logit/probit likelihoods on several problems, including inference in multinomial Gaussian process classification and learning in latent factor models. Our extensive comparisons demonstrate that our stick-breaking model effectively captures correlation in discrete data and is well suited for the analysis of categorical data.

Related Material