A recursive estimate for the predictive likelihood in a topic model
[edit]
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31:527535, 2013.
Abstract
We consider the problem of evaluating the predictive log likelihood of a previously un seen document under a topic model. This task arises when crossvalidating for a model hyperparameter, when testing a model on a holdout set, and when comparing the performance of different fitting strategies. Yet it is known to be very challenging, as it is equivalent to estimating a marginal likelihood in Bayesian model selection. We propose a fast algorithm for approximating this likelihood, one whose computational cost is linear both in document length and in the number of topics. The method is a firstorder approximation to the algorithm of Carvalho et al. (2010), and can also be interpreted as a oneparticle, RaoBlackwellized version of the "lefttoright" method of Wallach et al. (2009). On our test examples, the proposed method gives similar answers to these other methods, but at lower computational cost.
Related Material



