A recursive estimate for the predictive likelihood in a topic model

James Scott; Jason Baldridge

A recursive estimate for the predictive likelihood in a topic model

James Scott, Jason Baldridge

Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, PMLR 31:527-535, 2013.

Abstract

We consider the problem of evaluating the predictive log likelihood of a previously un- seen document under a topic model. This task arises when cross-validating for a model hyperparameter, when testing a model on a hold-out set, and when comparing the performance of different fitting strategies. Yet it is known to be very challenging, as it is equivalent to estimating a marginal likelihood in Bayesian model selection. We propose a fast algorithm for approximating this likelihood, one whose computational cost is linear both in document length and in the number of topics. The method is a first-order approximation to the algorithm of Carvalho et al. (2010), and can also be interpreted as a one-particle, Rao-Blackwellized version of the "left-to-right" method of Wallach et al. (2009). On our test examples, the proposed method gives similar answers to these other methods, but at lower computational cost.

Cite this Paper

BibTeX


@InProceedings{pmlr-v31-scott13a,
  title = 	 {A recursive estimate for the predictive likelihood in a topic model},
  author = 	 {Scott, James and Baldridge, Jason},
  booktitle = 	 {Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {527--535},
  year = 	 {2013},
  editor = 	 {Carvalho, Carlos M. and Ravikumar, Pradeep},
  volume = 	 {31},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Scottsdale, Arizona, USA},
  month = 	 {29 Apr--01 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v31/scott13a.pdf},
  url = 	 {https://proceedings.mlr.press/v31/scott13a.html},
  abstract = 	 {We consider the problem of evaluating the predictive log likelihood of a previously un- seen document under a topic model. This task arises when cross-validating for a model hyperparameter, when testing a model on a hold-out set, and when comparing the performance of different fitting strategies. Yet it is known to be very challenging, as it is equivalent to estimating a marginal likelihood in Bayesian model selection. We propose a fast algorithm for approximating this likelihood, one whose computational cost is linear both in document length and in the number of topics. The method is a first-order approximation to the algorithm of Carvalho et al. (2010), and can also be interpreted as a one-particle, Rao-Blackwellized version of the "left-to-right" method of Wallach et al. (2009). On our test examples, the proposed method gives similar answers to these other methods, but at lower computational cost.}
}

Endnote

%0 Conference Paper
%T A recursive estimate for the predictive likelihood in a topic model
%A James Scott
%A Jason Baldridge
%B Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2013
%E Carlos M. Carvalho
%E Pradeep Ravikumar	
%F pmlr-v31-scott13a
%I PMLR
%P 527--535
%U https://proceedings.mlr.press/v31/scott13a.html
%V 31
%X We consider the problem of evaluating the predictive log likelihood of a previously un- seen document under a topic model. This task arises when cross-validating for a model hyperparameter, when testing a model on a hold-out set, and when comparing the performance of different fitting strategies. Yet it is known to be very challenging, as it is equivalent to estimating a marginal likelihood in Bayesian model selection. We propose a fast algorithm for approximating this likelihood, one whose computational cost is linear both in document length and in the number of topics. The method is a first-order approximation to the algorithm of Carvalho et al. (2010), and can also be interpreted as a one-particle, Rao-Blackwellized version of the "left-to-right" method of Wallach et al. (2009). On our test examples, the proposed method gives similar answers to these other methods, but at lower computational cost.

RIS


TY  - CPAPER
TI  - A recursive estimate for the predictive likelihood in a topic model
AU  - James Scott
AU  - Jason Baldridge
BT  - Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics
DA  - 2013/04/29
ED  - Carlos M. Carvalho
ED  - Pradeep Ravikumar	
ID  - pmlr-v31-scott13a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 31
SP  - 527
EP  - 535
L1  - http://proceedings.mlr.press/v31/scott13a.pdf
UR  - https://proceedings.mlr.press/v31/scott13a.html
AB  - We consider the problem of evaluating the predictive log likelihood of a previously un- seen document under a topic model. This task arises when cross-validating for a model hyperparameter, when testing a model on a hold-out set, and when comparing the performance of different fitting strategies. Yet it is known to be very challenging, as it is equivalent to estimating a marginal likelihood in Bayesian model selection. We propose a fast algorithm for approximating this likelihood, one whose computational cost is linear both in document length and in the number of topics. The method is a first-order approximation to the algorithm of Carvalho et al. (2010), and can also be interpreted as a one-particle, Rao-Blackwellized version of the "left-to-right" method of Wallach et al. (2009). On our test examples, the proposed method gives similar answers to these other methods, but at lower computational cost.
ER  -

APA


Scott, J. & Baldridge, J.. (2013). A recursive estimate for the predictive likelihood in a topic model. Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 31:527-535 Available from https://proceedings.mlr.press/v31/scott13a.html.

Related Material

Download PDF