Latent feature regression for multivariate count data


Arto Klami, Abhishek Tripathi, Johannes Sirola, Lauri Väre, Frederic Roulland ;
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:462-470, 2015.


We consider the problem of regression on multivariate count data and present a Gibbs sampler for a latent feature regression model suitable for both under- and overdispersed response variables. The model learns count-valued latent features conditional on arbitrary covariates, modeling them as negative binomial variables, and maps them into the dependent count-valued observations using a Dirichlet-multinomial distribution. From another viewpoint, the model can be seen as a generalization of a specific topic model for scenarios where we are interested in generating the actual counts of observations and not just their relative frequencies and co-occurrences. The model is demonstrated on a smart traffic application where the task is to predict public transportation volume for unknown locations based on a characterization of the close-by services and venues.

Related Material