Latent feature regression for multivariate count data

Arto Klami, Abhishek Tripathi, Johannes Sirola, Lauri Väre, Frederic Roulland
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:462-470, 2015.

Abstract

We consider the problem of regression on multivariate count data and present a Gibbs sampler for a latent feature regression model suitable for both under- and overdispersed response variables. The model learns count-valued latent features conditional on arbitrary covariates, modeling them as negative binomial variables, and maps them into the dependent count-valued observations using a Dirichlet-multinomial distribution. From another viewpoint, the model can be seen as a generalization of a specific topic model for scenarios where we are interested in generating the actual counts of observations and not just their relative frequencies and co-occurrences. The model is demonstrated on a smart traffic application where the task is to predict public transportation volume for unknown locations based on a characterization of the close-by services and venues.

Cite this Paper


BibTeX
@InProceedings{pmlr-v38-klami15, title = {{Latent feature regression for multivariate count data}}, author = {Klami, Arto and Tripathi, Abhishek and Sirola, Johannes and Väre, Lauri and Roulland, Frederic}, booktitle = {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics}, pages = {462--470}, year = {2015}, editor = {Lebanon, Guy and Vishwanathan, S. V. N.}, volume = {38}, series = {Proceedings of Machine Learning Research}, address = {San Diego, California, USA}, month = {09--12 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v38/klami15.pdf}, url = {https://proceedings.mlr.press/v38/klami15.html}, abstract = {We consider the problem of regression on multivariate count data and present a Gibbs sampler for a latent feature regression model suitable for both under- and overdispersed response variables. The model learns count-valued latent features conditional on arbitrary covariates, modeling them as negative binomial variables, and maps them into the dependent count-valued observations using a Dirichlet-multinomial distribution. From another viewpoint, the model can be seen as a generalization of a specific topic model for scenarios where we are interested in generating the actual counts of observations and not just their relative frequencies and co-occurrences. The model is demonstrated on a smart traffic application where the task is to predict public transportation volume for unknown locations based on a characterization of the close-by services and venues.} }
Endnote
%0 Conference Paper %T Latent feature regression for multivariate count data %A Arto Klami %A Abhishek Tripathi %A Johannes Sirola %A Lauri Väre %A Frederic Roulland %B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2015 %E Guy Lebanon %E S. V. N. Vishwanathan %F pmlr-v38-klami15 %I PMLR %P 462--470 %U https://proceedings.mlr.press/v38/klami15.html %V 38 %X We consider the problem of regression on multivariate count data and present a Gibbs sampler for a latent feature regression model suitable for both under- and overdispersed response variables. The model learns count-valued latent features conditional on arbitrary covariates, modeling them as negative binomial variables, and maps them into the dependent count-valued observations using a Dirichlet-multinomial distribution. From another viewpoint, the model can be seen as a generalization of a specific topic model for scenarios where we are interested in generating the actual counts of observations and not just their relative frequencies and co-occurrences. The model is demonstrated on a smart traffic application where the task is to predict public transportation volume for unknown locations based on a characterization of the close-by services and venues.
RIS
TY - CPAPER TI - Latent feature regression for multivariate count data AU - Arto Klami AU - Abhishek Tripathi AU - Johannes Sirola AU - Lauri Väre AU - Frederic Roulland BT - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics DA - 2015/02/21 ED - Guy Lebanon ED - S. V. N. Vishwanathan ID - pmlr-v38-klami15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 38 SP - 462 EP - 470 L1 - http://proceedings.mlr.press/v38/klami15.pdf UR - https://proceedings.mlr.press/v38/klami15.html AB - We consider the problem of regression on multivariate count data and present a Gibbs sampler for a latent feature regression model suitable for both under- and overdispersed response variables. The model learns count-valued latent features conditional on arbitrary covariates, modeling them as negative binomial variables, and maps them into the dependent count-valued observations using a Dirichlet-multinomial distribution. From another viewpoint, the model can be seen as a generalization of a specific topic model for scenarios where we are interested in generating the actual counts of observations and not just their relative frequencies and co-occurrences. The model is demonstrated on a smart traffic application where the task is to predict public transportation volume for unknown locations based on a characterization of the close-by services and venues. ER -
APA
Klami, A., Tripathi, A., Sirola, J., Väre, L. & Roulland, F.. (2015). Latent feature regression for multivariate count data. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:462-470 Available from https://proceedings.mlr.press/v38/klami15.html.

Related Material