On the challenges of learning with inference networks on sparse, high-dimensional data

Rahul Krishnan, Dawen Liang, Matthew Hoffman
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:143-151, 2018.

Abstract

We study parameter estimation in Nonlinear Factor Analysis (NFA) where the generative model is parameterized by a deep neural network. Recent work has focused on learning such models using inference (or recognition) networks; we identify a crucial problem when modeling large, sparse, high-dimensional datasets – underfitting. We study the extent of underfitting, highlighting that its severity increases with the sparsity of the data. We propose methods to tackle it via iterative optimization inspired by stochastic variational inference (Hoffman et al., 2013) and improvements in the data representation used for inference. The proposed techniques drastically improve the ability of these powerful models to fit sparse data, achieving state-of-the-art results on a benchmark text-count dataset and excellent results on the task of top-N recommendation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-krishnan18a, title = {On the challenges of learning with inference networks on sparse, high-dimensional data}, author = {Krishnan, Rahul and Liang, Dawen and Hoffman, Matthew}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {143--151}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/krishnan18a/krishnan18a.pdf}, url = {https://proceedings.mlr.press/v84/krishnan18a.html}, abstract = {We study parameter estimation in Nonlinear Factor Analysis (NFA) where the generative model is parameterized by a deep neural network. Recent work has focused on learning such models using inference (or recognition) networks; we identify a crucial problem when modeling large, sparse, high-dimensional datasets – underfitting. We study the extent of underfitting, highlighting that its severity increases with the sparsity of the data. We propose methods to tackle it via iterative optimization inspired by stochastic variational inference (Hoffman et al., 2013) and improvements in the data representation used for inference. The proposed techniques drastically improve the ability of these powerful models to fit sparse data, achieving state-of-the-art results on a benchmark text-count dataset and excellent results on the task of top-N recommendation.} }
Endnote
%0 Conference Paper %T On the challenges of learning with inference networks on sparse, high-dimensional data %A Rahul Krishnan %A Dawen Liang %A Matthew Hoffman %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-krishnan18a %I PMLR %P 143--151 %U https://proceedings.mlr.press/v84/krishnan18a.html %V 84 %X We study parameter estimation in Nonlinear Factor Analysis (NFA) where the generative model is parameterized by a deep neural network. Recent work has focused on learning such models using inference (or recognition) networks; we identify a crucial problem when modeling large, sparse, high-dimensional datasets – underfitting. We study the extent of underfitting, highlighting that its severity increases with the sparsity of the data. We propose methods to tackle it via iterative optimization inspired by stochastic variational inference (Hoffman et al., 2013) and improvements in the data representation used for inference. The proposed techniques drastically improve the ability of these powerful models to fit sparse data, achieving state-of-the-art results on a benchmark text-count dataset and excellent results on the task of top-N recommendation.
APA
Krishnan, R., Liang, D. & Hoffman, M.. (2018). On the challenges of learning with inference networks on sparse, high-dimensional data. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:143-151 Available from https://proceedings.mlr.press/v84/krishnan18a.html.

Related Material