Convolutional Poisson Gamma Belief Network

Chaojie Wang, Bo Chen, Sucheng Xiao, Mingyuan Zhou
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6515-6525, 2019.

Abstract

For text analysis, one often resorts to a lossy representation that either completely ignores word order or embeds each word as a low-dimensional dense feature vector. In this paper, we propose convolutional Poisson factor analysis (CPFA) that directly operates on a lossless representation that processes the words in each document as a sequence of high-dimensional one-hot vectors. To boost its performance, we further propose the convolutional Poisson gamma belief network (CPGBN) that couples CPFA with the gamma belief network via a novel probabilistic pooling layer. CPFA forms words into phrases and captures very specific phrase-level topics, and CPGBN further builds a hierarchy of increasingly more general phrase-level topics. For efficient inference, we develop both a Gibbs sampler and a Weibull distribution based convolutional variational auto-encoder. Experimental results demonstrate that CPGBN can extract high-quality text latent representations that capture the word order information, and hence can be leveraged as a building block to enrich a wide variety of existing latent variable models that ignore word order.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-wang19b, title = {Convolutional Poisson Gamma Belief Network}, author = {Wang, Chaojie and Chen, Bo and Xiao, Sucheng and Zhou, Mingyuan}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6515--6525}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/wang19b/wang19b.pdf}, url = {https://proceedings.mlr.press/v97/wang19b.html}, abstract = {For text analysis, one often resorts to a lossy representation that either completely ignores word order or embeds each word as a low-dimensional dense feature vector. In this paper, we propose convolutional Poisson factor analysis (CPFA) that directly operates on a lossless representation that processes the words in each document as a sequence of high-dimensional one-hot vectors. To boost its performance, we further propose the convolutional Poisson gamma belief network (CPGBN) that couples CPFA with the gamma belief network via a novel probabilistic pooling layer. CPFA forms words into phrases and captures very specific phrase-level topics, and CPGBN further builds a hierarchy of increasingly more general phrase-level topics. For efficient inference, we develop both a Gibbs sampler and a Weibull distribution based convolutional variational auto-encoder. Experimental results demonstrate that CPGBN can extract high-quality text latent representations that capture the word order information, and hence can be leveraged as a building block to enrich a wide variety of existing latent variable models that ignore word order.} }
Endnote
%0 Conference Paper %T Convolutional Poisson Gamma Belief Network %A Chaojie Wang %A Bo Chen %A Sucheng Xiao %A Mingyuan Zhou %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-wang19b %I PMLR %P 6515--6525 %U https://proceedings.mlr.press/v97/wang19b.html %V 97 %X For text analysis, one often resorts to a lossy representation that either completely ignores word order or embeds each word as a low-dimensional dense feature vector. In this paper, we propose convolutional Poisson factor analysis (CPFA) that directly operates on a lossless representation that processes the words in each document as a sequence of high-dimensional one-hot vectors. To boost its performance, we further propose the convolutional Poisson gamma belief network (CPGBN) that couples CPFA with the gamma belief network via a novel probabilistic pooling layer. CPFA forms words into phrases and captures very specific phrase-level topics, and CPGBN further builds a hierarchy of increasingly more general phrase-level topics. For efficient inference, we develop both a Gibbs sampler and a Weibull distribution based convolutional variational auto-encoder. Experimental results demonstrate that CPGBN can extract high-quality text latent representations that capture the word order information, and hence can be leveraged as a building block to enrich a wide variety of existing latent variable models that ignore word order.
APA
Wang, C., Chen, B., Xiao, S. & Zhou, M.. (2019). Convolutional Poisson Gamma Belief Network. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6515-6525 Available from https://proceedings.mlr.press/v97/wang19b.html.

Related Material