Latent Normalizing Flows for Discrete Sequences

Zachary Ziegler; Alexander Rush

Latent Normalizing Flows for Discrete Sequences

Zachary Ziegler, Alexander Rush

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:7673-7682, 2019.

Abstract

Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.

Cite this Paper

BibTeX

@InProceedings{pmlr-v97-ziegler19a,
  title = 	 {Latent Normalizing Flows for Discrete Sequences},
  author =       {Ziegler, Zachary and Rush, Alexander},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {7673--7682},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/ziegler19a/ziegler19a.pdf},
  url = 	 {https://proceedings.mlr.press/v97/ziegler19a.html},
  abstract = 	 {Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.}
}

Endnote

%0 Conference Paper
%T Latent Normalizing Flows for Discrete Sequences
%A Zachary Ziegler
%A Alexander Rush
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-ziegler19a
%I PMLR
%P 7673--7682
%U https://proceedings.mlr.press/v97/ziegler19a.html
%V 97
%X Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.

APA

Ziegler, Z. & Rush, A.. (2019). Latent Normalizing Flows for Discrete Sequences. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:7673-7682 Available from https://proceedings.mlr.press/v97/ziegler19a.html.

Latent Normalizing Flows for Discrete Sequences

Abstract

Cite this Paper

Related Material