Latent Normalizing Flows for Discrete Sequences

Zachary Ziegler, Alexander Rush
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:7673-7682, 2019.

Abstract

Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-ziegler19a, title = {Latent Normalizing Flows for Discrete Sequences}, author = {Ziegler, Zachary and Rush, Alexander}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {7673--7682}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/ziegler19a/ziegler19a.pdf}, url = {https://proceedings.mlr.press/v97/ziegler19a.html}, abstract = {Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.} }
Endnote
%0 Conference Paper %T Latent Normalizing Flows for Discrete Sequences %A Zachary Ziegler %A Alexander Rush %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-ziegler19a %I PMLR %P 7673--7682 %U https://proceedings.mlr.press/v97/ziegler19a.html %V 97 %X Normalizing flows are a powerful class of generative models for continuous random variables, showing both strong model flexibility and the potential for non-autoregressive generation. These benefits are also desired when modeling discrete random variables such as text, but directly applying normalizing flows to discrete sequences poses significant additional challenges. We propose a VAE-based generative model which jointly learns a normalizing flow-based distribution in the latent space and a stochastic mapping to an observed discrete space. In this setting, we find that it is crucial for the flow-based distribution to be highly multimodal. To capture this property, we propose several normalizing flow architectures to maximize model flexibility. Experiments consider common discrete sequence tasks of character-level language modeling and polyphonic music generation. Our results indicate that an autoregressive flow-based model can match the performance of a comparable autoregressive baseline, and a non-autoregressive flow-based model can improve generation speed with a penalty to performance.
APA
Ziegler, Z. & Rush, A.. (2019). Latent Normalizing Flows for Discrete Sequences. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:7673-7682 Available from https://proceedings.mlr.press/v97/ziegler19a.html.

Related Material