Contrastive latent variable models for neural text generation

Zhiyang Teng, Chenhua Chen, Yan Zhang, Yue Zhang
Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PMLR 180:1928-1938, 2022.

Abstract

Deep latent variable models such as variational autoencoders and energy-based models are widely used for neural text generation. Most of them focus on matching the prior distribution with the posterior distribution of the latent variable for text reconstruction. In addition to instance-level reconstruction, this paper aims to integrate contrastive learning in the latent space, forcing the latent variables to learn high-level semantics by exploring inter-instance relationships. Experiments on various text generation benchmarks show the effectiveness of our proposed method. We also empirically show that our method can mitigate the posterior collapse issue for latent variable based text generation models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v180-teng22a, title = {Contrastive latent variable models for neural text generation}, author = {Teng, Zhiyang and Chen, Chenhua and Zhang, Yan and Zhang, Yue}, booktitle = {Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence}, pages = {1928--1938}, year = {2022}, editor = {Cussens, James and Zhang, Kun}, volume = {180}, series = {Proceedings of Machine Learning Research}, month = {01--05 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v180/teng22a/teng22a.pdf}, url = {https://proceedings.mlr.press/v180/teng22a.html}, abstract = {Deep latent variable models such as variational autoencoders and energy-based models are widely used for neural text generation. Most of them focus on matching the prior distribution with the posterior distribution of the latent variable for text reconstruction. In addition to instance-level reconstruction, this paper aims to integrate contrastive learning in the latent space, forcing the latent variables to learn high-level semantics by exploring inter-instance relationships. Experiments on various text generation benchmarks show the effectiveness of our proposed method. We also empirically show that our method can mitigate the posterior collapse issue for latent variable based text generation models. } }
Endnote
%0 Conference Paper %T Contrastive latent variable models for neural text generation %A Zhiyang Teng %A Chenhua Chen %A Yan Zhang %A Yue Zhang %B Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2022 %E James Cussens %E Kun Zhang %F pmlr-v180-teng22a %I PMLR %P 1928--1938 %U https://proceedings.mlr.press/v180/teng22a.html %V 180 %X Deep latent variable models such as variational autoencoders and energy-based models are widely used for neural text generation. Most of them focus on matching the prior distribution with the posterior distribution of the latent variable for text reconstruction. In addition to instance-level reconstruction, this paper aims to integrate contrastive learning in the latent space, forcing the latent variables to learn high-level semantics by exploring inter-instance relationships. Experiments on various text generation benchmarks show the effectiveness of our proposed method. We also empirically show that our method can mitigate the posterior collapse issue for latent variable based text generation models.
APA
Teng, Z., Chen, C., Zhang, Y. & Zhang, Y.. (2022). Contrastive latent variable models for neural text generation. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 180:1928-1938 Available from https://proceedings.mlr.press/v180/teng22a.html.

Related Material