Hierarchical Text Generation and Planning for Strategic Dialogue

Denis Yarats, Mike Lewis
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5591-5599, 2018.

Abstract

End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors. We introduce an approach to learning representations of messages in dialogues by maximizing the likelihood of subsequent sentences and actions, which decouples the semantics of the dialogue utterance from its linguistic realization. We then use these latent sentence representations for hierarchical language generation, planning and reinforcement learning. Experiments show that our approach increases the end-task reward achieved by the model, improves the effectiveness of long-term planning using rollouts, and allows self-play reinforcement learning to improve decision making without diverging from human language. Our hierarchical latent-variable model outperforms previous work both linguistically and strategically.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-yarats18a, title = {Hierarchical Text Generation and Planning for Strategic Dialogue}, author = {Yarats, Denis and Lewis, Mike}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5591--5599}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/yarats18a/yarats18a.pdf}, url = {https://proceedings.mlr.press/v80/yarats18a.html}, abstract = {End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors. We introduce an approach to learning representations of messages in dialogues by maximizing the likelihood of subsequent sentences and actions, which decouples the semantics of the dialogue utterance from its linguistic realization. We then use these latent sentence representations for hierarchical language generation, planning and reinforcement learning. Experiments show that our approach increases the end-task reward achieved by the model, improves the effectiveness of long-term planning using rollouts, and allows self-play reinforcement learning to improve decision making without diverging from human language. Our hierarchical latent-variable model outperforms previous work both linguistically and strategically.} }
Endnote
%0 Conference Paper %T Hierarchical Text Generation and Planning for Strategic Dialogue %A Denis Yarats %A Mike Lewis %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-yarats18a %I PMLR %P 5591--5599 %U https://proceedings.mlr.press/v80/yarats18a.html %V 80 %X End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors. We introduce an approach to learning representations of messages in dialogues by maximizing the likelihood of subsequent sentences and actions, which decouples the semantics of the dialogue utterance from its linguistic realization. We then use these latent sentence representations for hierarchical language generation, planning and reinforcement learning. Experiments show that our approach increases the end-task reward achieved by the model, improves the effectiveness of long-term planning using rollouts, and allows self-play reinforcement learning to improve decision making without diverging from human language. Our hierarchical latent-variable model outperforms previous work both linguistically and strategically.
APA
Yarats, D. & Lewis, M.. (2018). Hierarchical Text Generation and Planning for Strategic Dialogue. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5591-5599 Available from https://proceedings.mlr.press/v80/yarats18a.html.

Related Material