Explainable and Discourse Topic-aware Neural Language Understanding

Yatin Chaudhary, Hinrich Schuetze, Pankaj Gupta
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1479-1488, 2020.

Abstract

Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate latent document topic proportions and ignore topical discourse in sentences of the document. This work extends the line of research by additionally introducing an explainable topic representation in language understanding, obtained from a set of key terms correspondingly for each latent topic of the proportion. Moreover, we retain sentence-topic association along with document-topic association by modeling topical discourse for every sentence in the document. We present a novel neural composite language modeling (NCLM) framework that exploits both the latent and explainable topics along with topical discourse at sentence-level in a joint learning framework of topic and language models. Experiments over a range of tasks such as language modeling, word sense disambiguation, document classification, retrieval and text generation demonstrate ability of the proposed model in improving language understanding.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chaudhary20a, title = {Explainable and Discourse Topic-aware Neural Language Understanding}, author = {Chaudhary, Yatin and Schuetze, Hinrich and Gupta, Pankaj}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1479--1488}, year = {2020}, editor = {Hal Daumé III and Aarti Singh}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chaudhary20a/chaudhary20a.pdf}, url = { http://proceedings.mlr.press/v119/chaudhary20a.html }, abstract = {Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate latent document topic proportions and ignore topical discourse in sentences of the document. This work extends the line of research by additionally introducing an explainable topic representation in language understanding, obtained from a set of key terms correspondingly for each latent topic of the proportion. Moreover, we retain sentence-topic association along with document-topic association by modeling topical discourse for every sentence in the document. We present a novel neural composite language modeling (NCLM) framework that exploits both the latent and explainable topics along with topical discourse at sentence-level in a joint learning framework of topic and language models. Experiments over a range of tasks such as language modeling, word sense disambiguation, document classification, retrieval and text generation demonstrate ability of the proposed model in improving language understanding.} }
Endnote
%0 Conference Paper %T Explainable and Discourse Topic-aware Neural Language Understanding %A Yatin Chaudhary %A Hinrich Schuetze %A Pankaj Gupta %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chaudhary20a %I PMLR %P 1479--1488 %U http://proceedings.mlr.press/v119/chaudhary20a.html %V 119 %X Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate latent document topic proportions and ignore topical discourse in sentences of the document. This work extends the line of research by additionally introducing an explainable topic representation in language understanding, obtained from a set of key terms correspondingly for each latent topic of the proportion. Moreover, we retain sentence-topic association along with document-topic association by modeling topical discourse for every sentence in the document. We present a novel neural composite language modeling (NCLM) framework that exploits both the latent and explainable topics along with topical discourse at sentence-level in a joint learning framework of topic and language models. Experiments over a range of tasks such as language modeling, word sense disambiguation, document classification, retrieval and text generation demonstrate ability of the proposed model in improving language understanding.
APA
Chaudhary, Y., Schuetze, H. & Gupta, P.. (2020). Explainable and Discourse Topic-aware Neural Language Understanding. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1479-1488 Available from http://proceedings.mlr.press/v119/chaudhary20a.html .

Related Material