Description Based Text Classification with Reinforcement Learning

Duo Chai, Wei Wu, Qinghong Han, Fei Wu, Jiwei Li
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1371-1382, 2020.

Abstract

The task of text classification is usually divided into two stages: text feature extraction and classification. In this standard formalization, categories are merely represented as indexes in the label vocabulary, and the model lacks for explicit instructions on what to classify. Inspired by the current trend of formalizing NLP problems as question answering tasks, we propose a new framework for text classification, in which each category label is associated with a category description. Descriptions are generated by hand-crafted templates or using abstractive/extractive models from reinforcement learning. The concatenation of the description and the text is fed to the classifier to decide whether or not the current label should be assigned to the text. The proposed strategy forces the model to attend to the most salient texts with respect to the label, which can be regarded as a hard version of attention, leading to better performances. We observe significant performance boosts over strong baselines on a wide range of text classification tasks including single-label classification, multi-label classification and multi-aspect sentiment analysis.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chai20a, title = {Description Based Text Classification with Reinforcement Learning}, author = {Chai, Duo and Wu, Wei and Han, Qinghong and Wu, Fei and Li, Jiwei}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1371--1382}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chai20a/chai20a.pdf}, url = {https://proceedings.mlr.press/v119/chai20a.html}, abstract = {The task of text classification is usually divided into two stages: text feature extraction and classification. In this standard formalization, categories are merely represented as indexes in the label vocabulary, and the model lacks for explicit instructions on what to classify. Inspired by the current trend of formalizing NLP problems as question answering tasks, we propose a new framework for text classification, in which each category label is associated with a category description. Descriptions are generated by hand-crafted templates or using abstractive/extractive models from reinforcement learning. The concatenation of the description and the text is fed to the classifier to decide whether or not the current label should be assigned to the text. The proposed strategy forces the model to attend to the most salient texts with respect to the label, which can be regarded as a hard version of attention, leading to better performances. We observe significant performance boosts over strong baselines on a wide range of text classification tasks including single-label classification, multi-label classification and multi-aspect sentiment analysis.} }
Endnote
%0 Conference Paper %T Description Based Text Classification with Reinforcement Learning %A Duo Chai %A Wei Wu %A Qinghong Han %A Fei Wu %A Jiwei Li %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chai20a %I PMLR %P 1371--1382 %U https://proceedings.mlr.press/v119/chai20a.html %V 119 %X The task of text classification is usually divided into two stages: text feature extraction and classification. In this standard formalization, categories are merely represented as indexes in the label vocabulary, and the model lacks for explicit instructions on what to classify. Inspired by the current trend of formalizing NLP problems as question answering tasks, we propose a new framework for text classification, in which each category label is associated with a category description. Descriptions are generated by hand-crafted templates or using abstractive/extractive models from reinforcement learning. The concatenation of the description and the text is fed to the classifier to decide whether or not the current label should be assigned to the text. The proposed strategy forces the model to attend to the most salient texts with respect to the label, which can be regarded as a hard version of attention, leading to better performances. We observe significant performance boosts over strong baselines on a wide range of text classification tasks including single-label classification, multi-label classification and multi-aspect sentiment analysis.
APA
Chai, D., Wu, W., Han, Q., Wu, F. & Li, J.. (2020). Description Based Text Classification with Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1371-1382 Available from https://proceedings.mlr.press/v119/chai20a.html.

Related Material