Learning Dynamic Hierarchical Topic Graph with Graph Convolutional Network for Document Classification

Zhengjue Wang, Chaojie Wang, Hao Zhang, Zhibin Duan, Mingyuan Zhou, Bo Chen
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3959-3969, 2020.

Abstract

Constructing a graph with graph convolutional network (GCN) to explore the relational structure of the data has attracted lots of interests in various tasks. However, for document classification, existing graph based methods often focus on the straightforward word-word and word-document relations, ignoring the hierarchical semantics. Besides, the graph construction is often independent from the task-specific GCN learning. To address these constrains, we integrate a probabilistic deep topic model into graph construction, and propose a novel trainable hierarchical topic graph (HTG), including word-level, hierarchical topic-level and document-level nodes, exhibiting semantic variation from fine-grained to coarse. Regarding the document classification as a document-node label generation task, HTG can be dynamically evolved with GCN by performing variational inference, which leads to an end-to-end document classification method, named dynamic HTG (DHTG). Besides achieving state-of-the-art classification results, our model learns an interpretable document graph with meaningful node embeddings and semantic edges.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-wang20l, title = {Learning Dynamic Hierarchical Topic Graph with Graph Convolutional Network for Document Classification}, author = {Wang, Zhengjue and Wang, Chaojie and Zhang, Hao and Duan, Zhibin and Zhou, Mingyuan and Chen, Bo}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3959--3969}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/wang20l/wang20l.pdf}, url = { http://proceedings.mlr.press/v108/wang20l.html }, abstract = {Constructing a graph with graph convolutional network (GCN) to explore the relational structure of the data has attracted lots of interests in various tasks. However, for document classification, existing graph based methods often focus on the straightforward word-word and word-document relations, ignoring the hierarchical semantics. Besides, the graph construction is often independent from the task-specific GCN learning. To address these constrains, we integrate a probabilistic deep topic model into graph construction, and propose a novel trainable hierarchical topic graph (HTG), including word-level, hierarchical topic-level and document-level nodes, exhibiting semantic variation from fine-grained to coarse. Regarding the document classification as a document-node label generation task, HTG can be dynamically evolved with GCN by performing variational inference, which leads to an end-to-end document classification method, named dynamic HTG (DHTG). Besides achieving state-of-the-art classification results, our model learns an interpretable document graph with meaningful node embeddings and semantic edges.} }
Endnote
%0 Conference Paper %T Learning Dynamic Hierarchical Topic Graph with Graph Convolutional Network for Document Classification %A Zhengjue Wang %A Chaojie Wang %A Hao Zhang %A Zhibin Duan %A Mingyuan Zhou %A Bo Chen %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-wang20l %I PMLR %P 3959--3969 %U http://proceedings.mlr.press/v108/wang20l.html %V 108 %X Constructing a graph with graph convolutional network (GCN) to explore the relational structure of the data has attracted lots of interests in various tasks. However, for document classification, existing graph based methods often focus on the straightforward word-word and word-document relations, ignoring the hierarchical semantics. Besides, the graph construction is often independent from the task-specific GCN learning. To address these constrains, we integrate a probabilistic deep topic model into graph construction, and propose a novel trainable hierarchical topic graph (HTG), including word-level, hierarchical topic-level and document-level nodes, exhibiting semantic variation from fine-grained to coarse. Regarding the document classification as a document-node label generation task, HTG can be dynamically evolved with GCN by performing variational inference, which leads to an end-to-end document classification method, named dynamic HTG (DHTG). Besides achieving state-of-the-art classification results, our model learns an interpretable document graph with meaningful node embeddings and semantic edges.
APA
Wang, Z., Wang, C., Zhang, H., Duan, Z., Zhou, M. & Chen, B.. (2020). Learning Dynamic Hierarchical Topic Graph with Graph Convolutional Network for Document Classification. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3959-3969 Available from http://proceedings.mlr.press/v108/wang20l.html .

Related Material