Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Florence Regol, Soumyasundar Pal, Yingxue Zhang, Mark Coates
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:8041-8050, 2020.

Abstract

Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph-convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-regol20a, title = {Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation}, author = {Regol, Florence and Pal, Soumyasundar and Zhang, Yingxue and Coates, Mark}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {8041--8050}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/regol20a/regol20a.pdf}, url = {https://proceedings.mlr.press/v119/regol20a.html}, abstract = {Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph-convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.} }
Endnote
%0 Conference Paper %T Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation %A Florence Regol %A Soumyasundar Pal %A Yingxue Zhang %A Mark Coates %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-regol20a %I PMLR %P 8041--8050 %U https://proceedings.mlr.press/v119/regol20a.html %V 119 %X Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph-convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.
APA
Regol, F., Pal, S., Zhang, Y. & Coates, M.. (2020). Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:8041-8050 Available from https://proceedings.mlr.press/v119/regol20a.html.

Related Material