A Variance Minimization Criterion to Active Learning on Graphs

Ming Ji, Jiawei Han
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:556-564, 2012.

Abstract

We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned on the label information, which is a multivariate normal with the mean being the harmonic solution over the field. Then we select the nodes to label such that the total variance of the distribution on the unlabeled data, as well as the expected prediction error, is minimized. In this way, the classifier we obtain is theoretically more robust. Compared with existing methods, our algorithm has the advantage of selecting data in a batch offline mode with solid theoretical support. We show improved performance over existing label selection criteria on several real world data sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v22-ji12, title = {A Variance Minimization Criterion to Active Learning on Graphs}, author = {Ji, Ming and Han, Jiawei}, booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics}, pages = {556--564}, year = {2012}, editor = {Lawrence, Neil D. and Girolami, Mark}, volume = {22}, series = {Proceedings of Machine Learning Research}, address = {La Palma, Canary Islands}, month = {21--23 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v22/ji12/ji12.pdf}, url = {https://proceedings.mlr.press/v22/ji12.html}, abstract = {We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned on the label information, which is a multivariate normal with the mean being the harmonic solution over the field. Then we select the nodes to label such that the total variance of the distribution on the unlabeled data, as well as the expected prediction error, is minimized. In this way, the classifier we obtain is theoretically more robust. Compared with existing methods, our algorithm has the advantage of selecting data in a batch offline mode with solid theoretical support. We show improved performance over existing label selection criteria on several real world data sets.} }
Endnote
%0 Conference Paper %T A Variance Minimization Criterion to Active Learning on Graphs %A Ming Ji %A Jiawei Han %B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2012 %E Neil D. Lawrence %E Mark Girolami %F pmlr-v22-ji12 %I PMLR %P 556--564 %U https://proceedings.mlr.press/v22/ji12.html %V 22 %X We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned on the label information, which is a multivariate normal with the mean being the harmonic solution over the field. Then we select the nodes to label such that the total variance of the distribution on the unlabeled data, as well as the expected prediction error, is minimized. In this way, the classifier we obtain is theoretically more robust. Compared with existing methods, our algorithm has the advantage of selecting data in a batch offline mode with solid theoretical support. We show improved performance over existing label selection criteria on several real world data sets.
RIS
TY - CPAPER TI - A Variance Minimization Criterion to Active Learning on Graphs AU - Ming Ji AU - Jiawei Han BT - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics DA - 2012/03/21 ED - Neil D. Lawrence ED - Mark Girolami ID - pmlr-v22-ji12 PB - PMLR DP - Proceedings of Machine Learning Research VL - 22 SP - 556 EP - 564 L1 - http://proceedings.mlr.press/v22/ji12/ji12.pdf UR - https://proceedings.mlr.press/v22/ji12.html AB - We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned on the label information, which is a multivariate normal with the mean being the harmonic solution over the field. Then we select the nodes to label such that the total variance of the distribution on the unlabeled data, as well as the expected prediction error, is minimized. In this way, the classifier we obtain is theoretically more robust. Compared with existing methods, our algorithm has the advantage of selecting data in a batch offline mode with solid theoretical support. We show improved performance over existing label selection criteria on several real world data sets. ER -
APA
Ji, M. & Han, J.. (2012). A Variance Minimization Criterion to Active Learning on Graphs. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:556-564 Available from https://proceedings.mlr.press/v22/ji12.html.

Related Material