Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data

Victor Veitch, Morgane Austern, Wenda Zhou, David M. Blei, Peter Orbanz
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1733-1742, 2019.

Abstract

Empirical risk minimization is the main tool for prediction problems, but its extension to relational data remains unsolved. We solve this problem using recent ideas from graph sampling theory to (i) define an empirical risk for relational data and (ii) obtain stochastic gradients for this empirical risk that are automatically unbiased. This is achieved by considering the method by which data is sampled from a graph as an explicit component of model design. By integrating fast implementations of graph sampling schemes with standard automatic differentiation tools, we provide an efficient turnkey solver for the risk minimization problem. We establish basic theoretical properties of the procedure. Finally, we demonstrate relational ERM with application to two non-standard problems: one-stage training for semi-supervised node classification, and learning embedding vectors for vertex attributes. Experiments confirm that the turnkey inference procedure is effective in practice, and that the sampling scheme used for model specification has a strong effect on model performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-veitch19a, title = {Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data}, author = {Veitch, Victor and Austern, Morgane and Zhou, Wenda and Blei, David M. and Orbanz, Peter}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {1733--1742}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/veitch19a/veitch19a.pdf}, url = {https://proceedings.mlr.press/v89/veitch19a.html}, abstract = {Empirical risk minimization is the main tool for prediction problems, but its extension to relational data remains unsolved. We solve this problem using recent ideas from graph sampling theory to (i) define an empirical risk for relational data and (ii) obtain stochastic gradients for this empirical risk that are automatically unbiased. This is achieved by considering the method by which data is sampled from a graph as an explicit component of model design. By integrating fast implementations of graph sampling schemes with standard automatic differentiation tools, we provide an efficient turnkey solver for the risk minimization problem. We establish basic theoretical properties of the procedure. Finally, we demonstrate relational ERM with application to two non-standard problems: one-stage training for semi-supervised node classification, and learning embedding vectors for vertex attributes. Experiments confirm that the turnkey inference procedure is effective in practice, and that the sampling scheme used for model specification has a strong effect on model performance.} }
Endnote
%0 Conference Paper %T Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data %A Victor Veitch %A Morgane Austern %A Wenda Zhou %A David M. Blei %A Peter Orbanz %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-veitch19a %I PMLR %P 1733--1742 %U https://proceedings.mlr.press/v89/veitch19a.html %V 89 %X Empirical risk minimization is the main tool for prediction problems, but its extension to relational data remains unsolved. We solve this problem using recent ideas from graph sampling theory to (i) define an empirical risk for relational data and (ii) obtain stochastic gradients for this empirical risk that are automatically unbiased. This is achieved by considering the method by which data is sampled from a graph as an explicit component of model design. By integrating fast implementations of graph sampling schemes with standard automatic differentiation tools, we provide an efficient turnkey solver for the risk minimization problem. We establish basic theoretical properties of the procedure. Finally, we demonstrate relational ERM with application to two non-standard problems: one-stage training for semi-supervised node classification, and learning embedding vectors for vertex attributes. Experiments confirm that the turnkey inference procedure is effective in practice, and that the sampling scheme used for model specification has a strong effect on model performance.
APA
Veitch, V., Austern, M., Zhou, W., Blei, D.M. & Orbanz, P.. (2019). Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:1733-1742 Available from https://proceedings.mlr.press/v89/veitch19a.html.

Related Material