Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees

Sen Na, Yuwei Luo, Zhuoran Yang, Zhaoran Wang, Mladen Kolar
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:7141-7152, 2020.

Abstract

Graph representation learning is a ubiquitous task in machine learning where the goal is to embed each vertex into a low-dimensional vector space. We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution: the bipartite graph is assumed to be generated by a semiparametric exponential family distribution, whose parametric component is given by the proximity of outputs of two one-layer neural networks that take high-dimensional features as inputs, while nonparametric (nuisance) component is the base measure. In this setting, the representation learning problem is equivalent to recovering the weight matrices, and the main challenges of estimation arise from the nonlinearity of activation functions and the nonparametric nuisance component of the distribution. To overcome these challenges, we propose a pseudo-likelihood objective based on the rank-order decomposition technique and show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate. Moreover, we prove that the sample complexity of the problem is linear in dimensions (up to logarithmic factors), which is consistent with parametric Gaussian models. However, our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-na20a, title = {Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees}, author = {Na, Sen and Luo, Yuwei and Yang, Zhuoran and Wang, Zhaoran and Kolar, Mladen}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {7141--7152}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/na20a/na20a.pdf}, url = {https://proceedings.mlr.press/v119/na20a.html}, abstract = {Graph representation learning is a ubiquitous task in machine learning where the goal is to embed each vertex into a low-dimensional vector space. We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution: the bipartite graph is assumed to be generated by a semiparametric exponential family distribution, whose parametric component is given by the proximity of outputs of two one-layer neural networks that take high-dimensional features as inputs, while nonparametric (nuisance) component is the base measure. In this setting, the representation learning problem is equivalent to recovering the weight matrices, and the main challenges of estimation arise from the nonlinearity of activation functions and the nonparametric nuisance component of the distribution. To overcome these challenges, we propose a pseudo-likelihood objective based on the rank-order decomposition technique and show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate. Moreover, we prove that the sample complexity of the problem is linear in dimensions (up to logarithmic factors), which is consistent with parametric Gaussian models. However, our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.} }
Endnote
%0 Conference Paper %T Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees %A Sen Na %A Yuwei Luo %A Zhuoran Yang %A Zhaoran Wang %A Mladen Kolar %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-na20a %I PMLR %P 7141--7152 %U https://proceedings.mlr.press/v119/na20a.html %V 119 %X Graph representation learning is a ubiquitous task in machine learning where the goal is to embed each vertex into a low-dimensional vector space. We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution: the bipartite graph is assumed to be generated by a semiparametric exponential family distribution, whose parametric component is given by the proximity of outputs of two one-layer neural networks that take high-dimensional features as inputs, while nonparametric (nuisance) component is the base measure. In this setting, the representation learning problem is equivalent to recovering the weight matrices, and the main challenges of estimation arise from the nonlinearity of activation functions and the nonparametric nuisance component of the distribution. To overcome these challenges, we propose a pseudo-likelihood objective based on the rank-order decomposition technique and show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate. Moreover, we prove that the sample complexity of the problem is linear in dimensions (up to logarithmic factors), which is consistent with parametric Gaussian models. However, our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
APA
Na, S., Luo, Y., Yang, Z., Wang, Z. & Kolar, M.. (2020). Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:7141-7152 Available from https://proceedings.mlr.press/v119/na20a.html.

Related Material