Consistent Nonparametric Methods for Network Assisted Covariate Estimation

Xueyu Mao, Deepayan Chakrabarti, Purnamrita Sarkar
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:7435-7446, 2021.

Abstract

Networks with node covariates are commonplace: for example, people in a social network have interests, or product preferences, etc. If we know the covariates for some nodes, can we infer them for the remaining nodes? In this paper we propose a new similarity measure between two nodes based on the patterns of their 2-hop neighborhoods. We show that a simple algorithm (CN-VEC) like nearest neighbor regression with this metric is consistent for a wide range of models when the degree grows faster than $n^{1/3}$ up-to logarithmic factors, where $n$ is the number of nodes. For "low-rank" latent variable models, the natural contender will be to estimate the latent variables using SVD and use them for non-parametric regression. While we show consistency of this method under less stringent sparsity conditions, our experimental results suggest that the simple local CN-VEC method either outperforms the global SVD-RBF method, or has comparable performance for low rank models. We also present simulated and real data experiments to show the effectiveness of our algorithms compared to the state of the art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-mao21a, title = {Consistent Nonparametric Methods for Network Assisted Covariate Estimation}, author = {Mao, Xueyu and Chakrabarti, Deepayan and Sarkar, Purnamrita}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {7435--7446}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/mao21a/mao21a.pdf}, url = {https://proceedings.mlr.press/v139/mao21a.html}, abstract = {Networks with node covariates are commonplace: for example, people in a social network have interests, or product preferences, etc. If we know the covariates for some nodes, can we infer them for the remaining nodes? In this paper we propose a new similarity measure between two nodes based on the patterns of their 2-hop neighborhoods. We show that a simple algorithm (CN-VEC) like nearest neighbor regression with this metric is consistent for a wide range of models when the degree grows faster than $n^{1/3}$ up-to logarithmic factors, where $n$ is the number of nodes. For "low-rank" latent variable models, the natural contender will be to estimate the latent variables using SVD and use them for non-parametric regression. While we show consistency of this method under less stringent sparsity conditions, our experimental results suggest that the simple local CN-VEC method either outperforms the global SVD-RBF method, or has comparable performance for low rank models. We also present simulated and real data experiments to show the effectiveness of our algorithms compared to the state of the art.} }
Endnote
%0 Conference Paper %T Consistent Nonparametric Methods for Network Assisted Covariate Estimation %A Xueyu Mao %A Deepayan Chakrabarti %A Purnamrita Sarkar %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-mao21a %I PMLR %P 7435--7446 %U https://proceedings.mlr.press/v139/mao21a.html %V 139 %X Networks with node covariates are commonplace: for example, people in a social network have interests, or product preferences, etc. If we know the covariates for some nodes, can we infer them for the remaining nodes? In this paper we propose a new similarity measure between two nodes based on the patterns of their 2-hop neighborhoods. We show that a simple algorithm (CN-VEC) like nearest neighbor regression with this metric is consistent for a wide range of models when the degree grows faster than $n^{1/3}$ up-to logarithmic factors, where $n$ is the number of nodes. For "low-rank" latent variable models, the natural contender will be to estimate the latent variables using SVD and use them for non-parametric regression. While we show consistency of this method under less stringent sparsity conditions, our experimental results suggest that the simple local CN-VEC method either outperforms the global SVD-RBF method, or has comparable performance for low rank models. We also present simulated and real data experiments to show the effectiveness of our algorithms compared to the state of the art.
APA
Mao, X., Chakrabarti, D. & Sarkar, P.. (2021). Consistent Nonparametric Methods for Network Assisted Covariate Estimation. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:7435-7446 Available from https://proceedings.mlr.press/v139/mao21a.html.

Related Material