Asynchronous Distributed Variational Gaussian Process for Regression

Hao Peng, Shandian Zhe, Xiao Zhang, Yuan Qi
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2788-2797, 2017.

Abstract

Gaussian processes (GPs) are powerful non-parametric function estimators. However, their applications are largely limited by the expensive computational cost of the inference procedures. Existing stochastic or distributed synchronous variational inferences, although have alleviated this issue by scaling up GPs to millions of samples, are still far from satisfactory for real-world large applications, where the data sizes are often orders of magnitudes larger, say, billions. To solve this problem, we propose ADVGP, the first Asynchronous Distributed Variational Gaussian Process inference for regression, on the recent large-scale machine learning platform, PARAMETER SERVER. ADVGP uses a novel, flexible variational framework based on a weight space augmentation, and implements the highly efficient, asynchronous proximal gradient optimization. While maintaining comparable or better predictive performance, ADVGP greatly improves upon the efficiency of the existing variational methods. With ADVGP, we effortlessly scale up GP regression to a real-world application with billions of samples and demonstrate an excellent, superior prediction accuracy to the popular linear models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-peng17a, title = {Asynchronous Distributed Variational {G}aussian Process for Regression}, author = {Hao Peng and Shandian Zhe and Xiao Zhang and Yuan Qi}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {2788--2797}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/peng17a/peng17a.pdf}, url = {https://proceedings.mlr.press/v70/peng17a.html}, abstract = {Gaussian processes (GPs) are powerful non-parametric function estimators. However, their applications are largely limited by the expensive computational cost of the inference procedures. Existing stochastic or distributed synchronous variational inferences, although have alleviated this issue by scaling up GPs to millions of samples, are still far from satisfactory for real-world large applications, where the data sizes are often orders of magnitudes larger, say, billions. To solve this problem, we propose ADVGP, the first Asynchronous Distributed Variational Gaussian Process inference for regression, on the recent large-scale machine learning platform, PARAMETER SERVER. ADVGP uses a novel, flexible variational framework based on a weight space augmentation, and implements the highly efficient, asynchronous proximal gradient optimization. While maintaining comparable or better predictive performance, ADVGP greatly improves upon the efficiency of the existing variational methods. With ADVGP, we effortlessly scale up GP regression to a real-world application with billions of samples and demonstrate an excellent, superior prediction accuracy to the popular linear models.} }
Endnote
%0 Conference Paper %T Asynchronous Distributed Variational Gaussian Process for Regression %A Hao Peng %A Shandian Zhe %A Xiao Zhang %A Yuan Qi %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-peng17a %I PMLR %P 2788--2797 %U https://proceedings.mlr.press/v70/peng17a.html %V 70 %X Gaussian processes (GPs) are powerful non-parametric function estimators. However, their applications are largely limited by the expensive computational cost of the inference procedures. Existing stochastic or distributed synchronous variational inferences, although have alleviated this issue by scaling up GPs to millions of samples, are still far from satisfactory for real-world large applications, where the data sizes are often orders of magnitudes larger, say, billions. To solve this problem, we propose ADVGP, the first Asynchronous Distributed Variational Gaussian Process inference for regression, on the recent large-scale machine learning platform, PARAMETER SERVER. ADVGP uses a novel, flexible variational framework based on a weight space augmentation, and implements the highly efficient, asynchronous proximal gradient optimization. While maintaining comparable or better predictive performance, ADVGP greatly improves upon the efficiency of the existing variational methods. With ADVGP, we effortlessly scale up GP regression to a real-world application with billions of samples and demonstrate an excellent, superior prediction accuracy to the popular linear models.
APA
Peng, H., Zhe, S., Zhang, X. & Qi, Y.. (2017). Asynchronous Distributed Variational Gaussian Process for Regression. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2788-2797 Available from https://proceedings.mlr.press/v70/peng17a.html.

Related Material