Supervised Neighborhoods for Distributed Nonparametric Regression

Adam Bloniarz, Ameet Talwalkar, Bin Yu, Christopher Wu
; Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1450-1459, 2016.

Abstract

Techniques for nonparametric regression based on fitting small-scale local models at prediction time have long been studied in statistics and pattern recognition, but have received less attention in modern large-scale machine learning applications. In practice, such methods are generally applied to low-dimensional problems, but may falter with high-dimensional predictors if they use a Euclidean distance-based kernel. We propose a new method, SILO, for fitting prediction-time local models that uses supervised neighborhoods that adapt to the local shape of the regression surface. To learn such neighborhoods, we use a weight function between points derived from random forests. We prove the consistency of SILO, and demonstrate through simulations and real data that our method works well in both the serial and distributed settings. In the latter case, SILO learns the weighting function in a divide-and-conquer manner, entirely avoiding communication at training time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-bloniarz16, title = {Supervised Neighborhoods for Distributed Nonparametric Regression}, author = {Adam Bloniarz and Ameet Talwalkar and Bin Yu and Christopher Wu}, booktitle = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics}, pages = {1450--1459}, year = {2016}, editor = {Arthur Gretton and Christian C. Robert}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/bloniarz16.pdf}, url = {http://proceedings.mlr.press/v51/bloniarz16.html}, abstract = {Techniques for nonparametric regression based on fitting small-scale local models at prediction time have long been studied in statistics and pattern recognition, but have received less attention in modern large-scale machine learning applications. In practice, such methods are generally applied to low-dimensional problems, but may falter with high-dimensional predictors if they use a Euclidean distance-based kernel. We propose a new method, SILO, for fitting prediction-time local models that uses supervised neighborhoods that adapt to the local shape of the regression surface. To learn such neighborhoods, we use a weight function between points derived from random forests. We prove the consistency of SILO, and demonstrate through simulations and real data that our method works well in both the serial and distributed settings. In the latter case, SILO learns the weighting function in a divide-and-conquer manner, entirely avoiding communication at training time.} }
Endnote
%0 Conference Paper %T Supervised Neighborhoods for Distributed Nonparametric Regression %A Adam Bloniarz %A Ameet Talwalkar %A Bin Yu %A Christopher Wu %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-bloniarz16 %I PMLR %J Proceedings of Machine Learning Research %P 1450--1459 %U http://proceedings.mlr.press %V 51 %W PMLR %X Techniques for nonparametric regression based on fitting small-scale local models at prediction time have long been studied in statistics and pattern recognition, but have received less attention in modern large-scale machine learning applications. In practice, such methods are generally applied to low-dimensional problems, but may falter with high-dimensional predictors if they use a Euclidean distance-based kernel. We propose a new method, SILO, for fitting prediction-time local models that uses supervised neighborhoods that adapt to the local shape of the regression surface. To learn such neighborhoods, we use a weight function between points derived from random forests. We prove the consistency of SILO, and demonstrate through simulations and real data that our method works well in both the serial and distributed settings. In the latter case, SILO learns the weighting function in a divide-and-conquer manner, entirely avoiding communication at training time.
RIS
TY - CPAPER TI - Supervised Neighborhoods for Distributed Nonparametric Regression AU - Adam Bloniarz AU - Ameet Talwalkar AU - Bin Yu AU - Christopher Wu BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics PY - 2016/05/02 DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-bloniarz16 PB - PMLR SP - 1450 DP - PMLR EP - 1459 L1 - http://proceedings.mlr.press/v51/bloniarz16.pdf UR - http://proceedings.mlr.press/v51/bloniarz16.html AB - Techniques for nonparametric regression based on fitting small-scale local models at prediction time have long been studied in statistics and pattern recognition, but have received less attention in modern large-scale machine learning applications. In practice, such methods are generally applied to low-dimensional problems, but may falter with high-dimensional predictors if they use a Euclidean distance-based kernel. We propose a new method, SILO, for fitting prediction-time local models that uses supervised neighborhoods that adapt to the local shape of the regression surface. To learn such neighborhoods, we use a weight function between points derived from random forests. We prove the consistency of SILO, and demonstrate through simulations and real data that our method works well in both the serial and distributed settings. In the latter case, SILO learns the weighting function in a divide-and-conquer manner, entirely avoiding communication at training time. ER -
APA
Bloniarz, A., Talwalkar, A., Yu, B. & Wu, C.. (2016). Supervised Neighborhoods for Distributed Nonparametric Regression. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in PMLR 51:1450-1459

Related Material