Fast Distribution To Real Regression

Junier Oliva, Willie Neiswanger, Barnabas Poczos, Jeff Schneider, Eric Xing
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:706-714, 2014.

Abstract

We study the problem of distribution to real regression, where one aims to regress a mapping f that takes in a distribution input covariate P∈\mathcalI (for a non-parametric family of distributions \mathcalI) and outputs a real-valued response Y=f(P) + ε. This setting was recently studied in Pózcos et al. (2013), where the “Kernel-Kernel” estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as Ω(N). This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances N when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings f∈\mathcalF.

Cite this Paper


BibTeX
@InProceedings{pmlr-v33-oliva14a, title = {{Fast Distribution To Real Regression}}, author = {Oliva, Junier and Neiswanger, Willie and Poczos, Barnabas and Schneider, Jeff and Xing, Eric}, booktitle = {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics}, pages = {706--714}, year = {2014}, editor = {Kaski, Samuel and Corander, Jukka}, volume = {33}, series = {Proceedings of Machine Learning Research}, address = {Reykjavik, Iceland}, month = {22--25 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v33/oliva14a.pdf}, url = {https://proceedings.mlr.press/v33/oliva14a.html}, abstract = {We study the problem of distribution to real regression, where one aims to regress a mapping f that takes in a distribution input covariate P∈\mathcalI (for a non-parametric family of distributions \mathcalI) and outputs a real-valued response Y=f(P) + ε. This setting was recently studied in Pózcos et al. (2013), where the “Kernel-Kernel” estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as Ω(N). This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances N when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings f∈\mathcalF.} }
Endnote
%0 Conference Paper %T Fast Distribution To Real Regression %A Junier Oliva %A Willie Neiswanger %A Barnabas Poczos %A Jeff Schneider %A Eric Xing %B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2014 %E Samuel Kaski %E Jukka Corander %F pmlr-v33-oliva14a %I PMLR %P 706--714 %U https://proceedings.mlr.press/v33/oliva14a.html %V 33 %X We study the problem of distribution to real regression, where one aims to regress a mapping f that takes in a distribution input covariate P∈\mathcalI (for a non-parametric family of distributions \mathcalI) and outputs a real-valued response Y=f(P) + ε. This setting was recently studied in Pózcos et al. (2013), where the “Kernel-Kernel” estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as Ω(N). This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances N when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings f∈\mathcalF.
RIS
TY - CPAPER TI - Fast Distribution To Real Regression AU - Junier Oliva AU - Willie Neiswanger AU - Barnabas Poczos AU - Jeff Schneider AU - Eric Xing BT - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics DA - 2014/04/02 ED - Samuel Kaski ED - Jukka Corander ID - pmlr-v33-oliva14a PB - PMLR DP - Proceedings of Machine Learning Research VL - 33 SP - 706 EP - 714 L1 - http://proceedings.mlr.press/v33/oliva14a.pdf UR - https://proceedings.mlr.press/v33/oliva14a.html AB - We study the problem of distribution to real regression, where one aims to regress a mapping f that takes in a distribution input covariate P∈\mathcalI (for a non-parametric family of distributions \mathcalI) and outputs a real-valued response Y=f(P) + ε. This setting was recently studied in Pózcos et al. (2013), where the “Kernel-Kernel” estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as Ω(N). This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances N when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings f∈\mathcalF. ER -
APA
Oliva, J., Neiswanger, W., Poczos, B., Schneider, J. & Xing, E.. (2014). Fast Distribution To Real Regression. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:706-714 Available from https://proceedings.mlr.press/v33/oliva14a.html.

Related Material