Attribute Efficient Linear Regression with Distribution-Dependent Sampling

Doron Kukliansky, Ohad Shamir
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:153-161, 2015.

Abstract

We consider a budgeted learning setting, where the learner can only choose and observe a small subset of the attributes of each training example. We develop efficient algorithms for Ridge and Lasso linear regression, which utilize the geometry of the data by a novel distribution-dependent sampling scheme, and have excess risk bounds which are better a factor of up to O(d/k) over the state-of-the-art, where d is the dimension and k+1 is the number of observed attributes per example. Moreover, under reasonable assumptions, our algorithms are the first in our setting which can provably use *less* attributes than full-information algorithms, which is the main concern in budgeted learning. We complement our theoretical analysis with experiments which support our claims.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-kukliansky15, title = {Attribute Efficient Linear Regression with Distribution-Dependent Sampling}, author = {Kukliansky, Doron and Shamir, Ohad}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {153--161}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/kukliansky15.pdf}, url = {https://proceedings.mlr.press/v37/kukliansky15.html}, abstract = {We consider a budgeted learning setting, where the learner can only choose and observe a small subset of the attributes of each training example. We develop efficient algorithms for Ridge and Lasso linear regression, which utilize the geometry of the data by a novel distribution-dependent sampling scheme, and have excess risk bounds which are better a factor of up to O(d/k) over the state-of-the-art, where d is the dimension and k+1 is the number of observed attributes per example. Moreover, under reasonable assumptions, our algorithms are the first in our setting which can provably use *less* attributes than full-information algorithms, which is the main concern in budgeted learning. We complement our theoretical analysis with experiments which support our claims.} }
Endnote
%0 Conference Paper %T Attribute Efficient Linear Regression with Distribution-Dependent Sampling %A Doron Kukliansky %A Ohad Shamir %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-kukliansky15 %I PMLR %P 153--161 %U https://proceedings.mlr.press/v37/kukliansky15.html %V 37 %X We consider a budgeted learning setting, where the learner can only choose and observe a small subset of the attributes of each training example. We develop efficient algorithms for Ridge and Lasso linear regression, which utilize the geometry of the data by a novel distribution-dependent sampling scheme, and have excess risk bounds which are better a factor of up to O(d/k) over the state-of-the-art, where d is the dimension and k+1 is the number of observed attributes per example. Moreover, under reasonable assumptions, our algorithms are the first in our setting which can provably use *less* attributes than full-information algorithms, which is the main concern in budgeted learning. We complement our theoretical analysis with experiments which support our claims.
RIS
TY - CPAPER TI - Attribute Efficient Linear Regression with Distribution-Dependent Sampling AU - Doron Kukliansky AU - Ohad Shamir BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-kukliansky15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 153 EP - 161 L1 - http://proceedings.mlr.press/v37/kukliansky15.pdf UR - https://proceedings.mlr.press/v37/kukliansky15.html AB - We consider a budgeted learning setting, where the learner can only choose and observe a small subset of the attributes of each training example. We develop efficient algorithms for Ridge and Lasso linear regression, which utilize the geometry of the data by a novel distribution-dependent sampling scheme, and have excess risk bounds which are better a factor of up to O(d/k) over the state-of-the-art, where d is the dimension and k+1 is the number of observed attributes per example. Moreover, under reasonable assumptions, our algorithms are the first in our setting which can provably use *less* attributes than full-information algorithms, which is the main concern in budgeted learning. We complement our theoretical analysis with experiments which support our claims. ER -
APA
Kukliansky, D. & Shamir, O.. (2015). Attribute Efficient Linear Regression with Distribution-Dependent Sampling. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:153-161 Available from https://proceedings.mlr.press/v37/kukliansky15.html.

Related Material