Attribute Efficient Linear Regression with Distribution-Dependent Sampling
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:153-161, 2015.
We consider a budgeted learning setting, where the learner can only choose and observe a small subset of the attributes of each training example. We develop efficient algorithms for Ridge and Lasso linear regression, which utilize the geometry of the data by a novel distribution-dependent sampling scheme, and have excess risk bounds which are better a factor of up to O(d/k) over the state-of-the-art, where d is the dimension and k+1 is the number of observed attributes per example. Moreover, under reasonable assumptions, our algorithms are the first in our setting which can provably use *less* attributes than full-information algorithms, which is the main concern in budgeted learning. We complement our theoretical analysis with experiments which support our claims.