Tight conditions for consistent variable selection in high dimensional nonparametric regression

Laëtitia Comminges; Arnak S. Dalalyan

Tight conditions for consistent variable selection in high dimensional nonparametric regression

Laëtitia Comminges, Arnak S. Dalalyan

Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:187-206, 2011.

Abstract

We address the issue of variable selection in the regression model with very high ambient dimension, i.e., when the number of covariates is very large. The main focus is on the situation where the number of relevant covariates, called intrinsic dimension, is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size. The procedure that is provably consistent under these tight conditions is simple and is based on comparing the empirical Fourier coefficients with an appropriately chosen threshold value.

Cite this Paper

BibTeX


@InProceedings{pmlr-v19-comminges11a,
  title = 	 {Tight conditions for consistent variable selection in high dimensional nonparametric regression},
  author = 	 {Comminges, Laëtitia and Dalalyan, Arnak S.},
  booktitle = 	 {Proceedings of the 24th Annual Conference on Learning Theory},
  pages = 	 {187--206},
  year = 	 {2011},
  editor = 	 {Kakade, Sham M. and von Luxburg, Ulrike},
  volume = 	 {19},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Budapest, Hungary},
  month = 	 {09--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v19/comminges11a/comminges11a.pdf},
  url = 	 {https://proceedings.mlr.press/v19/comminges11a.html},
  abstract = 	 {We address the issue of variable selection in the regression model with very high ambient dimension, i.e., when the number of covariates is very large. The main focus is on the situation where the number of relevant covariates, called intrinsic dimension, is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size.  The procedure that is provably consistent under these tight conditions is simple and is based on comparing the empirical Fourier coefficients with an appropriately chosen threshold value.}
}

Endnote

%0 Conference Paper
%T Tight conditions for consistent variable selection in high dimensional nonparametric regression
%A Laëtitia Comminges
%A Arnak S. Dalalyan
%B Proceedings of the 24th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2011
%E Sham M. Kakade
%E Ulrike von Luxburg	
%F pmlr-v19-comminges11a
%I PMLR
%P 187--206
%U https://proceedings.mlr.press/v19/comminges11a.html
%V 19
%X We address the issue of variable selection in the regression model with very high ambient dimension, i.e., when the number of covariates is very large. The main focus is on the situation where the number of relevant covariates, called intrinsic dimension, is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size.  The procedure that is provably consistent under these tight conditions is simple and is based on comparing the empirical Fourier coefficients with an appropriately chosen threshold value.

RIS


TY  - CPAPER
TI  - Tight conditions for consistent variable selection in high dimensional nonparametric regression
AU  - Laëtitia Comminges
AU  - Arnak S. Dalalyan
BT  - Proceedings of the 24th Annual Conference on Learning Theory
DA  - 2011/12/21
ED  - Sham M. Kakade
ED  - Ulrike von Luxburg	
ID  - pmlr-v19-comminges11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 19
SP  - 187
EP  - 206
L1  - http://proceedings.mlr.press/v19/comminges11a/comminges11a.pdf
UR  - https://proceedings.mlr.press/v19/comminges11a.html
AB  - We address the issue of variable selection in the regression model with very high ambient dimension, i.e., when the number of covariates is very large. The main focus is on the situation where the number of relevant covariates, called intrinsic dimension, is much smaller than the ambient dimension. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size.  The procedure that is provably consistent under these tight conditions is simple and is based on comparing the empirical Fourier coefficients with an appropriately chosen threshold value.
ER  -

APA


Comminges, L. & Dalalyan, A.S.. (2011). Tight conditions for consistent variable selection in high dimensional nonparametric regression. Proceedings of the 24th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 19:187-206 Available from https://proceedings.mlr.press/v19/comminges11a.html.

Related Material

Download PDF