Feature selection via block-regularized regression

Seyoung Kim, Eric Xing
Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, PMLR R6:325-332, 2008.

Abstract

Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a block-regularized regression model for sparse variable selection in a high-dimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a 1st-order Markovian process along the feature sequence that "activates" the regression coefficients in a coupled fashion. We describe a sampling-based learning algorithm and demonstrate the performance of our method on simulated and biological data for marker identification under WGA.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR6-kim08a, title = {Feature selection via block-regularized regression}, author = {Kim, Seyoung and Xing, Eric}, booktitle = {Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence}, pages = {325--332}, year = {2008}, editor = {McAllester, David A. and Myllymäki, Petri}, volume = {R6}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/r6/main/assets/kim08a/kim08a.pdf}, url = {https://proceedings.mlr.press/r6/kim08a.html}, abstract = {Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a block-regularized regression model for sparse variable selection in a high-dimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a 1st-order Markovian process along the feature sequence that "activates" the regression coefficients in a coupled fashion. We describe a sampling-based learning algorithm and demonstrate the performance of our method on simulated and biological data for marker identification under WGA.}, note = {Reissued by PMLR on 09 October 2024.} }
Endnote
%0 Conference Paper %T Feature selection via block-regularized regression %A Seyoung Kim %A Eric Xing %B Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2008 %E David A. McAllester %E Petri Myllymäki %F pmlr-vR6-kim08a %I PMLR %P 325--332 %U https://proceedings.mlr.press/r6/kim08a.html %V R6 %X Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a block-regularized regression model for sparse variable selection in a high-dimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a 1st-order Markovian process along the feature sequence that "activates" the regression coefficients in a coupled fashion. We describe a sampling-based learning algorithm and demonstrate the performance of our method on simulated and biological data for marker identification under WGA. %Z Reissued by PMLR on 09 October 2024.
APA
Kim, S. & Xing, E.. (2008). Feature selection via block-regularized regression. Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research R6:325-332 Available from https://proceedings.mlr.press/r6/kim08a.html. Reissued by PMLR on 09 October 2024.

Related Material