Group Orthogonal Matching Pursuit for Logistic Regression

Aurelie Lozano, Grzegorz Swirszcz, Naoki Abe
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:452-460, 2011.

Abstract

We consider a matching pursuit approach for variable selection and estimation in logistic regression models. Specifically, we propose Logistic Group Orthogonal Matching Pursuit (Logit-GOMP), which extends the Group-OMP procedure originally proposed for linear regression models, to select groups of variables in logistic regression models, given a predefined grouping structure within the explanatory variables. We theoretically characterize the performance of Logit-GOMP in terms of predictive accuracy, and also provide conditions under which Logit-GOMP is able to identify the correct (groups of) variables. Our results are non-asymptotic in contrast to classical consistency results for logistic regression which only apply in the asymptotic limit where the dimensionality is fixed or is restricted to grow slowly with the sample size. We conduct empirical evaluation on simulated data sets and the real world problem of splice site detection in DNA sequences. The results indicate that Logit-GOMP compares favorably to Logistic Group Lasso both in terms of variable selection and prediction accuracy. We also provide a generic version of our algorithm that applies to the wider class of generalized linear models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v15-lozano11a, title = {Group Orthogonal Matching Pursuit for Logistic Regression}, author = {Lozano, Aurelie and Swirszcz, Grzegorz and Abe, Naoki}, booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics}, pages = {452--460}, year = {2011}, editor = {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav}, volume = {15}, series = {Proceedings of Machine Learning Research}, address = {Fort Lauderdale, FL, USA}, month = {11--13 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v15/lozano11a/lozano11a.pdf}, url = {https://proceedings.mlr.press/v15/lozano11a.html}, abstract = {We consider a matching pursuit approach for variable selection and estimation in logistic regression models. Specifically, we propose Logistic Group Orthogonal Matching Pursuit (Logit-GOMP), which extends the Group-OMP procedure originally proposed for linear regression models, to select groups of variables in logistic regression models, given a predefined grouping structure within the explanatory variables. We theoretically characterize the performance of Logit-GOMP in terms of predictive accuracy, and also provide conditions under which Logit-GOMP is able to identify the correct (groups of) variables. Our results are non-asymptotic in contrast to classical consistency results for logistic regression which only apply in the asymptotic limit where the dimensionality is fixed or is restricted to grow slowly with the sample size. We conduct empirical evaluation on simulated data sets and the real world problem of splice site detection in DNA sequences. The results indicate that Logit-GOMP compares favorably to Logistic Group Lasso both in terms of variable selection and prediction accuracy. We also provide a generic version of our algorithm that applies to the wider class of generalized linear models.} }
Endnote
%0 Conference Paper %T Group Orthogonal Matching Pursuit for Logistic Regression %A Aurelie Lozano %A Grzegorz Swirszcz %A Naoki Abe %B Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2011 %E Geoffrey Gordon %E David Dunson %E Miroslav Dudík %F pmlr-v15-lozano11a %I PMLR %P 452--460 %U https://proceedings.mlr.press/v15/lozano11a.html %V 15 %X We consider a matching pursuit approach for variable selection and estimation in logistic regression models. Specifically, we propose Logistic Group Orthogonal Matching Pursuit (Logit-GOMP), which extends the Group-OMP procedure originally proposed for linear regression models, to select groups of variables in logistic regression models, given a predefined grouping structure within the explanatory variables. We theoretically characterize the performance of Logit-GOMP in terms of predictive accuracy, and also provide conditions under which Logit-GOMP is able to identify the correct (groups of) variables. Our results are non-asymptotic in contrast to classical consistency results for logistic regression which only apply in the asymptotic limit where the dimensionality is fixed or is restricted to grow slowly with the sample size. We conduct empirical evaluation on simulated data sets and the real world problem of splice site detection in DNA sequences. The results indicate that Logit-GOMP compares favorably to Logistic Group Lasso both in terms of variable selection and prediction accuracy. We also provide a generic version of our algorithm that applies to the wider class of generalized linear models.
RIS
TY - CPAPER TI - Group Orthogonal Matching Pursuit for Logistic Regression AU - Aurelie Lozano AU - Grzegorz Swirszcz AU - Naoki Abe BT - Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics DA - 2011/06/14 ED - Geoffrey Gordon ED - David Dunson ED - Miroslav Dudík ID - pmlr-v15-lozano11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 15 SP - 452 EP - 460 L1 - http://proceedings.mlr.press/v15/lozano11a/lozano11a.pdf UR - https://proceedings.mlr.press/v15/lozano11a.html AB - We consider a matching pursuit approach for variable selection and estimation in logistic regression models. Specifically, we propose Logistic Group Orthogonal Matching Pursuit (Logit-GOMP), which extends the Group-OMP procedure originally proposed for linear regression models, to select groups of variables in logistic regression models, given a predefined grouping structure within the explanatory variables. We theoretically characterize the performance of Logit-GOMP in terms of predictive accuracy, and also provide conditions under which Logit-GOMP is able to identify the correct (groups of) variables. Our results are non-asymptotic in contrast to classical consistency results for logistic regression which only apply in the asymptotic limit where the dimensionality is fixed or is restricted to grow slowly with the sample size. We conduct empirical evaluation on simulated data sets and the real world problem of splice site detection in DNA sequences. The results indicate that Logit-GOMP compares favorably to Logistic Group Lasso both in terms of variable selection and prediction accuracy. We also provide a generic version of our algorithm that applies to the wider class of generalized linear models. ER -
APA
Lozano, A., Swirszcz, G. & Abe, N.. (2011). Group Orthogonal Matching Pursuit for Logistic Regression. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 15:452-460 Available from https://proceedings.mlr.press/v15/lozano11a.html.

Related Material