Group Orthogonal Matching Pursuit for Logistic Regression
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:452-460, 2011.
We consider a matching pursuit approach for variable selection and estimation in logistic regression models. Specifically, we propose Logistic Group Orthogonal Matching Pursuit (Logit-GOMP), which extends the Group-OMP procedure originally proposed for linear regression models, to select groups of variables in logistic regression models, given a predefined grouping structure within the explanatory variables. We theoretically characterize the performance of Logit-GOMP in terms of predictive accuracy, and also provide conditions under which Logit-GOMP is able to identify the correct (groups of) variables. Our results are non-asymptotic in contrast to classical consistency results for logistic regression which only apply in the asymptotic limit where the dimensionality is fixed or is restricted to grow slowly with the sample size. We conduct empirical evaluation on simulated data sets and the real world problem of splice site detection in DNA sequences. The results indicate that Logit-GOMP compares favorably to Logistic Group Lasso both in terms of variable selection and prediction accuracy. We also provide a generic version of our algorithm that applies to the wider class of generalized linear models.