Fast Sparse Classification for Generalized Linear and Additive Models

Jiachang Liu, Chudi Zhong, Margo Seltzer, Cynthia Rudin
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:9304-9333, 2022.

Abstract

We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-liu22f, title = { Fast Sparse Classification for Generalized Linear and Additive Models }, author = {Liu, Jiachang and Zhong, Chudi and Seltzer, Margo and Rudin, Cynthia}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {9304--9333}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/liu22f/liu22f.pdf}, url = {https://proceedings.mlr.press/v151/liu22f.html}, abstract = { We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets. } }
Endnote
%0 Conference Paper %T Fast Sparse Classification for Generalized Linear and Additive Models %A Jiachang Liu %A Chudi Zhong %A Margo Seltzer %A Cynthia Rudin %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-liu22f %I PMLR %P 9304--9333 %U https://proceedings.mlr.press/v151/liu22f.html %V 151 %X We present fast classification techniques for sparse generalized linear and additive models. These techniques can handle thousands of features and thousands of observations in minutes, even in the presence of many highly correlated features. For fast sparse logistic regression, our computational speed-up over other best-subset search techniques owes to linear and quadratic surrogate cuts for the logistic loss that allow us to efficiently screen features for elimination, as well as use of a priority queue that favors a more uniform exploration of features. As an alternative to the logistic loss, we propose the exponential loss, which permits an analytical solution to the line search at each iteration. Our algorithms are generally 2 to 5 times faster than previous approaches. They produce interpretable models that have accuracy comparable to black box models on challenging datasets.
APA
Liu, J., Zhong, C., Seltzer, M. & Rudin, C.. (2022). Fast Sparse Classification for Generalized Linear and Additive Models . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:9304-9333 Available from https://proceedings.mlr.press/v151/liu22f.html.

Related Material