BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory

Amirali Aghazadeh, Vipul Gupta, Alex DeWeese, Ozan Koyluoglu, Kannan Ramchandran
Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, PMLR 145:75-92, 2022.

Abstract

We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off in selecting features in high dimensions due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order feature selection algorithm, called BEAR, which avoids the extra collisions by efficiently storing the second-order stochastic gradients of the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, using a memory cost that grows sublinearly with the size of the feature vector. BEAR reveals an unexplored advantage of second-order optimization for memory-constrained high-dimensional gradient sketching. Our extensive experiments on several real-world data sets from genomics to language processing demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms with a comparable run time. Our theoretical analysis further proves the global convergence of BEAR with $\mathcal{O}(1/t)$ rate in $t$ iterations of the sketched algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v145-aghazadeh22a, title = {BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory}, author = {Aghazadeh, Amirali and Gupta, Vipul and DeWeese, Alex and Koyluoglu, Ozan and Ramchandran, Kannan}, booktitle = {Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference}, pages = {75--92}, year = {2022}, editor = {Bruna, Joan and Hesthaven, Jan and Zdeborova, Lenka}, volume = {145}, series = {Proceedings of Machine Learning Research}, month = {16--19 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v145/aghazadeh22a/aghazadeh22a.pdf}, url = {https://proceedings.mlr.press/v145/aghazadeh22a.html}, abstract = {We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off in selecting features in high dimensions due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order feature selection algorithm, called BEAR, which avoids the extra collisions by efficiently storing the second-order stochastic gradients of the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, using a memory cost that grows sublinearly with the size of the feature vector. BEAR reveals an unexplored advantage of second-order optimization for memory-constrained high-dimensional gradient sketching. Our extensive experiments on several real-world data sets from genomics to language processing demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms with a comparable run time. Our theoretical analysis further proves the global convergence of BEAR with $\mathcal{O}(1/t)$ rate in $t$ iterations of the sketched algorithm. } }
Endnote
%0 Conference Paper %T BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory %A Amirali Aghazadeh %A Vipul Gupta %A Alex DeWeese %A Ozan Koyluoglu %A Kannan Ramchandran %B Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference %C Proceedings of Machine Learning Research %D 2022 %E Joan Bruna %E Jan Hesthaven %E Lenka Zdeborova %F pmlr-v145-aghazadeh22a %I PMLR %P 75--92 %U https://proceedings.mlr.press/v145/aghazadeh22a.html %V 145 %X We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off in selecting features in high dimensions due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order feature selection algorithm, called BEAR, which avoids the extra collisions by efficiently storing the second-order stochastic gradients of the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, using a memory cost that grows sublinearly with the size of the feature vector. BEAR reveals an unexplored advantage of second-order optimization for memory-constrained high-dimensional gradient sketching. Our extensive experiments on several real-world data sets from genomics to language processing demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms with a comparable run time. Our theoretical analysis further proves the global convergence of BEAR with $\mathcal{O}(1/t)$ rate in $t$ iterations of the sketched algorithm.
APA
Aghazadeh, A., Gupta, V., DeWeese, A., Koyluoglu, O. & Ramchandran, K.. (2022). BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory. Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, in Proceedings of Machine Learning Research 145:75-92 Available from https://proceedings.mlr.press/v145/aghazadeh22a.html.

Related Material