Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation

Jin-Hong Du, Pratik Patil, Arun K. Kuchibhotla
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:8585-8631, 2023.

Abstract

We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $\lambda$ and the limiting subsample aspect ratio $\phi_s$ (the ratio of the feature size to the subsample size), we characterize contours in the $(\lambda, \phi_s)$-plane at any achievable risk. As a consequence, we prove that the risk of the optimal full ridgeless ensemble (fitted on all possible subsamples) matches that of the optimal ridge predictor. In addition, we prove strong uniform consistency of generalized cross-validation (GCV) over the subsample sizes for estimating the prediction risk of ridge ensembles. This allows for GCV-based tuning of full ridgeless ensembles without sample splitting and yields a predictor whose risk matches optimal ridge risk.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-du23d, title = {Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation}, author = {Du, Jin-Hong and Patil, Pratik and Kuchibhotla, Arun K.}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {8585--8631}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/du23d/du23d.pdf}, url = {https://proceedings.mlr.press/v202/du23d.html}, abstract = {We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $\lambda$ and the limiting subsample aspect ratio $\phi_s$ (the ratio of the feature size to the subsample size), we characterize contours in the $(\lambda, \phi_s)$-plane at any achievable risk. As a consequence, we prove that the risk of the optimal full ridgeless ensemble (fitted on all possible subsamples) matches that of the optimal ridge predictor. In addition, we prove strong uniform consistency of generalized cross-validation (GCV) over the subsample sizes for estimating the prediction risk of ridge ensembles. This allows for GCV-based tuning of full ridgeless ensembles without sample splitting and yields a predictor whose risk matches optimal ridge risk.} }
Endnote
%0 Conference Paper %T Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation %A Jin-Hong Du %A Pratik Patil %A Arun K. Kuchibhotla %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-du23d %I PMLR %P 8585--8631 %U https://proceedings.mlr.press/v202/du23d.html %V 202 %X We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $\lambda$ and the limiting subsample aspect ratio $\phi_s$ (the ratio of the feature size to the subsample size), we characterize contours in the $(\lambda, \phi_s)$-plane at any achievable risk. As a consequence, we prove that the risk of the optimal full ridgeless ensemble (fitted on all possible subsamples) matches that of the optimal ridge predictor. In addition, we prove strong uniform consistency of generalized cross-validation (GCV) over the subsample sizes for estimating the prediction risk of ridge ensembles. This allows for GCV-based tuning of full ridgeless ensembles without sample splitting and yields a predictor whose risk matches optimal ridge risk.
APA
Du, J., Patil, P. & Kuchibhotla, A.K.. (2023). Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:8585-8631 Available from https://proceedings.mlr.press/v202/du23d.html.

Related Material