Distributionally Robust Groupwise Regularization Estimator

Jose Blanchet, Yang Kang
Proceedings of the Ninth Asian Conference on Machine Learning, PMLR 77:97-112, 2017.

Abstract

Regularized estimators in the context of group variables have been applied successfully in model and feature selection in order to preserve interpretability. We formulate a Distributionally Robust Optimization (DRO) problem which recovers popular estimators, such as Group Square Root Lasso (GSRL). Our DRO formulation allows us to interpret GSRL as a game, in which we learn a regression parameter while an adversary chooses a perturbation of the data. We wish to pick the parameter to minimize the expected loss under any plausible model chosen by the adversary - who, on the other hand, wishes to increase the expected loss. The regularization parameter turns out to be precisely determined by the amount of perturbation on the training data allowed by the adversary. In this paper, we introduce a data-driven (statistical) criterion for the optimal choice of regularization, which we evaluate asymptotically, in closed form, as the size of the training set increases. Our easy-to-evaluate regularization formula is compared against cross-validation, showing comparable performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v77-blanchet17a, title = {Distributionally Robust Groupwise Regularization Estimator}, author = {Blanchet, Jose and Kang, Yang}, booktitle = {Proceedings of the Ninth Asian Conference on Machine Learning}, pages = {97--112}, year = {2017}, editor = {Zhang, Min-Ling and Noh, Yung-Kyun}, volume = {77}, series = {Proceedings of Machine Learning Research}, address = {Yonsei University, Seoul, Republic of Korea}, month = {15--17 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v77/blanchet17a/blanchet17a.pdf}, url = {https://proceedings.mlr.press/v77/blanchet17a.html}, abstract = {Regularized estimators in the context of group variables have been applied successfully in model and feature selection in order to preserve interpretability. We formulate a Distributionally Robust Optimization (DRO) problem which recovers popular estimators, such as Group Square Root Lasso (GSRL). Our DRO formulation allows us to interpret GSRL as a game, in which we learn a regression parameter while an adversary chooses a perturbation of the data. We wish to pick the parameter to minimize the expected loss under any plausible model chosen by the adversary - who, on the other hand, wishes to increase the expected loss. The regularization parameter turns out to be precisely determined by the amount of perturbation on the training data allowed by the adversary. In this paper, we introduce a data-driven (statistical) criterion for the optimal choice of regularization, which we evaluate asymptotically, in closed form, as the size of the training set increases. Our easy-to-evaluate regularization formula is compared against cross-validation, showing comparable performance.} }
Endnote
%0 Conference Paper %T Distributionally Robust Groupwise Regularization Estimator %A Jose Blanchet %A Yang Kang %B Proceedings of the Ninth Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Min-Ling Zhang %E Yung-Kyun Noh %F pmlr-v77-blanchet17a %I PMLR %P 97--112 %U https://proceedings.mlr.press/v77/blanchet17a.html %V 77 %X Regularized estimators in the context of group variables have been applied successfully in model and feature selection in order to preserve interpretability. We formulate a Distributionally Robust Optimization (DRO) problem which recovers popular estimators, such as Group Square Root Lasso (GSRL). Our DRO formulation allows us to interpret GSRL as a game, in which we learn a regression parameter while an adversary chooses a perturbation of the data. We wish to pick the parameter to minimize the expected loss under any plausible model chosen by the adversary - who, on the other hand, wishes to increase the expected loss. The regularization parameter turns out to be precisely determined by the amount of perturbation on the training data allowed by the adversary. In this paper, we introduce a data-driven (statistical) criterion for the optimal choice of regularization, which we evaluate asymptotically, in closed form, as the size of the training set increases. Our easy-to-evaluate regularization formula is compared against cross-validation, showing comparable performance.
APA
Blanchet, J. & Kang, Y.. (2017). Distributionally Robust Groupwise Regularization Estimator. Proceedings of the Ninth Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 77:97-112 Available from https://proceedings.mlr.press/v77/blanchet17a.html.

Related Material