Model Consistency for Learning with Mirror-Stratifiable Regularizers

Jalal Fadili, Guillaume Garrigos, Jérôme Malick, Gabriel Peyré
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1236-1244, 2019.

Abstract

Low-complexity non-smooth convex regularizers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regularizers. This class of regularizers encompasses the most well-known in the literature, including the L1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-fadili19a, title = {Model Consistency for Learning with Mirror-Stratifiable Regularizers}, author = {Fadili, Jalal and Garrigos, Guillaume and Malick, J\'{e}r\^{o}me and Peyr\'{e}, Gabriel}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {1236--1244}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/fadili19a/fadili19a.pdf}, url = {https://proceedings.mlr.press/v89/fadili19a.html}, abstract = {Low-complexity non-smooth convex regularizers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regularizers. This class of regularizers encompasses the most well-known in the literature, including the L1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms.} }
Endnote
%0 Conference Paper %T Model Consistency for Learning with Mirror-Stratifiable Regularizers %A Jalal Fadili %A Guillaume Garrigos %A Jérôme Malick %A Gabriel Peyré %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-fadili19a %I PMLR %P 1236--1244 %U https://proceedings.mlr.press/v89/fadili19a.html %V 89 %X Low-complexity non-smooth convex regularizers are routinely used to impose some structure (such as sparsity or low-rank) on the coefficients for linear predictors in supervised learning. Model consistency consists then in selecting the correct structure (for instance support or rank) by regularized empirical risk minimization. It is known that model consistency holds under appropriate non-degeneracy conditions. However such conditions typically fail for highly correlated designs and it is observed that regularization methods tend to select larger models. In this work, we provide the theoretical underpinning of this behavior using the notion of mirror-stratifiable regularizers. This class of regularizers encompasses the most well-known in the literature, including the L1 or trace norms. It brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. We also show how this analysis is applicable to optimal solutions of the learning problem, and also to the iterates computed by a certain class of stochastic proximal-gradient algorithms.
APA
Fadili, J., Garrigos, G., Malick, J. & Peyré, G.. (2019). Model Consistency for Learning with Mirror-Stratifiable Regularizers. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:1236-1244 Available from https://proceedings.mlr.press/v89/fadili19a.html.

Related Material