Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization

Mathieu Even, Laurent Massoulie
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:1847-1886, 2021.

Abstract

Dimension is an inherent bottleneck to some modern learning tasks, where optimization methods suffer from the size of the data. In this paper, we study non-isotropic distributions of data and develop tools that aim at reducing these dimensional costs by a dependency on an effective dimension rather than the ambient one. Based on non-asymptotic estimates of the metric entropy of ellipsoids -that prove to generalize to infinite dimensions- and on a chaining argument, our uniform concentration bounds involve an effective dimension instead of the global dimension, improving over existing results. We show the importance of taking advantage of non-isotropic properties in learning problems with the following applications: i) we improve state-of-the-art results in statistical preconditioning for communication-efficient distributed optimization, ii) we introduce a non-isotropic randomized smoothing for non-smooth optimization. Both applications cover a class of functions that encompasses empirical risk minization (ERM) for linear models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-even21a, title = {Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization}, author = {Even, Mathieu and Massoulie, Laurent}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {1847--1886}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/even21a/even21a.pdf}, url = {https://proceedings.mlr.press/v134/even21a.html}, abstract = {Dimension is an inherent bottleneck to some modern learning tasks, where optimization methods suffer from the size of the data. In this paper, we study non-isotropic distributions of data and develop tools that aim at reducing these dimensional costs by a dependency on an effective dimension rather than the ambient one. Based on non-asymptotic estimates of the metric entropy of ellipsoids -that prove to generalize to infinite dimensions- and on a chaining argument, our uniform concentration bounds involve an effective dimension instead of the global dimension, improving over existing results. We show the importance of taking advantage of non-isotropic properties in learning problems with the following applications: i) we improve state-of-the-art results in statistical preconditioning for communication-efficient distributed optimization, ii) we introduce a non-isotropic randomized smoothing for non-smooth optimization. Both applications cover a class of functions that encompasses empirical risk minization (ERM) for linear models.} }
Endnote
%0 Conference Paper %T Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization %A Mathieu Even %A Laurent Massoulie %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-even21a %I PMLR %P 1847--1886 %U https://proceedings.mlr.press/v134/even21a.html %V 134 %X Dimension is an inherent bottleneck to some modern learning tasks, where optimization methods suffer from the size of the data. In this paper, we study non-isotropic distributions of data and develop tools that aim at reducing these dimensional costs by a dependency on an effective dimension rather than the ambient one. Based on non-asymptotic estimates of the metric entropy of ellipsoids -that prove to generalize to infinite dimensions- and on a chaining argument, our uniform concentration bounds involve an effective dimension instead of the global dimension, improving over existing results. We show the importance of taking advantage of non-isotropic properties in learning problems with the following applications: i) we improve state-of-the-art results in statistical preconditioning for communication-efficient distributed optimization, ii) we introduce a non-isotropic randomized smoothing for non-smooth optimization. Both applications cover a class of functions that encompasses empirical risk minization (ERM) for linear models.
APA
Even, M. & Massoulie, L.. (2021). Concentration of Non-Isotropic Random Tensors with Applications to Learning and Empirical Risk Minimization. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:1847-1886 Available from https://proceedings.mlr.press/v134/even21a.html.

Related Material