Functions with average smoothness: structure, algorithms, and learning

Yair Ashlagi, Lee-Ad Gottlieb, Aryeh Kontorovich
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:186-236, 2021.

Abstract

We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds — assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. In addition to the usual average, we also examine a “weak” average that is more forgiving and yields a much wider function class. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the {\em empirical} covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-ashlagi21a, title = {Functions with average smoothness: structure, algorithms, and learning}, author = {Ashlagi, Yair and Gottlieb, Lee-Ad and Kontorovich, Aryeh}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {186--236}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/ashlagi21a/ashlagi21a.pdf}, url = {https://proceedings.mlr.press/v134/ashlagi21a.html}, abstract = {We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds — assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. In addition to the usual average, we also examine a “weak” average that is more forgiving and yields a much wider function class. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the {\em empirical} covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.} }
Endnote
%0 Conference Paper %T Functions with average smoothness: structure, algorithms, and learning %A Yair Ashlagi %A Lee-Ad Gottlieb %A Aryeh Kontorovich %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-ashlagi21a %I PMLR %P 186--236 %U https://proceedings.mlr.press/v134/ashlagi21a.html %V 134 %X We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds — assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. In addition to the usual average, we also examine a “weak” average that is more forgiving and yields a much wider function class. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the {\em empirical} covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.
APA
Ashlagi, Y., Gottlieb, L. & Kontorovich, A.. (2021). Functions with average smoothness: structure, algorithms, and learning. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:186-236 Available from https://proceedings.mlr.press/v134/ashlagi21a.html.

Related Material