Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, Lenka Zdeborová
Proceedings of the 31st Conference On Learning Theory, PMLR 75:728-731, 2018.

Abstract

Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal inference and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Non-rigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v75-barbier18a, title = {Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models}, author = {Barbier, Jean and Krzakala, Florent and Macris, Nicolas and Miolane, L{\'e}o and Zdeborov{\'a}, Lenka}, booktitle = {Proceedings of the 31st Conference On Learning Theory}, pages = {728--731}, year = {2018}, editor = {Bubeck, Sébastien and Perchet, Vianney and Rigollet, Philippe}, volume = {75}, series = {Proceedings of Machine Learning Research}, month = {06--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v75/barbier18a/barbier18a.pdf}, url = {https://proceedings.mlr.press/v75/barbier18a.html}, abstract = {Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal inference and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Non-rigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.} }
Endnote
%0 Conference Paper %T Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models %A Jean Barbier %A Florent Krzakala %A Nicolas Macris %A Léo Miolane %A Lenka Zdeborová %B Proceedings of the 31st Conference On Learning Theory %C Proceedings of Machine Learning Research %D 2018 %E Sébastien Bubeck %E Vianney Perchet %E Philippe Rigollet %F pmlr-v75-barbier18a %I PMLR %P 728--731 %U https://proceedings.mlr.press/v75/barbier18a.html %V 75 %X Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal inference and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Non-rigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.
APA
Barbier, J., Krzakala, F., Macris, N., Miolane, L. & Zdeborová, L.. (2018). Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models. Proceedings of the 31st Conference On Learning Theory, in Proceedings of Machine Learning Research 75:728-731 Available from https://proceedings.mlr.press/v75/barbier18a.html.

Related Material