Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

Jean Barbier; Florent Krzakala; Nicolas Macris; Léo Miolane; Lenka Zdeborová

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, Lenka Zdeborová

Proceedings of the 31st Conference On Learning Theory, PMLR 75:728-731, 2018.

Abstract

Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal inference and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Non-rigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.

Cite this Paper

BibTeX


@InProceedings{pmlr-v75-barbier18a,
  title = 	 {Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models},
  author =       {Barbier, Jean and Krzakala, Florent and Macris, Nicolas and Miolane, L{\'e}o and Zdeborov{\'a}, Lenka},
  booktitle = 	 {Proceedings of the 31st  Conference On Learning Theory},
  pages = 	 {728--731},
  year = 	 {2018},
  editor = 	 {Bubeck, Sébastien and Perchet, Vianney and Rigollet, Philippe},
  volume = 	 {75},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v75/barbier18a/barbier18a.pdf},
  url = 	 {https://proceedings.mlr.press/v75/barbier18a.html},
  abstract = 	 {Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal inference and generalization errors.  Our analysis applies to the high-dimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Non-rigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.}
}

Endnote

%0 Conference Paper
%T Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
%A Jean Barbier
%A Florent Krzakala
%A Nicolas Macris
%A Léo Miolane
%A Lenka Zdeborová
%B Proceedings of the 31st  Conference On Learning Theory
%C Proceedings of Machine Learning Research
%D 2018
%E Sébastien Bubeck
%E Vianney Perchet
%E Philippe Rigollet	
%F pmlr-v75-barbier18a
%I PMLR
%P 728--731
%U https://proceedings.mlr.press/v75/barbier18a.html
%V 75
%X Generalized linear models (GLMs) arise in high-dimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal inference and generalization errors.  Our analysis applies to the high-dimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Non-rigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and non-learnable regions.

APA


Barbier, J., Krzakala, F., Macris, N., Miolane, L. & Zdeborová, L.. (2018). Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models. Proceedings of the 31st  Conference On Learning Theory, in Proceedings of Machine Learning Research 75:728-731 Available from https://proceedings.mlr.press/v75/barbier18a.html.

Related Material

Download PDF