Optimal Errors and Phase Transitions in HighDimensional Generalized Linear Models
[edit]
Proceedings of the 31st Conference On Learning Theory, PMLR 75:728731, 2018.
Abstract
Generalized linear models (GLMs) arise in highdimensional machine learning, statistics, communications and signal processing. % In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, errorcorrecting codes or benchmarks models in neural networks. % We evaluate the mutual information (or “free entropy”) from which we deduce the Bayesoptimal inference and generalization errors. Our analysis applies to the highdimensional limit where both the number of samples and dimensions are large and their ratio is fixed. % Nonrigorous predictions for the optimal inference and generalization errors existed for special cases of GLMs, e.g. for the perceptron in the field of statistical physics based on the socalled replica method. Our present paper rigorously establishes those decades old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate messagepassing algorithm. % Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance, and locate the associated sharp phase transitions separating learnable and nonlearnable regions.
Related Material


