Thinking of Neural Networks Like a Physicist: The Statistical Physics of Machine Learning

Kai Jappe Sandbrink, Stefano Sarao Mannelli, Florent Krzakala
Proceedings of the Analytical Connectionism Schools 2023--2024, PMLR 320:15-41, 2026.

Abstract

Machine learning (ML) enables us to uncover patterns from data and generalize this information to new, unseen examples. The rapid development of the field has transformed not only classical computer science domains—such as computer vision, natural language processing, and speech recognition—but has also begun to reshape scientific research more broadly, including psychology and neuroscience. This paper presents a pedagogical introduction to an emerging line of research that seeks to interpret ML systems by " thinking like a physicist” presented by Florent Krzakala at Analytical Connectionism 2023. In particular, the methods and intuition of statistical physics—which has a long history of studying complex systems—can be fruitfully applied to high-dimensional problems encountered in ML. First, the paper presents applications of statistical physics techniques to unsupervised machine learning, in which patterns are found in data without any supervisory signal. The replica method – an important approximation that allows computing the expected value of the logarithm of the problem’s likelihood ratio efficiently – greatly facilitates the analysis of classic supervised learning problems such as sparse signal denoising and clustering. The approximate message passing algorithm describes an iterative approach to solving these types of problems. Second, the paper turns to the supervised learning setting, in which ground-truth training labels are used to train a learning algorithm. It characterizes the learning dynamics of neural networks with a single hidden layer across two types of regimes. In the lazy learning regime, learning occurs only in the readout layer of the neural network with fixed embedding weights. With an infinitely wide hidden layer, this corresponds to the neural tangent kernel regime in which the network behaves linearly over its features, and can be used to characterize the possible solutions to the learning problem. Meanwhile, in the feature learning regime, learning occurs in all weights, including embeddings. The paper ends with a brief discussion of current research going beyond single-sample stochastic gradient descent, and a brief introduction to the applications of the concepts outlined in this paper to cognitive psychology.

Cite this Paper


BibTeX
@InProceedings{pmlr-v320-sandbrink26a, title = {Thinking of Neural Networks Like a Physicist: The Statistical Physics of Machine Learning}, author = {Sandbrink, Kai Jappe and Mannelli, Stefano Sarao and Krzakala, Florent}, booktitle = {Proceedings of the Analytical Connectionism Schools 2023--2024}, pages = {15--41}, year = {2026}, editor = {Sarao Mannelli, Stefano and Mignacco, Francesca and Chou, Chi-Ning and Chung, SueYeon and Saxe, Andrew}, volume = {320}, series = {Proceedings of Machine Learning Research}, month = {01 Jan--31 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v320/main/assets/sandbrink26a/sandbrink26a.pdf}, url = {https://proceedings.mlr.press/v320/sandbrink26a.html}, abstract = {Machine learning (ML) enables us to uncover patterns from data and generalize this information to new, unseen examples. The rapid development of the field has transformed not only classical computer science domains—such as computer vision, natural language processing, and speech recognition—but has also begun to reshape scientific research more broadly, including psychology and neuroscience. This paper presents a pedagogical introduction to an emerging line of research that seeks to interpret ML systems by " thinking like a physicist” presented by Florent Krzakala at Analytical Connectionism 2023. In particular, the methods and intuition of statistical physics—which has a long history of studying complex systems—can be fruitfully applied to high-dimensional problems encountered in ML. First, the paper presents applications of statistical physics techniques to unsupervised machine learning, in which patterns are found in data without any supervisory signal. The replica method – an important approximation that allows computing the expected value of the logarithm of the problem’s likelihood ratio efficiently – greatly facilitates the analysis of classic supervised learning problems such as sparse signal denoising and clustering. The approximate message passing algorithm describes an iterative approach to solving these types of problems. Second, the paper turns to the supervised learning setting, in which ground-truth training labels are used to train a learning algorithm. It characterizes the learning dynamics of neural networks with a single hidden layer across two types of regimes. In the lazy learning regime, learning occurs only in the readout layer of the neural network with fixed embedding weights. With an infinitely wide hidden layer, this corresponds to the neural tangent kernel regime in which the network behaves linearly over its features, and can be used to characterize the possible solutions to the learning problem. Meanwhile, in the feature learning regime, learning occurs in all weights, including embeddings. The paper ends with a brief discussion of current research going beyond single-sample stochastic gradient descent, and a brief introduction to the applications of the concepts outlined in this paper to cognitive psychology.} }
Endnote
%0 Conference Paper %T Thinking of Neural Networks Like a Physicist: The Statistical Physics of Machine Learning %A Kai Jappe Sandbrink %A Stefano Sarao Mannelli %A Florent Krzakala %B Proceedings of the Analytical Connectionism Schools 2023--2024 %C Proceedings of Machine Learning Research %D 2026 %E Stefano Sarao Mannelli %E Francesca Mignacco %E Chi-Ning Chou %E SueYeon Chung %E Andrew Saxe %F pmlr-v320-sandbrink26a %I PMLR %P 15--41 %U https://proceedings.mlr.press/v320/sandbrink26a.html %V 320 %X Machine learning (ML) enables us to uncover patterns from data and generalize this information to new, unseen examples. The rapid development of the field has transformed not only classical computer science domains—such as computer vision, natural language processing, and speech recognition—but has also begun to reshape scientific research more broadly, including psychology and neuroscience. This paper presents a pedagogical introduction to an emerging line of research that seeks to interpret ML systems by " thinking like a physicist” presented by Florent Krzakala at Analytical Connectionism 2023. In particular, the methods and intuition of statistical physics—which has a long history of studying complex systems—can be fruitfully applied to high-dimensional problems encountered in ML. First, the paper presents applications of statistical physics techniques to unsupervised machine learning, in which patterns are found in data without any supervisory signal. The replica method – an important approximation that allows computing the expected value of the logarithm of the problem’s likelihood ratio efficiently – greatly facilitates the analysis of classic supervised learning problems such as sparse signal denoising and clustering. The approximate message passing algorithm describes an iterative approach to solving these types of problems. Second, the paper turns to the supervised learning setting, in which ground-truth training labels are used to train a learning algorithm. It characterizes the learning dynamics of neural networks with a single hidden layer across two types of regimes. In the lazy learning regime, learning occurs only in the readout layer of the neural network with fixed embedding weights. With an infinitely wide hidden layer, this corresponds to the neural tangent kernel regime in which the network behaves linearly over its features, and can be used to characterize the possible solutions to the learning problem. Meanwhile, in the feature learning regime, learning occurs in all weights, including embeddings. The paper ends with a brief discussion of current research going beyond single-sample stochastic gradient descent, and a brief introduction to the applications of the concepts outlined in this paper to cognitive psychology.
APA
Sandbrink, K.J., Mannelli, S.S. & Krzakala, F.. (2026). Thinking of Neural Networks Like a Physicist: The Statistical Physics of Machine Learning. Proceedings of the Analytical Connectionism Schools 2023--2024, in Proceedings of Machine Learning Research 320:15-41 Available from https://proceedings.mlr.press/v320/sandbrink26a.html.

Related Material