Brain–Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage

Angela Lopez-Cardona, Sebastian Idesis, Mireia Masias Bruns, Sergi Abadal, Ioannis Arapakis
Proceedings of UniReps: the Third Edition of the Workshop on Unifying Representations in Neural Models, PMLR 322:81-101, 2026.

Abstract

Do brains and language models converge toward the same internal representations of the world? Recent years have seen a rise in studies of neural activations and model alignment. In this work, we review 25 fMRI-based studies published between 2023 and 2025 and explicitly confront their findings with two key hypotheses: (i) the Platonic Representation Hypothesis—that as models scale and improve, they converge to a representation of the real world, and (ii) the Intermediate-Layer Advantage—that intermediate (mid-depth) layers often encode richer, more generalizable features. Our findings provide converging evidence that models and brains may share abstract representational structures, supporting both hypotheses and motivating further research on brain–model alignment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v322-lopez-cardona26a, title = {Brain{–}Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage}, author = {Lopez-Cardona, Angela and Idesis, Sebastian and Bruns, Mireia Masias and Abadal, Sergi and Arapakis, Ioannis}, booktitle = {Proceedings of UniReps: the Third Edition of the Workshop on Unifying Representations in Neural Models}, pages = {81--101}, year = {2026}, editor = {Fumero, Marco and Domine, Clementine and L"ahner, Zorah and Cannistraci, Irene and Zhao, Bo and Williams, Alex}, volume = {322}, series = {Proceedings of Machine Learning Research}, month = {06 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v322/main/assets/lopez-cardona26a/lopez-cardona26a.pdf}, url = {https://proceedings.mlr.press/v322/lopez-cardona26a.html}, abstract = {Do brains and language models converge toward the same internal representations of the world? Recent years have seen a rise in studies of neural activations and model alignment. In this work, we review 25 fMRI-based studies published between 2023 and 2025 and explicitly confront their findings with two key hypotheses: (i) the Platonic Representation Hypothesis—that as models scale and improve, they converge to a representation of the real world, and (ii) the Intermediate-Layer Advantage—that intermediate (mid-depth) layers often encode richer, more generalizable features. Our findings provide converging evidence that models and brains may share abstract representational structures, supporting both hypotheses and motivating further research on brain–model alignment.} }
Endnote
%0 Conference Paper %T Brain–Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage %A Angela Lopez-Cardona %A Sebastian Idesis %A Mireia Masias Bruns %A Sergi Abadal %A Ioannis Arapakis %B Proceedings of UniReps: the Third Edition of the Workshop on Unifying Representations in Neural Models %C Proceedings of Machine Learning Research %D 2026 %E Marco Fumero %E Clementine Domine %E Zorah L"ahner %E Irene Cannistraci %E Bo Zhao %E Alex Williams %F pmlr-v322-lopez-cardona26a %I PMLR %P 81--101 %U https://proceedings.mlr.press/v322/lopez-cardona26a.html %V 322 %X Do brains and language models converge toward the same internal representations of the world? Recent years have seen a rise in studies of neural activations and model alignment. In this work, we review 25 fMRI-based studies published between 2023 and 2025 and explicitly confront their findings with two key hypotheses: (i) the Platonic Representation Hypothesis—that as models scale and improve, they converge to a representation of the real world, and (ii) the Intermediate-Layer Advantage—that intermediate (mid-depth) layers often encode richer, more generalizable features. Our findings provide converging evidence that models and brains may share abstract representational structures, supporting both hypotheses and motivating further research on brain–model alignment.
APA
Lopez-Cardona, A., Idesis, S., Bruns, M.M., Abadal, S. & Arapakis, I.. (2026). Brain–Language Model Alignment: Insights into the Platonic Hypothesis and Intermediate-Layer Advantage. Proceedings of UniReps: the Third Edition of the Workshop on Unifying Representations in Neural Models, in Proceedings of Machine Learning Research 322:81-101 Available from https://proceedings.mlr.press/v322/lopez-cardona26a.html.

Related Material