Multi-Modal Natural Intelligence through Active Predictive Coding

Jeffrey Duan, Vishwas Sathish, Crimson Stambaugh, Rajesh P. N. Rao
Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026, PMLR 308:18-26, 2026.

Abstract

Active predictive coding (APC) is a recently proposed theory of the neocortex that postulates that a canonical sensory-motor processing circuit is replicated across cortical areas. These areas are organized in a rough hierarchy, with higher-level neural states modulating lower-level circuits implementing state-transition dynamics and policy functions. Such a structure enables the network to learn the compositional structure of the world, allowing it to rapidly compose solutions to new problems and generalize quickly to new environments. In APC, complex state transition dynamics are modeled as a sequence of simpler dynamics, which in turn are modeled using even simpler dynamics, and so on. Complex policies are similarly modeled as sequences of simpler policies, with the lowest level comprising sequences of primitive actions. Here we show that the APC model offers a unifying framework for multi-modal intelligence by demonstrating that the same architecture can (a) perform visual object recognition via active sensing (eye movements) and parts-based understanding, (b) navigate to desired goal locations in a complex environment through hierarchical planning, (c) learn to parse language hierarchically, infer the goal (i.e., intent) of an uttered sentence, and achieve the inferred goal through actions, and (d) scale up to realistic environments. Our results suggest that neurally-inspired approaches such as APC can help pave the way for more interpretable, generalizable, efficient, and human-like multi-modal AI.

Cite this Paper


BibTeX
@InProceedings{pmlr-v308-duan26a, title = {Multi-Modal Natural Intelligence through Active Predictive Coding}, author = {Duan, Jeffrey and Sathish, Vishwas and Stambaugh, Crimson and Rao, Rajesh P. N.}, booktitle = {Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026}, pages = {18--26}, year = {2026}, editor = {Abbasi-Asl, Reza and Iqbal, Asim and Ito, Shinya and Arkhipov, Anton and Sanborn, Sophia}, volume = {308}, series = {Proceedings of Machine Learning Research}, month = {27 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v308/main/assets/duan26a/duan26a.pdf}, url = {https://proceedings.mlr.press/v308/duan26a.html}, abstract = {Active predictive coding (APC) is a recently proposed theory of the neocortex that postulates that a canonical sensory-motor processing circuit is replicated across cortical areas. These areas are organized in a rough hierarchy, with higher-level neural states modulating lower-level circuits implementing state-transition dynamics and policy functions. Such a structure enables the network to learn the compositional structure of the world, allowing it to rapidly compose solutions to new problems and generalize quickly to new environments. In APC, complex state transition dynamics are modeled as a sequence of simpler dynamics, which in turn are modeled using even simpler dynamics, and so on. Complex policies are similarly modeled as sequences of simpler policies, with the lowest level comprising sequences of primitive actions. Here we show that the APC model offers a unifying framework for multi-modal intelligence by demonstrating that the same architecture can (a) perform visual object recognition via active sensing (eye movements) and parts-based understanding, (b) navigate to desired goal locations in a complex environment through hierarchical planning, (c) learn to parse language hierarchically, infer the goal (i.e., intent) of an uttered sentence, and achieve the inferred goal through actions, and (d) scale up to realistic environments. Our results suggest that neurally-inspired approaches such as APC can help pave the way for more interpretable, generalizable, efficient, and human-like multi-modal AI.} }
Endnote
%0 Conference Paper %T Multi-Modal Natural Intelligence through Active Predictive Coding %A Jeffrey Duan %A Vishwas Sathish %A Crimson Stambaugh %A Rajesh P. N. Rao %B Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026 %C Proceedings of Machine Learning Research %D 2026 %E Reza Abbasi-Asl %E Asim Iqbal %E Shinya Ito %E Anton Arkhipov %E Sophia Sanborn %F pmlr-v308-duan26a %I PMLR %P 18--26 %U https://proceedings.mlr.press/v308/duan26a.html %V 308 %X Active predictive coding (APC) is a recently proposed theory of the neocortex that postulates that a canonical sensory-motor processing circuit is replicated across cortical areas. These areas are organized in a rough hierarchy, with higher-level neural states modulating lower-level circuits implementing state-transition dynamics and policy functions. Such a structure enables the network to learn the compositional structure of the world, allowing it to rapidly compose solutions to new problems and generalize quickly to new environments. In APC, complex state transition dynamics are modeled as a sequence of simpler dynamics, which in turn are modeled using even simpler dynamics, and so on. Complex policies are similarly modeled as sequences of simpler policies, with the lowest level comprising sequences of primitive actions. Here we show that the APC model offers a unifying framework for multi-modal intelligence by demonstrating that the same architecture can (a) perform visual object recognition via active sensing (eye movements) and parts-based understanding, (b) navigate to desired goal locations in a complex environment through hierarchical planning, (c) learn to parse language hierarchically, infer the goal (i.e., intent) of an uttered sentence, and achieve the inferred goal through actions, and (d) scale up to realistic environments. Our results suggest that neurally-inspired approaches such as APC can help pave the way for more interpretable, generalizable, efficient, and human-like multi-modal AI.
APA
Duan, J., Sathish, V., Stambaugh, C. & Rao, R.P.N.. (2026). Multi-Modal Natural Intelligence through Active Predictive Coding. Proceedings of the First Workshop on NeuroAI Multimodal Intelligence @ AAAI 2026, in Proceedings of Machine Learning Research 308:18-26 Available from https://proceedings.mlr.press/v308/duan26a.html.

Related Material