SDMTR: A Brain-inspired Transformer for Relation Inference

Xiangyu Zeng, Jie Lin, Piao Hu, Zhihao Li, Tianxi Huang
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3259-3267, 2024.

Abstract

Deep learning has seen a movement towards the concepts of modularity, module coordination and sparse interactions to fit the working principles of biological systems. Inspired by Global Workspace Theory and long-term memory system in human brain, both are instrumental in constructing biologically plausible artificial intelligence systems, we introduce the shared dual-memory Transformers (SDMTR)— a model that builds upon Transformers. The proposed approach includes the shared long-term memory and workspace with finite capacity in which different specialized modules compete to write information. Later, crucial information from shared workspace is inscribed into long-term memory through outer product attention mechanism to reduce information conflict and build a knowledge reservoir, thereby facilitating subsequent inference, learning and problem-solving. We apply SDMTR to multi-modality question-answering and reasoning challenges, including text-based bAbI-20k, visual Sort-of-CLEVR and triangle relations detection tasks. The results demonstrate that our SDMTR significantly outperforms the vanilla Transformer and its recent improvements. Additionally, visualization analyses indicate that the presence of memory positively correlates with model effectiveness on inference tasks. This research provides novel insights and empirical support to advance biologically plausible deep learning frameworks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-zeng24a, title = {{SDMTR}: A Brain-inspired Transformer for Relation Inference}, author = {Zeng, Xiangyu and Lin, Jie and Hu, Piao and Li, Zhihao and Huang, Tianxi}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3259--3267}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/zeng24a/zeng24a.pdf}, url = {https://proceedings.mlr.press/v238/zeng24a.html}, abstract = {Deep learning has seen a movement towards the concepts of modularity, module coordination and sparse interactions to fit the working principles of biological systems. Inspired by Global Workspace Theory and long-term memory system in human brain, both are instrumental in constructing biologically plausible artificial intelligence systems, we introduce the shared dual-memory Transformers (SDMTR)— a model that builds upon Transformers. The proposed approach includes the shared long-term memory and workspace with finite capacity in which different specialized modules compete to write information. Later, crucial information from shared workspace is inscribed into long-term memory through outer product attention mechanism to reduce information conflict and build a knowledge reservoir, thereby facilitating subsequent inference, learning and problem-solving. We apply SDMTR to multi-modality question-answering and reasoning challenges, including text-based bAbI-20k, visual Sort-of-CLEVR and triangle relations detection tasks. The results demonstrate that our SDMTR significantly outperforms the vanilla Transformer and its recent improvements. Additionally, visualization analyses indicate that the presence of memory positively correlates with model effectiveness on inference tasks. This research provides novel insights and empirical support to advance biologically plausible deep learning frameworks.} }
Endnote
%0 Conference Paper %T SDMTR: A Brain-inspired Transformer for Relation Inference %A Xiangyu Zeng %A Jie Lin %A Piao Hu %A Zhihao Li %A Tianxi Huang %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-zeng24a %I PMLR %P 3259--3267 %U https://proceedings.mlr.press/v238/zeng24a.html %V 238 %X Deep learning has seen a movement towards the concepts of modularity, module coordination and sparse interactions to fit the working principles of biological systems. Inspired by Global Workspace Theory and long-term memory system in human brain, both are instrumental in constructing biologically plausible artificial intelligence systems, we introduce the shared dual-memory Transformers (SDMTR)— a model that builds upon Transformers. The proposed approach includes the shared long-term memory and workspace with finite capacity in which different specialized modules compete to write information. Later, crucial information from shared workspace is inscribed into long-term memory through outer product attention mechanism to reduce information conflict and build a knowledge reservoir, thereby facilitating subsequent inference, learning and problem-solving. We apply SDMTR to multi-modality question-answering and reasoning challenges, including text-based bAbI-20k, visual Sort-of-CLEVR and triangle relations detection tasks. The results demonstrate that our SDMTR significantly outperforms the vanilla Transformer and its recent improvements. Additionally, visualization analyses indicate that the presence of memory positively correlates with model effectiveness on inference tasks. This research provides novel insights and empirical support to advance biologically plausible deep learning frameworks.
APA
Zeng, X., Lin, J., Hu, P., Li, Z. & Huang, T.. (2024). SDMTR: A Brain-inspired Transformer for Relation Inference. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3259-3267 Available from https://proceedings.mlr.press/v238/zeng24a.html.

Related Material