[edit]
A Study of Modus Ponens in Transformer Models
Proceedings of the International Conference on Neuro-symbolic Systems, PMLR 288:293-315, 2025.
Abstract
Transformer models are the backbone of modern natural language processing. However, whether they can truly perform logical reasoning remains uncertain. This paper examines transformers’ capacity for logical inference in a controlled setting, isolating a single rule—modus ponens—and eliminating confounding factors such as semantic knowledge and linguistic complexity. We systematically vary architectural components, specifically the number of attention heads and layers, to assess their impact on logical inference. Our results show that attention heads enhance information propagation, whereas deeper architectures accelerate convergence but also introduce potentially redundant parameters. While transformers successfully handle level-2 inference tasks, their difficulties with higher-level and out-of-distribution problems suggest that they rely on heuristic “shortcuts” rather than explicit multi-step reasoning. Analysis of attention maps and ablation experiments indicates that these shortcuts function similarly to a matching-aggregation algorithm, where attention heads identify inference links, and the feed-forward network verifies if they form a valid chain. These findings highlight fundamental limitations in transformers’ ability to perform structured logical reasoning.