[edit]
Expert Collapse and Compositional Failure in Simple Multimodal MoE
Proceedings of AAAI 2026 Workshop on Bias in Multimodal AI, PMLR 332:1-10, 2026.
Abstract
Mixture-of-Experts (MoE) is a technique that uses multiple MLPs at each transformer layer, rather than using a single MLP. MoE architectures are hypothesised to create specialised experts, but this is often inferred rather than quantitatively measured. Further, this specialisation is in conflict with standard load-balancing losses that promote more uniform load distribution, though can encourage expert redundancy. We construct a novel two-unimodal-expert (vision/text) MoE testbed and use a three-stage protocol to first force specialisation with uni-modal data (Stage 2), then test its stability during fine-tuning (Stage 3). Our findings demonstrate that while Stage 2 successfully creates specialised experts, this specialisation persists only at the object-level. In Stage 3, the standard multimodal loss actively overwrites this structure, causing the latent space to default to only clustering by modality, rather than by concept. We identify the mechanism as layer-level expert collapse. Furthermore, a case study on compositional binding reveals that even when specialisation is present, it captures monolithic objects (e.g., ’car’) but fails to bind attributes like colour, perhaps highlighting a source of bias in multimodal representation.