[edit]
MisD-MoE: A Multimodal Misinformation Detection Framework with Adaptive Feature Selection
Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:114-122, 2024.
Abstract
The rapid growth of social media has led to the widespread dissemination of misinformation across multiple content forms, including text, images, audio, and video. Compared to unimodal misinformation detection, multimodal misinformation detection benefits from the increased availability of information across multiple modalities. However, these additional features may introduce redundancy, where overlapping or irrelevant information is included, potentially disrupting the feature space and consequently impairing the model’s performance. To address the issue, we propose a novel framework, Misinformation Detection Mixture of Experts (MisD-MoE), which employs distinct expert models for each modality and incorporates an adaptive feature selection mechanism using top-k gating and Gumbel-Sigmoid. This approach dynamically filters relevant features, reducing redundancy and improving detection accuracy. Extensive experiments on the FakeSV and FVC-2018 datasets demonstrate that MisD-MoE significantly outperforms state-of-the-art methods, with accuracy improvements of 3.45% and 3.71% on the respective datasets compared to baseline models.