[edit]
The Trojan Detection Challenge
Proceedings of the NeurIPS 2022 Competitions Track, PMLR 220:279-291, 2022.
Abstract
Neural trojan attacks inject machine learning systems with hidden behavior that lies dormant until activated. In recent years, trojan detection has emerged as a promising avenue for defending against standard trojan attacks. However, there have been few investigations on trojans specifically designed to be difficult to detect. We organized the Trojan Detection Challenge to begin work on the important question of how to build more robust trojan detectors. This paper gives an overview of the competition and its results. Notably, participants greatly improved over strong baselines on trojan detection and reverse-engineering tasks, demonstrating the potential for proactively improving the robustness of trojan detectors. We hope the competition and its results will inspire further research in detecting hidden behavior in machine learning systems.