The Trojan Detection Challenge

Mantas Mazeika, Dan Hendrycks, Huichen Li, Xiaojun Xu, Sidney Hough, Andy Zou, Arezoo Rajabi, Qi Yao, Zihao Wang, Jian Tian, Yao Tang, Di Tang, Roman Smirnov, Pavel Pleskov, Nikita Benkovich, Dawn Song, Radha Poovendran, Bo Li, David. Forsyth
Proceedings of the NeurIPS 2022 Competitions Track, PMLR 220:279-291, 2022.

Abstract

Neural trojan attacks inject machine learning systems with hidden behavior that lies dormant until activated. In recent years, trojan detection has emerged as a promising avenue for defending against standard trojan attacks. However, there have been few investigations on trojans specifically designed to be difficult to detect. We organized the Trojan Detection Challenge to begin work on the important question of how to build more robust trojan detectors. This paper gives an overview of the competition and its results. Notably, participants greatly improved over strong baselines on trojan detection and reverse-engineering tasks, demonstrating the potential for proactively improving the robustness of trojan detectors. We hope the competition and its results will inspire further research in detecting hidden behavior in machine learning systems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v220-mazeika23a, title = {The Trojan Detection Challenge}, author = {Mazeika, Mantas and Hendrycks, Dan and Li, Huichen and Xu, Xiaojun and Hough, Sidney and Zou, Andy and Rajabi, Arezoo and Yao, Qi and Wang, Zihao and Tian, Jian and Tang, Yao and Tang, Di and Smirnov, Roman and Pleskov, Pavel and Benkovich, Nikita and Song, Dawn and Poovendran, Radha and Li, Bo and Forsyth, David.}, booktitle = {Proceedings of the NeurIPS 2022 Competitions Track}, pages = {279--291}, year = {2022}, editor = {Ciccone, Marco and Stolovitzky, Gustavo and Albrecht, Jacob}, volume = {220}, series = {Proceedings of Machine Learning Research}, month = {28 Nov--09 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v220/mazeika23a/mazeika23a.pdf}, url = {https://proceedings.mlr.press/v220/mazeika23a.html}, abstract = {Neural trojan attacks inject machine learning systems with hidden behavior that lies dormant until activated. In recent years, trojan detection has emerged as a promising avenue for defending against standard trojan attacks. However, there have been few investigations on trojans specifically designed to be difficult to detect. We organized the Trojan Detection Challenge to begin work on the important question of how to build more robust trojan detectors. This paper gives an overview of the competition and its results. Notably, participants greatly improved over strong baselines on trojan detection and reverse-engineering tasks, demonstrating the potential for proactively improving the robustness of trojan detectors. We hope the competition and its results will inspire further research in detecting hidden behavior in machine learning systems.} }
Endnote
%0 Conference Paper %T The Trojan Detection Challenge %A Mantas Mazeika %A Dan Hendrycks %A Huichen Li %A Xiaojun Xu %A Sidney Hough %A Andy Zou %A Arezoo Rajabi %A Qi Yao %A Zihao Wang %A Jian Tian %A Yao Tang %A Di Tang %A Roman Smirnov %A Pavel Pleskov %A Nikita Benkovich %A Dawn Song %A Radha Poovendran %A Bo Li %A David. Forsyth %B Proceedings of the NeurIPS 2022 Competitions Track %C Proceedings of Machine Learning Research %D 2022 %E Marco Ciccone %E Gustavo Stolovitzky %E Jacob Albrecht %F pmlr-v220-mazeika23a %I PMLR %P 279--291 %U https://proceedings.mlr.press/v220/mazeika23a.html %V 220 %X Neural trojan attacks inject machine learning systems with hidden behavior that lies dormant until activated. In recent years, trojan detection has emerged as a promising avenue for defending against standard trojan attacks. However, there have been few investigations on trojans specifically designed to be difficult to detect. We organized the Trojan Detection Challenge to begin work on the important question of how to build more robust trojan detectors. This paper gives an overview of the competition and its results. Notably, participants greatly improved over strong baselines on trojan detection and reverse-engineering tasks, demonstrating the potential for proactively improving the robustness of trojan detectors. We hope the competition and its results will inspire further research in detecting hidden behavior in machine learning systems.
APA
Mazeika, M., Hendrycks, D., Li, H., Xu, X., Hough, S., Zou, A., Rajabi, A., Yao, Q., Wang, Z., Tian, J., Tang, Y., Tang, D., Smirnov, R., Pleskov, P., Benkovich, N., Song, D., Poovendran, R., Li, B. & Forsyth, D.. (2022). The Trojan Detection Challenge. Proceedings of the NeurIPS 2022 Competitions Track, in Proceedings of Machine Learning Research 220:279-291 Available from https://proceedings.mlr.press/v220/mazeika23a.html.

Related Material