Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems

Yutong Wu, Jie Zhang, Yiming Li, Chao Zhang, Qing Guo, Han Qiu, Nils Lukas, Tianwei Zhang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:68015-68035, 2025.

Abstract

Vision Language Model (VLM) Agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language. Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property is robustness, stating that the system maintains its integrity during adversarial attacks. Multi-agent systems lack robustness, as a successful exploit against one agent can spread and infect other agents to undermine the entire system’s integrity. We propose a defense Cowpox to provably enhance the robustness of a multi-agent system by a distributed mechanism that improves the recovery rate of agents by limiting the expected number of infections to other agents. The core idea is to generate and distribute a special cure sample that immunizes an agent against the attack before exposure. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wu25aq, title = {Cowpox: Towards the Immunity of {VLM}-based Multi-Agent Systems}, author = {Wu, Yutong and Zhang, Jie and Li, Yiming and Zhang, Chao and Guo, Qing and Qiu, Han and Lukas, Nils and Zhang, Tianwei}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {68015--68035}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wu25aq/wu25aq.pdf}, url = {https://proceedings.mlr.press/v267/wu25aq.html}, abstract = {Vision Language Model (VLM) Agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language. Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property is robustness, stating that the system maintains its integrity during adversarial attacks. Multi-agent systems lack robustness, as a successful exploit against one agent can spread and infect other agents to undermine the entire system’s integrity. We propose a defense Cowpox to provably enhance the robustness of a multi-agent system by a distributed mechanism that improves the recovery rate of agents by limiting the expected number of infections to other agents. The core idea is to generate and distribute a special cure sample that immunizes an agent against the attack before exposure. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.} }
Endnote
%0 Conference Paper %T Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems %A Yutong Wu %A Jie Zhang %A Yiming Li %A Chao Zhang %A Qing Guo %A Han Qiu %A Nils Lukas %A Tianwei Zhang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wu25aq %I PMLR %P 68015--68035 %U https://proceedings.mlr.press/v267/wu25aq.html %V 267 %X Vision Language Model (VLM) Agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language. Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property is robustness, stating that the system maintains its integrity during adversarial attacks. Multi-agent systems lack robustness, as a successful exploit against one agent can spread and infect other agents to undermine the entire system’s integrity. We propose a defense Cowpox to provably enhance the robustness of a multi-agent system by a distributed mechanism that improves the recovery rate of agents by limiting the expected number of infections to other agents. The core idea is to generate and distribute a special cure sample that immunizes an agent against the attack before exposure. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.
APA
Wu, Y., Zhang, J., Li, Y., Zhang, C., Guo, Q., Qiu, H., Lukas, N. & Zhang, T.. (2025). Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:68015-68035 Available from https://proceedings.mlr.press/v267/wu25aq.html.

Related Material