Offline Reward Perturbation Boosts Distributional Shift in Online RL

Zishun Yu, Siteng Kang, Xinhua Zhang
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:4041-4055, 2024.

Abstract

Offline-to-online reinforcement learning has recently been shown effective in reducing the online sample complexity by first training from offline collected data. However, this additional data source may also invite new poisoning attacks that target offline training. In this work, we reveal such vulnerabilities in critic-regularized offline RL by proposing a novel data poisoning attack method, which is stealthy in the sense that the performance during the offline training remains intact, but the online fine-tuning stage will suffer a significant performance drop. Our method leverages the techniques from bi-level optimization to promote the over-estimation/distribution shift under offline-to-online reinforcement learning. Experiments on four environments confirm the satisfaction of the new stealthiness requirement, and can be effective in attacking with only a small budget and without having white-box access to the victim model.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-yu24a, title = {Offline Reward Perturbation Boosts Distributional Shift in Online RL}, author = {Yu, Zishun and Kang, Siteng and Zhang, Xinhua}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {4041--4055}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/yu24a/yu24a.pdf}, url = {https://proceedings.mlr.press/v244/yu24a.html}, abstract = {Offline-to-online reinforcement learning has recently been shown effective in reducing the online sample complexity by first training from offline collected data. However, this additional data source may also invite new poisoning attacks that target offline training. In this work, we reveal such vulnerabilities in critic-regularized offline RL by proposing a novel data poisoning attack method, which is stealthy in the sense that the performance during the offline training remains intact, but the online fine-tuning stage will suffer a significant performance drop. Our method leverages the techniques from bi-level optimization to promote the over-estimation/distribution shift under offline-to-online reinforcement learning. Experiments on four environments confirm the satisfaction of the new stealthiness requirement, and can be effective in attacking with only a small budget and without having white-box access to the victim model.} }
Endnote
%0 Conference Paper %T Offline Reward Perturbation Boosts Distributional Shift in Online RL %A Zishun Yu %A Siteng Kang %A Xinhua Zhang %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-yu24a %I PMLR %P 4041--4055 %U https://proceedings.mlr.press/v244/yu24a.html %V 244 %X Offline-to-online reinforcement learning has recently been shown effective in reducing the online sample complexity by first training from offline collected data. However, this additional data source may also invite new poisoning attacks that target offline training. In this work, we reveal such vulnerabilities in critic-regularized offline RL by proposing a novel data poisoning attack method, which is stealthy in the sense that the performance during the offline training remains intact, but the online fine-tuning stage will suffer a significant performance drop. Our method leverages the techniques from bi-level optimization to promote the over-estimation/distribution shift under offline-to-online reinforcement learning. Experiments on four environments confirm the satisfaction of the new stealthiness requirement, and can be effective in attacking with only a small budget and without having white-box access to the victim model.
APA
Yu, Z., Kang, S. & Zhang, X.. (2024). Offline Reward Perturbation Boosts Distributional Shift in Online RL. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:4041-4055 Available from https://proceedings.mlr.press/v244/yu24a.html.

Related Material