Corruption-Robust Variance-aware Algorithms for Generalized Linear Bandits under Heavy-tailed Rewards

Qingyuan Yu, Euijin Baek, Xiang Li, Qiang Sun
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:4826-4843, 2025.

Abstract

Stochastic linear bandits have recently received significant attention in sequential decision-making. However, real-world challenges such as heavy-tailed noise, reward corruption, and nonlinear reward functions remain difficult to address. To tackle these difficulties, we propose GAdaOFUL, a novel algorithm that leverages adaptive Huber regression to achieve robustness in generalized linear models (GLMs), where rewards can be nonlinear functions of features. GAdaOFUL achieves a state-of-the-art variance-aware regret bound, scaling with the square root of the cumulative reward variance over time, plus an additional term proportional to the level of corruption. The algorithm adapts to problem complexity, yielding improved regret when the cumulative variance is small. Simulation results demonstrate the robustness and effectiveness of GAdaOFUL in practice. The code is available at \url{https://github.com/NeXAIS/GAdaOFUL}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-yu25b, title = {Corruption-Robust Variance-aware Algorithms for Generalized Linear Bandits under Heavy-tailed Rewards}, author = {Yu, Qingyuan and Baek, Euijin and Li, Xiang and Sun, Qiang}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {4826--4843}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/yu25b/yu25b.pdf}, url = {https://proceedings.mlr.press/v286/yu25b.html}, abstract = {Stochastic linear bandits have recently received significant attention in sequential decision-making. However, real-world challenges such as heavy-tailed noise, reward corruption, and nonlinear reward functions remain difficult to address. To tackle these difficulties, we propose GAdaOFUL, a novel algorithm that leverages adaptive Huber regression to achieve robustness in generalized linear models (GLMs), where rewards can be nonlinear functions of features. GAdaOFUL achieves a state-of-the-art variance-aware regret bound, scaling with the square root of the cumulative reward variance over time, plus an additional term proportional to the level of corruption. The algorithm adapts to problem complexity, yielding improved regret when the cumulative variance is small. Simulation results demonstrate the robustness and effectiveness of GAdaOFUL in practice. The code is available at \url{https://github.com/NeXAIS/GAdaOFUL}.} }
Endnote
%0 Conference Paper %T Corruption-Robust Variance-aware Algorithms for Generalized Linear Bandits under Heavy-tailed Rewards %A Qingyuan Yu %A Euijin Baek %A Xiang Li %A Qiang Sun %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-yu25b %I PMLR %P 4826--4843 %U https://proceedings.mlr.press/v286/yu25b.html %V 286 %X Stochastic linear bandits have recently received significant attention in sequential decision-making. However, real-world challenges such as heavy-tailed noise, reward corruption, and nonlinear reward functions remain difficult to address. To tackle these difficulties, we propose GAdaOFUL, a novel algorithm that leverages adaptive Huber regression to achieve robustness in generalized linear models (GLMs), where rewards can be nonlinear functions of features. GAdaOFUL achieves a state-of-the-art variance-aware regret bound, scaling with the square root of the cumulative reward variance over time, plus an additional term proportional to the level of corruption. The algorithm adapts to problem complexity, yielding improved regret when the cumulative variance is small. Simulation results demonstrate the robustness and effectiveness of GAdaOFUL in practice. The code is available at \url{https://github.com/NeXAIS/GAdaOFUL}.
APA
Yu, Q., Baek, E., Li, X. & Sun, Q.. (2025). Corruption-Robust Variance-aware Algorithms for Generalized Linear Bandits under Heavy-tailed Rewards. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:4826-4843 Available from https://proceedings.mlr.press/v286/yu25b.html.

Related Material