Corruption-Tolerant Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause, Jonathan Scarlett
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1071-1081, 2020.

Abstract

We consider the problem of optimizing an unknown (typically non-convex) function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS), based on noisy bandit feedback. We consider a novel variant of this problem in which the point evaluations are not only corrupted by random noise, but also adversarial corruptions. We introduce an algorithm Fast-Slow GP-UCB based on Gaussian process methods, randomized selection between two instances labeled ’fast’ (but non-robust) and ’slow’ (but robust), enlarged confidence bounds, and the principle of optimism under uncertainty. We present a novel theoret- ical analysis upper bounding the cumulative regret in terms of the corruption level, the time horizon, and the underlying kernel, and we argue that certain dependencies cannot be improved. We observe that distinct algorithmic ideas are required depending on whether one is required to perform well in both the corrupted and non-corrupted settings, and whether the corruption level is known or not.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-bogunovic20a, title = {Corruption-Tolerant Gaussian Process Bandit Optimization}, author = {Bogunovic, Ilija and Krause, Andreas and Scarlett, Jonathan}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {1071--1081}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/bogunovic20a/bogunovic20a.pdf}, url = {https://proceedings.mlr.press/v108/bogunovic20a.html}, abstract = {We consider the problem of optimizing an unknown (typically non-convex) function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS), based on noisy bandit feedback. We consider a novel variant of this problem in which the point evaluations are not only corrupted by random noise, but also adversarial corruptions. We introduce an algorithm Fast-Slow GP-UCB based on Gaussian process methods, randomized selection between two instances labeled ’fast’ (but non-robust) and ’slow’ (but robust), enlarged confidence bounds, and the principle of optimism under uncertainty. We present a novel theoret- ical analysis upper bounding the cumulative regret in terms of the corruption level, the time horizon, and the underlying kernel, and we argue that certain dependencies cannot be improved. We observe that distinct algorithmic ideas are required depending on whether one is required to perform well in both the corrupted and non-corrupted settings, and whether the corruption level is known or not.} }
Endnote
%0 Conference Paper %T Corruption-Tolerant Gaussian Process Bandit Optimization %A Ilija Bogunovic %A Andreas Krause %A Jonathan Scarlett %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-bogunovic20a %I PMLR %P 1071--1081 %U https://proceedings.mlr.press/v108/bogunovic20a.html %V 108 %X We consider the problem of optimizing an unknown (typically non-convex) function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS), based on noisy bandit feedback. We consider a novel variant of this problem in which the point evaluations are not only corrupted by random noise, but also adversarial corruptions. We introduce an algorithm Fast-Slow GP-UCB based on Gaussian process methods, randomized selection between two instances labeled ’fast’ (but non-robust) and ’slow’ (but robust), enlarged confidence bounds, and the principle of optimism under uncertainty. We present a novel theoret- ical analysis upper bounding the cumulative regret in terms of the corruption level, the time horizon, and the underlying kernel, and we argue that certain dependencies cannot be improved. We observe that distinct algorithmic ideas are required depending on whether one is required to perform well in both the corrupted and non-corrupted settings, and whether the corruption level is known or not.
APA
Bogunovic, I., Krause, A. & Scarlett, J.. (2020). Corruption-Tolerant Gaussian Process Bandit Optimization. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:1071-1081 Available from https://proceedings.mlr.press/v108/bogunovic20a.html.

Related Material