Derivative-Free & Order-Robust Optimisation

Haitham Ammar, Victor Gabillon, Rasul Tutunov, Michal Valko
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2293-2303, 2020.

Abstract

In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero’th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-ammar20a, title = {Derivative-Free & Order-Robust Optimisation}, author = {Ammar, Haitham and Gabillon, Victor and Tutunov, Rasul and Valko, Michal}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {2293--2303}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/ammar20a/ammar20a.pdf}, url = { http://proceedings.mlr.press/v108/ammar20a.html }, abstract = { In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero’th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.} }
Endnote
%0 Conference Paper %T Derivative-Free & Order-Robust Optimisation %A Haitham Ammar %A Victor Gabillon %A Rasul Tutunov %A Michal Valko %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-ammar20a %I PMLR %P 2293--2303 %U http://proceedings.mlr.press/v108/ammar20a.html %V 108 %X In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero’th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.
APA
Ammar, H., Gabillon, V., Tutunov, R. & Valko, M.. (2020). Derivative-Free & Order-Robust Optimisation. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:2293-2303 Available from http://proceedings.mlr.press/v108/ammar20a.html .

Related Material