[edit]
Prediction by random-walk perturbation
Proceedings of the 26th Annual Conference on Learning Theory, PMLR 30:460-473, 2013.
Abstract
We propose a version of the follow-the-perturbed-leader online prediction algorithm in which the cumulative losses are perturbed by independent symmetric random walks. The forecaster is shown to achieve an expected regret of the optimal order O(\sqrtn \log N) where n is the time horizon and N is the number of experts. More importantly, it is shown that the forecaster changes its prediction at most O(\sqrtn \log N) times, in expectation. We also extend the analysis to online combinatorial optimization and show that even in this more general setting, the forecaster rarely switches between experts while having a regret of near-optimal order.