Target Tracking for Contextual Bandits: Application to Demand Side Management

Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:754-763, 2019.

Abstract

We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $T^{2/3}$ upper bounds on this regret (up to poly-logarithmic terms)—and even faster rates under stronger assumptions—for strategies inspired by standard strategies for contextual bandits (like LinUCB, see Li et al., 2010). Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-bregere19a, title = {Target Tracking for Contextual Bandits: Application to Demand Side Management}, author = {Br{\'e}g{\`e}re, Margaux and Gaillard, Pierre and Goude, Yannig and Stoltz, Gilles}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {754--763}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/bregere19a/bregere19a.pdf}, url = {https://proceedings.mlr.press/v97/bregere19a.html}, abstract = {We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $T^{2/3}$ upper bounds on this regret (up to poly-logarithmic terms)—and even faster rates under stronger assumptions—for strategies inspired by standard strategies for contextual bandits (like LinUCB, see Li et al., 2010). Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.} }
Endnote
%0 Conference Paper %T Target Tracking for Contextual Bandits: Application to Demand Side Management %A Margaux Brégère %A Pierre Gaillard %A Yannig Goude %A Gilles Stoltz %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-bregere19a %I PMLR %P 754--763 %U https://proceedings.mlr.press/v97/bregere19a.html %V 97 %X We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $T^{2/3}$ upper bounds on this regret (up to poly-logarithmic terms)—and even faster rates under stronger assumptions—for strategies inspired by standard strategies for contextual bandits (like LinUCB, see Li et al., 2010). Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.
APA
Brégère, M., Gaillard, P., Goude, Y. & Stoltz, G.. (2019). Target Tracking for Contextual Bandits: Application to Demand Side Management. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:754-763 Available from https://proceedings.mlr.press/v97/bregere19a.html.

Related Material