Pursuit-Evasion Without Regret, with an Application to Trading

Lili Dworkin, Michael Kearns, Yuriy Nevmyvaka
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1521-1529, 2014.

Abstract

We propose a state-based variant of the classical online learning problem of tracking the best expert. In our setting, the actions of the algorithm and experts correspond to local moves through a continuous and bounded state space. At each step, Nature chooses payoffs as a function of each player’s current position and action. Our model therefore integrates the problem of prediction with expert advice with the stateful formalisms of reinforcement learning. Traditional no-regret learning approaches no longer apply, but we propose a simple algorithm that provably achieves no-regret when the state space is any convex Euclidean region. Our algorithm combines techniques from online learning with results from the literature on pursuit-evasion games. We describe a quantitative trading application in which the convex region captures inventory risk constraints, and local moves limit market impact. Using historical market data, we show experimentally that our algorithm has a strong advantage over classic no-regret approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-dworkin14, title = {Pursuit-Evasion Without Regret, with an Application to Trading}, author = {Dworkin, Lili and Kearns, Michael and Nevmyvaka, Yuriy}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {1521--1529}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/dworkin14.pdf}, url = {https://proceedings.mlr.press/v32/dworkin14.html}, abstract = {We propose a state-based variant of the classical online learning problem of tracking the best expert. In our setting, the actions of the algorithm and experts correspond to local moves through a continuous and bounded state space. At each step, Nature chooses payoffs as a function of each player’s current position and action. Our model therefore integrates the problem of prediction with expert advice with the stateful formalisms of reinforcement learning. Traditional no-regret learning approaches no longer apply, but we propose a simple algorithm that provably achieves no-regret when the state space is any convex Euclidean region. Our algorithm combines techniques from online learning with results from the literature on pursuit-evasion games. We describe a quantitative trading application in which the convex region captures inventory risk constraints, and local moves limit market impact. Using historical market data, we show experimentally that our algorithm has a strong advantage over classic no-regret approaches.} }
Endnote
%0 Conference Paper %T Pursuit-Evasion Without Regret, with an Application to Trading %A Lili Dworkin %A Michael Kearns %A Yuriy Nevmyvaka %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-dworkin14 %I PMLR %P 1521--1529 %U https://proceedings.mlr.press/v32/dworkin14.html %V 32 %N 2 %X We propose a state-based variant of the classical online learning problem of tracking the best expert. In our setting, the actions of the algorithm and experts correspond to local moves through a continuous and bounded state space. At each step, Nature chooses payoffs as a function of each player’s current position and action. Our model therefore integrates the problem of prediction with expert advice with the stateful formalisms of reinforcement learning. Traditional no-regret learning approaches no longer apply, but we propose a simple algorithm that provably achieves no-regret when the state space is any convex Euclidean region. Our algorithm combines techniques from online learning with results from the literature on pursuit-evasion games. We describe a quantitative trading application in which the convex region captures inventory risk constraints, and local moves limit market impact. Using historical market data, we show experimentally that our algorithm has a strong advantage over classic no-regret approaches.
RIS
TY - CPAPER TI - Pursuit-Evasion Without Regret, with an Application to Trading AU - Lili Dworkin AU - Michael Kearns AU - Yuriy Nevmyvaka BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/06/18 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-dworkin14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 2 SP - 1521 EP - 1529 L1 - http://proceedings.mlr.press/v32/dworkin14.pdf UR - https://proceedings.mlr.press/v32/dworkin14.html AB - We propose a state-based variant of the classical online learning problem of tracking the best expert. In our setting, the actions of the algorithm and experts correspond to local moves through a continuous and bounded state space. At each step, Nature chooses payoffs as a function of each player’s current position and action. Our model therefore integrates the problem of prediction with expert advice with the stateful formalisms of reinforcement learning. Traditional no-regret learning approaches no longer apply, but we propose a simple algorithm that provably achieves no-regret when the state space is any convex Euclidean region. Our algorithm combines techniques from online learning with results from the literature on pursuit-evasion games. We describe a quantitative trading application in which the convex region captures inventory risk constraints, and local moves limit market impact. Using historical market data, we show experimentally that our algorithm has a strong advantage over classic no-regret approaches. ER -
APA
Dworkin, L., Kearns, M. & Nevmyvaka, Y.. (2014). Pursuit-Evasion Without Regret, with an Application to Trading. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1521-1529 Available from https://proceedings.mlr.press/v32/dworkin14.html.

Related Material