Learning Exercise Policies for American Options

Yuxi Li; Csaba Szepesvari; Dale Schuurmans

Learning Exercise Policies for American Options

Yuxi Li, Csaba Szepesvari, Dale Schuurmans

Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, PMLR 5:352-359, 2009.

Abstract

Options are important instruments in modern finance. In this paper, we investigate reinforcement learning (RL) methods—in particular, least-squares policy iteration (LSPI)—for the problem of learning exercise policies for American options. We develop finite-time bounds on the performance of the policy obtained with LSPI and compare LSPI and the fitted Q-iteration algorithm (FQI) with the Longstaff-Schwartz method (LSM), the standard least-squares Monte Carlo algorithm from the finance community. Our empirical results show that the exercise policies discovered by LSPI and FQI gain larger payoffs than those discovered by LSM, on both real and synthetic data. Furthermore, we find that for all methods the policies learned from real data generally gain similar payoffs to the policies learned from simulated data. Our work shows that solution methods developed in machine learning can advance the state-of-the-art in an important and challenging application area, while demonstrating that computational finance remains a promising area for future applications of machine learning methods.

Cite this Paper

BibTeX


@InProceedings{pmlr-v5-li09d,
  title = 	 {Learning Exercise Policies for American Options},
  author = 	 {Li, Yuxi and Szepesvari, Csaba and Schuurmans, Dale},
  booktitle = 	 {Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {352--359},
  year = 	 {2009},
  editor = 	 {van Dyk, David and Welling, Max},
  volume = 	 {5},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA},
  month = 	 {16--18 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v5/li09d/li09d.pdf},
  url = 	 {https://proceedings.mlr.press/v5/li09d.html},
  abstract = 	 {Options are important instruments in modern finance. In this paper, we investigate reinforcement learning (RL) methods—in particular, least-squares policy iteration (LSPI)—for the problem of learning exercise policies for American options. We develop finite-time bounds on the performance of the policy obtained with LSPI and compare LSPI and the fitted Q-iteration algorithm (FQI) with the Longstaff-Schwartz method (LSM), the standard least-squares Monte Carlo algorithm from the finance community. Our empirical results show that the exercise policies discovered by LSPI and FQI gain larger payoffs than those discovered by LSM, on both real and synthetic data. Furthermore, we find that for all methods the policies learned from real data generally gain similar payoffs to the policies learned from simulated data. Our work shows that solution methods developed in machine learning can advance the state-of-the-art in an important and challenging application area, while demonstrating that computational finance remains a promising area for future applications of machine learning methods.}
}

Endnote

%0 Conference Paper
%T Learning Exercise Policies for American Options
%A Yuxi Li
%A Csaba Szepesvari
%A Dale Schuurmans
%B Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2009
%E David van Dyk
%E Max Welling	
%F pmlr-v5-li09d
%I PMLR
%P 352--359
%U https://proceedings.mlr.press/v5/li09d.html
%V 5
%X Options are important instruments in modern finance. In this paper, we investigate reinforcement learning (RL) methods—in particular, least-squares policy iteration (LSPI)—for the problem of learning exercise policies for American options. We develop finite-time bounds on the performance of the policy obtained with LSPI and compare LSPI and the fitted Q-iteration algorithm (FQI) with the Longstaff-Schwartz method (LSM), the standard least-squares Monte Carlo algorithm from the finance community. Our empirical results show that the exercise policies discovered by LSPI and FQI gain larger payoffs than those discovered by LSM, on both real and synthetic data. Furthermore, we find that for all methods the policies learned from real data generally gain similar payoffs to the policies learned from simulated data. Our work shows that solution methods developed in machine learning can advance the state-of-the-art in an important and challenging application area, while demonstrating that computational finance remains a promising area for future applications of machine learning methods.

RIS


TY  - CPAPER
TI  - Learning Exercise Policies for American Options
AU  - Yuxi Li
AU  - Csaba Szepesvari
AU  - Dale Schuurmans
BT  - Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
DA  - 2009/04/15
ED  - David van Dyk
ED  - Max Welling	
ID  - pmlr-v5-li09d
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 5
SP  - 352
EP  - 359
L1  - http://proceedings.mlr.press/v5/li09d/li09d.pdf
UR  - https://proceedings.mlr.press/v5/li09d.html
AB  - Options are important instruments in modern finance. In this paper, we investigate reinforcement learning (RL) methods—in particular, least-squares policy iteration (LSPI)—for the problem of learning exercise policies for American options. We develop finite-time bounds on the performance of the policy obtained with LSPI and compare LSPI and the fitted Q-iteration algorithm (FQI) with the Longstaff-Schwartz method (LSM), the standard least-squares Monte Carlo algorithm from the finance community. Our empirical results show that the exercise policies discovered by LSPI and FQI gain larger payoffs than those discovered by LSM, on both real and synthetic data. Furthermore, we find that for all methods the policies learned from real data generally gain similar payoffs to the policies learned from simulated data. Our work shows that solution methods developed in machine learning can advance the state-of-the-art in an important and challenging application area, while demonstrating that computational finance remains a promising area for future applications of machine learning methods.
ER  -

APA


Li, Y., Szepesvari, C. & Schuurmans, D.. (2009). Learning Exercise Policies for American Options. Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 5:352-359 Available from https://proceedings.mlr.press/v5/li09d.html.

Related Material

Download PDF