An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy

Gavin Taylor; Connor Geer; David Piekut

An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy

Gavin Taylor, Connor Geer, David Piekut

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):451-459, 2014.

Abstract

Recent interest in the use of L_1 regularization in the use of value function approximation includes Petrik et al.’s introduction of L_1-Regularized Approximate Linear Programming (RALP). RALP is unique among L_1-regularized approaches in that it approximates the optimal value function using off-policy samples. Additionally, it produces policies which outperform those of previous methods, such as LSPI. RALP’s value function approximation quality is affected heavily by the choice of state-relevance weights in the objective function of the linear program, and by the distribution from which samples are drawn; however, there has been no discussion of these considerations in the previous literature. In this paper, we discuss and explain the effects of choices in the state-relevance weights and sampling distribution on approximation quality, using both theoretical and experimental illustrations. The results provide insight not only onto these effects, but also provide intuition into the types of MDPs which are especially well suited for approximation with RALP.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-taylor14,
  title = 	 {An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy},
  author = 	 {Taylor, Gavin and Geer, Connor and Piekut, David},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {451--459},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/taylor14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/taylor14.html},
  abstract = 	 {Recent interest in the use of L_1 regularization in the use of value function approximation includes Petrik et al.’s introduction of L_1-Regularized Approximate Linear Programming (RALP).  RALP is unique among L_1-regularized approaches in that it approximates the optimal value function using off-policy samples.  Additionally, it produces policies which outperform those of previous methods, such as LSPI.  RALP’s value function approximation quality is affected heavily by the choice of state-relevance weights in the objective function of the linear program, and by the distribution from which samples are drawn; however, there has been no discussion of these considerations in the previous literature.  In this paper, we discuss and explain the effects of choices in the state-relevance weights and sampling distribution on approximation quality, using both theoretical and experimental illustrations.  The results provide insight not only onto these effects, but also provide intuition into the types of MDPs which are especially well suited for approximation with RALP.}
}

Endnote

%0 Conference Paper
%T An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy
%A Gavin Taylor
%A Connor Geer
%A David Piekut
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-taylor14
%I PMLR
%P 451--459
%U https://proceedings.mlr.press/v32/taylor14.html
%V 32
%N 2
%X Recent interest in the use of L_1 regularization in the use of value function approximation includes Petrik et al.’s introduction of L_1-Regularized Approximate Linear Programming (RALP).  RALP is unique among L_1-regularized approaches in that it approximates the optimal value function using off-policy samples.  Additionally, it produces policies which outperform those of previous methods, such as LSPI.  RALP’s value function approximation quality is affected heavily by the choice of state-relevance weights in the objective function of the linear program, and by the distribution from which samples are drawn; however, there has been no discussion of these considerations in the previous literature.  In this paper, we discuss and explain the effects of choices in the state-relevance weights and sampling distribution on approximation quality, using both theoretical and experimental illustrations.  The results provide insight not only onto these effects, but also provide intuition into the types of MDPs which are especially well suited for approximation with RALP.

RIS


TY  - CPAPER
TI  - An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy
AU  - Gavin Taylor
AU  - Connor Geer
AU  - David Piekut
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-taylor14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 451
EP  - 459
L1  - http://proceedings.mlr.press/v32/taylor14.pdf
UR  - https://proceedings.mlr.press/v32/taylor14.html
AB  - Recent interest in the use of L_1 regularization in the use of value function approximation includes Petrik et al.’s introduction of L_1-Regularized Approximate Linear Programming (RALP).  RALP is unique among L_1-regularized approaches in that it approximates the optimal value function using off-policy samples.  Additionally, it produces policies which outperform those of previous methods, such as LSPI.  RALP’s value function approximation quality is affected heavily by the choice of state-relevance weights in the objective function of the linear program, and by the distribution from which samples are drawn; however, there has been no discussion of these considerations in the previous literature.  In this paper, we discuss and explain the effects of choices in the state-relevance weights and sampling distribution on approximation quality, using both theoretical and experimental illustrations.  The results provide insight not only onto these effects, but also provide intuition into the types of MDPs which are especially well suited for approximation with RALP.
ER  -

APA


Taylor, G., Geer, C. & Piekut, D.. (2014). An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):451-459 Available from https://proceedings.mlr.press/v32/taylor14.html.

Related Material

Download PDF