A Robust Test for the Stationarity Assumption in Sequential Decision Making

Jitao Wang, Chengchun Shi, Zhenke Wu
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:36355-36379, 2023.

Abstract

Reinforcement learning (RL) is a powerful technique that allows an autonomous agent to learn an optimal policy to maximize the expected return. The optimality of various RL algorithms relies on the stationarity assumption, which requires time-invariant state transition and reward functions. However, deviations from stationarity over extended periods often occur in real-world applications like robotics control, health care and digital marketing, resulting in suboptimal policies learned under stationary assumptions. In this paper, we propose a model-based doubly robust procedure for testing the stationarity assumption and detecting change points in offline RL settings with certain degree of homogeneity. Our proposed testing procedure is robust to model misspecifications and can effectively control type-I error while achieving high statistical power, especially in high-dimensional settings. Extensive comparative simulations and a real-world interventional mobile health example illustrate the advantages of our method in detecting change points and optimizing long-term rewards in high-dimensional, non-stationary environments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-wang23ai, title = {A Robust Test for the Stationarity Assumption in Sequential Decision Making}, author = {Wang, Jitao and Shi, Chengchun and Wu, Zhenke}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {36355--36379}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/wang23ai/wang23ai.pdf}, url = {https://proceedings.mlr.press/v202/wang23ai.html}, abstract = {Reinforcement learning (RL) is a powerful technique that allows an autonomous agent to learn an optimal policy to maximize the expected return. The optimality of various RL algorithms relies on the stationarity assumption, which requires time-invariant state transition and reward functions. However, deviations from stationarity over extended periods often occur in real-world applications like robotics control, health care and digital marketing, resulting in suboptimal policies learned under stationary assumptions. In this paper, we propose a model-based doubly robust procedure for testing the stationarity assumption and detecting change points in offline RL settings with certain degree of homogeneity. Our proposed testing procedure is robust to model misspecifications and can effectively control type-I error while achieving high statistical power, especially in high-dimensional settings. Extensive comparative simulations and a real-world interventional mobile health example illustrate the advantages of our method in detecting change points and optimizing long-term rewards in high-dimensional, non-stationary environments.} }
Endnote
%0 Conference Paper %T A Robust Test for the Stationarity Assumption in Sequential Decision Making %A Jitao Wang %A Chengchun Shi %A Zhenke Wu %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-wang23ai %I PMLR %P 36355--36379 %U https://proceedings.mlr.press/v202/wang23ai.html %V 202 %X Reinforcement learning (RL) is a powerful technique that allows an autonomous agent to learn an optimal policy to maximize the expected return. The optimality of various RL algorithms relies on the stationarity assumption, which requires time-invariant state transition and reward functions. However, deviations from stationarity over extended periods often occur in real-world applications like robotics control, health care and digital marketing, resulting in suboptimal policies learned under stationary assumptions. In this paper, we propose a model-based doubly robust procedure for testing the stationarity assumption and detecting change points in offline RL settings with certain degree of homogeneity. Our proposed testing procedure is robust to model misspecifications and can effectively control type-I error while achieving high statistical power, especially in high-dimensional settings. Extensive comparative simulations and a real-world interventional mobile health example illustrate the advantages of our method in detecting change points and optimizing long-term rewards in high-dimensional, non-stationary environments.
APA
Wang, J., Shi, C. & Wu, Z.. (2023). A Robust Test for the Stationarity Assumption in Sequential Decision Making. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:36355-36379 Available from https://proceedings.mlr.press/v202/wang23ai.html.

Related Material