Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making

Hongling Zheng; Li Shen; Yong Luo; Deheng Ye; Bo Du; Jialie Shen; Dacheng Tao

Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making

Hongling Zheng, Li Shen, Yong Luo, Deheng Ye, Bo Du, Jialie Shen, Dacheng Tao

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:78297-78314, 2025.

Abstract

The Conditional Sequence Modeling (CSM) paradigm, benefiting from the transformer’s powerful distribution modeling capabilities, has demonstrated considerable promise in offline Reinforcement Learning (RL) tasks. Depending on the task’s nature, it is crucial to carefully balance the interplay between inherent local features and long-term dependencies in Markov decision trajectories to mitigate potential performance degradation and unnecessary computational overhead. In this paper, we propose Decision Mixer (DM), which addresses the conflict between features of different scales in the modeling process from the perspective of dynamic integration. Drawing inspiration from conditional computation, we design a plug-and-play dynamic token selection mechanism to ensure the model can effectively allocate attention to different features based on task characteristics. Additionally, we employ an auxiliary predictor to alleviate the short-sightedness issue in the autoregressive sampling process. DM achieves state-of-the-art performance on various standard RL benchmarks while requiring significantly fewer computational resources, offering a viable solution for building efficient and scalable RL foundation models. Code is available at here.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-zheng25m,
  title = 	 {Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making},
  author =       {Zheng, Hongling and Shen, Li and Luo, Yong and Ye, Deheng and Du, Bo and Shen, Jialie and Tao, Dacheng},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {78297--78314},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zheng25m/zheng25m.pdf},
  url = 	 {https://proceedings.mlr.press/v267/zheng25m.html},
  abstract = 	 {The Conditional Sequence Modeling (CSM) paradigm, benefiting from the transformer’s powerful distribution modeling capabilities, has demonstrated considerable promise in offline Reinforcement Learning (RL) tasks. Depending on the task’s nature, it is crucial to carefully balance the interplay between inherent local features and long-term dependencies in Markov decision trajectories to mitigate potential performance degradation and unnecessary computational overhead. In this paper, we propose Decision Mixer (DM), which addresses the conflict between features of different scales in the modeling process from the perspective of dynamic integration. Drawing inspiration from conditional computation, we design a plug-and-play dynamic token selection mechanism to ensure the model can effectively allocate attention to different features based on task characteristics. Additionally, we employ an auxiliary predictor to alleviate the short-sightedness issue in the autoregressive sampling process. DM achieves state-of-the-art performance on various standard RL benchmarks while requiring significantly fewer computational resources, offering a viable solution for building efficient and scalable RL foundation models. Code is available at here.}
}

Endnote

%0 Conference Paper
%T Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making
%A Hongling Zheng
%A Li Shen
%A Yong Luo
%A Deheng Ye
%A Bo Du
%A Jialie Shen
%A Dacheng Tao
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-zheng25m
%I PMLR
%P 78297--78314
%U https://proceedings.mlr.press/v267/zheng25m.html
%V 267
%X The Conditional Sequence Modeling (CSM) paradigm, benefiting from the transformer’s powerful distribution modeling capabilities, has demonstrated considerable promise in offline Reinforcement Learning (RL) tasks. Depending on the task’s nature, it is crucial to carefully balance the interplay between inherent local features and long-term dependencies in Markov decision trajectories to mitigate potential performance degradation and unnecessary computational overhead. In this paper, we propose Decision Mixer (DM), which addresses the conflict between features of different scales in the modeling process from the perspective of dynamic integration. Drawing inspiration from conditional computation, we design a plug-and-play dynamic token selection mechanism to ensure the model can effectively allocate attention to different features based on task characteristics. Additionally, we employ an auxiliary predictor to alleviate the short-sightedness issue in the autoregressive sampling process. DM achieves state-of-the-art performance on various standard RL benchmarks while requiring significantly fewer computational resources, offering a viable solution for building efficient and scalable RL foundation models. Code is available at here.

APA

Zheng, H., Shen, L., Luo, Y., Ye, D., Du, B., Shen, J. & Tao, D.. (2025). Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:78297-78314 Available from https://proceedings.mlr.press/v267/zheng25m.html.

Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making

Abstract

Cite this Paper

Related Material