Continuous Neural Algorithmic Planners

Yu He, Petar Veličković, Pietro Lio, Andreea Deac
Proceedings of the First Learning on Graphs Conference, PMLR 198:54:1-54:13, 2022.

Abstract

Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially using graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v198-he22a, title = {Continuous Neural Algorithmic Planners}, author = {He, Yu and Veli{\v c}kovi{\' c}, Petar and Lio, Pietro and Deac, Andreea}, booktitle = {Proceedings of the First Learning on Graphs Conference}, pages = {54:1--54:13}, year = {2022}, editor = {Rieck, Bastian and Pascanu, Razvan}, volume = {198}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v198/he22a/he22a.pdf}, url = {https://proceedings.mlr.press/v198/he22a.html}, abstract = {Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially using graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.} }
Endnote
%0 Conference Paper %T Continuous Neural Algorithmic Planners %A Yu He %A Petar Veličković %A Pietro Lio %A Andreea Deac %B Proceedings of the First Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2022 %E Bastian Rieck %E Razvan Pascanu %F pmlr-v198-he22a %I PMLR %P 54:1--54:13 %U https://proceedings.mlr.press/v198/he22a.html %V 198 %X Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially using graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.
APA
He, Y., Veličković, P., Lio, P. & Deac, A.. (2022). Continuous Neural Algorithmic Planners. Proceedings of the First Learning on Graphs Conference, in Proceedings of Machine Learning Research 198:54:1-54:13 Available from https://proceedings.mlr.press/v198/he22a.html.

Related Material