Neuro-algorithmic Policies Enable Fast Combinatorial Generalization

Marin Vlastelica; Michal Rolinek; Georg Martius

Neuro-algorithmic Policies Enable Fast Combinatorial Generalization

Marin Vlastelica, Michal Rolinek, Georg Martius

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10575-10585, 2021.

Abstract

Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. We show that, for a certain subclass of the MDP framework, this can be alleviated by a neuro-algorithmic policy architecture that embeds a time-dependent shortest path solver in a deep neural network. Trained end-to-end via blackbox-differentiation, this method leads to considerable improvement in generalization capabilities in the low-data regime.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-vlastelica21a,
  title = 	 {Neuro-algorithmic Policies Enable Fast Combinatorial Generalization},
  author =       {Vlastelica, Marin and Rolinek, Michal and Martius, Georg},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {10575--10585},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/vlastelica21a/vlastelica21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/vlastelica21a.html},
  abstract = 	 {Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. We show that, for a certain subclass of the MDP framework, this can be alleviated by a neuro-algorithmic policy architecture that embeds a time-dependent shortest path solver in a deep neural network. Trained end-to-end via blackbox-differentiation, this method leads to considerable improvement in generalization capabilities in the low-data regime.}
}

Endnote

%0 Conference Paper
%T Neuro-algorithmic Policies Enable Fast Combinatorial Generalization
%A Marin Vlastelica
%A Michal Rolinek
%A Georg Martius
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-vlastelica21a
%I PMLR
%P 10575--10585
%U https://proceedings.mlr.press/v139/vlastelica21a.html
%V 139
%X Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. We show that, for a certain subclass of the MDP framework, this can be alleviated by a neuro-algorithmic policy architecture that embeds a time-dependent shortest path solver in a deep neural network. Trained end-to-end via blackbox-differentiation, this method leads to considerable improvement in generalization capabilities in the low-data regime.

APA

Vlastelica, M., Rolinek, M. & Martius, G.. (2021). Neuro-algorithmic Policies Enable Fast Combinatorial Generalization. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10575-10585 Available from https://proceedings.mlr.press/v139/vlastelica21a.html.

Neuro-algorithmic Policies Enable Fast Combinatorial Generalization

Abstract

Cite this Paper

Related Material