Self-Supervised Learning for Multi-Goal Grid World: Comparing Leela and Deep Q Network
Proceedings of the First International Workshop on Self-Supervised Learning, PMLR 131:72-88, 2020.
Modern machine learning research has explored numerous approaches to solving reinforce- ment learning with multiple goals and sparse rewards as well as learning correct actions from a small number of exploratory samples. We explore the ability of a self-supervised system which automatically creates and tests symbolic hypotheses about the world to ad- dress these same issues. Leela is a system which builds an understanding of the world using constructivist artificial intelligence. For our study, we create an N ∗ N grid world with goals related to proprioceptive or visual positions for exploration. We compare Leela to a DQN which includes hindsight for improving multigoal learning with sparse rewards. Our results show that Leela is able to learn to solve multigoal problems in an N ∗ N world with approximately 160N2 exploratory steps compared to 360N2.7 steps required by the DQN.