BAGEL: Bootstrapping Agents by Guiding Exploration with Language

Shikhar Murty, Christopher D Manning, Peter Shaw, Mandar Joshi, Kenton Lee
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:36894-36910, 2024.

Abstract

Following natural language instructions by executing actions in digital environments (e.g. web-browsers and REST APIs) is a challenging task for language model (LM) agents. Unfortunately, LM agents often fail to generalize to new environments without human demonstrations. This work presents BAGEL, a method for bootstrapping LM agents without human supervision. BAGEL converts a seed set of randomly explored trajectories to synthetic demonstrations via round-trips between two noisy LM components: an LM labeler which converts a trajectory into a synthetic instruction, and a zero-shot LM agent which maps the synthetic instruction into a refined trajectory. By performing these round-trips iteratively, BAGEL quickly converts the initial distribution of trajectories towards those that are well-described by natural language. We adapt the base LM agent at test time with in-context learning by retrieving relevant BAGEL demonstrations based on the instruction, and find improvements of over 2-13% absolute on ToolQA and MiniWob++, with up to 13x reduction in execution failures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-murty24a, title = {{BAGEL}: Bootstrapping Agents by Guiding Exploration with Language}, author = {Murty, Shikhar and Manning, Christopher D and Shaw, Peter and Joshi, Mandar and Lee, Kenton}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {36894--36910}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/murty24a/murty24a.pdf}, url = {https://proceedings.mlr.press/v235/murty24a.html}, abstract = {Following natural language instructions by executing actions in digital environments (e.g. web-browsers and REST APIs) is a challenging task for language model (LM) agents. Unfortunately, LM agents often fail to generalize to new environments without human demonstrations. This work presents BAGEL, a method for bootstrapping LM agents without human supervision. BAGEL converts a seed set of randomly explored trajectories to synthetic demonstrations via round-trips between two noisy LM components: an LM labeler which converts a trajectory into a synthetic instruction, and a zero-shot LM agent which maps the synthetic instruction into a refined trajectory. By performing these round-trips iteratively, BAGEL quickly converts the initial distribution of trajectories towards those that are well-described by natural language. We adapt the base LM agent at test time with in-context learning by retrieving relevant BAGEL demonstrations based on the instruction, and find improvements of over 2-13% absolute on ToolQA and MiniWob++, with up to 13x reduction in execution failures.} }
Endnote
%0 Conference Paper %T BAGEL: Bootstrapping Agents by Guiding Exploration with Language %A Shikhar Murty %A Christopher D Manning %A Peter Shaw %A Mandar Joshi %A Kenton Lee %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-murty24a %I PMLR %P 36894--36910 %U https://proceedings.mlr.press/v235/murty24a.html %V 235 %X Following natural language instructions by executing actions in digital environments (e.g. web-browsers and REST APIs) is a challenging task for language model (LM) agents. Unfortunately, LM agents often fail to generalize to new environments without human demonstrations. This work presents BAGEL, a method for bootstrapping LM agents without human supervision. BAGEL converts a seed set of randomly explored trajectories to synthetic demonstrations via round-trips between two noisy LM components: an LM labeler which converts a trajectory into a synthetic instruction, and a zero-shot LM agent which maps the synthetic instruction into a refined trajectory. By performing these round-trips iteratively, BAGEL quickly converts the initial distribution of trajectories towards those that are well-described by natural language. We adapt the base LM agent at test time with in-context learning by retrieving relevant BAGEL demonstrations based on the instruction, and find improvements of over 2-13% absolute on ToolQA and MiniWob++, with up to 13x reduction in execution failures.
APA
Murty, S., Manning, C.D., Shaw, P., Joshi, M. & Lee, K.. (2024). BAGEL: Bootstrapping Agents by Guiding Exploration with Language. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:36894-36910 Available from https://proceedings.mlr.press/v235/murty24a.html.

Related Material