Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity

Atefeh Sohrabizadeh, Jialin Song, Mingjie Liu, Rajarshi Roy, Chankyu Lee, Jonathan Raiman, Bryan Catanzaro
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:56085-56100, 2025.

Abstract

Large Language Models (LLMs) have demonstrated significant potential in code generation by following natural language instructions. Unfortunately, crucial real-world software engineering tasks, such as debugging or repository-level feature implementation, involve processing extensive contexts beyond current LLM context sizes and performing complex reasoning that is brittle using standard autoregressive decoding. Enhancing LLMs’ performance in these scenarios requires careful consideration of the contextual information provided to the model, optimizing how the model leverages that, and identifying tools that enable more effective navigation of the development environment. To address these challenges, we introduce Nemotron-CORTEXA, an agentic system built on a predefined scaffold that enhances LLMs’ ability to navigate and reason efficiently in complex software engineering contexts. Specifically, we develop a novel code embedding model that retrieves the most relevant files with greater precision, along with a localization agent that refines the granularity of the retrieval process. Additionally, we demonstrate that providing diverse contextual information and utilizing different prompt formats enable the model to identify and resolve issues more efficiently. We evaluate Nemotron-CORTEXA using SWE-bench, a benchmark derived from real-world GitHub issues. Compared to the widely used Agentless framework, Nemotron-CORTEXA achieves a higher issue resolution rate at a lower cost, highlighting its practical impact in addressing real-world software engineering challenges.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-sohrabizadeh25a, title = {Nemotron-{CORTEXA}: Enhancing {LLM} Agents for Software Engineering Tasks via Improved Localization and Solution Diversity}, author = {Sohrabizadeh, Atefeh and Song, Jialin and Liu, Mingjie and Roy, Rajarshi and Lee, Chankyu and Raiman, Jonathan and Catanzaro, Bryan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {56085--56100}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/sohrabizadeh25a/sohrabizadeh25a.pdf}, url = {https://proceedings.mlr.press/v267/sohrabizadeh25a.html}, abstract = {Large Language Models (LLMs) have demonstrated significant potential in code generation by following natural language instructions. Unfortunately, crucial real-world software engineering tasks, such as debugging or repository-level feature implementation, involve processing extensive contexts beyond current LLM context sizes and performing complex reasoning that is brittle using standard autoregressive decoding. Enhancing LLMs’ performance in these scenarios requires careful consideration of the contextual information provided to the model, optimizing how the model leverages that, and identifying tools that enable more effective navigation of the development environment. To address these challenges, we introduce Nemotron-CORTEXA, an agentic system built on a predefined scaffold that enhances LLMs’ ability to navigate and reason efficiently in complex software engineering contexts. Specifically, we develop a novel code embedding model that retrieves the most relevant files with greater precision, along with a localization agent that refines the granularity of the retrieval process. Additionally, we demonstrate that providing diverse contextual information and utilizing different prompt formats enable the model to identify and resolve issues more efficiently. We evaluate Nemotron-CORTEXA using SWE-bench, a benchmark derived from real-world GitHub issues. Compared to the widely used Agentless framework, Nemotron-CORTEXA achieves a higher issue resolution rate at a lower cost, highlighting its practical impact in addressing real-world software engineering challenges.} }
Endnote
%0 Conference Paper %T Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity %A Atefeh Sohrabizadeh %A Jialin Song %A Mingjie Liu %A Rajarshi Roy %A Chankyu Lee %A Jonathan Raiman %A Bryan Catanzaro %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-sohrabizadeh25a %I PMLR %P 56085--56100 %U https://proceedings.mlr.press/v267/sohrabizadeh25a.html %V 267 %X Large Language Models (LLMs) have demonstrated significant potential in code generation by following natural language instructions. Unfortunately, crucial real-world software engineering tasks, such as debugging or repository-level feature implementation, involve processing extensive contexts beyond current LLM context sizes and performing complex reasoning that is brittle using standard autoregressive decoding. Enhancing LLMs’ performance in these scenarios requires careful consideration of the contextual information provided to the model, optimizing how the model leverages that, and identifying tools that enable more effective navigation of the development environment. To address these challenges, we introduce Nemotron-CORTEXA, an agentic system built on a predefined scaffold that enhances LLMs’ ability to navigate and reason efficiently in complex software engineering contexts. Specifically, we develop a novel code embedding model that retrieves the most relevant files with greater precision, along with a localization agent that refines the granularity of the retrieval process. Additionally, we demonstrate that providing diverse contextual information and utilizing different prompt formats enable the model to identify and resolve issues more efficiently. We evaluate Nemotron-CORTEXA using SWE-bench, a benchmark derived from real-world GitHub issues. Compared to the widely used Agentless framework, Nemotron-CORTEXA achieves a higher issue resolution rate at a lower cost, highlighting its practical impact in addressing real-world software engineering challenges.
APA
Sohrabizadeh, A., Song, J., Liu, M., Roy, R., Lee, C., Raiman, J. & Catanzaro, B.. (2025). Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:56085-56100 Available from https://proceedings.mlr.press/v267/sohrabizadeh25a.html.

Related Material