RAG in the Aerospace Domain: A Comprehensive Retrieval, Generation, and User Evaluation for NASA Documentation

Dominykas Petniunas, Gabriel Iturra-Bocaz, Petra Galuscakova
Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), PMLR 307:345-357, 2026.

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in Natural Language Understanding and text generation, but their application is often limited by hallucinations, outdated knowledge, and lack of evidence. Retrieval-Augmented Generation (RAG) addresses these fundamental LLM limitations by integrating external knowledge sources, thereby improving the factual accuracy and traceability while maintaining the text generative capabilities. This work presents the design and implementation of a web-based RAG system for the aerospace domain, leveraging more than 10,000 NASA technical documents and lessons-learned mission reports. The system integrates open-source LLaMA and closed-source OpenAI models and performs an extensive comparative analysis of their performance within the RAG framework. Evaluation through both automated metrics and user studies demonstrates the effectiveness of the RAG approach for both technical and non-technical users. The findings provide insights and establish a foundation for future advancements in AI-driven knowledge management for specialized fields

Cite this Paper


BibTeX
@InProceedings{pmlr-v307-petniunas26a, title = {{RAG} in the Aerospace Domain: A Comprehensive Retrieval, Generation, and User Evaluation for {NASA} Documentation}, author = {Petniunas, Dominykas and Iturra-Bocaz, Gabriel and Galuscakova, Petra}, booktitle = {Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL)}, pages = {345--357}, year = {2026}, editor = {Kim, Hyeongji and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {307}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v307/main/assets/petniunas26a/petniunas26a.pdf}, url = {https://proceedings.mlr.press/v307/petniunas26a.html}, abstract = {Large Language Models (LLMs) have demonstrated remarkable capabilities in Natural Language Understanding and text generation, but their application is often limited by hallucinations, outdated knowledge, and lack of evidence. Retrieval-Augmented Generation (RAG) addresses these fundamental LLM limitations by integrating external knowledge sources, thereby improving the factual accuracy and traceability while maintaining the text generative capabilities. This work presents the design and implementation of a web-based RAG system for the aerospace domain, leveraging more than 10,000 NASA technical documents and lessons-learned mission reports. The system integrates open-source LLaMA and closed-source OpenAI models and performs an extensive comparative analysis of their performance within the RAG framework. Evaluation through both automated metrics and user studies demonstrates the effectiveness of the RAG approach for both technical and non-technical users. The findings provide insights and establish a foundation for future advancements in AI-driven knowledge management for specialized fields} }
Endnote
%0 Conference Paper %T RAG in the Aerospace Domain: A Comprehensive Retrieval, Generation, and User Evaluation for NASA Documentation %A Dominykas Petniunas %A Gabriel Iturra-Bocaz %A Petra Galuscakova %B Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL) %C Proceedings of Machine Learning Research %D 2026 %E Hyeongji Kim %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v307-petniunas26a %I PMLR %P 345--357 %U https://proceedings.mlr.press/v307/petniunas26a.html %V 307 %X Large Language Models (LLMs) have demonstrated remarkable capabilities in Natural Language Understanding and text generation, but their application is often limited by hallucinations, outdated knowledge, and lack of evidence. Retrieval-Augmented Generation (RAG) addresses these fundamental LLM limitations by integrating external knowledge sources, thereby improving the factual accuracy and traceability while maintaining the text generative capabilities. This work presents the design and implementation of a web-based RAG system for the aerospace domain, leveraging more than 10,000 NASA technical documents and lessons-learned mission reports. The system integrates open-source LLaMA and closed-source OpenAI models and performs an extensive comparative analysis of their performance within the RAG framework. Evaluation through both automated metrics and user studies demonstrates the effectiveness of the RAG approach for both technical and non-technical users. The findings provide insights and establish a foundation for future advancements in AI-driven knowledge management for specialized fields
APA
Petniunas, D., Iturra-Bocaz, G. & Galuscakova, P.. (2026). RAG in the Aerospace Domain: A Comprehensive Retrieval, Generation, and User Evaluation for NASA Documentation. Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 307:345-357 Available from https://proceedings.mlr.press/v307/petniunas26a.html.

Related Material