[edit]
Citation Constraints and Reference Hallucinations in Large Language Models
Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:464-475, 2026.
Abstract
This paper investigates reference hallucinations in large language models (LLMs) under different prompting constraints. Thirty-six academic-style documents were generated across four systems: Gemini 3, ChatGPT 5.1, ChatGPT 4o, and Microsoft 365 Copilot, and evaluated using an automated citation verification method that cross-checks references against Crossref, OpenAlex, and arXiv. The results show that stricter citation requirements are associated with higher rates of invalid or inconsistent references, whereas unconstrained prompts more frequently produce unsupported conceptual claims rather than fabricated citations. These findings indicate that hallucination behaviour depends on task structure rather than simply topic difficulty, highlighting the importance of prompt design and verification when LLMs are used for research-style writing and literature assistance.