Auditing Citation Behavior in AI-Generated Search Summaries: A Framework and a Case Study of Google AI Overviews

Rustem Kakimov; Xing Tan; Jonathan Gillham; Narcis Bejtic

Auditing Citation Behavior in AI-Generated Search Summaries: A Framework and a Case Study of Google AI Overviews

Rustem Kakimov, Xing Tan, Jonathan Gillham, Narcis Bejtic

Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:224-235, 2026.

Abstract

Search engines increasingly integrate Large Language Models (LLM) to generate natural-language summaries with cited sources, while a growing fraction of online content is partially or fully AI-generated. This convergence raises new questions about how generative search systems select citation sources, particularly with respect to document provenance. In this paper, we propose a system-agnostic observational framework for auditing citation behavior in AI-generated search summaries, modeling retrieval and citation as observable processes over query-document pairs and introducing rank- and provenance-conditioned citation measures. We instantiate the framework in a large-scale empirical study of Google AI Overviews on "Your Money or Your Life" queries drawn from the MS MARCO Web Search dataset. Our analysis shows that AI-generated documents are cited more frequently than human-authored documents even after controlling for retrieval rank, with the difference driven primarily by non-retrieved citations and most pronounced at highly ranked positions. These results highlight the importance of transparent, measurement-based auditing for understanding citation behavior in generative search systems.

Cite this Paper

BibTeX

@InProceedings{pmlr-v318-kakimov26a,
  title = 	 {Auditing Citation Behavior in AI-Generated Search Summaries: A Framework and a Case Study of Google AI Overviews},
  author =       {Kakimov, Rustem and Tan, Xing and Gillham, Jonathan and Bejtic, Narcis},
  booktitle = 	 {Proceedings of the The 39th Canadian Conference on Artificial Intelligence},
  pages = 	 {224--235},
  year = 	 {2026},
  editor = 	 {Bouzar-Benlabiod, Lydia and Leung, Carson},
  volume = 	 {318},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--29 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v318/main/assets/kakimov26a/kakimov26a.pdf},
  url = 	 {https://proceedings.mlr.press/v318/kakimov26a.html},
  abstract = 	 {Search engines increasingly integrate Large Language Models (LLM) to generate natural-language summaries with cited sources, while a growing fraction of online content is partially or fully AI-generated. This convergence raises new questions about how generative search systems select citation sources, particularly with respect to document provenance. In this paper, we propose a system-agnostic observational framework for auditing citation behavior in AI-generated search summaries, modeling retrieval and citation as observable processes over query-document pairs and introducing rank- and provenance-conditioned citation measures. We instantiate the framework in a large-scale empirical study of Google AI Overviews on "Your Money or Your Life" queries drawn from the MS MARCO Web Search dataset. Our analysis shows that AI-generated documents are cited more frequently than human-authored documents even after controlling for retrieval rank, with the difference driven primarily by non-retrieved citations and most pronounced at highly ranked positions. These results highlight the importance of transparent, measurement-based auditing for understanding citation behavior in generative search systems.}
}

Endnote

%0 Conference Paper
%T Auditing Citation Behavior in AI-Generated Search Summaries: A Framework and a Case Study of Google AI Overviews
%A Rustem Kakimov
%A Xing Tan
%A Jonathan Gillham
%A Narcis Bejtic
%B Proceedings of the The 39th Canadian Conference on Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2026
%E Lydia Bouzar-Benlabiod
%E Carson Leung	
%F pmlr-v318-kakimov26a
%I PMLR
%P 224--235
%U https://proceedings.mlr.press/v318/kakimov26a.html
%V 318
%X Search engines increasingly integrate Large Language Models (LLM) to generate natural-language summaries with cited sources, while a growing fraction of online content is partially or fully AI-generated. This convergence raises new questions about how generative search systems select citation sources, particularly with respect to document provenance. In this paper, we propose a system-agnostic observational framework for auditing citation behavior in AI-generated search summaries, modeling retrieval and citation as observable processes over query-document pairs and introducing rank- and provenance-conditioned citation measures. We instantiate the framework in a large-scale empirical study of Google AI Overviews on "Your Money or Your Life" queries drawn from the MS MARCO Web Search dataset. Our analysis shows that AI-generated documents are cited more frequently than human-authored documents even after controlling for retrieval rank, with the difference driven primarily by non-retrieved citations and most pronounced at highly ranked positions. These results highlight the importance of transparent, measurement-based auditing for understanding citation behavior in generative search systems.

APA

Kakimov, R., Tan, X., Gillham, J. & Bejtic, N.. (2026). Auditing Citation Behavior in AI-Generated Search Summaries: A Framework and a Case Study of Google AI Overviews. Proceedings of the The 39th Canadian Conference on Artificial Intelligence, in Proceedings of Machine Learning Research 318:224-235 Available from https://proceedings.mlr.press/v318/kakimov26a.html.

Related Material

Download PDF