SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Yung-Sung Chuang; Benjamin Cohen-Wang; Zejiang Shen; Zhaofeng Wu; Hu Xu; Xi Victoria Lin; James R. Glass; Shang-Wen Li; Wen-Tau Yih

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Yung-Sung Chuang, Benjamin Cohen-Wang, Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, James R. Glass, Shang-Wen Li, Wen-Tau Yih

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:10839-10858, 2025.

Abstract

We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks. The source code is available at https://github.com/facebookresearch/SelfCite.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-chuang25a,
  title = 	 {{S}elf{C}ite: Self-Supervised Alignment for Context Attribution in Large Language Models},
  author =       {Chuang, Yung-Sung and Cohen-Wang, Benjamin and Shen, Zejiang and Wu, Zhaofeng and Xu, Hu and Lin, Xi Victoria and Glass, James R. and Li, Shang-Wen and Yih, Wen-Tau},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {10839--10858},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/chuang25a/chuang25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/chuang25a.html},
  abstract = 	 {We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks. The source code is available at https://github.com/facebookresearch/SelfCite.}
}

Endnote

%0 Conference Paper
%T SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
%A Yung-Sung Chuang
%A Benjamin Cohen-Wang
%A Zejiang Shen
%A Zhaofeng Wu
%A Hu Xu
%A Xi Victoria Lin
%A James R. Glass
%A Shang-Wen Li
%A Wen-Tau Yih
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-chuang25a
%I PMLR
%P 10839--10858
%U https://proceedings.mlr.press/v267/chuang25a.html
%V 267
%X We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks. The source code is available at https://github.com/facebookresearch/SelfCite.

APA

Chuang, Y., Cohen-Wang, B., Shen, Z., Wu, Z., Xu, H., Lin, X.V., Glass, J.R., Li, S. & Yih, W.. (2025). SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:10839-10858 Available from https://proceedings.mlr.press/v267/chuang25a.html.

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Abstract

Cite this Paper

Related Material