CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Chenlong Wang; Zhaoyang Chu; Zhengxiang Cheng; Xuyi Yang; Kaiyue Qiu; Yao Wan; Zhou Zhao; Xuanhua Shi; Hai Jin; Dongping Chen

CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Chenlong Wang, Zhaoyang Chu, Zhengxiang Cheng, Xuyi Yang, Kaiyue Qiu, Yao Wan, Zhou Zhao, Xuanhua Shi, Hai Jin, Dongping Chen

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:62672-62700, 2025.

Abstract

Large Language Models (LLMs) have exhibited exceptional performance in software engineering yet face challenges in adapting to continually evolving code knowledge, particularly the frequent updates of third-party library APIs. This limitation, rooted in the static pre-training datasets, often results in non-executable code or implementations with suboptimal safety and efficiency. To this end, we introduce CodeSync, a data engine to identify outdated code patterns and collect real-time code knowledge updates from Python third-party libraries. Building upon CodeSync, we develop CodeSyncBench, a comprehensive benchmark for assessing LLMs’ ability to stay synchronized with code evolution, which covers real-world updates for 220 APIs from six Python libraries. Our benchmark offers 3,300 test cases spanning three evaluation tasks and an update-aware instruction tuning dataset of 2,200 training samples. Extensive experiments on 14 LLMs reveal that they struggle with dynamic code evolution, even with the support of advanced knowledge updating methods (e.g., DPO, ORPO, and SimPO). Our CodeSync lays a strong foundation for developing more effective and robust methods for real-time code knowledge updating in the future. The experimental code is available at: https://github.com/CGCL-codes/naturalcc/tree/main/examples/codesync.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-wang25t,
  title = 	 {{C}ode{S}ync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale},
  author =       {Wang, Chenlong and Chu, Zhaoyang and Cheng, Zhengxiang and Yang, Xuyi and Qiu, Kaiyue and Wan, Yao and Zhao, Zhou and Shi, Xuanhua and Jin, Hai and Chen, Dongping},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {62672--62700},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wang25t/wang25t.pdf},
  url = 	 {https://proceedings.mlr.press/v267/wang25t.html},
  abstract = 	 {Large Language Models (LLMs) have exhibited exceptional performance in software engineering yet face challenges in adapting to continually evolving code knowledge, particularly the frequent updates of third-party library APIs. This limitation, rooted in the static pre-training datasets, often results in non-executable code or implementations with suboptimal safety and efficiency. To this end, we introduce CodeSync, a data engine to identify outdated code patterns and collect real-time code knowledge updates from Python third-party libraries. Building upon CodeSync, we develop CodeSyncBench, a comprehensive benchmark for assessing LLMs’ ability to stay synchronized with code evolution, which covers real-world updates for 220 APIs from six Python libraries. Our benchmark offers 3,300 test cases spanning three evaluation tasks and an update-aware instruction tuning dataset of 2,200 training samples. Extensive experiments on 14 LLMs reveal that they struggle with dynamic code evolution, even with the support of advanced knowledge updating methods (e.g., DPO, ORPO, and SimPO). Our CodeSync lays a strong foundation for developing more effective and robust methods for real-time code knowledge updating in the future. The experimental code is available at: https://github.com/CGCL-codes/naturalcc/tree/main/examples/codesync.}
}

Endnote

%0 Conference Paper
%T CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale
%A Chenlong Wang
%A Zhaoyang Chu
%A Zhengxiang Cheng
%A Xuyi Yang
%A Kaiyue Qiu
%A Yao Wan
%A Zhou Zhao
%A Xuanhua Shi
%A Hai Jin
%A Dongping Chen
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-wang25t
%I PMLR
%P 62672--62700
%U https://proceedings.mlr.press/v267/wang25t.html
%V 267
%X Large Language Models (LLMs) have exhibited exceptional performance in software engineering yet face challenges in adapting to continually evolving code knowledge, particularly the frequent updates of third-party library APIs. This limitation, rooted in the static pre-training datasets, often results in non-executable code or implementations with suboptimal safety and efficiency. To this end, we introduce CodeSync, a data engine to identify outdated code patterns and collect real-time code knowledge updates from Python third-party libraries. Building upon CodeSync, we develop CodeSyncBench, a comprehensive benchmark for assessing LLMs’ ability to stay synchronized with code evolution, which covers real-world updates for 220 APIs from six Python libraries. Our benchmark offers 3,300 test cases spanning three evaluation tasks and an update-aware instruction tuning dataset of 2,200 training samples. Extensive experiments on 14 LLMs reveal that they struggle with dynamic code evolution, even with the support of advanced knowledge updating methods (e.g., DPO, ORPO, and SimPO). Our CodeSync lays a strong foundation for developing more effective and robust methods for real-time code knowledge updating in the future. The experimental code is available at: https://github.com/CGCL-codes/naturalcc/tree/main/examples/codesync.

APA

Wang, C., Chu, Z., Cheng, Z., Yang, X., Qiu, K., Wan, Y., Zhao, Z., Shi, X., Jin, H. & Chen, D.. (2025). CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:62672-62700 Available from https://proceedings.mlr.press/v267/wang25t.html.

CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Abstract

Cite this Paper

Related Material