Function-to-Style Guidance of LLMs for Code Translation

Longhui Zhang, Bin Wang, Jiahao Wang, Xiaofeng Zhao, Min Zhang, Hao Yang, Meishan Zhang, Yu Li, Jing Li, Jun Yu, Min Zhang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:76273-76288, 2025.

Abstract

Large language models (LLMs) have made significant strides in code translation tasks. However, ensuring both the correctness and readability of translated code remains a challenge, limiting their effective adoption in real-world software development. In this work, we propose F2STrans, a function-to-style guiding paradigm designed to progressively improve the performance of LLMs in code translation. Our approach comprises two key stages: (1) Functional learning, which optimizes translation correctness using high-quality source-target code pairs mined from online programming platforms, and (2) Style learning, which improves translation readability by incorporating both positive and negative style examples. Additionally, we introduce a novel code translation benchmark that includes up-to-date source code, extensive test cases, and manually annotated ground-truth translations, enabling comprehensive functional and stylistic evaluations. Experiments on both our new benchmark and existing datasets demonstrate that our approach significantly improves code translation performance. Notably, our approach enables Qwen-1.5B to outperform prompt-enhanced Qwen-32B and GPT-4 on average across 20 diverse code translation scenarios.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-zhang25cb, title = {Function-to-Style Guidance of {LLM}s for Code Translation}, author = {Zhang, Longhui and Wang, Bin and Wang, Jiahao and Zhao, Xiaofeng and Zhang, Min and Yang, Hao and Zhang, Meishan and Li, Yu and Li, Jing and Yu, Jun and Zhang, Min}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {76273--76288}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhang25cb/zhang25cb.pdf}, url = {https://proceedings.mlr.press/v267/zhang25cb.html}, abstract = {Large language models (LLMs) have made significant strides in code translation tasks. However, ensuring both the correctness and readability of translated code remains a challenge, limiting their effective adoption in real-world software development. In this work, we propose F2STrans, a function-to-style guiding paradigm designed to progressively improve the performance of LLMs in code translation. Our approach comprises two key stages: (1) Functional learning, which optimizes translation correctness using high-quality source-target code pairs mined from online programming platforms, and (2) Style learning, which improves translation readability by incorporating both positive and negative style examples. Additionally, we introduce a novel code translation benchmark that includes up-to-date source code, extensive test cases, and manually annotated ground-truth translations, enabling comprehensive functional and stylistic evaluations. Experiments on both our new benchmark and existing datasets demonstrate that our approach significantly improves code translation performance. Notably, our approach enables Qwen-1.5B to outperform prompt-enhanced Qwen-32B and GPT-4 on average across 20 diverse code translation scenarios.} }
Endnote
%0 Conference Paper %T Function-to-Style Guidance of LLMs for Code Translation %A Longhui Zhang %A Bin Wang %A Jiahao Wang %A Xiaofeng Zhao %A Min ZhangHao Yang %A Meishan Zhang %A Yu Li %A Jing Li %A Jun Yu %A Min Zhang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-zhang25cb %I PMLR %P 76273--76288 %U https://proceedings.mlr.press/v267/zhang25cb.html %V 267 %X Large language models (LLMs) have made significant strides in code translation tasks. However, ensuring both the correctness and readability of translated code remains a challenge, limiting their effective adoption in real-world software development. In this work, we propose F2STrans, a function-to-style guiding paradigm designed to progressively improve the performance of LLMs in code translation. Our approach comprises two key stages: (1) Functional learning, which optimizes translation correctness using high-quality source-target code pairs mined from online programming platforms, and (2) Style learning, which improves translation readability by incorporating both positive and negative style examples. Additionally, we introduce a novel code translation benchmark that includes up-to-date source code, extensive test cases, and manually annotated ground-truth translations, enabling comprehensive functional and stylistic evaluations. Experiments on both our new benchmark and existing datasets demonstrate that our approach significantly improves code translation performance. Notably, our approach enables Qwen-1.5B to outperform prompt-enhanced Qwen-32B and GPT-4 on average across 20 diverse code translation scenarios.
APA
Zhang, L., Wang, B., Wang, J., Zhao, X., Zhang, M.Yang, H., Zhang, M., Li, Y., Li, J., Yu, J. & Zhang, M.. (2025). Function-to-Style Guidance of LLMs for Code Translation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:76273-76288 Available from https://proceedings.mlr.press/v267/zhang25cb.html.

Related Material