Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws

Xiyuan Wei, Ming Lin, Fanjiang Ye, Fengguang Song, Liangliang Cao, My T. Thai, Tianbao Yang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:66084-66102, 2025.

Abstract

This paper formalizes an emerging learning paradigm that uses a trained model as a reference to guide and enhance the training of a target model through strategic data selection or weighting, named model steering. While ad-hoc methods have been used in various contexts, including the training of large foundation models, its underlying principles remain insufficiently understood, leading to sub-optimal performance. In this work, we propose a theory-driven framework for model steering called DRRho risk minimization, which is rooted in Distributionally Robust Optimization (DRO). Through a generalization analysis, we provide theoretical insights into why this approach improves generalization and data efficiency compared to training without a reference model. To the best of our knowledge, this is the first time such theoretical insights are provided for the new learning paradigm, which significantly enhance our understanding and practice of model steering. Building on these insights and the connection between contrastive learning and DRO, we introduce a novel method for Contrastive Language-Image Pretraining (CLIP) with a reference model, termed DRRho-CLIP. Extensive experiments validate the theoretical insights, reveal a superior scaling law compared to CLIP without a reference model, and demonstrate its strength over existing heuristic approaches. Code is released at github.com/Optimization-AI/DRRho-CLIP

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wei25f, title = {Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws}, author = {Wei, Xiyuan and Lin, Ming and Ye, Fanjiang and Song, Fengguang and Cao, Liangliang and Thai, My T. and Yang, Tianbao}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {66084--66102}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wei25f/wei25f.pdf}, url = {https://proceedings.mlr.press/v267/wei25f.html}, abstract = {This paper formalizes an emerging learning paradigm that uses a trained model as a reference to guide and enhance the training of a target model through strategic data selection or weighting, named model steering. While ad-hoc methods have been used in various contexts, including the training of large foundation models, its underlying principles remain insufficiently understood, leading to sub-optimal performance. In this work, we propose a theory-driven framework for model steering called DRRho risk minimization, which is rooted in Distributionally Robust Optimization (DRO). Through a generalization analysis, we provide theoretical insights into why this approach improves generalization and data efficiency compared to training without a reference model. To the best of our knowledge, this is the first time such theoretical insights are provided for the new learning paradigm, which significantly enhance our understanding and practice of model steering. Building on these insights and the connection between contrastive learning and DRO, we introduce a novel method for Contrastive Language-Image Pretraining (CLIP) with a reference model, termed DRRho-CLIP. Extensive experiments validate the theoretical insights, reveal a superior scaling law compared to CLIP without a reference model, and demonstrate its strength over existing heuristic approaches. Code is released at github.com/Optimization-AI/DRRho-CLIP} }
Endnote
%0 Conference Paper %T Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws %A Xiyuan Wei %A Ming Lin %A Fanjiang Ye %A Fengguang Song %A Liangliang Cao %A My T. Thai %A Tianbao Yang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wei25f %I PMLR %P 66084--66102 %U https://proceedings.mlr.press/v267/wei25f.html %V 267 %X This paper formalizes an emerging learning paradigm that uses a trained model as a reference to guide and enhance the training of a target model through strategic data selection or weighting, named model steering. While ad-hoc methods have been used in various contexts, including the training of large foundation models, its underlying principles remain insufficiently understood, leading to sub-optimal performance. In this work, we propose a theory-driven framework for model steering called DRRho risk minimization, which is rooted in Distributionally Robust Optimization (DRO). Through a generalization analysis, we provide theoretical insights into why this approach improves generalization and data efficiency compared to training without a reference model. To the best of our knowledge, this is the first time such theoretical insights are provided for the new learning paradigm, which significantly enhance our understanding and practice of model steering. Building on these insights and the connection between contrastive learning and DRO, we introduce a novel method for Contrastive Language-Image Pretraining (CLIP) with a reference model, termed DRRho-CLIP. Extensive experiments validate the theoretical insights, reveal a superior scaling law compared to CLIP without a reference model, and demonstrate its strength over existing heuristic approaches. Code is released at github.com/Optimization-AI/DRRho-CLIP
APA
Wei, X., Lin, M., Ye, F., Song, F., Cao, L., Thai, M.T. & Yang, T.. (2025). Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:66084-66102 Available from https://proceedings.mlr.press/v267/wei25f.html.

Related Material