MGit: A Model Versioning and Management System

Wei Hao, Daniel Mendoza, Rafael Mendes, Deepak Narayanan, Amar Phanishayee, Asaf Cidon, Junfeng Yang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:17597-17615, 2024.

Abstract

New ML models are often derived from existing ones (e.g., through fine-tuning, quantization or distillation), forming an ecosystem where models are related to each other and can share structure or even parameter values. Managing such a large and evolving ecosystem of model derivatives is challenging. For instance, the overhead of storing all such models is high, and models may inherit bugs from related models, complicating error attribution and debugging. In this paper, we propose a model versioning and management system called MGit that makes it easier to store, test, update, and collaborate on related models. MGit introduces a lineage graph that records the relationships between models, optimizations to efficiently store model parameters, and abstractions over this lineage graph that facilitate model testing, updating and collaboration. We find that MGit works well in practice: MGit is able to reduce model storage footprint by up to 7$\times$. Additionally, in a user study with 20 ML practitioners, users complete a model updating task 3$\times$ faster on average with MGit.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-hao24c, title = {{MG}it: A Model Versioning and Management System}, author = {Hao, Wei and Mendoza, Daniel and Mendes, Rafael and Narayanan, Deepak and Phanishayee, Amar and Cidon, Asaf and Yang, Junfeng}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {17597--17615}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/hao24c/hao24c.pdf}, url = {https://proceedings.mlr.press/v235/hao24c.html}, abstract = {New ML models are often derived from existing ones (e.g., through fine-tuning, quantization or distillation), forming an ecosystem where models are related to each other and can share structure or even parameter values. Managing such a large and evolving ecosystem of model derivatives is challenging. For instance, the overhead of storing all such models is high, and models may inherit bugs from related models, complicating error attribution and debugging. In this paper, we propose a model versioning and management system called MGit that makes it easier to store, test, update, and collaborate on related models. MGit introduces a lineage graph that records the relationships between models, optimizations to efficiently store model parameters, and abstractions over this lineage graph that facilitate model testing, updating and collaboration. We find that MGit works well in practice: MGit is able to reduce model storage footprint by up to 7$\times$. Additionally, in a user study with 20 ML practitioners, users complete a model updating task 3$\times$ faster on average with MGit.} }
Endnote
%0 Conference Paper %T MGit: A Model Versioning and Management System %A Wei Hao %A Daniel Mendoza %A Rafael Mendes %A Deepak Narayanan %A Amar Phanishayee %A Asaf Cidon %A Junfeng Yang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-hao24c %I PMLR %P 17597--17615 %U https://proceedings.mlr.press/v235/hao24c.html %V 235 %X New ML models are often derived from existing ones (e.g., through fine-tuning, quantization or distillation), forming an ecosystem where models are related to each other and can share structure or even parameter values. Managing such a large and evolving ecosystem of model derivatives is challenging. For instance, the overhead of storing all such models is high, and models may inherit bugs from related models, complicating error attribution and debugging. In this paper, we propose a model versioning and management system called MGit that makes it easier to store, test, update, and collaborate on related models. MGit introduces a lineage graph that records the relationships between models, optimizations to efficiently store model parameters, and abstractions over this lineage graph that facilitate model testing, updating and collaboration. We find that MGit works well in practice: MGit is able to reduce model storage footprint by up to 7$\times$. Additionally, in a user study with 20 ML practitioners, users complete a model updating task 3$\times$ faster on average with MGit.
APA
Hao, W., Mendoza, D., Mendes, R., Narayanan, D., Phanishayee, A., Cidon, A. & Yang, J.. (2024). MGit: A Model Versioning and Management System. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:17597-17615 Available from https://proceedings.mlr.press/v235/hao24c.html.

Related Material