Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning

Le-Trung Nguyen, Aël Quélennec, Van-Tam Nguyen, Enzo Tartaglione
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:46196-46210, 2025.

Abstract

On-device learning has emerged as a promising direction for AI development, particularly because of its potential to reduce latency issues and mitigate privacy risks associated with device-server communication, while improving energy efficiency. Despite these advantages, significant memory and computational constraints still represent major challenges for its deployment. Drawing on previous studies on low-rank decomposition methods that address activation memory bottlenecks in backpropagation, we propose a novel shortcut approach as an alternative. Our analysis and experiments demonstrate that our method can reduce activation memory usage, even up to $120.09\times$ compared to vanilla training, while also reducing overall training FLOPs up to $1.86\times$ when evaluated on traditional benchmarks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-nguyen25i, title = {Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning}, author = {Nguyen, Le-Trung and Qu\'{e}lennec, A\"{e}l and Nguyen, Van-Tam and Tartaglione, Enzo}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {46196--46210}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/nguyen25i/nguyen25i.pdf}, url = {https://proceedings.mlr.press/v267/nguyen25i.html}, abstract = {On-device learning has emerged as a promising direction for AI development, particularly because of its potential to reduce latency issues and mitigate privacy risks associated with device-server communication, while improving energy efficiency. Despite these advantages, significant memory and computational constraints still represent major challenges for its deployment. Drawing on previous studies on low-rank decomposition methods that address activation memory bottlenecks in backpropagation, we propose a novel shortcut approach as an alternative. Our analysis and experiments demonstrate that our method can reduce activation memory usage, even up to $120.09\times$ compared to vanilla training, while also reducing overall training FLOPs up to $1.86\times$ when evaluated on traditional benchmarks.} }
Endnote
%0 Conference Paper %T Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning %A Le-Trung Nguyen %A Aël Quélennec %A Van-Tam Nguyen %A Enzo Tartaglione %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-nguyen25i %I PMLR %P 46196--46210 %U https://proceedings.mlr.press/v267/nguyen25i.html %V 267 %X On-device learning has emerged as a promising direction for AI development, particularly because of its potential to reduce latency issues and mitigate privacy risks associated with device-server communication, while improving energy efficiency. Despite these advantages, significant memory and computational constraints still represent major challenges for its deployment. Drawing on previous studies on low-rank decomposition methods that address activation memory bottlenecks in backpropagation, we propose a novel shortcut approach as an alternative. Our analysis and experiments demonstrate that our method can reduce activation memory usage, even up to $120.09\times$ compared to vanilla training, while also reducing overall training FLOPs up to $1.86\times$ when evaluated on traditional benchmarks.
APA
Nguyen, L., Quélennec, A., Nguyen, V. & Tartaglione, E.. (2025). Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:46196-46210 Available from https://proceedings.mlr.press/v267/nguyen25i.html.

Related Material