Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation

Yibo Yang, Xiaojie Li, Motasem Alfarra, Hasan Abed Al Kader Hammoud, Adel Bibi, Philip Torr, Bernard Ghanem
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:56196-56215, 2024.

Abstract

Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural network with local errors and has been proved to be effective even on large-scale datasets. However, the reconciliation among local errors has never been investigated. In this paper, we first theoretically study non-greedy layer-wise training and show that the convergence cannot be assured when the local gradient in a module w.r.t. its input is not reconciled with the local gradient in the previous module w.r.t. its output. Inspired by the theoretical result, we further propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules without breaking gradient isolation or introducing any learnable parameters. Our method can be integrated into both local-BP and BP-free settings. In experiments, we achieve significant performance improvements compared to previous methods. Particularly, our method for CNN and Transformer architectures on ImageNet is able to attain a competitive performance with global BP, saving more than 40% memory consumption.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-yang24m, title = {Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation}, author = {Yang, Yibo and Li, Xiaojie and Alfarra, Motasem and Hammoud, Hasan Abed Al Kader and Bibi, Adel and Torr, Philip and Ghanem, Bernard}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {56196--56215}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/yang24m/yang24m.pdf}, url = {https://proceedings.mlr.press/v235/yang24m.html}, abstract = {Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural network with local errors and has been proved to be effective even on large-scale datasets. However, the reconciliation among local errors has never been investigated. In this paper, we first theoretically study non-greedy layer-wise training and show that the convergence cannot be assured when the local gradient in a module w.r.t. its input is not reconciled with the local gradient in the previous module w.r.t. its output. Inspired by the theoretical result, we further propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules without breaking gradient isolation or introducing any learnable parameters. Our method can be integrated into both local-BP and BP-free settings. In experiments, we achieve significant performance improvements compared to previous methods. Particularly, our method for CNN and Transformer architectures on ImageNet is able to attain a competitive performance with global BP, saving more than 40% memory consumption.} }
Endnote
%0 Conference Paper %T Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation %A Yibo Yang %A Xiaojie Li %A Motasem Alfarra %A Hasan Abed Al Kader Hammoud %A Adel Bibi %A Philip Torr %A Bernard Ghanem %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-yang24m %I PMLR %P 56196--56215 %U https://proceedings.mlr.press/v235/yang24m.html %V 235 %X Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural network with local errors and has been proved to be effective even on large-scale datasets. However, the reconciliation among local errors has never been investigated. In this paper, we first theoretically study non-greedy layer-wise training and show that the convergence cannot be assured when the local gradient in a module w.r.t. its input is not reconciled with the local gradient in the previous module w.r.t. its output. Inspired by the theoretical result, we further propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules without breaking gradient isolation or introducing any learnable parameters. Our method can be integrated into both local-BP and BP-free settings. In experiments, we achieve significant performance improvements compared to previous methods. Particularly, our method for CNN and Transformer architectures on ImageNet is able to attain a competitive performance with global BP, saving more than 40% memory consumption.
APA
Yang, Y., Li, X., Alfarra, M., Hammoud, H.A.A.K., Bibi, A., Torr, P. & Ghanem, B.. (2024). Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:56196-56215 Available from https://proceedings.mlr.press/v235/yang24m.html.

Related Material