Bayesian Adaptation of Network Depth and Width for Continual Learning

Jeevan Thapa, Rui Li
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:48038-48061, 2024.

Abstract

While existing dynamic architecture-based continual learning methods adapt network width by growing new branches, they overlook the critical aspect of network depth. We propose a novel non-parametric Bayesian approach to infer network depth and adapt network width while maintaining model performance across tasks. Specifically, we model the growth of network depth with a beta process and apply drop-connect regularization to network width using a conjugate Bernoulli process. Our results show that our proposed method achieves superior or comparable performance with state-of-the-art methods across various continual learning benchmarks. Moreover, our approach can be readily extended to unsupervised continual learning, showcasing competitive performance compared to existing techniques.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-thapa24b, title = {{B}ayesian Adaptation of Network Depth and Width for Continual Learning}, author = {Thapa, Jeevan and Li, Rui}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {48038--48061}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/thapa24b/thapa24b.pdf}, url = {https://proceedings.mlr.press/v235/thapa24b.html}, abstract = {While existing dynamic architecture-based continual learning methods adapt network width by growing new branches, they overlook the critical aspect of network depth. We propose a novel non-parametric Bayesian approach to infer network depth and adapt network width while maintaining model performance across tasks. Specifically, we model the growth of network depth with a beta process and apply drop-connect regularization to network width using a conjugate Bernoulli process. Our results show that our proposed method achieves superior or comparable performance with state-of-the-art methods across various continual learning benchmarks. Moreover, our approach can be readily extended to unsupervised continual learning, showcasing competitive performance compared to existing techniques.} }
Endnote
%0 Conference Paper %T Bayesian Adaptation of Network Depth and Width for Continual Learning %A Jeevan Thapa %A Rui Li %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-thapa24b %I PMLR %P 48038--48061 %U https://proceedings.mlr.press/v235/thapa24b.html %V 235 %X While existing dynamic architecture-based continual learning methods adapt network width by growing new branches, they overlook the critical aspect of network depth. We propose a novel non-parametric Bayesian approach to infer network depth and adapt network width while maintaining model performance across tasks. Specifically, we model the growth of network depth with a beta process and apply drop-connect regularization to network width using a conjugate Bernoulli process. Our results show that our proposed method achieves superior or comparable performance with state-of-the-art methods across various continual learning benchmarks. Moreover, our approach can be readily extended to unsupervised continual learning, showcasing competitive performance compared to existing techniques.
APA
Thapa, J. & Li, R.. (2024). Bayesian Adaptation of Network Depth and Width for Continual Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:48038-48061 Available from https://proceedings.mlr.press/v235/thapa24b.html.

Related Material