Bayesian Adaptation of Network Depth and Width for Continual Learning

Jeevan Thapa; Rui Li

Bayesian Adaptation of Network Depth and Width for Continual Learning

Jeevan Thapa, Rui Li

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:48038-48061, 2024.

Abstract

While existing dynamic architecture-based continual learning methods adapt network width by growing new branches, they overlook the critical aspect of network depth. We propose a novel non-parametric Bayesian approach to infer network depth and adapt network width while maintaining model performance across tasks. Specifically, we model the growth of network depth with a beta process and apply drop-connect regularization to network width using a conjugate Bernoulli process. Our results show that our proposed method achieves superior or comparable performance with state-of-the-art methods across various continual learning benchmarks. Moreover, our approach can be readily extended to unsupervised continual learning, showcasing competitive performance compared to existing techniques.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-thapa24b,
  title = 	 {{B}ayesian Adaptation of Network Depth and Width for Continual Learning},
  author =       {Thapa, Jeevan and Li, Rui},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {48038--48061},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/thapa24b/thapa24b.pdf},
  url = 	 {https://proceedings.mlr.press/v235/thapa24b.html},
  abstract = 	 {While existing dynamic architecture-based continual learning methods adapt network width by growing new branches, they overlook the critical aspect of network depth. We propose a novel non-parametric Bayesian approach to infer network depth and adapt network width while maintaining model performance across tasks. Specifically, we model the growth of network depth with a beta process and apply drop-connect regularization to network width using a conjugate Bernoulli process. Our results show that our proposed method achieves superior or comparable performance with state-of-the-art methods across various continual learning benchmarks. Moreover, our approach can be readily extended to unsupervised continual learning, showcasing competitive performance compared to existing techniques.}
}

Endnote

%0 Conference Paper
%T Bayesian Adaptation of Network Depth and Width for Continual Learning
%A Jeevan Thapa
%A Rui Li
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-thapa24b
%I PMLR
%P 48038--48061
%U https://proceedings.mlr.press/v235/thapa24b.html
%V 235
%X While existing dynamic architecture-based continual learning methods adapt network width by growing new branches, they overlook the critical aspect of network depth. We propose a novel non-parametric Bayesian approach to infer network depth and adapt network width while maintaining model performance across tasks. Specifically, we model the growth of network depth with a beta process and apply drop-connect regularization to network width using a conjugate Bernoulli process. Our results show that our proposed method achieves superior or comparable performance with state-of-the-art methods across various continual learning benchmarks. Moreover, our approach can be readily extended to unsupervised continual learning, showcasing competitive performance compared to existing techniques.

APA


Thapa, J. & Li, R.. (2024). Bayesian Adaptation of Network Depth and Width for Continual Learning. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:48038-48061 Available from https://proceedings.mlr.press/v235/thapa24b.html.

Bayesian Adaptation of Network Depth and Width for Continual Learning

Abstract

Cite this Paper

Related Material