Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks

Jy-yong Sohn; Dohyun Kwon; Seoyeon An; Kangwook Lee

Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks

Jy-yong Sohn, Dohyun Kwon, Seoyeon An, Kangwook Lee

Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:3264-3278, 2024.

Abstract

Fine-tuning large pre-trained models is a common practice in machine learning applications, yet its mathematical analysis remains largely unexplored. In this paper, we study fine-tuning through the lens of memorization capacity. Our new measure, the Fine-Tuning Capacity (FTC), is defined as the maximum number of samples a neural network can fine-tune, or equivalently, as the minimum number of neurons (

$m$ ) needed to arbitrarily change

$N$ labels among

$K$ samples considered in the fine-tuning process. In essence, FTC extends the memorization capacity concept to the fine-tuning scenario. We analyze FTC for the additive fine-tuning scenario where the fine-tuned network is defined as the summation of the frozen pre-trained network

$f$ and a neural network

$g$ (with

$m$ neurons) designed for fine-tuning. When

$g$ is a ReLU network with either 2 or 3 layers, we obtain tight upper and lower bounds on FTC; we show that

$N$ samples can be fine-tuned with

$m=\Theta(N)$ neurons for 2-layer networks, and with

$m=\Theta(\sqrt{N})$ neurons for 3-layer networks, no matter how large

$K$ is. Our results recover the known memorization capacity results when

$N = K$ as a special case.

Cite this Paper

BibTeX


@InProceedings{pmlr-v244-sohn24a,
  title = 	 {Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks},
  author =       {Sohn, Jy-yong and Kwon, Dohyun and An, Seoyeon and Lee, Kangwook},
  booktitle = 	 {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {3264--3278},
  year = 	 {2024},
  editor = 	 {Kiyavash, Negar and Mooij, Joris M.},
  volume = 	 {244},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v244/main/assets/sohn24a/sohn24a.pdf},
  url = 	 {https://proceedings.mlr.press/v244/sohn24a.html},
  abstract = 	 {Fine-tuning large pre-trained models is a common practice in machine learning applications, yet its mathematical analysis remains largely unexplored. In this paper, we study fine-tuning through the lens of memorization capacity. Our new measure, the Fine-Tuning Capacity (FTC), is defined as the maximum number of samples a neural network can fine-tune, or equivalently, as the minimum number of neurons ($m$) needed to arbitrarily change $N$ labels among $K$ samples considered in the fine-tuning process. In essence, FTC extends the memorization capacity concept to the fine-tuning scenario. We analyze FTC for the additive fine-tuning scenario where the fine-tuned network is defined as the summation of the frozen pre-trained network $f$ and a neural network $g$ (with $m$ neurons) designed for fine-tuning. When $g$ is a ReLU network with either 2 or 3 layers, we obtain tight upper and lower bounds on FTC; we show that $N$ samples can be fine-tuned with $m=\Theta(N)$ neurons for 2-layer networks, and with $m=\Theta(\sqrt{N})$ neurons for 3-layer networks, no matter how large $K$ is. Our results recover the known memorization capacity results when $N = K$ as a special case.}
}

Endnote

%0 Conference Paper
%T Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks
%A Jy-yong Sohn
%A Dohyun Kwon
%A Seoyeon An
%A Kangwook Lee
%B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2024
%E Negar Kiyavash
%E Joris M. Mooij	
%F pmlr-v244-sohn24a
%I PMLR
%P 3264--3278
%U https://proceedings.mlr.press/v244/sohn24a.html
%V 244
%X Fine-tuning large pre-trained models is a common practice in machine learning applications, yet its mathematical analysis remains largely unexplored. In this paper, we study fine-tuning through the lens of memorization capacity. Our new measure, the Fine-Tuning Capacity (FTC), is defined as the maximum number of samples a neural network can fine-tune, or equivalently, as the minimum number of neurons ($m$) needed to arbitrarily change $N$ labels among $K$ samples considered in the fine-tuning process. In essence, FTC extends the memorization capacity concept to the fine-tuning scenario. We analyze FTC for the additive fine-tuning scenario where the fine-tuned network is defined as the summation of the frozen pre-trained network $f$ and a neural network $g$ (with $m$ neurons) designed for fine-tuning. When $g$ is a ReLU network with either 2 or 3 layers, we obtain tight upper and lower bounds on FTC; we show that $N$ samples can be fine-tuned with $m=\Theta(N)$ neurons for 2-layer networks, and with $m=\Theta(\sqrt{N})$ neurons for 3-layer networks, no matter how large $K$ is. Our results recover the known memorization capacity results when $N = K$ as a special case.

APA


Sohn, J., Kwon, D., An, S. & Lee, K.. (2024). Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:3264-3278 Available from https://proceedings.mlr.press/v244/sohn24a.html.

Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks

Abstract

Cite this Paper

Related Material