Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

Haotian Ju; Dongyue Li; Hongyang R Zhang

Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

Haotian Ju, Dongyue Li, Hongyang R Zhang

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:10431-10461, 2022.

Abstract

We consider transfer learning approaches that fine-tune a pretrained deep neural network on a target task. We investigate generalization properties of fine-tuning to understand the problem of overfitting, which often happens in practice. Previous works have shown that constraining the distance from the initialization of fine-tuning improves generalization. Using a PAC-Bayesian analysis, we observe that besides distance from initialization, Hessians affect generalization through the noise stability of deep neural networks against noise injections. Motivated by the observation, we develop Hessian distance-based generalization bounds for a wide range of fine-tuning methods. Next, we investigate the robustness of fine-tuning with noisy labels. We design an algorithm that incorporates consistent losses and distance-based regularization for fine-tuning. Additionally, we prove a generalization error bound of our algorithm under class conditional independent noise in the training dataset labels. We perform a detailed empirical study of our algorithm on various noisy environments and architectures. For example, on six image classification tasks whose training labels are generated with programmatic labeling, we show a 3.26% accuracy improvement over prior methods. Meanwhile, the Hessian distance measure of the fine-tuned network using our algorithm decreases by six times more than existing approaches.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-ju22a,
  title = 	 {Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees},
  author =       {Ju, Haotian and Li, Dongyue and Zhang, Hongyang R},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {10431--10461},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/ju22a/ju22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/ju22a.html},
  abstract = 	 {We consider transfer learning approaches that fine-tune a pretrained deep neural network on a target task. We investigate generalization properties of fine-tuning to understand the problem of overfitting, which often happens in practice. Previous works have shown that constraining the distance from the initialization of fine-tuning improves generalization. Using a PAC-Bayesian analysis, we observe that besides distance from initialization, Hessians affect generalization through the noise stability of deep neural networks against noise injections. Motivated by the observation, we develop Hessian distance-based generalization bounds for a wide range of fine-tuning methods. Next, we investigate the robustness of fine-tuning with noisy labels. We design an algorithm that incorporates consistent losses and distance-based regularization for fine-tuning. Additionally, we prove a generalization error bound of our algorithm under class conditional independent noise in the training dataset labels. We perform a detailed empirical study of our algorithm on various noisy environments and architectures. For example, on six image classification tasks whose training labels are generated with programmatic labeling, we show a 3.26% accuracy improvement over prior methods. Meanwhile, the Hessian distance measure of the fine-tuned network using our algorithm decreases by six times more than existing approaches.}
}

Endnote

%0 Conference Paper
%T Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
%A Haotian Ju
%A Dongyue Li
%A Hongyang R Zhang
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-ju22a
%I PMLR
%P 10431--10461
%U https://proceedings.mlr.press/v162/ju22a.html
%V 162
%X We consider transfer learning approaches that fine-tune a pretrained deep neural network on a target task. We investigate generalization properties of fine-tuning to understand the problem of overfitting, which often happens in practice. Previous works have shown that constraining the distance from the initialization of fine-tuning improves generalization. Using a PAC-Bayesian analysis, we observe that besides distance from initialization, Hessians affect generalization through the noise stability of deep neural networks against noise injections. Motivated by the observation, we develop Hessian distance-based generalization bounds for a wide range of fine-tuning methods. Next, we investigate the robustness of fine-tuning with noisy labels. We design an algorithm that incorporates consistent losses and distance-based regularization for fine-tuning. Additionally, we prove a generalization error bound of our algorithm under class conditional independent noise in the training dataset labels. We perform a detailed empirical study of our algorithm on various noisy environments and architectures. For example, on six image classification tasks whose training labels are generated with programmatic labeling, we show a 3.26% accuracy improvement over prior methods. Meanwhile, the Hessian distance measure of the fine-tuned network using our algorithm decreases by six times more than existing approaches.

APA


Ju, H., Li, D. & Zhang, H.R.. (2022). Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:10431-10461 Available from https://proceedings.mlr.press/v162/ju22a.html.

Related Material

Download PDF