Characterizing and Understanding the Generalization Error of Transfer Learning with Gibbs Algorithm

Yuheng Bu, Gholamali Aminian, Laura Toni, Gregory W. Wornell, Miguel Rodrigues
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:8673-8699, 2022.

Abstract

We provide an information-theoretic analysis of the generalization ability of Gibbs-based transfer learning algorithms by focusing on two popular empirical risk minimization (ERM) approaches for transfer learning, $\alpha$-weighted-ERM and two-stage-ERM. Our key result is an exact characterization of the generalization behavior using the conditional symmetrized Kullback-Leibler (KL) information between the output hypothesis and the target training samples given the source training samples. Our results can also be applied to provide novel distribution-free generalization error upper bounds on these two aforementioned Gibbs algorithms. Our approach is versatile, as it also characterizes the generalization errors and excess risks of these two Gibbs algorithms in the asymptotic regime, where they converge to the $\alpha$-weighted-ERM and two-stage-ERM, respectively. Based on our theoretical results, we show that the benefits of transfer learning can be viewed as a bias-variance trade-off, with the bias induced by the source distribution and the variance induced by the lack of target samples. We believe this viewpoint can guide the choice of transfer learning algorithms in practice.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-bu22a, title = { Characterizing and Understanding the Generalization Error of Transfer Learning with Gibbs Algorithm }, author = {Bu, Yuheng and Aminian, Gholamali and Toni, Laura and Wornell, Gregory W. and Rodrigues, Miguel}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {8673--8699}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/bu22a/bu22a.pdf}, url = {https://proceedings.mlr.press/v151/bu22a.html}, abstract = { We provide an information-theoretic analysis of the generalization ability of Gibbs-based transfer learning algorithms by focusing on two popular empirical risk minimization (ERM) approaches for transfer learning, $\alpha$-weighted-ERM and two-stage-ERM. Our key result is an exact characterization of the generalization behavior using the conditional symmetrized Kullback-Leibler (KL) information between the output hypothesis and the target training samples given the source training samples. Our results can also be applied to provide novel distribution-free generalization error upper bounds on these two aforementioned Gibbs algorithms. Our approach is versatile, as it also characterizes the generalization errors and excess risks of these two Gibbs algorithms in the asymptotic regime, where they converge to the $\alpha$-weighted-ERM and two-stage-ERM, respectively. Based on our theoretical results, we show that the benefits of transfer learning can be viewed as a bias-variance trade-off, with the bias induced by the source distribution and the variance induced by the lack of target samples. We believe this viewpoint can guide the choice of transfer learning algorithms in practice. } }
Endnote
%0 Conference Paper %T Characterizing and Understanding the Generalization Error of Transfer Learning with Gibbs Algorithm %A Yuheng Bu %A Gholamali Aminian %A Laura Toni %A Gregory W. Wornell %A Miguel Rodrigues %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-bu22a %I PMLR %P 8673--8699 %U https://proceedings.mlr.press/v151/bu22a.html %V 151 %X We provide an information-theoretic analysis of the generalization ability of Gibbs-based transfer learning algorithms by focusing on two popular empirical risk minimization (ERM) approaches for transfer learning, $\alpha$-weighted-ERM and two-stage-ERM. Our key result is an exact characterization of the generalization behavior using the conditional symmetrized Kullback-Leibler (KL) information between the output hypothesis and the target training samples given the source training samples. Our results can also be applied to provide novel distribution-free generalization error upper bounds on these two aforementioned Gibbs algorithms. Our approach is versatile, as it also characterizes the generalization errors and excess risks of these two Gibbs algorithms in the asymptotic regime, where they converge to the $\alpha$-weighted-ERM and two-stage-ERM, respectively. Based on our theoretical results, we show that the benefits of transfer learning can be viewed as a bias-variance trade-off, with the bias induced by the source distribution and the variance induced by the lack of target samples. We believe this viewpoint can guide the choice of transfer learning algorithms in practice.
APA
Bu, Y., Aminian, G., Toni, L., Wornell, G.W. & Rodrigues, M.. (2022). Characterizing and Understanding the Generalization Error of Transfer Learning with Gibbs Algorithm . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:8673-8699 Available from https://proceedings.mlr.press/v151/bu22a.html.

Related Material