Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees

Atsushi Nitanda; Taiji Suzuki

Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees

Atsushi Nitanda, Taiji Suzuki

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2981-2991, 2020.

Abstract

Recently, several studies have proposed progressive or sequential layer-wise training methods based on the boosting theory for deep neural networks. However, most studies lack the global convergence guarantees or require weak learning conditions that can be verified a posteriori after running methods. Moreover, generalization bounds usually have a worse dependence on network depth. In this paper, to resolve these problems, we propose a new functional gradient boosting for learning deep residual-like networks in a layer-wise fashion with its statistical guarantees on multi-class classification tasks. In the proposed method, each residual block is recognized as a functional gradient (i.e., weak learner), and the functional gradient step is performed by stacking it on the network, resulting in a strong optimization ability. In the theoretical analysis, we show the global convergence of the method under a standard margin assumption on a data distribution instead of a weak learning condition, and we eliminate a worse dependence on the network depth in a generalization bound via a fine-grained convergence analysis. %, unlike existing studies. Moreover, we show that the existence of a learnable function with a large margin on a training dataset significantly improves a generalization bound. Finally, we experimentally demonstrate that our proposed method is certainly useful for learning deep residual networks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v108-nitanda20a,
  title = 	 {Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees},
  author =       {Nitanda, Atsushi and Suzuki, Taiji},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {2981--2991},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/nitanda20a/nitanda20a.pdf},
  url = 	 {https://proceedings.mlr.press/v108/nitanda20a.html},
  abstract = 	 {Recently, several studies have proposed progressive or sequential layer-wise training methods based on the boosting theory for deep neural networks. However, most studies lack the global convergence guarantees or require weak learning conditions that can be verified a posteriori after running methods. Moreover, generalization bounds usually have a worse dependence on network depth. In this paper, to resolve these problems, we propose a new functional gradient boosting for learning deep residual-like networks in a layer-wise fashion with its statistical guarantees on multi-class classification tasks. In the proposed method, each residual block is recognized as a functional gradient (i.e., weak learner), and the functional gradient step is performed by stacking it on the network, resulting in a strong optimization ability. In the theoretical analysis, we show the global convergence of the method under a standard margin assumption on a data distribution instead of a weak learning condition, and we eliminate a worse dependence on the network depth in a generalization bound via a fine-grained convergence analysis. %, unlike existing studies. Moreover, we show that the existence of a learnable function with a large margin on a training dataset significantly improves a generalization bound. Finally, we experimentally demonstrate that our proposed method is certainly useful for learning deep residual networks.}
}

Endnote

%0 Conference Paper
%T Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees
%A Atsushi Nitanda
%A Taiji Suzuki
%B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2020
%E Silvia Chiappa
%E Roberto Calandra	
%F pmlr-v108-nitanda20a
%I PMLR
%P 2981--2991
%U https://proceedings.mlr.press/v108/nitanda20a.html
%V 108
%X Recently, several studies have proposed progressive or sequential layer-wise training methods based on the boosting theory for deep neural networks. However, most studies lack the global convergence guarantees or require weak learning conditions that can be verified a posteriori after running methods. Moreover, generalization bounds usually have a worse dependence on network depth. In this paper, to resolve these problems, we propose a new functional gradient boosting for learning deep residual-like networks in a layer-wise fashion with its statistical guarantees on multi-class classification tasks. In the proposed method, each residual block is recognized as a functional gradient (i.e., weak learner), and the functional gradient step is performed by stacking it on the network, resulting in a strong optimization ability. In the theoretical analysis, we show the global convergence of the method under a standard margin assumption on a data distribution instead of a weak learning condition, and we eliminate a worse dependence on the network depth in a generalization bound via a fine-grained convergence analysis. %, unlike existing studies. Moreover, we show that the existence of a learnable function with a large margin on a training dataset significantly improves a generalization bound. Finally, we experimentally demonstrate that our proposed method is certainly useful for learning deep residual networks.

APA


Nitanda, A. & Suzuki, T.. (2020). Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:2981-2991 Available from https://proceedings.mlr.press/v108/nitanda20a.html.

Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees

Abstract

Cite this Paper

Related Material