Interpolation between Residual and Non-Residual Networks

Zonghan Yang, Yang Liu, Chenglong Bao, Zuoqiang Shi
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:10736-10745, 2020.

Abstract

Although ordinary differential equations (ODEs) provide insights for designing network architectures, its relationship with the non-residual convolutional neural networks (CNNs) is still unclear. In this paper, we present a novel ODE model by adding a damping term. It can be shown that the proposed model can recover both a ResNet and a CNN by adjusting an interpolation coefficient. Therefore, the damped ODE model provides a unified framework for the interpretation of residual and non-residual networks. The Lyapunov analysis reveals better stability of the proposed model, and thus yields robustness improvement of the learned networks. Experiments on a number of image classification benchmarks show that the proposed model substantially improves the accuracy of ResNet and ResNeXt over the perturbed inputs from both stochastic noise and adversarial attack methods. Moreover, the loss landscape analysis demonstrates the improved robustness of our method along the attack direction.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-yang20g, title = {Interpolation between Residual and Non-Residual Networks}, author = {Yang, Zonghan and Liu, Yang and Bao, Chenglong and Shi, Zuoqiang}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {10736--10745}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/yang20g/yang20g.pdf}, url = {https://proceedings.mlr.press/v119/yang20g.html}, abstract = {Although ordinary differential equations (ODEs) provide insights for designing network architectures, its relationship with the non-residual convolutional neural networks (CNNs) is still unclear. In this paper, we present a novel ODE model by adding a damping term. It can be shown that the proposed model can recover both a ResNet and a CNN by adjusting an interpolation coefficient. Therefore, the damped ODE model provides a unified framework for the interpretation of residual and non-residual networks. The Lyapunov analysis reveals better stability of the proposed model, and thus yields robustness improvement of the learned networks. Experiments on a number of image classification benchmarks show that the proposed model substantially improves the accuracy of ResNet and ResNeXt over the perturbed inputs from both stochastic noise and adversarial attack methods. Moreover, the loss landscape analysis demonstrates the improved robustness of our method along the attack direction.} }
Endnote
%0 Conference Paper %T Interpolation between Residual and Non-Residual Networks %A Zonghan Yang %A Yang Liu %A Chenglong Bao %A Zuoqiang Shi %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-yang20g %I PMLR %P 10736--10745 %U https://proceedings.mlr.press/v119/yang20g.html %V 119 %X Although ordinary differential equations (ODEs) provide insights for designing network architectures, its relationship with the non-residual convolutional neural networks (CNNs) is still unclear. In this paper, we present a novel ODE model by adding a damping term. It can be shown that the proposed model can recover both a ResNet and a CNN by adjusting an interpolation coefficient. Therefore, the damped ODE model provides a unified framework for the interpretation of residual and non-residual networks. The Lyapunov analysis reveals better stability of the proposed model, and thus yields robustness improvement of the learned networks. Experiments on a number of image classification benchmarks show that the proposed model substantially improves the accuracy of ResNet and ResNeXt over the perturbed inputs from both stochastic noise and adversarial attack methods. Moreover, the loss landscape analysis demonstrates the improved robustness of our method along the attack direction.
APA
Yang, Z., Liu, Y., Bao, C. & Shi, Z.. (2020). Interpolation between Residual and Non-Residual Networks. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:10736-10745 Available from https://proceedings.mlr.press/v119/yang20g.html.

Related Material