ConvNets with Smooth Adaptive Activation Functions for Regression

Le Hou, Dimitris Samaras, Tahsin Kurc, Yi Gao, Joel Saltz
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:430-439, 2017.

Abstract

Within Neural Networks (NN), the parameters of Adaptive Activation Functions (AAF) control the shapes of activation functions. These parameters are trained along with other parameters in the NN. AAFs have improved performance of Convolutional Neural Networks (CNN) in multiple classification tasks. In this paper, we propose and apply AAFs on CNNs for regression tasks. We argue that applying AAFs in the regression (second-to-last) layer of a NN can significantly decrease the bias of the regression NN. However, using existing AAFs may lead to overfitting. To address this problem, we propose a Smooth Adaptive Activation Function (SAAF) with a piecewise polynomial form which can approximate any continuous function to arbitrary degree of error, while having a bounded Lipschitz constant for given bounded model parameters. As a result, NNs with SAAF can avoid overfitting by simply regularizing model parameters. We empirically evaluated CNNs with SAAFs and achieved state-of-the-art results on age and pose estimation datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-hou17a, title = {{ConvNets with Smooth Adaptive Activation Functions for Regression}}, author = {Hou, Le and Samaras, Dimitris and Kurc, Tahsin and Gao, Yi and Saltz, Joel}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {430--439}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/hou17a/hou17a.pdf}, url = {https://proceedings.mlr.press/v54/hou17a.html}, abstract = {Within Neural Networks (NN), the parameters of Adaptive Activation Functions (AAF) control the shapes of activation functions. These parameters are trained along with other parameters in the NN. AAFs have improved performance of Convolutional Neural Networks (CNN) in multiple classification tasks. In this paper, we propose and apply AAFs on CNNs for regression tasks. We argue that applying AAFs in the regression (second-to-last) layer of a NN can significantly decrease the bias of the regression NN. However, using existing AAFs may lead to overfitting. To address this problem, we propose a Smooth Adaptive Activation Function (SAAF) with a piecewise polynomial form which can approximate any continuous function to arbitrary degree of error, while having a bounded Lipschitz constant for given bounded model parameters. As a result, NNs with SAAF can avoid overfitting by simply regularizing model parameters. We empirically evaluated CNNs with SAAFs and achieved state-of-the-art results on age and pose estimation datasets.} }
Endnote
%0 Conference Paper %T ConvNets with Smooth Adaptive Activation Functions for Regression %A Le Hou %A Dimitris Samaras %A Tahsin Kurc %A Yi Gao %A Joel Saltz %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-hou17a %I PMLR %P 430--439 %U https://proceedings.mlr.press/v54/hou17a.html %V 54 %X Within Neural Networks (NN), the parameters of Adaptive Activation Functions (AAF) control the shapes of activation functions. These parameters are trained along with other parameters in the NN. AAFs have improved performance of Convolutional Neural Networks (CNN) in multiple classification tasks. In this paper, we propose and apply AAFs on CNNs for regression tasks. We argue that applying AAFs in the regression (second-to-last) layer of a NN can significantly decrease the bias of the regression NN. However, using existing AAFs may lead to overfitting. To address this problem, we propose a Smooth Adaptive Activation Function (SAAF) with a piecewise polynomial form which can approximate any continuous function to arbitrary degree of error, while having a bounded Lipschitz constant for given bounded model parameters. As a result, NNs with SAAF can avoid overfitting by simply regularizing model parameters. We empirically evaluated CNNs with SAAFs and achieved state-of-the-art results on age and pose estimation datasets.
APA
Hou, L., Samaras, D., Kurc, T., Gao, Y. & Saltz, J.. (2017). ConvNets with Smooth Adaptive Activation Functions for Regression. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:430-439 Available from https://proceedings.mlr.press/v54/hou17a.html.

Related Material