[edit]
Hypernetwork-based Implicit Posterior Estimation and Model Averaging of CNN
Proceedings of The 10th Asian Conference on Machine Learning, PMLR 95:176-191, 2018.
Abstract
Deep neural networks have a rich ability to learn complex representations and achieved remarkable results in various tasks. However, they are prone to overfitting due to the limited number of training samples; regularizing the learning process of neural networks is critical. In this paper, we propose a novel regularization method, which estimates parameters of a large convolutional neural network as implicit probabilistic distributions generated by a hypernetwork. Also, we can perform model averaging to improve the network performance. Experimental results demonstrate our regularization method outperformed the commonly-used maximum a posterior (MAP) estimation.