Maxout Networks

Ian Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, Yoshua Bengio
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1319-1327, 2013.

Abstract

We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique. We empirically verify that the model successfully accomplishes both of these tasks. We use maxout and dropout to demonstrate state of the art classification performance on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-goodfellow13, title = {Maxout Networks}, author = {Goodfellow, Ian and Warde-Farley, David and Mirza, Mehdi and Courville, Aaron and Bengio, Yoshua}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {1319--1327}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/goodfellow13.pdf}, url = {https://proceedings.mlr.press/v28/goodfellow13.html}, abstract = {We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique. We empirically verify that the model successfully accomplishes both of these tasks. We use maxout and dropout to demonstrate state of the art classification performance on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN.} }
Endnote
%0 Conference Paper %T Maxout Networks %A Ian Goodfellow %A David Warde-Farley %A Mehdi Mirza %A Aaron Courville %A Yoshua Bengio %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-goodfellow13 %I PMLR %P 1319--1327 %U https://proceedings.mlr.press/v28/goodfellow13.html %V 28 %N 3 %X We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique. We empirically verify that the model successfully accomplishes both of these tasks. We use maxout and dropout to demonstrate state of the art classification performance on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN.
RIS
TY - CPAPER TI - Maxout Networks AU - Ian Goodfellow AU - David Warde-Farley AU - Mehdi Mirza AU - Aaron Courville AU - Yoshua Bengio BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/05/26 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-goodfellow13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 3 SP - 1319 EP - 1327 L1 - http://proceedings.mlr.press/v28/goodfellow13.pdf UR - https://proceedings.mlr.press/v28/goodfellow13.html AB - We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and because it is a natural companion to dropout) designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique. We empirically verify that the model successfully accomplishes both of these tasks. We use maxout and dropout to demonstrate state of the art classification performance on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN. ER -
APA
Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A. & Bengio, Y.. (2013). Maxout Networks. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1319-1327 Available from https://proceedings.mlr.press/v28/goodfellow13.html.

Related Material