The Implicit and Explicit Regularization Effects of Dropout

Colin Wei; Sham Kakade; Tengyu Ma

The Implicit and Explicit Regularization Effects of Dropout

Colin Wei, Sham Kakade, Tengyu Ma

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:10181-10192, 2020.

Abstract

Dropout is a widely-used regularization technique, often required to obtain state-of-the-art for a number of architectures. This work demonstrates that dropout introduces two distinct but entangled regularization effects: an explicit effect (also studied in prior work) which occurs since dropout modifies the expected training objective, and, perhaps surprisingly, an additional implicit effect from the stochasticity in the dropout training update. This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent. We disentangle these two effects through controlled experiments. We then derive analytic simplifications which characterize each effect in terms of the derivatives of the model and the loss, for deep neural networks. We demonstrate these simplified, analytic regularizers accurately capture the important aspects of dropout, showing they faithfully replace dropout in practice.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-wei20d,
  title = 	 {The Implicit and Explicit Regularization Effects of Dropout},
  author =       {Wei, Colin and Kakade, Sham and Ma, Tengyu},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {10181--10192},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/wei20d/wei20d.pdf},
  url = 	 {https://proceedings.mlr.press/v119/wei20d.html},
  abstract = 	 {Dropout is a widely-used regularization technique, often required to obtain state-of-the-art for a number of architectures. This work demonstrates that dropout introduces two distinct but entangled regularization effects: an explicit effect (also studied in prior work) which occurs since dropout modifies the expected training objective, and, perhaps surprisingly, an additional implicit effect from the stochasticity in the dropout training update. This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent. We disentangle these two effects through controlled experiments. We then derive analytic simplifications which characterize each effect in terms of the derivatives of the model and the loss, for deep neural networks. We demonstrate these simplified, analytic regularizers accurately capture the important aspects of dropout, showing they faithfully replace dropout in practice.}
}

Endnote

%0 Conference Paper
%T The Implicit and Explicit Regularization Effects of Dropout
%A Colin Wei
%A Sham Kakade
%A Tengyu Ma
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-wei20d
%I PMLR
%P 10181--10192
%U https://proceedings.mlr.press/v119/wei20d.html
%V 119
%X Dropout is a widely-used regularization technique, often required to obtain state-of-the-art for a number of architectures. This work demonstrates that dropout introduces two distinct but entangled regularization effects: an explicit effect (also studied in prior work) which occurs since dropout modifies the expected training objective, and, perhaps surprisingly, an additional implicit effect from the stochasticity in the dropout training update. This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent. We disentangle these two effects through controlled experiments. We then derive analytic simplifications which characterize each effect in terms of the derivatives of the model and the loss, for deep neural networks. We demonstrate these simplified, analytic regularizers accurately capture the important aspects of dropout, showing they faithfully replace dropout in practice.

APA


Wei, C., Kakade, S. & Ma, T.. (2020). The Implicit and Explicit Regularization Effects of Dropout. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:10181-10192 Available from https://proceedings.mlr.press/v119/wei20d.html.

The Implicit and Explicit Regularization Effects of Dropout

Abstract

Cite this Paper

Related Material