Invertible Residual Networks

Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Joern-Henrik Jacobsen
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:573-582, 2019.

Abstract

We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation. Typically, enforcing invertibility requires partitioning dimensions or restricting network architectures. In contrast, our approach only requires adding a simple normalization step during training, already available in standard frameworks. Invertible ResNets define a generative model which can be trained by maximum likelihood on unlabeled data. To compute likelihoods, we introduce a tractable approximation to the Jacobian log-determinant of a residual block. Our empirical evaluation shows that invertible ResNets perform competitively with both state-of-the-art image classifiers and flow-based generative models, something that has not been previously achieved with a single architecture.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-behrmann19a, title = {Invertible Residual Networks}, author = {Behrmann, Jens and Grathwohl, Will and Chen, Ricky T. Q. and Duvenaud, David and Jacobsen, Joern-Henrik}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {573--582}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/behrmann19a/behrmann19a.pdf}, url = {https://proceedings.mlr.press/v97/behrmann19a.html}, abstract = {We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation. Typically, enforcing invertibility requires partitioning dimensions or restricting network architectures. In contrast, our approach only requires adding a simple normalization step during training, already available in standard frameworks. Invertible ResNets define a generative model which can be trained by maximum likelihood on unlabeled data. To compute likelihoods, we introduce a tractable approximation to the Jacobian log-determinant of a residual block. Our empirical evaluation shows that invertible ResNets perform competitively with both state-of-the-art image classifiers and flow-based generative models, something that has not been previously achieved with a single architecture.} }
Endnote
%0 Conference Paper %T Invertible Residual Networks %A Jens Behrmann %A Will Grathwohl %A Ricky T. Q. Chen %A David Duvenaud %A Joern-Henrik Jacobsen %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-behrmann19a %I PMLR %P 573--582 %U https://proceedings.mlr.press/v97/behrmann19a.html %V 97 %X We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation. Typically, enforcing invertibility requires partitioning dimensions or restricting network architectures. In contrast, our approach only requires adding a simple normalization step during training, already available in standard frameworks. Invertible ResNets define a generative model which can be trained by maximum likelihood on unlabeled data. To compute likelihoods, we introduce a tractable approximation to the Jacobian log-determinant of a residual block. Our empirical evaluation shows that invertible ResNets perform competitively with both state-of-the-art image classifiers and flow-based generative models, something that has not been previously achieved with a single architecture.
APA
Behrmann, J., Grathwohl, W., Chen, R.T.Q., Duvenaud, D. & Jacobsen, J.. (2019). Invertible Residual Networks. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:573-582 Available from https://proceedings.mlr.press/v97/behrmann19a.html.

Related Material