Understanding and Mitigating Exploding Inverses in Invertible Neural Networks

Jens Behrmann, Paul Vicol, Kuan-Chieh Wang, Roger Grosse, Joern-Henrik Jacobsen
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1792-1800, 2021.

Abstract

Invertible neural networks (INNs) have been used to design generative models, implement memory-saving gradient computation, and solve inverse problems. In this work, we show that commonly-used INN architectures suffer from exploding inverses and are thus prone to becoming numerically non-invertible. Across a wide range of INN use-cases, we reveal failures including the non-applicability of the change-of-variables formula on in- and out-of-distribution (OOD) data, incorrect gradients for memory-saving backprop, and the inability to sample from normalizing flow models. We further derive bi-Lipschitz properties of atomic building blocks of common architectures. These insights into the stability of INNs then provide ways forward to remedy these failures. For tasks where local invertibility is sufficient, like memory-saving backprop, we propose a flexible and efficient regularizer. For problems where global invertibility is necessary, such as applying normalizing flows on OOD data, we show the importance of designing stable INN building blocks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-behrmann21a, title = { Understanding and Mitigating Exploding Inverses in Invertible Neural Networks }, author = {Behrmann, Jens and Vicol, Paul and Wang, Kuan-Chieh and Grosse, Roger and Jacobsen, Joern-Henrik}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {1792--1800}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/behrmann21a/behrmann21a.pdf}, url = {http://proceedings.mlr.press/v130/behrmann21a.html}, abstract = { Invertible neural networks (INNs) have been used to design generative models, implement memory-saving gradient computation, and solve inverse problems. In this work, we show that commonly-used INN architectures suffer from exploding inverses and are thus prone to becoming numerically non-invertible. Across a wide range of INN use-cases, we reveal failures including the non-applicability of the change-of-variables formula on in- and out-of-distribution (OOD) data, incorrect gradients for memory-saving backprop, and the inability to sample from normalizing flow models. We further derive bi-Lipschitz properties of atomic building blocks of common architectures. These insights into the stability of INNs then provide ways forward to remedy these failures. For tasks where local invertibility is sufficient, like memory-saving backprop, we propose a flexible and efficient regularizer. For problems where global invertibility is necessary, such as applying normalizing flows on OOD data, we show the importance of designing stable INN building blocks. } }
Endnote
%0 Conference Paper %T Understanding and Mitigating Exploding Inverses in Invertible Neural Networks %A Jens Behrmann %A Paul Vicol %A Kuan-Chieh Wang %A Roger Grosse %A Joern-Henrik Jacobsen %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-behrmann21a %I PMLR %P 1792--1800 %U http://proceedings.mlr.press/v130/behrmann21a.html %V 130 %X Invertible neural networks (INNs) have been used to design generative models, implement memory-saving gradient computation, and solve inverse problems. In this work, we show that commonly-used INN architectures suffer from exploding inverses and are thus prone to becoming numerically non-invertible. Across a wide range of INN use-cases, we reveal failures including the non-applicability of the change-of-variables formula on in- and out-of-distribution (OOD) data, incorrect gradients for memory-saving backprop, and the inability to sample from normalizing flow models. We further derive bi-Lipschitz properties of atomic building blocks of common architectures. These insights into the stability of INNs then provide ways forward to remedy these failures. For tasks where local invertibility is sufficient, like memory-saving backprop, we propose a flexible and efficient regularizer. For problems where global invertibility is necessary, such as applying normalizing flows on OOD data, we show the importance of designing stable INN building blocks.
APA
Behrmann, J., Vicol, P., Wang, K., Grosse, R. & Jacobsen, J.. (2021). Understanding and Mitigating Exploding Inverses in Invertible Neural Networks . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1792-1800 Available from http://proceedings.mlr.press/v130/behrmann21a.html.

Related Material