[edit]
HyperGAN: A Generative Model for Diverse, Performant Neural Networks
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5361-5369, 2019.
Abstract
We introduce HyperGAN, a generative model that learns to generate all the parameters of a deep neural network. HyperGAN first transforms low dimensional noise into a latent space, which can be sampled from to obtain diverse, performant sets of parameters for a target architecture. We utilize an architecture that bears resemblance to generative adversarial networks, but we evaluate the likelihood of generated samples with a classification loss. This is equivalent to minimizing the KL-divergence between the distribution of generated parameters, and the unknown true parameter distribution. We apply HyperGAN to classification, showing that HyperGAN can learn to generate parameters which solve the MNIST and CIFAR-10 datasets with competitive performance to fully supervised learning, while also generating a rich distribution of effective parameters. We also show that HyperGAN can also provide better uncertainty estimates than standard ensembles. This is evidenced by the ability of HyperGAN-generated ensembles to detect out of distribution data as well as adversarial examples.