Pixel Recurrent Neural Networks

Aäron van den Oord; Nal Kalchbrenner; Koray Kavukcuoglu

Pixel Recurrent Neural Networks

Aäron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1747-1756, 2016.

Abstract

Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-oord16,
  title = 	 {Pixel Recurrent Neural Networks},
  author = 	 {van den Oord, Aäron and Kalchbrenner, Nal and Kavukcuoglu, Koray},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1747--1756},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/oord16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/oord16.html},
  abstract = 	 {Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.}
}

Endnote

%0 Conference Paper
%T Pixel Recurrent Neural Networks
%A Aäron van den Oord
%A Nal Kalchbrenner
%A Koray Kavukcuoglu
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-oord16
%I PMLR
%P 1747--1756
%U https://proceedings.mlr.press/v48/oord16.html
%V 48
%X Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.

RIS


TY  - CPAPER
TI  - Pixel Recurrent Neural Networks
AU  - Aäron van den Oord
AU  - Nal Kalchbrenner
AU  - Koray Kavukcuoglu
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-oord16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1747
EP  - 1756
L1  - http://proceedings.mlr.press/v48/oord16.pdf
UR  - https://proceedings.mlr.press/v48/oord16.html
AB  - Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.
ER  -

APA


van den Oord, A., Kalchbrenner, N. & Kavukcuoglu, K.. (2016). Pixel Recurrent Neural Networks. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1747-1756 Available from https://proceedings.mlr.press/v48/oord16.html.

Related Material

Download PDF