Generation of Realistic Images for Learning in Simulation using FeatureGAN

Nicolas Cruz, Javier Ruiz-del-Solar
Proceedings of the 2020 Conference on Robot Learning, PMLR 155:1121-1136, 2021.

Abstract

This paper presents FeatureGan, a methodology to train image translators (generators) using an unpaired image training set. FeatureGan is based on the use of Generative Adversarial Networks (GAN) and has three main novel components: (i) the use of a feature loss to ensure alignment between the input and the generated image, (ii) the use of a feature pyramid discriminator, which uses a tensor composed of features at different levels of abstraction generated by a pre-trained network, and (iii) the introduction of a per class loss to improve the results in the simulation-to-reality task. The main advantage of the proposed methodology when compared to classical approaches is a more stable training process, which includes a higher resilience to common GAN problems such as mode collapse, as well as better and more consistent results. FeatureGan is also fast to train, easy to replicate, and especially suited to be used in simulation-to-reality applications where the generated realistic images allow to close the visual simulation-to-reality gap. As a proof of concept, we show the application of the proposed methodology in soccer robotics, where realistic images are generated in a soccer robotics simulator, and robot and ball detectors are trained using these images and then tested in reality. The same methodology is used to generate realistic images from images rendered in a video game. The realistic images are then used to train a semantic segmentation network.

Cite this Paper


BibTeX
@InProceedings{pmlr-v155-cruz21a, title = {Generation of Realistic Images for Learning in Simulation using FeatureGAN}, author = {Cruz, Nicolas and Ruiz-del-Solar, Javier}, booktitle = {Proceedings of the 2020 Conference on Robot Learning}, pages = {1121--1136}, year = {2021}, editor = {Kober, Jens and Ramos, Fabio and Tomlin, Claire}, volume = {155}, series = {Proceedings of Machine Learning Research}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v155/cruz21a/cruz21a.pdf}, url = {https://proceedings.mlr.press/v155/cruz21a.html}, abstract = {This paper presents FeatureGan, a methodology to train image translators (generators) using an unpaired image training set. FeatureGan is based on the use of Generative Adversarial Networks (GAN) and has three main novel components: (i) the use of a feature loss to ensure alignment between the input and the generated image, (ii) the use of a feature pyramid discriminator, which uses a tensor composed of features at different levels of abstraction generated by a pre-trained network, and (iii) the introduction of a per class loss to improve the results in the simulation-to-reality task. The main advantage of the proposed methodology when compared to classical approaches is a more stable training process, which includes a higher resilience to common GAN problems such as mode collapse, as well as better and more consistent results. FeatureGan is also fast to train, easy to replicate, and especially suited to be used in simulation-to-reality applications where the generated realistic images allow to close the visual simulation-to-reality gap. As a proof of concept, we show the application of the proposed methodology in soccer robotics, where realistic images are generated in a soccer robotics simulator, and robot and ball detectors are trained using these images and then tested in reality. The same methodology is used to generate realistic images from images rendered in a video game. The realistic images are then used to train a semantic segmentation network.} }
Endnote
%0 Conference Paper %T Generation of Realistic Images for Learning in Simulation using FeatureGAN %A Nicolas Cruz %A Javier Ruiz-del-Solar %B Proceedings of the 2020 Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2021 %E Jens Kober %E Fabio Ramos %E Claire Tomlin %F pmlr-v155-cruz21a %I PMLR %P 1121--1136 %U https://proceedings.mlr.press/v155/cruz21a.html %V 155 %X This paper presents FeatureGan, a methodology to train image translators (generators) using an unpaired image training set. FeatureGan is based on the use of Generative Adversarial Networks (GAN) and has three main novel components: (i) the use of a feature loss to ensure alignment between the input and the generated image, (ii) the use of a feature pyramid discriminator, which uses a tensor composed of features at different levels of abstraction generated by a pre-trained network, and (iii) the introduction of a per class loss to improve the results in the simulation-to-reality task. The main advantage of the proposed methodology when compared to classical approaches is a more stable training process, which includes a higher resilience to common GAN problems such as mode collapse, as well as better and more consistent results. FeatureGan is also fast to train, easy to replicate, and especially suited to be used in simulation-to-reality applications where the generated realistic images allow to close the visual simulation-to-reality gap. As a proof of concept, we show the application of the proposed methodology in soccer robotics, where realistic images are generated in a soccer robotics simulator, and robot and ball detectors are trained using these images and then tested in reality. The same methodology is used to generate realistic images from images rendered in a video game. The realistic images are then used to train a semantic segmentation network.
APA
Cruz, N. & Ruiz-del-Solar, J.. (2021). Generation of Realistic Images for Learning in Simulation using FeatureGAN. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:1121-1136 Available from https://proceedings.mlr.press/v155/cruz21a.html.

Related Material