RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks

Jinsung Yoon, James Jordon, Mihaela Schaar
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5699-5707, 2018.

Abstract

Training complex machine learning models for prediction often requires a large amount of data that is not always readily available. Leveraging these external datasets from related but different sources is therefore an important task if good predictive models are to be built for deployment in settings where data can be rare. In this paper we propose a novel approach to the problem in which we use multiple GAN architectures to learn to translate from one dataset to another, thereby allowing us to effectively enlarge the target dataset, and therefore learn better predictive models than if we simply used the target dataset. We show the utility of such an approach, demonstrating that our method improves the prediction performance on the target domain over using just the target dataset and also show that our framework outperforms several other benchmarks on a collection of real-world medical datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-yoon18b, title = {{R}adial{GAN}: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks}, author = {Yoon, Jinsung and Jordon, James and van der Schaar, Mihaela}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5699--5707}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/yoon18b/yoon18b.pdf}, url = {http://proceedings.mlr.press/v80/yoon18b.html}, abstract = {Training complex machine learning models for prediction often requires a large amount of data that is not always readily available. Leveraging these external datasets from related but different sources is therefore an important task if good predictive models are to be built for deployment in settings where data can be rare. In this paper we propose a novel approach to the problem in which we use multiple GAN architectures to learn to translate from one dataset to another, thereby allowing us to effectively enlarge the target dataset, and therefore learn better predictive models than if we simply used the target dataset. We show the utility of such an approach, demonstrating that our method improves the prediction performance on the target domain over using just the target dataset and also show that our framework outperforms several other benchmarks on a collection of real-world medical datasets.} }
Endnote
%0 Conference Paper %T RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks %A Jinsung Yoon %A James Jordon %A Mihaela Schaar %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-yoon18b %I PMLR %P 5699--5707 %U http://proceedings.mlr.press/v80/yoon18b.html %V 80 %X Training complex machine learning models for prediction often requires a large amount of data that is not always readily available. Leveraging these external datasets from related but different sources is therefore an important task if good predictive models are to be built for deployment in settings where data can be rare. In this paper we propose a novel approach to the problem in which we use multiple GAN architectures to learn to translate from one dataset to another, thereby allowing us to effectively enlarge the target dataset, and therefore learn better predictive models than if we simply used the target dataset. We show the utility of such an approach, demonstrating that our method improves the prediction performance on the target domain over using just the target dataset and also show that our framework outperforms several other benchmarks on a collection of real-world medical datasets.
APA
Yoon, J., Jordon, J. & Schaar, M.. (2018). RadialGAN: Leveraging multiple datasets to improve target-specific predictive models using Generative Adversarial Networks. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5699-5707 Available from http://proceedings.mlr.press/v80/yoon18b.html.

Related Material