Concrete Autoencoders: Differentiable Feature Selection and Reconstruction

Muhammed Fatih Balın; Abubakar Abid; James Zou

Concrete Autoencoders: Differentiable Feature Selection and Reconstruction

Muhammed Fatih Balın, Abubakar Abid, James Zou

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:444-453, 2019.

Abstract

We introduce the concrete autoencoder, an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features. Our method is unsupervised, and is based on using a concrete selector layer as the encoder and using a standard neural network as the decoder. During the training phase, the temperature of the concrete selector layer is gradually decreased, which encourages a user-specified number of discrete features to be learned; during test time, the selected features can be used with the decoder network to reconstruct the remaining input features. We evaluate concrete autoencoders on a variety of datasets, where they significantly outperform state-of-the-art methods for feature selection and data reconstruction. In particular, on a large-scale gene expression dataset, the concrete autoencoder selects a small subset of genes whose expression levels can be used to impute the expression levels of the remaining genes; in doing so, it improves on the current widely-used expert-curated L1000 landmark genes, potentially reducing measurement costs by 20%. The concrete autoencoder can be implemented by adding just a few lines of code to a standard autoencoder, and the code for the algorithm and experiments is publicly available.

Cite this Paper

BibTeX


@InProceedings{pmlr-v97-balin19a,
  title = 	 {Concrete Autoencoders: Differentiable Feature Selection and Reconstruction},
  author =       {Bal{\i}n, Muhammed Fatih and Abid, Abubakar and Zou, James},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {444--453},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/balin19a/balin19a.pdf},
  url = 	 {https://proceedings.mlr.press/v97/balin19a.html},
  abstract = 	 {We introduce the concrete autoencoder, an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features. Our method is unsupervised, and is based on using a concrete selector layer as the encoder and using a standard neural network as the decoder. During the training phase, the temperature of the concrete selector layer is gradually decreased, which encourages a user-specified number of discrete features to be learned; during test time, the selected features can be used with the decoder network to reconstruct the remaining input features. We evaluate concrete autoencoders on a variety of datasets, where they significantly outperform state-of-the-art methods for feature selection and data reconstruction. In particular, on a large-scale gene expression dataset, the concrete autoencoder selects a small subset of genes whose expression levels can be used to impute the expression levels of the remaining genes; in doing so, it improves on the current widely-used expert-curated L1000 landmark genes, potentially reducing measurement costs by 20%. The concrete autoencoder can be implemented by adding just a few lines of code to a standard autoencoder, and the code for the algorithm and experiments is publicly available.}
}

Endnote

%0 Conference Paper
%T Concrete Autoencoders: Differentiable Feature Selection and Reconstruction
%A Muhammed Fatih Balın
%A Abubakar Abid
%A James Zou
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-balin19a
%I PMLR
%P 444--453
%U https://proceedings.mlr.press/v97/balin19a.html
%V 97
%X We introduce the concrete autoencoder, an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features. Our method is unsupervised, and is based on using a concrete selector layer as the encoder and using a standard neural network as the decoder. During the training phase, the temperature of the concrete selector layer is gradually decreased, which encourages a user-specified number of discrete features to be learned; during test time, the selected features can be used with the decoder network to reconstruct the remaining input features. We evaluate concrete autoencoders on a variety of datasets, where they significantly outperform state-of-the-art methods for feature selection and data reconstruction. In particular, on a large-scale gene expression dataset, the concrete autoencoder selects a small subset of genes whose expression levels can be used to impute the expression levels of the remaining genes; in doing so, it improves on the current widely-used expert-curated L1000 landmark genes, potentially reducing measurement costs by 20%. The concrete autoencoder can be implemented by adding just a few lines of code to a standard autoencoder, and the code for the algorithm and experiments is publicly available.

APA


Balın, M.F., Abid, A. & Zou, J.. (2019). Concrete Autoencoders: Differentiable Feature Selection and Reconstruction. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:444-453 Available from https://proceedings.mlr.press/v97/balin19a.html.

Concrete Autoencoders: Differentiable Feature Selection and Reconstruction

Abstract

Cite this Paper

Related Material