Post Selection Inference with Kernels

Makoto Yamada, Yuta Umezu, Kenji Fukumizu, Ichiro Takeuchi
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:152-160, 2018.

Abstract

Finding a set of statistically significant features from complex data (e.g., nonlinear and/or multi-dimensional output data) is important for scientific discovery and has a number of practical applications including biomarker discovery. In this paper, we propose a kernel-based post-selection inference (PSI) algorithm that can find a set of statistically significant features from non-linearly related data. Specifically, our PSI algorithm is based on independence measures, and we call it the Hilbert-Schmidt Independence Criterion (HSIC)-based PSI algorithm (hsicInf). The novelty of hsicInf is that it can handle non-linearity and/or multi-variate/multi-class outputs through kernels. Through synthetic experiments, we show that hsicInf can find a set of statistically significant features for both regression and classification problems. We applied hsicInf to real-world datasets and show that it can successfully identify important features.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-yamada18a, title = {Post Selection Inference with Kernels}, author = {Yamada, Makoto and Umezu, Yuta and Fukumizu, Kenji and Takeuchi, Ichiro}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {152--160}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/yamada18a/yamada18a.pdf}, url = {https://proceedings.mlr.press/v84/yamada18a.html}, abstract = {Finding a set of statistically significant features from complex data (e.g., nonlinear and/or multi-dimensional output data) is important for scientific discovery and has a number of practical applications including biomarker discovery. In this paper, we propose a kernel-based post-selection inference (PSI) algorithm that can find a set of statistically significant features from non-linearly related data. Specifically, our PSI algorithm is based on independence measures, and we call it the Hilbert-Schmidt Independence Criterion (HSIC)-based PSI algorithm (hsicInf). The novelty of hsicInf is that it can handle non-linearity and/or multi-variate/multi-class outputs through kernels. Through synthetic experiments, we show that hsicInf can find a set of statistically significant features for both regression and classification problems. We applied hsicInf to real-world datasets and show that it can successfully identify important features. } }
Endnote
%0 Conference Paper %T Post Selection Inference with Kernels %A Makoto Yamada %A Yuta Umezu %A Kenji Fukumizu %A Ichiro Takeuchi %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-yamada18a %I PMLR %P 152--160 %U https://proceedings.mlr.press/v84/yamada18a.html %V 84 %X Finding a set of statistically significant features from complex data (e.g., nonlinear and/or multi-dimensional output data) is important for scientific discovery and has a number of practical applications including biomarker discovery. In this paper, we propose a kernel-based post-selection inference (PSI) algorithm that can find a set of statistically significant features from non-linearly related data. Specifically, our PSI algorithm is based on independence measures, and we call it the Hilbert-Schmidt Independence Criterion (HSIC)-based PSI algorithm (hsicInf). The novelty of hsicInf is that it can handle non-linearity and/or multi-variate/multi-class outputs through kernels. Through synthetic experiments, we show that hsicInf can find a set of statistically significant features for both regression and classification problems. We applied hsicInf to real-world datasets and show that it can successfully identify important features.
APA
Yamada, M., Umezu, Y., Fukumizu, K. & Takeuchi, I.. (2018). Post Selection Inference with Kernels. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:152-160 Available from https://proceedings.mlr.press/v84/yamada18a.html.

Related Material