Nonoverlap-Promoting Variable Selection

Pengtao Xie, Hongbao Zhang, Yichen Zhu, Eric Xing
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5413-5422, 2018.

Abstract

Variable selection is a classic problem in machine learning (ML), widely used to find important explanatory factors, and improve generalization performance and interpretability of ML models. In this paper, we consider variable selection for models where multiple responses are to be predicted based on the same set of covariates. Since each response is relevant to a unique subset of covariates, we desire the selected variables for different responses have small overlap. We propose a regularizer that simultaneously encourage orthogonality and sparsity, which jointly brings in an effect of reducing overlap. We apply this regularizer to four model instances and develop efficient algorithms to solve the regularized problems. We provide a formal analysis on why the proposed regularizer can reduce generalization error. Experiments on both simulation studies and real-world datasets demonstrate the effectiveness of the proposed regularizer in selecting less-overlapped variables and improving generalization performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-xie18b, title = {Nonoverlap-Promoting Variable Selection}, author = {Xie, Pengtao and Zhang, Hongbao and Zhu, Yichen and Xing, Eric}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5413--5422}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/xie18b/xie18b.pdf}, url = {https://proceedings.mlr.press/v80/xie18b.html}, abstract = {Variable selection is a classic problem in machine learning (ML), widely used to find important explanatory factors, and improve generalization performance and interpretability of ML models. In this paper, we consider variable selection for models where multiple responses are to be predicted based on the same set of covariates. Since each response is relevant to a unique subset of covariates, we desire the selected variables for different responses have small overlap. We propose a regularizer that simultaneously encourage orthogonality and sparsity, which jointly brings in an effect of reducing overlap. We apply this regularizer to four model instances and develop efficient algorithms to solve the regularized problems. We provide a formal analysis on why the proposed regularizer can reduce generalization error. Experiments on both simulation studies and real-world datasets demonstrate the effectiveness of the proposed regularizer in selecting less-overlapped variables and improving generalization performance.} }
Endnote
%0 Conference Paper %T Nonoverlap-Promoting Variable Selection %A Pengtao Xie %A Hongbao Zhang %A Yichen Zhu %A Eric Xing %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-xie18b %I PMLR %P 5413--5422 %U https://proceedings.mlr.press/v80/xie18b.html %V 80 %X Variable selection is a classic problem in machine learning (ML), widely used to find important explanatory factors, and improve generalization performance and interpretability of ML models. In this paper, we consider variable selection for models where multiple responses are to be predicted based on the same set of covariates. Since each response is relevant to a unique subset of covariates, we desire the selected variables for different responses have small overlap. We propose a regularizer that simultaneously encourage orthogonality and sparsity, which jointly brings in an effect of reducing overlap. We apply this regularizer to four model instances and develop efficient algorithms to solve the regularized problems. We provide a formal analysis on why the proposed regularizer can reduce generalization error. Experiments on both simulation studies and real-world datasets demonstrate the effectiveness of the proposed regularizer in selecting less-overlapped variables and improving generalization performance.
APA
Xie, P., Zhang, H., Zhu, Y. & Xing, E.. (2018). Nonoverlap-Promoting Variable Selection. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5413-5422 Available from https://proceedings.mlr.press/v80/xie18b.html.

Related Material