Learning to Maximize Mutual Information for Dynamic Feature Selection

Ian Connick Covert, Wei Qiu, Mingyu Lu, Na Yoon Kim, Nathan J White, Su-In Lee
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:6424-6447, 2023.

Abstract

Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning, but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality, and it outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-covert23a, title = {Learning to Maximize Mutual Information for Dynamic Feature Selection}, author = {Covert, Ian Connick and Qiu, Wei and Lu, Mingyu and Kim, Na Yoon and White, Nathan J and Lee, Su-In}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {6424--6447}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/covert23a/covert23a.pdf}, url = {https://proceedings.mlr.press/v202/covert23a.html}, abstract = {Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning, but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality, and it outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.} }
Endnote
%0 Conference Paper %T Learning to Maximize Mutual Information for Dynamic Feature Selection %A Ian Connick Covert %A Wei Qiu %A Mingyu Lu %A Na Yoon Kim %A Nathan J White %A Su-In Lee %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-covert23a %I PMLR %P 6424--6447 %U https://proceedings.mlr.press/v202/covert23a.html %V 202 %X Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning, but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality, and it outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.
APA
Covert, I.C., Qiu, W., Lu, M., Kim, N.Y., White, N.J. & Lee, S.. (2023). Learning to Maximize Mutual Information for Dynamic Feature Selection. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:6424-6447 Available from https://proceedings.mlr.press/v202/covert23a.html.

Related Material