Fast Feature Selection with Fairness Constraints

Francesco Quinzan, Rajiv Khanna, Moshik Hershcovitch, Sarel Cohen, Daniel Waddington, Tobias Friedrich, Michael W. Mahoney
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:7800-7823, 2023.

Abstract

We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-quinzan23a, title = {Fast Feature Selection with Fairness Constraints}, author = {Quinzan, Francesco and Khanna, Rajiv and Hershcovitch, Moshik and Cohen, Sarel and Waddington, Daniel and Friedrich, Tobias and Mahoney, Michael W.}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {7800--7823}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/quinzan23a/quinzan23a.pdf}, url = {https://proceedings.mlr.press/v206/quinzan23a.html}, abstract = {We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.} }
Endnote
%0 Conference Paper %T Fast Feature Selection with Fairness Constraints %A Francesco Quinzan %A Rajiv Khanna %A Moshik Hershcovitch %A Sarel Cohen %A Daniel Waddington %A Tobias Friedrich %A Michael W. Mahoney %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-quinzan23a %I PMLR %P 7800--7823 %U https://proceedings.mlr.press/v206/quinzan23a.html %V 206 %X We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.
APA
Quinzan, F., Khanna, R., Hershcovitch, M., Cohen, S., Waddington, D., Friedrich, T. & Mahoney, M.W.. (2023). Fast Feature Selection with Fairness Constraints. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:7800-7823 Available from https://proceedings.mlr.press/v206/quinzan23a.html.

Related Material