Finding Relevant Information via a Discrete Fourier Expansion

Mohsen Heidari, Jithin Sreedharan, Gil I Shamir, Wojciech Szpankowski
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:4181-4191, 2021.

Abstract

A fundamental obstacle in learning information from data is the presence of nonlinear redundancies and dependencies in it. To address this, we propose a Fourier-based approach to extract relevant information in the supervised setting. We first develop a novel Fourier expansion for functions of correlated binary random variables. This expansion is a generalization of the standard Fourier analysis on the Boolean cube beyond product probability spaces. We further extend our Fourier analysis to stochastic mappings. As an important application of this analysis, we investigate learning with feature subset selection. We reformulate this problem in the Fourier domain and introduce a computationally efficient measure for selecting features. Bridging the Bayesian error rate with the Fourier coefficients, we demonstrate that the Fourier expansion provides a powerful tool to characterize nonlinear dependencies in the features-label relation. Via theoretical analysis, we show that our proposed measure finds provably asymptotically optimal feature subsets. Lastly, we present an algorithm based on our measure and verify our findings via numerical experiments on various datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-heidari21a, title = {Finding Relevant Information via a Discrete Fourier Expansion}, author = {Heidari, Mohsen and Sreedharan, Jithin and Shamir, Gil I and Szpankowski, Wojciech}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {4181--4191}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/heidari21a/heidari21a.pdf}, url = {https://proceedings.mlr.press/v139/heidari21a.html}, abstract = {A fundamental obstacle in learning information from data is the presence of nonlinear redundancies and dependencies in it. To address this, we propose a Fourier-based approach to extract relevant information in the supervised setting. We first develop a novel Fourier expansion for functions of correlated binary random variables. This expansion is a generalization of the standard Fourier analysis on the Boolean cube beyond product probability spaces. We further extend our Fourier analysis to stochastic mappings. As an important application of this analysis, we investigate learning with feature subset selection. We reformulate this problem in the Fourier domain and introduce a computationally efficient measure for selecting features. Bridging the Bayesian error rate with the Fourier coefficients, we demonstrate that the Fourier expansion provides a powerful tool to characterize nonlinear dependencies in the features-label relation. Via theoretical analysis, we show that our proposed measure finds provably asymptotically optimal feature subsets. Lastly, we present an algorithm based on our measure and verify our findings via numerical experiments on various datasets.} }
Endnote
%0 Conference Paper %T Finding Relevant Information via a Discrete Fourier Expansion %A Mohsen Heidari %A Jithin Sreedharan %A Gil I Shamir %A Wojciech Szpankowski %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-heidari21a %I PMLR %P 4181--4191 %U https://proceedings.mlr.press/v139/heidari21a.html %V 139 %X A fundamental obstacle in learning information from data is the presence of nonlinear redundancies and dependencies in it. To address this, we propose a Fourier-based approach to extract relevant information in the supervised setting. We first develop a novel Fourier expansion for functions of correlated binary random variables. This expansion is a generalization of the standard Fourier analysis on the Boolean cube beyond product probability spaces. We further extend our Fourier analysis to stochastic mappings. As an important application of this analysis, we investigate learning with feature subset selection. We reformulate this problem in the Fourier domain and introduce a computationally efficient measure for selecting features. Bridging the Bayesian error rate with the Fourier coefficients, we demonstrate that the Fourier expansion provides a powerful tool to characterize nonlinear dependencies in the features-label relation. Via theoretical analysis, we show that our proposed measure finds provably asymptotically optimal feature subsets. Lastly, we present an algorithm based on our measure and verify our findings via numerical experiments on various datasets.
APA
Heidari, M., Sreedharan, J., Shamir, G.I. & Szpankowski, W.. (2021). Finding Relevant Information via a Discrete Fourier Expansion. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:4181-4191 Available from https://proceedings.mlr.press/v139/heidari21a.html.

Related Material