WHInter: A Working set algorithm for High-dimensional sparse second order Interaction models

Marine Le Morvan, Jean-Philippe Vert
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:3635-3644, 2018.

Abstract

Learning sparse linear models with two-way interactions is desirable in many application domains such as genomics. $\ell_1$-regularised linear models are popular to estimate sparse models, yet standard implementations fail to address specifically the quadratic explosion of candidate two-way interactions in high dimensions, and typically do not scale to genetic data with hundreds of thousands of features. Here we present WHInter, a working set algorithm to solve large $\ell_1$-regularised problems with two-way interactions for binary design matrices. The novelty of WHInter stems from a new bound to efficiently identify working sets while avoiding to scan all features, and on fast computations inspired from solutions to the maximum inner product search problem. We apply WHInter to simulated and real genetic data and show that it is more scalable and two orders of magnitude faster than the state of the art.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-morvan18a, title = {{WHI}nter: A Working set algorithm for High-dimensional sparse second order Interaction models}, author = {Morvan, Marine Le and Vert, Jean-Philippe}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {3635--3644}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/morvan18a/morvan18a.pdf}, url = {https://proceedings.mlr.press/v80/morvan18a.html}, abstract = {Learning sparse linear models with two-way interactions is desirable in many application domains such as genomics. $\ell_1$-regularised linear models are popular to estimate sparse models, yet standard implementations fail to address specifically the quadratic explosion of candidate two-way interactions in high dimensions, and typically do not scale to genetic data with hundreds of thousands of features. Here we present WHInter, a working set algorithm to solve large $\ell_1$-regularised problems with two-way interactions for binary design matrices. The novelty of WHInter stems from a new bound to efficiently identify working sets while avoiding to scan all features, and on fast computations inspired from solutions to the maximum inner product search problem. We apply WHInter to simulated and real genetic data and show that it is more scalable and two orders of magnitude faster than the state of the art.} }
Endnote
%0 Conference Paper %T WHInter: A Working set algorithm for High-dimensional sparse second order Interaction models %A Marine Le Morvan %A Jean-Philippe Vert %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-morvan18a %I PMLR %P 3635--3644 %U https://proceedings.mlr.press/v80/morvan18a.html %V 80 %X Learning sparse linear models with two-way interactions is desirable in many application domains such as genomics. $\ell_1$-regularised linear models are popular to estimate sparse models, yet standard implementations fail to address specifically the quadratic explosion of candidate two-way interactions in high dimensions, and typically do not scale to genetic data with hundreds of thousands of features. Here we present WHInter, a working set algorithm to solve large $\ell_1$-regularised problems with two-way interactions for binary design matrices. The novelty of WHInter stems from a new bound to efficiently identify working sets while avoiding to scan all features, and on fast computations inspired from solutions to the maximum inner product search problem. We apply WHInter to simulated and real genetic data and show that it is more scalable and two orders of magnitude faster than the state of the art.
APA
Morvan, M.L. & Vert, J.. (2018). WHInter: A Working set algorithm for High-dimensional sparse second order Interaction models. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:3635-3644 Available from https://proceedings.mlr.press/v80/morvan18a.html.

Related Material