Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more

Elad Tolochinksy, Ibrahim Jubran, Dan Feldman
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:21520-21547, 2022.

Abstract

Coreset (or core-set) is a small weighted subset $Q$ of an input set $P$ with respect to a given monotonic function $f:\mathbb{R}\to\mathbb{R}$ that provably approximates its fitting loss $\sum_{p\in P}f(p\cdot x)$ to any given $x\in\mathbb{R}^d$. Using $Q$ we can obtain an approximation of $x^*$ that minimizes this loss, by running existing optimization algorithms on $Q$. In this work we provide: (i) A lower bound which proves that there are sets with no coresets smaller than $n=|P|$ for general monotonic loss functions. (ii) A proof that, with an additional common regularization term and under a natural assumption that holds e.g. for logistic regression and the sigmoid activation functions, a small coreset exists for any input $P$. (iii) A generic coreset construction algorithm that computes such a small coreset $Q$ in $O(nd+n\log n)$ time, and (iv) Experimental results with open-source code which demonstrate that our coresets are effective and are much smaller in practice than predicted in theory.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-tolochinksy22a, title = {Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more}, author = {Tolochinksy, Elad and Jubran, Ibrahim and Feldman, Dan}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {21520--21547}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/tolochinksy22a/tolochinksy22a.pdf}, url = {https://proceedings.mlr.press/v162/tolochinksy22a.html}, abstract = {Coreset (or core-set) is a small weighted subset $Q$ of an input set $P$ with respect to a given monotonic function $f:\mathbb{R}\to\mathbb{R}$ that provably approximates its fitting loss $\sum_{p\in P}f(p\cdot x)$ to any given $x\in\mathbb{R}^d$. Using $Q$ we can obtain an approximation of $x^*$ that minimizes this loss, by running existing optimization algorithms on $Q$. In this work we provide: (i) A lower bound which proves that there are sets with no coresets smaller than $n=|P|$ for general monotonic loss functions. (ii) A proof that, with an additional common regularization term and under a natural assumption that holds e.g. for logistic regression and the sigmoid activation functions, a small coreset exists for any input $P$. (iii) A generic coreset construction algorithm that computes such a small coreset $Q$ in $O(nd+n\log n)$ time, and (iv) Experimental results with open-source code which demonstrate that our coresets are effective and are much smaller in practice than predicted in theory.} }
Endnote
%0 Conference Paper %T Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more %A Elad Tolochinksy %A Ibrahim Jubran %A Dan Feldman %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-tolochinksy22a %I PMLR %P 21520--21547 %U https://proceedings.mlr.press/v162/tolochinksy22a.html %V 162 %X Coreset (or core-set) is a small weighted subset $Q$ of an input set $P$ with respect to a given monotonic function $f:\mathbb{R}\to\mathbb{R}$ that provably approximates its fitting loss $\sum_{p\in P}f(p\cdot x)$ to any given $x\in\mathbb{R}^d$. Using $Q$ we can obtain an approximation of $x^*$ that minimizes this loss, by running existing optimization algorithms on $Q$. In this work we provide: (i) A lower bound which proves that there are sets with no coresets smaller than $n=|P|$ for general monotonic loss functions. (ii) A proof that, with an additional common regularization term and under a natural assumption that holds e.g. for logistic regression and the sigmoid activation functions, a small coreset exists for any input $P$. (iii) A generic coreset construction algorithm that computes such a small coreset $Q$ in $O(nd+n\log n)$ time, and (iv) Experimental results with open-source code which demonstrate that our coresets are effective and are much smaller in practice than predicted in theory.
APA
Tolochinksy, E., Jubran, I. & Feldman, D.. (2022). Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:21520-21547 Available from https://proceedings.mlr.press/v162/tolochinksy22a.html.

Related Material