Feature Selection for Linear SVM with Provable Guarantees

Saurabh Paul; Malik Magdon-Ismail; Petros Drineas

Feature Selection for Linear SVM with Provable Guarantees

Saurabh Paul, Malik Magdon-Ismail, Petros Drineas

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:735-743, 2015.

Abstract

We give two provably accurate feature-selection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vectors. We prove that the margin in the feature space is preserved to within ε-relative error of the margin in the full feature space in the worst-case. In the unsupervised setting, we also provide worst-case guarantees of the radius of the minimum enclosing ball, thereby ensuring comparable generalization as in the full feature space and resolving an open problem posed in Dasgupta et al. We present extensive experiments on real-world datasets to support our theory and to demonstrate that our methods are competitive and often better than prior state-of-the-art, for which there are no known provable guarantees.

Cite this Paper

BibTeX


@InProceedings{pmlr-v38-paul15,
  title = 	 {{Feature Selection for Linear SVM with Provable Guarantees}},
  author = 	 {Paul, Saurabh and Magdon-Ismail, Malik and Drineas, Petros},
  booktitle = 	 {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {735--743},
  year = 	 {2015},
  editor = 	 {Lebanon, Guy and Vishwanathan, S. V. N.},
  volume = 	 {38},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {San Diego, California, USA},
  month = 	 {09--12 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v38/paul15.pdf},
  url = 	 {https://proceedings.mlr.press/v38/paul15.html},
  abstract = 	 {We give two provably accurate feature-selection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vectors. We prove that the margin in the feature space is preserved to within ε-relative error of the margin in the full feature space in the worst-case. In the unsupervised setting, we also provide worst-case guarantees of the radius of the minimum enclosing ball, thereby ensuring comparable generalization as in the full feature space and resolving an open problem posed in Dasgupta et al. We present extensive experiments on real-world datasets to support our theory and to demonstrate that our methods are competitive and often better than prior state-of-the-art, for which there are no known provable guarantees.}
}

Endnote

%0 Conference Paper
%T Feature Selection for Linear SVM with Provable Guarantees
%A Saurabh Paul
%A Malik Magdon-Ismail
%A Petros Drineas
%B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2015
%E Guy Lebanon
%E S. V. N. Vishwanathan	
%F pmlr-v38-paul15
%I PMLR
%P 735--743
%U https://proceedings.mlr.press/v38/paul15.html
%V 38
%X We give two provably accurate feature-selection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vectors. We prove that the margin in the feature space is preserved to within ε-relative error of the margin in the full feature space in the worst-case. In the unsupervised setting, we also provide worst-case guarantees of the radius of the minimum enclosing ball, thereby ensuring comparable generalization as in the full feature space and resolving an open problem posed in Dasgupta et al. We present extensive experiments on real-world datasets to support our theory and to demonstrate that our methods are competitive and often better than prior state-of-the-art, for which there are no known provable guarantees.

RIS


TY  - CPAPER
TI  - Feature Selection for Linear SVM with Provable Guarantees
AU  - Saurabh Paul
AU  - Malik Magdon-Ismail
AU  - Petros Drineas
BT  - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics
DA  - 2015/02/21
ED  - Guy Lebanon
ED  - S. V. N. Vishwanathan	
ID  - pmlr-v38-paul15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 38
SP  - 735
EP  - 743
L1  - http://proceedings.mlr.press/v38/paul15.pdf
UR  - https://proceedings.mlr.press/v38/paul15.html
AB  - We give two provably accurate feature-selection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vectors. We prove that the margin in the feature space is preserved to within ε-relative error of the margin in the full feature space in the worst-case. In the unsupervised setting, we also provide worst-case guarantees of the radius of the minimum enclosing ball, thereby ensuring comparable generalization as in the full feature space and resolving an open problem posed in Dasgupta et al. We present extensive experiments on real-world datasets to support our theory and to demonstrate that our methods are competitive and often better than prior state-of-the-art, for which there are no known provable guarantees.
ER  -

APA


Paul, S., Magdon-Ismail, M. & Drineas, P.. (2015). Feature Selection for Linear SVM with Provable Guarantees. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:735-743 Available from https://proceedings.mlr.press/v38/paul15.html.

Feature Selection for Linear SVM with Provable Guarantees

Abstract

Cite this Paper

Related Material