Theory of Dual-sparse Regularized Randomized Reduction

Tianbao Yang; Lijun Zhang; Rong Jin; Shenghuo Zhu

Theory of Dual-sparse Regularized Randomized Reduction

Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:305-314, 2015.

Abstract

In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e.g., random projection, random hashing), for large-scale high-dimensional classification. Previous theoretical results on randomized reduction methods hinge on strong assumptions about the data, e.g., low rank of the data matrix or a large separable margin of classification, which hinder their in broad domains. To address these limitations, we propose dual-sparse regularized randomized reduction methods that introduce a sparse regularizer into the reduced dual problem. Under a mild condition that the original dual solution is a (nearly) sparse vector, we show that the resulting dual solution is close to the original dual solution and concentrates on its support set. In numerical experiments, we present an empirical study to support the analysis and we also present a novel application of the dual-sparse randomized reduction methods to reducing the communication cost of distributed learning from large-scale high-dimensional data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-yangb15,
  title = 	 {Theory of Dual-sparse Regularized Randomized Reduction},
  author = 	 {Yang, Tianbao and Zhang, Lijun and Jin, Rong and Zhu, Shenghuo},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {305--314},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/yangb15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/yangb15.html},
  abstract = 	 {In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e.g., random projection, random hashing), for large-scale high-dimensional classification. Previous theoretical results on randomized reduction methods hinge on strong assumptions about the data, e.g., low rank of the data matrix or a large separable margin of classification, which hinder their in broad domains. To address these limitations, we propose dual-sparse regularized randomized reduction methods that introduce a sparse regularizer into the reduced dual problem. Under a mild condition that the original dual solution is a (nearly) sparse vector, we show that the resulting dual solution is close to the original dual solution and concentrates on its support set. In numerical experiments, we present an empirical study to support the analysis and we also present a novel application of the dual-sparse randomized reduction methods to reducing the communication cost of distributed learning from large-scale high-dimensional data.}
}

Endnote

%0 Conference Paper
%T Theory of Dual-sparse Regularized Randomized Reduction
%A Tianbao Yang
%A Lijun Zhang
%A Rong Jin
%A Shenghuo Zhu
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-yangb15
%I PMLR
%P 305--314
%U https://proceedings.mlr.press/v37/yangb15.html
%V 37
%X In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e.g., random projection, random hashing), for large-scale high-dimensional classification. Previous theoretical results on randomized reduction methods hinge on strong assumptions about the data, e.g., low rank of the data matrix or a large separable margin of classification, which hinder their in broad domains. To address these limitations, we propose dual-sparse regularized randomized reduction methods that introduce a sparse regularizer into the reduced dual problem. Under a mild condition that the original dual solution is a (nearly) sparse vector, we show that the resulting dual solution is close to the original dual solution and concentrates on its support set. In numerical experiments, we present an empirical study to support the analysis and we also present a novel application of the dual-sparse randomized reduction methods to reducing the communication cost of distributed learning from large-scale high-dimensional data.

RIS


TY  - CPAPER
TI  - Theory of Dual-sparse Regularized Randomized Reduction
AU  - Tianbao Yang
AU  - Lijun Zhang
AU  - Rong Jin
AU  - Shenghuo Zhu
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-yangb15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 305
EP  - 314
L1  - http://proceedings.mlr.press/v37/yangb15.pdf
UR  - https://proceedings.mlr.press/v37/yangb15.html
AB  - In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e.g., random projection, random hashing), for large-scale high-dimensional classification. Previous theoretical results on randomized reduction methods hinge on strong assumptions about the data, e.g., low rank of the data matrix or a large separable margin of classification, which hinder their in broad domains. To address these limitations, we propose dual-sparse regularized randomized reduction methods that introduce a sparse regularizer into the reduced dual problem. Under a mild condition that the original dual solution is a (nearly) sparse vector, we show that the resulting dual solution is close to the original dual solution and concentrates on its support set. In numerical experiments, we present an empirical study to support the analysis and we also present a novel application of the dual-sparse randomized reduction methods to reducing the communication cost of distributed learning from large-scale high-dimensional data.
ER  -

APA


Yang, T., Zhang, L., Jin, R. & Zhu, S.. (2015). Theory of Dual-sparse Regularized Randomized Reduction. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:305-314 Available from https://proceedings.mlr.press/v37/yangb15.html.

Related Material

Download PDF