Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers

Caoxie Zhang; Honglak Lee; Kang Shin

Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers

Caoxie Zhang, Honglak Lee, Kang Shin

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, PMLR 22:1398-1406, 2012.

Abstract

Linear classification has demonstrated success in many areas of applications. Modern algorithms for linear classification can train reasonably good models while going through the data in only tens of rounds. However, large data often does not fit in the memory of a single machine, which makes the bottleneck in large-scale learning the disk I/O, not the CPU. Following this observation, Yu et al. (2010) made significant progress in reducing disk usage, and their algorithms now outperform LIBLINEAR. In this paper, rather than optimizing algorithms on a single machine, we propose and implement distributed algorithms that achieve parallel disk loading and access the disk only once. Our large-scale learning algorithms are based on the framework of alternating direction methods of multipliers. The framework derives a subproblem that remains to be solved efficiently for which we propose using dual coordinate descent. Our experimental evaluations on large datasets demonstrate that the proposed algorithms achieve significant speedup over the classifier proposed by Yu et al. running on a single machine. Our algorithms are faster than existing distributed solvers, such as Zinkevich et al. (2010)’s parallel stochastic gradient descent and Vowpal Wabbit.

Cite this Paper

BibTeX


@InProceedings{pmlr-v22-zhang12a,
  title = 	 {Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers},
  author = 	 {Zhang, Caoxie and Lee, Honglak and Shin, Kang},
  booktitle = 	 {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1398--1406},
  year = 	 {2012},
  editor = 	 {Lawrence, Neil D. and Girolami, Mark},
  volume = 	 {22},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {La Palma, Canary Islands},
  month = 	 {21--23 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v22/zhang12a/zhang12a.pdf},
  url = 	 {https://proceedings.mlr.press/v22/zhang12a.html},
  abstract = 	 {Linear classification has demonstrated success in many areas of applications. Modern algorithms for linear classification can train reasonably good models while going through the data in only tens of rounds. However, large data often does not fit in the memory of a single machine, which makes the bottleneck in large-scale learning the disk I/O, not the CPU. Following this observation, Yu et al. (2010) made significant progress in reducing disk usage, and their algorithms now outperform LIBLINEAR. In this paper, rather than optimizing algorithms on a single machine, we propose and implement distributed algorithms that achieve parallel disk loading and access the disk only once. Our large-scale learning algorithms are based on the framework of alternating direction methods of multipliers. The framework derives a subproblem that remains to be solved efficiently for which we propose using dual coordinate descent. Our experimental evaluations on large datasets demonstrate that the proposed algorithms achieve significant speedup over the classifier proposed by Yu et al. running on a single machine. Our algorithms are faster than existing distributed solvers, such as Zinkevich et al. (2010)’s parallel stochastic gradient descent and Vowpal Wabbit.}
}

Endnote

%0 Conference Paper
%T Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers
%A Caoxie Zhang
%A Honglak Lee
%A Kang Shin
%B Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2012
%E Neil D. Lawrence
%E Mark Girolami	
%F pmlr-v22-zhang12a
%I PMLR
%P 1398--1406
%U https://proceedings.mlr.press/v22/zhang12a.html
%V 22
%X Linear classification has demonstrated success in many areas of applications. Modern algorithms for linear classification can train reasonably good models while going through the data in only tens of rounds. However, large data often does not fit in the memory of a single machine, which makes the bottleneck in large-scale learning the disk I/O, not the CPU. Following this observation, Yu et al. (2010) made significant progress in reducing disk usage, and their algorithms now outperform LIBLINEAR. In this paper, rather than optimizing algorithms on a single machine, we propose and implement distributed algorithms that achieve parallel disk loading and access the disk only once. Our large-scale learning algorithms are based on the framework of alternating direction methods of multipliers. The framework derives a subproblem that remains to be solved efficiently for which we propose using dual coordinate descent. Our experimental evaluations on large datasets demonstrate that the proposed algorithms achieve significant speedup over the classifier proposed by Yu et al. running on a single machine. Our algorithms are faster than existing distributed solvers, such as Zinkevich et al. (2010)’s parallel stochastic gradient descent and Vowpal Wabbit.

RIS


TY  - CPAPER
TI  - Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers
AU  - Caoxie Zhang
AU  - Honglak Lee
AU  - Kang Shin
BT  - Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics
DA  - 2012/03/21
ED  - Neil D. Lawrence
ED  - Mark Girolami	
ID  - pmlr-v22-zhang12a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 22
SP  - 1398
EP  - 1406
L1  - http://proceedings.mlr.press/v22/zhang12a/zhang12a.pdf
UR  - https://proceedings.mlr.press/v22/zhang12a.html
AB  - Linear classification has demonstrated success in many areas of applications. Modern algorithms for linear classification can train reasonably good models while going through the data in only tens of rounds. However, large data often does not fit in the memory of a single machine, which makes the bottleneck in large-scale learning the disk I/O, not the CPU. Following this observation, Yu et al. (2010) made significant progress in reducing disk usage, and their algorithms now outperform LIBLINEAR. In this paper, rather than optimizing algorithms on a single machine, we propose and implement distributed algorithms that achieve parallel disk loading and access the disk only once. Our large-scale learning algorithms are based on the framework of alternating direction methods of multipliers. The framework derives a subproblem that remains to be solved efficiently for which we propose using dual coordinate descent. Our experimental evaluations on large datasets demonstrate that the proposed algorithms achieve significant speedup over the classifier proposed by Yu et al. running on a single machine. Our algorithms are faster than existing distributed solvers, such as Zinkevich et al. (2010)’s parallel stochastic gradient descent and Vowpal Wabbit.
ER  -

APA


Zhang, C., Lee, H. & Shin, K.. (2012). Efficient Distributed Linear Classification Algorithms via the Alternating Direction Method of Multipliers. Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 22:1398-1406 Available from https://proceedings.mlr.press/v22/zhang12a.html.

Related Material

Download PDF