A New Generalized Error Path Algorithm for Model Selection

Bin Gu, Charles Ling
; Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2549-2558, 2015.

Abstract

Model selection with cross validation (CV) is very popular in machine learning. However, CV with grid and other common search strategies cannot guarantee to find the model with minimum CV error, which is often the ultimate goal of model selection. Recently, various solution path algorithms have been proposed for several important learning algorithms including support vector classification, Lasso, and so on. However, they still do not guarantee to find the model with minimum CV error.In this paper, we first show that the solution paths produced by various algorithms have the property of piecewise linearity. Then, we prove that a large class of error (or loss) functions are piecewise constant, linear, or quadratic w.r.t. the regularization parameter, based on the solution path. Finally, we propose a new generalized error path algorithm (GEP), and prove that it will find the model with minimum CV error for the entire range of the regularization parameter. The experimental results on a variety of datasets not only confirm our theoretical findings, but also show that the best model with our GEP has better generalization error on the test data, compared to the grid search, manual search, and random search.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-gu15, title = {A New Generalized Error Path Algorithm for Model Selection}, author = {Bin Gu and Charles Ling}, pages = {2549--2558}, year = {2015}, editor = {Francis Bach and David Blei}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/gu15.pdf}, url = {http://proceedings.mlr.press/v37/gu15.html}, abstract = {Model selection with cross validation (CV) is very popular in machine learning. However, CV with grid and other common search strategies cannot guarantee to find the model with minimum CV error, which is often the ultimate goal of model selection. Recently, various solution path algorithms have been proposed for several important learning algorithms including support vector classification, Lasso, and so on. However, they still do not guarantee to find the model with minimum CV error.In this paper, we first show that the solution paths produced by various algorithms have the property of piecewise linearity. Then, we prove that a large class of error (or loss) functions are piecewise constant, linear, or quadratic w.r.t. the regularization parameter, based on the solution path. Finally, we propose a new generalized error path algorithm (GEP), and prove that it will find the model with minimum CV error for the entire range of the regularization parameter. The experimental results on a variety of datasets not only confirm our theoretical findings, but also show that the best model with our GEP has better generalization error on the test data, compared to the grid search, manual search, and random search.} }
Endnote
%0 Conference Paper %T A New Generalized Error Path Algorithm for Model Selection %A Bin Gu %A Charles Ling %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-gu15 %I PMLR %J Proceedings of Machine Learning Research %P 2549--2558 %U http://proceedings.mlr.press %V 37 %W PMLR %X Model selection with cross validation (CV) is very popular in machine learning. However, CV with grid and other common search strategies cannot guarantee to find the model with minimum CV error, which is often the ultimate goal of model selection. Recently, various solution path algorithms have been proposed for several important learning algorithms including support vector classification, Lasso, and so on. However, they still do not guarantee to find the model with minimum CV error.In this paper, we first show that the solution paths produced by various algorithms have the property of piecewise linearity. Then, we prove that a large class of error (or loss) functions are piecewise constant, linear, or quadratic w.r.t. the regularization parameter, based on the solution path. Finally, we propose a new generalized error path algorithm (GEP), and prove that it will find the model with minimum CV error for the entire range of the regularization parameter. The experimental results on a variety of datasets not only confirm our theoretical findings, but also show that the best model with our GEP has better generalization error on the test data, compared to the grid search, manual search, and random search.
RIS
TY - CPAPER TI - A New Generalized Error Path Algorithm for Model Selection AU - Bin Gu AU - Charles Ling BT - Proceedings of the 32nd International Conference on Machine Learning PY - 2015/06/01 DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-gu15 PB - PMLR SP - 2549 DP - PMLR EP - 2558 L1 - http://proceedings.mlr.press/v37/gu15.pdf UR - http://proceedings.mlr.press/v37/gu15.html AB - Model selection with cross validation (CV) is very popular in machine learning. However, CV with grid and other common search strategies cannot guarantee to find the model with minimum CV error, which is often the ultimate goal of model selection. Recently, various solution path algorithms have been proposed for several important learning algorithms including support vector classification, Lasso, and so on. However, they still do not guarantee to find the model with minimum CV error.In this paper, we first show that the solution paths produced by various algorithms have the property of piecewise linearity. Then, we prove that a large class of error (or loss) functions are piecewise constant, linear, or quadratic w.r.t. the regularization parameter, based on the solution path. Finally, we propose a new generalized error path algorithm (GEP), and prove that it will find the model with minimum CV error for the entire range of the regularization parameter. The experimental results on a variety of datasets not only confirm our theoretical findings, but also show that the best model with our GEP has better generalization error on the test data, compared to the grid search, manual search, and random search. ER -
APA
Gu, B. & Ling, C.. (2015). A New Generalized Error Path Algorithm for Model Selection. Proceedings of the 32nd International Conference on Machine Learning, in PMLR 37:2549-2558

Related Material