A Bayesian nonparametric procedure for comparing algorithms

Alessio Benavoli, Giorgio Corani, Francesca Mangili, Marco Zaffalon
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1264-1272, 2015.

Abstract

A fundamental task in machine learning is to compare the performance of multiple algorithms. This is typically performed by frequentist tests (usually the Friedman test followed by a series of multiple pairwise comparisons). This implies dealing with null hypothesis significance tests and p-values, although the shortcomings of such methods are well known. First, we propose a nonparametric Bayesian version of the Friedman test using a Dirichlet process (DP) based prior. Our derivations show that, from a Bayesian perspective, the Friedman test is an inference for a multivariate mean based on an ellipsoid inclusion test. Second, we derive a joint procedure for the analysis of the multiple comparisons which accounts for their dependencies and which is based on the posterior probability computed through the DP. The proposed approach allows verifying the null hypothesis, not only rejecting it. Third, we apply our test to perform algorithms racing, i.e., the problem of identifying the best algorithm among a large set of candidates. We show by simulation that our approach is competitive both in terms of accuracy and speed in identifying the best algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-benavoli15, title = {A Bayesian nonparametric procedure for comparing algorithms}, author = {Benavoli, Alessio and Corani, Giorgio and Mangili, Francesca and Zaffalon, Marco}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {1264--1272}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/benavoli15.pdf}, url = {https://proceedings.mlr.press/v37/benavoli15.html}, abstract = {A fundamental task in machine learning is to compare the performance of multiple algorithms. This is typically performed by frequentist tests (usually the Friedman test followed by a series of multiple pairwise comparisons). This implies dealing with null hypothesis significance tests and p-values, although the shortcomings of such methods are well known. First, we propose a nonparametric Bayesian version of the Friedman test using a Dirichlet process (DP) based prior. Our derivations show that, from a Bayesian perspective, the Friedman test is an inference for a multivariate mean based on an ellipsoid inclusion test. Second, we derive a joint procedure for the analysis of the multiple comparisons which accounts for their dependencies and which is based on the posterior probability computed through the DP. The proposed approach allows verifying the null hypothesis, not only rejecting it. Third, we apply our test to perform algorithms racing, i.e., the problem of identifying the best algorithm among a large set of candidates. We show by simulation that our approach is competitive both in terms of accuracy and speed in identifying the best algorithm.} }
Endnote
%0 Conference Paper %T A Bayesian nonparametric procedure for comparing algorithms %A Alessio Benavoli %A Giorgio Corani %A Francesca Mangili %A Marco Zaffalon %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-benavoli15 %I PMLR %P 1264--1272 %U https://proceedings.mlr.press/v37/benavoli15.html %V 37 %X A fundamental task in machine learning is to compare the performance of multiple algorithms. This is typically performed by frequentist tests (usually the Friedman test followed by a series of multiple pairwise comparisons). This implies dealing with null hypothesis significance tests and p-values, although the shortcomings of such methods are well known. First, we propose a nonparametric Bayesian version of the Friedman test using a Dirichlet process (DP) based prior. Our derivations show that, from a Bayesian perspective, the Friedman test is an inference for a multivariate mean based on an ellipsoid inclusion test. Second, we derive a joint procedure for the analysis of the multiple comparisons which accounts for their dependencies and which is based on the posterior probability computed through the DP. The proposed approach allows verifying the null hypothesis, not only rejecting it. Third, we apply our test to perform algorithms racing, i.e., the problem of identifying the best algorithm among a large set of candidates. We show by simulation that our approach is competitive both in terms of accuracy and speed in identifying the best algorithm.
RIS
TY - CPAPER TI - A Bayesian nonparametric procedure for comparing algorithms AU - Alessio Benavoli AU - Giorgio Corani AU - Francesca Mangili AU - Marco Zaffalon BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-benavoli15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 1264 EP - 1272 L1 - http://proceedings.mlr.press/v37/benavoli15.pdf UR - https://proceedings.mlr.press/v37/benavoli15.html AB - A fundamental task in machine learning is to compare the performance of multiple algorithms. This is typically performed by frequentist tests (usually the Friedman test followed by a series of multiple pairwise comparisons). This implies dealing with null hypothesis significance tests and p-values, although the shortcomings of such methods are well known. First, we propose a nonparametric Bayesian version of the Friedman test using a Dirichlet process (DP) based prior. Our derivations show that, from a Bayesian perspective, the Friedman test is an inference for a multivariate mean based on an ellipsoid inclusion test. Second, we derive a joint procedure for the analysis of the multiple comparisons which accounts for their dependencies and which is based on the posterior probability computed through the DP. The proposed approach allows verifying the null hypothesis, not only rejecting it. Third, we apply our test to perform algorithms racing, i.e., the problem of identifying the best algorithm among a large set of candidates. We show by simulation that our approach is competitive both in terms of accuracy and speed in identifying the best algorithm. ER -
APA
Benavoli, A., Corani, G., Mangili, F. & Zaffalon, M.. (2015). A Bayesian nonparametric procedure for comparing algorithms. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1264-1272 Available from https://proceedings.mlr.press/v37/benavoli15.html.

Related Material