Exploiting Performance-based Similarity between Datasets in Metalearning
AAAI Workshop on Meta-Learning and MetaDL Challenge, PMLR 140:90-99, 2021.
This paper describes an improved algorithm selection method of a previous method called active testing. This method seeks a workflow (or its particular configuration) that would lead to the highest gain in performance (e.g., accuracy). The new version uses a particular performance-based characterization of each dataset, which is in the form of a vector of performance values of different algorithms. Dataset similarity is then assessed by comparing these performance vectors. One useful measure for this comparison is Spearman’s correlation. The advantage of this measure is that it can be easily recalculated as more information is gathered. Consequently, as the tests proceed, the recommendations of the system get adjusted to the characteristics of the target dataset. We show that this new strategy leads to improved results of the active testing approach.