Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction
Proceedings of the third International Workshop on Machine Learning in Systems Biology, PMLR 8:14-29, 2009.
In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally", according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost-sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism S. cerevisiae, using the FunCat taxonomy and seven biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.