Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction

Nicolò Cesa-Bianchi, Giorgio Valentini
; Proceedings of the third International Workshop on Machine Learning in Systems Biology, PMLR 8:14-29, 2009.

Abstract

In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally", according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost-sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism \emph S. cerevisiae, using the FunCat taxonomy and seven biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.

Cite this Paper


BibTeX
@InProceedings{pmlr-v8-cesa-bianchi10a, title = {Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction}, author = {Nicolò Cesa-Bianchi and Giorgio Valentini}, booktitle = {Proceedings of the third International Workshop on Machine Learning in Systems Biology}, pages = {14--29}, year = {2009}, editor = {Sašo Džeroski and Pierre Guerts and Juho Rousu}, volume = {8}, series = {Proceedings of Machine Learning Research}, address = {Ljubljana, Slovenia}, month = {05--06 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v8/cesa-bianchi10a/cesa-bianchi10a.pdf}, url = {http://proceedings.mlr.press/v8/cesa-bianchi10a.html}, abstract = {In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally", according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost-sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism \emph S. cerevisiae, using the FunCat taxonomy and seven biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.} }
Endnote
%0 Conference Paper %T Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction %A Nicolò Cesa-Bianchi %A Giorgio Valentini %B Proceedings of the third International Workshop on Machine Learning in Systems Biology %C Proceedings of Machine Learning Research %D 2009 %E Sašo Džeroski %E Pierre Guerts %E Juho Rousu %F pmlr-v8-cesa-bianchi10a %I PMLR %J Proceedings of Machine Learning Research %P 14--29 %U http://proceedings.mlr.press %V 8 %W PMLR %X In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally", according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost-sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism \emph S. cerevisiae, using the FunCat taxonomy and seven biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles.
RIS
TY - CPAPER TI - Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction AU - Nicolò Cesa-Bianchi AU - Giorgio Valentini BT - Proceedings of the third International Workshop on Machine Learning in Systems Biology PY - 2009/03/02 DA - 2009/03/02 ED - Sašo Džeroski ED - Pierre Guerts ED - Juho Rousu ID - pmlr-v8-cesa-bianchi10a PB - PMLR SP - 14 DP - PMLR EP - 29 L1 - http://proceedings.mlr.press/v8/cesa-bianchi10a/cesa-bianchi10a.pdf UR - http://proceedings.mlr.press/v8/cesa-bianchi10a.html AB - In this work we propose new ensemble methods for the hierarchical classification of gene functions. Our methods exploit the hierarchical relationships between the classes in different ways: each ensemble node is trained “locally", according to its position in the hierarchy; moreover, in the evaluation phase the set of predicted annotations is built so to minimize a global loss function defined over the hierarchy. We also address the problem of sparsity of annotations by introducing a cost-sensitive parameter that allows to control the precision-recall trade-off. Experiments with the model organism \emph S. cerevisiae, using the FunCat taxonomy and seven biomolecular data sets, reveal a significant advantage of our techniques over “flat” and cost-insensitive hierarchical ensembles. ER -
APA
Cesa-Bianchi, N. & Valentini, G.. (2009). Hierarchical Cost-Sensitive Algorithms for Genome-Wide Gene Function Prediction. Proceedings of the third International Workshop on Machine Learning in Systems Biology, in PMLR 8:14-29

Related Material