A Subgroup Discovery Approach for Relating Chemical Structure and Phenotype Data in Chemical Genomics


Lan Umek, Petra Kaferle, Mojca Mattiazzi, Aleš Erjavec, Črtomir Gorup, Tomaž Curk, Uroš Petrovič, Blaž Zupan ;
Proceedings of the third International Workshop on Machine Learning in Systems Biology, PMLR 8:136-144, 2009.


We report on development of an algorithm that can infer relations between the chemical structure and biochemical pathways from mutant-based growth fitness characterizations of small molecules. Identification of such relations is very important in drug discovery and development from the perspective of argument-based selection of candidate molecules in target-specific screenings, and early exclusion of substances with highly probable undesired side-effects. The algorithm uses a combination of unsupervised and supervised machine learning techniques, and besides experimental fitness data uses knowledge on gene subgroups (pathways), structural descriptions of chemicals, and MeSH term-based chemical and pharmacological annotations. We demonstrate the utility of the proposed approach in the analysis of a genome-wide \emphS. cerevisiae chemogenomics assay by Hillenmeyer \emphet al. (Science, 2008).

Related Material