[edit]
GOLF: A Generative AI Framework for Pathogenicity Prediction of Myocilin OLF Variants
Proceedings of the 20th Machine Learning in Computational Biology meeting, PMLR 311:148-161, 2025.
Abstract
Missense mutations in the MYOC gene, particularly those affecting the olfactomedin (OLF) domain of the myocilin protein, can be causal for open-angle glaucoma, a leading cause of irreversible blindness. However, predicting the pathogenicity of these mutations remains challenging due to the complex effects of toxic gain-of-function variants and the scarcity of labeled clinical data. Herein, we present GOLF, a generative AI framework for assessing and explaining the pathogenicity of OLF domain variants. GOLF collects and curates a comprehensive dataset of OLF homologs and trains generative models that evaluate monoallelic missense mutations. While these models exhibit diverse predictive behaviors, they collectively achieve accurate classification of known pathogenic and benign variants. To interpret their decision mechanisms, GOLF uses a sparse autoencoder (SAE) that reveals the underlying biochemical features exploited by the generative models to categorize variant effects. GOLF enables accurate evaluation of disease-causing mutations, supporting early genetic risk stratification for glaucoma and facilitating interpretable investigations into the molecular basis of pathogenic variants.