GOLF: A Generative AI Framework for Pathogenicity Prediction of Myocilin OLF Variants

Thomas Walton, Darin Tsui, Amirali Aghazadeh, Raquel Lieberman, Lauren Fogel, Rafael Chagas, Dustin Huard
Proceedings of the 20th Machine Learning in Computational Biology meeting, PMLR 311:148-161, 2025.

Abstract

Missense mutations in the MYOC gene, particularly those affecting the olfactomedin (OLF) domain of the myocilin protein, can be causal for open-angle glaucoma, a leading cause of irreversible blindness. However, predicting the pathogenicity of these mutations remains challenging due to the complex effects of toxic gain-of-function variants and the scarcity of labeled clinical data. Herein, we present GOLF, a generative AI framework for assessing and explaining the pathogenicity of OLF domain variants. GOLF collects and curates a comprehensive dataset of OLF homologs and trains generative models that evaluate monoallelic missense mutations. While these models exhibit diverse predictive behaviors, they collectively achieve accurate classification of known pathogenic and benign variants. To interpret their decision mechanisms, GOLF uses a sparse autoencoder (SAE) that reveals the underlying biochemical features exploited by the generative models to categorize variant effects. GOLF enables accurate evaluation of disease-causing mutations, supporting early genetic risk stratification for glaucoma and facilitating interpretable investigations into the molecular basis of pathogenic variants.

Cite this Paper


BibTeX
@InProceedings{pmlr-v311-walton25a, title = {GOLF: A Generative AI Framework for Pathogenicity Prediction of Myocilin OLF Variants}, author = {Walton, Thomas and Tsui, Darin and Aghazadeh, Amirali and Lieberman, Raquel and Fogel, Lauren and Chagas, Rafael and Huard, Dustin}, booktitle = {Proceedings of the 20th Machine Learning in Computational Biology meeting}, pages = {148--161}, year = {2025}, editor = {Knowles, David A and Koo, Peter K}, volume = {311}, series = {Proceedings of Machine Learning Research}, month = {10--11 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v311/main/assets/walton25a/walton25a.pdf}, url = {https://proceedings.mlr.press/v311/walton25a.html}, abstract = {Missense mutations in the MYOC gene, particularly those affecting the olfactomedin (OLF) domain of the myocilin protein, can be causal for open-angle glaucoma, a leading cause of irreversible blindness. However, predicting the pathogenicity of these mutations remains challenging due to the complex effects of toxic gain-of-function variants and the scarcity of labeled clinical data. Herein, we present GOLF, a generative AI framework for assessing and explaining the pathogenicity of OLF domain variants. GOLF collects and curates a comprehensive dataset of OLF homologs and trains generative models that evaluate monoallelic missense mutations. While these models exhibit diverse predictive behaviors, they collectively achieve accurate classification of known pathogenic and benign variants. To interpret their decision mechanisms, GOLF uses a sparse autoencoder (SAE) that reveals the underlying biochemical features exploited by the generative models to categorize variant effects. GOLF enables accurate evaluation of disease-causing mutations, supporting early genetic risk stratification for glaucoma and facilitating interpretable investigations into the molecular basis of pathogenic variants.} }
Endnote
%0 Conference Paper %T GOLF: A Generative AI Framework for Pathogenicity Prediction of Myocilin OLF Variants %A Thomas Walton %A Darin Tsui %A Amirali Aghazadeh %A Raquel Lieberman %A Lauren Fogel %A Rafael Chagas %A Dustin Huard %B Proceedings of the 20th Machine Learning in Computational Biology meeting %C Proceedings of Machine Learning Research %D 2025 %E David A Knowles %E Peter K Koo %F pmlr-v311-walton25a %I PMLR %P 148--161 %U https://proceedings.mlr.press/v311/walton25a.html %V 311 %X Missense mutations in the MYOC gene, particularly those affecting the olfactomedin (OLF) domain of the myocilin protein, can be causal for open-angle glaucoma, a leading cause of irreversible blindness. However, predicting the pathogenicity of these mutations remains challenging due to the complex effects of toxic gain-of-function variants and the scarcity of labeled clinical data. Herein, we present GOLF, a generative AI framework for assessing and explaining the pathogenicity of OLF domain variants. GOLF collects and curates a comprehensive dataset of OLF homologs and trains generative models that evaluate monoallelic missense mutations. While these models exhibit diverse predictive behaviors, they collectively achieve accurate classification of known pathogenic and benign variants. To interpret their decision mechanisms, GOLF uses a sparse autoencoder (SAE) that reveals the underlying biochemical features exploited by the generative models to categorize variant effects. GOLF enables accurate evaluation of disease-causing mutations, supporting early genetic risk stratification for glaucoma and facilitating interpretable investigations into the molecular basis of pathogenic variants.
APA
Walton, T., Tsui, D., Aghazadeh, A., Lieberman, R., Fogel, L., Chagas, R. & Huard, D.. (2025). GOLF: A Generative AI Framework for Pathogenicity Prediction of Myocilin OLF Variants. Proceedings of the 20th Machine Learning in Computational Biology meeting, in Proceedings of Machine Learning Research 311:148-161 Available from https://proceedings.mlr.press/v311/walton25a.html.

Related Material