AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires

Melanie F. Pradier, Niranjani Prasad, Paidamoyo Chapfuwa, Sahra Ghalebikesabi, Maximilian Ilse, Steven Woodhouse, Rebecca Elyanow, Javier Zazo, Javier Gonzalez Hernandez, Julia Greissl, Edward Meeds
Proceedings of the 8th Machine Learning for Healthcare Conference, PMLR 219:588-611, 2023.

Abstract

Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong systematic effects on repertoires, which requires careful consideration when developing diagnostic models. We present an Adaptive Immune Repertoire-Invariant Variational Autoencoder (AIRIVA), a generative model that learns a low-dimensional, interpretable, and compositional representation of TCR repertoires to disentangle such systematic effects in repertoires. We apply AIRIVA to two infectious disease case-studies: COVID-19 (natural infection and vaccination) and the Herpes Simplex Virus (HSV-1 and HSV-2), and empirically show that we can disentangle the individual disease signals. We further demonstrate AIRIVA’s capability to: learn from unlabelled samples; generate in-silico TCR repertoires by intervening on the latent factors; and identify disease-associated TCRs validated using TCR annotations from external assay data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v219-pradier23a, title = {AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires}, author = {Pradier, Melanie F. and Prasad, Niranjani and Chapfuwa, Paidamoyo and Ghalebikesabi, Sahra and Ilse, Maximilian and Woodhouse, Steven and Elyanow, Rebecca and Zazo, Javier and Hernandez, Javier Gonzalez and Greissl, Julia and Meeds, Edward}, booktitle = {Proceedings of the 8th Machine Learning for Healthcare Conference}, pages = {588--611}, year = {2023}, editor = {Deshpande, Kaivalya and Fiterau, Madalina and Joshi, Shalmali and Lipton, Zachary and Ranganath, Rajesh and Urteaga, Iñigo and Yeung, Serene}, volume = {219}, series = {Proceedings of Machine Learning Research}, month = {11--12 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v219/pradier23a/pradier23a.pdf}, url = {https://proceedings.mlr.press/v219/pradier23a.html}, abstract = {Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong systematic effects on repertoires, which requires careful consideration when developing diagnostic models. We present an Adaptive Immune Repertoire-Invariant Variational Autoencoder (AIRIVA), a generative model that learns a low-dimensional, interpretable, and compositional representation of TCR repertoires to disentangle such systematic effects in repertoires. We apply AIRIVA to two infectious disease case-studies: COVID-19 (natural infection and vaccination) and the Herpes Simplex Virus (HSV-1 and HSV-2), and empirically show that we can disentangle the individual disease signals. We further demonstrate AIRIVA’s capability to: learn from unlabelled samples; generate in-silico TCR repertoires by intervening on the latent factors; and identify disease-associated TCRs validated using TCR annotations from external assay data.} }
Endnote
%0 Conference Paper %T AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires %A Melanie F. Pradier %A Niranjani Prasad %A Paidamoyo Chapfuwa %A Sahra Ghalebikesabi %A Maximilian Ilse %A Steven Woodhouse %A Rebecca Elyanow %A Javier Zazo %A Javier Gonzalez Hernandez %A Julia Greissl %A Edward Meeds %B Proceedings of the 8th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2023 %E Kaivalya Deshpande %E Madalina Fiterau %E Shalmali Joshi %E Zachary Lipton %E Rajesh Ranganath %E Iñigo Urteaga %E Serene Yeung %F pmlr-v219-pradier23a %I PMLR %P 588--611 %U https://proceedings.mlr.press/v219/pradier23a.html %V 219 %X Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong systematic effects on repertoires, which requires careful consideration when developing diagnostic models. We present an Adaptive Immune Repertoire-Invariant Variational Autoencoder (AIRIVA), a generative model that learns a low-dimensional, interpretable, and compositional representation of TCR repertoires to disentangle such systematic effects in repertoires. We apply AIRIVA to two infectious disease case-studies: COVID-19 (natural infection and vaccination) and the Herpes Simplex Virus (HSV-1 and HSV-2), and empirically show that we can disentangle the individual disease signals. We further demonstrate AIRIVA’s capability to: learn from unlabelled samples; generate in-silico TCR repertoires by intervening on the latent factors; and identify disease-associated TCRs validated using TCR annotations from external assay data.
APA
Pradier, M.F., Prasad, N., Chapfuwa, P., Ghalebikesabi, S., Ilse, M., Woodhouse, S., Elyanow, R., Zazo, J., Hernandez, J.G., Greissl, J. & Meeds, E.. (2023). AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires. Proceedings of the 8th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 219:588-611 Available from https://proceedings.mlr.press/v219/pradier23a.html.

Related Material