Bayesian Event-Based Model for Disease Subtype and Stage Inference

Hongtao Hao, Joseph L. Austerweil
Proceedings of the Fifth Machine Learning for Health Symposium, PMLR 297:88-119, 2026.

Abstract

Chronic diseases often progress differently across patients. Rather than randomly varying, there are typically a small number of subtypes for how a disease progresses across patients. To capture this structured heterogeneity, the Subtype and Stage Inference Event-Based Model (SuStaIn) estimates the number of subtypes, the order of disease progression for each subtype, and assigns each patient to a subtype from primarily cross-sectional data. It has been widely applied to uncover the subtypes of many diseases and inform our understanding of them. But how robust is its performance? In this paper, we develop a principled Bayesian subtype variant of the event-based model (bebms) and compare its performance to SuStaIn in a variety of synthetic data experiments with varied levels of model misspecification. BebmS substantially outperforms SuStaIn across ordering, staging, and subtype assignment tasks. Further, we apply bebms and SuStaIn to a real-world Alzheimer’s data set. We find BebmS has results that are more consistent with the scientific consensus of Alzheimer’s disease progression than SuStaIn.

Cite this Paper


BibTeX
@InProceedings{pmlr-v297-hao26a, title = {Bayesian Event-Based Model for Disease Subtype and Stage Inference}, author = {Hao, Hongtao and Austerweil, Joseph L.}, booktitle = {Proceedings of the Fifth Machine Learning for Health Symposium}, pages = {88--119}, year = {2026}, editor = {Argaw, Peniel and Zhang, Haoran and Jabbour, Sarah and Chandak, Payal and Ji, Jerry and Mukherjee, Sumit and Salaudeen, Olawale and Chang, Trenton and Healey, Elizabeth and Gröger, Fabian and Adibi, Amin and Hegselmann, Stefan and Wild, Benjamin and Noori, Ayush}, volume = {297}, series = {Proceedings of Machine Learning Research}, month = {13--14 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v297/main/assets/hao26a/hao26a.pdf}, url = {https://proceedings.mlr.press/v297/hao26a.html}, abstract = {Chronic diseases often progress differently across patients. Rather than randomly varying, there are typically a small number of subtypes for how a disease progresses across patients. To capture this structured heterogeneity, the Subtype and Stage Inference Event-Based Model (SuStaIn) estimates the number of subtypes, the order of disease progression for each subtype, and assigns each patient to a subtype from primarily cross-sectional data. It has been widely applied to uncover the subtypes of many diseases and inform our understanding of them. But how robust is its performance? In this paper, we develop a principled Bayesian subtype variant of the event-based model (bebms) and compare its performance to SuStaIn in a variety of synthetic data experiments with varied levels of model misspecification. BebmS substantially outperforms SuStaIn across ordering, staging, and subtype assignment tasks. Further, we apply bebms and SuStaIn to a real-world Alzheimer’s data set. We find BebmS has results that are more consistent with the scientific consensus of Alzheimer’s disease progression than SuStaIn.} }
Endnote
%0 Conference Paper %T Bayesian Event-Based Model for Disease Subtype and Stage Inference %A Hongtao Hao %A Joseph L. Austerweil %B Proceedings of the Fifth Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2026 %E Peniel Argaw %E Haoran Zhang %E Sarah Jabbour %E Payal Chandak %E Jerry Ji %E Sumit Mukherjee %E Olawale Salaudeen %E Trenton Chang %E Elizabeth Healey %E Fabian Gröger %E Amin Adibi %E Stefan Hegselmann %E Benjamin Wild %E Ayush Noori %F pmlr-v297-hao26a %I PMLR %P 88--119 %U https://proceedings.mlr.press/v297/hao26a.html %V 297 %X Chronic diseases often progress differently across patients. Rather than randomly varying, there are typically a small number of subtypes for how a disease progresses across patients. To capture this structured heterogeneity, the Subtype and Stage Inference Event-Based Model (SuStaIn) estimates the number of subtypes, the order of disease progression for each subtype, and assigns each patient to a subtype from primarily cross-sectional data. It has been widely applied to uncover the subtypes of many diseases and inform our understanding of them. But how robust is its performance? In this paper, we develop a principled Bayesian subtype variant of the event-based model (bebms) and compare its performance to SuStaIn in a variety of synthetic data experiments with varied levels of model misspecification. BebmS substantially outperforms SuStaIn across ordering, staging, and subtype assignment tasks. Further, we apply bebms and SuStaIn to a real-world Alzheimer’s data set. We find BebmS has results that are more consistent with the scientific consensus of Alzheimer’s disease progression than SuStaIn.
APA
Hao, H. & Austerweil, J.L.. (2026). Bayesian Event-Based Model for Disease Subtype and Stage Inference. Proceedings of the Fifth Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 297:88-119 Available from https://proceedings.mlr.press/v297/hao26a.html.

Related Material