Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads

Todd Morrill, Aahlad Puli, Murad Megjhani, Soojin Park, Richard Zemel
Proceedings of the Fifth Machine Learning for Health Symposium, PMLR 297:697-720, 2026.

Abstract

Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such as calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they are assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several discrete-time deep mixture-of-experts ({MoE}) based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of {MoE}s is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.

Cite this Paper


BibTeX
@InProceedings{pmlr-v297-morrill26a, title = {Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads}, author = {Morrill, Todd and Puli, Aahlad and Megjhani, Murad and Park, Soojin and Zemel, Richard}, booktitle = {Proceedings of the Fifth Machine Learning for Health Symposium}, pages = {697--720}, year = {2026}, editor = {Argaw, Peniel and Zhang, Haoran and Jabbour, Sarah and Chandak, Payal and Ji, Jerry and Mukherjee, Sumit and Salaudeen, Olawale and Chang, Trenton and Healey, Elizabeth and Gröger, Fabian and Adibi, Amin and Hegselmann, Stefan and Wild, Benjamin and Noori, Ayush}, volume = {297}, series = {Proceedings of Machine Learning Research}, month = {13--14 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v297/main/assets/morrill26a/morrill26a.pdf}, url = {https://proceedings.mlr.press/v297/morrill26a.html}, abstract = {Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such as calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they are assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several discrete-time deep mixture-of-experts ({MoE}) based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of {MoE}s is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.} }
Endnote
%0 Conference Paper %T Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads %A Todd Morrill %A Aahlad Puli %A Murad Megjhani %A Soojin Park %A Richard Zemel %B Proceedings of the Fifth Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2026 %E Peniel Argaw %E Haoran Zhang %E Sarah Jabbour %E Payal Chandak %E Jerry Ji %E Sumit Mukherjee %E Olawale Salaudeen %E Trenton Chang %E Elizabeth Healey %E Fabian Gröger %E Amin Adibi %E Stefan Hegselmann %E Benjamin Wild %E Ayush Noori %F pmlr-v297-morrill26a %I PMLR %P 697--720 %U https://proceedings.mlr.press/v297/morrill26a.html %V 297 %X Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such as calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they are assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several discrete-time deep mixture-of-experts ({MoE}) based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of {MoE}s is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.
APA
Morrill, T., Puli, A., Megjhani, M., Park, S. & Zemel, R.. (2026). Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads. Proceedings of the Fifth Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 297:697-720 Available from https://proceedings.mlr.press/v297/morrill26a.html.

Related Material