Bridging the utility gap between MALDI-TOF and WGS for affordable outbreak cluster detection

Chang Liu, Jieshi Chen, Lee H Harrison, Artur Dubrawski
Proceedings of the sixth Conference on Health, Inference, and Learning, PMLR 287:557-572, 2025.

Abstract

Rapid and accurate detection of emerging outbreak clusters can help contain the spread of diseases with epidemic potential. Among the available pathogen matching methods that can be used to support the task, whole genome sequencing (WGS) offers the highest discriminatory power but is expensive and time-consuming. On the other hand, Matrix-Assisted Laser Desorption Ionization–Time of Flight (MALDI-TOF) mass spectrometry is gaining attention for being a rapid and cost-effective, albeit less precise, alternative. In order to combine the strengths of both MALDI-TOF and WGS, we present MSMAP, the first machine learning framework that establishes a mapping between MALDI-TOF mass spectra and the single nucleotide polymorphism (SNP) distances obtained from WGS analysis. We demonstrate the effectiveness of MSMAP in retrieving WGS-defined outbreak clusters on synthetic mass spectrum data and on proprietary data with paired MALDI-TOF and SNP information. The results show that MSMAP augments MALDI-TOF with the discriminatory power of WGS, thus bridging their utility gap and paving the way toward fast, accurate and cost-effective outbreak cluster detection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v287-liu25a, title = {Bridging the utility gap between MALDI-TOF and WGS for affordable outbreak cluster detection}, author = {Liu, Chang and Chen, Jieshi and Harrison, Lee H and Dubrawski, Artur}, booktitle = {Proceedings of the sixth Conference on Health, Inference, and Learning}, pages = {557--572}, year = {2025}, editor = {Xu, Xuhai Orson and Choi, Edward and Singhal, Pankhuri and Gerych, Walter and Tang, Shengpu and Agrawal, Monica and Subbaswamy, Adarsh and Sizikova, Elena and Dunn, Jessilyn and Daneshjou, Roxana and Sarker, Tasmie and McDermott, Matthew and Chen, Irene}, volume = {287}, series = {Proceedings of Machine Learning Research}, month = {25--27 Jun}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v287/main/assets/liu25a/liu25a.pdf}, url = {https://proceedings.mlr.press/v287/liu25a.html}, abstract = {Rapid and accurate detection of emerging outbreak clusters can help contain the spread of diseases with epidemic potential. Among the available pathogen matching methods that can be used to support the task, whole genome sequencing (WGS) offers the highest discriminatory power but is expensive and time-consuming. On the other hand, Matrix-Assisted Laser Desorption Ionization–Time of Flight (MALDI-TOF) mass spectrometry is gaining attention for being a rapid and cost-effective, albeit less precise, alternative. In order to combine the strengths of both MALDI-TOF and WGS, we present MSMAP, the first machine learning framework that establishes a mapping between MALDI-TOF mass spectra and the single nucleotide polymorphism (SNP) distances obtained from WGS analysis. We demonstrate the effectiveness of MSMAP in retrieving WGS-defined outbreak clusters on synthetic mass spectrum data and on proprietary data with paired MALDI-TOF and SNP information. The results show that MSMAP augments MALDI-TOF with the discriminatory power of WGS, thus bridging their utility gap and paving the way toward fast, accurate and cost-effective outbreak cluster detection.} }
Endnote
%0 Conference Paper %T Bridging the utility gap between MALDI-TOF and WGS for affordable outbreak cluster detection %A Chang Liu %A Jieshi Chen %A Lee H Harrison %A Artur Dubrawski %B Proceedings of the sixth Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2025 %E Xuhai Orson Xu %E Edward Choi %E Pankhuri Singhal %E Walter Gerych %E Shengpu Tang %E Monica Agrawal %E Adarsh Subbaswamy %E Elena Sizikova %E Jessilyn Dunn %E Roxana Daneshjou %E Tasmie Sarker %E Matthew McDermott %E Irene Chen %F pmlr-v287-liu25a %I PMLR %P 557--572 %U https://proceedings.mlr.press/v287/liu25a.html %V 287 %X Rapid and accurate detection of emerging outbreak clusters can help contain the spread of diseases with epidemic potential. Among the available pathogen matching methods that can be used to support the task, whole genome sequencing (WGS) offers the highest discriminatory power but is expensive and time-consuming. On the other hand, Matrix-Assisted Laser Desorption Ionization–Time of Flight (MALDI-TOF) mass spectrometry is gaining attention for being a rapid and cost-effective, albeit less precise, alternative. In order to combine the strengths of both MALDI-TOF and WGS, we present MSMAP, the first machine learning framework that establishes a mapping between MALDI-TOF mass spectra and the single nucleotide polymorphism (SNP) distances obtained from WGS analysis. We demonstrate the effectiveness of MSMAP in retrieving WGS-defined outbreak clusters on synthetic mass spectrum data and on proprietary data with paired MALDI-TOF and SNP information. The results show that MSMAP augments MALDI-TOF with the discriminatory power of WGS, thus bridging their utility gap and paving the way toward fast, accurate and cost-effective outbreak cluster detection.
APA
Liu, C., Chen, J., Harrison, L.H. & Dubrawski, A.. (2025). Bridging the utility gap between MALDI-TOF and WGS for affordable outbreak cluster detection. Proceedings of the sixth Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 287:557-572 Available from https://proceedings.mlr.press/v287/liu25a.html.

Related Material