Analyzing Tandem Mass Spectra: A Graphical Models Perspective
Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, PMLR 73:6-6, 2017.
In the past two decades, the field of proteomics has seen explosive growth, largely due to the development of tandem mass spectrometry (MS/MS). With a complex biological sample as input, a typical MS/MS experiment quickly produces a large (often numbering in the hundreds-of-thousands) collection of spectra representative of the proteins present in the original complex sample. A majority of widely used methods to search and identify MS/MS spectra use scoring functions which rely on static, hand-selected parameters rather than affording the ability to learn parameters and adapt to the widely varying characteristics of MS/MS data. In this talk, we discuss recent work utilizing dynamic Bayesian networks (DBNs) to identify MS/MS spectra. In particular, we discuss a recently proposed DBN for Rapid Identification of Peptides (DRIP) which, in contrast to popular scoring functions, allows efficient generative and discriminative learning of parameters to achieve state-of-theart spectrum-identification accuracy. Furthermore, facilitated by DRIP’s generative nature, we present current innovations leveraging DBNs to significantly enhance many other aspects of MS/MS analysis, such as improving downstream discriminative classification via detailed feature extraction and speeding up identification runtime using trellises and approximate inference.