Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models

Robert McGibbon; Bharath Ramsundar; Mohammad Sultan; Gert Kiss; Vijay Pande

Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models

Robert McGibbon, Bharath Ramsundar, Mohammad Sultan, Gert Kiss, Vijay Pande

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1197-1205, 2014.

Abstract

We present a machine learning framework for modeling protein dynamics. Our approach uses L1-regularized, reversible hidden Markov models to understand large protein datasets generated via molecular dynamics simulations. Our model is motivated by three design principles: (1) the requirement of massive scalability; (2) the need to adhere to relevant physical law; and (3) the necessity of providing accessible interpretations, critical for rational protein engineering and drug design. We present an EM algorithm for learning and introduce a model selection criteria based on the physical notion of relaxation timescales. We contrast our model with standard methods in biophysics and demonstrate improved robustness. We implement our algorithm on GPUs and apply the method to two large protein simulation datasets generated respectively on the NCSA Bluewaters supercomputer and the Folding@Home distributed computing network. Our analysis identifies the conformational dynamics of the ubiquitin protein responsible for signaling, and elucidates the stepwise activation mechanism of the c-Src kinase protein.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-mcgibbon14,
  title = 	 {Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models},
  author = 	 {McGibbon, Robert and Ramsundar, Bharath and Sultan, Mohammad and Kiss, Gert and Pande, Vijay},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1197--1205},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/mcgibbon14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/mcgibbon14.html},
  abstract = 	 {We present a machine learning framework for modeling protein dynamics. Our  approach uses L1-regularized, reversible hidden Markov models to  understand large protein datasets generated via molecular dynamics  simulations. Our model is motivated by three design principles: (1) the requirement of massive scalability; (2) the need to adhere to relevant physical law; and (3) the necessity of providing accessible interpretations, critical for rational protein engineering and drug design. We present an EM algorithm for learning and introduce a model selection criteria based on the physical notion of relaxation timescales. We contrast our model with standard methods in biophysics and demonstrate improved robustness. We implement our algorithm on GPUs and apply the method to two large protein simulation datasets generated respectively on the NCSA Bluewaters supercomputer and the Folding@Home distributed computing network. Our analysis identifies the conformational dynamics of the ubiquitin protein responsible for signaling, and elucidates the stepwise activation mechanism of the c-Src kinase protein.}
}

Endnote

%0 Conference Paper
%T Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models
%A Robert McGibbon
%A Bharath Ramsundar
%A Mohammad Sultan
%A Gert Kiss
%A Vijay Pande
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-mcgibbon14
%I PMLR
%P 1197--1205
%U https://proceedings.mlr.press/v32/mcgibbon14.html
%V 32
%N 2
%X We present a machine learning framework for modeling protein dynamics. Our  approach uses L1-regularized, reversible hidden Markov models to  understand large protein datasets generated via molecular dynamics  simulations. Our model is motivated by three design principles: (1) the requirement of massive scalability; (2) the need to adhere to relevant physical law; and (3) the necessity of providing accessible interpretations, critical for rational protein engineering and drug design. We present an EM algorithm for learning and introduce a model selection criteria based on the physical notion of relaxation timescales. We contrast our model with standard methods in biophysics and demonstrate improved robustness. We implement our algorithm on GPUs and apply the method to two large protein simulation datasets generated respectively on the NCSA Bluewaters supercomputer and the Folding@Home distributed computing network. Our analysis identifies the conformational dynamics of the ubiquitin protein responsible for signaling, and elucidates the stepwise activation mechanism of the c-Src kinase protein.

RIS


TY  - CPAPER
TI  - Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models
AU  - Robert McGibbon
AU  - Bharath Ramsundar
AU  - Mohammad Sultan
AU  - Gert Kiss
AU  - Vijay Pande
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-mcgibbon14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1197
EP  - 1205
L1  - http://proceedings.mlr.press/v32/mcgibbon14.pdf
UR  - https://proceedings.mlr.press/v32/mcgibbon14.html
AB  - We present a machine learning framework for modeling protein dynamics. Our  approach uses L1-regularized, reversible hidden Markov models to  understand large protein datasets generated via molecular dynamics  simulations. Our model is motivated by three design principles: (1) the requirement of massive scalability; (2) the need to adhere to relevant physical law; and (3) the necessity of providing accessible interpretations, critical for rational protein engineering and drug design. We present an EM algorithm for learning and introduce a model selection criteria based on the physical notion of relaxation timescales. We contrast our model with standard methods in biophysics and demonstrate improved robustness. We implement our algorithm on GPUs and apply the method to two large protein simulation datasets generated respectively on the NCSA Bluewaters supercomputer and the Folding@Home distributed computing network. Our analysis identifies the conformational dynamics of the ubiquitin protein responsible for signaling, and elucidates the stepwise activation mechanism of the c-Src kinase protein.
ER  -

APA


McGibbon, R., Ramsundar, B., Sultan, M., Kiss, G. & Pande, V.. (2014). Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1197-1205 Available from https://proceedings.mlr.press/v32/mcgibbon14.html.

Related Material

Download PDF