[edit]
Graph learning for capturing long-range dependencies in protein structures
Proceedings of the 19th Machine Learning in Computational Biology meeting, PMLR 261:117-128, 2024.
Abstract
Polyamides, or peptides and proteins, are biomolecules that exist in a broad spectrum of size, structure, and function. Both structure and function are defined by the underlying sequence of amino acids, causing the polyamide to take three-dimensional conformations when in solution. Despite significant efforts and advances in function and conformation prediction, there remains a critical need for computational methods to accurately infer protein function from sequence and structure. Recent advancements in deep learning, particularly Graph Neural Networks, have shown promise in learning the sequence and structure of proteins. However, they fail to capture essential long-range dependencies inherent in the complex and dynamic three-dimensional structures of proteins, leading to issues including oversquashing and oversmoothing. Here, we explore solutions to the challenge of capturing long-range dependencies in graph representations of polyamides, focusing on latent nodes and graph rewiring techniques. While graph rewiring enhances information flow between distant nodes, latent nodes enable the concentration of global information. In addition, we investigate the effectiveness of ChebNet, a spectral backbone, in capturing long-range dependencies. Our unified framework combines these approaches to address the limitations of current methods, offering insights into protein function and regulation. Through experimental analysis, we demonstrate the efficacy of our proposed methods in capturing long-range dependencies.