[edit]
TCR-ECHO: Evolutionary Cross-attention with Physicochemical Bias for Hierarchical TCR-Peptide Binding Prediction
Proceedings of the 20th Machine Learning in Computational Biology meeting, PMLR 311:12-27, 2025.
Abstract
Accurate prediction of T-cell receptor (TCR)-peptide binding specificity remains challenging due to vast immune receptor diversity and complex molecular recognition principles. We present a deep learning framework integrating evolutionary protein representations with physicochemical binding principles for TCR-epitope interaction prediction. Our approach employs separate ESM2 encoders for TCR CDR3beta and peptide sequences, capturing evolutionary patterns from millions of proteins. The architecture introduces two key innovations: physicochemically-informed cross-attention incorporating Atchley factor biases to model molecular complementarity, and hierarchical contrastive learning operating at residue and interaction levels to structure binding-specific representations. The bidirectional cross-attention mechanism models mutual recognition between binding partners, while Atchley factor integration provides physicochemical context—hydrophobicity, polarity, and structural propensities—governing biochemical interactions beyond sequence patterns. Hierarchical contrastive learning progressively refines representations from sequence patterns to interaction compatibility, creating peptide-specific clusters in TCR embedding space. Comprehensive evaluation demonstrates superior performance across multiple scenarios. Systematic ablation studies confirm each component’s importance, particularly for novel peptide generalization. The learned representations show clear biological interpretability, with TCRs binding identical peptides clustering in embedding space. This framework establishes foundations for computational immunology tools by integrating evolutionary information with molecular binding principles, which are informative for understanding TCR-peptide binding.