PiTE: TCR-epitope Binding Affinity Prediction Pipeline using Transformer-based Sequence Encoder

Pengfei Zhang, Seojin Bang, Heewook Lee

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

Accurate prediction of TCR binding affinity to a target antigen is important for development of immunotherapy strategies. Recent computational methods were built on various deep neural networks and used the evolutionary-based distance matrix BLOSUM to embed amino acids of TCR and epitope sequences to numeric values. A pre-trained language model of amino acids is an alternative embedding method where each amino acid in a peptide is embedded as a continuous numeric vector. Little attention has yet been given to summarize the amino-acid-wise embedding vectors to sequence-wise representations. In this paper, we propose PiTE, a two-step pipeline for the TCR-epitope binding affinity prediction. First, we use an amino acids embedding model pre-trained on a large number of unlabeled TCR sequences and obtain a real-valued representation from a string representation of amino acid sequences. Second, we train a binding affinity prediction model that consists of two sequence encoders and a stack of linear layers predicting the affinity score of a given TCR and epitope pair. In particular, we explore various types of neural network architectures for the sequence encoders in the two-step binding affinity prediction pipeline. We show that our Transformer-like sequence encoder achieves a state-of-the-art performance and significantly outperforms the others, perhaps due to the models ability to capture contextual information between amino acids in each sequence. Our work highlights that an advanced sequence encoder on top of pre-trained representation significantly improves performance of the TCR-epitope binding affinity prediction.

Original languageEnglish (US)
Pages (from-to)347-358
Number of pages12
JournalPacific Symposium on Biocomputing
Issue number2023
DOIs
StatePublished - 2023
Event28th Pacific Symposium on Biocomputing, PSB 2023 - Kohala Coast, United States
Duration: Jan 3 2023Jan 7 2023

Keywords

  • TCR
  • binding affinity prediction
  • epitope
  • sequence encoder

ASJC Scopus subject areas

  • Biomedical Engineering
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'PiTE: TCR-epitope Binding Affinity Prediction Pipeline using Transformer-based Sequence Encoder'. Together they form a unique fingerprint.

Cite this