Overview of Epitope Prediction Methods

Predicted epitopes in this database were identified using a combination of established computational tools designed to capture different aspects of antigen recognition. For T-cell epitopes, both cytotoxic (CD8⁺) and helper (CD4⁺) predictions were performed using the NetMHCpan and NetMHCIIpan series, which evaluate peptide binding affinity and presentation likelihood to MHC class I and II molecules. For B-cell epitopes, multiple sequence- and structure-based algorithms were applied to assess surface accessibility, flexibility, hydrophilicity, and antigenic propensity. Together, these approaches provide a comprehensive in silico prediction framework to identify potential epitopes for experimental validation.

Cytotoxic T Cells

NetMHCpan-4.1 EL

Predicts MHC-I presented peptides using a neural network trained on eluted ligand and binding affinity data, providing likelihood scores that reflect natural antigen presentation. Higher scores indicate stronger presentation potential and higher likelihood of being a naturally processed epitope.

NetMHCpan-4.1 BA

Estimates peptide–MHC-I binding affinity based on experimentally measured IC₅₀ values, reporting predicted binding strength and percentile ranks. Lower IC₅₀ values (or lower rank %) correspond to stronger MHC-I binding affinity.

TCRLens

A structure-aware deep learning framework developed by our group that models residue-level interactions across five critical interface zones of peptide–MHC–TCR complexes using multi-scale graph representations and an equivariant graph neural network (EGNN). The model integrates a VAE–GAN to generate realistic negative samples and jointly predicts peptide–MHC binding, peptide–TCR recognition, and full-complex interactions. Higher prediction scores indicate stronger MHC-I binding affinity and higher confidence in TCR recognition potential.

Helper T Cells

NetMHCIIpan-4.3 EL

Predicts peptides naturally presented by MHC-II molecules using deep learning trained on eluted ligand datasets, outputting likelihood scores for CD4⁺ T-cell recognition. Higher scores indicate stronger potential for antigen presentation and helper T-cell activation.

NetMHCIIpan-4.3 BA

Evaluates peptide–MHC-II binding affinity through neural network models trained on IC₅₀ data, returning predicted binding strength and ranking scores. Lower IC₅₀ or rank values signify tighter binding between the peptide and MHC-II molecule.

B Cells

BepiPred 3.0

Combines protein language models and structural features to predict linear and conformational B-cell epitopes, providing residue-level probability scores. Residues with higher probabilities (typically >0.5) are more likely to be part of antibody-recognized epitopes.

Chou–Fasman

Identifies regions likely forming β-turns in protein sequences, which are often located on accessible surfaces and serve as potential B-cell epitopes. Residues with β-turn propensity values above threshold are predicted as potential epitope-forming regions.

Emini

Estimates surface accessibility of residues to identify exposed regions that are more likely to be recognized by antibodies. Higher accessibility scores indicate surface-exposed residues suitable for B-cell recognition.

Karplus–Schulz

Predicts local flexibility of amino acid residues based on B-factor values, highlighting flexible regions that tend to act as B-cell epitopes. Regions with higher flexibility scores suggest greater structural mobility and higher epitope potential.

Kolaskar–Tongaonkar

Uses physicochemical properties and antigenic propensity values of amino acids to identify antigenic determinants in protein sequences. Regions with average scores above 1.0 are generally considered antigenic and likely to elicit antibody response.

Parker

Calculates hydrophilicity profiles to pinpoint hydrophilic regions that are typically located on protein surfaces and recognized as B-cell epitopes. Higher hydrophilicity scores denote more exposed, water-facing residues with increased epitope likelihood.