Skip to main content
  • Original article
  • Open access
  • Published:

Multi-epitope vaccine design of African swine fever virus considering T cell and B cell immunogenicity

Abstract

T and B cell activation are equally important in triggering and orchestrating adaptive host responses to design multi-epitope African swine fever virus (ASFV) vaccines. However, few design methods have considered the trade-off between T and B cell immunogenicity when identifying promising ASFV epitopes. This work proposed a novel Pareto front-based ASFV screening method PFAS to identify promising epitopes for designing multi-epitope vaccines utilizing five ASFV Georgia 2007/1 sequences. To accurately predict T cell immunogenicity, four scoring methods were used to estimate the T cell activation in the four stages, including proteasomal cleavage probability, transporter associated with antigen processing transport efficiency, class I binding affinity of the major histocompatibility complex, and CD8 + cytotoxic T cell immunogenicity. PFAS ranked promising epitopes using a Pareto front method considering T and B cell immunogenicity. The coefficient of determination between the Pareto ranks of multi-epitope vaccines and survival days of swine vaccinations was R2 = 0.95. Consequently, PFAS scored complete epitope profiles and identified 72 promising top-ranked epitopes, including 46 CD2v epitopes, two p30 epitopes, 10 p72 epitopes, and 14 pp220 epitopes. PFAS is the first method of using the Pareto front approach to identify promising epitopes that considers the objectives of maximizing both T and B cell immunogenicity. The top-ranked promising epitopes can be cost-effectively validated in vitro. The Pareto front approach can be adaptively applied to various epitope predictors for bacterial, viral and cancer vaccine developments. The MATLAB code of the Pareto front method was available at https://github.com/NYCU-ICLAB/PFAS.

Graphical Abstract

Key points

• Proposing a Pareto front-based method for designing swine multi-epitope vaccine.

• The method maximizes T and B cell immunogenicity while ranking promising epitopes.

• Higher the epitope Pareto ranks leads to longer vaccination survival (R2 = 0.95).

Introduction

African swine fever virus (ASFV) causes a lethal hemorrhagic disease and has become an epidemic swine viral disease in Asia. Simultaneous activation of T and B cells results in a better immune response and more immunological memory against ASFV than activation of one of the cell types alone (Bosch-Camos et al. 2020; Teklue et al. 2020). The development of subunit vaccines, especially multi-epitope vaccines, is more challenging than that of live-attenuated virus vaccines. However, live attenuated vaccines contain attenuated forms of pathogens that can infect individuals with weakened immune systems and may revert to more virulent strains. As a result, these vaccines are only appropriate for endemic areas. This highlights the need for a shift towards the development of multi-epitope vaccines, which have significant applications in non-endemic regions and are driving increased research and development efforts. (Teklue et al. 2020). However, there is currently no effective multi-epitope vaccine for the prevention of ASFV (Blome et al. 2020).

The selection of protein candidates for designing a multi-epitope vaccine should consider several factors, including the conservation, abundance, extracellular localization, and cross-protection against various viral genotypes (Adamczyk-Poplawska et al. 2011; Alejo et al. 2018; Kessler et al. 2018). CD2v is a hemagglutinin and the main antigen protein involved in regulating immune responses and cell adhesion (Burmakina et al. 2019; Gaudreault and Richt 2019; Jia et al. 2017). p30 is a structural protein involved in the attachment and internalization of ASFV (Gomez-Puertas et al. 1996). p54 is the only membranous structural protein in the inner viral envelope associated with viral attachment (Gomez-Puertas et al. 1998). p72 is a major structural protein (approximately 31–33% of the entire virus) and an important antigenic protein owing to its high conservation and thermostable nature (Liu et al. 2019; Yu et al. 1996). pp220 is the largest multi-precursor protein (Lokhandwala et al. 2019). For immunogenicity and protection, CD2v, p30, and p72 show both antigenicity and immunogenicity, while p54 has antigenicity but low immunogenicity. pp220 displays immunogenicity and presents many peptides to CD8 + cytotoxic T cells (CTLs), triggering a strong antibody response (Bosch-Camos et al. 2020; Lokhandwala et al. 2019). In swine vaccine protection experiments, CD2v did not confer protection, while p30, p54, and p72 showed partial protection. Selecting proteins targeting different aspects of the immune response can provide effective protection. In addition, the binding of attachment proteins (p30, p54) to dominant B and T cell epitopes (CD2v, p72) may enhance viral neutralization and clearance (Bosch-Camos et al. 2020; Jancovich et al. 2018).

Bioinformatic methods using a machine learning approach serve as an effective strategy to identify vaccine candidates for human (Guo et al. 2022; Hajialibeigi et al. 2021; Kibria et al. 2022) and swine pathogens, including ASFV (Gao et al. 2021), influenza A virus (Baratelli et al. 2020; Fan et al. 2018), and porcine circovirus type 2 (Bandrick et al. 2020). Machine learning methods used to identify epitopes in the design of multi-epitope vaccines consider the biological presentation and activation of ASFV epitopes. Figure 1 shows the presentation and activation of ASFV epitopes, and the correspondence between biological and computational processes. In viral infections, CTLs (mainly involved in viral T cell immunity) support cell-mediated immunity against intracellular viruses, while B cells trigger humoral immunity and produce memory cells for future infection (Clem 2011).

Fig. 1
figure 1

Presentation and activation of ASFV epitopes, and the correspondence between biological and computational processes. (A) Illustration of the biological processing and presentation of the ASFV epitope. When infected by ASFV, four processing stages activate T cells in APCs and epitope presentation activates B cells in helper T cells. (B) The computational procedure of the Pareto front method. The input is a set of protein sequences. The outputs are prediction scores of T and B cell immunogenicity. The T cell score is the mean score of four stages: (1) proteasomal cleavage probability, (2) peptide-TAP transporter binding affinity, (3) peptide-MHC I binding affinity, and (4) CTL immunogenicity. CTL: Cytotoxic. T lymphocyte. APC: Antigen-presenting cell. TAP: the transporter associated with antigen processing. Images of figure were partly taken from ‘Smart Servier Medical Art’ (https://smart.servier.com/)

The objective of vaccine design is to induce immune responses in both T and B cells. Existing computational methods for identifying ASFV epitopes screen potential epitopes by separately considering T and B cell immunogenicity (Bosch-Camos et al. 2021; Lopera-Madrid et al. 2017; Ros-Lucas et al. 2020). The Pareto front is a popular approach used for obtaining a set of non-dominated solutions to a bi-objective problem. Pareto-optimal methods have been used to dock proteins and peptides (Masoudi-Sobhanzadeh et al. 2021) and improve amino acid and protein production in Yarrowia lipolytica (Jach et al. 2020). A study related to epitope-based vaccine design against human immunodeficiency virus used the Pareto front to simultaneously optimize cleavage and immunogenicity (Dorigatti and Schubert 2020). However, few studies have used the Pareto front method to simultaneously accommodate T and B cell immunogenicity.

This work proposes for the first time a novel Pareto front-based screening method PFAS to identify promising epitopes with high T and B cell immunogenicity for designing ASFV recombinant multi-epitope vaccines. First, PFAS used experimental T and B cell epitopes from the Immune Epitope Database (IEDB) to verify the state-of-the-art computational methods and their parameter settings. Next, PFAS used the Pareto front technique to deal with T and B cell prediction scores as bi-objective ranks and identify the top-ranked epitopes. PFAS scored whole epitope profiles and identified 72 promising epitopes. Based on the three combinations of epitopes in pp220, p30, p72, and p54 for a vaccination study against ASFV, the determination coefficient of determination between the Pareto ranks of recombinant multi-epitope vaccines and swine survival was R2 = 0.95. The identified epitopes can be cost-effectively validated in vitro to design epitope-based ASFV vaccines.

Immunoinformatics-based reverse vaccinology approaches hold great promise for reducing the time and cost of vaccine development. Currently, several approaches can be used to optimize and validate multi-epitope vaccines. These approaches include allergenicity assessment, protein structure verification, docking conformation, and in silico immune simulations of the vaccine structure (Bappy et al. 2021; Gul et al. 2022). The number of potential epitopes screened can be dynamically adjusted using PFAS for subsequent validation in multi-epitope combination simulations. PFAS can be seamlessly integrated into immunoinformatics-based vaccine development pipelines to generate high-potential epitope combinations for biological experimental confirmation.

Materials and methods

Collection of ASFV protein sequences

The most lethal ASFV type, Georgia 2007/1 (GenBank: FR682468), was used as the target virus for screening. The protein sequences of Georgia 2007/1 were obtained from the NCBI, including CD2v (EP402R), p30 (CP204L), p54 (E183L), p72 (B646L), and pp220 (CP2475L). All sequences were cut into fragments using a sliding window. Finally, two datasets of ASFV proteins consisting of 9mer (Figure S1A) and 15mer (Figure S1B) fragments served as candidates for CTL and B cell epitopes, respectively.

Collection of validation datasets

To obtain the best parameter settings for PFAS, this work established two datasets from the IEDB, consisting of experimentally validated CTL and B cell epitopes of swine. The CTL epitopes (n = 243) were annotated as Sus scrofa, infectious diseases, and Swine Leukocyte Antigen (SLA) class II. After simultaneously removing duplicate and uncertain sequences belonging to both positive and negative groups, the dataset contained 125 swine 9mer CTL fragments, including 37 epitopes and 88 non-epitopes.

Similarly, 1,700 validated swine B cell epitopes (BCEs) which were annotated as Sus scrofa and infectious disease were retrieved. Because IgG production is part of the secondary humoral immune response to an antigen, we extracted 1,389 IgG epitopes. Among them, the 15mer epitope was the largest in the dataset, followed by the 12mer epitope. Therefore, we established two datasets: (1) 650 B cell 15mer epitopes, including 116 positive and 534 negative epitopes, and (2) 293 swine B cell 12mer epitopes, including 35 positive and 258 negative epitopes.

Proposed method PFAS

Figure 2 shows a flowchart of the proposed method PFAS. Five protein sequences from Georgia 2007/1 were obtained and cut into 9mer and 15mer fragments. The CTL epitope predictor estimates T cell activation of 9mer fragments in the four stages and averages the four scores to obtain a T cell immunogenicity score. Similarly, the BCE predictor estimates B cell activation of 15mer fragments to obtain a B cell immunogenicity score. After normalizing these two scores into the range of [0, 1], the two fragments were superimposed by the central amino acid. Consequently, the fragments were extended, and thus conserved sequences were obtained. The Pareto front method produced ranks of the conserved fragments. The top-ranked fragments were considered as promising epitopes.

Fig. 2
figure 2

The flowchart of PFAS. This flowchart includes three main parts: epitope prediction of T and B cell fragments, Pareto rank of fragments, and promising epitopes of multi-epitope vaccines

Calculation of T and B cell scores

Good CTL epitopes are involved in viral processing and antigen presentation, with major histocompatibility complex (MHC) I molecules playing a major role. First, pathogen debris is degraded by proteasomal degradation in the cytosol of productively infected cells. NetCTL is based on the NetChop method and predicts the probability of proteasomal cleavage (Larsen et al. 2007). Second, peptides are transported to the endoplasmic reticulum (ER) by a transporter associated with antigen processing (TAP). To predict TAP transport efficiency, NetCTL and MHC I Processing in the IEDB use the stabilized matrix method, and TAPPred is based on a support vector machine (SVM) with 33 physical features of amino acids (Bhasin and Raghava 2004). Third, an antigen is loaded onto MHC I and appears on the cell surface through vesicles. NetMHCpan (Reynisson et al. 2020), MHC I Processing, and NetCTL are the most widely used ANN-based methods to predict MHC I binding affinity using the BLOSUM50 matrix. Finally, the epitope stimulates CTL activation and differentiation. MHC I immunogenicity in the IEDB (Calis et al. 2013) is based on an immunogenicity score model to predict immunogenicity. In general, these four predictive roles are equally important.

To identify promising T cell epitopes (TCEs), five web predictors were used, including NetCTL (https://services.healthtech.dtu.dk/service.php?NetCTL-1.2), IEDB MHC I Processing (http://tools.iedb.org/processing/), TAPPred (https://webs.iiitd.edu.in/raghava/tappred/index.html), NetMHCpan (https://services.healthtech.dtu.dk/service.php?NetMHCpan-4.0), and IEDB MHC I Immunogenicity. NetCTL was used to predict proteasome processing, TAP transport efficiency, and MHC I binding affinity. To examine conserved epitope candidates that cover multiple MHC loci, including A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58 and B62, we used sequences as inputs and applied ensemble learning with 12 supertype models. After averaging all predictive values in the 12 models, we obtained three estimated values: binding affinity, proteasome cleavage, and the TAP score. The IEDB MHC I Processing tool was used to estimate TAP transport efficiency and MHC I binding affinity. We used all 45 SLA I alleles (including 12 SLA1, 16 SLA2, 12 SLA3, and 5 SLA6) and set nine as the peptide length for each allele to obtain a file with the average predictive values, including the TAP and MHC scores in all sequence fragments. PFAS used TAPPred to predict the peptide-TAP transporter binding affinity based on SVM with validated sequences and obtained the prediction score. NetMHCpan was used to predict the binding affinity of peptide-MHC I. To obtain effective epitopes, we considered all 75 SLA alleles (including 23 SLA1, 26 SLA2, 21 SLA3, and five SLA6) and set nine as the peptide length for each allele. The binding affinity scores were estimated with mean scores for all fragments. This work used IEDB MHC I Immunogenicity to predict CTL immunogenicity considering all CTL active factors and obtained scores of all sequence fragments.

BCEs can induce the differentiation of naïve and memory B cells into plasma cells, including antigen processing, peptide-MHC II presentation, and cytokine promotion. In studies on BCE presentation, LBtope (Singh et al. 2013), iBCE-EL (Manavalan et al. 2018), IgPred (Gupta et al. 2013), and ABCpred (Saha and Raghava 2006) are sequence-based predictors. LBtope uses the sparse matrix and amino acid property profile features and is an SVM-based Weka Classifier using 38,197 IEDB experimental epitopes. iBCE-EL is based on ensemble learning using amino acid composition characteristics and proportions of 5,550 experimentally validated BCEs. IgPred uses 14,725 BCEs in different types of specific epitopes using physicochemical properties (PCPs) features and is based on Weka Classifiers. ABCpred is based on PCP features and the neural network method with a balanced BCE database. Among the aforementioned predictors, LBtope uses the largest dataset with ensemble learning.

To estimate the B cell immunogenicity score of 15mer and 12mer fragments, five online predictors (LBtope_Variable, LBtope_Confirm, iBCE-EL, IgPred, and ABCpred) were utilized and validated. Epitope probabilities and IgG scores were determined using the iBCE-EL and IgPred prediction tools, respectively. LBtope is based on multiple peptides from prediction models using two variable-length epitope models. The LBtope_Variable model was trained using 38,197 peptides. The LBtope_Confirm model was reported in at least two studies and contained 2,837 peptides. By submitting multiple fragments, the probability of epitopes was obtained along with the physical property score. As ABCpred exclusively accepts an even number of epitope lengths and continuous amino acid sequences as submissions, PFAS used only one 12mer dataset with parameters containing a threshold of zero and an overlapping filter to obtain the predicted scores.

Immunogenicity prediction of T and B cell fragments

The CTL activation prediction has four important stages: proteasomal cleavage probability, TAP transport efficiency, MHC I binding affinity, and CTL immunogenicity. These predictions help identify potential TCE candidates. PFAS combined all the prediction values obtained from the online prediction tools in the four stages. The probability of proteasomal cleavage was estimated using NetCTL1.2. The TAP transport efficiency score is the mean score of NetCTL1.2, IEDB MHC I Processing, and TAPPred values. The peptide-MHCI binding affinity score is the mean score of NetCTL1.2, IEDB MHC I Processing, and NetMHCpan predictive values. The CTL immunogenicity score is obtained using IEDB MHC I Immunogenicity. After combining and normalizing the scores of each category using a combination of weights, PFAS compiled four stage score for the TCE prediction.

For the BCE prediction, the best predictor was evaluated and used to obtain B cell immunogenicity scores. After compiling the results of the prediction values from the web tools, the output values were normalized into the range of [0, 1] and B cell immunogenicity scores were compiled.

Pareto rank of fragments

The Pareto front is the set of all efficient solutions to bi-objective problems. In this study, a fragment Frag belonging to the Pareto front means that no other fragment has both larger T and B cell scores than Frag. The T and B cell scores of all fragments which were represented by their central amino acids were used as inputs of the Pareto front method to determine the Pareto rank of fragments. The Pareto front method iteratively removes the Pareto fronts, and Pareto rank of the fragments was the serial number of the removed front. For instance, the segments belonging to the initial Pareto front have a rank one. After removing the Pareto front, the fragments belonging to the new Pareto front have a rank two, and so on.

Promising epitopes of the multi-epitope vaccine

This work extended the fragments to a length of 16–20 amino acids and obtained the epitope profiles with the average Pareto rank of the extended fragments. The average rank was defined as the sum of the Pareto ranks divided by the total number of fragments included in the extended fragments. Moreover, to select conserved epitopes, PFAS estimated protein variability using the Protein Variability Server (PVS) (Garcia-Boronat et al. 2008). PVS contains three methods, the Shannon entropy, the Simpson diversity index, and the Wu–Kabat variability coefficient method, which can be used as indicators of variability. In this study, the Shannon entropy greater than two was considered as the variability point. Accordingly, PFAS removed the variable fragments that contained highly variable sequences. Finally, in the 16mer to 20mer epitope profiles, PFAS ranked conserved fragments according to the average Pareto rank, and the top-ranked promising epitopes were provided to the biological decision makers for in vitro validation.

Results

Estimation of T and B cell immunogenicity scores

After T cell online prediction, the T cell score was the weighted sum of four scores in the four stages: proteasomal cleavage probability, TAP transport efficiency, MHC I binding affinity, and CTL immunogenicity scores. For each stage, the scores were normalized and averaged. To validate the weights of four scores, the experimentally validated swine 9mer CTL epitopes from the IEDB were used. Figure S2A shows a good performance with an area under the receiver operating characteristic curve (AUC) of 0.71, and the largest AUC was reached when using the top 30% epitopes (Figure S3). The set of four equal weights 1/4, 1/4, 1/4, and 1/4 in determining the T cell score is the most stable one. This result is consistent to the previous hypothesis that the importance of the four stages in the swine immunogenicity prediction is equal.

In B cell immunogenicity prediction, previous studies have revealed that the use of sequence-based predictors, such as LBtope (Singh et al. 2013), iBCE-EL (Manavalan et al. 2018), IgPred (Gupta et al. 2013), and ABCpred (Saha and Raghava 2006) is an efficient approach to identifying BCEs. Owing to different aims of the training dataset and machine learning approaches, the prediction results were different from these methods. Therefore, PFAS used LBtope, the largest experimental dataset with ensemble learning, as a prediction model to identify BCEs. Experimentally validated swine 15mer and 12mer BCEs were used to validate this hypothesis, the validation pipeline consistent with that used for TCE prediction. Figure S2B shows performance of the four methods: iBCE-EL, IgPred, LBtope with a variable dataset, and LBtope with a confirmed dataset. LBtope with the variable dataset achieved an AUC of 0.86. When using the top-ranked 30% epitopes, LBtope with a confirmed dataset was better than the other methods (Figure S4–5). These results were in good agreement with the hypothesis, showing that LBtope is an appropriate method for predicting ASFV BCEs.

Epitopes identification using the Pareto front method

Given that both T and B cell activation are equally important for mobilizing adaptive immunity, we applied the Pareto front method to identify potential epitopes. To rank and identify fragments simultaneously, the Pareto front method iteratively determined 116 Pareto ranks (Figure S6). T and B cell scores in the bi-objective problem were converted into Pareto ranks to identify epitope candidates.

Figure 3 shows 15mer epitope profiles for the five ASFV proteins. A higher average Pareto rank indicates a more promising epitope. Table S1 shows the results of the selected fragments in the five proteins, including 346 fragments of CD2v, 187 fragments of p30, 170 fragments of p54, 632 fragments of p72, and 2462 fragments of pp220. In short, Table 1 shows the Pareto ranks of the top three fronts. The best protein is CD2v, which has the most selected and continuous fragments in the rank one Pareto front.

Fig. 3
figure 3

Epitope profiles of the five ASFV proteins. Each 15mer fragment has an average Pareto rank represented by the central amino acid. The higher Pareto rank indicates greater epitope potential. The light pink background indicates the epitope hotspot. (A) CD2v. (B) p30. (C) p54. (D) p72. (E) pp220. ID, identification

Table 1 Fragments of the top three fronts using PFAS

Evaluation of screening efficiency

To evaluate the screening efficiency of PFAS, two validation datasets were used consisting of 30 experimentally validated and 34 predicted epitopes of T or B cells annotated in previous studies and the IEDB database (Bosch-Camos et al. 2021; Ivanov et al. 2011; Ros-Lucas et al. 2020). PFAS selected the top 30% fragments as promising epitopes. Table 2 lists the public epitopes with the Pareto ranks. PFAS identified 17 epitopes from 30 experimental ones and 24 epitopes from 34 predicted ones. Figure 4 revealed scatter points of experimental, predicted and PFAS selected epitopes. Since animal studies can prove the actual antigenicity and immunogenicity of epitopes, the top-ranked epitopes may be superior to the published epitopes. Three recombinant multi-epitope vaccines with synthesized epitope groups were used to determine the determination coefficient between the predicted ranks of PFAS and swine immunization (Ivanov et al. 2011). The combinations 1, 2, and 3 consisting of four pp220 epitopes, six p30 and p72 epitopes, and two p54 epitopes, respectively. The Pareto ranks of peptides in each combination were determined using the Pareto front method. Figure 5 indicates a significant coefficient of determination with R2 = 0.95 between the mean Pareto ranks of the recombinant vaccine and swine survival days in a vaccination study. The higher the combination Pareto rank, the longer the survival days of the pig (Table S2). These results reveal that PFAS is an efficient approach to epitope identification.

Fig. 4
figure 4

Scatter points of experimental, predicted and selected epitopes in the top 30% of the fragments using PFAS. (A) Experimental epitopes. (B) Predicted epitopes

Fig. 5
figure 5

The correlation between Pareto ranks and swine survival days. Combinations 1, 2, and 3 contain four, six, and two epitopes, respectively. The vaccination experiments of combinations 1, 2, and 3 were performed in triplicate, quadruplicate, and triplicate, respectively. There was an R2 = 0.95 between mean ranks of the recombinant vaccines and swine survival days. Results are presented using mean ± SD

Table 2 The Pareto rank of the epitopes in the top 30% fragments for the experimentally validated and predicted epitopes

Identification of promising epitopes for multi-epitope vaccines

Clustering epitopes into hotspots (high-ranked epitopes) would be an effective method to obtain vaccine candidates in the multi-epitope vaccine design. The results shown in Table 1 are consistent to previous studies in which highly-ranked sequences were continuous. Accordingly, PFAS extended the fragments to sequences of 16–20 amino acids and produced epitope profiles with the average Pareto ranks of extended fragments (Figure S7–11). The higher the front rank, the greater potential of the epitope. Enrichment of potential epitopes is considered as an epitope hotspot.

Additionally, epitope variability is important for biologists in identifying vaccine candidates. To obtain conserved epitopes, PFAS calculated the variability of the five proteins using PVS (Table S3). A total of 45 sites were identified, including two CD2v sites and 43 p54 sites, which can be regarded as sites with high variability because their Shannon entropy was greater than two (Figure S12). Similarly, if a fragment contained highly variable sites, it was regarded as a highly variable region. After removing 69 highly variable fragments (Table S4), 3728 fragments were obtained, including 341 CD2v fragments, 187 p30 fragments, 106 p54 fragments, 632 p72 fragments, and 2462 pp220 fragments. After conservation verification, the mean Pareto rank of the extended fragments for each protein was determined. CD2v had the highest average rank among the five proteins, and p30, p72, p220, and p54 ranked second, third, fourth, and fifth, respectively. Furthermore, we estimated all conserved fragments (Table S5). Biological decision makers can flexibly choose the appropriate epitope length and sample size for experimental validation in vitro (Tables S6–10). For example, Table 3 shows 72 promising epitopes with an average Pareto rank of four, including 26 16mer epitopes, 16 17mer epitopes, 12 18mer epitopes, 10 19mer epitopes, and 8 20mer epitopes. In addition, these epitopes came from four proteins, including 46 CD2v epitopes, two p30 epitopes, 10 p72 epitopes, and 14 pp220 epitopes.

Table 3 The top 72 epitopes and their average ranks with the 16–20 amino acids

Discussion

Since ASFV is a complex and lethal multi-antigen virus, it originated in Africa but has recently caused an emerging epidemic in Asia. With advances in computational biology and machine learning in the field of immunology, computational epitope prediction provides a new opportunity to improve ASFV multi-epitope vaccines. Several studies have demonstrated that ASFV enhances or modulates the host immune response through multiple proteins. Therefore, a recombinant multi-epitope vaccine has a potential to be an excellent ASFV vaccine.

In this work, we have analyzed five proteins: CD2v, p30, p54, p72, and pp220. Variation in MHC polymorphisms would induce different immune responses (Opriessnig et al. 2021), which plays an important role in identifying potential epitopes. Human epitopes have been widely used to build prediction models. However, very few swine epitopes and prediction models are available. To identify potential epitopes for swine vaccine, PFAS used state-of-the-art predictors with promising parameter setting to calculate T and B cell scores. Even when experimentally validated porcine epitopes were used for parameter validation, cross-species prediction models may still reduce prediction accuracy.

Although both the T and B cell immunogenicity are important, there must be a trade-off in identifying promising epitopes for conventional prediction methods. Therefore, the Pareto front method was proposed to cope with the bi-objective problem by converting both the T and B cell scores of a fragment into a single Pareto rank that vaccine designers can easily determine the number of promising epitopes for biological experiments. The validation of ASFV recombinant multi-epitope vaccines reported suggests that the Pareto front method would be a potentially useful approach to identifying promising epitopes in the design of multi-epitope vaccines against ASFV.

Because ASFV has multiple antigens and complex immune interactions with the host immune system, ASFV recombinant multi-epitope vaccines require a multi-epitope combination. Therefore, we analyzed the protein features of the top-ranked epitopes in five potential proteins, and some results were consistent to those of the biological studies. Some studies have shown that CD2v exhibits serological specificity, participates in immune evasion, enhances viral replication, and damages lymphocyte functions (Sanna et al. 2017). In this work, we observed that CD2v had the highest average rank of extended fragments and the highest proportion (n = 46) of the top 72 epitopes. These results are consistent to the findings of existing studies revealing that CD2v plays an important role in activating adaptive host immune responses (Jia et al. 2017). CD2v may be a potential protein candidate in the ASFV vaccine due to its high ranking. p30, a phosphoprotein involved in ASFV entry, is synthesized in the early phase and continues to be synthesized during the late phase of viral infection. Although p30 is an antigenic and conserved structural protein, the immune response triggered by p30 alone is insufficient for antibody-mediated protection. However, combining it with other proteins, such as hemagglutinin, can increase humoral and cellular responses (Argilaguet et al. 2012). In this work, p30 had the second highest average rank, and p30 had two of the top 72 epitopes. It appears that p30 can be an important part of multi-epitope vaccine. p54 is important for the recruitment of envelope precursors to assembly factories and induces apoptosis during the early phase of infection (Hernaez et al. 2004; Rodriguez et al. 1996). p54 is an antigenic structural protein that induces the production of specific antibodies. However, protein variability analysis demonstrated that there is a highly variable region in the C-terminus of the p54 protein. Accordingly, p54 had the lowest average rank among the five proteins in this work, and these show that the p54 epitope selection may depend on the target swine species and predominant MHC type. p72 is conserved and essential for viral icosahedron formation during viral infection (Cobbold and Wileman 1998); therefore, p72 has the characteristics of high antigenicity and immunogenicity and is enriched and assembled in the ER during late-stage expression of infection. This study (Neilan et al. 2004) showed that p72 may produce high levels of p72-specific IgG antibodies, but there exists partial protection when using p72 epitopes alone. In this work, p72 had the third highest Pareto front and 10 p72 epitopes were selected from the top 72 epitopes, the data show that p72 may increase antibody production in multi-epitope vaccine. The ASFV polyprotein precursor pp220 is highly conserved in the viral genome, and pp220 is cleaved by proteases to produce the mature virion proteins p150, p37, p14, and p34, which account for approximately 30% of the total viral protein mass and play an important role in the assembly process of the viral capsids and viral infection (Andres et al. 2002). In this work, pp220 had seven epitopes in the top three fronts, and pp220 had the second highest proportion (n = 14) among the top 72 promising epitopes. These results indicate that pp220 can be an important component of multi-epitope vaccine.

However, in the swine computational studies, the prediction models were trained mainly using human datasets and small amounts of animal data. In addition, the percentage of immune cell populations and the function of T cells differ between pigs and humans (Gerner et al. 2015; Rubic-Schneider et al. 2016), and variations in MHC polymorphisms induce different immune responses (Opriessnig et al. 2021). For computational prediction, identifying individual predictors is important to improve swine epitope prediction.

Although cross-species epitope prediction increases the uncertainty of the results, this study has demonstrated a relationship between Pareto rank and swine survival days based on the Pareto front approach. These findings support the hypothesis that accurate predictors with the Pareto front method may reduce vaccine development time and costs when applied to human vaccine development.

In this study, we use the Pareto front method to consider T and B cell immunogenicity simultaneously and ranked the epitopes with Pareto ranks. This procedure involved state-of-the-art computational methods and confirmed parameters. In addition, the evaluation of the experimental epitope ranks for the vaccination study had a significant coefficient of determination, demonstrating that the Pareto front method has effective screening efficiency. Finally, promising epitopes based on fragment extension and the peptide sequences with Pareto ranks were provided for biological experimental verification and confirmation. Overall, our study has proposed a computational prediction method based on the Pareto front method, provides Pareto rank of all fragments, promising epitopes, and may contribute to the development of recombinant multi-epitope vaccines for ASFV. The method may be used for human or cross-species promising epitope identification.

Data availability

The data that download from prediction tools are available from the corresponding author upon request. The MATLAB codes of the Pareto front method are available at https://github.com/NYCU-ICLAB/PFAS. The validated epitopes and experimental results are available at supplementary information. The promising epitopes are available at supplementary information.

References

Download references

Acknowledgements

We would like to thank National Core Facility for Biopharmaceuticals (NCFB, 111-2740-B-492-001) and National Center for High-performance Computing (NCHC) of National Applied Research Laboratories (NARLabs) of Taiwan for providing computational resources and storage resources.

Funding

The work was supported by grants from National Science and Technology Council, Taiwan (110-2221-E-A49-099-MY3, 112-2740-B-400-005-), and was financially supported by the “Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B)” from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan.

Author information

Authors and Affiliations

Authors

Contributions

TC, CC, SH, and PW conceived the study. TC and SH designed the experiments. TC, YH and FK performed the experiments and analyzed the data. CC and PW performed the formal analysis and technical assistance. TC drafted the original manuscript. SH and CC validated, supervised, and proof-read the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Chia-Jung Chang or Shinn-Ying Ho.

Ethics declarations

Competing interests

Pei-Yin Wue and Chia-Jung Chang are employed by the Reber Genetics Co.

Ethical standards

This article does not contain any studies with animals performed by any of the authors.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, TY., Ho, YJ., Ko, FY. et al. Multi-epitope vaccine design of African swine fever virus considering T cell and B cell immunogenicity. AMB Expr 14, 95 (2024). https://doi.org/10.1186/s13568-024-01749-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13568-024-01749-6

Keywords