Use of combinatorial peptide libraries for T-cell epitope mapping

Use of combinatorial peptide libraries for T-cell epitope mapping

Methods 29 (2003) 236–247 www.elsevier.com/locate/ymeth Use of combinatorial peptide libraries for T-cell epitope mapping Mireia Sospedra,a Clemencia...

682KB Sizes 2 Downloads 123 Views

Methods 29 (2003) 236–247 www.elsevier.com/locate/ymeth

Use of combinatorial peptide libraries for T-cell epitope mapping Mireia Sospedra,a Clemencia Pinilla,b and Roland Martina,* a

Neuroimmunology Branch, NINDS, National Institutes of Health, 10 Center Drive, MSC 1400, Bethesda, MD 20892-1400, USA b Torrey Pines Institute for Molecular Studies, San Diego, CA 92121, USA Accepted 25 November 2002

Abstract T lymphocytes play important roles not only in infectious diseases and autoimmunity, but also in immune responses against tumors. For many of these disorders, the relevant target antigens are not known. Designing effective methods that allow the search for T-cell epitopes is therefore an important goal in the areas of infectious diseases, oncology, vaccine development, and numerous other biomedical specialties. So far, the strategies used to examine T-cell recognition have been based largely on mapping T-cell epitopes with overlapping peptides from known proteins or with entire proteins, e.g., from a specific virus, bacterium, or human tissue. These approaches are tedious and have a number of limitations. It is, for example, almost impossible to isolate T cells that infiltrate an organ or infectious site and identify their specificity unless one already has a concept as to which antigens may be relevant. During recent years, a number of laboratories have developed less biased approaches that employ either the selection of putative T-cell epitopes based on the prediction of binding to certain major histocompatibilty complex (MHC) molecules and peptide or protein libraries that have been generated in expression systems, e.g. phage, or rely on combinatorial peptide chemistry. The latter technique has been refined by a number of laboratories including ours. Bead-bound or, preferably, positional scanning synthetic and soluble combinatorial peptide libraries allow the identification of T-cell epitopes within complex mixtures of proteins even for T cells that have been expanded from an organ infiltrate with a polyclonal stimulus. The practical steps that are involved in the latter method are described in this article. Published by Elsevier Science (USA).

1. Introduction Adaptive immune responses are antigen-specific and mediated by antibodies, i.e., humoral immune response, and by T lymphocytes, i.e., cellular immune response, respectively. Antibodies usually react with conformational determinants of complex molecules that exist in soluble or particulate form in blood or other body fluids/compartments. Only a minority of antibodies respond to linear peptides. In contrast to antibodies, T cells react with oligopeptides usually between 9 and about 20 amino acids in length that are bound to the peptide-binding groove of either class I or class II self major histocompatabiltiy complex (MHC)/

*

Corresponding author. Fax: 1-301-402-0373. E-mail address: [email protected] (R. Martin).

1046-2023/03/$ - see front matter. Published by Elsevier Science (USA). doi:10.1016/S1046-2023(02)00346-8

human lymphocyte antigen (HLA)1 molecules. The different subpopulations of T cells can roughly be divided into MHC/HLA class II-restricted CD4þ helper T cells and CD8þ T cells that are usually cytolytic and recognize antigenic peptide in the context of MHC/ HLA (both are used interchangeably below) class I molecules. Furthermore, different CD4þ helper T (Th) populations have been categorized according to their cytokine secretion patterns. Helper T cells stimulate B cells, activate and recruit numerous other immune cells, and, via the secretion of interleukin (IL)-2 and other cytokines, provide help to cytolytic T cells, which in turn exert their effector function by killing, e.g., virus1 Abbreviations used: Aa, amino acid; TCC, T-cell clone; TCR, T-cell receptor; SCL, synthetic combinatorial peptide libraries; PS-SCL, positional scanning SCL; APC, antigen-presenting cell; MHC, major histocompatibility complex; HLA, human leukocyte antigen, S-index (SI), stimulation index; SMPS, simultaneous multiple peptide synthesis.

M. Sospedra et al. / Methods 29 (2003) 236–247

infected cells or tumor cells. It is of relevance to note for this brief article that the antigen-binding grooves of MHC class I and MHC class II molecules differ particularly in one important aspect. While MHC class I molecules accommodate comparably shorter peptides (9–10 amino acids) in their binding groove which is closed at both ends, class II molecules can bind longer peptides since the MHC class II binding groove is open. The core length of the MHC class II binding groove is similar, however. The large number of polymorphic MHC class I and class II alleles differ in amino acid composition and structure of their binding pockets. This high degree of polymorphism and the simultaneous expression of multiple class I and class II molecules on the surface of all nucleated cells (class I) or antigen presenting cells (class I and class II) encode the specific self histocompatibility patterns that characterize each individual. Identification of the specific antigen(s) that is(are) recognized by T cells is of great importance to the understanding and potential treatment of infectious diseases and autoimmune disorders and in tumor immunology. Decrypting the specificity of T cells was greatly facilitated by the introduction of T-cell cloning by limiting dilution and other techniques, but also by the continuous propagation of these T-cell clones via

237

repeated antigen stimulation and soluble growth factor, i.e., interleukin-2. Once unique and clonal T-cell populations could be expanded to sufficient cell numbers, numerous antigens including whole infectious agents, protein mixtures, individual proteins, peptides, expression libraries, and peptide libraries, among others, were used to identify the specificity of the respective T-cell clone (TCC). The development of soluble and bead-bound combinatorial peptide libraries in various formats representing millions to trillions of peptides has emerged as a powerful approach to T-cell epitope determination [1,2]. Recent studies have demonstrated the efficacy of using positional scanning synthetic combinatorial libraries (PS-SCL) for identifying target antigens (Ags) and highly active peptide mimics [1–5]. The identification of these epitopes for clonotypic T-cell receptors (TCR) of known and unknown specificity is possible using a strategy that combines data acquisition with PS-SCL and analysis with a quantitative scoring matrix [4,6]. Peptides can be identified from database analysis and ranked according to a score that is predictive of their stimulatory potency. To our knowledge, this strategy (Fig. 1) is the most efficient approach available to identify stimulatory peptides for individual TCR and predict their actual stimulatory potency with relative

Fig. 1. Flow diagram of the strategy used to analyze TCR recognition of antigens by clonotypic T cells. The steps for a quantitative analysis are represented in boldface. Experimental data collected by measuring functional T-cell responses to PS-SCL are then analyzed by a scoring matrix approach. This allows the identification and ranking of the spectrum of antigenic ligands for TCC of known and unknown specificity.

238

M. Sospedra et al. / Methods 29 (2003) 236–247

Synthetic combinatorial libraries (SCL) represent mixtures of very large numbers of synthetic compounds that are systematically arranged. SCL composed of peptides are used for the specific purpose of identifying T-cell epitopes [1,7]. SCL are generated using the multiple solid-phase synthesis method known as the ‘‘tea bag approach’’ [8] and are then cleaved from the solid support. The characteristic feature of SCL is the presence of individual defined amino acids (aa) at certain positions of the compound scaffold, while the remaining diversity positions are mixtures of aa. SCL are cleaved from the synthetic solid support and then assayed in solution, which allows each peptide within each mixture to freely interact with a given receptor. The first SCL were made up of peptides of various lengths and amino acids (L -, D -, and unusual). Two different deconvolution methods are used to identify individual active compounds from mixturebased SCL. The first method involves an iterative deconvolution approach [9]. Following the identification of active mixtures having defined positions, the remaining mixture positions are then defined one diversity position at a time, through a synthesis and selection process, until individual compounds are identified. The second deconvolution method is termed positional scanning [5,10,11], in which each diversity position is individually addressed by separate sublibraries. The amino acids of the most active mixtures at each diversity position are combined, and the resulting individual compounds are synthesized and tested to determine their activities. The advantage of the positional scanning format is that iterations are not required; in most cases only a single synthesis is needed to obtain active individual compounds.

An example of the PS-SCL concept using a tripeptide combinatorial library is illustrated in Fig. 2. Four different amino acids are incorporated at each of the three diversity positions resulting in a diversity of 64 (43 ) individual peptides. When the same diversity is arranged as a PS-SCL, only 12 peptide mixtures (4 amino acids  3 positions) need to be synthesized. Each of the three positional sublibraries, namely OXX, XOX, and XXO, contains the same diversity of peptides, but differ only in the location of the defined aa position. The O positions represent one of the four amino acids while the remaining two diversity positions are mixtures (X) of the same four amino acids. Shown below each mixture are the 16 peptides (42 ) that make up that mixture. Assuming that ART is the only active tripeptide in this library that is recognized, the ART tripeptide (outlined below each sublibrary in Fig. 2) is present in all three positional sublibraries. Thus, the only mixtures with activity are AXX, XRX, and XXT because the ART tripeptide is present only in those mixtures. The combination of these aa in their respective positions yields the tripeptide ART, which would then be synthesized and tested for its activity against the receptor. It should be noted that the activity observed for each of the three mixtures (AXX, XRX, and XXT) is due to the presence of the tripeptide ART within each mixture, and not due to the individual amino acids (A, R, and T) that occupy the defined positions. In more complex libraries, more than one mixture is often found to have activity at each position. Selection of the aa for the synthesis of individual peptides is based first on activity and then on differences in the chemical character. Although the above example is a simple representation of the arrangement and use of a PS-SCL, the same concept applies to all types of mixture-based libraries having defined and mixture positions. To give an estimate of the complexity of PS-SCL, a hexapeptide library using 20 amino acids represents a total diversity of 6:4  107 (206 ) individual peptides, a number of peptides that clearly could not be synthesized or tested individually. A hexamer PS-SCL can be formatted into 120 mixtures (20 aa  6 positions). The synthesis of SCL is not covered here, but has been described in detail elsewhere [12].

2.2. Positional scanning concept

2.3. Screening mixture-based libraries

PS-SCL are composed of positional SCL or sublibraries, in which each diversity position is defined with a single aa, while the remaining positions are composed of mixtures of aa [12]. Each positional sublibrary represents the same collection of individual compounds. The assay data derived from each positional sublibrary provide information about the most important aa for every diversity position of the PS-SCL.

One advantage to using libraries formatted as mixtures is the reduction of assay reagents and materials required for screening. For example, the number of 96well microtiter plates that are required for testing a hexapeptide library as mixtures in duplicate is 4 plates compared with 106 plates, if the library is formatted as individual compounds (50  106 compounds). Thus, each PS-SCL can be screened on several microtiter plates instead of hundreds or thousands, reducing

high accuracy [6]. Here, we focus on the practical steps and application of positional scanning synthetic combinatorial peptide libraries for the identification of T-cell specificity.

2. Positional scanning synthetic combinatorial libraries 2.1. Synthetic combinatorial libraries

M. Sospedra et al. / Methods 29 (2003) 236–247 Fig. 2. Schematic comparison of a library of 64 tripeptides formatted as individual peptides or in PS-SCL. To construct the tripeptide library four different amino acids are incorporated at each of the three positions, resulting in a diversity of 43 (64) individual peptides. The individual peptides are grouped according to the amino acid in the first position and the only active peptide (ART) is outlined. The PS-SCL is composed of 12 different peptide mixtures (3 positions  4 amino acids). The active mixtures at each position and the active peptide responsible for the mixture activities are outlined. 239

240

M. Sospedra et al. / Methods 29 (2003) 236–247

workload and cost by several orders of magnitude. The starting concentration that is used to screen a library depends on the individual assay. In the context of this article, we focus on CD4þ T cells and use of antigenspecific proliferation measured by [3 H]thymidine incorporation as a readout. In contrast to receptor binding assays or ELISA, cell-based assays are sensitive to high concentrations of compounds, and screening is typically preformed at 0.1 mg/ml. For this reason it is convenient to aliquot very complex libraries at the highest concentration possible (10 mg/ml). This ensures that individual compounds within each mixture are present at a detectable concentration.

One aspect of screening PS-SCL is that the activity of a mixture with a given aa in the defined position can be due to one or more families of compounds having that same aa at its respective position. When the combinations of the most active aa are synthesized and tested as individual compounds, it becomes clear if the activities of the mixtures between positions are connected. Thus, the activities of mixtures in the library are due to the activities of individual compounds, and there are several strategies that can be used for their deconvolution (see below).

2.4. PS-SCL deconvolution

PS-SCL have been used in a number of biological assays. An understanding of the various assay parameters, such as signal-to-noise ratio, variability, and sensitivity is required for the successful identification of active compounds from the library. These parameters are inherent to the assay and are not influenced by the fact that complex mixtures are being tested instead of individual compounds. The most important parameter to control for an assay system is interassay variability. When screening complex mixtures, it is critical that the assay variability is known. For an assay with low variability, one has confidence that a 5- to 10-fold difference in observed activity between mixtures is significant. For an assay with high variability, the variation between replicates for a given mixture may obscure real differences in activities from other mixtures. The use of repeated experiments and averaged data ensures accurate deconvolution (i.e., selection of truly active mixtures), which results in the identification of individual compounds with significant activity.

To identify the most active individual compounds from the PS-SCL, compounds that incorporate the most active aa at each position are synthesized and subsequently tested. For practical purposes, the number of aa selected from each position that will be used to synthesize the individual peptides should be minimized. For example, if two aa were selected from each position of a hexapeptide PS-SCL, one would need to synthesize 64 peptides (26 ). Hence, the decision on which active aa are chosen for each position to synthesize individual compounds depends on their relative activity and the number of compounds that can reasonably be synthesized. Successful deconvolution of active individual compounds from mixture-based libraries is dependent on reproducible screening data and clear dose–response activities of the most active mixtures. Assay optimization is crucial before screening a library. In most cases, dose–response curves can be determined for the most active mixtures, and activities based on calculated values of half-maximal stimulation are used to select the aa that will be included in the synthesis of individual compounds. Often aa of similar chemical character yield similar activities at a given position. This may indicate that a number of analogs of the same compound are responsible for the observed activity. Similar aa can be excluded from selection to reduce the number of final compounds needed to be synthesized. In a number of examples of library screening data, the distinction between active and inactive mixtures is more difficult to determine. In this case, dose–response determinations may not be possible, and as an alternative it may be useful to compare the activity of a given mixture relative to the average mixture activity at that diversity position. As noted previously, the data analysis of complex mixtures is no different from the data obtained using individual compounds. One simply uses the relative activity of the PS-SCL for one specific position. In other words, distinguishing between active and inactive samples is independent of the complexity of these samples.

2.5. Assay optimization

3. Use of combinatorial peptide libraries for T-cell epitope mapping Combinatorial libraries composed of peptides have proven useful for the identification of sequences recognized by T-cell clones and for mapping T-cell specificity [3,5,11,13]. In addition to defining native ligand sequences, the specificity profiles that result from screening PS-SCL have led to the identification of cross-reactive sequences that do not necessarily correspond to sequences in proteins or were found from the analysis of sequence databases. In this respect, PS-SCL, as well as peptides in known proteins, may help to identify relevant epitopes that can lead to the design of novel vaccines for infectious diseases and cancer as well as the identification of target autoantigens involved in autoimmune diseases [3,4,14]. The strategy used to identify ligands for clonotypic TCR of known and unknown specificity using PS-SCL combines data acquisition with PS-SCL with subsequent

M. Sospedra et al. / Methods 29 (2003) 236–247

database analysis. As discussed in detail in Hemmer et al. [15], the assay data, i.e., which aa are active in which diversity position, can be used either to search for peptides with sequences that represent the most active positions or to search on the basis of a quantitative scoring matrix [6]. The latter technique is more powerful and relies on the assumption that the stimulatory potency of a peptide results from the additive stimulatory value of the aa in that peptide [15]. We have previously demonstrated that this assumption is likely correct; however, we currently do not know whether it can be generalized to every TCR or is valid only for a subset of T-cell clones and their TCR. The use of iterative deconvolution strategies with peptide libraries is not covered here, but is described elsewhere [1,16]. The following protocol summarizes the use of PS-SCL to map epitopes of CD4þ T cells. The same protocol can be applied with few variations to map epitopes of CD8þ T cells [14]. The main difference between studying the specificity of CD4þ and CD8þ T cells with PS-SCL is the assay readout. For CD4þ T-cell activation is assessed by proliferation assays that measure the incorporation of cell [3 H]thymidine into newly synthesized DNA, whereas for CD8þ T cells the activation is usually assessed by 51 Cr-release assays or other assays that measure cytotoxicity. Also, quantitative assessments of cytokine (e.g., interferon-c (IFN-c) or granulocyte– macrophage colony-stimulating factor (GM-CSF) production by standard sandwich ELISA can be use to test T-cell activation. As already pointed out, we use CD4þ T cells as an example here. Examples for CD8þ T cells have been described in the literature [5,11,14,17]. 3.1. Equipment and reagents Decapeptide PS-SCL T-cell clone Antigen presenting cells Appropriate growth medium for cells 96-well U-bottom microtiter plates (Costar) Cell harverster Liquid scintillation counter [3 H]thymidine, [Na2 51 CrO4 ] ELISA kit to measure IFN-c production (PharMingen) 3.2. Method 3.2.1. Generating T-cell clones TCC can be isolated from peripheral blood as well as from other organ compartments by limiting dilution using specific proteins or peptides, or using mitogens (e.g., PHA), or polyclonal T-cell receptor-targeted stimulation strategies (e.g., combination of anti-CD3 and anti-CD28 monoclonal antibodies). T-cell limiting dilution can be accomplished by plating T cells in 96

241

wells/plate at 3 and 0.3 cell/well in the presence of 2  105 irradiated PBMCs per well. As described in detail by Levkovits, Waldmann, and others [18–20], the likelihood of expanding a single clone can be estimated from the cloning efficiency and other parameters. Growing T cells are expanded by discontinuous stimulation (cycles of stimulation and intervening rest) to reach cell numbers that are needed to screen the PSSCL. Approximately 8 million cells are needed to screen a decapeptide library, i.e., 10  20 compounds in duplicate (400 wells with 2  104 T cells/well). 3.3. Screening PS-SCL for biological activity The response of a CD4þ TCC to PS-SCL is assessed in proliferation assays. T cells (20,000 cells/well), autologous irradiated PBMC (100,000 cells/well), and the different complex mixtures (100–200 lg/ml) are seeded in duplicate using 200 ll of tissue culture medium in 96well plates (Fig. 3). A negative control, consisting of T cells and autologous irradiated PBMC without complex mixtures, and a positive control, consisting of T cells, autologous irradiated PBMC, and antigen, if the specificity of the TCC is known, or mitogen or TCR stimulator, if the specificity is unknown, are included. Five 96-well plates can be used for a decapeptide PS-SCL, composed of 200 mixtures and duplicates per sample. After 48 h incubation at 37 °C the response of TCC to PS-SCL is assessed. One microcurie of [3 H]thymidine is added to each well for the last 16 h of incubation. Cells are then harvested, and the incorporated radioactivity is measured by scintillation counting. The results should be confirmed in at least two repeat experiments, since the incorporated radioactivity and consequently the stimulation indices can vary depending on the state of activation of the T cells. However, the rank order of most active mixtures should be consistent even if the absolute magnitude of the stimulation indices varies, e.g., due to high background incorporation. 3.3.1. Employing the PS-SCl screening data for protein database analysis to identify T-cell ligands As indicated in Fig. 1, the next step is to identify individual stimulatory compounds by testing the PSSCL. If the respective CD4þ T-cell clone responds to the PS-SCL, and the proliferative assay yields a pattern that indicates activity for specific amino acids in individual diversity positions [1,6], the most straightforward way to arrive at individual peptides is to align the most active aa for each position in a peptide, i.e., deduce a peptide sequence from these data. Using this approach, we previously demonstrated for an autoreactive CD4þ T-cell clone that highly active peptides can be identified [15]. It was further demonstrated that his approach yields peptides that are potent agonists for the respective

242

M. Sospedra et al. / Methods 29 (2003) 236–247

clone [3]. In the case of a myelin basic protein (MBP)specific T-cell clone, peptides were identified that stimulated the clone at an EC50 (half-maximal stimulatory concentration) several orders of magnitude lower than that of the MBP peptide [15]. Deducing such a potent stimulatory peptide does not necessarily lead to a peptide that exists in any known protein, and therefore, the above approach should be combined with database analysis to identify potentially stimulatory ligands that are derived from, e.g., humans, bacteria, or viruses. For this purpose, a single sequence can be searched for in public databases such as GenBank and SwissProt, or, alternatively, a search motif can be formulated; i.e., the most active aa for those positions that are clearly informative are compiled [6]. An example would be: [A, V]-[M, I]-[F, Y]-[K, R]-[A]-[V, I]-[E, D]-[P]-[N-]-[I, L, M]. If individual PS-SCL positions are not clearly in-

formative, they can be replaced by blanks in the search as outlined in Zhao et al. [6]. This approach is overall less efficient than using the scoring matrix approach that is described below; however, it also leads to epitope candidates. 3.4. Designing a scoring matrix for advanced database analysis In complex PS-SCL, i.e., mixtures containing large numbers of compounds at individually low concentrations, the activity of a mixture most likely is elicited from more than a single compound in the mixture. As an example, individual peptides in decamer PS-SCL are present at femtomolar concentrations, and based on current experiences about the sensitivities of CD4þ T cells to antigen, usually nano- to micromolar con-

Fig. 3. Screening of a PS-SCL with a CD4þ TCC for biological activity. The T-cell activity is tested with the PS-SCL by using antigen-specific proliferation as readout. T cells are seeded in duplicate in a 96-well plate in the presence of autologous irradiated PBMC and the different complex mixtures. For each position 40 wells are necessary (20 amino acids in duplicate). In one 96-well plate it is possible to test two positions (40 sublibraries). A total of 5 plates are therefore required to test the entire set of decamer PS-SCL. After 48 h the incorporation of thymidine is measured by liquid scintillation counting. The negative control consists of T cells and PBMCs without mixtures, and either specific antigen or, if not known, a mitogen such as phytohemagglutinin (PHA) can serve as positive controls.

M. Sospedra et al. / Methods 29 (2003) 236–247

243

(a)

(b)

Fig. 4. (a) Score matrix from a representative proliferative experiment of a TCC to a decamer PS-SCL. Each number represents the S index (cpm in the presence of the mixture/cpm in the absence of the mixture) of each of the 200 mixtures of a decapeptide PS-SCL. In the matrix the columns represent positions, and the rows, the 20 amino acids used in PS-SCL libraries. According to our model of independent contribution of each amino acid to recognition, the stimulatory value of any decapeptide can be determined by summing the values of the individual amino acids in the score matrix. The example shown is a decamer peptide derived from influenza virus that was used to establish the TCC. Numbers corresponding to the amino acid sequence of the peptide are boxed in the score matrix and their sum represents the peptide score. Maximum and minimum scores for this particular matrix are also shown. (b) The scoring matrix can be used to score contiguous decamer peptides contained in all known protein sequences contained in public databases to find stimulatory peptides for a given TCC. The example shows the decamer scoring moved in one-amino-acid increments along the sequence of influenza virus hemagglutinin (HA) recognized by the TCC.

244

M. Sospedra et al. / Methods 29 (2003) 236–247

centrations are required. This implies that a biologically active mixture of a peptide PS-SCL with a specific aa in one of the diversity positions is active because several peptides with this specific aa in the particular diversity position elicit biological activity, e.g., stimulate a CD4þ T cell. It further indicates that we obtain only indirect information about the active peptides that are responsible for the activity of mixtures having an amino acid defined. Furthermore, we have demonstrated that the stimulatory potency of a peptide can be approximated by adding the relative stimulation of each aa in the diversity positions to a score value [15]. Based on this assumption, a systematic strategy was developed that employs a scoring matrix and subsequent database analysis during which essentially the stimulatory score for each decamer peptides in the respective database is calculated in one-aa increments [4,6]. This strategy is described in more detail below. A positional scoring matrix is generated by assigning a value for the stimulatory potential to each of the 20 defined amino acids in each position. In the matrix the columns represent positions, and the rows, the 20 amino acids used in PS-SCL libraries. For the proliferative response of a CD4þ TCC to PS-SCL, the scoring matrix is created using the score called stimulation index (S index): S indexi;j ¼ ½Mij : mean proliferation response to ði; jÞth mixture in cpm =½B : mean cpm of the background ðno mixturesÞ where ði; jÞth mixture contains peptides with amino acid i in the peptide position j and cpm stands for counts per minute. We generate the score index in each position by using the mean of duplicate cpm values of T-cell proliferation in the presence of mixtures from the PS-SCL fractions divided by the mean of values in the absence of mixtures from the PS-SLC (negative control) (Fig. 4a). For cytotoxic responses the score of each amino acid in each position is calculated using the percentage lysis value. For some TCC, the response includes substantial variability. To incorporate this variability into the index, Zhao et al. [6] have defined a score matrix based on the Z index: Z indexi;j ¼

3.4.1. Deducing optimal peptides for the TCC by combining the defined amino acids of the most active mixtures at each position of the PS-SCL As mentioned before we recently demonstrated that each amino acid within a peptide contributes to recognition almost independently and in an additive fashion, so that amino acid substitutions that abrogate recognition can be compensated by highly stimulatory substitutions in other positions [15]. Thus, the overall stimulatory value of a peptide results from the combination of positive and negative effects of each of the amino acids. Under these assumptions the predicted stimulatory value of any peptide can be determined by summing the values of the individual amino acids in the score matrix (Fig. 4b). 3.4.2. Searching sequence databases for potential crossreactive ligands based on library results The score matrix generated is then used to search for predicted stimulatory peptides in the public protein databases. A program has been designed to scan and score all the decapeptides contained in the GenPept database and to identify sequences with highest scores. We recommend use of the GenPept database (ftp://ftp.ncifcrf.gov/pub/genpept) more than the SwissProt (http://www.expasy.ch/prot) because the first is substantially larger than the second. The program to scan and score the peptides using a score matrix that is currently used by the authors for database analysis is not yet publicly available. Until this is the case, please contact one of the authors or Richard Simon (e-mail: [email protected]) for specific information. From these searches a large number of predicted stimulatory peptides from viral, bacterial, and human proteins are identified. The next step is to demonstrate the actual stimulatory potential of the predicted peptides.

Mij B 2

2 1=2

½ðstdMij Þ þ ðstdBÞ 

where M; B are defined as above and std M denotes smoothed estimates of the standard deviations. At a practical level despite the error-adjusting capability of this index, no difference in performance was observed in the prediction for the TCC investigated, and until present it is not clear which matrix is the optimal.

Fig. 5. T-cell receptors with different requirements for peptide specificity can be divided into two groups: (a) receptors with highly specific recognition of the peptide that recognize only a very limited number of peptides and (b) receptors with more degenerate recognition that recognize large numbers of peptides.

M. Sospedra et al. / Methods 29 (2003) 236–247

3.4.3. Synthesizing individual peptides and confirming library results by testing TCC responses to these individual peptides A feasible number (hundreds) of candidate peptides are synthesized and tested for their stimulatory activity. The score serves as a practical guide to identify potentially stimulatory T-cell epitopes when the number of candidate peptides is very large. Ranking the predicted stimulatory peptides according to their score assists in selecting which of the candidate peptides should be synthesized and tested with the TCC. In a previous study Zhao et al. [6] assessed the predictive power of the use of PS-SCL to identify

245

specific peptides using a TCC of known specificity. The comparison between stimulatory potential predicted by the scoring matrix and actual measurement of TCC stimulation showed that the positive predictive value was 93.5% and the negative predictive value 80.8%. The sensitivity for predictions with the clone used was 92% and the specificity was 84%. Although the predictions based on this method are very accurate a few high-scoring peptides were not stimulatory. One reason could be that the influence of aa that are not permitted in certain positions is not adequately represented in the scoring matrix; i.e., it contains no negative values.

Fig. 6. (a) Proliferative response of a TCC with unknown specificity to 200 mixtures of a decapeptide PS-SCL in which each position has one defined amino acid [20 for each of the 10 positions (P1 to P10)]. Vertical axes: proliferation, as counts per minute (cpm) induced by each mixture of the PSSCL. Horizontal axes: single-letter amino acid code. Some amino acids of the peptide appear as a double peak when the peptide mixtures with fixed amino acids are tested. The amino acid isoleucine induces the highest response when it is fixed in positions 1 and 2; arginine when it is fixed in positions 3 and 4; proline when it is fixed in positions 5 and 6; and tryptophan when it is fixed in positions 7, 8, and also 9. (b) This image represents the shift phenomenon in peptide and MHC alignment. Top: Aligned binding. Because MHC class II molecules have open-ended binding grooves, a shifted binding configuration may occur according to the affinity pattern of amino acids for each MHC position.

246

M. Sospedra et al. / Methods 29 (2003) 236–247

3.4.4. Identifying biologically relevant candidate peptides from library results and database analysis The strategy described in this article allows us to find peptides from every known source that have stimulatory activity for the clone that has been tested with PS-SCL. This leads to the problem of how to identify from this wealth of data which peptides may be biologically relevant. One strategy that can be used to identify proteins involved in autoimmune diseases is to analyze which of the predicted peptides has been found overexpressed in autoimmune tissue compared with normal tissue using cDNA microarray analysis or other techniques. Other approaches rely on hypotheses that had previously been generated, e.g., in animal models, and focus on examining those peptides first that stem from candidate target antigen in the respective disorder. 3.5. Hints for troubleshooting Not all clones respond to the peptide libraries. There is probably a wide spectrum of T-cell receptor affinities for their respective MHC/peptide ligand, and autoreactive TCC in particular should usually express low- or intermediate-affinity TCR for the autoantigen based on current concepts about positive and negative selection. In some TCR with a narrow range of specificities or high specificity only a very limited number of structurally defined peptides can be recognized and elicit downstream signaling (Fig. 5a). In other TCC with more degenerate (less specific) recognition a large number of different peptides may be recognized (Fig. 5b). The high complexity of a PS-SCL [for example, for a decapeptide, each mixture contains 209 different peptides (199 in PSSCL where C is omitted from the randomized positions)] accounts for the fact that each peptide is represented at very low concentrations (femtomolar). It is therefore possible that TCC with degenerate recognition are able to respond to the peptide libraries, whereas TCC with highly specific recognition of the peptide do not respond since their nominal antigen(s) is present at too low a concentration in the PS-SCL. In such cases, an attempt can be made to reduce the complexity of libraries by assigning diversity positions only to the core of the epitope and attaching fixed amino acids at either end (e.g., glycine or alanine). The elimination of each diversity position in the PS-SCL reduces the complexity of the library by a factor of 20. This also ensures that the relative concentration of peptides in the mixture is increased by a factor of 20. The reduction of complexity by fixing MHC anchor positions is another strategy, but it is currently not clear for both paths of reducing library complexity whether it will lead to better sensitivity and achieve the desired goal. Furthermore, each of these manipulations compromises the original positional scanning concept and

may induce constraints that render assay interpretation more difficult. 3.6. Shifted alignments of the peptide mixtures Our previous experiments have shown that for some TCC certain aa in the PS-SCL occur as multiple consecutive peaks, e.g., A occurs in two to three adjacent positions such as positions 4–6, when the peptide mixtures are tested (Fig. 6a). This observation could indicate that the complex peptide mixtures slide by one position in the open MHC class II binding groove; however, there are other potential explanations as well. We currently try to devise strategies that allow us to understand better the underlying rules of this phenomenon. Fig. 6b shows how misalignment of the peptides by a single position within the MHC binding groove would influence TCR recognition.

References [1] C. Pinilla, R. Martin, B. Gran, J.R. Appel, C. Boggiano, D. Wilson, R.A. Houghten, Curr. Opin. Immunol. 11 (1999) 193– 202. [2] H.S. Hiemstra, P.A. van Veelen, N.C. Schloot, A. Geluk, K.E. van Meijgaarden, S.J. Willemen, J.A. Leunissen, W.E. Benckhuijsen, R. Amons, R.R. de Vries, B.O. Roep, T.H. Ottenhoff, J.W. Drijfhout, J. Immunol. 161 (1998) 4078–4082. [3] B. Hemmer, B. Fleckenstein, M. Vergelli, G. Jung, H.F. McFarland, R. Martin, K.-H. Wiesm€ uller, J. Exp. Med. 185 (1997) 1651– 1659. [4] B. Hemmer, B. Gran, Y. Zhao, A. Marques, C. Pinilla, J. Pascal, A. Tzou, T. Kondo, I. Cortese, B. Bielekova, S. Straus, H.F. McFarland, R. Houghten, R. Simon, R. Martin, Nat. Med. 5 (1999) 1375–1382. [5] K. Udaka, K.-H. Wiesm€ uller, S. Kienle, G. Jung, P. Walden, J. Immunol. 157 (1996) 670–678. [6] Y. Zhao, B. Gran, C. Pinilla, S. Markovic-Plese, B. Hemmer, A. Tzou, L. Ward Whitney, W.E. Biddison, R. Martin, R. Simon, J. Immunol. 167 (2001) 2130–2141. [7] R.A. Houghten, C. Pinilla, J.R. Appel, S.E. Blondelle, C.T. Dooley, J. Eichler, A. Nefzi, J.M. Ostresh, J. Med. Chem. 42 (1999) 3743–3778. [8] R.A. Houghten, Proc. Natl. Acad. Sci. USA 82 (1985) 5131– 5135. [9] R.A. Houghten, C. Pinilla, S.E. Blondelle, J.R. Appel, C.T. Dooley, J.H. Cuervo, Nature 354 (1991) 84–86. [10] C. Pinilla, J.R. Appel, P. Blanc, R.A. Houghten, BioTechniques 13 (1992) 901–905. [11] E. Borras, R. Martin, V. Judkowski, J. Shukaliak, Y. Zhao, V. Rubio-Godoy, D. Valmori, D. Wilson, R. Simon, R. Houghten, C. Pinilla, J. Immunol. Methods 267 (2002) 79–97. [12] C. Pinilla, J.R. Appel, S.E. Blondelle, C.T. Dooley, J. Eichler, J.M. Ostresh, R.A. Houghten, Drug Dev. Res. 33 (1994) 133– 145. [13] B. Hemmer, M. Vergelli, C. Pinilla, R. Houghten, R. Martin, Immunol. Today 19 (1998) 163–168. [14] C. Pinilla, V. Rubio-Godoy, V. Dutoit, P. Guillaume, R. Simon, Y. Zhao, R.A. Houghten, J.C. Cerottini, P. Romero, D. Valmori, Cancer Res. 61 (2001) 5153–5160.

M. Sospedra et al. / Methods 29 (2003) 236–247 [15] B. Hemmer, C. Pinilla, B. Gran, M. Vergelli, N. Ling, P. Conlon, H.F. McFarland, R. Houghton, R. Martin, J. Immunol. 164 (2000) 861–871. [16] J. Blake, J.V. Johnston, K.E. Hellstr€ om, H. Marquardt, L. Chen, J. Exp. Med. 184 (1996) 121–130. [17] P. Walden, Curr. Opin. Immunol. 8 (1996) 68–74.

247

[18] H. Waldmann, I. Lefkovits, J. Quintans, Immunology 28 (1975) 1135–1148. [19] H. Waldmann, H. Pope, I. Lefkovits, Immunology 31 (1976) 343– 352. [20] A. Moretta, G. Pantaleo, L. Moretta, J.C. Cerottini, M.C. Mingari, J. Exp. Med. 157 (1983) 743–754.