Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides

Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides

Journal of Bioscience and Bioengineering VOL. xx No. xx, 1e9, 2016 www.elsevier.com/locate/jbiosc Exploring high-affinity binding properties of octame...

2MB Sizes 1 Downloads 42 Views

Journal of Bioscience and Bioengineering VOL. xx No. xx, 1e9, 2016 www.elsevier.com/locate/jbiosc

Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides Akiko Kume,1 Shun Kawai,2 Ryuji Kato,2 Shinmei Iwata,1 Kazunori Shimizu,1 and Hiroyuki Honda1, * Department of Biotechnology, Graduate School of Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan1 and Department of Basic Medicinal Sciences, Graduate School of Pharmaceutical Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan2 Received 31 March 2016; accepted 17 August 2016 Available online xxx

To investigate the binding properties of a peptide sequence, we conducted principal component analysis (PCA) of the physicochemical features of a tetramer peptide library comprised of 512 peptides, and the variables were reduced to two principal components. We selected IL-2 and IgG as model proteins and the binding affinity to these proteins was assayed using the 512 peptides mentioned above. PCA of binding affinity data showed that 16 and 18 variables were suitable for localizing IL-2 and IgG high-affinity binding peptides, respectively, into a restricted region of the PCA plot. We then investigated whether the binding affinity of octamer peptide libraries could be predicted using the identified region in the tetramer PCA. The results show that octamer high-affinity binding peptides were also concentrated in the tetramer high-affinity binding region of both IL-2 and IgG. The average fluorescence intensity of high-affinity binding peptides was 3.3- and 2.1-fold higher than that of low-affinity binding peptides for IL-2 and IgG, respectively. We conclude that PCA may be used to identify octamer peptides with high- or low-affinity binding properties from data from a tetramer peptide library. Ó 2016, The Society for Biotechnology, Japan. All rights reserved. [Key words: Peptide; Binding analysis; Peptide array; Principal component analysis; Prediction]

Peptides are among the most important of biological molecules, being able to bind to various intracellular molecules such as enzymes, receptors, cytokines, polysaccharides, and nucleotides, and thereby regulate complex biological mechanisms. A variety of peptides specifically bound to target molecules has been described, and the information obtained may be applied to many biological research areas such as molecular drug targeting, bio-imaging and diagnostics (1e11). Such functional peptides need to be rationally explored, and its variants should be designed with precision. Recently, the construction of peptide libraries for screening and development of functional peptides has attracted much attention. For example, phage display (12) (a biological strategy), combinatorial split synthesis (13) (a chemical strategy), and beads display (14) (a chemical strategy) are the most popular and convenient methods of obtaining peptide libraries and are also powerful tools for the identification of new ligands. Such methods identify functional peptides based simply on positive screening (15e18). In these methods, however, the rational analysis of the characteristics of functional peptides is difficult. A larger number of peptides are included in libraries of peptides with a longer sequence. Therefore, longer peptides have a higher chance of interacting with the target protein than shorter peptides. For example, the peptides binding to antibody have 13e16 residues (19) and the peptides binding to MHC class I have 9e10 residues (20). However, it is very difficult to exhaustively explore the

* Corresponding author. Tel.: þ81 52 789 3215; fax: þ81 52 789 3214. E-mail address: [email protected] (H. Honda).

functional properties of peptides from such large libraries. An octamer peptide library, for instance, consists of 25.6 billion different peptides. In contrast, a tetramer peptide library consists of only 160,000 peptides. Thus, exploring peptide libraries in an efficient and rational manner is crucial when screening for functional peptides in a large number of candidate molecules. In the context of screening a peptide library efficiently, many researchers have demonstrated that the physicochemical properties of longer peptides are different depending on the target protein (21e26). Recently, we developed a screening method for peptides with binding affinity to bile acid and to the death receptor using both binding data from solid-phase peptide arrays and physicochemical characteristics from computational analysis (27,28). In the method described in these articles, we used random peptide libraries containing 2212 peptides for bile acid-binding peptides and 643 peptides for receptor-binding peptides. The larger the size of the random library, the higher the number of high-affinity binding peptides we found. The minimum size requirements for screening of peptide libraries still needed to be determined. Thus, we proposed the use of a small, nonrandom, predesigned library for the screening of high-affinity binding peptides. In the predesigned library, the 20 amino acids were categorized into four groups according to the physicochemical properties of high/low hydrophobicity and positive/negative charge, and a tetramer peptide library (512 peptides, 44  2 versions) covering all four physicochemical properties was constructed (29). The size of this library was the minimum size for a tetramer library. Using the predesigned library, we found the physicochemical properties underlying the binding of peptides to the target protein, and succeeded in

1389-1723/$ e see front matter Ó 2016, The Society for Biotechnology, Japan. All rights reserved. http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

2

KUME ET AL.

J. BIOSCI. BIOENG.,

developing a screening method for functional peptides using both binding affinity data and physicochemical properties of peptides in an efficient and rational manner. However, our previously proposed screening method (29) was adapted for the use of only short tetramer peptides. In addition, if the peptide length were extended from tetramer to octamer in our proposed method, the size of the peptide library would need to increase 256-fold (256 ¼ 44 to 65,536 ¼ 48). Therefore, a new concept for predesigned peptide libraries of longer peptides was necessary. Principal component analysis (PCA) is a mathematical algorithm that reduces dimensionality of the data without loss of information, and recently it has been incorporated into biotechnological research such as genome-wide expression studies (30). The reduction is accomplished by identifying variables, called principal components (PCs), along which the variation in the data is maximal. PCA identifies new variables, such as PC1 or PC2, which are linear combinations of the original variables. Assuming that the physicochemical properties of peptides are used as original variables, the reduction of dimensionality obtained by PCA would result in two peptides with similar characteristics to be placed at points roughly close to each other in the PCA plot. Many different properties, such as polarity, charge, hydrophobicity, and molecular weight of each amino acid residue, and the average, maximum, and minimum values of each in the peptide sequence, can be used as original variables. It is precisely the multi-dimensionality of the properties of a peptide sequence that makes the analysis of peptide characteristics difficult. Indeed, analysis of peptides with high-affinity binding to a target protein should be performed by selecting relevant variables for peptide binding affinity first, and then the combinatory model of the relevant variables should be constructed to predict the binding strength. However, it is not possible to guarantee that the correct relevant variables will be selected from restricted data. In addition, selection of variables by supervised learning is more difficult when only a small sample is used compared to a very large original population. The reduction of dimensionality without loss of information by PCA is very attractive because only two PCs allow a clear understanding of localized characteristics of high-affinity binding peptides. In the present study, the following methodology was employed. A PCA of peptide sequence properties was conducted against a predesigned short peptide library and only two PCs were identified. Next, binding affinity of the peptide library with the target protein was investigated. After obtaining the area of the high-affinity binding peptides in a score plot of the two PCs, the area characteristics were used for searching the larger peptide library, and any resulting correlation between sequence and function was investigated. For instance, we conducted PCA of a tetramer peptide library consisting of 512 peptides, and the variables corresponding to the physicochemical properties of peptides were reduced to two PCs. We selected IL-2 and IgG as model proteins and the binding affinity for each protein was assayed using a peptide array consisting of the 512 peptides. When the binding affinity data was laid over PCA plots, PCAs with 16 and 18 variables were suitable for localizing IL-2 and IgG high-affinity binding peptides, respectively, into the restricted region of the PCA plot. We then tested whether the binding properties of an octamer peptide library to IL-2 and IgG could be predicted using the identified region from the PCA of the tetramer library. The binding affinities of octamer peptide libraries using this method are discussed. MATERIALS AND METHODS Preparation of peptide array The peptide arrays were prepared as described previously (29). A cellulose membrane (grade 542; Whatman, Maidstone, UK) was activated using b-alanine as the basal spacer. The activated

9-fluorenylmethoxycarbonyl amino acids (at 0.25 M) were then spotted using a peptide auto-spotter (ASP222, Intavis AG, Köln, Germany). The membrane was washed with N,N0 -dimethyl formamide after spotting, and deprotected with 20% piperidine. Additional washes were performed with N,N0 -dimethyl formamide, followed by washing with methanol. These steps were repeated at every elongation step. After elongation, the side-chain-protecting groups were removed by incubation in m-cresol:thioanisole:ethanedithiol:trifluoroacetic acid (1:6:3:40) for 3 h, and the membrane was washed with diethyl ether and methanol. Preparation of peptide library A tetramer peptide library consisting of 512 peptides covering the four important physicochemical properties for proteineprotein interactions was obtained as previously described (29) and is shown in Table S1. An ordinary octamer peptide library was obtained from sequences of the extracellular domain of the human IL-2 receptor. The library consisted of 234 peptides designed as octamer peptides with two amino acid shifts (Table S2). The sequences of the human IL-2 receptor alpha (P01589) and beta (P14784) chains were obtained from Universal Protein Resource (http://www.uniprot.org/). A random octamer peptide library was designed by combining two peptides randomly selected from the tetramer peptide library. The random octamer library consisted of 640 peptides (Table S3). Binding assay of peptide arrays to IL-2 or IgG The binding assay of the peptide arrays to the target proteins was performed as described previously (29). After removing side-chain-protecting groups by incubation for 16 h, the peptide array membranes were washed several times with phosphate-buffered saline (PBS, pH 7.0) to remove traces of the reagent. The membranes were then soaked in 1% Block Ace Powder (DS Pharma Biomedical, Osaka, Japan) in PBS for 1 h to block unspecific adsorption. After washing three times with PBS containing 0.05% Tween-20 (T-PBS) for 5 min, the arrays were incubated with each of the IL-2 solutions in PBS at a final concentration of 130 nM for 2 h at 37 C with rotation at 55 rpm using a small-size shaker (NR-3, Taitec Corporation, Saitama, Japan). Unbound cytokine was removed by washing the array with 0.01% T-PBS three times for 5 min. The peptide arrays were then incubated with primary rabbit antibodies against the target (ab9618 for IL-2, Abcam, Cambridge, UK) in PBS containing 0.25% Block Ace Powder for 2 h at 37 C with rotation at 55 rpm. Unbound primary antibody was removed by washing the array three times for 5 min with 0.01% T-PBS. Next, the peptide arrays were incubated with secondary antibody (anti-mouse goat antibody) conjugated with Alexa 488 (ab150117, Abcam) in PBS containing 0.25% Block Ace Powder for 2 h at 37 C with rotation at 55 rpm. Unbound secondary antibody was removed by washing the array three times for 10 min with 0.01% T-PBS. The fluorescence intensity of each peptide spot from the tetramer peptide library and from the ordinary octamer peptide library derived from the extracellular domains of the IL-2 receptor was determined using a Typhoon FLA-9500 (GE Healthcare Life Sciences, Buckinghamshire, UK). Fluorescence intensity from the random octamer peptide library was determined using an FLA-7000 (GE Healthcare Life Sciences) at 494/519 nm and ImageQuantTL (GE Healthcare Life Sciences). The amount of protein bound to a peptide spot was calculated from the fluorescence intensity as follows: (fluorescence intensity of protein bound to a peptide spot) ¼ (fluorescence intensity of the peptide spot with cytokine)  (fluorescence intensity of the peptide spot without cytokine). Principal component analysis A principal component analysis (PCA) was performed using R (R Development Core Team, https://www.r-project.org/). Principal component 1 and 2 (PC1 and 2) were composed of physicochemical properties which were included isoelectric point (31), hydropathy index (32), polarity (31), side-chain contribution to protein stability (33), partition coefficient (Log P), molecular weight (34), Pk1 and Pk2 (35) of the amino acid residue, and the number of the amino acid group in peptides (29) (Table S4). To determine the suitable number of input variables for tetramer peptide libraries, we performed several PCAs (numbered 1e16) with two components. Variable decreasing method was conducted using 49 to 9 variables (Table S4). We first selected PCAs number 10, 11, 14, and 16 as candidates for fitting variables by cumulative proportion of PC1 and PC2. We then calculated the correlation coefficient between fluorescence intensity and PC1 or PC2, and found that PCA 11 (16 variables) for IL-2 and PCA 10 (18 variables) for IgG were the most fitting variables with the highest coefficients. The proportions of variance and cumulative proportions were calculated from the PCA with R. Analysis of high- and low-affinity binding peptide rules for IL-2 and IgG We determined the high- and low-affinity binding rules for the tetramer, ordinary octamer, and random octamer peptide libraries from the PC scores of the tetramer peptide library. First, we selected the top 20 and bottom 20 peptides in fluorescence intensity from the tetramer peptide library. Second, we performed PCA of the top 20 and bottom 20 peptides using 16 variables for IL-2 and 18 variables for IgG, and calculated the averages and standard deviations (SDs) of the PC scores for the top 20 and bottom 20 peptides. Third, we determined the highbinder region from average  SD of the top 20 peptides in a score plot of PC1 and PC2. In addition, we determined the low-binder region from average  of the bottom 20 peptides in a score plot of PC1 and PC2. The area excluded by these regions was referred to as others. Next, we calculated the averages and SDs of the PC scores for the ordinary and random octamer peptide libraries. Peptides from these octamer libraries localized

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

VOL. xx, 2016

PCA-ASSISTED PEPTIDE SCREENING

within the high-binder region of the tetramer peptide library were referred to as high-affinity binding peptides. In the same manner, peptides from these octamer libraries localized within the low-binder region of the tetramer peptide library were referred to as low-binding peptides.

RESULTS Optimization of variables for peptide sequence analysis PCA using sequence data from the tetramer peptide library (512 peptides) was performed using 49 physicochemical variables. Variables corresponded to physicochemical properties of amino acids forming the peptide and included polarity, isoelectric point, hydrophobicity, molecular weight of amino acid at each position, and the maximum, minimum, and average values of the four amino acids. Positional information of the peptide was not considered because the acquired information should be applicable to octamer peptides (Table S4). At the same time, the binding assay using peptide arrays was performed for IL-2 and IgG as model proteins. However, only peptide sequence data, but not binding assay data, was used for PCA, and the reduction of dimensionality of peptide sequence data was investigated. We first performed several PCAs (numbered 1e16) using 49 to 9 variables, respectively. Variables used for each PCA are listed in Table S4 and examples of PCA results are displayed in Fig. 1A and C for PCA 11 and Fig. 2A and C for PCA 10. All 512 peptides were

3

sparsely plotted in a score plot in PC1 and PC2. We also calculated the variance and cumulative proportions from the PC scores, which are listed in Table S4. In general, the first few, especially the first two, components have more variance than later ones, and the variance of the first two components becomes smaller when more variables are used. This means that as more variables are prepared, the dimensionality becomes too large for a high variance to remain in the first two PCs. The reduction of dimensionality retaining high variance is easily achieved when the number of variables is relatively small. Therefore, we conducted an optimization of the number of variables to be able to interpret the data set using only the first two PCs. As a result, the cumulative proportions of the first two PCs decreased as the number of variables was increased. However, the variable sets of PCA 10 (18 variables), 11 (16 variables), 13 (16 variables), 14 (13 variables), and 16 (9 variables) showed higher cumulative proportions (0.56, 0.64, 0.57, 0.64, and 0.67, respectively; Table S4) than sets with more than 18 variables. Therefore, we succeeded in the reduction of dimensionality by PCA, and we conclude that less than 18 variables are suitable to interpret the peptide sequence using only the first two PCs. To optimize the fitting variable for PCA of each target, we next investigated the projection of the peptide sequence data onto two PCs. As shown in Fig. 1, projection of the data onto PC2 revealed that

FIG. 1. Principal component analysis (PCA) of a tetramer peptide library using 16 variables (PCA 11). Score plots of PCA for IL-2 (A) and IgG (C) are shown. Assessed peptides are shown as gray squares. Open squares and closed triangles correspond to the top 20 and bottom 20 peptides, respectively. Relations between fluorescence intensity of the tetramer peptide library to IL-2 (B) or IgG (D) and the PC scores (PC1 or PC2) from Fig. 1A and C are shown. Solid lines and formulas show the approximate straight line and approximate formula of each plot.

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

4

KUME ET AL.

IL2 binding activity increased as the PC2 score increased (Fig. 1C), whereas an inverse correlation was observed in IgG (Fig. 1D). We found that peptide sequences with high-binding activity showed similar PC scores, especially in PC2. A similar tendency was observed in PCA 10 as shown in Fig. 2. We then calculated the correlation coefficients between fluorescence intensity and PC1 or PC2. For IL-2, the correlation coefficients between PC2 and fluorescence intensity were 0.46 (PCA 11 with 16 variables, shown in Fig. 1B), 0.39 (PCA 14 with 13 variables, shown in Fig. S2B), 0.36 (PCA 16 with 9 variables, shown in Fig. S1B), and 0.21 (PCA 10 with 18 variables, shown in Fig. 2B). We then selected PCA 11 as showing the highest correlation for IL-2. In the PCA 11 set, PC1 correlated with hydrophobicity, polarity, and Log P of the tetramer peptides, and PC2 correlated with isoelectric point, number of group 3 residues, and number of group 4 residues. For IgG, the correlation coefficients between PC2 and fluorescence intensity were 0.35 (PCA 10 with 18 variables, shown in Fig. 2B), 0.25 (PCA 11 with 16 variables, shown in Fig. 1B), 0.24 (PCA 14 with 13 variables, shown in Fig. S2B), and 0.21 (PCA 16 with 9 variables, shown in Fig. S1B). We selected PCA 10 as showing the highest correlation for IgG. In the PCA 10 set, PC1 correlated with Log P, hydrophobicity, polarity, and side-chain contribution to protein stability of the tetramer peptides, and PC2 correlated with molecular weight, isoelectric point, and side-chain contribution to protein stability.

J. BIOSCI. BIOENG., Determination of rules for high- and low-affinity binding to IL-2 or IgG from PCA using a tetramer peptide library The PC scores of the tetramer peptide library were calculated using 16 variables for IL-2 and 18 variables for IgG. The PCA scores of the peptide sequences were plotted in PC1 and PC2 dimensions, and peptides with similar characteristics were localized close to each other. The top 20 and bottom 20 peptides of the tetramer peptide library were localized in largely separated areas (Fig. 1A and B). It was assumed that each area represents sequence characteristics, called sequence rules, for high- and low-affinity binding peptides. To construct high- and low-affinity binding rules, the averages and SDs of PC scores were calculated from the top 20 and bottom 20 peptides of the tetramer peptide library. High- or low-affinity binding rules were determined to be a region within average  SD of the top 20 peptides or the bottom 20 peptides, respectively, in the score plot (Fig. 3). As a result, for IL-2, the highaffinity binding rule corresponded to the region within 6.33 to 1.43 of PC1 and 0.487 to 2.54 of PC2, and the low-affinity binding rule corresponded to the region within 1.35 to 2.58 of PC1 and 3.15 to 0.781 of PC2 (Fig. 3A). For IgG, the high-affinity binding rule corresponded to the region within 1.16 to 1.15 of PC1 and 2.93 to 0.859 of PC2, and the low-affinity binding rule corresponded to the region within 0.597 to 4.33 of PC1 and 1.06 to 3.35 of PC2 (Fig. 3B).

FIG. 2. PCA of a tetramer peptide library using 18 variables (PCA 10). Score plots of PCA for IL-2 (A) and IgG (C) are shown. Assessed peptides are shown as gray squares. Open squares and closed triangles correspond to the top 20 and bottom 20 peptides, respectively. Relations between fluorescence intensity of the tetramer peptide library to IL-2 (B) or IgG (D) and the PC scores (PC1 or PC2) from Fig. 2A and C are shown. Solid lines and formulas show the approximate straight line and approximate formula of each plot.

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

VOL. xx, 2016

PCA-ASSISTED PEPTIDE SCREENING

5

FIG. 3. Identification of regions containing the high- and low-affinity IL-2- (A) and IgG-binding (B) peptides in the score plot using a tetramer peptide library. Closed circles and error bars show the averages and standard deviations (SDs), respectively, of PC scores from the top 20 (open squares) and bottom 20 peptides (closed triangles). The averages and SDs of PC scores correspond to regions in which high- or low-affinity binding peptides are concentrated. The regions were calculated by PCA using 16 variables for IL-2 (Fig. 1) and 18 variables for IgG (Fig. 2).

Analysis of peptides within high- and low-affinity binding regions using the IL-2 receptor-derived octamer peptide library The high- and low-affinity binding peptides of an octamer peptide library derived from the IL-2 receptor (234 peptides) were extracted from PC scores using 16 variables for IL2 and 18 variables for IgG. Fig. 4A and C shows the high- and low-affinity binding peptides in score plots for IL-2 and IgG. For IL-2, 59 peptides (25%) were localized in the high-binder area and 47 peptides (20%) in the low-binder area. For IgG, 31 (13%) and 24 (10%) peptides of the octamer library were identified as belonging to the high- and low-binder groups, respectively. We then compared the fluorescence intensities of high- and low-affinity binding peptides for IL-2 (Fig. 4B) and IgG (Fig. 4D). For IL-2, the average fluorescence intensity of the high-binder group (2.23  106, Table 1) was 1.8-fold higher than that of the low-binder group (1.22  106), whereas the average fluorescence intensity of peptides of the others group was slightly higher (2.44  106) than that of the high-binder group. For IgG, the average fluorescence intensity of the high-binder group (4.02  106) was the highest of the three groups, and was 5.3-fold higher than that of the low-binder group (0.76  106, Table 1). Although we could discriminate the high- and low-affinity IgGbinding octamer peptides derived from the IL-2 receptor using the binding rules from a tetramer peptide library, we could not discriminate the high- and low-affinity IL-2-binding octamer peptides. It is possible that the IL-2-derived ordinary octamer library is biased in physicochemical characteristics, and thus relatively high-affinity binding peptides were present in low-binder or others areas. Analysis of peptides from a random octamer peptide library selected using high- and low-affinity binding rules To perform an unbiased evaluation of the tendency of peptides to bind to target proteins, we constructed a random octamer peptide library with broad physicochemical characteristics. This new octamer peptide library was constructed by combining two peptides randomly selected from the tetramer peptide library and obtaining 640 peptides. Three peptide groups (high-binder, low-

binder, and others) were extracted from PC scores using 16 variables for IL-2 and 18 variables for IgG. Fig. 5A and C shows the high- and low-binder groups in score plots for IL-2 and IgG. As a result, we identified 128 and 103 octamers of the highbinder group for IL-2 and IgG, respectively, and 140 and 74 octamers of the low-binder group for IL-2 and IgG, respectively. We then compared the fluorescence intensities of high- and low-affinity binding peptides for IL-2 (Fig. 5B) and IgG (Fig. 5D). For IL-2, 128 (20%) high- and 140 (22%) low-affinity peptides were found. The average fluorescence intensity of high-binder group (0.89  105) was the highest of the three groups, and was 3.3-fold higher than that of the low-binder group (0.27  105, Table 2). For IgG, 103 (16%) high- and 74 (12%) low-affinity peptides were found. The average fluorescence intensity of the high-binder group (2.13  105) was the highest of the three groups, and was 2.1-fold higher than that of the low-binder group (1.03  105, Table 2). Therefore, we could discriminate the high- and low-affinity IL-2and IgG-binding octamer peptides using binding rules from a tetramer peptide library. DISCUSSION In the present study, we performed PCA of physicochemical properties using a tetramer peptide library consisting of 512 peptides to extract the binding rules of these peptides to IL-2 and IgG. Using the rules obtained from the tetramer peptides, we analyzed the binding affinity of peptides from two octamer libraries, an ordinary library consisting of 234 peptides derived from the IL-2 receptor, and a random library consisting of 640 peptides. As a result, we could discriminate peptides with high- or low-affinity binding to IL-2 and IgG in these octamer libraries using only binding rules obtained from tetramer peptides. Therefore, the results suggest that binding affinity of longer peptides may be predicted from binding affinity of shorter peptides. Table 1 shows the fluorescence intensity of IL-2 binding of peptides discriminated into high- and low-binder groups from an ordinary octamer peptide library derived from the IL-2 receptor,

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

6

KUME ET AL.

J. BIOSCI. BIOENG.,

FIG. 4. Analysis of peptides within high- and low-binding regions from an octamer peptide library derived from IL-2 receptor sequences. PC score plots of octamer peptides for IL-2 (A) and IgG (C) and the fluorescence intensity of the high- and low-affinity binding peptides to IL-2 (B) or IgG (D) are shown. The regions containing high- and low-binding peptides identified for the tetramer peptide library shown in Fig. 3 were applied to an octamer peptide library derived from the IL-2 receptor. PC scores of octamer peptides for IL-2 and IgG were calculated using 16 and 18 variables for IL-2 and IgG, respectively. Peptides of the octamer library localized within the high- or low-binder groups were selected and displayed on the score plot for IL-2 (A) and IgG (C). In Fig. 4A and C, open squares and closed triangles show the PC scores of the high- and low-binding peptides, respectively. The number of high-binder peptides was 59 and 31 peptides for IL-2 and IgG, respectively. The number of low-binder peptides was 47 and 24 peptides for IL-2 and IgG, respectively.

whereas the intensity of the high-binder group was similar to that of the others group. On the other hand, the fluorescence intensity of IgG binding of peptides from the high-binder group from the ordinary octamer peptide library derived from the IL-2 receptor was extremely higher than that of the low-binder group. The ordinary octamer peptide library covered a small region of the PCA score plot when compared to that of the new random octamer library for both IL-2 and IgG (Figs. 4A and 5A). It is considered that amino acid residues in the extracellular domain of the human IL-2 receptor are biased in physicochemical characteristics. Indeed, amino acid residues of group 1, 2, 3 and 4 are present at 39.0%, 38.7%, 10.6% and 11.7%, respectively, in the extracellular domain of the IL-2 receptor. In perfectly randomized library, amino acid residues of group 1, 2, 3 and 4 appear at 40%, 35%, 15% and 10%, respectively. Therefore, it is assumed that the ordinary peptide library is suitable for screening of hydrophilic peptides containing group 2 and not suitable for positively charged peptides containing group 3. In the present study, peptides were better sorted into high-IgG binder and low-IgG binder groups by the score plot of the ordinary octamer library (Fig. 4D) than by that of the random octamer library (Fig. 5D). The difference between averages of the high and

low IL-2-binding groups is shown in Table 2. In the previous study (29), we reported that high binding peptides for IL-2 are frequently containing amino acid residues of group 4 and those are scarcely containing the residues of group 3. We also reported that high binding peptides for IgG is including residues of group 3. It is likely that there are many peptides with high binding affinity to IL-2 in the ordinary octamer library derived from the IL-2 receptor and the difference between the average of high binder and that of low/ others is relatively small. Because the ordinary octamer library was constructed from the amino acid sequence of the IL-2 receptor, obtention of a high IL-2 binding group from this library with high binding affinity would be also expected (Fig. 4D). In addition, it is also likely that there are a few peptides with high binding affinity to IgG in the ordinary octamer library derived from the IL-2 receptor and the difference between the average of high binder and that of low binder is relatively big. As shown in Tables 1 and 2, for instance, the ratio of average fluorescence intensity of IgG high-binding/lowbinding peptides from the octamer peptide library derived from the IL-2 receptor (5.3-fold) was bigger than that of peptides from the random octamer peptide library for IgG (2.1-fold). Thus, we reasoned that a peptide library covering a broader region in the

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

VOL. xx, 2016

PCA-ASSISTED PEPTIDE SCREENING

TABLE 1. Regions identified by the average, standard deviation, and number of peptides from an ordinary octamer peptide library derived from IL-2 receptor sequences.

IL-2

IgG

High-binder Low-binder Others High-binder Low-binder Others

Ave

SD

Number of peptides

2233707 1222325 2436329 4021037 764934 2184281

407662 879272 695693 764934 79927 350052

59 47 128 31 24 179

Ave: average; SD: standard deviation. The high-binder, low-binder, and others were constructed from the average of fluorescent intensities, SD, and number of peptides in Fig. 4.

score plot was more suitable to discriminate high- and low-binding regions of peptides. However, experiments using other proteinreceptor pairs are needed to support this speculative interpretation. The average fluorescence intensity of the random octamer peptide library was significantly lower than that of the ordinary octamer peptide library for both IL-2 and IgG binding as shown in Tables 1 and 2. This is due to the use of different equipment as described in Materials and Method. However, although the average fluorescence intensities for the random and ordinary octamer

7

TABLE 2. Regions identified by the average, standard deviation, and number of peptides from a random octamer peptide library.

IL-2

IgG

High-binder Low-binder Others High-binder Low-binder Others

Ave

SD

Number of peptides

89125 26937 70806 213481 102782 152196

28846 26472 27159 16755 7464 13007

128 140 372 103 74 463

The high-binder, low-binder, and others were constructed from the average of fluorescent intensities, SD, and number of peptides in Fig. 5.

peptide libraries were significantly different, the relevant comparisons were done within each library. For example, it should be emphasized that the average intensity of the high-binder group for IL-2 was about 2-fold higher than that of the low-binder group for the ordinary octamer peptide library, and the average intensity of the high-binder group for IL-2 was about 3.3-fold higher than that of the low-binder group for the random octamer peptide library. In our previously described binding analysis using tetramer peptide libraries, the following physicochemical rules for highaffinity binding peptides were proposed: no positively charged

FIG. 5. Analysis of peptides within high- and low-binding regions from a random octamer peptide library. PC score plots of octamer peptides for IL-2 (A) and IgG (C) and the fluorescence intensity of peptides bound to IL-2 (B) or IgG (D) are shown. The regions containing the high- and low-binding peptides from the tetramer peptide library shown in Fig. 3 was applied to a random octamer peptide library. PC scores of octamer peptides for IL-2 and IgG were calculated using 16 or 18 variables for IL-2 and IgG, respectively. Peptides of the octamer library localized within the region for high or low-binder groups from the tetramer peptide library were selected and displayed on the score plot for IL-2 (A) and IgG (C). In Fig. 5A and C, open squares and closed triangles show the PC scores of the high- and low-binding peptides, respectively. The number of high-binding peptides was 128 and 103 peptides for IL-2 and IgG, respectively. The number of low-binding peptides was 140 and 74 peptides for IL-2 and IgG, respectively.

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

8

KUME ET AL.

hydrophilic (group 3) amino acid residue in any position and a noncharged hydrophobic (group 1) or negatively charged hydrophilic (group 4) amino acid residue in position 1 (group 1 in P2 and P4) for IL-2; and a non-charged hydrophilic (group 2) and a positively charged hydrophilic (group 3) amino acid residue in any position for IgG (29). These rules were compared with those derived from the present study as follows. In the present study, PCA 11 was selected for IL-2. PC1 correlated with hydrophobicity, polarity, and Log P of the tetramer peptides, whereas PC2 correlated with isoelectric point, number of group 3 residues, and number of group 4 residues. PC2 was negatively correlated with the number of group 3 residues and positively correlated with the number of group 4 residues for IL-2. Because PC2 was positively correlated to binding affinity to IL-2 (Fig. 1B), binding affinity increased with an increase in group 4 residues. Therefore, these results are in agreement with the published work mentioned above. For IgG, PCA 10 was selected and PC2 coefficients of the isoelectric point-related variables were slightly negative. Because PC2 was negatively correlated to binding affinity for IgG (Fig. 1D), peptides with positively charged amino acid residues (group 3) were found in the high-binder group. On the other hand, PC2 coefficients of isoelectric point-related variables were notably negative for IL-2. Because PC2 was positively correlated to binding affinity for IL-2 (Fig. 1B), peptides with negatively charged amino acid residues (group 4) were found in the high-binder group. PC2 coefficients of hydropathy index-related variables were positive in PCA 11 for IgG. Because PC2 was negatively correlated to binding affinity to IgG (Fig. 1D), the binding affinity to IgG decreased with an increase of the hydrophobicity of the peptide. These results are also in agreement with our previously published work. The reliability of PCA becomes high with an increasing cumulative proportion, and a value of 0.8 is considered standard. However, many researchers have performed PCAs with a cumulative proportion of less than 0.8 (30,36,37), which suggests that models constructed with a lower cumulative proportion are considered reliable. Thus, we selected PCA models with variable sets 10 and 11, although the cumulative proportion of these models is less than 0.8. In addition, we could discriminate the high- and low-affinity IL2- and IgG-binding peptides among the octamer peptides using the binding rules of tetramer peptides. In previously published work, we demonstrated how to extract the physicochemical characteristics of high-affinity binding peptides for various kinds of target molecules, such as the death receptor, a cell-adhesion peptide, and bile acid (27,28,38). Therefore, applying this method to other targets might allow the analysis of binding affinities of longer and/or shorter peptides than those of the original assayed library. We focused on short-chain peptides because their structures are linear and the binding data could be submitted to computational analysis. As generally stated, a typical helix contains about ten amino acids. Therefore, we investigated herein the properties of octamer peptides. The method proposed here is difficult to apply when screening longer sequences (peptides with more than 10 residues) because secondary structure possibly forms spontaneously. In the present study, the binding peptides were identified for IL2 and IgG as target proteins. Although it is tempting to suggest that the physicochemical properties of a high-affinity binding peptide are reflected in the physicochemical properties of the target protein, the binding region of a peptide from a high-binder group is not necessarily a binding site for the target protein. To understand this relation, it should be investigated whether the high-affinity IL-2binding peptides can competitively inhibit binding of IL-2 to its receptor. Our method may be convenient to understand binding-related physicochemical characteristics of high- and low-affinity binding peptides. Thus, we conclude that the proposed method is useful for

J. BIOSCI. BIOENG., a first screening of binding peptides. However, it is unclear whether the peptide with the highest binding affinity to the target molecule can be identified, because it is possible that this peptide does not have the selected physicochemical properties. As a next step, we plan to apply our method in the screening of peptides with the highest or lowest binding affinity by performing a second and more screening steps. For this second and more screening, a library with exhaustive but rationally designed octamer peptides must be prepared after extracting the important variables and evaluating binding using the peptide array. After PCA, the scores of the newly designed peptides are easy to be calculated, although calculation for longer peptides is time-consuming because of the huge number of candidates. To summarize, our exploratory method using physicochemical characteristics is useful to effectively reduce library size. PCA also has the potential for identifying high- or low-binder regions, and we succeeded in expanding the identified region from tetramer to octamer peptides using PCA. Our results show that the binding affinity of longer peptides can be predicted from the binding affinity of shorter peptides. In addition, we were able to reduce the number of candidate octamer peptide ligands, using all 20 amino acids, from 2.56  1010 to 512 peptides. Therefore, our method is an important tool for the efficient and rational development of peptide ligands. Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.jbiosc.2016.08.005.

ACKNOWLEDGMENTS This work was partially supported by JSPS KAKENHI (Grant Numbers: 25289292 and 18H04575). We would like to thank Editage for English language editing.

References 1. Terpe, K.: Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems, Appl. Microbiol. Biotechnol., 60, 523e533 (2003). 2. Waibel, R., Alberto, R., Willuda, J., Finnern, R., Schibli, R., Stichelberger, A., Egli, A., Abram, U., Mach, J. P., Plückthun, A., and Schubiger, P. A.: Stable onestep technetium-99m labeling of His-tagged recombinant proteins with a novel Tc(I)-carbonyl complex, Nat. Biotechnol., 17, 897e901 (1999). 3. Hopp, T. P., Pricket, K. S., Price, V.L.,Libby, R. T., March, C. J., Ceretti, D.P., Urdal, D. L., and Conlon, P. J.: A short polypeptide marker sequence useful for recombinant protein identification and purification, Nat. Biotechnol., 6, 1204e1210 (1988). 4. Murray, A., Sekowski, M., Spencer, D. I. R., Denton, G., and Price, M. R.: Purification of monoclonal antibodies by epitope and mimotope affinity chromatography, J. Chromatogr. A, 782, 49e54 (1997). 5. Murray, A., Spencer, D. I. R., Missailidis, S., Denton, G., and Price, M. R.: Design of ligands for the purification of anti-MUC1 antibodies by peptide epitope affinity chromatography, J. Pept. Res., 52, 375e383 (1998). 6. Roque, A. C. A., Lowe, C. R., and Taipa, M. A.: Antibodies and genetically engineered related molecules: production and purification, Biotechnol. Prog., 20, 639e654 (2004). 7. Chen, X., Conti, P. S., and Moats, R. A.: In vivo near-infrared fluorescence imaging of integrin avb3 in brain tumor xenografts, Cancer Res., 64, 8009e8014 (2004). 8. Qiao, Y., Tang, H., Munske, G. R., Dutta, P., Ivory, C. F., and Dong, W. J.: Enhanced fluorescence anisotropy assay for human cardiac troponin I and T detection, J. Fluoresc., 21, 2101e2110 (2011). 9. Sapsford, K. E., Blanco-Canosa, J. B., Dawson, P. E., and Medintz, I. L.: Detection of HIV-1 specific monoclonal antibodies using enhancement of dyelabeled antigenic peptides, Bioconjug. Chem., 21, 393e398 (2010). 10. Wang, X. H., Wang, C. Y., Qu, K. G., Song, Y. J., Ren, J. S., Miyoshi, D., Sugimoto, N., and Qu, X. G.: Ultrasensitive and selective detection of a prognostic indicator in early-stage cancer using graphene oxide and carbon nanotubes, Adv. Funct. Mater., 20, 3967e3971 (2010). 11. Izdebska, M., Gagat, M., Grzanka, D., and Grzanka, A.: Ultrastructural localization of F-actin using phalloidin and quantum dots in HL-60 promyelocytic leukemia cell line after cell death induction by arsenic trioxide, Acta Histochem., 115, 487e495 (2013).

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005

VOL. xx, 2016 12. Smith, G. P.: Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface, Science, 228, 1315e1317 (1985). 13. Lam, K. S., Salmon, S. E., Hersh, E. M., Hruby, V. J., Kazmierski, W. M., and Knapp, R. J.: A new type of synthetic peptide library for identifying ligandbinding activity, Nature, 354, 82e84 (1991). 14. Lebl, M., Krchnak, V., Sepetov, N. F., Seligmann, B., Strop, P., Felder, S., and Lam, K. S.: One-bead-one-structure combinatorial libraries, Biopolymers, 37, 177e198 (1995). 15. Fassina, G., Ruvo, M., Palombo, G., Verdoliva, A., and Marino, M.: Novel ligands for the affinity-chromatographic purification of antibodies, J. Biochem. Biophys. Methods, 49, 481e490 (2001). 16. Fassina, G., Verdoliva, A., Odierna, M. R., Ruvo, M., and Cassini, G.: Protein a mimetic peptide ligand for affinity purification of antibodies, J. Mol. Recognit., 9, 564e569 (1996). 17. Sandin, C., Linse, S., Areschoug, T., Woof, J. M., Reinholdt, J., and Lindahl, G.: Isolation and detection of human IgA using a streptococcal IgA-binding peptide, J. Immunol., 169, 1357e1364 (2002). 18. Palombo, G., Verdoliva, A., and Fassina, G.: Affinity purification of immunoglobulin M using a novel synthetic ligand, J. Chromatogr. B, 715, 137e145 (1998). 19. Hatanaka, T., Ohzono, S., Park, M., Sakamoto, K., Tsukamoto, S., Sugita, R., Ishitobi, H., Mori, T., Ito, O., Sorajo, K., and other 3 authors: Human IgAbinding peptides selected form random peptide libraries: affinity maturation and application in IgA purification, J. Biol. Chem., 287, 43126e43136 (2012). 20. Wang, F. Y., Guo, J. Q., and Zhang, G. P.: Analysis of the linear epitope for Fcbinding on the mouse IgG Fc receptor (moFcgR1) by synthetic peptide, Genet. Mol. Res., 13, 4647e4653 (2014). 21. Burden, F. R. and Winkler, D. A.: Predictive Bayesian neural network models of MHC class II peptide binding, J. Mol. Graph. Model, 23, 481e489 (2005). 22. Brusic, V., Rudy, G., Honeyman, G., Hammer, J., and Harrison, L.: Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network, Bioinformatics, 14, 121e130 (1998). 23. Singh, H. and Raghava, G. P.: ProPred: prediction of HLA-DR binding sites, Bioinformatics, 17, 1236e1237 (2001). 24. Tung, C. W. and Ho, S. Y.: POPI: predicting immunogenicity of MHC class I binding peptides by mining informative physicochemical properties, Bioinformatics, 23, 942e949 (2007). 25. Salomon, J. and Flower, D. R.: Predicting class II MHC-peptide binding: a kernel based approach using similarity scores, BMC Bioinformatics, 7, 501 (2006).

PCA-ASSISTED PEPTIDE SCREENING

9

26. Zhang, G. L., Bozic, I., Kwoh, C. K., August, J. T., and Brusic, V.: Prediction of supertype-specific HLA class I binding peptides using support vector machines, J. Immunol. Methods, 320, 143e154 (2007). 27. Kaga, C., Okochi, M., Tomita, Y., Kato, R., and Honda, H.: Computationally assisted screening and design of cell-interactive peptides by a cell-based assay using peptide arrays and a fuzzy neural network algorithm, Biotechniques, 44, 393e402 (2008). 28. Takeshita, T., Okochi, M., Kato, R., Kaga, C., Tomita, Y., Nagaoka, S., and Honda, H.: Screening of peptides with a high affinity to bile acids using peptide arrays and a computational analysis, J. Biosci. Bioeng., 112, 92e97 (2011). 29. Kume, A., Okochi, M., Shimizu, K., Yoshida, Y., and Honda, H.: Development of a tactical screening method to investigate the characteristics of functional peptides, Biotechnol. Bioprocess Eng., 21, 119e127 (2016). 30. Ringnér, M.: What is principal component analysis? Nat. Biotechnol., 26, 303e304 (2008). 31. Zimmerman, J. M., Eliezer, N., and Simha, R.: The characterization of amino acid sequences in protein by statistical methods, J. Theor. Biol., 21, 170e201 (1968). 32. Kyte, J. and Doolittle, R. F.: A simple method for displaying the hydropathic character of a protein, J. Mol. Biol., 157, 105e132 (1982). 33. Takano, K. and Yutani, K.: A new scale for side-chain contribution to protein stability based on the empirical stability analysis of mutant proteins, Protein Eng., 14, 525e528 (2001). 34. Zamyatin, A. A.: Protein volume in solution, Prog. Biophys. Mol. Biol., 24, 107e123 (1972). 35. David, L. N. and Michael, M. C.: Lehninger principles of biochemistry, pp. 118 3rd ed. Worth Pub, UK (2000). 36. Futamura, Y., Kawatani, M., Kazami, S., Tanaka, K., Muroi, M., Shimizu, T., Tomita, K., Watanabe, N., and Osada, H.: Morphobase, an encyclopedic cell morphology database, and its use for drug target identification, Chem. Biol., 19, 1620e1630 (2012). 37. Tarasova, N. K., Ytterberg, A. J., Lundberg, K., Zhang, X. M., Harris, R. A., and Zubarev, R. A.: Proteomics reveals a role for attachment in monocyte differentiation into efficient proinflammatory macrophages, J. Proteome Res., 14, 3940e3947 (2015). 38. Kaga, C., Okochi, M., Nakanishi, M., Hayashi, H., Kato, R., and Honda, H.: Screening of a novel octamer peptide, CNSCWSKD, that induces caspasedependent cell death, Biochem. Biophys. Res. Commun., 362, 1063e1068 (2007).

Please cite this article in press as: Kume, A., et al., Exploring high-affinity binding properties of octamer peptides by principal component analysis of tetramer peptides, J. Biosci. Bioeng., (2016), http://dx.doi.org/10.1016/j.jbiosc.2016.08.005