Amino acid sequence of horse colipase B

Amino acid sequence of horse colipase B

39 Biochimica et Biophysica Acta, 669 (1981) 39-45 Elsevier/North-Holland Biomedical Press BBA 38687 AMINO ACID SEQUENCE OF HORSE COLIPASE B J. BONI...

505KB Sizes 0 Downloads 144 Views

39

Biochimica et Biophysica Acta, 669 (1981) 39-45

Elsevier/North-Holland Biomedical Press BBA 38687 AMINO ACID SEQUENCE OF HORSE COLIPASE B J. BONICEL, P. COUCHOUD, E. FOGLIZZO, P. DESNUELLE and C. CHAPUS * Centre de Biochimie et de Biologie Molgculaire du CNRS, 31, Chemin Joseph.Aiguier, B.P. 71, 13277 Marseille Cedex 9 (France)

(Received December 29th, 1980)

Key words: Colipase B; Amino acid sequence; (Horse)

The complete sequence of the 96 residues composing horse colipase B has been determined by automated analysis of the intact protein, of two CNBr peptides and two tryptic peptides arising, respectively, from the eitraconylated chain and from the unreduced protein. The single histidine of the protein is located at position 29 as in horse colipase A. His86, present in the C-terminal region of the pig cofactor and supposed to play a role in the ,folding molecule, is not conserved in horse B. Large pieces of the pig and horse B chains were found to be identical or very similar, especially the N-terminal sequence and the central segment Ala49-Cys65 including the three tyrosines of the molecule. The four lysines and the ten half cystines are also conserved.

Colipase, a small protein cofactor secreted by pancreas, is assumed to anchor lipase (EC 3.1.1.3) at bilesalt-coated interfaces of insoluble triglyceride substrates [1]. This function requires specific binding of the molecule to both interface and enzyme. Sequence determination is a first step towards a better understanding of these processes which are excellent models of protein-lipid interactions and protein-protein interactions mediated by an organized lipid phase [21. The amino acid sequence and disulfide bridges of a degraded and a still active form of porcine colipase (colipase II) have already been elucidated [3,4]. The protein contains 84 residues and five disulfide bridges * To whom correspondence should be addressed. Abbreviation: TPCK, N-tosyl-L-phenylalanine chloroketone. Supplementary data to this article are deposited with and can be obtained from Elsevier/North-Holland Biomedical Press B.V., BBA Data Deposition, P.O. Box 1345 1000 BH Amsterdam, The Netherlands. Reference should be made to No. BBA/DD/181/38687/669 (1981)39. The supplementary information includes: Amino acid compositions and elution profiles of peptides obtained after trypsin digestion and the initial and repetitive yields during automated sequence analysis.

o f which four are located in a strongly reticulated core. Two hydrophobic regions (Ile7-Ile8-Ile9 in the N-terminal tail and Phe50-Thr51-Leu52-Tyr53-Gly54Val55-Tyr56-Tyr57 in the core) are noteworthy. The two adjacent Tyr56 and Tyr57 have been shown by several physicochemical techniques to be involved in interfacial binding o f the cofactor [ 5 - 9 ] . An interaction between these tyrosines and a histidine tentatively identified as His86 in the C-terminal part of the porcine colipase chain has been demonstrated by proton NMR spectrography and laser photo-chemically induced dynamic nuclear polarization [ 8 - 9 ] . Special emphasis was laid on this interaction, which may reflect a characteristic folding of the chain creating a relatively large interface recognition site in the protein. Two isocolipases A and B * a r e synthesized by horse pancreas. Both have been shown by amino acid analysis to contain about 96 residues [10,11]. Their N-terminal sequences, including 55 residues for the * During pancreas processing and cofactor purification, the isocolipases give rise to small quantities of slightly degraded forms A2 and B2 [i0]. Only the major forms A1 and B1 are considered here and simply designated A and B.

0 005-2795/81/0000-0000/$02.50 © Elsevier/North-Holland Biomedical Press

40 first and 51 residues for the second, have been determined by automated Edman degradation [ 11 ]. The presence of two methionines in horse colipase B and the resulting possibility of obtaining three peptides upon chain fragmentation by CNBr prompted us to try to elucidate the complete amino acid sequence of this protein of which the proportions in horse pancreas homogenates largely exceed those of colipase A [I0]. The sequence compared to that of the porcine cofactor should give useful information about the regions essential for the expression of colipase activity. It should also help in the interpretation of crystallographic data. Horse colipase B has already been crystallized, with encouraging results [101. In the present work, t h e arrangement of the 96 residues composing the protein was derived from automated sequence analysis of the intact chain and only four peptides isolated after chain fragmentation by CNBr or trypsin digestion. The single histidine of colipase B was definitely proved to be at position 29 and the above-mentioned hydrophobic regions of pig colipase were shown to be conserved in the horse. Preliminary information about the C-terminal and intermediary CNBr peptides has already been published [10,12]. Material and Methods

Material Horse colipase B was purified (40 mg from 800 g of fresh pancreas) as described by Chapus et al. [10] in the presence of trypsin and carboxypeptidase inhibitors. The protein is characterized by an N-terminal valine and a C-terminal arginine detectable by digestion with carboxypeptidase B. CNBr, monoiodoacetic acid, mercaptoethanol and citraconic anhydride were Fluka products (Buchs, Switzerland) of the best available grade. Trypsin pretreated with N-tosyl-t-phenylalanine chloroketone (TPCK-trypsin) was from Worthington Biochem. Corp. (N J, U.S.A.). Sephadex was from Pharmacia (Sweden) and DEAEcellulose (DE 52) was from Whatman (U.K.).

Methods CNBr peptides. These were obtained as described in Ref. 10.

Tryptic peptides. The reduced-carboxymethylated

protein (2/~_rnol) was dialyzed for 24 h at 4°C against 15 l of 1 mM HC1, lyophilized and taken up in 2 ml of 0.1 M ethylmorpholine acetate buffer, pH 8.2, containing 50 mM EDTA. The solution was incubated for 1 h at room temperature with a 55-fold excess (calculated by reference to the lysine amino groups) of citraconic anhydride. After a 24 h dialysis against a 60 mM ammonium bicarbonate buffer, pH 8.5, the mixture was digested for 1 h at 37°C by TPCK-trypsin (enzyme : substrate ratio, 1 : 50). After separation and prior to analysis, the arginine peptides were decitraconylated in 5% formic acid at room temperature for 18 h.

Limited trypsinolysis of unreduced colipase (1 12tool). This was carried out as described in Ref. 10 by a 24 h incubation at 37°C and pH 8.0 under stirring with Sepharose-linked trypsin (1 mg of active enzyme).

Amino acid analysis and N-terminal residues. Amino acid analysis were performed in a Beckman Autoanalyzer Model 120 C after 24, 48 and 72 h hydrolysis of the peptides in constant boiling HC1 at 110°C. Values for serine and threonine were derived from extrapolation to zero time while the highest values obtained after 72 h hydrolysis were used for valine, leucine and isoleucine. N-terminal residues in proteins and peptides were identified by the dansyl technique of Hartley [13]. Automated Edman degradations. Automated Edman degradations [14] were performed in a Beckman Sequencer Model 890 C using the 0.1 M Quadrol program. A Socosi sequencer was also used in some cases with a dimethylbenzylamine program [15]. Short peptides were retained in the spinning cup by apocytochrome C [16]. Conversion ofanilinothiazolinones was carried out in 20% trifluoroacetic acid at 55°C for 30 min [17]. After evaporation to dryness, the resulting thiohydantoins were dissolved in 1 M HC1 and extracted with ethyl acetate, thus leading to an organic-soluble and a water-soluble fraction. Organicsoluble thiohydantoins were identified and quantified by high-performance liquid chromatography using a Waters apparatus and a Merck RP 18 column which was eluted by a linear methanol concentration gradient (20 to 46%) in a 5 mM sodium acetate buffer, pH 4.5. Complete separation was achieved within 50 min under these conditions. Water soluble thiohydantoins of arginine and histidine were identi-

41

fled with a Waters C 18 column eluted as described above. Except for the Cit-T2A peptide, of which only a small amount was obtained, identifications were confirmed by gas chromatography before and after trimethylsilylation [18], thin-layer chromatography on Merck's precoated plates for nanochromatography and, in some cases, amino acid analysis after hydrolysis of thiohydantoins in hydroiodic acid [19]. Results The strategy adopted to elucidate the sequence of horse colipase B by automated analysis is outlined in Fig. 1. The technique was applied to the intact chain and to only four peptides. Two of these peptides were CNBr fragments. The others were obtained by tryptic digestion of the citraconylated chain and of the unreduced protein as reported in detail below. N-terminal sequence of the intact chain. As shown by Fig. 1, automated degradation of the reduced-carboxymethylated protein was interrupted after the 22nd residue, which provided a good overlap with peptide CNBr I. Up to this point, our results fully agreed with those previously reported by Julien et al. [11 ] for the same region of the protein. Purification and sequence of CNBr peptides. The

Complete

chain

CNBr peptides

VQI

~ln

I

I

1

22

VQI I 1

peptide mixture resulting from chain fragmentation by CNBr was applied to Sephadex G-50 superfine and chromatographed into four fractions. The first, which contained incompletely degraded material, was discarded. The others, designated CNBr I, CNBr II and CNBr III by order of emergence from the column, corresponded to the three peptides arising from normal chain cleavage at the two methionines of the chain. CNBr II (18 residues)was N-terminal since its amino acid composition was identical to that of the sequence determined on the intact chain up to Metl8, CNBr III, which was the only peptide devoid of homoserine lactone, was considered C-terminal. It contained 15 residues to which the C-terminal arginine detected in the intact protein by carboxypeptidase digestion should be added. As discussed later in more detail for the extreme C-terminal fragment of the unreduced protein detached by limited trypsinolysis, this arginine is lost during trypsin attack. It can be detected in the digests by Sephadex chromatography. Therefore, a total of 16 residues was assigned to CNBr III. In spite of their similar length, CNBr II and CNBr III were easily separated on Sephadex G-50, due probably to the presence of two phenylalanines in the latter peptide. CNBr I with 62 residues was the intermediary fragment between the two methionines. Strong

Arg

II 93 94

MetAsn II 1819 CNBr 11

Cit.T2A

trypsinolysis

MetAsn

68

78 79

I

G~u

II

III 92 93 94

CNBr [

CNBr Tn gly

g~y

I

I

I

83

90

64

Limited

G•n

Arg

VoI

Arg

SerGluGIxArs

I

I

I I I I

F~. 1. Diagrammatic representation of peptide fragments used in sequence analysis of horse colipase B. Arrows indicate regions sequenced by automated Edman de~adations (~) and ca:boxypeptidase digestion (*-). The total number of residues in the proteins is not 94, but 96 becauseof two inseztions (AIa36A and Ser48A) compared to the porcine cofactor chain taken as reference~ The C-terminal arginine identified in the intact protein is lost during digestion with trypsin (seetext).

42 absorption at 280 nm reflected the presence in this peptide of the tryptophan and the three tyrosines of the horse cofactor. The total number of residues found in the CNBr peptides (18 + 62 + 16 = 96) is in good agreement with the value derived from amino acid analysis of the complete protein [10,11 ]. CNBr II was not sequenced since the necessary information on the N-terminal part of the chain was already known for the intact molecule (see above). By contrast, the arrangement of the first 50 residues in CNBr I was elucidated by automated Edman degradation (Fig. 1), thus extending the known sequence up to Gin68 and leaving only a short unknown interval of about 10 residues before the C-terminal CNBr IlL As already reported earlier [10], this latter peptide was sequenced up to the 14th residue found to be a glutamic acid. Amino acid analysis showed that only one Glx was present in the rest of the chain. Tryptic peptides. The missing C-terminal parts of CNBr I and CNBr III as well as the overlap between these fragments were given by two tryptic peptides, an arginine peptide resulting from digestion of the citraconylated protein and a short C-terminal peptide obtained by limited trypsinolysis of the unreduced molecule. The first method took advantage of the presence of two arginines (Arg63 in CNBr I and Arg(n-5) in CNBr III) in both sides of the desired overlap. The reduced-carboxymethylated cofactor was citraconylated and digested by trypsin under conditions reported in the Methods Section. The peptides were filtered through Sephadex G-50 superfine and yielded four fractions, Cit-T1, Cit-T2, Cit-T3 and CitT4. N-terminal residue determination and amino acid analysis showed that Cit-T1 and Cit-T3 were pure and that they derived, respectively, from segments Gly6Arg37 and Glu31-Arg37 in the chain. The coexistence of these two peptides suggested that the bond Arg30Glu31 was incompletely cleaved by trypsin, due probably to the proximity of a negative charge. Cit-T4 with two N-terminal residues Val and Ser was probably a mixture of short peptides tentatively identified as the N-terminal peptide Vall-Arg5 and the C-terminal peptide Ser-(Glx)2-3 (see below). Special attention was paid to the impure fraction Cit-T2 which contained the two phenylalanines of the expected overlapping peptide. Chromatography of this fraction on DEAE-cellulose at pH 8.5 yielded

two peaks. The first was divided into three subfractions, Cit-T2A, Cit-T2B and Cit-T2C, which were analyzed separately. Fraction Cit-T2D which formed a symmetrical peak was also analyzed. The fraction Cit-T2A with the two phenylalanine markers was obtained in small quantity due to restricted solubility. Nevertheless, it could be sequenced up to Phe82 with a single gap at position 80, thus completing the CNBr I formula and providing an overlap with CNBr III. Starting with 6.2 nmol of reacting Cit-T2A, 19 cycles could be realized with a repetitive yield better than 90%. Finally, the arrangement of the last C-terminal residues in colipase B was elucidated by limited trypsinolysis of the unreduced protein. Filtration of the resulting peptide mixture through Sephadex G-25 Superfine in 60 mM ammonium bicarbonate, pH 8.5, showed three peaks. The first contained a relatively large fragment with a molecular weight similar to that of the intact molecule and an N-terminal glycine. The second was formed by two short peptides which were separated by electrophoresis-chromatography on cellulose plates (see Methods) and identified b y amino acid analysis, respectively, as the N-terminal pentapeptide VaU-Arg5 (yield, 51%) and to Ser(Glx)2-3 (yield, 36%). Ser and Glu were sequentially detached from this latter peptide by automated degradation, thus providing a satisfactory overlap with the end of the known CNBr III sequence. The number of Glx in this peptide could not be definitely ascertained due to variable serine recovery after acid hydrolysis. But, as reported, the amino acid composition of CNBr III was consistent with a single Glx after Glu92. The third peak separated on Sephadex contained an amount of free arginine (yield, 32%) equivalent to that of the peptide, thus definitely proving that the extreme C-terminal sequence of horse colipase B is Ser-Glu-Glx-Arg COOH. The pept i d e Ser-Glu-Glx itself was isoelectric at pH 4.4 as .shown by electrophoresis during purification [10]. This information was not sufficient to decide whether the peptide contained two glutamic acid residues or one glutamic acid and one glutamine. The intention of limited proteolysis is to cleave the two extreme segments of the chain located outside the disulfide bridge network while the rest of the molecule is probably protected against tryptic attack by its strong reticulation.

lead to an interesting homology with the horse at positions 90 and 91. However, the porcine cofactor would under these conditions be shorter by three residues than its horse analog, in contradiction with all published amino acid analysis data. More work is necessary to clear up this point. A high degree of homology (93%) was found between the first 55 residues of horse colipases A and B. The two non-conservative substitutions (Leu-Met at position 18 and Gln-Arg at position 30) already mentioned by Julien et al. [11] were confirmed. By contrast, the substitution Gln-Glu at position 22 was not confirmed. According to the present analysis, the 22nd residue in horse B is a glutamine. A point of great interest was that the substitution I-lis-Thr at position 29 was also not confirmed, this position being occupied by a histidine in both horse A and B. The single histidine of horse B has already been found in the intermediary CNBr peptide [12] and not in the

Discussion

The sequence of the 96 residues of horse colipase B is given in Fig. 2 and compared to those of the first 89 residues of the porcine cofactor [3] and the 55 residues of horse colipase A [11 ]. Results presented in Fig. 2 confirm the two insertions reported by Julien et al. [11] in the N-terminal part of horse colipase A and B compared to the pig cofactor. These insertions are designated AIa36A and Ser48A as proposed by Hartley for the serine protease family [20], with the result that the chain is composed of 96 residues instead of the 94 indicated in Figs. 1 and 2. No other insertion or deletion was found in the rest of the chain up to the 89th residue, after which the pig colipase sequence is still uncertain. The proposal made by Canioni et al. [8], to add just after this residue the four amino acids split off by carboxypeptidase from the intact chain, would

5 10 15 Va l-Pro-Asp-Pro-Arg-G~-lyll --, lell le-I l e - A s n - L e u,.I A.s p.- G. l .u ~.L e.u ~ L e u

Pig ,orse

~

B

Horse A

Val-Pro-Asn-Pro-Aro-GIV Val lle-lle-Asn-Leu II Glu-Ala [ I-~ ,f I'.,:___! I. . . . . . I

Pig

20 25 Se r-Al a-G In-Cys-Ly s-Ser~A s n ~ G

Horse B

r-i;--i r'-7 [--~ r---, Ser-A~a-G~n-Cys-Lys-Ser~G~u~ys-~ys~H~st-Argt-G~ul-Ser~Sen~Le~-Ser-Leu~A~a~Arg-Cys-A~

I;

ii

Horse B

-J

- I

30 35 In-Hi s-As p-Thr-I IeILeu-Se r-Leu~

I'

'

',

,

Horse

Pig

lle

- _ --7m

,

!/

Leu IArg-Cys-~Al

a

a

I',36AII

X

40 45 50 ~5 Leu ~ A r g I G Iu-Asn-Ser-G, u-Cys t ~:PheIThr-Leu-Tyr-G Iy-Va ,-Tyr-Tyr-Ly ~_ f --i o'-i [i'_ - - i ,A IALys-AI a~Se ~ G iu-Asn-Se r-G• Iu-CystSe r~Al a~Trp i~Thr-Leu-Tyr-G 1y-Va l-Tyr-Tyr-Lys~ I

'

I

4

,

I

I

Horse A

IAla:~Lvs-Ala~SerHGlu-Asn-Ser-Glu-Cys+4, .... ,;_ _: •

Pig

60 65 70 ICys-Pro-Cys-G lu-Arg-C ly-Leu-Thr-CystG Iu-G, y ~ S e

Horse B

~ys-Pro-Cys-Glu-Arg-Gl y-Leu-Thr-CystG ,n-VaIIAsp-Lys~ThrJLeu-Va l_Gly_Ser_l ie~

Pig

80 85 90 ThTAsn-Thr-Asn-Phe-G Iy-I Ie-Cys~Hi s-Asn-Va i-G Iy ....

Horse B

Met ~Asn-Thr-Asn-Phe-Gly-I le-Cys~Phe-Asp-Al a-AI a-Arg-Se r-G iu-G Ix-Arg.

I

X A~!~Trp ,~rhr-Leu-Tyr]..,__., ... 75 rILeu_Va I_G Iy_Se r_l Iel

!

Fig.2. Amino acid sequencein horse colipaseB. The sequence of pig colipase[3] istaken as referencefor residuenumbering. See Fig. 1 for the two insertionsAla36A and Ser48A. The totalnumber of residuesin the chainis96. Invariantzesiduesin the pig and the two horse proteinsare enclosedby boxes (solidline).Interruptedlineisused when homology isrestrictedto horse A and B.

44 C-terminal CNBr III [10], thus demonstrating that His86 present in the pig cofactor is not conserved in the horse. If Gln29 and His30 are assumed to be inverted in the published pig colipase sequence [3], a histidine would be present at position 29 of the three proteins. This histidine would in no case be adjacent to an acidic residue [7] and it would be located close to a characteristic structure composed of two contiguous half cystines (Cys27 and .Cys28). This may orient our ideas concerning the folding of the colipase chain [8]. If comparison is extended to the first 55 residues in the three proteins (pig, horse A and horse B), homology decreases to 75% with eight non-conservative and six conservative substitutions plus the two insertions already mentioned. Significant conservative substitutions are those of Ile7 and Phe50 in the pig by, respectively, a valine and a tryptophan in the horse. Considering now sequences and not merely individual residues, large pieces appear to be identical or very similar in the three proteins, apart from the region 29-33 and the extreme C-terminal end. A striking homology is noted between the sequences Vail and Gly6 with a possible extension to Leul 1, the substitution at position 7 being highly conservative. A special role has been suggested for the N-terminal part of colipase in interfacial binding in the presence of phospholipids [21]. It is also remarkable that positions 7, 8 and 9 are invariably occupied by strongly hydrophobic residues in the three proteins and also in human colipase [22]. Another interesting region is that extending from Ala49 to Cys65, where strict homology between the pig and horse B cofactors is only interrupted by three conservative and one non-conservative substitutions. This region includes the invariant Tyr53, Tyr56 and Tyr57, the latter two being involved in interface recognition [8,9]. It also includes residue Trp50 in horse colipase B. The ultraviolet and intrinsic fluorescence spectra of this residue are modified in the same way as those of the tyrosines, by the presence of lipid interfaces [12]. It has been suggested that lysine residues in the colipase core and acidic residues in the tails particil~ate, respectively, in interracial binding of colipase and lipase recognition [23]. The present analysis shows that the four lysines of the pig cofactor are

conserved in horse B. As for the external acidic residues located in the tails, Asp3 and Glu92 not present in active colipase II [3] and Glul3 not conserved in the horse must be excluded, thus focusing attention onto Aspl2 (or Glul2) and Glul5. The significance of these and other homologies will become apparent as more is discovered about the essential residues in colipase. Acknowledgements We are indebted to Drs. Mireille Rovery and Josiane Bianchetta-de Caro for helpful discussions and to Mrs. Andrea Guidoni for performing amino acid analysis. Financial help from University Aix-Marseille I is gratefully acknowledged.

References 1 Chapus, C., Sad, H., S6m6riva, M. and Desnuelle, P. (1975) FEBS Lett. 58, 155-158 2 S6m&iva, M. and DesnueUe, P. (1979) Adv. Enzymol. 48, 319-370 3 Charles, M., Erlanson, C., Bianchetta, J., Joffre, J., Guidoni, A. and Rovery, M. (1974) Biochim. Biophys. Acta 359, 186-197 4 Erlanson, C., Charles~ M., Astier, M. and DesnueUe, P. (1974) Biochim. Biophys. Acta 359, 198-203 5 Sad, H., Entressangles, B. and Desnuelle, P. (1975) Eur. J. Biochem. 58,561-565 6 Sad, H., Granon, S. and S6m6riva, M. (1978) FEBS Lett. 95,229-234 7 Canioni, P. and Cozzone, P. (1979) Biochimie 61, 343-354 8 Canioni, P., Cozzone, P. and Sarda, L. (1980) Biochim. Biophys. Acta 621, 29-42 9 Canioni, P., Cozzone, P. and Kaptein, P. (1980) FEBS Lett. 111,219-222 10 Chapus, C., Desnuelle, P. and Foglizzo, E. (1981) Eur. J. Biochem. 115,99-105 11 Julien, R., B6chis, G., Gr6goire, J., Rathelot, J., Rochat, H. and Sarda, L. (1980) Biochem. Biophys. Res. Commun. 95, 1245-1252 12 Granon, S., Rahmani-Jourdheuil, D., Desnuelle, P. and Chapus, C. (1981) Biochem. Biophys. Research Cornmun. 99,114-119 13 Hartley, B.S. (1970) Biochem. J. 119, 805-822 14 Edman, P. and Begg, G. (1967) Eur. J. Biochem. 1, 80-91 15 Hermodson, M.A., Ericsson, L.M., Titani, K., Neurath, H. and Walsh, K.A. (1972) Biochemistry 11, 4493-4502 16 Bonicel, J., Bruschi, M., Couchoud, P. and Bovier-Lapierre, G. (1977) Biochimie 59, 111-113

45 17 Wittmann-Liebold, B. (1973) Hoppe-Seyler's Z. Physiol. Chem. 354, 1415-1431 18 Pisano, J.J., Bronzert, T.J. and Brewer, M.B.J. (1972) Anal. Biochem. 45, 43-59 19 Smithies, O., Gibson, D., Fanning, E.M., Goodlliesh, R.M., Gilman, J.G. and Ballantyne, D.L. (1971) Biochemistry 10, 4912-4921

20 Hartley, B.S. (1970) Philos. Trans. R. Soc. Lond. Ser. B 257, 77-87 21 Borgstrom, B., Wieloch, T. and Erlansson-Albertson, C. (1979) FEBS Lett. 108,407-410 22 Sternby, B. and Borgstrom. B. (1979) Biochim. Biophys. Acta 572, 235-243 23 Erlansson, C., Barrowman, J.A. and Borgstrom, B. (1977) Biochim. Biophys. Acta 489,150-162