The complete amino acid sequence of human complex-forming glycoprotein heterogeneous in charge (protein HC) from one individual

The complete amino acid sequence of human complex-forming glycoprotein heterogeneous in charge (protein HC) from one individual

ARCHIVES OF BIOCHEMISTRY AND BIOPHYSICS Vol. 228, No. 2, February 1, pp. 544-554, 1984 The Complete Amino Acid Sequence of Human Complex-Forming Glyc...

817KB Sizes 0 Downloads 13 Views

ARCHIVES OF BIOCHEMISTRY AND BIOPHYSICS Vol. 228, No. 2, February 1, pp. 544-554, 1984

The Complete Amino Acid Sequence of Human Complex-Forming Glycoprotein Heterogeneous in Charge (Protein HC) from One Individual’ CARLOS LOPEZ OTIN, ANDERS 0. GRUBB,*

AND

ENRIQUE MENDEZ’

Servicio de Endocrinologia, Centro Ramon y Cajal, Carretera de Cohnenar Km 9.1, Mad& $4, Spain, and *Department of Clinical Chemistry, University of Lund, Malti General Hospital, S-21.4 01 Malmii, Sweden Received April

21, 1983, and in revised form August 8, 198.3

The complete amino acid sequence of the single polypeptide chain of human complexforming glycoprotein heterogeneous in charge (protein HC) isolated from a single individual is reported with the supporting data. The primary structure was determined by automatic degradation of the intact chain and of fragments obtained by chemical and enzymatic degradations of the native or reduced and S-carboxymethylated protein. The polypeptide chain of protein HC contained 182 amino acid residues with a calculated molecular weight of 20,621. No amino acid sequence variability was found and such variability can therefore not explain the great charge heterogeneity of protein HC in a single individual. The amino acid sequence of protein HC was nearly identical to the one reported for human al-microglobulin in a research communication but contained 15 additional residues. Human complex-forming glycoprotein heterogeneous in charge (protein HC)3 is a recently described glycoprotein originally isolated from normal human urine (1). It carries an unidentified yellow-brown chromophore material and has been immunochemically demonstrated to occur in human blood plasma where a considerable part of the immunoreactivity is complexed with IgA (1). Although protein HC forms a single band on dodecyl sulfate-polyacrylamide gel electrophoresis, it displays an appreciable charge heterogeneity on agarose gel electrophoresis and on isoelectric focusing which does not diminish after desialylation (1). Protein HC is imi This investigation was supported by grants from El Fondo de Investigaciones Sanitariaa de la Seguridad Social (Spain), and the Medical Faculty, University of Lund, Sweden, DirektSr A. PBhlssons Stift&e, A. ijsterlunds Stiftelse, and the Swedish Medical Research Council (Project B82-13X-05196-05A). a To whom correspondence should be addressed. * Abbreviations used: BNPS-skatole, 2-(2-nitrophenylsulfenyl)t-methyl-3-bromoindolenine; dansyl-, 5-dimethyl-aminonaphthalene-1-sulfonyl-; Protein HC, human complex-forming glycoprotein, heterogeneous in charge; PTH, phenylthiohydantoin; TPCK, L-l-Tosylamide-2-phenylethylchloromethylketone; IgA, immunoglobulin A. 0003-9861/84 33.00 Copyright All rights

0 1984 by Academic Press, Inc. of reproduction in any form reserved.

munochemically and physiochemically related with two other recently described glycoproteins, a,-microglobulin (2-6) and ai-microglycoprotein (7), both of which were isolated from urine of patients with renal tubular dysfunction. In order to elucidate the charge heterogeneity and the complex-forming tendency of protein HC as well as its exact relation to cr,-microglobulin and al-microglycoprotein, studies were undertaken to determine its amino acid sequence. The present article describes the complete amino acid sequence of the single polypeptide chain of protein HC isolated from the urine of one single individual. A preliminary report on parts of the present work has been published (8). EXPERIMENTAL The Experimental sults are presented mediately following

PROCEDURES

Procedures and parts of the Reas a Miniprint Supplement imthis paper. RESULTS

The amino acid sequence of the polypeptide chain of protein HC (Fig. A) was deduced from automated degradation of the intact reduced and carboxymethylated polypeptide chain and of fragments ob544

AMINO

ACID

SEQUENCE

tained by chemical and/or enzymatic cleavage. The fragments used for the derivation of the sequence are shown in Fig. B. Details regarding all cyanogen bromide fragments are given but only regarding those tryptic and BNPS-skatole fragments needed to provide necessary overlaps. Details concerning the purification of the fragments, their amino acid compositions (Tables 2,8, and 13), and the results of the sequenator degradations (Tables 3-7,9-12, and 14-16) are given in the Supplement in which tables and figures are referred to by arabic numerals. All these fragments were purified to homogeneity (see Experimental Procedures) and their sequences agreed with their amino acid compositions. All amino-terminal sequence determinations were made by automatic Edman degradations except for one fragment, CNS3, which was sequenced by the Dansyl-Edman technique. The single polypeptide chain of protein HC was found to contain 182 amino acid residues with a calculated molecular weight of 20,621. According to amino acid analysis (Table A), protein HC contains five methionine residues, and at least six cyanogen bromide fragments should therefore be obtained. Gel filtration of the cyanogen bromide reaction mixture on Sephacryl S-200 resulted finally in the isolation of six different fragments (Figs. l-3). However, one of them, CNPla, from position 41 to 162 contained three methionine residues (Table 2). Partial cleavages at this region of the molecule could therefore explain the relatively low yield obtained for some of the cyanogen bromide fragments (Table 2) and the absence of the one from position 44 to 61, which is contained in this large cyanogen 10

Gly-pro-"al-Pro-,,,

?

-PTo-PRO-ASP-Afn-lie-Gln-Val-Gin-Gl"-Asn-Ph~-A~

OF HUMAN

PROTEIN

HC

545

bromide fragment, CNPla. The six cyanogen bromide fragments were used to establish the major part of the amino acid sequence (Fig. B). The alignment of the cyanogen bromide fragments, and the identification of those residues which could not be established by sequence analysis of the cyanogen bromide fragments, were provided for by automatic degradation of four tryptic peptides (T14, Tlb2, Tla2, and Tlbl) and two BNPS-skatole fragments (BNl and BN2) (Fig. B). Fragment CNSlb was identified as the carboxyl-terminal cyanogen bromide fragment, since it was the only cyanogen bromide fragment which was devoid of homoserine (Table 2). In addition, when CNSlb and the intact protein HC were digested with carboxypeptidase B, in both cases the only amino acid released was arginine. DISCUSSION

The proposed sequence of the polypeptide chain of protein HC is well documented since all residues except five (at positions 56, 116-117, and 128-129) were identified in at least two different fragments obtained by chemical and enzymatic cleavages of the chain (8). In addition, all tryptic (Table 8) and a large number of peptic and chymotryptic peptides were isolated and sequenced during the present study and none of these had a sequence incompatible with the structure in Fig. A. The peptides shown in Fig. B and reported in detail are only those that were required to deduce the complete sequence.Furthermore, the amino acid composition of protein HC calculated from the sequence (Fig. A) conforms to the one obtained after acid hydrolysis (Table A). 10 t: -11~-Ser-ArG-11~-Tyr-Gly-Lys-Tr~-Tyr-A~~-~eu-Ala-ll~-

10

40 Gly-Ser-Thr-Cys-Pr~-Trp-Leu-Lys-lls-Het-Asp-A~g-Met-Thr-Val-S~~-Th~-~~"-V~l-~~"-Gly-Gl"-Gly-Ala-Th~-Gl"-Al~-Gl"-ll~-S~~-

'0

70 Met-Thr-Ssr-Thr-Arg-Trp-Arg-Lys-Gly-V~l-Cy~-Gl"-Gl"-Thr-Ser-Gly-Al~-Ty~-Gl"-~y~-Thr-A~~-Th~-A~~-Gly-~y~-Ph~-~~"-Ty~-Hi~-

x0

‘Ml

100 Lys-Ser-Lys-T~p-A.~-Ile-Thr-n.t-Gl"-S~~-Ty~-V~l-V~l-H,~-Th~-A~"-Ty~-A~~-Gl"-Ty~-Al~-ll~-Ph~-~~"-Th~-~y~-~y~-Ph~-S~~-A~G-

110

120

140

,%I

I 30

H,s-H,s-Gly-Pro-Thr-Ile-Th~-Al~-Lys-Leu-Tyr-Gly-Arg-Ala-Pra-Gl"-~~"-Arg-Gl"-Th~-~~"-~e"-Gl"-A~~-Ph~-A~g-V~l-V~l-Al~-Gl~/M 170 tly-V~l-Gly-,1~-Pr.-Gl"-A~~-G~~-,l~-~h~-T~~-~~t-A~~-A~~-A~g-G~y-Gl"-~y~-V~~-P~~-G~y-G~"-G~"-Gl"-~~~-Gl"-P~~-,~~-~~"-,~~Pro-Arg

FIG. A. Amino acid sequence of the polypeptide chain of human protein HC. Residues 71 and 168 form an intrachain disulfide bridge while residue 34, at least partly, is blocked by a cysteine residue. Residues 5, 17, and 95 (marked with asterisks) probably carry carbohydrate side chains.

bO

IX0

546

LOPEZ OTIN, GRUBB, AND MENDEZ 1 192

, 1 1 1 --l-l 1

protein

HC

-------A-------------, 182

39 CNS2a

CNSI

CNSla

k 62

36 41 43

CNP2

I---96 99

----------41

133

-

~z”iz--

------,

CNPla

162

115 ----I 132

+-T’S -

Tlb2 65

92

97

162

162

56 BNl ,----------4 95

T14 I-+39 4243

CNSlb

, 116

EN2 161 ,T’b_l 147

162 -

164

4 162

FIG. B. Scheme of the polypeptide chain of protein HC and the fragments used in the derivation of its amino acid sequence. Solid lines indicate regions sequenced. CN, cyanogen bromide fragments; BN, fragments isolated after BNPS-skatole cleavage; T, tryptic peptides.

The residues at positions 5, 17, and 95 were obtained in low yields but their identities were corroborated by the amino acid compositions of the corresponding tryptic and chymotryptic peptides. Carbohydrate prosthetic groups are probably bound at these residues since amino sugars were found in the corresponding peptides by amino acid analysis. In addition, it should be noted that ai-microglobulin has been reported to have glycosylation sites at homologous positions (27). Although native protein HC has been reported to contain four cysteine residues (24), the present work could only establish the position of three carboxymethylcysteine residues in the reduced and S-carboxymethylated polypeptide chain of the protein. Two of these (at positions 71 and 168) have earlier been shown to form an intrachain disulfide bridge by the diagonal map technique (24) but no cysteic acid-containing peptide related to the cysteine residue at position 34 could be found. Native protein HC does not contain any fressulfhydry1 groups but small amounts of free cysteic acid (0.10 mol/mol) and a cysteic acidcontaining peptide of unknown sequence (0.01 mol/mol) can be released from native protein HC by treatment with performic acid (24). Since free cysteic acid also, was released upon performic acid treatment of the native amino-terminal cyanogen bromide fragment CNS2a (Fig. 33) comprising residues l-40 of the polypeptide chain, the cysteine residue of position 34 probably is

involved in the binding of small amounts of cysteine to native protein HC. However, since the release of cysteic acid from native protein HC and native CNS2a only amounted to about 10% of the molar conTABLE A AMINO ACID COMFWSITIONOF HUMAN PROTEIN HC

Amino acid Lysine Histidine Arginine Aspartic acid Threonineb Serineb Glutamic acid Proline GIycine Alanine Carboxymethyl cysteine ValineC Methionine Isoleucine Leucine Tyrosine Phenylalanine Tryptophan”

Analysis”

Sequence

10.2 3.9 9.3 15.2 15.1 10.7 22.4 12.0 13.1 9.5

10 4 10 14 17 10 21 12 14 9

2.8 10.3 4.8 13.0 12.2 8.4 7.9 1.4

3 11 5 13 11 8 6 4

“Except where noted, all figures are averages of values from one 24-h, one 48-h, and one ‘72-h hydro1ysaWassuming 182 residues in the polypeptide chain. bValuea obtained by extrapolation to zero hours hydrolysis. ’ Seventy-two-hour hydrolysis value only. ’ Determined by hydrolysis with 3 M ptoluenesulfonic acid (15).

AMINO

ACID

SEQUENCE

centrations of the parent molecules it is tempting to assume that other components of unknown structure are bound to the cysteine residue at position 34 with a blocking of its sulfhydryl group as a result. One of the most conspicuous properties of protein HC is its charge heterogeneity. Although the preparation of protein HC used in this work was isolated from the urine of one single individual, it displayed a remarkable heterogeneity like all other described preparations of protein HC and the related proteins al-microglobulin and cY1-microglycoprotein. In spite of this charge heterogeneity of the protein, no evidence could be found during the present sequence study of more than one amino acid residue at any position of the single polypeptide chain of the protein. The possibility that a sequence variability of the polypeptide chain would have passed the present study unobserved seems to be highly unlikely, since the chain was cleaved in five different ways and more than 200 different fragments were isolated and investigated. It should be stressed, however, that a contribution to the charge heterogeneity of protein HC by a varying amidation of its acidic side chains cannot be ruled out by the reported experiments since some peptides were obtained in very low yields. On the other hand, no positive evidence of such a contribution to the heterogeneity was found. If, therefore, a polymorphism of the amino acid sequence of the polypeptide chain of protein HC cannot explain its charge heterogeneity, alternative explanations must be sought. One such might be that a variability of the carbohydrate prosthetic groups would cause the charge heterogeneity of the protein. Little evidence speaks in favor of this alternative, however, since no heterogeneity of the carbohydrate groups of a,-microglobulin could be found in a recent study (25) and since neuraminidase treatment of protein HC and al-microglobulin, although it reduces their electrophoretic mobility, does not reduce their charge heterogeneity (1,3). Another reason for the charge heterogeneity of protein HC might be a heterogeneity of the components attached at the cysteine residue at position 34. However, since fully reduced and alkylated protein HC displays

OF HUMAN

PROTEIN

HC

547

virtually the same charge heterogeneity as native protein HC on isoelectric focusing and agarose gel electrophoresis, such a contribution to the heterogeneity is probably insignificant. A more likely explanation to the charge heterogeneity of protein HC seems to be that the protein carries a varying amount of one or more unidentified substances. Although the chromophore material of protein HC, which gives the protein its yellowbrown color, is very strongly linked to the protein, we have observed that the color of the protein varies considerably when the protein is isolated from different urine samples from the same or different individuals. The color of the lyophilized protein has ranged from almost white to dark brown and the charge heterogeneity of the protein as estimated by crossed immunoelectrophoresis (1) has displayed a more or less parallel variation with the most pigmented protein preparation being the most heterogenous one. Therefore, one of the main reasons for the charge heterogeneity of protein HC appears to be that individual molecules carry varying amounts of the chromophore material. The chromophore(-s), which is so tightly linked to native protein HC, does not seem to be attached at only one or two specific points of the polypeptide chain since a large number of colored peptides from widely separate parts of the chain were isolated during the present work. The amino acid sequence of the polypeptide chain of human protein HC presented in this work is virtually identical with the one reported in a preliminary communication (8). The only difference is the presence of an additional tryptophan residue at position 36. This residue remained undetected in the earlier studies because it appeared at the end of long sequencer runs and since back-hydrolysis was used to identify residues in this polypeptide area. In the present work this residue was identified by high-performance liquid chromatography in repeated runs of the intact carboxymethylated protein (Table 1). It was also found to be the carboxylterminal residue of the chymotryptic fragment C3a (Table 16). Computer analysis according to Dayhoff’s program (26) of the sequence of protein HC did not reveal any

548

L6PEZ

OTIN,

GRUBB,

significant homology to any known protein with the exception of human al-microglobulin, the sequence of which was recently reported by Takagi and co-workers (27). Although the reported sequences for human protein HC and al-microglobulin are very similar, they differ in 16 positions. The reported crl-microglobulin sequence is devoid of residues corresponding to residue numbers 29-37, 117, 125, and 179-182 of the sequence of protein HC and has threonine instead of a histidine residue at position 122. The significance of these discrepancies is at present uncertain but it should be mentioned that recent sequence studies in our laboratory have demonstrated the nonapeptide 29-37 to be present in protein HC isolated from a pool of urines as well as from plasma of one single individual. It should, in addition, be pointed out that the isolation and sequence analysis of three different types of fragments from the protein HC preparation used in this work (the cyanogen bromide fragment CNS2a, Table 2; the tryptic peptide TS, Table 8; and the chymotryptic peptide C3a, Table 16) proves beyond reasonable doubt the presence of the cysteine-containing nonapeptide in this protein preparation. The difference between the carboxylterminal sequences of protein HC and al-microglobulin may be due to varying degrees of proteolytic breakdown of the sequenced molecules, since a recent investigation (23) has demonstrated that protein HC isolated from a pool of urines is a mixture of molecules with and without the carboxyl-terminal tetrapeptide of the protein HC preparation of the present work. ACKNOWLEDGMENTS The skillful technical assistance of Ms. V. Grimsberg, Mr. F. Soriano, Mr. R. Nilsson, and Mr. L. Hansson and the expert secretarial assistance of Ms. I. Bomark is gratefully acknowledged. REFERENCES 1. TEJLER, L., AND GRUBB, A. 0. (1976) Biochim Bio phya Acta 439,82-94. 2. EKSTR~M, B., PETERSON, P. A., AND BERGGARD, I. (1975) Biochem Biophgtx Ran Commun 65, 1427-1433. 3. EK~TRBM, B., AND BERGGARD, I. (1977) J. Bid them 262,8048-8057. 4. f%XW3ON, L., AND RA’XNSKOV,U. (1976) 4%~ Chim Acta 73,415~422.

AND

MENDEZ

5. BERNIER, I., DAUTIGNY, A., GLA~HAAR, B. E., LERGIER, W., JOLLES, J., GILLESSEN, D., AND JOLLES, P. (1980) B&him Biophgs. Ada 626, 188-196. 6. TAKAGI, K., KIN, K., ITOH, Y., KAWAI, T., KASAHARE, T., SHIMODA, T., AND SHIKATA, T. (1979) J. Clin Invest. 63,318-325. 7. SEON, B. K., AND PRESSMAN, D. (1978) Biochemistry 17,2815-2821. 8. LOPEZ, C., GRUFIB,A. O., SORIANO, F., AND M~NDEZ, E. (1981) Biochem Biophys. Res. Cmmun 103, 919-925. 9. NAKAI, N., LAI, C. Y., AND HORECKER, B. L. (1974) And Biochem. 58,563-570. 10. M~NDEZ, E., AND LAI, C. Y. (1975) And Biochem 65,281-292. 11. STEERS, E., JR., CRAVEN, G. G., ANFINSEN, C. B., AND BETHUNE, J. L. (1965) J. Biol. Chem. 240, 24’78-2484. 12. OMENN, G. S., FONTANA, A., AND ANFINSEN, C. B. (1970) J. Bid Chem 245,1895-1902. 13. AMBLER, R. P. (1972) in Methods in Enzymology (Him, C. H. W., and Timasheff, S. N., eds.), Vol. 25, pp. 143-154, Academic Press, New York. 14. MOORE, S. (1963) J. Bid Chem 238, 235-237. 15. LIU, T. Y., AND CHANG, Y. H. (1971) J. Biol Chem. 246,2842-2848. 16. EDMAN, P., AND BEGG, G. (1967) Eur. J. B&hem. 1, 86-91. 17. THOMSEN, J., BUCHER, D., BRUNFELDT, K., N~ti, E., AND OLESEN, H. (1976) Eur. J. Biochem 69, 87-96. 18. TARR, G. E., BEECHER, J. F., BELL, M., AND MCKEAN, D. J. (1978) Anal B&hem 84, 622627. 19. LASKOWSKI, M., JR., AND KOHR, B. Internal Communication No. 03-LS-11-76, Waters Associates, Milford, Mass. 20. M~NDEZ, E., AND LAI, C. Y. (1975) And Bitxhem 68,47-53. 21. JEPPSSON, J. O., AND SJ~QUIST, J. (1967) And Biochem 18,264-269. 22. GRAY, W. R. (1967) in Methods in Enzymology (Him, C. H. W., ed.), Vol. 11, pp. 469-475, Academic Press, New York. 23. WOODS, K. R., AND WANG, K. T. (1967) Biochim Biophytx Acta 133, 369-372. 24. M~NDEZ, E., GRUBB, A. O., LOPEZ, C., FRANGIONE, B., AND FRANKLIN, E. C. (1982) Arch Biochem Bbphgs. 213,240-259. 25. EKSTR~M, B., LUNDBLAD, A., AND SVENSSON, S. (1981) Eur. J. Biochem 114, 663-666. 26. DAYHOFF, M. 0. (1978) Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3, National Biomedical Research Foundation, Washington, D. C. 27. TAKAGI, T., TAKAGI. K., AND KAWAI, T. (1981) Biochem Biophya Rea Conzmun 93,997-1991. 28. LOPEZ, C., GRUBB, A., ANDMI?NDEZ, E. (1982) FEBS J!.ett 144,349-353.

AMINO

ACID

SEQUENCE

OF HUMAN

PROTEIN

HC

549

LOPEZ

OTIN.

GRUBB.

AND

MfiNDEZ

AMINO

ACID

SEQUENCE

OF HUMAN

PROTEIN

HC

551

L6PEZ

OTIN,

GRUBB,

AND

MtiNDEZ

AMINO

ACID

SEQUENCE

OF HUMAN

PROTEIN

HC

553

LdPEZ

OTIN,

GRUBB,

AND

MfiNDEZ