ARCYIVES OF BIOCHEMISTRY AND BIOPHYSICS Vol. 243, No. 1, November 15, pp. l&4-194,1985
Characterization
of the Subunits of ,&Conglycinin’
J. B. COATES, J. S. MEDEIROS:
V. H. THANH,3
AND
N. C. NIELSEN4
United States Department of Agriculture and the Department of Agronomy, Purdue University, u! LoJayette, Indiana 47907 Received December
61984,
and in revised form April
6,1985
Four subunits of /3-conglycinin were purified from soybean cultivar CX 635-1-1-1, and were designated cx, (Y’, p, and /3’ in accordance with nomenclature proposed by Thanh and Shibasaki [(1977) Biochim. Biophgs. Acta 490,370-3841. Of these subunits, @’has not previously been reported or characterized. Consistent with the low levels of methionine in these proteins, cyanogen bromide cleavage of (Y’, (Y, and p’ subunits produced only a few fragments. The p subunit contains no methionine and was not cleaved by cyanogen bromide. The NHz-terminal amino acid sequences of the (Y and (Y’ subunits are homologous, and each has valine at its amino terminus. The @ subunit has a very different NHz-terminal sequence from those of the (Yand CY’subunits, and has leucine at its amino terminus. The NHz-terminal sequence of the fl subunit could not be determined, as it appeared to be blocked to Edman degradation. Although CYand (Y’subunits have similar NHz-terminal sequences, they differ in the number of methionine residues and so yielded different numbers of cyanogen bromide fragments. Two cyanogen bromide fragments (CB-1 and CB-2) were purified from the LYsubunit. CB-1 originated from the NHz-terminal end of the subunit. The amino acid sequence of CB-2 was identical to that predicted from the nucleotide sequence of cDNA clone pB36. The insert in pB36 encoded 216 amino acids from the COOH-terminal end of the (Y subunit and contained a 138-bp trailer sequence which was followed by a poly-(A) tail. Maps showing the relative positions of methionine residues and carbohydrate moieties in the (Y and (Y’ subunits were drawn, based on primary sequence data, and the size and carbohydrate content of the CNBr fragments derived from the subunits. o 19s Academic PWS, IW.
Soybeans contain 40-55% protein on a dry weight basis and are an important
’ Cooperative research of USDA-ARS and the Purdue Agricultural Experiment Station. Journal Paper no. 10,106. Mention of a trademark or proprietary product does not constitute a guarantee or warranty of the product by USDA-ARS or Purdue University. Financial support from the USDA Competitive Grants Program and the American Soybean Association Research Foundation is gratefully acknowledged. 2 Present address: Dept. de Biologia, Universidade Federal Rural de Pernambuco, Dois Irmaos, Recife 50.000 PE, Brazil. 3 Present address: Dept. of Biological Sciences, Stanford University, Stanford, Calif. 94305. 4 To whom correspondence should be addressed. 0003-9861/85 $3.00 Copyright All rights
Q 1985 by Academic Press, Inc. of reproduction in any form reserved.
source of protein for both animals and humans. Early studies reported four major antigenic components among these proteins: glycinin, a-conglycinin, P-conglycinin, and y-conglycinin (2-4). Glycinin and @-conglycinin have sedimentation coelllcients of 11.8 and 7.5 S, respectively. Together they account for 70-80% of the seed protein (5, 6) and make a major contribution to the functional properties of food products made from soybean seeds. Due to their low sulfur-amino acid content and their abundance in the seed, these two proteins are largely responsible for the wellknown nutritional deficiency of legume seeds with respect to the essential amino acid, methionine. 184
B-CONGLYCININ
@-Conglycinin is a complex protein (a trimer) which exhibits polymorphism in its subunit composition. At least seven different isomeric forms, which consist of different combinations of three subunits, CY, (Y’,and fl (7), have been isolated from the variety Raiden (8, 9). In addition to these major subunits, Thanh and Shibasaki observed two minor bands after electrophoretie separation of ‘7S protein. They called these y and 6 subunits (8). Although they did not characterize the polypeptides responsible for the y and 6 electrophoretic bands in detail, they did purify the major (Y,(Y’,and ,Bsubunits, and determined their amino acid composition, NHz-terminal residue, and carbohydrate content (1). As part of a continuing effort in our laboratory to characterize the soybean storage proteins, we have purified these major subunits (a, (Y’,and p) and a minor subunit (p’) from P-conglycinin and determined their amino acid composition, NH2-terminal sequence, and cyanogen bromide (CNBr5) cleavage patterns. These results have been compared with sequence data obtained from a cDNA clone for the (Ysubunit. MATERIALS
AND
METHODS
Source ofso@eu~. P-Conglycinin was isolated from seeds of soybean [G&c& wzax (L.) Merr.] cultivar CX 635-l-l-l. This is a high-protein selection (~50%) from the soybean breeding program at Purdue University (10). The selection originated from the cross Woodworth X Pando after four generations of singleseed pedigree selection for the high-protein phenotype. The selection has been maintained as selfed seed from a population of these plants through seven generations at the Agronomy Farm, Purdue University, West Lafayette, Indiana. Seeds were harvested at a mature stage prior to the onset of dehydration. They were frozen in liquid nitrogen and stored at -80°C until used for isolation of /3-conglycinin. Purijicatim Frozen seeds were ground to a fine powder in liquid nitrogen and lipids were removed by extraction with cold acetone as described by Moreira et al (11). The P-conglycinin subunits were purified by a modification of the procedure of Thanh and Shibasaki (1). Protein was extracted from the acetone5 Abbreviations used: CNBr, cyanogen bromide; Con A, concanavalin A; SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis.
SUBUNITS
185
washed powder by stirring at room temperature with Tris buffer (30 mM, pH 8.0, containing 20 mM 2-mercaptoethanol) for 1 h. Insoluble material was removed by centrifugation, and the supernatant was cooled to 0°C and adjusted to pH 6.4 with HCl. After 2 h, precipitated glycinin was removed by centrifugation. The supernatant was adjusted to pH 4.8, at 0°C and allowed to stand for 2 h. This step precipitated @-conglycinin and a small amount of low-molecular-weight proteins. The precipitate was pelleted by centrifugation, washed with the pH 4.8 buffer, and dissolved in a standard buffer (35 mM K-phosphate, pH 7.6, containing 0.4 M NaCl, 0.02% NaNa, and 10 mM 2-mercaptoethanol). Insoluble polymerized proteins were removed by centrifugation, and the supernatant was dialyzed against water at 4°C and lyophilized. The protein was purified further on a column of Sepharose 6B (2.5 X 100 cm) equilibrated with the standard buffer. The proteins eluted from the column in several peaks, with the major one corresponding to P-conglycinin. In some instances, fractions from this peak were used directly for purification of fl-conglycinin subunits. More commonly the protein was passed through a Con A-Sepharose 4B column (2.5 X 20 cm) equilibrated with the standard buffer from which 2-mercaptoethanol had been omitted, in order to remove nonglycosylated proteins. The fl-conglycinin subunits, which are glycosylated, were eluted from the affinity column in the standard with 0.1 M a-methyl-o-mannoside buffer, dialyzed against water, and lyophilized. Lyophilized P-conglycinin was dissolved in K-phosphate buffer (19 mM, containing 6M urea, 1 mM NazEDTA, and 0.02% NaNz) at either pH 8.1 or pH 7.0. This sample was applied to a column (2.5 X 50 cm) of DEAE-Sephadex A50 equilibrated with the same buffer. The protein was eluted with a 1.8-liter linear salt gradient between 0 and 0.45 M NaCl (Figs. 1, 2, and 3). The fractions in each peak were pooled, dialyzed against water, and lyophilized. To purify the a’ and a subunits further, samples containing a mixture of the two subunits were dissolved in 32 mM K-phosphate, pH 6.0, containing 6 M urea and 1 mM NazEDTA, and applied to a 2.5 X 20cm column of CM-Sephadex C50 equilibrated with the same buffer. A l-liter linear gradient from 0 to 0.5 M NaCl was used to resolve the (Yand a’ subunits. Fractions containing each subunit were pooled, dialyzed against water, and lyophilized. The pure subunits were denatured in 6 M guanidineHCl, reduced, and then Salkylated with I-vinylpyridine as described by Hermodson et aL (12). After Salkylation, /3-conglycinin was desalted by dialysis against 9% formic acid and lyophilized. Analytical procedures. Sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) was performed using modifications of the system of Laemmli (13, 14). Electrophoresis of intact polypeptides was performed in either 7 to 13% gradient gels or in
186
COATES
12.5% resolving gels, each with a 5% acrylamide stacking gel. The CNBr cleavage products were analyzed in 19% gels or in 10 to 18% gradient gels, each with a 5% stacking gel. Electrophoresis was at 15 mA until the sample reached the interface with the resolving gel, and then the current was increased to 25 mA for 4 h. In some cases, 6% acrylamide slab gels, containing 5 M urea, 10% acetic acid were used [modified from Kitamura et al. (15)]. After overnight preelectrophoresis of the gels (18 X 12.5 X 0.15 cm) at a constant voltage of 50 V, proteins were separated by electrophoresis for 4.5 h at a constant voltage of 120 V. The electrode buffer used was 10% acetic acid. Gels were stained with Coomassie blue G to detect total protein. To detect glycoproteins, gels were stained with periodic acid-Schiff reagent by the method of Konat et al. (16). CNBr cleavage of S-alkylated /3-conglycinin subunits was carried out in 70% formic acid by the method of Steers et aL (1’7). The cleaved products from the (Y subunit were desalted, lyophilized, and dissolved in 0.1 M Tris-HCl, pH 7.6, containing 8 M urea. The mixture was applied to a column (2.5 X 100 cm) of Sephacryl S-300. The proteins eluted from the column in two peaks (CB-1 and CB-2). Fractions from each peak were pooled, dialyzed against 9% formic acid, and lyophilized. The amino acid composition of the purified subunits and CNBr fragments was determined with a Durrum D500 amino acid analyzer by standard procedures. Proteins were hydrolyzed in 6 M HCI under vacuum for 22 h at 110°C. Cysteine was determined on samples which were oxidized with performic acid before HCl hydrolysis. The NHz-terminal sequence analysis was performed with a Beckman 89OC sequencer as described by Hermodson et al. (12,18). The phenylthiohydantoin derivatives were identified by high-performance liquid chromatography essentially according to Zimmerman et al. (19). Spot tests were used to confirm identification of arginine and histidine (18). With the exception of the @’subunit, the amounts of the amino acid phenylthiohydantoins produced during early cycles of degradation gave at least 70% of the expected amounts based on the amount of protein used for analysis. No identification was made unless the peak to background ratio was greater than 3:l and the quantity of derivative obtained was consistent with that from preceding cycles. If a detectable contaminating sequence made up 10% of the yield of the predominant sequence in a sample, the data were not used. Where the above criteria were not met, but where a low yield of a specific amino acid was consistently observed with no other amino acid appearing in the same cycle, tentative identifications were made. These are indicated in the results by parentheses around the amino acid in question. Preparation of cDNA clones. The 18-20 S-enriched polyadenylated RNAs, which were shown previously
ET AL. to direct synthesis of glycinin and P-conglycinin precursors in vitro (20, 21), were used as templates to produce cDNA. Double-stranded cDNAs were prepared by standard procedures (22, 23), treated with Sl nuclease, and sized on a 5 to 20% linear sucrose gradient. The fraction larger than 1.3 kb was collected and inserted into the P&I site of pBR322 by the oligo(dG) * (dC) tailing method (24). Escherichia co& K12 strain HBlOl was transformed with recombinant plasmids, and the transformants were screened for tetracycline resistance and ampicillin sensitivity. Four clones, pA06, pB36, pB40, and pC26, hybridized strongly to 3zP-labeled cDNA synthesized from 18-20 S mRNA. The insert in plasmid pB36 was shown to hybridize to 20 S mRNA by Northern analysis. The insert was removed from pB36 by PstI cleavage, and was shown to be 850 bp long by analysis on agarose gels. The insert was sequenced by the Maxam and Gilbert (25) and Sanger (26) methods. RESULTS
AND
DISCUSSION
P-Conglycinin was extracted from soybean seeds in its trimeric form, but was then denatured to allow separation of individual subunits (1). In initial attempts to resolve the subunits on DEAE-Sephadex A50, chromatography was performed at pH 8.1, as described by Thanh and Shibasaki (1). These conditions yielded an elution profile with three prominent peaks (Fig. 1). Analysis of the peaks by SDS-PAGE showed that the third, or largest, peak contained a mixture of the CYand (Y’ subunits, and the central peak contained the p subunit. These three subunits, which together constitute most of the P-conglycinin, were studied in detail by Thanh and Shibasaki (1). However, the first peak to emerge from ion-exchange columns developed at pH 8.1 has not been well characterized. Electrophoretic analysis of this peak showed that it contained several proteins. The major polypeptide was similar in size to the p subunit, although it appeared to be slightly smaller on SDS-gels. For purposes of discussion, we have referred to the major protein in this fraction as the p’ protein. To study the p polypeptides further, it was necessary to find conditions that would improve resolution of the P-conglycinin subunits. We found that the subunits bound the DEAE-Sephadex A50 column more firmly at pH 7.0 than at pH 8.1, and that the elution profile from the column
B-CONGLYCININ
0
20
40
60
60
187
SUBUNITS
loo
120
140
160
180
FRACTION NUMBER
FIG. 1. Separation of /3-conglycinin subunits on DEAE-Sephadex A50 at pH 8.1. A sample of @conglycinin (500 mg) was dissolved in 19 mM K-phosphate buffer, pH 8.1, containing 6 M urea, 1 mM Na2EDTA, and 0.02% NaNa, to give a final volume of 25 ml. The sample was loaded onto a column (2.5 X 50 cm), which was equilibrated with the same elution buffer. The fractionation was carried out at 4°C at a flow rate of 2.4 ml cm-* h-‘, and lo-ml fractions were collected. Proteins were eluted from the column by the salt gradient formed between 900 ml of the elution buffer and 900 ml of the elution buffer containing 0.45 M NaCl. The progress of the gradient was monitored by the conductivity of the eluate. 8: fl subunit plus some low-molecular-weight contaminants; /3, t9 subunit; (Y’ + (Y, (Y’ and 01subunits eluting as single peak.
was more complex at the lower pH (Fig. 1 vs Fig. 2). Analysis of the various pH 7.0 column fractions by SDS-PAGE showed that the first major peak to elute from the column contained the /3 subunit (i.e., fraction 4, Figs. 2 and 3) and that most of the p’ polypeptides were contained in the dis-
0
20
40
60
80
100
120
1‘0
160
FRACTION NUMBER
FIG. 2. Separation of @conglycinin subunits on DEAE-Sephadex A50 at pH 7.0. Details of operation are as for Fig. 1 except that the sample of @-conglycinin had previously been purified by passage through a column of Con A-Sepharose 4B, and the column and elution buffers used were at pH 7.0. A 2’7-ml sample at a concentration of 8 units/ml (measured at X = 280 nm) was loaded on the column. The numbers (l-11) correspond to the fractions collected from the column.
tinct shoulder (i.e., fraction 3, Figs. 2 and 3) on the leading edge of this peak. There were also small amounts of the /3’ polypeptides in earlier fractions eluted from the column, although these fractions were contaminated by low-molecular-weight (A& 21-25,000) polypeptides that were not characterized further. It proved to be very difficult to obtain /3 polypeptides which were not contaminated either by the low-molecular-weight proteins or by the p subunits. The degree and type of contamination were largely dependent on the previous treatment of the samples loaded onto the ion-exchange columns. If the protein samples were exposed to Con A-Sepharose 4B, to remove contaminating 11 S polypeptides, before they were loaded onto the ion-exchange columns, the p and p’ polypeptides showed an increased tendency to associate and were difficult to separate. If the Con A-Sepharose 4B purification step was omitted, however, the fractions that contained /3’ polypeptides were usually contaminated by polypeptides of M, 21-25,000 (e.g., fractions 1 and 2, Figs. 2 and 3).
188
COATES M
T
7Sl
2
3
4
ET AL. 5
6
7
8
9
1011
TM
FIG. 3. Analysis by SDS-PAGE of fractions collected from the DEAE-Sephadex A50 column shown in Fig. 2. M, molecular weight standards; T, total protein extracted from seeds of the soybean cultivar CX 635-l-l-l; 7 S, purified @-conglycinin loaded onto the column; l-11, fractions collected from the column (correspond to Fig. 2); 1-3, mainly @’ subunit; 4, mainly @ subunit; 8, mainly (Y’ subunit; 10-11, mainly 01subunit. The more basic proteins are eluted before the more acidic proteins. The protein standards were phosphorylase B (iWr = 92,500); bovine serum albumin (Af. = 66,200); ovalbumin (A& = 45,000); carbonic anhydrase (i%f, = 31,000); soybean trypsin inhibitor (M, = 21,500); and lysosyme (M, = 14,400).
Changing the pH of development of the DEAE-Sephadex A50 column also affected the separation of the a and (Y’subunits. At pH 8.1, the LYand (Y’subunits were eluted together (Fig. l), whereas at pH 7.0 they were partially separated, with the (Y’subunit emerging first (Figs. 2 and 3). The peak which contained the (Y’subunit characteristically had a leading shoulder which contained additional polypeptides (fraction 6, Figs. 2 and 3). However, purification of the individual polypeptides in these minor fractions has not been successful and they have not been characterized further. The four P-conglycinin subunits resolved by ion-exchange chromatography (a, (Y’,0, 8’) were characterized by electrophoresis. By SDS-PAGE, the p’ subunit has slightly higher electrophoretic mobility than the p subunit (Fig. 3). However, in acid-urea gels, in which the charge differences between polypeptides influence their electrophoretic mobility to a greater extent than in SDS gels, the /3’subunit shows slightly lower mobility than the /!I subunit (Fig. 4). Similarly, the (Ysubunit shows higher mobility than the LY’subunit in SDS gels (Fig. 3), but shows slightly lower mobility than the (Y’subunit in acid-urea gels (Fig. 4).
Thanh and Shibasaki (1) drew attention to two bands that occurred in electrophoretie separations of purified 7 S proteins, and they named these bands y and 6. These two minor 7 S components were clearly visible on acid-urea gels (1). They are identified in Fig. 4. We considered the possibility that the p’ subunit corresponds to
R 635 B’
B
a’
a 635 R
FIG. 4. Analysis of selected fractions from the DEAE-Sephadex A50 column shown in Fig. 2., on an acetic acid-urea gel. R, total protein extracted from seeds of the soybean cultivar, Raiden; 635, total protein extracted from seeds of the soybean cultivar, CX 635l-l-l; @, /r subunit (3); 6, fl subunit (4); Cu:01’subunit (8); o, (Y subunit (10). Numbers in parentheses correspond to the fraction number of the samples collected from the column (cf. Fig. 2). The positions of the y and 6 subunits noted by Thanh and Shibasaki (1) are indicated.
11.43 f 0.32* 2.50 f O.ll* 7.00 + 0.27* 22.62 + 2.00* 6.93 f 0.21* 5.31 + 0.13* 4.46 5 O-22* 3.87 + 0.2.9 0.61 f 0.06* 4.05 + 0.16* 7.36 + 0.35* 1.28 + 1.01* 4.78 * 0.38b 3.65 It_ 0.16* 6.96 f 0.31* 6.88 + 0.26* 0.30 + 0.02=
12.20 f 0.36* 2.43 + 0.08* 7.07 f 0.14* 21.91 + 1.09* 7.66 f 0.32* 4.90 k 0.19* 4.74 f 0.226 3.63 + 0.26* 0.46 + 0.17* 4.62 f 0.24’ 8.52 f OS* 1.43 f 0.59* 5.02 _+ 0.13 * 1.20 2 0.07* 5.77 + 0.19* 8.12 + 0.37* 0.31 f O.OSd
a B 13.59 + o.30c 2.74 + 0.10” 7.49 + 0.17” 17.79 * 0.17c 5.49 + 0.20” 5.13 * 0.10” 5.06 f 0.15’ 4.50 ?c 0.21” 0.27 * 0.30” 4.89 +- 0.20’ 10.29 + 0.18’ 1.37 + 0.82” 6.08 -t o.17c 2.12 * 0.51” 5.09 2 0.17” 7.35 -+ 0.12O 0.16 + 0.23”
mole %”
13.67 3.27 8.29 16.48 5.41 5.25 6.04 4.54 0.72 5.45 10.48 1.38 5.32 0.79 5.84 6.53 0.49
f f + + * f + + + + f k 2 k + + +
P’ 0.30d 0.32d 0.4Zd 0.12d 0.37d 0.55d 02id 0.31 d 0.21d O.lgd 0.45d 0.17d 0.16d 0.16d 0.61d 0.74’ 2.95” 74.87 16.36 45.82 148.19 45.40 34.79 29.23 25.32 3.97 26.55 48.24 8.40 31.30 23.91 45.59 45.08 1.97
k + + + + + f f + f + k + * f + f
a’ 2.08* 0.69’ 1.78” 13.08* 1.40* 0.89* 1.49* 1.50” 0.37* 1.03’ 2.29* 6.64* 2.53’ 1.04* 2.00” 1.68* o.13c
B 64.26 _t 1.41” 12.98 + 0.49” 35.43 * 0.84” 84.13 + 0.83” 25.96 + 0.96’ 24.28 + 0.45” 26.49 k 0.74” 21.28 + 1.01” 1.27 t 1.41” 23.15 + 0.94” 48.68 -+ 0.87’ 6.51 f 3.85” 28.76 f 0.82” 10.03 + 2.40” 24.09 _t o.7gc 34.76 + 0.55” 0.76 f 1.09’
75.43 + 2.20* 15.04 + 0.52* 43.71 + 0.85* 135.39 + 6.77” 47.34 rt 1.98* 30.25 + 1.20* 29.27 + 1.32* 22.45 f 1.62* 2.84 + 1.03* 28.56 rf: 1.48’ 52.65 + 1.48* 8.86 +_3.63* 31.04 + 0.82* 7.44 * 0.43* 35.68 + 1.20* 50.18 t 2.28* 1.92 f 0.4gd
number of residues/mole” a
Average
+ f * -t f
+ 0.77d + 0.74d k 2.91d + 3.46d + 13.95’
21.48 3.38 25.79 49.54 6.52 25.16 3.72 27.60 30.91 2.32
1.48d l.Old 0.8@ 2.11d 0.81d
f 1.41d + 1.52d 2 2.01 d + 0.54d 2 1.7Sd + 2.61d * 1.15d
64.68 15.49 39.23 77.93 25.59 24.82 28.54
P’
Note. Samples (two to four replicates) were hydrolyzed in evacuated tubes in 6 M HCl at 110°C for 22 h, and the amino acids were determined by standard techniques. The values shown are the averages of these determinations f 95% confidence limits for n - 1 df. The cysteine content was determined on additional samples which had been oxidized with performic acid. LIMean k 95% confidence limits for n - 1 df. * n = five samples. ‘72 = four samples. d n = three samples. ’ n = two samples.
Asx Thr Ser Glx Pro GUY Ala Val Met Ile Leu Tyr Phe His LYS Arg CYS
a’
Average
I
AMINO ACID COMPOSITIONOF &CONGLYCININ SUBUNITS
TABLE
2 3 m
2
g 2
3
p 8
190
COATES
one of these two components. However, from Fig. 4 it can be seen that the p’ polypeptide does not comigrate with either the y or 6 polypeptides in acid-urea gels. It is still possible that the p’ polypeptide has been modified during purification in such a way that it exhibits different electrophoretie properties from the y and 6 polypeptides, but in the absence of evidence for this we must conclude that the p’ polypeptide is distinct from either the y or 6 proteins. The amino acid composition of the (Y,a’, fi, and p’ subunits was determined. The data for the (Y,o!, and 0 subunits are similar to
ET AL.
those previously reported by Thanh and Shibasaki (1). Data for the p and fl subunits are compared in Table I. The determinations of the methionine and cysteine contents are estimates, as these residues are present in such low quantities in the subunits that accurate determination is difficult. However, the estimates of the methionine content of the subunits were confirmed by CNBr cleavage experiments, in which it was shown that the CY’and p’ subunits both formed more CNBr cleavage fragments than the (Ysubunit, while the p subunit was not cleaved by CNBr (Figs. 5A and C). Apart from the difference in sulfur-
A T o’
CNBr i, M
M
a
CNBr CNBr ‘0 B b M
CNBr
p'
CtjBr To’a’M
CtjBr aa
CNBr biM
M
96f.51
FIG. 5. SDS-PAGE of CNBr cleavage fragments. (A) Fragments derived from the (Y’, (Y, and /3 subunits The gel was stained with Coomassie blue G to reveal total protein. T, total protein extracted from seeds of the soybean cultivar CX 635-l-l-l; M, molecular weight standards; a’, (Y’subunit; a, a subunit; 8, B subunit; + CNBr, treated with CNBr. The protein standards were bovine serum albumin (nlr, = 66,000); ovalbumin (M, = 45,000); glyceraldehyde-3-P-dehydrogenase (iIf. = 36,000); carbonic anhydrase (M, = 29,000); trypsinogen (A& = 24,000); trypsin inhibitor (A& = 20,100); a-lactalbumin (Afr = 14,200); and aprotinin (It& = 6,500). (B) Replicate of (5A), except that the gel was stained with periodic acid-Schiff reagent to reveal glycoproteins. (C) Fragments derived from the fl subunit. The gel was stained with Coomassie blue G. M, molecular weight standards; fl, @ subunit; fl + CNBr, @’ subunit treated with CNBr. The protein standards were phosphorylase B (Mr = 92,500); human serum albumin (iI& = 66,000); ovalbumin (iVr = 45,000); and myoglobin (&f, = 17,000). (D) Fragments derived from the (Ysubunit after purification on Sephacryl S-300. The gel was stained with Coomassie blue G. CB-1, large fragment (Af, N 54,000); CB-2, small fragment (A& L 19,500); (Y, (Y subunit; ‘7 S, p-conglycinin. A dimer of the a subunit is visible at the top of the 01subunit sample.
P-CONGLYCININ
containing amino acids and some difference in histidine content, the amino acid composition of the @and p’ subunits is almost identical. Similarly, the major difference between the (Yand CY’subunits is in their histidine content. The NHz-terminal sequences of the CY’,LY and /3subunits are shown in Fig. 6A. These data are in agreement with the NHz-terminal amino acid analysis reported by Thanh and Shibasaki (l), who found NH2terminal valine for both (Yand (Y’subunits and leucine for the 0 subunit. No NHz-terminal sequence could be determined for the p’ subunit, as it appeared to be blocked to Edman degradation. The NHz-terminal sequence for the /3 subunit was determined to residue 26, whereas those for the (Yand &subunits were difficult to determine even to residue 12. This difficulty was probably due to the large number of consecutive glutamate residues found in the NHz-terminal regions of these subunits. Even so, it is clear that the (Y and (Y’subunits are homologous at their NHz-termini, whereas the p subunit NHz-terminal sequence is quite different. Purified (Y’, (Y, 0, and p’ subunits were cleaved at methionine residues by treatment with CNBr. Their cleavage patterns were analyzed in SDS gels (Figs. 5A, B, and C). The LX’subunit gave major bands at M,. N 47,000, 19,500, and 15,500, and minor bands at M, N 60,000, 55,000, 37,000, and 12,600. The data suggest that there are
three or four methionine residues in the (Y’ subunit, an estimate consistent with the amino acid analysis (1). As /3-conglycinin subunits are known to be glycosylated (l), the CNBr fragments were stained with periodic acid-Schiff reagent after electrophoretic separation. The &I,. = 47,000 and 19,500fragments contained sugar, whereas the M, 15,500 fragment was clearly not glycosylated (Fig. 5B). Unfortunately solvent conditions to purify the CNBr cleavage fragments of the (Y’subunit were not found, so the fragments were not studied further. The p’ subunit yielded at least four fragments on cleavage with CNBr (Fig. 5C), suggesting that it contained at least two methionine residues. The ,8subunit was not cleaved by CNBr treatment (Fig. 5A), which indicated that it contained no methionine. Cleavage of the (Y subunit with CNBr gave four fragments. Analysis by SDSPAGE showed that there were two major fragments, at M, N 54,000 and M, - 19,500, one minor fragment at M, N 52,000, and one very minor fragment at M, N 15,500. When the fragments were separated by gel filtration on Sephacryl S-300, two major peaks were obtained. The first peak, CB-1, contained both the 54- and 52-KDa polypeptides. The second peak, CB-2, contained the 19.5-KDa polypeptide (Fig. 5D). The 54-, 52-, and 19.5-KDa polypeptides were all glycosylated (Fig. 5B).
5 A.
i:
NH2-Val-Glu-Glu-Glu-Glu>:
NH2-Va,-G,u-Lys-G,u-G,u-
1:
NH2-Leu-Lys-Val-Arg-Glu-
191
SUBUNITS
Glu -(Glu)-
X -Glu-
10 X - X -,le
5
10 Ser - Glu -G,u-Gly-Glu-Ile-Pro
5
10 Asp - Glu -Asn-Asn-Pro-Phe-lyr-Leu-Arg-Ser-
15
Ser-As"-Ser-phe-~~"-,h~-Leu-Phe-(Gl")-(:~")-(Gl") 8.
m-CB-1:
5 NH2-Val-Glu-Lys-Glu-Glu-(SW)-
.z-C&Z:
NH~-AsIJ-G~u-G~~-A~~-LsI-
5
10 Glu -G1u-G1y-G1u-Ile-Pro 10 Leu - Leu -Pro-His-Phe-Asn-Ser-Lys-Ala-Ile-
20 25 Va1-lle-Leu-Va1-I1e-Asn-G1u-G1y-(Asp)-A1a-(Asn)-Ile-G1u-Leu-Va1
15 30
FIG. 6. Partial amino acid sequence of @-conglycinin subunits. (A) NHa-Terminal sequences for the (Y’,(Yand fl subunits. (B) NHa-Terminal sequences for CNBr cleavage fragments, CB-1 (44, = 54,000) and CB-2 (M, = 19,500) from the (Ysubunit. X, Unidentified amino acid. Residues in parentheses are identified only tentatively.
192
COATES
Despite the apparent size heterogeneity of CB-1, a single NHz-terminal sequence was obtained, which was identical to that of the intact subunit (Fig. 6B). This indicated that the polypeptides in CB-1 originated at the NHe-terminus of the subunit. Fragment CB-2 (M, N 19,500) appeared to be homogeneous by SDS-PAGE (Fig. 5D). It contained a new NHz-terminal sequence, which originated within the molecule (Fig. 6B). To obtain further information about the sequence of the proteins, cDNA clone pB36 was constructed. It was constructed using cDNA prepared from 18-20 S-enriched polyadenylated mRNA, and was inserted into the PstI site of pBr322 by the oligo(dG) * (dC) tailing method (24). Northern analysis revealed that pB36 hybridized to the 20 S mRNA fraction (27). This abundant class of mRNAs had previously been shown to encode (Yand a’ precursors (20). Clone pB36 also hybridized to selected message from a total polyadenylated RNA fraction which, when translated, produced polypeptides equivalent in size to the subunit precursors. These translation products were precipitated by immunoglobulins raised against P-conglycinin (data not shown). This preliminary evidence thus clearly suggested that pB36 encoded part of either an a or (Y’subunit precursor. In preparation for nucleotide sequence analysis, the DNA insert in pB36 was excised with PstI, separated by electrophoresis in agarose, and electroeluted (28). Restriction analysis of this purified insert with HindIII, FntiHI, HaeIII, AM, DdeI, HinfI, and Tag1 was used to construct a physical map and provided the basis for generation of DNA fragments used to establish an overlapping nucleotide sequence (Fig. 7). The insert was about 850 bp long and contained 650 bp of coding sequence (Fig. 8). The 650-bp coding region was followed by a 138-bp 3’-nontranslated sequence which began with the TGA stop codon and ended with a short poly(A) tail. Two overlapping polyadenylation signals (AATAAA) were located in the untranslated region, the last of which was 35 bp upstream from the poly(A) tail (29). The 650-bp coding sequence contained the se-
ET AL. 800
Maxam Gilbert
700
600
5W
U3l
300
100
0 base 1 pairs
--
-t-) ---
200
-
FIG. 7. Restriction map and sequencing strategy for the clone pB36. CB-2, start of sequence coding for the small CNBr cleavage fragment.
quence shown earlier to be at the NH2 terminus of CB-2, thus confirming that pB36 encodes an a or a’ subunit. This also indicates that CB-1 originates from the NH% terminal region of the (Ysubunit. Schuler et aZ. (29, 30) reported the nucleotide sequences of several clones of cDNA origin which have incomplete coding regions for the a and a’ subunits. They divided these clones into two groups on the basis of sequence homology. Within groups the sequences were nearly identical, while between groups the homology decreased but still approached 90%. Because of their extensive homology, clones from each group selected messages for both the (Yand a’ subunits. The pB36 clone described in this report is nearly identical to Gmc 21 studied by Schuler et al. (29), although there are eight nucleotide mismatches in the coding region, one mismatch in the 3’noncoding region, and an insertion of three extra bases just prior to the tandem polyadenylation signals. The eight mismatches in the coding region lead to three substitutions among the 160 amino acids. Significantly, the 30-amino acid NHz-terminal sequence of CB-2 was conserved completely in both pB36 and Gmc21, while clones in the other group (e.g., Gmc 16 and Gmc 32) differed from them at 3 of the 30 positions (29, 30). It is therefore reasonable to conclude that pB36 and Gmc 21 encode part of the a subunit, while the clones in the other group correspond to the coding sequence for the (Y’polypeptide. This interpretation is consistent with the methionine content of the two kinds of
,&CONGLYCININ
193
SUBUNITS
50 TGffiCAAACGTGCCI\AATCTAGTTCAAtiG~CCATllCTTCTG~GAT~ACCTTTT~CTTGGG~GCCGCGACCCCATCTACTCC SerLysAryAlaLysSerSerSerAryLysThrIleSerSerGluAspLysProPheAsnLeuGlySerAryAspProIleTyrSer 100 150 AAGAAGCTTGGCAAGTTCTTTtiAGATCACCCCAGAGACCCCCAGCTTCGGGACTTGGATATCTTCCTCAGTATTGlGGATATGAAC AsnLysLeuGlyLysPhePhetiluIleThrProGluLysAsnProGlnLeuAryAspLeuAspIlePheLeuSerIleValAspMetAsn
-
200 250 tiAGGGAtiCTCTlCTTClACCACACTTCAATTCAAAGGCGATAGTGATACT(iGTAATTAATGAAGGAGATGCAAACATTGAACTTGTTGGC GluGlyAlaLeuLeuLeuProHisPheAsnSerLysAlalleValIleLeuValIleAsnGluGlyAspAlaAsnIleGluLeuValGly 350 300 CTAAAAGAACAACAACAGGAGCAGCAACAGGAAGAGCAACCTTTGG~GTGCGG~ATATAGAGCCGAATTGTCTG~CAAGATATATTT LeuLysGluGlnGlnGlnG1uGlnGlnGlnGluGluGlnProLeuGluValAryLysTyrAryAlaGluLeuSerGluGlnAsp~lePhe
'
400 450 GTAATCCCAGCAGGTTATCCAGTTGTGGTCAACGCTACCTCAAATCTGAATTTCTTTGCTATTtiGTATTAATGCCGAGAACAACCAGAGG VallleProAlaGlyTyrProValValValAsnAlaThrSerAsnLeuAsnPhePheAlaileGly~leAsnAlaGluAsnAsnGlnAry 500 AACTTCCTCGCAGtiTTCGCAAGACAATGTGATAB(jCCAGATACCTAGlC~tiTGCAGGAGCTTGCATTCCCTGGGTCTGCAC~GCTGTT AsnPheLeuAlaGlySerGlnAspAsnValIleSerGlnllnValGlnGluLeuAlaPheProGlySerAlatilnAlaVal 550 600 GAGAAGCTATTAAAGAACCAAAGAGAATCCTACTTTtiTGGATGCTCAGCCTAATti~~AGAGGAGGGTAATAA~G~G~GGGTCCT GluLysLeuLeuLysAsnGlnAryGluSerTyrPheValAspAlaGlnProAsnGluLysGluGluGlyAsnLysGlyAryLysGlyPro 700 650 TTGTCTTCAATTTTGAGGGCTTTTTAC~TAAGTATGTACTAAAATGTATtiCTGT~TAGCTCATAGTGAGCGAGGAAAGTATCGGGC LeuSerSerlleLeuAryAlaPheTyr 800 750 TATtiTAACTATGACTAGAGCTTCAACTATti~TAAAT~ATCGACAGCATATGATtiCTTTTGTTTTGTGTTCTTCA~~AAAAAAA
FIG. 8. Nucleotide sequence of the (Y subunit clone, pB36, and derived amino acid sequence. The sequence of nucleotides coding for the polyadenylation signals (AATAAA) is underlined. The terminator codon (TGA) is identified by an overline. The 30-amino acid sequence corresponding to the NHs-terminal sequence of the CB-2 fragment of the (Ysubunit is underlined. The probable attachment site (Asn-Ala-Thr) of one of the two LYsubunit carbohydrate groups is indicated by the bold underline.
subunits. Amino acid analysis and CNBr cleavage analysis indicated that the (Ysubunit had two of these residues while the (Y’ subunit had at least three. Clone Gmc 21 contained two methionine coding sequences in the COOH-terminal part of the molecule, whereas clones for the (Y’group had three, all of which were located in the region of coding sequence corresponding to the CB2 fragment of the (Y subunit. /3-Conglycinin, unlike glycinin, is glycosylated (31). Evidence reported by Thanh and Shibasaki (1) showed that there were two glycosylation sites in each (Y and (Y’ subunit. Each carbohydrate moiety (highmannose type) is attached to an asparagine residue (32), but two different tripeptides are involved (i.e., Asn-Ala-Thr and AsnGly-Thr). The Asn-Ala-Thr glycosylation site was located in the CB-2 fragment of the a! subunit, and the same sequence was found at an equivalent position in the sequence predicted for the LY’subunit on the basis of data reported by Schuler et al. (29). Further examination of their data did not
reveal additional Asn-Ala-Thr or AsnGly-Thr sequences. Since the sequences reported by these workers encoded all primary structure of the COOH-terminal region from the first apparent methionine in the chains, this suggests that the other glycosylation site (i.e., Asn-Gly-Thr) must be included in the NHz-terminal CNBr fragment, although the exact site is unknown. Exact mapping of the subunits by analysis of CNBr cleavage fragments in SDSPAGE is difficult for two major reasons. First, fragments smaller than about 12 KDa are frequently lost from gels during the staining procedure, so small fragments are seriously underrepresented. Second, those polypeptides which contain carbohydrate moieties often migrate anomalously on the gels. They are generally retarded compared to nonglycosylated polypeptides of equivalent size (33). This makes the real size of the cleavage fragments difficult to determine. In spite of this, it is still possible to draw partial maps of the
194
COATES CHO M
K NH2
34
t$ 15.5
CHO M 1
NH2
30
z
11.5
COOH
kO
HO
y 15.5
HO
r$ 8 195
COOH kO
FIG. 9. Maps of the (Yand cu’subunits. M, methionine residue; CHO, carbohydrate moiety.
primary structures for cy’and (Y subunits by combining the CNBr cleavage data, protein sequence data, our cDNA sequence data plus that of Schuler et al. (29), as well as the glycosylation data. This information is summarized in Fig. 9. An understanding of the “normal” structure of P-conglycinin subunits provides the basis for further studies, in which the structure and inheritance of electrophoretic variants of @conglycinin from several cultivars are being investigated. ACKNOWLEDGMENT We are grateful help in sequencing
to Dr. Mark Hermodson for his the fl-conglycinin subunits. REFERENCES
1. THANH, V. H., AND SHIBASAKI, K. (1977) Biochim Biophys. Ada 490,370-384. 2. CATSIMPOOLAS,N., AND EKENSTAM, C. (1969) Arch B&hem. Biophys. 129,490-497. 3. KOSHIYAMA, I., AND FUKUSHIMA, D. (1976) Phytochemistry l&157-159. 4. KOSHIYAMA, I., AND FUKUSHIMA, D. (1976) Phytochemistry 15,161-164. 5. HILL, J. E., AND BREIDENBACH, R. W. (1974) Plant Physiol. 53,742-746. 6. DERBYSHIRE, E., WRIGHT, D. J., AND BOULTER, D. (1976) Phytochemistry 15,3-24. 7. THANH, V. H., AND SHIBASAKI, K. (1978) J. Agric Food Chem 26,692-695. 8. THANH, V. H., AND SHIBASAKI, K. (1976) B&him Biophys. Acta 439,326-338. 9. YAMAUCHI, F., SATO, M., SATO, W., KAMATA, Y., AND SHIBASAKI, K. (1981) Agric. Biol Chem. 45, 2863-2868. 10. SIMPSON, A. J. (1977) PhD Thesis, Purdue University. 11. MOREIRA, M. A., HERMODSON, M. A., LARKINS, B. A., AND NIELSEN, N. C. (1979) J. Biol. Chem. 254,9921-9926.
ET AL. 12. HERMODSON, M., SCHMER, G., AND KURACHI, K. (1977) J. Biol Chem 252,6276-6279. 13. LAEMMLI, U. K. (1970) Nature (London) 227,680685. 14. LARKINS, B., AND HURKMAN, W. J. (1978) Plant Physiol. 62,256-263. 15. KITAMURA, K., TAKAGI, T., AND SHIBASAKI, K. (1976) Agric Biol. Chem. 40,1837-1844. 16. KONAT, G., OFFNER, H., AND MELLAH, J. (1984) Experientiu 40,303-304. 17. STEERS, E., JR., CRAVEN, G. R., ANFINSEN, C. B., AND BETHUNE, J. L. (1965) J. BioL Chem. 240, 2478-2484. 18. HERMODSON, M. A., ERICSSON, L. H., TITANI, K., NEURATH, H., AND WALSH, K. A. (1972) Bb chemistry 11,4493-4502. 19. ZIMMERMAN, C. C., APPELLA, E., AND PISANO, J. J. (1977) Anal B&hem. 77,569-573. 20. TUMER, N. E., THANH, V. H., AND NIELSEN, N. C. (1981) J. Biol. Chem 256,8756-8760. 21. GOLDBERG, R. B., HOSCHEK, G., DITTA, G. S., AND BREIDENBACH, R. W. (1981) Dew. BioL 83,218231. 22. BUELL, G. N., WICKENS, M. P., PAYVAR, F., AND SCHIMKE, R. T. (1978) J. Biol Chem 253,24712482. 23. WICKENS, M. P., BUELL, G. N., AND SCHIMKE, R. T. (1978) J. Biol. Chem. 253,2483-2495. 24. ROYCHOUDHURY, R., JAY, E., AND WV, R. (1976) Nucleic Acids Res. 3,101-116. 25. MAXAM, A. M., AND GILBERT, W. (1980) in Methods in Enzymology (Grossman, L., and Moldave, K., eds.), Vol. 65, pp.499-560, Academic Press, New York. 26. SMITH, A. J. H. (1980) in Methods in Enzymology (Grossman, L., and Moldave, K., eds.), Vol. 65, pp. 560-580, Academic Press, New York. 27. MARCO, Y. A., THANH, V. H., TUMER, N. E., SCALLON, B. J., AND NIELSEN, N. C. (1984) J. BioL Chem 259,13436-13441. 28. YANG, RC-A., LIS, J., AND WV, R. (1979) in Methods in Enzymology (Wu, R., ed.), Vol. 68, pp. 176-182, Academic Press, New York. 29. SCHULER, M. A., SCHMI~, E. S., AND BEACHY, R. N. (1982) Nucleic Acids Res. 10,8225-8244. 30. SCHULER, M. A., LADIN, B. F., POLLACO, J. C., FREYER, G., AND BEACHY, R. N. (1982) Nucleic Acids Res. 10,8245-8261. 31. KOSHIYAMA, I. (1966) Agric Biol. Chem. 30,646650. 32. YAMAUCHI, F., THANH, V. H., KAWASE, M., AND SHIBASAKI, K. (1976) Agric Bid Chem. 40,691696. 33. SEGREST,J. P., AND JACKSON, R. L. (1972) in Methods in Enzymology (Ginsburg, V., ed.), Vol. 28, pp. 5463, Academic Press, New York.