Processing and localization of bovine β-casein expressed in transgenic soybean seeds under control of a soybean lectin expression cassette

Processing and localization of bovine β-casein expressed in transgenic soybean seeds under control of a soybean lectin expression cassette

Plant Science 161 (2001) 323– 335 www.elsevier.com/locate/plantsci Processing and localization of bovine b-casein expressed in transgenic soybean see...

310KB Sizes 1 Downloads 69 Views

Plant Science 161 (2001) 323– 335 www.elsevier.com/locate/plantsci

Processing and localization of bovine b-casein expressed in transgenic soybean seeds under control of a soybean lectin expression cassette Reena Philip a, Douglas W. Darnowski b, P. Jeffery Maughan c, Lila O. Vodkin d,* a Gene Logic Inc., 708 Quince Orchard Road, Gaithersburg, MD 20878, USA Department of Biology, Washington College, 300 Washington A6enue, Chestertown, MD 21620, USA c Monsanto Corporation, Ankeny, IA 50021, USA d Department of Crop Sciences, Uni6ersity of Illinois at Urbana-Champaign, 384 E.R. Madigan Laboratory, 1201 West Gregory Dri6e, Urbana, IL 61801, USA b

Received 12 February 2001; received in revised form 28 March 2001; accepted 28 March 2001

Abstract We have examined the processing and subcellular localization of a chimeric gene consisting of the bovine milk protein, b-casein, under the control of a soybean seed lectin promoter and its 32 amino acid signal sequence in the seeds of transgenic soybean plants. The b-casein expressed in developing soybean seeds is a doublet with apparent molecular weight slightly smaller than the bovine b-casein and expression of the protein was highest in immature cotyledons. The casein proteins were purified from the immature soybean seeds by immunoaffinity chromatography and were analyzed by two-dimensional gel electrophoresis, blotting, and amino terminal sequencing. The N-terminal sequences of both of the doublet soybean casein polypeptides were identical to the N-terminal sequence of the bovine b-casein indicating that the 32 amino acid lectin signal sequence was cleaved precisely from the chimeric protein in developing soybean seeds. Analysis of the purified soybean b-casein polypeptides by mass spectrometry (MALDI-MS) showed that they are not phosphorylated. Absence of added phosphate groups is the cause of the size difference between the soybean b-casein and native bovine b-casein protein. Immunolocalization experiments showed that the casein protein was found in the protein storage vacuoles (PSV) in developing and mature soybean seeds. The precise removal of the 32 amino acid lectin amino terminal sequence from the chimeric lectin– casein fusion suggests that the lectin expression cassette can be used for production of pharmaceutical or other recombinant proteins of added value in the developing soybean seed. © 2001 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Transgenic soybean; Localization in protein storage vacuoles; Lectin signal sequence; b-casein phosphorylation

1. Introduction Seeds store large amounts of proteins during their maturation phase. Plant seed storage proteins are compartmentalized in specialized bodies known as protein storage vacuoles (PSV), or protein bodies. The ability to express transgenic proteins in the developing seed and to localize

* Corresponding author. Tel.: + 1-217-2446147; fax: +1-2173334582. E-mail address: [email protected] (L.O. Vodkin).

transgenic proteins to the PSV is required to alter the nutritional quality of seed proteins or to enable the use of plant seeds as bioreactors for the production of recombinant proteins [1,2]. Recently, we described the transformation and expression of bovine b-casein in soybean plants [3]. The bovine b-casein sequence was cloned into a seed-specific lectin promoter and signal sequence expression cassette [4,5] and introduced into soybean somatic embryos via particle bombardment. The foreign gene was present as several copies at a single locus as shown by segregation data of the casein protein in F2 seed.

0168-9452/01/$ - see front matter © 2001 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 1 6 8 - 9 4 5 2 ( 0 1 ) 0 0 4 2 0 - 4

324

R. Philip et al. / Plant Science 161 (2001) 323–335

In this paper, we describe the purification and post-translational processing of the b-casein protein in soybean seed and its subcellular localization in the PSV. Although a number of recombinant proteins have been expressed in plants, very few have been subjected to detailed purification, amino terminal sequence analysis, and processing studies as reported here for the lectin – casein fusion protein. Soybean lectin is a moderately abundant seed protein that is localized in the protein bodies and encoded by a single gene locus [6]. The soybean lectin promoter cassette used to drive the seed specific expression of the bovine b-casein transgene contains an N-terminal sequence of 32 amino acids (approximately, 3.0 kDa) [4,6]. We have shown that this signal sequence appeared to be sufficient to result in localization of recombinant lectin-GUS (b-glucuronidase) to the PSV of transgenic tobacco seeds [5]. However, the amount of GUS protein in the tobacco seed was too low to be purified for more detailed characterization. Although soybean is one of the most recalcitrant species to transform, the seeds are large and amenable to biochemical studies. Expression levels of the casein protein in soybean under control of the lectin expression cassette were sufficient to allow purification and characterization of the recombinant protein. Therefore, we conducted immunolocalization experiments, immunoaffinity purification, N-terminal sequencing, and mass spectrometry analysis in order to determine the subcellular localization and processing state of this recombinant bovine b-casein protein in transgenic soybean plants. These studies showed that the transgenic soybean casein is less phosphorylated leading to production of polypeptides with slightly smaller molecular weights than native bovine b-casein. The 32 amino acid lectin signal sequence is cleaved efficiently and accurately from the lectin–casein fusion protein as shown by N-terminal sequencing of the doublet polypeptides. Immunofluorescent detection localized the casein protein to PSV of the soybean seed. Whatever the mechanism for localization of the foreign casein protein in the PSV, the lectin cassette may have general utility when plant seeds are used as production factories since its amino terminal sequence appears to be efficiently removed from the fusion protein. This feature is useful for production of valuable

proteins that can be extracted and purified from the developing seeds.

2. Materials and methods

2.1. Production and purification of antibodies to bo6ine i-casein protein Production of polyclonal antibodies (denatured and native) to casein were made at the Immunological Resource Center at the University of Illinois. Two mg of bovine b-casein (Sigma catalog number, C-6905) prepared in 1 ml of PBS (10 mM NaH2PO4 –123 mM NaCl) and 1.0% SDS were injected per rabbit. A total of five immunizations was made and bleedings of 20 ml per rabbit was collected every 1–2 months. Immune sera to native casein protein were prepared in the same manner as above with the omission of SDS. In order to purify the antibodies, bovine b-casein protein (Sigma catalog number, C-6905) were coupled to CNBr-activated Sepharose 4B by standard methods [7] and Pharmacia Technical Bulletin ‘Affinity Chromatography’. Polyclonal antibodies were purified from immune sera by affinity chromatography on a column containing 1 g appropriate antigen–sepharose matrix as described earlier [8].

2.2. Nomenclature used for plant transgenic lines The pGLCH-2 plasmid, containing a chimeric lectin b-casein gene construct and a constitutively expressed hygromycin phosphotranferase selectable marker, was used to transform soybeans [3]. The seeds analyzed in this report are all derivatives of one transgenic plant line derived from one transformed embryogenic culture denoted B1-10. Plant B1-10-1 was the first plant regenerated from this culture. It contains several inserts, likely in tandem, as they segregated as a single genetic locus [3]. The next generation (F2 seed and plants) are denoted B1-10-1-(specific plant c). These F2 plants are either positive (homozygous or heterozygous) or negative (homozygous) for casein expression as determined by examining F3 seed by protein blotting. The embryogenic culture was derived from the soybean variety ‘Jack’ which serves as a non-transformed control line.

R. Philip et al. / Plant Science 161 (2001) 323–335

2.3. RT-PCR analysis Total RNA was extracted from immature cotyledons (100– 150 mg fresh weight) following standard methods [3,5,9] and stored at − 70°C. Messenger RNA was isolated using the PolyATract® large scale mRNA Isolation system II as described by the manufacturer (Promega, Madison, WI). First-strand cDNA synthesis for RT-PCR was done using the Advantage RT-for PCR kit from Clontech (Clontech, Palo Alto, CA. catalog number, K1402-2). RT-PCR was performed using different primer sets. The 5%primers used for amplification were CAS1 & CAS7. The 3%primers used for amplification were CAS2R, CAS8R and Lec41R. The primer sequences are as follows. “ CAS1 = 5%AGCGGCCGCCAGAGAGCTGG AAGAACTC3%. “ CAS2R =5%GGCGGCCGCTTAGACAATAA TAGGGAA3%. “ CAS7 = 5%AGAGAGCTGGAAGAACTCAAT GTACCGGGTGAG3%. “ CAS8R =5%TTAGACAATAATAGGGAAAG GTCCCCGGACAGG3%. “ Lec41R =5%TGACAATCACTAGCGATCGA GTAGTGAGAG3%. The CAS1 primer matched the first six amino acids of the bovine b-casein protein [10] and contained a 5% NotI site. The CAS2R primer matches the last five amino acids and stop codon of the bovine b-casein protein and contains a NotI site. The CAS7 primer matches the first 11 amino acids of the bovine b-casein protein. The CAS8R primer matches the last 10 amino acids and stop codon of the bovine b-casein protein. The Lec41R primer matches 10 amino acids outside the coding region of the soybean lectin gene in the 3%untranslated region. PCR reactions and the electrophoresis of the RT-PCR products were conducted by standard methods as earlier described [4,11].

2.4. Purification of bo6ine i-casein protein from transgenic plants Purified denatured casein antibody was coupled to CNBr-activated Sepharose 4B by standard methods [7] and Pharmacia Technical Bulletin ‘Affinity Chromatography’ and used to purify the soybean casein protein from transgenic seed. The F2 plant line, B-1-10-1-39, which

325

had the same DNA restriction banding pattern as the parental line B-1-10-1 on an RFLP gel, was used to purify bovine b-casein protein from transgenic plants. Dried seeds expressed significantly lower amounts of casein, therefore, immature seeds of the fresh weight range of 150–300 mg per seed were collected and freeze dried. Approximately, 1 g dried seed weight was ground with a mortar and pestle and dissolved in 10 ml of PBS, pH 7.2. The sample was centrifuged at 5000 rpm in a 50 ml orange cap tube in JA14 rotor for 20 min. The supernatant was filtered by a miracloth. The samples were then purified by passing through a purified antibody –sepharose column. Nonspecifically bound material was removed by two cycles of washings with 0.1 M Na acetate, 1 M NaCl (pH 4.8), and 0.1 M Na carbonate, 1 M NaCl (pH 7.6), as described before [7,8]. Bound protein was eluted with 0.5 M acetic acid (pH 2.5). Fractions 1– 15 and 15– 30 were pooled separately. Pooled protein fractions were adjusted immediately to pH 7.0 by dropwise addition of 2.5 M ammonium hydroxide. The pooled fractions were then dialysed overnight with 2×1 l changes of PBS, centrifuged to remove any precipitate and freezedried. The freeze-dried material was dissolved in 500 ml of PBS. Out of that volume, 50 ml was used for a diagnostic SDS gel. The remaining 450 ml was freeze-dried and subjected to two-dimensional SDS gel electrophoresis by Kendrick Laboratories, Madison, WI.

2.5. Two-dimensional electrophoresis, N-terminal sequencing and MALDI-MS Two-dimensional electrophoresis was performed according to the method of O’Farrell [12] by Kendrick Labs, Inc., Madison, WI. After slab gel electrophoresis, the gel was transferred to transfer buffer (12.5 mM Tris, pH 8.8, 86 mM glycine, 10% MetOH), transblotted onto pVDF paper overnight at 200 mA and approximately, 100 V/two gels. The blots were stained with coomassie blue and dried between sheets of filter paper. The protein spots on the PVDF membrane were cut out for N-terminal sequencing. For the MALDI-MS, the 2-D gel was dried between sheets of cellophane paper with acid edge to the left.

326

R. Philip et al. / Plant Science 161 (2001) 323–335

N-terminal sequencing of 15 residues was carried out at the Keck Foundation Biotechnology Resource Laboratory at Yale University. Tryptic digests of the soybean casein, and HPLC and MALDI analysis of the peptides, were also carried out at the Yale facility. The sequence of soybean casein peptides was compared with the sequence of bovine b-casein as reported in accession number P02666 of the SwissProt data base and I46963 of PIR data base.

2.6. Immunoblotting and dephosphorylation To examine developmental expression, proteins were extracted from seeds from different fresh weights ranging from 10 mg until the seeds are mature and dried. Fresh weight is calculated based on the weight of the two cotyledons and the seed coat. Seed coats were removed and the cotyledons were freeze-dried. Protein extraction and immunoblotting were conducted as described [3]. Dephosphorylation studies were conducted according to published procedures [13]. Samples were dissolved in PIPES/DTT, pH 6.0 to a final concentration of 0.5 mg/ml. They were incubated for 10 min at 30°C and then 5 U of acid phosphatase was added and incubated for 15 min at 30°C. The reaction was terminated by the addition of 100 mM sodium pyrophosphate (10 mM final concentration). Twenty ml of the reaction were mixed with an equal volume of 2×gel loading mix+ 8 M urea. The control and dephosphorylated samples were electrophoresed on a urea-PAGE gel and they were electroblotted to nitrocellulose overnight at 25 V at constant voltage. The transfer buffer was 0.025 M Tris, 0.192 M Glycine, 20% MetOH.

2.7. Fluorescence immunolocalization Sectioning was performed on immature and mature cotyledons pieces as [14] except that substitution was performed in absolute ethanol and that samples were infiltrated with metamethacrylate (1:4 methyl:butyl with 10 mM DTT during infiltration; 1% benzoyl peroxide for polymerization). This resin can be removed from the sections so that more antigen can be reached by antibodies. Mature cotyledon samples were im-

bibed 24 h in the dark prior to cryoprotection and freezing. The primary antibodies were a rabbit polyclonal IgG anti-soybean seed lectin [8] or a rabbit anti-bovine b-casein polyclonal IgG 3, each at 1:10 dilution. The fluorescein-anti-rabbit IgG secondary antibody was used at a concentration of 1:10. Rabbit normal serum and secondary antibody-only controls were also performed, with the normal serum used at a dilution which contained amounts of IgG equivalent to those in the positive antisera based on serial dot immunoblotting. Photographs were made using an Olympus BH-2 compound microscope on Kodak E160T film.

3. Results

3.1. Detection of bo6ine i-casein mRNA in transgenic soybean plants The presence of the casein mRNA was detected in Northern blots of total RNA of the F2 seed of transgenic soybean plants [3]. The structure of the lectin–casein fusion and its expression in the mRNA of transgenic plants as detected by RT-PCR is shown in Fig. 1. RTPCR was used to more accurately assess that the mRNA contained the full length of the casein protein coding region. The seed from two F2 generation plants B1-10-1-39 (casein positive) and B1-10-1-32 (casein negative) were examined by amplification of genomic DNA and cDNA from each. As expected, the casein negative plant, B1-10-1-32, did not give any products in the PCR reactions. Fig. 1B shows the design of primers for RTPCR analysis. In the reverse trancriptase-polymerase chain reaction, CAS1 and CAS2R primers, and CAS7 and CAS8R primer combinations, were designed to amplify the full length coding region of the bovine b-casein gene (630 bp), both in the genomic DNA and cDNA of the transgenic plant, B1-10-1-39. CAS1 and Lec41R primer combination was designed to amplify a 700 bp band only in the genomic DNA of the transgenic plant, since Lec41R primer lies outside the coding region. This was to eliminate the possibility of DNA contamination in the total RNA preparation.

R. Philip et al. / Plant Science 161 (2001) 323–335

327

3.2. De6elopmental expression of bo6ine i-casein protein in transformed soybean seeds

3.3. Immunoaffinity purification and N-terminal sequencing of the soybean i-casein polypeptides

Individual seeds from many F2 plants of line B1-10-1 were analyzed for casein protein expression by immunoblotting. Fig. 2 shows the pattern of bovine casein expression in developing seed of F2 line B1-10-1-9. There were three other casein positive F2 lines (B1-10-1-25, B1-10-1-39 and B1-10-143) which gave similar results. The soybean casein appeared as a doublet band even in very young seed of 20 mg fresh weight (Fig. 2). The lower protein seemed to be almost diminished when the seeds are completely matured. The amount of upper protein also was reduced dramatically as the seeds matured. The level of casein protein was highest when the seeds were between 100–200 mg fresh weight. The doublet band was slightly smaller by 1–2 kDa than the bovine b-casein from Sigma Chemical Co. (lanes 1 –3) which also appears as a doublet band.

In order to conduct more extensive studies of the processing of the chimeric lectin–casein fusion protein, the b-casein produced in transgenic plants was purified from the seeds using immunoaffinity chromatography as described in Section 2. Samples of the purified protein were analyzed by 2-D gel electrophoresis and blotted to PVDF membrane (Kendrick Laboratories, Madison, WI). The 2-D gel pattern of the purified protein always gave four closely spaced spots in several independent preparations (data not shown). These were separately labeled as A (28 kDa left), B (28 kDa right), C (26 kDa left) and D (26 kDa right). The two spots at approximately, 28 kDa were pooled and subjected to amino terminal sequencing of the first 15 residues by the Keck Foundation Biotechnology Resource Laboratory at Yale University. Likewise, the two

Fig. 1. (A) RT-PCR analysis of bovine b-casein RNA from transgenic soybean plants. Lane M is FX174 RF DNA per HaeIII fragments. Lanes 1 –3, Genomic DNA from B1-10-1-39, a positive F2 plant. Lanes 4 – 6, cDNA from B1-10-1-39. Lanes 7 –9, Genomic DNA from B1-10-1-32, a negative F2 plant. Lanes 1, 4 and 7, PCR products from CAS7 and CAS8R primers. Lanes 2, 5 and 8, PCR products from CAS1 and CAS2R primers. Lanes 3, 6 and 9, PCR products from CAS1 and Lec41R primers. A 630 bp b-casein fragment was detected in the genomic DNA and cDNA of B1-10-1-39, when CAS1 and CAS2R primers and CAS7 and CAS8R primer sets were used. A 700 bp band which includes the bovine b-casein and lectin 3% region was detected only in the genomic DNA of B1-10-1-39 and not in the cDNA when CAS1 & Lec41R primers were used. Genomic DNA from the negative plant B1-10-1-32 did not show any PCR products with any of these primer sets. (B) Schematic representation of the relative positions of primers used for RT-PCR analysis. The sequences of each primer are given in Section 2.

328

R. Philip et al. / Plant Science 161 (2001) 323–335

Fig. 2. Developmental expression of bovine b-casein protein in F2 casein-transformed soybean plants. Lanes 1 – 3 are Sigma bovine b-casein loaded as a positive control marker with 75, 150 and 300 ng, respectively. Lanes 4 –9 were loaded with protein extracted from seeds of the indicated fresh weight range from B1-10-1-9, a casein transformed F2 soybean plant. Lane 10 was protein extracted from mature seeds from non-transformed (NT) Jack plant as a negative control. Lane 4 was loaded with 10 mg total protein and lanes 5 through 10 was loaded with 30 mg total protein. The arrows show the position of Sigma bovine b-casein of approximately, 29 kDa and plant bovine b-casein of approximately, 28 kDa.

spots at approximately, 26 kDa were pooled and sequenced. Fig. 3 shows that the amino terminal sequences of both sizes of the soybean casein proteins matched the amino terminal sequence of the bovine casein protein, starting with arginine which is the first residue of the mature bovine b-casein protein [10,15]. These data indicate that the lectin fusion is cleaved after the extra alanine residues added by the NotI fusion site between the lectin signal peptide and the casein coding region. They also demonstrate that the size difference in the soybean casein doublet band is not due to proteolytic cleavages of the amino terminal end of the two polypeptides.

3.4. MALDI-MS analysis of the purified soybean casein The soybean b-casein migrates as a doublet band with an apparent molecular weight smaller than the control bovine b-casein from Sigma Chemical Company. The RT-PCR experiments (Fig. 2) showed that the full length of the coding region for casein is present in the transgenic soybean seed. One possible reason for the protein size differences is that the plant protein is not phosphorylated or is incompletely phosphorylated. The five phosphorylated residues should add about 1.4 kDa apparent molecular weight in SDS gels to the casein polypeptide. We conducted dephosphoylation studes of the soybean casein compared with the Sigma casein by treatment with acid phosphatase. No migration changes on one dimensional genes were observed

for the native or dephosphorylated soybean (data not shown) which indicated that the plant protein may not be phosphorylated. To better determine the phosphorylation and processing status of the soybean casein protein, the four spots from 2-D gel electrophoresis were subjected to HPLC analysis of tryptic digests to identify peptides that differ between the samples and the control Sigma casein protein. Those peptides that differed were identified by MALDIMS analysis and amino terminal sequencing. The four spots of the purified soybean casein were cut directly out of the dried 2-D gels at Kendrick Laboratories. The control (Sigma bovine b-casein) was run on a separate gel and contained one intense spot that had an apparent molecular weight of approximately 29 kDa on the 2-D gel. After digesting both the Sigma bovine b-casein and the soybean b-samples with trypsin, all five samples were run on the microbore HPLC columns by the Keck Foundation Biotechnology Resource Laboratory at Yale University. The scale factors indicated that the Sigma casein control sample contained 50 pmoles of protein and the soybean casein sample B (28 kDa, right) contained 25 pmoles and sample D (26 kDa, right) contained 40 pmoles. The B and D samples which have the same charge but different molecular weights in the 2-D gels were analyzed in more detail as they contained more protein than samples A (28 kDa, left) and D (26 kDa left).

R. Philip et al. / Plant Science 161 (2001) 323–335

Table 1 shows the major HPLC peak differences observed between the samples subjected to MALDI-MS and amino terminal sequence analysis of the peptides. We found that peak c 387-57 in sample D, had a mass of 2803.73, which matched the mass of the dephosphorylated tryptic peptide 1-25 in bovine b-casein. Peptide 1-25 is the phosphopeptide in native bovine b-casein and it contains four out of the five total phosphoserine residues in the bovine b-casein protein. Sequencing of peptide peak c387-57 confirmed that it is indeed the 1-25 peptide and thus is not phosphorylated based on the mass determination from the MALDI-MS data. Another major difference was that peak c38631 in the control Sigma protein was shifted to the right in all of the soy polypeptides A, B, C and D. We performed the MALDI-MS and the amino terminal sequence analysis for c 386-31 (Sigma bovine casein) and the corresponding soy peptides 387-19 (26 kDa soy casein spot D) and 388-22 (28 kDa soy peptide B). The two soy peptides had an observed mass of 1983, which was 80 Da less than that of the Sigma bovine b-casein that had a mass of 2063 which matched the mass of the phosphorylated tryptic peptide 33-48 in bovine b-casein. All three peptides had the same amino terminal sequence that matched the amino terminus of tryptic peptide 33-48 as shown in Table 1. The serine residue at position 35 is not phosphorylated in the soybean peptides but is phosphorylated in the Sigma bovine b-casein, as this serine residue did not sequence at that position. Thus, we have identified that peptide 33-48 which contains one out of five total phosphoserine residues in the bovine b-casein is not phosphorylated in the soybean. A third major difference between the HPLC peaks clarified the reason for the size difference between the 28 kDa (samples A and B) and 26

329

kDa (samples C and D) soybean casein polypeptides. Peaks corresponding to peptide c386-56 of the Sigma casein were found in soybean casein samples A and B which are 28 kDa but were absent in soy casein samples C and D which are 26 kDa. As shown in Table 1, the MALDI-MS mass of peak c386-56 in the control casein and in c388-45 in sample B, are 1384.83 and 1384.67 Da, respectively. We sequenced the peptide c 38845 and found that the sequence matched with the bovine b-casein protein sequence very near to the carboxy terminal end. The predicted peptide position is 191–202 and the predicted mass of this peptide is 1384.65 Da. The preceding residue to this peptide is Phe, so a chymotrypsin-like cut would generate this peptide. It is likely that there is a protease in the soybean plant that is producing the smaller 26 kDa casein polypeptide by cleavage near this region in the transgenic seed.

3.5. Subcellular immunolocalizations of i-casein in the seed of transgenic plants We used light level fluorescent immunotechniques to determine the subcellular localization of the b-casein in seed of transgenic soybean plants. A specific reaction for casein is found in the protein storage vacuoles of immature (Fig. 4e) and mature (Fig. 4h) soybean seeds of transformed plants using a polyclonal anti-casein antibody and a fluorescein-labeled secondary antibody. The controls included cotyledon sections of non-transformed seed probed with the anti-b-casein antibody (Fig. 4c) or casein positive seed that are probed with normal (non-immune) rabbit serum (Fig. 4f). We found that the fluorescent signal for casein was less intense in the mature cotyledon sections (Fig. 4h) of transformed plants as compared with the immature (Fig. 4e) seeds. This agrees well with the developmental protein blots

Fig. 3. Amino-terminal sequences of the purified soybean casein polypeptides. The top line shows the predicted sequence of the fusion peptide of the 32 amino acid lectin signal sequence and the mature casein protein in the pGLCH-2 construct (Fig. 1) that was used to transform soybean. The NotI site that joins the two proteins results in three extra alanine residues in the fusion protein. * denotes the phosphorylated serine residues. The bottom two lines show the experimentally determined sequence of the purified soybean casein polypeptides.

330

Table 1 MALDI MS and peptide sequence data comparison for soybean casein and Sigma b-casein Predicted peptide mass

Soy casein D (26 kDa) Soy casein D (26 kDa) Soy casein B (28 kDa) Sigma casein Soy casein B (28 kDa) Sigma casein

387-57

2803.73

3125 (2804)c

1-25

387-19

1983.12

2063 (1983)d

33-48

Phe-Gln-Ser-Glu-Glu /Phe Gln-Ser-Glu-Glu

388-22

1983.12

2063 (1983)d

33-48

Phe-Gln-Ser-Glu-Glu /Phe Gln-Ser-Glu-Glu

386-31 388-45e

2062.48 1384.83

2063 1384.7

33-48e 191-202f

Phe-Gln-Ser-Glu-Glu-Gln-Gln-Gln/Phe-Gln-Xxx-Glu-Glu-Gln-Gln-Gln Leu-Leu-Tyr-Gln-Glu-Pro-Val-Leu/Leu-Leu-Tyr-Gln-Glu-Pro-Val-Leu

386-56

1384.67

1384.7

191-202

Leu-Leu-Tyr-Gln-Glu-Pro-Val-Leu/not determined

Arg-Glu-Leu-Glu-Leu-Asn-Val/ArgGlu-Leu-Glu-Leu-Asn-Val

The soybean casein samples refer to the position of the 2-D gel spots. The specific sample and HPLC peak number assigned by the Keck Protein Center at Yale University. c The number in parentheses is the predicted mass of a dephosphorylated peptide if all four phosphate resides are missing from this peptide, i.e. 3125−(4×80)=2804. d The number in parentheses is the predicted mass of a dephosphorylated peptide if the single phosphate residue is missing from this peptide, i.e. 2063−80=1983. e Serine c35 in the Sigma b-casein is phosphorylated and did not sequence as serine whereas the soybean casein did sequence as unmodified serine. f There is no corresponding peptide with this elution time in the 26 kDa soybean casein fragments C and D indicating that this carboxyterminal fragment is missing in those smaller fragments. b

R. Philip et al. / Plant Science 161 (2001) 323–335

HPLC peptide MALDI numberb peptide mass

a

Predicted casein peptide (amino acid number)

Top line: predicted casein peptide sequence from data base (italics) /Bottom line: direct sequencing data of the MALDI peptide

Sample from 2-D gela

R. Philip et al. / Plant Science 161 (2001) 323–335

331

Fig. 4. Fluorescence immunolocalization of bovine b-casein in protein storage vacuoles of transgenic soybean seeds. Immature cotyledon sections were used in (a) through (f); and mature cotyledon sections were used in (g) and (h). (a) Phase contrast image of a cotyledon section from a non-transformed Jack plant stained with toluidine-o-blue. Vacuoles are seen as clear irregular bodies. (b) Fluorescence image of an immature cotyledon section from a non-transformed Jack plant, probed with lectin antibody, showing the localization of lectin in the protein storage vacuoles. (c) Fluorescence image of an immature cotyledon section from a non-transformed Jack plant, probed with casein antibody, showing the absence of a reaction. (d) Fluorescence image of an immature cotyledon section from a b-casein positive soybean seed probed with lectin antibody, showing the localization of lectin in the protein storage vacuoles. (e) Fluorescence image of an immature cotyledon section from a b-casein positive soybean seed probed with casein antibody, showing the localization of b-casein in the protein storage vacuoles. (f) Fluorescence image of an immature cotyledon section from a b-casein positive soybean seed probed with rabbit normal serum, showing the absence of a signal. (g) Fluorescence image of a mature cotyledon section from a b-casein positive seed probed with lectin antibody, showing the localization of lectin in the protein storage vacuoles. (h) Fluorescence view of a mature cotyledon section from b-casein positive seed probed with casein antibody, showing the localization of casein in the protein storage vacuoles. P, protein storage vacuole; c, cytoplasm; w, wall, all are at 250 × magnification.

showing that casein is more abundant in immature seed and is decreased substantially by the time the seeds are completely dehydrated and mature (Fig. 2). The shape of the protein storage vacuoles in which the casein protein is located (Fig. 4e Fig. 4h) is similar to the structure of the vacuoles containing the soybean lectin protein as shown by immunolocalizations using anti-lectin antibody in non-transformed (Fig. 4b) and casein-positive cotyledons (Fig. 4g).

4. Discussion Genetic engineering of plant proteins for improved nutritional quality or for the production of recombinant proteins in transgenic plants requires a reliable way to express the foreign proteins in specific subcellular locations and with effective post-translational processing. We undertook a detailed analysis of the processing and localization of bovine b-casein as expressed in transgenic soybean

332

R. Philip et al. / Plant Science 161 (2001) 323–335

seed [3] under control of the soybean lectin expression cassette [4] that contained the lectin promoter, 32 amino acid signal sequence, and lectin untranslated region (Fig. 1). The levels of b-casein protein accumulate during seed development and then decline as the seed mature and desiccate (Fig. 2). Since the levels were higher in immature seed, we purified the casein protein from immature seed by immunoaffinity chromatography and characterized it by 2-D gel electrophoresis, amino terminal sequencing, tryptic digestion and MALDI-MS of the tryptic peptides. We determined the amino terminal sequence of the soybean casein protein, the phosphorylation status of the casein in soybean, and its subcellular localization in the cotyledons using fluorescence microscopy.

4.1. The lectin signal sequence is clea6ed precisely from the casein fusion protein in de6eloping seed To directly examine the processing of the lectin– casein fusion in plants, we determined the N-terminal sequences of the casein polypeptides purified from soybean. Soybean casein is a doublet band of two closely spaced polypeptides of approximately, 28 and 26 kDa (Fig. 2) and both sequences were identical to the N-terminal sequence of the mature bovine b-casein (Fig. 3). These data demonstrate that the lectin 32 amino acid signal sequence and the three extra alanine residues that result from the NotI site used at the fusion junction are removed by the ER membrane system of the plant.

4.2. Casein is not phosphorylated in the transgenic seed We suspected that post-translational modification, possibly phosphorylation differences, might explain the size difference between the bovine and plant produced proteins since the amino terminal ends proved to be the same. Bovine b-casein is a phosphoprotein that is modified posttranslationally by the covalent coupling of five phosphate groups to serine residues at the N-terminal region of the protein [16]. The phosphoserine residues of the bovine b-casein play an essential role in the formation of casein micelles via Ca2 + -phosphate clusters and also contribute to increased curd tension during cheese making. Since the five phos-

phorylated residues adds about 1.4 kDa apparent molecular weight in SDS gels, we suspected that that the size difference between our plant purified b-casein and Sigma casein purified from the cow is due to the addition of phosphate residues to the b-casein in the animal system. MALDI-MS has been used to investigate the phosphorylation state of the bovine b-casein [15]. The natural mature bovine b-casein is phosphorylated at the serine residues c15, c17, c18, c19 and c35 beginning from the Arg of the mature protein as residues c1. The c22 serine within the amino terminal peptide is not phosphorylated and none of the other serines in the protein are phosphorylated. We compared with the pattern of tryptic digests of the soybean casein and the Sigma casein and selected certain peptides for mass analysis by MALDI-MS to determine whether the soybean casein was phosphorylated. As shown in Table 1, we identified the tryptic peptide c1-25 in the soybean casein but its mass (2803) is less than that predicted (3125) if the four serine residues are phosphorylated. In addition, we identified peptide 33-48 and its mass is also less (1983) than the equivalent peptide of the Sigma casein (2063). In addition, we showed directly that the serine c35 in the soybean casein was not phosphorylated as it sequenced as serine, whereas this residue in the bovine b-casein from Sigma did not sequence since it is phosphorylated (Table 1). Thus, the MALDI-MS confirm that the size difference between the Sigma bovine b-casein and our transgenic b-casein is due to the fact that the Sigma bovine b-casein is phosphorylated and the soybean casein is not. Bovine b-casein has been expressed in several heterologous organisms such as Escherichia coli [17]. Saccharomyces cere6isiae [18] and [16,19]. The bovine b-casein was phosphorylated in yeast and mice, but not in E. coli due to the lack of phosphorylation systems in bacteria. Others have speculated that the human milk protein expressed in transgenic potato plants is not phosphorylated because of the size differences in the plant and human produced casein sizes [20]. Thus, plant systems, including soybean, may not be effective in phosphorylation of foreign proteins. This may be critical for some proteins, but not for others. The behavior of milk products during food processing is partly attributable to the phosphate groups of casein subunits.

R. Philip et al. / Plant Science 161 (2001) 323–335

4.3. The smaller of the doublet casein polypeptides in soybean is missing a carboxyterminal peptide In bovine b-casein, the c 28 and c 29 lys residues are very susceptible to non-enzymatic cleavage (i.e. from freeze thawing) as well as in vivo enzymatic cleavage by the enzyme plasmin, a serine protease with a higher specificity than trypsin for the Lys 28–29 bond. This susceptibility is often the origin of the double band in bovine b-casein with the smaller product lacking the first 28 amino acids (Rafael Jimenez-Flores, personal communication). The smaller product, termed gamma casein, lacks the first 28 amino acids. Four of the five total phosphoserines are also in this peptide. Thus, it was possible that the difference between the two soybean polypeptides (double bands of 28 and 26 kDa in SDS gels) was due to a cleavage at this position. However, N-terminal sequencing of the polypeptides indicates no evidence of the loss of an amino terminal peptide in the lower molecular weight soybean casein polypeptide. Thus, the reason for the double bands in SDS gels in transgenic soybean seed is not the generation of a gamma casein. Instead, we identified that a 1.4 kDa peptide towards the carboxy terminal end of the 28 kDa soybean casein polypeptides was missing in the smaller and less abundant 26 kDa polypeptide (Table 1). Hence, we speculate that there is a proteolytic cleavage that produces the smaller species. Whether this cleavage takes place in the plant before or after the cells are broken during extraction is unclear. Although there is not a tryptic site in the known sequence prior to this peptide, the preceding residue is Phe, so a chymotrypsin-like cleavage would remove this carboxyterminal peptide.

4.4. The soybean casein is localized in the protein storage 6acuoles in de6eloping seeds Immunolocalization demonstrates that bovine b-casein is found in the protein storage vacuoles of the transgenic seed (Fig. 4). There is little observable casein protein located in the cell walls or the extracellular space between cells. The default pathway in non-seed tissues for proteins with a hydrophobic signal sequence alone, and no vacuolar targeting information, is thought to be export from the cell and would lead to accumulation of

333

the protein in the intracellular spaces. Clearly, this is not the case for the localization of the casein polypeptides after removal of the lectin amino terminal signal sequence. The native soybean lectin is known to be localized in the protein storage vacuoles [21] but the details of its in vivo processing are not known. The mature protein of lectin begins with an alanine at residue 33 and computer programs predict the preferred site of cleavage of the hydrophobic signal sequence to be between the serine and alanine of residues 32 and 33, respectively. We have shown that cleavage also occurs at this position in the chimeric lectin–casein fusion protein and cleavage removes the extra alanine residues of the Not1 fusion site. Thus, misprocessing of the lectin signal leading to artificial retention of casein polypeptides in the ER is not the reason for the localization in the PSV. A chimeric fusion protein hooked to the hydrophobic lectin signal sequence of 32 amino acids would be predicted to be exported from the cell via the default pathway in the absence of additional vacuolar targeting information. However, sorting is a complex process likely with multiple mechanisms. Plant cells maintain two separate sorting pathways to two distinct types of vacuoles, the lytic vacuoles and the specialized storage protein vacuoles of seeds and different types of proteins accumulate in each [22,23]. Other classes of proteins, as the cereal prolamines, accumulate in specific regions of ER destined for formation of protein storage vacuoles by an autophagy-like process [24,25]. There are several possible speculations as to why the casein protein fused to the lectin signal sequence is localized to the PSV. One is that no additional targeting information is required because the default pathway in seeds is not export, but rather is transport to the protein storage vacuoles. Another possibility is that cryptic sorting signals present in the mature casein polypeptide are recognized by the targeting receptors of the sorting machinery of the cell. Alternatively, the 32 amino acid N-terminal sequence of the soybean lectin may contain information that specifies that the nascent lectin protein and the nascent lectin– casein chimeric protein enter a region of the ER destined for formation of the protein storage vacuoles by an autophagy-like process. Finally, the lectin 3% untranslated region may be involved in

334

R. Philip et al. / Plant Science 161 (2001) 323–335

localizing the synthetic machinery to specific regions of the ER that are destined to form proteins storage vacuoles. Recently, the 3% untranslated region of a rice prolamine storage protein gene has been shown to be necessary for targeting the prolamines to specific subregions of the ER in transgenic rice seeds [26]. Although the exact mechanism is unknown, the lectin expression is sufficient for localizing the foreign protein, bovine b-casein, to the protein storage vacuoles of the developing seed. Recently, we have also shown that the green fluorescent protein (GFP) is found in vacuoles of transgenic Arabidopsis thaliania seeds when it was fused to the 32 amino acid N-terminal sequence of the lectin protein (Darnowski and Vodkin, unpublished) in this cassette. Our earlier work on tobacco with lectin–GUS fusions [5] and our present results in with two very different marker proteins (bovine b-casein and GFP) is supportive of this basic conclusion. Additionally, we show that the 32 amino acid N-sequence is precisely removed from the lectin–casein fusion. The lectin expression cassette which contains its promoter, 32 amino acid signal sequence, and 3% untranslated region will likely have general utility to direct seed specific expression and localization of valuable foreign proteins to the PSV in developing seeds with efficient removal of the lectin amino terminal sequences from the fusion protein.

Acknowledgements We thank Daniel Choffnes for his excellent assistance with protein immunoblotting and genetic analysis of the transformed seed. We thank Ken Williams of the Keck Foundation Biotechnology Resource Laboratory at Yale University for valuable advice on the MALDI-MS. We are grateful to the North Central Soybean Research Program and the United Soybean Board for grants supporting the research.

References [1] H. Horvath, J. Huang, O. Wong, E. Kohl, T. Okita, C.G. Kannangara, D. von Wettstein, The production of recombinant proteins in transgenic barley grains, Proc. Natl. Acad. Sci. USA 97 (2000) 1914 – 1919.

[2] A.R. Kusnadi, Z.L. Nikolow, J.A. Howard, Production of recombinant proteins in transgenic plants: practical considerations, Biotechnol. Bioeng. 56 (1997) 473 –484. [3] P.-J. Maughan, R. Philip, M.J. Cho, J.M. Widholm, L.O. Vodkin, Biolistic transformation, expression and inheritance of bovine b-casein in soybean (Glycine max), In Vitro Cell. Devel. Biol. Plant 35 (1999) 344 – 349. [4] M.-J. Cho, J.M. Widholm, L.O Vodkin, Cassettes for seed-specific expression tested in transformed embryogenic cultures of soybean, Plant Mol. Biol. Rep. 13 (1995) 255 – 269. [5] R. Philip, D.W. Darnowski, V. Sundararaman, M.-J. Cho, L.O. Vodkin, Localization of glucuronidasein protein bodies of transgenic tobacco seed by fusion to an amino terminal sequence of the soybean lectin gene, Plant Sci. 137 (1998) 191 – 204. [6] L.O. Vodkin, P.R. Rhodes, R.B. Goldberg, A lectin gene insertion has the structural features of a transposable element, Cell 34 (1983) 1023 – 1031. [7] D.J. Shapiro, J.M. Taylor, G.S. McKnight, R. Palacios, C. Gonzalez, M.L. Kiley, R.T. Schimke, Isolation of hen oviduct ovalbumin and rat liver albumin polysomes by indirect immunoprecipitation, J. Biol. Chem. 249 (1974) 3665– 3671. [8] L.O. Vodkin, Isolation and characterization of messenger RNAs for seed lectin and Kunitz trypsin inhibitor in soybeans, Plant Physiol. 68 (1981) 766 – 771. [9] D. McCarty, A simple method for extraction of RNA from maize tissue, Maize Genet. Coop. Newslett. 60 (1986) 61. [10] R. Jimenez-Flores, Y.C. Kang, T. Richardson, Cloning and sequence analysis of bovine b-casein cDNA, Biochem. Biophys. Res. Com. 142 (1987) 617 – 621. [11] J.J. Todd, L.O. Vodkin, Duplications that suppress and deletions that restore expression from a soybean multigene family, Plant Cell 8 (1996) 687 – 699. [12] P.H. O’Farrell, High resolution two-dimensional electrophoresis of proteins, J. Biol. Chem. 250 (1975) 4007 – 4021. [13] F.M. Ausubel, R. Brent, R.E. Kingston, D.M. Moore, J.G. Seidman, J.A. Smith, K. Struhl, in: Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-InterScience, (Eds.), Wiley, New York. vol. 3, Suppl. 33, 1990, pp. 18.5.2 –18.5.3. [14] D.D. Darnowski, L.O. Vodkin, Construction of a device composed of common plumbing supplies for freezing microscopy samples, Biotechniques 24 (1998) 412 – 416. [15] P-C. Liao, J. Leykam, P.C. Andrews, D.A. Gage, J. Allison, An approach to locate phosphorylation sites in a phosphoprotein: mass mapping by combining specific enzymatic degradation with matrix-assisted laser desorption/ionization mass spectrometry, Anal. Biochem. 219 (1994) 9 – 20. [16] S. Jeng, G.T. Bleck, M.B. Wheeler, R. Jimenez-Flores, Characterization and partial purification of bovine a˚ -lactalbumin and b-casein produced in milk of transgenic mice, J. Dairy Sci. 80 (1997) 3167 – 3175. [17] G. Simons, W. Van den Heuvel, T. Reynen, A. Fritjers, G. Rutten, C.J. Slangen, M. Groenen, W.M. deVos, R.J. Siezen, Overproduction of bovine b-casein in Escherichia coli and engineering of its main chymosin cleavage site, Prot. Eng. 6 (1993) 763 –770.

R. Philip et al. / Plant Science 161 (2001) 323–335

[18] R. Jimenez-Flores, T. Richardson, L.F. Bisson, Expression of bovine b-casein in Saccharomyces cere6isiae and characterization of the protein produced in vivo, J. Agric. Food Chem. 38 (1990) 1134 – 1141. [19] E. Hitchin, E.M. Stevenson, J. Clark, M. McClenaghan, J. Leaver, Bovine b-casein expressed in transgenic mouse milk is phosphorylated and incorporated into micelles, Protein Expr. Purif. 7 (1996) 247 – 252. [20] D.K.X. Chong, W. Roberts, T. Arakawa, K. Illes, G. Bagi, C.W. Slattery, W.H.R. Langridge, Expression of the human milk protein b-casein in transgenic potato plants, Transgenic Res. 6 (1997) 289 – 296. [21] L.O. Vodkin, N.V. Raikhel, Soybean lectin and related proteins in seeds and roots of Le + and Le− soybean varieties, Plant Physiol. 81 (1986) 558 – 565. [22] N. Paris, C.M. Stanley, R.L. Jones, J.C. Rogers, Plant

.

[23]

[24]

[25]

[26]

335

cells contain two functionally distinct vacuolar compartments, Cell 85 (1996) 563 – 572. J.-M. Neuhaus, J.C. Rogers, Sorting of proteins to vacuoles in plant cells, Plant Mol. Biol. 38 (1998) 127 – 144. A.R. Kermonde, Mechanisms of intracellular protein transport and targeting in plant cells, Crit. Rev. Plant Sci. 15 (1996) 285 – 423. T.W. Okita, J.C. Rogers, Compartmentation of proteins in the endomembrane system of plant cells, Annu. Rev. Plant Physiol. Plant Mol. Biol. 47 (1996) 327 – 350. S.-B Choi, C. Wang, D.G. Muench, K. Ozawa, V.R. Franceschi, Y Wu, T.W. Okita, Messenger RNA targeting of rice seed stgorage proteins to specific ER subdomains, Nature 407 (2000) 765 – 767.