Eiochem. Physiol. Pflanzen 183, 99-106(1988) VEE Gustav Fischer Verlag J ena
The Structure and Function of Zein Genes of Maizel ) J. W.
S. BROWN,
U. G.
MAIER, M. SCHWALL,
L.
SCHMITZ,
C. W ANDELT, and G.
FEIX
Institut fiir Biologie III, Albert-Ludwigs-Universitat, Frciburg, F.RG. Key Term Ind ex: zein, promoter, transcription, splicing, DNA-protein binding, Zea mays
Summary Zcin genes are the genes coding for the ze:n storage proteins of maize. Analysis of a number of genomic zein clones by a variety of methods have suggested a unique gene structure for zein genes where at least two promoters lic upstream of the coding region. The evidence for this structure has been summarised and new evidence which suggest a splicing event in the mRNA transcribed from the upstream (PI) promoter region is disl'ussed. The 5' flanking region of one genomic clone, pMSl, has been analysed in detail for DNA-protein binding. Regions with potcntial regulatory effects of zein gene expression, in('luding tissue-specificity, have been tentatively identified and will be studied on a functional basis following electroporation of different constructions into maize kernel protoplasts.
Introduction
Zein proteins are the major storage proteins of maize composing approximately 50 to 70% of the total seed storage protein. The zein proteins can be divided into two major classes based on molecular weight: 19 and 21 kilodaltons (kDa) (GIANAzzA et al. 1976). The genes coding for these proteins form a multigene family comprising 50 to 100 hybridising regions (W IENAND and FEIX 1980; HAGEN and RUBENSTEIN 1981; BURR et al. 1982) of which many are pseudogenes containing mutations resulting in disrupted reading frames. A number of cDNA and genomic zein clones have been isolated and analysed (for reviews: MESSING et al. 1983; HEIDECKER and MESSING 1986) from different maize varieties. The zein genes represent a developmentally regulated and highly coordinated gene system expressed exclusively in the endosperm. Towards an understanding of the molecular basis for the specific gene expression, we have determined the sequence and structure of several cloned zein genes and searched for assumed regulatory motifs and for regions of homology common to all zein genes or to genes of the same subclass (BROWN et al. 1986; W ANDELT 1987). These studies were complemented by transient transformation assays where these genes and derived mutants were injected into the green alga Acetabularia. These experiments, together with the analysis of the specific binding of nuclear proteins to distinct parts of the genes, led to the tentative identification of regulatory regions of the zein genetic system.
1) The paper was presented at the 4th Symposium on Seed Proteins, held at GatcrsJeben, G.D.R., July 19-23,1987.
7*
99
Results and Discussion
Zein gene structure The structure of zein genes from the variety A619 has been studied using two genomic clones representing the 19 kDa class (pMS1 and pMS2) and two representing the 21 kDa classes (pML1 and pML2) (WIENAND et al. 1982; LANGRID GE and FEIX 1983; LANGRIDGE et al. 1985 ; BROWN et al. 1986; WANDELT 1987). These genomic clones have been analysed by DNA sequencing of the coding and flanking regions, Northern analyses, R-Iooping, Sl-mapping, in vitro transcription and in vivo transcription in yeast (WIENAND et al. 1981 ; LANGRIDGE et al. 1982a, 1982b, 1984, 1985 and LANGRIDGE and FEIX 1983). Based on these analyses, a unique (and indeed controversial) structure has been proposed for the zein genes, at least, of the maize variety A 619. Zein genes appear to have two promoters, P1 and P2, where the P1 promoter lies approximately 900-1 ,000 bp upstream and the P2 promoter lies 40 to 60 bp upstream of the start of the coding region which would give rise to R~A transcripts of about 1,800 and 900 bases, respectively. The first suggestion of a multiple promoter system was obtained through Northern analyses of total endosperm RNA from A619 and from its opague-2 derivative on formaldehyde agarose gels (LANGRIDGE et al. 1982a). In these Northern analyses cDNA probes specific to zein genes of the 19 and 21 kDa classes were hybridised to total RNA from different stages of development. Besides the expected band of 900 to 1,000 bases other higher molecular weight zein-specific bands were clearly visible. Furthermore, the effect of the opaque-2 mutation which reduced zein protein production in the en dosperm and in particular that of the 21 kDa zein glass, was reflected in the Northern analyses where the 19 kDa class final-size and higher molecular weight RNAs were reduced in amount and the 21 KDa class RNAs were scarcely visible (LANGRIDGE et al. 1982a). It is unlikely that these RNAs arc an artefact of the RNA extraction or gel separation systems. They are observed following RNA extraction by the original method reported by LANGRIDGE et al. (1982 a and b) and by the guanidinium isothiocyanate RNA extraction method (MANIATIS et al. 1982) and are seen following separation of RNA on the denaturing formaldehyde, methy I mercuric hydroxide and glyoxal agarose gels (LANGRIDGE et al. 1982b). Furthermore, the 900 base and 1,800 base zein-specific RNAs have been observed in Northern analyses of poly (A)+ - RNA fractions following oligo dT-cellulose and poly (U) sepharose chromatography (LANGRIDGE et al. 1982b; WANDELT 1987) and in R-loop analyses between poly (A)+ - RNA and linearised plasmid D~A(LAN GRID GE et al.1982b). The analysis of zein gene structure was further carried out in in vitro transcription experiments with Xenopus oocyte germinal vesicle and HeLa cell nuclear extracts. Addition of supercoiled DNA of the plasmid pML1 (21 kDa) to the oocyte derived transcription system gave rise to bands of 900 and 1,800 bases which when hybridised to a restriction digest of pML1 only lit up fragments which contained the coding region and about 1 kb of 5' flanking region (LANGRIDGE et al. 1982b; LANGRIDGE and FEIX 1983). The presence of the two bands in this system suggested that the zein gene contained on pML1 had either two promoters, each giving rise to a transcript, or that it contained one upstream promoter and the 1,800 base and 900 base transcripts represented un100
BPP 183 (1988) 2-3
processed and processed mRNAs respectively (LANGRIDGE et al. 1982b). In the transcription system prepared from HeLa cell extracts both supercoiled and linear plasmid DNA were active but only the 1,800 base transcript was observed. This transcript also hybridised to fragments containing the coding region and 1 kb of 5' flanking sequence. Further evidence for the two promoter system has come from Sl-mapping of the transcription start sites of the PI and P2 promoters using RNA extracted from maize endosperm. The transcription start sites have been mapped to positions -42 and -56 (P2) and -920 and -922 (Pl) for pM81 and pM82 (LANGRIDGE et al. 1984, 1985) and to positions -52 and -65 (P2) and -1038 and -1047 (Pl) for pMLl and pML2 (LANGRIDGE and FEIX 1983). When the inserts of the genomic clones pM81 and pMLl were stably integrated into the yeast genome transcription was observed from the P2 promoter of pMLl and the Pl promoter pf pM81. Fine 81-mapping of the latter transcript showed that the transcriptional initiation sites were identical in yeast and maize endosperm although no zein protein was produced (LANGRIDGE et al.1984). In the absence of an homologous transformation/expression system, the zein genomic clones, pM81, pM82, pMLl and pML2 have been injected in supercoiled form into the nuclei of the single cell alga Acetabularia mediterranea. Following incubation zein protein was detected in the cytoplasm of the algal cells with a zein specific immunofluorescent staining technique (LANGRIDGE et al. 1985; BROWN et al. 1986). This Acetabularia system still represents the only in vivo system capable of expressing cloned zein genes but suffers from the disadvantage that it has not been possible to isolate RNA suitable for 81-mapping and primer extension. Thus, in order to study transcription it was necessary to construct a number of deletion mutants of our genomic clones. Constructs containing an intact Pl promoter gave rise to zein polypeptides in the algal cytoplasm while constructs where the Pl promoter was affected by deletion of the CAAT box or CAAT and TATA boxes did not produce zein protein. This result suggests that the P2 promoter which lies directly in front of the coding region was inactive and therefore not recognised in the Acetabularia nuclei (BROWN et al. 1986). The results presented above from l'orthern analyses, R-Iooping, 81-mapping, and in vitro and in vivo transcription/expression studies all provide evidence that zein genes have at least two promoter regions, one in front of the coding sequence and one lying a further 900 to 1,000 bp upstream. The controversy which has arisen following the publication of the proposed two promoter zein gene structure has arisen because a) the higher molecular weight zein-specific RNAs have not been observed in Northern analyses in other laboratories and the amount of these RNAs relative to "final-size" mRNA and the exact nature of these hybridising bands has been brought into question and b) these RNAs were referred to as "precursor" mRNAs (LANGRIDGE et al. 1982 a and b). It has been suggested that the higher molecular weight of zein-specific RNAs seen in Northern experiments are due to cross hybridisation of the zein cDNA probes with ribosomal RNA in the total RNA preparation (BOSTON et al. 1986; KRIZ et al. 1987). This supposition can be discounted for a number of reasons. Firstly there is no sequence homology of the cDNA probes and 188 and 25 8 rRNA. Secondly, hybridisation with specific 18 8 and 25 8 rDNA probes from the maize rRNA gene (TOLOCZYKI and FEIX (1986) did not hybridise to zein mRNA. Thirdly, in the Northern experiments with RNA from the opaque-2 derivative of A619, the high molecular weight RNAs were visTIPP 183 (1988) 2-3
101
ible with the 19 kDa zein cDNA probe but not with the 21 kDa zein cDNA probe(LANGRIDGE et al.1982 a). Finally, hybridisation of zein cDNAs to a Northern of total RNA extracted from kernels at different stages of development showed that the amount of zein mRNAS, including the higher molecular weight RNAs, increased and then decreased whereas hybridisation with an rRNA probe showed a constant level of rRNA throughout the developmental kinetic. None of the above results are consistent with cross-hybridisation of zein and rRNA sequences. The amount of zein mRNA precursor was reported as representing 10 to 30% of the zein mRNA (LANGRIDGE et al. 1982b). These levels have been observed and are still observed in our routine RNA preparations. A precursor-product relationship has now been established for the 1,800 base transcript in that an intron in the 5' flanking region appears to be removed by a splicing event. In the 5' flanking regions between the upstream PI promoter and the translation initiation codon of pMSI and pMLllie 18 AUG co dons each followed by a short reading frame and a stop codon. Translation of an mRNA produced from PI would require that these AUGs are unrecognised as translation starts, that translation continually reinitiates after termination, or that the mRNA is processed to remove these short reading frames. Evidence for a splicing event to remove the short reading frames and stop co dons comes from a comparison of SI-mapping and primer extension signals, DNA sequence, injection of certain deletion mutants into Acetabularia and in vitro splicing of a hybrid intron in a HeLa cell nuclear extract. Firstly, the SI-mapping of the P2 promoter in pMSI and pMS2 has shown two signals at -56 and -42. Primer extension analysis on the other hand shows signals at -56 and -57 suggesting that the SI signal at -42 does not represent a transcription start (BROWN et al. 1987). Secondly, this SI signal lies adjacent to an AG dinucleotide (the highly conserved dinucleotide at the 3' end of an intron - MOUNT 1982; BROWN 1986). The sequence surrounding this dinucleotide has other attributes of a 3' splice junction: a reasonable polypyrimidine stretch and a perfect branch point consensus sequence. This discrepancy between the SI-mapping and the primer extension signals and the proximity of an AG dinucleotide to the downstream SI signal has also been observed for pMLI (WANDELT 1987). Thirdly, injection of pMSI into Acetabularia gave rise to zein protein in the algal cytoplasm (BROW~ et al. 1986). Injection of a deletion mutant of pMSI where all of the short reading frames have been removed between the PI promoter and the AUG codon also gave rise to zein protein. However, injection of a construct which removed the proposed 3' splice junction and left a single short reading frame with a stop codon between PI and the AUG codon did not produce zein protein (BROWN et al. 1986, 1987). Thus, at least in Acetabularia the short reading frames lying in the 5' region disrupt translation unless the 3' splice site is present. Finally, a hybrid intron was constructed in pSP65 between the 5' end of a legumin J intron of pea and a fragment containing the 3' splice junction lying in the 5' flanking region of pMSl. RNA from ths construct produced by in vitro transcription was shown to be spliced in an in vitro HeLa cell nuclear extract splicing system. Thus, the original reference to the larger zein-specific RNAs seen in N ortherns as representing zein precursor RNAs was correct, at least for the 1,800 b transcript of pMSl. As stated by LANGRIDGE et al. (1982b) the unusual feature of multiple zein precursor RNAs in concentrations of 10 to 30 % of the total zein mRNA are felt to reflect the mechanism of zein gene expression. It is possible that transcription from the upstream 102
BPP 183 (1988) 2-3
promoters or processing of the high molecular weight zein mRNAs may represent levels of regulation of zein gene expression in which case the variation in the level of the precursor mRNAs may reflect the physiological state of the plant and the conditions under which it was grown. The zein gene family is a multigene family where the genes are for the most part clustered on three chromosomes (although the degree of linkage is variable - SOAVE et al. 1982). The complexity of this gene family has been seen in comparisons of various cDNA and genomic clones where numerous point mutations, insertions, deletions and rearrangements have been observed. These rearrangements have obviously played a part in the evolution and diversification of the zein gene family and continual rearrangement by unequal crossing-over and gene conversion between genes has been suggested as a means of maintaining a set of functional zein genes (HEIDECKER and MESSING 1986). The variation seen in the sequences of the coding regions and flanking sequences, in the intergenic distances as envisaged by Southern analyses, in the differing lengths of flanking sequences in cloned genomic zein genes, and in cosmid cloms (1. RUBEKSTEIN and J. MESSING, University of Minnesota) suggest that abundant and large rearrangements have occurred. The variation in flanking sequences is seen in comparisons of genes of the 19 kDa where the homology in the 5' region stops at different positions for each gene (BROWN et al. 1986; KRIZ et al. 1987). Differences on a much larger scale are apparent with the recent finding that two 19 kDa genes run into highly repeated sequences within a few hundred base pairs of either the 5' or 3' end of the structural gene (KRIZ et al. 1987) which has not been observed for other zein genes. The level of complexity is then further increased by the presence of upstream promoters and the limitations in the methods of analysing the promoters of genes in multigene families in general. For example, a signal obtained in a Northern analysis, or by Sl-mapping or primer extension with total RNA and with a DNA fragment or primer from a specific cloned gene only gives information on the expression of gene families or at best sub-families but gives none about the expression of the individual gene in question. Thus, it is not known whether all zein genes have or are transcribed from more than one promoter, whether some genes are transcribed from the P2 promoter only, some from PI only, and some from other promoters lying further upstream. The position of the promoters may have been altered relative to their structural genes due to insertions or deletions in their flanking regions, thus giving rise to the different zein-specific mRNAs. Such a situation apparently exists with the 19 kDa clone, pMS2, where one subclone, pMS2CB, which extends to position -1094 and contains the PI transcription start but not the CAAT and TATA boxes by virtue of a change in sequence is not functional in Acetabularia. However, a second subclone, pMS2BB, which extends a further 2 kb in the 5' direction does give rise to protein. The inference from this result is that in pMS2 the PI promoter is not active and expression in Acetabularia is due to another promoter lying still further upstream. The presence of a functional 3' splice site in front of the structural genes would maximise the possibility of processing of the various precursor mRNAs to a translatable form. Amplification of specific zein gene sequences in the highly amplified genome of the triploid maize endosperm cells could also playa role in the control of zein gene expression. The presence of multiple promoters may increase the capacity of the binding of RNA polymerase II by simply providing more binding sites which may lead to an accelerated or more efficient transcription process. The efficiency of the splicing of BPP 183 (1988) 2-3
103
the various precursors may also represent a level at which translational regulation may be effected.
Analysis of the 5' flanking regions of zein genes for DNA-protein binding activity The generation of extensive 5' flanking sequences of pMSl and pMLl has allowed comparisons of sequences between genes coding for 19 kDa and 21 kDa zein genes. Regions of homology to position -225 were found particularly in the CAAT and TATA box regions. Further upstream, very little homology was found with the exception of a 15 bp sequence at approximately -330 which was conserved in all zein genes published so far. DNA-protein binding analysis by nitrocellulose filter binding has shown specific binding of a nuclear factor(s) to fragments containing this consensus sequence and to fragments containing the P2 promoter region including the CAAT and TATA boxes (MAIER et al. 1987 a). The binding to the consensus sequence region was analysed in detail by gel retention and DNase I footprinting experiments. A nuclear factor(s) was shown to bind to a 22 bp region which contained 14 of the 15 bp conserved sequence (above) (MAIER et al. 1987 a). Similar results have now been obtained for the Pi promoter region and for a sequence similar to the above consensus which lies the same distance upstream of the Pi promoter. Furthermore, by comparing the DNA-protein binding activities using nuclear extracts from maize endosperm and maize seedlings it can be shown that fragments containing the Pi and P2 promoter regions bind nuclear factor(s) from both extracts while fragments containing the 15 bp consensus sepuence and the similar sequence upstream from Pi bind nuclear factor(s) only from maize endosperm nuclear extracts (MAIER et al. 1987 b). This result suggests that these regions may be important in the tissue-specific regulation of the zein genes in the maize endosperm. The success of the DNA-protein binding studies on the zein 5' flanking regions has depended greatly on the development of a method for the preparation of crude nuclear extracts from developing maize kernels (TOLOCZYKI, 1987; MAIER et al. 1987 a), Although the nuclear extracts tend to be variable in their DNA-protein binding activity, we have found binding activity for genes transcribed by RNA polymerase I (maize rRNA gene) and RNA polymerase II (zein genes). Indeed, every extract is tested for DNA-protein binding activity by nitrocellulose filter binding assays with fragments containing the consensus sequence from pMSl and part of the external spacer region of a maize rRNA gene (TOLOCZYKI and FEIX 1986).
Expression of GaMV and zein promoter constructs following electroporation into protoplast prepared from maize kernels Recently, a method has been developed for the preparation of viable protoplasts from maize kernels at an early stage of development (SCHWALL et al., in preparation). These protoplasts are suitable for electroporation of plasmid DNA and the subsequent analysis of transient expression of the genes contained on the plasmids. Thus far, the bacterial chloramphenicol acetyl transferase (CAT) gene has been expressed from plasmide where this gene has been cloned behind the 35S promoter of Cauliflower Mosaic Virus (CaMV) and behind the 5' region of pMSl containing both the Pi and P2 promoters. This system will now allow the analysis by transient expression of the zein promoters and other regulatory sequences in cells of the tissue in which they are normally 104
BPP 183 (1988) 2-3
expressed. That is, in the absence of stable transformation of maize, we can now, finally, study zein gene expression in a homologous system. The construction of an array of plasm ids containing either marker genes or a marked z ein gene (where a region of a pea storage protein gene has been cloned, in frame, into the zein gene) will further this aim. References BOSTON, R. S., KODRZYCKI, R., and LARKINS, B. A.: Transcriptional and post-transcriptional regulation of zein genes. Molecular biology of seed storage proteins and lectins. (Eds. SHANNON, L. M., and CHRISPEELS, M. J.) Ni nth symposium on plant physiology. University of California, RiveJside, Calif. pp. 117-126 (1986). BROWN, J. W. S. : A catalogue of splice junetion and putative bran ch point scq uences from plant introns. Nue!. Acids Res . 14,9649-9669 (1986). BRowx, J. W. S., WANDELT, C., FEIX, G., NEU HAUS , G., and SCHWEIGER, H.-G.: The upstream regions of zein genes: sequen ce ana.Iysis and expression in the unicellular alga Acetabularia. Eur. J. Cell BioI. 42, 161-170(1986). BROWN, J. W. S., NEUHAUS, G., and FEIX, G.: Evidence for splicing of the zein mRNA produced from the PI upstream promoter of a zein gene (in preparation). BURR, B., BVRR, F. A., ST. J OH N, '1'. P., TnoMAs, J. M., and DAVIS, R. W.: Zein storage protein gene family of maize. J. Mol. BioI. 104, 33-49 (1982). GIANAZZA, E., RIGHETTI, P. G., PIOLI, F., GALANTE, E., and SOAVE, C.: Size and eharge heterogen eity of zein in normal and opaque-2 maize endosperms. Maydica 21 , 1-17 (1976). H AGEX, G., and RUBENSTEIN, I.: Complex organisation of zein genes in maize. Gene 13, 239-249 (1981). H EIDECKER, G., and MESSING, J.: Struct ural analysis of plant genes. Ann. Rev. Plant Physiol. 37, 439- 466 (1986). KRIZ, A. L., BOSTON, R. S., and LARKINS, B. A.: Structural and transcriptional analysis of DNA sequences of flanking genes that encode 19 kilo dalton zeins. Mol. Gen. Gen et. 207, 90-98 (1987). LAXGRIDGE, P., BROWN, J. W. S. , PINTOI{-TORO, J. A., FEIX, G., NEUHAUS, G., NEUHAUS-URL, G., and SCHWEIGER, H.-G.: Expression of zein genes in Acelubularia rnediterranea. Eur. J. Ccll BioI. 39, 267-264(1986). L ANG I!IDGE, P., ETIlEL, H. , BtWWN, J. W. S., and FEIX, G.: Tmnscription from ma,ize storage protein gene promotcrs in yeast. E)-IRO J. 3, 2467-2471 (1984). LANGRIDGE, P. , a nd FEIX, G.: A zein gen e of maizc is trans cribed from two widely separated promoter regions. Cell 34, 1016-1022 (1983). LAXGRIDGE, P., PI NTOR-ToRO, J. A., and FEIX, G.: Transcriptional effects of the opaque-2 mutation of Zea mays L. Plant a 156,166-170 (1982 a). LANGRIDGE , P., PINTOR-ToRO, J. A., and FEIX, G.: Zein precursor mRNAs from maize endosperm. Mol. Gen. Genet. 187, 432-438(1982 b). M.HEN, U. G. , BROWN, J. W. S., TOLOCZYKI, C., and FEIX, G.: Binding of a nuelcar factor to a consensus sequence in the 6' flanking region of zein genes from maize. EMBO J. 6, 17-22 (1987). f,'IAJER, U. G., BROWN, J. W. S., SCHMITZ, 1,., DIETRICH, G., SCHWALL, M., and FEIX, G.: Tissue-specifit DNA-protein binding interactions with fragments from the 5' flanking regions of a zein gene. (in preparation). MANIATIS, T. , FNITSCIT, E ., a nd S.UIBROOK, J.: Molecular donin g. Cold Spring Harbor Laboratory. Cold Sprin g Harbor, New York 1982. MESSJ:
105
SOAVE, A., REGGIANI, R., DI FONZO, N., and SALAMINI, F.: Clustering of genes for 20 kd zein subunits in the short arm of chromosome 7. Genetics 97, 363-377 (1981). SPENA, A., VIOTTI, A., and PIROTTA, V.: Two adjacent genomic zein sequences: structure, organisation and tissue-specific restriction pattern. J. Mol. BioI. 169, 799-811 (1983). TOLOCZYKI, C.: Structure and expression of a cloned rRNA gene from Zea mays L. PhD Thesis, University of Freiburg (1987). TOLOCZYKI, C., and FEIx, G.: Oceurrence of 9 homologous repeat units in the external spacer region of a nuclear maize r RNA gene unit. Nucl. Acids Res.H, 4969-4986 (1986). WANDELl', C.: Structure and expression of cloned zein genes of the 21 kDa class from Zea mays. PhD. Thesis. L'nivcrsity of Freiburg(1987). WIENAND, U., LANGlUDGE, P., and FEIX, G.: Isolation and characterisation of a genomic sequence of maize coding for a zein gene. )101. Gen. Genet. 182, 440-444(1981). WIE:'
, UWE G. MAIER, MICHAEL SCHWALL, LIENHARD SCHMITZ, Dr. CmUS'l'INE WA:>DELT and Prof. Dr. GUNTER FEIX, Institut fur Biologie III, Albert-LudwigsUniversitiit, SchanzlestraBc I, 0 - 7800 Freiburg.
106
DPP 183 (1988) 2-3