J. Mol.
Biol.
(1982)
157,681-686
LETTERS TO ‘rHE: EUITOK
Abbreviated 3’ Non-coding Region in Duck Alpha D Globin Messenger RNA Defines Evolutionarily Conserved Sequences Sucleotide sequence data for the duck alpha II globin gene indicate short 3’ untranslated region of only 49 nucleotides. Revolutionarily sequences defined short regions of potential functional significance. among sequences for duck and chicken alpha globins indicate that importance in the regulation of globin gene expression.
an extremeI) conserved Comparisons this region has
One of the goals of studies on messenger RNA primary structure is to determine evolutionarily conserved sequences, which are thought to constitute important) functional determinants of the mRNA molecule. Eukaryotic mRNAs cont,ain 5’ and 3’ non-coding sequences that flank their coding regions. Nucleotide sequence comparisons are especially important for the determination of significant sequences in these 5’ and 3’ untranslated regions of mRNA where t’he contribution of a particular sequence is not often obvious. In our studies of the duck globin gene system we have determined nucleotide sequences from a number of globin rnRN&\ molecules (Paddock & Gaubatz, 1981; Paddock et al., 1982) t’hat are expressed in adult anemic ducks, including alpha n globin (the minor avian adult alpha globin) and alpha a globin, which appears to be an alternative major adult alpha /I globin in avians (Paddock & Gaubatz, 1981: Paddock et al., 1982; Salser et al., 1979: Richards & Wells, 1980: Dodgson et al., 1981: Matsuda et al.. 1971). The nucleotidr sequences obtained from the 3’ non-coding region of these two duck mRNAs along with the analogous region of a chicken globin mRNA are shown in Figure 1. Comparison of the 3’ non-coding regions from duck alpha a and chicken alpha s mRSA indicates a high degree of homology (85?/,). Comparisons of the 3’ regions from these two mRNAs with the analogous regions of the duck alpha n mRSA does not reveal such a high degree of conservation. In fact, many of the regions of homology between the chicken alpha s and duck alpha a mRNAs are delet’ed from this abbreviated 3’ end region of the duck alpha Z) mRNA molecule. We have found five short homologous sequences (Fig. 1). One such sequence, Pu-C-C-C-U (region II in Fig. l), is found in all alpha globins. including mammalian (Paddock & Gaubatz, 1981 : Paddock of al., 1982: Salser et al., 1979; Proudfoot & Longley, 1976: Proudfoot et al.. 1977; Deacon et al.. 19X0: Heindell et al., 1978: Richards & \$‘ells, 1980: Nishioka et nl., 1980; Wilson et nl.. 1977) with the exception of the human alpha-l globin gene, which has Pu-(‘Y-CCU (Michelson & Orkin, 1980). Although it always seems to be present between t,he termination codon and the A-A-VA-A-A sequence. neither its precise location nor the nucleotides from which it is evolutionarily derived appear to be conserved. The
Chicken
alpha
2
Duck
alpha
2
Duck
alpha
_D
Chlcken
alpha
2
Duck
alpho
7
Duck alpha
C
hexanucleotide sequence A-A-U-&A-A (region 111 in Fig. 1). tirst identified 1)) I+oudfootj bi Krownlee (1976), is the most ubiyuitous 3’ conserved sequence. having been found in nearly all polyadenylabd mR?lu’As. In addition. all avian globin mRNAs thus far seyuenced (Paddock h Gaubatz. 1981 : Salser rt al.. 1979: Deacon P/ al.. 1980: Richards & Wells. 1980: Richards ut ~1.. 1979 : Hampe ut nl.. 1981) contain the nucleotides (‘-A-C at, a short distance after the termination codon anal (‘-A-VP-y between the A-X-V,A-AAA and the poly(A) tail (regions I and l\- in Fig. 1). Finally. a (’ residue precedes the poly(X) tail in a wide variety of mRSAs (region V in Fig. I), including all glohins except .YP~w~~us bet’s globin (Williams P/ ad., 1980). Other nucleot.ides, such as the C-U-G at positions 23 to 25. also appear to be conserved among the avian alpha globins. hut WC‘ have not classified them because they appear sporadically elsewhere in other globin mRSr\ 3’ untranslatrd seyuences (Paddock 8r Gaubatz. 1981: Paddock of nl.. 1982: Salsrr it Fitzgeraltl & Shenk (1981) show that polyadrnylation in simian virus 10 (S\‘-N)l mutants containing a duplicat,ed A-I1-V-X-A-A S~Y~IWIW~ 1:a.n owur following t*itlwr
LETTERS
TO THE
ETIITOR
tixn
of these conserved sequences.It is interesting that neither the exact location nor all six nucleotides of this sequence need to be preserved for proper polyadenylation (Efstratiadis et al., 1980). Numberg et al. (1980) have found that A-U-A-A suffices in this position for t,he short form of mouse dihydrofolate reductase mRNA. Other exceptions to the presence of the A-A-U-A-A-A sequenceare found in the mRNAs coding for chicken lysozyme (Jung et al., 1980), human leukocyte interferon LeIF B (Yelverton et al., 1981), and mouse pancreatic amylase (Hagenbuchle et al., 1980: Tosi et al., 1981), where uridine is substituted for an additional adenosine in the sequence. Fitzgerald & Shenk (1981) have observed that the distance between the A-A-U-A-A-A sequenceand the poly(A) site falls within a narrow range of 11 to 30 nucleotides. Their experiments with SV40 deletion mutants suggestthe importance of a spatial relationship between the A-A-U-A-A-A sequence and the poly(A) addition site. Our findings that the distance between region TIT (A-A-U-A-A-A) and the poly(-4) addition site in alpha D mRNA falls within the above-mentioned range (14 bases). in spite of the overall abbreviated character of this mRNA, lends support to the importance of this spatial relationship. The function of the other 3’ non-coding sequencesin regions I, IT, IV and I’ is unknown. Indeed. the function of the 3’ non-coding region as a whole is unclear. Kronenberg et al. (1979) have shown that translation of rabbit beta globin mRNA can occur at least in vitro without it, although possibly with lessefficiency. While perhaps not, completely essential for translation, the 3’ non-coding region may play a role in determining the overall efficiency of mR?iA translation in Vito. Within the (~11.mRNA availability for translational activit.y may be modulated at a number of levels, including processing of pre-mRNA, transport to the cytoplasm, length of mKN.4 half-life, and maintenance of proper three-dimensional structure for ribosome binding. Support for the idea that the 3’ non-coding region may play such a role is shown by the studies report,& by Weatherall & Clegg (1979), who suggest that the elongated translation of Hemoglobin Constant Spring mRNA results in ribosome destabilization of the mRNA in the 3’ untranslated region leading to lowered alpha globin synthesis. Alternatively, Orkin & Goff (1981) suggest that the slight predominance of human alpha-2 globin mRNA over alpha-l globin mRS.4 could be explained by nucleotide sequence differences in their 3’ untranslated regions resulting in differential stabilities for what are otherwise virtually identical mRNAs. While possibly coincidental, it is interesting to note that the human alpha-l globin gene has a variant region II sequence in its 3’ untranslated region. [t would seemthat proteins that play similar roles within a cell would most likely be (bodedby mRNAs that share sequencesimportant to the proper timing and magnit,ude of expression. The duck alpha a and chicken alpha 8 globin proteins are both similarly expressed as the major adult alpha globins. With this in mind. it is int)eresting that an inter-species comparison of the 3’ untranslated regions for these mRSAs reveals a much higher degree of conservation than is seen in the intraspeciescomparison of alpha a and alpha D globin mR;“r’Aa. which code for probeins that exhibit very different quanbitat.ive and temporal properties of expression. An analogous situation can be seen in the study of protein evolution (Melderis et nl.. 1971: Phapman et al., 1980). For example, the embryonic alpha globins are more c&ely related between speciesthan are the embryonic alpha-like and adult globins
WI
I(.
E’KANKIS
ANI)
(i
\’
I’.~I)I)O(‘li
within the same species. Thus it is possible that. as in the case of proteins. t hqb structure of mRN\;A molecules in the 3’ not)-coding regions must he similarI> preserved in order to maintain proper regulation of expression. perhaps through interactions with the proteins known to hind to and possit)ly stabilize pre-mKSA and mRSA (Scherrer & Maundrell. 1979: (:oldenherg ef nl.. 1980: Vincent c,t trl.. 1981 ). One must use caution in t)his type of wmparisott. however. sitwe the degree of gene similarity could be related as tnwh to evolutionary distattw in time as to gene func-tion (i.e. the alpha glohin genes underucwt grnc~ duplicnatiott ittttl divergence tjefwe species (ii\-crgence). LVhile a wnsiderat& amount, ofdata are availahlc concerning the prowssing. halflives. and translational efficiencies of duck globin mKNAs (Spohr Kr Scherrer. 1972 : Schrier P( al., 1978: Spohr et al.. 1972; Stewart r/ CL/.. 19X3: Rcynaud rt ccl.. 1980: S&sing, 197X: Macnaughton it cd.. t 974). unfortunat~ety no information rxist,s at c~haracteristiw for the different duck globin present, concerning these relative rnRSAs. Acquisition of such data must represent a goal for future efforts in OI+I~ that m KSA scqwiiw tnay he related to futt~t.ioti. (‘urretitl!~ available recombitiar~l 1)5X techniques should also make it possible t,o construct mutatjions in the various cwnserved sequences in order to determine their fuwtiona~l sigttificanw, 1t is conceivable that different classes of’ conserved sequences exist within the 3’ now cwding region of rnK.N.4 tnolec~ules. For examplr~. some may tw present in \videl?~ divergent tnKNAs because they perform a futwtiott cwtnmon to all these mKSAs (such as AA-U-A-A-A and ~tol~adett~lat~ion). Other less ut.tiquitjous sryuences ma) be present only in a very small group of rnKS.As. which vodr for protjeins \vith a closely related func+ion. as in t)he sequences wmrnon to the c*hickcn ali)htt s antI duck alpha II genes hut not duck alpha 1). or the homologous srqucnws found in thrb .4-X-I’-A4-.%~4 atijawnt regions of’ the lteta-like plobitt rttKS.As (Henoist V/ r/i.. 1980). Ue~\vecn these t,ww cxt.retnes may tw styuw~c~es itnportattt~ t’o tljrl IJIXJ~JPI expression of larger groups of mKXrls. such as thcb l’u-(‘4 ‘-(‘-I’ swn in all alpha glot)in genes and t,he C’-A-I-1’~ found in a.11 aviate glohin gcwes. 1f-e grwf’ull,v xc-kno\\ It~dg~~ thv twhtlit.al assistalwr~ ctf’ I,<)uiw .Just. rnariiw.ript prq)aratiot~ IJJ- Linda I’atltlock and Nancay Kutlvr. ittlti rctitorial suggestions I)>- (‘harlts I, Smith atrd I)r ,Jirtt ICaratn. This \\,ork was supporttvl in part 1)~ yratlts (to (;. I’. I’.) frotn tht, Satiottal lrrstit,utrs of’ Hwlth ((:JI-P17X3) atttl frotn St,atrB oi South (‘arolina Institntiott~tl Support Funds. This is publication no. 0+X8 frotn thr I)rparttwttt of Basic, and (‘IinicA Itritn~tnolog,v atid .\lic~rol)iology. Jlr~dic~itl I‘ttivctrsity of South (‘aroliti:i. Alolec~ular HIICI(‘ellrtlar t
S.(‘.
29425.
I’.S.A.
(:.\I:\
(‘arolilla
\*
I’.\l,lNVh
LETTERS
TC) THE
EDITOR
685
REFERENCES Benoist. C.. O’Hare, K., Breathnach, R. & Chambon. P. (1980). Sucl. Acids Res. 8, 127-146. Chapman, B. S., Tobin, A. J. & Hood, L. E. (1980). J. Biol. Chem. 255, 9051-9059. Deacon, N. ,J., Shine, J. & Naora, H. (1980). Sucl. Acids Res. 8. 118771199. Dodgson, J. B., McCune, K. C., Rusling, D. J., Krust. A. & Engel, J. D. (1981). Proc. Sat. ilcad. Sci., U.s.a. 78, 5998-6002. Efstratidis. A., Posakony, J. W., Maniatis. T.. Lawn, R. M.. O’Connell. C., Spritz. R. A.. DeReil, J. K., Forget, B. G.. Weissman, 8. M., Slightom, J. I,., Blechl, A. E.. Smithies. O., Baralle. F. E., Shoulders, C. C. & Proudfoot, N. J. (1980). Cell. 21, 653+68. Fitzgerald. M. & Shenk, T. (1981). Cell. 24, 25lb260. Goldenberg, S., Vincent. A. & Scherrer. K. (1980). Sucl. -Acids Iles. 8. d057kN69. Hagenbuchle, O., Bovey. R. & Young, R. A. (1980). Cell, 21, 179-187. Hampe, A.. Therwath, A., Soriano, P. & Galibert, F. (1981). Gr,le. 14, lll21. Heindell, H. C.. Liu, A., Paddock. G. V., Studnicka. G. M. & Salser. W. A. (1978). Cell. 15. 43354. Jung, A., Sippel. A. E., Grez, M. & Schutz, G. (1980). Proc. Sat. Acad. Nci.. IT.A’.A1. 77. 57.W 5763. Konkel, D. A., Tilghman, S. M. & Leder, 1’. (1978). Cell, 15. 1125-1132. Kronenberg, H. M., Roberts, B. I. & Efstratiadis, A. (1979). Sucl. =Icids Res. 6, 153-166. Lawn. R. M., Efstratiadis, A., O’Connell, C. & Maniatis, T. (1980). Cell. 21, 647-651, Macnaughton. M., Freeman. K. B. & Bishop, J. 0. (1974). Cell, 1, 1177125. Matsuda, G.. Takei, H., Wu, K. C. & Shiozawa, T. (1971). Znt. J. Protein Res. 3. 173-I 74. Maxam. A. M. & Gilbert, W. (1980). Methods Enzymol. 65. 499-560. Melderis. H., Steinheider. G. & Ostertag. W. (1974). Satuw (Lo),don). 250, 774-776. Michelson, A. M. & Orkin. H. H. (1980). Cell. 22, 371-377. Niessing, J. (1978). Eur. J. Biochem. 91. 587-598. Nishioka, Y., Leder, A. & Leder. P. (1980). I’roc. Sat. Acad. Ski.. C:.S.,I, 77. 2806-2809. Numberg. ,J. H., Kaufman, It. J., (‘hang, A. C. Y.. Cohen, S. N. & Schimke. R. T. (1980). Cell, 19, 355-364. Orkin, S. H. & Gaff, S. C. (1981). Cell. 24. 345351. Paddock. (:. 17. & Gaubatz, J. (1980). Biochem. Biophys. Res. Commun. 97. 1116---l 123. Paddock. (:. V. & Gaubatz, J. (1981). Eur. J. Biochem. 117. d69-273. Paddock. G. V., Lin, F. K.. Frankis. R.. McNeil. W. 8r Gaubatz. -7. (1982). Camp. Pathohio/. In the press. Proudfoot, N. ,J. & Brownlee. G. 0. (1976). Snture (London). 263, PI lb216. l’roudfoot. N. J. & Longley. J. I. (1976). Cell. 9, 733-746. Proudfoot. N. J.. Gillam, S., Smith, M. & Langley, J. I. (1977). Cell. 11. 807-818. Reynaud, C. A., Imaizumi-Scherrer, RI. T. & Scherrer. K. (1980). J. Mol. Biol. 140, 48lhN4. Richards. R. I. & Wells. J. R. E. (1980). J. Biol. Chem. 255, 93066931 I, Richards. R. I.. Shine, J., Ullrich. A., Wells, .J. R. E. 8: Goodman. H. M. (1979). Sltcl. =Icids RPS. 7, 1137-I 146. Salser. W. A., Cummings. I., Liu. A.. Strommer. J.. Padayatty, J. & Clarke, P. (1979). In Cellular and Molecular Regulation of Hemoglobin A’witching (Stamatopannopoulos. (:. 8 Nienhius, A. W.. eds), pp. 621-634, Grune and Stratton, New York. Scherrer, K. & Maundrell, K. (1979). Eur. J. Biochem. 99, 225-238. Schrier. M. H.. Staehelin. T.. Stewart. A., Gander. E. & Scherrer. K. (1973). Eur. J. Bioch~eru. 34. 213-218. Spohr. (:. 8: Scherrer, K. (1972). Cell. Diff. 1, 53-61. Spohr. G., Kayibanda, B. & Scherrer. K. (1972). Eur. J. Riochem. 31. 194-208. Strwart, A. G., Gander, E. S., Morel, C., Luppis. B. & Schrrrer. K. (1973). Eur. J. Riodwn 34. 2055212. Takei, H., Ota. T.. Wu, K. C.. Kiyohara. T. & Matsuda, G. (1975). J. Riochem. 77. 13451345. Tosi. M., Toung. R. A.. Hagenbuchle. 0. & Schiblrr. IT. (1981). Sucl. dcids Res. 10. 2313m 2323.
tit%
R. FKANIiIS
.txr)
c:. v. PAI)I)O(‘K
Vincent, A.. (:oldenberg. S. & Schrrrer. K. (1981). h’ur. J. Hioch,n~.114. I’i!)-~lH3 Weatherall. D. J. Rr (‘Iegg, J. B. (1979). (%I/. 16, 467-47!t. Williams. J. G.. Kay, R. M. & Patient, R. K. (1980). .I’&. .-Icids tlrs. 8. 4247-4257. Wilson. J. T.. deRie1. J. K.. Forget. B. (i.. Marotta. C”. A. B Wrisstnan. S. Jl. (1977). .I’&. dcids firs. 4. 2353-2367. Yelverton. E.. bung. I).. .Weck. P.. (iray. I’. W. di (Goeddel, I). V. (1981). LY~/c/..-lcids Krs. 9. 731-741.
AVotr rrddrd iti yooj’: Examination of tht, recently obtained chickerr p and 1: globitt rnRNA sequences (Robinson. I. H. & Ingram. V. 31. (1982). Pell. 28. 51.5 Xl) reveals that the p tnRNA contains the sequence C-A-L-I’ ((‘-A-L-Py. region IT’). whereas the E sequences contains the related sequence V-A-N’ ((‘-A-IQ-1’~). Evidence in support of the functional importance of the 3’ Ilorr-cwding region vomes from recent experiments which demonstrate that deletion mutations within t,his regiott in yeast cytochrome c mRNA lead to alterations in steady-sta,tts amounts of t,he mRr\‘A within the cell (Zaret. K. S. $ Sherman. F. (1982). (‘rll. 28, .56%.ii3). If indeed this region plays an important regulatory futtction one might postulat,e that the r~petitiw sequence “suffex” common to nearly 2% of the cytoplasmic poly(4)+ RSAs of Ilrosophilrr is shared 1t> messages with common regulatory requirements (TcGhurikov. ?;. .-\.. Nautnova. .4. K Zelentsova. E. S. 13 C:eorgiw. (:. P. (1982). (‘Pll. 28. 365~373).