159
WU.YResearch, 22 (1992) 159-164 Elsevier Science Publishers B.V.
VIRUS 00731
Nucleotide sequence comparison of the Ml genome segment of reovirus type 1 Lang * and type 3 Dearing S. Zou and E.G. Brown Department
of Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ont. Canada
(Received 17 September 1991; revision received and accepted 4 November 1991)
Summary
The mammalian reoviruses possess a genome composed of 10 double stranded RNA segments. The serotype 1 strain Lang Ml segment was sequenced and compared to the published type 3 sequence. Both segments were 2304 base-pairs long coding for the ~2 protein predicted to be 736 amino acids long. The sequences were highly conserved with 97.2% conservation of nucleotide sequence and 98.6% conservation of amino acid sequence. The Ml segments of serotypes 1 and 3 have recently diverged as indicated by the distribution of variation with respect to codon positions. The conservation of amino acid sequence indicated that the ~2 protein has a relatively high functional density. Reovirus; Type 1 Lang; Ml segment; Sequence
Mammalian reoviruses are represented by 3 serotypes, 1, 2, and 3, that comprise the reovirus genus of the family Reoviridae. Reovirus has a double shelled icosahedral particle containing 10 dsRNA genome segments. Weiner et al. (1989) determined the sequence of the serotype 3 strain Dearing Ml genome segment that is 2304 nucleotides long with a single open reading frame from positions 14 to
Correspondence to: E.G. Brown, Department of Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, Ont., Canada KlH 8M5. * EMBL accession no. X.59945.
160
2224 of 736 predicted amino acids. The Ml genome segment codes for a single protein, ~2, that is present in low amounts in the inner capsid shell of the virion. The mRNA from segment Ml is very inefficiently translated in vivo although it possesses a competent translation initiation site and is efficiently translated in vitro (Roner et al., 1989). In studies using genetic reassortants, the ~2 protein was observed to control the extent of cytopathic effect and plaque size in L929 cells (Moody and Joklik, 1989) and the capacity to cause myocarditis in the mouse (Sherry and Fields, 1989) and to control the extent of replication in heart cells in vitro (Matoba et al., 1991). The role of the ~2 protein as a regulator of viral cytopathology may be related to its low level of translation in vivo (Roner et al., 1989) where host components limit ~2 expression. Type 1 (strain Lang) and type 3 (strain Dearing) reassortants that possess the type 3 L2 segment and type 1 M3 segment generate Ml segment deletions on serial passage (Brown et al., 1983). Both Tl and T3 Ml segments appear to be equally susceptible to deletion and are the most likely segments to be deleted in type 3 L2 segment and type 1 M3 segment reassortants (Brown et al., 1983). We have recently identified the minimum essential sequences for replication and assembly of the Ml segment that are conserved in all Ml deletion mutants (Zou and Brown, 1992). The 5’ terminal 132-135 nucleotides and the 3’ terminal 183-185 nucleotides of the Ml segment were conserved in all Ml deletion mutants. To continue our analysis of Ml segment replication and assembly we needed the Ml sequence of the serotype 1 Lang strain which we are presenting here along with its comparison to the serotype 3 Dearing Ml sequence. The Ml segment of type 1 Lang was cloned and sequenced. The intact Ml segment of type 1 was reverse transcribed and cloned by a modification of Cashdollar et al. (1982) and Schmid et al. (1987). Low passage (passage 2) type 1 virus stock (Lang strain) was grown in L cell suspension culture harvested by centrifugation and purified by freon extraction. The freon extraction began with suspension of the infected cell pellet in HO buffer (10 mM Tris, 250 mM NaCl, 10 mM 2-mercaptoethanol) followed by the addition of sodium desoxycholate to 0.5% and l/2 volume of freon, sonicated to mix the phases and then centrifuged to separate the aqueous phase from the organic phase. The aqueous phase was reextracted with freon and the resulting aqueous phase containing the virus was further purified by CsCl gradient centrifugation. The purified virus was dialyzed overnight against dialysis buffer (0.15 M NaCl, 0.015 M MgCl,, 0.01 M Tris pH 7.4) before extracting viral dsRNA by treatment with 1% SDS and 10 pg/ml proteinase K at 37°C for 30 min, followed by phenol/chloroform extraction and precipitation with sodium acetate and ethanol. The dsRNA was dissolved in 10 mM Tris-HCl, pH 7.4, 1 mM EDTA, and melted at 50°C in 90% DMSO for reverse transcription. Both RNA strands of the Ml genome segment of reovirus type 1 strain Lang were reverse transcribed using primers complementary to 18 nucleotides at the 3’ ends of type 3 (Antczak et al., 1982) for the purpose of cloning. The RNA/cDNA hybrids were treated with NaOH, 75 mM, at 60°C for 1 h, neutralized, precipitated and dissolved in annealing solution (20 mM Tris pH 8.0, 100 mM NaCl, 1 mM EDTA, 50% formamide) to allow dsDNA to form. After
161
purification through a Sephacryl 400 spun column and precipitation, the dsDNA further underwent end-repair by Klenow fragment and was then phosphorylated using polynucleotide kinase before ligation with pGEM 7Zf + and transformation of DHSa as described for the cloning of Ml deletion fragments (Zou and Brown, 1992). For sequencing Ml subclones were produced by progressive unidirectional deletion using the erase-a-base reaction kit and protocol supplied by Promega. Each Ml subclone was sequenced from both directions and with both dGTP and dITP in the sequencing reactions using SP6 and T7 primers and the Sequenase kit. The Tl Ml consisted of 2304 nucleotides the same length as the published T3 Ml sequence and contained the same initiation and termination codons (Wiener et al., 1989). Comparison of Tl and T3 Ml sequences showed high homology in nucleotide sequences (2253/2304, 97.79%) (Fig. 1) and in predicted amino acid sequences (726/736, 98.64%) (Fig. 2). There were only 51 nucleotide substitutions, of which 39 (76.47%) occurred in the third base codon position. There were no deletion or addition mutations between Tl and T3 Ml segments. Of the 51 substitutions 76% were in the 3rd base codon position with 16% and 8% in the 1st and 2nd base codon positions respectively. Thus out of the 2.2% total nucleotide variability observed, 5.3% of the third base positions were substituted relative to 1.1% and 0.5% of the 1st and 2nd positions respectively. The low level of 3rd base substitution indicated that the Tl and T3 Ml segments were closely related and the relatively less variable 1st and 2nd base codon positions indicated that the ~2 protein is structurally conserved and thus possesses a high functional density. This is not surprising since ~2 is an internal core component that must interact with the other core proteins sigma 2, lambda 1, 2, 3 and possibly the dsRNA genome. An estimate for evaluating divergence for Tl and T3 genome segments can be made on a scale of 0 to 100% by multiplying nucleotide mismatch percentages by 1.33 since unrelated sequences exhibit a limit of 75% nucleotide mismatch. The evolutionary divergence patterns of all the Tl and T3 genome segments except M3 are shown in Table 1 and were obtained from our data and those of Weiner and Joklik (1989). The Ml and L2 segments have the lowest level of third base codon position divergences, 6 and 7%, suggesting that they have been evolving independently for the shortest time, whereas Ll, L2, and S4 have been evolving for an intermediate time and the remaining genome segments (M2, Sl, S2, and S3) for the longest time. This is consistent with reassortment mediated exchange of genetic material that has occurred at different times and that involved groups of genome segments. There were 51 nucleotide substitutions in the Tl Ml segment relative to the T3; 1 in the noncoding region and 40 others in the coding region did not alter the sense of codons in the Ml segment. Only 10 amino acid substitutions were predicted for Tl relative to T3 (Fig. 2). Of the 10 predicted changes 5 were conservative. Five non-conservative substitutions were predicted (at amino acid positions 150, 302, 347, 458, and 726). One or other of these is most likely to be responsible for the difference in phenotype of Tl and T3 that have been associated with the Ml genome segment; differences in CPE, plaque size or, the capacity to replicate in cardiac fibroblasts and cause myocarditis.
162 ;
GCUAWCGCG
GUC&@GCW
ACAUCGCAGU
UCCUGCGGUG
GUGG#AJUCAC GINJ~WGA
GGCUAUUGGA
CUGCUAGAALI
CGUUUGGAW
AGACGCUGGG
GCU~GCGA
AUGACGLWC
AUAUCAAGAU
CAUGACUAUG
UGUUGGAUCA
GUJACAGUAIJ
AUGWAGAUG
GAUAUGAGGC
1:;
UGGfiACGW
AUCGAUGCAC
UCGUCCACAA
GAAUUGGWA
CAUCACUC~G
UCUAUJGCW
GUJGCCACCC
AMAWCAAC
UAfJAWWA
:;;
UUGGAAAAGU ..
AAUCCUJCAG
UGAUACCGGA C
CAACGUJGAU
CWCGGCLAX
GUAAACGACU
MUGCUAAAG
AAAGAUCUCA
FGAUGA
361 Ml
UGAAUACMU
CAACUAGCGC G
WGCWUCAA
GAUAUCGGAU
WCUACGCAC
CUCUCAUCUC
AUCCACGACG
UCACCGAUGA
CMUGAUCCA A
2;;
GAACUUGAAU
C$AGGCGAGA
UCWWACAC
CACGACGGAC
AGGYG
GGGCUAGMU
CUUGUJAUAU
GCUCCUAGAA
AWAWAUGC
2:;
WCAAWCY
UCAUUJAWA
UGAWMWG
CAUCAWCCG
UJUGGfAAG
AGWGGWCG
UGWCCUCAC
UWfiAUJJA
AUGUUGGCAC
Tl 73
25;
AMCCAUCA G
AUUGCUACCC
CGAAAUGLAAJ UWCAUGAW
GGGGUJGAUA
WGAWCCAU
CCCAAAUGAA
WCAUCMGU U
UGUWUACCA
Tl 73
721 721
GCGCWCMG
AWGUUCACG
CiAAyACU
AAAUGACAUA
UI$CWCAGA
UCWWWGA
CAUGAUAAAC
AGAAAGCWU
;3!
UAWCCAUCA
GAUCWCGAG
CCGCGCAWU
GAUGCALMJG
CCPCCAUG
WAAACGAGG
AGCWWCAC
WCGACGLW
ACMGWGGA
$.l;
UGWGIJAGAC
~UGUUGWAG C
AGWAWGGA
UWGGCCGAU
GGGWGCGCA
ACWAUCUAG
GAAAWAAW
AUGCAUACCG
WCCGWAUG U
991
UAUUCWGM
AUWUGGGUA
WGAGAUJGC
GGACUAUUGC
AUJCWCAAG
AGGAUGGAAU
GIfJCACAGAU
UGGUJCCUAC
UJUJMCCAU
:: 8;
:: :: :: Tl T3 :: ::
:: :: ::
C
C
UGCGCMCA
;:
;I$
GCUAUCUGAU
GGCW~WG
AUAGAAGGAC
GCAUUWCAA
UACUJGAl$A
AUCCWCAAG
UWGCWCCU
GAUWGAUAC
WMCAUCUC
Tl 73
1171 1171
AAWACUGGA
IMJAUAMUA
G;CAUACAAU
CGAUWCAUG
CW”z”AU
AUWCUUCW
UAAACCCAW
GGCGCUWGC
UGCWMGGG
;:
;$%;
AUCAUWAAA
UCAACAAUUA
UGAGAGUJCU
UGAWCMUA
UCAAUAWAG
GAfJCCAGAU A
CAUGCCGCGC
GCGCAUWAG
WGAWCfiA
Tl T3
1351 1351
UGAGWGGGC
GAGCAMUGG
AGCWACGW
UGAGCA;GCG
GUUAUGGAGA
UAUACAAAGG
GAWGCUGGC
WUGAWCGC
UGGAUGAUCU
;;
it:;
CAUCMGUGG
WGflGAACU
CGGAUCUCAU
UCCGCAUGAU
GACAGGCWG
G;CMWAW
UCMGCGl!UU
:UGCCUCUCG
CAMGGACUU
Tl 73
1531 1531
GWAGCUCCA A
AUGGCCAGM
AGUUWAUGA
UAACUCAAUG
AWGAGGGUA
GAUUGCUGAC A
AUUCGCUCAU
GCCGACAWG
AGWGCIJGAA
;;
if:;
CGCAMUUA:
UWGWCAW
UAWGCGAW
AAAMUACCA
UAUAUJACAG
AGGWAAUCU
GNGAWCGC
MGMUCWG
AGGWGGAGA
;:
if;
GCUAUWCAG
CUUWGWAU
CfAUWAUA
UAAAAUWAU
GCUAWAGCG
CGCAGCCUM
AUGGUUUGGA
UCAWAWGC
GAUUGWAAU
;;
;ttt;
AUGUCCWGG
WACAUAUGG
AGAAAUUAAU
AGGAGAAGCA
WCCCGGCAU
CUACWCGGC
UGAAAUUGGA
UGGCAUAUCC
WCWGMCA
Tl 73
1891 1891
GCUGAUGCM
GAUGGAUGW
WGGAUWGA
AGAFGGAWC
AUJCCWAUG
UJAGCAUACG
UGCGCCMGA
CUGGIHJAUGG A
AGGAGUUGAU
;;
;;RJ
GGAGMGAAC
UGGGGCCMU
AUCAUGCCCA
AGUJAWWC
AWGAUCAGC
WWCWAGG
CGAACCGCGG
AGGWAUCUG
CCMGGWGU U
Tl T3
2071 2071
GAUCMGGGUXALJCACWAC
CAGUUAAGW
AGUUUCACGA
LlUJGCAUGW
UCACAWGAC
GGCGMWAU
GAGAUGAGGC
U;UCWGCGG
;:
;;%I
CCAUAGCAW
FGACG;GG;G
CUGCAUACAA
UGCGAGACUA
GCUUUCCGAU
WGACUUGGC
WAUCCWG
ACAUGCWAG
UWGACACCU
;:
$;Z:
GtjCCCUAGW
CAAUGGGGGU
AGGGGGCGGG
CUMGAWAC
WACGCGWU
CAUC
G
G
Fig. 1. Nucleotide sequence comparison of the Ml genome segment of reovirus serotype 1 strain Lang with the Ml segment of serotype 3 strain Dearing. Initiation and termination codons are underlined. Only the nucleotide differences are indicated for type 3. The EMBL accession number for the Tl Lang Ml sequence is X59945.
Ml mRNA, ml, is translated very poorly in vivo; this may relate to the formation of a secondary structure or the complexing of ml with cellular factors. Using the algorithm of Zucker and Steigler (1981) a possible stable secondary structural motif is predicted to be formed by association of the two termini (59
163 11 13 :: :: :: :: :: :: :: Tl 73
1
MYIAVPAW
DSRSSEAIGL
LESFGVDAGA
OANDVSYGM
DYVLOGLOYM
LDGYEAGOVI
DALVNKNULH
HSVYCLLPPK
SGLLEYUKSN
8;
PS;lPDNWR
RLRKRLHLKK
DLRKOOEYNG
LARAFKISOV
YAPLISSTTS
PHTMIPNLN;
GEIVYTTTDR
VIGARILLYA
PRKYYASTLS
1:;
FTMKCIIPF
GKEVGRVPHS
RFNVGTFPSI
ATPKCFMISG
WIESIPNEF
IKLFYPRVKS
VHANILNOtS
PGIVSDMINR
KRLRVHTPSD
$3;
R~G~HLP
YHVKRGASHV
DWELT
L~E~V~G
LRNVSRKLTM
H~~~LEU
LGIEIADYCI
RGEDGN:TDU
FLLLTnLSDG
$61
LTDRRTHCGY
LANPSSVPPD
VILNISITGF
lNRHTlDVklP
DIYDFVXPIG
AVLPKGSFKS
TIMRVLDSIS
ILG;GIRPRA
HWDSDEVGE
$+I
GREPTFE;AV
MEIYKGIAGV
DSLDDLlKW
LNSOLIPHDD
RLGQLFPAFL
PLAKDLLAPM
ARKFYDNSHS
EGRLLTFAHA
DSELLNANYF
541 541
GHLLRLKIPY
lTEVNL&tIRK
NREGGELFGL
VLSYLYKNYA
TSAGPKUFGS
LLRLLICPYL
Hl4EKLlGEAG
PASTSAEIGU
HIPREGLMPD
63;
GUCGCEDGFI
PYVSIRAPRL
~EL~KNU
~Y~GVIVT
DGLWGEPRR
VSAKAVIKGN
HLP~LVSRF
ACFTLTAKYE
BRLSCGHSTG
5:;
RGAAY:ARLA
FRSGLA
Fig. 2. Comparison of the predicted amino acid sequence of the ~2 proteins of reovirus serotype 1 strain Lang and serotype 3 strain Dearing. Only the amino acid differences in serotype 3 relative to type 1 are indicated. TA3LE 1 Evolutionary divergence pattern of the Ll L2 L3 Ml M2 Sl S2 S3 and S4 serotype 1 Lang and 3 Dearing Genome Segments Codon position
First Second Third
Genome segment a Ll
L2
L3
Ml
M2
Sl
S2
S3
S4
2 1 13
2 2 29
6 3 6
2 1 7
4 2 53
68 59 83
5 0 56
5 2 48
4 1 22
a The percentage divergence of nucleotides at different codon positions are indicated.
Fig. 3. Predicted secondary structure formed by terminal portions of the plus strand of the Ml segment of serotype 1. The secondary structure was predicted by the method of Zucker and is taken from Zou and Brown (1992). Nucleotide differences in the serotype 3 sequence are indicated by arrows adjacent to the relevant Tl sequence. The initiation and termination codons are shown.
164
nucleotides from the 5’ end and 91 nucleotides from the 3’ end) (Zou and Brown, 1992). The two substitutions in T3 relative to Tl in these regions would not disrupt this putative secondary structure since base-pairing would be conserved at both of these positions by the formation of G to U base-pairs (Fig. 3). The 2 terminal regions of the Ml segment (132-135 nucleotides at the 5’ end plus 183-185 nucleotides from the 3’ end) have been shown to contain all the signals necessary for replication and assembly of the segment (Zou and Brown, 1992). Future work will involve the dissection of these signals as well as those controlling other phenotypes described above for the ~2 protein using in vitro mutagenesis of cloned Ml genes and their reintroduction into virions (Roner et al., 1990).
Acknowledgements
This work was supported by Grant OGP0041771 from the Natural Sciences and Engineering Research Council of Canada.
References Antczak, J.B., Chmelo, R., Pickup, D.S. and Joklik, W.K. (1982) Sequences at both termini of the 10 genes of reovirus serotype 3 (strain Dearing). Virology 121, 307-319. Cashdollar, L.W., Esparza, J., Hudson, G.R., Chmelo, R., Lee, P.W.K. and Joklik, W.K. (1982) Cloning the double-stranded RNA genes of reovirus: sequence of the cloned S2 gene. Proc. Natl. Acad. Sci. U.S.A. 79, 7644-7648. Brown, E.G., Nibert, M.L. and Fields, B.N. (1983) The L2 gene of reovirus serotype 3 controls the capacity to interfere, accumulate deletions and establish persistent infection. In: Double-Stranded RNA Viruses (R.W. Compans and D.H.L. Bishop, Eds.), pp. 275-287, Elsevier, Amsterdam. Matoba, Y., Sherry, B., Fields, B.N. and Smith, T.W. (1991) Identification of the viral genes responsible for growth of strains of reovirus in cultured mouse heart cells. J. Clin. Invest. 87, 1628-1633. Moody, M.D. and Joklik, W.K. (1989) The function of reovirus proteins during the reovirus multiplication cycle: analysis using monoreassortants. Virology 173, 437-446. Roner, M.R., Gaillard, R.K. and Joklik, W.K. (1989) Control of reovirus messenger RNA translation efficiency by the regions upstream of initiation codons. Virology 168, 292-301. Roner, M.R., Sutphin, L.A. and Joklik, W.K. (1990) Reovirus RNA is infectious. Virology 179, 845-852. Schmid, A., Cattaneo, R. and Billeter, A. (1987).A procedure for selective full length cDNA cloning of specific RNA species. Nucl. Acids Res. 15, 3987-3996. Sherry, B. and Fields, B.N. (1989) The reovirus Ml gene, encoding a viral core protein, is associated with the myocarditic phenotype of a reovirus variant. J. Virol. 63, 4850-4856. Wiener, J.R., Baetlett, J.A. and Joklik, W.K. (1989) The sequences of reovirus serotype 3 genome segments Ml and M3 encoding the minor protein ~2 and the major nonstructural protein pNS, respectively. Virology 169, 293-304. Weiner, J.R. and Joklik, W.K. (1989) The sequence divergence of the reovirus serotype 1,2, and 3 Ll genome segments and analysis of the mode of divergence of the reovirus serotypes. Virology 169, 194-203. Zou, S. and Brown E.G. (1991) Identification of sequence elements containing signals for replication and encapsidation of the reovirus Ml genome segment. Virology, in press. Zucker, M. and Stiegler, P. (1981) Optimal computer folding of large RNA sequences using thermodynamics and auxilliary information. Nut Acids Res. 9, 133-148.