VIROLOGY
191, 98-l 05 (1992)
The Nucleotide
Sequence
NOBUYUKI YOSHIKAWA,’
of Apple Stem Grooving
EIMI SASAKI, MOTOHIRO
Faculty of Agriculture,
lwate University,
Capillovirus
Genome
KATO, AND TSUYOSHI TAKAHASHI
Ueda 3-chome. Morioka
020, Japan
Received March 4, 1992; accepted July 8, 1992
The complete nucleotide sequence of apple stem grooving virus (ASGV) genome has been determined. The genome is 6496 nucleotides in length excluding a 3’-terminal poly(A) tail and contains two overlapping open reading frames (ORFs). ORFl begins at nucleotide position 37 and is terminated at position 6341, encoding a protein with a molecular weight of 241 kDa. ORF2, which is in a different reading frame within ORFl, begins at position 4788 and can encode a 36-kDa protein. The 241-kDa protein contains two consensus sequences associated with the RNA-dependent RNA polymerase and the NTP-binding helicase. Comparisons of amino acid sequences around these conserved motifs with other RNA viruses revealed that ASGV has extensive similarities with apple chlorotic leaf spot, tymo-, carla-, and potexviruses, and is a member of the sindbis-like supergroup. ASGV coat protein is found to be located in the C-terminal region of the 241-kDa polyprotein. The 36-kDa protein encoded by ORF2 contains the consensus sequence o 1992 Academic PWSS. IIVZ. Gly-Asp-Ser-Gly found in the active site of several cellular and viral serine proteases.
INTRODUCTION
In this paper, we present the complete nucleotide sequence of the ASGV genome and comparisons of the proteins encoded by the ASGV genome are made with those of other plant RNA viruses.
Apple stem grooving virus (ASGV) is the type member of the Capillovirus group which also includes potato virus T and possibly lilac chlorotic leaf spot, Nandina stem pitting, and citrus tatter leaf viruses (Francki er al., 1991; Salazar and Harrison, 1978; Nishio eta/., 1989). ASGV hasveryflexuous thread-like particles, approximately 600 to 700 X 12 nm (De Sequeira and Lister, 1969; Lister, 1970) and contains a polyadenylated, plus-sense, single-stranded RNA with a AI!, of 2.30 X 1O6 and a single coat protein of 27 kDa (Yoshikawa and Takahashi, 1988). In vitro translation experiments using the rabbit reticulocyte lysate showed that ASGV-RNA directed the synthesis of a polypeptide of 200 kDa as a major product which was immunoprecipitated by antiserum against purified ASGV (Yoshikawa and Takahashi, 1992). However, a protein coinciding with coat protein in electrophoretic mobility was not detected in the translation products (Yoshikawa and Takahashi, 1992). These results indicated that ASGV coat protein was synthesized as part of a 200-kDa polyprotein, but was not cleaved from a 200-kDa polyprotein. At present, no information has been reported on nucleotide sequence and genome organization of a capillovirus.
’
To whom
0042-6822192
reprint
requests
MATERIALS Viral RNA
ASGV (isolate P-209) originally isolated from an apple tree (Yanase, 1974), was propagated in Chenopodium quinoa and purified as described previously (Yoshikawa and Takahashi, 1988). Viral RNA was extracted from purified virus using SDS/phenol/chloroform (Yoshikawa and Takahashi, 1988). cDNA cloning As ASGV-RNA has a poly(A) tail (Yoshikawa and Takahashi, 1988) RNA was primed for the synthesis of cDNA by oligo(dT) primer. In another experiment, random hexanucleotides were also used as primer. First and second strand cDNAs were prepared from 2 rg of ASGV-RNA according to Gubler and Hoffman (1983) using a cDNA synthesis system (Amersham). The double-stranded DNAs were ligated to the Smal site of pUCl9 and used to transform competent Escherichia co/i DH5a cells (Bethesda Research Laboratories). Fifty clones were selected from white, ampicillin-resistant colonies and plasmids were extracted from small overnight cultures by a boiling method (Sambrook et
should be addressed.
$5.00
Copyright 0 1992 by Academac Press, Inc. All rights of reproduction on any form reserved.
AND METHODS
98
NUCLEOTIDE
ASGVRNA
SEQUENCE
5’ ,
I 1
OF APPLE STEM GROOVING
I 2
v
Q
I 3
I 4
CAPILLOVIRUS
I
5
GENOME
99
, %W) 6 Kb
pSG41
primer
FIG. 1. The locations of ASGV cDNA clones used to determine the nucleotide sequence and the restriction sites used forsubcloning. E, H, and P indicate the EcoRI, HindIll, and Pstl sites, respectively. Most of the sequence was determined using the insert from pSG41, The large arrow indicates the sequenced portion from pSG56 and the small arrow indicates the 5’ region sequenced by primer extension directly from the RNA.
al., 1989). These clones contained inserts ranging from 1 to 6.5 kilobases (kb). Restriction maps of each clone were determined by single or double enzyme restriction digestions followed by 1% agarose gel electrophoresis. Two clones, pSG41 and pSG56, were selected for nucleotide sequencing. The clone pSG41 contained about a 6.5 kb insert equivalent to the entire genome.
Nucleotide sequencing For nucleotide sequencing, pSG41 was digested with Pstl, HindIll, or EcoRl and selected fragments were subcloned in plasmid Bluescript KS (Strategene). Deletion mutants were also prepared from the subclones using exonuclease III and mung bean nuclease (Takara Shuzo Company) (Henikoff, 1984). The resultant cDNA clones were sequenced by the dideoxyribonucleotide chain termination method of Sanger eta/. (1977) using the Sequenase version 2.0 (United States Biochemical). Double-stranded plasmid DNAs were used as templates. The sequence of the 5’-terminal region of genome was determined by extending a synthetic primer (5’GCAAllTCGAGGGGGllTCT3’) complementary to nucleotides 52 to 71 of the viral RNA with dideoxynucleotides using reverse transcriptase and terminal deoxynucleotidyl transferase (Deborde et al., 1986). All nucleotide sequence data were collected and analyzed using the program GENETYX version 8.0 (Software Development Co., LTD).
Analysis of expressed protein in Escherichia co/i Two clones (pTrc99A-41 H and pTrc99A-41 EB) were constructed as follows. The Hincll-Hincll fragment (positions 5346-6399) from the clone pSG41 was li-
gated to the expression vector, pTrc99A (Pharmacia), which was digested with BarnHI and then filled in using T4 DNA polymerase I (pTrc99A-41H). The EcoRIBarnHI fragment (positions 3743-5 139) from the clone pSG41 was ligated to pTrc99A which had been digested with EcoRl and BarnHI (pTrc99A-41EB). Both clones were subsequently used to transform competent E. co/i JM109. The precultures of E. co/i (40 ~1)containing pTrc99A41 H were inoculated to 10 ml of fresh LB medium and grown for 3 hr. The cultures were further grown for 4 hr after the addition of isopropyl thiogalactopyranoside (IPTG) to a final concentration of 0.5 mn/l. Proteins were prepared from cultures described by Jagadish et al. (1991) electrophoresed in a 12.5% polyacrylamide-SDS gel (Laemmli, 1970) and transferred electrophoretically to nitrocellulose membrane. The membrane was incubated with ASGV antiserum, followed by anti-rabbit IgG goat IgG conjugated with alkaline phosphatase (Tago Inc.) and immersed in development solution containing Fast Red TR salt (Sigma) and naphthol AS-MX phosphate (Sigma) as described by Yoshikawa et al. (1986).
RESULTS AND DISCUSSION Nucleotide sequence of ASGV RNA Figure 1 shows the locations of cDNA clones used for sequencing and the sites of restriction enzymes (EcoRI, HindIll, and WI) used for subcloning. Most of the sequence of the ASGV genome was determined using the insert from pSG41. All portions of the cDNAs were sequenced from both strands. The clone pSG56 was used to analyze the 5’-terminal region because the pSG41 lacked this region (Fig. 1). The sequence of the 5’-terminal region of the genome was also determined by direct RNA sequencing using reverse transcriptase
YOSHIKAWA
100
ET AL.
1 NAAATTTAACAGGCTTAATTTCCGCTTTACOTCAATOGCTTTCACTTACAGAAACCCCCTCGAAATTGCAATCAACAAACTTCCTAGTAAGCAGTCTGATCAACTGCTTTCCTTGACC HAPTYRNPLEIAI NKLPSXQSDQLLSLT 121
ACCGACGAGATTGAAAAGACCTTAGAAGTGACCAACCGCTTCTTCTCTTTTTCAATCACACCAGAAGATCAAGAATTGTTGACTAAGCATGGTCTAACACTTGCACCTATAGGGTTTAAG T D E I EKTLEVTNRFFSFSITPEDCIELLTKHGLTLAPIGFK
241
TCACACTCCCATCCAATATCCAAAATGATAGAAAATCATCTCCTGTATATATGTGTTCCGAGTCTTTTATCCTCCTTTAAGTCAGTTGCCTTTTTTTCACTTAGGGAAAATAAAGTAGAC SHSHPI SKHIENHLLYICVPSLLSSFKSVAFFSLRENKVD
361
AGTTTTCTTAAGATGCATTCAGTCTTTTCCCATGGAAAAATTAAATCTTTGGGGATGTACAATGCTATAATTGATGGGAAAGATAAATATAGGTATGGTGATGTAGAGTTTTCATCTTTT SPLKHHSVFSHGKIKSLGHYNAI IDGKDKYRYGDVEPSSF
461
AGGGATAGAGTCATTGGTCTTAGAGATCAATGCCTTACACGTAACAAATTTCCAAAAGTTCTGTTTCTTCACGACGAGTTGCACTTTCTAAGTCCATTTGACATGGCTTTCCTATTTGAG R D R V IGLRDQCLTRNKFPKVLFLHDELHFLSPFDMAFLFE
601
ACAATCCCAGAAATTGATAGAGTTGTTGCAACCACAGTTTTTCCAATAGAACTTTTATTCGGGGACAAGGTCTCTAAGGAACCCAGGGTTTATACCTACAAGGTCCATGGCTCTTCATTT PEIDRVVATTVFPIELLFGDKVSKEPRVYTYKVHGSSF T I
721
TCATTTTATCCGGATGGTGTTGCCTCTGAGTGTTACGAACAGAATTTGGCAAATTCTAAATGGCCCTTCACCTGCAGCGGCATACAATGGGCTAACAGGAAAATTAGGGTAACCAAGCTA SFYPDGVASECYEQNLANSKWPFTCSG IQWANRKIRVTKL
641
CAGAGTCTCTTCGCCCATCATGTTTTCTCATTTGACAGGGGGAGGGCTTGTAATGAATTTAATCATTTCGACAAACCTAGCTGTCTACTTGCGGAAGAAATGCGCCTTTTGACCAAAAGG QSLFAHHVFSFDRGRACNEFNHFDKPSCLLAEEMRLLTKR
961
TTTGATAAAGCAGTTATTAACAGAAGCACAGTCTCTTCCCTCAGTACATACATGGCTTGTCTTAAAACTGCAAATGCGGCTTCAGCTGTTGCCAAGCTGAGGCAGTTGGAGAAGAGGGAT FDKAVINRSTVSSLSTYMACLKTANAASAVAKLRQLEKRD
1061
CTTTACCCAGATGAGTTGAACTTCGTCTATTCCTTTGGAGAGCATTTCAAAAATTTTGGGATGAGAGATGACTTTGATGTGTCAGTTCTACAATGGGTCAAAGACAAATTTTGCCAGGTC LYPDELNFVYSFGEHFKNFGNRDDFDVSVLQWVXDKFCQV
1201
ATGCCTCACTTCATCGCCGCCAGTTTCTTTGAACCAACAGAATTTCATTTAAACATGCGCAAATTGTTGAATGATCTGGCTACTAAAGGAATAGAGGTTCCCCTTTCTGTGATCATCCTG NPHFIAASFFEPTEFHLNNRKLLNDLATKGIEVPLSVIIL
1321
GACAAAGTCAACTTCATAGAGACCAGATTTCATGCCAGGATGTTCGACATAGCACAGGCAATCGGGGTGAACCTAGATTTACTGGGGAAAAGATTTGATTATGAGGCTGAGAGTGAAGAG IGVNLDLLGXRFDYEAESEE DKVNFIETRFHARHFDIAQA
1441
TACTTTTCAGAGAACGGTTACATCTTTATGCCCTCTAAATCAAATCCAGAGAGAAATTGGATTCTAAATTCCGGTTCGCTGAAAATTGACTATTCAAGATTGGTAAGAGCCAGGAGATTT YFSENGYIFHPSXSNPERNWILNSGSLKIDYSRLVRARRF
1661
AGATTGAGAAGAGATTTCCTAGATCCCATATCTAAAGGAAAATCCCCTAGAAAACAACTCTTCTTGGAGTCAACGGGAAACATTAAATCAAATCCCAATGCTGAAAAAAATAGCGAGAGC IKSNPNAEKNSES RLRRDFLDPISKGKSPRKQLFLESTGN
1661
GGCGAAATAAAGATTGAAGGCAGTGCCGAAAATGATCAGCCACATGAGGTATCACATACTTCAATGGAAACCGAGGATGGACAGGGTTTTGAAGGTTCAATACCAGTTGATTTAATCAAT SNETEDGQGFEGSI KIEGSAENDQPHEVSHT G E I
PVDLIN
1801
TGCTTTGAACCAGAAGAAATCAAGCTTCCAAAGAGAAGAAGGAAAAATGATTGCGTCTTCAAGGCCATCTCTGCACACTTGGGGATTGACTCTCAAGATTTGTTGAATTTTTTGGTAAAT DSQDLLNFLVN V F K A I SAHLGI XLPKRRRKNDC CFEPEEI
1921
GAAGACATATCAGATGAATTACTTGATTGCATTGAAGAGGACAAAGGACTGTCACATGAAATGATTGAAGAAGTTTTGATCACAAAGGGTCTTTCAATGGTTTATACTTCTGACTTCAAA TKGLSMVYTSDFK E E V L I IEEDKGLSHEMI EDISDELLDC
2041
GAAATGGCAGTTCTTAATAGAAAGTATGGAGTGAATGGCAAGATGTACTGCACAATTAAAGGCAATCACTGCGAGCTGAGTTCCAAAGAGTGCTTCATCAGATTATTGAAAGAAGGTGGT IRLLKEGG KGNHCELSSKECF EMAVLNRKYGVNGXNYCTI
2161
GAAGCGCAGATGTCAAATGAAAATCTAAATGCTGATTCCTTGTTCGACCTTGGAAGATTTGTGCATAATAGAGACAGGGCTGTCAAGCTAGCAAAATCAATGGCAAGAGGCACAACAGGC SLFDLGRFVHNRDRAVXLAKSNARGTTG EAQMSNENLNAD
2261
CTCCTGAATGAATTCGACCTAGAATTCTGCAAGAACATGGTGACCCTTTCGGAGTTGTTTCCTGAAAACTTTTCTTCTGTTGTCGGGCTAAGGCTTGGGTTTGCGGGTTCTGGTAAAACG SVVGLRLGFAGSGKT TLSELFPENFS LLNEFDLEFCKNMV
2401
CATAAGGTGCTTCAATGGATTAATTACACTCCAAGTGTCAAAAGAATGTTTATAAGTCCAAGGAGAATGCTGGCGGATGAAGTTGAACCTCAACTCAAGGGAACGGCCTGTCAGGTGCAT S P R R M L A D E V E P Q L K-CT NYTPSVKRHFI HKVLQWI
2621
ACATGGGAGACTGCACTTAAAAAAATCGACGGAACTTTTATGGAAGTTTTTGTTGATGAAATAGGTTTGTACCCACCTGGATACCTTACACTGCTACAGATGTGTGCTTTCAGAAAGATT GLYPPGYLTLLQNCAFRKI DGTFNEVFVDEI TWETALKKI
2641
GTTAAGGGACAAAGTGAAAATTTCTTGAAAGGCAAACTGTTGGAATTGTCAAAGACTTGCTTAAACATAAGATGTTTTGGTGATCCATTGCAATTAAGGTATTACTCAGCTGAAGACACC KLLELSXTCLNIRCFGDPLQLRYYSAEDT VKGQSENFLKG
2761
AATCTATTGGACAAAACACATGATATTGACCTCATGATCAAGACGATCAAGCACAAATATCTTTTCCAAGGGTACAGGTTCGGTCAGTGGTTTCAAGAACTGGTGAACATGCCCACTAGA IKTIKHKYLFQGYRFGQWFQELVNMPTR D L N NLLDKTHDI
2661
GTGGATGAGTCGAAATTCTCAAGGAAGTTCTTTGCAGACATTTCAAGTGTAAAAACTGAAGATTACGGACTCATCCTAGTTGCCAAGAGAGAAGATAAAGGTGTCTTCGCTGGAAGAGTT LVAKREDKGVFAGRV DISSVKTEDYGLI VDESKFSRKFFA
3001
CCTGTAGCAACAGTGAGTGAATCTCAGGGAATGACCATTAGCAAAAGGGTGTTGATATGTTTGGACCAAAATCTTTTTGCCGGGGGAGCCAATGCAGCCATTGTTGCAATAACAAGATCA ISKRVLICLDQNLFAGGANAA PVATVSESQGMT
I
3121
AAGGTCGGCTTTGACTTTATCCTTAAAGGGAATTCATTGAAAGAGGTACAGAGGATGGCACAAAAGACAATTTGGCAGTTCATCATTGAAGGGAAGTCTATTCCGATGGAGAGGATAGTG I E G K S W Q F I LKGNSLKEVQRMAQXTI KVGFDFI
IPHERIV
3241
AACATGAATCCTGGAGCCAGCTTTTATGAGAGTCCTTTGGATGTTGGAAATTCATCAATTCAAGACAAAGCTTCTAATGACCTGTTCATAATGCCTTTTATAAATTTGGCTGAGGAAGAA IMPFINLAEEE NMNPGASFYESPLDVGNSSIQDKASNDLF
3361
GTTGACCCAGAGGAAGTTGTTGGGGACGTAATTCAACCTGTTGAGTGGTTCAAATGTCATGTGCCTGTCTTCGACACAGATCCGACGCTTGCGGAGATTTTTGATAAGGTTGCAGCAAAA FDKVAAK CHVPVFDTDPTLAEI IQPVBWFK VDPEEVVGDV
3461
GAAAAAAGGGAATTCCAGTCTGTGCTGGGTCTTTCAAATCAATTTCTTGACATGGAAAAGAATGGATGCAAAATAGACATCTTGCCCTTTGCGCGACAAAATGTTTTTCCACATCATCAA DILPPARQNVFPHHQ LGLSNQFLDHEKNGCKI EKREFQSV
3601
GCGTCTGATGATGTTACTTTCTGGGCAGGTGTTCAAAAAAGAATTAGAAAGTCGAACTGGAGAAGGGAGAAATCGAAGTTTGAGGAATTTGAAAGCCAAGGGAAAGAACTTCTTCAAGAA KRIRKSNWRREKSKFEEFESQGKELLQE ASDDVTFWAGVQ
3721
TTCATCTCAATGCTACCGTTTGAATTCAAAGTGAATATCAAGGAGATTGAAGATGGAGAGAAGAGCTTTTTAGAAAAAAGAAAGCTAAAATCTGAGAAAATGTGGGCAAATCA~TCGGAG FLEKRKLKSEKNWANHSE KEIEDGEKS F I SNLPFEFKVNI
V
A
C
Q
V
H
A
I
T
R
S
FIG. 2. The complete nucleotide sequence of the ASGV genome and amino acid sequences of the major open reading frames. Asterisks indicate the termination codons. The amino acid sequences of the consetved motifs associated with the NTP-binding helicase, the RNA polymerase, and the serine protease are underlined.
NUCLEOTIDE
SEQUENCE
OF APPLE STEM GROOVING
CAPILLOVIRUS
GENOME
101
3841
AGATCAGACATTGACTGGAAACTTOACCACCACGCCTTTCTCTTCATGAAATCACAATATTGCACGAAGGAAGGGAAGATGTTCACCGAAGCTAAAGCTGGCCAAACTTTGGCCTGTTTCCAA RSD~DWKLDHAFLFMKSQYCTKEGKMFTEAKAGQTLACFQ
3961
CATATAGTCCTATTTAGATTTGGACCCATGTTGAGAGCAATTGAAAGTGCCTTTTTGAGAAGCTGTGGAGACTCATACTACATACACTCCGGGAAAAACTTCTTCTGCCTGGATAGCTTT HIVLFRFGPMLRAIESAFLRSCGDSYYIHSGKNFFCLDSF
4081
GTGACAAAGAATGCAAGTGTCTTTGATGGATTTCAATGAGTCAGACTACACGGCCTTTGACTCATCTCAGGACCACGTCATATTGGCCTTTGAAATGGCACTGTTACAATACCTGGGC VTKNASVFDGFS IESDYTAFDSSQDHV ILAFEHALLQYLG
4201
GTGTCAAAGGAGTTTCAGCTAGATTACCTTAGACTGAAATTAACTCTCGGATGCCGTCTCGGATCACTAGCAATAATGAGGTTCACAGGAGAATTTTGCACTTTCTTATTCAACACATTT VSKEFC’LDYLRLKLTLGCR LGSLAI HRFTGEFCTFLFNTF
4321
GCCAATATGCTGTTTACTCAATTGAAGTACAAGATAGACCCAAGGAGGCATAGGATTTTATTTGCTGGGGACGATATGTGTTCCTTGAGCTCTCTCAAAAGAAGGAGAGGGGAGAGAGCG ANMLFTCILKYK IDPRRHRILFAGDDMCSLSSLKRRRGERA
4441
ACAAGATTGATGAAGAGCTTTTCCCTAACTGCAGTAGAAGAGGTGAGAAAATTCCCAATGTTTTGTGGATGGTACTTAAGTCCATATGGTATCATTAAATCTCCAAAATTGCTGTGGGCC KSPKLLWA TRLHKSFSLTAVEEVRKFPMFCGWYLSPYGII
4561
AGGATCAAGATGATGAGTGAGAGACAGCTTTTGAAGGAATGTGTTGATAATTACCTATTTGAGGCGATATTTGCCTACAGATTAGGTGAGAGGCTTTACACAATTTTGAAAGAAGAGGAT KMHSERQLLKECVDNYLFEAI FAYRLGERLYTI R 1
4681
TTTGAATACCATTATCTTGTCATAAGATTTTTTGTTAGAAATTCAAAATTGTTAACAGGGTTGAGCAAAAGCTTGATATTTGAAATTGGGGAGGGCATCGGGTCCAAATGGCTATCGTCA IRFFVRNSKLLTGLSKSLI FEIGEGIGSKWLSS FEYHYLV
4801
ACGTCAACCGCTTCCTCAAGGAGGTCGAATCTACAGACCTCAAAATTGATGCTATCTCGTCCTCAGAGCTTTACAAGGATGCAACCTTTTTCAAACCAGACGTGCTTAATTGCATCAAAA TSTASSRRSNLQTSKLMLSRPQSFTRMQPFSNQTCLIASK ESTDLKIDAI VNRFLKHV SSSELYKDATFFKPDVLNCIKR
4921
GGTTTGAATCAAACGTCAAGGTTTCCTCTCGATCTGGTGACGGCCTCGTCCTGTCTGATTTCAAACTGCTTGATGACACCGAAATTGATTCAATCAGGAAGAAAAGCAACAAGTACAAAT GLNQTSRFPLDLVTASSCLISNCLMTPKLIQSGRKATSTN FESNVKVSSRSGDGLVLSDFKLLDDTEIDS-IRKKSNKYKY
5041
ACTTACACTATGGAGTCATCCTGGTTGGGATCAAAGCAATGTTGCCAAACTTTAGAGGCATGGAAGGGAGAGTCATTGTATATGATGGAGCCTGCCTGGATCCGAAAAGAGGCCACATTT TYTMESSWLGSKQCCQTLEAWKGESLYHHEPAWl LHYGVlLVGlKAMLPNFRGMEGRV 1VYDGAcLDPKRGHIC
L
K
I4
6161
GCTCGTATCTTTTCAAGTTTGAGTCTGACTGTTGCTACTTTGGTCTCAGGCCAGAGCACTGTTTGTCTACCACAGACGCAAATTTGGCCAAAAGGTTTAGATTTCGTGTGGACTTTGATT A R I FSSLSLTVATLVSGQSTVCLPQTQIWPKGLDFVWTLI SYLFKFESDCCYFGLRPEHCLSTTDANLAKRFRFRVDFDC
6281
GTCCACAATATGAACAGGACACTGAGTTGTTTGCTCTTGACATTGGAGTTGCATACAGATGCGTCAACTCTGCAAGGTTTTTGGAAACCAAAACTGGCGATTCAGGATGGGCTTCACAGG VHNMNRTLSCLLLTLELHTDASTLQGFWKPKLAIQDGLHR PQYEQDTELFALDI GVAYRCVNSARFLETKTGDSGWASQA
6401
CAATCAGCGGCTGTGAAGCACTTAAATTCAATGAGGAAATCAAGATGGCCATCCTGGATCGCAGATCCCCGCTGTTTCTGGAAGAAGGTGCACCAAACGTGCACATTGAAAAGAGATTGT QSAAVKHLNSHRKSRWPSW IADPRCFWKKVHQTCTLKRDC ISGCEALKFNEEI KMAlLDRRSPLFLEEGAPNVHIEKRLF
5621
TCAGAGGTGACAAGGTTAGAAGGTCACGCTCAATTTCCGCTAAAAGGGGGCCAAACTCAAGGGTGCAAGAAAAGAGAGGATTTAGGTCCCTCTCGGCTAGAATTGAAAGATTTGGAAAAA SEVTRLEGHAQFPLKGGQTQGCKKREDLGPSRLELKDLEK RGDKVRRSRSISAKRGPNSRVQEKRGFRSLSARIERFGKN
6641
ATGAGTTTGGAAGACGTGCTTCAGCAAGCGAGGCGCCACCGGGTAGGAGTATATCTATGGAAGACTCACATAGACCCGGCAAAGGAACTTCTGACGGTTCCTCCCCCTGAAGGATTTAAG HSLEDVLQQARRHRVGVYLWKTHIDPAKELLTVPPPEGFK EFGRRASASEAPPGRSI SHEDSHRPGKGTSDGSSP?
6761
GAAGGTGAAAGCTTTGAGGGCAAAGAGCTTTACCTTCTTCTTTGCAACCATTACTGTAAATACTTGTTCGGTAATATTGCTGTCTTTGGGTCATCTGATAAGACCCAGTTTCCCGCTGTT EGESFEGKELYLLLCNHYCKYLFGNIAVFGSSDKTQFPAV
5881
GGATTTGATACACCTCCGGTTCATTATAATTTGACAACGACCCCAAAGGAAGGGGAGACTGACGAAGGAAGGAAGGCCAGAGCGGGTTCGTCTGGCGAAAAAACAAAAATTTGGAGGATC GFDTPPVHYNLTTTPKEGETDEGRKARAGSSGEKTKIWRI
6001
GATTTGTCAAATGTTGTTCCTGAATTGAAAACCTTTGCTGCCACTTCCAGGCAGAACTCTTTGAACGAATGTACGTTCAGAAAGCTTTGCGAGCCATTTGCCGATTTGGCTCGAGAATTT DLSNVVPELKTFAATSRQNSLNECTFRKLCEPFADLAREF
6121
CTACATGAAAGGTGGTCTAAGGGATTGGCCACCAATATTTACAAGAAATGGCCCAAAGCTTTCGAAAAAAGTCCATGGGTGGCCTTTGATTTTGCCACTGGTCTGAAAATGAATCGTCTA LHERWSKGLATNIYKKWPKAFEKSPWVAFDFATGLKHNRL
6241
ACACCTGATGAGAAACAGGTGATTGATAGAATGACCAAAAGACTTTTTCGTACTGAAGGACAAAAAGGGGTTTTCGAGGCAGGTTCGGAAAGTAACCTGGAACTGGAGGGTTAGGAGTCG TPDEKQVIDRHTKRLFRTEGQKGVFEAGSESNLELEG*
6361
TGTGAAATTCCGCAAACTTGGTCGCGGTCTTGCAGGTTGACATGCCTGCCTTTATACTTAATTAAAGGGTTCCCCCGGTTTTCTGAGCATTTCCGGGTTAGTGTGGTTTTTCTAGAGTCT
6481
AGAGTTTGTCCACTCT
E
A
E
I
D
V
N
RKEATF
Poly(A)
FIG. 2-Continued
and a synthetic oligonucleotide primer (Deborde et a/., 1986). The extreme 5’-terminus nucleotide could not be determined unambiguously and is written as N in Fig. 2. This showed that the cDNA inserts of pSG41 and pSG56 were lacking 44 and 12 nucleotides from the 5’-terminus of the ASGV genome, respectively. The ASGV genome consists of 6496 nucleotides (M, 2.21 X 106) excluding the 3’ poly(A) tail (Fig. 2). This
value agrees with the AI, of 2.30 X 106, previously estimated by the electrophoresis of poly(A)-tailed RNA denatured with glyoxal (Yoshikawa and Takahashi, 1988). The base composition of ASGV RNA revealed relatively high adenine and uracil contents (30.69/o A, 28.0% U, 23.00/o G, and 18.4% C), similar to those reported for other plant viruses (Domier et al., 1986; Forster et a/., 1988; German et a/., 1990).
YOSHIKAWA
102
I
ET AL.
I
I
I
I
I
1
I
2
3
4
5
6
I Kb
FIG. 3. Open reading frames in all three reading frames for both the viral (+) and complementary (-) strands of ASGV RNA. The short and long vertical bars indicate the initiation (AUG) and the termination (UAG, UGA, and UAA) codons, respectively.
Coding regions Analysis of the putative open reading frame (ORF) in all three reading frames of both the positive and complementary strands of ASGV genome showed that two overlapping ORFs were present in the positive strand (Fig. 3). ORFl begins at AUG (nucleotide positions 3739) and terminates at UAG (nucleotide positions 63426345) to yield a large polypeptide with a AJ of 241267 (241 kDa) (Fig. 2). The AUG codon at positions 37-39 fits with the optimal sequence context for plant mRNAs proposed by Llltcke et a/. (1987; AACMXGC) at the positions -2, -1, +l, and +2. Other potential initiation codons, e.g., those at positions 265-267, 273-275, and 415-417, are in a very poor context. A polypeptide of 241 kDa may correspond to the 200-kDa polyprotein synthesized in rabbit reticulocyte lysate (Yoshikawa and Takahashi, 1992). ORF2, in a different reading frame within ORFl, starts at the AUG (nucleotide positions 4788-4790) and stops at UGA at positions 5748-5750 (Figs. 2 and 3). The initiation codon of ORF2 also fits with the optimal context for plant
mRNAs at the positions -1, +l , and +2. ORF2 can encode a polypeptide with a Mr of 36134 (36 kDa). Amino acid sequence comparisons The 241 -kDa protein encoded by ORFl contains two consensus sequences associated with the NTP-binding helicase and the RNA-dependent RNA polymerase (Argos, 1988; Hodgman, 1988; Gorbalenya ef a/., 1988) found in most positive-strand RNAviruses (Habili and Symons, 1989). The NTP-binding helicase motif GxxGxGKS/T is located at positions 781 to 788 within the 241-kDa protein (Figs. 2 and 4). The consensus sequences GxxxTxxxNTIS and GDD thought to be core sequences of the RNA-dependent RNA polymerase have also been found at position 1418 to 1453 (Figs. 2 and 4). Comparisons of amino acid sequences around these conserved motifs with other viruses reveal extensive homologies with apple chlorotic leaf spot virus (ACLSV) (German et al., 1990) potato virus S (PVS) (Mackenzie et al., 1989) turnip yellow mosaic virus (TYMV) (March et al., 1988; Keese et al., 1989)
A ASGV ACLSV ASGV ACLSV
214K (779) ZlLx(1057) 51 72
B
FIG. 4. Amino acid sequence alignment amino acids for both viruses are boxed.
of the putative helicase (A) and the RNA polymerase
(B) regions from ASGV and ACLSV. Identical
NUCLEOTIDE
SEQUENCE
OF APPLE STEM GROOVING
CAPILLOVIRUS
GENOME
103
FIG. 5. Amino acid sequence alignment of the 36.kDa protein encoded by ASGV ORF2 with those of chymotrypsin (CHYT), Streptomyces griseus protease 13 (SGPB), and the autoprotease domain in the capsid protein of Sindbis virus (SIN). Amino acids identical to the ASGV sequence are boxed. A double asterisk indicates that all four residues are identical. A single asterisk indicates that at least one of three amino acrds is identical to that of ASGV
eggplant mosaic virus (Osorio-Keese et al., 1989), ononis yellow mosaic virus (Ding et al., 1989) potato virus X (PVX) (Huisman et al., 1988) and white clover mosaic virus (Forster et al., 1988). An alignment of these regions with ACLSV is shown in Fig. 4. Our data on amino acid sequences of conserved motifs indicate that ASGV is a member of the Sindbis-like supergroup A (Habili and Symons, 1989) or polymerase supergroup III (Koonin, 1991). A homology search of amino acid sequences between the 36-kDa protein encoded by ORF2 and proteins available through NBRF and SWISS-PROT protein databases did not reveal highly significant sequence similarities. However, the 36-kDa protein contains the sequence Gly-Asp-Ser-Gly (GDSG; position 197200), which is found in the active site of several cellular and viral serine proteases (Bazan and Fletterick, 1988; Choi et a/., 1991; Gorbalenya et a/., 1989; Schlesinger and Schlesinger, 1990) (Figs. 2 and 5). In the protease of the nucleocapsid Sindbis core protein, Ser 215 (in GDSG), His 141 and Asp 163 were identified as the
1
2
3
essential catalytic triad (Choi eta/., 1991; Strauss eta/., 1984). The 36-kDa protein also contains Ser (in GDSG), His, and Asp at the positions 199, 144, and 171, respectively (Fig. 5) indicating that the 36-kDa protein may act as a protease. The protease activity of the 36-kDa protein is under investigation. In previous in vitro translation experiments using rabbit reticulocyte lysate, ASGV-RNA directed the synthesis of a 200-kDa polyprotein which was immunoprecipitated by the antiserum against purified ASGV (Yoshikawa and Takahashi, 1992). This suggests that a viral protease is necessary for the processing of the 241-kDa polyprotein, similar to coma-, poty-, and nepoviruses (Dougherty and Carrington, 1988; Dougherty and Hiebert, 1985; Goldbach and van Kammen, 1985; Pelham, 1979). As the polypeptide corresponding to the 36-kDa protein was not synthesized from genomic RNA in an in vitro translation system (Yoshikawa and Takahashi, 1992) the 36-kDa protein may be expressed from subgenomic mRNA in vivo and may be involved in the processing of the 241-kDa polyprotein. Location
4
of coat protein gene
.
d
* ‘“*“,.‘*_
-338K m@
-CP
FIG. 6. Western blot analysis of the proteins expressed in E. co/i containing pTrc99A-41 H. Lane 1, proteins from E. co/i cells grown without IPTG; lanes 2 and 3, proteins from f. co/i cells induced by IPTG; lane 4, coat protein from purified ASGV particles.
Attempts to determine the N-terminal sequence of the coat protein by Edman degradation was unsuccessful, probably due to blocking of the N-terminus of the ASGV coat protein. We constructed the clone pTrc99A-41 H which contained the Hincll-Hincll fragment (nucleotide positions 5346-6399) and analyzed the expressed proteins by Western blotting using ASGV antiserum. This clone is expected to express a 38-kDa protein in E. co/i, identical to the C-terminal region of the 241-kDa protein. As shown in Fig. 6, there were two bands which reacted with antiserum against purified ASGV. The slower migrating band corresponds to the 38-kDa protein, expected from the calculation of the amino acid sequence. Another faster migrating band coelectrophoresed with the coat protein prepared from purified virus (Fig. 6). The clone pTrc99A-41 EB which contained the EcoRI-BarnHI
YOSHIKAWA
104
ET AL.
ASGV-OBPl(1981) EKTKIWDIDLSNVVPBLKTPAATSE~NSLNECTFBRLCEPFADLAEEFLHEK~SKGLATN *. ***..**.** *t. * ACLSV-CP (125) DpSVLGSTwtks~tNLi~r~K~~~~~P~iNK”TPKGVCKAPAPKA~NGLV~LKT~~~P~~ ASGV-ORPl(2041) IYKKWPKAFEKSPWVAFDFATGLK~NKLTPDEK~V~DR~TKELFETEG~KGVFEAGSESN ACLSV-CP (185)
. . . . t. _ _* * _ ***. .**. * . . . . . . **..*..**..** .*. ** .* LFTT~PBVGSKYPEL~FDFNKGLN~F~NNKA~~KVITNHNEELL~TEFAKSENEAKLSSV
FIG. 7. Amino acid homology between C-terminal region of the ASGV 241 -kDa protein and portion of ACLSV coat protein, Identical amino acids are indicated by asterisks and amino acids with similar properties by dots.
fragment press These
(nucleotide proteins results
3743-5 139) did not exASGV antiserum.
positions
that indicate
reacted that
with the
coat
protein
is located
of the 241-kDa polyprotein, in agreement with the results of in vitro translation (Yoshikawa and Takahashi, 1992). At present, we cannot explain the reason why a protein comigrating with coat protein was expressed in E. co/i in addition to the 38kDa protein. Internal initiation of translation (Verver et al., 1991) or autoproteolysis similar to that of the nucleocapsid Sindbis core protein (Schlesinger and Schlesinger, 1990) may occur in the ASGV coat protein, although these did not occur in rabbit reticulocyte lysates (Yoshikawa and Takahashi, 1992). in the
C-terminal
Evolutionary
cause the ORFs found in the ASGV genome are so different from those in ACLSV genome, i.e., these two viruses may differ in gene expression strategy (Fig. 8).
region
relationship
with ACLSV
In addition to the similarities in amino acid sequences around conserved motifs between ASGV and ACLSV described above, extensive homologies were found in amino acid sequences between the 241-kDa protein of ASGV and the 216- and 28-kDa proteins of ACLSV (German et a/., 1990). In sequences of ca. 800 amino acids containing two conserved regions (positions 780-1577 for ASGV-241 kDa and positions 1058-l 849 for ACLSV-216 kDa), amino acid similarity calculated on the basis of an optimal alignment of two sequences was 35.79/o. Furthermore, a region of about 100 amino acids from the C-terminus of ASGV-241 kDa protein has 37.9% similarity with the ACLSV coat protein (Fig. 7). These similarities are interesting be-
ACKNOWLEDGMENTS We thank Dr. N. Suzuki for his help on DNA and RNA sequencing, Dr. H. Taira for helpful discussion and critical reading of the manuscript, and Dr. K. Tsutsumi for the synthesis of the oligonucleotide primer. This work was supported in part by a Grant-in-Aid from the Ministry of Education, Science and Culture, Japan.
REFERENCES ARGOS, P. (1988). A sequence motif in many polymerases. Nucleic Acids Res. 16, 9909-9916. BAZAN, 1. F., and FLE‘TTERICK,R. J. (1988). Viral cysteine proteases are homologous to the trypsin-like family of serine proteases: Structural and functional implications. froc. Nat/. Acad. Sci. USA 85, 7872-7876. CHOI, H., TONG, L., MINOR, W., DUMAS, P., BOEGE, U., ROSSMANN, M. G., and WENGLER, G. (1991). Structure of Sindbis virus core protein reveals a chymotrypsin-like serine proteinase and the organization of the virion. Nature 354, 37-43. DEBORDE,D. C., NAEVE, C. W., HERLOCHER,M. L., and MAASSAB, H. F. (1986). Resolution of a common RNA sequencing ambiguity by terminal deoxynucleotidyl transferase. Anal. Biochem. 157, 275282. DE SEQUEIRA,0. A., and LISTER, R. M. (1969). Purification and relationships of some filamentous viruses from apple. Phytopathology 59, 1740-l 749. DING, S.. KEESE,P., and GIBBS, A. (1989). Nucleotide sequence of the ononis yellow mosaic tymovirus genome. Virology 172, 555-563. DOMIER, L. L., FRANKLIN,K. M., SHAHABUDDIN, M., HELLMANN, G. M., OVERMEYER,J. H., HIREMATH, S. T., SIAW, M. F. E., LOMONOSSOFF.
ACLSV
FIG. 8. Comparison of open reading frames between ASGV and ACLSV. The closed triangles and squares indicate the locations of the NTP-binding helicase and the RNA polymerase motifs, respectively. Regions of amino acid sequence similarity are shown by similar shadowing. The open reading frames of ACLSV were drawn from the paper by German et a/. (1990).
NUCLEOTIDE
SEQUENCE
OF APPLE STEM GROOVING
G. P., SHAW, J. G., and RHOADS, R. E. (1986). The nucleotide sequence of tobacco vein mottling virus. Nucleic Acids Res. 14, 5417-5430. DOUGHERTY,W. G., and CARRINGTON, 1. C. (1988). Expression and function of potyviral gene products. Annu. Rev. Phyfoparhol. 26, 123-143. DOUGHERTY,W. G., and HIEBERT, E. (1985). Genome structure and gene expression of plant RNA viruses. In “Molecular Plant Virology, Vol. II: Replication and Gene Expression” (J. W. Davies, Ed.), pp. 23-81. CRC Press, Boca Raton, FL. FORSTER,R. L. S., BEVAN, M. W., HARBISON, S. A., and GARDNER, R. (1988). The complete nucleotide sequence of the potexvirus white clover mosaic virus. Nucleic Acids Res. 16, 291-303. FRANCKI, R. I. B., FAUQUET. C. M., KNUDSON, D. L., and BROWN, F. (1991). Classification and nomenclature of viruses. Fifth Report of the International Committee on Taxonomy of Viruses. Archives VifOl. Suppl. 2. GERMAN, S., CANDRESSE, T., LANNEAU, M., HUET, 1. C., PERNOLLET, J. C.. and DUNEZ, J. (1990). Nucleotide sequence and genomic organization of apple chlorotic leaf spot closterovirus. Virology 179, 104-l 12. GOLDBACH, R., and VAN KAMMEN, A. (1985). Structure, replication, and expression of the bipartite genome of cowpea mosaic virus. In “Molecular Plant Virology, Vol. II: Replication and Gene Expression” (J. W. Davies, Ed.), pp. 83-l 20. CRC Press, Boca Raton, FL. GORBALENYA,A. E.. KOONIN, E. V., DONCHENKO, A. P., and BLINOV, V. M. (1988). A conserved NTP-motif in putative helicases. Nature 333, 22. GORBALENYA,A. E., DONCHENKO, A. P., BLINOV, V. M., and KOONIN, E. V. (1989). Cysteine proteases of positive strand RNA viruses and chymotrypsin-like serine proteases. FEBS Letf. 243, 103114. GUBLER, U., and HOFFMAN, B. 1. (1983). A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269. HABILI, N., and SYMONS, R. H. (1989). Evolutionary relationship between luteoviruses and other RNA plant viruses based on sequence motifs in their putative RNA polymerases and nucleic acid helicases. Nucleic Acids Res. 23, 9543-9555. HENIKOFF, S. (1984). Unidirectional digestion with exonuclease Ill creates targeted breakpoints for DNA sequencing. Gene 28,351359. HODGMAN, T. C. (1988). A new superfamily of replicative proteins. Nature 333, 22-23. HUISMAN, M. J., LINTHORST, H. J. M.. BOL, J. F., and CORNELISSEN, B. J. C. (1988). The complete nucleotide sequence of potato virus X and its homologies at the amino acid level with various plusstranded RNA viruses. /. Gen. Viral. 69, 1789-l 798. JAGADISH,M. N.. WARD, C. W., GOUGH. K. H., TULLOCH. P. A., WHITTAKER, L. A.. and SHUKLA, D. D. (1991). Expression of potyvirus coat protein in fscherichia co/i and yeast and its assembly into virus-like particles. J. Gen. Viral. 72, 1543-l 550. KEESE,P., MACKENZIE,A., and GIBBS, A. (1989). Nucleotide sequence of the genome of an Australian isolate of turnip yellow mosaic tymovirus. Virology 172, 536-546.
CAPILLOVIRUS
GENOME
105
KOONIN, E. V. (1991). The phylogeny of RNA-dependent RNA polymerases of positive-strand RNA viruses. J. Gen. Viral. 72, 2 1972206. LAEMMLI, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature (London) 227, 680-685. LISTER, R. M. (1970). Apple stem grooving virus. C.M././A.A.B. “Descriptions of Plant Viruses,” no. 31. LOTCKE, H. A., CHOW, K. C.. MICKEL, F. S., Moss, K. A., KERN, H. F., and SCHEELE, G. A. (1987). Selection of AUG initiation codons differs in plants and animals. EMBO J. 6, 43-48. MACKENZIE,D. J.. TREMAINE,1. H., and STACE-SMITH, R. (1989). Organization and interviral homologies of the y-terminal portion of potato virus S RNA. /. Gen. Viral. 70, 1053-1063. MORCH, M. D., BOYER.J. C., and HAENNI, A. L. (1988). Overlapping open reading frames revealed by complete nucleotide sequencing on turnip mosaic virus genomic RNA. Nucleic Acids Res. 16, 6157-6173. NISHIO, T.. KAWAI, A., TAKAHASHI, T., NAMBA, S., and YAMASHITA, S. (1989). Purification and properties of citrus tatter leaf virus. Ann. Phytopath. Sot. Japan 55, 254-258. OSORIO-KEESE, M. E., KEESE, P., and GIBBS, A. (1989). Nucleotide sequence of the genome of eggplant mosaic tymovirus. Virology 172, 547-554. PELHAM, H. R. B. (1979). Synthesis and proteolytic processing of cowpea mosaic virus proteins in reticulocyte lysates. Virology 96, 463-477. SALAZAR, L. F., and HARRISON, B. D. (1978). Host range, purification and properties of potato virus T. Ann. Appl. Biol. 89, 223-235. SAMBROOK, J., FRITSCH, E. F., and MANIATIS, T. (1989). “Molecular Cloning: A Laboratory Manual,” 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. SANGER, F.. NICKLEN, S., and COULSON, A. R. (1977). DNA sequencing with chain terminating inhibitors. Proc. Nat/. Acad. Sci. USA 74,5463-5467. SCHLESINGER,S., and SCHLESINGER,M. J. (1990). Replication of togaviridae and flaviviridae. /n “Virology” (B. N. Fields eta/., Eds.), 2nd ed. pp. 697-711. Raven Press, New York. STRAUSS, E. G., RICE, C. M., and STRAUSS, J. H. (1984). Complete nucleotide sequence of the genomic RNA of sindbis virus. Virology 133,92-l 10. VERVER.J., LE GALL, 0.. VAN KAMMEN, A., and WELLINK, J. (1991). The sequence between nucleotides 161 and 512 of cowpea mosaic virus M RNA is able to support internal initiation of translation in vivo. J. Gen. Viral. 72, 2339-2345. YANASE, H. (1974). Studies on apple latent viruses in Japan. Bull. fruit Tree Res. Sta. Japan, Ser. Cl, 47-l 09. YOSHIKAWA, N., and TAKAHASHI, T. (1988). Properties of RNAs and proteins of apple stem grooving and apple chlorotic leaf spot viruses. J. Gen. Viral. 69, 241-245. YOSHIKAWA,N., and TAKAHASHI, T. (1992). Evidence for translation of apple stem grooving Capillovirus genomic RNA. J. Gen. Viral. 73, 1313-1315. YOSHIKAWA, N., POOLPOL. P., and INOUYE, T. (1986). Use of a dot immunobinding assay for rapid detection of strawberry pseudo mild yellow edge virus. Ann. Phytopath. Sot. Japan 52, 728-731,