Virus Research 136 (2008) 130–139
Contents lists available at ScienceDirect
Virus Research journal homepage: www.elsevier.com/locate/virusres
Complete genome sequence of a raccoon rabies virus isolate Annamaria G. Szanto a,∗ , Susan A. Nadin-Davis b , Bradley N. White a a b
DNA Profiling and Forensic Research Centre, Trent University, DNA Building, 2140 East Bank Drive, Peterborough, Ontario K9J 7B8, Canada Rabies Centre of Expertise, Ottawa Laboratory Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Ottawa, Ontario K2H 8P9, Canada
a r t i c l e
i n f o
Article history: Received 20 December 2007 Received in revised form 28 April 2008 Accepted 30 April 2008 Available online 11 June 2008 Keywords: Rabies virus Raccoon rabies strain Complete genome sequence L gene Wildlife isolate
a b s t r a c t The entire genome of a mid-Atlantic raccoon strain rabies virus (RRV) isolated in Canada was sequenced; this is the second North American wildlife rabies virus isolate to be fully characterized. The overall organization and length of the genome was similar to that of other lyssaviruses. The nucleotide sequence identity of the raccoon strain ranged between 32.7% and 85.0% when compared to other lyssaviruses, while the deduced amino acid sequence identity ranged between 22.9% and 94.2% with the nucleoprotein and polymerase being the most conserved. Notable features of RRV include the phosphoprotein’s four amino acid extension compared to most other rabies viruses, and a nucleotide substitution immediately prior to the normal start codon that results in an additional methionine at the beginning of the L protein. This is the first report of the RRV L gene sequence and its 2128 amino acid product. Rates of non-synonymous and synonymous nucleotide changes within the lyssavirus L gene identified the conserved blocks II, III and IV as being most constrained. Analysis of L gene codon substitution patterns favoured models that supported positive selection, but only one site, corresponding to Leu62 of the RRV L protein, was identified as being under weak positive selection. © 2008 Published by Elsevier B.V.
1. Introduction Rabies virus, a member of the Rhabdovirus family and Lyssavirus genus, is a bullet-shaped, enveloped, negative strand RNA virus with an unsegmented genome of about 12 kb. The genome is organized into five coding regions: nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and polymerase or large protein (L). The intergenic regions are short, just 2–5 nucleotides in length, except for a long non-coding region between the G termination codon and the L mRNA initiation signal (reviewed in Wunner, 2007). Phylogenetic analysis of the lyssavirus N gene initially revealed six genotypes (GTs): (1) Rabies virus (RV); (2) Lagos bat virus (LBV); (3) Mokola virus (MOKV); (4) Duvenhage virus (DUVV); (5) European bat lyssavirus type 1 (EBLV-1); and (6) European bat lyssavirus type 2 (EBLV-2) (Bourhy et al., 1993). Subsequent characterization of isolates of the Australian bat lyssavirus (ABLV) identified this strain as a seventh genotype (GT 7) (Gould et al., 1998). These genotypes can be classified into two phylogroups based upon distinct immuno-pathology and genetic characteristics: Phylogroup 1 comprises GTs 1, 4–7 and phylogroup 2 comprises GTs 2 and 3 (Badrane et al., 2001). Four recently described lyssaviruses (Aravan, Khujand, Irkut and West Caucasian Bat Virus) recovered from bats in Eura-
∗ Corresponding author. Tel.: +1 905 873 8885; fax: +1 905 873 4940. E-mail address:
[email protected] (A.G. Szanto). 0168-1702/$ – see front matter © 2008 Published by Elsevier B.V. doi:10.1016/j.virusres.2008.04.029
sia still await formal classification but may constitute several new genotypes (Arai et al., 2003; Botvinkin et al., 2003; Kuzmin et al., 2003, 2005). To date 33 complete lyssavirus genome sequences from 5 GTs have been reported (see list in Table 1). The rabies-related viruses for which complete genome sequences have been determined include one GT 3 (MOKV), one isolate each of GTs 5 (EBLV-1) and 6 (EBLV-2), and two isolates of GT 7 (ABLV). Of the 28 complete sequences derived from GT 1 classical RVs, 23 are laboratoryadapted strains while just five represent field isolates including two recently reported from China that were isolated from a deer (DRV) and a mouse (MRV), respectively, and which appear to be closely related to vaccine strains (Meng et al., 2007). Only one of these field isolate sequences, the SHBRV-18 strain, is derived from a virus recovered directly from its natural reservoir host. It is evident from this list that complete genome sequencing has concentrated on vaccine and certain reference strains while very few lyssavirus isolates recovered directly from their natural reservoirs have been characterized to this extent. Since it appears likely that adaptation of specific lyssavirus strains to their reservoir host species involves very subtle changes to unknown regions of the viral genome, the identification of host-specific features of these viruses will require complete genome sequence determination. Only when a significant number of such sequences are available from viruses recovered directly from their reservoir hosts can a systematic search for viral features that direct host specificity be undertaken.
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
131
Table 1 Lyssaviruses for which complete genome sequences have been described Isolate name Genotype 1 laboratory isolates PV SADB19 SRV9 SAD derivatives
Flury-LEP HEP-Flury PM1503 Nishigahara RC-HL Ni-CE
Origins
Reference
GenBank no.
Pathogenic strain derived from original Louis Pasteur isolate Vaccine strain derived by multiple passage of an isolate (SAD) recovered from a rabid dog in Alabama, 1935 Vaccine strain, plaque purified clone of SAD 14 separate derivatives of SAD (SAD Bern original var 1, 2, 3, 4 and 5, SAD Bern (Sanafox) and (Lysvulpen), SAD1-3670 var 1 and 2, SAD VA1, SAD B19 (Fuchsoral), SAD P5/88 (Rabifox), SAG 2, ERA) Derived from a human isolate in Georgia, USA, 1939 Avirulent strain derived from Flury-LEP through high egg passage.
Tordo et al., 1986, 1988 Sacramento et al., 1992 Conzelmann et al., 1990
M13215
Virulent strain developed in Japan: thought to derived from PV Vaccine strain derived from Nishigahara Avirulent strain
M31046 AF499686 EF206707-20
Morimoto et al., 1989 Ito et al., 2001 Ito et al., 2001
DQ099524 AB085828 DQ099525 AB044824 AB009663 AB128149
Genotype 1 Street isolates Hum-Trans-IND NNV-RAB-H SHBRV-18 DRV MRV
Human transplant donor (India) Human Indian case Silver-haired bat isolate Recovered from a deer in China Recovered from a mouse in China
Faber et al., 2004
AY956319 EF437215 AY705373 DQ875051 DQ875050
Other genotypes MOKV (GT 3) EBLV1-RV9 (GT 5) EBLV2-RV1333 (GT 6) ABLV-Hum (GT 7) ABLV-Sacc (GT 7)
Recovered from a cat in Zimbabwe, 1981 From a bat in Germany, 1968 From a human in Scotland, 2002 Recovered from a human in 1998 Insectivorous bat isolate from 1996
Le Mercier et al., 1997 Marston et al., 2007 Marston et al., 2007 Warrilow et al., 2002 Gould et al., 2002
NC006429 EF157976 EF157977 AF418014 AF081020
As for all lyssaviruses on the American continent, the raccoon strain of rabies virus is a GT 1 RV. While the raccoon (Procyon lotor) is the primary host for the virus, spill-over to other mammals can occur, particularly to the striped skunk (Mephitis mephitis) (Wandeler et al., 1994). The first outbreak of rabies in the raccoon host was observed in Florida in the mid 1950s (Burridge et al., 1986; McLean, 1971). The epizootic remained contained until a second focus was reported from Virginia in 1977 (Jenkins and Winkler, 1987). This second outbreak, referred to as the mid-Atlantic raccoon variant, was probably the result of translocation of infected raccoons from Florida. Subsequently the virus moved rapidly into the states neighbouring Virginia, and then progressively spread north and south at an unprecedented rate until it became endemic in all states along the eastern seaboard of the United States (Biek et al., 2007). In Canada the first case of raccoon strain rabies was reported in Eastern Ontario on July 14th, 1999 (Wandeler and Salsberg, 1999), followed by a second case 12 days later. This outbreak was due to cross-border incursion from New York State and eventually resulted in 132 reported cases of raccoon variant rabies in the province of Ontario during 1999–2008 (Nadin-Davis et al., 2006; Rosatte et al., 2006). Additional incursions into Canada have included an outbreak in New Brunswick, in the years 2000–2002, that was probably due to cross-border incursion from the state of Maine (MacInnes, 2000; Nadin-Davis et al., 2006) and a more recent incursion into Quebec province (Ressources naturelles et faune Quebec, 2006). In this study, we report the first full-length genomic sequence of a mid-Atlantic raccoon rabies virus (RRV) isolated in Canada. Although the N, P, M and G genes of different isolates of this strain have been analyzed previously (Nadin-Davis et al., 1996, 1997, 2006), this is the first report describing the sequence of the RRV L gene. The RRV genome sequence has been compared to the sequences of all other complete lyssavirus genomes as well as complete gene sequences of many other wildlife isolates available from GenBank. The raccoon strain of rabies is just the sixth wildlife RV that has been fully sequenced and this information significantly
extends our knowledge of the extent to which viral variation is limited in wildlife reservoirs. 2. Materials and methods 2.1. Virus isolate and RNA isolation The RRV specimen, ON-99-2, targeted by this study, was recovered in July 1999 from Leeds county in the province of Ontario and provided by the Canadian Food Inspection Agency (CFIA). Total RNA was extracted directly from original infected raccoon brain tissue using TRIzol according to the supplier’s instructions (Invitrogen, Burlington, Canada) and the RNA precipitates were dissolved in 50 l of DNAse-, RNAse-free distilled water (GIBCO). Total RNA was quantified using a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies Inc., Rockland DE). 2.2. Primer design Primers employed for the amplification and sequencing of the RRV genome are described in Table 2. The outermost external PCR primers were designed using terminal sequences of GT 1 RVs deposited in GenBank. Attempts to amplify and sequence the 3 -terminal half of the genome were first made using internal primers designed using sequences compiled primarily from various Canadian RV strains. Based on initial sequence data some of the primers were re-designed using the PRIMER EXPRESS computer software program (Applied Biosystems) so as to improve their match with the RRV sequence. To generate initial PCR products for the 5 -terminal half of the genome, primers were designed based on conserved coding regions among the GT 1 RVs accessed from GenBank, with particular emphasis placed on the sequence of the SHBRV isolate which, of all complete lyssavirus sequences deposited to the database, is phylogenetically the most closely related to the RRV strain (Nadin-Davis et al., 2002). Using the
132
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
Table 2 Primers used for amplification and sequencing of the raccoon rabies virus genome Name
Sequence (5 –3 )
Gene
Base position (PVa )
Sense
External primers RVfor RVrev
AACTGCAGACTAGTACGCTTAACAACCAGATCAAAG GGATGCATGCGGCCGCACGCTTAACAAATAAAC
N L
Beginning End
Positive Negative
Internal primers #305 RRV-N F RRV-N R Rac 887 RRV-1R Rac 656 Rac 995 Rac 888 RRV-3 F RRV-3 R #423 #429 RRV-5 F RRV-2 F #974 RRV-4 F RRV-2 R RRV-6 F RRV-4 R RRV-7 F RRV-5 R RRV-9 F L3rev RRV-12 F L9Ra L9Fb RRV-13 F L10Fb L10Rb RRV-14 F L11Fb L11Rb RRV-15F
GTYCCYTRCAAGAACTGCATTGC CGGCCCCCTTTGTGAAG TAGCGCACATTTTGTGAGTTGTC CTACTTCTCTGGTGAAACCAGGAG ATGATTGACCTTCGAGTTGAG GCCGAGAGCTTCTCCAAGAAGTACA CCTTGACTATGTCATCAAGGAACA GGCAACCACAGATCATCATCAT CATTCTGAAATCGTTTGATGAGATCT CCTTTAGACGCTCTCTTCCCTTT CTCGGATGAGCTTGAGCATCTTGT TGTTGTTGGAGAAGGGATGA TTCATCTCCCAGATGTTCACAAA GGACACAAGCAATAGATCACAATCA TCTTCTA(GT)CAAAGGAGAGTTGAG(AG)TTGTAGT TTTGTTCTTTCATGTGATTACCCTTT CCCTCCACAGTGCCAAGATAG CGACTTCTTGCGAGGGATCA TATATGCTTGGGAGGCCATGT GGAGCCTGGTCAGCCTACTG CCCTGGGCTAGTATCTTAGTTCTTGT GCAGCTATGACTTTCTCATTTATGGA CCCTTCWGCRCTCAGRATCTT CAGAGATTCATGGGATCAATCG GCYCTYTTCACCACATGAACRTT ATGTCGACCCAGCTATTCCATGC CAGAAGGACATACGGCTGAGAGA ACCACWATGAAAGARGGCAACAGATC GCAACCTGTTGAGCAGAGCAYACCCA GCCTGTTGGAAGTGAATGACTTG GTYACTGACATTGCATCGATCAA GACAATGCRAAATCRGACAT GGGAACATTTATTTATCCTGGAGTTG
N N N N N P P M M G G G G G L L L L L L L L L L L L L L L L L L L
299–321 594–610 634–656 1249–1272 1998–2018 2129–2153 2208–2231 2575–2596 2717–2742 3958–3980 4164–4187 4504–4523 4625–4647 5259–5283 5521–5551 5931–5956 5990–6010 6688–6707 6736–6756 7517–7536 7567–7592 7760–7785 8032–8052 8624–8645 8962–8984 8923–8945 9310–9332 9886–9911 10312–10337 10589–10611 10807–10829 10846–10865 11541–11566
Negative Positive Negative Positive Negative Positive Negative Negative Positive Negative Positive Negative Positive Positive Negative Positive Negative Positive Negative Positive Negative Positive Negative Positive Negative Positive Positive Positive Negative Positive Positive Negative Positive
a
GenBank accession M13215.
gene-walking approach (Tordo et al., 1986), sequencing primers appropriate to the RRV sequence were designed with PRIMER EXPRESS. 2.3. cDNA synthesis, amplification and sequencing Synthesis of cDNA was performed with 2 g of total RNA as outlined by Nadin-Davis (1998). The polymerase chain reaction (PCR) was performed on a PTC-100 Programmable Thermal Controller (MJ Research Inc.) in a 25 l reaction mixture that consisted of cDNA (5 l), 1X PCR buffer (Invitrogen), 1 M of forward and reverse primers, 2 mM MgCl2 , 100 M of each dNTP and 1 unit of Taq DNA Polymerase (Invitrogen). Thermocycling profiles involved an initial denaturation for 2 min at 95 ◦ C, followed by (95 ◦ C for 1 min, 50 ◦ C for 1 min and 72 ◦ C for 2 min) for 35 cycles and a final extension at 72 ◦ C for 5 min. Sizes of the amplicons were confirmed visually by ethidium bromide staining of 1.5% agarose gels after electrophoresis. The amplicons were purified using ExoSAP-IT (USB) following the manufacturer’s instruction. The purified amplicons were sequenced with both forward and reverse primers on the MegaBase 1000 (GE Healthcare) 96-well plate DNA sequencer using the DYEnamicTM ET terminator cycle sequencing kit (GE Healthcare). 2.4. Amplification and sequencing of the terminal ends The 5 -terminal end of the genome was confirmed using a 5 RACE System (Invitrogen) according to the supplier’s instruc-
tions. Synthesis of cDNA was performed using a positive sense L gene-specific primer (L10Fb); this same primer, together with the 5 abridged anchor primer included in the kit, was used in the first round of amplification. A second round of amplification, performed by hemi-nested PCR, used an internal positive sense primer (L11Fb) and the 5 abridged anchor primer used in the first PCR. The 3 -terminal end was generated essentially as outlined by Marston et al. (2007). A RLM-3 RACE oligonucleotide (5 GTCGTACTAGTCGACGCGTGGCCTAG 3 ) was phosphorylated with T4 kinase. Eighty pmol of this oligonucleotide was ligated to 4 g of total RV-infected raccoon brain RNA using T4 RNA ligase at 16 ◦ C for 2 h. After recovery of the RNA by ethanol precipitation, cDNA synthesis was primed using the complementary oligonucleotide 5 -GGCCACGCGTCGACTAGTAC-3 . For the first round of PCR the complementary oligonucleotide was paired with a negative sense primer RRV-1R that targets N gene coding sequence and for the second round of hemi-nested PCR it was paired with another primer RRV-NR. The amplicons were purified and sequenced as described above. Sequencing of the 5 and 3 terminal ends was carried out with primers RRV-15F and # 305, respectively. 2.5. Data analysis Sequences were manually edited and compiled using the BioEdit (Hall, 1999) program. Alignments were performed using ClustalW (Thompson et al., 1994) in BioEdit. The raccoon strain nucleotide sequence was compared to full-length genome sequences of published lyssavirus isolates of several GTs to establish its genome
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
organization. Protein sequences of coding regions were deduced using the MEGA 3.1 program (Kumar et al., 2004) and compared to those of many representative lyssavirus GTs to examine the degree of conservation of functionally important motifs and to identify raccoon strain-specific residues. Hydrophilicity plots of selected L proteins were generated using the method of Hopp and Wood (1981) implemented in the DNAsis package, v. 2.6 (Hitachi). 2.6. Phylogenetic analysis Phylogenetic analysis of lyssavirus L gene sequences, including phylogenetic tree construction and calculation of nucleotide and amino acid (AA) conservation was performed using the MEGA 3.1 program. Phylogenetic trees were constructed using the neighbour joining (NJ) method with Kimura (1980) evolutionary distance correction statistics. The branching pattern was statistically evaluated by bootstrap analysis of 1000 replicates (Felsenstein, 1985; Dopazo, 1994); bootstrap values of 70% or greater were considered significant (Hillis and Bull, 1993). 2.7. Evolutionary analysis of L gene sequences Using an alignment of 22 L gene sequences, the MEGA 4.0 package was used to determine pairwise and overall transition/transversion (Tn/Tv) ratios for the complete coding region and for segments thereof. The codeml program of the PAML package, v 3.15 (Yang, 1997) was used for pairwise calculation of the rates of non-synonymous (dN) and synonymous (dS) base changes using the same alignment.
133
A maximum likelihood approach was used to investigate selection pressures in the lyssavirus L gene. A maximum likelihood tree, generated for the 22 L gene alignment using the dnaml program of PHYLIP, version 3.63 (Felsenstein, 1993), and employing the calculated Tn/Tv ratio, was used together with the sequence alignment in the codeml package of PAML to examine various models of codon substitution (M0, M1, M2, M3, M7 and M8) using the NSSITES option. Sites with gaps were removed from the analysis and each model was run twice to check for convergence. Nested models were assessed against each other using a likelihood ratio test as detailed previously (Yang et al., 2000; Holmes et al., 2002). 3. Results 3.1. Gene sequence analysis and genome organization Using a total of 35 primers (as shown in Table 2) the RRV genome was amplified as 21 separate overlapping PCRs that permitted sequencing of the entire genome. In addition, the 5 - and 3 -termini were confirmed as described. The length (11,923 nucleotides (nts)) and overall organization of the RRV genome was similar to that of other GT 1 RVs (data not shown). The viral sequence described here has been submitted to GenBank (accession number EU311738). 3.1.1. Non-translated and non-coding regions Consistent with previous observations on RV termini (Tordo et al., 1988), and their importance with respect to viral transcription and replication (Wunner, 2007), the leader region of the RRV strain is highly A/T rich and its genomic 3 -terminal 11 nucleotides exhibit
Fig. 1. (A) Comparison of the 3 - and 5 -termini of the antigenome (+) sense RNA (in DNA code) of the RRV strain. Complementary nucleotides are indicated by an exclamation mark. The start of the N gene is underlined and the start of the N ORF is indicated in bold with the predicted amino acids shown above. The TTP motif of the L gene is in italics and underlined. Also shown is the alignment of the 3 (B) and 5 (C) termini of the antigenomic (+sense) RNA (in DNA code) of the RRV with corresponding sequences from 10 other lyssaviruses including 6 GT 1 RVs and 4 rabies-related viruses of GTs 3, 5, 6 and 7. Only differences from the reference RRV sequence are shown.
134
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
Table 3 Percentage of conserved nucleotide and deduced amino acid residues of all five genes for 29 genotype 1 lyssavirusesa
Nucleotide homology Amino acid homology a
N (%)
P (%)
M (%)
G (%)
L (%)
67.4 80.9
56.1 63.2
63.1 70.9
58.5 65.0
65.9 84.1
As listed in Table 1.
100% complementarity to the last 11 bases of the genomic 5 terminus. Indeed, as shown in Fig. 1A, 25 of the 40 most terminal bases are complementary in nature. The high conservation of the RRV sequence with those of other lyssaviruses at the 3 - and 5 -genomic termini are illustrated in Fig. 1B and C, respectively. Apart from an A/G substitution in ABLV and EBLV2 sequences at nucleotide 10, all lyssaviruses in this comparison are absolutely conserved over the 12 bases of the genomic 3 -terminus and highly conserved over a 25 nucleotide region spanning the N gene’s transcriptional start (see below) and beginning of its open reading frame (ORF). Similarly the 16 terminal nucleotides of the 5 -genomic terminus are absolutely conserved in all GT 1 RVs examined with just a single A/T substitution at position 13 from the end in the non GT 1 lyssaviruses and an additional C nucleotide at the terminus of MOKV. Indeed of the 82 bases of the 5 -genomic terminus, substitutions amongst all RVs are very limited. 3.1.2. Protein-coding regions and nucleotide sequence identity Comparisons of the five protein-coding regions of the RRV with those of the other 28 GT 1 RVs revealed a high degree of homology as summarized in Table 3. The highest deduced AA sequence identity was observed for the L gene (84.1% of all residues conserved), followed by the N, M, G and P gene as the most variable (63.2% of AAs conserved). Similar results were obtained when the RRV sequence was compared with just the six rabies wildlife isolates (data not shown). Pairwise comparisons of deduced AA sequences between RRV and lyssaviruses representative of other genotypes showed similar trends (Table 4), although the most conserved protein was either the L or N depending on the viral pair used for comparison. The predicted AA sequence identity ranged between 22.9% (RRV-MOKV P gene) and 94.2% (RRV-PV N gene). At the nucleotide level N was overall and by pairwise comparison the most conserved gene, followed by L, M, G and P (summarized in Tables 3 and 4). Within GT 1, nucleotide sequence identity ranged between 67.4% and 56.1% while between RRV and other lyssaviruses the range was between 32.7% (MOKV) to 85.0% (SHBRV-18).
Table 4 Pairwise comparisons of nucleotide and predicted amino acid sequence identities of the raccoon rabies virus (RRV) with all five genes of other lyssaviruses Strains compared
N (%)
P (%)
M (%)
G (%)
L (%)
RRV – PV
Nucleotide Amino acid
82.7 94.2
58.5 66.7
78.5 89.2
78.3 85.2
80.1 92.5
RRV – SHBRV-18
Nucleotide Amino acid
85.0 93.3
60.7 67.0
83.4 92.6
79.3 84.5
84.2 94.6
RRV – MOKV
Nucleotide Amino acid
71.9 81.2
32.7 22.9
73.4 77.3
57.0 57.3
70.1 78.3
RRV – EBLV-1
Nucleotide Amino acid
73.7 87.4
44.9 31.1
73.4 80.8
68.0 70.7
72.7 85.7
RRV – EBLV-2
Nucleotide Amino acid
75.1 86.5
65.0 65.6
75.9 83.3
68.8 75.0
74.6 86.9
RRV – ABLV-Hum
Nucleotide Amino acid
77.1 91.6
67.5 69.2
76.7 89.2
69.0 76.1
74.9 88.6
3.2. Protein sequence comparisons The present study confirmed the primary sequences of the N, P, M and G proteins of the RRV as reported previously (Nadin-Davis et al., 1996, 1997, 2002; Badrane and Tordo, 2001) and thus these data are not examined in detail here. N, M and G all exhibited identical lengths and well conserved sequences compared to other GT 1 viruses, while the RRV P is notable in having a length of 301 AAs compared to 297 residues for most RVs. Important functional motifs (as summarized in Wunner, 2007) that are retained in the RRV strain included the following: the highly conserved phosphorylation site, Ser 389, of N; several alternate internal methionine initiation codons in P, the minimal binding motif (KXTQT) for the dynein light chain LC8 at P residues 144–148, and the lysine-rich motif (FSKKYKF) at residues 209–215 important for P binding to N-RNA; an M PPPY motif important for viral budding (Harty et al., 2001); G retained the well characterized antigenic sites II and III, a linear epitope located between residues 223–276, an N-linked glycosylation site at residue 319 and residue Arg333 important for viral pathogenicity. Unlike most other RVs, the RRV strain G lacks the N-linked glycosylation site at residue 37 due to an Asn to Ser replacement. Comparisons of each of these four protein sequences of the raccoon strain to those of all other characterized lyssaviruses identified certain residues which appear specific to the raccoon strain thus: N, Ser166 and Ser303; P, Leu164, Gly166, Arg297 and Gln298; M, His189; G, Trp-16 (in the signal peptide), Leu246 (in the ectodomain), Phe454 (in the transmembrane domain) and Lys488 (in the cytoplasmic domain). In the absence of any clear functions identified for these residues the importance of any of these changes is presently unknown. 3.2.1. Polymerase structure The 6387 nucleotide ORF of the RRV strain L gene encodes a product of 2128 AA. An L gene alignment illustrated that both the RRV and SHBRV-18 isolates have an extra Met codon at the start of the ORF but they share the same termination codon with several other lyssaviruses with the exception of the PV strain, which is extended at its C-terminal by 15 AA. As has been noted previously, although the lyssavirus L is highly conserved overall, six blocks of sequence exhibiting particularly high levels of conservation, separated by less conserved segments, have been identified (Tordo et al., 1988; Poch et al., 1990; Le Mercier et al., 1997). Since these conserved blocks are probably responsible for several enzymatic functions, their high level of conservation across the genus is not unexpected, and the RRV sequence was no exception in this regard. A hydrophilicity plot of the RRV L yielded a profile virtually indistinguishable from plots for other GT 1 L proteins and there was even minimal difference from the plot for MOKV L (data not shown). Thus, the overall structure and conformation of the lyssavirus L protein is highly conserved. Specific motifs retained by RRV L include the AQGDNQ motif (conserved domain III, block C) shown to be critical for RNA polymerase function (Schnell and Conzelmann, 1995) and the Pre-A sequence, rich in Lys and Glu residues within conserved domain II, responsible for RNA template binding in many RNA viruses (Muller et al., 1994). Comparison of RRV L with that of all 33 other lyssaviruses characterized at this locus (see Table 1) identified 39 AA replacements specific to the raccoon strain. Twenty-nine of these AA changes were located within the non-conserved blocks at AA positions Ser20, Asn27, Arg56, Leu81, Met118, Phe137, Met208, Ile230, His884, Pro1086, Ala1585, Ala1616, Gly1620, 1 Met1627, Arg1804, Thr1805, Met1810, Ile1811, Phe1815, Glu1822, Arg1823, Thr1853, Ile1891, Val1970, Asn1993, Lys2042, Ala2074, Thr2093 and Asn2095. Two changes were found within conserved block I (His236 and Tyr312), 1 AA change within
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
135
Table 5 Pairwise Transition/Transversion (Tn/Tv) ratios for various regions across the lyssavirus L gene Protein domain Nucleotide range
Total L
1–660
I 661–1260
1261–1500
II 1501–1800
1801–4050
III 1801–2490
IV 2671–3180
V 3271–3990
3991–5010
VI 5011–5280
5281–6384
RRVa PV SAD FluryHEP PM1503 Nishigahara NNV-RAB-H SHBRV DRV
3.18 3.11 2.79 2.92 3.09 3.02 3.77 3.01
2.79 2.79 2.64 2.97 2.69 2.84 4.43 2.81
4.74 4.24 4.35 4.35 3.68 4.21 4.88 3.46
1.93 1.87 2.39 2.15 3.25 2.54 1.73 3.33
4.15 4.23 2.88 3.39 4.25 3.67 4.27 4.08
3.03 2.93 2.63 2.67 2.69 3.08 3.77 2.69
4.52 4.00 3.83 3.78 3.55 4.00 5.38 3.44
2.48 2.38 2.29 2.20 2.41 3.24 4.06 2.77
2.82 2.87 2.33 2.50 2.34 2.89 3.00 2.19
3.95 3.78 3.34 3.61 4.24 3.04 4.33 3.79
2.62 2.62 2.00 2.46 2.33 2.85 4.43 2.43
2.97 3.06 2.59 2.65 3.11 2.68 3.02 2.85
EBLV-1 EBLV-2 ABLV-Hum ABLV-Sacc MOKV
1.47 1.44 1.57 1.48 1.13
1.25 1.35 1.30 1.31 0.90
1.95 1.72 2.16 2.11 0.95
1.00 1.15 1.57 1.56 1.14
1.32 1.41 1.63 1.73 1.17
1.47 1.32 1.55 1.36 1.24
1.82 1.65 1.96 1.91 1.25
1.61 1.09 1.08 1.15 1.31
1.18 1.16 1.40 1.22 1.15
1.40 1.52 1.40 1.34 1.22
1.06 1.60 1.87 1.38 1.11
1.75 1.58 1.70 1.60 1.14
3.11 0.2926
2.99 0.5887
4.24 0.4791
2.40 0.613
3.86 0.511
2.94 0.379
4.06 0.626
2.73 0.6317
2.62 0.311
3.76 0.432
2.72 0.735
2.87 0.202
Virus strain
Mean (GT1 only) ±S.D. a
Sequence used for comparison in each case.
conserved block IV (Asn1061) and 7 changes were noted within conserved block V (Ala1093, Val1137, Val1228, Arg1232, Glu1274, Cys1282 and Arg1289). 3.3. Evolutionary trends on the L gene To further explore the evolutionary processes operating on the L gene coding region, further analysis was performed on an alignment of 22 sequences. This group includes RRV and all isolates listed in Table 1, except that all SAD derivatives apart from ERA and SAG2 were excluded so as not to unduly affect the results by the inclusion of a large number of closely related laboratory strains. Pairwise transition/transversion (Tn/Tv) ratios were determined for the entire length of the ORF as well as for specific regions of the gene corresponding to the various conserved blocks and less conserved flanking regions as identified previously (Tordo et al., 1988; Marston et al., 2007). A comparison of the RRV strain with several other lyssaviruses showed that the Tn/Tv ratio is highest for strains that are most closely related phylogenetically (cf RRV vs. SHBRV with RRV vs. MOKV) (Table 5). Secondly, perhaps not surprisingly
this ratio appears to vary across the length of the gene with particularly high values associated with regions encoding the most highly conserved blocks, particularly those of I, II and III. Interestingly the region between those encoding blocks V and VI (nts 3991–5010) also exhibited rather high Tn/Tv values. Since transitions are less likely to lead to AA changes, a second analysis was applied to estimate the rate of non-synonymous to synonymous changes (dN/dS) for each of these L gene segments (see Table 6). As expected particularly low dN/dS values were observed for regions encoding blocks II, III and IV, lower than average dN/dS values were observed for those regions encoding blocks I and VI while the L gene segment encoding block V yielded a value close to average for the entire gene. These data suggest that mutational constraints operate most selectively on the L gene segments encoding blocks II, III and IV, while block V exhibits greater flexibility consistent with the observation of a relatively large number of RRV-specific AA replacements within this block. . Further analysis of this dataset used a maximum likelihood approach to evaluate selection pressures acting on the L gene by testing various models of codon substitution as detailed previously
Table 6 Rates of non-synonymous/synonymous (dN/dS) base changes for various regions across the lyssavirus L gene Protein block Nucleotide range
Total L
1–660
I 661–1260
1261–1500
II 1501–1800
1801–4050
III 1801–2490
IV 2671–3180
V 3271–3990
3991–5010
VI 5011–5280
5281–6384
RRVa PV SAD FluryHEP PM1503 Nishigahara NNV-RAB-H SHBRV DRV
0.027 0.028 0.022 0.024 0.025 0.022 0.033 0.028
0.054 0.054 0.043 0.055 0.060 0.060 0.068 0.069
0.019 0.028 0.009 0.007 0.015 0.023 0.033 0.020
0.016 0.014 0.001 0.024 0.001 0.001 0.056 0.001
0.001 0.001 0.008 0.001 0.001 0.003 0.001 0.001
0.023 0.022 0.016 0.017 0.016 0.014 0.028 0.019
0.017 0.018 0.003 0.002 0.005 0.002 0.009 0.009
0.009 0.012 0.006 0.008 0.009 0.009 0.014 0.009
0.030 0.026 0.032 0.033 0.025 0.023 0.048 0.029
0.022 0.025 0.019 0.024 0.017 0.019 0.033 0.023
0.016 0.016 0.015 0.014 0.012 0.007 0.001 0.012
0.032 0.031 0.026 0.027 0.036 0.029 0.046 0.035
EBLV-1 EBLV-2 ABLV-Hum ABLV-Sacc MOKV
0.020 0.023 0.016 0.019 0.033
0.028 0.025 0.022 0.020 0.004
0.028 0.027 0.021 0.027 0.014
0.008 0.024 0.003 0.004 0.086
0.005 0.002 0.002 0.001 0.009
0.012 0.024 0.010 0.016 0.028
0.005 0.012 0.003 0.004 0.016
0.005 0.020 0.006 0.014 0.021
0.021 0.030 0.014 0.017 0.022
0.021 0.024 0.017 0.022 0.041
0.009 0.027 0.005 0.041 0.015
0.025 0.016 0.027 0.014 0.027
0.026 0.004
0.058 0.009
0.019 0.009
0.014 0.019
0.002 0.003
0.020 0.004
0.008 0.006
0.009 0.002
0.031 0.008
0.023 0.005
0.012 0.005
0.033 0.006
Virus strain
Mean (GT1 only) ±S.D. a
Sequence used for comparison in each case.
136
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
Table 7 Comparison of codon substitution models for the lyssavirus L gene Model code
Tree length
Likelihood
npa
Tn/Tv
Estimates of parameters
Models compared
M1 (neutral)
11.141
−38604.075376
44
4.252
M1 vs. M2
M2 (selection)
11.141
−38604.075372
46
4.252
M0 (one-ratio) M3 (discrete)
10.895 12.296
−38846.851319 −38159.361796
43 47
3.917 4.191
M7 (beta) M8 (beta&)
12.216 12.291
−38165.801833 −38157.324580
44 46
4.166 4.190
p0 = 0.964, p1 = 0.036 ω0 = 0.025 p0 = 0.964, p1 = 0.036, p2 = 0.000, ω0 = 0.025, ω1 = 1.000 ω = 0.032 p0 = 0.665; p1 = 0.268, p2 = 0.067 ω0 = 0.003, ω1 = 0.061, ω2 = 0.257 p = 0.226 q = 5.405 p0 = 0.997, (p1 = 0.003), p = 0.237, q = 6.100, ω = 1.000
a b
d.f.b
p-value
0.000008
2
0.999996
M0 vs. M3
1374.979046
4
0.000000
M7 vs. M8
16.954506
2
0.000208
2× difference in likelihood
np = number of parameters. d.f. = degrees of freedom (=difference in np values for respective models).
(Yang, 1997; Holmes et al., 2002). By exploring the fit of various substitution models to the data, various categories of sites can be assigned different dN/dS (ω) ratios thereby facilitating the identification of a small number of sites for which ω > 1 and which are thus under positive selection. A summary of such an analysis of the 22 L gene sequence alignment is presented in Table 7. Based on the likelihood values models M3 and M8, which both support positive selection, provided the best fit to the data. Model M3, which allows for three classes of sites with different ω ratios, was well favoured over the M0 model, which applies one ω ratio to all sites, since the likelihood rate test (LRT) for this comparison yielded a value of 1374.979, much greater than critical values from a 2 distribution with d.f. = 4. However, M3 did not contain a category of sites with ω > 1. Moreover, the selection model M2 failed to detect a category of sites with ω > 1 and the data clearly did not support M2 over the strictly neutral model M1. Model M8 (beta&), which gave the best fit to the data, was significantly supported over the M7 (beta) model. M8 identified a single codon, encoding Leu62 of the RRV sequence, as being under weak positive selection based on a ω value of 1.476 with a probability of >0.95. However, the vast majority of sites appear to be strongly constrained and have very low ω values. 3.4. Phylogenetic analysis A phylogenetic tree was generated based on the L gene sequences of all 34 lyssavirus isolates (Fig. 2). The raccoon isolate grouped most closely with the silver-haired bat RV (SHBRV-18) and these two American viruses formed an outlying clade to all other GT 1 RVs which, in this tree, are represented primarily by various laboratory and vaccine strains. The two wildlife isolates (DRV and MRV) reported from China both grouped together with other laboratory isolates; DRV was closely related to the Japanese laboratory strains while MRV was very closely related to the vaccine strain PM1503. Similar observations were reported based on phylogenetic analysis of the G protein-coding region (Meng et al., 2007). Phylogenetic relationships of these same isolates were similar when trees were generated using P and G-L regions (data not shown). 4. Discussion Although sequences of several complete lyssavirus genomes have become available over the past few years, in almost all cases these sequences represent those of laboratory isolates. While this information improves our understanding of structural features critical to fundamental aspects of viral replication, such information
may not help to elucidate features of the virus that facilitate adaptation to reservoir hosts. For this latter purpose, characterization of street isolates recovered directly from their normal host is required. In this study, we report the first complete nucleotide sequence of a RRV genome; this strain is only the second rabies isolate from North America for which a complete genomic sequence has been determined and the first from a non-chiropteran host. As shown here by phylogenetic analysis of the lyssavirus L gene, RRV probably emerged from a North American bat rabies strain. Another terrestrial strain designated as the south central skunk (SCSK) RV has been shown in other studies (Badrane and Tordo, 2001; NadinDavis et al., 2002) to group most closely with RRV and collectively the RRV, SCSK and all American bat RVs form a phylogenetically distinct group of viruses referred to as the American indigenous lineage (Nadin-Davis et al., 2002). It remains to be determined whether emergence of the RRV strain occurred directly as a result of a bat virus adapting to the raccoon host or by adaptation of the SCSK virus to the raccoon. As expected the overall organization of the genome is similar to that of other lyssaviruses with minor variations. The complete length of the ON-99-2 RRV genome was 11,923 nts. This is in contrast to the observation of Warrilow et al. (2002) that all lyssaviruses published up to 2002 had genomes comprising only even numbers of nucleotides. Indeed over the past 5 years several other GT 1 isolates having genomes with odd numbers of nucleotides have been reported (for example, SHBRV-18 (11,923 nts), DRV (11,863 nts), MRV (11,869 nts)). It would appear genomes with odd numbers of nucleotides are more common in nature than those with even nucleotide numbers but overall it would appear that genomes with odd or even nucleotides both occur frequently so the biological significance of this phenomenon is questionable. In accord with previous studies (Bourhy et al., 1993; Johnson et al., 2002; Kuzmin et al., 2005) we found that, at the genus level, the N gene is the most conserved protein-coding region, but within GT 1 lyssaviruses the L gene is the most conserved, followed in order by the N, M, G and P genes. The similarity score between nucleotide sequence and their predicted protein products suggest that most synonymous nucleotide substitutions occur in the L gene of the GT 1 lyssaviruses and in the N gene of other lyssaviruses. The data also revealed that within GT 1 lyssaviruses different regions of the genome with different levels of nucleotide conservation will generate similar phylogenetic relationships between groups of isolates in agreement with prior studies (Kissi et al., 1995; Nadin-Davis et al., 1997; Johnson et al., 2002). Previous sequence analyses of full-length lyssavirus genomes have identified conserved signals at the beginning and end of
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
137
Fig. 2. Phylogenetic tree of 34 lyssaviruses (see list in Table 1) generated with L gene coding sequences by a NJ analysis with the Kimura 2 parameter. Bootstrap values out of 1000 replicates are indicated as a percentage to the left of each branch of the tree. Wildlife isolates (WL) and lyssavirus genotypes (GT) are shown to the right of the tree. A scale at the bottom indicates the genetic distance represented by the horizontal branches.
each transcription unit. These include the motif WGAAAAAAA that directs transcription termination and polyadenylation (TTP) and a nine-nucleotide transcription initiation motif, collectively summarized as AACAYYHCT (Marston et al., 2007). In the raccoon rabies virus, the TTP motif, is conserved at all gene termini with the exception of a CGAAAAAAA TTP signal at the N-P junction. The initiation signals at the start of each transcription unit of the raccoon strain agree precisely with the nine nucleotide consensus sequences described by Marston et al. (2007), while all intergenic sequences (IGS) also follow the consensus sequences and lengths described in that report. Similar to observations made on other wildlife iso-
lates (Ravkov et al., 1995; Sabeta et al., 2003; Sacramento et al., 1992; Marston et al., 2007; Warrilow et al., 2002), the G-L intergenic region of the raccoon strain lacks the first of the two polyadenylation sites observed in certain laboratory strains including PV and SAD (Tordo et al., 1986; Conzelmann et al., 1990), confirming that this region is not a pseudogene but a long non-coding region of the G gene. As identified previously by Nadin-Davis et al. (1997), the RRV P protein is extended at its C terminus by 4 AAs compared with other GT 1 RVs. This is the result of a mutation at the stop codon (TAA to CAA), thus allowing extended translation until the next stop sig-
138
A.G. Szanto et al. / Virus Research 136 (2008) 130–139
nal (TAG) located 12 nucleotides downstream. Also notable was an additional AA, methionine (Met) at the start of the L protein, which has to date been observed in only one other lyssavirus isolate, the North American silver-haired bat strain SHBRV-18. This additional AA is due to an A to T mutation that changes AAG to ATG immediately prior to the normal start codon. It would be interesting to explore whether this change is found in other viruses of the American indigenous clade and whether indeed this is a consistent feature of viruses of this lineage. Of the few lyssavirus L genes described to date, including the RRV described in this report, all exhibit the modular organization originally described by Tordo et al. (1988); conserved stretches of AAs suggest that functional domains are not randomly distributed but they are clustered in blocks (Poch et al., 1990). The selection pressures being exerted on this gene were further examined during this study by exploring patterns of base substitution (Tn/Tv ratio) over the length of the L gene and for distinct sections of the gene corresponding to the conserved blocks and less conserved intervening sequence. It was indeed apparent that the most conserved segments of the gene exhibited high Tn/Tv ratios which corresponded to low dN/dS ratios; by this approach, blocks II, III and IV of the L protein were most constrained, blocks I and VI rather less so while block V was the least constrained. A maximum likelihood analysis of L gene codon substitution models indicated, as previously reported for the rabies virus G and N genes (Holmes et al., 2002), that the codon substitution models M3 and M8 that allow for positive selection were favoured over other models such as M0 (one dN/dS ratio) and M7 (beta distribution); however M2 was not supported over M1 (neutral). It is apparent that most sites are under purifying selection since the M8 model identified just one site, corresponding to Leu62 of RRV, which appeared to exhibit positive selection. At this site Leu is also retained in the closely related SHBRV strain while most other GT 1 rabies viruses encoded Tyr at this position except for Nishigahara and RC-HL strains for which Cys was encoded and the NICE strain for which Arg was encoded. Lyssaviruses of other genotypes encoded different AAs at the corresponding position thus: MOKV (Ser), EBLV-1 (Ser), EBLV-2 (Arg), ABLV-Hum (Gly) and ABLV-Sacc (Met). This residue lies a short distance downstream of the well conserved motif LNSPL (Marston et al., 2007) at AAs 39–43 of RRV but further investigation would be needed to explore any functional significance to this high level of variation at residue 62. Despite the high level of conservation of the RRV L protein compared with that of other lyssaviruses, it did Exhibit 39 AA replacements that were confined to this strain. Interestingly, 29 of these AA changes occurred outside the conserved blocks and clustered at the N- and C-terminal ends of the L polymerase, consistent with the notion that major polymerase activities are performed by the central portion of the protein (Tordo et al., 1988). Ten RRVspecific AA changes were found within conserved domains, seven of which are located in block V thought to have a metal-binding dependent catalytic role, as indicated by its Cys and His rich region (Poch et al., 1990); indeed in VSV high AA conservation in this region is essential for polymerase activity (Massey and Lenard, 1987). Two RRV-specific AA changes occurred in block I, which has multiple putative polymerase functions (Chandrika et al., 1995), and one was located in block IV, proposed to have polyadenylation or protein kinase activity (Poch et al., 1990). While there is presently insufficient data to assess host-specific variation within the lyssavirus L protein, the probable interaction of this product with multiple host factors, based on interactions of the VSV product with elongation factor 1 components (Das et al., 1998) and cellular mRNA capping enzymes (Gupta et al., 2002), suggests that this protein is a likely candidate for conferring some level of host specificity. Molecular characterization of this locus for many additional street rabies viruses may shed light on this aspect of lyssavirus biology.
Acknowledgements We thank Dr. A.I. Wandeler and the diagnostic staff of the Rabies Centre of Expertise, CFIA, for diagnosis and strain typing of the isolate used in these studies. We would also like to thank the technical staff of the DNA and Forensic Science Centre for generating the sequencing data, C. Fehlner-Gardiner and M. Lin for their critical reading of the manuscript and two anonymous reviewers for their helpful suggestions. We are most grateful to Dr. S. Aris-Brosou of the University of Ottawa for guidance in the use of the PAML software for the positive selection analysis. Also, we are grateful to the Ontario Ministry of Natural Resources for their support. This study was funded by a Natural Sciences and Engineering Research Council of Canada grant to Dr. B.N. White.
References Arai, Y.T., Kuzmin, I.V., Kameoka, Y., Botvinkin, A.D., 2003. New lyssavirus genotype from the lesser mouse-eared bat (Myotis blythi), Kyrghyzstan. Emerg. Infect. Dis. 9, 333–337. Badrane, H., Tordo, N., 2001. Host switching in Lyssavirus history from the Chiroptera to the Carnivora orders. J. Virol. 75, 8096–8104. Badrane, H., Bahloul, C., Perrin, P., Tordo, N., 2001. Evidence of two Lyssavirus phylogroups with distinct pathogenicity and immunogenicity. J. Virol. 75, 3268–3276. Biek, R., Henderson, C.J., Waller, L.A., Rupprecht, C.E., Real, L.A., 2007. A highresolution genetic signature of demographic and spatial expansion in epizootic rabies virus. Proc. Natl. Acad. Sci. USA 104, 7993–7998. Botvinkin, A.D., Poleschuk, E.M., Kuzmin, I.V., Borisova, T.I., Gazaryan, S.V., Yager, P., Rupprecht, C.E., 2003. Novel lyssaviruses isolated from bats in Russia. Emerg. Infect. Dis. 9, 1623–1625. Bourhy, H., Kissi, B., Tordo, N., 1993. Molecular diversity of Lyssavirus genus. Virology 194, 70–81. Burridge, M.J., Sawyer, L.A., Bigler, W.J., 1986. Rabies in Florida. Department of Health and Rehabilitative Services, Tallahassee, FL, p. 147. Chandrika, R., Horikami, S.M., Smallwood, S., Moyer, S.A., 1995. Mutations in conserved domain I of the Sendai virus L polymerase protein uncouple transcription and replication. Virology 213, 352–363. Conzelmann, K.K., Cox, J.H., Schneider, L.G., Thiel, H.J., 1990. Molecular cloning and complete nucleotide sequence of the attenuated rabies virus SAD B19. Virology 175, 485–489. Das, T., Mathur, M., Gupta, A.K., Janssen, G.M.C., Banerjee, A.K., 1998. RNA polymerase of vesicular stomatitis virus specifically associates with translation elongation factor-1 ␣␥ for its activity. Proc. Natl. Acad. Sci. USA 95, 1449–1454. Dopazo, J., 1994. Estimating errors and confidence intervals for branch lengths in phylogenetic trees by a bootstrap approach. J. Mol. Evol. 38, 301–302. Faber, M., Pulmanausahakul, R., Nagao, K., Prosniak, M., Rice, A.B., Koprowski, H., Schnell, M.J., Dietzschold, B., 2004. Identification of viral genomic elements responsible for rabies virus neuroinvasiveness. Proc. Natl. Acad. Sci. USA 101, 16328–16332. Felsenstein, J., 1993. PHYLIP: Phylogeny Inference Package. [Version 3.52c]. University of Washington, Seattle, Washington. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. Gould, A.R., Hyatt, A.D., Lunt, R., Kattenbelt, J.A., Hengstberger, S., Blacksell, S.D., 1998. Characterization of a novel lyssavirus isolated from Pteropid bats in Australia. Virus Res. 54, 165–187. Gould, A.R., Kattenbelt, J.A., Gumley, S.G., Lunt, R.A., 2002. Characterization of an Australian bat lyssavirus variant isolated from an insectivorous bat. Virus Res. 89, 1–28. Gupta, A.K., Mathur, M., Banerjee, A.K., 2002. Unique capping activity of the recombinant RNA polymerase (L) of vesicular stomatitis virus: association of cellular capping enzyme with the L protein. Biochem. Biophys. Res. Commun. 293, 264–268. Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98. Harty, R.N., Brown, M.E., McGettigan, J.P., Wang, G., Jayakar, H.R., Huibregtse, J.M., Whitt, M., Schnell, M., 2001. Rhabdoviruses and the cellular ubiquitinproteasome system: a budding interaction. J. Virol. 75, 10623–10629. Hillis, D.M., Bull, J.J., 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42, 182–192. Holmes, E.C., Woelk, C.H., Kassis, R., Bourhy, H., 2002. Genetic constraints and the adaptive evolution of rabies virus in nature. Virology 292, 247–257. Hopp, T.P., Wood, K.R., 1981. Prediction of protein antigenic determinants from amino acid sequences. Proc. Natl. Acad. Sci. USA 78, 3824–3828. Ito, N., Kakemizu, M., Ito, K.A., Yamamoto, A., Yoshida, Y., Sugiyama, M., Minamoto, N., 2001. A comparison of complete genome sequences of the attenuated RC-HL strain of rabies virus used for production of animal vaccine in Japan, and the parental Nishigahara strain. Microbiol. Immunol. 45, 51–58.
A.G. Szanto et al. / Virus Research 136 (2008) 130–139 Jenkins, S.R., Winkler, W.G., 1987. Descriptive epidemiology from an epizootic of raccoon rabies in the middle Atlantic states, 1982–1983. Am. J. Epidemiol. 126, 429–437. Johnson, N., McElhinney, L.M., Smith, J., Lowings, P., Fooks, A.R., 2002. Phylogenetic comparison of the genus Lyssavirus using distal coding sequences of the glycoprotein and nucleoprotein genes. Arch. Virol. 147, 2111–2123. Kimura, M., 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120. Kissi, B., Tordo, N., Bourhy, H., 1995. Genetic polymorphism in the rabies virus nucleoprotein gene. Virology 209, 526–537. Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: Integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinform. 5, 150–163. Kuzmin, I.V., Orciari, L.A., Arai, Y.T., Smith, J.S., Hanlon, C.A., Kameoka, Y., Rupprecht, C.E., 2003. Bat lyssaviruses (Aravan and Khujand) from Central Asia: phylogenetic relationships according to N, P and G gene sequences. Virus Res. 97, 65–79. Kuzmin, I.V., Hughes, G.J., Botvinkin, A.D., Orciari, L.A., Rupprecht, C.E., 2005. Phylogenetic relationships of Irkut and West Caucasian bat viruses within the Lyssavirus genus and suggested quantitative criteria based on the N gene sequence for lyssavirus genotype definition. Virus Res. 111, 28–43. Le Mercier, P., Jacob, Y., Tordo, N., 1997. The complete Mokola virus genome sequence: structure of the RNA-dependent RNA polymerase. J. Gen. Virol. 78, 1571–1576. MacInnes, C.D., 2000. Raccoon rabies in New Brunswick. The Rabies Reporter 11, 10. Marston, D.A., McElhinney, L.M., Johnson, N., Muller, T., Conzelmann, K.K., Tordo, N., Fooks, A.R., 2007. Comparative analysis of the full genome sequence of European bat lyssavirus type1 and type 2 with other lyssaviruses and evidence for a conserved transcription termination and polyadenylation motif in the G-L 3 non-translated region. J. Gen. Virol. 88, 1302–1314. Massey, D.M., Lenard, J., 1987. Inactivation of the RNA polymerase of vesicular stomatitis virus by N-ethylmaleimide and protection by nucleotide triphosphate. J. Biol. Chem. 262, 8734–8737. McLean, R.G., 1971. Rabies in raccoons in the south-eastern United States. J. Infect. Dis. 123, 680–681. Meng, S.-L., Yan, J.-X., Xu, G.-L., Nadin-Davis, S.A., Ming, P.-G., Liu, S.-Y., Wu, J., Ming, H.-T., Zhu, F.-C., Zhou, D.-J., Xiao, Q.-Y., Dong, G.-M., Yang, X.-M., 2007. A molecular epidemiological study targeting the glycoprotein gene of rabies virus isolates from China. Virus Res. 124, 125–138. Morimoto, K., Okhubo, A., Kawai, A., 1989. Structure and transcription of the glycoprotein genes of attenuated HEP-flury strain of rabies virus. Virology 173, 465–477. Muller, R., Poch, O., Delarue, M., Bishop, D.H., Bouloy, M., 1994. Rift Valley Fever Virus L segment: correction of the sequence and possible functional role of newly identified regions conserved in RNA-dependent polymerases. J. Gen. Virol. 75, 1345–1352. Nadin-Davis, S.A., 1998. Polymerase chain reaction protocols for rabies virus discrimination. J. Virol. Methods 75, 1–8. Nadin-Davis, S.A., Huang, W., Wandeler, A.I., 1996. The design of strain-specific polymerase chain reaction of the raccoon rabies virus strain from indigenous rabies viruses of Ontario. J. Virol. Methods 57, 141–156. Nadin-Davis, S.A., Huang, W., Wandeler, A.I., 1997. Polymorphism of rabies viruses within the phosphoprotein and matrix protein genes. Arch. Virol. 142, 979–992.
139
Nadin-Davis, S.A., Abdel-Malik, M., Armstrong, J., Wandeler, A.I., 2002. Lyssavirus P gene characterization provides insights into the phylogeny of the genus and identifies structural similarities and diversity within the encoded phosphoprotein. Virology 298, 286–305. Nadin-Davis, S.A., Muldoon, F., Wandeler, A.I., 2006. A molecular epidemiological analysis of the incursion of the raccoon strain of rabies virus into Canada. Epidemiol. Infect. 134, 534–547. Poch, O., Blumberg, B.M., Bougueleret, L., Tordo, N., 1990. Sequence comparison of five polymerases (L proteins) of unsegmented negative-strand RNA viruses: theoretical assignment of functional domains. J. Gen. Virol. 71, 1153–1162. Ravkov, E.V., Smith, J.S., Nichol, S.T., 1995. Rabies virus glycoprotein gene contains a long 3 noncoding region which lacks pseudogene properties. Virology 206, 718–723. Ressources naturelles et faune Quebec, 2006. http://www.mrnf.gouv.qc.ca/faune/ sante-animaux-sauvages/raton-laveur.jsp (accessed October 11, 2007). Rosatte, R., Sobey, K., Donavan, D., Bruce, L., Allan, M., Silver, A., Bennett, K., Gibson, M., Simpson, H., Davies, C., Wandeler, A., Muldoon, F., 2006. Behaviour, movements and demographics of rabid raccoons in Ontario, Canada: management implications. J. Wildl. Dis. 42, 589–605. Sabeta, C.T., Bingham, J., Nel, L.H., 2003. Molecular epidemiology of canid rabies in Zimbabwe and South Africa. Virus Res. 91, 203–211. Sacramento, D., Badrane, H., Bourhy, H., Tordo, N., 1992. Molecular epidemiology of rabies virus in France: comparison with vaccine strains. J. Gen. Virol. 73, 1149–1158. Schnell, M.J., Conzelmann, K.K., 1995. Polymerase activity of in vitro mutated rabies L protein. Virology 214, 522–530. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acid Res. 22, 4673–4680. Tordo, N., Poch, O., Ermine, A., Keith, G., Rougeon, F., 1986. Walking along the genome: is the large G-L intergenic region a remnant gene? Proc. Natl. Acad. Sci. USA 83, 3914–3918. Tordo, N., Poch, O., Ermine, A., Keith, G., Rougeon, F., 1988. Completion of the rabies virus genome sequence determination: highly conserved domains among the L (polymerase) proteins of unsegmented negative-strand RNA viruses. Virology 165, 565–576. Wandeler, A.I., Salsberg, E., 1999. Raccoon rabies in eastern Ontario. Can. Vet. J. 40, 731. Wandeler, A.I., Nadin-Davis, S.A., Tinline, R.R., Rupprecht, C.E., 1994. Rabies Epidemiology: some ecological and evolutionary perspective. In: Rupprecht, C.E., Dietzschold, B., Koprowski, H. (Eds.), Current Topics in Microbiology and Immunology, 187. Springer-Verlag, Berlin Heidelberg, Germany, pp. 297–324. Warrilow, D., Smith, I.L., Harrower, B., Smith, G.A., 2002. Sequence analysis of an isolate from a fatal human infection of Australian bat Lyssavirus. Virology 297, 109–119. Wunner, W.H., 2007. Rabies virus. In: Jackson, A.C., Wunner, W.H. (Eds.), Rabies, second ed. Academic Press, San Diego, pp. 23–68. Yang, Z., 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comp. Appl. BioSci. 13, 555–556. Yang, Z., Nielsen, R., Golgman, N., Krabbe Pedersen, A., 2000. Codon substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155, 431–449.