ELSEVIER
Virus Research 39 (1995) 181-193
Virus Research
The inverted repeat regions of the simian varicella virus and varicella-zoster virus genomes have a similar genetic organization 1 Wayne L. Gray *, Nanette J. Gusick, Christine Ek-Kommonen, Steven E. Kempson, Thomas M. Fletcher III Department of Microbiology and Immunology, 4301 West Markham Street, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA Received 27 June 1995; revised 17 August 1995; accepted 17 August 1995
Abstract
Simian varicella virus (SVV) causes a varicella-like disease in nonhuman primates. The D N A sequence and genetic organization of the inverted repeat region (RS) of the SVV genome was determined. The SVV RS is 7559 bp in size with 56% guanine + cytosine (G + C) content and includes 3 open reading frames (ORFs). The SVV R S 1 0 R F encodes a 1279 amino acid (aa) protein with 58 and 39% identity to the varicella-zoster virus (VZV) gene 62 and herpes simplex virus type 1 (HSV-1) ICP4 homologs, respectively. The predicted 261 aa SVV RS2 polypeptide possesses 52% identity with the VZV gene 63 homolog and 23% identity with the HSV-1 ICP22. The SVV RS3 encodes a 187 aa polypeptide with 56% and 28% identity to the VZV gene 64 and the HSV-1 US10 homologs, respectively, and includes an atypical zinc finger motif. A G + C-rich 16 base-pair (bp) sequence which is repeated 7 times and a putative SVV origin of replication were identified between the RS1 and RS2 ORFs. Comparison with the VZV RS indicates the SVV and VZV RS regions are similar in size and genetic organization. Keywords: Simian varicella virus; Varicella-zoster virus; D N A sequence; Inverted repeat region
* Corresponding author. Tel.: + 1 (501) 686-5187; Fax: + 1 (501) 686-5359; E-mail:
[email protected]. 1 Genbank accession number U33499. 0168-1702/95/$09.50 © 1995 Elsevier Science B.V. All rights reserved SSDI 0168-1702(95)00091-7
W.L. Gray et al. /Virus Research 39 (1995) 181-193
182
1. Introduction
Varicella-zoster virus (VZV) is the cause of two distinct clinical conditions. Varicella (chickenpox) is a highly contagious disease of childhood. Herpes zoster (shingles) is generally a disease of the elderly caused by reactivation of latent virus. While these are usually benign diseases, they may be severe and life-threatening, especially in immunocompromised individuals. Studies on VZV pathogenesis and latency and the development of antiviral strategies against VZV are hampered by a lack of suitable animal models (Myers and Connelly, 1992). Simian varicella virus (SVV) causes a natural, varicella-like disease in nonhuman primates (Oakes and d'Offay, 1988). The pathogenesis of SVV infection in monkeys closely parallels human varicella. SVV and VZV are antigenically related and an extensive number of cross-reacting polypeptides exist (Felsenfeld and Schmidt, 1975; Fletcher and Gray, 1992). In addition, the viruses share 70-75% DNA homology and a similar genetic organization (Gray and Oakes, 1984; Pumphrey and Gray, 1992). Based upon the clinical similarities of human and simian varicella and the relatedness of VZV and SVV, simian varicella is used as an experimental animal model for VZV infections (Soike, 1992). The SVV and VZV genomes are similar in size and structure (Gray et al., 1992; Clarke et al., 1992). SVV DNA consists of a long (L, 100 [kilobase pairs] kbp) component covalently linked to a short (S, 20 kbp) component. The S component includes a unique short sequence (US) bracketed by inverted repeat sequences (RS, Fig. 1). DNA sequence analysis of the 4.9 kbp SVV US sequence identified 4 open reading frames designated US1, US2, US3, and US4, which are homologous
A.
L
•
i
s
• • IRs
US
TRs m
I I
EcoRI
G
I
A
H
I
I
A
J
BamHI
I
B.
r BamHI
K I
F
iOiPi N
I
B
Q J
I F
H
I
I
M
I
I
B I
N I
L
M
O
I
I
E I
K
I C I
E :
t
?
,
I I
IRs
L I
P II
G
Us
I D
L
I
I Sma
Z4
L
I
'0O I
I
;~51
C
I
I
I
B
I
C
IEI
D
I
I
I I ~21 I
Fig. 1. Restriction endonuclease maps of the SVV genome and cloned SVV DNAs used to determine the RS DNA sequence. (A) Schematic diagram of the S W genome with SVV DNA EcoRI and BamHI restriction endonuclease maps. (B) The SVV RS region. The SVV BamHI D and L, SmaI subclones of the BamHI D, and lambda clones were used as templates for DNA sequence analysis of the RS.
W.L. Gray et al. / 14rus Research 39 (1995) 181-193
183
to their respective VZV counterparts; gene 65 (a putative tegument component), gene 66 (protein kinase), gene 67 (glycoprotein I), and gene 68 (glycoprotein E, Fletcher and Gray, 1993). The 7.3 kbp VZV RS includes 3 ORFs, genes 62, 63, and 64. Gene 62 is the counterpart of the herpes simplex virus type one (HSV-1) immediate early gene ICP4 (Davison and Scott, 1986). Like ICP4, the VZV gene 62 product is a transactivator of viral gene expression (Perera et al., 1993). The gene 63 polypeptide, homolog of HSV-1 ICP22, also appears to play a regulatory role in viral gene expression. The function of the VZV gene 64 product is unknown. However, the HSV-1 US10 homolog of gene 64, is speculated to be a virion structural protein (Beers et al., 1994). In addition to the 3 ORFs, the VZV RS also contains the viral origin of replication (oriS) and a guanine + cytosine(G + C)-rich reiterated sequence of unknown function. A homolog of the VZV gene 62 maps within the SVV RS component (Clarke et al., 1993). In addition, the DNA sequence of the SVV DNA termini and R S - L junction was recently determined (Clarke et al., 1995). In this study, we report the DNA sequence and genetic organization of the entire SVV RS. The genetic contents of the SVV and VZV RS regions are compared.
2. Materials
2.1. Determination of the S W RS DNA sequence
The SVV RS maps within the SVV BamHI D and L fragments (Fig. 1, Gray et al., 1992). SmaI subclones of the BamHI D were generated in recombinant plasmid vector pGEM7Z (Promega Corp.) according to standard techniques (Sambrook et al., 1989). Nested set unidirectional deletions of the BamHI D and L and the SmaI subclones were prepared for both strands using the exonuclease III digestion procedure originally described by Henikoff (1984) as modified for use in the Erase-a-Base system (Promega Corp., Madison, WI). Dideoxy-chain termination sequencing was performed with oligonucleotide primers using Sequenase Version 2 (USB Corp., Cleveland, OH) and [aasS]dATP (NEN) (Sanger et al., 1977). SVV DNA from the RS region was also cloned into recombinant lambda clones (Gray et al., 1992). These SVV lambda clones (lambda 4, 51, and 21) were used as templates for cycle sequencing employing the Promega Corp. 'f/viol' cycle-sequencing system. The sequence of both strands of the SVV RS DNA was determined. The junctions between the BamHI D and L clones and between the SmaI subclones were confirmed by cycle sequencing using the overlapping BamHI D or EcoRI C as templates. Sequence reaction products were separated on 6% polyacrylamide, 7 M urea, 1 × Tris-Borate EDTA (TBE) gels. The gels were dried, and individual DNA bands were visualized by exposure of the dried gels to Kodak X-Omat film.
184
W.L. Gray et al. / Virus Research 39 (1995) 181-193
2.2. DNA sequence analysis The sequence data was assembled and analyzed using the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin (Devereux et al., 1984). Overlapping nucleotide sequences of both DNA strands were assembled into the final RS sequence using the Gelassemble program based on the method of Staden (1980). Protein coding regions were determined by using the Frames and Testcode programs (Fickett, 1982). Amino acid sequence homologies were determined using the Bestfit program which employs the local homology algorithm of Smith and Waterman (1981). The Pilup program which uses the algorithm of Feng and Doolittle (1987) was used to construct multiple-sequence alignments between homologous herpesvirus genes. The Motifs program (GCG) was used to search for sequence motifs within protein sequences for patterns defined in the PROSITE Dictionary of Protein Sites and Patterns (Bairoch, 1991). The DNA sequence in this report has been deposited in the GenBank Data Library under the accession number U33499. The sequence includes 97 bp of the SVV DNA L component and the 7559 bp RS. In order to precisely identify specific sequences throughout the text, nucleotide (nt) numbers are used with nt number 98 at the SVV L - R S junction and nt number 7656 at the R S - U S junction.
3. Results
3.1. Size and genetic content of the SI/I/ RS sequence The DNA sequence of the entire SVV internal RS (IRS), which maps within the BamHI D and L restriction endonuclease fragments, was determined (Fig. 1, Fletcher and Gray, 1993). The junction between the RS and US components of the SVV genome was previously determined (Fletcher and Gray, 1993). The junction of the SVV RS and L components was derived by DNA sequence analysis of the EcoRI L, which includes the genomic terminus and terminal RS (TRS) sequences (Fig. 1). The SVV RS was determined to be 7559 base pairs (bp) in size. Three ORFs were identified and were designated SVV RS1, RS2, and RS3 (Fig. 2). Between the RS1 and R S 2 0 R F s a putative SVV oriS was identified, based upon homology to other herpesvirus oriS elements. A G + C-rich reiterated region was located between the oriS and the R S 2 0 R F . The overall G + C content of the RS is 55.5%. However, the G + C% varies considerably from 81.2% in the reiterated region and 71.0% in the R S 1 0 R F to 31.5% within the region of the oriS (Fig. 2). 3.2. Analysis of SI/V RS ORFS SVI/" ORF RS1 The SVV R S 1 0 R F (nt 4436 to 597) is 3840 bp in size and codes for a predicted 1279 amino acid polylaeptide with a putative size of 136.8 kDa (kilodaltons, data
W.L. Gray et al. / Virus Research 39 (1995) 181-193 i
J
i
i
I
2.
i
i
000
i
i
I
i
i
i
i
4. 000
L
--[
~
6.
185 ,
i
,
000
IRs •
RSl
i
]Us t R
t 0
RS2 I=
RS3b
Fig. 2. Distribution of the G + C content and genetic organization of the SVV RS. (A) The G + C content was calculated over a range of 100 bp at 3-bp shift increments using the GCG Window program and plotted using the GCG Statplot program. (B) The locations of the RS1, RS2, and R S 3 0 R F s within the SVV RS are indicated. The map positions of the reiterated region (R) and putative oriS (O) are shown.
not shown). A consensus poly A motif (AATAAA) is located 141 downstream of the termination codon. The RS1 polypeptide has 57.5% identity to the VZV gene 62 (Davison and Scott, 1986), 42.8% identity to the equine herpesvirus type 1 gene 64 (EHV-1, Telford et al., 1992) and 39.3% identity to the HSV-1 ICP4 (McGeoch et al., 1986) immediate early genes. The SVV R S 1 0 R F DNA sequence derived in this study is similar, but not identical, to the SVV RS1 DNA sequence reported by Clarke et al. (1993). The first 1178 and last 23 amino acids of both versions are identical. However, our results indicate a 1-bp addition at nt 901 and a 5-base addition at nt 666-670 within the 3' region of the SVV RS1 gene, resulting in predicted polypeptide which is 2 amino acids larger than originally described. The changes in this region of SVV RS1 result in a 2.2-fold increase in amino acid identity with the homologous region in the VZV gene 62 polypeptide. SVV ORF RS2
The coding region for the SVV R S 2 0 R F (nt 5822 to 6607) includes 786 bp. The sequence environment around the ATG start codon (TCAATGC) is in poor context with Kozak's rules for translational initiation of eukaryotic mRNAs (Kozak, 1986), as has been demonstrated for other SVV ORFs (Fletcher and Gray, 1993). A potential TATA box (ATAAATTA) is located 111 bp upstream of the initiation codon. A consensus poly A signal (AATAAA) is centered 51 bp downstream of the TAA stop codon. The predicted 261 amino acid RS2 polypeptide is 29.3 kDa in size and shares 51.6% amino acid identity with the VZV gene 63 homolog. Other homologs include the HSV-1 ICP22 (US1, 23.0% identity), the EHV-1 gene 65 (33.9% identity), and the pseudorabies virus (PRV) RSp40 (29.6% identity). Fig. 3 shows the amino acid alignment of the SVV RS2 homologs. SVV ORF RS3
The 564-bp R S 3 0 R F (nt 6829 to 7392) is the smallest of the SVV RS genes. The sequence around the initiation codon, - 3 ( G ) and + 4(G), is in agreement
186
W.L. Gray et al. / VirusResearch 39 (1995) 181-193 SW
..........
.... MQAPRD
EMTPHGIDVY
SLGLDIHGAR
It
i i
EYGSVTPGLH
llli
VZV EHV PRV HSV
..... M .... T P S P P A .... P P G P R P .... PRPKRARVNL
.FCTSPATRG •G D P S P R S S Q .TTPVPGSSP RLTSSPDRRD
DSSESKPGA. RIDAVRVPAR PSPASTPTPP GVIFPKMGRV
..SVDVNGKM LPG..GSDHP KRGRYVVEHP RSTRETQPRA
EYGSAPGPLN EYGMPLSPRA EYGPPPDPEE PTPSAPSPNA • ** *
SVV
AFCAPPWSLD
VARLVKDINR
MFLCIARASG
RVTRDSRTLR
RI CVDFYLMG
VZV EHV PRV HSV
AFCTPGWEIH AFCAPPWRPD AFCAAPWRPD RRSSARWTPD *** # *
Ill l{
{lll I}II
lilll
SN. D L E H G P G
I
I
G R •D T S R G P G LRPYLARGPG VRVHGARGPG MLRRSVRQAQ **** RLKQRPTVTC
II il{llll II III l{llli I illl
PARLVEDINR VFLCIAQSSG VNRLAGDVNR LFRGISTSSI TRRLGADVNR LFRGIAVSAA LGYMRQCINQ LFRVLRVARD •* * #* # *
RVTRDSRRLR HVTEDSRTLR DVTGDTRALR PHGSANR.LR ** * # ##
RI CLDFYLMG RALLDFYAMG RALFDFYAMG HLIRDCYLMG • #*# ##
106 144 136 239
V S D D A T ....
159
WEELLQLQ.P WEELLQLQ.P WQSLLQLL.P WQALLQLS.P WCRLLQVSGG
SVV
......................................
VZV EHV PRV HSV
..................................... EDD SDDDGS ..... TPSDVIEFR ..................................... EEE EEDEAS ..... GESSVSEFS TVASEFSFRG SVCEDDGEDE DEEEDGEEED EDEEGEEEED EEEEEG ..... DEDGETDVY TSDDEIS .............. DATDLEAAG SDHTLASQSD TEDAPSPVTL ETPEPRGSLA
TQTRCLRATL
# ###*
ADVARRSPIT
{II lllllJ
*
i
I I
EE.FIDPPDI
l Jl
I
EDGFIEAPNV QRFLEPPSDP PRVLSPPV.I AEPVCKLPCL
PLHRIALECD
l{il I{III IIII PI./4RSALECD PNTLFGEECD EGPLFGEECD ETRRYGPECD ###
V S D D G G .... VSGDESPS.. VDEDDAGSDT LSNLEIHLSA ** *
SN SDDDQS ..... TTSEEVVFE
III{I
t l
SW
NPDDECCTEE
STLDENECIE
G Q T ¥ •A P L T K
RPRTITARRN
DSDAESSDGE PEEETASSEY EEDDEAEDEE VRLEDEFGEF *
DFIVEEESEE STDSCEPD.. .GVPGDCYRD D ................................ D. . . E E D G D D F D G A S V G D D D V F E P P E D G S D DWTPQEGSQP WLSAVVADTS SVERPGPSDS * * *
GDGCNTPSPK SFSDVGE GEGSGSDDGG GAGRAAEDRK *
RPQRAIER.. DDSSCTGKWS DGEDEDEDED CLDGCRKMRF
SVV
ALDSNASVEH
MAAKILTELR
VZV EHV PRY HSV
.YAGAETAEY SSESESDSES EDEDEDDGED STACPYPCSD
EFIIEEESEN
TSDDGSEDVE
i II IIIII
I
ILl II{
I
i
It
161 201 194 299
176
I
VZV EHV PRV HSV
II
105
I
SVV
TQTQCLRATL MEVSHRPPRG EQSFPLRATLRALNSEDRYE EQSAPLRSALRELNERDVYD TWGMHLRNTI REVEARFDAT • ## ** *
46 84 76 180
RTRQRPTLAC ¥THTRPTLEC ¥TRQRPSAPC Y CRARLAPRT #* *
VZV EHV PRV HSV
illllll{{
45
II{
{
ESVHNT ..................................
179 219 249 345
235
234 247 306 405
261
I
TAAKALTALG EGGVDWKRP,R HEAPRRHDIP PPHGV ............... DAPTNNHHPT TRASAAKKRR KRQPPKGERP TKSARR .............. EEDEEGEDGG EDGEDGEEDE DEDGEGEEGG KDAARRGTRA PTRPAAAP. . TFLRP .............................................
278 293 364 420
Fig. 3. Predicted amino acid sequence of the SVV RS2 and alignment with the amino acid sequences of the VZV gene 63 (Davison and Scott, 1986), EHV-1 gene 65 (Telford et al., 1992), PRV RSp40 (Zhang and Laeder, 1990), and HSV-1 ICP22 (US1, McGeoch et al., 1985) polypeptides. Vertical lines indicate identical amino acid residues. Amino acids which are identical in all 5 polypeptides are indicated (#). Amino acids which are conserved in 4 of the 5 polypeptides are indicated (*). The numbers to the right of the sequences indicate the positions in the amino acid sequences, with the initiation codon being number 1 in all cases. with Kozak's rules (Kozak, 1986). A potential T A T A box ( T T A T A T A ) is centered 119 bp upstream of the start codon. A consensus poly A signal ( A A T A A A ) is located 35 bp downstream of the T A A termination codon. The R S 3 0 R F codes for a predicted 187 aa polypeptide with an estimated size of 21.1 kDa. The RS3 polypeptide is homologous to the gene products of V Z V gene 64 (55.7% identity), the HSV-1 US10 gene (27.8% identity), and the EHV-1 g e n e 66 (31.1% identity). An amino acid alignment of the SVV RS3 homologs is shown in Fig. 4A. A potential zinc finger motif, M - X 3 - C - X 3 - H - X 1 3 - C , located between amino acids 120-142, was identified and is similar to the consensus zinc finger motif present in the predicted polypeptides of V Z V gene 64, HSV-1 US10, and EHV-1 ORF 66 (Fig. 4B). The amino acids located between positions 2 and 10 of the motif are highly conserved among the herpesvirus RS3 homologs (Fig. 4A).
187
W.L. Gray et al. / Virus Research 39 (1995) 181-193 A.
SVV
..............................
PVNPGSCYP
MDLYRAE...
VZV EHV HSV
.............................. ................. MDG AYGHVHNGSP SPDPFSPQHG AYARARVGIH TAVRVPPTGS
HQ~PFER
EFAIELCQIS
IIIII VZV EHV HSV
HQALMNDAER HAVSLPRSVG HLVMLPADHR
SVV
NQISFPAVRR
II
#
*
ADAFSAYTCE
II II
YFAAALCAIS DFAAVVRAVS AFFRTVVEVS
#*
#
ATLAVLREKC
i i
i] J I
VZV EHV HSV
RPPSVPPIRR VFHSADPLRR SPLWP...WR
SVV VZV EHV HSV
II
I
I llll
TEAYEAFIHS AEAADALRSG .RMCAANVRD
PSERPCASLW AG..PPAEAW PPPPATGAML
ASDPPTHAEL
SDRLVLMSYW
*
Ill I
#
li
MPDPQSHLEL AAPVETHAEL APRVQTHRHM
RAFVYNRRGG
ICHRLFDAYL
GCGVYPESGR
il
llllillill
I11
ICHRLFDAYL .CARFFETRL HCLHLFGA.F
GCGSL...GV GIGETPPADA GCGDPALTPP
*# *
i III
RAELYDRPGG RPQMYERA.. YAGLY...ST
**
# *
*
*
*
*
**
*
**
*
Ill
*
SERLILMAYW SGRMLFCAYW HDLLMACAFW
*# ** * **
**
I
#
GRAKDAFGRM PRVYRMFCDM GRHARLVH..
CGELAADRQ. FGRYAASPMP TQWLRANQET
CCLGHAGTRL
YDQPPDKLCI
#
*
83
lli 86 i01 240
*# I
143
II
CCLGHAGLPT IGLSPDNKCI CCLGHAFA .......... CS CCLTHAST .......... CS
**# ###*##
DR .........
I
27 43 180
Cd%AFAATRGI
II
J LI I lJ II II illllll
AVLSLLREQC AVGLYLVDLG TAAINFITTM
*
il
23
LYCTRHDTPA LYPTSTDTAA LYPLDARALA
SRAKTAFGRL
PLERPCPALW
I
TRHDTSA
[IIIII
GEHPGGEYAG GTGTGAGADG GDEPTSDDSG
MNLCGSR... MAVDGEESGA PTHTHLRQDP
*
SVV
....
II
I
146 153 300
#
NIKHDEWPR
LEC*
187
I PR ......... ERYWAALLNM LC* 312
# **# * * #*#
TYERS* AGAEPELFPR
180 HAAAAAYLRA
B. Homolou
SVV VZV EHV-I HSV-I
IR3 ORF 64 IR5 US10
CONSENSUS
ist aa 120 123 140 287
M M C C C
-
X X X X
3 3 3 3
X2_ 4
-
Amino acid C X3 C X3 C X3 C X3 C
- X2_15-
motif H H H H C/H
-
XI3 XI3 X3 X3
- X2_ 4
-
C C C C
- C/H
Fig. 4. (A) Alignment of the SVV predicted amino acid sequence with the amino acid sequences of the VZV gene 64 (Davison and Scott, 1986), EHV-1 gene 66 (Telford et al., 1992), and HSV-1 US10 (McGeoch et al., 1985) polypeptides. Vertical lines indicate identical amino acid residues. Amino acids which are identical in all 4 polypeptides are indicated (#). Amino acids which are conserved in 3 of the 4 polypeptides are indicated (*). The numbers to the right of the sequences indicate the positions in the amino acid sequences, with the initiation codon being number 1 in all cases. (B) Comparison of the zinc finger amino acid motifs in the SVV RS3 homologs. Conserved methionine (M), cysteine (C), and histidine (H) residues are indicated. X represents any amino acid. The consensus motif for C - C - H - H type zinc fingers is shown (Berg, 1986).
SVV RS reiterated sequence The G + C-rich reiterated sequence was identified in a noncoding region between the RS1 and RS2 ORFs (nt 5156 to 5274). The reiterated sequence is a total of 119 nucleotides in length and includes a 16-bp sequence (AGAGGGGGGACGGGGG) which is repeated in tandem 7 times plus a 7-bp (AGAGGGG) partial repeat (Fig. 5). The SVV repeat sequence shares extensive homology to a 27-bp VZV RS reiterated sequence which is located between the VZV genes 62 and 63. A 9-bp core sequence of the SVV repeat is identical, except for a single cytosine nucleotide, to an analogous sequence within the VZV repeat element.
W.L. Gray et al. /Virus Research 39 (1995) 181-193
188 SVV
GGGGGGACGGGGGAGA I I I II
VZV
CCGCCGATGGGGA I
I
I
I
I II
HSV ¢ G A C G C G G G G G C G G A G G A I I I II
HSV a
GCGAGGA
I I|
|1
!
GGGGGCGCGGTACC I II
II
GGGGG I II
II
GGGGG
Fig. 5. Reiterated sequence in the SVV RS and comparison to homologous repeat sequences in the VZV and HSV-1 genome. The DNA sequences of the 16-bp SVV and 27-bp VZV RS tandem direct repeats are illustrated. The DNA sequences of the 22-bp repeat in the 'c' region (Murchie and McGeoch, 1982) and the 12-bp repeat in the 'a' region (Mocarski and Roizman, 1982) of the HSV-1 genome are indicated. Vertical lines indicate identical nucleotides.
Homologous repeat elements are also within the 'a' sequence of the HSV-1 DNA termini (Mocarski and Roizman, 1982) and the 'c' sequence of the HSV-1 DNA RS (Fig. 5, Murchie and McGeoch, 1982). The significance of this finding is uncertain since these HSV-1 repeat elements are not located in genomic areas corresponding to the SVV and VZV repeats within the RS. Putative S W oriS A putative SVV oriS was identified based on homology to the VZV and other alphaherpesvirus replication origins. The SVV oriS region (nt 5467 to 5625) was
A. ****
AAAGAATCGC
****
ACTTCCCATA
TTCTTATTTT
CATAATGTCA
CGTAAAATTT
CTATTAAAAA
CCAGTGAAAC
********
ATAGGCGTGG
TTTACTTTAG
TCGCACTTCC
********
ATATATTGTG
TTGGATTCCA
GCCAATTATT
AAGATATATA
TTATTCACCA
CGTACCGACT
ATTCGCACTT
TAATATAATA >
(
B.
SVV oriS
TTATTATTCGCACTTTAATATAATAAAGATATATATTATTCAC
]
lllilill]l
I
III 1 IIIIIIII
i
CACGTAC
I il
VZV
oriS
CCACCGTTCGCACTTTCTTTCTATATATATATATATATATATATATATATATAGAGAAAGA
HSV
oriS
GAAGCGTTCGCACTTCGTCCCAATATATATATATTATTAGGGCGA/%GTGCGI%
I illll[lllll
i
i llll11111111
II
l I
Fig. 6. DNA sequence of the S W oriS region and comparison to the VZV and HSV-1 oriS. (A) DNA sequence of the SVV oriS (nt 5461-5640). Palindrome sequences are underlined. The 3 origin protein DNA binding site motifs are indicated with asterisks. (B) Comparison of the S W , VZV, and HSV-1 oriS regions. Palindrome sequences are underlined. The proximal SVV origin protein DNA binding site motif is indicated with asterisks and in bold. The VZV and HSV-1 origin protein DNA binding site motifs are indicated in bold. The VZV and HSV-1 DNA sequences are derived from Davison and Scott, 1986 (nt 110,175-110,235) and Murchie and McGeoch, 1982 (nt 2408-2459), respectively. Vertical lines indicate identical nucleotides.
189
W.L. Gray et al. / Virus Research 39 (1995) 181-193 VZV
IRs US
R
O
IRs
am
SVV
F
i~ Us RS 1
t t R
.s2"
~
~s,
~
us3 ~
us,
"
O
Fig. 7. Summaryof gene organizationwithin the S componentsof the VZV and SVV genomes.The genomic location and orientation of the VZV ORFs 62-68 and the SVV RS1-3 and US1-4 are indicated. Regions of the VZV and SW RS containingreiterated (R) and oriS (0) sequences are illustrated with arrows. located between the reiterated sequence and the R S 2 0 R F (Fig. 2). Typical of other alphaherpesvirus replication origins, the SVV oriS includes an A + T-rich sequence capable of forming a hairpin structure (Fig. 6A). The hairpin consists of a 6-base palindromic stem (ATAATA and TA'I~AT) and an 8 base loop (AAGATATA). Immediately upstream of the hairpin are three motifs (TCGCACTT) which are homologous to binding sites for the VZV gene 51 and the HSV-1 UL9 origin binding proteins (Fig. 6A and 6B, Stow et al., 1990). The SVV oriS region, including the hairpin and the most proximal origin binding site, shares 72% nucleotide identity with the homologous region of the VZV oriS (Fig. 6B).
4. D i s c u s s i o n
This study of the SVV RS, combined with our previous study of the SVV US region (Fletcher and Gray, 1993), demonstrates that the S components of the SVV and VZV genomes are similar in genetic organization and size (Fig. 7). The SVV and VZV S components contain homologous ORFs, replication origins, and reiterated regions. The SVV RS is 7559 bp in size, slightly larger than the 7319-bp RS of the VZV Dumas strain (Davison and Scott, 1985, 1986). The S component of the SVV genome, including the duplicated RS and the 4904-bp SVV US region, contains 20,022 bp, similar to the 19,870-bp VZV S component (Davison and Scott, 1986). The sizes of the SVV RS, US and total S components as determined by DNA sequence analysis are consistent with previous size estimates of SVV DNA by electron microscopy (Gray et al., 1992; Clarke et al., 1992).
190
W.L. Gray et al. / Virus Research 39 (1995) 181-193
The G + C content of the SVV RS (55.5%) is higher than that of the US region (39.1%) and the whole SVV genome (41%) and slightly less than the VZV RS (59.0%, Fletcher and Gray, 1993; Clarke et al., 1992; Davison and Scott, 1986). The SVV and VZV RS have a similar G + C% distribution with particularly high G + C content in the RSl-gene 62 homologs and in the reiterated repeat sequences. Although a relatively high G + C content is characteristic for the inverted repeats of most herpesviruses, the factors which produce this phenomenon are not understood (Davison and McGeoch, 1986). The SVV RS contains 3 ORFs, each homologous to VZV RS genes. The SVV RS1 gene is closely related to the major immediate early genes of VZV (gene 62), EHV-1 (gene 64), and HSV-1 (ICP-4). While the function of the SVV RS1 polypeptide has not been investigated, the VZV gene 62 homolog, a major component of the virion tegument, transactivates immediate early, early, and late genes and plays an important regulatory role in VZV replication (Kinchington et al., 1992; Perera et al., 1992). The SVV RS2 gene has extensive homology with the VZV gene 63 and limited homology with a domain of the HSV-1 immediate early ICP22 gene. Transient expression assays indicate that the VZV gene 63 polypeptide strongly represses gene 62 expression, promotes transcription of a VZV early gene (thymidine kinase), and has no effect on the expression of late genes (glycoprotein gE and gB, Jackers et al., 1992). VZV gene 63 transcripts are detected in latently infected human and rat ganglia, suggesting this viral gene may play a role in VZV latency and reactivation (Gilden et al., 1987; Merville-Louis et al., 1989; Croen and Straus, 1991). The SVV RS3 (187 amino acids) and VZV gene 64 (180 amino acids) homologs are considerably smaller than the HSV-1 US10 counterpart (312 amino acids) and homology with HSV-1 US10 is limited to a 130 amino acid domain (Fig. 3). The functions of the SVV RS3 and VZV gene 64 proteins are unknown. HSV-1 US10 is nonessential for viral replication and is speculated to be a virion polypeptide (Nishiyama et al., 1993; Beers et al., 1994). A HSV-1 US10 deletion mutant is as neurovirulent as the parental virus following intracranial inoculation of mice, 20-fold less virulent following intraperitoneal inoculation, and is able to establish latent infection of ganglia and reactivate in vitro (Nishiyama et al., 1993). Proteins which contain zinc finger motifs are generally regulatory proteins which act by binding to specific DNA or RNA sequences (Berg, 1990). The potential zinc finger motifs identified within the predicted SVV RS3 and VZV gene 64 polypeptides are similar to the C-X2_3-C-X3_a-H-X4_8-C consensus motifs present in other herpesvirus US10 homologs and in retroviral low molecular weight nucleic acid-binding proteins (Berg, 1986). However, the SVV RS3 and VZV gene 64 motifs are unusual in that each begins with a methionine instead of the typical cysteine (Fig. 4B). In addition, the SVV and VZV motifs are atypical in having 13 amino acids between the histidine at position 9 and the terminal cysteine residue instead of the usual 4-8 residues. If the atypical zinc finger motifs of the SVV RS3 and VZV gene 64 polypeptides function to permit specific DNA binding, these proteins may play an important regulatory role in viral replication.
W.L. Gray et al. / Virus Research 39 (1995) 181-193
191
The G + C-rich SVV reiterated region includes a 16-bp sequence which is tandemly repeated 7 times plus a partial repeat and extends a total of 119 bp. The analogous 27-bp VZV repeat is duplicated a variable number of times in different VZV strains (Casey et al., 1985; Davison and Scott, 1986). For example, the VZV Oka, Ellen, and Dumas strains contain 15, 6, and 5 copies, respectively, of the repeat element. It will be of interest to determine if a similar variation occurs in the RS regions of different SVV isolates. The basis for reiterated regions within herpesvirus genomes is not understood. The similarity in location within non-coding regions of the SVV and VZV RS and the homology between the SVV and VZV repeat sequences suggests some importance in viral DNA structure a n d / o r gene regulation. However, it has also been proposed that the reiterated regions are no more than parasitic sequences which accumulate in herpesvirus DNA by recombination or duplication in locations where they do not cause a selective disadvantage (Davison and Scott, 1986). The putative SVV oriS was identified based on extensive homology with the VZV and HSV-1 replication origins, an A-T-rich palindromic sequence which may form a stem and loop structure, and a conserved origin binding protein motif. The 20-bp SVV stem and loop sequence is shorter than the 45-46 bp stem and loop structures of the VZV and HSV-1 oriS (Davison and Scott, 1985). However, the total length of the palindromic region may not be so important since deletion of 12-20 bp within the A-T-rich sequence of the VZV oriS does not eliminate oriS activity (Stow and Davison, 1986). In addition, studies employing HSV-1 oriS mutants indicate that neither the size of the palindromic region nor the ability to form a cruciform structure are important for efficient oriS replication activity (Deb and Doelberg, 1988). The 3 SVV origin binding site motifs are located on the same side of the palindrome, similar to the situation in the VZV oriS (Fig. 6, Stow et al., 1990). In contrast, two conserved oriS binding sites overlap the ends of the HSV-1 palindromic sequence (Elias and Lehman, 1988). In studies of the VZV oriS the binding site closest to the palindrome is essential for oriS replication activity whereas the most distal site is dispensable (Stow et al., 1990). This study further demonstrates the genetic relatedness of SVV and VZV and provides support for the simian varicella model as an experimental system for studying aspects of varicella pathogenesis which are not feasible with VZV. Identification of SVV homologs to the VZV genes 62, 63, and 64 will permit use of the simian varicella model to investigate the role of these genes in viral replication, pathogenesis, and latency.
Acknowledgements This study was supported by Public Health Service Grant AI 26070 from the National Institutes of Health, Grant 1N167 from the American Cancer Society, and U.A.M.S. Institutional Biomedical Research Support Grant RR05350-26. The authors thank Dr. Don Gilden for providing the SVV EcoRI L and Dr. Bill Stroop and Ms. Carla Pumphrey for critical review of the paper.
192
W.L. Gray et al. / Virus Research 39 (1995) 181-193
References Bairoch, A. (1991) PROSITE: A dictionary of sites and patterns in proteins. Nucl. Acids Res. 19, 2241-2245. Beers, D.R., Henkel, J.S. and Stroop, W.G. (1994) Herpes simplex virus. In: R.R. McKendall and W.G. Stroop (Eds.), Handbook of Neurovirology, Marcel Dekker, Inc., New York, pp. 225-252. Berg, J.M. (1986) Potential-metal binding domains in nucleic acid binding proteins. Science 232, 485-487. Berg, J.M. (1990). Zinc fingers and other metal-binding domains. J. Biol. Chem. 265, 6513-6516. Casey, T.A., Ruyechan, W.T., Flora, M.N., Reinhold, W., Straus, S.E. and Hay, J. (1985) Fine mapping and sequencing of a variable segment in the inverted repeat region of varicella-zoster virus DNA. J. Virol. 54, 639-642. Clarke, P., Rabkin, S.D., Inman, M.V., Mahalingam, R., Cohrs, R., Wellish, M. and Gilden, D.H. (1992) Molecular analysis of simian varicella virus DNA. Virology 190, 597-605. Clarke, P., Brunschwig, A. and Gilden, D.H. (1993) DNA sequence of a simian varicella virus gene that encodes a homologue of varicella zoster virus IE62 and herpes simplex virus ICP4. Virology 197, 45-52. Clarke, P., Beer, T. and Gilden, D.H. (1995) Configuration and terminal sequences of the simian varicella virus genome. Virology 207, 154-159. Croen, K.D. and Straus, S.E. (1991) Varicella-zoster virus latency. Annu. Rev. Microbiot. 45, 265-282. Davison, A.J. and McGeoch, D.J. (1986) Evolutionary comparisons of the S segments in the genomes of herpes simplex virus type 1 and varicella-zoster virus. J. Gen. Virol. 67, 597-611. Davison, A.J. and Scott, J.E. (1985) DNA sequence of the major inverted repeat in the varicella-zoster virus genome. J. Gen. Virol. 66, 207-220. Davison, A.J. and Scott, J.E. (1986) The complete DNA sequence of varicella-zoster virus. J. Gen. Viroi. 67, 1759-1816. Deb, S. and Doelberg, M. (1988) A 67-base-pair segment from the ori-S region of the herpes simplex virus type 1 encodes origin function. J. Virol. 62, 2516-2519. Devereux, J., Haeberli, P. and Smithies, O. (1984) A comprehensive set of sequence analysis programs for the VAX. Nucl. Acids Res. 12, 387-395. Elias, P. and Lehman, I.R. (1988) Interaction of origin binding protein with an origin of replication of herpes simplex virus 1. Proc. Natl. Acad. Sci., USA 85, 2959-2963. Felsenfeld, A.D. and Schmidt, N.J. (1975) Immunological relationship between delta herpesvirus of patas monkeys and varicella-zoster virus of humans. Infect. Immun. 12, 261-266. Feng, D.-F. and Doolittle, R.F. (1987) Progressive sequence alignment as a prerequisite to correct phytogenetic trees. J. Mol. Evol. 25, 351-360. Fickett, J.W. (1982) Recognition of protein coding regions in DNA sequences. Nucl. Acids Res. 10, 5303-5318. Fletcher, T.M. and Gray, W.L. (1992) Simian varicella virus: Characterization of virion and infected cell polypeptides and the antigenic cross-reactivity with varicella zoster virus. J. Gen. Virol. 73, 1209-1215. Fletcher, T.M. and Gray, W.L. (1993) DNA sequence and genetic organization of the unique short (Us) region of the simian varicella virus genome. Virology 193, 762-773. Gilden, D.H., Rozenman, Y., Murray, R., Devlin, M. and Vafai, A. (1987) Detection of varicella-zoster virus nucleic acid in neurons of normal human thoracic ganglia. Ann. Neurol. 22, 377-380. Gray, W.L. and Oakes, J.E. (1984) Simian varicella virus DNA shares homology with human varicellazoster virus DNA. Virology 136, 241-246. Gray, W.L., Pumphrey, C.Y., Ruyechan, W.T. and Fletcher, T.M. (1992) The simian varicella virus and varicella zoster virus genomes are similar in size and structure. Virology 186, 562-572. Henikoff, S. (1984) Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28, 351-359. Jackers, P., Defechereux, P., Baudoux, L., Lambert, C., Massaer, M., Merville-Louis, M.-P., Rentier, B. and Piette, J. (1992) Characterization of regulatory functions of the varicella-zoster virus gene 63-encoded protein. J. Virol. 66, 3899-3903.
W.L. Gray et al. / Virus Research 39 (1995) 181-193
193
Kinchington, P.R., Hougland, J.K., Arvin, A.M., Ruyechan, W.T. and Hay, J. (1992) The varicella-zoster virus immediate-early protein IE62 is a major component of virus particles. J. Virol. 66, 359-366. Kozak, M. (1986) Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44, 283-292. McGeoch, D.J., Dolan, A., Donald, S. and Rixon, F.J. (1985) Sequence determination and genetic content of the short unique sequence of the herpes simplex virus type 1. J. Mol. Biol. 181, 1-13. McGeoch, D.J., Dolan, A., Donald, S. and Brauer, D.H.IC (1986) Complete DNA sequence of the short repeat region in the genome of herpes simplex type 1. Nucl. Acids Res. 14, 1727-1745. Merville-Louis, M.-P., Sadzot-Delvaux, C., Deiree, P., Piette, J., Oonen, G. and Rentier, B. (1989) Varicella-zoster virus infection of adult rat sensory neurons in vitro. J. Virol. 63, 3155-3160. Mocarski, E.S. and Roizman, B. (1982) Structure and role of the herpes simplex virus DNA termini in inversion, circularization and generation of virion DNA. Cell 31, 89-97. Murchie, M.-J. and McGeoch, D.J. (1982) DNA sequence analysis of an immediate-early gene region of the herpes simplex virus type 1 genome (map coordinates 0.950 to 0.978). J. Gen. Virol. 62, 1-15. Myers, M.G. and ConneUy, B.L. (1992) Animal models of variceUa. J. Infect. Dis. 166, Suppl. 1, $48-$50. Nishiyama, Y., Kurachi, R., Daikoku, T. and Umene, K. (1993) The US9, 10, 11, and 12 genes of herpes simplex virus type 1 are of no importance for its neurovirulence and latency in mice. Virology 194, 419-423. Oakes, J.E. and d'Offay, J.M. (1988) Simian varicella virus. In: G. Darai (Ed.), Virus Diseases in Laboratory and Captive Animals, Martinus Nijhoff, Boston, MA, pp. 163-174. Perera, L.P., Mosca, J.D., Ruyechan, W.T. and Hay, J. (1992) Regulation of varicella-zoster virus gene expression in human T lymphocytes. J. Virol. 66, 5298-5304. Perera, L.P., Mosca, J.D., Ruyechan, W.T., Hayward, G.S., Straus, S.E. and Hay, J. (1993) A major transactivator of varicella-zoster virus, the immediate-early protein IE62, contains a potent N-terminal activation domain. J. Virol. 67, 4474-4483. Pumphrey, C.Y. and Gray, W.L. (1992) The genomes of simian varicella virus and varicella zoster virus are colinear. Virus Res. 26, 255-266. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) In: Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory Press. Sanger, F., Nicklen, S. and Coulson, A.R. (1977) DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci., USA 74, 5463-5467. Smith, T.F. and Waterman, M.S. (1981) Comparison of bio-sequences. Adv. Appl. Math. 2, 482-489. Soike, K.F. (1992) Simian varicella virus infection in African and Asian monkeys. Trop. Vet. Med. 653, 323-333. Staden, R. (1980) A new computer method for the storage and manipulation of DNA gel reading data. Nucl. Acids Res. 8, 3673-3694. Stow, N.D. and Davison, A.J. (1986) Identification of a varicella-zoster virus origin of replication and its activation by herpes simplex virus type 1 gene products. J. Gen. Virol. 67, 1613-1623. Stow, N.D., Weir, H.M. and Stow, E.C. (1990) Analysis of the binding sites for the varicella-zoster virus gene 51 product within the viral origin of DNA replication. Virology 177, 570-577. Telford, E.A.R., Watson, M.S., McBride, K. and Davison, J. (1992) The DNA sequence of equine herpesvirus-1. Virology 189, 304-316. Zhang, G. and Laeder, D.P. (1990) The structure of the pseudorabies virus genome at the end of the inverted repeat sequences proximal to the junction with the short unique region. J. Gen. Virol. 71, 2433-2441.