Analysis of the nucleotide sequence of a 43 kbp segment of the genome of variola virus India-1967 strain

Analysis of the nucleotide sequence of a 43 kbp segment of the genome of variola virus India-1967 strain

V irus Research IGus Research, 30 (1993) 239-258 0 1993 Elsevier Science Publishers B.V. All rights reserved 0168-1702/93/%06.00 VIRUS 00935 Analys...

2MB Sizes 0 Downloads 9 Views

V irus Research

IGus Research, 30 (1993) 239-258 0 1993 Elsevier Science Publishers B.V. All rights reserved 0168-1702/93/%06.00

VIRUS 00935

Analysis of the nucleotide sequence of a 43 kbp segment of the genome of variola virus India-1967 strain Sergei N. Shchelkunov, Vladimir M. Blinov, Sergei M. Resenchuk, Aleksei V. Totmenin and Lev S. Sandakhchiev Instituteof ~~lecu~r Biology, NPO Vector:

Kolt~ovo, Novos~~~k Region 633159, Russia

(Received 10 December 1992; revision received 10 June 1993; accepted 28 June 1993)

Summary Sequencing and computer analysis of the nucleotide sequence of the varioIa virus strain India-1967 @AR) genome segment (43069 bp) from the region of Hind111 C, E, R, Q, K, H DNA fragments has been carried out. Forty-three potential open reading frames (ORFS) have been identified, and the polypeptides encoded by them have been compared with the analogous proteins of vaccinia virus strain Copenhagen (COP). ORF E7R of VAR is much shorter than the COP analog. The other polypeptides coded by the potential ORFs of VAR are highly conserved in comparison with COP. Possible functions of the predicted viral polypeptides are discussed. Orthopoxvirus;

Variola virus; Sequence analysis

Introduction Comparative studies of the structure-function organization of the genome of various orthopoxviruses are necessary in order to detect the genes determining such species specific properties as tissue tropism, or modulation of the protective

Correspondence to: S.N. Shchelkunov, Institute of Molecular Biology, NPO ‘Vector’, Koltsovo, Novosibirsk Region 633159, Russia.

240

mechanisms of the host organism against viral infection. Such information is also extremely important for revealing the evolutionary interrelations between different orthopoxviruses. For instance, the origin of vaccinia virus is still unclear (Buller and Palumbo, 1991). Important questions are which virus should be considered the evolutionary ancestor of variola virus and whether such a virus could appear as a result of spontaneous changes in the genome of orthopoxviruses persisting in nature now. The complete nucleotide sequence of one of the vaccinia virus strains has recently been determined (Goebel et al., 1990). The next stage of this study should be sequencing of the genome of variola virus WAR), monkeypox and cowpox viruses. In this work we have sequenced and analyzed a 43069 bp segment of the genome of VAR strain India-1967.

Materials and Methods Hybrid plasmids (Shchelkunov et al., 1991) containing Hind111 fragments of variola major virus DNA strain India-1967 (Shchelkunov et al., 1993a) were used. Sequencing of DNA fragments was carried out by the Maxam and Gilbert (1980) technique with double redundancy as described earlier (Shchelkunov et al., 1993a). Structure-function analysis of the variola virus genome was carried out with the help of the following applied programs: ‘Alignment Service’, ‘Image’, ‘Nucleowriter’, ‘Gene Tools’, all developed in NPO ‘Vector’. The search for consensus sequences and viral protein sites most homologous to cell ones was done by the specialized package ‘Q-Search’. Primary assessment and comparison of genomes was performed with the help of the program ‘G-Search’ to compare genomes up to 200,000 base pairs simultaneously. Data bases on immunoglobulinlike domains and cell receptors were used in the work. These data bases had been created in NPO ‘Vector’. The data bases SWISS-PROT (Protein Sequence Database, Release 21) and EMBL (Nucleotide Sequence Database, Release 28) were also used. VAR open reading frames (ORFS) were denoted by the name of the corresponding Hind111 fragment in which the given ORF is initiated or completely located. Letters ‘L’ and ‘R’ imply ORF directions from right to left and from left to right, respectively. Sequence data from this article have been deposited with the EMBL Data Library under the accession number X69198.

Results and Discussion After sequencing of the HindIII-C, strain India-1967 DNA (Fig. 1) we sequence and compared it with the Copenhagen (COP) genome (Goebel

E, R, Q, K and H fragments of variola virus have carried out computer analysis of this analogous region of the vaccinia virus strain et al., 1990).

241

C

I,,

I”

F

IMU I

EFllI6LJH I

III

A

0 III

I

0 I

I

j

Copenhagen

India

-

nr

-. : r,u PI!_paw ULUL

UL m SJ!. m.

au

gz

7i7i-

Fig. 1. HindI restriction maps of the DNAs of vaccinia virus, Copenhagen strain, and variola virus, India-1967 strain. The VAR genome segment analyzed in the present work is 3 cross-hatched. Size and direction of the corresponding ORFs are marked with arrows.

Forty-three potential ORFs were detected in this region of the VAR genome (Fig. l), the same as for COP (Fig. 2). ORF E7R of VAR is much shorter than the analog of COP. The other polypeptides coded by the potential ORFs of VAR have a high conservation as compared to COP. However, the CllL protein lacks multiple Lys-Asn repeats which are present in the corresponding COP protein and E5R, C19L, CL9 and C8L polypeptides of VAR have additional sequences at the N-termini (Fig. 2). It is possible that in these cases the methionine codons which have been chosen in the variola sequence are not the authentic start codons. Thirteen of 42 polypeptides analyzed have extended segments characterized by high hydrophobic&. Fourteen polypeptides contain a leucine zipper motif (Fig. 2), which implies their ~tential pa~icipation in the protein-protein or protein-nucleic acid interactions (~nd~hulz et al., 1988; Kouzarides and Ziff,1988). The leucine zipper motif has the following consensus sequence [L,I]-(x).6-[L,I]-(x).6-[L,I]-(x).6[L,I]. It should be noted that in VAR polypeptide QlL the second leucine zipper motif from the N-terminus of the COP analog breaks as a result of an amino acid substitution (Fig. 2). The experimental data gained so far assign the following ORFs of VAR to the early genes: CXL, ClSL, C16L, C19L, C2OL, ElL, E3L, E9L, QlL, K4L, H3R (Traktman et al., 1984; Golini and Kates, 1984; Earl et al., 1986; Slabaugh et al., 1988; Tengelsen et al., 1988; Gershon et al., 1991; Watson et al., 1991; Meis and Condit, 1991; Chang et al., 1992); to the late genes: C17L, C21R, Q2L, KlL, K2L, K5L, K7L, HlL, H2L, H4L (Wittek et al., 1984; Hirt et al., 1986; Schmitt and Stunnenberg, 1988; Fathi and Condit, 1991a; Meis and Csndit, 1991; Blasco and Moss, 1991; 1992; Ahn and Moss, 1992; Ryazankina et al., 1993); to the early-late

242 copF4L

UiTEKKITTIVYLYnEPIL*YYPWRFVIFPI~VHDIYNIIVK~EASFYTYEEMISWIYDUNKLTPDEKVFlKHVLAfFM~GIVNENLAERFCIE~ --------------......P............Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..T...

100 86

VRcCSL copF4L

ITEARCFYBF~~UENI”SEI(YSLLIDTYVI(DSNEKNYLFNAIETNPCVKKY*D”APIV,NOSACVGERLIAFMVEC,FFSCSFASIFULKKRCLWPG~ ....................................................................................................

200 186

VWCSL

. VRTCSL copF4L

300 286

1 TFSNELlSROEGLHCDFACL”FKNLLVPPSESTVRS~ITDAVSISPEFLTVALPVKL,~N~E~KTYNSFV~RL,S~L~FKR,“NVTNPSOFNS~ISL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A.................I..............K.......F........

VarCSL copF4L

EGKTNFFEKRVGEYQKNGVMSPEDWHFSLDVOF .................................

varc9L copF5L

. . WRISNSIKPILFVNKLKSKHLLSRI~GTNlI~FIILYLLAVCGCVEVD~NN~ICTCANVSHINHTF~VNN~IALATEDRTSGVISSFIKRVNIS --------------------------....W.V.V..........I.....~.,N...NT~ . . . . . ..S....D.........K...............

V.wCPL copFSL

l . LTCLNlSSLRVEDSGSVKGVSHLKDGVIVTTTNNISV~IIDLTCRVCVLTRNYCEVKIRCE~KSFALNGSlTPLHI(I . . . . . . . . . . . . . ..T................................R...............T..........P......V.K...............

VRFCPL copF5L

ELKRYISCNPYPIESLALEISATFNRFTIVKN-NDDEFSCVLFSQNVSFNYIILNARNICESEYEALNNNNDNSSSNPVSHNNRANDLSWISPLPWDDD . . . . . . . . . . ..T........S..........L. . . . . . . . . . . . . . . . . . . . . . . . . . . . ..K.....-..A....A....L.................

VarCPL copF5L

NNDVSAPNNINNLIHIVLITMLSlIITIlWlAIlAMVKRSKVSHlDDN . . . . . . . ..VO..............LV......A.S...K...R..-..

VarCloL COPML

MSKILTFVKNKIIDLIKNDPI~SRVITIEEU)SLLSVNEVVANHGFDCVENIDENIINENLE~VKTDSFL~-. . . . . . . . . . . . . . ..N..........N........P.D..H...............S...I.....E..FTIN

“arc, 1L COPFTL

--------KDDIEEEGEGVCDVKNLNDLDE-ATRIEFGPLVIINEEK~INTLDIKRRVRHAIESWf NTLVNGSCCGRFCDAKNKFK---N.NKNKNKNKNKNK.E.V..GR..CVN.........SEA.V......N...................l...... . . . . ..F...........

VarclzL copFSL

WECSKRKHESRRPPPEPEOHRPRTPPSYEEIAKVGHSFNVKRFTNEENCLKNDYPRIISYNPPPK . . . . . . ..D..........P.............................................

verC13L copF9L

. ~ETKEFKTLVNLFIDSVL~KLAPHSIPTNVTCIIHICE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..S..................................D......

VwzlSL copF9L

SKLEKNCNAVAEVNNIIDIPKLNIGECSAPPG~HNLLPIVICSlRRNL . . . . . . . . . . . . . . . . . . . . ..D......................E........V.............................................

varCl3L sopF9L

ALKYRVGTFLYV ............

333 319 t 100 74

. 200 174

l

l

%

348 321 l

72 74

:: 100 100

1

-

I

I

I 200 200

212 212 .

l

vsrC14L copFlOL

WGVANDSSPEVPWISPHRLSDTVILGDCLVFNNIHS~LDLH~NUAPSVRLLNVFKNFNRETLLKIEENDVINSSFF~~~KRFVPlNDDFYH ............................................................................................

VarCllL copF 1 OL

FKIDNYWKFVFEATKLVSPWETTAEFTVPKFLVNNLKGDEKKLlV~L~GLNVKlTFLNlLYKRVLNNLLLLIOT~G~ELSLRVSS~FL~FNERK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..u...................n...............................

201 201

VarClLL copFlOL

I I I 1 DSIKFVKLLSHFVPAVINSNINVlNVFNRNFHFFENEKRTNYEVERGNIIIFPLALVS~K~TELAIKLGFKSLVOYIKF FLCMALLYIKIVELPRCD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..c..

301 301

400 400

FDSNEPIIIHL~KKFVFNERIYSALNDFDFS~VAGIINKK~KNNF~EHN~VDFHFFVHTLLKTVPElE~IEFSTALEEFlN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..K.................................... 439 439

verCl4L copFlOL

CTKTDCDKYRLKVSILHPISFLEKFIMRDIFSDUINCRN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..c.

VarClSL copF11L

NGFCIPLRLKNLKRGSRKFSU(LARRHTPK~NIVTDLENRLKKNSVlENTN~GNIL~SIFVSTNSVETLFGSVITDDSDDVEL~LLNVTVNIKPVIV ................................. . . . . . . ..S.........S..I....P.......................................P

100 100

VerClSL copFllL

PDIKLDAVLDRDGNFRPADCFLVKLKNRDGFTK~LYLCHSAGFTATlCLKNEGVSGLYlPGTSVIRlNlC~~TIVSRSSRGV~FLL~lGGEAlFLIVS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..S...................P............

200 200

bwCl5L copFllL

LCPTKKLVETGFVIPNISSN~NAKlMRILSEKRKOTITHlDTLl~VGWLELAVVNSCULTEFLHVCNLVANTIKESLLKET~~~INlTHTNITTLLN . . . . . . . . . . . . . ..E......................A.......HR.....................S..O...........................

varC15L copFllL

ETAKVIKLVKSLWKEDTOIVNNFlTKElKNCGGVKNRDTlVNSLSLSNLDFVL . . . . . . . R. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..K.....

l

.

t

.

* 300 300

354 354

Fig. 2. Comparison of amino acid sequences of the polypeptides coded by the corresponding ORFs of VAR and COP. Asterisks show the potential glycosylation sites. Extended clusters of hydrophobic amino acids are marked above with solid lines. The coinciding amino acids are marked with dots. Square brackets denote the leucine zipper motif (Landschulz et al., 1988). Sequences characteristic of the protein kinase are enframed in ORF varC14K (see the text).

243

l

varClbL CcpFlZL

IILNRV4ILIIKtANNYEtlElLRNYLRLYlIURNEEGHGILIYDONtDSIIMWITRLEVIGLttHCtKLRSSPPI~SRLF~EIDNESYYSPKt~Y ....................................................................................................

100

varClbL CopFlPL

PLIDIIRKRSNE4CDIAULKRYCIENtDSISEINEULSSKGLACYRF~FNDYRK~YRKFSKCtIW~IIGNIGNHYIUIKNLEtYtRPEIOVLPFD . . . . . . . . . . . . . . . . . . ..EP.O.......................................R....................................

200 200

varClbL copFl2L

IKYISRDELUIPISSSLD4tNIKtItVSW~ItONG~PY~IStYPGNtF~FNS~LILNFLDUI~I~tSlRtIILVGYIISNLFDIPLLtWYNN ......................................... . . . . . . . . ..VR.............A............I...................N

300 300

varClbL CopFl2L

CCUKlYNNILISEDGARVIUAYKFSCGLSLPOIClnYCYNUGSKPESRPFDLIKKID~RNIK~KE~tSLKSLYEAFEt4S~LEVL~SPCR~FSFSRl . . . . . . . . t............................................S.....S..L.....A...............................

400 400

varClbL copFl2L

1 EO~FLtSVINRVSKNtUIUIYYPtND,SSLF,ESS,CLDYIIVNN4KSN~R,KSVLD~ISSK4YPAGRPNYVKNGtKGKLY,ALC~tVPtNDNIP~Y ................................................. . . . . . . . . . . . ..E.............P..................E....

500 500

vmrClbL CopFlZL

.t . NDDDNtttFItVLtSWIEtATR*CIEIVEL~L~ONIPELKDCLLDSI~~YDLNtVttNNLLE4L~ENINFNNSSIILLFYtFAISYCR*FIYSII . . . . . . . . . . . . A......................S.................. . . . . . . . . . . . . . . . . . . . ..I...............N......NG

600 600

varClbL copFl2L

. EtIDPVYIS4FSYKELYIRSSYKDINEVHS4MVKL . . . . . . . . . . . . . . . ..vs........s.......

varcl7L cqFl3L

~UPFtSAPAGAKCRLVEtLPEN~FRSD”LttFECFNE,ItLAKKYIY~ASFCCNPLSttR~LIFDKLKEASEKGIKI~VLLDKRGKRNLGEL4S”CPO . . ..A.V.............................................................................................

varCl7L copFlSL

INFItVNIDKKNNVGLLLGCFYVU)DERCYVGNASFtGGSINt~KtLGWSDYPPLAtDLRRRFDtF~FNS~NSULNLYSSACCLPVStAYHIKNPIG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A.......C.A.................

200 200

varCl7L copFl3L

GVFFtDSPEHLLGYSRDLDtDWIOKLRSAKtSlDIENLAIVPtTRWGNSYY~IYNSI~EMINRG~IRLLVGN~KNDWS~tAESLOALCV4N . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R.........

300 300

ID0

l

635 635 100 100

l

. varc17L

copFl3L

-

DLSV~FtI4NNtKLLIM)DEYVNItSANFDGtHY4NHGFVSFNSIDK4LVSEAKKIFERD~SSNSKSLKI ........................................................................ ”

.

copF14L

IKH~SEGLEIStOFNSIIS4LStSDK)IEIDEDNItELLNILtELGCO~FDEOF~~tOOVLESL~E~~ ..YSL.....S..N.L....G.P..~.t.......D.I1.................N....A..I....I...V

varCl9L copF15L

~KRNEEYCGLNKLUIEIFNVEELIN~KPFKN~NKItIN4KONCILANRCFVKIDtPRY~PLtS~SSSNIIRIRNNDFtLSELLYSPFNFPQPPFPYLLPG .R---SIA...............................N....................S.......................................

varCl9L copFl5L

FVLtCIO~SK44KECKYCISNR~ODSLSINLFIPtINKSIYIIIGLR~KNF~PKFEIE .............................................................

VarCZOL copF1bL

~KWIVtSVASLLDASI4F4KtACRHHCNYLSll4lVKEIEEFGtINEKNLEFAtYU)VI4NOEIOALVFYRVK4ISIStGVLYEW(RNRtKPIWFFVR . . . . . . . ..R........................V.................D...........................................Y...

VarClaL

100 97

. 161

158

. l

l

100 100

.

bwC2OL copFlbL

OCLAFOCOPPSFRI(tSCNINAYNRNKI~LIILIINMKtCNKKIIGEFIIDNFGSVNALLSIINSNVTWtSVINNSNtRGINIRVSNN~LtItSFRRFV . . . . . ..N................S..............................D.....V......................................

VsrCZoL copFlbL

NKLK”YKttKCAS4LDNLCtEINKNBIIDKK . . . . . . . . . . . . . . . . . . . ..II.........

vsrC2lR copFUR

~NSHFASAHtPFYINTKEGRYLVL~V~CO~tVEFEGS~SCVL~KPSSPASERRPSSPSRCER~NNPGK4WF~RtDNL4N~FMNRDNVASRLLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..C..................C............................................N

200 200

231 231 101 101

. varElL COpElL

MNRNPOHNtFPNItLKIIEtYLGRLPSVNEYH~LKL4tRNI4KItVFN~~FVSL~KNK~FFSO~t~SEI~RILSYFSK4t4tYNIGKLFtIIEL . . . . ..P.................V............A..............................................................

100 100

. V.wElL COpElL varElL CopElL

U)IOIL4tNSRtFLIOLAFLIKFItGNN~ILSKIPYLRNYI(VI~ENDNHIIOSFNIRPDtIINWPKIFIONIY~~PtF4LLN~I~F~~DRLEOLSK . . . . . . . . . . . . ..““....................................~............................................

300 3w

vsrElL CopElL

DPEKFNARIUt~LEYVRYtHGIVFDGKRNNMPUKCIIDENNRWt~l~YFSFKKCLWLDENVLS~ILDLN~tS~FESWNSWLINDNI~YtYF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..I.........................................................

400 400

varElL CopElL

SNtILLY)KC~HEISARGLCAHILLY~LtS~YK4CL~LLNsrurWRDKIPIYSNtERDKKH~HGFINIE~IIVF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..E.............................P............... -

Fig. 2 (continued).

244

VarEZL CDpEZL

~IEVTDIRRAFLDNECNYYY~FGYLREDIAIALlKlGFNPYYLP~LYNN~FVPEKLYLFKPRYV~~LISYIYKL~~KFASHINYHKNS~LIY . . . . . . . . . . . . . . . . . . . . . . . . . . . ..N......................................................................

12

vRrEZL CopEZL

CDKSLIYI(O(PY~IISDDDYRFIREOF~YNSIEYILSFIN~SIY~Y~FSENEI~IINRDHF~YEPIYEN~VLD~FL~~LD~GIWINSGIID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R............

200 200

vRrEZL CO&?L

ELYPEAIIEILlUWRPRDAIRFLDIVNILYEDSVKNYIINDIRRGKIDYYIPYVEDFLEDRYEDLGIYANIFFEDAIDlYKLDIYKYELENISKYIN ..c...............................................................................................m.

300 300

vRrE2L CcQEZL

c YYYYYIDHIVN ILPUNY DILASI DYMDVLYEELCIRIVCESYNWWYSLPIWSYLVWVllCIQUIVDIVEFLDEIDIDYLIEKGU)PIYEYYFTT ....................................................................................................

400 400

varEi!L CapEZL

R”YNKHNDLIYLYIKKTGFCPMIKRLFIFEYPLYKEASGHLLKY~ENRGA MFFPRY CYLPYLkCNYKL PKPIPFKEENRNIVYKKYNRVLCFDLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..N.......S.

500 500

VwE2L Cop2L

ENUFKSLIKIDSIPGLKYYNlKDIYYEKSNN1ICVRFIWESI)INEERRIKLPLFDIARUSYGLYYIPSRYLSSVTPWYllIEGREYYNWKIECLVI ....................................................................................................

vwE2L CopLZL

LDLFSEEFIEYPNL~VSNIIELEYYISNYPUINCUISYLLIYLVLGSIRSISI(TENFVLSILNIFYKGLK1NELLSEPVSCVCIELDKI~GDRASSGD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..D..............................N..........

vwE2L CopE2L

SSFIFLKKNELSKYLSLCEKVCVEYILDNNGSFKSSK .....................................

wrE3L CopE3L

~SKIYIDEREDAEIVCEAIKNIGLEGVTAVQLYRPLNIIEKREVN~LYDLOR~~S~DIPPRUF~YYEADKPOAIIYIVIIDDVSREK:SIIREDHKS . . . . . . . . . . . . . . ..A......I..A..A...............................................DA.....................

vRrE3L CqaESL

FWVIPAI(KIIDUKUANP~IINEYCPIY~DYTFRIESVGPSNSPYFYAC~IDGRVFD~CKSKRDAKNNMKLA~KLLGYVIIRF . . . . . . . . . . . . ..D...........................................................................

vRrE5R copE5R

. RLILYM(IY~LIIVLYLYGYNFIIISGSPCPIIINDDRFYLYRKIGIDSVESYlllGIOKKRYKFPN~KEINPYIRMaYWYEYLKLGYIKFKIIIIRYY -_______-. . . . . . . . . . . . . . . . . . . . . . . . ..S...........A.............................................R.....

100 90

vRrE5R co~E5R

YLEDIYYSIPNIGKIYKLF~ISAIG~~NPS~YALLLY~FPNLF~DHRFILYR~F~SKIKGKIFSPFKLNLIRILVEERFYNNECRSN~IIG . . . ..AP....N..Y............A.......V...................R...H........................................

200 1W

varE5R c.z#5R

YOVD~LIAESDKYYIDARYRLRPIYRIKGYSEEDYLFIK~~~CVTSQELVE~LKILFRDLFKStEYUYRYDDDVENGFIG~KLKLNIVnOIVEP . . . . . . . . . . . . . . . . . . ..N.K.W..................E...........................................Y............

300 290

VarESR ce5R

C”PVRRPVAKILCKE”VNKYFENPLNIIGKNLOECIDFVSE .........................................

. 737 737 100 100

190 190

2:

. IOFIRRKYLIYYVENNIDFLa)DYLS~NNFTLNHVLAL~LVSNF~HVIY~VLANYNFFVFIHWVRCCMEAVLRHAFDAPTLYV~LYKNYLSFS ....................................................................................................

100 100

NAIGSYKEYVHKLYWEKFLEVAEY~EELGELIGVNYDLVLNPLFHGGEPI~~EIIFLKLFKKYDF~KLSVIRLLIYAYLSK~YGIEFADNDRPD . . . . . . . . . . . . . . . . . . . . . . . . ..D.........................................................................

200 200

l

l

1

I

I

I

IYYLFPKYGRIVNSNLYEYFRDYIFP~KYSY~LNESIAND~IVLNRPAIY~YDKILSYIYSEIKO~VNKN~LKLWIFEPE~IRELLLEIIYDI . . . . ..P...........................................H.................................................

300 300 l

COPE&

PU)ILSIIDAKNDDYKKIFISFYY*NFINGNYFISDRTFWEDLFRVWRFDELDINNSY~SNIIYEVNDIYLDYIC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..~..............................

VRrE6R COPE&

I I I r DUKKCGIFNEDYSYYVKEYNYYLFLNEYDP~IENCILKKLSSIKSKSRRLNLFSKNILKYYLDCGLIRLCLVLDOYKCDLLYYlllNHLKFVEDVUFVR . . . . . . . . . . . . . . . . . . . . . . . ..H.S....................K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..S.........

VRrEbR COPES

FSYDKNPSVLPSLINYILASYNISIIVLFOKFLRDNLYH~KFLDKSIHLYKYDKKYIL~LIRHGRS . . . . . . ..I.....K...............R..........E.........................

varwR COPE’IR

I(GYMYIOYPYKLIN~EIUEIIILEKI~HI~YISDES~SENNPEYIDFRNRYEDYRSLIIKY)HEF~LCKNHAEKSSPEY~IIKHIYEQYLIPVS

VRrE7R COPE’IR

------~ISIIGDIIYYNGCYDN~LEPLSYLNFNNLHVVWSCSIO(VTRIFYlFFSYL~~KLDI EVLLKPIII..............E..............RY...........L.............N.

VWE6R

400 400

.

500 500

. 567 567

________________________________I____________-____----___~-----__~____---~________---_________-~~___ .

Fig. 2 (continued).

60 1M

0 100

245

100 100

WPEBR cocxm

~WPRFWW~AoRRIU)oETFFSRCLGRPLKNTYLFDNYAYW)IPETAIYSSRYANLDASDYYPISLGLLKKFKFLI(SLYKGPIPWEEIVNTEF . ..T..........................S...............................................E.....................

V*rEBR C*ea

IANCSFSCRYVSYLRKFUILPTWEFISFLLLTSIPIYNILFUFKNloFDIlKNlLFRYWTDNAKNLAL~Y~NoT~YKPLFSRLKENYIFlGPVPIGI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..c.

VOPERR Co(xm

~IDDBPNLSRUSP~YETLAIIISlILYFTPIDWLNFLLFYWGYSITTKITPA~YLW)KLKLTKNDMLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..N...S.....

wrE9L CopEPL

WVRClNUFESNGENRFLYLKSRCRNCETVTIRFPNYFY~DEIYoSL~PPFN~~~RTIDIDETISYNLDI)(DRKCSVADINLIEEPK~NIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..S........L.....................................S..

l

l

l

vnrE9L CopEPL

wrE9L CopEPL

200 200

.

l

::: 100 100

l

NATIOEFLNISUFYISNGIS~G~SLDDOYLTKINNGCYHC~PRNCF~EIPRFDIPRSYLFLDIECHFDKKFPSVFINPlSHTSYCYIDLSGKRLLF . . . . . . . . . . . . . . . . . . . . . . . . . . ..E.............D........K................................................

200 200

TLINEEIlLTEoElPEAwRCCLRIOSL~~YERELVLCSElVLLOIAKoLLELTFDYIVTFNG~NFDLRYITNRLELLTGEKIIFRSPDKKEAV.VILCIY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R............V.........................................

300 300

. . . ERNOSSNKGVGCIUNTTFNVNNNNGTIFFDLYSFIOKSEKLDSYKLDSISKNAFS~G~LNRGwENTFIGDDTTDAKGVJIVFAI(VLTTGNTVTW-O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A..............E. l

499 500

vmE9L

IICKVIN~IUENGFIWLSCPTLTNDTYKLSFG~W~~Y~YNLNIALD~RYCI~ACLCOYL~YYGVETKTDA~STYVLPOSNVFCY~ST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..E.R... . . . . ..R............L....P

vmE9L

VIKCPLLKLLLETKTILVRSETKoKFPYEGGI[VFAPKO~FSNNVLIFDYNSLYPNVCIFGNLSPETLVGVWSSNRLEEEINNOLLLOKYPPPRYITVN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..T.........................

varE9L CopEPL

CEPR!PNLISEIAIFDRS EGTIPR~LRTFLAERARYK~LKPATSSTEU.IYDUIPYTYKIIANSWGLIIGFRNSALYSYASAKSCTSIGRRIIILYLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..v.....................................

699 700

WrE9L CqE9L

VLNOlELSNQ(LRFAYPLWPFYIOD~INPIVKTSLPIDYRFRFRSW~TDSVFlEIDSPOMKSIEIAKELERLINSRVLFNNFKIEFEAVYKNLI~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..N.....-..............

799 SD0

WI-ER CopEPL

oSIYKYTT~YSASSNSKSWERINKCTSETRRDVSKFNKNRIKIYKTRLSERLSEGR~NSNOVCIDILRSLETDLRSEFDSRSSPLELF~LSR~~HLNY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..T....................................................S..

vlrE9L CopEPL

K~NPN~YLVTEYNKNNPETIELGERYYFAYICPWVWTLVNIKlYETIIDRSFKLGY)ORIFYEVYFKRLTSEIVNLLDNKVLCISFFER~FGSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..K

vmE9L COPE%

PTFYEA PTFYEA

vmE1DR copElOR

IUIPKHWIIUAUTIIF1VL~GLDGNI~CKRKLYTIVSTLPCPACRRNATIAIEDNNI~SSNDLNYIYYFFIRLFNNLA~PKYAIDVSKVKPL . . . . . . . ..V................................................V...D................................

VmRllL C#llL

llELVNIFLETDAGRVKFVIKNT~VCASELINKFvELLsEYIH DOSEFYLWKDKD . . . . . . . . . . . . . . . ..A..............................................R......N...............F............

vOrE11L CopellL

TIPCRIIPKNN”AVISWTNHKFYNGLSL ...................... .......

VmolL CgQlL

I l IIFWYPEFAR~LSKLISKKLNIE~SSKNoLVLLDYGLHGLLPKSLYLEAIN~ILNVRFFPPEIINVTDIVULoNSCRM)EYL~VSLYHKNSLIIVSC ....................................................................................................

100 100

VmolL cop01 L

PN~LIIIEYNLLTHY)LEYINENVI~TYLLKINAY~INFKIDLTwEIIDLVaDIPVCATLNLYNILNNLDLNIILRISDEYNIPPVNDILSKLTDE . . . . . . . . . . . . . . . . . . . . . . . . ..V.............................................I..D.V......................

200 200

VmolL CC@lL

EIICIKLW~PI(ENVINFIN~wYSPTFIKTI~FVNAHLPTI(YDGLNDYLHSVIw~LIEEYKIKSV~FNLEYKTDwTLTLDEOIFVEVNISYYD . . . . . . . . . . ..D.........................K.................I.E.........................................

300 300

WrolL CopOlL

FRYROFANEFRDYIRL~SROIT~TWKIRRFRRPIISLRSTIIK~TDSLEDILSNIDN~KNS~SIEOwRlISSFRLNPCWRRT~LSNIDIKTKI . . . . . ..D.......I..R......S..R..........................A................E...................D.......

400 400

WralL CopOlL

1--IIYLKIAMUISULTLSAIKGIYVTDTINTVLSKIL~NNRNVF~LTSWNKEITVCNCSRCVSLFYRELKSIRCDLNTDDCLLARLYDLTRYAL~tRIN . . . ..V...........................................E....A.......L.........V....R......D............K..

500 500

WrolL CqQlL

~NLI~R~PLT~LFNEDKKRKLNNL~IKI~~LWGNSIEKTLIPITESLSFKLS~T~SVSNDOYA~IFFNTIIEYIVATIYYRLAVLNNYV . . . . . . . . . . . . . . . . . ..N..K......E......................D.............L......I...........I.......T.....l

VmolL SopolL

AIRNFVL~LNT~C~VLFS~I~DKIENELEE~KGIWSYLHHLSINVISIILDDINGTR N~...S...............Y..................T.....Y..................

l

999 1 DO0

1005 1006

1

I

I

95 95 l

FYFKCDKGSISIVSNEFYVFDEPLLFVKDYTNVTGVEFIVTE

100 100

129 129

l

. .

Fig. 2 (continued).

z

600 600

246

wrP2L COpOZL

~EEFVQ~RLTWWKVTIFVKFTCPFCRNALDlLNKFSFKR~YEl~lKEFKPENKLNDYFE~lTGGRTVPRlFFGKTSlGGYSDLLElDN~ALWlLS . . . . . . . . ..A.........Y...... . . . . . . . . . . . . . . . . . . . . . . . . . . ..E.R..........................................

VarPZL CopoZL

SIGVLRTC ........

varKlL CopIlL

~VEFEDPLVFNSISARALMYFTAKlNE~~ELVTRKCPPYLVKSSFVKKFGLCRYGGlLlSLlNSLVERRFFlKNCKLDDTGKK .A......................................................................,.................D.........

100 100

varKlL COPIlL

EL”LTDVEKR,LNTVDKSSPLYIDISDV~LMRLKREATPFllFWGHTIHLTRSFRSP . . . . . . . . . . . . . . I................................................................................L....

200 200

verKlL CopIlL

PVKDNIISSTRLYDYFTRVTKRDESSIYVILKDPRIASILSLETVEllGAF~YTKNUlLTNAISSEWRYSEKfPESFYED~AEFVEENERVDVSRWECL . . . . . . ..R....................................K..................R.....K..............K.....N........

varKlL CopIlL

. TVPNITISSNAE ............

varK2L copl2L

WDKLYMIFGVF~CSPEDDLTDFIEIVKSVLSDEKTVTSTNNTGCUCUYULlllFFIVLlLLLLlYLYL~ .........................................................................

varK3L copl3L

* ISKVIKKRVETSPRPTASSDSLPTUGYlElAYSlSKSNAKCIEYVTLNASPYANCSSlSIKLTDSLSWITSTFI~LEGETKLYK~KSK~DR~~YFLK ....................................................................................................

100 100

varK3L copl3L

IKVlMSPMLYPLLEAWGNlKHKERIPNSLNSLLVETITEKTF~ESIFINKLNG~VEYVSTGELSlLRSlEGELESLSKRER~LA~llTPWFYRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..S............................A..S...........................l.....

200 200

varK3L COPl3L

GTETKITFALKKLIIDREWANVlGLSGOSERV~TENVEEDLARNLGL~lDDEYDEDSDKEKPlFNV .....................................................................

VarKLL COPl4L

~FVIKRNEYKENYnFDKITSRlRKLCYGLNTDNlDPlKlAM~l~GlYNGVTTVELDTLTAElMTClT~NPDYAlLMRlAVSNLHKETKKLFSE~KD

vwK4L COPl4L

LFNYVNPKNGKHSPIISSIT~WNKYKDKLNSVlIYERDFSYNYFGFKTLEKSYLLKlNNKIVERP~H~L~RVAVClN~~IDSAIETYNLLSEKUFTN . . . . . . . . . . . . . . . . . . . . ..I................................................................

varK4L ccpl4L

I I I 1 ASPTLFNAGTSRNP~SSCFLLN~~DDSIEGlYDTLKRULlSKMCGlGLslSNl~SGSYISGTN~SNGllP~LRVYNNTARYID~GGNKRPG~TlY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..l.............................A.. . . . . . . . . . . . . . . . . . . . . . ..I...

varK4L copl4L

LEPYNSDIHAFLDLKKNTtNEE”RTRDLF,AL~,PDLF~KRVKDD~S”SL~CPDECP~LDN~~~FERLYTLYEREKRYKSlIMRVWUIIES~lET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R......................

varK4L copl4L

CTPFILYYDACNKKSN~~NLtTIKCsNLCTEIl~YAOANEVAVCNLASIALNnFVIDGPFDFLKLKD~lVRNLNKlIDlNYYPIPEAEISNKRNRPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..V.........R...............................

. . . . . . . . . .

500 500

varK4L copl4L

I I r GICVPCLADAFILLNYPFDSLEA~LNKKIFETlYYGALEASCELAEKEGPYDTYVGSYASNGIL~YDLUNVVP~LUNUEPLKDKlRTYGLRNSLLVAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

600 600

varK4L CopIlL

WPTASTA~lLGNNESVEPYTSNIYTRRVLSGEFQW)(PnllKCUADRGAFl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A........................................

varK4L COPl4L

DPS~SMNlNlADPSYSKLTSnHFYGUSLGLKTC~YYLRTKPASAPl~FTLD~KlKPLWCDSEIClSCSG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

V!MKSL CoplsL

IUDAITVLTAIG1TVL~LL~VlSGTA~lVKELNPNDl$T~~SLKFNRAVTlFKYlGLFlYlPGTllLYATYIKsLL~KS .V......................A..............................................“.......

varK6L Ccpl6L

MNNFVKPVASKSLKPTKKLSPSDEVlSLNECIlSFNLDNFYYCNDGLFTKPlNTPEDVLKSLLl~SFAYE~IlKCLlKlLISRAYlNDlYFTPFGULT . . . . .._......._......._.............................................................................

verK6L copl6L

l t * GIDDDPETNWIKIIFNSSLISIKS~VlEYLKPYNVNNLSVLTTEKELSlNTFNVPDSlP~SllSFFPFDTDFlLVlLFFGVYNDSYCGISYlSPKERLP .v..................................................................................................

VBTK6L copl6L

I l YIIEILKPL~SElN~LSDEIGRTSSIRIFNSTSVKKFPTNTLTSICElWSFDESSFPTPKTFTPLNASPYlPKKlVSLLDLPSNVEl~ISRGG~FIl ......................................................... . . . . . . . ..V.................................

VerKbL copl6L

. HlNNKRLNTILVlAKDNFLKNSTFSGTFlKENll~GlYTYRllKSSFPVPTlKSVTNKKKlCKKNCFVN~YTTRTLSNIL _,..........._._...._.._...._.............._......................................

100 100

108 108

. 300 300

312 312 l

::

l

269 269 100 100

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A......................................E.

. . . . . . . . . . . . .

200 200

t

300 300 400 400 l-

.

Fig. 2 (continued).

700 700 77’1 771

$ 100 100 i-

382 382

200 200 300 300

247



VWK7L Cop17L

MERYTOLVISKIPELGFTULLCH1YSLltLCSN1DV~fLTNCYCYWECfDKSTlAG~~1PIDPlLELVESRRLSRPNS~S~~LlDELKYRY G......................... *......~..........~*~..~......~...~............*...*..-...............-...

vrrK71 cqdn

HSIYDVFELPTElPLAYFFKPRLREYVS~IDfSMIDLYIDDLSRKGIHTGENPLNMlKIEPER~ISNRS1~LVSPFAYGSEWYIGPFDIIRFLUS ............ ........................................................................................

200 200

vrrK71 CopI’lL

LAIHEKFDAFWNKWILSYIL~KIKSSTSRFVllFGFCYLSHUYCVIYDKKOCLVSFYDSGGYtPTEFHHYNNFYFYSFYlGFNTWNRHSVLDNTN~IDV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..f....

300 300

VMK7l. ccprn

LFRFFECIFGAK1ECIWYEVNPLLESECGnF1SLF~1LCTRTPPKSFKSLK~TFfKFL*DYT)(TLFKSILFNLPOLSLD1TETDNAGLlEYKRIRKUT _*...*...** . . . . . ..s..... T...........................................................*.......n *......

400 COG

VWK7l cop171

KKSINVI~KLTTKLNRI~DE .......................

varKGR COplGR

f t ~EKNLPDIFFFPNCYllVFSYKYSPDEFSN~SUMERDEFSLAVFPV1KHRU~NAHWKWKCIYYVSTEAHGK~SPPSLGKPSHINLTAKOYIYSEHlISF ............ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..KT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R............A.....T

100

vwK6R CoplGR

ECYSFLKCI7NAEINSFDEYILRGLLEAGNSLPIFSNSVGKRTDTIGVLWlYlPFSKIPLASLTP~OREIFUY1SHRPWLTCGTGVOKT~VPKLLL . . . . . . . . . . . T........................................................................................

200 200

VWKf!U CcpIaR

UFNYLFGGFSTLDK1TDFHERPViLSLPRIALVRLHSNTIlKSLCFXYLDGSPISLRYGSfPEELINKPPUYGIVFSTHKLSLTKLFSYGTLI1DEVHE .~.*~~.~.~~~~...~...~.....~...~.~.~.~~.......~..................~....~...~.~.~....~..~.~...~....*~~.

:ii

VWKBR ClQI(lR

I(DPlGOIi1AVARKHHTKIDU(FLUTAlLEDDRERL~FlPNPAFIIiPGDTLFKlSEVF1HNKlNPSS~YIEEE~NLVTA1~YTPWGSSG1VFV . . . . . . . . . ..‘..........................................................‘.............................

400 400

VWK6R ccQ18R

. . ASVAOCHEYKSYLEKRLPYDnYllnGYVLE1DK1LEK~SSPNVSI1ISAPYLESSVlIHNVTHIYD~GRVFVPAPFGG~OF1SK~RDORKGRVGRVN . . . . . . . . . . . . . . . . . . . . . . . . . . . ..D..E................T.........R.....................E..................

500 500

VarK8R cop18R

” I PGTY~fYDLSYYKSIPR,NSEFL”NYILYANKFNLTLP~DLFI1PTNLD1~~TKEY . . . . . . . . . . . . . . . . . ..D................................................................................

VOI-KGR CopIaR

l . . ERTGELTSl~AILSLNLRIKILNFK:HKDWDTYIHFCYILFGYINGTNATIYYHRPLTGYRNIIISDTIFVPWNN . . . . . . . . . . . . . . . . . .... .......... .............. . . . . . . . . ..R...................D

VlrWlL Cq*lL

~IVLPNKVRIFIYDRMKWIYLGISYFOIEWDIDEILGIAHLLEHLLISFDSTIFLANASlSRSY~SFUCKSIN~TE~AIRTLVSUFFSNGKLKDNFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..N..............................................

100 100

VmllL CqmL

. f LSSIRFHIKELENEYYFRIEVF”OlD,LTFLSGGDLYNGGRIDMIDWLNIVRD”~VRRllOR SGSNIV FVKRLGPGTLDFFNOTFGSLPACPEIIPSSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..K.................

200 2GO

vari(l1 CopClL

PVSTNGKIYIIYPSPFYTVIIVYINPTLDNILG1LYLYETYHL1DYETIGNPLYLTVSFIDETEYESFLRGEAILO1SPCORINIUIYU)DY~NIYLNFPUL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..s....................

VIMIll. CopOlL

SHDLYDY17R1NDDSKSILISLTNE1YTSIINRDIIVIYPNFSKAnCNTRDTWHPIWLDATNDGLIKKPYRS1PLI~IT . . . . . . . . . . . . . . . . . . . . . . . . . ..A........................................................................

400 400

VdllL COQGlL

LSLSKPD1SL~NAEGIR~HSFUDDIOAIWESDSfLKYSRSKPAAI(YPY1FLSFfASGNSIDDILTNRDSTLEFSKKTKSKILFGRNARIDVTTKSSF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A..........R..........T.....A....

500 500

v&rHlL CC$ClL

VCG1YRtKlLDKTSLWMLKKKGLIY~FTNLL~NlFYLFlFTiYTDE~DYLNlNKLF~CLVISTK~~NfSSLK~WiRV . . . . . . ..s.............................................................v....................

varH2R CqYiZR

. ~PFRDLILFNLSKFLLTEDKESLE1VSSLCRGFEISYDDLlTYFPDRKYHKYlY~fEHWLSEEL~EFHDTTLRDLVYLKLVKYSKCIRPCYKLGDNL . . . . . . . . . . . . . . . . . ..E.................................S...........................R..................

100 100

varH2R CG+G~R

KGIWIKDRNIYIREANDDL1GYLLKEYTPOlYTYSNE~VP~AGSKLXLCGFSOVTF~AYTTSHlTlNKKWVLVSKKCIDKL~PINYOILONLFDKGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R...T......................................E..................

ZOO 200

VerH2R =eR

GTINKILRKIFYSVTGWTP ..*.......*........*

wwH3L CqsfL

IUSLLYFiLFLLFVC1SYYFTYYPTNKLOM~ElDREW*IlIPRNDEIPTRTLDTAIF7DASTVASAOiYLYYNSN1GK1lNSLNGKKHTFNLYDOND1 . . . . ..L...................................R...........................H.............................

varH31 CCp%L

RTLLPILLLSK ...........

l

::: 100

DSFDiSTET~KLLSNYYR~1EYAKLYVLSP1LAEELDNF

676 676

.

l

* 300 300

t

l

591 591

220 220

111 111

Fig. 2 (continued).

100 100

248 l

l

VWHli CopGlL

~NYLIIF~YCS1EENYSUVEELYSfYDILHVDILSFFLYDU1SUUQ)~RGTLlWIfMWLSNYXYSIFYIYWT~FMIYLSUlFlKlDKS .*...*.................*............*. . . . . . . . . . . . . . . . . . . ..D.........................‘...............

VRHlZL CopolL

LVNLEILKSEIEKANYGWPPVTE . . . . . . . . . . . . ..T.........

VRHISR CQssR

. WS~~LKS~LLENYStll‘DDIltrrVIYG FMl~WSIYIAVAWCM~LEELTTVf~~~~N~lF~OR~~KiK~N~K~~SKSl~R~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..O*.........L......

w-N%? ccqG5R

LELEKCTSXiWMGF)(EEE~YMIPC*IDUTFPlYLSDsDNIKlSU(EILTHfYXNEWYTlFY~ERF~~~YfFlT~~lItS~OoM . . . . . . ..E...............N........................................................N.S................

VaPHSR Cqs3R

tfAEMNHP~lKWlTPLFKfVP~NYL~LTAL~G~FFPGLY~SI7PTNLNKI(ILFsOFlINNIVTSLAIKNYYRKfNSTMVRNIVtFiNDVA ,...2..,..........,............................~.....N.............O...................~............

VWH5R cqssa

WLWWSYIPFC~ClYMFIfEILDEMLDFKSSYlE7VPLP~L~YALEPR~lOV~~TLS~lOFE~lK~IDViKSISSfFCYSNENCUfIVFtl . . . . . . ..v....................u........................................._............................

VRMR coposff

YKDNLLLS~NSSfYF~NSL~~lN~K~NIIN~GY . . . . . . . . ..N.....O.................

VWWR cw

WPVWF,~YAPRCS,IFI;Y,WSLTSHLIIPSI.EKH”GIYYCTLLSE”LWESTYRK~~,YPLOSffEDYLSAM”‘EWIPM(KIMDTSLTLLCIPY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R........................................

varn6R m$G6S

GfWUIRHYCFKLV~CYIAGl~lS~RILGKOiFLS . . . . . . . . . . . . ..o...*..yE...........................*..............

VSrn‘tL CopClL

~~RRSTiFOtYSX&lWSYLROISINSEYlEUAK~LCYCPASmSVtWGlYNCCEUlIEtllPKEPLlK;LDNLRC~UHYWI*TOfYRLYNS~~ ..,....,.........,*..........~......*...~.....*.........~.~..*......‘....~.*...~...~..*............*

VwH7L ~cpS7L

F,HTTAFFUTCKPTlLATLNTLlrLILSNKLLYMEI(VEYLSNPLOSSNK~S~~L~LL~”KYAL~N~~YR,~~,~CEPtIVAGfSCKEPIS~YUE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..D....

100 lop

*

*

200 200

.

*

t

:E

124 124

*

*

*

l

165 165 t

t

wrll71 Copoll

* VERLMELPYY~O;YWTTYOFUIRYGIOlSNNtAEYIAGLKIEEIEYYEKYLPEVlSTI*USNIllOlKKSIFPANINDK9IUECSIO(LOTSEKYSKGYYTD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..R............... *

Vmi7L Cgpc‘Ic

371 371

WVTSPLTGNNfllTFIPlSA~FTILEY~Yi~R~NVKK~EGKNNGC~~~SSPF~lNLPKC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..v....................*..

I(SIRIK1DKLRP1VAYFSEfSEIEVSlNWSlDEL~Y~FMLCGSVNIUAtIPLMSVfYR~ENIVFNLPVSXYYSfLCSf~NDA~lDIEPOLENWLVKL . . . . . . . . . . . . . . . . . . . . . . . . . ..f.................................~....~..*....._*.*...*.....~...........

100 100

SSYHWSVDUIKELI(P1RTDTTICLStDaKKSWfNFNKYEEKCCGRTViHLEYLLGfIKCISDNPHLTIWFUIO11i1I(KTffiN?DAfSREYYITECSPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A...............................

:zi

LQKFSFKIAlSSLNKLRGFKKRVHVfETRIM*)WDOUILOlLFSORWSFKIN1FYAFLO . . . ..~.~~......~......~~....~......~.~...........~.........~

260 260

Fig. 2 (continued).

genes: C18L, E4L, HSR, H5SR (Golini and Kates, 1984; Jones et al,, 1987; Ahn et aI., 1990; Meis and Condit, 1991; Amegadzie et al., 1992); and to the i~te~ediate genes: H8R Neck et al., 1990). It should be noted that COP genes corresponding to the K3L and K8R genes of VAR have been assigned to the intermediate category on the basis of e~erimental data (Vos and Stunnenberg, 1988). Besides, in this laboratory the 13L gene of COP (the analog of the VAR K3L gene) has been shown to express constitutiveiy throughout the cycle of virus development (Schmitt and Stunnenberg, 1988). In another laboratory (Fathi and Condit, 1991a,b) gene I8R of COP (the analog of the VAR K8R gene) was found to be an early-late one. These results are illustrative of the difficulties which arise in assigning some o~ho~~irus genes to this or that type. These reflect the complexity of the regulation of viral gene expression (Moss et al., 1991).

249

40 26

1z 26

Ylm?.L CCpF4L RR nw”

RR yea aefv

KUEASFYTVE---EMlSYDINDU-NYLTPDEClFIKHVLAFFMSDGlVNEYLAERFCIEMITEARCFYGfO)(AI ................. ......... ..---...........-..................................1 ....... ..A.---...L....PH.-U.C...RN..S.................V...SO...V..........l ....... ..A.TAE.l.L....W..N.SHNEN.RF..UI................V.N.ST....P..KS.....~N ..L.SWT..A~.--...I...D..-VJ(PYWRE.Y.4i~~..WA.E..~..~LlN.~~TK~.VLy..TU.A

EN1HSEI(YSLLIDTYV KOSNEY ...................... . ..PK .R .U ............... ..... ..T ...... ..I ..PK .S .a .CV...A..IOVK.L I P.EK.P

NYLFNAtEyUPCVKK~~~YUI~-~GYGERLIAFMVEOIFFSESFASiFULKKRELIIPGLTFSNELISROEGLWCOFACLMFKNLLYP-PSEEfV .................................................................................................... .. ..W .................................................... E...................LR..fi.KE.f EF.....n.l.EIOE..E..LR..4.#.LF..~.V...SX..V................~..~........C......l....~L.A..KNK..OPA~ 9RI.SG..KN.XI..II.P.~G.~P~NSL....VC.......L.~NH.VA.~F..E~~.~..VSY..F......N..S...FLiSNYV.N~.E.K1 VarCSL CopFLL RR mm RR ye8 erfv

RS~ITDAVSIEPEFLTVAL---PVYLI~NCE~TYUEFVADRLISELGFYRIYNY7WPY)F~NISLEG--KTNFCEKRVOEYP~~S~-~E-DNN ..A..---...............T..............K .............. ..... ..F ....................................... .E...N..R.......E..---.........TL..P.I.......ML....NK.FR.E ..F ........ ..--.............R.....--NSTE EK.V.E..E...RYFLD..---..A.L...*DL.Na.V.......LVAF.N.KY.K.E..F........A.--.........~...A....KSTK~A HK.LKE..ELM)..INY.FO~GRVP.FSK..LFQ.~RYFT.N.~F~~.S....G..F~.TKFF.NEVE......L.Pl~..NC~D.-AF.AF

vacce.1 copF4L RR mw RR yea arfv

-FSLDV-OF -.....-.* -.T..A-.. A.TFNE-., -LF..OO..

..VHK..A.aR.

:: 191 19r 122

. I

.S G K

326 312 :: 319

333 319 390

Fig. 3. Comparison of amino acid sequences of the proteins CXL of variola virus (var), F4L of vaccinia virus (cop) and the small subunits of the mouse (mou) (Thelander and Tbelander, 19891, ~cc~uro~yce~ cereubiue (yea) (Hurd et al., 1987) and African swine fever virus (a&v) @oursnell et al., 1991) ribonucleotide reductases (RR). The consensus sequence characteristic of the small subunit of RR is enframed (see the text). Amino acid residues coinciding with the corresponding residues in the sequence of varC8L are marked with dots. Dash denotes the deletion of the corres~nding amino acid residue.

The C8L pol~eptide of VAR is highly homologous to the F4L protein of COP (Fig. 3) and represents the small subunit of ribonucleotide reductase (RR) (Slabaugh et al., 1988). The small RR subunit is known to bind iron atoms (Nordlund et al., 1990), and the consensus sequence in the active center region is represented by: E-x-[L,I,V1-H-x-x-x-Y-x-x-[L,I,Y]-x-x-x-[L,I,V,M,F,YJ[L,I,V,M,F,Y]. Besides the characteristic active center (Fig. 3) the C8L protein of VAR contains a leucine zipper motif and a cluster of hydrophobic (Fig. 2) amino acid residues. These sequences are apparently important for the interaction of the small and large RR subunits. Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from the corresponding ribonucleotides, thus providing a pool of precursors which are necessary for viral DNA synthesis. RR is an oligomeric enzyme and consists of both large and small subunits. The large RR subunit is encoded by the K4L gene of the analyzed VAR strain. Amino acid sequence comparison for the large subunit of RRs of various origins (Fig. 4) reveals considerable homology of the analyzed proteins. The active center of this polypeptide possessing a consensus sequence G-x-x-N-S-x-x-x-A-x-M-P ~Nillson et al., 1988) is fully preserved in orthopoxviruses. As we see (Fig. 31, the viral small RR subunits are reduced at the N-terminus compared to the analogous polypeptides of cell origin. The homology level of the overlapping sequences C8L of VAR and F4L of COP and the analogous cell proteins is high. The small RR subunit of the African swine fever

250

VarK4L COpIiL tan YP

RFVIKRWGYKENVWFDKITSRlRKLCYGlHTDHIDPIKIlflICY1OGIYYGIliTVELOTLTIEIMTClT4HWYAIL~RIAVSWLWYET~LFSEVxm *....................*.....................................A......................................E. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A......................................E. ..A......................................E. .H....D.RQ.R..........P.......M.FV..AP.T......L.S..........A..T...L ..K .................... ..V..D..E. . .Y....O.R..P.4........TR.S...DPYR..AV.VT4RI.S.V.S........Y.A..lC.YII..V.....Y......I.....4.T.4..K.IE --------.~--MEYFF.VKK~SAT..~LN-V.LY.LL~NNNSL4.llS....--.~-VC~L.IL.Y..Sl....LS.YL..4S.ASS..~VS L

E

vsrK4L COplCL tan

LFNYVNPKN:WUHSPIISSXT~WN~‘U)KLNSVIIYERDFSYNYFGFKTLEKSYLLK-lNYKlVERWHXLURVAVGi(IPU----DlDUlETYNLLSE . . . . . . . . . . . . . . . . . . . . ..I....................................*..*...,...‘.....*....~--. . . . . . . . . . . . . . . . . . . . ..I...................................-.......................----.............. ..I...................................-.......................---.Y..l..N..R...WVA.S.L.I.IUX..R...A...D..............R........G.VA..........S....KE-.--...A .HDUI..AT...A.M..DEIYN~.HElRY-...A.V.D...4.l........A....R.L.GEVA.....LV....L...GS.---..E.VLK....~ PMPSCSRLSS.-------FV...Y...A~~D.Y.D.S..I~LL..D

195 793 195 1% 193 194 171

1;);,, RRYt% arfv

.........................................................

....................

KYFTHASPTLFNACTSRNPGSYISGTNCASNGIfPIILRWNNT*RYID4GGNKRPG .......................... ..I...........................................1 .......................... ..!...........................................I .......................... ..1...........................................1 .... ..P........N.P.L......S.K............U......S.....VAV.C...l....A....N...LV............V RY.............PLP.......IA.K............E..H...T.-.~SH.N...Sl....A....T...L...I..F......~RW-T.D HVt......K.....KKP .L ..... ..V-N.NL.NL..UV.lAGT..GGG.....CL.O.C.AN...-L

. . . . ..*.*..... .............. ..........

........................... ........................... ...........................

.L . ..R

295 295 293 ......... ...

L

U1ITIYLEPUHSDIMFL--DLKKNTGNEE~NRTRDLFtALUI~LF~KRV~------DGEUSLIICPDECPGLDNIWUIE~..---------FERLYTLYER ..A..............--..........-...,..................-.---. . . . . . . . . . . . . . . . . . . . ..----------.......... ..A..............--..........-......................-----. . . . . . . . . . . . . . . . . . . . ..*---------.......... ..A..............--..........-......................--.--

......................

..----..---

AFA.......L..FE..--.......K..-P.A....F............ET-...~-N~......N .... ..E . ..E..---------..K..ES .LSPCS~US.U)4ISSTKF.IR.lN.K..-I.A....P.......~....4E...--*..P.T.FS.~....D.....----------..S...R AYAY...L..4..fT..WPRL.G4KA.4RLNAPN.KYG ..V ... ..EILE.PINNRG..T.Y.FS..OI.R.HK.FDL.RS4S~NAHRE.KK..Y4.V varKbL CopIlL tan :L” RRYell esfv

..........

..K ... A

EKRYKSIIUR~IIESPIETGTPFlLYKDACNKKSN~NLGTIKCSNLCTEfI4Y---MANEVAVCNLASnLWI(FVI---DtP---FDFLKLI(D .R..............................,..........................---.............V.......---..R---........ .R.........................................................--. . . . . . . . . . . ..V.......---..R---........ .R........,................................................---............,V.......---..R~--........ PG.VRKW..QOL.Y......T.....YR....S..R . . . . . . . . . . . . . . . . . ..VE.---TSKD.........L . . ..Y.T-----PEHTY..E..AE .G.G.T-...PKL.Y..LP.T......nV......R.T......,..S....C..VE.---SSPD.T . . . . . . . . ..PA..EVSE..KTASYN.ER.NE . . . . TGVlT.KEII.EUFKTW4V.N.Y.GF...I.R...LSNV...TN . . . . I..TIPCUEGNEA.PG. . . ..AVVY.M.I------RESSY.YRG.IE

.T.....................HL.....................~...E.P...L...O.................KEY...E..E..PV.K..~ IA...TH...RV..R....V...RN..MY....AL.......TY~.RL..E.E...1...4......~ T ...... ..a...K.S.FE..P..K MGNVTE..ON...NC...TEATRR..n. ...... ..F .. ..V.AS.~K.G.P..IAIDEA.~L....IIRR.l...KEK.SnPSFP..A..K.L

;kKu RRyes sofv

DEIKNRl~VDGGSI4-NTNLPEDlKRWKTlUElP4KTII~R~FlD4~SMNlNIMPSYSKLTYIHFYGUSLGLKT~YYLRTKPASAPI4FTL ...... ..A.......................................................................................... ...... ..A ........................................................................................... ...... ..A ........................................................................................... K.M..P.IACN....SIPEI.D.L.PL...V...S...VL......~....R..AN .UI.4YLITON....GLP.V.QEL.EL...V...S.~...N..~..AlY....N.L.LFLM..ffi.I........KK...~.......~...A ED.WPLLNA.....HILDI.AE.RER...SR.IIN..ILT.H..A.NP.VS..Y.L.YYFPE.EL.4VLlVL~..KK..T..S..CWFS.G-.GT.KK

varKLL CQPl4L tan UlRRmw RRYM ssfv

DKDKIK----------------PLWU)SE---ICTSCSG . . . . ..-------_--------........*-*....... . . . . ..“_------_-------........---......_ . . . . . . ._____._________.P......___....... N.E.L.OKEKALKEEEEKERNTMn..SL.NREE.LFl.GS ._-_.--_.._...______------~__.._._._~_____ *RN--..---...-.__-__--SE~.NAD--~_.~A.LL

:z 292 269 376 376 376 376 376 374 569 467 467 467 467 468 470 463 367 367 367 367

.. .. ..I. ..P

DLUWWPSDLUNUG----------------PLmtlRTY GLRNSLLVAPMPTA STA4lLGNNESVEPYTSNIYTRRVLSGEF4WNPHLLRVLTERKL,,N .................................................................................................... .F.. ......................................................... ....................................... .................................................................................................... .R...A.T...D.K-----.-----_----...E..ll. .I.....1 ................. . ................ ..I......KD....G .N..U.FOI.D..----------------T.R~.~ H .... ..TY ...... ..S....Y..CF..V...,kS.............Y...D.VDLGT . ..VRCGDLIPS..NRVA4TT4GVLlPKKh’M)LRLAAt 4 .V..GY.T.L .... TSSNST.K..CF..F . ..L .. ..T.....IIL.KY.T4DD.E.IN

varK4L cOpI& t*n

100 100 100 100 78

... .D ..S

631 431 631 631 632 651 663

730 izi ...... .. ..I

I

2: 734 762

771 g: ;;

Fig. 4. Alignment of amino acid sequences of the following proteins: K4L of variala virus (var), 14L of vaccinia virus (cop) (Goebel et al., 1990) and anaiogous proteins of vaccinia virus TIAN.TAN (tan) (Oi et al., 1988) and WR (wr) (Schmitt and Stunnenberg, 1988) strains as well as the large subunits of RR of mouse (tnou) (Caras et al., 19851, yeast (yea) (Yagle and McEntee, 1990) and African swine fever virus (asfv) (Boursnell et al., 1991).

251

virus (ASFV) has greater sequence differences compared with the same proteins of orthopoxviruses and their cell analogs. These results suggest that there is great evolutionary distance between ASFV and orthopoxviruses. Furthermore, the genes of the large and the small RR subunits of ASFV are tandemly arranged in the viral genome whereas they are spaced apart in the orthopoxvirus genome (K4L and C8L genes of VAR). In the C14L ORF of VAR we have detected two sequences which are characteristic of protein kinase (Fig. 2). Protein kinases are known (Kamps et al., 1984; Bairoch and Claverie, 1988) to have two conservative segments containing the following sequences: ~L,I,V~-G-x-G-x-~~,Y~-~S,Gl-x-~L,I,Vl and [L,J,M,F,Y,Cl-x[H,Y]-x-R-[L,I,V,M,F~Y~-K-x-x-N-[L,I,V,M,F,Cl.3. The given sequences are present in the protein under study, however the first consensus contains Ser instead of the second Gly. The effect of this change on the possible enzyme characteristics is unclear. To answer this question it will be necessary to study expression of the given ORF, isolate the protein and analyze its properties. The C17L polypeptide of VAR represents a 37 kDa acylated (palmitated) protein of the outer envelope of extracellular virions (Hirt et al., 1986; Blasco and Moss, 1992). This protein is highly conserved in orthopoxviruses (Fig. 2). It is required for extracellular enveloped virion formation and also plays a critical role in the local cell-to-cell transmission of virus (Blasco and Moss, 1991, 1992). The DNA binding 11 kDa phosphoprotein (Wittek et al., 1984) of VAR (C21R) has two cysteine residues less than the analogous COP protein (Fig. 2). As has been shown (Zhang and Moss, 1991) this protein is vitally important for the virus. It is included in the virion core and is necessary for proteolytic processing of the major structural viral proteins P4a and P4b. For vaccinia virus it was shown that the ElL gene codes for a 55 kDa protein which forms the virus-specific poly(A) polymerase together with the J3R protein (39 kDa) (Gershon et al., 1991). The given heterodimer provides the posttranscriptional addition of 3’-poly(A) termini of the mRNAs. The ElL polypeptide might be a catalytic subunit, and J3R is evidently a regulatory subunit providing a higher reaction rate and the formation of longer poly(A) termini. This protein complex can be found in mature virions (Moss et al., 1975). The conservation of the ~iypeptide structure (Fig. 2) proves its import~ce for virus replication. The E4L protein contains a Me 2+ (Zn2+)-binding domain (Fig. 5) and is homologous to the cellular transcriptional factors. The present polypeptide has been defined as a 30 kDa subunit of the viral DNA-dependent polymerase (Earl and Moss, 1990; Ahn et al., 1990; Broyles and Pennington, 1990), and is highly conserved (Fig. 2). The largest protein, E9L, coded by the analyzed region of the viral genome represents DNA-polymerase (Earl et al., 1986). It is a key enzyme in viral DNA replication. It is highly conserved within poxviruses (Fig. 2) and contains the described motif [Y,A]-X-D-T-D-S-[L,I,V,M,T] conserved in many DNA-polymerases (Argos, 1988). On the whole, DNA-polymerases of different origins differ greatly in their amino acid sequences. Moreover, our analysis revealed other conserved regions in addition to the one described. We consider the region shown

252 T

R A

Q

A MI

D

P” T

E

N

V

P

T

R

R

s C

X

N

P C/ P NH 44.

1

varE4L

%"~:~ TF mou TF dro PR yea

. ..KYNT 154

L’

H

A

lMeG+c R 'CD K

Q HF....E-coon 194

259

^ I^ _-+ + h+ K:GVEYNIDKIKDVSYNDYFKaLNEKYNTPCPNC~~T~~~QT~~~P~~_CR~CKQH~~PP .. . . .. . . . . . . . . . . . . . . . ..D............................................. L.EMRK.LT.EAIREHQMAKTGGTQTDLFT.GK..KK.C.YTQV...S....MTTFW.NE.GNRW.FC L.EMRK.LT.EAIREHQMAXTGGTQTDLFT.GK..KK.C.YTQV...S....MTTFW.NE.GNRW.FC M.KLREKW.EAXNDAQLAmQGTKTDLLK.AK..K..C.YNQL...S....MTT~.NE.GNRW.FC L.QKIEE.A.QNLYNAQGATIERSVTDRFT.GX..EIMVSYYQL...S....LTTFCT.EA.GNRW.FS

Fig. 5. Metal-binding domain of variola virus E4L protein (A) and compari~n of amino acid sequences (B) in the homology region of transcriptional factors of man (hum), mouse (mou) (Hirashima et al., 1988), Drosophila (dro) (Marshall et al., 1990) and proteins E4L of variola virus (var) and vaccinia virus (cop) (Goebel et al., 1990). Above the sequences the conservatively located residues of hydrophobic (h), positively ( + ) and negatively ( - 1 charged amino acids are marked. The sign denotes Cys residues which form part of the metal-binding domain. Amino acid residues coinciding with the corresponding residues in varE4L sequence are marked with dots.

varE9L CopEPL fpv CPA bus DPA y.. DPO ye. hcmv Ywd hsvl hsvZ cbv

h hh - - hh h h h h- - h hhh *++h FRfRS”IU)TDSYFTEIGSPDMKSIEIAKELERLl”SRYLf”NFKIEFEAYlKYLIWPSKKK:XITT~KYSASSNSKS~~RI”;GTSET~”SKfHKN~(IKIYKTRLS ..Y................................................................T .X..........I.S..STK.tE.TAK...H..nl..TYf.HA........I.TG..L........I..L.YIYPUIX.I.V....:......AL...W..PR..D)I MILEVI......l~INT-NETWLEEVFKL~K~SE~KLYK-LLE.DIDG.F.S.LLL K .. ..MLWEFl.DGNY.T~EL..L~tV...~~.Dl~~I~I ... ..ML~~K.GNGl~LEV..~~..EF~~VSI~LNTI )(ULLV........KIDT.GC.NY*CAIKIGLGFKRLVLIERYR-LLE.D~DN.F~K.LL U.FPYLLIN..R.AGLFYTN~KIWYFD---KLDG..LASV...SCSLVSIVWW~LKKI RDAW........KVKf-5TTDL.EAMLGTEAAKTV.TLFKilPINl... VEA.VI........VRFRGLTFCIALVARGFS.AII~VTACLV.EPV .L . ..K.FVS.Y.IC..R.IGYYEG..------.GL~..VDLSL IOVI.GG--------KVLW..M)LV.KNNCO.INDYARKLVEL VEVKVI........IRFKGVS.EGIAK.GE~Hl.SlALFCPPl.L.C.KlFIK.LLtl.... .... IGVt.GG--------MLI..Vi,LV.KNNCA.INRTSRALVDL YU(.II......I.VLCRGLTMGLlAVU)K~SN.SRALFLPPI.L.C.KTFlK.LLIA .... IGVICGG--------KMLI..VDl.V.KNNCA.INRlSRALkDL YSM.Il......I.VLCROLTGEALVAnCDaUISH.SRALFLPPI.L.C.KlFTK.LLIA GGL.VI......L.I.CRGFSESE~LRF.DA.MNTlRSLfVAP~SL.A.KTFSC.~LIT..R.VGVLTDG.-------KTL~..VE~V.KTAC..VGtRCRRVLDLV

...................................

h

h

...... .X L L L .f .L .f .F L

Fig. 6. Comparison of amino acid sequences in the most conservative region of DNA-polymerases A (DPA) of man (hum) (Wang et al., 19881, S. cereuisiae (yea) (Pizzagalli et al., 1988), DNA-polymerase D (DPD) of yeast (Boulet et al., 1989) as well as of DNA-~l~era~s of human ~omegalovi~s (hcmv) (Kouzarides et al., l987f, varicella zoster virus (vzvd) (Davison and Scott, 19861, herpes simplex virus type 1 (hsvl) (Quinn and McGeoch, 1985) and type 2 (hsv2) (Tsurumi et al., 1987), Epstein-Barr virus (ebv) (Baer et al., 19841, fowlpox virus (fpv) (Binns et al., 1987) and E9L protein of variola virus (VAR) and vaccinia virus (cop) (Goebel et al., 1990). Above the sequences the conservatively located residues of hydrophobic (h) and positively (+) and negatively f - 1 charged amino acids are marked. Amino acid residues coinciding with the corresponding residues in varE9L sequence are marked with dots. Dash denotes the deletion of an amino acid residue.

253

in Fig. 6 to be ~nctionally irn~~~t and relatively conservative for DNA-polymerases. It can be roughly divided into two parts, one of which contains negatively charged and hydrophobic amino acid residues, the other hydrophobic and positively charged residues. The active DNA-polymerase site might be formed by interaction of differently charged and hydrophobic amino acid residues. The analysis suggests the existence of an additional new consensus sequence for varX8R COPI8R wr

MEXNLPDIFFFPNCVNVFSYKYSQDEFSNHSNMERDSFSLAVFPVIKHRh%NAHVVKHXG . ........................... ............................... . ........................... ...............................

varX8R COPIER wr

IYKVSTEAHGKXVSPPS~KPSHINLTAKQYIYSEXTISFECYSF~C~TN~INSFDEY ...... ..R............A.....T.......................T ...... ..R............A.............................T

varX8R COPIER Wr

ILRGL~AGNSLQIFSNSVGKRTDTIGVLGNXYPFSKIPLASLTPKAQREIFSAWISHRP ............................................................ ............................................................

60

120 ........ ........ 180

-e A varX8R CopIaR wr

240

SQVPKLLLWFNYLFGGFSTLDKITDFHERPVILSLPRlALVRLHI

................................................

..............................

..N..................~.~

..

-- B varX8R copI8R wr

LXSLGFKVLDGSPISLRYGSIPEELINXQPXXYGIVFSTHXLSLTXLFS YGTLIIDE VHE ............................................................ ........... ................................................. I I

300

varX8R copI8R

360

wr

H~~GDIIIAV~TX~DSMF~AT~DD~R~FL~PAFIH~~~LFXIS~F ............................................................ ............................................................

varX8R COPIER wr

IHNKrNPSSRMAYIEEEKRNL~AIQMYTPPDGSSGIVFVASVAQCHEYKSYLEKRLPYD ............................................................ ............................................................

420

varX8R CopI8R WI:

MYIIHGKVLEIDKILEKVYSSPNVSIIIISAPY~SS~I~HIYDMGRVFVPAPFGGS .................... ....... ..D..E................T.........R .......... ....... ..D..E................T.........R.........K

480

varK8R copI8R wr

~~FISKSMRoQRKGRVGRvzVPGTYVYFYDLSYMKSIQRINSEFLHNYILY~XFNLTLPE ..D .................... ................................... .................... .E.....................................D

540

varX8R COPISR

DLFI~PTNLDILWRTKEYIDSFDISTETWNKLLSNYYU~IEYAXLYV~PI~EELDNF ............................................................ ............................................................

600

wr varX8R CopI8R wr

ERTGELTSIVQEAILSLNLRIXIWFIMXDNDTYXHFCXILFGVYNGTNATIYYHRPG ............................. ........ ..R...................D ........ ..R ................. ..D .............................

660

varX8R copI8R wr

YMNMISDTXFVPVDNN ................ ................

676

Fig. 7. Alignment of amino acid sequences of the K8R protein of variola virus (VAR) and Ii3R protein of vaccinia virus (COP) (Goebel et al., 1990) and WR (wr) strains (Schmitt and Stunnenberg, 1988). Amino acid residues coinciding with K8R are marked with dots. Box A and box B (Gorbalenya et al., 1989) of the protein nucleotide-binding domain are enframed.

254

varQ2L ;gC"L wr -3CZt

rab i% yea

varQ2L 3"""

---MAF.EFVQQRLTXNKVTIFVKF ---.,........A.........Y ---.,...,....A.........Y ---,.........A.........Y ---..........A.......,.Y ----.Q... NSKIQPG..W.l,P ----.QA..NSKIQPG..W.I.P ----.QA..NSXIQFG..W.I.P VSQETVAH.XDLIGQKE.FVAA.T

TCPFCRNALDIL . ...I....... . . .. . . . ..I.. .. ... .... ... a........... . ..Y..KTQE.. . . . . ..KTQEL. . ..Y..KTQEL. Y..Y.KAT.ST.

NKFSFKRGAYEIV-DIKEFKPENK . . . . . . . . . . . ..-.........E . . . . . . . . . . . ..-.........E . . . . . ..a.....-.........E e..........**-f.. G.....E SQLP..Q.LL.F.-..TATSDMSE SQLP..E.LL.F.-..TATSDT.E SQLP..Q.LL.F.-..TAAGNISE FQELNVPXSKAL.LELD.MSNGSE

LHDYFEQfTGGRTVPRIFFGKTSIGGYSDLLEiIDNMDALGDILSSIGVLRTC

108

.R

108

..,.....~,......~..,.....................I........ ,......................~..**,............I.......*

wr W-?t

rab Ph bov yea

:::. ..~..~.~.~.~.......~........~.~~.~.....~......~.. .R.........X **.,...**+..........*.,*,.*....*...*.... IQ..LQ.L..A..... V.L..DC...C... JAMQEXGE.LAR.KEM.A..-CT..ESMliKRGE.LTR.QQ..A.K-IQ..I.Q.L..A.....V.f..EC... IQ..LQ.L..A.....V.I.QEC...CT..VXMfiERGE.LTR.KQM.A.Q-IQ.AL,E.S.QK...mINGKH...N...ETLKKNGX.AE..KPV------

z"6 z: 56 5555 2:

108 108

208

105 105 105 106

Fig. 8. ~igument of amino acid sequences of the proteins: QZL of variola virus {var), OZL of vaccinia virus, Copenhagen strain (cop) (Goebel et al., 1990) and the analogous protein of vaccinia virus WR CwrXTengelsen and Hruby, 1989) and LIVP (liv) strains, ectromelia virus, K-l strain (ect) (Ryazankina et al., 19931 as well as rabbit (rab) (Hopper et al., 19891, pig @an and Wells, 1987), bovine (bov) (Papayanno~ulos et al., 1989) and yeast (yea) (Gan et al., 1990) glutaredoxins. The glutaredoxin active center is enframed. Amino acid residues of coinciding with varQ2L are marked with dots.

DNA-polymerases, that is: [I,L~-~E,D]-X-[E,D~-X-X-[Y,F~-~-~-EL,Y]-[~,L,M~-[M,L]-X-XK-K-[&RI-Y. The products of the K3L, K8R and H8R genes together with some other viral proteins are likely to represent factors which are necessary for the transcription of the late orthopoxvirus genes (Keck et al., 1990; Moss et al., 1991; Wright et al., 1991). K3L and H8R are small highly conservative proteins (Fig. 2) whereas K8R is larger. The latter contains the nucleotide-binding domain (Fig. 71, and the differences in this protein between various orthopoxvirus strains are concentrated in the N- and C-terminal regions, whereas the central region of the protein is highly conservative. The COP gene 18R (VAR KSR) has recently been shown to encode an RNA helicase (Shuman, 1992). The Q2L protein exhibits marked homology with glutaredoxin of eukaryotic cells (Fig. 8). In addition to the VAR Q2L gene we have sequenced the corresponding genes of vaccinia virus, LIVP strain, and ectromelia virus, K-l strain. The active site of glutaredoxin, C-x-[F,Y]-C-x-x-[T,Al-[K,Ql-x-IL,11 (Johnson et al., 1991) is almost unchanged in the viral protein. Glutaredoxin (also known as thioltransferase) is a small protein which serves as electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by ribonucleotide reductase. It has recently been shown that viral glutaredoxin is synthesized after viral DNA replication and associated with purified vaccinia virions @&n and Moss, 1992). The KlL gene is essential for virus growth (Sh~heikunov et al., 1993b). We have demonstrated (Ryazankina et ai., 1993) that the protein product of this gene appears at the late stage of vaccinia virus development in a viroplast but is not incorporated into the &ions. We suggested that this protein could either participate in the maturation of viral DNA molecuies or be necessary during the first

stages of assembly of viral particles. The protein is highly conserved among orthopoxviruses; no analogs of it have been found among other organisms. The GSSR gene encoding a small RNA polymerase subunit (7 kDa) was found in the region between ORFs G5R and G6R of COP (Amegadzie et al., 1992). Analysis of the VAR sequence shows that the ORF H5.5R is of the same size (63 amino acid residues) and has 97% homology with the COP gene. The functions of the other proteins coded by the anaIyzed region of VAR genome are so far unknown, and it is impossible to reveal any anaIogs among cellular proteins or the proteins of the other viruses. Most of the analyzed polypeptides are likely to be characteristic only of orthopoxviruses (poxviruses). The high conservation of these proteins indicates their role in maintaining virus viability. Notewo~hy, the amino acid sequence of K2L protein remains unchanged (Fig. 2) in variola virus, India strain, vaccinia virus Copenhagen (Goebel et al., 19901, WR (Schmitt and Stunnenberg, 1988) and TIAN.TAN (Oi et al., 1988) strains. This is the first case of full conservation of a protein sequence which we observed comparing various vaccinia virus strains and various species of orthopoxviruses. Thus, analysis of the VAR genome HindIII-C, E, R, Q, K and H fragments has demonstrated high conservation of this region, which suggests that the genes necessary for orthopoxvirus viability are concentrated in this region.

Acknowledgements We would like to thank Dr. V.Ye. Chizhikov, V.V. Gutorov, and P.F. Safronov for sequencing of DNA fragments.

References Ahn, B.-Y., Gershon, P.D., Jones, E.V. and Moss, B. (1990) Identification of rpo30, a vaccinia virus RNA polymerase gene with structural similarity to a eucaryotic transcription elongation factor. Mol. Cell. Biol. 10, 5433-5441. Ahn, B.-Y. and Moss, B. (1992) Glutaredoxin homolog encoded by vaccinia virus is a virion-associated enzyme with thioltransferase and dehydroascorbate reductase activities. Proc. Natl. Acad. Sci. USA 89,7~-70~. Amegadzie, B.Y., Ahn, B.-Y. and Moss, B.(1992) Characterization of a 7-kilodalton subunit of vaccinia virus DNA-dependent RNA polymerase with structural similarities to the smallest subunit of eukaryotic RNA polymerase 1I.J. Virol. 66, 3003-3010. Argos, P. (1988) A sequence motif in many polymerases. Nucleic Acids Res. 16, 9909-9916. Baer, R., Bankier, A.T., Biggin, M.D., Deininger, P.L., Farrell, P-J., Gibson, T.J., Hatfull, G., Hudson, G.S., Satchweli, SC., Seguin, C., Tuffnell, P.S. and Barre& B.G. (1984) DNA sequence and expression of the 895-8 Epstein-Barr virus genome. Nature 310,207-211. Bairoch, A. and Claverie, J.-M. (1988) Sequence patterns in protein kinases. Nature 331, 22. Binns, M.M., Stenzler, L., Tomley, F.M., Campbell, J. and Boursnell, MEG. (1987) Identification by a random sequencing strategy of the fowlpox virus DNA polymerase gene, its nucleotide sequence and comparison with other viral DNA polymerase. Nucleic Acids Res. 15, 6563-6573.

256 Blasco, R. and Moss, B. (1991) Extracellular vaccinia virus formation and cell-to-cell virus transmission are prevented by deletion of the gene encoding the 37,000-dalton outer envelope protein. J. Virol. 65,5910-5920. Blasco, R. and Moss, B. (19921 Role of cell-associated enveloped vaccinia virus in cell-to-cell spread. J. Virol. 66, 4170-4179. Boulet, A., Simon, M., Faye, G., Bauer, G.A. and Burgers, M.J. (1989) Structure and function of the .Succharomyces cereuLsiaeCDC2 gene encoding the large subunit of DNA polymerase III. EMBO J. 8, 1849-1854. Boursnell, M., Shaw, K., Yanez, R.J., Vinuela, E. and Dixon, L. (1991) The sequences of the ribonucleotide reductase genes from African swine fever virus show considerable homology with those of the orthopoxvirus virus. Virology 184,411-416. Broyles, S.S. and Pennington, M.J. (1990) Vaccinia virus gene encoding a 3~~l~alton subunit of the viral DNA-dependent RNA polymerase. J. Virol. 64, 5376-5382. Buller, M.L. and Palumbo, G.J. (1991) Poxvirus pathogenesis. Microbial. Rev. 55, 80-122. Caras, I.W., Levinson, B.B., Fabry, M., Williams, S.R. and Martin, D.W. (1985) Cloned mouse ribonucleotide reductase subunit Ml cDNA reveals amino acid sequence homology with Escherichia coli and herpes virus ribonucleotide reductases. J. Biol. Chem. 260, 7015-7022. Chang, H.-W., Watson, J.C. and Jacobs, B.L. (19921 The E3L gene of vaccinia virus encodes an inhibitor of the inte~eron-induced, double-stranded RNA~e~ndent protein kinase. Proc. Natl. Acad. Sci. USA 89,4825-4829. Davison, A.J. and Scott, J.E. (19861 A complete DNA sequence of varicella-zoster virus. J. Gen. Virol. 67, 1759-1816. Earl, P.L., Jones, E.V. and Moss, B. (1986) Homology between DNA polymerase of poxviruses, herpesviruses, and adenoviruses: Nucleotide sequence of the vaccinia virus DNA polymerase gene. Proc. Natl. Acad. Sci. USA 83, 3659-3663. Earl, P.L. and Moss, B. (1990) Vaccinia virus. In: Locus Maps Complex Genomes, pp. 138-218. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. Fathi, Z. and Condit, R.C. (1991a) Genetic and molecular biological characterization of vaccinia virus temperature-sensitive complementation group affecting a virion component. Virology 181, 258-272. Fathi, Z. and Condit, R.C. (1991b) Phenotypic characterization of a vaccinia virus temperature-sensitive complementation group affecting a virion component. Virology 181, 273-276. Gan, Z.-R. and Wells, W.W. (1987) The primary structure of pig liver thioltransferase. J. Biol. Chem. 262,6699-6703. Gan, Z.-R., Polokoff, M.A., Jacobs, J.W. and Sardana, M.K.(1990) Complete amino acid sequence of yeast thioltransferase (glytaredoxin). Biochem. Biophys. Res. Commun. 168, 944-951. Gershon, P.D., Ahn, B.-Y., Garfield, M. and Moss, B. (1991) Poly(A1 polymerase and a dissociable polyadenylation stimulatoty factor encoded by vaccinia virus. Cell 66, 1269-1278. Goebel, S.J., Johnson, G.P., Perkus, M.E., Davis, S.W., Winslow, J.P. and Paoletti, E. (1990) The complete DNA sequence of vaccinia virus. Virology 179, 247-266. Golini, F. and Kates, J.R. (1984) Tran~riptional and translational analysis of a strongly expressed early region of the vaccinia virus genome. J. Virol. 49, 459-470. Gorbalenya, A.E., Blinov, V.M., Donchenko, A.P. and Koonin, E.V. (1989) Au NT&binding motif is the most conserved sequence in a highly diverged monophyletic group of protein involved in positive strain RNA viral replication. J. Mol. Evol. 28, 256-268. Hirashima, S., Hirai, H., Nakanishi, Y. and Natori, S. (1988) Molecular cloning and characterization of cDNA for eukaryotic t~nsc~ption factor S-I1.J. Biol. Chem. 263, 3858-3863. Hirt, P., Hiller, G. and Wittek, R. (1986) Localization and fine structure of a vaccinia virus gene encoding an envelope antigen. J. Virol. 58, 757-764. Hopper, S., Johnson, R.S., Vath, J.E. and Biemann, K. (1989) Glutaredoxin from rabbit bone marrow. J. Biol. Chem. 264, 20438-20447. Hurd, H.K., Roberts, C.W. and Roberts, J.W. (1987) Identification of the aene for the yeast ribonucleotide reductase small subunit and its inducibility by methyl methanesulfonate. Mol. Cell. Biol. 7, 3673-3677.

257 Johnson, G.P., Goebel, S.J., Perkus, M.E., Davis, S.W., Winslow, J.P. and Paoletti, E. (1991) Vaccinia virus encodes a protein with similarity to glutaredoxins. Virology 181, 378-381. Jones, E.V., Puckett, C. and Moss, B. (1987) DNA-de~ndent RNA polymerase subunits encoded within the vaccinia virus genome. J. Virol. 61, 1765-1771. Kamps, M.P., Taylor, S.S. and Sefton, B.M. (1984) Direct evidence that oncogenic tyrosine kinases and cyclic AMP-dependent protein kinase have homologous ATP-binding sites. Nature 310,589-592. Keck, J.G., Baldick, C.J. and Moss B. (1990) Role of DNA replication in vaccinia virus gene expression: A naked template is required for transcription of three late trans-activator genes. Cell 61, 801-809. Kouzarides, T., Bankier, A.T., Satchwell, S.C., Weston, K., Tomlinson, P. and Barrell, B.G (19871 Sequence and transcription analysis of the human ~omegalovi~s DNA polymerase gene. J. Viral. 61, 125-133. Kouzarides, T. and Ziff, E. (1988) The role of the leucine zipper in the fos-jun interaction. Nature 336, 646-651. Landschulz, W.H., Johnson, P.F. and M&night, S.L. (1988) The leucine zipper: A hypothetical structure common to a new class of DNA binding proteins. Science 240, 1759-1764. Marshall, T.K., Guo, H. and Price, D.H. (1990) Drosophila RNA polymerase II elongation factor DmS-II has homology to mouse S-II and sequence similarity to yeast PPR2. Nucleic Acids Res. 18, 6293-6298. Maxam, A. and Gilbert, W. (1980) Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65, 499-560. Meis, R.J. and Condit, R.C. (1991) Genetic and molecular biological characterization of vaccinia virus gene which renders the virus dependent on isatin-b-thiosemicarbazone (IBT). Virology 182,442-454. Moss, B., Rosenblum, E.N. and Gershowitz, A. (1975) Characterization of a polyriboadenylate polymerase from vaccinia virions. J. Biol. Chem. 250,4722-4729. Moss, B., Ahn, B.-Y., Amegadzie, B., Gershon, P.D. and Keck, J.G. (1991) Cytoplasmic transcription system encoded by vaccinia virus, J. Biol. Chem. 266, 1355-1358. Nillson, O., Lundqvist, T., Hahne, S. and Sjoberg, B.-M. (1988) St~cture-unction studies of the large subunit of ribonucleotide reductase from Escherichiu co& Biochem. Sot. Trans. 16, 91-94. Nordlund, P., Sjoberg, B.-M. and Eklund, H. (1990) Three-dimensional structure of the free radical protein of ribonucleotide reductase. Nature 345, 593-598. Oi, J., Yue, C., Jiangguang, L., Zhiliang, L., Dongyan, 3. and Yunde, H. (1988) 25 kb nucieotide sequence of the genome of Chinese vaccinia virus vaccine strain (TIAN.TAN) and comparison with non-vaccine strain (WR). Ping Tu Hsueh Pao 4, 285-304. Papayannopoulos, LA., Gan, Z.-R., Wells, W.W. and Biemann, K. (1989) A revised sequence of calf thymus glutaredoxin. Biochem. Biophys. Res. Commun. 159, 1448-1454. Pizzagalli, A., Valsasnini, P., Plevani, P. and Lucchini, G. (19881 DNA polymerase I gene of Succharomyces cereuisiae: Nucleotide sequence, mapping of a temperature-sensitive mutation, and protein homology with other DNA polymerases. Proc. Natl. Acad. sci. USA 85, 3772-3776. Quinn, J.P. and McGeoch, D.J. (1985) DNA sequence of the region in the genome of herpes simplex virus type 1 containing the genes for DNA polymerase and the major DNA binding protein. Nucleic Acids Res. 13, 8143-8163. Ryazankina, O.I., Muravlev, A.I., Gutorov, V.V., Mikrjukov, N.N. and Shchelkunov, S.N. (1993) Comparative analysis of conservative area of orthopoxvirus genomes coding 36K and 12K proteins. Virus Res. 29. Schmitt, J.F.C. and Stunnenberg, H.G. (1988) Sequence and transcriptional analysis of the vaccinia virus Hind111 I fragment. J. Virol. 62, 1889-1897. Shchelkunov, S.N., Marennikova, S.S., Totmenin, A.V., Blinov, V.M., Chizhikov V.V., Gutorov, V.V., Safronov, P.F., Pozdnyakov, S.G., SheIukhina, E.M., G~~nikov, P.V. Andzhaparidze, O.G. and Sandakhchiev, L.S. (19911 Construction of clonoteques of fragments of smallpox virus DNA and structure-function investigation of viral host range genes. Dokl. Akad. Nauk. 321, 402-406 (in Russian). Shchelkunov, S.N., Blinov, V.M., Totmenin, A.V., Marennikova, S.S., Kolykhalov, A.A., Frofov, IV., Chizhikov, V.E., Gytorov, V.V., Gashnikov, P.V., Belanov, E.F., Belavin, P.A., Resenchuk, S.M.,

258 Andzhaparidze, O.G. and Sandakhchiev, L.S. (1993a) Nucleotide sequence analysis of variola virus Hind111 M, L, I genome fragments. Virus Res. 27, 25-35. Shchelkunov, S.N., Ryaxankina, O.I. and Gashnikov, P.V. (1993b) The gene encoding the late nonstructural 36K protein of vaccinia virus is essential for virus reproduction. Virus Res. 28, 273-283. Shuman, S. (19921 Vaccinia virus RNA helicase: An essential enyme related to the DE-II family of RNA-dependent NTPases. Proc. Natl. Acad. Sci. USA 89, 10935-10939. Slabaugh, MB, Roseman, N., Davis, R. and Matthews, C. (1988) Vaccinia virus-encoded ~~nucleotide reductase: Sequence conservation of the gene for the small subunit and its amplification in hydroxyurea-resistant mutants. J. Virol. 62, 519-527. Tengelsen, L.A. and Hruby, D.E. (1989) Nucleotide sequence and transcriptional studies of the vaccinia virus KpnI I DNA fragment. Virus Genes 3, 175-187. Tengelsen, L.A., Slabaugb, M.B., Bibler, J.K. and Hruby, D.E. (1988) Nucleotide sequence and molecular genetic analysis of the large subunit of ribonucleotide reductase encoded by vaccinia virus. Virology 164, 121-131. Thelander, M. and Thelander, L. (1989) Molecular cloning and expression of the functional gene encoding the M2 subunit of mouse ri~~ucIeotide reductase: a new dominant marker gene. EMBO J. 8,2475-2479. Traktman, P., Sridhar, P., Condit, R.C. and Roberts, B.E. (1984) Transcriptional mapping of the DNA polymerase gene of vaccinia virus. J. Virol. 49, 125-131. Tsurumi, T., Maeno, K. and Nishiyama, Y. (1987) Nucleotide sequence of the DNA polymerase gene of herpes simplex virus type 2 and comparison with the type 1 counte~a~. Gene 52, 129-137. Vos, J.C. and Stunnenberg, H.G. (19881 Derepression of a novel class of vaccinia virus genes upon DNA replication. EMBO J. 7, 3487-3492. Watson, J.C., Chang, H.-W. and Jacobs, B. (1991) Characterization of a vaccinia virus-encoded double-stranded RNA-binding protein that may be involved in inhibition of the double-stranded RNA-dependent protein kinase. Virology 185, 206-216. Wittek, R., Hanggi, M. and Hiller, G. (19841 Mapping of a gene coding for a major late structural polypeptide on the vaccinia virus genome. J. Virol. 49, 371-378. Wong, SW., Wahl, A.F., Yuan, P.-M., Arai, N., Pearson, BE., Arai, K., Kom, D., Hunkapiller, M.W. and Wang, T.S. (19881 Human DNA polymerase A gene expression is cell proliferation dependent and its primary structure is similar to both prokaryotic and eukaryotic replicative DNA polymerases. EMBO J. 7,37-47. Wright, CF., Keck, J.G., Tsai, M.M. and Moss, B. (1991) A transcription factor for expression of vaccinia virus late genes is encoded by an intermediate gene. J. Virol. 65, 3715-3720. Yagte, K. and McEntee, K. (1990) The DNA dam~e-inducible gene DIN1 of Sacc~aro~yce~cereuisae encoded a regulatory subunit of ribonucleotide reductase and is identical to RNR3. Mol. Cell. Biol. 10, 5553-5557. Zhang, Y. and Moss, B. (1991) Vaccinia virus morphogenesis is interrupted when expression of the gene encoding an 11-~l~alton phospho~lated protein is prevented by the ~c~~ch~ coli iac repressor. J. Viral. 65, 6101-6110.