A frog virus 3 gene codes for a protein containing the motif characteristic of the INT family of integrases

A frog virus 3 gene codes for a protein containing the motif characteristic of the INT family of integrases

VIROLOGY 186,693-700 (1992) A Frog Virus 3 Gene Codes for a Protein Containing the Motif Characteristic of the INT Family of Integrases’ JAN ROHOZI...

1MB Sizes 4 Downloads 20 Views

VIROLOGY

186,693-700

(1992)

A Frog Virus 3 Gene Codes for a Protein Containing the Motif Characteristic of the INT Family of Integrases’ JAN ROHOZINSKI AND RAKESH GOORHA* Department of Virology and Molecular Biology, St. Jude Children’s Research Hospital, 332 North Lauderdale, P. 0. Box 3 18, Memphis, Tennessee 38 10 1 Received August 8, 199 1; accepted

October 28, 199 1

The integrase (INT) family of bacteriophage coded integrase-recombinase proteins are responsible for catalyzing strand exchange between DNA molecules and play an important role in the DNA replication of many bacteriophages. Within the frog virus 3 (FV3) genome we have identified an open reading frame (ORF) of which the deduced amino acid sequence contains a motif characteristic of the INT family of integrases-recombinases. The ORF consists of 825 bp which codes for a protein of 275 amino acids with a predicted M, of 29,945. RNA transcribed from this ORF during virus infection was detected by Northern blot analysis and it is a delayed early message of approximately 1100 bases. The 5 and 3’ ends of the putative FV3 integrase-recombinase transcript were mapped. The transcriptional start site is preceded by a presumptive TATA box, and a region of hyphenated dyad symmetry is present at the 3’ end of the message. A protein with an M, of approximately 30,500 was synthesized by a rabbit reticulocyte lysate programmed with capped runoff transcripts from the cloned gene, indicating that the ORF can be transcribed into a message coding for a viral protein. In the FV3 life cycle, DNA replication occurs in a large complex formed through the recombination of small viral DNA molecules. Thus, at this stage, DNA replication and recombination are interlinked. Resolution of concatameric DNA is required for the packaging of genomes into virus particles. The putative FV3 INT gene may be involved in one or more of these functions. 8 1992 AcademicPress, Inc.

INTRODUCTION

We have identified an ORF in the FV3 genome which codes for a delayed early protein. This protein contains the sequence motif characteristic of bacteriophage DNA integrases of the INT family (Smith et a/., 1990). Members of the integrase family are characterized as having only very limited similarity (the INT motif) even though they carry out similar enzymatic functions (Argos et al., 1986). These integrases are responsible for conservative recombination which is restricted to specific sites on both partner DNA molecules; only a very short region of DNA homology is required. The occurrence of the motif characteristic of integrases in an FV3 protein is surprising as no such integrase-recombinase proteins have been previously reported in viruses infecting eucaryotic cells. However, the replication strategy displayed by FV3 has similarity to that of the lambdoid and T bacteriophages (reviewed in Murti et al., 1985). The presence of a gene coding for an integrase-recombinase protein in the FV3 genome may reflect the similarities in viral DNA replication/ packaging between FV3 and bacteriophages.

Frog virus 3 (FV3) is an iridovirus which was isolated from the leopard frog (Rana pipiens) (Granoff et a/., 1966). It is icosahedral in structure with an inner lipid component (reviewed in Murti et al., 1985). The genome is double-stranded linear DNA of 170 kbp (Murti ef al., 1982) which is circularly permuted, terminally redundant (Goorha and Murti, 1982) and highlymethylated (Willis and Granoff, 1980). Synthesis and expression of viral RNA are under temporal control (reviewed in Willis et a/., 1990) and virion assembly takes place within morphologically distinct regions of the cytoplasm known as virus assembly sites (Darlington eta/,, 1966). Viral DNA synthesis occurs in two stages. Initial synthesis of approximately unit length DNA takes place in the nucleus. About 3 hr postinfection DNA synthesis moves to the cytoplasm where viral DNA is synthesized in a large, concatameric complex (Goorha, 1982). Concatamer synthesis in the cytoplasm is believed to occur by recombination of unit length DNA molecules into a branched concatameric structure; cytoplasmic DNA replication occurs in conjunction with recombination (Goorha and Dixit, 1984). ’ Sequence data from this article have been deposited Genbank Data Libraries under Accession No. M80548. ’ To whom reprint requests should be addressed.

MATERIALS AND METHODS Cells and virus Fathead minnow (FHM) cells for propagation of FV3 were cultured at 33” in Eagle’s minimal essential medium supplemented with 5% fetal calf serum. A clonal

with the

693

0042-6822192

$3.00

CopyrIght 0 1992 by Academic Press, Inc. All rights of reproduction in any form reserved.

694

ROHOZINSKI AND GOORHA

isolate of wild-type FV3 was grown as previously scribed (Naegele and Granoff, 197 1).

de-

primed at the appropriate restriction site and running a sequencing reaction.

Bacterial strains and plasmids Escherichia co/i XL1 -blue strain (Stratagene) was used to isolate and propagate the plasmids containing FV3 DNA. The plasmid used for sequencing was constructed by inserting a Kpnl fragment of FV3 DNA containing the region of interest into the multiple cloning site of the plasmid PBS Ml 3+ (Stratagene) and was designated PBS M 13+ (7J). Bacteria were grown in media containing 50 pg/ml ampicillin, and plasmid was recovered by alkaline lysis and ethanol precipitation (Maniatis et al., 1982). DNA sequencing Both strands of the cloned Kpnl fragment were sequenced using the dideoxynucleotide chain termination-based Sequenase version 2.0 DNA sequencing kit (United States Biochemical Corp.). Sequencing was done by walking along the insert using 21-base primers. Northern

blots

FHM cell monolayers were infected with FV3 (20 PFU/cell m.o.i.) in the presence of cyclohexamide (100 pg/ml) or fluorophenylalanine (100 pg/mI) or in the absence of drug. At 6 hr postinfection, the total RNA was extracted using a previously described procedure (Maniatis et al., 1982). The RNA (20 pg/well) was electrophoresed through 1.2% agarose containing formaldehyde and 6.6 ng of ethidium bromide/ml (modified from Maniatis et a/., 1982). After electrophoresis the gel was photographed under uv illumination, washed in several changes of water, and finally washed in 1OX SSC (1 X SSC = 150 mM NaCI, 15 m/M Na, Citrate, pH 7). The RNA was transferred to Nytran (Schleicher and Schuell Inc.) with 20X SSC using the procedure of Southern (1975). Blots were hybridized with 32P-labeled DNA generated by priming within the sequence of interest and extending with sequencing reaction mixture. The length of the probe was kept to approximately 150 bases by limiting the amount of dideoxynucleotide triphosphates in the reaction mixture. Mung bean nuclease

mapping

The 5’ and 3’ termini of the messenger RNA were mapped using mung bean nuclease (Murry, 1986). The DNA used for protection was either a 3’ or a 5’ 32P-labeled restriction fragment obtained from the cloned gene. The sequence for aligning the nuclease-insensitive residue was generated by making a primer which

Production

of RNA transcripts

The gene of interest was isolated from within the plasmid pBS+ (7J) by PCR amplification. At the 5’end a primer containing the T7 promoter sequence was used to generate a PCR product which could be directly transcribed using T7 RNA polymerase. Transcripts capped with M7G(5’)PPP(5’)G were produced using a T7 RNA polymerase kit (Stratagene No. 200350). Additional cap analog was used to increase the ratio of cap analog to GTP from 0.5 (as supplied in the kit) to 10 as recommended by Yisraeli and Melton (1989). In vitro translation Rabbit reticulocyte lysate (Stratagene) was used to produce in vitro translation products from RNA transcripts following the procedure supplied with the kit. The translational product was analyzed by SDS-PAGE containing 15% acrylamide using the Laemmli (1970) buffer system.

Data analysis Sequence analysis and data bank searches were done on a VAX 8350 computer using the IntelliGenetics Suite version 5.37 program (IntelliGenetics, Inc.).

RESULTS Sequence

of the ORF

As part of an ongoing search for genes involved in viral DNA replication, a Kpnl fragment approximately 5 kbp long located within the Hindlll D fragment of the FV3 genome (Fig. 1) (Willis et a/., 1985) was cloned into the multiple cloning site of the plasmid PBS M 13+ and sequenced in both directions using the dideoxy method (Tabor and Richardson, 1987). A large ORF was identified and, when the amino acid sequence was deduced, it was found to contain the motif common to the INT family of integrases. The ORF is within the 11 13-bp region shown in Fig. 2 and there is only one ORF present. There is a methionine codon translational start site at position 17. The ORF ended with a TGA termination codon at position 842. We conclude that the coding region of the ORF is 825 bp long, specifying a protein of 275 amino acids with a calculated @ of 29,945.

695

FROG VIRUS 3 * NOT P I II I II

Smal

C

B II II

I

F

C

1

I

I

I

I

& B

MJ

J D I I

E I I ‘10

Gp L

B

D

A

J B

(QW)F (GNXZ)

N

(LY)

A

(NOT)P

C

B

Sal I

4 PNE

5 I H KRMUP

III 11111

II

J L

G QKO

III Ill

M(VW)X

A

I I

III1 IIII

D

II u

B

(UP)N

I I

II WI

I I

E

C

I I

t

5

F ‘ST’KF%i4 I I I I Hind Ill

E

I I I

G I I

D I I

(OR) C I II I II

L

D

A

ORF

H

A

H

II II

I II

A

Kpn I C

F

I I

Xba I

J II II 4 G1

J

RB I T H

B

I I

B

.

HGF II I

D I

J

F

I I I I

I 1

C I

M

E I

J II

& A

E

-1

t Sal I

357 FIG. 1. Location of the cloned Kpnl fragment relative to the whole FV3 genome. The Kpnl fragment (G,) containing the putative integrase gene is located within the HindIll C fragment. The Kpnl G, fragment was cut out of the HindIll C fragment and cloned into the plasmid PBS M 13+ to give the plasmid PBS Ml 3+ (71). Also shown is the position of the ORF. A single Sal1 site is located within the gene.

Analysis of mRNA To determine if this ORF is transcribed in FV3-infected cells, and to identify the time of its transcription, a Northern blot analysis using total RNA extracted from FV3-infected fathead minnow cells was done. Expression of FV3 genes is under temporal control and three classes of transcripts, namely immediate-early, delayed-early, and late can be identified by the use of drugs (reviewed in Willis et a/., 1985). FV3-infected cells treated with cycloheximide produce only immediate-early messages which do not require de nova synthesis of viral proteins for transcription. Delayed-early as well as immediate-early mRNA transcripts are present in FV3-infected cells treated with fluorophenylalanine which blocks late message synthesis. Infected cells without drug treatment contain late messages in addition to the previously described classes of mRNAs. Radiolabeled probe was made using a 21base primer complementary to the ORF (bases 288, 308, Fig. 2) priming within the cloned gene and extended using the sequencing reaction. This probe was hybridized to Northern blots of RNA from FVS-infected cells. No hybridization was detected with RNA from

cycloheximide-treated FV3-infected cells (Fig. 3, lane 1). However, a single transcript of about 1100 nucleotides was detected in RNA from both fluorophenylalanine-treated and untreated cells (Fig. 3, lanes 2 and 3) indicating that the ORF was transcribed as a delayed early message. In vitro translation

To check if the ORF is recognized as an authentic message, capped RNA transcripts were made using the PCR-amplified product from the gene contained in the plasmid PBS M 13+ (7J). This strategy was used because repeated attempts to clone the gene alone in either PBS Ml 3+ or pBR 322 were unsuccessful. The PCR reaction utilized a primer containing the T7 promoter sequence and started 101 bases upstream of the translational start site. Stop codons in all three reading frames upstream of the initiation methionine ensure that any translational product is correctly initiated. At the 3’ end a primer of 21 bases, terminating 450 bases downstream of the stop codon, was used. The resultant PCR product containing the T7 promoter

ROHOZINSKI AND GOORHA

696

21 CCTCATTCTCIT cc, AAA*AG TTT c:: GAGATAATATCCTAAOATGTTGTTACA m L.” L.” HIS L.” II. L.” II. L.” Lgl 5.r Ph. L.” 6, CGA TAT COT AQC TCT ATC CCT AAC CCQ loll GCC CAA ACG TAA TCT AQC QTT TAG QTA PI0

cAA Ly*

AQA Alp

LY.

AAT II.

TTC S.,

Al@

ACT L.”

CCT L.”

A.n

TcT L.”

QCA I,,* -

L.”

CAA Lys

COG Qly

Al.

ACA ,,I.

COG 01~

Ph.

CCA Gin

AAQ Al@ --

Clrg

Tyr

ACC Pro

135 GCC Pro

A*p

TI~A QI”

QAA Aan

TOT

GCA His

Ills CCA Gin

QAT

CAT

QCC

QCT L.u

COG Qty -

AAQ Ars

OTT Ph.

CQC Al,

AQC Al,

CAQ S.,

TCA Hi*

270 CT, Ph.

IIe

II.

MET

“.I

“.I

PI0

AIn TTT Ph. CQG Q,y

L.” TGc Ala CTT Ph.

S., TQC

L.” TCT

A,, TTQ

L.” CAA

cy,

Ly*

Thr TTA Tyr ACT I.,”

A,# 21* TAT II. 21s CTC s.,

QTG Cyt

TCC Pro

AAC Thr

CCT L.”

CQA A.p

COT “II

COO Qly

ACC Pro

243 TOO Qly

CTC S9r

TQA 01”

ACQ Arg

COT “.I

CTC 8.r

CCA QI”

QQA A.P

TAQ Ars

207 QCA QIn

QQC Al.

ACA QIn

QQQ QIY

CCQ Ars

TQA ‘2,”

ACQ AIQ

CCQ Arg

QTC 8.r

324 CQT “.I

ACT L.”

CTT L.”

QCC Pro

TCA Qln

QQA 01”

QCQ Arg

CC0 Arg

CQQ Qly

351 GAO S.,

CTC $01

TQQ Qly

TCQ Arp

ACA 8,”

GCC P,o

TC’, ,,.u

CTA Ty,

TAO s.,

378 CCT L.Y

CCT L.u

TCC Pro

ACA His

CCQ Arg

ACC Pro

TAA Lya

AAG Arg

QQA Ql”

405 QQQ QIy

COT “al

QAT II.

TQQ Ply

CQC Ala

AQC Al,

ACT L.”

CTQ Cys

CTA Tyr

132 TAC Thr

CAQ S.,

CQC A,,

AAQ 8.r

TCA “1.

CAQ Se,

TCT L.”

QAQ S9r

CCC Pro

45s AGC Al,

AQC Ala

ACT LIY

CTT L.”

QCC Pro

TCC Pro

COT “,I

COG QIy

CAQ A,g

48.3 AGO Qly

CAA Lys

QQA CAQ ASP At,,

ACQ AIO

CTQ Cy*

CQQ Qty

CAQ Arg

ACA QI”

513 QQT “al

CQQ Qly

QQC AI.

AQC Ala

CQQ QIy

TCT L.”

QCT L.Y

CAQ Arg

ACA His

540 TQT “al

CQQ Qly

ACQ AI-J

CIAO A,,,

ACQ Arg

QCA Q,”

QTT Ph.

TQC Ala

CCA Qln

587 ACT L.Y

QAC Tkr

CCQ Arg

AQQ Qly

CCC P,.,

TCT L.”

COO Q,y

CCT I..”

CCC Pro

594 TTQ Cyr

CQC Ala

AC, L.u

CC, L.”

TQA QI”

QCC Pro

TCT L.”

CCC Pro

CC, I..”

021 TQA Asp

CCC Pro

TCA His

TCT I..”

TCT L.”

TQA A.P

CAA Lys

000 Qly

CAQ Arg

8.8 AQA Alp

COT “II

COG QIy

TQT “.I

COG GIy

QQQ QIy

CTQ Cy,

CCA “Is

875 TQA TCA ASP Hi*

CCA QI”

QQQ QIy

ACA Qln

QQA QIu

QQA A#P

COO QIy

CC, Leu

QCT L.”

702 TCT L.”

CCA Hi.

CQQ Qly

TOO Qly

QCC Pro

TQA QI”

QCC Pro

TGT “al

CC, La”

729 QQQ QIy

QCC Pro

ACA Gin

QQQ ‘311

TQT “.I

OAT II.

AQQ QIy

TCC PIO

TQC Al.

75s CTA Tyr

COG 01~

CCQ Arp

TCC Pro

TCA Hfs

CQC Ala

CAA A.n

TQT “.I

CCA Hi*

783 CQC Ala

ACT Le”

CAQ A,g

GCA His

TQA Qlu

QCT L.”

CCA 01”

QCA Hi*

COO Qly

010 QCC Pro

TCT Le”

TQQ QIy

AQA A.P

CCC Pro

TCC Pro

TCQ AIS

CCT Le”

TCG A,(,

1137 OTC S.r

TOP Qly

CmCQ

CC,

QAQ

QAC

CT,

CQQ

1181 QAT

CTQ

QQQ

AAC

QTC

TQC

CQT

CCC

ACQ

945 CC0

AAA

TQT

COT

CCA

AQA

GAG

CT,

CAA

972 COT

TCA

TGA

CGA

ATC

TGA

QAA

AGA

TAG

990 TAG

GGG

CQT

GTQ

CCC

AAQ

CC0

QAC

QQT

,028 CCQ

ACC

CT,

TCA

AAA

GOT

TTC

TCA

TCT

1053 CTC

GGG

TGQ

AGA

GAG

ATQ

GQA

AAQ

AAA

,080 TTA

TCC

TTT

ACT

TTT

TQQ

CAC

ACT

CTG

CGA

CCA

GCT

FIG.2. The complete nucleotide and deduced amino acid sequences of the FV3 gene. The methionine translational start site is boxed as is the stop codon at the 3’ end of the sequence. The amino acid residues which make up the functional motif characteristic of the INT family of integrases are underlined.

was used directly for RNA synthesis with T7 RNA polymerase. Transcripts were added to rabbit reticulocyte lysate in the presence of [35S]methionine and the products of protein synthesis were resolved in an SDS-polyacrylamide gel. A single translation product was observed (Fig. 4) with an estimated mass of 30,500 Da. The molecular weight of the in vitro-synthesized protein is in agreement with the expected molecular mass calculated from the deduced amino acid composition of the ORF (29,945 Da). These data indicate that the tran-

script synthesized from the region containing the ORF is recognized as an authentic message. Regulatory sequences The position of transcriptional initiation and termination was determined so that regulating sequences flanking the ORF could be analyzed. The 5’ and 3’ termini of the mRNA were determined with the use of mung bean nuclease. The 5’ end of the message was well defined, however, the 3’ end showed minor micro-

FROG VIRUS 3

697 ACGT

5

A C G T 3’

2.37

0.24

FIG. 3. The putative INT gene is transcribed in FV3infected cells. Total RNA from FV3-infected cells was subjected to Northern blot analysis and the message of interest was identified by probing with 32P-labeled DNA generated from within the cloned gene. Lane 1, RNA from FVB-infected cells treated with cyclohexamide; lane 2, RNA from FV3-infected cells treated with fluorophenylalanine; and lane 3, RNA from FV3-infected cells. Twenty micrograms of RNA was loaded per lane. The positions of RNA size markers and their sizes (kb) are indicated on the right. The absence of detectable message in the cyclohexamide-treated cells and its presence in fluorophenylalanine-treated and untreated cells indicate that the message falls into the delayed-early class of FV3 messages.

FIG. 5. Determination of the 5’ and 3’termini of the putative INT gene transcript. The 5’ and 3’termini of the mRNA were determined with the use of mung bean nuclease. The left panel shows the sequence of the noncoding complementary strand for the 5’ end and the nuclease-resistant fragment in the right lane. The panel on the right shows the sequence in the region of the B’terminus; two major nuclease-resistant fragments are visible in the right lane.

of the sequence ATAAA (Fig. 6). There is no CCAAT promoter site at the 5’ end. The CCMT promoter is found in many eukaryotic genes and functions as a -60

heterogeneity (Fig. 5). The 5’end was located 16 bases upstream of the translational initiation codon and the 3’ end was 269 bases downstream from theTGA termination codon (Fig. 2). A TATA box-like sequence is located starting 47 bases before the transcriptional start site and consists

-30

;AGTATCTTACGpirqGGTTTCTGAGAG;ATACATCTCAAG*TA +1

30

GCAAAGACCCTTTCA~AGATAATATCCTAAOATGTTGTTAoAco:c

~LsuLsuHlrLeu 43 . ATCCTAAAAAGT-. IleLeuLysSsr

. .--.

636 640 . . -.TCTGGCTGACCTGAGGACCTTCGG SetGl y

670

900

GATAGTCT;CCGGAGAGTCTTCCGGAGAGTCTCCAGGA:ACGCTT 930 TGGGGGATGCTTCTGGGGAACGT:TGCCGTCCCACGCCGAAATGT

+

69,000

960 . CGTCCAAGAGAGCTTCAACGTTCATGACGAATCTGAGAAAGATAG

990 .

1020 . TAGGGGCGTGTGCCCAAGCCGGACGGTCCGACCCTTTCAAAAGGT

t46.000 1060 TTCTCATCTCTCGGG;GGA$AGAGATGGGAAAGAAATTATCCTTT -1110

in vitro

product

ACTTTTTGGCACACTCTGCGAC::GC:TGAGCTTACCCTTTCTTT

+-30,000

1140 AATAGGAA;

FIG. 4. SDS-PAGE analysis of the in vitro translation products from capped RNA runoff transcripts generated from the PCR-amplified gene. Lane 1, the translation product from the runoff transcript; lane 2, translation products in the absence of added RNA. Positions of molecular weight markers are shown to the right. A molecular mass of approximately 30,500 Da was estimated for the in vitro product.

FIG. 6. The regulatory sequence flanking the coding region of the gene. In the 5’ region the presumptive TATA motif is boxed and the transcriptional start site is underlined. In the 3’ region solid arrowheads indicate the transcriptional stop sites. Opposing arrows underline the hyphenated dyad-like symmetry, and the center of the potential loop is indicated by a dot between the arrows.

698

ROHOZINSKI LERTGIELPAGQLTHVLRHTFASHFMMNGGNILVLDRVLGHTDIKMTMRYAHFAP RSIWNGTGMAEWSLHDMRRTIATNLSELGCPPHVIEKLLGHDMVGVMAHYNLHDY MKAlKPDLPMGOATHALRHSFATHFMlNGGSllTLQRlLGHTRlEOTMVYAHFAP AKDDSGORYLAWSGHSARVGAARDMARAGVSIPEIMDAGGWTNVNIVMNYIRNLD *KASGLSFEGDEWSLHELRSLSARLYEKOlSDKAFDHLLGHKSOTMASQYRDDRGR . MGRRRSHERRDLPPNLYIRNNGYYCYRDPRTGKEFGLGRDRRIAITE .TEKYTRAFDEKKSPHKLRHTYATNHYNENKDLVLLANDMGHNSMETTALYTNlDD LRAMGYDTKTEVCGHGFRTMAROALGESGLWSDDAlERQLSHSERNNVRAAYlHT SGLWSDDAIERQLSHSERNNVRAAYIHTSEHLDERRLMMQWWADYLDMNRNKYIS APYSIFAIKNGPKSHIGRHLMTSFLSMKGLTELTNVVGNWSDKRASAVARTTYTH PASPIFAIKHGPKSHLGRHLMNSFLHKNELDSWANSLGNWSSSONORESGARLGY PPENVFAALYIDSLHGGRHQIMPGFCKLSCPTLDVGPGLGRFAASHFSERVSODR . . . . H.LRH..A......G......Q..LGH..I..T..Y..... 7 1

10

AND GOORHA INT166 lNT60 INTPP CREPl INTLAM INTLAM 1 7 ) BTHBT13 INTP4 7 5 ) INTP4 0 5 ) FLP 9 9 ) FLPSBJ 5 3 ) FV3 common amino acids

(277) ( 3 5 2 ) ( 2 6 9 ) I1 67) i 3 0 6 j ( 7) (

(346)

. 20

( 3 ( 3 ( 2 ( Most

-

. 30

40

FIG. 7. Alignment of the functional motifs of some known integrases and that of the FV3 protein. The sequence identifier names are given on the right and the numbers in parentheses indicate the positions of the histidine residue which delineates the functional motif within the various integrases. The sequences marked with an asterisk were obtained from Argos ef a/. (1986). The other data for known integrases and the underlined stringent sequence motif in the bottom line were obtained from Smith et al. (1990). lntegrase (INT) genes used in this comparison were from bacteriophages 186, 480, P2, P4. and X (LAM). The Cre protein, which is responsible for conserved recombination and excision, was from bacteriophage Pl The FLP protein sequence was from the site-specific recombinase (FLP) encoded by the yeast plasmid 2 pm circle from Saccharomyces cerevisiae.

binding site for DNA-dependent RNA polymerase II. It is present in the promoter regions of some immediateearly genes of FV3 (Beckman et a/., 1988). The mRNAs from FV3 are not poly(A) tailed (Willis and Granoff, 1976) and have no polyadenylation signal at the 3’end. However, there are othersequences present which may serve as transcriptional termination signals. The presence of regions of hyphenated dyad-like symmetry at the 3’ termini of immediate-early and late FV3 transcripts has been previously reported by Auber-tin et a/. (1989). We have also found a similar region of dyad symmetry near the 3’ end of the ORF. This sequence between bases 1041 and 1073 (Fig. 6) could basepair to form a hairpin structure and may be analogous to the hyphenated dyad symmetry present at the 3’ termination site of procaryotic messages which act as transcriptional termination signals (Watson et a/., 1987). Homology

with other proteins

and INT motif

A computer search of the Swiss-Prot and PIR data banks was undertaken to identify proteins with similarity to that coded for by the ORF. The search, using FASTDB (IntelliGenetics, Inc.), revealed no proteins which had a significant level of homology with that encoded by the ORF. However, when searching for motifs known to be associated with catalytic or functional properties of proteins, the presence of the motif characteristic of the INT family of integrases was observed (Fig. 7). A sequence motif HxxRx(21)G, characteristic of all bacteriophage integrases, has recently been established by Smith et al. (1990). This motif had been previously recognized within a region of similarity between the lambdoid bacteriophages and FLP integrase

of yeast (Argos et a/., 1986). Argos et al. (1986) compared the amino acid sequences of seven integrases from lambdoid phages and the yeast 2 pm plasmid FLP protein and found very little amino acid sequence homology within this group except for the residues which comprise the characteristic INT family motif. The alignment between the INT motif within the translated ORF and several other integrases from the INTfamily is shown in Fig. 7. Also shown on the bottom line of the figure is the INT motif and the most common amino acids found in the conservatively substituted 40 amino acid residue region delineated by the INT motif. Of particular interest is the tyrosine residue near the carboxyl end of this region. This tyrosine residue is known to be required for covalent coupling of the INT protein to DNA during the recombination reaction (reviewed in Craig, 1988). The position of this tyrosine is imperfectly conserved (Fig. 7) and it is also absent from the FV3 ORF; however, several serine residues are present in the FV3 ORF in the general region where a tyrosine residue is normally found. Serine can have a catalytic function similar to that of tyrosine and this is demonstrated in the case of resolvases involved in transposon cointegrate resolution. These resolvases are part of a recombination system of the transposable element 3 (Tn3) family and catalyze recombination via a DNA-serine linkage (Reed and Moser, 1984). DISCUSSION In this paper we report the sequence of a gene from frog virus 3 coding for a protein containing the sequence motif characteristic of the integrase-resolvase family of site-specific recombinases. The deduced amino acid sequence of the ORF indicates that the FV3

FROG VIRUS 3

protein consists of 275 amino acids with a calculated M, of 29,945. In vitro translation using RNA transcribed from a PCR product of the cloned gene resulted in the detection of a single protein with an NI, of -30,500. The gene is expressed as a delayed early message of approximately 1100 bases in infected cells. There is a presumptive TATA box 47 bases upstream of the transcriptional start site. There are no promoter elements such as GC or CCAAT boxes present. These promoter elements are common to many genes transcribed by RNA polymerase II and their absence may indicate that the host RNA polymerase is either modified upon viral infection or transcription of delayed early viral genes is done by a viral coded RNA polymerase. Goorha (1981) and Campadelli-Fiume er al. (1975) have previously presented biochemical evidence that RNA polymerase II is modified in FV3-infected cells. There is a 33-base region of hyphenated dyad symmetry at the 3’ end of the message which may act as a transcriptional termination signal. A complete data bank search failed to yield any proteins having global homology with the FV3 gene product, so the motif characteristic of the INT family of recombinases (Argos er al., 1986; Smith et al., 1990) is the only feature it shares with other known proteins. The INT family includes site-specific recombinase encoded by phage h, by the X-related phages $80, P2, P4, P22, 186, and by phage Pl (Cre). The FLP recombinase of the yeast 2 pm plasmid is also grouped in the INT family (Argos et a/., 1986; Parsons et al., 1988). This group of enzymes carries out site-specific recombination reactions by cleavage of the substrate DNA and transient covalent attachment of the enzyme to the 3’ end of the cut. Recombination proceeds by pairwise exchange of single strands by formation and resolution of a Holliday intermediate (Craig, 1988). The amino acid residues which make up the stringent sequence motif found in bacteriophage DNA integrases are HXXRX(21)G (Fig. 7). When this stringent motif was used to search the PIR and Swiss-Prot data banks, only integrases of the INTfamilywere identified, thus indicating that this motif is not the result of random assortment. The FV3 ORF displays the characteristic features of this motif. Although integrases have no overall homology, the 40 amino acid residue region around the motif is conservatively substituted (Argos et a/., 1986; Craig, 1988). This is also a feature of the FV3 protein which shares a 93% homology with other integrases in this region. The conserved motif found in the INT family of integrases is known to represent the catalytic site. An interesting feature of the motif is the invariant histidine and arginine residues at positions 1 and 4 of the motif (Fig. 7). Site-directed mutagenesis in the FLP protein has

699

established that the histidine 305 residue is required for recombination via strand exchange and religation (Parsons et al., 1988). Substitution of arginine 308 in the FLP protein eliminates or reduces substrate cleavage (Parsons et al., 1988, 1990). The FV3 protein contains both of the conserved histidine and arginine residues discussed above in positions 53 and 56 of its amino acid sequence. The need for an integrase-resolvase protein in FV3 replication is illustrated by the replication strategy used by FV3. Viral DNA replication occurs in two stages (Goorha, 1982). The first stage of replication involves the synthesis of approximately unit length genomic DNA in the nucleus. The second stage of DNA replication is initiated in the cytoplasm about 3 hr after infection. Cytoplasmic DNA synthesis involves the formation of a large concatameric complex by recombination of smaller genomic DNA molecules (Goorha and Dixit, 1984). These small DNA molecules also serve as primers for DNA replication via chain elongation; thus, during this latter stage, DNA replication occurs in conjunction with recombination, and DNA synthesis and recombination are interlinked. The high level of genetic recombination (Chinchar and Granoff, 1986) displayed by FV3 is consistent with DNA synthesis being dependent on recombination of genomic DNA. Packaging of genomic DNA into the capsid presumably occurs via a head full mechanism resulting in the packaged DNA being circularly permuted and terminally redundant. However, before packaging can occur, the concatamer must be resolved so that linear genomic DNA is available for encapsidation. An enzyme having integrase-resolvase-recombinase-type activity may be required for the formation of concatameric complex replicative intermediates and/or their resolution before encapsidation. The putative FV3 INT protein may be involved in these functions. The presence of an integrase-like gene within the genome of FV3 is surprising since such an integrase protein has not previously been reported in animal viruses. However, FV3 shares many features of bacteriophages, particularly the lambdoid and T bacteriophages. FV3 has a highly methylated DNA; its genomic DNA is circularly permuted and terminally redundant and its mRNAs lack poly(A) tails (Mut-ti eta/., 1985). The two-stage DNA replication involving the formation of a concatameric complex most closely resembles that of bacteriophage T4 (Goorha and Dixit, 1984). The presence of a protein having a characteristic motif of the INT family of integrase-recombinase may further reflect the similarity of FV3 to bacteriophages. ACKNOWLEDGMENTS We thank Ms. Ramona Tirey for expert technical assistance and Ms. Glenith D. White for typing the manuscript. This work was sup-

700

ROHOZINSKI

ported by National Institutes of Health Research Grant GM 23638 and American Lebanese Syrian Associated Charities.

REFERENCES ARGOS, P., LANOY,A., ABREMSKI.K., EGAN, J. B., HAGGARD-LJANGQUIST, E., HOESS, R. H., KAHN, M. L., KALIONIS, B., NARAYAMA, S. V. L., PIERSON, L. S., STERNBERG,N., and LEONG, J. M. (1986). The integrase family of site-specific recombinases: Regional similarities and global diversity. EMBO J. 5, 433-440. AUEERTIN, A. M., TONDRE, L., and THAM, T. N. (1989). Translational regulation of frog virus 3. In “Viruses of Lower Vertebrates” (W. Ahne and E. Kurstak, Eds.), pp. 51-59. Springer-Verlag, BerlinHeidelberg. BECKMAN,W., THAM, T. N., AUBERTIN,A. M., and WILLIS, D. B. (1988). Structure and regulation of the immediate early frog virus 3 gene that encodes ICR489. J. viral. 62, 1271-1277. CAMPADELLI-FUIME, G., COSTANZO, F., FOA-TOMASI, L., and LAPLACA, M. (1975). Modification of cellular RNA polymerase II after infection with frog virus 3. J. Gen. Viral. 27, 391-394. CHINCHAR, V. G., and GRANOFF,A. (1986). Temperature-sensitive mutants of frog virus 3: Biochemical and genetic characterization. J. Viol. 58, 192-202. CRAIG, N. L. (1988). The mechanism of conservative site-specific recombination. In “Annual Review of Genetics” (A. Campbell, B. S. Baker, and I. Herskowitz, Eds.), Vol. 22, pp. 77-105. Annual Reviews, Palo Alto, CA. DARLINGTON, R. W., GRANOFF, A., and BREEZE,D. C. (1966). Viruses and renal carcinoma of Ranapipiens. II. Ultrastructural studies and sequential development of virus isolated from normal and tumor tissue. Virology 29, 149-l 56. GOORHA, R. (1981). Frog virus 3 requires RNA polymerase II for its replication. J. Viral. 37, 496-499. GOORHA, R. (1982). Frog virus 3 DNA replication occurs in two stages. J. Viral. 43, 519-528. GOORHA, R., and DIXIT, P. (1984). A temperature-sensitive (ts) mutant of frog virus 3 (FV3) is defective in second-stage DNA replication. virology 36, 186-l 95. GOORHA, R., and MURTI, K. G. (1982). The genome of frog virus 3, an animal DNAvirus, is circularly permuted and terminally redundant. Proc. Nat/. Acad. Sci. USA 79, 248-252. GRANOFF,A., CAME, P. E., and BREEZE,D. C. (1966). Viruses and renal carcinoma of Rana pipiens. I. The isolation and properties of virus from normal and tumor tissue. Virology 29, 133-l 48. GRONOSTAISKI,R. M., and SADDWSKI,P. D. (1985). The FLP recombinase of the Saccharomyces cerevisiae 2-pm plasmid attaches covalently to DNA via a phosphotyrosyl linkage. Mol. Cell. Biol. 5, 3274-3279. LAEMMLI, U. K. (1970). Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227,680-685. MANIATIS, T., FRITSCH, E. F.. and SAMBROOK, J. (1982). “Molecular Cloning: A Laboratory Manual.” Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. MURRY, M. G. (1986). Use of sodium trichloroacetate and mung bean nuclease to increase sensitivity and precision during transcription mapping. Anal. Biochem. 158, 165-l 70.

AND GOORHA MURTI, K. G., GOORHA, R., and CHEN, M. (1985). Interaction of frog virus 3 with the cytoskeleton. ln “Current Topics in Microbiology and Immunology: Iridoviridae” (D. B. Willis, ed.), pp. 107-l 31. Springer-Verlag, Berlin-Heidelberg. MURTI, K. G., GOORHA. R., and GRANOFF,A. (1982). Structure of frog virus 3 genome: Size and arrangement of nucleotide sequences as determined by electron microscopy. Virology 116, 275-283. MURTI, K. G., GOORHA, R., and GRANOFF,A. (1985). An unusual replication strategy of an animal iridovirus. In “Advances in Virus Research” (K. Maramorosch, F. A. Murphy, and A. J. Shatkin. Eds.), pp. l-l 9. Academic Press, New York. NAEGELE,R. F., and GFIANOFF,A. (197 1). Viruses and renal carcinoma of Rana pipiens. XI. Isolation of frog virus 3 temperature-sensitive mutants: Complementation and genetic recombination. Virology 44,286-295. PARGELLIS.L. A., NUNES-DUBY. S. E., MOITOSO DE VARGAS, L., and LANDY, A. (1988). Suicide recombination substrates yield covalent X integrase-DNA complexes and lead to identification of the active site tyrosine. J. Biol. Chem. 263, 7678-7685. PARSONS, R. L., EVANS, B. R., ZHENG. L., and JAYARAM, M. (1990). Functional analysis of arg-308 mutants of FLP recombinase. J. Biol. Chem. 265, 4527-4533. PARSONS, R. L., PRASAD. P. V., HARSHEY, R. M., and JAYARAM, M. (1988). Step-arrest mutants of FLP recombinase: Implications for the catalytic mechanism of DNA recombination. Mol. Cell. Biol. 8, 3303-33 10. REED, R. R., and MOSER, C. D. (1984). Resolvase-mediated recombination intermediates contain a serine residue covalently linked to DNA. Cold Spring Harbor Symp. Quant. Biol. 49, 245-249. SMITH, H. O., ANNAU, T. M., and CHANDRASEGARAN,S. (1990). Finding sequence motifs in groups of functionally related proteins. Proc. Natl. Acad. Sci. USA 87, 826-830. SOUTHERN, E. (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. /. Mol. Biol. 98, 503517. TABOR, S., and RICHARDSON,C. C. (1987). DNA sequence analysis with a modified bacteriophage T7 DNA polymerase. Proc. Nat/. Acad. Sci. 84, 4767-477 1. WATSON, J. D., HOPKINS, N. H., ROBERTS,J. W., STEITZ, J. A., and WEINER,A. M. (1987). “Molecular Biology of the Gene,” fourth ed. Benjamin/Cummings, Menlo Park, CA. WILLIS, D. B., GOORHA, R., and CHINCHAR, V. G. (1985). Macromolecular synthesis in cells infected by frog virus 3. ln “Current Topics in Microbiology and Immunology: Iridoviridae” (D. B. Willis, Ed.), pp. 77-l 06. Springer-Verlag, Berlin-Heidelberg. WILLIS, D., and GRANOFF, A. (1976). Macromolecular synthesis in cells infected by frog virus 3. V. The absence of polyadenylic acid in the majority of virus-specific RNA species. Virology 73, 543547. WILLIS, D., and GRANOFF,A. (1980). Frog virus 3 DNA is heavily methylated at CpG sequences. Virology 107, 250-257. WILLIS, D. B., THOMPSON, J. P., and BECKMAN, W. (1990). Transcription of frog virus 3. ln “Molecular Biology of Iridoviruses” (G. Darai, Ed.), pp. 173-l 85. Kluwer Academic, Norwell, MA. YISRAELI,J. K., and MELTON, D. A. (1989). Synthesis of long, capped transcripts in vitro by SP6 and 77 RNA polymerases. ln “Methods in Enzymology” (1. N. Abelson and M. I. Simon, Eds.), Vol. 180, pp. 42-53. Academic Press, New York.