Gene, 168 (1996) 183-187 © 1996 Elsevier Science B.V. All rights reserved. 0378-1119/96/$15.00
183
GENE 09418
Sequence of the mouse adenovirus type-1 DNA encoding the 100-kDa, 33-kDa and DNA-binding proteins (Animal virus; sequence comparison; early region 2; late region 4; late protein)
Angela N. Cauthen and Katherine R. Spindler Department of Genetics, University of Georgia, Athens, GA 30602-7223, USA Received by J.A. Engler: 10 April 1995; Revised/Accepted: 22 August 1995; Received at publishers: 5 October 1995
SUMMARY
The genomic nucleotide sequence for the region of 66 to 77 map units (m.u.) of mouse adenovirus type 1 (MAV-1) was determined and predicted to encode proteins homologous to the human adenovirus (Ad) 100-kDa, 33-kDa and DNA-binding proteins (DBP). The putative MAV-1 100-kDa protein has 65-70% amino-acid similarity to 100-kDa proteins from five different human Ad serotypes. The mRNA for the putative 33-kDa protein is internally spliced within the coding sequence, as are its human Ad counterparts [Oosterom-Dragon and Anderson, J. Virol. 45 (1983) 251-263]. The N-terminal region of the putative MAV-1 33-kDa protein has 41-44% similarity to two human Ad 33-kDa N-termini, and the C-terminal regions are more conserved, with 60-65% similarity. The MAV-1 DBP is predicted to be encoded in this region and was compared to six different human Ad DBP N- and C-termini. The N-termini of the MAV-1 and Ad DBP were 33--48% similar and the C-termini were 56-60% similar. The MAV-1 DBP contains conserved regions (CR) 1, 2 and 3, and it retains important residues for a putative zinc finger (Zf) motif identified in Ad DBP [Eagle and Klessig, Virology 187 (1992) 777-787]. Additional sequence features of these three proteins have also been identified.
INTRODUCTION
The organization of the MAV-1 genome is similar to that of human Ad (Song et al., 1995). The region of the MAV-1 genome from 66 to 77 m.u. is expected to encode the 100-kDa and 33-kDa proteins, which are translated Correspondence to: Dr. K.R. Spindler, Department of Genetics, University of Georgia, Life Sciences Building, Athens, GA 30602-7223, USA. Tel. (1-706) 542-8395; Fax (1-706) 542-3910; c-mail:
[email protected] u Abbreviations: aa, amino acid(s); AAV, adeno-associated virus; Ad, human adenovirus(es); bp, base pair(s); CR, conserved region(s); DBP, DNA-binding protein(s); Exo, exonuclease; GCG, Genetics Computer Group (Madison, WI, USA); kb, kilobase(s) or 1000 bp; L4, late region 4; MAV-1, mouse adenovirus type 1; MLP, major late promoter; m.u., map unit(s); NLS, nuclear localization signal; nt, nucleotide(s); PAGE, polyacrylamide-gel electrophoresis; RRM, RNA-recognition motif; SDS, sodium dodecyl sulfate; ss, single-strand(ed); Zf, zinc finger(s). SSDI 0378-1119(95)00715-6
from mRNAs transcribed in the rightward direction, and the DBP, which is translated from mRNA transcribed in the leftward direction. The 100-kDa protein has been well characterized in human Ad. It is translated from a late region 4 (L4) transcript, which is driven by the major late promoter (MLP) (Ginsberg, 1984) as are many virally encoded messages produced late in infection (Ginsberg, 1984). The human Ad type 2/5 (Ad2/5) 100-kDa protein is phosphorylated (Russell and Blair, 1977) and is involved in virion morphogenesis (Cepko and Sharp, 1982) and in increased translational efficiency of late viral messages (Hayes et al., 1990). The 33-kDa protein is also a phosphoprotein (Russell and Blair, 1977) translated from a L4 mRNA transcribed in the rightward direction from the MLP (Ginsberg, 1984). The 33-kDa protein is localized to the nucleus of infected cells and is proposed to be involved in virion
184 morphogenesis (Gambke and Deppert, 1981; OosteromDragon and Anderson, 1983). Many functions have been ascribed to the DBP, which is translated from an early region-2 mRNA (Ginsberg, 1984). DBP contains two distinct functional domains. The N terminus is involved in host range determination (Brough et al., 1985), nuclear localization (Morin et al., 1989), transformation (Rice et al., 1987) and possibly replication (Brough et al., 1993). The C terminus is involved in replication (Klessig and Quinlan, 1982), DNA and RNA binding (Van der Vliet and Levine, 1973; Fowlkes et al., 1979; Cleghon and Klessig, 1986), Zn 2+ binding (Eagle and Klessig, 1992 and references therein), and protection of single-stranded DNA (ssDNA) from degradation by nucleases (Nass and Frenkel, 1980). The DBP is phosphorylated on the N-terminal portion of the protein (Linne and Philipson, 1980) and it may be involved in virion assembly (Nicolas et al., 1983). We determined the nt sequence for the region of the MAV-1 genome that is positionally equivalent to the region of the genome of human Ad that encodes the 100-kDa protein, the 33-kDa protein and the DBP. The predicted 100-kDa, 33-kDa and DBP polypeptides were identified in MAV-1 by aa sequence comparison. These proteins share greater sequence similarity with the human Ad proteins than do MAV-1 early region 1, 3 and 4 proteins (Kring et al., 1992 and references therein).
EXPERIMENTAL AND DISCUSSION
(a) Exo III deletions and sequencing of MAV-I Exo III deletions (Henikoff, 1984) were made in a Bluescript KS + plasmid (Stratagene, La Jolla, CA, USA) containing the MAV-1 BamHI-HindlII fragment that encompasses 64 to 77 m.u. of the MAV-1 genome. Exo III deletions were made in the BamHI to HindlII direction, and these deletion clones were used to make ssDNA for sequencing. To sequence the other strand of the DNA, the BamHI-HindlII fragment was cloned into Bluescript SKII + plasmid (Stratagene). Exo Ili deletions were then made in the HindlII-BamHI direction, and the resulting clones were used to make ssDNAs. The DNAs were sequenced by dideoxy chain termination (Sanger et al., 1977) using Sequenase 2.0 (US Biochemical, Cleveland, OH, USA), and the sequenced fragments were assembled using the Gel program version 5.4 from Intelligenetics. (b) Computer analysis of the MAV-1 sequence Computer analyses were performed on the MAV-1 sequence using version 8.0 of the GCG package (Devereux et al., 1984) and version 5.4 of Intelligenetics. Protein database searches were performed using FASTA
and BLAST of GCG. Pairwise comparisons and multiple sequence alignments were obtained using GAP and PILEUP, respectively (GCG).
(c) Analysis of the 100-kDa protein Sequence comparison indicated a high degree of similarity between the putative MAV-1 100-kDa protein and the human Ad 100-kDa proteins throughout their sequences (Table I). MAV-1 and human Ad type 40 (Ad40), Ad2, Ad5, and Adl2 share an average of 67% similar and 50% identical aa as determined by the GAP program (Table I) (Devereux et al., 1984). The extensive conservation between the 100-kDa proteins from human and mouse adenoviruses is not surprising given that the function of the 100-kDa protein is in morphogenesis of the virus (Cepko and Sharp, 1982 and references therein). The Ad2 100-kDa protein binds newly synthesized hexon proteins (Cepko and Sharp, 1982), and it is thought to assemble them into trimers which are later used in assembling the capsid (Cepko and Sharp, 1982). The Ad5 100-kDa protein has been shown to bind RNA (Riley and Flint, 1993 and references therein) and to increase the translational efficiency of late viral mRNAs (Hayes et al., 1990). There is evidence that the 100-kDa protein domains which bind mRNA and hexon are different (Riley and Flint, 1993). The Ad2/5 100-kDa proteins exhibit all four subdomains ascribed to an RNA recognition motif (RRM) (Hayes et al., 1990). Although the putative MAV-1 100-kDa protein has sequence similarity to the Ad2/5 100-kDa proteins, its similarity to the putative RRM is minimal. (d) Analysis of the 33-kDa protein The hypothesis that the human Ad 33-kDa protein functions in virion morphogenesis is based on its late transcription (Ginsberg, 1984), absence in mature virions (Edvardsson et al., 1976) and presence in 'top components', which are empty capsids (Edvardsson et al., 1976). However, others do not find the 33-kDa protein in top components (J. Weber, personal communication). It was first identified as the 39-kDa protein (Russell and Blair, 1977) because it migrates anomalously in SDS-PAGE. Therefore, the protein is referred to as both the 33-kDa and the 39-kDa protein in the literature. The Ad2 33-kDa protein mRNA is spliced within the coding sequence (Oosterom-Dragon and Anderson, 1983). The same is true for MAV-1. One mock infected and two infected cell mRNA samples were treated with RQ1 DNase (Promega, Madison, WI, USA), reverse transcribed with AMV reverse transcriptase (US Biochemical) and amplified by PCR, using two different sets of PCR primers flanking the splice junction. Products unique to the infected cells and of the predicted size based
185 TABLE I Comparison of MAV-1 proteins to human Ad proteinsa Protein
100-kDa (Ad2,5,12,40) 33-kDa b (Ad2,40) DBW (Ad2,4,5,7,12,40)
Entire protein a
N-termini b'¢'d
% Identity
% Similarity
49-53(50)
65 70(67)
39, 34 32 35(34)
C-terminib'°'d
% Identity
% Similarity
% Identity
52, 51
26, 19
41, 44
53, 48
52-57(54)
17 26(21)
33-48(39)
37-40(38)
%(Similarity
' 65~ 60 57~-60(58)
Pairwise comparisons of the MAV-1 proteins to human adenovirus proteins were accomplished using the default parameter~ of the (~APprogram from CGC. Arithmetic means of the similarity scores are shown in parentheses, The GenBank accession number for this sequeraz¢ is U23770. Aminoacid sequence comparisons of human Ad to MAV-1 proteins were accomplished using GCG PILEUP and BOXSHADE and iupo~a rgqucst may be obtained from the corresponding author (K.R.S.) b The splice junction of each 33-kDa protein was arbitrarily used as the division between the N- and C-terminus of each proteia. c The N-termini for the DBPs are from 1-170 aa (Ad5 numbering). The C-termini for the DBPs are from 171-592 a~/.(Ad5-~umbering) (Kitchingman, 1985). a Identical or similar aa, when compared to the MAV-1 protein as determined by the GAP program from GCG. a
on hypothesized 33-kDa mRNA splicing were isolated and sequenced (data not shown). In all cases the sequence indicated a splice donor at nt 3215 (accession No. U23770) and a splice acceptor at nt 116 (accession No. M30594). The similarity and identity computations for the putative MAV-1 33-kDa protein are shown in Table I. Comparisons were made between putative MAV-1 33-kDa N- and C-termini to the corresponding termini of the 33-kDa proteins of the human Ad. The boundaries between the termini were arbitrarily made at the splice junction of the proteins. As with the Ad 33-kDa proteins, there is higher aa similarity in the C-terminal end of the MAV-1 protein. The higher degree of conservation in the C terminus of these 33-kDa proteins may be related to their function. A potential consensus glycosylation site in the C terminus of the Ad2 33-kDa protein is conserved in each of the adenovirus proteins compared, including MAV-1 (Oosterom-Dragon and Anderson, 1983). Glycosylation of the 33-kDa proteins along with phosphorylation and high proline and glutamic acid content may explain the anomalous migration rate at 39 kDa for this Ad2 protein upon SDS-PAGE (Oosterom-Dragon and Anderson, 1983).
(e) Analysis of the DNA-binding protein The DBPs encoded by human Ad are multifunctional proteins containing two distinct functional domains (Klessig and Quinlan, 1982). The N-termini of DBPs are not very well conserved among human Ad serotypes while the C-terminal domains are well conserved (Kitchingman, 1985). The putative MAV-1 DBP is predicted to be translated from mRNA transcribed in the
leftward direction, as are DBPs from human Ad. The sequence of MAV-1 DBP (461 aa) is shorter than the DBPs from human Ad (473 to 592 aa). Both for the human Ad and MAV-1 DBP proteins, size differences are due mostly to deletions in the N-terminal regions compared to Ad5 (Kitchingman, 1985 and references therein). Overall sequence similarity between MAV-1 DBP and human Ad DBPs is 54%, while between the N-termini it is 37% and between the C-termini it is 58% (Table I). Since the N terminus of the Ad5 DBP is involved in host range determination (Brough et al., 1985, and references therein) and Ad are species-specific in their host infectivity, low aa similarity is not unexpected between human and mouse DBP N-termini. The N-terminal region of the DBP is thought to contain a nuclear localization signal (NLS) (Morin et al., 1989). Ad5 aa 42-46 and 84-89 are necessary for efficient nuclear transport of the DBP. It is difficult to determine if a NLS exists in the putative MAV-1 DBP based on aa sequence. There is aa conservation throughout the C-terminal portion of the putative DBP encoded by MAV-1. Four C-terminal regions of human Ad DBPs that are conserved are CR1, CR2, and CR3 (Quinn and Kitchingman, 1984), and a putative Zf (Eagle and Klessig, 1992 and references therein). CR1, 2 and 3 were identified in Ad 2/5 as aa 178-186, 322-330 and 464-475, respectively (Quinn and Kitchingman, 1984). CR1 mutants retain wild-type function for Ad-associated virus (AAV) helper function and ssDNA-binding (Neale and Kitchingman, 1990 and references therein). In the MAV-1 region corresponding to CR1, two of the nine aa are identical, three have conservative changes and four are not conserved when compared to human Ad CRls. Mutational studies
186 in Ad5 indicate that CR2 is a DNA-binding domain of DBP (Neale and Kitchingman, 1990). The putative MAV-1 DBP CR2 is identical to all the human Ad serotypes to which it is compared, with the exception of one aa. Notably, the basic aa, R326 in Ad2/5, which is thought to be important in DNA-binding (Quinn and Kitchingman, 1984), is conserved in MAV-1. Mutational analysis of Ad5 DBP CR3 shows that this region plays a functional role in AAV helper function and ssDNAbinding (Neale and Kitchingman, 1990). MAV-1 DBP CR3 is well conserved and the consensus basic aa, K470 in Ad2/5 (Quinn and Kitchingman, 1984), is present in MAV-1. A fourth conserved region described in the Ad2/5 DBP is a Zf motif located from aa 273-286 (Vos et al., 1988a), important for the protein's ability to bind to Zn 2+ and ssDNA, implying that the Zn 2+ binding is required for ssDNA binding (Vos et al., 1988b; Eagle and Klessig, 1992): the DBP Zf motif is a variation of the classic finger first described for TFIIIA (Berg, 1990 and references therein). Most of the important residues in the Zf region of the putative MAV-1 DBP are conserved. The Zf portion of the motif is only five aa in MAV-1 vs. eight aa in human Ad. The 3-aa deletion in MAV-1 could potentially alter the function of this putative Zf. Sequence comparisons of the predicted proteins of MAV-1 and human Ad in the region from 66-77 m.u. imply that the overall structure of this genome region has been conserved. Transcriptional mapping and identification of the predicted proteins in MAV-1 infected cells will confirm this hypothesis.
ACKNOWLEDGEMENTS
We thank Steve Starling and Hannah Wooley for their preparation of deletion clones and preliminary sequencing and Clayton Beard for RNA samples from infected and uninfected cells. We also thank Julie Olszewski, Jeanne McLachlin, Tom Bureau and Michael Weise for computer assistance. We thank Carl Anderson and Joe Weber for discussions of the 33-kDa protein. We thank members of our laboratory for comments on the manuscript. This work was supported by NIH R01 AI23762 to K.R.S., who is also the recipient of an NIH Research Career Development Award.
REFERENCES Berg, J.: Zinc fingers and other metal-binding domains. J. Biol. Chem. 265 (1990) 6513-6516. Brough, D.E., Droguett, G., Horwitz, M.S. and Klessig, D.F.: Multiple
functions of the adenovirus DNA-binding protein are required for efficient viral DNA synthesis. Virology 196 (1993) 269-281. Brough, D.E., Rice, S.A., Sell, S. and Klessig, D.F.: Restricted Changes in the Adenovirus DNA-binding protein that lead to extended host range or temperature-sensitive phenotypes. J. Virol. 55 (1985) 206-212. Cepko, C.L. and Sharp, P.A.: Assembly of adenovirus major capsid protein is mediated by a nonvirion protein. Cell 31 (1982) 407-415. Cleghon, V.G. and Klessig, D.F.: Association of the adenovirus DNAbinding protein with RNA both in vitro and in vivo. Proc. Natl. Acad. Sci. USA 83 (1986) 8947-8951. Devereux, J., Haeberli, P. and Smithies, O.: A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12 (1984) 387-395. Eagle, P.A. and Klessig, D.F.: A zinc-binding motif located between amino acids 273 and 286 in the adenovirus DNA-binding protein is necessary for ssDNA binding. Virology 187 (1992) 777-787. Edvardsson, B., Everitt, E., J6rnvall, J., Prage, L. and Philipson, L.: Intermediates in adenovirus assembly. J. Virol. 19 (1976) 533-547. Fowlkes, D.M., Lord, S.T., Linne, T., Pettersson, U. and Philipson, L.: Interaction between the adenovirus DNA-binding protein and double-stranded DNA. J. Mol. Biol. 132 (1979) 163-180. Gambke, C. and Deppert, W.: Late nonstructural 100,000- and 33,000-dalton proteins of adenovirus type 2, 1. Subcellular localization during the course of the infection. J. Virol. 40 (1981) 585-593. Ginsberg, H.S.: The Adenoviruses. Plenum Press, New York, NY, 1984. Hayes, B.W., Telling, G.D., Myat, M.M., Williams, J.F. and Flint, S.J.: The adenovirus L4 100-kilodalton protein is necessary for efficient translation of viral late mRNA species. J. Virol. 64 (1990) 2732 2742. Henikoff, S.: Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28 (1984) 351-359. Kitchingman, G.R.: Sequence of the DNA-binding protein of a human subgroup E adenovirus (type 4): comparison with subgroup A (type 12), subgroup B (type 7), and subgroup C (type 5). Virology 146 (1985) 90-101. Klessig, D.F. and Quinlan, M.P.: Genetic evidence for separate functional domains on the human adenovirus specified, 72 kd, DNA binding protein. J. Mol. Appl. Genet. 1 (1982) 263-272. Kring, S.C., Ball, A.O. and Spindler, K.R.: Transcription mapping of mouse adenovirus type 1 early region 4. Virology 190 (1992) 248 255. Linne, T. and Philipson, L.: Further characterization of the phosphate moiety of the adenovirus type 2 DNA-binding protein. Eur. J. Biochem. 103 (1980) 259 270. Morin, N., Delsert, C. and Klessig, D.F.: Nuclear localization of the adeuovirus DNA-binding protein: requirement for two signals and complementation during viral infection. Mol. Cell. Biol. 9 (1989) 4372-4380. Nass, K. and Frenkel, G.D.: Adenovirus-specific DNA-binding protein inhibits the hydrolysis of DNA by DNase in vitro. J. Virol. 35 (1980) 314-319, Neale, G.A.M, and Kitchingman, G.R.: Conserved region 3 of the adenovirus type 5 DNA-binding protein is important for interaction with single-stranded DNA. J. Virol. 64 (1990) 630-638. Nicolas, J.C., Sarnow, P., Girard, M, and Levine, A.J.: Host range temperature-conditional mutants in the adenovirus DNA binding protein are defective in the assembly of infectious virus. Virology 126 (1983) 228-239. Oosterom-Dragon, E.A. and Anderson, C.W.: Polypeptide structure and encoding location of the adenovirus serotype 2 late, nonstructural 33K protein. J. Virol. 45 (1983) 251-263. Quinn, C.O. and Kitchingman, G.R.: Sequence of the DNA-binding protein gene of a human subgroup B adenovirus (type 7). J. Biol. Chem. 259 (1984) 5003 5009.
187 Rice, S.A., Klessig, D.F. and Williams, J.: Multiple effects of the 72-kDa, adenovirus-specified DNA binding protein on the efficiency of cellular transformation. Virology 156 (1987) 366-376. Riley, D. and Flint, S.J.: RNA-binding properties of a translational activator, the adenovirus L4 100K protein. J. Virol. 67 (1993) 3586-3595. Russell, W.C. and Blair, G.E.: Polypeptide phosphorylation in adenovirus-infected cells. J. Gen. Virol. 34 (1977) 19-35. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chainterminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (1977) 5463-5467. Song, B., Spindler, K.R. and Young, C.S.H.: Sequence of the mouse
adenovirus serotype-1 DNA encoding the precursor to capsid protein VI. Gene 152 (1995) 279-280. Van der Vliet, P.C. and Levine, A.J.: DNA-binding proteins specific for cells infected by adenovirus. Nature New Biol. 246 (1973) 170-174. Vos, H.L., Van der Lee, F.M., Reemst, A.M.C.B., Van Loon, A.E. and Sussenbach, J.S.: The genes encoding the DNA binding protein and the 23K protease of adenovirus type 40 and 41. Virology 163 (1988a) 1 10. Vos, H.L., Van der Lee, F.M. and Sussenbach, J.S.: The binding of in vitro synthesized adenovirus DNA binding protein to singlestranded DNA is stimulated by zinc ions. FEBS Lett. 239 (1988b) 251-254.