Gene, 166 (1995) 237-242 © 1995 Elsevier Science B.V. All rights reserved. 0378-1119/95/$09,50
237
GENE 09281
Isolation and sequence determination of the cDNA encoding DNA polymerase from Drosophila melanogaster (DNA replication; PCNA; cloning; PCR; antibody; sequence conservation)
Chuen-Sheue Chiang* and I. R. Lehman Department of Biochemistry, Beckman Center, Stanford University, Stanford, CA 94305. USA Received by R. Padmanabhan: 5 July 1995; Accepted: l0 July 1995; Received at publishers: 21 August 1995
SUMMARY
The cDNA encoding the catalytic subunit of Drosophila melanogaster (Din) DNA polymerase 8 (Pol~) was isolated by a combination of PCR amplification and cDNA library screening. The cDNA is 3457 nucleotides in length and contains an open reading frame (ORF) that encodes a protein of 1092 amino acids (124799 Da). The ORF contains the sequence that was determined for a peptide from the purified catalytic subunit of Dm PolS. Polyclonal antibodies raised against Dm Po18 specifically recognize a protein of the expected size when the cDNA is expressed in either Escherichia coli or insect cells. Comparison of the deduced aa sequence with other Polg sequences demonstrates that Polk is one of the most highly conserved of the DNA polymerases.
INTRODUCTION
Three DNA polymerases, DNA polymerase ~, ~ and (Pol~, 8 and ~) are required to replicate eukaryotic chromosomal DNA (for review, see Wang, 1991; Stillman, 1994). Polk and ~ also appear to be involved in DNA repair (Nishida et al., 1988; Budd and Campbell, 1995; Aboussekhra et al., 1995). Recent studies that link mutations of Pol~ to the replication error (RER) phenotype
Correspondence to: Dr. I.R. Lehman, Department of Biochemistry, Beckman Center, Stanford University, Stanford, CA 94305, USA. Tel. (1-415) 723-6164; Fax (1-415) 723-6783. * Present address: Department of Pathology, Stanford University, Stanford, CA 94305-5324, USA. Tel (1-415) 725-4908. Abbreviations: aa, amino acid(s); Ab, antibody(ies); bp, base pair(s); cDNA, DNA complementary to RNA; Dm, Drosophila melanogaster; Exo, exonuclease; HSV-1, herpes simplex virus type 1; kb, kilobase(s) or 1000 bp; nt, nucleotide(s); ORF, open reading frame; PCNA, proliferating cell nuclear antigen; PCR, polymerase chain reaction; Pol, DNA polymerase; Pol, gene (DNA) encoding Pol; S., Saccharomyces; SDS, sodium dodecyl sulfate; SV40, simian virus 40; UTR, untranslated region(s); Zf, zinc finger(s).
SSDI 0378-1119(95)00567-6
of some colorectal tumors are in agreement with this idea (da Costa et al., 1995). Po18 is highly conserved both structurally and functionally. It consists of two subunits of approx. 120 and 50 kDa. The 120-kDa subunit has the polymerase and 3'-5' exonuclease activities associated with the two subunit enzyme. The function of the 50-kDa subunit is unknown. Deoxynucleotide polymerization by Po15 is distributive; however, it becomes highly processive in the presence of the accessory protein, proliferating cell nuclear antigen (PCNA). The strong functional conservation of Po18 is supported by the finding that calf thymus PCNA stimulates the processivity of Po18 from S. cerevisiae and S. cerevisiae PCNA stimulates the processivity of the calf thymus Polk (Bauer and Burgers, 1988). Polfi from S. cerevisiae can also substitute for human Pol~ in SV40 replication in vitro (Tsurimoto et al., 1990). We have been using Drosophila melanogaster (Din) embryos as a model for the study of eukaryotic DNA replication and repair (Lehman and Kaguni, 1989). As part of this study, we purified Polk from 0-2-h embryos to near homogeneity and demonstrated that it consists of only a single 120-kDa polypeptide, that is unresponsive
238 to PCNA (Chiang et al., 1993). As a prelude to analyzing the interaction of the Dm Po18 with PCNA and determining the function of the 50-kDa subunit, we have cloned the Pol6 gene for the 120-kDa subunit and determined the nucleotide sequence of the full-length cDNA.
EXPERIMENTAL AND DISCUSSION
(a) Isolation and sequence determination of the Poi6 cDNA encoding the 120-kDa subunit of Dm Poll By using PCR amplication of a Dm ovarian cDNA library constructed in ~gtll (generous gift from Dr. Laura Kafayen) with oligodeoxynucleotide primers based on the aa sequences MAHNLCYT and VV/IYGDTD in Domains II and I of PolS, respectively, one positive PCR product of 419 bp was isolated. Sequence analysis demonstrated that it contained an ORF extensively homologous to human and bovine Po18 (67% and 64%
aa identity, respectively). This ORF could be aligned with aa 615 to 750 of human Po16 and contained conserved domains VI and III in the correct order and spatial separation (Fig. 1). The ORF had only 10% identity with the catalytic subunit of the Dm Pol~. The partial cDNA was used to screen the same cDNA library. One of 23 positive clones that carried the longest insert was further analyzed and sequenced. The cDNA consisted of 3457 nt, including a 5'-UTR of 93 nt, an ORF of 1092 aa, a 3'-UTR of 68 nt and a poly(A) tail of 20 nt (Fig. 1). The ATG sequence at nt 94 was chosen as the start codon because (i) the only preceding ATG sequence was followed by several in-frame stop codons, (ii) the sequence surrounding the start codon was in agreement with the Cavener (1987) criteria for a translational start site for Dm mRNA and (iii) the first 5 deduced aa were identical to those of human, bovine and mouse Pol6. This ORF encodes a protein of 124 799 Da, which is consistent with the 120 kDa for the purified catalytic
~T¢T~cGTGccTACc~c~AA~cGcG¢cTT~TTTTTTGG~TcGCTcTT@Tc~TATGGATTTG~GTA~¢ATTT¢Ac~GGTAc~cAG~TGGATGGc~GcGc~TTT~TGG~¢cTCc~TGGACAT@~c~AAGcccA~C; M D G K R K F N G T S N G H A K K P R
!5o 19
~TCCTGATGACGATGAGG~TGGGCTTTGAGGCGGAGCTGGCCGCCTTCGAG~CTCCGAGGACATGGACCAGAcTCTGCT~TGGGCGATGGACCCGAG~CC~CGAcCAGTGAGCGTTGGTCCCGTCCGCCGCCCCCAG~CTA N P D D D E E M G F E A E L A A F E N S E D M D Q T L L M G D G P E N Q T T S E R W S R P P p p E L
300 69
GATCCCTCC~GCAC~CTTGGAGTTTCAGCAGCTGGACGTG~AAAACTATTTGGGACAGCCGTTGCCGGG~TGCCAGGTGCCC~TAGGACCCGTGCCGGTG~TCCG~TGTTTGGTGTCACCATGGAGGGT~CTCT@TGTGCTGC D P S K H N L E F Q Q L D V E N Y L G Q P L P G M P G A Q I G P V P V V R H F G V T M E G N S V C C
480 i19
CATGTGCATGGTTTCTGTCCATACTTCTACATAGAGGCGCCCAGTC~TTCGAGGAGCACCATTGCGAG~CTACA~GCCTTGGATCAAAAGGTTATTGCCGATATTCGCAAC~C~GAT~TGTCCAGGAGGCTGTGCTTATG H V H G F C P Y F Y I E A P S Q F E E H H C E K L Q K A L D Q K V I A D I R N N K D N V Q E A V L M
6@0 169
GTGGAACTGGTGGAG~GCTG~CATCCATGGATAC~TGGAGAC~G~GCAGAGGTACATcA~TATCGGTTACGCTGCCCAGATTTGT~CTGCGGcCTCAcGTCTCCTC~GGAAGTGATCATGTCGGAGATT~ACTTCCAG V E L V E K L N I H G Y N G D K K Q R Y I K I S V T L P R F V A A A S R L L K K E V I H S E I D F Q
750 219
GACTGTCGCGCCTTTGAG~T~CATAGACTTTGACATTCGCTTCATGGTGGACACTGATGTGGTGGGTTGC~TTGGATAGAGCTTCCCAT~GTCACTGGCG~T~GG~CAGTCACAGC~GCCGTT~CC~G~TCCCGCTGCCAG D C R A F E N N I D F D I R F M V D T D V V G C N W I E L P M G H W R ! R N S H S K P L P E S R C Q
900 269
ATTG~GTAGACGTGGCCTTCGACAGATTTATATCCCACGAGCCcG~GGTG~'~GGTCC~GGTGGCTCCCTTCCGGATCCTCTCCTTTGATA~TG~TGCGCTGGTCGC~GG~TATTTCCGG~GCCAAAATAGATCCAG?CATC ! E V D V A F D R F I S H E P E G E W S H V A P ' F R I L S F D I E C A G R K G I F P E A K I D P V I
1050 319
CAGATAGCC~TAT~GTGAT~GGCAGGGAG~CGAG~CCTTTCATTAGG~TGTCTTTACCCT~TG~TGCGCTCC~TCATA~GcAGCCAGGTGTTGTGcCACGAC~GGAGACCCAGATGCTGGAC~GTGGTCTGCCTTTGTC~2~ Q I A N M V I R Q G E R E P F I R N V F T L N E C A P I I G S Q V L C I { D K E T Q M L D K W S A F V 369 CGGG~GTTGACCCGGATATTTTGACCGGATAT~TATCAAC~cTTTGACTTCCCCTATTTGCTTAACCGAGCAGCTCACTTG~G~TCAGG.~CTTTGAGTATTTGGGCAGGATT~G~CATTCGTTCGGTGATCAAGG~CAGATG R E V D P D I L T G Y N I N N F D F P Y L L N R A A H L K V R N F E Y L G R I K N I R S V I K E Q M
1350 419
TTGCAGTCG~GcAGATGGGTCGCAGGGA/~AACCAGTACGTT~TTTTGAGGGTAGAGTTCCCTTCGATCTcCTCTTTGTCCTGCTGCGCGACTAC~CTACGCTCGTACACTCT~cGCTGTGAGC'IATCACTTTCTGCAGGAGC~ L Q S K Q M G R R E N Q Y V N F E G R V P F D L L F V L L R D Y K L R S Y T L N A V S Y H F L Q E Q
1500 469
~GGAGGATGTGCATCATAGCATTATCACAGATCTTCAG~T@AGACGAGCAGACACGTCG~CGTTCGGCCATGTACTGCCT~GGATGCCTACTTACCGCTTAGATTGCTGGAG~GTT~TGGCCATTGTT~CTACATGGAGATG K E D V H H S I I T D L Q N G D E Q T R R R S A M Y C L K D A Y L P L R L L E K L M A I V N Y N E M
1650 519
GCCAGGGTGACGGGTGTGCCACTGGAGTCCTTGCTCACCCGCGGAC~CAGAT~GGTTTT~GTC~TTGCTGCGC~GGCCAA~CC~@GATTCATCATGCCCTCGTACACCTcTCAGGGATCGGATG~CAGTATG~GGAGCG A R V T G V P L E S L L T R G Q Q I K V L S Q L L R K A K T K G F I M P S Y T S Q G S D E Q Y E G A
IS00 569
ACTGTGATTG~CCAAAACGTGGCTACTATGCGGACCCCATCTCCACGCTGGATTTCGCcTCTCTGTATCC~GTAT~TGATGGCGCAT~TTTGTGCTACACcACCTTGGTTTTG~GTGG~CTC~TGAG~GCTGCGGCA~CAGGAG T V I E P K R G Y Y A D P I S T L D F A S L Y P S I M N A H N L C Y ~ T L V L G G T R E H L R Q Q E
1950 619
~CCTGCAGGACGATC~GTGG~CGTACGCCTGC~C~CTACTTTGTG~GTCTGAGGTGCGTCGTGGTcTGCTCCCTGAGATTC~GG~TCTCTTTTGGCGGCCAG~GCGTGCC~-~hAATGACCTAAAAGTGGA~CAGATCCG ~h~ ~ Q ~ ~R~ ~A~ ~ Y~ ~ ~S~ ~R~ ~ L~ ~ ~I ~ ~8~ ~A~ ~ K ~ ~N~ L~K~ ~T~ ~
2100
TTTA~G~GGTCCTGGATGGCAGACAGCTGGCGcTG~GATTTCGGCT~TTCCGTGTACGGATTTACTGGCGCACAGGTTGG~GTTGCCATGCTTAGAGATCTCGGGCAGCGTCACCGCCTACGGTCGTACCATGATCGAGATG F K R K V L D G R ~ L A L K I S A N S V Y G F T G A ~ V G K L P C L E I S G S V T A Y G R T M I E M
2250 719
ACGAA~CG~GTGG~TCCCATTACACACA~CC~TGGCTACGAG~C~TGCAGTGGTCATCTACGGCGACAcTGATTCTGTGATGGTT~TTTCGGAGT~AAACTCTGGAGcGCAGCATGGAGCTGGGACGCGAGGCTGCCG~
2403
CTGGTCAGTTCC~GTTCGTGCATCCTATT~TTGG~TTCGAG~GTTTACTATCCTTACCT~CTGATT~C~GAAACGCTATGCGGGA~TATACTTTACGCGCCCAGATACCTACGATA~TGGATTGC~GGGCATAG~CC
2550 819
T K N E V E S H Y T Q A N G Y E N N A V V ~
Y G D T D S V M V N F G V K T L E R S M E L G R E A A E
L V S S K F V H P I K L E F B K V Y Y P Y L L I N K K R Y A O L Y F T R P D T Y D K M D C K G I E T
669
769
GTGAGGAGAGAT~CTCTCCGCTG@TGGCC~%CCTGATG~CTCCTGCcTGCAG~CTACTCATCG~GGGATCCCGATGGTGCAGTTGCCTATGTGA~CAGGTGATAGCCGATCTCCTCTGCAATCGCATCGACATCTCGCACTTG27~ V R R D N S P L V A N L N N S C L Q K L L I E R D P D G A V A Y V K Q V ! A D L L C N R I D I S H L 869 GTCAT~cC~GGAGTTGGCCAAAACGGATTACGCAGc~CAGGCAcAcGTTGAGCTGGCCGCC~GATG~GAAAAGAGATCcCGGTACGGCGCCC~ACTGGGGGATCGAGTTCCCTATGTGATCTGTGCGGCAGCC~A~CACA V I T K E L A K T D Y A A K Q A H V E L A A K M K K E D P G T A P K L G D R V P Y V I C A A A K N T
2850 919
CCCGCTTACCAG~GGCCGAGGATCCGCTGTATGTGCTGGAA~CAGCGT@~CCATCGATGCCACTTACTACCTGG~CAGCAGCTGTCTAAGCCGCTGCT~GGATCTTTG~CCTATTTTGGGCGAC~TGCCGAGTC~TTTTGTTA p A Y Q K A E D F L Y V L E N S V P I D A T Y Y L E O Q L S K P L L R I F E P I L . G D N A E S I L L
3000
~GGAG~CACAC~CGCACACG~CTGTGGT~CATCC~GTGGGTGGACTTGCTGGATTTATGACc~GAA~CGTCGTGTTTGGGCTGCA~TCCCTGATGCCC~GGGCTACG~CAGGCCTGTCTGTGTCCACACTGCGAGccA K G E H T R T R T V V T S K V G G L A G F M T K K T S C L G C K S L M P K G Y E Q A C L C P H C E P I 0 1 9
3150
CG~TGAGTGAGCTGTATCAG~G~AGGTGGGTGCG~GAGGG~CTGGAGGAGACCTTCTCTCGCCTGTGGACCGAGTGCCAGCGATGCCAGG~TCCTTGCACGAGGAGGTTATCTGCTCC~CAGAGATTGCCCCATCTTCTACATG R M S E L Y Q K E V G A K R E L E E T F S R L W T E C Q R C Q E S L H E E V I C S N R D C P ! F Y M I 0 6 9
3300
CGACAG~GGTTcGCATGGATCTGGA~T~AGGAG~GCGGGTGTTGcGATTCGGc~TGGCCGA~TGGT~CCATTGCATGAGTTTACTG~TTGTTT~TCCTAT~TTT~T~TTATATTACTAG~GTTATTAAAA~AA~A R Q K V R M D L D N Q E K ~ V L R F G L A E W I 0 9 2
3450
A ~
969
3457
Fig. 1. Nucleotide sequence of the Drosophila melanogaster Pol6 cDNA (EMBL accession No. X88928) and the deduced aa sequence. The directly determined aa sequence is underlined. Also indicated are the oligodeoxynucleotide primers used ~ r PCR (arrows), the sequence of the PCR product (dashed underlining) and the polyadenylation signal (double underlining).
239 subunit of Dm Po18 as judged by SDS-PAGE (Chiang et al., 1993). The 3'-UTR had a consensus AATAA polyadenylation signal 20 nt upstream from the polyadenylation site. (b) The eDNA encodes Dm Po18 Three lines of evidence demonstrate that the ORF encodes Dm PolS: (i) The deduced aa sequence contains the aa sequence determined for a peptide generated from the purified 120-kDa Dm PolS, YYADPISTLDFASLYPSI (Fig. 1), that is strictly conserved among members of the Po18 family (Fig. 2). (ii) A polyclonal Ab raised in chicken against Dm Po18 specifically recognized a 120-kDa protein when the cDNA was expressed in E. coli. When the eDNA was constructed in the opposite orientation and transfected into E. coli, no Ab-reactive protein could be detected (data not shown). (iii) Lysates prepared from Sf9 insect cells transfected with the cDNAbaculovirus construct showed increased DNA polymerase activity as compared with lysates from the insect cells alone or in~ect cells transfected with wild-type baculovirus, when they were assayed with poly(dA-dT), the preferred template for PolS. The DNA polymerase activity in the lysates corresponded to a 120-kDa protein as judged by immunoblot analysis (data not shown).
(e) Genomic Southern and RNA transcript analysis Genomic DNA from Dm adults was isolated, digested with HindIII, transferred to Nytran membrane, and then probed with the cDNA. A single band >12 kb was detected, indicating that there is only a single copy of the gene (data not shown). When total RNA or mRNA isolated from Dm adults was probed with the cDNA, a single band of about 4 kb was detected (data not shown), indicating that only one major mRNA species corresponded to Pol6. (d) Po16 is highly conserved Po18 belongs to the family of m-like DNA polymerases (reviewed in Wang, 1991). The homology among the various Po18 species is much greater than for Pol~. Fig. 2 shows an aa sequence comparison for the catalytic subunit of all known Po18 species. Although the N-terminal region shows the least homology, there are portions of the N-terminal region that are very conserved, e.g., NT-1 and NT-2 (Cullmann et al., 1993). Recent studies have suggested the involvement of this region in the interaction of Po18 with PCNA. Specifically, a truncated yeast Po18 without the N-terminal 220 aa expressed in E. coli did not interact with PCNA (Brown and Campbell, 1993), and studies with deletion mutants expressed in insect cells localized the region of human Po18 that binds PCNA to the first 182 aa (Zhang et al., 1995). Furthermore, a pep-
tide containing aa 129 to 149 of human Pol8 inhibited PCNA stimulation (Zhang et al,. 1995). Site-directed mutagenesis will be needed to define precisely the aa residues that mediate the interaction of Pol8 with PCNA. The central region of the protein is the most conserved among all DNA polymerases. It contains all six a-like homology boxes (Pol boxes I to VI with box I the most and box VI the least conserved) and three Exo boxes (Fig. 2). The Pol boxes constitute parts of the polymerase active site. Studies with Pol from HSV-1 and the Bacillus phage qb29 have demonstrated the necessity of box I for DNA polymerase activity (Dorsky and Crumpacker, 1990; Bernad et al., 1990) and substrate recognition (Marcy et al., 1990). Studies with human Pol~ have also shown that box I is involved in metal-specific catalysis (Copeland and Wang, 1993). Mutations in box II/III of HSV-1 Pol and box II of human Pol~ indicate that these boxes participate in dNTP (Larder et al., 1987; Tsurumi et al., 1987; Gibbs et al., 1988; Dong et al., 19t93a) and primer binding (Dong et al., 1993b). On the basis of aa sequence homology to DNA polymerase I (Pol I) of E. coli, it was suggested that three regions (Exo I, II and III) are related to the 3'-5' Exo activity associated with eukaryotic DNA polymerases (Bernad et al., 1989). These regions also contain conserved aa sequences that are necessary for metal, and single-stranded DNA binding (Ollis et al., 1985; Derbyshire et al., 1991). Point mutations in the predicted Exo boxes of qb29 Pol resulted in a loss of 3'-5' Exo activity (Bernad et al., 1989). Recently, a new Exo I box has been proposed, and subsequently confirmed by mutational analysis of S. cerevisiae Pol8 (Simon et al., 1991). The original Exo I box was renamed Exo I'. ° The C-terminal region of Po18 has five conserved regions, termed CT-1, CT-2, CT-3, ZnF1 and ZnF2 (Yang et al., 1992; Cullmann et al., 1993). The function of these regions is unknown. However, the C-terminal 227 aa of the HSV-1 Pol, including CT-2 and CT-3, are necessary for interaction with the UL 42 protein (Digard and Coen, 1990), a protein that is analogous in function to PCNA (Hernandez and Lehman, 1990). No significant homology downstream from the CT-3 box could be detected between the eukaryotic Pol and the HSV-1 Pol. ZnF1 and ZnF2 are two potential zinc finger (Zf) domains that were first identified in human Pola (Wong et al., 1988). Although the Zf regions could be good candidates for the interaction of Po18 with PCNA (Yang et al., 1992; Cullmann et al., 1993), the N-terminal region of the protein appears to mediate this interaction as noted above. The Zf regions might therefore be involved in the interaction of the 120-kDa and 50-kDa subunits. Site-directed mutagenesis in combination with biochemical studies is clearly necessary to elucidate the function of this region.
240 Pf Mm Bt Hs Dm Sp Sc
MEELKTC PFTNVI PYGLLYDKLKKEKNNDVPENYVI EEFDKLLKNYER pNVYDE IGNATFKNDEDL ITFQ ID LDYTVENI FKNMIYNESGSNNS ILNDI YN MDCKR RQGPGPGVPPKRARGHLWDEDE PSPSQFEANLALLEE IEAENRLQEAEEE LQLPPEGTVGG QFSTADIDPRWRRPTLRALDPSTEPLIFQQLEIDHYVGS MDGKR RPGPGPGVPPKRARGGLWDEDEAYRPSQFEEELALMEEM EAERRLQEQEEEELQSALE AADG QFSPTAIDARWLRPAPPALDPQMEPLIFQQLEIDHYVAZ MDGKR R PGPG PGVPPKRARGGLWDDDDA PWPSQFEEDLALMEEM EAEHRLQEQEEEELQSVLEGVADG QVP P SA IDPRWLRPTPPALD PQTE PL IFQQLE IDHYVGF MDGKR KFNGTSNGHAKKP RNPDDD EEMG FEAELAAFENSEDMDQTLLMGDG PENQTTS ERWSRPPPPELDPSKHNLEFQQL DVENYi M TDRSSNEGVVLNKENYP FFRRNGSI HG EITDVKRRRLSERNGYGDKKG SSSKEKTSS FEDEL AEYASQLDQ DEIKSS KDQQWQRPA MSEKRS L pMVDVKI DDEDTPQLEKK IKRQS I DHGVGSE PVST IE I IPSDSFRKYNSQGFKAKDTDLMGTQLESTFEQELSQMEHDMADQEEHDLSSFERKKLPTDFDF
N T -i Pf 102 PYRILLSKDKNYV~VPIIRIYSLRKDGCSVLINVHNFFPYFY VEKPDDFE~EDLIKLEMLMNENLNLNSQYKIYEKKILKIEIVKTESLMYFKKNGKKDFLKITVLLP KMVPSLKKYFE HB~ts t 106 APPLPERPLPSRN~vPILRAFGVTDEGFSvCCHIQGFAPYFYTPAPPGFGA~LSELQQELNAAISRDQRGGKELSGPAvLAIELCSRE~MFGYHGHGPSPFLRITLAL~RLMAPAR}~ LLE 107 A~LPGAPPPSQD~PIL~AFGVTNEGVSVCCHIHGFAPYFYTPAPPGFG~E~LSELQRELSAAISRDQRGGKELTGPAVLAVELCSRESMFGYHGHGPS~FLRITLALPRLMAPARR LLE 108 AQPVPGGPPPSRG~/PVLRAFGvTDEGFSVCCHIHGFAPYFYT~APPGFGP~MGDLQRELNLAISRDSRGGRELTGPAvLAVELCSRESMFGYHGHGPSPFLRITVALPRLVAPARR LLE ~c 88 GQPLPGMPGAQIG~RMFGVTMEGNSVCCHVHGFCPYFYIEAPSQFEE}~£CEKLQKALDQKVIADIRNNKDNVQEAVLMVELVEKLNIHGYNGDKKQRYIKISVTLPRFvAAAS~ LLK 88 L PA INPEKDD IYF(~ IDSEEFTEGSVPS IRLFGVTDNGNS ILVHVVGFLPYF~6VKAPVGFRPEMLERFTQDLDATCNGGVI DHC I I EMKENLYGFQGNEKSPF IK IFTTNPR ILS RARNVFE 109 SLYD ISFQQI DAE~SVLNG IKDENTSTVVRFFGVTSEGHSVLCNVTGFKNYI~VPAPNS SDANDQEQ INKFVHYLNETFDHA IDS IEVVSKQS IWGYSGDTKL PFWK IYVTY PHMVNKLRTA _
Pf ~s
.
T -~
_
H
.
Exo
I
Dm Sp Sc
221 227 228 229 209 210 231
GIVHV NNKS I~GIVYEANL PF IL}{YIIDHKITGSSWINCKKGHY]~RNKNKK ISNCTFE IDI SYEHVE P ITLENEYQQI P~LR ILSFD IEC IKLDGKGF~EAKNDPI IQ ! S S ILYFQG QGVRVPGLGTPSFA~fEANVDFEIRFMVDADIVGCNWLELP AGKS~JRRAEKKATLCQLEVDVLWSDVISHPPEGQWQRIA~LRVLSFDIEC AGRKGIFP~PERDPVIQICSLGLRWG QGIRLAGLGTPSFA~fEANVDFEIRFMVDTDIVGCNWLELP AGKS~[LRPEGKATLCQLEADVLWSDVISHPPEGEWQRIA~LRVLSFDIEC AGRKGIF~PERDPVIQICSLGLRWG QGIRVAGLGTPSFA~YEANVDFEIRFMVDTDIVGCNWLELP AGKI~%LRLKEKATQCQLEADVLWSDWSHPPEGPWQRIA~LRVLSFDIEC AGRKGIF~PERDPVIQICSLGLRWG KEVIMSEIDFQDCR~ENNIDFDIRFMVDTD~GCNWIELPMGHWRIR~HSKPLPESRCQIEVDVAFDRFISHEPEGEWSKVA~ILSFDIEC AGEKGIF~EAKIDPVIQIANMVIRQG RGEFNFEELFP~GV(~TTFESNTQYLLRFMIDCDwGMNW~HLPASKY~FRYQjRVSNCQIEAWINYKDLISLPAEGQWSKMA~LRIMSFDIEC AGRKGVFI~DPSIDPVIQIASIVTQYG FERGHLSFNSWFSNC~TT YDNIAYTLRLMVDCGIVGMSWITLPKGKY~IEPNNRVSSCQLEVSINYRNLIAHPAEGDWSHTA~LRIMSFDIEC AGRIGVFP~PEYDPVIQIANVVSIAG
Pf Mm Bt Hs Dm
339 344 345 346 330
EPIDNCTKFIFTLLECAS I P %WNEF I I R ID+FLTGYNI INFDL PYI L N ~ A L N L K K L K F L G R IKNVASTVKDS SFSSKQFGTHETKE INI FGR IQFDVYDL IKRD EP EPFLRLALTLRPCAPIL(~d~QSYEREEDLL~ ~wADFILAMDI~DVITGYNIQNFDLPYLISR~(~ALKVDRFPFLGRVTGLRSNIRDSSFQSRQVGRRDSKVISMvGRVQMDMLQVLLRE EP EPFLRLALTLRPCAPILG~%~QSYEREEDLL~WSTFIRIMD~DVITGYNIQNFDLPYLIS~(~fLKVPGFPLLGR~IGLRSNIRESSFQSRQTGRRDSKVvSMvGR~QMDMLQ~LLRE EP EPFLRLALTLRPCAPILG~QSYEKEEDLL~ ~WSTFIRIMD~D~ITGYNIQNFDLPYLIS~LK~QTFPFLGRVAG~CSNIRDSSFQSKQTGRRDTKwSMVGRV~MDMLQVLLRE E REPFIRNVFTLNECAPII(~Q~LCHDKETQML~KWSAFVREvD~DILTGYNINNFDFPYLLNR~LKVRNFEYLGRIKNIRSVIKEQMLQSKQMGRRENQYVNFEGRv~FDLLFVLLRD
pf Mm Bt Hs Dm Sp SC
461 465 466 467 451 451 472
YKLKS YTLNYVSFEFLKEQKEDVHYS IMNDLQNES PE+KRIATYC IKDGVLPLR+ DKLLF IYNYVEMARVTGTPFVYLLTRGQQ IKVTSQLYRKCKELNYVI PSTYMKVNTNEK~EGATV HKLRSYTLNAVSFHFLGEQKEDVQHSIITDLQNGNEQ~RRRLAvYCLKDAFLPLR~LERLMvLVNNvEMAR~TG~PLGYLLTRGQQVK~SQL LRQAMRQGLLMP VVKTEGGE~fTGATV YKLRSYTL~A~SFHFLGEQKEDvQHS~ITDLQNGNDQ~RRRLAVYCLKDAFLPLR~LERLMVLVNAMEMARVTGVPLGYLLSRGQQVKwSQL LRQAMRQGLLMP VVKTEGGE~TGATV YKLRSHTLNAvSFHFLGEQKEDVQHSIITDLQNGNDQ~RRRLAVYCLKDAYLPLR~LERLMVLVNAVEMARVTGVPLSYLLSRGQQVKvVSQL LRQAMHEGLLMP W K S E G G E ~ T G A T V YKLRSYTLNAVSYHFLQEQKEDvHHSIITDLQNGDEQ~RRRSAMYCLKDAYLPLR~LEKLMA~VNYMEMARVTGVPLESLLTRGQQIKVLSQL LRKAKTKGFIMFSYTSQGSDE~fEGATV FKLRS~SLNA~CSQFLGEQKED~H~SIITDLQNGTAD~RRRLAIYCLKDAYLPQR~DKLMCFVNYTEMARVTGVPFNFLLARGQQIKVISQL FCKALQHDLVVPNIRVNGTDE~{EGATV YKLRSYTL~AVSAHFLGEQKEDVHYSIISDLQNGDSE~RRLAVYCLKDAYLPLR~MEKLMALVNYTEMARVTGVPFSYLLARGQQIK~SQL FRKCLEIDTVIPNMQSQASDD~EGATV
Pf ~st
583 585 586 587 572 572 593
LEPIKGYYIEPI STLDFASLYFS IMIAHNLCYS~ IKSNH EVSDLQNDDI TTIQGKNNLK FVKKNVK~G IL PL IVEEL IEARKKVKL~ IKNEKNNI TKMV~NGRQLALKI SANSVYGYT IE FLKGYYDVP IATLDFSS LYPS IM~AHNLCYT~LRPGAAQK L GLK PDEF IKTPTGDEFVKS S V R ~ L L PQ ILENLLSARKRAK/~LAQETDPLHRQVI~GRQLALKVSANSVYGFT IE PLKGYYDVP IATLDFSS LY FS IMMAHNLCYT~LRPGAAQKL GLTEDQF IKTPTGDEFVKASVR~LL PQI LENLLSARKRAK~LAKETDPLRRQVI~X3RQLALKVSANSVYGFT IEPLKGYYDVP IATLDFSS LYPS IMMAHNLCYT~LRPGTAQKL GLTEDQF IRTPTGDEFVKTSVR~GLLPQ ILENLLSARKRAK~LAKETDPLRRQV[~DGRQLALKVSANSVYGFT IEPKRGYYADP I STLDFAS LYPS IMMAHNLCyT~LVLGGTRERL~{QQENLQDDQVERTPANNYFVKS EVR~GLL PE ILE SLLAARKRAK~DLKVETDPFKRKVI~]GRQLALK I SANSVYGFT IEF IKGYYDTP IATLDFSS LYPS IMQAHNLCYT~ L DSNTAE LL KEKQ DVDYSVT PNGDYFVK PHVR~GL L P I ILADLLNARKKAK~LKKETD PF KKAVI~DGRQLALKVSANSVYGFT IEFI RGYYDVP IATLDFNSLYPS IMMAHNLCYT~L CNKATVERLNLK I DEDYV ITPNGDYFVTTKRR~GILP I ILDELI SARKRAKK~LRDEKDPFKRDV~NGR~LALKI SANSVYGFT
~
PolII
Exo
Pol
Sc
Exo
I[
ZI
]I
PolVI
Pol
]I[
Pol I pf 702 G A S ~ G G Q L ~ C L E V A V ~ I T T L G ~ M I E K T K E R V E ~ F Y C K ~ N G Y E H N S ~ ~ K F G T N N I E E A M T L G K D A A E R I S K E F L ~ I K L E F E K V Y C ~ Y L L L N K K R Y A G L L yTNpNKH~'~ BH~tS 703 GA QVGKLPCLEISQSVTGFG~MIEKTKQLVESKYTVENGYDANA~ VVYGDTDS~/MCRFGVSSVAEAMSLGREAANWVSSHFPSPIRLEFEKVYFPYLLISKKRYAGLLFSSRSDAH[~
Pol
V
T -2
CT-I
~LGKTDYETRL~HVELAKKLKQRDSATAPNVGDRVSYIIZ FRAAADYAGKQ~HVELAERMRKRDPGSAPSLGDRVPYVI] FRAAADYAGKQ~VELAERMRKRDPGSAPSLGDRVPYVI] PRAASDYAGKQ~HVELAERMRKRDPGSAPSLGDRVPYVI] AKTDYAAKQ~VELAAKMKKRDPGTAPKLGDRVPYVI£ SKTDYAAKM2~VELAERMRKRDAGSAPAIGDRVAYVI] KTLAPNYTNPQ~4AVLAEKM KRREGVGPNVGDRVDYVIJ
KGVKGQAQYERA~DPLYVLD ~AA]
Pf Mm Bt HS Dm Sp Sc
823 824 825 826 8]4 809 832
CKGIETVRRDFC~LIQQMMETVLNKLLIEKNLNSAIEYTKS~IKELLTNNIDMSLLVVTK CKGLEAVRRDNC~LVANLVTSSLRRILVDRDPDGAVAHAKD~ISDLLCNRIDISQLVITKE] CKGLEAVRRDNC~LVANLVTASLRRLLIDRDPSGAVAHAQD~ISDLLCNRIDISQLVITKE] CKGLEAVRRDNC~LVANLVTASLRRLLIDRDPEGAVAHAQD~ISDLLCNRIDISQLVITKEI CKGIETVRRDNS~LVANLMNSCLQKLLIERDPDGAVAYVKQ~IADLLCNRIDISHLVITKEL SKGIETVRRDNC~LVSYVIDTALRKMLIDQDVEGAQLFTKK~ISDLLQNKIDMSQHVITKAi QKGLASVRRDSC~LVSIVMNKVLKKILIERNVDGA LAF~ETINDILHNRVDISKLII6
Pf ~s
943 946 947 948 934 929 948
NNLAIDYN}{YL DAIKSPLSRIFE |VIMQNSDSLFSGDHTRHKTILTSSQTALSKFLKKSVRCIGC NSSIKKPP LCNHCKENKEFSIYMQKIKDFKNKQNEFFQLWTECQ~Q~ HSLPIDTQYYLEQQLAKPLLRIFEP~LGEGRAESVLLRGDHTRCKTVLTSKVGGLLAFTKRRNCCIGC RSVIDHQGAVCKFCQPRESELSQKEVSHLN ALEERFSRLWTQCQ Q HSLPIDTQYYLEQQLAKPLLRIFEP~LGEGRAEAVLLRGDHTRCKTVLTGKVGGLLAFAKRRNCCIGC RTVLSHQGAVCKFCQPRESELYQKEVSHLS ALEERFSRLWTQCQRCQGSL HSLPIDTQYYLEQQLAKPLLRIFEP~LGEGRAEAVLLRGDHTRCKTVLTGKVGGLLAFAKRRNCCIGC RTVLSHQGAVCEFCQPRESELYQKEVSHLN ALEERFSRLWTQCQRCQGSL NSVPIDATYYLEQQLSKPLLRIFEP~LG DNAESILLKGEHTRTRTVVTSKVGGLAGFMTKKTSCLGCKSLMPKGYEQACLCPHCEPRMSELYQKEVGAKR ELEETFSRLWTECQRCQESL NNIPIDAKYYLENQLSKPLLRIFEP~LG EKASSLLHGDHTRTISMAAPSVGGIMKFAVKVETCLGCKApIKKG KTALCENCLNRSAELYQRQVAQVN DLEVRFAELWTQCQRCQGSM NNIQVDSRYyLTNQLQNPIISIVAP~ IGDKQANGMFVVKSIKINTGSQKGGLMSFIKKVEACKSCKGPLRKG E~3PLCSNCLARSGELYIKALYDVR DLEEKYSRLWTQCQRCAGNL
pf Mm Bt HS Dm SQ Sc
1057 1064 1065 1066 1054 1046 1064
CT-3
**
**
**
W HVDVICMNRDCPIFYRRAKIKKD IANLQEQVTSLRMDW HEDVICTSRDCPIFYMRKKVRKDLEDQERLLQRFGPPGFEAW HEDVICTSRDCPIFYM~KKVRKDLEDQERLLRRFGPPGPEAW HEDVICTSRDCPIFYMRKKVRKDLEDQEQLLRRFGPPGFEAW HEEVICSNRDCPIFYMRQKVRMDLDNQEKRVLRFGLAEW HQDVICTSRDCPIFYMRIAEHKKLQQSVDLLKRFDEMSW HSEVLCSNKNCDIFYMRVKVKKELQEKVEQLSKW
Fig. 2. Amino-acid sequence comparison of several Pol6. The homology boxes and conserved Cys residues of putative Zf are indicated. Sources for the sequences were Drosophila melanogaster (Din, this work), Homo sapiens (Hs, Chung et al., 199l), Bos taurus (Bt, Zhang et al., 1991), Mus musculus (Mm, Cullmann et al, 1993), Saccharyomyces cerevisiae (Sc, Boulet et al., 1989), Schizosaccharomyces pombe (Sp, Pign6de et al., 1991 ) and Plasmodium
falciparum (Pf, Ridley et al., 1991).
241
(e) Conclusions (1) A full-length cDNA encoding Dm Po18 has been isolated using a combination of PCR and cDNA library screening. (2) The deduced ORF contains all the homology boxes for Pol~. (3) A comparison of the deduced aa sequence with those from other sources indicate that Po16 is very highly conserved.
ACKNOWLEDGEMENT
This work was supported by Grant GM-06196 from the National Institutes of Health.
REFERENCES Aboussekhra, A., Biggerstaff, M., Shivji, M.K.K., Vilpo, J.A., Moncollin, V., Podust, V.N., Protic, M., Hiabscher, U., Egly, J.-M. and Wood, R.D.: Mammalian DNA nucleotide excision repair reconstituted with purified protein components. Cell 80 (1995) 859-868. Bauer, G.A. and Burgers, P.M.J.: The yeast analog of mammalian cyclin/proliferating cell nuclear antigen interacts with mammalian DNA polymerase 6. Proc. Natl. Acad. Sci. USA 85 (1988) 7506-7510. Bernad, A., Blanco, L., L~zaro, J.M., Martin, G. and Salas, M.: A conserved 3'-5' exonuclease active site in prokaryotic and eukaryotic DNA polymerases. Cell 59 (1989) 219-228. Bernad, A., L~tzaro, J.M., Salas, M. and Blanco, L.: The highly conserved amino acid sequence motif Tyr-Gly-Asp-Thr-Asp-Ser in or-like DNA polymerases is required by d~29 DNA polymerase for protein-primed initiation and polymerization. Proc. Natl. Acad. Sci. USA 87 (1990) 4610-4614. Boulet, A., Simon, M., Faye, G., Bauer, G.A. and Burgers, PM.J.: Structure and function of the Saccharomyces cerevisiae CDC2 gene encoding the large subunit of DNA polymerase III. EMBO J. 8 (1989) 1849-1854. Brown, W.C. and Campbell, J.L.: Interaction of proliferating cell nuclear antigen with yeast DNA polymerase 6. J. Biol. Chem. 268 (1993) 21706 21710. Budd, M.E. and Campbell, J.L.: DNA polymerase required for repair of UV-induced damage in Saccharomyces cerevisiae. Mol. Cell. Biol. 15 (1995) 2173-2179. Cavener, D.R.: Comparison of the consensus sequence flanking translational start sites in Drosophila and vertebrates. Nucleic Acids Res. 15 (1987) 1353-1361. Chiang, C.-S., Mitsis, P.G. and Lehman, I.R.: DNA polymerase 6 from embryos of Drosophila melanogaster. Proc. Natl. Acad. Sci USA 90 (1993) 9105-9109. Chung, D.W., Zhang, J., Tan, C.-K., Davie, E.W., So, A.G. and Downey, K.M.: Primary structure of the catalytic subunit of human DNA polymerase 6 and chromosomal location of the gene. Proc. Natl. Acad. Sci. USA 88 (1991) 11197-11201. Copeland, W.C. and Wang, T.S.-F.: Mutational analysis of the human DNA polymerase zt. The most conserved region in ~t-like DNA polymerases is involved in metal-specific catalysis. J. Biol. Chem. 268 (1993) 11028 11040.
CuUmann, G., Hindges, R., Berchtold, M.W. and Hfibscher, U.: Cloning of a mouse cDNA encoding DNA polymerase 8: refinement of the homology boxes. Gene 134 (1993) 191-200. da Costa, L.T., Liu, B., E1-Deiry, W.S., Hamilton S. R., Kinzler, K.W., Vogelstein, B., Markowitz, S., Willson, J.K.V., de la Chapelle, A., Downey, K.M. and So, A.G.: Polymerase 8 variants in RER colorectal tumours. Nature Genet. 9 (1995) 10-11. Derbyshire, V., Grindley, N.D.F. and Joyce, C.M.: The 3'-5' exonuclease of DNA polymerase I of Escherichia coil: contribution of each amino acid at the active site to the reaction. EMBO J. 10 (1991) 17-24. Digard, P. and Coen, D.M.: A novel functional domain of an zt-like DNA polymerase. J. Biol. Chem. 265 (1990) 17393 17396. Dong, Q., Copeland, W.C. and Wang, T.S.-F.: Mutational studies of human DNA polymerase zt. Identification of residues critical for deoxynucleotide binding and misinsertion fidelity of DNA synthesis. J. Biol. Chem, 268 (1993a) 24163-24174. Dong, Q., Copeland, W.C. and Wang, T.S.-F.: Mutational studies of human DNA polymerase a. Serine 867 in the second most conserved region among a-like DNA polymerases is involved in primer binding and mispair primer extension. J. Biol. Chem. 268 (1993b) 24175-24182. Dorsky, D.I. and Crumpacker, C.S.: Site-specific mutagenesis of a highly conserved region of the herpes simplex virus type 1 DNA polymerase gene. J. Virol. 64 (1990) 1394-1397. Gibbs, J.S., Chiou, H.C., Bastow, K.F., Cheng, Y.-C. and Coen, D.M.: Identification of amino acids in herpes simplex virus DNA polymerase involved in substrate and drug recognition. Proc. Natl. Acad. Sci. USA 85 (1988) 6672-6676. Hernandez, T.R. and Lehman, I.R.: Functional interaction between the herpes simplex-I DNA polymerase and UL42 protein. J. Biol. Chem. 265 (1990) 11227-11232. Larder, B.A., Kemp, S.D. and Darby, G.: Related functional domains in virus DNA polymerases. EMBO J. 6 (1987) 169 175. Lehman, I.R. and Kaguni, L.S.: DNA polymerase zt. J. Biol. Chem. 264 (1989) 4265-4268. Marcy, Ad., Hwang, C.B.C., Ruffner, K.L. and Coen, D.M.: Engineered herpes simplex virus DNA polymerase point mutants: the most highly conserved region shared among cz-like DNA polymerases is involved in substrate recognition. J. Virol. 64 (1990) 5883-5890. Nishida, C., Reinhard, P. and Linn, S.: DNA repair synthesis in human fibroblasts requires DNA polymerase 6. J. Biol. Chem. 263 (1988) 501 510. Ollis, D.L., Brick, P., Hamlin, R., Xuong, N.G. and Teitz, T.A.: Structure of large fragment of Escherichia coli DNA polymerase I complexed with dTMP. Nature 313 (1985) 762-766. Pign6de, G., Bouvier, D., de Recondo, A.-M. and Baldacci, G.: Characterization of the POL3 gene product from Schizosaccharomyces pombe indicates inter-species conservation of the catalytic subunit of DNA polymerase 6. J. Mol. Biol. 222 (1991) 209 218. Ridley, R.G., White, J.H., McAleese, S.M., Goman, M., Alano, P., de Vries, E. and Kilbey, B.J.: DNA polymerase 8: gene sequence from Plasmodium .fitlciparum indicates that this enzyme is more highly conserved than DNA polymerase et. Nucleic Acids Res. 19 (1991) 6731 6736. Simon, M., Giot, L. and Faye, G.: The 3' to 5' exonuclease activity located in the DNA polymerase 6 subunit of Saccharoymyces cerevisiae is required for accurate replication. EMBO J. 10 (1991) 2165 2170. Stillman, B.: Smart machines at the DNA replication fork. Cell 78 (1994) 725-728. Tsurimoto, T., Melendy, T. and StiUman, B.: Sequential initiation of lagging and leading strand synthesis by two different polymerase complexes at the SV40 DNA replication origin. Nature 346 (1990) 534 539.
242 Tsurumi, T., Maeno, K. and Nishiyama, Y.: A single-base change within the DNA polymerase locus of herpes simplex virus type 2 confers resistance to aphidicolin. J. Virol. 61 (1987) 388-394. Wang, T.S.-F.: Eukaryotic DNA polymerases. Annu. Rev. Biochem. 60 (1991) 513-552. Wong, S.W., Wahl, A.F., Yuan, P.M., Arai, N., Pearson, B.E., Arai, K., Korn, D., Hunkapiller, M.W. and Wang, T.S.-F.: Human DNA polymerase ~ gene expression is cell proliferation dependent and its primary structure is similar to both prokaryotic and eukaryotic replicative DNA polymerases. EMBO J. 7 (1988) 37-47.
Yang, C.-L., Chang, L.-S., Zhang, P., Hao, H., Zhu, L., Toomey, N.L. and Lee, M.Y.W.T.: Molecular cloning of the cDNA for the catalytic subunit of human DNA polymerase 6. Nucleic Acids Res. 20 (1992) 735 745. Zhang, J., Chung, D.W., Tan, C.-K., Downey, K.M., Davie, E.W. and So, A.G.: Primary structure of the catalytic subunit of calf thymus DNA polymerase 6: sequence similarities with other DNA polymerases. Biochemistry 30 (1991) 11742-11750. Zhang, S.-J., Zeng, X.-R., Zhang, P., Toomey, N.L., Chuang, R.-Y., Chang, L.-S. and Lee, M.Y.W.T.: A conserved region in the N terminus of DNA polymerase 6 is involved in proliferation cell nuclear antigen binding. J. Biol. Chem. 270 (1995) 7988 7992.