Pergamon PII:
hr. .J. Biochem. Cd Bid. Vol. 29, No. 6, pp. 895-900. 1997 IQ 1997 Elsevier Science Ltd. All rights reserved Printed in Great Britain 1357-2725197 $17.00 + 0.00 S1357-2725(97)00029-O
Cloning and Characterization of the Genes of the CeqI Restriction-Modification System ZSUZSA
IZSVAK,“’
ZSOLT
JOBBAGY,’
IMRE
TAKtlCS,2
ERNii
DUDA’?
‘Institute of Biochemistry, Biological Research Centre, Hungarian Academy of Sciences, Szeged, H-6701, Hungary and 2Agricultural Research Institute of the Hungarian Academy of Sciences, MartonvQsLir H-2462, Hungary Two genes from Corynebacteriutn equii, a Gram-positive bacterium producing the Ceql restrictio*modification enzymes were cloned and sequenced. In aiuo restriction experiments, DNA and amino acid sequence data suggest that the two genes code for the endonuclease and the methyltransferase enzymes. However, when the two genes are expressed in E. coli, practically no enzyme activity can be detected in the supernatants of sonicated cells. Based on the DNA sequence data CeqI restriction endonuclease (an EcoRV izoschizomer) consists of 270 amino acid residues with a predicted molecular mass of 31.6 kDa, in good agreement with the previously measured 32 k 2 kDa. The methyltransferase is 517 residues long (approx. 60 kDa). The two genes are in opposite orientation and overlap by 37 base pairs on the chromosome. The deduced amino acid sequence of the putative eudonuclease gene revealed long stretches of hydrophobic amino acids, that may form the structural basis of the unusual aggregation properties of the restriction endouuclease. The amino acid sequence of the methylase shows homologies with other type II methyltransferases. 0 1997 Elsevier Science Ltd Keywords: Recombinant DNA Corynebacterium DNA methylation Primary sequence ht.
equii EcoRV izoschizomer
Hydrophobicity
J. Biochem. Cell Biol. (1997) 29, 895-900
INTRODUCTION
et al., 1987). CeqI endonuclease was purified to homogeneity and characterized biochemiOver a hundred type II restriction-modification cally. Some of its properties were found to systemshave been cloned and their organization be quite different from those of EcoRV (Izsvak and nucleotide sequences analysed (Wilson, and Duda, 1989;Izsvak et al., 1992).Character1991). In this paper we report the cloning istically, it formed inactive complexes of and sequencing of the putative genes of the 12-20 subunits under physiological conditions CeqI restriction-modification system discovered in the absence of DNA, while dissociated into in a Gram-positive bacterium Corynebacterium active tetramers in the presence of substrate. equii. The recognition sequence of CeqI was The isolated enzyme complex was partially found to be identical with that of EcoRV (Duda converted into monomers and oligomers in the presenceof detergents or in high ionic strength solutions (Jobbagy et al., 1992). *To whom all correspondence should be addressed. We have cloned the genes of the CeqI tPresent address: Dept. Genetics and Cell Biology, restriction-modification system, as primary BioScience Center, University of Minneapolis, St. Paul, MN 55108, U.S.A., phone: 612 625 5754, fax: 612 624 sequencedata m ight help to explain the unusual 4206. properties of the enzyme. Amino acid sequence Abbreviations: aa, amino acid(s); LB medium, Luriahomologies between CeqI and EcoRV could Bertani medium; ORF, open reading frame; R-M, restriction-modification; u, unit of enzyme activity; SD, reveal information on the molecular mechanisms of DNA sequence recognition of these Shine-Dalgarno sequence. Received 14 October 1996; accepted 18 February 1997. proteins. 895
Zwzaa Izsvik i’/ t/i
896 MATERIALS
AND METHODS
Construction of a phagemid lihrar> DNA was purified from 10 g of c’. equii cells using Proteinase-K and phenol and isopropanol precipitation. The purified C. equii chromosoma1 DNA was partially digested with Mhol at 37°C to get the maximum number of fragments at around 10 kb. The ipGY91 phagemid vector (Vincze and Kiss, 1990) was digested by Xhol and Xbal and dephosphorylated by calf intestine phosphatase (CIP) treatment. The pretreated vector was digested with BarnHI, and ligated with the partially digested chromosomal DNA at a molar ratio of 1:2. The ligate was vitro packaged into phage-particles icalenghe et al., 1981). K-1400 host was infected with these in vitro phage-particles to produce a phage library. Isolation of the genes of the CeqI restrictionmodljication system Phagemid DNA prepared from the library was digested with an excess of R . Ceql. This reaction was repeated three times with phenol/ chloroform extraction, and ethanol precipitation done between each reaction. The digested DNA was used for the transformation of E.coli K-1400 cells. The recombinant DNA prepared from K-1400 cells was subsequently transformed into E.coli strains JM107 or JM109. Phagemid DNA purified from the colonies were tested for DNA modification by R . Ceql digestion. (In the control experiment lambda DNA mixed with the recombinant phagemid DNA was used to check the activity of the enzyme.) Phagemids resistant to endonuclease cleavage were selected and retested in the cotransformation experiment. Plasmid pBRPC9 is a derivative of plasmid pBR322 as follows: the Pstl-EcoRI fragment from pBR322 was replaced with a Pstl-EcoRI fragment from plasmid pPC9 resulting in a partial deletion of the pBR322 ampicillin gene. Plasmid pPC9 was derived from pUC19 by the insertion of GGGATATCCC double stranded oligomer into its polylinker region at the Smal site. In vivo restriction experiments In viuo restriction activity was tested by determining the plating efficiency of lambda phage on JM109 E. coli cells carrying the recombinant constructs. Clones were selected that showed a thousand times higher resistance
to infection than control cells. Lambda phages grown on these recombinant cells were very effective in reinfecting the same bacterial strain. proving that in viva restriction was responsible for the poor growth of the phage in the first cycle of infection. Sequencing Restriction fragments of interest were subcloned into Ml3 mp18 and mp19 vectors. The determination of the nucleotide sequence was carried out by the dideoxy chain termination method (Sanger et al., 1978) using 35S-dATP and Sequenase Version 2.0 (United States Biochemical Corp., Cleveland, OH, U.S.A.). Both strands were sequenced. RESULTS
Isolation of the putative genescodingfor the Ceql R-M system The selection of the gene coding for the methyltransferase was based on the resistanceof self-modifying recombinant phagemids to digestion by the corresponding endonuclease (Szomolanyi et al., 1980).The phagemid library of C. equii was digested to completion with an excess of Ceql. All phagemid clones were expected to be cleaved by the endonuclease except those modified by the recombinant methylase. This digestion mixture was used to transform K-1400 cells. After antibiotic selection 55 colonies were obtained. Recombinant phagemids purified from these transformants were not affected by digestion with R CeqI while control lambda DNA in the same mixtures was digested. These large phagemid constructs (over 48 kb) were rather unstable, and their size decreased dramatically between subsequent transformation steps until smaller but stable plasmids were established. Since Ceql cleavage sites might get lost during this process, cotransformation experiments were carried out to select those in which methylase activity was not affected by deletion. pBRPC9 was cotransformed with the plasmids into E. coli JM109 and transformants for both ampicillin and tetracycline resistance were selected. Digestion of pBRPC9 with BamHI yields a 400 bp fragment that has two Ceql sites, one in the middle of the fragment (the phagemids did not contain BamHI recognition sites). The Ceql site containing BamHI restriction fragment of pBRPC9 plasmid proved to be resistant to CeqI
897
Cloning and characterization of the genes of CeqI restriction-modification system
digestion in the case of three phagemids, named pCM7, pCM14 and pCM22. Assuming that the genescoding the methylase and endonuclease proteins could be in close proximity on the C. equii chromosome, all methylase-positive recombinant clones were tested for endonucleaseactivity, both in viuo and in vitro. Restriction activity was tested by determining the efficiency of plating of lambda Fig. 2. Sequencing strategy and functional analysis of CeqI phage on JM109 E. coli cells carrying the restriction-modification genes and restriction map of the investigated recombinants (Trautner et al., 3 kb long KpnI-Sal1 fragment of pCMR7 plasmid. Bold arrows indicate the open reading frames on the cloned 1974). All three clones showed approx. DNA; lines represent the DNA sequencespresent in the lo3 higher resistance to the infection than constructs; thin arrows show the subcloned DNA fragments control cells. The clone pCM7 was selected for and the direction of the sequencing. The effect of deletions -
--f---_c
A--
--
lambda
-
on the activity of enzymes are shown ( + : active, - : inactive). The in viuo activity of the methyltransferase (M) was determined by in vitro digestion of the DNA isolated from the transformed bacteria; in uivo restriction activity of R CeqI (R) was monitored by determination of the relative plating efficiency of lambda phages grown on E.coli strains harbouring plasmids carrying the ceqIR gene. Deletion mutations affecting the methyltransferase enzyme were non-viable, unless the putative endonuclease gene was also deleted. Non-viable mutants are indicated by asterisks.
further experiments (and renamed pCMR7, Fig. 1).
SalI Mbol St”1 EcoRI
Fig. 1. Plasmid map of pCMR7 and pBRPC9. pCMR7 is one of three phagemids carrying the genes of the CeqI restriction-modification system. Details of its construction are described in Materials and Methods. Plasmid pBRPC9 is a derivative of plasmid pBR322 as follows: the PstI-EcoRI fragment from pBR322 was replaced with a PstI-EcoRI fragment from plasmid pPC9 resulting in a partial deletion of the pBR322 ampicillin gene. Plasmid pPC9 was derived from pUC19 by the insertion of GGGATATCCC double stranded oligomer into its polylinker region at the SmaI site.
Nucleotide sequencesof the putative genesof the CeqI R-M system (EMBL accession number 234099) Sequence determination of the 2979 bp KpnI-Sal1 fragment of pCMR7 revealed two ORFs in opposite orientation. Starting from the KpnI site, the first 133 bp were found to be part of the lambda genome, while the last 276 nucleotides were from pBR322 (Fig. 2). These flanking sequenceswere omitted from the published sequences. The first ORF starts at nucleotide 286. A potential Shine-Dalgarno sequence (AGAGGG) could be identified upstream of the translational start codon. According to the analysis of deletion mutants, this ORF could code for the endonuclease,since mutations in this gene lead to the loss of in vivo restriction (Fig. 2.). The 810 bp ceqIR gene corresponds to a protein of 270 amino acids (Fig. 3a). The molecular weight of the predicted protein (31.6 kDa) is in good agreement with the experimental data, 32 ) 2 kDa, measured on CeqI (Duda et al., 1987; Izsvak et al., 1992). Bacteria transformed with pCMR7 mutants, carrying an intact ceqIR gene and deletions in the other ORF located on the complementer
(>I)
strand were not viable. This suggests that the other ORF codes for the methyltransferase enzyme (Fig. 2). The most probable start codon is the AUG triplet at position 62, but two GUG triplets can also be csnsidered. Potential ShineeDalgarno sequences are shown in bold letters on Fig. 3(b). The predicted polypeptide consists of 517 amino acids (Fig. 3b). Mobility shift experiments with C’. ryuii extracts suggested a 50-55 kDa molecular mass for methylase (Izsvik, unpublished). The two ORFs coding for the endonuclease and the methyltransferase overlap by 37 nucleotides.
ATTGCGT~CULGCTCRCCTGGAGRATCGTGcc~TG**Gc*cc~cGTTT RATAGATCCTCTGCCITCGA?TTTGTGCGGePATAOACAT TCCCGAACGCAATAAATPGTGGGCCATGC~-CC~T~CTCC~TCC CGATAAAACCGATTGTAAGATCCCGTTCTTTCAAAAGCCCW,CTCCTl-TC CTGTGTTAGCCT~GCTTACG~C~T~CCAACGGCTTT TC?TGAGATTTCCTTTTATTTCAGAGGGTTCCAGAATGAGAACAAA&AAT M
R
T
K
pj
GTCTGCACGTTTULGTTGATCT~CCT-GTT~G*TCGT~~TCCC v-c
T
F
0
L
I
L
F
P
K
”
R
$
Y
I
P
ATTGCCGCTCATTTTA~A~AGATCACCAAAAA LPLILKRIEITKNCIST CAAGAGGXAGACUC
CATCW;TCACGGAGa4CCTTC~~G~
RGETTKTSVTEKLSEE ?diAACGGTCGTTGTAGAGCCAGTCTTCGGRTPCCT~GGCTCAC~CA KTVVVBPVFGFLKAHFH ~CCACI‘CGG~~CC~GAGGCRRCGAPIAAGGC STRFSEBATKRRKRXWA CCCTCCTGGZI;GTAPITCC~-GTATACCGCCAG LLVVILRKYTARNSKQ .U.GAGM.C~CTAGR
CAAAARGGAGTCCGCTCGTCGmRTTGA
KRTKKTR*KG”RSSFID CTGG’PITETITCGGCCTC
‘XAGCffiACTC~~~A-
A hydrophobicity plot of the predicted protein sequence of C<~yl endonuclease shows many potential hydrophobic regions, unlike the izoschizomer EcoRV endonuclease and three other well known endonucleases (Fig. 4). This property may form the structural basis of multiple interactions between the subunits resulting in a complex quaternary structure under physiological conditions (Jobbigy et a/., 1992; Izsvik et rrl., 1992). Intermolecular amino acid sequence comparisons were performed between R CeqI and all known endonucleases as well as the complete content of the SwissProt database (release 25) using the FASTA program (Genetics Computer Group, 1992) and BLASTP on the BLAST server at NCBI NLM NIH. No significant homologies were found. Similarly, comparison of the CccjI methylase and endonuclease revealed no homology between the two proteins. However in the amino acid sequence of the methylase we were able to define the conserved motifs (D/N)PP(Y/F) and FXGXG characteristic for adenine methylases (Fig. 5). Interestingly, these motifs were unusually close to each other in M C’eql.
BRTPFSFIFWLVLFRPL TITITAATGmTACCCC~GTCTC~GTA~~TTC FNVLPCSAOSLFVOEF ~CATTACGTCGCffiTGCCTGG??TOTCC~~-C~C-~~G~ P
ITS
PC
L
V
C
PA
CGI-ITCTGCTTTPOCTCCATCCACRAACGGAA
F
P
L
EQN
CAAAAA-GCAGATGA
“SAFAPSTNGTKTKOMI TTACGATTACACC~TGATT~~~~~~CAGCTCCTGACCCTA TITPMIRFIVVOLLTL AGATTTXATATCGTACGG~~GG~~CTCCG~A~TCGCCGEC RFSYRTVKGIS
PLRQRA
CI‘GGTTmCITGTGG~CAATCGGAGTIULATGT-TGT~~CG-GCTCCA WFFLWFNRSKCFC”GSI TCCACAW.CGGAACAAAIU\TAAAGCAGACAAGCAATACCAAACGGAATAT HKRNKNKRDKQYQTEY TCGTTTCATTGCTGTTCAI\cTccT~Tc*T~T~T~TT~GT~cGGT*~~ SFHCCSTPIIFI
I.
v
R
x
(b) Mb01 GATCATCGTCG
YIFKMICIYNQDKLPITSBRLVPMASSISPBKffiAKI”BWYSCIVSRBVBQAEBYK
DGLLnFLYYFMSGQYBFYSGRYQSALRLYKIABQKIDBVHWSBKAEFYFRLGBS YFAHHQYTFA”SYLBQAIDLFENNNFI”TILNCRLLLAAIKTEU’LFDBARKBYQ SALADATPYPTTXALLLLNRVRQRKLYGAgmppgGLKT
DISCUSSION GV%N
517
Fig. 3. (a) Nucleotid sequence of the R CeqI coding DNA strand. The Shine-Dalgarno ribosome binding site is shown in bold letters. The potential stop codon is indicated by X. Long hydrophobic stretches of the protein sequence are underlined. Restriction sites indicated on Fig. 2 are italicized (EMBL Accession Number 234099). (b) Partial nucleotide sequence of the complementary strand coding for M CeqI and complete predicted amino acid sequence of the protein. The potential start codons are indicated by asterisks, the Shine-Dalgarno-like sequences are in bold. Sequences showing homology to D/NPPY/F and FxGxG motifs of other adenine methylases are underlined.
The cloning method we used for the isolation of the methyltransferase gene is a powerful selection technique successfully utilized for the cloning of a number of methyltransferase genes. The resistance of plasmids isolated from the recombinant bacteria to CeqI digestion and protein sequence homologies with other methyltransferases provide further evidence that the isolated gene codes for methyltransferase enzyme. Finally, lambda phages showed restricted growth on recombinant E. coli cells
Cloning and characterization
of the genes of CeqI restriction-modification
carrying the putative genes of the CeqI restriction-modification system only in the first replication cycle. The results of in viuo restriction experiments suggest that the other, partially overlapping open reading frame, codes for the endonuclease. In most cases the methyltransferase gene shows a close genetic linkage with the endonuclease gene. The facts that the gene (product) did not allow survival of the transformed cells in the absence of a functional methyltransferase gene
Ceql
EcoRV
‘r
BarnHI
fcoRl
amino
acids
Fig. 4. Comparision of hydrophobicity plots of R. Ceql (270 aa), R. EcoRV (245 aa), R. BarnHI (213 aa) and R EcoRI (276 aa). Hydrophobicity data were calculated with the computer program SeqAid (Lipman and Wilbur). Averages and values plotted are for these amino acid residues plus the next six.
system
Fig. 5. Alignment between homologous regions of M Ceql and several adenine methyltransferases. Identical amino acids are shown by bold letters. Published sequences of the following methylases were used: RsrI (Kaszubska et nl., 1989). MhoII (Bocklage et al., 1991) EcoRV (Bougueleret er nl., 1984). Hiizfr (Chandrasegaran et nl., 1988a), KpnI (Chatterjee er nl., 1991). HhaII (Chandrasegaran et al., 1988b), EcaI (Brenner et ul., 1990) EcoRI (Greene et u;., 1981) and Eco571 (Janulaitis et al., 1991)
(deletion experiments) and the molecular mass of the predicted protein is in good agreement with previously published experimental data further support the presumption. Unfortunately, attempts to prove the presence of active endonuclease and methyltransferase enzymes in crude extracts of the recombinant E. coli cells were usuccessful. DNA cleavage patterns resembling those of CeqI digests were found only after very brief incubation times (l-2 min); longer digestion of lambda DNA resulted in a smear. It is not exceptional to lose the activity of a recombinant protein in a different host organism, but CeqI endonuclease was a fairly stable enzyme in its native host. We have to add though, that authentic Ceql enzyme is converted into a non-specific endonuclease in suboptimal assay conditions (Izsvak et al., 1992) and previous studies on CeqI did not rule out the stabilizing role of an intensively substance coloured associated with the enzyme (Izsvak et al., 1992). The lack of this compound in recombinant E. coli cells and the nucleases (totally presence of non-specific absent in C. cquii) might have interfered with the measurement of CeqI in crude E. coli extracts. Purified CeqI exhibits peculiar aggregation properties. A hydrophobicity plot of the protein predicted on the basis of DNA sequence of the putative endonuclease gene shows many hydrophobic regions. This property could explain the observed behaviour of the enzyme. Ackno,lle~~ml~ntsWe acknowledge the excellent Maria Fejes throughout the experiments. The help Laszlb Szilak in the synthesis of oligonucleotides acknowledged. This work was supported by (National Science Foundation) grants,
help of of Dr. is also OTKA
900
Zsuzsa lzsvik REFERENCES
Duda E., Izsvak Z. and Orosz A. (1987) Isolation and purification of CeqI endonuclease. Nucleic Aci& Reseurch 15, 1334. Genetics Computer Group (1992) Sequence Analysis Software Package version 7.2, Science Drive, Madison, Wisconsin, USA 5371 I. lzsvbk Z. and Duda E. (1989) ‘Star’ activity and complete loss of specificity of Crql endonuclease. Biochrmistr.v Journal 258, 301-303. Izsvik Z., Jobbagy Z. and Duda E. (1992) Purification and characterization of CeqI restriction endonuclease. Zeitschr$
fiir
Nafurforschung
UC,
83H34.
Jobbagy Zs., Izsvak Zs. and Duda E. (1992) Positive interaction the co-operative between subunits of CeqI endonuclease. Biochemis/r.v Journd 286, 85-W.
er c/I.
Scalenghe F;.. Turco E., Edstrom J. E., Pirotta V. and Mel11 M. (1981) Microdissection and cloning of DNA from a melanogaster polytene specific region of Drosophila chromosoma. C’hromosorna 82, 205-216. Szomolanyi I%, Kiss A. and Venetianer P. (1980) Clonmg the modification methylase gene of Bacillus sphaericu.sR in Escherichra coli. Gene 10, 219-225. Trautner T. A.. Pawlek B., Bron S. and Anagnostopoulos C. (1974) Restriction and modification in B. suh/ili.r. Molecular Genes rind Genetks 131, 18 I- 191. Vincze g. and Kiss G. B. (1990) A phosphate group at the cos ends of phage lambda DNA is not a prerequisite for in airro packaging: an alternative method for constructing genomic libraries using a new plasmid vector, lpGY97. Gene 96, 17 -22.
Wilson G. G. (1991) Organization of restriction-modification systems. Nuc/eic Acids Research 19, 2539 2565.