Gene, 125 (1993) 65-68 ~CI 1993 Elsevier Science Publishers
B.V. All rights
reserved.
65
0378-l 119/93/$06.00
GENE 06923
Short Communications
Sequence motifs common to the EcoRII restriction endonuclease and the proposed sequence specificity domain of three DNA-[cytosine-CS] methyltransferases (Restriction-modification
Valeri G. Kossykh”,
systems; sequence
Alexander
alignment;
target recognition
domain)
V. Repykb and Stanley Hattman”
“Department qf Biology. Unicersity of Rochester, Rochester, NY 14627, USA; and bInstitute of Biochemistry Russian Academy ofsciences, Pushchino, Moscow Region 142292, Russia. Tel. (7-095-925-7448) Received by F. Barany:
18 May 1992; Revised/Accepted:
31 October/3
November
and Physiology
1992; Received at publishers:
5 November
of Microorganisms.
1992
SUMMARY
We have compared the deduced amino acid (aa) sequences of the EcoRII restriction endonuclease (R.EcoRII) and the proposed specificity (target recognition) domains of three DNA-[cytosine-CS] methyltransferases (MTases), M.EcoRII, M.Dcm, and M.SPR, each of which recognizes the same nucleotide sequence, CCWGG (where W is A or T). We have identified a region containing sequence motifs that are partially conserved in the MTases and R.EcoRII. This may be the first example of aa sequence homology between a MTase specificity (target recognition) domain and its cognate restriction endonuclease (ENase). It suggests that this region is important for DNA recognition by R.EcoRII and that the EcoRII ENase and MTase genes may have evolved from a common progenitor.
INTRODUCTION
Analysis of the deduced aa sequences of mono- and multispecific DNA-[cytosine-CS] MTases has revealed a common architecture for these enzymes. This consists of ten conserved blocks of lo-20 aa which are separated by regions of variable length. A long variable region [SO- 120 aa for monospecific MTases and 200-300 aa for multispecific MTases] separates conserved regions VIII and IX; it has been proposed that this variable region determines nt sequence (target recognition) specificity (Lauster et al., 1989; Posfai et al., 1989). Evidence supporting this was presented for the monospecific MTases, M.HpaII and
Correspondence to: Dr. V. G. Kossykh, Department of Biology, University of Rochester, Rochester, NY 14627, USA. Te1.(716)275-3846; Fax (716) 275-2070. Abbreviations: aa, amino acid(s); bp, base pair(s); ENase (or R.), restriction endonuclease; kb, kilobase( MTase (or M,), methyttransferase; nt, nucleotide(s); W, A or T.
M.HhaI (Klimasauskas et al., 1991), and for the multispecific phage SPR MTase (Lauster et al., 1989; Lange et al., 1991). Primary structure similarities have been found in the specificity (target recognition) domain of MTases recognizing identical or related DNA sequences (Lauster et al., 1989). However, no similarity has been reported between a MTase specificity (target recognition) domain and its cognate R.ENase for at least nine different type-II systems (Chandrasegaran and Smith, 1988; Lauster, 1989). However, R.EcoRV has a short motif similar to one found in several heterospecific ENases and MTases, as well as in the cognate M.EcoRV (Thielking et al., 1991). From the x-ray structure, specific H-bonds and hydrophobic interactions are made between the target DNA sequence and several residues within this motif (Winkler, 1992). Although it has been proposed that this motif is important for nt sequence recognition, mutation of these residues does not alter DNA binding but does sharply reduce specific cleavage (Vermote et al., 1992). Thus, it remains open as to whether the motif is in the sequence specificity domain.
66 EXPERIMENTAL
in the sequence CCWGG. Analysis was done with the GCG (Madison,
AND DISCUSSION
(a) The aa sequence comparison for R.EcaRII and CCWGG-specific MTases In this report we compare the primary structure of R.EcoRII (Kossykh et al., 1989) with the proposed specificity (target recognition) domain for two monospecific MTases (M.EcoRII and M.Dcm) and one multispecific MTase (M.SPR), each of which methylates the internal C
of the aa sequences WI, USA) software
package (Devereux et al., 1984); e.g., the PILEUP program was used to construct multiple alignments of the R.EcoRII aa sequence against the so-called variable region [containing the specificity (target recognition) domain] of the three MTases. The alignments revealed the presence of six blocks of homology (A-B-CDE-F) between
R.EcoRII
and the monospecific
MTases,
and five
target recognition or sequence
specificity
domain IX
CDE’
x
MSPR
M.EcoRII
1
C
A
BCDE
A
B
D
I
I
I
M.DCNl
A
F
I J
S-J I F
I
I
E
I
A
1
CDEF
R.EcoRII A
B
CDE
Fig. I. Schematic illusttation of the aa sequences of RZcoRII and three CS-MTases. The ten conserved by stippled boxes. Motifs common to R~EcoRII and the four MTases are found in blocks (represented
F
regions (1-X) found in the MTases are denoted by solid and hatched A-F boxes), which are
located in the variable specificity (target recognition) domain between regions VIII and IX. R~EroRII does not contain the nt sequences to regions I-X found in the MTases. The motif CDE* is shown as a bar. The aa sequences of blocks A-F are shown in Fig. 2.
M.SPR
corresponding
183FNFRWT
M.Dcm M-EcoRII LRFPSGSEI
R.EeoRII
Vd.KYIL.......l.....
IS.....Q......L:e..
cclnsensus
motif
..YA..H..K
QxxxxxxLxexxVdxKYIL
A
C
13aa
D
1Oaa
E 4aa
F 19a.3.
245~~MNGNr;jlY~SSG/ELAV~H~~~NQ~----QN[j;lV~EIijlRPV-LTPER/G~
M.SPR M.SPR M.Dcm M.EcoRII R-EcoRII
gnGFG..........
cOnSenSuS
motif
CDE
SVS.tlsaR..K........i.Id.G....AT..
OFGxxxxxxxxxxSVxxxxxxRxxK
Fig. 2. Alignment of the aa sequences of R.EcoRII and three CS-MTases within the sequence specificity (target region) domain; a contiguous block (starting at aa 278 of M.SPR and corresponding to CDE* in Fig. 1) and interrupted blocks (starting at aa 245 of M,SPR and corresponding to C, D, and E in Fig. I) are compared with CDE of the other three enzymes. The numbers following each enzyme symbol denote the aa position relative domain involved in recognition of to the N terminus. Block C in M,SPR [starting at residue 2451 is part of the specificity (target recognition) CCWGG (Wilke et al.. 1988; Lange et al., 1991). Initial alignment was obtained with the PILEUP program from published sequences (Buhk et al., 1984; Som et al., 1987; Hanck et al., 1989; Kossykh et al., 1989) and then refined manually. Dashes denote gaps in the alignment, and slashes (r) delimit intervening sequences, Identical and chemically similar [K-R, T-S, L-I-V, D-E, N-Q, F-Y] aa are enclosed in boxes. In the consensus sequences, bold capital letters denote identical aa for all four enzymes, capital letters denote identical or chemically similar aa in R.EcoRII and at least two MTases, and lower case letters denote identical or chemically similar aa in at least three of the enzymes. Letter x in the motif lines refers to any aa.
blocks
of homology
(A-C-D-EF)
with the multispecific
tions
they catalyze
Fig. 1 shows a schematic representation of the alignment of the complete aa sequences, and Fig. 2 shows an
domain (both domains are probably in close proximity to one another in the active site). In this context, it has been shown recently that the R.FokI sequence recognition and cleavage domains are physically distinct (Li et al., 1992). Moreover, Friedman and Ansari (1992) have shown that the catalytic site for methyl transfer and cova-
of the variable
R+EcoRII. sequence within
their respective
the BESTFIT
shutBings
before
comparisons MTases
> 5, which indicates scores
except
it is interesting
using
and R%co-
sequence specificity detailed information
that the se-
catalytic
from block
and 2); the Z-score for this sequence not certain of its significance.
is in the N-terminal from &he proposed
domain shown in Fig. 2. There is no available, yet, on the location of the
in R*EcoRII.
with
to note that motif CDE
upstream
domain
by another
Motif
in comparison
ACKNOWLEDGEMENTS
in an interrupted form in MSPR a related contiguous sequence.
was found just
is mediated
lent attachment to the target cytosine half of M.EcoRII. This is upstream
and
of motif A
were not due simply to chance.
similar
M.SPR. However,
the statistical
alignments
and multispecific
appears to be present (Fig. I), In addition CDE*,
at the aa
of the MTase sequences
Pairwise
Z-scores
similarities yielded
of these
program.
the mono-
RI1 all yielded quence
similarity
genes. We checked
(Z-scores)
after 100 random
CDE
to showing
level? the blocks have the same relative positions
significance
against
regions of the three MTases and
In addition
recognition)
that sequence
specificity
alignment
(target
differ), but it is likely
M+SPR MTase.
E (Figs.
1
was 2.5, so we are
This work was supported by grants Academy of Sciences and from the U. tutes of Health fCM29227). We thank colleague Howard Ochman and the helpful suggestions and criticism.
from the Russian S. National InstiHamilton Smith, referees for their
(b) Additional homologies We also used the FASTA program to search the SWISS BANK protein database for homology to motifs A and CDE. For motif A, the highest match (Z-scores of 557.9) corresponded to R.EcoRII, M.EroRII, M.&m, and MSPR. Moreover, significant the MTases of bacteriophages as to the bacterial contrast
MTases,
to these results,
R*EroRII, highest
M.EcoRII,
matches were found with pi 1S, $31 and HZ, as well M.IVIuX and
M-HpaiI.
In
the CDE motif was detected
in
and MeDcm. and they had the three
scores.
REFERENCES Buhk,
H.-J.,
Behrens.
B., Tailor,
organization bacteriophage
demonstrate
aa sequence
homoiogy
between the proposed specificity (target recognition) domain of a MTase and its cognate ENase. We suggest that the motifs are important for sequence recognition by R.EroRII ancestor
and that
there may have been a common
of the genes specifying
R.EccrRII and the isospec-
ific MTases. It is widely believed that sequence specificity domains of type-II ENases and cognate MTases do not make
identical
contacts
with their target
sites. because
the ENases typically function as dimers while the MTases appear to be monomers (Modrich, 1982; Lauster: 1989). Thus, we were surprised to find a similarity between proposed EcoRTI sequence specificity domains. {2) The catalytic
sites of an ENase
MTase are also expected
and
the
its cognate
to be different {because the reac-
J.J., Giinthert,
and product of the DNA metbyltransferas~ SPR. Gene 29 ( I984) Sl--6 I
gene of
C’handrasegaran. S. and Smith, H.O.: Amino acid sequence homologies among twenty-five restriction endonucleases and methylases. In: Sarma, M.H. and Sarma, R.H. (Eds), From Proteins to Ribosomes. Vol. I. Adenine Press, Guilderland, NY. 1988, pp, 149-156. Devereux. J., Haeberli, P. and sequence analysis programs (19X4) 387-395.
(c) Conclusions (If These results
R., Wilke, K.. Prada.
U., Noyer-Weidner, M., Jentsch, S. and Trautner, T.A.: Restriction and modification in ~~~~~~~.~ suhriiis: nucleotide sequence, functional
Smithies, 0.: A comprehensive for the VAX. Nucleic Acids
set of Res. 12
Friedman, S. and Ansari, N.: Binding of the EroRII methyitransferase to 5-Ruoro-cytosine-containing DNA. Isolation of a bound peptide. Nucleic Acids Res. 20 (lYY2) 3241-3248. Hanck, T.. Gerwin, N. and Fritz, H.-J.: Nucleotide sequence of the rkm locus of ~s~~~e~j~~j~r*oIi K- f2. Nucleic Acids Res. 17 (19X9) 5844. Klimasauskas, S., N&on, J.L. and Roberts, J.R.: The sequence speciticity domain of cytosiw5mC methylases. Nucleic Acids Res. 13 (1991) 6183-6190. Kossykh, V., Repyk, A.V., K&man, A. and Buryanov. Ya.: Nucleotide sequence of the EcoRII restriction endonuclease gene, Biochim. Biophys. Acta 1009 (19KY) 290-292. Lange, C., Jugel. A., Waiter, J., Noyer-Weidner, M. and Trautner, T.A: ‘Pseudo’ domains in phage-encoded DNA methyltransferases. Nature (1991) 352 645-648. Lauster, R.: Evolution of type II methyitransferases: a gene duplication model. J. Mol. Biol. 206 ( 1989)3 13-32 1. Lauster, R.. Trautner. T.A. and Noyer-Weidner, M.: Cytosine-sp~~iti~ type II m~thyItra~sf~~s~s: a conserved enzyme core with variabte target-recognition domains, J. Mol. Biol. 206 (1989) 305-312. Li, L., Wu. L.P. and Chandrasegaran, S.: Functional domains in FokI
68 restriction endonuclease. Proc. Natl. Acad. Sci. USA 89 (1992) 42754279. Modrich, P.: Studies on sequence recognition by type II restriction and modification enzymes. CRC Crit. Rev. Biochem. 13 (1982) 2877323. Posfai, J., Bhagwat, A.S., Pbsfai, G. and Roberts, R.J.: Predictive motifs derived
from
cytosine
methyltransferases.
Nucleic
Acids
Res. 17
(I 989) 242 l-2435. Som, S., Bhagwat, AS. and Friedman, S.: expression of the gene coding the EcoRII cleic Acids Res. 15 (1987) 3133332. Thielking, V., Selent, U., Klihler, E., Wolfes, Urbanke, C., Winkler, F.K. and Pingoud,
Nucleotide sequence and modification enzyme. NuH., Pieper, U., Geiger, R., A.: Site-directed mutagen-
esis studies with EcoRV restriction endonuclease to identify regions in recognition and catalysis. Biochemistry 30 (1991) 6416-6422. Vermote, C.L.M., Vipond, I.B. and Halford, SE.: EcoRV restriction endonuclease: communication between DNA recognition and catalysis. Biochemistry 3 1 (1992) 608996097. Wilke, K., Rauhut, E., Noyer-Weidner, Behrens, B. and Trautner, T.A.: recognizing EMBO J. 7 Winkler, F.K.: Curr. Opin.
M., Lauster, R., Pawlek, B., Sequential order of target-
domains in multi-specific DNA-methyltransferases. (1988) 2601-2609. Structure and function of restriction-endonucleases. Struct. Biol. 2 (1992) 93-99.