Sequence motifs common to the Eco RII restriction endonuclease and the proposed sequence specificity domain of three DNA-[cytosine-C5] methyltransferases

Sequence motifs common to the Eco RII restriction endonuclease and the proposed sequence specificity domain of three DNA-[cytosine-C5] methyltransferases

Gene, 125 (1993) 65-68 ~CI 1993 Elsevier Science Publishers B.V. All rights reserved. 65 0378-l 119/93/$06.00 GENE 06923 Short Communications S...

430KB Sizes 0 Downloads 30 Views

Gene, 125 (1993) 65-68 ~CI 1993 Elsevier Science Publishers

B.V. All rights

reserved.

65

0378-l 119/93/$06.00

GENE 06923

Short Communications

Sequence motifs common to the EcoRII restriction endonuclease and the proposed sequence specificity domain of three DNA-[cytosine-CS] methyltransferases (Restriction-modification

Valeri G. Kossykh”,

systems; sequence

Alexander

alignment;

target recognition

domain)

V. Repykb and Stanley Hattman”

“Department qf Biology. Unicersity of Rochester, Rochester, NY 14627, USA; and bInstitute of Biochemistry Russian Academy ofsciences, Pushchino, Moscow Region 142292, Russia. Tel. (7-095-925-7448) Received by F. Barany:

18 May 1992; Revised/Accepted:

31 October/3

November

and Physiology

1992; Received at publishers:

5 November

of Microorganisms.

1992

SUMMARY

We have compared the deduced amino acid (aa) sequences of the EcoRII restriction endonuclease (R.EcoRII) and the proposed specificity (target recognition) domains of three DNA-[cytosine-CS] methyltransferases (MTases), M.EcoRII, M.Dcm, and M.SPR, each of which recognizes the same nucleotide sequence, CCWGG (where W is A or T). We have identified a region containing sequence motifs that are partially conserved in the MTases and R.EcoRII. This may be the first example of aa sequence homology between a MTase specificity (target recognition) domain and its cognate restriction endonuclease (ENase). It suggests that this region is important for DNA recognition by R.EcoRII and that the EcoRII ENase and MTase genes may have evolved from a common progenitor.

INTRODUCTION

Analysis of the deduced aa sequences of mono- and multispecific DNA-[cytosine-CS] MTases has revealed a common architecture for these enzymes. This consists of ten conserved blocks of lo-20 aa which are separated by regions of variable length. A long variable region [SO- 120 aa for monospecific MTases and 200-300 aa for multispecific MTases] separates conserved regions VIII and IX; it has been proposed that this variable region determines nt sequence (target recognition) specificity (Lauster et al., 1989; Posfai et al., 1989). Evidence supporting this was presented for the monospecific MTases, M.HpaII and

Correspondence to: Dr. V. G. Kossykh, Department of Biology, University of Rochester, Rochester, NY 14627, USA. Te1.(716)275-3846; Fax (716) 275-2070. Abbreviations: aa, amino acid(s); bp, base pair(s); ENase (or R.), restriction endonuclease; kb, kilobase( MTase (or M,), methyttransferase; nt, nucleotide(s); W, A or T.

M.HhaI (Klimasauskas et al., 1991), and for the multispecific phage SPR MTase (Lauster et al., 1989; Lange et al., 1991). Primary structure similarities have been found in the specificity (target recognition) domain of MTases recognizing identical or related DNA sequences (Lauster et al., 1989). However, no similarity has been reported between a MTase specificity (target recognition) domain and its cognate R.ENase for at least nine different type-II systems (Chandrasegaran and Smith, 1988; Lauster, 1989). However, R.EcoRV has a short motif similar to one found in several heterospecific ENases and MTases, as well as in the cognate M.EcoRV (Thielking et al., 1991). From the x-ray structure, specific H-bonds and hydrophobic interactions are made between the target DNA sequence and several residues within this motif (Winkler, 1992). Although it has been proposed that this motif is important for nt sequence recognition, mutation of these residues does not alter DNA binding but does sharply reduce specific cleavage (Vermote et al., 1992). Thus, it remains open as to whether the motif is in the sequence specificity domain.

66 EXPERIMENTAL

in the sequence CCWGG. Analysis was done with the GCG (Madison,

AND DISCUSSION

(a) The aa sequence comparison for R.EcaRII and CCWGG-specific MTases In this report we compare the primary structure of R.EcoRII (Kossykh et al., 1989) with the proposed specificity (target recognition) domain for two monospecific MTases (M.EcoRII and M.Dcm) and one multispecific MTase (M.SPR), each of which methylates the internal C

of the aa sequences WI, USA) software

package (Devereux et al., 1984); e.g., the PILEUP program was used to construct multiple alignments of the R.EcoRII aa sequence against the so-called variable region [containing the specificity (target recognition) domain] of the three MTases. The alignments revealed the presence of six blocks of homology (A-B-CDE-F) between

R.EcoRII

and the monospecific

MTases,

and five

target recognition or sequence

specificity

domain IX

CDE’

x

MSPR

M.EcoRII

1

C

A

BCDE

A

B

D

I

I

I

M.DCNl

A

F

I J

S-J I F

I

I

E

I

A

1

CDEF

R.EcoRII A

B

CDE

Fig. I. Schematic illusttation of the aa sequences of RZcoRII and three CS-MTases. The ten conserved by stippled boxes. Motifs common to R~EcoRII and the four MTases are found in blocks (represented

F

regions (1-X) found in the MTases are denoted by solid and hatched A-F boxes), which are

located in the variable specificity (target recognition) domain between regions VIII and IX. R~EroRII does not contain the nt sequences to regions I-X found in the MTases. The motif CDE* is shown as a bar. The aa sequences of blocks A-F are shown in Fig. 2.

M.SPR

corresponding

183FNFRWT

M.Dcm M-EcoRII LRFPSGSEI

R.EeoRII

Vd.KYIL.......l.....

IS.....Q......L:e..

cclnsensus

motif

..YA..H..K

QxxxxxxLxexxVdxKYIL

A

C

13aa

D

1Oaa

E 4aa

F 19a.3.

245~~MNGNr;jlY~SSG/ELAV~H~~~NQ~----QN[j;lV~EIijlRPV-LTPER/G~

M.SPR M.SPR M.Dcm M.EcoRII R-EcoRII

gnGFG..........

cOnSenSuS

motif

CDE

SVS.tlsaR..K........i.Id.G....AT..

OFGxxxxxxxxxxSVxxxxxxRxxK

Fig. 2. Alignment of the aa sequences of R.EcoRII and three CS-MTases within the sequence specificity (target region) domain; a contiguous block (starting at aa 278 of M.SPR and corresponding to CDE* in Fig. 1) and interrupted blocks (starting at aa 245 of M,SPR and corresponding to C, D, and E in Fig. I) are compared with CDE of the other three enzymes. The numbers following each enzyme symbol denote the aa position relative domain involved in recognition of to the N terminus. Block C in M,SPR [starting at residue 2451 is part of the specificity (target recognition) CCWGG (Wilke et al.. 1988; Lange et al., 1991). Initial alignment was obtained with the PILEUP program from published sequences (Buhk et al., 1984; Som et al., 1987; Hanck et al., 1989; Kossykh et al., 1989) and then refined manually. Dashes denote gaps in the alignment, and slashes (r) delimit intervening sequences, Identical and chemically similar [K-R, T-S, L-I-V, D-E, N-Q, F-Y] aa are enclosed in boxes. In the consensus sequences, bold capital letters denote identical aa for all four enzymes, capital letters denote identical or chemically similar aa in R.EcoRII and at least two MTases, and lower case letters denote identical or chemically similar aa in at least three of the enzymes. Letter x in the motif lines refers to any aa.

blocks

of homology

(A-C-D-EF)

with the multispecific

tions

they catalyze

Fig. 1 shows a schematic representation of the alignment of the complete aa sequences, and Fig. 2 shows an

domain (both domains are probably in close proximity to one another in the active site). In this context, it has been shown recently that the R.FokI sequence recognition and cleavage domains are physically distinct (Li et al., 1992). Moreover, Friedman and Ansari (1992) have shown that the catalytic site for methyl transfer and cova-

of the variable

R+EcoRII. sequence within

their respective

the BESTFIT

shutBings

before

comparisons MTases

> 5, which indicates scores

except

it is interesting

using

and R%co-

sequence specificity detailed information

that the se-

catalytic

from block

and 2); the Z-score for this sequence not certain of its significance.

is in the N-terminal from &he proposed

domain shown in Fig. 2. There is no available, yet, on the location of the

in R*EcoRII.

with

to note that motif CDE

upstream

domain

by another

Motif

in comparison

ACKNOWLEDGEMENTS

in an interrupted form in MSPR a related contiguous sequence.

was found just

is mediated

lent attachment to the target cytosine half of M.EcoRII. This is upstream

and

of motif A

were not due simply to chance.

similar

M.SPR. However,

the statistical

alignments

and multispecific

appears to be present (Fig. I), In addition CDE*,

at the aa

of the MTase sequences

Pairwise

Z-scores

similarities yielded

of these

program.

the mono-

RI1 all yielded quence

similarity

genes. We checked

(Z-scores)

after 100 random

CDE

to showing

level? the blocks have the same relative positions

significance

against

regions of the three MTases and

In addition

recognition)

that sequence

specificity

alignment

(target

differ), but it is likely

M+SPR MTase.

E (Figs.

1

was 2.5, so we are

This work was supported by grants Academy of Sciences and from the U. tutes of Health fCM29227). We thank colleague Howard Ochman and the helpful suggestions and criticism.

from the Russian S. National InstiHamilton Smith, referees for their

(b) Additional homologies We also used the FASTA program to search the SWISS BANK protein database for homology to motifs A and CDE. For motif A, the highest match (Z-scores of 557.9) corresponded to R.EcoRII, M.EroRII, M.&m, and MSPR. Moreover, significant the MTases of bacteriophages as to the bacterial contrast

MTases,

to these results,

R*EroRII, highest

M.EcoRII,

matches were found with pi 1S, $31 and HZ, as well M.IVIuX and

M-HpaiI.

In

the CDE motif was detected

in

and MeDcm. and they had the three

scores.

REFERENCES Buhk,

H.-J.,

Behrens.

B., Tailor,

organization bacteriophage

demonstrate

aa sequence

homoiogy

between the proposed specificity (target recognition) domain of a MTase and its cognate ENase. We suggest that the motifs are important for sequence recognition by R.EroRII ancestor

and that

there may have been a common

of the genes specifying

R.EccrRII and the isospec-

ific MTases. It is widely believed that sequence specificity domains of type-II ENases and cognate MTases do not make

identical

contacts

with their target

sites. because

the ENases typically function as dimers while the MTases appear to be monomers (Modrich, 1982; Lauster: 1989). Thus, we were surprised to find a similarity between proposed EcoRTI sequence specificity domains. {2) The catalytic

sites of an ENase

MTase are also expected

and

the

its cognate

to be different {because the reac-

J.J., Giinthert,

and product of the DNA metbyltransferas~ SPR. Gene 29 ( I984) Sl--6 I

gene of

C’handrasegaran. S. and Smith, H.O.: Amino acid sequence homologies among twenty-five restriction endonucleases and methylases. In: Sarma, M.H. and Sarma, R.H. (Eds), From Proteins to Ribosomes. Vol. I. Adenine Press, Guilderland, NY. 1988, pp, 149-156. Devereux. J., Haeberli, P. and sequence analysis programs (19X4) 387-395.

(c) Conclusions (If These results

R., Wilke, K.. Prada.

U., Noyer-Weidner, M., Jentsch, S. and Trautner, T.A.: Restriction and modification in ~~~~~~~.~ suhriiis: nucleotide sequence, functional

Smithies, 0.: A comprehensive for the VAX. Nucleic Acids

set of Res. 12

Friedman, S. and Ansari, N.: Binding of the EroRII methyitransferase to 5-Ruoro-cytosine-containing DNA. Isolation of a bound peptide. Nucleic Acids Res. 20 (lYY2) 3241-3248. Hanck, T.. Gerwin, N. and Fritz, H.-J.: Nucleotide sequence of the rkm locus of ~s~~~e~j~~j~r*oIi K- f2. Nucleic Acids Res. 17 (19X9) 5844. Klimasauskas, S., N&on, J.L. and Roberts, J.R.: The sequence speciticity domain of cytosiw5mC methylases. Nucleic Acids Res. 13 (1991) 6183-6190. Kossykh, V., Repyk, A.V., K&man, A. and Buryanov. Ya.: Nucleotide sequence of the EcoRII restriction endonuclease gene, Biochim. Biophys. Acta 1009 (19KY) 290-292. Lange, C., Jugel. A., Waiter, J., Noyer-Weidner, M. and Trautner, T.A: ‘Pseudo’ domains in phage-encoded DNA methyltransferases. Nature (1991) 352 645-648. Lauster, R.: Evolution of type II methyitransferases: a gene duplication model. J. Mol. Biol. 206 ( 1989)3 13-32 1. Lauster, R.. Trautner. T.A. and Noyer-Weidner, M.: Cytosine-sp~~iti~ type II m~thyItra~sf~~s~s: a conserved enzyme core with variabte target-recognition domains, J. Mol. Biol. 206 (1989) 305-312. Li, L., Wu. L.P. and Chandrasegaran, S.: Functional domains in FokI

68 restriction endonuclease. Proc. Natl. Acad. Sci. USA 89 (1992) 42754279. Modrich, P.: Studies on sequence recognition by type II restriction and modification enzymes. CRC Crit. Rev. Biochem. 13 (1982) 2877323. Posfai, J., Bhagwat, A.S., Pbsfai, G. and Roberts, R.J.: Predictive motifs derived

from

cytosine

methyltransferases.

Nucleic

Acids

Res. 17

(I 989) 242 l-2435. Som, S., Bhagwat, AS. and Friedman, S.: expression of the gene coding the EcoRII cleic Acids Res. 15 (1987) 3133332. Thielking, V., Selent, U., Klihler, E., Wolfes, Urbanke, C., Winkler, F.K. and Pingoud,

Nucleotide sequence and modification enzyme. NuH., Pieper, U., Geiger, R., A.: Site-directed mutagen-

esis studies with EcoRV restriction endonuclease to identify regions in recognition and catalysis. Biochemistry 30 (1991) 6416-6422. Vermote, C.L.M., Vipond, I.B. and Halford, SE.: EcoRV restriction endonuclease: communication between DNA recognition and catalysis. Biochemistry 3 1 (1992) 608996097. Wilke, K., Rauhut, E., Noyer-Weidner, Behrens, B. and Trautner, T.A.: recognizing EMBO J. 7 Winkler, F.K.: Curr. Opin.

M., Lauster, R., Pawlek, B., Sequential order of target-

domains in multi-specific DNA-methyltransferases. (1988) 2601-2609. Structure and function of restriction-endonucleases. Struct. Biol. 2 (1992) 93-99.