Gene. 42 (1986)
x9
89-96
Elsevier
GENE
1547
DNA methyltransferase (Recombinant
DNA;
genes of Bacillus subtilis phages: comparison of their nucleotide sequences
temperate
bacteriophages
SPR, SPP, @3T, Z; promoters;
molecular
evolution;
Bacillus
sphaericus M.BspRI)
A. Tran-Betcke,
B. Behrens, M. Noyer-Weidner
Mar.*_-Platlck-Institut,f~r molekulare (Received
December
4th, 1985)
(Accepted
December
30th, 1985)
Genetik. Ihnestrasse
and T.A. Trautner 63/73.
D-1000
Berlin 33 (Germanja)
Tel. 8307260
SUMMARY
The @3T DNA methyltransferase (Mtase) and most of the SPfi Mtase genes have been sequenced. With the exception of their promoters, no difference was found between the $3T and SPfl Mtase genes which code for an enzyme with a M, of 50 507, consisting of 443 amino acids (aa). Comparison of the deduced aa sequence of the $3T/SPfl type Mtase (target specificity: GGCC and GCNGC) with that of the previously established sequence of the SPR Mtase (Buhk et al., 1984) which has the target specificity GGCC and CCGG, reveals strong similarities between these two types of enzymes. There is, however, one striking difference: both the $3T/SPfl and the SPR enzymes contain at different positions inserts of 33 aa, which have no homology to each other. We suggest that the methylation specificity unique to each of the two types of Mtases (GCNGC in $3T/SP/?; CCGG in SPR) depends on these inserts, while the GGCC-specific modification potential common to all Mtases is determined by structures conserved in both types of enzymes. A DNA fragment of non-modifying phage Z, which shows homology to both flanks of the SPR Mtase gene, was also sequenced. This segment can be described as a derivative of SPR DNA, in which the Mtase gene and sequences at its 5’ end have been deleted, with the deletion extending between two direct repeats of 25 bp.
INTRODUCTION
thylates Two related kinds of Mtases encoded by a group of temperate Bacillus subtilis bacteriophages show multiple sequence specificity (Gunthert and Trautner, 1984). The Mtase of phage SPR me-
Abbreviations:
aa, amino acid(s); bp. base pair(s); Mtase, DNA
methyltransferase;
nt. nucleotide(s);
ORF, open reading
frame;
PA, polyacrylamide.
037X-I
I IY:Xh,‘S(13.50
C‘ 1986
Else&x
Science
Publlshers
B.V.
(Biomedical
(Trautner recently
cytosines
in the sequences
et al., 1980; Jentsch discovered,
GGCC,
CCGG
et al., 1981) and, as
also in the sequence
CC$GG
(U. Gunther& unpublished results). The Mtases of phages $3T, SPD, and pl 1 recognize the sequences GGCC and GCNGC; phage Z has no methylation capacity (Noyer-Weidner et al., 1983). The Mtase genes of these phages, as well as a DNA fragment of phage Z, which carries DNA sequences homologous to those flanking the Mtase genes of the Diwsion)
90
other phages, have been cloned 1983; Noyer-Weidner
(Kiss and Baldauf,
(c) Nucleotide
sequences
et al., 1985). The nucleotide
sequence of the SPR Mtase gene has been established (Buhk et al., 1984; Posfai et al., 1984) and
The approximate location of the $3T and SP/I Mtase genes in plasmids pBN16 and pBJ20 and
relatedness
their direction
been
of this gene to the $3T-type
documented
(Noyer-Weidner
by
DNA/DNA
genes has
hybridization
to define the structures
which
fragment
of transcription
(Noyer-Weidner
derived plasmid
et al., 1985).
We are interested
previously
had been determined et al., 1985). A pBR328-
(pBKU l), containing
a contiguous
of phage Z DNA, which has homology
DNA
in this communication
gene had also been described (Kupsch, 1984). Phage DNA for sequencing was obtained from these plas-
sequences
of those
in one polypeptide
is to compare Mtasegenes,
an identical
the nucleotide which
(GGCC)
encode
and a differ-
ent (CCGG/GCNGC) specificity. To this end we have sequenced DNA fragments which contain the Mtase genes of phages $3T and SPfl. A fragment of Z DNA, in which the Mtase gene had been deleted (Kupsch, 1984) was also sequenced.
EXPERIMENTAL
(a) Strains and plasmids DNA fragments containing parts of the Mtase genes of phages $3T and SPfl were obtained from pBR328 derivatives pBN16, pBN 17, pBN30, pBN32 and pBJ20, pBJ21 (Noyer-Weidner et al., 1985). Plasmid pBKU 1 (Kupsch, 1984) provided the relevant DNA of phage Z. All plasmids (see Fig. 1) were propagated on E. coli strain HB 101 (Boyer and Roulland-Dussoix, 1969). (b) DNA sequencing The Sanger dideoxy chain termination method (Sanger et al., 1977) was followed throughout. Overlapping restriction fragments were cloned into phages M 13mp8 and M13mp9 (Messing and Vieira, 1982). [35S]dATP was used in the DNA polymerase reaction starting from the Amersham universal 17-mer primer. Electrophoresis was carried out in 0.3 mm x 200 mm x 600 mm 6% PA gels. Multiple sequence determinations and/or sequencing of complementary strands were performed to maximize accuracy. Sequence evaluation was based on computer programs of Staden (1978) and Isono ( 1982).
flanking
the
3’ side
of
93T/SP[i
to
determine the specificity recognition of these enzymes. One experimental approach which we follow
and
SPR Mtase genes and the 5’ side of the SPR Mtase
mids (Fig. 1) or from plasmids
containing
subfrag-
ments of the Mtase genes. Regions sequenced (Fig. 1) include the Mtase gene of $3T and its flanking regions, part of the structural gene of SP/I and DNA on its 5’ terminus and Z sequences, which have homology to both the 5’ and 3’ flanking regions of the SPR Mtase gene. We also present in Fig. I the Mtase gene of SPR, which had previously been sequenced (Buhk et al., 1984). Within the 2360 bp of $3T DNA sequenced, the Mtase gene extends from codon TTG at position 7 14 to the termination codon TAA at position 2043 (Fig. 2). This assignment agrees with the previous localization of the Mtase gene. The sequence identitied represents the only ORF in the anticipated direction of transcription, which could code for the Mtase. This ORF codes for 443 aa. Based on the predicted aa composition, the 43T Mtase would have an M, of 50507. This value is in reasonable agreement with the M, of 47000 of the $3T Mtase derived from protein analyses of either in vitro synthesized Mtase or Mtase produced in minicells (Noyer-Weidner et al.. 1985). The @3T and SPP Mtases have the same M, and methylating specificity (Noyer-Weidner et al., 1985). Sequencing of the major part of the SPflgene revealed identity to the $3T sequences (Fig. 2). From this result and also from the finding of the same rcstriction pattern of the $3T and SPfl Mtase genes, u’c assume that also those nucleotide sequences of SP/I not sequenced are identical in SPfi and $3T. Based on restriction data, identity of $3T and SPP DNA is also likely beyond the 3’ ends of both genes. On their 5’ ends the genes are preceded by an identical region of 52 bp. Further upstream the q’,3T and SP/I sequences are different. The lack of methylation capacity of phage Z is a
Hmdlll
I I I I I
1 1 I I
I
---___-________ I
SW
SPR
HindlII I
t Fig. 1. Schematic
representation
Mtase genes and their direction analysis.
DNA/DNA
DNA regions
I
of Mtase genes and their environs. of transcription
hybridization,
sequenced.
Ikb
are indicated
or sequence
Only a limited number
analysis
Boxed regions
by open arrows.
are represented
of restriction
consequence of the absence of an Mtase gene in Z DNA (Noyer-Weidner et al., 1985). However, we have observed in Z DNA a single restriction fragment which has homology to the flanking regions of the SPR Mtase gene (Noyer-Weidner et al., 1985). When sequencing this DNA fragment (Figs. 1 and 2) and comparing it with SPR DNA, we realized that Z DNA can be described as a derivative of SPR DNA in which bp 257 to 1924 including the Mtase gene plus 348 bp to the 5’ end have been lost. This ‘deletion’ in Z extends between two quasi complete direct repeats in SPR DNA, which are found at coordinates 245 and 1913:
represent
Homologous
phage DNA, dashed
by the same kind of shadowing.
sites are shown.
Symbol
lines vector
DNA.
flanking regions, defined either by restriction d represents
Heavy lines below mark the
the deletion.
(GAAACTGATTAAAATATCTCTTTTA) (GAATCTGAATAAAATTTGTCTTTTA) (Fig. 2). Recombination leading to the deletion in Z must have occurred within the overlined region.
DISCUSSION
To facilitate comparison of the $3T, SPfl and SPR Mtases, we have aligned in Fig. 2 the sequences at the common codon TTG at locations 714 ($3T),
92
Fig. 2. Nucleotide
and predicted
aa sequences.
Sequences
TTG, where aa numbering
begins (coordinate
indicate
to those in the $3T sequence.
nt or aa identical
ofthe Mtase genes have been aligned at the common
+ 1). Asterisks
define sequences
Regions
not determined.
where matching
nt/aa
translational
start codon
Blanks in SP/!, SPR and Z sequences
are absent
arc represented
by dashes.
The
Shine-Dalgarno bracketing
(SD) sequence
the deletion
was sequenced
and the -10 and -35 regions
in Z DNA are indicated
in regions
corresponding
by arrows.
of putative
promoters
The SPR sequence
are underlined.
presented
Two direct repeats
in SPR DNA
is taken from Buhk et al. (1984). SPfi DNA
to nt 295 to 1142 and nt 1581 to 1923 of the $3T sequence
(see Fig. 1).
94 1 AmInoacIds
A
IOC ._A
____-
___
200
300
L-
./._____i_L
400
500
@ 3T
SPR
BspRI Fig. 3. Schematic phage enzymes
representation are stippled,
of aa homologies
while those between
between
phage
and B. sp~ue~i~l~~ (BspRI)
phage and bacterial
enzymes
are hatched.
Mtases. Sequence
Regions
of homology
data for the B.yRI
hctwern
Mtascs
arc
from Posfai et al. (1983)
420 (SPfl), 605 (SPR), respectively.
This was shown to be the translational start of the SPR Mtase gene (Buhk et al., 1984; R. Lauster, unpubIished) and is most likely also the start signal in the $3T and SP/?Mtase genes. A comparison of the deduced aa sequences of the SPR and 4b3T/SPb Mtases (Fig. 2) reveals strong sjmilarities between the two types of enzymes. They have almost the same number of aa (439 in SPR; 443 in $3T/SPfl) and extensive regions of homology, ranging between 55 7; and 90% identity. There are some 50 aa both at the
N and C ends of the enzymes which are virtually identical. Amino acid sequences which show strong holnology are also found in the central portions of the enzymes (Fig. 3). The close relatedness of the enzymes, based on sequence comparison, is reflected in their immunological cross reactivity (R. Lauster and U. Gtinthert, unpublished). The most striking differences between these enzymes are two regions of 33 nonmatched aa within each Mtase. These ‘inserts’ have no homology to each other. They extend from aa 92 to 124 in the $3T/SP/3 Mtase and from aa 297 to 329 in the SPR enzyme. We speculate that the inserts represent determinants of that nlethylation specificity which is unique to each of the two types of Mtases, i.e., CCGG methylation in SPR and GCNGC methy-
lation in $3T/SP/I. Support for this speculation comes from the localization of mutations. We have found three independent mutants of SPR, which are still capable of methylating the sequence GGCC, but are unable to methylate the sequence CCGG, which is uniquely recognized by the SPR enzyme. Two of these ~nutations are caused by aa changes within the insert of SPR, the third mutation is located three aa away from the insert (Buhk et al., 1984 and unpublished results). In contrast, mutations causing a defect in both Mtase activities were found to be located all over the Mtase gene. We are presently attempting to obtain and localize additional SPR mut~ts which are defective in only CCGG methylation. Mutagenesis experiments with $3T are also in progress. In analogy to SPR, mutations within the non-homologous 99-bp insert of $3T DNA would affect the capacity of this enzyme to methylate the Fnu4HI sequence, which is uniquely recognized by the (P3T Mtase. Since no B. .wdttilis strain with ~~~4H~-speci~c restriction~modi~cation has been identified, such mutants could so far not be selected. As has been shown previously (Buhk et al., 1984; Posfai et al., 1984; Kiss et al., 1985), conserved regions characterized by 40 to 60”” identical aa, are also observed between the SPR enzyme and modifi-
95
cation
Mtases
of B. sphaericus R and B. subtilis R.
This would point to an ancestral with this evolutionary $3T/SPp
Mtase gene. Along
consideration
sequence information
the additional
provided
here shows
that conserved regions between bacterial and phage Mtases fall exclusively into regions which are shared by the phage enzymes The
nucleotide
finding
of the phage Mtases
might require
quence
as an
of
the
$3T
and
studied
a redefinition
element
here, this
of the -35 se-
in promoter
recognition.
We are presently conducting Sl mapping experiments to obtain direct information on the location of the promoter(s) Nucleotide
(Fig. 3).
sequences
expression
used by these genes. sequences
3’ to the genes and to the
deletion of phage Z DNA are almost fully conserved.
SPPMtase genes including the insert pattern are identical. The pl 1 Mtase gene has most likely the
This includes
same sequence,
present a transcriptional stop signal, and an ORF of 72 aa extending between coordinates 2097 and 23 13
since its restriction
to those of $3T/SPB.
Furthermore,
map is identical it has previously
been reported that there is strong DNA/DNA homology between the cloned ~111 and $3T Mtase genes (Noyer-Weidner et al., 1985). The restriction maps of the $3T type Mtase genes are distinct from that of the corresponding SPR gene. This is due to both real aa differences between the two types of enzymes and the degeneracy of the third position of the codon used within regions of aa homology within regions of aa homology. Looking at sequences surrounding the Mtase genes (Figs. 1 and 2), we recognize in all sequences a highly conserved region starting 52 bp upstream from the translational start signal. However, the preceding sequences are different in all three phage DNAs. The DNA beyond the 5’ end of all three genes analyzed here must contain signal elements (underlined in Fig. 2) which define the transcriptional start, since the fragments sequenced (Fig. 1) express the Mtase gene when cloned into heterologous vectors irrespective of their orientation with respect to the vector promoters (Noyer-Weidner et al., 1985; Giinthert et al., 1986). Within the conserved 52-bp region we can recognize at a consensus distance of 6 bp from the translational start codon a ribosomal binding site GGAGGT. Several sequences in this region would qualify to be the -10 promoter sequence. This would imply that the -35 regions of all genes sequenced fell into the region of non-homology and hence would be different for the three genes analyzed here. The putative -35 regions’ (underlined in Fig. 2) resemble, in the case of SPR and $3T the consensus sequence TTGACA of the vegetative obl RNA polymerase, whereas the corresponding sequence CTAAA in SPBis identical to the “ -35 region” described for the 0” RNA polymerase (Gilman et al., 1981; Moran et al., 1982). Apart from the significance for the
following
of $3T for the ed in et al.,
a palindromic
sequence
the TAA stop codon,
immediately
which
might
re-
DNA. It is conceivable that this DNA codes protein of approx. 11 kDa, which was observprevious studies with $3T (Noyer-Weidner 1985). The function of this protein is not
known. Through DNA/DNA hybridization experiments and electron microscopic inspection of appropriate heteroduplex molecules we had previously demonstrated homology between the fragment of Z DNA shown in Fig. 1 and both flanks of the SPR Mtase gene. The sequence comparison between the cloned Z and SPR DNAs demonstrated the absence in Z DNA of 1668 bp of DNA, including the Mtase gene. Since neither phage Z nor for that matter any of the SPR or $3T mutants defective in Mtase activity have any loss of viability, the Mtase genes must be dispensable. Still these genes remain conserved in the methylating phages in the absence of apparent selective pressure for gene maintenance. Such stability is particularly surprising in phage SPR, where repetitious DNA sequences bracketing the Mtase gene would obviously facilitate deletions similar to that found in phage Z.
ACKNOWLEDGEMENTS
We thank R. Lurz for cheerful and patient help in computer work and R. Lauster, U. Gtinthert, J. Kupsch, K. Wilke and P.A. Terschiiren for discussions and the communication of unpublished results. M. N.-W. is supported by the Deutsche Forschungsgemeinschaft (Tr 25/10- 1).
96
REFERENCES
Maniatis,
T., Fritsch,
A Laboratory Boyer,
H.W. and
analysis
Roulland-Dussoix,
of the restriction
Spring
D.: A complementation
and
modification
of DNA
in
Escherichia coli. J. Mol. Biol. 41 (1969) 459-472. Buhk,
H.-J.,
Behrens,
Gtinthert,
B., Tailor,
U.. Noyer-Weidner,
T.A.: Restriction tide sequence,
R.. Wilke, M., Jentsch,
and modification functional
DNA methyltransferase
J.J..
nucleo-
and product
gene of bacteriophage
of the
SPR. Gene 2Y
M.Z., Wiggs,
sequences
J.L. and Chamberlin.
RNA polymerase.
M.A.,
strand
of double-digest
Sonnenschein, quences
A.L., Pero, J. and Losick.
that signal the initiation M., Jentsch, T.A.:
phages Z,
SPR,
L.-F.
and
exists between
Doi,
R.H.:
the major
A strong
sequence
RNA polymerase
4 fac-
Noyer-Weidner, Trautner,
U. and Trautner,
T.A.: DNA methyltransfcrases
subtilis and its bacteriophages.
(Eds.), Methylation
of DNA. Current
and Immunology,
of
In Trautncr,
‘I .A.
Berlin, 1984, pp.
L. and Lauster,
R.: Cloning
I l-22.
and expres-
sion of Bacillus subtilis phage DNA methyltransferasc
genes
Isono, K.: Computer
programs
to analyze DNA and aa sequence
M., Jentsch,
S., Gtinthert, affecting
U. and Trautner, the sequence
(19X3)
J.. Bergbauer,
M. and
F.: Molecular
coli of two
Nucl. .4cids Res. 9
cloning
modification
G., Keller,
R.J.: Nucleotide
sequence
CC.,
and expression
methylase
genes
in of
11l-l 19.
Bacillus subrilis. Gene 21 (1983)
tion system.
S., Posfai,
gene
expression.
J. and Venetianer,
of the BaciNus .rphuericus R modification
Posfai, G.. Baldauf,
P.:
methylase
F.. Erdei. S.. Posfai. J.. Venetianer,
Kiss, A.: Structure
P. and
of the gene coding for the sequence-speci-
fic DNA-methyltransferase
of the B. subdis
Sanger, I;., Nicklen, S. and Coulson. inhibitors.
Staden,
R.: Further
procedures
Venetianer.
Trautncr. and
T.A., Pawlek. Freund.
phagc
SPR.
A.R.: DNA sequencing
with
Proc. Natl. Acad. Sci. USA 74 for sequence
subtilis: Identification P. and Roberta.
of the BsuRI restriction-modifca-
coding
B., Gtinthert,
M.: Restriction
analysis
Mol. Gen. Gcnct.
J.: Untersuchungen
zur Evolution
temperenter
Freie Universitat,
Berlin,
1984.
Diploma Commumcatcd
modification
1X0 (1980) 361-367.
der DNA-Methyl-
B. subrilis-Phagcn.
U., Canosi,
by com-
U.. Jcntsch,
and modification
of a gene in the temperate
for a BsuR specific
Nucl. Acids Res. 13 (1985) 6403-6421.
transferase-Gene
and
puter. Nucl. Acids Res. 5 (1978) 1013-1015.
Kiss, A. and Baldauf,
Kiss, A., Posfai,
genes ofBcrci/lus suh-
relatedness
G., Kiss, A., Erdei,
Structure
S., Kupsch.
(1977) 5463-5467.
T.A.: DNA methyltrans-
5’CCGG.
( I98 I) 2753-2759. Escherichiu
Posfai,
structural
chain-termmating
data. Nucl. Acids Res. 10 (1982) 85-89.
Thesis,
bactcrio-
J. Virol.46
Nucl. Acids Res. I2 (1984) YO39-YO4Y.
in Escherichia coli and B. subtilis. Gene 41 (1986) 261-270.
Kupsch.
of the related pll.
U. and
in Bacillrrs
gcnc. J. Mol. Biol. 170 (1983) 597-610.
Topics in Microbiology
Vol. 108. Springer,
U., Reiners,
ferases
B.. Gtinthert,
T.A.: DNA methyltransfcrasc
rills phages:
Jentsch,
and
se-
and trans-
modification
potential
SPB, $3T
Gene 35 (1985) 143-150.
Gtinthert,
and
M..
R.: Nuclcotide
IX6 (1982) 339-346.
S., Pawlek,
Restriction
tors et Bucillus subrilis and Escherichicr co/i. J. Biol. Chem.
Budus
fragments.
of transcription
lation in B. subtilis. Mol. Gen. Genet.
260 (1985) 7178-7185. Gtinthert,
restriction
446-453.
Wang,
homology
DNA
subrilis: DNA methylation
used by Bad/w
Nucl. Acids Rcs. Y (1981)
599 1-6000. Gitt,
Cold
NY, 1982.
Gene 19 (1982) 269-276.
Trautner,
M.J.: Nucleotidc
of two Bacillus suhtilis promoters
.subtilis sigma-28
Cloning.
Laboratory,
Messing, J. and Vieira, J.: A new pair of MI3 vectors for selecting
Noyer-Weidner,
(1984) 51-61. Gilman,
J.: Molecular
Moran, Jr., C.P., Land, N., Leguce, S.F.J., Lee, G., Stephens,
S. and Trautner.
in Brrcillus suhrik
organization
Harbor,
either
K., Pradn,
D.F. and Sambrook,
Manual. Cold Sp,,ing Harbor
by T. Bicklc
S.
in Bucillus phagc
SP/?
methyltransferase.