195
Gene, 76 (1989) 195-205 Elsevier GEN 02867
IS257 from Stuphylococc~~ awezw member of an insertion sequence superfamily prevalent among Gram-positive and Gram-negative bacteria (Transposase; structure;
evolutionary
immigrant
comparisons;
codon
bias;
G + C content
selection;
protein
evolution;
genetic
genes)
Duncan A. Rouch and Ronald A. Skurray Department of Microbiology, Monash University, Clayton, Victoria 3168 (Australia) Received
by P.A. Manning:
Accepted:
5 September
28 June 1988
1988
SUMMARY
The nucleotide sequences for the IS257 family of insertion sequences from Staphylococcus uureus were compared with those of the ISSl family from Streptococcus Zactis and the IS15 family which is widespread amongst Gram-negative bacteria. These elements have a striking degree of similarity in both their putative transnosase polypeptide sequences and their nucleotide sequences (40 to 64% between pairs), including 12 out of 14 bp conservation in their terminal inverted repeats. The evolutionary distance between the IS15 family and the IS257 and ISSl families of Gram-positive origin is approximately twice that between the IS257 and ISSl families. Analysis of base substitutions in the three sequences has provided insights into the effect of selection for the G + C content of immigrant genes to conform to that of their hosts, and into the evolution of biases in overall amino acid composition of cellular proteins in prokaryotes and eukaryotes. The IS257, ISSI, IS15 families form a superfamily of insertion sequences that has been involved in the spread of a number of antimicrobial resistance determinants in Gram-positive and Gram-negative pathogens.
INTRODUCTION
are thought to have played a pivotal role in the emergence of multi-drug resistance in pathogenic bacte-
Transposable elements, through their ability to enhance gene transfer and rearrangements via homologous and non-homologous recombination processes and to affect gene expression (Cohen, 1976; Campbell, 1981; Kleckner, 1981; Syvanen, 1984),
ria, such as S. uureus (Lyon and Skurray, 1987; Skurray et al., 1988) and the streptococci (Clewell, 198 1; Clewell and Gawron-Burke, 1986). One such element from S. aureus is IS257, of which directly repeated copies flank the trimethoprim-resistance determinant on many members of the pSK1 family of multi-resistance plasmids, forming the composite transposon Tn4003 (Gillespie et al., 1987; Lyon and Skurray, 1987; Rouch et al., 1989). IS257 also forms direct repeats bounding mercury-resistance determinants on heavy-metal-
Correspondence to: Dr. R.A. biology,
Monash
University,
Tel. 61-3-5654927; Abbreviations: repeat(s);
Skurray, Clayton,
Department Victoria
of Micro-
3168 (Australia)
Fax 61-3-5654007.
aa, amino acid(s);
IS, insertion
sequence(s);
0378-l 119/89/%03.50 0 1989 Elsevier
bp, base pair(s);
IR, inverted
nt, nucleotide(s).
Science Publishers
B.V. (Biomedical
Division)
196
resistance plasmids to constitute the putative transposable element Tn4004 (Gillespie et al., 1987; Lyon and Skurray, 1987); these repeats associated with mercury resistance on the latter plasmids are also known as IS432 (Barberis-Maino et al., 1987). IS257 sequences are additionally located adjacent to chromosomal determinants for resistance to mercury, methicillin and tetracycline (Gillespie et al., 1987; Matthews et al., 1987), and may be involved in the chromosomal integration of these three determinants and the in vitro amplification of the methicillin-resistance determinant (Gillespie et al., 1987; Matthews and Stewart, 1988). Furthermore, up to live copies of sequences homologous to IS257 have been identified on aminoglycoside-resistance plasmids from North America (Gillespie et al., 1987); two of the IS257-like elements on these plasmids form part of the inverted repeats which flank the aacA-uphD aminoglycoside-resistance determinant (M. Byrne, D.A.R., M. Gillespie and R.A.S., in preparation). Remarkably, the IS257 family of insertion elements, of which six members have been sequenced (Barberis-Maino et al., 1987; Rouch et al., 1989) shares sequence homology with the IS15 family which is widespread among R plasmids in enterobacteria, and is also detected in Acinetobacter and Cumpylobacter-like species (Lab&e-Roussel et al., 198 1; Labigne-Roussel and Courvalin, 1983; Ouellette et al., 1987); it includes IS15, ISIS-d (Labigne-Roussel et al., 1981; Trieu-Cuot and Courvalin, 1984), IS26 (Mollet et al., 1983), IS46 (Brown et al., 1984; Hall, 1987) and IS140 (Brau and Piepersberg, 1983). ISSI, detected on a lactose plasmid from S. luctis (Polzin and Shimizu-Kadota, 1987), is also homologous to IS257 (Murphy, 1988; Rouch et al., 1989). We present a detailed sequence comparison of members of the IS257, ISSl and IS15 insertion sequence families, to examine the evolutionary relationships of this collection of IS elements.
MATERIALS (a)
AND METHODS
Sources of nucleotide sequences and their align-
ment
The sequences included were of IS257L, IS257R1, IS257R2 (Rouch et al., 1989) IS43IL,
IS431 R, IS43lmec (Barberis-Maino et al., 1987) ISSl S, ISSlT (Polzin and Shimizu-Kadota, 1987) in the ISSl family; ISIS-& ISIS-d11 (Trieu-Cuot and Courvalin, 1984), ISIS-d111 (Ouellette et al., 1987) and IS26R (Mollet et al., 1983) in the IS15 family. IS46 and IS140 were not included in the analysis as only limited sequence information is available for them (Hall, 1987; Brau and Piepersberg, 1983). Sequence alignment was performed using the programs of Staden (1986) as modified by A. Kyne, Walter and Eliza Hall Institute for Medical Research, Melbourne, Australia. To optimize alignment in the coding region, the putative transposase polypeptide sequences were aligned using the mutation data matrix (Schwartz and Dayhoff, 1978) with a deletion penalty of 8 and a zero matrix bias; the corresponding nucleotide sequences were then positioned according to the polypeptide alignment. This procedure was applied because of the moderate degree of similarity between the nucleotide sequences from the different families. (b) Evolutionary relationships
Phylogenetic relationships were determined with the parsimony analysis of Fitch (197 l), utilizing the nucleotide sequence alignment shown in Fig. 1. To delineate the main branches in the evolutionary tree, representative sequences from the three IS families were analysed; namely, IS257L, ISSlS and ISIS-d& for the IS257, ISSl and IS15 families, respectively. (c) Codon bias
The reading frame for each sequence was scored for the degree of use of the codons preferentially utilized by highly expressed genes, corresponding to the major isoaccepting tRNA species in Escherichia coli, and Bacillus subtilis, according to Bennetzen and Hall (1982). The preferred codons for E. coli and B. subtilis, taken from Sharp and Li (1986) and Shields and Sharp (1987) respectively, were UUC, CUG, GUU, GUA, UCU, UCC, CCG, ACU, ACC, UAC, CAC, CAG, AAC, AAA, GAC, GAA, CGU and GGU for E. coli, and UUC, UUA, CUU, AUC, GUU, UCU, CCU, ACU, ACA, GCU, UAC, CAU, CAA, AAC, AAA, GAC, GAA, CGU, GGU and CGA for B. subtilis. Codon bias is
197
equal to the number of preferred codons used divided by the corrected total number of codons, which is the total codon number minus the number of codons utilized for methionine, cysteine, tryptophan and termination, with alanine codons additionally excluded for E. coli only.
RESULTS AND DISCUSSION
(a) Evolutionary
relationships
among
the IS257,
ISSl and IS15 families
To initiate examination of the evolutionary relationships among members of the IS257, ISSl and IS15 families, their nucleotide sequences, where available, were aligned (Fig. l), as were the polypeptide sequences of their putative transposase enzymes (Fig. 2). The DNA and polypeptide sequence similarities between representatives of the three IS families, viz., IS257L, ISSl S and ISI5-d1, derived from these sequence alignments are shown in Table I; the simil~ti~s leave no doubt that the sequences are related and that IS257L and IS52 S are the most closely related pair. Construction of a phylogeny allows for a more accurate and detailed analysis of the evolutionary relationships between these sequences and the IS families they represent. In dete~i~ing the phylogenetic relationships, 369 positions out of 842 in the nucleotide sequence alignment were informative with regard to differentiating substitutions; at 119 nt positions each sequence contained a unique nt, and there were 14 insertionldeletion regions, shown by gaps in the TABLE I Sequence similarities between IS257L, ISSl S and ISIS-d1 O/,homologies a
IS257L ISSl s ISIS-d1
lS257L
ISSlS
ISIS-AI
(100) 59 40
64 (100) 46
49 50 (100)
a Percentage nucleotide sequence identities are shown above the diagonal and amino acid sequence identities below. These are derived from the nucleotide sequence alignment (Fig. 1) and the corresponding amino acid sequence alignment (Fig. 2).
aligned sequences (Fig. 1). The distance score for each sequence was calculated for its branch in the unrooted tree (Fig. 3a). Assu~~g uniform mutation rates, the rooted evolutionary tree shown in Fig. 3b is produced; remarkably, the distance scores for IS257L and ISSZ S from their last common ancestor are very similar (167 and 161, respectively), supporting the ass~ption regarding mutation rates. The distance between ISI5-BI and IS257L (649) or ISSlS (643) is approx. two-fold greater than the distance between IS257L and ISSlS (328). Compared to the distance between the prototype of each family, the distances within each family are quite small, as suggested by inspection of Figs. 1 and 2, being less than 12 for the most divergent pair, IS257Rl and IS43ZL (not shown). Thus, each family forms a tight cluster, which is consistent with the notion that since their initial divergence, elements from the three families have emerged relatively recently in staphylococci, streptococci and enterobacteria; in these cases the elements have emanated from at least three separate sources, that are likely to be non-pathogens of soil origin. In the case of the IS257 family, S. aureU.sappears to have acquired copies of this IS element in at least two sets: IS431 L and IS431 R, which flank the merA merB mercury-resistance determinant on heavy metal/b-lactamase plasmids (Barber&Main0 et al., 1987; Gillespie et al., 1987), show significant sequence divergence from the IS257L, IS257Rl and IS257R2 sub-families (distance = 10 to 1l), that are components of the trimethoprim-resistance transposon Tn4003 found on quaternary ammoniumcompound-resistance plasmids (Gillespie et al., 1987; Skurray et al., 1988, Rouch et al., 1989). The IS431 pair may have been first selected in nosocomial staphylococci with the advent of mercurial antiseptics late last century, whereas Tn4003 was not detected until 1979, subsequent to the introduction of t~ethoprim in chemo~erapy (Lyon and Skurray, 1987). In contrast, divergence among most members of the IS15 family that have been sequenced is minimal, supporting the contention that the widespread occurrence of elements in this family is largely due to recent intra- and inter-species genetransfer events between Gram-negative pathogens, promoted indirectly by antimicrobial chemotherapy (Lab&e-Roussel and Courvalin, 1983; Ouellette et al,, 1987).
198
-35 I
-10
RBS
MRYFRYKPF
IS257L/R2 ISZR; IS431L ISgR 1s431mec
GGTTCTGTTGCAAAGTTGATTATAGTATAATTTTAACAA?= GGAGTCTTCTGTATGAACTATTTCAGATARACAATTT ................................................................................... ............................................. ..A ................................... .A ................................................................................. ....................... ..R .........................................................
ISSlS ISSiT
. . . . . . . . . . . . . . . ..TTCCGATA...C..T...AGTGT....TGAATARAAATGACAGC.AG.ATA.A.CA...........T.A.GG.......... . . . . . . . . . . . . . . . ..TTCCGATA...C..T...AGTGT....TGAATARAnATGAU\GC.AG.ATA.A.CA...........T.A.GG..........
-35
-
I
-10
RBS
-10
-35
RBS
ISl5-AI IS-i&AI1
..CA..........TAGTCGG.GG.UI..A.C..A.C.TCCCCTT..........?TGCTGATG.AGC.GCAC....,.CCA....A.GGCCGG..T... ..CA..........TAGTCffi.GG.GA..A.C..A.C.TCCCCTT..........~GCTGATG.AGC.GCAC......CCA....A.GGCCGG..T...
IS26R -
..CA..........TAGTCGG.GG.GA..A.C..A.C.TCCCCTT..........~GCTGATG.AGC.GCAC......CCA....A.ffiCCGG..T...
IS257L/RZ IS257Rl IS431L ISCR IS431mec ISSlS ISSiT ISl?-AI IS15-AI1 ISE-AIII IS26R
-
100
20 40 NKDVITVAVGYYLRYALSYRDISEILRGRGVNV ~CAAffiATGTTATCACTGTAGCCGTTGGCTACTACTAnGCGTTC 200 .. .. .. .. .. . ... .. . .. .. . .. .. .. .. .. . .. .. .. .. . . ... . ... ... . . ... . .. .. . .. ... .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A................. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A................. . . ..R...............*........................A....................................A................. C.A..A.....G........C..T.....T..T......C.T...AATC.A..C........A..TCAA...C.CC.~A...T.....CA.T......T C.A..A.....G........C..T.....T..T......C.T...AATC.A..C........A..TCRA...C.CC.2TA...T.....AA....T...T C..A..G..T..CG C.GCGT..CA.C...C.GTGG.....AC...GG..CTGC.A...C.GCA.C.....C.....GC.GCAG..G..GC.GGCT.A... C.GCGT..CA.C...C.GTGG.....AC...GG..CTGC.A...C.GCA.C.....C.....GC.CCAG..G..GC.ffiCT.A...C..A..G..T..CG C.GCAG..G..GC.ffiCT.A...C..A..G..T..CG C.GCGT..CA.C...C.GTGG.....AC...GG..CTGC.A...C.GCA.C.....C.....GC.GCAG..G..GC.GGCT.A...C..A..G..T..CG 60 HHSTVYRWVQ
EYAPILYQ
I
ATCATTCAACGGTCTACCGTTGGGTTCAAGAATATGCCCCAATTTTATATCAA .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IS43lL IS43lR 1s43imec ISSlS IssiT ISI?-AI IS%.-AI1 ISE-AI11 IS26R
. . ..C..C...A.T.....C........GCGT.....G..TGAAA.GG.AA..CffiC.GCGC...T.CTGGCGTA.CCCTTCC.A.CT..G.CCG....A . . ..C..C...A.T.....C........GCGT.....G..TGAAA.GG.M..CGGC.GCGC...T.CTGGCGTA.CCCTTCC.A.CT..G.CCG....A
IS257L,'RZ IS257Rl IS43lL IS43lR
ATT
WKKKHKKAYYKWR GCTTATTACAAATGGCG 300 TGGAAGAAAAAGCATAAAAAA ... ... . ... .. . .. .. .. .. .. ... .. .. .. .. ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A .. .. ... . ... .. . ... .. . ... .. . .... .. . .. ... . .... .. . ... .. . .. .. ... .. .. ... . ... .. .. .. . . . ..A..G..AA...G.C.GT.C.TC..?TCG...AA . . . ..A..G..AA...G.C.GT.C.lC..TTCG...AA
IS257L/R2 IS257Rl
.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..G...... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..G...... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..G...... .. . G...CA....TA.T..T........A........CAGTAA.G.CC.C.....T C.C G...CA....TA.T..T........A........CAGTAA.G.CC.C.....T C.C . . ..C..C...A.T.....C........GCGT.....G..TGAAA.GG.AA..CGGC.GCGC...T.CTGGCGTA.CCCTTCC.A.CT..G.CCG....A T.CTGGCGTA.CCCTTCC.A.CT..G.CCG....A . . ..C..C...A.T.....C........GCGT.....G..TGRAA.GG.RA..CGGC.GCGC...
100 SO IDAEGHTLDIWLRKQ IDETYIKIKGKWSYLYRA TATTGATGAGACGTACATCAT~G~T~AGCTATTTATATCGTGCCATTGATGCA~~GACATACATTA~TATTTGGTTGCGT~GC~
400
. .. .. .. . ... .. .. .. .. . .. . .. .. .. . .. .. . .. .. . . .. .. . .. .. .. ... . . ... .. . .. .. .. .. .. .. .. .. .. .. . .. ... . .. .. ... .. . . . ... .. . ... ... . .. .. .. . .. .. ... . .. .. . .. .. . . .. .. . ... .. . ... . .. ... . . .. .. .. .. . ... .. .. .. .. . .. ... .. . ... .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..r.................................
IS43lmec
....................................................................................................
ISSlS
A..G.....A..T..T........C.....TCGT...CAT...C
ISSIT ISE-AI ISIS-AI1 ISIS-AI11
. . . . . . . .A........G..T...?TG..T.....C..C...C.A..C...A.. A..G.....A..T..T........C.....TCGT...CAT...C.C ATC.CTCCTCC.GT C..G.....A..C...G.G..GG.C..T..CCGC...GCG...C.G..C..G...G.C..CAGCCG...C.GC..TG.C...T... C..G.....A..C...G.G..ffi.C..T..CCGC...GCG...C.G..C..G...G.C..CAGCCG...C.GC..TG.C...T...ATC.CTCCTCC.GT C..G.....A..C...G.G..ffi.C..T..CCGC...GCG...C.G..C..G...G.C..CAGCCG...C.GC..TG.C...T...ATC.CTCCTCC.GT C..G.....A..C...G.G..GG.C..T..CCGC...GCG...C.G..C..G...G.C..CAGCCG...C.GC..TG.C...T...ATC.CTCCTCC.GT
IS26R
.C ........ A........G..T...TTG..T.....C..C...C.A..C...A
140
120 RDNHSAYAFI
K R L AAACGTCTC . .. . .. . .. ... . .. .. . . .. .. . .. .
IKQFGKPQKVITDQAPS ATTAAACAATTTGGTAAACCTCAAAAGGTC 500 .. ... .. . .. .. . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... ... .. .. ... .. .. .. . .. . .. .. .. .. .. .. .. .. .. ... . .. .... . .... ... .. .
.........
. .. ... . . ... .. . . .. .. .. ... .. .. ... .. .. . ... ... . .... ... . .
IS257L/R2 IS257Rl IS431L IS43lR
CGAGATAATCATTCAGCATATGCGTTTATT . . . . . . . . . . . . . . . . . . . . . . . . . . . ..c
IS43lmec
. . . . . . . . . . . . . . . . . . . . . . . . . . . ..c
ISSlS
. . . ..A..A ..GA...CG..AG.T..T........CT.G . . . ..A..A ..GA...CG..AG.T..T........CT.G ..TA.C.GCA.AG.T.....CCG....C.GGGT...ATC... ..TA.C.GCA.AG.T.....CCG....C.GGGT...ATC... ..TA.C.GCA.AG.T.....CCG....C.GGGT...ATC... ..TA.C.GCA.AG.T.....CCG....C.GGGT...ATC...
-
ISSiT ISly-AI ISC-AI1 IS'5-AI11 IS26R
. . . . . . . . . . . . ..G..............C . . . . . . . . . . . . ..G..............C
..
... ... . .. .. . .. .. . .. .. .. .. ... .. ... .. . .. ... ... .. .... ..
CA......G......C....AAG.GTAA.TG.C..G...A.A..G..C..'r. CA......G......C....AAG.GTAA.TG.C..G...A.A..G..C..T. AACAACG.G..GA.G.W;CAG.TC..A.G.TTCA.C.AC..G...A.A..G..CG.CT AACAACG.G..GA.G.GGCAG.TC..G.G.TTCA.C.AC..G...A.A..G..CG.CT AACAACG.G..GA.G.ffiCAG.TC..G.G.~CA.C.AC..G...A.A..G..CG.CT ARCAACG.G..GA.G.ffiCAG.TC..G.G.TTCA.C.AC..G...A.A..G..CG.CT 160
TKVAMAKV IS257L/R2 IS257Rl -
CGAAGGTAGCAATGGCTAAAGTA . . . . . . . . . . . . . ..T.......
IS431L IS43lR IS43lmec
.. . .. ... .. .. ... .. .. .. .. . . . . . ..I...............
IKAFKLKPDC ATTAAAGCTTTTAAACTTAAACCTGACTGC
HCTSKYLNNLIEQD CATTGTACATCGAAATATCTGAATAACCTCATTGAGCAAGA
. ... . .. .. . .. .. . .. ... . .. ... .. ..
. . ... . .. .... . ... .. .. .. .. .. . ... . .. ... . .. .. .. . .. .. .. ... .. .. ... . .. .. .. .. .. .. . ... .. . .. .. . .. ... .. ... .. . ... . ... .. . .. .. .. .. .. .. . ..
. . . . . . . . . . . . . . . . . . . . . . . . . . . ..T . . . . . . . . . . . . . . . . . . . . . . . . . . . ..T
....................................................
.
..G
.........................................
. ..C.A..CGT...G.....C
....................
ISSlS
TTGGTTCT...T.TAGR..GT..CAG.G...C.G...AT.TAC...GR.A
ISSlT IS-AI
. ..C.A..CGT...G.....C.................... TTGGTTCT...T.TAGA..GT..CAG.G...C.G,..AT.TAC...GA.A..G .AACGC.MffiCCGGTGCCCGT.....GTTGRA..CCAACAGRTT..G..C.G...C...G.G.....ATGC.. ATGGTCGC..GC.T...CTGC.C
ISIS-AI1 ISlS-AI11 IS26R -
ATGGTCGC..GC.T...CTGC.C ATGGTCGC..GC.T...CTGC.C ATGGTCGC..W:.T...CTGC.C
.AACGC.AAGGCCGGTGCCCGT.....GTTGAA..CCAACAGATT..G..C.G...C...G.G.....ATGC.. .AACGC.AAGGCCGGTGCCCGT.....~TGAA..CCAACAGATT..G.GC.G...C...G.G.....ATGC.. .AACGC.AAGGCCGGTGCCCGT.....~TGAA..CCAACAGATT..G..C.G...C...G.G.....ATGC..
600
199
200
180 ISEL/RZ IS257Rl IS431L IS431R xs431mec ISSlS ISSlT _ ISE-AI ISE-AI1 ISlS-AI11 -_ IS26R
H R H I K TCACCGTCATATTAAA .. . .. . ... . .. .. .. .. ... .. ... . .. ... .. .. . ... . .. .. .. . .. .. . .. .. .. .. ... C..T..A.CA..C...
V R K GTAAGAAAG .. . .. .. . . .. . .. .. . . .. . . .. .. . .. . . .. .. . CG.C.C..T
C..T..A.CA..C...
CG.C.C..T
. ..TG.CA.AC.G...CGGA...TC..CGCC..
YQSINTARNTLKGIECIYAL TATCAAAGTATCAATACAGCAAAGAATACT"TTAUAGGTATTGAATGTATTTACGCTCTA ACAAGG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..G............... . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A..... ... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..A..... ... .. . . .. .. . .. ... . .. .. .. .. .. . .. ... .. . ... . .. .. .. .. .. .. .... .. .. ... .. . .. .. . .A.TTT . . ..G.... C.ACGA..T..CTCA.CC..GA.T..G..C..G...ACA...CGA.GAA.. . . ..G....C.ACGR..T..CTCA.CC..GA.T..G..C..G...ACA...CGA.GAA.. .A.TTT CCT.CGA.T.A..TCC..G..G..G..TT.CGCC..CR.CGT..A... T
R
TCGGCGCC..GCT.GGA.T.A..TCC..G..G..G..~.CGCC..CA.C...........GGTG..GCGT..A... . ..TG.CA.AC.G...CGGA... . ..TG.CA.AC.G...CffiA...TCffiCGCC..GCT.ffiA.T.A..TCC..G..G..G..TT.CGCC..CA.C...........ffiTG..GCGT..A... GCT.GGA.T.A..TCC..G..G..G..~.CGCC..CA.C...........GGmr..GCGT..A... . ..TG.CA.AC.G...CGGA...TCGGCGCC.. 220 YKKNRRSLQI YGFSPCHEISIMLAS* TATAAAAAGAACCGCAGGTCTCTTCAGATCTACGGATTTTCGCCATGCCRCWUGCG ................................................................................................. ................................................................................................. .................................................................................................
IS257L,'R2 ISERl IS431L IS431R IS431mec ISSlS ISSLT IS=-AI ISE-AI1 ISl5-AI11 IS26R
/ ACTTTGCAACAGAACC
ATTAGTGGTTAGCTATATTTTTT
IS15-AI1 IS15-AI11 IS26R
GAC.AAA.GCT..CTC..CGC GAC.AAA.GCT..CTC..CGC
.A .A
GAC.AAA.GCT..CTC..CGC
.A
842
................ . ....................... ...................................... . ....................... TA..AACC..GTA.T.GA ..... AA TA..AACC..GTA.T.GA ..... AA GAC.AAA.GCT..CTC..CGC .A
Nucleotide
sequence
alignment
................ ................ ................ .......... ..TG .. .......... ..TG .. .......... ..TG .. .......... ..TG ..
for members
of the lS257, ISSl
for lS257L, and lS257R2 (lS257L/R2);
for the other aligned sequences,
Bold lines mark the positions
promoter
of each family. Alignment well conserved lS257L/R2 according
among
to facilitate
of putative position
members
numbers
alignment.
unknown.
nucleotides
sequence
Uncertainties
is shown
sequence
and ISIS-All1
sites (RBS) for the members indicate
numbers
(Barberis-Maino
sequences
the 14-bp IR which is
for the transposase
(Fig. 2); associated
in the lS43lmec
Note that the lS431mec
ribosome-binding
lines at the termini
Translation
alignment
sequence is shown
nucleotide
are only shown where they differ from lS257L/R2.
(-35 and -10) and potential
ISSZ and ISIS families.
with the polypeptide
sequence
and by an asterisk,
sequences
and IS.15 families. The complete
are shown to the right. Half-arrowed
of the lS257,
orientation
to the polypeptide
by R, purine,
800
............................... ..C...........A..AAA.GGRAC.C...T..........GT..CTACT..G....AGG..T.AA.G.G..TA.T T.AGAAC.AGAAGG.T . ..T ..C...........A..AAA.GGAAC.C...T..........GT..CTACT..G....Affi..T.RA.G.G..TA.T T.AGAAC.AGAAGG.T . ..T CGC..AGG..AGGCC..AGCATTTTAT..T ..T GAT..CCTGGG......GC..C.ffi.AAGCAG.DT..TT.AAATGT.AffiCCTTTGA .... CGC..AGG..AGGCC..AGCAWTTAT..T ..T GAT..CCTGGG......GC..C.GG.RAGCAG.GP..TT.AAACGT.AffiCCTTTG8 .... CGC..AGG..AGGCC..AGCATTTTAT..T ..T GAT..CCTGGG......GC..C.GG.MGCAG.GP..TT.AAACGT.AGGCCTTTGA .... CGC..AGG..AGGCC..AGCATTTTAT..T ..T GAT..CCTGGG ...... GC..C.GG.AAGCAG.GT..TT.AAATGT.AGGCCTTTGA ....
IS257L,'R2 IS257Rl ISHL IS431R ISSlS ISSiT IS~-AI
Fig. 1.
AACACTGRCATGATAA
reading
thus indicate
frame
of
positions
et al., 1987) are indicated
are incomplete
at the 3’ and 5’ ends,
respectively.
(b) Codon usage within transposase genes Codon
usage within
a gene reflects both on the
identity of the host that it originated from (Grantham et al., 1980; Bibb et al., 1984) and on the degree to which it is expressed; the more highly expressed a gene is the greater the use of synonymous codons corresponding to the major isoaccepting tRNA species within the host with which it has had a longterm association (Post et al., 1979; Bennetzen and Hall, 1982). The level of use of these preferred codons in the putative transposase genes located in the three prototype IS elements, IS257L, ISSl S and ISIS-d1 (Fig. 1, cf. Fig. 2), was measured for representative Gram-positive and Gram-negative host species, namely B. subtilis and E. coli, respectively, using the codon bias statistic of Bennetzen and Hall (1982), which allows discrimination between dif-
ferent hosts. The codon bias results (Table II) show that the two sequences of Gram-positive origin, IS257L TABLE Codon
and ISSIS,
have a higher codon
bias for
II bias in transposase
Host
Insertion
genes sequence
lS257L Codon
ISSl s
ISlS-Al
bias ’
B. subtilis
0.45
0.42
0.34
E. coli
0.37
0.35
0.54
a Codon bias was calculated METHODS,
section
levels show a codon 1982).
c. Genes
according expressed
to MATERIALS
AND
at low and moderate
bias of less than 0.6 (Bennetzen
and Hall,
200
20 IS257L,'RZ IS257Rl IS431L
MNYFRYKQFNKDVITVAVGYYLRYALSYRD MNYFRYKQFNKDVITVAVGYYLRYALSYRD MNYFRYKQFNKDVITVAVGYYLRYALSYRD
I/// Helix ///I--Tff"-I///// Helix /////I 60 ISEILRGRGVNVHHSTVYRWVQEYAPILYQI-ml-IWKKKHKKAYYKW] I SE I I, RGIIJ R GV N VHLI S TV Y RW VQ E Y A P I L Y Q - I -W K K K iIK K A Y Y KW IS EILRER G V N V II II S TV Y ~W VQ E Y A P I L Y U - 1-W K K K ii K K A Y Y KW ISEILRER GVNVHHSTVYRWVQEYAP I LYQ- I /II -WKKKHKKAYYKW I ISEILRERGVNVHHSTVYRWVQEYAPILYQ-u-WKKKHKKAYYKW IQE~LYiJRGLNVCH~T~YRWVQ&YSK~LY~-k-WKKKfNKQSEYSh NVDHSTIYRWVQRYPEMEKRLRWYWRNPSDLCPW @jL-fRGg 1 [ [ ~!++=j LQEMLAERGVNVDHSTIYRWVQRYAPEEEKRLRWYWRNPSDLCPW LQEfiLAERGVNVDHSTrYRWVQRYAPEgEKRLRWYWENPSDLCPW LQEijLAERGVNVDHSTiYRWVQKYAPEGEKRLRWYWBNPSDLCPW
IS257L/R2 ISmRl IS431L IS43LR ISZXmec ISSlS/T ISl%-AI ISi?-AI1 ISi?-AI11 ISSR
I
l/// Helix ///I-T"f"--[///// Helix /////I IS257L/R2 ~RIDETYIKIKGKW~YLYRAIDAEGHTLUIWLRKQRDNHSAYAFI~ISmRl ISz-lL IS431R IS431mec ISSlS/T ISl5-A I ISi%-AI1 ISi-!?-AI11 ISZR -
100
120
IS257L/RZ IS257Rl IS431L ISmR IS431mec ISSlS/T ISl?-A I ISi??-AI1 IS=-AI11 IS?%R -
IS257L/R2 ISmRl ISB-iL IS431R 1szimec ISSlS,'T ISls-AI ISi%-AI1 ISZ-AI11 IS%R 220 KKNRRSLQIYGFSPCHEISIMLAS KKNRRSLQIYGFSPCHEISIMLAS KKNRRSLQIYGFSPCHEISIMLAS
IS257L/R2 ISxRl IS43LL ISmR IS431mec ISSlS/T ISlF- AI ISi%-AI1 ISi%-AI11 IS-%R -
Fig. 2. Amino containing
acid sequence
identical
or (S, T) are underlined. above
the sequences,
C-terminal
alignment
for the putative
aa are boxed and conservative The two predicted as is numbering
and N-terminal
is a result of uncertainty
a-helix-turn-a-helix
of the alignment.
ends, respectively. in the nucleotide
transposases
replacements
nucleotide
sequence
Note that the IS43Zmec
The X occurring
sequence
of members
within the groups
of the IS257, recognition
and ISIS families.
motifs (Rouch
and ISIS-d111
in the IS431mec sequence
of DNA (Fig. 1).
ISSl
Positions
(D, E), (F, Y, W), (I, L, V, M), (H, K, R), (N, Q) sequences
at residue
et al., 1989) are shown are incomplete
11 (Barberis-Maino
at their
et al., 1987)
201
B. subtilis as the host compared
to E. coli; the oppo-
Thus,
only one such event
need
be proposed
to
site applies for ISIS-d1
isolated from Gram-negative
explain the available data; the last common
ancestral
bacteria.
are consistent
sequence for the three families of insertion
sequences
These results
that the IS257
and ISSl
families
with the idea have resided
in
Gram-positive hosts for some time and the IS15 family in Gram-negative hosts. This is not surprising, since transfer
of genes between
Gram-negative
bacteria
Gram-positive
and
is likely to be a rare event.
must have undergone such a transfer. The codon bias values for the IS elements preferred ISSlS,
host system,
of low to moderate
which is consistent genes
a
167 \
)_I?_ 161
ISS IS
b
1
fi
Time
Fig. 3. Evolutionary
IS15 n-1 ISIS-All Is15-AIII IS26, IS46,ISZ40
1SSl.S ISSlT
IS257R1, IS257L/R2 IS431L,IS431 R IS431mec
tree topologies
determined
for the three prototype
Di, of a particular
sequence
tree was calculated
sequences.
from its branch
from: Di = L { +
that includes
and reverted accounts
a Poisson
substitutions
for insertion
point in the unrooted + Xi, in
of substitutions
to the
correction
(Kimura
or deletion
for superimposed
and Ohta,
1972), and Xi
events, marked
by gaps in the
aligned
sequences
(Fig. 1). L is the effective
quence
in bp and Si is the raw substitution
length
contains
a different
the other two, u is the number contain entiating
unique nucleotides, positions
unique nucleotide differentiating relative IS257L
substitutions
positions
L = 842,
being
assuming prototype
evolutionary
uniform mutation sequences;
other
families are shown below.
set at equal to twice
d = 369,
e.g., for
u = 119, X, = 2, + 2 = 167.
tree derived from the unrooted rates, calculated sequenced
of
considered,
events for the sequence;
d, = 88,
the
to the number
Di = 842{ -314 ln(1 - 4/3[88 + 2 x 119(ss/~)]/842)} (b) The rooted
of differ-
dJZd distributes
sequence
X, was arbitrarily
to
where all sequences
in proportion
for the
of insertion/deletion
where
of positions compared
and zd is the total number
for the three sequences;
to all sequences.
the number
nucleotide
of positions
of the se-
score, determined
from: Si = d, + 2u(dJZd), in which d, is the number where the sequence
unrooted
The distance,
ln( 1 - 4/3 (S,/L))}
which the first term is the contribution distance,
with the parsi-
of Fitch (1971). (a) The only possible
tree,
for the three family
members
of the three
and to a
levels of expression,
with the proposed for transposase
(c) Selection for G + C content transposase proteins
ISIS-DI
/
tree structure
as encoding
correspond
nature of the polypeptides,
since such enzymes are usually produced amounts (e.g., Morisato et al., 1983).
IS257L
mony analysis
B. subtilis for IS257L
and E. coli for ISIS-d1,
prediction
in their
in low
and evolution
of
The G + C content of genomic DNA from different bacterial species varies between approx. 25% and 75% (Rosypal and Rosypalova, 1966). To allow encodement of proteins with similar activities across species, the diversity of G + C content in coding regions is accommodated for, in part, by the degeneracy in the genetic code, since synonymous codons vary in G + C content themselves (Bibb et al., 1984). Furthermore, the G + C content of the DNAs of each species is apparently under stabilizing selection, so that where a DNA molecule, which for example contains a resistance gene or transposon, is transferred from its original host to a new host with a different G + C content, the G + C content of the transferred DNA gradually alters to that of the new host (Sueoka, 1962). The question then arises of how the ability to encode for a function, such as a particular enzyme activity, is maintained under selection for an altered G + C content. The differing G + C contents of the related IS257L, ISSlS and ISIS-d1 transposase genes, which are 34.5%, 37.3% and 52.8%, respectively, permits examination of this question in terms of nucleotide and amino acid sequence changes. Nucleotide substitutions responsible for differences in G + C content between the reading frames of these transposase genes are distributed through most of the 205 codon positions examined for which the sequences had corresponding codons, excluding 15 invariant codon positions (Fig. 1). Among the codons for invariant amino acids, the majority of the mutation differences between IS257L or ISSI S and ISIS-d1 altered the G + C content, and were
202
naturally
restricted
positions. nucleotides codons, tution directly
in the most part to the third base
This indicates has occurred
that selection for particular at the third
and demonstrates theory
(e.g.,
applicable
position
1968;
in this situation.
amino acid composition of bulk cellular protein isolated from bacterial species with low or high
1983) is not
G t C contents (Sueoka, 1962) could have evolved, and also the evolution of amino acid composition
Conservative
biases in eukaryotes
that the neutral
Kimura,
in
genomes, the results explain how similar biases in the
substi-
with non-average
G
t C con-
subsets (D, E), (F, Y, W, (I, L, V, M), (H, K, R), (N, Q) or (S, T), involved G + C content changes at
tents, like Plasmodium falciparum (Hyde and Sims, 1987). It can also be inferred that transfer of a gene to a host with a different G + C content will acce-
all codon positions and most combinations of positions. These changes tended to result in an amino
maintenance
acid with A + T-rich codons
Thus,
amino acid replacements,
posases
encoded
that is exchange within the
occurring
by A t T-rich
DNA
in the transsequences
(IS257L and ISSIS) opposite an amino acid from the same subset encoded by G + C-richer codons in the ISIS-d1 sequence; for example, lysine (AAA) opposite arginine (CGG) at aa position 7(cf., Figs. 1 and 2) tyrosine (TAC) or phenylalanine (TTT) opposite tryptophan (TGG) at aa positions 20 and 129. In these cases, the lysine/arginine substitution can occur through three separate single nt substitutions, with the intermediate codons specifying lysine or arginine [i.e., AAA(lys)-AAG(lys)AGG(arg)-CGG(arg)], however, the aromatic amino acid replacements would require double substitution events to avoid encoding a non-aromatic residue in an intermediate step. At positions demonstrating non-conservative amino acid replacements there is a tendency for an amino acid with A t T-rich codons to occur in the A + T-rich sequences opposite any amino acid with G + C-rich codons in the G t C-rich sequences; for example, tyrosine (TAT) against proline (CCA) at aa position 3. In addition, in the ISIS-d1 sequence there are separate relative insertions of three arginine codons (CGG or CGC; aa positions 61,63 and 182) and two glycine codons (GGT or GGA; aa positions 120 and 189); and relative deletions of a tyrosine (TAT) and a phenylalanine (TTT) codon (aa positions 210 and 222) which positively contribute to the differences in G + C content. Taken together, these results account for the overall biases in amino acid compositions of the putative transposases, with a trend for more residues specified by A t T-rich codons, such as F, Y, I, M, N or K, to occur in the IS257 and ISSI sequences compared to the IS15 sequences, and vice versa for residues specified by G + C-rich codons, like P, R, A, and G. Furthermore, when extrapolated to whole
lerate polypeptide evolution
hastened
sequence
alterations
in a host with a similar G of particular
proteins
compared
to
t C content. would
be
by the shuffling of their genes between host
bacteria with differing G + C contents, as has probably occurred, for example, in the aminoglycoside phosphotransferase family (Thompson and Gray, 1983; Rouch et al., 1987); transfer may have aided in producing the wide variety of substrate profiles exhibited by the enzymes in this family, that is mirrored in the range of G t C contents (25-73%) among their encoding DNAs (D.A.R. and R.A.S., unpublished). Such transfer events will be rare given the likely impediments against transfer between unrelated species, such as restriction systems. However, this is not to deny their possible importance in the long term for the evolution of some genes. (d) Molecular organization and IS15 families
of IS in the IS257, ISSZ
Sequence conservation between members of the IS257, ISSl and IS15 families is reflected in their common molecular organization, as shown for their representatives in Fig. 1. The single significant open reading frame, encoding the putative transposase, accounts for approx. 85% of each IS element; the most highly conserved regions in the transposase polypeptide sequences specify two predicted cr-helixrecognition turn-a-helix nucleotide sequence domains (Fig. 2; Rouch et al., 1989), with a corresponding high degree of nucleotide conservation in their encoding DNA sequences (nt positions 169-229 and 302-361; Fig. 1). Also, although the lengths of the terminal IR vary between the IS257 (16-28 bp), ISSl (18 bp) and IS15 (14 bp) families (Rouch et al., 1989) there is good conservation of a 14-bp terminal IR with the left-hand IR containing the -35 box of the predicted promoter for the transposase gene of each sequence (Fig. 1). Since the
203
transposase
would
during transposition
be expected
to bind to the IR
(Craig and Kleckner,
may regulate its own expression
1987) it
as this would also
occlude the -35 promoter region from an RNA polymerase molecule. Although the -35 box is con-
impaired.
Also,
IS431 R has undergone
transition
at nt 2, which is in the terminal
outside end of IS431 R in Tn4004, of the IR is important
for transposition,
served, the -10 box and spacing between
these two
could render
Tn4004
and IS15
these results
suggest Tn4004
family members
is not;
IS257,
show spacings
ISSl
of 14, 16 and 20 bp,
respectively. The promoters are close enough to the ends of the IS elements that sequences flanking their left hand ends could affect expression, the case of IS257 with its sub-optimal spacing (Rouch et al., 1989). In addition,
especially
IR of the
so that if integrity as in other
cases (e.g., Huang et al., 1986), then this substitution
promoter
elements
a G -+ A
effectively
inactive.
Together,
is a defective
trans-
poson. (e) Conclusions
in
promoter expression
of the transposase may be influenced by promoters located outside an IS but able to direct transcription into its transposase gene; if above a certain strength, such externally directed transcription could outcompete the transcriptional block imposed by a transposase molecule bound at its -35 box. Hence, transposase gene expression, and therefore transposition activity, may be influenced by the site of insertion of these elements. In contrast, some other IS elements, such as IS5 (Kroger and Hobom, 1982) contain Rho-independent transcription terminators in their terminal regions which block outside to inside transcription. Partly due to lack of optional extras like these, insertion elements in the three IS families described here are among the smallest such sequences. Of these, the IS257 elements are the most compact, with minimal spacing between the potential promoter and ribosome-binding site (8 bp, cf. 23 and 22 bp for ISSl and ISIS families, respectively) and between the -10 and -35 regions of the promoter (14 bp cf. 16 and 20 bp for ISSI and IS15 families, respectively); and they also potentially encode for the shortest transposase, of 224 aa (cf. 226 aa for ISSl and 234 aa for ISIS). Interestingly, sequence differences between the IS257 sub-family and the two IS431 elements, which flank merA merB in the putative mercury-resistance transposon Tn4004 (Gillespie et al., 1987; Lyon and Skurray, 1987), suggest reasons for the failure to observe transposition of Tn4004 (Murphy, 1988); IS431 L has suffered a G --t A transition at a critical position in its ribosome-binding site (Fig. l), suggesting that expression of the transposase would be impaired, and if the transposase is preferentially cisacting, as occurs for a number of other transposable elements (McFall, 1986), transposition might also be
The degree of sequence similarity between members of the IS257, ISSl and IS15 families of IS leaves little doubt that they have a common ancestry. These IS elements, therefore, form a superfamily of IS, that has members prevalent among a number of Gram-positive and Gram-negative pathogens; in these organisms, it would seem that this IS superfamily has been associated with the spread of many antimicrobial resistance determinants in recent evolutionary times (Labigne-Roussel and Courvalin, 1983; Skurray et al., 1988; Rouch et al., 1989). The biases in amino acid substitutions in the transposase polypeptide, that correlate with the G + C contents of their encoding DNAs, suggest that the transfer of genes between hosts of different G + C contents will accelerate the evolution of their products, as a result of G + C content normalization. Similarly, divergence in G + C content of related species, whether prokaryotic or eukaryotic, would accelerate protein evolution on a genomic scale.
ACKNOWLEDGEMENTS
This work was supported by a Project Grant from the National Health and Medical Research Council (Australia).
REFERENCES Barberis-Maino, and
Kayser,
quence-like
L., Berger-B&hi, F.H.: element
I%jZ, related
B., Weber,
a staphylococcal
H., Beck, W.D. insertion
se-
to IS26 from Proteus vulgaris.
Gene 59 (1987) 107-113. Bennetzen,
J.L. and Hall, B.D.: Codon selection
Chem. 257 (1982) 3026-3031.
in yeast. J. Biol.
204
Bibb, M.J., Findlay, between
base
P.R. and Johnson, composition
M.W.: The relationship
and codon
usage
genes and its use for the simple and reliable identification protein-coding
sequences.
BrLu, B. and Piepersberg, mobilization mediated
of gentamicin
zation
resistance
J. Bacterial.
significance
genus Sfrepfococcus.
Microbial.
and the dissemination
DNA ele-
35 (1981) 55-83.
and gene transfer
in the
Rev. 45 (1981) 409-436. resistance
genetic elements
in streptococci.
recombination.
and plasmid
N.: Transposition
In Neidhardt,
K.B., Magasanik,
F.C.,
B., Schaechter,
evolu-
and site-specific
Ingraham,
J.L., Low,
M. and Umbarger,
R.:
H.E.
Society for Microbiology,
DC, 1987, pp. 1054-1070.
P.R. and Stewart,
chromosomal
DNA
tree topology.
P.R.: Amplification
Stewart,
Lyon,
B.R.,
Loo,
P.R. and Skurray,
sequences
associated
and trimethoprim aureus. FEMS
Syst. Zool.
methicillin,
McFall,
E.: c&acting
Mallet,
B., Shigeru,
sequence
associated
Nucleic
P.R.,
Nucleic
proteins.
J. Bacterial.
I., Shepherd,
J.E.
genome
and the genome
R. and Pave, A.:
hypothesis.
Nucleic
deletion
Sims,
P.F.G.:
in both coding
M.: Evolutionary
of R46.
required
R.J. and Lee,
for transposition
Associates,
Anomalous
dinucleotide
and non-coding
malarial
regions
fre-
from the
Plasmodium falcipa-
parasite
distance
N.: Transposable
rate at the molecular
level. Nature 217
theory
of molecular
Sunderland,
evolution.
ofGenes
In Nei,
and Proteins.
MA, 1983, pp. 208-233. model for estimation
between homologous
proteins.
J. Mol.
IS5. Nature
Labigne-Roussel,
G., Lambert,
resistance
(Eds.),
Cambridge,
elements in prokaryotes.
insertion
element,
analysis
Annu. Rev. of insertion
297 (1982) 159-162.
A. and Courvalin,
strain from
P.:
of aphA-I,
members
a
of the
Agents Chemother.
31
similar
P.: IS15, a new insertion
M.: Identification
to Gram-negative
of a new
IS26,
on the
of Streptococcus lactis ML3. J. Bacterial.
plasmid
169
(1987) 5481-5488. G.D., Nomura,
P.P.: Nucleotide cluster
adjacent
sequence
M., Lewis, H. and Dennis,
of the ribosomal
protein
to the gene for RNA polymerase Natl.
Acad.
gene
subunit
Sci. USA
p
76 (1979)
1697-1701. Rosypal,
S. and
taxonomic
Rosypalova,
relationships
A.: Genetic, among
phylogenetic
bacteria
acid base
Fat. Nat. Sci. J.E. Purkynye aacA-aphD gentamicin
composition.
Univ. (Biologica)
and kanamycin
Folia
sequence
analysis.
by Pub].
7 (1966) l-90.
resistance
R.A.: The determi-
nant of Tn4001 from Staphylococcus aureus: expression nucleotide
and
as determined
D.A., Byrne, M.E., Kong, Y.C. and Skurray,
J. Gen. Microbial.
and
133 (1987)
3039-3052. Rouch,
D.A., Messerotti,
L.J., Loo, L.S.L., Jackson,
R.A.: Trimethoprim
resistance
transposon
C.A. and Tn4003
from Staphylococcus aureus encodes genes for a dihydrofolate reductase
G.: Structural
T. and Courvalin,
determinant
Polzin, K.M. and Shimizu-Kadota,
Skurray,
15 (1981) 341-404. M. and Hobom,
sequence
SM.
Press,
by a Campylobacter-like
their deoxyribonucleic
M. and Ohta, T.: On the stochastic
Genet.
University
in Escherichia coli. Proc.
Evol. 2 (1972) 87-90.
Kroger,
Cambridge
ends in
Staphylococcus. In
in
K.F. and Kingsman,
family Enterobacteriaceae. Antimicrob.
Rouch,
M.: The neutral
ofmutational Kleckner,
N.: TnlO
on nearby transposon
elements
A.J., Chater,
Post, L.E., Strycharz,
F., Twu, J.-S., Schloemer,
M. and Koehn, R.K. (Eds.), Evolution Kimura,
Transposable
M., Gerbaud,
lactose
is an IS46-promoted
of the human
Sinauer
W.: Nucleotide
mobile genetic element.
(1987) 1021-1026.
Gene 41 (1986) 23-31.
and
Ouellette,
kanamycin
(1968) 624-626. Kimura,
acts preferentially
E.:
tetracycline
rum. Gene 61 (1987) 177-187. Kimura,
133
167 (1986) 429-432.
J. and Arber,
D., Way, J.C., Kim, H.-J. and Kleckner,
Acquisition
in Staphylococcus
determinants
of Tn3 sequences
and immunity. quencies
of
and other
Acids Res. 11 (1983) 6319-6330.
Morisato,
direct repeat
Acids Res. 15 (1987) 5479.
C.-J., Heffron,
C.-H.: Analysis Hyde,
P.R.: The cloning
with methicillin
of IS26, a new prokaryote
Transposition.
Lett. 43 (1987) 165-171.
C., Gouy, M., Mercier,
usage
R.M.: pKMllO1
Huang,
of methicillin.
in Staphylococcus aureus. J. Gen. Microbial.
resistances
20 (1971)
Acids Res. 8 (1980) r49-r62. Hall,
of a section of
134 (1988) 1455-1464.
DNA
Kingsman,
Matthews,
R.A.: Homologous
resistance Microbial.
catalog
L.S.L.,
with mercury,
R., Gautier,
Codon
51
1988, pp. 59-89. M.T.,
Grantham,
of
Rev.
Staphylococcus
in methicillin-resistant
P.R., Reed, K.C. and Stewart,
chromosomal
minimum
406-416. Gillespie,
resistance
Microbial.
aureus following growth in high concentrations J. Gen. Microbial.
Murphy,
defining the course of evolution:
for a specific
basis,
(1987) 88-134.
vivo. Cell 32 (1983) 799-807.
change
Antimicrobial
Staphylococcus aureus: genetic
and Molecular Washington,
from the
182 (1981) 390-408.
Skurray,
transposase
Fitch, W.M.: Toward
P.: Translo-
resistance
in Salmonella ordonez.
plasmid
(Eds.), Escherichia coli and Salmonella typhimurium. Cellular Biology. American
antibiotic
to a receptor
and
bac-
(1987) 1919-1929.
263 (1976) 73 1-738.
N.L. and Kleckner,
B.R.
G. and Courvalin,
encoding
Mol. Gen. Genet. Lyon,
Matthews,
transposons
40 (1986) 635-659.
S.N.: Transposable
chromosome
of Gram-negative
189 (1983) 102-l 12.
A., Gerbaud,
of sequences
Matthews,
C.: Conjugative
of antibiotic
Annu. Rev. Microbial.
Craig,
is
on two IncN
of accessory
drug resistance
Clewell, D.B. and Gawron-Burke,
tion. Nature
found
Annu. Rev. Microbial.
Clewell, D.B.: Plasmids,
Cohen,
pWP14a
159 (1984) 472-481.
A.: Evolutionary
ments in bacteria.
and
189 (1983) 298-303.
sequence
Labigne-Roussel, cation
G.M. and Willetts, N.S.: Characteri-
of IS46, an insertion
Campbell,
plasmid
by ISZ40. Mol. Gen. Genet.
plasmids.
of
transduction
widely spread in R plasmids
teria. Mol. Gen. Genet.
Gene 30 (1984) 157-166.
W.: Cointegrational
Brown, A.M.C., Coupland,
sequence
in bacterial
and thymidylate
of IS257. Mol. Microbial. Schwartz,
R.M. and Dayhoff,
tance relationships.
synthetase
flanked by three copies
(1989) in press. M.O.: Matrices
In Dayhoff,
for detecting
dis-
M.O. (Ed.), Atlas of Protein
205
Sequence medical
and
Structure,
Research
Vol. 5, Suppl.
Foundation,
3. National
Washington,
Bio-
DC, 1978, pp.
P.M. and Li, W.-H.: Codon
usage in regulatory
Escherichia coli does not reflect Nucleic Shields,
for ‘rare’ codons.
tational
biases. R.A.,
Tennent,
Rouch,
Synonymous
both translational
codon selection
D.A.,
Lyon,
B.R.,
Suppl. C (1988) 19-38.
in
and mu-
Gillespie,
M.T.,
strains. J. Antimicrob.
and evolution Chemother.
21,
status Nucleic
and portability
of our sequence
Acids Res. 14 (1986) 217-231.
N.: On the genetic basis of variation
and heterogeneity
Proc. Natl. Acad.
Sci. USA 48
(1962) 582-592. M.: The evolutionary
elements. Thompson,
aminoglycoside
plasmids.
of mobile genetic
18 (1984) 271-293.
C.J. and Gray, G.S.: Nucleotide
relationship Trieu-Cuot,
implications
Annu. Rev. Genet.
tomycete
L.J. and May, J.W.:
Staphylococcus aweus: genetics Australian
usage
Acids Res. 15 (1987) 8023-8040.
J.M., Byrne, M.E., Messerotti,
Multiresistant ofepidemic
Nucleic
P.M.:
software.
of DNA base composition. Syvanen,
Acids Res. 14 (1986) 7737-7749.
D.C. and Sharp,
Bacillus subtilis reflects Skurray,
selection
genes in
R.: The current
handling Sueoka,
353-358. Sharp,
Staden,
sequence
phosphotransferase
to phosphotransferases
of a strep-
gene and its
encoded
by resistance
Proc. Natl. Acad. Sci. USA 80 (1983) 5190-5194. P. and Courvalin,
transposable
element
P.: Nucleotide
sequence
IS15. Gene 30 (1984) 113-120.
of the