The role of IS6110 in the evolution of Mycobacterium tuberculosis

The role of IS6110 in the evolution of Mycobacterium tuberculosis

ARTICLE IN PRESS Tuberculosis (2007) 87, 393–404 Available at www.sciencedirect.com journal homepage: http://intl.elsevierhealth.com/journals/tube ...

474KB Sizes 35 Downloads 261 Views

ARTICLE IN PRESS Tuberculosis (2007) 87, 393–404

Available at www.sciencedirect.com

journal homepage: http://intl.elsevierhealth.com/journals/tube

REVIEW

The role of IS6110 in the evolution of Mycobacterium tuberculosis Christopher R.E. McEvoy, Alecia A. Falmer, Nicolaas C. Gey van Pittius, Thomas C. Victor, Paul D. van Helden, Robin M. Warren DST/NRF Centre of Excellence in Biomedical Tuberculosis Research, MRC Centre for Molecular and Cellular Biology, Division of Molecular Biology and Human Genetics, Faculty of Health Sciences, Stellenbosch University, Tygerberg, South Africa Received 7 March 2007; received in revised form 15 May 2007; accepted 22 May 2007

KEYWORDS Mycobacterium tuberculosis; IS6110; Transposon; Evolution

Summary Members of the Mycobacterium tuberculosis complex contain the transposable element IS6110 which, due to its high numerical and positional polymorphism, has become a widely used marker in epidemiological studies. Here, we review the evidence that IS6110 is not simply a passive or ‘junk’ DNA sequence, but that, through its transposable activity, it is able to generate genotypic variation that translates into strain-specific phenotypic variation. We also speculate on the role that this variation has played in the evolution of M. tuberculosis and conclude that the presence of a moderate IS6110 copy number within the genome may provide the pathogen with a selective advantage that has aided its virulence. & 2007 Elsevier Ltd. All rights reserved.

Contents Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Insertion sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IS6110. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolution of IS6110 RFLP patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Consequences of IS6110 transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integration into intragenic regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IS6110 flanking mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recombination/gene deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Promoter activity of IS6110 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IS6110 and evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Corresponding author. Tel.: +27 21 9389482; fax: +27 21 9389476.

E-mail address: [email protected] (C.R.E. McEvoy). 1472-9792/$ - see front matter & 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.tube.2007.05.010

394 394 395 397 397 397 399 399 399 400

ARTICLE IN PRESS 394

C.R.E. McEvoy et al. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

Introduction Tuberculosis (TB) is one of the leading causes of infectious disease mortality in the world, with over 2 million deaths recorded annually, and it is estimated that one third of the world’s population is latently infected.1 Although effective drugs are available for most cases and rates of TB infection have decreased dramatically in industrialised countries, poor public health systems in developing countries, the HIV/ AIDS pandemic, and the emergence of multidrug-resistant strains, have all contributed to an ongoing increase in reported cases worldwide.2 Mycobacterium tuberculosis belongs to a group of closely related organisms known as the M. tuberculosis complex (MTBC) that comprises seven recognised members: M. tuberculosis, M. bovis, M. africanum, M. pinnipedii, M. caprae, M. microti and M. canetti. They are characterised by identical 16S rRNA sequences3 and high genetic homogeneity4 yet they display divergent phenotypes, elicit different pathologies and show different host specificities. The high (99.9%) similarity at the nucleotide level between MTBC members4 has been suggested to have resulted from a recent evolutionary bottleneck with the resulting MTBC being derived from the clonal expansion of a single successful ancestor.4–8 Indeed, the clonal nature and extreme sequence similarity between MTBC members has led to suggestions that they should be viewed as host-adapted ecotypes of the same species.9 Until recently, the evolutionary history of M. tuberculosis has remained enigmatic with the favoured hypothesis suggesting that it is a recent human pathogen, probably derived from M. bovis in a cross-species jump from domesticated bovids during the early years of animal husbandry.10 The exact evolutionary history of the MTBC is still unclear although recent reports have clarified several aspects.7,8 Studies that have examined sequence data from all MTBC members7 suggest that a severe evolutionary bottleneck occurred in the MTBC 20,000–35,000 years ago and further suggest that the rarely encountered MTBC member M. canetti (an unusual mycobacterium strain that exhibits smooth and glossy colonies) predates this bottleneck while all other MTBC members do indeed represent the clonal expansion of one of these progenitor species. Moreover, the extant representatives of this ancient group (including M. canetti) are localised to East Africa and are also human pathogens, suggesting that the MTBC may have coevolved with the human lineage since at least the time of early hominids between 2.6 and 2.8 million years ago.8 However, the precise evolutionary origins of the MTBC remain unresolved.11,12 It should be noted that although numerous studies indicate that members of the MTBC are overwhelmingly, if not exclusively, clonal, a recent report has provided evidence for lateral gene transfer within the M. tuberculosis PE_PGRS genomic region.13 As an intracellular pathogen, host immunity is expected to have provided the most significant selection force during the evolutionary history of M. tuberculosis. More recently

introduced evolutionary pressures include the BCG vaccine and anti-TB drug therapy. The response of M. tuberculosis to these new selective parameters can be observed by the numerous outbreaks of drug resistant (including multi-drug resistant and extreme-drug resistant) TB14 and the implication that mass BCG vaccination in Eastern Asia has been a selective force in the emergence of the Beijing family phenotype15 (although not all data are consistent regarding this last point16). It may be speculated that the recent emergence of HIV has resulted in a new selection parameter. The evolutionary consequences of M. tuberculosis coexistence with HIV and the antiretroviral therapy used to treat it remains to be determined. Despite the extreme sequence similarity seen within gene coding regions4 different M. tuberculosis strains display variable phenotypes relating to transmission ability,17 disease manifestation,18 immunological responses,19–21 replication rate,22 and possibly the frequency of drug resistance23 and ability to evade vaccination.15 Through a thorough understanding of M. tuberculosis evolution and genetics we may increase our knowledge of the organism’s pathogenesis and subsequently improve treatment and control measures. This understanding has been greatly expanded through the publication in 1998 of the complete M. tuberculosis H37Rv laboratory strain genome sequence24 and the subsequent complete genome sequences of three other clinical strains as well as the sequence of M. bovis.25 Comparative genomics studies have shown that members of the MTBC have evolved through single nucleotide polymorphisms, insertions, deletions and other genomic rearrangements. As in most bacteria, various mobile genetic elements (MGEs) or ‘jumping genes’ have been detected in the M. tuberculosis genome.26 These elements are capable of moving from one chromosomal location to another in a process called transposition and their dynamic nature has been implicated in the phenotypic characteristics of several pathogenic bacteria.27,28 Due to its extensive numerical and positional polymorphism in M. tuberculosis, the MGE known as Insertion Sequence (IS) 6110 has been used extensively as a genotypic marker in epidemiological studies. However, apart from its invaluable molecular epidemiological role it has become more obvious in recent years that it is not just a passive genetic element and that, through its mobility, IS6110 is capable of altering gene expression and thereby may contribute to phenotypic diversity among different M. tuberculosis strains. In this review we will describe the general role of IS6110 in driving the evolution of the M. tuberculosis genome with a particular emphasis on bacterial virulence.

Insertion sequences The genomes of most organisms, from bacteria to mammals, have been found to contain MGEs. These elements are defined by their ability to extract themselves from one region of the genome and insert themselves into another

ARTICLE IN PRESS Role of IS6110 in evolution of M. tuberculosis region. MGEs were first identified by Barbara McClintock in the 1950s where they were found to alter the colour of maize kernels. However, they were originally dismissed as passive elements or ‘junk DNA’ and it is only in recent years that the major influence they play in genome evolution, particularly through their regulation of gene activity, has become appreciated.29 Genome sequencing has emphasised their importance and ubiquitous nature. The human genome, for example, consists largely (45%) of active or inactive MGE sequences. MGEs are classed according to their mechanism of transposition. Class I MGEs, or retrotransposons, move within the genome by being transcribed to RNA and then back to DNA by reverse transcriptase, while class II MGE’s (often known as transposons) encode a transposase and move directly from one position to another within the genome using a ‘cut and paste’ or ‘copy’ mechanism. In bacteria, transposons are often referred to as insertion sequences (IS) and these are further classified according to structural similarities. IS’s are capable of transposition by one of two pathways, either non-replicative (conservative)

395 or replicative transposition (Fig. 1). The product of nonreplicative transposition is a simple insertion or ‘jump’ from one genomic location to another, while replicative transposition generates an additional IS copy. In both pathways the transposase catalyses the cleavage of a DNA strand to generate a 30 OH at the transposon termini.30 The different pathways depend on a double-strand breakage completely removing the donor DNA, or a single-strand nick that will allow co-integrates to form.30 Not all IS elements exclusively use one or other mechanism with IS6110, for example, making use of both.

IS6110 Gordon and colleagues, in an analysis of the M. tuberculosis H37Rv genome, reported the presence of 56 loci representing almost 30 different IS elements.26 One of these, IS6110, was originally described in 1990 as a 1.36 kb IS found only within the MTBC, belonging to the IS3 family and characterised by unique 28 bp imperfect terminal inverted

Figure 1 Summary of the biochemical steps that occur in simple and replicative transposition pathways. Nicking at the 30 ends is the initial step in both pathways (a). In the simple insertion pathway, this is followed by cleavage of the 50 -flanking DNA (b), which generates an excised transposon. Interaction with a target (c) allows strand transfer to occur, which results in a simple insertion (d). Replicative transposition (a, e and f) occurs via a strand transfer reaction (e) involving the nicked transposon and a target to generate a strand transfer intermediate. Replication of this intermediate using the exposed 30 OHs in the target DNA as priming sites (f) results in duplication of the transposon and a co-integrate structure. IRs are represented by filled triangles, the transposon DNA is shown as a thick line, flanking and target DNAs as thin lines, and cleavage sites by small vertical arrows. Reproduced from Ref. 30 with the kind permission of the authors.

ARTICLE IN PRESS 396

C.R.E. McEvoy et al.

repeats (TIRs).31 Another characteristic of IS6110 is that a duplication of the 3–4 bp proximate to the insertion site is made upon integration.32 Most members of the MTBC, including M. tuberculosis, contain multiple IS6110 copies although M. bovis generally contains only one copy and limited transposition is observed. Almost all MTBC species and strains possess an IS6110 element in the direct repeat (DR) region of the genome and this is considered to be the site of the original insertion site into the MTBC early in its evolution.33,34 As with all other members of the IS3 family, the IS6110 sequence contains two partially overlapping reading frames, orfA and orfB (Fig. 2). By way of translational frameshifting, an OrfAB protein, which acts as a transposase, may also be produced.35 While specific studies on IS6110 are lacking, analysis of the transposition mechanism of IS3 has revealed that the orfA and orfB gene products inhibit transpositional recombination promoted by the transposase.36 Thus, it is likely that the relative proportions of orfA, orfB and orfAB produced determine the frequency of transposition. It should be noted, however, that no function has been attributed to orfB in the IS3-member transposons IS15037 and IS91138 and that its role in the biology of IS6110 is therefore currently unclear. Evidence exists that some IS3 family members form both circular and linear intermediates during transposition and that the TIRs in circular intermediates may combine in a hairpin structure to generate a strong promoter, allowing for more efficient genomic integration.37,39 It has also been demonstrated that separate copies of the IS6110 element do not operate independently, that is they are able to share the same transposase.40 Indeed, distinct but structurally related ISs may sometimes act through the same transposase even though they may differ in their respective transposition pathways.39 The high degree of IS6110 polymorphism, both in a numerical and positional sense, between different M. tuberculosis strains, has made it a useful marker for strain genotyping. Accordingly, the standardised IS6110 fingerprinting method41 has become the most widely used genotyping method in molecular epidemiological studies of M. tuberculosis (Fig. 3). This is because IS6110 transposition events are generally common enough to allow differentiation between more distantly evolved strains but are still rare enough to show stability within more closely related strains. They are thus useful in distinguishing between recent epidemiological events (transmission) and distant epidemiological events (reactivation). In practice, two or more isolates with identical or near-identical (71 band) IS6110

OrfA 6-275

fingerprints (known as a cluster) are generally accepted as representing a recent transmission event. It was previously assumed that the rate of IS6110 transposition is constant between strains. However, different M. tuberculosis strains have been shown to present varying transposition rates.42–45 In particular, low copy number strains show a lower transposition rate and therefore possible overestimation of clustering may occur in these cases.46 This may have important implications for the interpretation of epidemiological results and highlights the fact that a more thorough knowledge of the biology of IS6110 and its relation to the emergence of specific M. tuberculosis strains is needed.

Figure 3 Autoradiograph of IS6110 RFLP fingerprinting patterns for 13 clinical isolates of M. tuberculosis along with the standard laboratory strain M. tuberculosis 14323 (far right). Each band represents an individual IS6110 element with the different band positions representing different locations within the genome. Strain relatedness can generally be inferred from the similarity of RFLP patterns.

OrfB 274-1311

5’

3’ TIR

TIR

1358bp

Figure 2 Diagram of IS6110 illustrating the position of the TIRs and the two open reading frames orfA and orfB. Note the 1 bp overlap between the two orfs that allows for the production of the orfAB protein by translational frameshifting.

ARTICLE IN PRESS Role of IS6110 in evolution of M. tuberculosis

Evolution of IS6110 RFLP patterns Sequencing of the IS6110 element has revealed that the sequence is conserved within the M. tuberculosis complex and thus differences in the frequency of transposition are not caused by differences in the sequence of the element itself.47 It appears that IS6110 copy number variation is largely due to the nature of the genomic region into which the element is integrated.40 When residing in transcriptionally silent genomic regions IS6110 is inactive and rarely undergoes transposition. However, when inserted into a transcriptionally active region of the genome the transposition rate of IS6110 is greatly increased, presumably due to an increase in transposase production.40 This finding suggests that a single transpositional event may generate a sudden burst of transpositional activity if the element is incorporated into a transcriptionally active site.40 With each additional duplication event the chance that an element will be integrated into an active genomic region and undergo increased transposition and duplication increases. Consequently, the copy number of IS6110 may quickly increase, resulting in strains with an intermediate or high copy number.48 The commonly observed bimodal distribution of IS6110 copy number, where strains that display an intermediate copy number (approximately 5–14) are relatively rare, may be a reflection of this process. In this scenario a low copy (approximately 1–5) peak represents stable strains possessing low IS6110 mobility.49 This is followed by a trough comprising intermediate copy number strains where one or more IS6110 elements is unstable, resulting in a (relatively) rapid move into the high copy number (approximately 15–25) peak. The explanation for an observed upper limit of approximately 25 copies is unclear although a runaway process resulting in uncontrolled IS accumulation would clearly be deleterious to the organism. Possibilities include a limitation on the number of insertion sites available, an elevated recombination frequency in strains with high copy number, the presence of a trans-acting transpositional inhibitor that increases with copy number (for example, accumulation of orfA), or that the element has only recently invaded the MTBC and has not yet had time to expand to a higher copy number. It seems unlikely that a limit in the number of available insertion sites that do not exert deleterious effects has been reached since mapping studies in various M. tuberculosis clinical isolates has documented 100s of unique insertion sites (see below). In addition, other bacterial pathogens can display extremely high IS copy numbers (in the 100s) within a single genome.27,28 Recombination between IS6110 elements results in the loss of one element along with all intervening sequence.50–52 It is likely that an increase in copy number would correlate with an increase in recombination frequency although this has not been experimentally tested. Furthermore, it is unclear how factors such as the proximity of IS elements to one another and the phenotypic effects due to the loss of intervening sequences would effect the frequency of IS loss. The possibility that a trans-acting feedback mechanism exists to limit copy number seems more probable. Trans-inhibition has been documented in IS1053 and is likely to also occur in IS1.54 The process has not been documented in IS3 family members however, although studies are lacking (M. Chand-

397 ler, personal communication). Finally, because of uncertainties in the evolutionary history of M. tuberculosis and the fact that IS6110 copy number cannot be easily determined from ancient DNA samples it is impossible to say how recently high copy number strains evolved or whether sufficient evolutionary time has elapsed to generate even higher copy number strains. It has been proposed that the highly successful, high copy number, W-Beijing family arose in central Asia over 30,000 years ago55 although the copy number status of this ‘proto-Beijing’ lineage is unknown and may have been considerably less than that found in current strains. Interestingly, a recent report has shown that more recently evolved sub-lineages of the W-Beijing family possess a higher number of IS6110 elements than more ancient lineages.56 Sudden bursts of genomic change may have important consequences for molecular epidemiological studies since this occurrence in epidemiologically related strains may result in the IS6110 RFLP patterns displaying high diversity. The strains may thus be mistakenly assigned as nonclustered and presumed to be cases of reactivation.40

Consequences of IS6110 transposition In recent years the effects that insertion sequences can exert on host genomes has become more fully appreciated. Depending on the position of integration, a transpositional event may result in a broad range of phenotypic alterations on the host ranging from lethality to neutrality to possible occasional beneficial effects. Mapping of IS6110 transpositional insertion points in M. tuberculosis has demonstrated that no obvious insertion site sequence specificity exists, although more subtle sequence preferences may possibly occur. However, insertion sites are not completely random and integration hot-spots do exist. These may be defined as genomic regions that exhibit integration frequencies that are above the level expected if integration is assumed to be randomly distributed. They include the DR region,57 the phospholipase C gene region,58 members of the PPE gene family,59,60 the intergenic region between the DnaA and DnaN genes,61 as well as other ISs themselves, such as IS1547 (also described as the ipl site).62,63 By defining a preferential integration region as a genomic domain o500 bp where different IS6110 insertion points have been identified in more than one M. tuberculosis isolate, Warren and colleagues have characterised over 10 integration hot-spots.64 Insertion point mapping studies also detect genomic regions where IS integration is rare or absent.65 This strongly suggests that integration into these regions is detrimental to the organism. Mutational events caused by IS6110 transposition can be divided into the following classes. For each class, phenotypic effects that have either been directly observed or have the potential to occur are also discussed. A diagrammatic representation of each mutational event is shown in Fig. 4.

Integration into intragenic regions Analysis of integration sites have shown that the majority of IS6110 integration events occur within coding regions, presumably rendering the affected gene inactive in the

ARTICLE IN PRESS 398

C.R.E. McEvoy et al. IS6110a

OrfA

IS6110b

IS6110a

IS6110a

IS6110b

** *

IS6110b

IS6110a or b

IS6110a

OrfA

IS6110b

Figure 4 Schematic representation of the possible effects of IS6110 transposition. The top diagram (1) shows an open reading frame, orfA, located between two IS6110 elements, IS6110a and IS6110b. (2) IS6110a integrates within orfA causing disruption of the coding region. (3) IS6110a integrates within orfA producing the following flanking mutations: a deletion event 50 of the integration site that encompasses the 50 portion of orfA and part of the upstream non-coding region (arrowed) and point mutations 30 of the integration site (asterix). (4) A recombination event occurs between IS6110a and IS6110b resulting in loss of all the intervening sequence and deletion of orfA. (5) IS6110a integrates upstream of orfA and upregulates its expression through IS6110 promoter activity.

majority of cases.59,60,64,65 In the most numerically thorough analysis it was found that 60% of integration points in clinical isolates were intragenic.65 However, over 90% of the M. tuberculosis genome is intragenic, demonstrating that insertions into coding regions occur at a lower than expected frequency if random integration is assumed. This, in turn, suggests that relatively few genes are capable of undergoing an IS6110 insertion without causing deleterious effects to the pathogen. Emphasising this was the finding that integration into putative virulence genes (defined as genes whose knockout results in an attenuated phenotype in vivo) was completely absent. Also less affected by IS6110 integration than expected were information pathway genes, lipid metabolism genes and genes involved in cell wall synthesis.65 There was also substantial agreement between this study and previous work designed to document all genes essential for M. tuberculosis growth,66 with only 5 of the 100 genes shown to exhibit an IS6110 insertion being previously designated as essential. Furthermore, Sampson and colleagues have found that most disrupted genes are members of multiple gene families and hence the phenotypic effect may be limited in these cases because of functional gene redundancy.60 Gene classes found to contain more insertions than expected were the PPE genes and genes of unknown function.65 Members of the PPE gene family are hypothesised to be variable surface antigens67 and the presumed absence of gene product due to intragenic IS6110 integration may benefit the organism as a method of immune evasion. Lending support to this theory is the fact that genes of the PPE family are also more likely to exhibit single nucleotide polymorphisms than other gene regions,68 demonstrating another mechanism that may potentially alter its antigenic effects. The extremely high rate of IS6110 integration into these genes even suggests the possibility that the process is actively selected for by the organism due

to host immune pressures. The fact that genes of unknown function were also found to contain IS6110 insertions at a higher than expected frequency agrees with earlier work suggesting that relatively few of these genes are essential for the organism’s survival.69 Yang and colleagues have examined the phenotypic effects of IS6110 disruption of the plcD gene.70 Phospholipase C genes have been shown to play a role in the pathogenesis of several intracellular bacteria and gene knockout studies in M. tuberculosis have demonstrated that mutants are attenuated in the late phase of infection.71 TB patients who were infected with M. tuberculosis strains where the plcD gene was either disrupted by IS6110 integration or had undergone IS6110associated partial deletion were shown to be twice as likely to exhibit extrathoracic disease.70 A second line of evidence that demonstrates the potential phenotypic effects of IS6110 integration into coding regions relates to the technique of transposon mutagenesis. This method of generating a mutant library of bacterial strains has been used to identify many M. tuberculosis genes that contribute to bacterial survival and virulence. The transposons used for mutagenesis in these studies are not naturally found in MTBC species (although they may be derived from closely related species such as M. smegmatis72) but they may replicate the effect of an IS6110 insertion event. The major difference between in vitro mutagenesis studies and those based on clinical isolates is that in vitro studies only examine the short term effects on growth and virulence on artificial media or in animal models whereas the latter examine long term effects on virulence including transmission. It is to be expected therefore that the spectrum of genes deemed essential in transposon mutagenesis studies will be narrower than that found in clinical isolates. Transposon site hybridisation (TraSH) analysis has revealed that many, if not most, IS insertions into gene coding regions

ARTICLE IN PRESS Role of IS6110 in evolution of M. tuberculosis reduce the fitness of the pathogen.66,69,73 However, as described below, an increase in virulence has been noted in some cases. A major cause of concern to clinicians is the increasing rate of emergence of drug resistant and multi-drug resistant TB strains. Drug resistance is generally mediated via mutations in key genes. Such mutations may include spontaneous IS6110 insertions, as documented by Lemaitre and colleagues in their study on the mechanisms of pyrazinamide resistance.74 Capreomycin resistance caused by IS6110 integration into the tlyA gene of M. smegmatis has also been described.75 The ability of transposon mutagenesis techniques to produce drug resistant phenotypes in various MTBC species is also well established. Sassetti and colleagues generated a TraSH library of M. bovis BCG that contained numerous mutants that were resistant to the front-line anti-TB antibiotics isoniazid and ethionamide due to independent insertions of an IS element into the katG and etaA genes, respectively.73 Similarly, transposon insertions into the thyA gene of M. bovis BCG have been shown to result in resistance to the second-line antibiotic paraaminosalicylic acid76 and transposon integration into the tlyA gene of M. tuberculosis confers resistance to the second-line antibiotic capreomycin.75 Pathogens that evolve drug resistance often display a reduced overall fitness77 and the laboratory constructed mutants described above may incur an unknown phenotypic cost which may explain why few naturally occurring examples involving IS6110 have been found. Besides drug resistance, other aspects of virulence have also been shown to increase in certain strains following transposon mutagenesis. McAdam and colleagues randomly selected 11 H37Rv transposon insertion mutants, located in genes of defined general function, and demonstrated that five of these displayed significantly increased virulence compared to wild-type, as determined by the survival time of infected SCID mice.78 The disrupted genes involved comprised members of several functional classification groups including a transcriptional regulator and an aminopeptidase. While all of the above examples are artificially generated they demonstrate the potential for IS6110 intragenic insertions to increase M. tuberculosis virulence as measured both by decreased host survival time and drug resistance characteristics.

IS6110 flanking mutations In the genomic regions flanking IS6110 elements a high mutation rate is observed.64 This genome plasticity appears to be driven by integration of the element and is possibly a direct result of the disruptive effect of IS6110 insertion on the surrounding DNA structure. Mutations observed in these regions include point mutations, expansion and contraction of tandem repeats, and larger genomic deletions.64 It may therefore be hypothesised that IS6110 integration not only alters the precise integration site but also the region surrounding it and it is therefore possible that these regions evolve at a substantially higher rate. The exact mechanism of these events is currently unknown but is suspected to be complex and varied due to the heterogeneous nature of the mutations involved.

399 Alternative explanations for the frequent presence of IS6110 flanking mutations are that a gene that has been previously inactivated by IS integration will be more likely to acquire subsequent mutations as these will be phenotypically neutral, or, conversely, a gene that has undergone an inactivating mutation may be more likely to undergo subsequent IS6110 integration. A causal relationship has not been established and it is therefore unclear whether IS6110 flanking mutations occur simultaneously, subsequently, or prior to IS6110 integration. Although not documented, it is possible that additional mutational consequences relating to the 3–4 bp duplication flanking the element that is generated upon genome integration32 may also occur. If IS6110 subsequently transposes to a new genomic location via non-replicative transposition these duplicated nucleotides may remain, resulting in a further genetic alteration.

Recombination/gene deletion Deletion events have been shown to be capable of removing 20 kb of DNA containing up to 13 genes.79 They are common in the M. tuberculosis genome and result in approximately 5% of genes being variably absent in clinical isolates.80 Deletion of genomic regions may result from recombination between two adjacent copies of IS6110 leading to subsequent loss of gene function. Such events are identified by the absence of the 3 bp duplication flanking the element that occurs at integration. Previously, it was reported that homologous recombination plays a major role in gene deletion, particularly near the ipl locus integration hotspot.50 However, deletions do not always appear to be classical homologous recombination events between adjacent directly repeated elements and may involve elements that are in opposing directions.51,52 Numerous possible alternative mechanisms exist, some of which are described by Sampson and colleagues.51 Using high-density oligonucleotide arrays, Kato-Maeda and colleagues compared the deletion profiles of 19 M. tuberculosis isolates, including 13 non-clonal isolates that had caused disease in 148 patients.81 Deletion polymorphisms were shown to be present in almost all non-clonal isolates and a correlation between an increased amount of genomic deletion and a decreased chance of pulmonary cavitation was found. An example of the phenotypic effect of specific IS6110-mediated deletions has been documented in the previously described study by Yang and colleagues, where partial deletions of the plcD gene were shown to be associated with extrathoracic disease.70

Promoter activity of IS6110 Most instances of IS6110 integration are expected to result in negative effects on gene expression due to the disruption of gene coding or promoter regions. However, like several other ISs described in other organisms, IS6110 has been found to possess an outward-directed promoter at its 30 end and can thus act as a mobile promoter.59,82,83 A suspicion of possible IS6110 promoter activity was first raised by Beggs and colleagues who, in a study of IS6110 integration sites, found that an element had been inserted 55 bp into the ctpD

ARTICLE IN PRESS 400 gene and that the gene was still expressed normally.59 Two more recent studies have provided further evidence for IS6110 promoter activity. Safi and colleagues have characterised the promoter’s location and shown that it is able to upregulate several downstream genes from natural insertion sites in M. tuberculosis isolates cultured in human monocytes.82 Soto and colleagues have emphasised the effects that the IS6110 promoter may exert on mycobacterial virulence by describing the result of its insertion upstream of the phoP gene.83 phoP is a transcriptional regulator whose disruption results in the impaired growth of M. tuberculosis when cultured in mouse macrophages along with its attenuation in an in vivo mouse infection model, and it is thus considered to be essential for M. tuberculosis virulence.84 Soto and colleagues, examining an MDR outbreak strain of M. bovis, discovered an IS6110 insertion 75 bp upstream from the ATG start codon of phoP. This insertion was associated with an approximate 10-fold increase in phoP transcription as compared to wild-type when the phoP gene and promoter region were cloned into a mycobacterial replicative plasmid and expressed in M. smegmatis. The authors conclude that the IS6110-associated upregulation of phoP may be strongly associated with its high virulence.83

IS6110 and evolution As described above, the transpositional activity of IS6110 produces genomic alterations that have the potential to result in an altered phenotype. Since phenotypic variation between organisms is the selectable parameter in Darwinian evolution, IS6110 transposition can be seen as an integral part of this process in M. tuberculosis. The genomes of different M. tuberculosis strains (and even distinct MTBC species) are remarkably similar with limited diversity within gene regions.4 In addition, horizontal DNA transfer appears to be rare within the MTBC.5,7,24,85–87 In contrast, genetic alterations caused by IS6110 transposition are frequently observed. Within the same individual, rates of IS6110 fingerprint change (defined as the proportion of cases that show fingerprint alterations over time) of up to 4.6% have been reported after 1 month,42,88 while rates of between 5.6% and 22% have been reported at 48 months.42,44,88–90 Furthermore, low intensity bands on IS6110 RFLP gels are relatively common91 and these have been shown to be produced by genetic heterogeneity within the clinical isolate.91,92 de Boer and colleagues found a significant correlation between low intensity bands and patient age and have suggested that IS6110 transposition may play an active role in the endogenous reactivation of dormant infection.91 It is probable that an increased rate of change occurs during transmission and two studies have estimated this to be approximately 18% over a period of 5–6 years.46,88 These findings suggest that the ability of M. tuberculosis to evolve may depend largely on the transposition of IS6110 and fuels the notion that phenotypic diversity within M. tuberculosis is, to an extent, due to the highly mobile IS. This, in turn, suggests that strains with a higher IS6110 copy number may have a higher evolutionary rate, and a potentially increased selective advantage, than strains with a low-copy number. This hypothesis is supported by the fact that the number of elements contained within the genome has increased over

C.R.E. McEvoy et al. time from one original integration into the DR region to more than seven copies found in the majority of strains worldwide today. It is also notable that the W-Beijing strain family possesses, on average, a higher number of IS6110 copies (typically around 21) than any other strain. This strain is dominant throughout Eastern and South-Eastern Asia as well as Northern Eurasia, and has rapidly spread to many other regions of the world where it has also increased in incidence.93 Thus, the greater evolutionary success of high copy number strains might be directly related to the IS6110 copy number. This, however, overlooks the fact that low copy number strains may also cause outbreaks and evolve traits such as drug resistance17,94 and that other members of the complex, such as M. bovis, generally contain only one IS6110 element yet appear to infect and transmit quite efficiently. These findings are suggestive of distinct mechanisms of successful adaptive evolution. This leads to the question of whether the presence of IS6110 provides an overall benefit to its host or if it is a purely ‘parasitic’ or ‘selfish’ replicating element. In the selfish scenario, the element is driven only by its own evolutionary interests and contributes little or no benefit to its host. Its replication rate is therefore increased and the host genome is forced to reduce the replication rate in order to survive, resulting in genomic conflict between the IS and the host genome and negative selection against high copy numbers. Conversely, the bacterial host might derive a selectable advantage by integrating transposable elements into its genome and therefore tolerate or even encourage their presence up to a critical point. Analysis of IS6110 integration sites suggests that the majority of transpositional events, like all mutations, are deleterious to M. tuberculosis.65 In a series of computer modelling studies using various combinations of possible transposition functions and selective regimes, Tanaka and colleagues found that the most successful models involved selection against the uncontrolled expansion of IS6110 copy number,48 thus providing evidence for the selfish element theory. However, as discussed above, IS6110 replication may sometimes increase the fitness of M. tuberculosis, for example by upregulating virulence genes83 or by disrupting potential antigenic determinants.60 In another possible example of its beneficial effects, Ghanekar and colleagues have shown that the frequency of IS6110 transposition is stress-inducible.95 In such a situation the increase in transposition rate may increase the likelihood of a mutant emerging that can survive the stress. Analysis of the M. tuberculosis genome of various lineages by Alland and colleagues has lent additional support to the notion that IS6110 can confer an evolutionary advantage to its host.96 In this study it was found that M. tuberculosis strains containing o7 IS6110 elements segregate into a single lineage while strains containing 47 elements have arisen at least 3 times. In addition, the loss of IS6110 elements to o7 was uncommon, further emphasising that their presence is possibly beneficial. Another study that has investigated the phylogeny of M. tuberculosis has disputed the finding that low copy number strains segregate into a single lineage but demonstrates that a sequential increase in IS6110 copy number over time has occurred independently in many lineages.87 An additional line of evidence suggesting a beneficial effect of IS6110 transposition comes from the analysis of its transposition frequency

ARTICLE IN PRESS Role of IS6110 in evolution of M. tuberculosis in vitro compared to that observed during transmission. As mentioned above, the IS6110 mutation rate (as determined by changes in RFLP patterns) is relatively high. This contrasts with the mutation rate seen in vitro where no alteration of IS6110 fingerprint patterns were observed following weekly passaging in liquid culture over a 6-month period.97 A new genotype observed following transmission presumably represents the clonal expansion of a single altered bacterium that has out-competed its far more numerous unaltered siblings in certain aspects of virulence. The high IS6110 mutation frequency observed during transmission may therefore be interpreted as an adaptive evolutionary response to the selection pressure exerted by the pathogens’ host. When minimal selection pressure is exerted, as in the case of a cultured organism, the requirement for genotypic/phenotypic change is greatly reduced and limited IS6110 transposition is observed. The two scenarios are not mutually exclusive and probably both occur, since a successful transposable element will evolve a balance with its host through limiting its own replication rate and possibly also producing the occasional beneficial mutation. In other words, although a selfish element may act only in its own evolutionary interest, it is not in its evolutionary interest to kill its host from uncontrolled replication. In all bacterial organisms, transposition of IS elements appears to be highly selfregulated in order to achieve a balance between extinction of the element and the presumably lethal effects of overtransposition. Intrinsic control of IS transpositional activity may be achieved through a variety of mechanisms, including weak promoter strength (sometimes achieved through Dam methylation of promoter sequences), poor ribosome binding sites, the inclusion of transcription termination sites within the transposase ORF and, as previously described for IS6110, programmed translational frameshifting.98 From the view of the host organism a balance must also be achieved between the neutral or potential positive effects of limited transposition and the deleterious effects of runaway replication by exerting its own regulatory controls. These may include the production of transposition repressor proteins, altering the DNA supercoiling of the element and Dam methylation of the transposon ends.98 Note that in many individual bacteria this balance will not be ideal but, as in all Darwinian processes, these ‘less fit’ organisms will be selected against and their lineages will eventually become extinct. The finely tuned transpositional balance achieved between IS6110 and M. tuberculosis raises the possibility of an anti-TB drug whose mechanism involves tipping the balance towards uncontrolled over-transposition resulting in a host genome damaged to such an extent that the organism dies.48 This process, termed error catastrophe, could rely on the specific upregulation of the transposase or downregulation of transposase inhibitors and would represent a novel method of TB control and one in which no resistance has yet developed, although as with all other bactericidal drugs, eventual evolution of resistance is probably inevitable.

Conclusion Different M. tuberculosis strains display phenotypic variation in many aspects of virulence despite showing extreme

401 similarity within protein coding regions of the genome. Here we have compiled and discussed evidence suggesting that, through its relatively high transpositional activity, IS6110 is a major contributor to this phenotypic variation and that this action has influenced the evolution of the pathogen. IS6110 has been shown to influence gene expression through integration into protein coding genomic regions, the ability to undergo recombination events resulting in gene deletion, and the upregulation of genes due to its intrinsic promoter activity. It is generally accepted that insertion sequences and other selfish DNA elements are deleterious to the host and various lines of evidence suggest that the majority of IS6110 transpositional events in M. tuberculosis result in lethality or reduced virulence. IS6110 transposition does appear to occasionally offer a reproductive benefit to its bacterial host however. One of the most convincing demonstrations of this has been provided by Soto and colleagues, who described an IS6110 integration event that was found to upregulate a gene implicated in M. tuberculosis virulence in a clinical isolate that had caused an MDR TB outbreak.83 IS6110 can thus play a dynamic role in adaptive evolution and its presence within the genome may subsequently be tolerated or even encouraged. A runaway increase in IS6110 copy number is prevented by the deleterious effects of overreplication. The balance achieved in this conflict between IS and host results in few strains observed having a copy number of 424. The long term result of this genomic conflict is uncertain and may result in a further general IS6110 increase or reduction, even to the point of extinction.48,99 Indeed, it has been suggested that the dynamics of the element are currently out of equilibrium and that this could reflect its transient state within M. tuberculosis.99 If, however, as we propose here, the element is able to provide occasional selective benefits and thus ‘pay for its keep’ its future survival is far more assured. Funding: The authors wish to thank the South African Medical Research Council, National Research Foundation and Stellenbosch University for continued financial support. Competing interests: None declared Ethical approval: Not required

Acknowledgement We wish to acknowledge Dr. M. Chandler for his interesting and enthusiastic discussion along with the reviewers for their insightful comments.

References 1. Corbett EL, Watt CJ, Walker N, Maher D, Williams BG, Raviglione MC, et al. The growing burden of tuberculosis: global trends and interactions with the HIV epidemic. Arch Intern Med 2003;163:1009–21. 2. Dye C, Scheele S, Dolin P, Pathania V, Raviglione MC. Consensus statement. Global burden of tuberculosis: estimated incidence,

ARTICLE IN PRESS 402

3.

4.

5.

6.

7.

8.

9.

10.

11. 12.

13.

14.

15.

16.

17.

18.

19.

20.

C.R.E. McEvoy et al. prevalence, and mortality by country. WHO Global Surveillance and Monitoring Project. JAMA 1999;282:677–86. Boddinghaus B, Rogall T, Flohr T, Blocker H, Bottger EC. Detection and identification of Mycobacteria by amplification of rRNA. J Clin Microbiol 1990;28:1751–9. Sreevatsan S, Pan X, Stockbauer KE, Connell ND, Kreiswirth BN, Whittam TS, et al. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc Nat Acad Sci USA 1997; 94:9869–74. Gutacker MM, Smoot JC, Migliaccio CA, Ricklefs SM, Hua S, Cousins DV, et al. Genome-wide analysis of synonymous single nucleotide polymorphisms in Mycobacterium tuberculosis complex organisms. Resolution of genetic relationships among closely related microbial strains. Genetics 2002;162:1533–43. Hughes AL, Friedman R, Murray M. Genomewide pattern of synonymous nucleotide substitution in two complete genomes of Mycobacterium tuberculosis. Emerg Infect Dis 2002;8: 1342–6. Brosch R, Gordon SV, Marmiesse M, Brodin P, Buchrieser C, Eiglmeier K, et al. A new evolutionary scenario for the Mycobacterium tuberculosis complex. Proc Nat Acad Sci USA 2002;99:3684–9. Gutierrez MC, Brisse S, Brosch R, Fabre M, Omais B, Marmiesse M, et al. Ancient origin and gene mosaicism of the progenitor of Mycobacterium tuberculosis. PLoS Pathog 2005;1:e5. Smith NH, Kremer K, Inwald J, Dale J, Driscoll JR, Gordon SV, et al. Ecotypes of the Mycobacterium tuberculosis complex. J Theor Biol 2006;239:220–5. Stead WW, Eisenach KD, Cave MD, Beggs ML, Templeton GL, Thoen CO, et al. When did Mycobacterium tuberculosis infection first occur in the New World? An important question with public health implications. Am J Respir Crit Care Med 1995;151:1267–8. Smith NH. A re-evaluation of M. prototuberculosis. PLoS Pathog 2006;2:e98. Brisse S, Supply P, Brosch R, Vincent V, Gutierrez MC. ‘‘A reevaluation of M. prototuberculosis’’: continuing the debate. PLoS Pathog 2006;2:e95. Liu X, Gutacker MM, Musser JM, Fu YX. Evidence for recombination in Mycobacterium tuberculosis. J Bacteriol 2006; 188:8169–77. Emergence of Mycobacterium tuberculosis with extensive resistance to second-line drugs—worldwide, 2000–2004. MMWR Morb Mortal Wkly Rep 2006;55:301–5. Abebe F, Bjune G. The emergence of Beijing family genotypes of Mycobacterium tuberculosis and low-level protection by bacille Calmette-Guerin (BCG) vaccines: is there a link? Clin Exp Immunol 2006;145:389–97. Anh DD, Borgdorff MW, Van LN, Lan NT, van Gorkom T, Kremer K, et al. Mycobacterium tuberculosis Beijing genotype emerging in Vietnam. Emerg Infect Dis 2000;6:302–5. Valway SE, Sanchez MP, Shinnick TF, Orme I, Agerton T, Hoy D, et al. An outbreak involving extensive transmission of a virulent strain of Mycobacterium tuberculosis. N Engl J Med 1998;338: 633–9. Dormans J, Burger M, Aguilar D, Hernandez-Pando R, Kremer K, Roholl P, et al. Correlation of virulence, lung pathology, bacterial load and delayed type hypersensitivity responses after infection with different Mycobacterium tuberculosis genotypes in a BALB/c mouse model. Clin Exp Immunol 2004; 137:460–8. Reed MB, Domenech P, Manca C, Su H, Barczak AK, Kreiswirth BN, et al. A glycolipid of hypervirulent tuberculosis strains that inhibits the innate immune response. Nature 2004;431:84–7. Manca C, Reed MB, Freeman S, Mathema B, Kreiswirth B, Barry III. CE, et al. Differential monocyte activation underlies strain-

21.

22.

23.

24.

25.

26.

27.

28.

29. 30.

31.

32. 33.

34.

35.

36.

37.

38.

39.

specific Mycobacterium tuberculosis pathogenesis. Infect Immun 2004;72:5511–4. Chacon-Salinas R, Serafin-Lopez J, Ramos-Payan R, MendezAragon P, Hernandez-Pando R, van Soolingen D, et al. Differential pattern of cytokine expression by macrophages infected in vitro with different Mycobacterium tuberculosis genotypes. Clin Exp Immunol 2005;140:443–9. Zhang M, Gong J, Yang Z, Samten B, Cave MD, Barnes PF. Enhanced capacity of a widespread strain of Mycobacterium tuberculosis to grow in human macrophages. J Infect Dis 1999;179:1213–7. Glynn JR, Whiteley J, Bifani PJ, Kremer K, van Soolingen D. Worldwide occurrence of Beijing/W strains of Mycobacterium tuberculosis: a systematic review. Emerg Infect Dis 2002;8: 843–9. Cole ST, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 1998;393: 537–44. Garnier T, Eiglmeier K, Camus JC, Medina N, Mansoor H, Pryor M, et al. The complete genome sequence of Mycobacterium bovis. Proc Nat Acad Sci USA 2003;100:7877–82. Gordon SV, Heym B, Parkhill J, Barrell B, Cole ST. New insertion sequences and a novel repeated sequence in the genome of Mycobacterium tuberculosis H37Rv. Microbiology 1999;145: 881–92. Parkhill J, Sebaihia M, Preston A, Murphy LD, Thomson N, Harris DE, et al. Comparative analysis of the genome sequences of Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica. Nat Genet 2003;35:32–40. Brugger K, Torarinsson E, Redder P, Chen L, Garrett RA. Shuffling of Sulfolobus genomes by autonomous and non-autonomous mobile elements. Biochem Soc Trans 2004;32:179–83. Biemont C, Vieira C. Genetics: junk DNA as an evolutionary force. Nature 2006;443:521–4. Tavakoli NP, Derbyshire KM. Tipping the balance between replicative and simple transposition. EMBO J 2001;20: 2923–30. Thierry D, Brisson-Noel A, Vincent-Levy-Frebault V, Nguyen S, Guesdon JL, Gicquel B. Characterization of a Mycobacterium tuberculosis insertion sequence, IS6110, and its application in diagnosis. J Clin Microbiol 1990;28:2668–73. Dale JW. Mobile genetic elements in mycobacteria. Eur Respir J Suppl 1995;20:633s–48s. Fang Z, Morrison N, Watt B, Doig C, Forbes KJ. IS6110 transposition and evolutionary scenario of the direct repeat locus in a group of closely related Mycobacterium tuberculosis strains. J Bacteriol 1998;180:2102–9. Fomukong N, Beggs M, el Hajj H, Templeton G, Eisenach K, Cave MD. Differences in the prevalence of IS6110 insertion sites in Mycobacterium tuberculosis strains: low and high copy number of IS6110. Tuber Lung Dis 1997;78:109–16. Sekine Y, Eisaki N, Ohtsubo E. Translational control in production of transposase and in transposition of insertion sequence IS3. J Mol Biol 1994;235:1406–20. Sekine Y, Izumi K, Mizuno T, Ohtsubo E. Inhibition of transpositional recombination by OrfA and OrfB proteins encoded by insertion sequence IS3. Genes Cells 1997;2:547–57. Haas M, Rak B. Escherichia coli insertion sequence IS150: transposition via circular and linear intermediates. J Bacteriol 2002;184:5833–41. Polard P, Prere MF, Chandler M, Fayet O. Programmed translational frameshifting and initiation at an AUU codon in gene expression of bacterial insertion sequence IS911. J Mol Biol 1991;222:465–77. Duval-Valentin G, Normand C, Khemici V, Marty B, Chandler M. Transient promoter formation: a new feedback mechanism for regulation of IS911 transposition. EMBO J 2001;20:5802–11.

ARTICLE IN PRESS Role of IS6110 in evolution of M. tuberculosis 40. Wall S, Ghanekar K, McFadden J, Dale JW. Context-sensitive transposition of IS6110 in mycobacteria. Microbiology 1999;145:3169–76. 41. van Embden JD, Cave MD, Crawford JT, Dale JW, Eisenach KD, Gicquel B, et al. Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. J Clin Microbiol 1993;31:406–9. 42. de Boer AS, Borgdorff MW, de Haas PE, Nagelkerke NJ, van Embden JD, van Soolingen D. Analysis of rate of change of IS6110 RFLP patterns of Mycobacterium tuberculosis based on serial patient isolates. J Infect Dis 1999;180:1238–44. 43. Niemann S, Richter E, Rusch-Gerdes S. Stability of Mycobacterium tuberculosis IS6110 restriction fragment length polymorphism patterns and spoligotypes determined by analyzing serial isolates from patients with drug-resistant tuberculosis. J Clin Microbiol 1999;37:409–12. 44. Yeh RW, Ponce dL, Agasino CB, Hahn JA, Daley CL, Hopewell PC, et al. Stability of Mycobacterium tuberculosis DNA genotypes. J Infect Dis 1998;177:1107–11. 45. Warren RM, van der Spuy GD, Richardson M, Beyers N, Borgdorff MW, Behr MA, et al. Calculation of the stability of the IS6110 banding pattern in patients with persistent Mycobacterium tuberculosis disease. J Clin Microbiol 2002;40:1705–8. 46. Warren RM, van der Spuy GD, Richardson M, Beyers N, Booysen C, Behr MA, et al. Evolution of the IS6110-based restriction fragment length polymorphism pattern during the transmission of Mycobacterium tuberculosis. J Clin Microbiol 2002;40: 1277–82. 47. Dale JW, Tang TH, Wall S, Zainuddin ZF, Plikaytis B. Conservation of IS6110 sequence in strains of Mycobacterium tuberculosis with single and multiple copies. Tuber Lung Dis 1997; 78:225–7. 48. Tanaka MM, Rosenberg NA, Small PM. The control of copy number of IS6110 in Mycobacterium tuberculosis. Mol Biol Evol 2004;21:2195–201. 49. Dale JW, Al Ghusein H, Al Hashmi S, Butcher P, Dickens AL, Drobniewski F, et al. Evolutionary relationships among strains of Mycobacterium tuberculosis with few copies of IS6110. J Bacteriol 2003;185:2555–62. 50. Fang Z, Doig C, Kenna DT, Smittipat N, Palittapongarnpim P, Watt B, et al. IS6110-mediated deletions of wild-type chromosomes of Mycobacterium tuberculosis. J Bacteriol 1999;181: 1014–20. 51. Sampson SL, Warren RM, Richardson M, Victor TC, Jordaan AM, van der Spuy GD, et al. IS6110-mediated deletion polymorphism in the direct repeat region of clinical isolates of Mycobacterium tuberculosis. J Bacteriol 2003;185:2856–66. 52. Sampson SL, Richardson M, van Helden PD, Warren RM. IS6110mediated deletion polymorphism in isogenic strains of Mycobacterium tuberculosis. J Clin Microbiol 2004;42:895–8. 53. Ma C, Simons RW. The IS10 antisense RNA blocks ribosome binding at the transposase translation initiation site. EMBO J 1990;9:1267–74. 54. Escoubas JM, Prere MF, Fayet O, Salvignol I, Galas D, Zerbib D, et al. Translational control of transposition activity of the bacterial insertion sequence IS1. EMBO J 1991;10:705–12. 55. Mokrousov I, Ly HM, Otten T, Lan NN, Vyshnevskyi B, Hoffner S, et al. Origin and primary dispersal of the Mycobacterium tuberculosis Beijing genotype: clues from human phylogeography. Genome Res 2005;15:1357–64. 56. Hanekom M, van der Spuy GD, Streicher E, Ndabambi SL, McEvoy CR, Kidd M, et al. A recently evolved sublineage of the Mycobacterium tuberculosis Beijing strain family is associated with an increased ability to spread and cause disease. J Clin Microbiol 2007;45:1483–90. 57. Hermans PW, van Soolingen D, Bik EM, de Haas PE, Dale JW, van Embden JD. Insertion element IS987 from Mycobacterium bovis BCG is located in a hot-spot integration region for insertion

403

58.

59.

60.

61.

62.

63.

64.

65.

66.

67.

68.

69.

70.

71.

72.

73.

74.

75.

elements in Mycobacterium tuberculosis complex strains. Infect Immun 1991;59:2695–705. Vera-Cabrera L, Hernandez-Vera MA, Welsh O, Johnson WM, Castro-Garza J. Phospholipase region of Mycobacterium tuberculosis is a preferential locus for IS6110 transposition. J Clin Microbiol 2001;39:3499–504. Beggs ML, Eisenach KD, Cave MD. Mapping of IS6110 insertion sites in two epidemic strains of Mycobacterium tuberculosis. J Clin Microbiol 2000;38:2923–8. Sampson SL, Warren RM, Richardson M, van der Spuy GD, van Helden PD. Disruption of coding regions by IS6110 insertion in Mycobacterium tuberculosis. Tuber Lung Dis 1999;79:349–59. Kurepina NE, Sreevatsan S, Plikaytis BB, Bifani PJ, Connell ND, Donnelly RJ, et al. Characterization of the phylogenetic distribution and chromosomal insertion sites of five IS6110 elements in Mycobacterium tuberculosis: non-random integration in the dnaA-dnaN region. Tuber Lung Dis 1998;79:31–42. Fang Z, Forbes KJ. A Mycobacterium tuberculosis IS6110 preferential locus (ipl) for insertion into the genome. J Clin Microbiol 1997;35:479–81. Fang Z, Doig C, Morrison N, Watt B, Forbes KJ. Characterization of IS1547, a new member of the IS900 family in the Mycobacterium tuberculosis complex, and its association with IS6110. J Bacteriol 1999;181:1021–4. Warren RM, Sampson SL, Richardson M, van der Spuy GD, Lombard CJ, Victor TC, et al. Mapping of IS6110 flanking regions in clinical isolates of M. tuberculosis demonstrates genome plasticity. Mol Microbiol 2000;37:1405–16. Yesilkaya H, Dale JW, Strachan NJ, Forbes KJ. Natural transposon mutagenesis of clinical isolates of Mycobacterium tuberculosis: how many genes does a pathogen need? J Bacteriol 2005;187:6726–32. Sassetti CM, Boyd DH, Rubin EJ. Genes required for mycobacterial growth defined by high density mutagenesis. Mol Microbiol 2003;48:77–84. Banu S, Honore N, Saint-Joanis B, Philpott D, Prevost MC, Cole ST. Are the PE-PGRS proteins of Mycobacterium tuberculosis variable surface antigens? Mol Microbiol 2002;44:9–19. Fleischmann RD, Alland D, Eisen JA, Carpenter L, White O, Peterson J, et al. Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. J Bacteriol 2002; 184:5479–90. Sassetti CM, Rubin EJ. Genetic requirements for mycobacterial survival during infection. Proc Nat Acad Sci USA 2003;100: 12989–94. Yang Z, Yang D, Kong Y, Zhang L, Marrs CF, Foxman B, et al. Clinical relevance of Mycobacterium tuberculosis plcD gene mutations. Am J Respir Crit Care Med 2005;171:1436–42. Raynaud C, Guilhot C, Rauzier J, Bordat Y, Pelicic V, Manganelli R, et al. Phospholipases C are involved in the virulence of Mycobacterium tuberculosis. Mol Microbiol 2002;45:203–17. McAdam RA, Weisbrod TR, Martin J, Scuderi JD, Brown AM, Cirillo JD, et al. In vivo growth characteristics of leucine and methionine auxotrophic mutants of Mycobacterium bovis BCG generated by transposon mutagenesis. Infect Immun 1995;63: 1004–12. Sassetti CM, Boyd DH, Rubin EJ. Comprehensive identification of conditionally essential genes in mycobacteria. Proc Nat Acad Sci USA 2001;98:12712–7. Lemaitre N, Sougakoff W, Truffot-Pernot C, Jarlier V. Characterization of new mutations in pyrazinamide-resistant strains of Mycobacterium tuberculosis and identification of conserved regions important for the catalytic activity of the pyrazinamidase PncA. Antimicrob Agents Chemother 1999;43: 1761–3. Maus CE, Plikaytis BB, Shinnick TM. Mutation of tlyA confers capreomycin resistance in Mycobacterium tuberculosis. Antimicrob Agents Chemother 2005;49:571–7.

ARTICLE IN PRESS 404 76. Rengarajan J, Sassetti CM, Naroditskaya V, Sloutsky A, Bloom BR, Rubin EJ. The folate pathway is a target for resistance to the drug para-aminosalicylic acid (PAS) in mycobacteria. Mol Microbiol 2004;53:275–82. 77. Andersson DI, Levin BR. The biological cost of antibiotic resistance. Curr Opin Microbiol 1999;2:489–93. 78. McAdam RA, Quan S, Smith DA, Bardarov S, Betts JC, Cook FC, et al. Characterization of a Mycobacterium tuberculosis H37Rv transposon library reveals insertions in 351 ORFs and mutants with altered virulence. Microbiology 2002;148:2975–86. 79. Ho TB, Robertson BD, Taylor GM, Shaw RJ, Young DB. Comparison of Mycobacterium tuberculosis genomes reveals frequent deletions in a 20 kb variable region in clinical isolates. Yeast 2000;17:272–82. 80. Tsolaki AG, Hirsh AE, DeRiemer K, Enciso JA, Wong MZ, Hannan M, et al. Functional and evolutionary genomics of Mycobacterium tuberculosis: insights from genomic deletions in 100 strains. Proc Nat Acad Sci USA 2004;101:4865–70. 81. Kato-Maeda M, Rhee JT, Gingeras TR, Salamon H, Drenkow J, Smittipat N, et al. Comparing genomes within the species Mycobacterium tuberculosis. Genome Res 2001;11: 547–54. 82. Safi H, Barnes PF, Lakey DL, Shams H, Samten B, Vankayalapati R, et al. IS6110 functions as a mobile, monocyte-activated promoter in Mycobacterium tuberculosis. Mol Microbiol 2004; 52:999–1012. 83. Soto CY, Menendez MC, Perez E, Samper S, Gomez AB, Garcia MJ, et al. IS6110 mediates increased transcription of the phoP virulence gene in a multidrug-resistant clinical isolate responsible for tuberculosis outbreaks. J Clin Microbiol 2004;42: 212–9. 84. Perez E, Samper S, Bordas Y, Guilhot C, Gicquel B, Martin C. An essential role for phoP in Mycobacterium tuberculosis virulence. Mol Microbiol 2001;41:179–87. 85. Baker L, Brown T, Maiden MC, Drobniewski F. Silent nucleotide polymorphisms and a phylogeny for Mycobacterium tuberculosis. Emerg Infect Dis 2004;10:1568–77. 86. Filliol I, Motiwala AS, Cavatore M, Qi W, Hazbon MH, Bobadilla dV. Global phylogeny of Mycobacterium tuberculosis based on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set. J Bacteriol 2006;188:759–72. 87. Gutacker MM, Mathema B, Soini H, Shashkina E, Kreiswirth BN, Graviss EA, et al. Single-nucleotide polymorphism-based population genetic analysis of Mycobacterium tuberculosis strains from 4 geographic sites. J Infect Dis 2006;193:121–8.

C.R.E. McEvoy et al. 88. Glynn JR, Yates MD, Crampin AC, Ngwira BM, Mwaungulu FD, Black GF, et al. DNA fingerprint changes in tuberculosis: reinfection, evolution, or laboratory error? J Infect Dis 2004; 190:1158–66. 89. Cave MD, Eisenach KD, Templeton G, Salfinger M, Mazurek G, Bates JH, et al. Stability of DNA fingerprint pattern produced with IS6110 in strains of Mycobacterium tuberculosis. J Clin Microbiol 1994;32:262–6. 90. Quy HT, Lan NT, Borgdorff MW, Grosset J, Linh PD, Tung LB, et al. Drug resistance among failure and relapse cases of tuberculosis: is the standard re-treatment regimen adequate? Int J Tuberc Lung Dis 2003;7:631–6. 91. de Boer AS, Kremer K, Borgdorff MW, de Haas PE, Heersma HF, van Soolingen D. Genetic heterogeneity in Mycobacterium tuberculosis isolates reflected in IS6110 restriction fragment length polymorphism patterns as low-intensity bands. J Clin Microbiol 2000;38:4478–84. 92. Matsumoto T, Ano H, Nagai T, Danno K, Takashima T, Tsuyuguchi I. IS6110 DNA fingerprinting analysis of individually separated colonies of Mycobacterium tuberculosis. Tuberculosis (Edinb) 2005;85:207–12. 93. Bifani PJ, Mathema B, Kurepina NE, Kreiswirth BN. Global dissemination of the Mycobacterium tuberculosis W-Beijing family strains. Trends Microbiol 2002;10:45–52. 94. Victor TC, Streicher EM, Kewley C, Jordaan AM, van der Spuy GD, Bosman M, et al. Spread of an emerging Mycobacterium tuberculosis drug-resistant strain in the western Cape of South Africa. Int J Tuberc Lung Dis 2007;11:195–201. 95. Ghanekar K, McBride A, Dellagostin O, Thorne S, Mooney R, McFadden J. Stimulation of transposition of the Mycobacterium tuberculosis insertion sequence IS6110 by exposure to a microaerobic environment. Mol Microbiol 1999;33:982–93. 96. Alland D, Whittam TS, Murray MB, Cave MD, Hazbon MH, Dix K, et al. Modeling bacterial evolution with comparative-genomebased marker systems: application to Mycobacterium tuberculosis evolution and pathogenesis. J Bacteriol 2003;185:3392–9. 97. van Soolingen D, Hermans PW, de Haas PE, Soll DR, van Embden JD. Occurrence and stability of insertion sequences in Mycobacterium tuberculosis complex strains: evaluation of an insertion sequence-dependent DNA polymorphism as a tool in the epidemiology of tuberculosis. J Clin Microbiol 1991;29: 2578–86. 98. Nagy Z, Chandler M. Regulation of transposition in bacteria. Res Microbiol 2004;155:387–98. 99. Tanaka MM, Small PM, Salamon H, Feldman MW. The dynamics of repeated elements: applications to the epidemiology of tuberculosis. Proc Nat Acad Sci USA 2000;97:3532–7.