Molecular epidemiology, phylogeny and evolution of Legionella

Molecular epidemiology, phylogeny and evolution of Legionella

    Molecular epidemiology, phylogeny and evolution of Legionella A. Khodr, E. Kay, L. Gomez-Valero, C. Ginevra, P. Doublet, C. Buchriese...

850KB Sizes 27 Downloads 167 Views

    Molecular epidemiology, phylogeny and evolution of Legionella A. Khodr, E. Kay, L. Gomez-Valero, C. Ginevra, P. Doublet, C. Buchrieser, S. Jarraud PII: DOI: Reference:

S1567-1348(16)30167-8 doi: 10.1016/j.meegid.2016.04.033 MEEGID 2731

To appear in: Received date: Revised date: Accepted date:

14 February 2016 29 April 2016 30 April 2016

Please cite this article as: Khodr, A., Kay, E., Gomez-Valero, L., Ginevra, C., Doublet, P., Buchrieser, C., Jarraud, S., Molecular epidemiology, phylogeny and evolution of Legionella, (2016), doi: 10.1016/j.meegid.2016.04.033

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Molecular epidemiology, phylogeny and evolution of Legionella

PT

A. Khodr1,2, E. Kay3, L. Gomez-Valero1,2, C. Ginevra3,4, P. Doublet3, C. Buchrieser1,2, S. Jarraud3,4 1

Institut Pasteur, Unité de Biologie des Bactéries Intracellulaires CNRS UMR 3525, 28, Rue du Dr Roux, 75724, Paris, France 3 CIRI, International Center for Infectiology Research, Inserm, U1111 ; CNRS ; UMR5308, Université Lyon 1 ; École Normale Supérieure de Lyon, Lyon, F-69008, France. 4 French National Reference Center of Legionella, Hospices Civils de Lyon, France.

MA

NU

SC RI

2

PT

ED

Running title: Legionella epidemiology and evolution

CE

Keywords: Legionella pneumophila; phylogeny; epidemiology; experimental evolution;

AC

ICEs; HGT; T4ASS.

1

ACCEPTED MANUSCRIPT Abstract

Legionella are opportunistic pathogens that develop in aquatic environments where they

PT

multiply in protozoa. When infected aerosols reach the human respiratory tract they may accidentally infect the alveolar macrophages leading to a severe pneumonia called

SC RI

Legionnaires' disease (LD). The ability of Legionella to survive within host-cells is strictly dependent on the Dot/Icm Type 4 Secretion System that translocates a large repertoire of effectors into the host cell cytosol. Although Legionella is a large genus comprising nearly 60 species that are worldwide distributed, only about half of them have been involved in LD cases. Strikingly, the species L. pneumophila alone is responsible for 90% of all LD cases.

NU

The present review summarizes the molecular approaches that are used for L. pneumophila genotyping with a major focus on the contribution of whole genome

MA

sequencing (WGS) to the investigation of local L. pneumophila outbreaks and global epidemiology studies. We report the newest knowledge regarding the phylogeny and the evolution of Legionella and then focus on virulence evolution of those Legionella species

ED

that are known to have the capacity to infect humans. Finally, we discuss the evolutionary forces and adaptation mechanisms acting on the Dot/Icm system itself as well as the role of mobile genetic elements (MGE) encoding T4ASSs and of gene duplications in the evolution

AC

CE

PT

of Legionella and its adaptation to different hosts and lifestyles.

2

ACCEPTED MANUSCRIPT Contents

1. Introduction

3. Genotyping methods and epidemiology

3.2. Subtyping of endemic clones

SC RI

3.1. Historical typing methods and today’s standards

PT

2. Taxonomy and ecology of the genus Legionella

3.3. Whole genome sequencing: a new tool form molecular typing of L. pneumophila 4. Update on the phylogeny of the genus Legionella

5. Evolution of the species Legionella pneumophila and the genus Legionella

NU

5.1. Impact of recombination on the evolution of Legionella pneumophila 5.2. Experimental evolution of Legionella pneumophila

MA

5.3. Evolutionary forces acting on the Dot/Icm type IVB secretion systems in the genus Legionella

6. Adaptation of Legionella to the environment and its hosts 6.1. Role of integrative conjugative elements in the evolution of Legionella

ED

6.1.1. Lvh (Legionella Vir homologue) region 6.1.2. Trb/Tra family of conjugative elements

PT

6.1.3. Genomic Island associated T4SS family 6.2. Role of gene duplications in the adaptation of Legionella

CE

6.3. Specific features of Legionella influencing environmental adaptation

AC

7. Concluding remarks

3

ACCEPTED MANUSCRIPT 1. Introduction

Legionella pneumophila is a human pathogen that was recognised only 40 years ago after a large outbreak of pneumonia during a convention of the American Legion, thus

PT

named Legionnaires’ disease (LD) (Fraser et al., 1977; McDade et al., 1977). Why was a bacterium that causes severe pneumonia only identified in 1977? Firstly, the development

SC RI

of man-made water systems producing aerosols (e.g. air conditioning systems, cooling towers, spas etc.) created conditions allowing the direct access of this opportunistic bacterium to human lungs. This fact is thought to have made LD an emerging disease since the 1970’s. Secondly, the fastidious growth of L. pneumophila and the requirement of L-

NU

cysteine and iron for growth on axenic culture media would have made it difficult to isolate this organism earlier (Feeley et al., 1978). However, the magnitude of the first recognized

MA

outbreak of LD in Philadelphia in 1977, its mysterious character due to the long lasting search for the causative agent of this disease that was widely reported by the media, combined with the severity of the pneumonia and the group of patients that belonged mostly to the American Legion, made this outbreak an exceptional event. Thus, very

ED

significant resources were employed for the epidemiological investigation to finally identify the causative agent. Interestingly, only techniques routinely used in the field of Rickettsia,

PT

also an intracellular bacterium, such as inoculation of both guinea pigs and yolk sacs of egg embryos, allowed Joseph McDade and his collaborators to cultivate and identify

CE

L. pneumophila (McDade et al., 1977). Once the bacterium was identified and isolated, sero-epidemiological studies were done, which also allowed the recognition of earlier

AC

outbreaks of LD but also enabled description of Pontiac fever (Blaser, 1977; McDade et al., 1979). Pontiac fever is a milder form of legionellosis, classically described as an influenzalike illness without pneumonia.

Today, L. pneumophila has been identified as one of the three most common causes of severe community-acquired pneumonia (CAP): Legionnaires’ disease accounts for 2%–8% of CAP cases (Bartlett, 2008, 2011; Roig and Rello, 2003). A review of 41 European studies of CAP identified Legionella as the causative agent in 1.9% of outpatients, 4.9% of hospitalised patients and 7.9% of ICU patients (Woodhead, 2002). The exact incidence of LD worldwide is unknown due to difference between countries in surveillance and reporting but also diagnosis of LD. The development of diagnostic tests for the easy detection of L. pneumophila serogroup 1-specific antigens in urine was a significant advance in the diagnosis of legionellosis. In 2014 the European prevalence was 13.5 cases per million inhabitants; most cases (78%) were confirmed by this urinary antigen test (European Centre for Disease Prevention and Control, 2016). The mortality rate remains

4

ACCEPTED MANUSCRIPT high (8% to 30% in ICUs and for hospital-acquired LD) despite improved diagnostic and therapeutic management of patients. Known risk factors for LD include increasing age, male gender, smoking, chronic lung disease, diabetes and various conditions associated with immunodeficiency (Phin et al., 2014). LD occurs sporadically and in outbreaks. In

PT

Europe, most cases (approximately 80%) are community-acquired and sporadic (European Centre for Disease Prevention and Control, 2016). Most of the cases occur during summer

SC RI

and early autumn; the yearly incidence of LD seems to be associated with climate changes, such as increased precipitation (Cunha et al., 2015). The sources of infection of sporadic cases are more rarely investigated and identified. However several systems and matrices have been classified as confirmed sources of Legionella (van Heijnsbergen et al., 2015).

NU

Healthcare and travel-associated outbreaks are mainly related to contaminated water systems (showers and baths, respiratory therapy or air conditioning equipments, spa pools,

MA

and, less commonly, food display humidifiers). Community outbreaks are predominantly linked to contaminated aerosols from wet cooling systems. As mentioned above, L. pneumophila as several other species of Legionella can cause also Pontiac fever. However the causative bacteria have not been isolated yet from Pontiac fever patients. The

ED

diagnosis is performed by urinary antigen test or by using serology. While the disease progresses acutely and shows a high attack rate of approximately 95%, the mortality rate is

PT

zero (Fields et al., 2002).

CE

Important knowledge about the epidemiology, the clinical presentations and the treatment of LD was readily gained and published soon after its recognition in 1976 (Fraser et al., 1977). However, advances in microbiology have now led to a better understanding of

AC

the ecological niches, the pathogenesis and the evolution of L. pneumophila. Here, we review the latest knowledge gained on the phylogeny and evolution of L. pneumophila through whole genome sequencing and different molecular approaches employed recently for its study.

2. Taxonomy and ecology of genus Legionella Since Legionnaires’ disease was recognized, isolation and identification of different strains and their characterization led to the establishment of the family Legionellaceae consisting of the single genus Legionella among the subdivision γ2-Proteobacteria (Brenner et al., 1979; Woese, 1987). On the basis of low DNA-DNA hybridization values between some Legionella species, a classification separating the family Legionellaceae in three genera - Legionella, Fluoribacter and Tatlockia was proposed (Garrity et al., 1980).

5

ACCEPTED MANUSCRIPT However, additional studies showing that all legionellae studied have 16S ribosomal RNA sequences more than 95% identical did not support this division (Fry et al., 1991). Today, the genus Legionella comprises over 60 species (http://www.bacterio.net/legionella.html); all species have been isolated from environmental samples, and about half of the known

PT

species were also isolated at least once from patients and have thus been associated with infection. The type species is Legionella pneumophila, corresponding to the first bacterium

SC RI

of the genus described that is nowadays also the species responsible for nearly 95% of cases of LD diagnosed worldwide (Fraser et al., 1977; McDade et al., 1977). This species can be subdivided in 16 serogroups but the majority of culture-confirmed LD cases (84% worldwide, 80% in Europe) is caused by L. pneumophila serogroup 1 (Lp1) (Beaute et al.,

NU

2013; Fields et al., 2002; Yu et al., 2002). In 1988, Brenner et al. subdivided the species Legionella pneumophila in three subspecies (Brenner et al., 1988) - L. pneumophila subsp.

MA

pneumophila; L. pneumophila subsp. fraseri; L. pneumophila subsp. pascullei. Nevertheless, the tools to distinguish subspecies are not available in routine laboratories and thus these subdivisions are rarely reported. Other species of the genus Legionella also isolated from patients in 2014 in Europe are L. longbeachae (2%), L. micdadei (1%), L. bozemanii (<1%),

ED

L. macaechernii (<1%), L. sainthelensi (<1%), L. other species (<1%) and L. species not identified (1%) (European Centre for Disease Prevention and Control, 2016). The majority

immune-compromised

PT

of confirmed infections involving non-pneumophila Legionella species occur in severely patients.

Interestingly,

Graham

et

al.

reported

a

distinct

CE

epidemiological pattern of legionellosis in New Zealand and Australia, where L. longbeachae and L. pneumophila are similarly prevalent in LD (Graham et al., 2011). Infections with L. longbeachae are commonly associated with exposure to contaminated

AC

composts and potting soils, and have been increasingly reported in Europe in the past ten years (Amodeo et al., 2010; Currie and Beattie, 2015). The comparison of the L. longbeachae genome with that of Lp1 identified many species-specific differences that may account for the different environmental niches and disease epidemiology of these two species. Among these, L. longbeachae encodes several enzymes that might confer the ability to degrade plant material suggesting that L. longbeachae may also be able to interact with plants. Interestingly, L. longbeachae does not encode flagella, a major virulence factor for L. pneumophila, but is encapsulated (Cazalet et al., 2010). Regarding L. pneumophila, the high prevalence of Lp1 strains in human disease as mentioned above, does not seem to be due to its environmental distribution (Doleans et al., 2004). In order to explain this epidemiological predominance, comparative genome analyses of over 200 Legionella strains using macroarrays were performed. One major finding of this study was that the lipopolysaccharide (LPS) biosynthesis gene cluster of sg1 was the only common feature of Lp1 strains suggesting that the specific LPS of sg1 is at least partly responsible for the

6

ACCEPTED MANUSCRIPT predominance of sg1 in human disease (Cazalet et al., 2008). These genome analyses further showed that the LPS gene cluster of sg1 strains is present in Lp strains representing different positions in the phylogenetic tree and genetic backgrounds indicating that the LPS cluster is exchanged by HGT (Cazalet et al., 2008; Cazalet et al., 2004; Gomez-Valero et

PT

al., 2011; Merault et al., 2011).

SC RI

A particular form of Legionella bacteria called Legionella-like amoebal pathogens (LLAPs) has been described in 1956 as intracellular parasite of free-living amoebae that cause lysis of the amoeba cells (Drozanski, 1956).Though initially named Sarcobium lyticum (Drozanski, 1991), this species was reclassified within the genus Legionella as L.

NU

lytica (Hookey et al., 1996). The dichotomy between LLAP and other Legionella spp. is based only on the fact that LLAPs grow poorly or not at all on BCYE agar. Based on 16s

MA

rRNA sequence, LLAPs clustered within a monophyletic group containing all other species (Birtles et al., 1996; Fry et al., 1991). Today these LLAPs represent five species in the genus Legionella - Legionella lytica (LLAP-3, LLAP-9, LLAP-7), L. drozanskii (LLAP-1), L.

ED

rowbothamii (LLAP-6), L. fallonii (LLAP-10), and L. drancourtii (LLAP-12) (Adeleke et al., 1996; Adeleke et al., 2001; La Scola et al., 2004). Serological studies suggested that they might be involved in human infections as a co-pathogen (La Scola et al., 2002; Marrie et al.,

PT

2001 ; McNally et al., 2000).

CE

Legionella exist as an intracellular parasite of protozoa such as Acanthamoeba, Hartmanella, Tetrahymena and Naegleria (Brieland et al., 1997; Rowbotham, 1980). During

AC

human infection, Legionella invades and replicates in macrophages, which are considered the primary target of Legionella. Conservation of many signaling pathways of protozoa to human macrophages explains in part the ability of Legionella to infect humans. Through the action of a large number of effector proteins translocated into host cell cytosol via the defect in organelle trafficking/intracellular multiplication (Dot/Icm) type IV secretion system (T4SS), the bacteria modulate host cell vesicular traffic and endosomal maturation pathways. It includes the recruitment of vesicles derived from endoplasmic reticulum (ER) to convert its nascent phagosome into an ER-derived compartment, termed Legionella-containing vacuole (LCV), for survival and replication (Allombert et al., 2013; Isberg et al., 2009). L. pneumophila deliver more than 300 effectors into host cells (Isberg et al., 2009; Newton et al., 2010; Zhu et al., 2011). The exact function of the vast majority of effectors still remains unknown. Many of these effector proteins harbor eukaryotic-like domains or resemble eukaryotic proteins that seem to have been acquired form their eukaryotic hosts via horizontal gene transfer (Gomez-Valero and Buchrieser, 2013).

All Legionella species

analyzed to date harbor a type IVB Icm/Dot secretion system however they have a large

7

ACCEPTED MANUSCRIPT non-overlapping effector repertoire (Burstein et al., 2016; Gomez-Valero et al., 2014) (see paragraph 4.3). Legionella uses amoebae as a natural niche for its development, and human cells (such as macrophages, epithelial cells) are thought to be an accidental niche in the normal

PT

life cycle of the bacteria. Although human-to-human transmission has been thought not to occur, recently the first case of a probable human-to-human transmission of L. pneumophila

SC RI

has been reported during a large outbreak in Portugal in 2015 (Correia et al., 2016). However, co-evolution with environmental phagocytic cells is still without doubt the major driving force of the evolution of this bacterium. The replication capacity of different Legionella species in amoeba and human cells correlates with the epidemiological data for

NU

these species, suggesting that common as well as species-specific mechanisms might be involved in Legionella infection and replication in human cells (Gomez-Valero et al., 2014).

MA

Although amoeba and human macrophages are conventional cellular models for studying the virulence of L. pneumophila, this bacterium is able to colonize other model organisms such as non-phagocytic cell lines like HeLa cells (Garduno et al., 1998), permissive A/J mice and different cell lines (Brieland et al., 1994), the nematode Caenorhabditis elegans

ED

(Brassinga and Sifri, 2013), the wax moth wax Galleria melonella (Harding et al., 2012) or the fruit fly Drosophila melanogaster (Sun et al., 2013). In an evolutionary perspective, the

PT

fact that L. pneumophila is able to multiply in very different phagocytic cells (unicellular protists to the mammalian macrophages) reveals that these bacteria are able to control

AC

Shuman, 1999).

CE

cellular pathways which are conserved within its different eukaryotic hosts (Segal and

3. Genotyping methods and epidemiology

3.1. Historical typing methods and today’s standards As mentioned above, L. pneumophila is the most prevalent Legionella species involved in LD. After genus and species identification, the discrimination between Legionella isolates can be performed using several different subtyping methods. These typing methods aim to characterize isolates in order to identify epidemiologically linked LD cases and/or to identify the environmental source of the infection allowing taking corrective actions. Most of the methods are dedicated to L. pneumophila typing, but some of them could also be applied to other Legionella species. Most typing methods used are molecular methods; nevertheless a few phenotyping methods have also been described. For example sets of monoclonal antibodies that allow subtyping L. pneumophila serogroup 1 (Lp1) have

8

ACCEPTED MANUSCRIPT been described (Helbig et al., 2002; Joly et al., 1986). First a standard phenotyping scheme was described that allowed dividing Lp1 into 9 subtypes which constitute an interesting first screen during epidemiological investigation and which could also enhance the discriminatory power of an associated genotyping method (Helbig et al., 2002). Later, first

PT

molecular typing methods have been developed for L. pneumophila that were based on the comparison of DNA band patterns generated by different methods. Finally, methods based

SC RI

on sequencing of amplicons have been developed (Aurell et al., 2005; Gaia et al., 2005; Helbig et al., 2003; Pourcel et al., 2007; Ratzow et al., 2007) and more recently, a MALDITOF-MS approach (Fujinami et al., 2010), whole genome mapping (Bosch et al., 2015) and whole genome sequencing-based methods have been described ( see paragraph 3.3). The

NU

efficiency of a typing method to differentiate bacterial isolates could be evaluated by calculating the discriminatory power (D). The discriminatory power is the average

MA

probability that the typing system will assign a different type to two unrelated strains randomly sampled in the microbial population of a given taxon and is calculated according to Hunter and Gaston (Hunter and Gaston, 1988). Table 1 sum up the discriminatory power of the typing methods for which it was available.

ED

Restriction fragment length polymorphisms (RFLP) analysis was one of the first molecular typing methods developed for L. pneumophila. This method has been used as

PT

the gold standard until recently in some countries such as the UK (Harrison et al., 2007). The method is a restriction endonuclease digest analysis of gDNA in which differences

CE

between the sequences of the genomic DNAs of strains are detected following electrophoretic separations of the digests using specific probes in a southern blot assay (Saunders et al., 1990). This method has also been used for L. longbeachae typing (Lanser

AC

et al., 1990). Alternative probes targeting the ribosomal operon have also been used for L. pneumophila typing (Saunders et al., 1991). RFLP typing has a high discriminatory index, but several methods with an equal discriminatory index have been developed later have shown to be more discriminative (e.g. Pulse-field gel electrophoresis (PFGE) (Pruckler et al., 1995)) or are easier to set up (e.g. PCR-based techniques). PFGE was longtime one of the gold standard methods. The method is based on the separation of macro-restriction fragments of the bacterial chromosome generated by digestion with an infrequent cutting site restriction endonuclease by pulse-field gel electrophoresis. PFGE has a high discriminatory index as assessed by several studies that are mainly using the enzyme SfiI as restriction endonuclease (Amemura-Maekawa et al., 2005; Marrie et al., 1999; Riffard et al., 1998). Despite its high discriminatory power, this method suffers some drawbacks: it is time consuming (4 days to obtain results), inter-gel reproducibility is often poor, electrophoresis requires specific equipment and computer-aided imaging analysis is needed, data are difficult to exchange between laboratories making investigations of travel-

9

ACCEPTED MANUSCRIPT associated LD cases more difficult. Recently an improved protocol for L. pneumophila typing was described that reduces the total duration of the experiment to 2 days (Chang et al., 2009). Based on the complete genome sequences, new restriction endonucleases for PFGE typing were identified and the electrophoretic parameters were optimized (Zhou et al.,

PT

2010). PFGE has also been used for L. longbeachae typing (Montanaro-Punzengruber et al., 1999); moreover, an optimized protocol using a double digestion has been described for

SC RI

L. anisa typing (Akermi et al., 2006). Despite international efforts for standardization of PFGE typing protocols (restriction endonuclease, plug preparation and electrophoretic parameters) the obtained data remain difficult to exchange.

PCR-based techniques generating band patterns have also been applied for

NU

L. pneumophila typing. Amplified fragment length polymorphism (AFLP) is one of the methods standardized by the European working group on Legionella infection (EWGLI,

MA

become in 2012 the European Society of Clinical Microbiology and Infectious Diseases (ESCMID) study group for Legionella infections: ESGLI), as it has a high discriminatory index. Bacterial DNA is digested and specific adapters are ligated to the restriction fragments. These adapters are then used as targets for PCR amplification. The length

ED

polymorphism of the amplified fragments generated is visualized via an agarose or acrylamide gel electrophoresis (Fry et al., 2005; Fry et al., 1999; Fry et al., 2002). Several

PT

other PCR-based methods generating DNA-band patterns have been described for L. pneumophila subtyping such as arbitrary primed PCR (AP-PCR) (Gomez-Lus et al.,

CE

1993; Grattard et al., 1996; Lawrence et al., 1999c), the infrequent-restriction-site PCR (IRS-PCR) (Jakubek et al., 2013; Riffard et al., 1998) or multi-locus Variable-Number Tandem-Repeat (VNTR) analysis (MLVA) which uses the diversity of VNTRs for typing

AC

(Nederbragt et al., 2008; Pourcel et al., 2003). An MLVA type database has also been created and is available via the website http://bacterial-genotyping.igmors.u-psud.fr/. Conversely to AP-PCR, the amplification step of AFLP, IRS-PCR and MLVA is performed in stringent conditions, which makes inter laboratories reproducibility higher. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDITOf-MS) for rapid discrimination of Legionella isolates was also evaluated (Fujinami et al., 2010). First tests showed that two sets of L. pneumophila isolates clustered in the same way when typed by mass spectrometry or by PFGE. Nevertheless, only 23 L. pneumophila isolates have been tested and thus this method needs to be evaluated on a larger number of isolates from diverse origins before conclusions on its usefulness can be drawn. However, it seems to be a promising tool as MALDI-TOF-MS data can be generated within few hours after Legionella growth and MALDI-TOF mass spectrometers are increasingly present in microbiology laboratories for bacterial identification purposes.

10

ACCEPTED MANUSCRIPT Whole genome mapping have recently been evaluated for L. pneumophila typing (Bosch et al., 2015). The authors have tested this new tool on a panel of 53 related and unrelated isolated including stability and reproducibility testing. They demonstrated the high discriminatory power of the method, which allows a result in less than 24 hours after culture

PT

growth.

Methods based on the comparison of polymorphisms present in selected DNA

SC RI

targets have also been developed. These sequence-based typing (SBT) methods are based on the sequence comparison of several PCR-amplified DNA fragments. For L. pneumophila typing, several targets combinations have been assessed to obtain the most discriminative method (Aurell et al., 2005; Gaia et al., 2005; Gaia et al., 2003; Ratzow

NU

et al., 2007). Since 2007, a consensus was defined and SBT is the gold standard method recommended by EWGLI for L. pneumophila subtyping. The method is based on the

MA

sequence comparison of 7 PCR-amplified genes fragments (flaA, pilE, asd, mip, mompS, proA, and neuA). Raw sequence data are submitted to an online database available at the Public

Health

of

England

(PHE)

(http://bioinformatics.phe.org.uk/legionella/legionella_sbt/php/sbt_homepage.php),

website which

ED

allows the assignment of individual allele numbers to each target. A string of the individual allele numbers separated by commas in a predetermined order defined the allelic profile

PT

(e.g., 1,4,3,1,1,1,1) (Gaia et al., 2005; Ratzow et al., 2007). A Sequence Type (ST) number is assigned for each allelic profile (e.g. allelic profile 1,4,3,1,1,1,1 = ST1). In March 2016,

CE

the SBT database included 10792 entries from 60 countries representing 2156 different STs. Isolation of clinical or environmental L. pneumophila strains is not an easy task. This impairs the epidemiological investigations, as both clinical and environmental isolates are

AC

required for comparisons to find the source of infection. In this respect, SBT as PCR-based methods have an advantage as they can be applied directly to DNA extracted from clinical or environmental samples without the need to isolate the strains. This direct use on DNA samples has been used in few cases, however variable results are reported as in some cases all genes could be amplified and sequenced from the extracted DNA (Fry NK, 2005), but in other cases none or only few of the 7 SBT genes could be amplified and sequenced (Luck et al., 2007). To enhance the sensitivity of SBT directly applied on clinical samples, the addition of a prior amplification step leading to a nested or semi-nested PCR before sequencing of the target genes was reported (Coscolla and Gonzalez-Candelas, 2009; Ginevra et al., 2009). Coscollá and colleagues enhanced the discriminatory index of the method by adding three intergenic regions as new targets to the six target genes of standard SBT, but these targets were not included in the EWGLI definition of sequence types. An optimized protocol of a nested PCR-based SBT is available on the EWGLI site (www.ewgli.org).

11

ACCEPTED MANUSCRIPT Sequence-based typing appears to be the method of choice for the exchange of results and as a powerful tool for global epidemiology. Furthermore, the application of the SBT method directly on clinical and environmental samples offers new solutions during epidemiological investigations (Coscolla and Gonzalez-Candelas, 2009; Mentasti et al.,

PT

2012; Moran-Gilad et al., 2015). The addition of new targets to the standard SBT has shown that the discriminative power of this technique can be enhanced easily (Coscolla and

SC RI

Gonzalez-Candelas, 2009, Ratzow et al., 2007). This concept has now been extended to whole genome sequencing data and core genome-MLST approaches that have been

3.2. Subtyping of endemic clones

NU

described and will be detailed further below (see paragraph 3.3).

MA

Comprehensive molecular typing of Legionellae around the world allows building up a global picture of strain distributions. Several independent studies have shown that only few genotypes (e.g. ST1, ST23, ST37, ST40, ST47, ST62…) cause most of unrelated culture-proven LD cases (Cazalet et al., 2008; Ginevra et al., 2008; Harrison et al., 2009;

ED

Lawrence et al., 1999b; Moran-Gilad et al., 2014). Some of these genotypes are distributed

PT

worldwide (Cazalet et al., 2008).

The collaborative results obtained by members of EWGLI since 2003 using the

CE

standard SBT scheme showed that a minority of strains causes nearly half of all reported cases of Legionnaires’ disease in Western Europe. However, the high isolation rate of strains with the above mentioned genotypes in clinical samples makes it difficult, and

AC

frequently even impossible, to demonstrate links between cases and environmental sources of infections during epidemiological investigations. Few studies have described methods allowing the subtyping of some STs (e.g. ST1 or ST62) for a better discrimination of these genotypes. A spoligotyping method was designed to subtype ST1. The CRISPR spacers were amplified using primers located in the direct repeat of the CRISPR array. The presence of these spacers was then revealed by reverse dot blot using 42 probes homologous to CRISPR spacers. A spoligotype was defined by the specific binary code deduced from the presence or absence of each of the 42 spacers. In their study Ginevra et al. were able to discriminate 233 ST1 isolates into 41 different spoligotypes (Ginevra et al., 2012). This method was first developed using a membrane-based hybridization assay and was transferred to use as a microbead-based high-throughput assay (Gomgnimbou et al., 2014)The subtyping of ST62 isolates using sequencing of the CRISPR array was also helpful in differentiating outbreak isolates from

12

ACCEPTED MANUSCRIPT unrelated ST62 isolates and to confirm the environmental source of the outbreak (Luck et al., 2015). 3.3 Whole genome sequencing: a new tool for molecular typing of L. pneumophila

PT

Different studies have been conducted to evaluate the potential of whole genome sequencing (WGS) as a tool for L. pneumophila typing. Data comparisons were based on

SC RI

SNP-analyses, core genome comparisons, accessory genome comparisons, mobile genetic element profiling and core genome MLST (cgMLST). The first study describing WGS as a typing tool for L. pneumophila was performed by sequencing 7 isolates (3 clinical and 4 environmental) from an outbreak in the UK in 2003, retrospectively (Reuter et al., 2013).

NU

The objective of the study was to evaluate the feasibility of using WGS for Legionella outbreak investigations. The results obtained by SNP analyses were concordant with the

MA

SBT results (which were derived from the WGS data) and with the previous investigation results that linked two clinical isolates with three environmental isolates and excluded one clinical and one environmental isolate from the outbreak. The authors pointed to some limitations of WGS such as the limited number of available L. pneumophila genomes and

ED

the lack of automated pipelines for genome analysis. The use of WGS for the real-time investigation of a L. pneumophila community

PT

outbreak has also been used during an outbreak in Québec city in 2012. WGS was performed in comparison with SBT and PFGE (Levesque et al., 2014). All three methods

CE

gave concordant results and allowed the identification of the environmental source of the outbreak. Recently the use of WGS for the real-time investigation of a nosocomial L. pneumophila outbreak in an Australian hospital was reported (Graham et al., 2014). In

AC

this study, the genomes of six ST1 isolates were sequenced, four from the hospital (three clinical and one environmental isolate, respectively) and two unrelated isolates. All genomes obtained were compared to the ST1 reference genome (Paris, CR628336). SBT and virulence gene profiling was also performed for the typing of these isolates. WGS was the unique method used in this study that allowed the discrimination of unrelated ST1 isolates from the isolates from the hospital. Unrelated isolates showed more than 1500 different SNPs when compared to the hospital isolates that did not have more than 20 SNPs between two strains. In this study, the authors could link the two clinical isolates of the 2013 outbreak and one clinical isolate from a LD case in 2011 to an environmental isolate from a water sample from the hospital. They have demonstrated the usefulness of WGS for real-time investigation of LD cases and showed the higher discriminatory power of this method compared to the SBT method especially for the typing of endemic clones such as ST1.

13

ACCEPTED MANUSCRIPT A large retrospective study by sequencing isolates from the same hospital was conducted recently (Bartley et al., 2016). The clinical isolates as well as 39 environmental ST1 isolates from different parts of the hospital were sequenced. A SNP-based approach as well as mobile genetic element profiling was used to analyze the whole genome

PT

sequencing data. As in the previous study, few SNP differences were observed between the related clinical and environmental isolates. Despite this small number of differences, the

SC RI

authors identified three clonal variants that seem to correspond to different areas of the hospital. The patient isolates were most closely related to isolates from their immediate environments, which might indicate localized geographic microevolution. Nevertheless, the authors also demonstrated a highly stable L. pneumophila population that contaminated the

NU

hospital with only few differences according to the time of isolation and localization. Thus the authors had doubts about the correlation between the geographic localizations of the

MA

variants and those of the patients.

Isolates belonging to ST191 that had caused a community LD outbreak in 2012 in Edinburgh, UK have been particularly closely analyzed by performing WGS of 21 isolates from 15 patients (McAdam et al., 2014). First SNP-based comparison of the core genomes

ED

after removal of three high-density SNP regions supposed to be due to recombination events was performed. This analysis led to the discrimination of four different subtypes

PT

among the patients. Two subtypes differed by 20 SNPs but had been isolated from the same patient. Secondly an analysis of the accessory genome was performed; in particular

CE

the gene content of type 4 secretion system (T4SS) related genomic regions was assessed. Variations in the Lvh T4ASS encoding region were identified between isolates. Of note, this analysis allowed the identification of a new L. pneumophila vir homologues (Lvh) encoding

AC

region (the Lvh region will be detailed further in the article). SNPs were also identified in the T4BSS encoded by the dot/icm genes. The differences between the T4SSs did not enhance the discrimination among the isolates. Finally, two isolates from the same patient, which had an indistinguishable core genome, differed by the presence of a 55Kb element including genes encoding resistance to heavy metals and a 2.7 Kb region encoding 2 hypothetical proteins. This study highlights that using the same dataset different levels of discrimination can be obtained depending on the analyses performed. It also highlights that an outbreak can be due to a polymorphic population of L. pneumophila strains in the environmental water source, leading to a complex distribution of L. pneumophila variants between patients of the outbreak and even within the same patient. Recently a new cgMLST scheme for L. pneumophila typing based on 17 genomes from databases has been proposed where 1521 genes are analyzed. This cgMLST method has been validated on 21 genomes from the databases and evaluated on three epidemiologically unrelated humidifier-associated pediatric LD clusters from Israel. The

14

ACCEPTED MANUSCRIPT cgMLST allowed differentiating isolates within the endemic ST1 isolates. The cgMLST seems to provide a high discriminatory power and easily exchangeable results (MoranGilad et al., 2015). Taken together, the different studies discussed show the power of WGS for

PT

investigations of L. pneumophila outbreaks. Different sequencing and analysis methods were used giving the same or higher discrimination levels as the standard methods used.

SC RI

Although, a different discriminatory power was observed when several analysis methods are combined, WGS seems to be the future of molecular typing; from a global epidemiology standpoint, there is a need of standardization of both sequencing methods and sequence data analyses in order to make the data comparable and exchangeable between labs. In

NU

contrast, for local outbreak investigations, several analysis strategies may be used with the

MA

same success for the in depth characterization of outbreak related isolates.

4. Update on the phylogeny of the genus Legionella

ED

Phylogenetic studies of the species Legionella have been scarce and these few studies were based only on one or few genes (Grattard et al., 2006; Ko et al., 2002; Rubin

PT

et al., 2005). Since these studies were published new Legionella species have been described and the current accessibility to more and more complete genome sequences allows the use of a higher number of genes for phylogenetic reconstruction. The knowledge

CE

of a whole genome based phylogeny is important as it allows a better understanding of the phylogenetic relationships of the different Legionella species. Recently the whole genome

AC

sequences of 38 different Legionella species have been published and the genus phylogeny was reconstructed based on a concatenated alignment of 78 quasi-universal bacterial proteins. The results showed that the genus Legionella is divided into 3 clades where L. hackeliae, L. lansingensis and L. micdadei are present in the older one. The major Legionella pathogens (L. pneumophila, L. longbeachae, Legionella bozemanii and L. dumoffi) clustered together and a deep-branching clade consisting of Legionella oakridgensis, Legionella londiniensis and Legionella adelaidensis was observed. Legionella geestiana was considered as a outgroup to the rest of the Legionella species (Burstein et al., 2016). 5. Evolution of the species Legionella pneumophila and the genus Legionella

5.1. Impact of recombination on the evolution of L. pneumophila

15

ACCEPTED MANUSCRIPT Early studies applying multi locus enzyme electrophoreses (MEE) supported a clonal population structure of the species L. pneumophila (Selander et al., 1985). Similarly, different sequence-based typing methods showed no evidence of recombination but suggested that the species L. pneumophila is clonal (Edwards et al., 2008). However, large

PT

population genomics approaches not only identified the occurrence of recombination in the species L. pneumophila but they showed that recombination has a key contribution to the

SC RI

genetic diversity of this species. For example recent data demonstrated that 34% to 57% of the genome of L. pneumophila has been involved in recombination events that are an ongoing process (Coscolla et al., 2011). Additionally, two recombination hot spots were identified: the first one is rich in genes related to T4ASS and the second one is rich in

NU

genes encoding mostly for hypothetical proteins of unknown function (Coscolla et al., 2011). Interestingly, intragenic recombination events were shown to occur also in housekeeping

MA

genes (pgk, atpD, ffh, metK) (Gomez-Valero et al., 2011), which is indicative of very high recombination rates (Feil et al., 2001).

Recently the index of association that is commonly used to measure the rate of intergenic recombination and the r/m ratio (Recombination /mutation) was calculated. This

ED

showed that the species L. pneumophila has a recombination value between that of Staphylococcus aureus (clonal) and that of Neisseria menigitidis (highly recombining

PT

species) (Underwood et al., 2013). Furthermore, to confirm the occurrence of recombination, a reticulate network tree based on the sequences used for the Sequence-based typing

CE

(SBT) scheme was constructed. Such a network tree allows more explicit representation of the evolutionary history than traditional phylogenetic trees. The tree computed from L. pneumophila data obtained from strains sampled from both environmental sources

AC

associated with human habitations and from patients with Legionnaires’ disease proofed that many recombination events between lineages occurred. Thus Legionella undergoes significant recombination and both clonal vertical inheritance of genetic material and exchange via recombination play a role in the evolution of L. pneumophila (Underwood et al., 2013). Interestingly, not only intragenic recombination events are occurring but also intergenic recombination events take place. Whole genome comparisons of single nucleotide polymorphisms (SNP) of different strains identified in each genome several regions with very low polymorphism (<0,05%) suggesting the exchange of DNA fragments between these strains. Surprisingly, these regions can be very large. For example between the strains Paris and HL06041035 genomic regions ranging from 10 to 99 kbs or between strains Philadelphia and Paris a region of 213 kb with a SNP rate lower than 0.005% were identified. It was suggested that these exchanges occur by conjugation. Thus a high DNA exchange rate seems to be present, in particular among strains present in the same

16

ACCEPTED MANUSCRIPT geographical region (Gomez-Valero et al., 2011). Taken together, the recent results obtained through whole genome sequencing and comparison changed the initial view of a clonal population structure of L. pneumophila and showed that in contrast high

5.2. Experimental evolution of Legionella pneumophila

PT

recombination rates and DNA exchange via conjugation are key evolutionary mechanism.

SC RI

The development of next generation sequencing methods in recent years has allowed to undertake studies that were previously not feasible. Two recent studies in the Legionella field illustrated well these new possibilities of research. Ensminger and colleagues undertook a genome comparison of the classically used L. pneumophila strain

NU

Philadelphia-1 laboratory strains called JR32 and Lp01 and their derivatives called Lp02 and Lp03 (Rao et al., 2013). This study describes all the mutations associated to these strains and therewith allowed reconstructing their evolutionary history from the ancestral

MA

Philadelphia-1 strain isolated during the first reported Legionnaires’’ disease outbreak. The study confirms that JR32 and Lp02 derived from genetically identical clones, whereas polymorphisms detected in strains Lp02 and Lp03, both thymidine auxotroph, reveal that

ED

Lp03 is not a derivate of Lp02 as thought before. By analysing the whole genome sequence of each of these strains, it was shown that the genetic make-up of each laboratory strain

PT

has indeed changed over time due to numerous lab passages and genetic manipulations. For example, each strain contains large deletions in MGE discussed later in this review, in particular deletions of the Lvh-region and the tra regions (Rao et al., 2013). Furthermore,

CE

several, different single nucleotide polymorphisms (SNPs) were observed in each strain. The authors also suggest, that the use of strains JR32 and Lp01 for research should

AC

perhaps be reconsidered. These clones were genetically changed to allow easier strain manipulation and to improve transformation efficiencies with exogenous DNA (Berger and Isberg, 1993; Marra and Shuman, 1989) as compared to the original Philadelphia-1 strain. However, Rao and colleagues showed that the electroporation of plasmids is possible also in the original Philadelphia-1 strains similar to what was shown for the Paris strain isolated from a nosocomial outbreak of legionellosis in a Paris hospital (Lawrence et al., 1999a). This strain has been successfully genetically manipulated several times by using natural competence (Lomma et al., 2010; Rolando et al., 2013; Sahr et al., 2012). These findings led the authors to suggest that the use of the original Philadelphia-1 strain as it is closer to the original clinical background or the Paris strain that has not been engineered, might be a better choice for research on this pathogen (Rao et al., 2013). Finally, re-sequencing of the Philadelphia-1 strain revealed several sequencing errors present in the sequence published in 2004, which should be taken in consideration in the future (Chien et al., 2004; Rao et al., 2013).

17

ACCEPTED MANUSCRIPT Ensminger and colleagues conducted another, very interesting study on genome evolution of Legionella using next generation sequencing (Ensminger et al., 2012). To learn what the molecular determinants important for infection of macrophages by L. pneumophila are and therewith to understand how this bacterium adapts to a specific host,

PT

L. pneumophila was restricted to growth within mouse macrophages for hundreds of generations. The genomic sequence and the infectivity of the resulting strains were

SC RI

compared to the original ones. This showed that the evolved strains had accumulated several adaptive mutations as it replicated better in macrophages than the ancestor strain. Indeed several of the mutations correlated with the fitness increase such as those present in the gene coding the flagellar regulator FleN or the lysine biosynthesis pathway encoding

NU

genes. Both provided an advantage in replication in macrophages. Furthermore, identical mutations were present in different lineages that had undergone the same growth restriction,

MA

demonstrating parallel evolution due to the selective pressures exerted by the macrophages. These results reveal how the environment may modify a genome and support the hypothesis that cycling through many different eukaryotic hosts allows L. pneumophila to be a broad host-range pathogen. Thus this ecological profile is

ED

advantageous for infecting many different hosts with the cost to not be highly adapted to any of these hosts. Further similar studies will help us to decipher the virulence factors

PT

necessary for the infection of human macrophages and the adaptation to different specific hosts, in particular for L. pneumophila as there is high functional redundancy among its

CE

effectors.

5.3. Evolutionary forces acting on the Dot/Icm type IVB secretion system (T4BSS) in

AC

the genus Legionella

Legionella are cycling between extra- and intracellular environments, but seem to replicate only within eukaryotic cells like aquatic protozoa. Essential for the adaptation to these different environments is the ability of Legionella to secret proteins via its different secretion systems, in particular via the Dot/ T4BSS. The Dot/Icm T4BSS is similar to the Tra/Trb system of IncI plasmids and is essential for the establishment of a replication permissive vacuole within host cells (Isberg et al., 2009). The Dot/Icm system has the amazing capacity to translocate over 300 effector proteins, but may also transfer DNA (Segal et al., 1999; Vogel et al., 1998). To date several effector proteins have been functionally analysed and their roles in the subversion of many different host-signalling pathways have been elucidated (Hubber and Roy, 2010; Nora et al., 2009; Rolando and Buchrieser, 2012). Secretion of proteins via the Dot/Icm system is essential for the adaptation of Legionella to the intracellular environments it encounters. The description of the different effectors studied to date is not in the scope of this review, thus will discuss

18

ACCEPTED MANUSCRIPT here the evolutionary forces and the adaptation mechanisms acting on the Dot/Icm system itself. Already early studies using DNA/DNA hybridization analyses of the Dot/Icm locus, reported that the dot/icm genes are conserved in different L. pneumophila strains but also in

PT

different Legionella species (Morozova et al., 2004). Whole genome sequencing confirmed this finding and expanded the knowledge of this system as it revealed that the dot/Icm loci

SC RI

of strains L. pneumophila Philadelphia, Paris, Lens, Lorraine, Corby and HL 06041035 are not only present but also display a very high nucleotide conservation of 98-100% among orthologs Exceptions are only dotA, icmX that are more divergent (84% nucleotide identity) and icmC of strain Corby as compared to icmC of strain Paris as it is shorter and

NU

thus shows only 54% nucleotide identity. These results indicate that a strong negative selection acts on the genes constituting the Dot/Icm system (Gomez-Valero et al., 2011). A

MA

recent study analysing the genome sequences of Legionella hackeliae, Legionella micdadei and Legionella fallonii showed that the Dot/Icm system is also conserved, but that the genes show a lower nucleotide identity than among strains belonging to the same species.

ED

The degree of conservation of the different Dot/Icm proteins is very variable, ranging from >90% for DotB to proteins without any homology such as IcmR (Gomez-Valero et al., 2014). Surprisingly, the dotA gene is again one of the least conserved genes of the Dot/Icm

PT

system.

The dotA gene encodes an integral cytoplasmic membrane protein with eight

CE

hydrophobic transmembrane domains that is essential for L. pneumophila virulence, as bacteria lacking the dotA gene are defective in all virulence activities that require the Dot/Icm secretion system (Roy et al., 1998). Although the DotA protein is essential for

AC

successful infection of eukaryotic cells, the dotA gene shows higher variation than most of the other dot/icm genes among L. pneumophila strains and also among different Legionella species. The variation found in the dotA gene may be a sign that this gene is a target for host specialization and adaptive evolution, and its sequence variation may reflect the adaptation to different hosts and environments. Analyses of over 300 dotA gene sequences from L. pneumophila strains isolated from natural environments, man-made and clinical environments showed that clinical and man-made environmental strains belong to a sub-set of the genotypes existing in nature. Recombination and strong diversifying selection of the dotA alleles was observed (Costa et al., 2010). Interestingly one gene of the Dot/Icm system, icmR, is not conserved at the sequence level, but is in all different species analysed to date replaced by a nonhomologous gene with functional equivalence. This finding was reported the first time for the species L. micdadei and L. longbeachae (Feldman and Segal, 2004) and later confirmed for 27 additional Legionella species (Feldman et al., 2005). Similarly, a gene

19

ACCEPTED MANUSCRIPT encoding a protein with no similarity to any previously described proteins is present in the position of icmR in L. fallonii, suggesting that it is serving as a functional equivalent of icmR of L. pneumophila like it was shown for other species (Gomez-Valero et al., 2014). IcmR is a regulator of IcmQ (another component of the Dot/Icm system) that possesses pore

PT

forming activity (Dumenil et al., 2004). In contrast, the most highly conserved genes among all different strains investigated are the dotB, icmS, icmW and icmP genes (Gomez-Valero

SC RI

et al., 2014). This observation is in line with the finding that, the L. pneumophila dotB, icmS, icmW and icmP genes can functionally replace their homologues in Coxiella burnetii (Zusman et al., 2003).

Taken together, recombination and frequent non-synonymous mutations are the

NU

evolutionary mechanisms that led to the diversification of the dotA gene. Positive selection seems to act on the icmR gene homologues (Feldman et al., 2005) whereas purifying

MA

selection is playing a role in the evolution of the highly conserved dotB, icmS, icmW and icmP genes. Furthermore, Burstein and colleagues observed that even though the gene order and orientation of the Dot/Icm systems of different Legionella species is highly conserved, gene insertions, most likely mediated by recombination events may occur

ED

(Burstein et al., 2016). These mechanisms together may allow the adaptation of the different Legionella strains and species to specific environmental niches and diverse

CE

cell and the pathogen.

PT

protozoan hosts and play a role in the evolutionary arms race between the protozoan host

6. Adaptation of Legionella to the environment and its hosts

AC

6.1. Role of integrative conjugative elements in the evolution of Legionella Bacteriophages, plasmids, and mobile genetic elements (MGE) like integrative conjugative elements (ICE), as well as natural competence are key players in genome plasticity and evolution and may mediate adaption to specific environments and condition (Bellanger et al., 2014; Thomas and Nielsen, 2005). Genome sequencing and comparison revealed that the L. pneumophila genomes show high plasticity and have a very dynamic accessory genome due to horizontal gene transfer and the presence of many MGE (Cazalet et al., 2004; Gomez-Valero and Buchrieser, 2013). Particular interesting is the finding that many of the L. pneumophila MGEs are ICEs that encode different type IVA secretion systems (T4ASS). The first one described was the so-called L. pneumophila vir homologues region (Lvh-region). It contains 11 genes encoding a T4ASS and that was found to be located on a DNA island with a GC content higher than the average of the L. pneumophila chromosome (Segal et al., 1999). Then, the L. pneumophila pathogenicity island 1 (LpPI-1), a 65 kb region encoding a T4ASS homologous to the Tra proteins of the

20

ACCEPTED MANUSCRIPT F-plasmid of E. coli was described (Brassinga et al., 2003). It was thought to be specific to strain Philadelphia, but genome comparison by hybridization showed that this LpPI-1 is present in several L. pneumophila strains (Cazalet et al., 2008). Recent whole genome analyses of different L. pneumophila strains revealed that ICE carrying T4ASS are a

PT

common feature of the Legionella genomes (Gomez-Valero and Buchrieser, 2013; GomezValero et al., 2011). Different but related T4ASS are present on MGE in the genomes and

SC RI

three distinct T4SSAs with different prevalence among strains have been described: Lvh, Trb/Tra like T4SS and GI-T4SS (Glöckner et al., 2008; Gomez-Valero et al., 2011; Samrakandi et al., 2002; Segal et al., 1999). The genetic organization and components of these T4ASSs have high similarity to the classical VirB4/D4 conjugative system of

NU

Agrobacterium tumefaciens (Alvarez-Martinez and Christie, 2009). Here we will focus on the analysis and description of these three different classes of T4ASS as they seem to play

MA

an important role in the evolution of the Legionella genomes and probably also in the adaptation capacity of Legionella to the different environments this bacterium encounters.

6.1.1. Lvh (Legionella Vir homologue) region

ED

The Lvh T4ASS is located on a plasmid-like element of 36kb in strain Paris, 45kb in strain Philadelphia and strain Lens. This plasmid-like element can exist integrated in the

PT

genome or as a multi copy plasmid (Cazalet et al., 2004). Analysis of the mobility of this region showed that it is a phage like element whose excision and integration is growth

CE

phase dependent (Doleans-Jordheim et al., 2006). In L. pneumophila Lens and Paris the Lvh-region integrates in the tmRNA (Cazalet et al., 2004) and in L. pneumophila strain Philadelphia-1 it is inserted in a tRNA-Arg (Chien et al., 2004). In all but the L. pneumophila

AC

strain 130b, the Lvh-region exists as a single copy when integrated in the chromosome. In strain L. pneumophila 130b two distinct Lvh-regions are integrated in the genome in close vicinity. Lvh1 and Lvh2 are integrated in the 130b genome on either side of a CRISPR locus and share 92% similarity to each other. Lvh2 is 99% similar to the Lvh of strain Paris suggesting that it was acquired by horizontal gene transfer from this strain (Schroeder et al., 2010). The mobility of the Lvh-region was also deduced from whole genome analyses of the different derivatives of the original Philadephia-1 ancestor strain used as laboratory strains (JR32, Lp01, Lp02, Lp03). Indeed the original Philadelphia-1 strain carried the Lvhregion, but its derivatives Lp01, Lp02 and Lp03 have lost the entire Lvh-region (Rao et al., 2013). In contrast in the JR32 strain, another derivative of Philadelphia-1 strain, the Lvhregion is still present. However, in the J32 linage, a locus belonging to the Trb/Tra type T4ASS is deleted in comparison to the Philadelphia-1 progenitor (Rao et al., 2013). Interestingly, the Lvh-region shows not only intra- but also interspecies mobility since it is

21

ACCEPTED MANUSCRIPT present as a chromosomally integrated putative plasmid-like element in L. longbeachae strain D-4968 (Kozak et al., 2010b). The functional role of the Lvh T4ASS in L. pneumophila is not clear. However, it was shown that a mutant in the lvhB2 gene that is part of the T4ASS does not display a

PT

significant defect in interactions with host cells when the bacteria are grown at 37°C, but it has an approximately 100-fold effect on entry and intracellular replication when the bacteria

SC RI

are grown at 30°C, suggesting a role of this element at lower temperatures (Ridenour et al., 2003). Similarly, Bandyopadhyay and colleagues showed that the Lvh-region impacts infection and inclusion in Acanthamoeba castelanii cysts at lower temperatures. Furthermore, they showed that in growth conditions that mimic the aquatic phase of

NU

L. pneumophila such as water stress (WS) the Lvh T4ASS can functionally complement the defect in entry, delay of phagosome acidification, and intracellular multiplication phenotypes

MA

of a Dot/Icm mutant (Bandyopadhyay et al., 2007). They thus proposed that the Lvh-region is involved in virulence-related phenotypes under conditions mimicking the spread of Legionnaires' disease from environmental niches. In a recent study Bandyopadhyay and colleagues showed that the VirD4 protein alone encoded in the Lvh-region, is responsible

ED

for the restoration of the wild-type phenotype of a Dot/Icm mutant (Bandyopadhyay et al., 2013).

PT

Interestingly, the presence of a small noncoding RNA (sRNA, lpr0035) overlapping the attL site of the Lvh–region of strain Philadelphia (pLp45) was reported. The deletion in

CE

the 5’ chromosomal junction of pLp45 that is removing the 49-bp direct repeat as well as the sRNA lpr0035, locked this element in the chromosome and abolished its mobility. Amoebae and macrophage infection assays with stationary phase or WS-treated cultures of

AC

this mutant strain established a role of lpr0035 in the Legionella entry phenotype and in intracellular multiplication in J774 macrophages and A. castellani. However, the authors conclude that the mobility of pLp45 was not required for the entry and intracellular replication phenotype since only the virulence phenotype but not the mobility of pLp45 was complemented when expressing lpr0035 in the deletion mutant (Jayakumar et al., 2012). The different analyses together suggest that the Lvh-region is not important for in vitro growth of L. pneumophila, but it seems to offer adaptive advantages in certain stressful environmental niches like under lower temperatures and water stress. Further analyses will be necessary to fully understand the function of this T4ASS that is circulating among Legionella strains as it is encoded on a MGE.

6.1.2. Trb/Tra family of conjugative elements The Lvh T4ASS is not present in L. pneumophila strain Corby, but a similar T4ASS is integrated in this site (tmRNA) and a second genomic island carrying a T4ASS is

22

ACCEPTED MANUSCRIPT integrated in the tRNAPro gene, a site not occupied by a MGE in the strains Paris, Lens or Philadelphia. These two MGEs were named Trb-1 and Trb-2 (Glöckner et al., 2008). Like the Lvh-region, the Trb-1 and Trb-2 regions (42,710 bp and 34,434 bp, respectively) are able to excise from the chromosome forming episomal circles (Glöckner et al., 2008). Trb-1

PT

encodes for the proteins needed to assemble a functional conjugation/T4ASS and can be transferred horizontally to other L. pneumophila strains by conjugation where it integrates in

SC RI

a site-specific manner in the tRNAPro gene. In contrast to the Lvh-region, excision of the Trb-1 island is not growth phase dependent and can take place during intracellular replication in A. castellani. However, Trb-1 is not necessary for intracellular replication of L. pneumophila (Glöckner et al., 2008; Lautner et al., 2013). Recently, two similar genomic

NU

islands, Trb-3 in L. pneumophila strain Lorraine and Trb-4 in L. longbeachae NSW150 have been identified but it is not known yet whether these elements are also mobile (Gomez-

MA

Valero et al., 2011). Comparative genomics revealed, that the Lvh-region and all Trb-1 like MGEs of L. pneumophila are associated with a lvrRABC gene cluster (Gomez Valero et al., 2011). The lvrR gene is predicted to code for a transcriptional regulator and LvrC is a homologue of CsrA, an RNA binding protein that is crucial for the regulation of the

ED

L. pneumophila life cycle (Molofsky and Swanson, 2003). Deletion of the lvrRABC gene cluster from the Trb-1 region of strain Corby showed that these proteins are involved in the

PT

regulation of the circularisation of the Trb-1 ICE, however the exact mechanism is not known (Lautner et al., 2013).

CE

The so-called LpPI-1 described in strain Philadelphia-1 belongs also to the Trb/Tra family of conjugative/T4SS (Brassinga et al., 2003). This 65 kb region is inserted in the tRNALys,Arg and is predicted to encode a T4ASS and a set of cargo genes coding for

AC

transport proteins acetyltransferases, reductases, transposases, integrases, phage related genes and the virulence related magA-msrA locus. MagA is expressed intracellularly during transition from exponential phase to post-stationary phase. MsrA catalyses the reduction of methionine sulfoxide residues to methionine and is an established virulence determinant in E. coli, Neisseria gonorrhoeae and Mycoplasma genitalium (Dhandayuthapani et al., 2001; Moskovitz et al., 1995; Skaar et al., 2002). This ICE, was recently renamed ICE-ßox as studies examining its function and mobility showed that it confers resistance to ß-lactam antibiotics, hydrogen peroxide and bleach in vitro. It also protects L. pneumophila from the immune response of macrophages that produce reactive oxygen species, a protection directly correlated with the NADPH component of macrophages since L. pneumophila carrying the ICE-ßox replicated much better during the first 24 h, than bacteria lacking this element (Flynn and Swanson, 2014). Thus, this ICE may contribute to increased fitness of L. pneumophila in natural environments and confers oxidative stress resistance to this pathogen.

23

ACCEPTED MANUSCRIPT

6.1.3. Genomic Island associated T4SS family An additional class of T4SSs named the genomic island T4SS (GI-like) (Juhas et al., 2007) is present within the L. pneumophila genomes. Two such GI-like islands named

PT

Legionella genomic island 1 (LGI-1) and LGI-2 were identified in strain 130b (Schroeder et al., 2010). Strain Corby exhibits four such genomic islands associated T4SS named LpcGI-

SC RI

1 and 2, LpcGI-Asn and LpcGI-Phe. LpcGI-1 and 2 are also present in strain Paris (GomezValero et al., 2011). Both are mobile and exist in two or three different episomal forms, respectively. As for Trb-1, the excision and transfer of one form named LpcGI-2-A implicates the integrase encoded on this GI. The second form, named LpcGI-2AB seems to

NU

be transferred by a different conjugation system since its excision is integrase independent. However, for an unknown reason deletion of the two integrases encoded on this island,

MA

increases the rate of the episomal form. In contrast, LpcGI-Asn and LpcGI-Phe do not encode a T4ASS, little is known about their mobility but they are also integrated in tRNA genes, and nothing is known about their functional role they may play (Lautner et al., 2013). The here analysed, completely sequenced L. pneumophila (Paris, Lens, Corby,

ED

Alcoy, Lorraine, HL 06041035, 130b), and one L. longbeachae genome (NSW150) encode between one to four GI-T4SS and at least one of these elements is inserted in a tRNAthr site.

PT

Although, GI-T4SSs are widely conserved, some of those present in L. longbeachae or L. drancourtii miss part of the LvrABC regulatory locus that is in contrast highly conserved in

CE

the Lvh-region and the Trb/Tra type ICEs. GI-T4SSs are evolutionary distant from T4ASS and T4BSS. They form a separate clade that is a divergent monophyletic group. They lack the essential mobilization genes mob that processes the DNA before and after conjugation

virulence

AC

and the OriT. In the cargo regions of these LGI-T4SS a number of antibiotic resistance and genes

are

present

additionally

to

variable

amounts

of

RND

(resistance/nodulation/division) efflux systems putatively involved in pathogenicity (Wee et al., 2013). In strain Philadelphia the region carrying the RNDs modules is called pLP100 due to its size (100 kb) and its capacity to excise from the chromosome to form a circular 100kb plasmid-like element (Trigui et al., 2013). It encodes a putative conjugation system and is integrated in a tRNAthr gene. It excises from the chromosome in a growth phase dependent manner, but in contrast to the Lvh-region, the excision is more frequent during exponential than in post-exponential growth phase. Interestingly, excision in postexponential phase is Hfq dependent. pLP100 cargo genes like copA confer higher copper resistance to this strain (Trigui et al., 2013). In contrast, it does not influence Legionella survival during amoebae or macrophages infection (Kim et al., 2009). Taken together, all Legionella strains sequenced to date encode one or several T4ASSs present on MGE that are often ICE. Their implication in Legionella virulence and

24

ACCEPTED MANUSCRIPT adaptation seem to be host and strain specific and varies from one type to another. They may confer advantages to Legionella in specific environmental growth conditions, like does the Lvh–region for growth at low temperature and in water stress conditions or the ICE-ßox that confers resistance to ß-lactam antibiotics, hydrogen peroxide and bleach and protects

PT

L. pneumophila from the immune response of macrophages that produce reactive oxygen species. However, for most of these T4ASSs, no strong virulence phenotype was observed

SC RI

and for many the function is not known yet. The presence of these many and various types of MGE encoding T4ASSs in the Legionella genomes is intriguing (table 2 and 3) and thus it is plausible that each plays a different role in the adaption of Legionella to the environment

L. pneumophila fitness need to be found yet.

NU

and the different hosts, but the exact conditions where each element has an impact on

MA

6.2 Role of gene duplications in the adaptation of Legionella In addition to horizontal gene transfer (HGT), gene duplication events also shape the bacterial genomes and contribute to evolution (Andersson et al., 2015). Gene duplication

ED

and subsequent divergence of the extra copy to acquire a new function is one of the main sources of functional diversity and may be associated with adaptation to specific

PT

environments, development of novel life strategies, or other species-specific adaptations (Andersson et al., 2015; Conant and Wolfe, 2008; Innan and Kondrashov, 2010). Although Legionella has a moderate genome redundancy relative to other bacterial species, genome

CE

sequencing revealed that certain protein families possess numerous paralogous genes (Cazalet et al., 2004; Chien et al., 2004). For instance, expansion of the Dot/Icm effector

AC

repertoire, probably through acquisition of foreign genetic material and gene duplications, results in paralogue effector families like in the case of the Sid (substrate of Icm/Dot) effectors (Luo and Isberg, 2004). As an example, the SidE family is comprised of 4 paralogues genes, sdeA, sdeB, sdeC and sidE encoding potentially redundant functions (Bardill et al., 2005; Jeong et al., 2015). The genetic and functional redundancies in the effector arsenal probably promotes intracellular replication of Legionella in a diverse range of protozoan hosts in the environment and it is possible that each paralogue is specific for a particular host cell (Chien et al., 2004; Luo and Isberg, 2004; O'Connor et al., 2011). In addition to duplication of structural genes, a new picture has recently emerged in which Legionella may favor, duplication, divergence, and maintenance of several paralogous global regulatory proteins during evolution. Global regulators allow coordination of cellular functions in response to physiological, metabolic and/or environmental fluctuations and thus are crucial for successful bacterial infection, persistence and

25

ACCEPTED MANUSCRIPT dissemination. Thus, duplication of such regulators may lead to more versatile and adapted bacteria than in the case of structural gene duplications (Wang et al., 2011). The nucleoidassociated protein Fis (factor for inversion stimulation) is a central component of such global regulatory networks. Comparative genomic analysis revealed that in all currently

PT

accessible Legionella genome sequences three Fis paralogs are present (Fis1, Fis2 and Fis3) whereas Gammaproteobacteria usually contain only one single Fis protein (Cazalet et

SC RI

al., 2004; Chien et al., 2004; Zusman et al., 2014). Fis is considered as a global regulator with a broad set of regulatory functions such as genome organization, phage transposition, initiation of the replication and gene transcription (Dillon and Dorman, 2010). In particular, Fis regulates the expression of virulence related-genes in many species such as enteric

NU

bacteria (e.g. Escherichia coli, Salmonella enterica, Shigella, Vibrio cholera and Dikeya dandantii) (Falconi et al., 2001; Goldberg et al., 2001; Kelly et al., 2004; Lautier and Nasser,

MA

2007; Lenz and Bassler, 2007). It was shown that Fis preferentially binds a 17-bp and ATrich consensus sequence at promoter regions of target genes and can work in concert with other regulators to modulate transcription initiation (Browning et al., 2010). The reconstruction of the evolutionary history of the three Fis proteins of Legionella suggests

ED

that two duplication events occurred before the divergence of the genus Legionella (Zusman et al., 2014). The three Fis protein sequences share a little less than 50% amino

PT

acid identity, suggesting a rapid evolution after gene duplication. However, the helix-turnhelix DNA-binding domain is conserved in the three protein sequences as well as the amino

CE

acids involved in the recognition of the Fis regulatory motif on the promoter region of targets genes (Zusman et al., 2014). Interestingly, Fis1 and Fis3, but not Fis2, are required for optimal intracellular growth in both amoeba and macrophages and Fis1 and Fis3 directly

AC

repress the level of expression of 18 Dot/Icm effector-encoding genes. Fis1 and Fis3 appear to have common and distinct target genes since some effector genes are mainly repressed by either Fis1 or Fis3 while other sets of genes are repressed by both regulators in a similar manner. In contrast, Fis2 does not seem to modulate the expression of the effector genes tested, at least under the experimental conditions of the study. These results suggest that after gene duplication, the three copies have diverged and have acquired at least in part, distinct regulatory functions. They suggest that maintaining three distinct Fis regulatory proteins may allow fine-tuning gene expression of the large repertoire of Dot/Icm effectors of Legionella. In addition, the three Fis proteins are probably part, with other transcriptional regulators (CpxR, PmrA), of sophisticated regulatory networks that allow proper coordination of virulence-related gene expression and/or to respond to the multiple stimuli that the bacterium encounters inside hosts or in the environment. Fis is not the only regulator present in several copies in the Legionella genomes. In their study, Zusman and collaborators also reported that the two-component system CpxRA, that regulates the

26

ACCEPTED MANUSCRIPT expression of several effector-encoding genes as well as dot/icm genes (Altman and Segal, 2008; Gal-Mor and Segal, 2003), has two other paralogs in Legionella longbeachae and L. dumoffii. However, the roles of these multiple CpxRA homologs in some Legionella species remain to be determined.

PT

Another remarkable feature of the Legionella genomes is the presence of four to seven csrA paralogs depending on strains (Abbott et al., 2015b). For instance, L.

SC RI

pneumophila strains Philadelphia and Paris contain 5 and 6 CsrA-like genes, respectively, and up to 7 CsrA copies are identified in the virulent environmental isolate Lp LPE509 (Abbott et al., 2015b) or in L. drancourtii (Zusman et al., 2014). In contrast, L. longbeachae strains NMW150 and D-4968 contain four CsrA copies. CsrA is a pivotal regulator of

NU

metabolism and virulence that controls the switch from replicative to transmissive phase in L. pneumophila. It was shown that CsrA is absolutely required for efficient bacterial

MA

replication in broth culture or inside eukaryotic cells and that ir it represses phenotypic traits related to the virulent phase of the L. pneumophila life cycle such as pigmentation, motility, cell shape shortening, stress resistance, sodium sensitivity and cytotoxicity (Fettes et al., 2001; Forsbach-Birk et al., 2004; Molofsky and Swanson, 2003). The CsrA protein acts as a

ED

global post-transcriptional regulator by directly binding messenger RNA at an ANGGA motif and thus interfering with translation initiation and/or mRNA stability. In L. pneumophila, it

PT

was shown that CsrA directly repressed the expression of numerous Dot/Icm effector genes (Nevo et al., 2014; Rasis and Segal, 2009).

CE

Recently, Abbott et al. (2015b) have performed a functional study on one of the csrA paralogues (named CsrR) that, like the canonical CsrA is conserved among L. pneumophila strains. CsrA and CsrR share only 28% amino acid identity but despite significant genetic

AC

drift, CsrR retained the amino acid residues necessary for its RNA binding activity. However, CsrR appears unlikely to regulate neither transmissive traits commonly studied nor intracellular growth inside eukaryotic cells. In contrast, CsrR seems to be required to enhance survival of L. pneumophila in water since a csrR deletion mutant presents reduced viability after prolonged incubation in hot tap water compared to the wild type strain. In addition, CsrA specifically binds to a ANGGA motif of the csrR mRNA in vitro and represses its translation, creating a global hierarchical regulatory cascade that adds another level of complexity to posttranscriptional regulation in L. pneumophila (Abbott et al., 2015b). The authors speculate that csrA gene duplication and genetic drift result in a new regulatory role for CsrR that retained the mRNA binding function of canonical CsrA but acquired a new cohort of targets that increase the capacity of L. pneumophila to withstand a broad range of environmental conditions. While CsrA promotes intracellular replication inside host cells, CrsR would play a role in the environmental persistence by enhancing its survival under poor-nutrient conditions. Both, the canonical csrA and csrR genes appear to be stably

27

ACCEPTED MANUSCRIPT integrated into the chromosome in contrast to the other CsrA paralogs that are associated to T4ASS within integrative and conjugative elements (ICEs). It seems that the number of copies of CsrA depends on the number of genomic island-encoded T4SSs present in the genome of a specific strain (Gomez-Valero et al., 2011). In L. pneumophila Paris strain,

PT

except the canonical csrA and csrR genes that belong to the core genome, the four other csrA-like gene copies are located on the accessory genome: i) one copy, named LvrC, is

SC RI

flanking the Lvh region (Doléans-Jordheim et al. 2006) ; ii) two other CsrA/LvrC-like copies are located at the 5’ end of two distinct genomic island-associated T4ASS (Wee et al. 2013); iii) and the fourth CsrA homolog is carried on the 136 kb-plasmid upstream the Tratype T4ASS (Gomez-Valero et al., 2011) (see table 3). As previously suggested in this

NU

review, ICE-related LvrC/csrA homologues might be implicated in the regulation of the mobility of these islands (Lautner et al., 2013). Very recently, Swanson and collaborators

MA

(Abbott et al., 2015a) demonstrated that the CsrA paralog encoded by the ICE-βox element, named CsrT (for CsrA paralog for ICE transfer) decreases conjugative transfer, replication in macrophages and oxidative stress resistance conferred by its cognate mobile element. CsrT seems to play an even broader regulatory role since its ectopic expression also

ED

reduces motility not only in L. pneumophila but also in Bacillus subtilis probably by binding hag mRNA at an ANGGA as the B. subtillis CsrA does (Abbott et al., 2015a). It is also

PT

tempting to speculate that these CsrA-like extra-copies might play a role in regulating the expression of their cognate T4ASS-encoding genes that mediate horizontal transfer of

CE

these ICE elements. To conclude, L. pneumophila ICE CsrA-like regulators seem to play a crucial role in maintenance and spread of these mobile elements however, the specific physiological stimuli, metabolic shift or environmental changing conditions that allow their

AC

excision and transfer are still unknown. A summary showing the number of duplications of each of the mentioned genes above is shown in table 3. The reason why Legionella duplicated and maintained several copies of such regulatory proteins (transcriptional or post-transcriptional regulators) is not known but it presumably reflects a regulatory plasticity that may be important for the adaptation to different hosts and lifestyles. It would be of interest to determine the role of each paralog, if they have related or redundant functions in L. pneumophila, or if they are differentially expressed in response to signals at specific stages of infection or environmental cues. 6.3. Specific features of Legionella influencing environmental adaptation

One of the major findings in microbiology of the last 10 years was the discovery of the CRISPR-Cas system and the identification of its immune role against invasion by phages or plasmids (Horvath and Barrangou, 2010). It became a breakthrough discovery

28

ACCEPTED MANUSCRIPT when it was shown that it could be used as a programmable genome-editing tool (Jinek et al., 2012). Recently, it was shown that this locus might also play a role in infection. In the L. pneumophila strain 130b genome a type II-B CRISPR-Cas locus that contains cas9, cas1, cas2, cas4, and an array with 60 repeats and 58 unique spacers is present. The cas1 and

PT

cas2 genes of the CRISPR-Cas locus are more expressed during intracellular growth in macrophages as compared to in vitro growth in liquid media. Further analyses showed that

SC RI

cas9, cas1, cas4, and CRISPR array mutants grew normally in macrophages and amoeba, in contrast the cas2 gene was required for intracellular growth in amoeba but dispensable for growth in macrophages (Gunderson and Cianciotto, 2013). Cas2 of L. pneumophila is thus one of the rare proteins identified to date that show different functions in human

NU

macrophages and in protozoan hosts. Further analyses of the function of Cas2 showed that it possesses RNase and DNase activity and that this nuclease activity promotes infection of

MA

amoeba. Interestingly, it was shown that this activity is also not conserved in all amoeba hosts but differs between amoeba species (Gunderson et al., 2015). Thus the CRISP-Cas locus has an additional role, adaptation of L. pneumophila to specific protozoan hosts to better persist in the environment. As the introduction of cas2 into strain Philadelphia-1, a

ED

L. pneumophila strain not carrying the CRIISPR-Cas locus caused that strain to be more infectious, there might be a selective advantage to acquire cas2, perhaps via MGEs. Indeed,

PT

in strain Paris this locus is located on the Lvh-region described earlier, which is a MGE.

CE

7. Concluding remarks

In the last years, large-scale genome sequencing of many Legionella species and strains

AC

has greatly improved our knowledge on the evolution, pathogenicity and genetic diversity of the genus Legionella. Furthermore, an increasing number of studies that have been conducted demonstrated that whole genome sequencing allows robust genotyping and without doubt it will become the method of choice for epidemiological investigations in the future. Comparative genomics approaches are equally valuable to shed light on the high plasticity of the Legionella genomes that have lead to the present genetic and pathogenic heterogeneity within this species. Significant differences are observed between strains, showing that numerous genomic rearrangements continue to occur also after speciation. Some sites of syntenic differences are marked by the presence of mobile genetic elements such as ICE or genomic islands that are usually associated with T4ASS. In contrast to the Dot/Icm T4BSS, which is the key for Legionella pathogenesis, the roles of these various T4ASS in virulence or in the adaption to changing environmental conditions remain to be elucidated. However, the data known to date about the known T4ASSs show that they are less decisive for L. pneumophila adaptation and persistence than the T4BSS Dot/Icm

29

ACCEPTED MANUSCRIPT system. Another striking feature of the Legionella genomes revealed by comparative genomics is the presence of a large cohort of Dot/Icm effectors, which of many harbour numerous and various protein domains typically present in eukaryotic organisms. A large proportion of these genes is species specific and is correlated with the evolutionary history

PT

of this bacterium, a pathogen that co-evolved with a variety of protozoan hosts. As the Dot/Icm T4BSS is well conserved among species, even within non-pathogenic strains, it is

SC RI

likely that the increased virulence towards humans of some strains is related to this pool of effectors and/or to the sophisticated regulatory circuits that control their expression and not the Dot/Icm secretion system. Thus these many different putative effector proteins represent a promising research topic for future studies on the emergence and evolution of

MA

NU

pathogenicity.

Acknowledgements

The authors declare that they have no competing interests.

ED

Work in SJ and PD laboratory is performed within the framework of the LABEX ECOFECT (ANR-11-LABX-0042) of Université de Lyon, within the program “Investissements d’Avenir”

PT

(ANR-11-IDEX-0007) operated by the French National Research Agency (ANR) and supported by the Institut national de la santé et de la recherche médicale (INSERM), the

CE

Fondation pour la Recherche Médicale (FRM) grant n° DBI20131228568, the Hospices Civils de Lyon and the General Direction for Health (DGS). Work in CB laboratory is financed by the Institut Pasteur, the Centre National de Recherche

AC

Scientifique (CNRS), the Institut Carnot-Pasteur MI, the French Region Ile de France (DIM Malinf), the grant n°ANR-10-LABX-62-IBEID, the Infect-ERA project EUGENPATH (ANR13-IFEC-0003-02) and the Fondation pour la Recherche Médicale (FRM) grant N° DEQ20120323697.

30

AC

CE

PT

ED

MA

NU

SC RI

PT

ACCEPTED MANUSCRIPT

31

ACCEPTED MANUSCRIPT References

AC

CE

PT

ED

MA

NU

SC RI

PT

Abbott, Z.D., Flynn, K.J., Byrne, B.G., Mukherjee, S., Kearns, D.B., Swanson, M.S., 2015a. csrT represents a new class of csrA-like regulatory genes associated with Integrative Conjugative Elements of Legionella pneumophila. J Bacteriol. Abbott, Z.D., Yakhnin, H., Babitzke, P., Swanson, M.S., 2015b. csrR, a Paralog and Direct Target of CsrA, Promotes Legionella pneumophila Resilience in Water. MBio 6, e00595. Adeleke, A., Pruckler, J., Benson, R., Rowbotham, T., Halablab, M., Fields, B., 1996. Legionella-like amebal pathogens--phylogenetic status and possible role in respiratory disease. Emerg Infect Dis 2, 225-230. Adeleke, A.A., Fields, B.S., Benson, R.F., Daneshvar, M.I., Pruckler, J.M., Ratcliff, R.M., Harrison, T.G., Weyant, R.S., Birtles, R.J., Raoult, D., Halablab, M.A., 2001. Legionella drozanskii sp. nov., Legionella rowbothamii sp. nov. and Legionella fallonii sp. nov.: three unusual new Legionella species. Int J Syst Evol Microbiol 51, 1151-1160. Akermi, M., Doleans, A., Forey, F., Reyrolle, M., Meugnier, H., Freney, J., Vandenesch, F., Etienne, J., Jarraud, S., 2006. Characterization of the Legionella anisa population structure by pulsed-field gel electrophoresis. FEMS Microbiol Lett 258, 204-207. Allombert, J., Fuche, F., Michard, C., Doublet, P., 2013. Molecular mimicry and original biochemical strategies for the biogenesis of a Legionella pneumophila replicative niche in phagocytic cells. Microbes Infect 15, 981-988. Altman, E., Segal, G., 2008. The response regulator CpxR directly regulates expression of several Legionella pneumophila icm/dot components as well as new translocated substrates. J Bacteriol 190, 1985-1996. Alvarez-Martinez, C.E., Christie, P.J., 2009. Biological diversity of prokaryotic type IV secretion systems. Microbiology and molecular biology reviews : MMBR 73, 775-808. Amemura-Maekawa, J., Kura, F., Chang, B., Watanabe, H., 2005. Legionella pneumophila serogroup 1 isolates from cooling towers in Japan form a distinct genetic cluster. Microbiol Immunol 49, 1027-1033. Amodeo, M.R., Murdoch, D.R., Pithie, A.D., 2010. Legionnaires' disease caused by Legionella longbeachae and Legionella pneumophila: comparison of clinical features, host-related risk factors, and outcomes. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 16, 1405-1407. Andersson, D.I., Jerlstrom-Hultqvist, J., Nasvall, J., 2015. Evolution of new functions de novo and from preexisting genes. Cold Spring Harb Perspect Biol 7. Aurell, H., Farge, P., Meugnier, H., Gouy, M., Forey, F., Lina, G., Vandenesch, F., Etienne, J., Jarraud, S., 2005. Clinical and environmental isolates of Legionella pneumophila serogroup 1 cannot be distinguished by sequence analysis of two surface protein genes and three housekeeping genes. Appl Environ Microbiol 71, 282-289. Bandyopadhyay, P., Lang, E.A., Rasaputra, K.S., Steinman, H.M., 2013. Implication of the VirD4 coupling protein of the Lvh type 4 secretion system in virulence phenotypes of Legionella pneumophila. J Bacteriol 195, 3468-3475. Bandyopadhyay, P., Liu, S., Gabbai, C.B., Venitelli, Z., Steinman, H.M., 2007. Environmental Mimics and the Lvh Type IVA Secretion System Contribute to Virulencerelated Phenotypes of Legionella pneumophila. Infect Immun 75, 723-735. Bardill, J.P., Miller, J.L., Vogel, J.P., 2005. IcmS-dependent translocation of SdeA into macrophages by the Legionella pneumophila type IV secretion system. Mol Microbiol 56, 90-103. Bartlett, J.G., 2008. Is activity against "atypical" pathogens necessary in the treatment protocols for community-acquired pneumonia? Issues with combination therapy. Clin Infect Dis 47 Suppl 3, S232-236. Bartlett, J.G., 2011. Diagnostic tests for agents of community-acquired pneumonia. Clin Infect Dis 52 Suppl 4, S296-304. Bartley, P.B., Ben Zakour, N.L., Stanton-Cook, M., Muguli, R., Prado, L., Garnys, V., Taylor, K., Barnett, T.C., Pinna, G., Robson, J., Paterson, D.L., Walker, M.J., Schembri, M.A.,

32

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Beatson, S.A., 2016. Hospital-wide Eradication of a Nosocomial Legionella pneumophila Serogroup 1 Outbreak. Clin Infect Dis 62, 273-279. Beaute, J., Zucs, P., de Jong, B., 2013. Legionnaires disease in Europe, 2009-2010. Euro Surveill 18, 20417. Bellanger, X., Payot, S., Leblond-Bourget, N., Guedon, G., 2014. Conjugative and mobilizable genomic islands in bacteria: evolution and diversity. FEMS microbiology reviews 38, 720-760. Berger, K.H., Isberg, R.R., 1993. Two distinct defects in intracellular growth complemented by a single genetic locus in Legionella pneumophila. Molecular microbiology 7, 7-19. Birtles, R.J., Rowbotham, T.J., Raoult, D., Harrison, T.G., 1996. Phylogenetic diversity of intra-amoebal Legionellae as revealed by 16S rRNA gene sequence comparison. Microbiology 142 ( Pt 12), 3525-3530. Blaser, M., 1977. Hot-bath syndrome, Pontiac fever, and Legionnaires' disease. Lancet (London, England) 2, 1226. Bosch, T., Euser, S.M., Landman, F., Bruin, J.P., EP, I.J., den Boer, J.W., Schouls, L.M., 2015. Whole-Genome Mapping as a Novel High-Resolution Typing Tool for Legionella pneumophila. Journal of clinical microbiology 53, 3234-3238. Brassinga, A.K., Hiltz, M.F., Sisson, G.R., Morash, M.G., Hill, N., Garduno, E., Edelstein, P.H., Garduno, R.A., Hoffman, P.S., 2003. A 65-Kilobase Pathogenicity Island Is Unique to Philadelphia-1 Strains of Legionella pneumophila. J Bacteriol 185, 4630-3637. Brassinga, A.K., Sifri, C.D., 2013. The Caenorhabditis elegans model of Legionella infection. Methods Mol Biol 954, 439-461. Brenner, D.J., Steigerwalt, A.G., Epple, P., Bibb, W.F., McKinney, R.M., Starnes, R.W., Colville, J.M., Selander, R.K., Edelstein, P.H., Moss, C.W., 1988. Legionella pneumophila serogroup Lansing 3 isolated from a patient with fatal pneumonia, and descriptions of L. pneumophila subsp. pneumophila subsp. nov., L. pneumophila subsp. fraseri subsp. nov., and L. pneumophila subsp. pascullei subsp. nov. Journal of clinical microbiology 26, 1695-1703. Brenner, D.J., Steigerwalt, A.G., McDade, J.E., 1979. Classification of the Legionnaires' disease bacterium: Legionella pneumophila, genus novum, species nova, of the family Legionellaceae, familia nova. Ann Intern Med 90, 656-658. Brieland, J., Freeman, P., Kunkel, R., Chrisp, C., Hurley, M., Fantone, J., Engleberg, C., 1994. Replicative Legionella pneumophila lung infection in intratracheally inoculated A/J mice. A murine model of human Legionnaires' disease. Am J Pathol 145, 1537-1546. Brieland, J., McClain, M., LeGendre, M., Engleberg, C., 1997. Intrapulmonary Hartmannella vermiformis: a potential niche for Legionella pneumophila replication in a murine model of legionellosis. Infect Immun 65, 4892-4896. Browning, D.F., Grainger, D.C., Busby, S.J., 2010. Effects of nucleoid-associated proteins on bacterial chromosome structure and gene expression. Curr Opin Microbiol 13, 773780. Burstein, D., Amaro, F., Zusman, T., Lifshitz, Z., Cohen, O., Gilbert, J.A., Pupko, T., Shuman, H.A., Segal, G., 2016. Genomic analysis of 38 Legionella species identifies large and diverse effector repertoires. Nature genetics. Cazalet, C., Gomez-Valero, L., Rusniok, C., Lomma, M., Dervins-Ravault, D., Newton, H.J., Sansom, F.M., Jarraud, S., Zidane, N., Ma, L., Bouchier, C., Etienne, J., Hartland, E.L., Buchrieser, C., 2010. Analysis of the Legionella longbeachae genome and transcriptome uncovers unique strategies to cause Legionnaires' disease. PLoS Genet 6, e1000851. Cazalet, C., Jarraud, S., Ghavi-Helm, Y., Kunst, F., Glaser, P., Etienne, J., Buchrieser, C., 2008. Multigenome analysis identifies a worldwide distributed epidemic Legionella pneumophila clone that emerged within a highly diverse species. Genome Res. 18, 431441. Cazalet, C., Rusniok, C., Bruggemann, H., Zidane, N., Magnier, A., Ma, L., Tichit, M., Jarraud, S., Bouchier, C., Vandenesch, F., Kunst, F., Etienne, J., Glaser, P., Buchrieser, C., 2004. Evidence in the Legionella pneumophila genome for exploitation of host cell functions and high genome plasticity. Nature genetics 36, 1165-1173.

33

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Chang, B., Amemura-Maekawa, J., Watanabe, H., 2009. An improved protocol for the preparation and restriction enzyme digestion of pulsed-field gel electrophoresis agarose plugs for the analysis of Legionella isolates. Jpn J Infect Dis 62, 54-56. Chien, M., Morozova, I., Shi, S., Sheng, H., Chen, J., Gomez, S.M., Asamani, G., Hill, K., Nuara, J., Feder, M., Rineer, J., Greenberg, J.J., Steshenko, V., Park, S.H., Zhao, B., Teplitskaya, E., Edwards, J.R., Pampou, S., Georghiou, A., Chou, I.C., Iannuccilli, W., Ulz, M.E., Kim, D.H., Geringer-Sameth, A., Goldsberry, C., Morozov, P., Fischer, S.G., Segal, G., Qu, X., Rzhetsky, A., Zhang, P., Cayanis, E., De Jong, P.J., Ju, J., Kalachikov, S., Shuman, H.A., Russo, J.J., 2004. The genomic sequence of the accidental pathogen Legionella pneumophila. Science 305, 1966-1968. Conant, G.C., Wolfe, K.H., 2008. Turning a hobby into a job: how duplicated genes find new functions. Nature reviews. Genetics 9, 938-950. Correia, A.M., Ferreira, J.S., Borges, V., Nunes, A., Gomes, B., Capucho, R., Goncalves, J., Antunes, D.M., Almeida, S., Mendes, A., Guerreiro, M., Sampaio, D.A., Vieira, L., Machado, J., Simoes, M.J., Goncalves, P., Gomes, J.P., 2016. Probable Person-toPerson Transmission of Legionnaires' Disease. N Engl J Med 374, 497-498. Coscolla, M., Comas, I., Gonzalez-Candelas, F., 2011. Quantifying nonvertical inheritance in the evolution of Legionella pneumophila. Molecular biology and evolution 28, 9851001. Coscolla, M., Gonzalez-Candelas, F., 2009. Direct sequencing of Legionella pneumophila from respiratory samples for sequence-based typing analysis. Journal of clinical microbiology 47, 2901-2905. Costa, J., Tiago, I., Da Costa, M.S., Verissimo, A., 2010. Molecular evolution of Legionella pneumophila dotA gene, the contribution of natural environmental strains. Environmental microbiology 12, 2711-2729. Cunha, B.A., Connolly, J., Abruzzo, E., 2015. Increase in pre-seasonal community-acquired Legionnaire's disease due to increased precipitation. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 21, e45-46. Currie, S.L., Beattie, T.K., 2015. Compost and Legionella longbeachae: an emerging infection? Perspect Public Health 135, 309-315. Dhandayuthapani, S., Blaylock, M.W., Bebear, C.M., Rasmussen, W.G., Baseman, J.B., 2001. Peptide methionine sulfoxide reductase (MsrA) is a virulence determinant in Mycoplasma genitalium. J Bacteriol 183, 5645-5650. Dillon, S.C., Dorman, C.J., 2010. Bacterial nucleoid-associated proteins, nucleoid structure and gene expression. Nat Rev Microbiol 8, 185-195. Doleans-Jordheim, A., Akermi, M., Ginevra, C., Cazalet, C., Kay, E., Schneider, D., Buchrieser, C., Atlan, D., Vandenesch, F., Etienne, J., Jarraud, S., 2006. Growth-phasedependent mobility of the lvh-encoding region in Legionella pneumophila strain Paris. Microbiology 152, 3561-3568. Doleans, A., Aurell, H., Reyrolle, M., Lina, G., Freney, J., Vandenesch, F., Etienne, J., Jarraud, S., 2004. Clinical and environmental distributions of Legionella strains in France are different. Journal of clinical microbiology 42, 458-460. Drozanski, W., 1956. Fatal bacterial infection in soil amoebae. Acta. Microbiol. Pol. 5, 315317. Drozanski, W.J., 1991. Sarcobium lyticum gen. nov., sp. nov., an obligate intracellular bacterial parasite of small free-living amoebae. Int. J. Syst. Bacteriol. 41, 82-87. Dumenil, G., Montminy, T.P., Tang, M., Isberg, R.R., 2004. IcmR-regulated membrane insertion and efflux by the Legionella pneumophila IcmQ protein. The Journal of biological chemistry 279, 4686-4695. Edwards, M.T., Fry, N.K., Harrison, T.G., 2008. Clonal population structure of Legionella pneumophila inferred from allelic profiling. Microbiology 154, 852-864. Ensminger, A.W., Yassin, Y., Miron, A., Isberg, R.R., 2012. Experimental evolution of Legionella pneumophila in mouse macrophages leads to strains with altered determinants of environmental survival. PLoS pathogens 8, e1002731.

34

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

European Centre for Disease Prevention and Control, 2016. Legionnaires' disease in Europe, 2014. Falconi, M., Prosseda, G., Giangrossi, M., Beghetto, E., Colonna, B., 2001. Involvement of FIS in the H-NS-mediated regulation of virF gene of Shigella and enteroinvasive Escherichia coli. Mol Microbiol 42, 439-452. Feeley, J.C., Gorman, G.W., Weaver, R.E., Mackel, D.C., Smith, H.W., 1978. Primary isolation media for Legionnaires disease bacterium. Journal of clinical microbiology 8, 320-325. Feil, E.J., Holmes, E.C., Bessen, D.E., Chan, M.S., Day, N.P., Enright, M.C., Goldstein, R., Hood, D.W., Kalia, A., Moore, C.E., Zhou, J., Spratt, B.G., 2001. Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proceedings of the National Academy of Sciences of the United States of America 98, 182-187. Feldman, M., Segal, G., 2004. A specific genomic location within the icm/dot pathogenesis region of different Legionella species encodes functionally similar but nonhomologous virulence proteins. Infect Immun 72, 4503-4511. Feldman, M., Zusman, T., Hagag, S., Segal, G., 2005. Coevolution between nonhomologous but functionally similar proteins and their conserved partners in the Legionella pathogenesis system. Proceedings of the National Academy of Sciences of the United States of America 102, 12206-12211. Fettes, P.S., Forsbach-Birk, V., Lynch, D., Marre, R., 2001. Overexpresssion of a Legionella pneumophila homologue of the E. coli regulator csrA affects cell size, flagellation, and pigmentation. Int J Med Microbiol 291, 353-360. Fields, B.S., Benson, R.F., Besser, R.E., 2002. Legionella and Legionnaires' disease: 25 years of investigation. Clin Microbiol Rev 15, 506-526. Flynn, K.J., Swanson, M.S., 2014. Integrative conjugative element ICE-betaox confers oxidative stress resistance to Legionella pneumophila in vitro and in macrophages. mBio 5, e01091-01014. Forsbach-Birk, V., McNealy, T., Shi, C., Lynch, D., Marre, R., 2004. Reduced expression of the global regulator protein CsrA in Legionella pneumophila affects virulence-associated regulators and growth in Acanthamoeba castellanii. Int J Med Microbiol 294, 15-25. Fraser, D.W., Tsai, T.R., Orenstein, W., Parkin, W.E., Beecham, H.J., Sharrar, R.G., Harris, J., Mallison, G.F., Martin, S.M., McDade, J.E., Shepard, C.C., Brachman, P.S., 1977. Legionnaires' disease: description of an epidemic of pneumonia. N. Engl. J. Med. 297, 1189-1197. Fry NK, A.B., Wewalka G, and Harrison TG, 2005. Epidemiological typing of Legionella pneumophila in the absence of isolates, in: Cianciotto, N.P. (Ed.), Legionella: State of art 30 years after its recognition. ASM Press, Chicago, p. 152. Fry, N.K., Afshar, B., Visca, P., Jonas, D., Duncan, J., Nebuloso, E., Underwood, A., Harrison, T.G., 2005. Assessment of fluorescent amplified fragment length polymorphism analysis for epidemiological genotyping of Legionella pneumophila serogroup 1. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 11, 704-712. Fry, N.K., Alexiou-Daniel, S., Bangsborg, J.M., Bernander, S., Castellani Pastoris, M., Etienne, J., Forsblom, B., Gaia, V., Helbig, J.H., Lindsay, D., Christian Luck, P., Pelaz, C., Uldum, S.A., Harrison, T.G., 1999. A multicenter evaluation of genotypic methods for the epidemiologic typing of Legionella pneumophila serogroup 1: results of a panEuropean study. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 5, 462-477. Fry, N.K., Bangsborg, J.M., Bergmans, A., Bernander, S., Etienne, J., Franzin, L., Gaia, V., Hasenberger, P., Baladron Jimenez, B., Jonas, D., Lindsay, D., Mentula, S., Papoutsi, A., Struelens, M., Uldum, S.A., Visca, P., Wannet, W., Harrison, T.G., 2002. Designation of the European Working Group on Legionella Infection (EWGLI) amplified fragment length polymorphism types of Legionella pneumophila serogroup 1 and results of intercentre proficiency testing Using a standard protocol. Eur J Clin Microbiol Infect Dis 21, 722-728.

35

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Fry, N.K., Warwick, S., Saunders, N.A., Embley, T.M., 1991. The use of 16S ribosomal RNA analyses to investigate the phylogeny of the family Legionellaceae. J Gen Microbiol 137, 1215-1222. Fujinami, Y., Kikkawa, H.S., Kurosaki, Y., Sakurada, K., Yoshino, M., Yasuda, J., 2010. Rapid discrimination of Legionella by matrix-assisted laser desorption ionization time-offlight mass spectrometry. Microbiological research. Gaia, V., Fry, N.K., Afshar, B., Luck, P.C., Meugnier, H., Etienne, J., Peduzzi, R., Harrison, T.G., 2005. Consensus sequence-based scheme for epidemiological typing of clinical and environmental isolates of Legionella pneumophila. Journal of clinical microbiology 43, 2047-2052. Gaia, V., Fry, N.K., Harrison, T.G., Peduzzi, R., 2003. Sequence-based typing of Legionella pneumophila serogroup 1 offers the potential for true portability in legionellosis outbreak investigation. J. Clin. Microbiol. 41, 2932-2939. Gal-Mor, O., Segal, G., 2003. Identification of CpxR as a positive regulator of icm and dot virulence genes of Legionella pneumophila. J Bacteriol 185, 4908-4919. Garduno, R.A., Garduno, E., Hoffman, P.S., 1998. Surface-associated hsp60 chaperonin of Legionella pneumophila mediates invasion in a HeLa cell model. Infect Immun 66, 46024610. Garrity, G., Brown, A., Vickers, R., 1980. Tatlockia and Fluoribacter : two new genera of organisms resembling Legionella pneumophila. Int. J. Syst. Bacteriol. 30, 609-614. Ginevra, C., Forey, F., Campese, C., Reyrolle, M., Che, D., Etienne, J., Jarraud, S., 2008. Lorraine strain of Legionella pneumophila serogroup 1, France. Emerg Infect Dis 14, 673-675. Ginevra, C., Jacotin, N., Diancourt, L., Guigon, G., Arquilliere, R., Meugnier, H., Descours, G., Vandenesch, F., Etienne, J., Lina, G., Caro, V., Jarraud, S., 2012. Legionella pneumophila sequence type 1/Paris pulsotype subtyping by spoligotyping. Journal of clinical microbiology 50, 696-701. Ginevra, C., Lopez, M., Forey, F., Reyrolle, M., Meugnier, H., Vandenesch, F., Etienne, J., Jarraud, S., Molmeret, M., 2009. Evaluation of a nested-PCR-derived sequence-based typing method applied directly to respiratory samples from patients with Legionnaires' disease. Journal of clinical microbiology 47, 981-987. Glöckner, G., Albert-Weissenberger, C., Weinmann, E., Jacobi, S., Schunder, E., Steinert, M., Hacker, J., Heuner, K., 2008. Identification and characterization of a new conjugation / type IVA secretion system (trb/tra) of Legionella pneumophila Corby localized on a mobile genomic island. International journal of medical microbiology : IJMM 298, 411428. Goldberg, M.D., Johnson, M., Hinton, J.C., Williams, P.H., 2001. Role of the nucleoidassociated protein Fis in the regulation of virulence properties of enteropathogenic Escherichia coli. Mol Microbiol 41, 549-559. Gomez-Lus, P., Fields, B.S., Benson, R.F., Martin, W.T., O'Connor, S.P., Black, C.M., 1993. Comparison of arbitrarily primed polymerase chain reaction, ribotyping, and monoclonal antibody analysis for subtyping Legionella pneumophila serogroup 1. Journal of clinical microbiology 31, 1940-1942. Gomez-Valero, L., Buchrieser, C., 2013. Genome dynamics in Legionella: the basis of versatility and adaptation to intracellular replication. Cold Spring Harb Perspect Med 3. Gomez-Valero, L., Rusniok, C., Jarraud, S., Vacherie, B., Rouy, Z., Barbe, V., Medigue, C., Etienne, J., Buchrieser, C., 2011. Extensive recombination events and horizontal gene transfer shaped the Legionella pneumophila genomes. BMC Genomics 12, 536. Gomez-Valero, L., Rusniok, C., Rolando, M., Neou, M., Dervins-Ravault, D., Demirtas, J., Rouy, Z., Moore, R.J., Chen, H., Petty, N.K., Jarraud, S., Etienne, J., Steinert, M., Heuner, K., Gribaldo, S., Medigue, C., Glockner, G., Hartland, E.L., Buchrieser, C., 2014. Comparative analyses of Legionella species identifies genetic features of strains causing Legionnaires' disease. Genome Biol 15, 505.

36

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Gomez Valero, L., Runsiok, C., Cazalet C, C, B., 2011. Comparative and functional genomics of Legionella identified eukaryotic like proteins as key players in hostpathogen interactions. Frontiers in microbiology 2. Gomgnimbou, M.K., Ginevra, C., Peron-Cane, C., Versapuech, M., Refregier, G., Jacotin, N., Sola, C., Jarraud, S., 2014. Validation of a microbead-based format for spoligotyping of Legionella pneumophila. Journal of clinical microbiology 52, 2410-2415. Graham, F.F., White, P.S., Harte, D.J., Kingham, S.P., 2011. Changing epidemiological trends of legionellosis in New Zealand, 1979-2009. Epidemiol Infect, 1-16. Graham, R.M., Doyle, C.J., Jennison, A.V., 2014. Real-time investigation of a Legionella pneumophila outbreak using whole genome sequencing. Epidemiol Infect 142, 23472351. Grattard, F., Berthelot, P., Reyrolle, M., Ros, A., Etienne, J., Pozzetto, B., 1996. Molecular typing of nosocomial strains of Legionella pneumophila by arbitrarily primed PCR. Journal of clinical microbiology 34, 1595-1598. Grattard, F., Ginevra, C., Riffard, S., Ros, A., Jarraud, S., Etienne, J., Pozzetto, B., 2006. Analysis of the genetic diversity of Legionella by sequencing the 23S-5S ribosomal intergenic spacer region: from phylogeny to direct identification of isolates at the species level from clinical specimens. Microbes Infect 8, 73-83. Gunderson, F.F., Cianciotto, N.P., 2013. The CRISPR-associated gene cas2 of Legionella pneumophila is required for intracellular infection of amoebae. mBio 4, e00074-00013. Gunderson, F.F., Mallama, C.A., Fairbairn, S.G., Cianciotto, N.P., 2015. Nuclease activity of Legionella pneumophila Cas2 promotes intracellular infection of amoebal host cells. Infect Immun 83, 1008-1018. Harding, C.R., Schroeder, G.N., Reynolds, S., Kosta, A., Collins, J.W., Mousnier, A., Frankel, G., 2012. Legionella pneumophila pathogenesis in the Galleria mellonella infection model. Infect Immun 80, 2780-2790. Harrison, T.G., Afshar, B., Doshi, N., Fry, N.K., Lee, J.V., 2009. Distribution of Legionella pneumophila serogroups, monoclonal antibody subgroups and DNA sequence types in recent clinical and environmental isolates from England and Wales (2000-2008). Eur J Clin Microbiol Infect Dis 28, 781-791. Harrison, T.G., Doshi, N., Fry, N.K., Joseph, C.A., 2007. Comparison of clinical and environmental isolates of Legionella pneumophila obtained in the UK over 19 years. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 13, 78-85. Helbig, J.H., Bernander, S., Castellani Pastoris, M., Etienne, J., Gaia, V., Lauwers, S., Lindsay, D., Luck, P.C., Marques, T., Mentula, S., Peeters, M.F., Pelaz, C., Struelens, M., Uldum, S.A., Wewalka, G., Harrison, T.G., 2002. Pan-European study on cultureproven Legionnaires' disease: distribution of Legionella pneumophila serogroups and monoclonal subgroups. Eur J Clin Microbiol Infect Dis 21, 710-716. Helbig, J.H., Uldum, S.A., Bernander, S., Luck, P.C., Wewalka, G., Abraham, B., Gaia, V., Harrison, T.G., 2003. Clinical utility of urinary antigen detection for diagnosis of community-acquired, travel-associated, and nosocomial legionnaires' disease. Journal of clinical microbiology 41, 838-840. Hookey, J.V., Saunders, N.A., Fry, N.K., Birtles, R.J., Harrison, T.G., 1996. Phylogeny of Legionellaceae based on small-subunit ribosomal DNA sequences and proposal of Legionella lytica comb. nov. for Legionella-like amoebal pathogens. Int J Syst Bacteriol 46, 526-531. Horvath, P., Barrangou, R., 2010. CRISPR/Cas, the immune system of bacteria and archaea. Science 327, 167-170. Hubber, A., Roy, C.R., 2010. Modulation of host cell function by Legionella pneumophila type IV effectors. Annu Rev Cell Dev Biol 26, 261-283. Hunter, P.R., Gaston, M.A., 1988. Numerical index of the discriminatory ability of typing systems: an application of Simpson's index of diversity. Journal of clinical microbiology 26, 2465-2466.

37

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Innan, H., Kondrashov, F., 2010. The evolution of gene duplications: classifying and distinguishing between models. Nature reviews. Genetics 11, 97-108. Isberg, R.R., O'Connor, T.J., Heidtman, M., 2009. The Legionella pneumophila replication vacuole: making a cosy niche inside host cells. Nat Rev Microbiol 7, 13-24. Jakubek, D., Le Brun, M., Leblon, G., Dubow, M., Binet, M., 2013. Validation of IRS PCR, a molecular typing method, for the study of the diversity and population dynamics of Legionella in industrial cooling circuits. Lett Appl Microbiol 56, 135-141. Jayakumar, D., Early, J.V., Steinman, H.M., 2012. Virulence phenotypes of Legionella pneumophila associated with noncoding RNA lpr0035. Infect Immun 80, 4143-4153. Jeong, K.C., Sexton, J.A., Vogel, J.P., 2015. Spatiotemporal Regulation of a Legionella pneumophila T4SS Substrate by the Metaeffector SidJ. PLoS Pathogens 11, e1004695. Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J.A., Charpentier, E., 2012. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816-821. Joly, J.R., McKinney, R.M., Tobin, J.O., Bibb, W.F., Watkins, I.D., Ramsay, D., 1986. Development of a standardized subgrouping scheme for Legionella pneumophila serogroup 1 using monoclonal antibodies. Journal of clinical microbiology 23, 768-771. Juhas, M., Crook, D.W., Dimopoulou, I.D., Lunter, G., Harding, R.M., Ferguson, D.J., Hood, D.W., 2007. Novel type IV secretion system involved in propagation of genomic islands. J Bacteriol 189, 761-771. Kelly, A., Goldberg, M.D., Carroll, R.K., Danino, V., Hinton, J.C., Dorman, C.J., 2004. A global role for Fis in the transcriptional control of metabolism and type III secretion in Salmonella enterica serovar Typhimurium. Microbiology 150, 2037-2053. Kim, E.H., Charpentier, X., Torres-Urquidy, O., McEvoy, M.M., Rensing, C., 2009. The metal efflux island of Legionella pneumophila is not required for survival in macrophages and amoebas. FEMS Microbiol Lett 301, 164-170. Ko, K.S., Lee, H.K., Park, M.Y., Park, M.S., Lee, K.H., Woo, S.Y., Yun, Y.J., Kook, Y.H., 2002. Population genetic structure of Legionella pneumophila inferred from RNA polymerase gene (rpoB) and DotA gene (dotA) sequences. J Bacteriol 184, 2123-2130. Kozak, N.A., Buss, M., Lucas, C.E., Frace, M., Govil, D., Travis, T., Olsen-Rasmussen, M., Benson, R.F., Fields, B.S., 2010a. Virulence factors encoded by Legionella longbeachae identified on the basis of the genome sequence analysis of clinical isolate D-4968. J Bacteriol 192, 1030-1044. Kozak, N.A., Buss, M., Lucas, C.E., Frace, M., Govil, D., Travis, T., Olsen-Rasmussen, M., Benson, R.F., Fields, B.S., 2010b. Virulence factors encoded by Legionella longbeachae identified on the basis of the genome sequence analysis of clinical isolate D-4968. J Bacteriol 192, 1030-1044. La Scola, B., Birtles, R.J., Greub, G., Harrison, T.J., Ratcliff, R.M., Raoult, D., 2004. Legionella drancourtii sp. nov., a strictly intracellular amoebal pathogen. Int J Syst Evol Microbiol 54, 699-703. La Scola, B., Mezi, L., Auffray, J.P., Berland, Y., Raoult, D., 2002. Patients in the intensive care unit are exposed to amoeba-associated pathogens. Infect Control Hosp Epidemiol 23, 462-465. Lanser, J.A., Adams, M., Doyle, R., Sangster, N., Steele, T.W., 1990. Genetic relatedness of Legionella longbeachae isolates from human and environmental sources in Australia. Appl Environ Microbiol 56, 2784-2790. Lautier, T., Nasser, W., 2007. The DNA nucleoid-associated protein Fis co-ordinates the expression of the main virulence genes in the phytopathogenic bacterium Erwinia chrysanthemi. Mol Microbiol 66, 1474-1490. Lautner, M., Schunder, E., Herrmann, V., Heuner, K., 2013. Regulation, integrasedependent excision, and horizontal transfer of genomic islands in Legionella pneumophila. J Bacteriol 195, 1583-1597. Lawrence, C., Reyrolle, M., Dubrou, S., Forey, F., Decludt, B., Goulvestre, C., MatsiotaBernard, P., Etienne, J., Nauciel, C., 1999a. Single clonal origin of a high proportion of

38

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Legionella pneumophila serogroup 1 isolates from patients and the environment in the area of Paris, France, over a 10-year period. J Clin Microbiol 37, 2652-2655. Lawrence, C., Reyrolle, M., Dubrou, S., Forey, F., Decludt, B., Goulvestre, C., MatsiotaBernard, P., Etienne, J., Nauciel, C., 1999b. Single clonal origin of a high proportion of Legionella pneumophila serogroup 1 isolates from patients and the environment in the area of Paris, France, over a 10-year period. Journal of clinical microbiology 37, 26522655. Lawrence, C., Ronco, E., Dubrou, S., Leclercq, R., Nauciel, C., Matsiota-Bernard, P., 1999c. Molecular typing of Legionella pneumophila serogroup 1 isolates from patients and the nosocomial environment by arbitrarily primed PCR and pulsed-field gel electrophoresis. J Med Microbiol 48, 327-333. Lenz, D.H., Bassler, B.L., 2007. The small nucleoid protein Fis is involved in Vibrio cholerae quorum sensing. Mol Microbiol 63, 859-871. Levesque, S., Plante, P.L., Mendis, N., Cantin, P., Marchand, G., Charest, H., Raymond, F., Huot, C., Goupil-Sormany, I., Desbiens, F., Faucher, S.P., Corbeil, J., Tremblay, C., 2014. Genomic characterization of a large outbreak of Legionella pneumophila serogroup 1 strains in Quebec City, 2012. PloS one 9, e103852. Lomma, M., Dervins-Ravault, D., Rolando, M., Nora, T., Newton, H.J., Sansom, F.M., Sahr, T., Gomez-Valero, L., Jules, M., Hartland, E.L., Buchrieser, C., 2010. The Legionella pneumophila F-box protein Lpp2082 (AnkB) modulates ubiquitination of the host protein parvin B and promotes intracellular replication. Cellular microbiology 12, 1272-1291. Luck, C., Brzuszkiewicz, E., Rydzewski, K., Koshkolda, T., Sarnow, K., Essig, A., Heuner, K., 2015. Subtyping of the Legionella pneumophila "Ulm" outbreak strain using the CRISPR-Cas system. Int J Med Microbiol 305, 828-837. Luck, P.C., Ecker, C., Reischl, U., Linde, H.J., Stempka, R., 2007. Culture-independent identification of the source of an infection by direct amplification and sequencing of Legionella pneumophila DNA from a clinical specimen. Journal of clinical microbiology 45, 3143-3144. Luo, Z.Q., Isberg, R.R., 2004. Multiple substrates of the Legionella pneumophila Dot/Icm system identified by interbacterial protein transfer. Proc Natl Acad Sci U S A 101, 841846. Marra, A., Shuman, H.A., 1989. Isolation of a Legionella pneumophila restriction mutant with increased ability to act as a recipient in heterospecific matings. J Bacteriol 171, 2238-2240. Marrie, T.J., Raoult, D., La Scola, B., Birtles, R.J., de Carolis, E., 2001. Legionella-like and other amoebal pathogens as agents of community-acquired pneumonia. Emerg Infect Dis 7, 1026-1029. Marrie, T.J., Tyler, S., Bezanson, G., Dendy, C., Johnson, W., 1999. Analysis of Legionella pneumophila serogroup 1 isolates by pulsed-field gel electrophoresis. J Clin Microbiol 37, 251-254. McAdam, P.R., Vander Broek, C.W., Lindsay, D.S., Ward, M.J., Hanson, M.F., Gillies, M., Watson, M., Stevens, J.M., Edwards, G.F., Fitzgerald, J.R., 2014. Gene flow in environmental Legionella pneumophila leads to genetic and pathogenic heterogeneity within a Legionnaires' disease outbreak. Genome Biol 15, 504. McDade, J.E., Brenner, D.J., Bozeman, F.M., 1979. Legionnaires' disease bacterium isolated in 1947. Ann. Intern. Med. 90, 659-661. McDade, J.E., Shepard, C.C., Fraser, D.W., Tsai, T.R., Redus, M.A., Dowdle, W.R., 1977. Legionnaires' disease: isolation of a bacterium and demonstration of its role in other respiratory disease. N Engl J Med 297, 1197-1203. McNally, C., Hackman, B., Fields, B.S., Plouffe, J.F., 2000. Potential importance of Legionella species as etiologies in community acquired pneumonia (CAP). Diagn Microbiol Infect Dis 38, 79-82. Mentasti, M., Fry, N.K., Afshar, B., Palepou-Foxley, C., Naik, F.C., Harrison, T.G., 2012. Application of Legionella pneumophila-specific quantitative real-time PCR combined with

39

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

direct amplification and sequence-based typing in the diagnosis and epidemiological investigation of Legionnaires' disease. Eur J Clin Microbiol Infect Dis 31, 2017-2028. Merault, N., Rusniok, C., Jarraud, S., Gomez-Valero, L., Cazalet, C., Marin, M., Brachet, E., Aegerter, P., Gaillard, J.L., Etienne, J., Herrmann, J.L., Lawrence, C., Buchrieser, C., 2011. Specific real-time PCR for simultaneous detection and identification of Legionella pneumophila serogroup 1 in water and clinical samples. Appl Environ Microbiol 77, 1708-1717. Molofsky, A.B., Swanson, M.S., 2003. Legionella pneumophila CsrA is a pivotal repressor of transmission traits and activator of replication. Mol Microbiol 50, 445-461. Montanaro-Punzengruber, J.C., Hicks, L., Meyer, W., Gilbert, G.L., 1999. Australian isolates of Legionella longbeachae are not a clonal population. Journal of clinical microbiology 37, 3249-3254. Moran-Gilad, J., Mentasti, M., Lazarovitch, T., Huberman, Z., Stocki, T., Sadik, C., Shahar, T., Anis, E., Valinsky, L., Harrison, T.G., Grotto, I., 2014. Molecular epidemiology of Legionnaires' disease in Israel. Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases 20, 690-696. Moran-Gilad, J., Prior, K., Yakunin, E., Harrison, T.G., Underwood, A., Lazarovitch, T., Valinsky, L., Luck, C., Krux, F., Agmon, V., Grotto, I., Harmsen, D., 2015. Design and application of a core genome multilocus sequence typing scheme for investigation of Legionnaires' disease incidents. Euro Surveill 20. Morozova, I., Qu, X., Shi, S., Asamani, G., Greenberg, J.E., Shuman, H.A., Russo, J.J., 2004. Comparative sequence analysis of the icm/dot genes in Legionella. Plasmid 51, 127-147. Moskovitz, J., Rahman, M.A., Strassman, J., Yancey, S.O., Kushner, S.R., Brot, N., Weissbach, H., 1995. Escherichia coli peptide methionine sulfoxide reductase gene: regulation of expression and role in protecting against oxidative damage. J Bacteriol 177, 502-507. Nederbragt, A.J., Balasingham, A., Sirevag, R., Utkilen, H., Jakobsen, K.S., AndersonGlenna, M.J., 2008. Multiple-locus variable-number tandem repeat analysis of Legionella pneumophila using multi-colored capillary electrophoresis. J Microbiol Methods 73, 111117. Nevo, O., Zusman, T., Rasis, M., Lifshitz, Z., Segal, G., 2014. Identification of Legionella pneumophila effectors regulated by the LetAS-RsmYZ-CsrA regulatory cascade, many of which modulate vesicular trafficking. J Bacteriol 196, 681-692. Newton, H.J., Ang, D.K., van Driel, I.R., Hartland, E.L., 2010. Molecular pathogenesis of infections caused by Legionella pneumophila. Clin Microbiol Rev 23, 274-298. Nora, T., Lomma, M., Gomez-Valero, L., Buchrieser, C., 2009. Molecular mimicry: an important virulence strategy employed by Legionella pneumophila to subvert host functions. Future Microbiol 4, 691-701. O'Connor, T.J., Adepoju, Y., Boyd, D., Isberg, R.R., 2011. Minimization of the Legionella pneumophila genome reveals chromosomal regions involved in host range expansion. Proc Natl Acad Sci U S A 108, 14733-14740. Phin, N., Parry-Ford, F., Harrison, T., Stagg, H.R., Zhang, N., Kumar, K., Lortholary, O., Zumla, A., Abubakar, I., 2014. Epidemiology and clinical management of Legionnaires' disease. Lancet Infect Dis 14, 1011-1021. Pourcel, C., Vidgop, Y., Ramisse, F., Vergnaud, G., Tram, C., 2003. Characterization of a tandem repeat polymorphism in Legionella pneumophila and its use for genotyping. Journal of clinical microbiology 41, 1819-1826. Pourcel, C., Visca, P., Afshar, B., D'Arezzo, S., Vergnaud, G., Fry, N.K., 2007. Identification of variable-number tandem-repeat (VNTR) sequences in Legionella pneumophila and development of an optimized multiple-locus VNTR analysis typing scheme. Journal of clinical microbiology 45, 1190-1199. Pruckler, J.M., Mermel, L.A., Benson, R.F., Giorgio, C., Cassiday, P.K., Breiman, R.F., Whitney, C.G., Fields, B.S., 1995. Comparison of Legionella pneumophila isolates by

40

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

arbitrarily primed PCR and pulsed-field gel electrophoresis: analysis from seven epidemic investigations. J Clin Microbiol 33, 2872-2875. Rao, C., Benhabib, H., Ensminger, A.W., 2013. Phylogenetic reconstruction of the Legionella pneumophila Philadelphia-1 laboratory strains through comparative genomics. PloS one 8, e64129. Rasis, M., Segal, G., 2009. The LetA-RsmYZ-CsrA regulatory cascade, together with RpoS and PmrA, post-transcriptionally regulates stationary phase activation of Legionella pneumophila Icm/Dot effectors. Mol Microbiol 72, 995-1010. Ratzow, S., Gaia, V., Helbig, J.H., Fry, N.K., Luck, P.C., 2007. Addition of neuA, the gene encoding N-acylneuraminate cytidylyl transferase, increases the discriminatory ability of the consensus sequence-based scheme for typing Legionella pneumophila serogroup 1 strains. Journal of clinical microbiology 45, 1965-1968. Reuter, S., Harrison, T.G., Koser, C.U., Ellington, M.J., Smith, G.P., Parkhill, J., Peacock, S.J., Bentley, S.D., Torok, M.E., 2013. A pilot study of rapid whole-genome sequencing for the investigation of a Legionella outbreak. BMJ Open 3. Ridenour, D.A., Cirillo, S.L., Feng, S., Samrakandi, M.M., Cirillo, J.D., 2003. Identification of a gene that affects the efficiency of host cell infection by Legionella pneumophila in a temperature-dependent fashion. Infect Immun 71, 6256-6263. Riffard, S., Lo Presti, F., Vandenesch, F., Forey, F., Reyrolle, M., Etienne, J., 1998. Comparative analysis of infrequent-restriction-site PCR and pulsed-field gel electrophoresis for epidemiological typing of Legionella pneumophila serogroup 1 strains. Journal of clinical microbiology 36, 161-167. Roig, J., Rello, J., 2003. Legionnaires' disease: a rational approach to therapy. J Antimicrob Chemother 51, 1119-1129. Rolando, M., Buchrieser, C., 2012. Post-translational modifications of host proteins by Legionella pneumophila: a sophisticated survival strategy. Future Microbiol 7, 369-381. Rolando, M., Sanulli, S., Rusniok, C., Gomez-Valero, L., Bertholet, C., Sahr, T., Margueron, R., Buchrieser, C., 2013. Legionella pneumophila effector RomA uniquely modifies host chromatin to repress gene expression and promote intracellular bacterial replication. Cell Host Microbe 13, 395-405. Rowbotham, T.J., 1980. Preliminary report on the pathogenicity of Legionella pneumophila for freshwater and soil amoebae. J Clin Pathol 33, 1179-1183. Roy, C.R., Berger, K.H., Isberg, R.R., 1998. Legionella pneumophila DotA protein is required for early phagosome trafficking decisions that occur within minutes of bacterial uptake. Mol Microbiol 28, 663-674. Rubin, C.J., Thollesson, M., Kirsebom, L.A., Herrmann, B., 2005. Phylogenetic relationships and species differentiation of 39 Legionella species by sequence determination of the RNase P RNA gene rnpB. Int J Syst Evol Microbiol 55, 2039-2049. Sahr, T., Rusniok, C., Dervins-Ravault, D., Sismeiro, O., Coppee, J.Y., Buchrieser, C., 2012. Deep sequencing defines the transcriptional map of L. pneumophila and identifies growth phase-dependent regulated ncRNAs implicated in virulence. RNA Biol 9, 503-519. Samrakandi, M.M., Cirillo, S.L., Ridenour, D.A., Bermudez, L.E., Cirillo, J.D., 2002. Genetic and phenotypic differences between Legionella pneumophila strains. J Clin Microbiol 40, 1352-1362. Saunders, N.A., Harrison, T.G., Haththotuwa, A., Kachwalla, N., Taylor, A.G., 1990. A method for typing strains of Legionella pneumophila serogroup 1 by analysis of restriction fragment length polymorphisms. J Med Microbiol 31, 45-55. Saunders, N.A., Harrison, T.G., Haththotuwa, A., Taylor, A.G., 1991. A comparison of probes for restriction fragment length polymorphism (RFLP) typing of Legionella pneumophila serogroup 1 strains. J Med Microbiol 35, 152-158. Schroeder, G.N., Petty, N.K., Mousnier, A., Harding, C.R., Vogrin, A.J., Wee, B., Fry, N.K., Harrison, T.G., Newton, H.J., Thomson, N.R., Beatson, S.A., Dougan, G., Hartland, E.L., Frankel, G., 2010. Legionella pneumophila strain 130b possesses a unique combination of type IV secretion systems and novel Dot/Icm secretion system effector proteins. J Bacteriol 192, 6001-6016.

41

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA

NU

SC RI

PT

Segal, G., Russo, J.J., Shuman, H.A., 1999. Relationships between a new type IV secretion system and the icm/dot virulence system of Legionella pneumophila. Mol Microbiol 34, 799-809. Segal, G., Shuman, H.A., 1999. Legionella pneumophila utilizes the same genes to multiply within Acanthamoeba castellanii and human macrophages. Infect Immun 67, 2117-2124. Selander, R.K., McKinney, R.M., Whittam, T.S., Bibb, W.F., Brenner, D.J., Nolte, F.S., Pattison, P.E., 1985. Genetic structure of populations of Legionella pneumophila. J Bacteriol. 163, 1021-1037. Skaar, E.P., Tobiason, D.M., Quick, J., Judd, R.C., Weissbach, H., Etienne, F., Brot, N., Seifert, H.S., 2002. The outer membrane localization of the Neisseria gonorrhoeae MsrA/B is involved in survival against reactive oxygen species. Proceedings of the National Academy of Sciences of the United States of America 99, 10108-10113. Sun, E.W., Wagner, M.L., Maize, A., Kemler, D., Garland-Kuntz, E., Xu, L., Luo, Z.Q., Hollenbeck, P.J., 2013. Legionella pneumophila infection of Drosophila S2 cells induces only minor changes in mitochondrial dynamics. PloS one 8, e62972. Thomas, C.M., Nielsen, K.M., 2005. Mechanisms of, and barriers to, horizontal gene transfer between bacteria. Nature reviews. Microbiology 3, 711-721. Trigui, H., Dudyk, P., Sum, J., Shuman, H.A., Faucher, S.P., 2013. Analysis of the transcriptome of Legionella pneumophila hfq mutant reveals a new mobile genetic element. Microbiology 159, 1649-1660. Underwood, A.P., Jones, G., Mentasti, M., Fry, N.K., Harrison, T.G., 2013. Comparison of the Legionella pneumophila population structure as determined by sequence-based typing and whole genome sequencing. BMC microbiology 13, 302. van Heijnsbergen, E., Schalk, J.A., Euser, S.M., Brandsema, P.S., den Boer, J.W., de Roda Husman, A.M., 2015. Confirmed and Potential Sources of Legionella Reviewed. Environ Sci Technol 49, 4797-4815. Vogel, J.P., Andrews, H.L., Wong, S.K., Isberg, R.R., 1998. Conjugative transfer by the virulence system of Legionella pneumophila. Science 279, 873-876. Wang, L., Wang, F.F., Qian, W., 2011. Evolutionary rewiring and reprogramming of bacterial transcription regulation. Journal of genetics and genomics = Yi chuan xue bao 38, 279-288. Wee, B.A., Woolfit, M., Beatson, S.A., Petty, N.K., 2013. A distinct and divergent lineage of genomic island-associated Type IV Secretion Systems in Legionella. PloS one 8, e82221. Woese, C.R., 1987. Bacterial evolution. Microbiol Rev 51, 221-271. Woodhead, M., 2002. Community-acquired pneumonia in Europe: causative pathogens and resistance patterns. Eur Respir J Suppl 36, 20s-27s. Yu, V.L., Plouffe, J.F., Pastoris, M.C., Stout, J.E., Schousboe, M., Widmer, A., Summersgill, J., File, T., Heath, C.M., Paterson, D.L., Chereshsky, A., 2002. Distribution of Legionella species and serogroups isolated by culture in patients with sporadic community-acquired legionellosis: an international collaborative survey. J Infect Dis 186, 127-128. Zhou, H., Ren, H., Zhu, B., Kan, B., Xu, J., Shao, Z., 2010. Optimization of pulsed-field gel electrophoresis for Legionella pneumophila subtyping. Appl Environ Microbiol 76, 13341340. Zhu, W., Banga, S., Tan, Y., Zheng, C., Stephenson, R., Gately, J., Luo, Z.Q., 2011. Comprehensive identification of protein substrates of the Dot/Icm type IV transporter of Legionella pneumophila. PloS one 6, e17638. Zusman, T., Speiser, Y., Segal, G., 2014. Two Fis regulators directly repress the expression of numerous effector-encoding genes in Legionella pneumophila. J Bacteriol 196, 4172-4183. Zusman, T., Yerushalmi, G., Segal, G., 2003. Functional similarities between the icm/dot pathogenesis systems of Coxiella burnetii and Legionella pneumophila. Infect Immun 71, 3714-3723.

42

AC CE

PT ED

MA

NU

SC RI

PT

ACCEPTED MANUSCRIPT

43

ACCEPTED MANUSCRIPT

Table legends

AC CE

PT ED

MA

NU

SC RI

PT

Table 3: Summary table of the distribution of the various ICEs and some of the duplicated genes present in Legionella. To Asses the presence or absence of (Fis, CsrA and SidE) a tblastn was performed using the amino acid sequence of each of these proteins of Legionella pneumophila Paris strain. The “(+)” for Legionella pneumophila Corby and Alcoy corresponds to the presence of a truncated 4th copy of the protein. +: Present, -: absent. The phylogenetic tree is based on the 16S gene after running G-blocks for the alignment

44

ACCEPTED MANUSCRIPT

MA

NU

SC RI

Reference Fry et al., 1999 Fry et al., 1999 Fry et al., 1999 Fryet al., 1999 Gaia et al., 2005 Gaia et al., 2005 Ginevra et al., 2011 Bosch et al., 2015 NA*

PT ED

*NA Not Available

Discriminatory power (D) 0.84 0.896 0.99 0.891 0.94 0.98 0.797 0.87 NA*

AC CE

Methods Ribotyping RFLP PFGE AFLP SBT SBT + Mab Spoligotyping Whole genome mapping cgMLST

PT

Table 1. Discriminatory power of Legionella typing methods

45

ACCEPTED MANUSCRIPT Table 2: distribution and characteristics of the various ICEs present in Legionella Strain

Insertion site

Size (Kb)

Paris

Nb of copies 1

tmRNA

36

Philadelphia

1

Arg

Lens Wadsworth 130b

1 2

LpVv2 HL06041035

1

tmRNA Inserted upstream and downstream of the CRISPR Locus tmRNA

L. micdadei

ATCC33218

1

tmRNA

L. longbeachae

D-4968 - WGS ACZGv1

1

L. pneumophila

Paris (plasmid) Loraine (plasmid) Lens (plasmid) Philadelphia NSW150 NSW150 (plasmid) D4968-WGS

1 1 1 1 1 1 1

Species L. pneumophila Lvh

L. longbeachae

L. hackeliae L. fallonii

ATCC35250 LHAv2 (Plasmid) ATCC700992 (plasmid) ATCC700992 Chromosome

PT ED

MA

NU

SC RI

tRNA

tRNA

Reference

45 or 50 Lvh1: 25.5 Lvh2: 17.2

(Cazalet et al., 2004) (Chien et al., 2004) This study (Schroeder et al., 2010)

36

This study

32.3

This study

45.2

(Kozak et al., 2010a) (GomezValero et al., 2011)

45

Genome Accession Nb and status CR628336 (complete) AE017354 (Complete) CR628337 (complete) FR687201, Draft (145 contigs) EMBL: FQ958211 (complete) EMBL: PRJEB7312 (complete)

tRNA, tRNA Arg tRNA, tRNA (A) ?

28 28.2 32 64 48.7 26.6 ?

1

?

?

This study

ACZG00000000, Draft (13 contigs) CR628338 (complete) FQ958212 (complete) CR628339 (complete) AE017354 (complete) NC_013861 (complete) NC_014544 (complete) ACZG00000000, Draft (13 contigs) EMBL: PRJEB7321

1 1

?

100 78

This study This study

LN614828 (complete) LN614827 (complete)

AC CE

Tra (F-type)

Pro

*

PT

Element

Near a pin site specific recombinase e14 prophage: (A) Lys

Arg

Ser

Leu

tRNA,

Gly

tRNA,

Cyst

tRNA

This study

46

ACCEPTED MANUSCRIPT

Alcoy

2

L. longbeachae

NSW150

1

L. pneumophila

Alcoy Corby

1 2

Philadelphia Paris

1 2

Lens Wadsworth 130b

1 2

Lorraine Subspecies HL06041035

1 2

L. longbeachae

D-4968

1

L. drancourti

LLAP12

L. fallonii

ATCC700992

1

L. hackeliae

ATCC35250

L. micdadei

ATCC33218

thr

tRNA thr LpcGI-1: tRNA Met LpcGI-2: tRNA thr tRNA thr lppGI-1: tRNA Met lppGI-2: tRNA thr tRNA thr LpwGI-1: tRNA Arg LpwGI-2: tRNA Pro tRNA Thr LpvGI-1: tRNA Arg LpvGI-2: tRNA Pro tRNA Arg

Trb1: 42.7 Trb2: 34.4 60.2 22

PT

1 1

SC RI

Lorraine Wadsworth 130b

pro

Trb1: tRNA Trb2: tmRNA Met tRNA Non-synonymous genomic location pro Trb1: tRNA Trb2: tmRNA Met tRNA

NU

2

MA

Corby

PT ED

GIT4ASS

L. pneumophila

AC CE

Trb (P-type)

39.6 35.3 80

51 45 22 115 45 123 80 59 70.5 98 104 >45

(Lautner et al., 2013) This study (Schroeder et al., 2010) This study

CP000675 (completete)

(GomezValero et al., 2011) (Wee et al., 2013)

NC_013861 (complete)

This study This study

CP001828 complete CP000675 complete AE017354 complete CR628336 complete CR628337 complete CAFM0000000 (Draft, 159 contigs) FQ958210 (complete) FQ958211 (complete)

(Wee et al., 2013) (Wee et al., 2013)

ACZG00000000 (Draft, 13 contigs) ACUL00000000 (Draft, 263 contigs)

This study

EMBL: PRJEB7322 (complete) EMBL: PRJEB7321 (Complete) EMBL: PRJEB7312 (Complete)

LdrGI-1: tRNA pro LdrGI-2: tRNA LdrGI-3:? LdrGI-4: ? pro tRNA

>46 >30 >44 >58 49

1

Arg

89.2

This study

1

Met

37.1

This study

4

tRNA tRNA

FQ958210 (complete) CAFM0000000 (Draft, 159 contigs) CP001828 complete

47

AC CE

PT ED

MA

NU

SC RI

PT

ACCEPTED MANUSCRIPT

48

ACCEPTED MANUSCRIPT

AC CE

PT

ED

MA

NU

SC

RI P

T

Table 3: Summary table of the distribution of the various ICEs and some of the duplicated genes present in Legionella.

49

AC CE

PT

ED

MA

NU

SC

RI P

T

ACCEPTED MANUSCRIPT

50

AC CE

PT ED

MA

NU

SC RI

PT

ACCEPTED MANUSCRIPT

51

ACCEPTED MANUSCRIPT

PT

Highlights

- WGS has become the method of choice for L. pneumophila outbreak investigations

SC RI

- High recombination rates and DNA exchange via conjugation are key evolutionary mechanism - The Dot/Icm T4BSS well conserved among species is the key for Legionella pathogenesis

NU

- Legionella species have a large non-overlapping Dot/Icm effector repertoire

AC CE

PT ED

MA

- All Legionella strains sequenced encode one or several T4ASSs present often on ICE

52