Drancourtella massiliensis gen. nov., sp. nov. isolated from fresh healthy human faecal sample from South France

Drancourtella massiliensis gen. nov., sp. nov. isolated from fresh healthy human faecal sample from South France

Accepted Manuscript Drancourtella massiliensis gen. nov., sp. nov. isolated from fresh healthy human fecal sample from South France Guillaume A. Duran...

8MB Sizes 3 Downloads 191 Views

Accepted Manuscript Drancourtella massiliensis gen. nov., sp. nov. isolated from fresh healthy human fecal sample from South France Guillaume A. Durand, Jean-Chistophe Lagier, Saber Khelaifia, Nicholas Armstrong, Catherine Robert, Jaishriram Rathored, Pierre-Edouard Fournier, Didier Raoult, Pr PII:

S2052-2975(16)00019-6

DOI:

10.1016/j.nmni.2016.02.002

Reference:

NMNI 130

To appear in:

New Microbes and New Infections

Received Date: 14 November 2015 Revised Date:

30 January 2016

Accepted Date: 3 February 2016

Please cite this article as: Durand GA, Lagier J-C, Khelaifia S, Armstrong N, Robert C, Rathored J, Fournier P-E, Raoult D, Drancourtella massiliensis gen. nov., sp. nov. isolated from fresh healthy human fecal sample from South France, New Microbes and New Infections (2016), doi: 10.1016/ j.nmni.2016.02.002. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Drancourtella massiliensis gen. nov., sp. nov. isolated from fresh healthy human fecal sample from South France

RI PT

Guillaume A. Durand a,b, Jean-Chistophe Lagier a,b, Saber Khelaifia a, Nicholas Armstrong a, Catherine Robert a, Jaishriram Rathored a, Pierre-Edouard Fournier a,b , Didier Raoult a,b,c*

URMITE UM63, CNRS7278, IRD198, INSERM1085, Faculté de Médecine, Aix Marseille

SC

a

b

Pôle des Maladies Infectieuses, Hôpital La Timone, Assistance Publique-Hôpitaux de

Marseille, Marseille, France

Special Infectious Agents Unit, King Fahd Medical Research Center, King Abdulaziz

TE D

c

M AN U

Université, 27 boulevard Jean Moulin, 13385 Marseille cedex 5, France

University, Jeddah, Saudi Arabia

EP

Corresponding author : Pr Didier Raoult, URMITE UM63, CNRS7278, IRD198, INSERM1085, Faculté de Médecine, Aix Marseille Université, 27 boulevard Jean Moulin,

AC C

13385 Marseille cedex 5, France, [email protected]

Running head: Drancourtella massiliensis gen. nov., sp. nov. genome Keywords : Drancourtella massiliensis gen. nov., sp. nov., taxono-genomics, culturomics, anaerobe, gut microbiota Abstract words count: 75 Text words count: 1,860

ACCEPTED MANUSCRIPT

Abbreviations :

2

CSUR: Collection de Souches de l’Unité des Rickettsies

3

DSM: Deutsche Sammlung von Mikroorganismen

4

MALDI-TOF MS: Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry

5

TE buffer: Tris-EDTA buffer

6

SDS: sodium dodecyl sulfate

7

URMITE: Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes

8

FAME: Fatty Acid Methyl Ester

9

GC/MS: Gaz Chromatography/Mass Spectrometry

AC C

EP

TE D

M AN U

10

SC

RI PT

1

ACCEPTED MANUSCRIPT

ABSTRACT

12

Strain GD1T gen. nov., sp. nov. is the type strain of the newly proposed genus and species

13

Drancourtella massiliensis, belonging to the Clostridiales order. This strain, isolated from a healthy

14

person stool, is a gram positive rod, oxygen intolerant, non motile, with spore forming activity. The

15

features of this organism and its genome sequence are described. The draft genome of 3,057,334 bp

16

long with 45.24% G+C content contains 2,861 protein-coding genes, and 64 RNAs genes.

AC C

EP

TE D

M AN U

SC

RI PT

11

17 18

ACCEPTED MANUSCRIPT

Introduction

The gut microbiota is an important part of the Human Microbiome Project. Indeed, 53% of cultivated bacterial species from human are isolated from gut (1), and 80% of gut phylotypes found

20

using metagenomic method, that are mainly anaerobic strains (2), are not yet cultivated (3). Those

21

phylotypes belong specially to Firmicutes and Proteobacteria. Culturomics project consists in the

22

utilisation of a lot of different culture methods, in order to cultivate most bacterial species from one

23

plurimicrobial sample, and notably the use of multiple anaerobic conditions (4). With the aim to

24

cultivate most anaerobic species from human stool, we use culturomics methods with fresh stool of

25

a healthy French person.

SC

The new genus proposed here was close to Ruminococcus torques, according to 16S rRNA

M AN U

26

RI PT

19

27

phylogenetic analysis, part of Clostridiales order in Firmicutes phylum, that was separated on a

28

different clade according to 16S rRNA analysis (5). Cluster XIV contains notably Ruminococcus,

29

Coprococcus, and Blautia genera. Ruminococcus was first described from rumen fluid. Human

30

infections caused by these bacteria have been reported (6–9).

We propose here a novel genus and species Drancourtella massiliensis for a strain isolated

TE D

31

from the fresh stool of a healthy French person, which is phylogenetically close to Ruminococcus

33

torques. Type strain is GD1T (=CSUR P1506, = DSM 100357).

34

Material and methods

35

Ethic and sample collection

AC C

EP

32

36

After receiving informed and signed consent, approved by the Institut Fédératif de

37

Recherche 48 (Faculty of Medicine, Marseille, France), under agreement number 09-022, a stool

38

specimen was collected at La Timone hospital Marseille (France) on January 2015. The specimen

39

was from a healthy French male (BMI 23.2 kg.m-2), 28 years old, with no current treatment,

40

especially no antibiotics.

41

Isolation and growth conditions of the strain

42

The stool specimen was incubated at 37°C into an anaerobic blood bottle BactecTM Lytic/10

ACCEPTED MANUSCRIPT

Anaerobic/F (Becton Diclinson, Le Pont de Claix, France) supplemented with 5% sheep blood, after

44

a thermal shock of 20 minutes at 80°C. Then dilution cultures were performed, and characterization

45

of growth conditions was tested as previously described (10). Finally, sporulation, different pH

46

levels and NaCl concentrations were tested in the agar plate, with the best culture conditions (11).

47

Strain identification by MALDI-TOF and 16s rRNA

48

RI PT

43

Identification of colonies was performed using MALDI-TOF Microflex spectrometer (Brüker Daltonics, Leipzig, Germany) as previously described, and the spectrum was compared

50

with our database (which includes the Bruker database and own collection) (10,12). In case of non-

51

identification, i.e. if the spectrum did not find a match in the database (score < 1.7), we proceeded

52

to spectrum verification. If the spectrum was without background noise, and after exclusion of

53

culture contamination, 16S rRNA was sequenced as previously described (10). In case of sequence

54

similarity value lower than 96%, the species is considered as new genera, without performing DNA-

55

DNA hybridization, as suggested by Stackebrandt and Ebers (13).

56

Morphological and biochemical characterization

M AN U

Morphological characterization was first done by observation of Gram staining and motility

TE D

57

SC

49

of fresh sample. Negative staining was then done using glutheraldehyde 2.5% bacterial fixed,

59

deposited on carbon formvar film, and then incubated for one second on ammonium molybdate 1%,

60

dried on blotting paper and finally observed using TECNAI G20 transmission electron microscope

61

(FEI Company, Limeil-Brevannes, France) at an operating voltage of 200keV. Biochemical features,

62

such as oxidase, catalase, API 50CH, 20A, and ZYM strips (Biomerieux, Marcy l’Etoile, France)

63

were investigated, according to manufacturer’s instructions. Cellular fatty acids were analysed

64

from two samples prepared with approximately 10 mg of bacterial biomass each harvested

65

from several culture plates. Fatty acid methyl esters were prepared as described (14) and

66

GC/MS analyses were carried out as described before (15).

67

Antibiotic susceptibility

68

AC C

EP

58

Antibiotic susceptibility was tested with the diffusion method according to CASFM/Eucast

ACCEPTED MANUSCRIPT

2015 recommendations for fastidious anaerobes (16) using a suspension of 1 Mc Farland on

70

Wilkins Chalgren agar (Sigma Aldrich, Steinheim, Germany) supplemented with 5% sheep blood.

71

Incubations were done in anaerobic conditions at 37°C, and reading done at 48 hours using Sirscan

72

system© (i2a, Montpellier, France) and eye controlled.

73

Genome sequencing, annotation and comparison

74

RI PT

69

DNA extraction, in view of whole genome sequencing, consisted of growing the species on Columbia supplemented with 5% sheep blood (Biomerieux, Marcy l’Etoile, France) at 37°C in

76

anaerobic atmosphere. Bacteria grown on three Petri dishes were harvested and resuspended in

77

4x100µL of TE buffer. Then, 200 µL of this suspension was diluted in 1ml TE buffer for lysis

78

treatment that included a 30- minute incubation with 2.5 µg/µL lysozyme at 37°C, followed by an

79

overnight incubation with 20 µg/µL proteinase K at 37°C (17). As previously described (10, 15,

80

18), our genomic platform use a protocol including Proteinase K incubated overnight in order

81

to digest contaminating proteins. Extracted DNA was then purified using 3 successive phenol-

82

chloroform extractions and ethanol precipitations at -20°C overnight. After centrifugation, the DNA

83

was resuspended in 160 µL TE buffer. The whole genome was then sequenced on the MiSeq

84

Technology (Illumina Inc, San Diego, CA, USA) with the mate pair strategy as previously

85

described (18). Open Reading Frames (ORFs) were predicted using Prodigal (19) with default

86

parameters but the predicted ORFs were excluded if they were spanning a sequencing gap region

87

(contain N). The predicted bacterial protein sequences were searched against the Clusters of

88

Orthologous Groups (COG) using BLASTP (E-value 1e-03, coverage 0,7 and identity percent

89

30%). If no hit was found, a search was performed against the NR database using BLASTP with E-

90

value of 1e-03 coverage 0,7 and an identity percent of 30 %, If the sequence lengths were smaller

91

than 80 amino acids, we used an E-value of 1e-05. The tRNAScanSE tool (20) was used to find

92

tRNA genes, whereas ribosomal RNAs were found by using RNAmmer (21). Lipoprotein signal

93

peptides and the number of transmembrane helices were predicted using Phobius (22). ORFans,

94

sequences that did not blast in the BLASTP program to known sequence have been defined by

AC C

EP

TE D

M AN U

SC

75

ACCEPTED MANUSCRIPT

sequences with E value smaller than 1e-3 in case of sequence length upper than 80 amino

96

acids, and Evalue smaller than 1e-5 in case of sequence length smaller than 80aa. Such

97

parameter thresholds have already been used in previous works to define ORFans (23). For

98

each selected genome, the complete genome sequence, proteome genome sequence and Orfeome

99

genome sequence were retrieved from the File Transfer Protocol (FTP) of NCBI (24). All

RI PT

95

proteomes were analyzed with proteinOrtho (25). Then for each couple of genomes, a similarity

101

score was computed. This score is the mean value of nucleotide similarity between all couples of

102

orthologues between the two genomes studied (AGIOS) (26). An annotation of the entire proteome

103

was performed to define the distribution of the functional classes of predicted genes according to

104

the clusters of orthologous groups of proteins (using the same method as for the genome

105

annotation). Annotation and comparison processes were performed in the Multi-Agent software

106

system DAGOBAH (27), that include Figenix libraries (28) that provide pipeline analysis.

107

Results

108

Classification and features

M AN U

SC

100

Type strain GD1T was first isolated after 10 days anaerobic incubation of the stool sample in

110

presence of sheep blood, after thermal shock, and then cultivated on Columbia agar supplemented

111

with 5% sheep blood under anaerobic conditions. MALDI-TOF spectrum of GD1T did not find a

112

match in our database nor the Brucker's database, despite no background noise (Figure 1 and 2).

113

16S sequencing of the subunit of rRNA shows complete sequence of 1476pb, with nucleotide blast

114

results indicating Ruminococcus torques JCM6553 as the most closely cultured species at 93,8%

115

(Figure 3). This result allows us to define a new genus, according to the thresholds delimited by

116

Stackebrandt and Ebers (13). The GD1T 16S rRNA accession number from EBI Sequence Database

117

is LN828944. A gel view was performed in order to see the spectra differences of Drancourtella

118

massiliensis with other close bacteria (Figure 2).

AC C

EP

TE D

109

119

Tested culture conditions have identified optimal growth at 37°C after 48 hours under

120

anaerobic conditions, but a little growth appeared in microaerophilic conditions, suggesting a

ACCEPTED MANUSCRIPT

relative oxygen tolerance. The pH range for growth is 6.5 to 7.0 and NaCl concentration need to be

122

inferior to 10g/L. GD1T strain appears to be around 2 mm in size, homogeneous translucent and

123

smooth, non-hemolytic, non motile, with spore forming activity colonies. Gram coloration was

124

positive and bacteria demonstrated a bacilli aspect under the microscope (Figure 4). Electronic

125

microscopy shows small rods of about 500 nm (Figure 5). Classification and principal phenotypic

126

features are presented in Table1.

127

RI PT

121

Catalase and oxidase production reactions were negatives. Using an API 50CH strip, positive reactions were observed for D-ribose, Methyl-αD-glucopyranoside, D-turanose and for

129

potassium 5-ketogluconate. Negative reactions were observed for all others. Using an API 20A strip,

130

all reactions were negative. Using an API ZYM strip, reactions were positive for leucine

131

arylamidase, valine arylamidase, cystine arylamydase, naphthol-AS-BI-phosphohydrolase, β-

132

galactosidase, N-acetyl-β-glucosaminidase and negative for others. The differences of

133

characteristics compared with other representatives of the family Ruminococcaceae are detailed in

134

Table 2. The major fatty acids detected (16:0 and 14:0) are saturated species. The GC/MS

135

results indicate lower amounts of unsaturated acids and other saturated compounds (Table 3).

136

Concerning susceptibility on βlactam, the GD1T strain was susceptible to amoxicillin,

TE D

M AN U

SC

128

ticarcillin, cefepime, vancomycin, teicoplanin and linezolid but resistant to ceftriaxone and

138

ceftazidime. Concerning aminoglycoside antibiotics, cells were susceptible to gentamicin and

139

resistant to tobramycin and amikacin. GD1T strain was resistant to sulfamethoxazole, aztreonam,

140

ciprofloxacin and ofloxacin, and susceptible to imipenem, fosfomycin, clindamycin, metronidazole,

141

rifampicin, and colistin.

142

Genomic characterization and comparison

AC C

EP

137

143

The genome is 3,057,334 bp long with 45.24 % G+C content (Table 4). It is composed of 7

144

scaffolds (composed of 7 contigs). On the 2,925 predicted genes, 2,861 were protein-coding genes,

145

and 64 were RNAs (2 genes are 5S rRNA, 2 genes are 16S rRNA, 3 genes are 23S rRNA, 57 genes

146

are tRNA genes). A total of 1,969 genes (68.82%) were assigned as putative function (by COGs or

ACCEPTED MANUSCRIPT

147

by NR blast). 78 genes (2.73%) were identified as ORFans. The remaining genes (703 genes,

148

24.57 %) were annotated as hypothetical proteins. Table 5 summarizes the distribution of genes into

149

COG functional categories. The genome sequence has been deposited in Genbank under accession

150

number CVPG00000000. The draft genome sequence of Drancourtella massiliensis (3.05Mb) is smaller than those of

RI PT

151

Ruminococcus gnavus, Clostridium scindens, Coprococcus comes and Dorea formicigenerans (3.72,

153

3.62, 3.24 and 3.19 Mb respectively), but larger than those of Eubacterium ventriosum (2.87 Mb).

154

The G+C content of Drancourtella massiliensis is smaller than those of Clostridium scindens

155

(45.24 and 46.35 % respectively), but larger than those of Ruminococcus gnavus, Coprococcus

156

comes, Dorea formicigenerans and Eubacterium ventriosum (42.52, 42.49, 40.97 and 34.92 %

157

respectively). The gene content of Drancourtella massiliensis is smaller than those of

158

Ruminococcus gnavus, Clostridium scindens, Coprococcus comes and Dorea formicigenerans

159

(2,861; 3,762; 3,995; 3,913 and 3,277 respectively), but larger than those of Eubacterium

160

ventriosum (2,802). Figures 6 and 7 demonstrate that the distribution of genes into COG categories

161

was similar in all compared genomes. Drancourtella massiliensis also shared 1,235, 922, 1,082,

162

1,202 and 1,266 orthologous proteins with Dorea formicigenerans, Eubacterium ventriosum,

163

Coprococcus comes, Ruminococcus gnavus, and Clostridium scindens respectively (Table 7). To

164

evaluate the genomic similarity among the studied strains, we determined two parameters, dDDH

165

that exhibits a high correlation with DDH (Table 6) (29, 30) and AGIOS (Table 7) (26) which was

166

designed to be independent from DDH.

167

Conclusion

AC C

EP

TE D

M AN U

SC

152

168

Anaerobic conditions of culturomics have permitted the culture of the first strain of the new

169

genus Drancourtella. Taxogenomics studies confirmed this species Drancourtella massiliensis gen.

170

nov, sp. nov.

171

Taxonomic and nomenclatural proposals

172

Description of Drancourtella gen. nov.

ACCEPTED MANUSCRIPT

Drancourtella (Dran.cour.tel'la. NL gen fem) name was chosen after the French microbiologist

174

Michel Drancourt (Université de la Méditerranée, Marseille, France).

175

Gram positive bacilli. Spore forming and non motile. Drancourtella massiliensis is anaerobe.

176

Oxidase and catalase negative. Indole negative, nitrate reductase negative. Habitat is the human gut.

177

Type species is Drancourtella massiliensis strain GD1T.

178

Description of Drancourtella massiliensis gen. nov, sp. nov

179

Drancourtella massiliensis (ma.si.li.en'sis. L. fem. adj. Massiliensis, the Latin Massilia, the city

180

where the bacteria was isolated, Marseille).

181

D. massiliensis strain GD1T (= CSUR P1506 = DSM 100357) presented as translucent colonies of

182

2mm diameter, isolated from the human stool of a healthy French person. Bacteria appeared as

183

gram positive bacilli of 0.5 µm long. Metabolism was strictly anaerobic, and growth was optimal at

184

37°C. D. massiliensis was spore forming and non motile. Oxidase and catalase were negative.

185

Positive reactions were observed for D-ribose, Methyl-αD-glucopyranoside, D-turanose and for

186

potassium 5-ketogluconate, leucine arylamidase, valine arylamidase, cystine arylamydase,

187

naphthol-AS-BI-phosphohydrolase, β-galactosidase, and N-acetyl-β-glucosaminidase. Resistance

188

for ceftriaxone, ceftazidime, tobramycin, amikacin, sulfamethoxazole, aztreonam, ciprofloxacin and

189

ofloxacin was observed.

190

This strain exhibited a G+C content of 45.24%. Its 16S rRNA sequence was deposited in Genbank

191

under the accession number LN828944, and the whole genome shotgun sequence has been

192

deposited in GenBank under accession number CVPG00000000.

AC C

EP

TE D

M AN U

SC

RI PT

173

ACCEPTED MANUSCRIPT

Conflict of interest

194

The authors declare no conflict of interest.

195

Acknowledgements

196

The authors thank the Xegen Company (www.xegen.fr) for automating the genomic annotation

197

process. This study was funded by the “Fondation Méditerranée Infection”. We also thank Karolina

198

Griffiths for English reviewing and Claudia Andrieu for administrative assistance.

RI PT

193

AC C

EP

TE D

M AN U

SC

199

ACCEPTED MANUSCRIPT

References

1. Hugon P, Dufour JC, Colson P, Fournier P-E, Sallah K, Raoult D. A comprehensive repertoire of prokaryotic species identified in human beings. Lancet Infect Dis. 2015 Oct;15(10):1211–9. 2. Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, et al. The long-term stability of the human gut microbiota. Science. 2013 Jul 5;341(6141):1237439.

RI PT

3. Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI. The human microbiome project. Nature. 2007 Oct 18;449(7164):804–10. 4. Lagier JC, Hugon P, Khelaifia S, Fournier PE, La Scola B, Raoult D. The rebirth of culture in microbiology through the example of culturomics to study human gut microbiota. Clin Microbiol Rev. 2015 Jan;28(1):237–64.

SC

5. Collins MD, Lawson PA, Willems A, Cordoba JJ, Fernandez-Garayzabal J, Garcia P, et al. The phylogeny of the genus Clostridium: proposal of five new genera and eleven new species combinations. Int J Syst Bacteriol. 1994 Oct;44(4):812–26.

M AN U

6. Hansen SGK, Skov MN, Justesen US. Two cases of Ruminococcus gnavus bacteremia associated with diverticulitis. J Clin Microbiol. 2013 Apr;51(4):1334–6. 7. Livaoğlu M, Yilmaz G, Kerimoğlu S, Aydin K, Karacal N. Necrotizing fasciitis with ruminococcus. J Med Microbiol. 2008 Feb;57(Pt 2):246–8. 8. Sucu N, Köksal I, Yilmaz G, Aydin K, Caylan R, Aktoz Boz G. [Liver abscess and infective endocarditis cases caused by Ruminococcus productus]. Mikrobiyoloji Bül. 2006 Oct;40(4):389–95.

TE D

9. Wang L, Christophersen CT, Sorich MJ, Gerber JP, Angley MT, Conlon MA. Increased abundance of Sutterella spp. and Ruminococcus torques in feces of children with autism spectrum disorder. Mol Autism. 2013;4(1):42.

EP

10. Lagier JC, Elkarkouri K, Rivet R, Couderc C, Raoult D, Fournier P-E. Non contiguous-finished genome sequence and description of Senegalemassilia anaerobia gen. nov., sp. nov. Stand Genomic Sci. 2013;7(3):343–56. 11. Aghnatios R, Cayrou C, Garibal M, Robert C, Azza S, Raoult D, et al. Draft genome of Gemmata massiliana sp. nov, a water-borne Planctomycetes species exhibiting two variants. Stand Genomic Sci . 2015;10:120. doi: 10.1186/s40793-015-0103-0

AC C

200

12. Seng P, Abat C, Rolain JM, Colson P, Lagier JC, Gouriet F et al.. Identification of rare pathogenic bacteria in a clinical microbiology laboratory: impact of matrix-assisted laser desorption ionization-time of flight mass spectrometry. J Clin Microbiol 2013 Jul; 51(7):218294.. 13. Stackebrandt E and Ebers J. Taxonomic parameters revisited: tarnished gold standards. Microbiol Today. 2006; 14. Myron Sasser. Bacterial Identification by Gas Chromatographic Analysis of Fatty Acids Methyl Esters (GC-FAME). MIDI. 2006;(Technical Note #101). 15. Dione N, Sankar SA, Lagier JC, Khelaifia S, Michele C, Armstrong N. at al. Genome sequence and description of Anaerosalibacter massiliensis sp. nov. New Microbes New

ACCEPTED MANUSCRIPT

Infect. 2016; in press.

16. European Committee on Antimicrobial Susceptibility Testing, Comité de l’Antibiogramme de la Société Française de Microbiologie. CASFM-EUCAST recommendations 2015 [Internet]. Available from: http://www.sfmmicrobiologie.org/UserFiles/files/casfm/CASFM_EUCAST_V1_2015.pdf

RI PT

17. Sengüven B, Baris E, Oygur T, Berktas M. Comparison of Methods for the Extraction of DNA from Formalin-Fixed, Paraffin-Embedded Archival Tissues. Int J Med Sci. 2014 Mar 27;11(5):494–9. 18. Lagier JC, Bibi F, Ramasamy D, Azhar EI, Robert C, Yasir M, et al. Non contiguous-finished genome sequence and description of Clostridium jeddahense sp. nov. Stand Genomic Sci. 2014 May 1;9(3):1003–19.

SC

19. Prodigual [Internet]. Available from: http://prodigal.ornl.gov/

20. Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997 Mar 1;25(5):955–64.

M AN U

21. Lagesen K, Hallin P, Rødland EA, Staerfeldt H-H, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35(9):3100– 8. 22. Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004 Jul 16;340(4):783–95.

TE D

23. Daubin V, Ochman H. Bacterial genomes as new gene homes: the genealogy of ORFans in E. coli. Genome Res. 2004 Jun;14(6):1036–42. 24. The new NCBI Genomes FTP site is here! [Internet]. [cited 2016 Jan 22]. Available from: http://www.ncbi.nlm.nih.gov/news/08-26-2014-new-genomes-FTP-live/

EP

25. Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ. Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics. 2011;12:124.

AC C

26. Ramasamy D, Mishra AK, Lagier JC, Padhmanabhan R, Rossi M, Sentausa E, et al. A polyphasic strategy incorporating genomic data for the taxonomic description of novel bacterial species. Int J Syst Evol Microbiol. 2014 Feb;64(Pt 2):384–91. 27. Gouret P, Paganini J, Dainat J, Louati D, Darbo E, Pontarotti P,, Pontarotti P. Integration of Evolutionary Biology Concepts for Functional Annotation and Automation of Complex Research in Evolution: The Multi-Agent Software System DAGOBAH. In: Evolutionary Biology – Concepts, Biodiversity, Macroevolution and Genome Evolution. Springer Berlin Heidelberg; 2011. p. 71–87. 28. Gouret P, Vitiello V, Balandraud N, Gilles A, Pontarotti P, Danchin EGJ. FIGENIX: intelligent automation of genomic annotation: expertise integration in a new software platform. BMC Bioinformatics. 2005;6:198. 29. Auch AF, von Jan M, Klenk H-P, Göker M. Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison. Stand Genomic Sci. 2010;2(1):117–34.

ACCEPTED MANUSCRIPT

30. Meier-Kolthoff JP, Auch AF, Klenk H-P, Göker M. Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinformatics. 2013;14:60.

AC C

EP

TE D

M AN U

SC

RI PT

201

ACCEPTED MANUSCRIPT

Figure legends

203

Figure 1. Reference mass spectrum from Drancourtella massiliensis strain GD1T. Spectra from 12

204

individual colonies were compared and a reference spectrum was generated.

205

Figure 2. Gel view comparing Drancourtella massiliensis to other phylogenetically close species.

206

The gel view displays the raw spectra of loaded spectrum files arranged in a pseudo-gel like look.

207

The x-axis records the m/z value. The left y-axis displays the running spectrum number originating

208

from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The

209

color bar and the right y-axis indicate the relation between the color a peak is displayed with and the

210

peak intensity in arbitrary units. Displayed species are indicated on the left.

211

Figure 3. Phylogenetic tree highlighting the position of Drancourtella massiliensis gen. nov., sp.

212

nov. strain GD1T relative to other phylogenetically close type strains. Clusters are made according

213

to Collins MD et al. (5). Sequences were aligned using CLUSTALW, and phylogenetic inferences

214

were obtained with kimura two parameter model using neighbor-joining method with 1000

215

bootstrap replicates, within MEGA6. The scale bar represents a 1% nucleotide sequence divergence.

216

Figure 4. Gram-staining of Drancourtella massiliensis strain GD1T.

217

Figure 5. Transmission electron microscopy of Drancourtella massiliensis strain GD1T, using a

218

TECNAI G20 (FEI) at an operating voltage of 200keV. The scale bar represents 200 nm.

219

Figure 6. Graphical circular map of the chromosome. From outside to the center: Genes on the

220

forward strand colored by COG categories (only genes assigned to COG), genes on the reverse

221

strand colored by COG categories (only genes assigned to COG), RNA genes (tRNAs green,

222

rRNAs red), GC content and GC skew.

223

Figure 7. Distribution of functional classes of predicted genes according to the clusters of

224

orthologous groups of proteins.

AC C

EP

TE D

M AN U

SC

RI PT

202

225

ACCEPTED MANUSCRIPT

Table 1: Classification and general features of Drancourtella massiliensis strain GD1T Property

Term

Current classification

Domain: Bacteria Phylum: Firmicutes

RI PT

Class: Clostridia Order: Clostridiales Family: Ruminococcaceae

SC

Genus: Drancourtella Species: Drancourtella massiliensis

Cell shape

Rod

Motility

Non motile

Sporulation

sporulating

Temperature range

Mesophilic

Optimum temperature

37°C

EP

TE D

Positive

AC C

226

Gram stain

M AN U

Type strain: GD1T

ACCEPTED MANUSCRIPT

227

Table 2: Differential characteristics of Drancourtella massiliensis strain GD1T, Ruminococcus

228

faecis Eg2T, Ruminococcus torques ATCC27756, Ruminococcus lactaris ATCC29176, na: non-

229

available data

Cell diameter (µm)

D. massiliensis

0.5

R. faecis

1.0

R. torques

na

R. lactaris

na

+

+

+

+

Salt requirement

<10g/L

na

na

na

Motility

-

-

na

na

Endospore formation +

-

-

-

Indole

-

-

+

-

Alkaline phosphatase -

+

na

na

-

Oxidase

-

Nitrate reductase

-

+

na

na

-

na

na

-

+

-

EP

Catalase

TE D

Production of

Urease

M AN U

Gram stain

-

-

na

na

-

-

-

+

N-acetyl-glucosamine -

-

+

+

AC C

β-galactosidase

SC

Oxygen requirement anaerobic anaerobic anaerobic anaerobic

RI PT

Properties

Acid from

L-Arabinose

-

-

+

-

Ribose

+

na

na

na

Mannose

-

-

+

-

Mannitol

-

-

na

na

Saccharose

ACCEPTED MANUSCRIPT D. massiliensis

R. faecis

R. torques

R. lactaris

-

-

+

-

-

+

-

-

-

na

na

na

D-maltose

-

+

+

-

D-lactose

-

+

+

-

Human gut

Human gut

Human gut

Human gut

D-glucose D-fructose

Habitat

SC

230

RI PT

Properties

AC C

EP

TE D

M AN U

231

232

ACCEPTED MANUSCRIPT

Table 3: Total cellular fatty acid composition

Mean relative % a

16:0

Hexadecanoic acid

39.5 ± 4.1

14:0

Tetradecanoic acid

21.4 ± 9.6

18:1n9

9-Octadecenoic acid

8.2 ± 5.2

16:1n7

9-Hexadecenoic acid

6.3 ± 0.9

18:0

Octadecanoic acid

6.2 ± 1.6

18:1n7

11-Octadecenoic acid

3.8 ± 0.5

13:0

Tridecanoic acid

3.3 ± 1.8

18:2n6

9,12-Octadecadienoic acid

3.2 ± 0.8

14:1n5

9-Tetradecenoic acid

2.7 ± 0.4

12:0

Dodecanoic acid

15:0

Pentadecanoic acid

5:0 anteiso

2-methyl-butanoic acid

TR

15:0 anteiso

12-methyl-tetradecanoic acid

TR

15:1n5

10-Pentadecenoic acid

TR

C13:0 3-OH

3-hydroxy-Tridecanoic acid

TR

17:0

Heptadecanoic acid

TR

EP

TE D

M AN U

SC

IUPAC name

AC C

234

Fatty acids

RI PT

233

235

a

236

deviation (n =3); TR = trace amounts < 1 %

1.4 ± 0.8

1.4 ± 0.8

Mean peak area percentage calculated from the analysis of FAMEs in 3 sample preparations ± standard

ACCEPTED MANUSCRIPT

Table 4: Nucleotide content and gene count levels of the genome Attribute

% of totala

Size (bp)

3,057,334

100

G+C content (bp)

1,383,137

45.24

Coding region (bp)

2,772,730

90.7

Total genes

2,925

100

RNA genes

64

2.18

Protein-coding genes

2,861

97.81

Genes with function prediction

1,969

67.32

Genes assigned to COGs

1,763

61.62

Genes with peptide signals

245

8.56

Genes with transmembrane helices

664

Genes with Pfam domains

2,693

239

a

240

coding genes in the annotated genome.

SC

RI PT

Value

TE D

238

Genome (total)

M AN U

237

23.2 92

AC C

EP

The total is based on either the size of the genome in base pairs or the total number of protein

ACCEPTED MANUSCRIPT

Value

% of totala

Description

J

152

5.31

Translation

A

0

0

RNA processing and modification

K

177

6.18

Transcription

L

115

4.02

Replication, recombination and repair

B

0

0

Chromatin structure and dynamics

D

23

0.8

Cell cycle control, mitosis and meiosis

Y

0

0

Nuclear structure

V

71

2.48

Defense mechanisms

T

56

1.96

Signal transduction mechanisms

M

95

3.32

Cell wall/membrane biogenesis

N

3

0.10

Cell motility

Z

0

0

Cytoskeleton

W

0

0

U

22

0.77

O

59

2.06

C

98

3.43

Energy production and conversion

G

202

7.06

Carbohydrate transport and metabolism

M AN U

SC

RI PT

Code

TE D

Table 5: Number of genes associated with the 25 general COG functional categories

Extracellular structures Intracellular trafficking and secretion

EP

Post-translational modification, protein turnover, chaperones

H

AC C

241

67

2.34

Coenzyme transport and metabolism

I

44

1.53

Lipid transport and metabolism

P

104

3.64

Inorganic ion transport and metabolism

Q

28

0.98

R

237

8.28

Secondary metabolites biosynthesis, transport and catabolism General function prediction only

E F

227

7.93

Amino acid transport and metabolism

62

2.16

Nucleotide transport and metabolism

S

ACCEPTED MANUSCRIPT

111

3.88

Function unknown

1098

38.38

Not in COGs

-

TE D

M AN U

SC

RI PT

The total is based on the total number of protein coding genes in the annotated genome

EP

a

AC C

242

ACCEPTED MANUSCRIPT

Table 6: Pairwise comparison of Drancourtella massiliensis (upper right) with eight other species using GGDC, formula 2 (DDH estimates based on

244

identities / HSP length)*

245 D. formicigenerans

E. ventriosum

C. comes

R. gnavus

C. scindens

D. massiliensis

100% ± 00

32.8% ±2.56

39.4% ±2.70

25.6% ±2.59

22.8% ±2.62

22.9% ±2.56

100% ± 00

38.9% ±2.56

32.2% ±2.55

28% ±2.54

25.5% ±2.53

23.1% ±2.58

22.5% ±2.56

21.5% ±2.56

100% ± 00

25.7% ±2.58

22.2% ±2.56

100% ± 00

22.2% ±2.57

Eubacterium ventriosum Coprococcus comes

100% ± 00

M AN U

Ruminococcus gnavus

SC

Dorea formicigenerans

RI PT

243

Clostridium scindens Drancourtella massiliensis

246

100% ± 00

*The confidence intervals indicate the inherent uncertainty in estimating DDH values from intergenomic distances based on models derived from

248

empirical test data sets (which are always limited in size) These results are in accordance with the 16S rRNA (Figure 3) and phylogenomic analyses as

249

well as the GGDC results.

251 252 253

EP AC C

250

TE D

247

ACCEPTED MANUSCRIPT

Table 7: Numbers of orthologous protein shared between genomes (upper right)* average percentage similarity of nucleotides corresponding to

255

orthologous protein shared between genomes (lower left) and numbers of proteins per genome (bold).

256

RI PT

254

E. ventriosum

C. comes

R. gnavus

C. scindens

D. massiliensis

Dorea formicigenerans

3,277

987

1,194

1,234

1,337

1,235

Eubacterium ventriosum

66.46

2,802

870

954

946

922

Coprococcus comes

71.29

65.98

3,913

1,075

1,144

1,082

Ruminococcus gnavus

69.73

65.47

70.00

3,762

1,269

1,202

Clostridium scindens

70.87

63.52

68.76

69.15

3,995

1,266

Drancourtella massiliensis

68.07

63.73

68.96

69.31

68.68

2,861

M AN U

SC

D. formicigenerans

AC C

EP

TE D

257

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

Figure 1

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

Figure 2

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

Figure 3

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

Figure 4

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Figure 5

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

Figure 6

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

Figure 7