Accepted Manuscript Title: Complete genome sequence of Enterococcus faecalis LD33, a bacteriocin-producing strain Author: Yuehua Jiao Lanwei Zhang Fei Liu Huaxi Yi Xue Han PII: DOI: Reference:
S0168-1656(16)30210-3 http://dx.doi.org/doi:10.1016/j.jbiotec.2016.04.030 BIOTEC 7517
To appear in:
Journal of Biotechnology
Received date: Accepted date:
10-4-2016 13-4-2016
Please cite this article as: Yuehua, Jiao, Lanwei, Zhang, Fei, Liu, Huaxi, Yi, Xue, Han, Complete genome sequence of Enterococcus faecalis LD33, a bacteriocin-producing strain.Journal of Biotechnology http://dx.doi.org/10.1016/j.jbiotec.2016.04.030 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Complete genome sequence of Enterococcus faecalis LD33, a bacteriocin-producing strain Yuehua Jiaoa,b, Lanwei Zhanga*,Fei Liuc, Huaxi Yia, Xue Hana a Department of Food Science and Engineering, School of Chemical Engineering and Technology, Harbin Institute of Technology, Harbin 150090, PR China b Center of Drug Safety Evaluation, Heilongjiang University of Chinese Medicine, Harbin 150040, PR China c Key Laboratory of Dairy Science, Ministry of Education, Northeast Agricultural University, Harbin 150030, PR China * Corresponding author at: Department of Food Science and Engineering, School of Chemical Engineering and Technology, Harbin Institute of Technology, No.73 Huanghe Road, Nangang District, Harbin 150090, China. Tel.: +86 451 86282901; fax: +86 451 86282906. E-mail address:
[email protected] (L. Zhang)
Highlights The whole-genome sequence of E. faecalis LD33 was performed using Illumina Hiseq and the PacBio RSII platform. E. faecalis LD33 has more clusters of microcin than other sequenced E. faecalis, indicating that this strain may have competitive advantages in attacking pathogenic bacteria. When exogenous heme was added, E. faecalis LD33 could undergo respiration to achieve a high biomass, because of E. faecalis LD33 owning an aerobic respiratory chain in silico analysis.
Abstract: Enterococcus faecalis LD33 strain was originally isolated from traditional naturally fermented cream in Inner Mongolia of China. Its complete genome sequence was carried out using the Illumina Hiseq and the PacBio RSII platform. The genome only has a circular chromosome and a GC content of 37.58%. Other core information shown in the genome sequencing results give further insight on this bacterium’s genetic elements for bacteriocin production and the genes related to respiratory chain.
Keywords: Enterococcus faecalis; Genome; Bacteriocin; Respiration
Enterococci are lactic acid bacteria of the gut microbiota of humans and animals and part of the food microflora. These strains are capable of producing a variety of enterocins with activity against Gram positive pathogenic bacteria and other food-borne pathogens. Most of the enterocins produced belong to class II bacteriocins, which are small heat-stable non-lantibiotic peptides (Moreno et al., 2006). E. faecalis LD33 was isolated from traditional naturally fermented cream in China and identified by API 50CH strips and 16S rRNA gene sequence analysis (Jiao et al., 2013a). In a recent study, we showed that the bacteriocin produced by E. faecalis LD33 exhibited antimicrobial activity against Staphylococcus aureus and Shigella sonnei. In the presence of heme, E. faecalis LD33 could not only resist oxygen damage, but also might make use of oxygen for aerobic respiration metabolism (Jiao et al., 2013b) resulting in significant improvements in viable cell count, and which may further increase the production of bacteriocin. In order to mine the gene cluster encoding bacteriocins of E. faecalis LD33 and its full potentials, its whole-genome sequence was performed using a combined strategy of Illumina paired-end sequencing and PacBio RSII sequencing technology. A total of 35,181 reads were de novo assembled using the hierarchical genome assembly process (HGAP) work-flow (Chin et al., 2013) and the data from Illumina were used afterwards to correct the contig by soapSNP and soapIndel. This resulted in a circular complete chromosome genome. Gene prediction was managed by the Glimmer (Besemer et al., 2001). Gene annotation was carried out by Annotation NCBI Prokaryotic Genome Annotation Pipeline (Pruitt et al., 2012). Genes involved in the biosynthesis
of
bacteriocins
were
analyzed
in
silico
using
antiSMASH
(http://antismash.secondarymetabolites.org/). Genomic islands and virulence factors were
displayed
by
Island
Viewer
(http://www.pathogenomics.sfu.ca/islandviewer/browse/). Antibiotic resistant genes in E. faecalis LD33 were mined by the Antibiotic resistance gene database (http://ardb.cbcb.umd.edu/). The complete genome of E. faecalis LD33 is composed of a circular chromosome with GC content of 37.58%. The chromosome of E. faecalis LD33 contains 2,643
CDSs, 77 RNAs and 61 pseudogenes (Table 1). All coding genes can be classified to clusters of orthologous genes (COG) (Tatusov et al., 2000). In detail, there are 88 genes for energy production and conversion, 157 genes for translation, ribosomal structure and biogenesis, 17 genes involved in cell cycle control, cell division, chromosome partitioning, 163 genes for amino acid transport and metabolism, 215 genes for carbohydrate transport and metabolism, 75 genes for nucleotide transport and metabolism, 49 genes for lipid transport and metabolism, 57 genes for coenzyme transport and metabolism, 130 genes for transcription, 154 genes for translation, ribosomal structure and biogenesis, 75 genes for cell wall/membrane/envelope biogenesis, 115 genes for replication, recombination and repair, 49 genes for posttranslational modification, protein turn over and chaperones, 5 genes for cell motility, 26 genes for secondary metabolites biosynthesis, transport and catabolism, 113 genes for inorganic ion transport and metabolism, 21 genes for intracellular trafficking, secretion, and vesicular transport, 51 genes for signal transduction mechanisms, 39 genes for defense mechanisms, 161 genes for unknown function, 212 genes for general function prediction in the chromosome genome. Table 1 General genome features of E. faecalis LD33. Feature Size [bp] GC content [%] Predicted genes Protein coding genes (CDSs) Pseudogenes rRNA operons tRNAs ncRNA Frameshifted Genes GenBank accession
Chromosome 2,803,429 37.58% 2,781 2,643 61 4 61 4 32 CP014949
We compared the genome of E. faecalis LD33 obtained in this study with available complete genome sequences of E. faecalis D32 (GenBank accession no. NC0182210) (Zischka et al., 2012), E. faecalis 62 (GenBank accession no.CP002491) (Solheim et al., 2009), E. faecalis ATCC29212 (GenBank accession no. CP008816) (Minogue et al., 2014) and E. faecalis V583 (GenBank accession no.NC004668.1) (Aakra et al.,
2005). We mined the genome of E. faecalis LD33 to find the genetic element with bacteriocin. There are 4 clusters of microcin which are composed of a relatively few peptides. E. faecalis LD33 has more clusters of microcin than other sequenced E. faecalis, indicating that this strain may have competitive advantages in attacking pathogenic bacteria. There are three genes for multidrug resistance in the genome of E. faecalis LD33, A3777_13900 encodes for multidrug transporter, which belongs to the major facilitator transporter and is used as a multidrug resistance efflux pump to resist fluoroquinolone.
A3777_04840
alpha-N-acetylglucosaminyl
encodes
1-phosphate
for
transferase,
undecaprenyl-phosphate which
consists
in
the
sequestration of Undecaprenyl pyrophosphate to resist bacitracin, A3777_06475 encodes for multidrug ABC transporter ATP-binding protein, which is resistant to MLS antibiotics (lincosamide, streptogramin and macrolide). However, these genes are not located in the genomic island, so they cannot be transferred by mobile elements. E. faecalis LD33 does not possess any virulence factors based on results from in silico analysis, therefore, this strain maybe have no hazardous factors for high industrial and economic concerns. A3777_08935 encodes for noxA, A3777_04465 encodes for noxB, these two genes can form the membrane NADH dehydrogenase used as an electron donor; A3777_11130 encodes for isochorismate synthase, A3777_11125 encodes for 2-succinylbenzoate-CoA ligase, A3777_11135 encodes
for
2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate
synthase, A3777_11145 encodes for o-succinylbenzoate synthase, A3777_11120 encodes for 1,4-dihydroxy-2-naphthoyl-CoA synthase, A3777_08920 encodes for 1,4-dihydroxy-2-naphthoate
polyprenyltransferase,
A3777_11140
2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate
synthase.
encodes These
for
genes
construct a complete pathway to synthesize the menaquinones used as an electron shuttle; A3777_04495 encodes for cydA, A3777_04490 encodes for cydB, A3777_04485 encodes for cydC, A3777_04480 encodes for cydD, These genes are essential elements for synthesizing cytochrome quinol oxidase, which must be activated by heme (Lechardeur et al., 2011), however, E. faecalis LD33 lacks some genes in the heme biosynthesis pathway. According our recent study (Jiao et al.,
2013b), when exogenous heme was added, E. faecalis LD33 could undergo respiration to achieve a high biomass, this must because of an aerobic respiratory chain having been established. The complete genome chromosome sequence has been deposited in GenBank database with accession number CP014949. This strain has been deposited at the China General Microbiological Culture Collection Center (CGMCC No.1.15424). Acknowledgements: This research was supported by the National Natural Science Foundation of China (Grant No. 31401512) and Scientific Research Fund of Heilongjiang Provincial Education Commission (Grant No. 12541768).
References Aakra, Å., Vebø, H., Snipen, L., Hirt, H., Aastveit, A., Kapur, V., Dunny, G., Murray, B., Nes, I.F., (2005) Transcriptional response of Enterococcus faecalis V583 to erythromycin. Antimicrobial Agents and Chemotherapy 49, 2246-2259. Besemer, J., Lomsadze, A., Borodovsky, M., (2001) GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic acids research 29, 2607-2618. Chin, C.-S., Alexander, D.H., Marks, P., Klammer, A.A., Drake, J., Heiner, C., Clum, A., Copeland, A., Huddleston, J., Eichler, E.E., (2013) Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature methods 10, 563-569. Jiao, Y.H., Zhang, L.W., Liu, F., (2013a) Identification of a Lactic Acid Bacteria Strain from Traditional Dairy Products. Advanced Materials Research. Trans Tech Publ, pp. 1599-1602. Jiao, Y.H., Zhang, L.W., Liu, F., (2013b) Screening of Lactic Acid Bacteria Strains with Respiration Ability in the Present of Heme. Advanced Materials Research. Trans Tech Publ, pp. 448-451. Lechardeur, D., Cesselin, B., Fernandez, A., Lamberet, G., Garrigues, C., Pedersen, M., Gaudu, P., Gruss, A., (2011) Using heme as an energy boost for lactic acid bacteria. Current Opinion in Biotechnology 22, 143-149. Minogue, T., Daligault, H., Davenport, K., Broomall, S., Bruce, D., Chain, P., Coyne, S., Chertkov, O., Freitas, T., Gibbons, H., (2014) Complete genome assembly of Enterococcus faecalis 29212, a laboratory reference strain. Genome announcements 2, e00968-00914. Moreno, M.F., Sarantinopoulos, P., Tsakalidou, E., De Vuyst, L., (2006) The role and application of enterococci in food and health. International journal of food microbiology 106, 1-24. Pruitt, K.D., Tatusova, T., Brown, G.R., Maglott, D.R., (2012) NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic acids research 40, D130-D135. Solheim, M., Aakra, Å., Snipen, L.G., Brede, D.A., Nes, I.F., (2009) Comparative genomics of Enterococcus faecalis from healthy Norwegian infants. BMC genomics 10, 1. Tatusov, R.L., Galperin, M.Y., Natale, D.A., Koonin, E.V., (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic acids research 28, 33-36. Zischka, M., Kuenne, C., Blom, J., Dabrowski, P.W., Linke, B., Hain, T., Nitsche, A., Goesmann, A., Larsen, J., Jensen, L.B., (2012) Complete genome sequence of the porcine isolate Enterococcus faecalis D32. Journal of bacteriology 194, 5490-5491.