Genome sequence of Geobacillus thermoglucosidasius DSM2542, a platform hosts for biotechnological applications with industrial potential

Genome sequence of Geobacillus thermoglucosidasius DSM2542, a platform hosts for biotechnological applications with industrial potential

Accepted Manuscript Title: Genome Sequence of Geobacillus thermoglucosidasius DSM2542, a Platform Hosts for Biotechnological Applications with Industr...

106KB Sizes 3 Downloads 83 Views

Accepted Manuscript Title: Genome Sequence of Geobacillus thermoglucosidasius DSM2542, a Platform Hosts for Biotechnological Applications with Industrial Potential Author: Jingyu Chen Zhengzhi Zhang Caili Zhang Bo Yu PII: DOI: Reference:

S0168-1656(15)30150-4 http://dx.doi.org/doi:10.1016/j.jbiotec.2015.10.002 BIOTEC 7271

To appear in:

Journal of Biotechnology

Received date: Accepted date:

29-9-2015 5-10-2015

Please cite this article as: Chen, Jingyu, Zhang, Zhengzhi, Zhang, Caili, Yu, Bo, Genome Sequence of Geobacillus thermoglucosidasius DSM2542, a Platform Hosts for Biotechnological Applications with Industrial Potential.Journal of Biotechnology http://dx.doi.org/10.1016/j.jbiotec.2015.10.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Genome Sequence of Geobacillus thermoglucosidasius DSM2542, a Platform Hosts for Biotechnological Applications with Industrial Potential

Jingyu Chen1*, Zhengzhi Zhang2, Caili Zhang3, and Bo Yu3

1

Beijing Laboratory for Food Quality and Safety, College of Food Science and

Nutritional Engineering, China Agricultural University, Beijing 100083, China 2

Linyi Municipal Environmental Protection Bureau, Linyi 276000, China

3

CAS Key Laboratory of Microbial Physiological and Metabolic Engineering,

Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China

*Corresponding author. Email: [email protected] Phone: +86-10-62737966; Fax: +86-10-62737078

1

Highlights 1.

Geobacillus thermoglucosidasius DSM2542 could ferment a wide range of substrates with low nutrient requirements for growth between 40°C and 70°C.

2.

The success in bioethanol production and its predominant properties make G. thermoglucosidasius DSM2542 as a potentially industrial workhorse.

3.

The first released genome sequence of G. thermoglucosidasius DSM2542 may facilitate the design of rational strategies for further strain improvements.

4.

The availability of genome sequence also provides information for exploration of industrially interesting enzymes with thermotolerant properties.

2

ABSTRACT Thermophilic Geobacillus thermoglucosidasius could ferment a wide range of substrates with low nutrient requirements for growth. Here, the first released the complete genome sequence of G. thermoglucosidasius DSM2542 may facilitate the design of rational strategies for further strain improvements and provide information for exploring industrially interesting enzymes with thermotolerant properties.

Keywords: Geobacillus thermoglucosidasius, genome sequence, thermophilic

3

The genus Geobacillus includes thermophilic Gram-positive spore-forming bacteria that form a phylogenetically coherent clade within the family Bacillaceae. These bacteria were previously categorized as ‘Group 5’ within the genus Bacillus but were subsequently split into the new genus Geobacillus (Nazina et al., 2001). Species in the Geobacillus genus are capable of growth between 40°C and 70°C. The advantages of using thermophilic bacteria as whole-cell biocatalysts include reduced risk of contamination and acceleration of biochemical processes in fermentation. Additionally, many glycolytic thermophiles are able to use polymeric or short oligomeric carbohydrates with low nutrient requirements to generate lactate, formate, acetate and ethanol as products. Thus, Geobacillus spp. are of great interest for biotechnology as sources of thermostable enzymes and as platform hosts for natural product from lignocellulosic biomass (Niehaus et al., 1999; Cripps et al., 2009; Taylor et al., 2009). Geobacillus

thermoglucosidasius

is

a

facultative

anaerobic,

rod-shaped,

Gram-positive and endospore-forming bacterium (Nazina et al., 2001). The bacterium also tends to ferment a wide range of substrates, utilizing both cellobiose and pentose sugars. In the context of bioethanol production, there is the additional advantage of reduced cooling costs and easier removal and recovery of the volatile product by sparging or partial vacuum thus also avoiding ethanol poisoning of the bacteria (Taylor et al., 2009). G. thermoglucosidasius DSM2542 has been engineered and exploited for industrial bioethanol production from lignocellulosic feedstocks (Cripps et al., 2009). This is primarily due to its rapid growth rate and ability to ferment a 4

broad range of monosaccharides, cellobiose and short-chain oligosaccharides (particularly xylans); and importantly this strain is amenable to genetic manipulation. The already success in engineered bioethanol production and the predominant properties make G. thermoglucosidasius DSM2542 as a potential workhorse for production of other value-added chemicals, such as organic acids and amino acids. The convergence of genomic data should help to remove the barriers to greater exploitation of this thermophilic species. However, genome sequences are not yet publicly available for such readily transformable strains (Thompson et al., 2008). We therefore sequenced and analyzed the genome of strain G. thermoglucosidasius DSM2542 to provide the genetic basis to explore the possibility to produce other targeted chemicals by systems metabolic engineering. Genomic DNA from G. thermoglucosidasius DSM2542 was extracted using the QIAamp DNA Mini Kit (Qiagen, CA). The quantity and quality of genomic DNA were evaluated on the Agilent 2100 Bioanalyzer (Agilent, US). Genomic DNA was used to construct a 10 kb insert SMRT-bell library, and then it was sequenced on the single molecule real-time (SMRT) DNA sequencing platform by Pacific Biosciences (PacBio) RS II sequencer (Pacific Biosciences, CA) (Eid et al., 2009). A total of 150,292 polymerase reads on one SMRT cell for 3 h movie times was led to a total of 1,088,091,659 nucleotide bases. After filtering to remove any reads having low accuracy values less than 0.8, 912,529,533 read bases were obtained with 0.847 read quality. All of the filtered sequences were de novo assembled using RS hierarchical genome assembly process (HGAP) assembly protocol 2.0 in SMRT analysis software 5

version 2.3.0 (Pacific Biosciences) (Chin et al., 2013). The length of complete circular chromosome is 3,873,116 bp with a G + C content of 43.9%.The annotated was performed by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP), resulting in the prediction of 3,818 genes, including 3,576 coding sequences (CDSs), and 90 tRNA and 27 rRNA (5S rRNA, 16S rRNA, and 23S rRNA) sequences (Table 1). The genome sequence could serve as a basis for further elucidation of the genetic background of this promising strain, and provide significant opportunities for investigating the metabolic and regulatory mechanisms underlying the formation of ethanol, organic acids, amino acids and etc. It may also facilitate the identification of suitable target genes that can assist the development of superior microbial cell factories with higher optical purity, concentration, yield, and productivity by systems metabolic engineering. Additionally, the availability of Geobacillus genome sequences has enabled or accelerated the discovery, cloning and exploitation of interesting enzymes for biotechnological potential. In G. thermoglucosidasius DSM2542, several genes responsible for industrially important enzymes, such as lipase/esterase (Schmidt-Dannert et al., 1998), hydrolase (Bartosiak-Jentys et al., 2013), protease (Chen et al., 2004) and etc, were annotated, which may provide alternative information for exploration of such enzymes from thermophiles.

Nucleotide sequence accession number The complete genome information of G. thermoglucosidasius DSM2542 was deposited in GenBank under the accession number CP012712. 6

Author’s contributions Dr. J Chen contributed to study design and coordination of the work. Dr. ZZ Zhang participated in data analysis. CL Zhang prepared the DNA for sequencing. Drs. J Chen and B Yu were responsible for article preparation. All the authors have approved the manuscript and agree with submission to Journal of Biotechnology.

Acknowledgments The work was partly supported by grants from the Beijing Higher Education Young Elite Teacher Project (YETP0310), Chinese Universities Scientific Fund (2015SP003), the National Natural Science Foundation of China (21466007) and Project of Guangxi Provincial Science & Technology Development, China (14125008-2-22).

7

References Bartosiak-Jentys, J., Hussein, A.H., Lewis, C.J., Leak, D.J., 2013. Modular system for assessment of glycosyl hydrolase secretion in Geobacillus thermoglucosidasius. Microbiology 159, 1267-1275. Chen, X.G., Stabnikova, O., Tay, J.H., Wang, J.Y., Tay, S.T.L., 2004. Thermoactive extracellular proteases of Geobacillus caldoproteolyticus, sp. nov., from sewage sludge. Extremophiles 8, 489-498. Cripps, R.E., Eley, K., Leak, D.J., Rudd, B., Taylor, M., Todd, M., Boakes, S., Martin, S.,

Atkinson,

T.,

2009.

Metabolic

engineering

of

Geobacillus

thermoglucosidasius for high yield ethanol production. Metab. Eng. 11, 398-408. Nazina, T.N., Tourova, T.P., Poltaraus, A.B., Novikova, E.V., Grigoryan, A.A., Ivanova, A.E., Lysenko, A.M., Petrunyaka, V.V., Osipov, G.A., Belyaev, S.S., Ivanov, M.V., 2001. Taxonomic study of aerobic thermophilic bacilli: descriptions of Geobacillus subterraneus Gen. Nov., Sp. Nov. and Geobacillus uzenensis Sp. Nov. from petroleum reservoirs and transfer of Bacillus stearothermophilus, Bacillus thermocatenulatus, Bacillus thermoleovorans, Bacillus kaustophilus, Bacillus thermodenitrificans to Geobacillus as the new combinations G. stearothermophilus, G. thermoglucosidasius. Int. J. Syst. Evol. Microbiol. 51, 433-446. Niehaus, F., Bertoldo, C., Kähler, M., Antranikian, G., 1999. Extremophiles as a source of novel enzymes for industrial application. Appl. Microbiol. Biotechnol. 51, 711-729. 8

Schmidt-Dannert, C., Pleiss, J., Schmid, R.D., 1998. A toolbox of recombinant lipases for industrial applications. Ann. N. Y. Acad. Sci. 864, 14-22. Taylor, M.P., Eley, K.L., Martin, S., Tuffin, M.I., Burton, S.G., Cowan, D.A., 2009. Thermophilic ethanologenesis: future prospects for second-generation bioethanol production. Trends Biotechnol. 27, 398-405. Thompson, A.H., Studholme, D.J., Green, E.M., Leak, D.J., 2008. Heterologous expression of pyruvate decarboxylase in Geobacillus thermoglucosidasius. Biotechnol. Lett. 30, 1359-1365. Chin, C.S., Alexander, D.H., Marks, P., Klammer, A.A., Drake, J., Heiner, C., Clum, A., Copeland, A., Huddleston, J., Eichler, E.E., Turner, S.W., Korlach, J., 2013. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nature Methods 10, 563-569. Eid, J., Fehr, A., Gray, J., Luong, K., Lyle, J., Otto, G., Peluso, P., Rank, D., Baybayan, P., Bettman, B., Bibillo, A., Bjornson, K., Chaudhuri, B., Christians, F., Cicero, R., Clark, S., Dalal, R., Dewinter, A., Dixon, J., Foquet, M., Gaertner, A., Hardenbol, P., Heiner, C., Hester, K., Holden, D., Kearns, G., Kong, X., Kuse, R., Lacroix, Y., Lin, S., Lundquist, P., Ma, C., Marks, P., Maxham, M., Murphy, D., Park, I., Pham, T., Phillips, M., Roy, J., Sebra, R., Shen, G., Sorenson, J., Tomaney, A., Travers, K., Trulson, M., Vieceli, J., Wegener, J., Wu, D., Yang, A., Zaccarin, D., Zhao, P., Zhong, F., Korlach, J., Turner, S., 2009. Real-time DNA sequencing from single polymerase molecules. Science 323, 133-138.

9

Table captions Table 1 Features of the genome of Geobacillus thermoglucosidasius DSM2542. Feature

Value

Genome size (bp)

3,873,116

GC content [%]

43.9

Total number of genes

3,818

Protein coding genes (CDSs)

3,576

rRNA genes (5S, 16S, 23S)

27 (7, 10, 10)

tRNA genes

90

ncRNA

0

10