Natural Products from Mammalian Gut Microbiota

Natural Products from Mammalian Gut Microbiota

TIBTEC 1714 No. of Pages 13 Review Natural Products from Mammalian Gut Microbiota Leli Wang,1,4 Vinothkannan Ravichandran,2,4 Yulong Yin,1,3 Jia Yin...

2MB Sizes 0 Downloads 60 Views

TIBTEC 1714 No. of Pages 13

Review

Natural Products from Mammalian Gut Microbiota Leli Wang,1,4 Vinothkannan Ravichandran,2,4 Yulong Yin,1,3 Jia Yin,1,2,* and Youming Zhang2,* The mammalian gut has a remarkable abundance of microbes. These microbes have strong potential to biosynthesize distinct metabolites that are promising drugs, and many more bioactive compounds have yet to be explored as potential drug candidates. These small bioactive molecules often mediate important host–microbe and microbe–microbe interactions. In this review, we provide perspectives on and challenges associated with three mining strategies – culture-based, (meta)genomics-based, and metabolomics-based mining approaches – for discovering natural products derived from biosynthetic gene clusters (BGCs) in mammalian gut microbiota. In addition, we comprehensively summarize the structures, biological functions, and BGCs of these compounds. Improving these techniques, including by using combinatorial approaches, may accelerate drug discovery from gut microbes.

Highlights The gut microbiota has strong potential to biosynthesize large amounts of structurally distinct metabolites with a variety of biological activities that are promising drugs and drug candidates. Only a fraction of the gut microbiota have been cultivated so far, which makes it even more important to investigate microbial ‘dark matter’, which is a promising source for drug discovery. One emerging approach is to develop metabolomics data sets and to construct metabolic network models by machine learning for both clinics and laboratories.

Natural Products from Mammalian Gut Microbiota – a Boon or a Bane? Endothermic mammals always need a nutritious and sizeable diet and are either carnivorous or herbivorous/omnivorous. The diversity of broad habitats and the complexity of the mammalian digestive tract contribute to a rich niche of commensal microorganisms. Certain symbiotic relationships exist between microorganisms and their hosts, and the fitness of the microbe– host system primarily depends on a diverse set of molecular interactions between the symbiotic partners [1], including colonization resistance against host-specific pathogens [2] and mediators of cell–cell communication [3]. For instance, microbiota-derived butyrate and niacin inhibit inflammation of the intestinal tract by preventing the development of colitis and colitis associated colorectal cancer [4]. Better understanding the mechanisms of these interactions has led to greater interest in the corresponding secondary metabolites [5]. Gut microbes produce natural products in two distinct ways. In one way, the metabolites produced by the gut microbiota from dietary components, such as tryptophan metabolites [6], short-chain fatty acids [7], and oligosaccharides [8]. In another way, the uncharacterized end products from the unique biosynthetic gene clusters (BGCs) (see Glossary) of the gut microbes. The past three decades have witnessed great progress in BGC-derived natural products discovery from the mammalian gut microbiota (Table 1). These natural products have potential novel bioactive functions [9], such as the antibiotic microcin M/H47 [10], the genotoxin colibactin [11], the cytotoxic compound tilivalline [12], and the protease inhibitors dipeptide aldehydes [13]. These distinct secondary metabolites are playing vital roles in understanding gut microbiota and in developing the pharmaceutical, agricultural, and food industries. Considering a rapid increase in gut microbiome sequencing and research, we characterize BGCs from mammals and outline three different mining strategies for functional bacterial metabolites, with a special emphasis on drug discovery, to create a foundation to explore this fascinating resource and accelerate the discovery of natural products from the mammalian gut microbiota. Trends in Biotechnology, Month Year, Vol. xx, No. yy

Theboundarybetweengenome/metagenomics-based, culture-based, and metabolomics-based mining approaches is fading and it is becoming feasible to combine two or more strategies to accelerate the mining of bioactive metabolites in the future.

1

Laboratory of Animal Nutrition and Human Health, Hunan International Joint Laboratory of Animal Intestinal Ecology and Health, College of Life Science, Hunan Normal University, 410081, Changsha, China 2 Shandong University-Helmholtz Institute of Biotechnology, State Key Laboratory of Microbial Technology, Suzhou Institute of Shandong University, 266235, Qingdao, China 3 Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process; Key Laboratory of Agro-ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences; Hunan Provincial Engineering Research Center for Healthy Livestock and Poultry Production; Scientific

https://doi.org/10.1016/j.tibtech.2018.10.003 © 2018 Elsevier Ltd. All rights reserved.

1

TIBTEC 1714 No. of Pages 13

BGC-Derived Natural Products Most BGC-derived natural products from the mammalian gut microbiota characterized so far are ribosomally synthesized and post-translationally modified peptides (RiPPs), which are a diverse class of natural products of ribosomal origin [14]. Structurally diverse RiPPs synthetized by gut microbiota are grouped into four categories. First, lantibiotics, such as ruminococcin A, are a class of peptide antibiotics containing fewer than 40 amino acids with post-translationally formed crosslinks between the terminal thiol of a cysteine residue and a dehydrated serine or threonine (Figure 1A) [15]. Second, bacteriocins, a class of narrowspectrum antibiotics, are proteinaceous toxins expressed by bacteria to restrain the growth of similar or closely related bacterial species. Third, microcins, synthesized by members of the Enterobacteriaceae family, are typical narrow-spectrum antibiotics with a series of unique posttranslational modifications in their peptide chain, such as internal amide crosslinking to develop a lasso-like topology (microcin J25) (Figure 1A) [16]. Fourth, thiazole/oxazole-modified microcins (TOMMs) are a class of post-translationally modified compounds containing azole and azoline heterocycles, such as clostridiolysin S (Figure 1A) [17]. Polyketides (PKs) and nonribosomal peptides (NRPs) are key natural products because they are the precursors of numerous therapeutic molecules [18]. Most PKs and NRPs are isolated from soil or aquatic microbes, while only a few are derived from the mammalian gut microbiota. Figure 1B illustrates the complex chemical structures from BGC-derived PKs/ NRPs. BGC-derived natural products are often toxic to a limited set of species, which may play an important role in their niche colonization [5]. BGCs always contain genes related to immunity and/or transport. Therefore, the immunity or transport genes should be considered when mining the genomes of mammalian gut microbiota. Since BGC-derived natural products from the mammalian gut microbiota appear to affect the physiology of the individuals that the microbiota colonize [13], it should be possible to exploit other bioactive molecules from this niche, which can further help to develop them as potential drug candidates.

Culture-Based Mining Approach Pure culturing is a critical step in the identification and utilization of BGC-derived natural products [12,19–22]. The isolation of bacteria from any origin and subsequent growth in broths, organic extraction, and bioactivity screening is a conventional approach in drug discovery. Culture conditions such as broth composition, pH values, and temperatures can be varied to improve the yield. Facilitating the cultivation of bacteria that have previously been uncultured and activating silent gene clusters are recent advancements in this approach. A schematic diagram of the culture-based mining approach is provided in Figure 2, Key Figure. Considering that distinct RiPPs and PK/NRP clusters can produce molecules with different bioactivities, numerous isolates can be selected by specific functions, such as activity against enteric pathogens or toxicity to mammalian cells (Figure 2A II). In this section, we present the challenges and techniques currently used to isolate the diverse mammalian gut microbiota. The colonization and growth of the microbiota face a range of selective pressures, which can be ascribed to abiotic perturbation and biotic factors. With respect to abiotic perturbations, growing fastidious in vitro bacteria requires dynamic physical and chemical gradients, including mammalian-favorable temperatures, pH, oxygen-free conditions, and nutrient sources [23]. Walsh and colleagues designed a microfabricated device that emulated the steep oxygen gradient along the colon wall 2

Trends in Biotechnology, Month Year, Vol. xx, No. yy

Observing and Experimental Station of Animal Nutrition and Feed Science in South-Central, Ministry of Agriculture, 410125, Changsha, China 4 These authors contributed equally to this work

*Correspondence: [email protected] (J. Yin) and [email protected] (Y. Zhang).

TIBTEC 1714 No. of Pages 13

[24]. Inventive model in vitro culture systems resembling the conditions of the human gut are being studied and developed, such as the Simulator of the Human Intestinal Microbial Ecosystem (SHIME) [25]. Recent research on biotic factors suggests that interbacterial communication occurs extensively among gut bacteria [26]. Interspecies relationships that limit the usage of the pureculture technique primarily include inhibition by competitive bacteria and promotion by symbiotic microorganisms. The growth of some bacteria may be inhibited by nutrient-rich media featuring natural materials typically used to cultivate fast-growing pathogens or neighboring bacteria in mixed cultures [27]. It is not surprising that these microorganisms compete for nutrition and living space, whether in vivo or in vitro. Boedicker and colleagues diluted mixed samples to single cells before cultivation, which decreased the potential competition stress [27]. Another strategy is to design selective media that not only allow the growth of target microorganisms but also suppress the growth of other microorganisms on the medium. For example, Kawanishi and colleagues developed a new detection system for bacteria using selective media, a carbon source, and antimicrobials to isolate Burkholderia glumae [28]. In contrast, the existence of symbiotic microorganisms may be due to the dependence on the metabolic activity of neighboring bacteria for survival and growth, and therefore it is difficult to isolate and eradicate them from a mixed medium. Attention has also been focused on the cultivation and isolation of single colonies of bacteria that need to be facilitated by other bacteria. For example, Tanaka and Benno used a soft agar coculture technique that employed soft agar layers divided by a membrane filter with a 0.02-mm pore size to isolate previously uncultured strains [29]. In addition to the diversified culture conditions, the identification methodologies, such as matrix-assisted laser desorption/ionization-time-of-flight mass spectrometry (MALDI–TOF MS) and 16S rRNA sequencing, promote the wide application of culturomics [30]. After isolating specific bacteria using these methods, BGCs with specific functions can be confirmed by random mutagenesis or by constructing genomic libraries (Figure 2A III–VI). A random mutagenesis library can be constructed using a Tn5 transposon (DNA sequences that can move from one genomic location to another) or chemical mutagenesis. Random mutants are selected by altering a specific phenotype, such as the cytotoxicity of tilivalline [12] and the genotoxic activity of colibactin [11]. The boundary sequence of the mutation can be sequenced with adjacent DNA. In BGCs of known sequence, site-directed mutagenesis using a PCRbased one-step inactivation method and complementation study in vivo were utilized to discover microcin M [10]. Two strategies are used to target the bacteria with known BGC sequences: in situ expression and heterologous expression. In situ expression refers to the expression of the BGC in its original host. Dornisch and colleagues identified that NpsA and NpsB are necessary for tilivalline production based on Tn5 transposon mutagenesis to a wildtype strain and the sequencing of four toxin-deficient mutants [12] (Figure 2A VI a). One of the major challenges in natural product discovery is that a few BGCs are produced by cryptic pathways where the transcription is either silent or the expression levels are low under laboratory conditions [31]. In in situ expression, artificially adding a strong promoter and deleting inhibitory factors will induce high-level expression and enhance the synthesis of the natural product under certain circumstances. A recombineering system based on hostspecific phage proteins is required for in situ expression, which can be developed according to the approach for Plu2670 of Photorhabdus [32] and glidopeptins/rhizomides of Burkholderiales [33]. Recently, Zalatan developed synthetic bacterial transcriptional activators by linking activation domains to programmable CRISPR-Cas DNA binding domains [34], which has great potential in in situ expression for cryptic pathways.

Glossary Biosynthetic gene cluster (BGC): group of genes in a genome that encodes a biosynthetic pathway to produce a specialized metabolite in bacteria, fungi, and plants. Cas9-assisted targeting of chromosome (CATCH): technology in which the targeting fragment is digested from bacterial chromosomes in vitro by the RNAguided Cas9 nuclease at two designated loci and cloned to the vector by Gibson assembly. CRISPR-Cas: essential adaptive immunity system in certain bacteria and archaea, enabling the organisms to respond to and eliminate invading genetic material. Based on these technologies, it is possible to effectively and specifically change genes within organisms. Culturomics: culturing approach that uses multiple culture conditions and MALDI–TOF and 16S rRNA to rapidly identify large numbers of colonies. Emulsion, paired isolation, and concatenation PCR (epicPCR): novel technique that links functional genes and phylogenetic markers in uncultured single cells that costeffectively provides a throughput of hundreds of thousands of cells. Exonuclease Combined with RecET recombination (ExoCET): combining in vitro exonuclease and annealing with the remarkable capacity of full length RecET homologous recombination to retrieve specified regions from genomic DNA preparations, this method bypasses DNA library construction and screening. Fragment assembly: aligning and merging fragments from a longer DNA sequence to reconstruct the original sequence. Genome binning: process of grouping reads or contigs and assigning them to operational taxonomic units. Known Media Database (KOMODO): web tool that enables a systematic investigation of monophyletic genomes to detect significantly enriched groups of homologous genes between one taxon and another. Nonribosomal peptides (NRPs): peptide secondary metabolites from bacteria and fungi with a variety of

Trends in Biotechnology, Month Year, Vol. xx, No. yy

3

TIBTEC 1714 No. of Pages 13

Heterologous expression is another option to study BGCs (Figure 2A VI b). In discovering colibactin [21], systematic mutagenesis of a genomic island, named a pks island, which is harbored on a bacterial artificial chromosome (BAC) in the laboratory Escherichia coli strain DH10B, revealed the enzymes responsible for the cytopathic effect of colibactin. In a genomic library, DNA of a specific length can be collected after shearing the genomic DNA of the functional bacteria. Based on the conserved sequence of known BGCs, degenerate oligonucleotides can be designed and applied to screen a similar gene. After sequencing, the associated genes that are likely to be highly conserved, such as those encoding proteins that are involved in post-translational modifications, could be used to screen novel BGCs from the mammalian intestinal tract. A DNA fragment from novel BGCs could be used for target gene mapping or as a probe for hybridization analysis in the genomic library. Using this strategy, a novel lantibiotic cluster nisin O, which shares considerable homology with class AI lantibiotics, was discovered from the anaerobic bacterium Blautia obeum A2-162 (Figure 2A VI d) [19]. Additionally, the genomic library can also be screened using a phenotypic assay, such as antibiotic halos for microcin H47 [35] (Figure 2A VI e). Once the function of the isolated bacterium is determined, its natural products can be purified by a series of metabolomics platforms, such as Sephadex G-75, cation-exchange HPLC, or reverse-phase HPLC. These techniques are advantageous for analysis of BGC-derived bioactive molecules. Sephadex G-75 is a well-established gel filtration resin for desalting and buffer exchange of large biomolecules with >80 000 Da molecular weight. Cation-exchange chromatography is used to separate a large range of molecules from amino acids, nucleotides, and large proteins. Reverse-phase columns have a hydrophobic stationary phase, which works well to retain most organic analytes. Reverse-phase chromatography also has the advantage of being able to use pH selectivity to improve separations. The bioactivity of each fraction is determined at each step of the purification procedure. Then, the amino acid sequences can be determined using Edman degradation to deduce the open reading frames surrounding the BGC sequence. This strategy was applied to discover ruminococcin A [15], ruminococcin C [36], and bovicin HC5 [22] (Figure 2A VI c). Ruminococcin A was purified by reverse-phase chromatography from the supernatant of the Ruminococcus gnavus E1 strain [15]. Ruminococcin C was directly purified from the cecal content of R. gnavus E1-mono-contaminated rats [36]. One limitation of this approach is that it can be used if the bioactive natural product is a peptide but not if the natural product is a PK or NRP.

Genome/Metagenomics-Based Mining Approach Genome mining is the process of identifying, annotating, and analyzing biosynthetic gene clusters of unknown secondary metabolites from whole genome sequence data. In metagenomics, the genomic content is from a complex mixture of microorganisms of environmental samples. The emergence of next-generation sequencing (NGS) and cost-effective wholegenome sequencing technologies has enhanced the drug discovery process by accelerating the mining of biologically effective secondary metabolites from genome sequences using bioinformatics tools. A schematic diagram of genome/metagenomics mining is shown in Figure 2B, including DNA extraction from intestinal samples (Figure 2B I), (meta)genomics sequencing (Figure 2B II), bioinformatic analysis, and prediction and annotation of BGCs (Figure 2B III and IV), in situ expression (Figure 2B V and VI f), or cloning BGCs for heterologous expression (Figure 2B V and VI g and h), and metabolic analysis for products (Figure 2B VII, VIII and IX). This strategy exploits the fact that the genes typically colocalized in BGCs encode specific microbial secondary metabolites. Currently, only a fraction of the gut microbiota has been purely cultured [37], so this approach can circumvent the need for gut bacterial culture. 4

Trends in Biotechnology, Month Year, Vol. xx, No. yy

medicinal properties synthesized by enzymes called nonribosomal peptide synthetases (NRPSs) and a series of tailoring enzymes, in which ribosomal machineries and messenger RNAs do not play a role. Polyketides (PKs): natural products found in a variety of organisms, including plants, fungi, and bacteria, which are produced by the polymerization of acyl-coenzyme A by PK synthase enzymes and a series of tailoring enzymes, including transport, regulatory, and modifying enzymes. Red/ET recombineering: in vivo reaction and based either on the red genes of the lambda phage or the recE/recT genes of the Rac prophage. Reda or RecE act as a 50 ! 30 exonuclease and Redb or RecT as a single strand DNA binding protein. Ribosomal RNA gene flanking region sequencing (RiboFR-Seq): novel approach to link 16s rRNA amplicon profiles to metagenomes that enable capturing both ribosomal RNA variable regions and their flanking protein-coding genes simultaneously. Ribosomally synthesized and post-translationally modified peptides (RiPPs): natural products produced by ribosomes with molecular weight less than 1000 Da that undergo chemical transformations after translation. They are biosynthesized by precursor peptides and a series of enzymes, including immune factors and transport, regulatory, and modifying enzymes. Surface localized antimicrobial display (SLAY): platform that drives bacteria to express and self-test peptides of any size, structure, or sequence complexity for antimicrobial activity. It is a highthroughput screening platform that rapidly identifies lead antimicrobial peptides to combat multidrugresistant Gram-negative bacteria. Thiazole/oxazole-modified microcins (TOMMs): ribosomally produced peptides with posttranslationally installed thiazole and oxazole heterocycles derived from cysteine, serine, and threonine residues.

TIBTEC 1714 No. of Pages 13

Table 1. BGCb-derived Natural Products from the Mammalian Gut Microbiota

a

Compound

Class

Producer

Known/predicted activity

Length/position of gene cluster

Strategies of discoverya

Refs

Nisin O

RiPP (lantibiotic)

Blautia obeum A2-162

Acts against Clostridium perfringens, Clostridium difficile and Lactococcus lactis.

16.2 kb/ chromosome

(d)

[19]

Nisin H

RiPP (lantibiotic)

Streptococcus hyointestinalis DPC6484

Acts against a wide range of Gram-positive bacteria.

15.8 kb/ chromosome

(f)

[45]

Ruminococcin A

RiPP (lantibiotic)

Ruminococcus gnavus E1

Acts against various pathogenic clostridia and bacteria phylogenetically related to R. gnavus.

12.8 kb/ chromosome

(c)

[15]

Bovicin HC5

RiPP (lantibiotic)

Streptococcus bovis HC5

Acts against a variety of Gram-positive bacteria.

10.1 kb/ chromosome

(c)

[22]

Ruminococcin C

RiPP (bacteriocin)

Ruminococcus gnavus E1

Acts against the pathogen C. perfringens.

15 kb/chromosome

(c)

[36,70]

Microcin M

RiPP (microcin)

Escherichia coli Nissle 1917

Acts against adherent invasive E. coli and Salmonella enterica serovar Typhimurium.

7.9 kb/ chromosome

(a)

[10]

Microcin J25

RiPP (microcin)

Escherichia coli AY25

Exhibits high activity against Gram-negative bacteria.

5.0 kb/plasmid

(a)

[71]

Microcin H47

RiPP (microcin)

Escherichia coli Nissle 1917

Acts against adherent invasive E .coli.

7.9 kb/ chromosome

(e)

[10]

Microcin S

RiPP (microcin)

Escherichia coli G3/10

Inhibits the adherence of enteropathogenic E. coli (EPEC) strain E2348/69 to intestinal epithelial cells.

4.7 kb/plasmid

(f)

[47]

Clostridiolysin S

RiPP (TOMM)

Clostridium sporogenes

Responsible for a hemolytic phenotype.

8.6 kb/ chromosome

(g)

[17]

Listeriolysin S

RiPP (TOMM)

Listeria monocytogenes

Targets exclusively prokaryotic cells during in vivo listeriosis infections .

6.9 kb/ chromosome

(g)

[50,72]

Tilivalline

NRP

Klebsiella oxytoca

Cytotoxic, induces apoptosis.

25.8 kb/ chromosome

(a)

[12]

Dipeptide aldehydes

NRP

Clostridium sp. KLE1755

Exhibits potent protease inhibitory activity.

10 kb/ chromosome

(h)

[13]

Colibactin

NRP-PK

Escherichia coli IHE3034

Genotoxin, a driver of carcinogenesis.

54 kb/ chromosome

(b)

[11,73]

Strategies detailed in Figure 2. Abbreviations: BGC, biosynthetic gene clusters; NRP, nonribosomal peptides; NRP-PK, nonribosomal and polyketide hybrid peptides, RiPP, ribosomally synthesized and post-translationally modified peptides; TOMM, thiazole/oxazole-modified microcins.

b

Metagenomic sequencing contain many data from a vast number of organisms, and those data require an automated approach, such as fragment assembly and genome binning for data processing and analysis. Researchers recently reconstructed near-complete genomes for hundreds of microorganisms from the soil ecosystem using genome-resolved metagenomic methods including automated binning software packages including ABAWCA [38], Trends in Biotechnology, Month Year, Vol. xx, No. yy

5

TIBTEC 1714 No. of Pages 13

(A)

Thr

Lys

Gly Gly Asn Leu Val

Val

Ala Thr Ser Gly Ser Cys Val Cys Gly Trp

Ala

Met Asn Cys Asn Thr Ser His Glu

Asn Ser His Asn aAla Gly Pro Ala Tyr Cys

Trp

Arg Thr Val Val aGly Asn Asn Gly Cys Tyr Asn

Ile

S

O

O

Phe Cys Thr Phe Leu Cys S

Gln

Ala

Ruminococcin C

Gly Ile Gly

Phe

Gly Ser Ala Ser

Thr Pro

Tyr

Gly

Asn Ala Asn Val aAla Lys Thr Ala

Ruminococcin A

Val

Val

O

H N

Gly Gly Gly Ala Ala N O

Gly

Glu Ile Gly Pro Ser Gly Val Phe Ala His Gly Tyr Gly OH

Gly Ser Ser Val Val Val

Ala Gly Asn

Gly

Gln Gly

2kb Precursor pepƟde Immunity factor Transport

Clostridiolysin S

Regulator Modifying enzyme

Microcin J25

(B) H N

H2N

HO

O

H N

H

HN

H N

O O

Tilivalline

DipepƟde aldehydes

5kb NRPS PKS

O H2N O C13H27

N H

O

H N

O

O N H

CH3

H N O

O

Transport

S

S N

N

CO2-

Regulatory PPTase Others

PrecolibacƟn

Figure 1. Biosynthetic Gene Clusters and Structures of Representative Natural Products from Gut Microbiota. (A) Representative RiPP natural products; the key characteristic of the natural product class is depicted in green. (B) Representative PK/NRP natural products. Abbreviations: NRPS, nonribosomal peptide synthase; PKS, polyketide synthase; PPTase, phosphopantheteine transferase; RiPP, ribosomally synthesized and post-translationally modified peptide. 6

Trends in Biotechnology, Month Year, Vol. xx, No. yy

TIBTEC 1714 No. of Pages 13

Key Figure

Schematic Diagram of Mining Approaches to Discover BGC-derived Natural Products from the Mammalian Gut Microbiota (A)

(B)

(C)

(I) IsolaƟon with specific funcƟons

(Meta)genomics (M Meta e a)g a)ggen no n o omics mics sequencing seq qu qu uencingg

Diseases, Dise eaasses,, an anƟbioƟcs, nƟbiioƟcs, n theer er perturbaƟon perttu pe turb rb baƟon baƟon or ot other

Sequence input

ExtracƟon of metabolite from intesƟnal contents or fecal

(II)

Compounds analysis

Mutants library

Genomic library

(III) BioinformaƟcs analysis Phenotypic assay

ComparaƟve metabolomics

LigaƟon

(IV)

Gene cluster of interest

Sequencing

In situ expression Heterologous expression BGC cloning Vector introducƟon Promoter

TransformaƟon

(V)

O H N

H N

Necessary casseƩe In situ Heterologous expression expression

(VI)

In situ Phenotype hybridizaƟon assay

Heterologous host

Heterologous host

(a)

(b)

(c)

(d)

(e)

2

Necessary casseƩe

TransformaƟon

Genome

(f)

(g)

m/z

m/z

NMR structure elucidaƟon

Codon opƟmized and BGC synthesized Promoter

H

O

TransformaƟon

Seƫng up a metabolomics database

Heterologous host

(h)

Culturing

(VII)

HPLC isolaƟon

(VIII)

NMR structure elucidaƟon O

(IX)

H N

H 2N

H

O

Seƫng up a metabolomics database

(X)

Figure 2. (A) Culture-based mining approach. (B) Genome/metagenomics-based mining approach. (C) Metabolomics-based mining approach. Examples of natural products produced from these approaches include microcin M, microcin J25, and tilivalline (from approach a), colibactin (b), ruminococcin A, ruminococcin C, and bovicin HC5 (c), nisin O (d), microcin H47 (e), nisin H and microcin S (f), clostridiolysin S and listeriolysin S (g), and dipeptide aldehydes (h). Abbreviations: BGC, biosynthetic gene cluster; NRP, nonribosomal peptide; PK, polyketide; RiPP, ribosomally synthesized and post-translationally modified peptide.

Trends in Biotechnology, Month Year, Vol. xx, No. yy

7

TIBTEC 1714 No. of Pages 13

ABAWACA2 [39], MaxBin2 [40], CONCOCT [41], and MetaBAT [42] and identified microorganisms from previously understudied phyla that harbor diverse BGCs [38]. Genome sequences can be searched using the BLASTn and BLASTp programs through the National Center for Biotechnology Information web portal (https://blast.ncbi.nlm.nih.gov/Blast. cgi). Automated programs such as ClusterFinder [43] and antiSMASH 3.0 [44] have been developed to accelerate and precisely identify BGCs from DNA sequencing data in batches [45] and could predict the structures from the corresponding gene clusters, which would be useful information for subsequent compound purification and structural analysis (Figure 2B III and IV). Using ClusterFinder, Donia and colleagues identified 3118 small molecule BGCs in the genomes of human-associated bacteria and showed that the BGCs of thiopeptides are widely distributed in the metagenomes of the human microbiota [45]. The genome sequencing of a porcine gut-derived strain revealed a novel version of nisin, designated nisin H, which inhibits a wide range of Gram-positive bacteria [46]. In the discovery of microcin S [47], BLAST analysis of automatically annotated three open reading frames in the original plasmid revealed slight homologies to characterized proteins. Subcloning genes and gene fragments in E. coli G4/9 followed by a phenotype assay characterized the function of the entire microcin S operon (Figure 2B V and VI f). Certain native host strains, such as Clostridium sporogenes and Listeria monocytogenes, are largely refractory to genetic manipulation in the host. Thus, heterologous expression is a better option to study these BGCs by refactoring and reconstructing the control elements (Figure 2B V and VI g). A recent study incorporating the heterologous expression discovered the BGCs of TOMMs, including listeriolysin S and clostridiolysin S [48]. Recently, Davies developed a platform called surface localized antimicrobial display (SLAY) for studying the function of peptides by heterologous expression [49], which has potential to study antimicrobial peptides from the gut microbiota. Many of the products act as antibiotics to provide an ecological advantage to the producers [48]. For instance, listeriolysin S from the human pathogen L. monocytogenes exhibits bactericidal activity and alters the host gut microbiota during infection [20,50]. Codon-optimized and synthesized BGCs could also have applications (Figure 2B V and VI h). For example, a family of NRPS gene clusters in the gut microbiota encode dipeptide aldehydes. After codon optimization, the synthesized BGC was heterologously expressed in two commonly used and well-characterized laboratory hosts, E. coli and Bacillus subtilis [13]. Such dipeptide aldehydes from Clostridium sp. KLE1755, Phe-Phe-H, can target cathepsins in an unbiased cell-based assay, which indicates a possible role for lysosomal proteases in microbiota–host interactions [13]. In addition, modified E. coli GB05-MtaA was created by introducing a phosphopantheteine transferase (PPTase) that is required for the post-translational activation of PKS/NRPS proteins [51]. E. coli has shown feasibility as a cell factory for the heterologous production of PKS/NRPS proteins [52]. For heterologous expression (Figure 2A VI b, Figure 2B VI g, and Figure 2B VI h), an expression vector should be constructed, which may entail integrating an inducible promoter or the necessary cassette into the expression vector, including the genes for transposition, conjugation and integration. Therefore, the final constructs can be introduced into heterologous hosts by electroporation, conjugation, or transposition. Red/ET recombineering is an optimal technology, which allows the exchange of genetic information between two DNA molecules in a precise, specific, and accurate manner, which enable a seamless modification of natural biosynthetic genes and accompanying vector systems [53,54]. Furthermore, this technology 8

Trends in Biotechnology, Month Year, Vol. xx, No. yy

TIBTEC 1714 No. of Pages 13

can be applied to produce novel compounds by tailoring native enzymes and replacing domains [55]. In addition, novel molecular engineering techniques can be applied for seamless DNA engineering of gene clusters from mammalian gut microbiota. Two examples are the CRISPR/Cas9 system with an integrated algorithm to predict highly effective single guide RNAs [56], and recombineering in combination with CcdB counterselection, which is robust and does not require titrations or optimization [57]. The challenge associated with heterologous expression is acquiring massive molecular-weight BGCs from the gut microbiota genome. However, this challenge can be addressed by several methods, including linear plus linear homologous recombination (LLHR)-mediated RecET as used for salinomycin BGC cloning [58], Cas9-assisted targeting of chromosome (CATCH), which was used for psk BGC cloning [59], or exonuclease combined with RecET recombination (ExoCET) for mammalian genomic DNA cloning [60]. Each method has the potential to directly clone large BGCs from the mammalian gut microbiota.

Metabolomics-Based Mining Approach Metabolomics is a metabolite profiling technique that is increasingly applied as a powerful tool to mine novel bioactive compounds from gut microbiota and to identify the microbiota-driven

Table 2. Advantages, Disadvantages and Possible Future Developments of Three Mining Approaches

a

Approaches

Advantages

Disadvantages

Possible future developments

Culture-based mining approach

Enables the direct isolation of bacteria. Function is known before purification. Simulated in vitro intestinal model culture systems are applied. Numerous traditional media recipes have been accumulated.

Involves challenges associated with pure-culture techniques. Massive uncultivable microbes. Genetic modification is difficult for wild-type strains. Screening for a functional clone is laborious and time-consuming.

Automated growth detection and identification. Miniaturization. Innovative culture conditions.

Genome/metagenomicsbased mining approach

High throughput and high efficiency. Applicable for uncultivable microbes. Enables the synthesis of BGCa sequences with optimized codons. Structure and function are predictable via large existing public databases. Enzymatic chemistry is well understood.

Generic annotation may be incomplete or misidentified. Function is unknown before purification. Difficult to obtain large BGCs (>100 kb). Contamination from host-derived DNA and organelles may obscure subsequent bioinformatic analysis.

Increased depth of sequencing. NGS data interpretation tools.

Metabolomics-based mining approach

Enables rapid analysis for the chemical structure and amino acid sequences. Highly sensitive, quantitative, and high throughput. Directly applicable to pattern recognition.

Not available in purified form for low yield compounds. Enzymatic chemistry is poorly understood. Subjected to degradation and oxidation during extraction. No correlation with genomic information. Cannot distinguish the source between dietary components and BGCs.

Improved instrumentation, such as GC–EI–MS and GC  GC techniques. In silico structure prediction tools. Combined with deep learning models for clinics and laboratories.

Abbreviations: BGC, biosynthetic gene clusters; EI, electron ionization; GC, gas chromatography; NGS, next-generation sequencing.

Trends in Biotechnology, Month Year, Vol. xx, No. yy

9

TIBTEC 1714 No. of Pages 13

mechanisms underlying the link between microbial community structure and health [61]. The main methods in the metabolomics field are classified as either targeted and untargeted metabolomics [62]. Some analytical techniques used for chemical profiling include liquid chromatography in combination with high-resolution tandem MS (LC-HRMS), MALDI-imaging-MS, and NMR. Targeted metabolomics involved measuring a set of known metabolites with high sensitivity and selectivity, which are chemically characterized or biochemically annotated [62]. The clones that result from a genome/metagenomics-based mining approach or a culture-based mining approach can be transferred into an enriched medium, where the supernatant extract is analyzed for BGC-derived metabolites via NMR spectroscopy, gas chromatography (GC), or LC coupled to MS, Fourier transform infrared spectroscopy (FTIR), ion cyclotron resonanceFT (ICR-FT), or capillary electrophoresis (CE) coupled to MS [63] (Figure 2AB VII, VIII and IX). For instance, with HPLC–tandem MS, an isolated peak, the purified sphingolipid from the Bacteroides fragilis, was found to regulate homeostasis of the host’s invariant natural killer T cells and protect the host from a colitis challenge [64]. To increase the automation, speed and accuracy of absolute metabolite quantitation are critical in targeted metabolomics. Untargeted metabolomics (Figure 2C) is an unbiased methodology to measure a broad range of metabolites, including unknown chemicals [62]. A gut community harbored in humans or mammals could be subjected to diseases, antibiotic treatment, or other perturbations (Figure 2C II). Comparative analysis can be directly applied to identify the abundance changes of microbiota-derived molecules and to provide insight into the underlying mechanisms by the systems-level effects of associated perturbation (Figure 2C IV). For instance, using an untargeted metabolomics approach, trimethylamine N-oxide (TMAO), produced via metaorganismal pathways, was discovered and structurally identified as a predictor of incident cardiovascular disease risks [65]. Although TMAO is formed by the gut microbiota from dietary nutrients, the approach also provided insight into the link between BGC-derived natural products and diseases. In addition to the metabolites produced by microorganisms, metabolites can also be derived from the host organism, as well as from xenobiotic, dietary, and other exogenous sources. Thus, 16s rDNA or metagenomics sequencing data are essential to identify and correlate specific metabolites with altered microbial profiles. The challenge for metabolomics is to resolve the structures of unknown bioactive chemicals and identify their corresponding BGCs. In addition, BGC-chemical matching is comparatively difficult. This challenge can be addressed with a reverse lookup approach that involves comparing a chemical for which the BGC is known with a chemical of unknown origin. With the amount of targeted and untargeted metabolomics data rapidly increasing, such data can be maintained in a consolidated manner. It is essential to expand the compound libraries by adding more metabolites and by elucidating the structure of novel metabolites via hybrid MS/ NMR methods. Therefore, building metabolomics databases and training metabolic network models by machine learning for clinics and laboratories could help to study the metabolic intricacies of gut microbial communities, and new metabolic fingerprinting data of gut microbiota samples can be deposited into such databases and accessed by researchers worldwide [66].

Concluding Remarks Commensal microorganisms in mammals produce numerous metabolites to mediate significant interactions and physiological functions for survival [1]. However, from the perspective of providing benefits to humans, there are numerous gene clusters and distinct functional 10

Trends in Biotechnology, Month Year, Vol. xx, No. yy

Outstanding Questions Genomic/metagenomic studies produce huge amounts of data and are rich in complexity. Can structural prediction tools with high accuracy and efficiency be developed to process the metagenomic data in the future? If uncultured gut microbiota are cultivable when favorable growth conditions are provided, which factors will affect the growth of these organisms in laboratory conditions? Will it be possible to miniaturize the culture conditions of gut microbiota in a laboratory setup to address the issue of culturomics? In metabolomics, GC-MS and LC-MS lack precision and accuracy, while NMR lacks sensitivity. Beyond LCMS-NMR, what potential technical advancements will improve the efficiency of metabolite profiling? Secondary metabolite profiling from the gut microbiota is only partially feasible and will be fully realized only if structure–activity relationships are elucidated. Will it be possible to discover potential molecular targets through computational techniques such as reverse docking? Due to the structural complexity of the mammalian gut, can a more systematic and specific mining approach be developed to explore small moleculemediated interactions between mammals and their microbiota?

TIBTEC 1714 No. of Pages 13

compounds yet to be explored. Exploring the mechanisms of new compounds at various levels and perspectives is critical to ensure the safety and value of clinical applications, such as investigating the class of the compounds, pharmacokinetics, pharmacodynamics, and mode of action, as well as their effects on the structure and function of the human intestine. The limitations of culture-based, genome/metagenomics-based, and metabolomics-based mining approaches are fading now. Considering the advantages and disadvantages of every strategy (Table 2), it is wise to combine two or more strategies to accelerate the rate of bioactive compound mining and elucidate the function as well. For example, ribosomal RNA gene flanking region sequencing (RiboFR-Seq) [67] and emulsion, paired isolation, and concatenation PCR (epicPCR) [68] were proposed to link the functional genes derived from metagenomics to their 16S rRNA profiles. Once obtaining their phylogeny, the Known Media Database (KOMODO), can be implemented to predict specific culture media to cultivate targeted bacteria based on their 16s rDNA sequences [69]. The productivity of these compounds is still low, so various strategies have been applied to improve their production (see Outstanding Questions). Generally, uncultured gut microbiota are cultivable when favorable growth conditions are supplied; thus, the factors influencing the growth of these microbes in laboratory conditions have yet to be investigated. Similarly, miniaturizing the culture conditions of gut microbiota in a laboratory setup represents an interesting emerging approach in culturomics. In the future, genomic/metagenomics studies will produce high amounts of highly complex data, which can be useful to provide suitable tools for data management and analysis when these tools are available. These structure prediction tools must be redefined to enhance accuracy and efficiency. In metabolomics, the lack of precision and accuracy in GC-MS and LC-MS and the low sensitivity of NMR have yet to be addressed. Similarly, potential technical advancements to LC-MS-NMR are essential to improve the efficiency of metabolite profiling. Metabolomics datasets can be pooled together as a database through which researchers will be able to access data worldwide. In addition, such as in the case of iScreen, a cloud-computing-based web server to virtually screen the secondary metabolite database of traditional Chinese medicine plants against molecular targets, a database can be created with molecular docking facility to enable reverse pharmacological approaches. Acknowledgments We are grateful to Dr Ruijuan Li and Prof Xiaoying Bian from Shandong University for the editorial assistance with the figures. We also would like to thank Prof Jianzhong Li and Dr Huansheng Yang from Hunan Normal University for their constructive suggestions and editorial contribution. This work was supported by funding awarded to J.Y. from the National Natural Science Foundation of China (31700004), Natural Science Foundation of Jiangsu Province (BK20160368), and Huxiang Youth Excellence project (2017RS3029). Y.L. received funding from the Key Programs of Frontier Scientific Research of the Chinese Academy of Sciences (QYZDY-SSW-SMC008).

Supplemental Information Supplemental information associated with this article can be found, in the online version, at https://doi.org/10.1016/j. tibtech.2018.10.003.

References 1. Rooks, M.G. and Garrett, W.S. (2016) Gut microbiota, metabolites and host immunity. Nat. Rev. Immunol. 16, 341–352

4. Jobin, C. (2014) GPR109a: The missing link between microbiome and good Health? Immunity 40, 8–10

2. Koppel, N. and Balskus, E.P. (2016) Exploring and understanding the biochemical diversity of the human microbiota. Cell Chem. Biol 23, 18–30

5. Donia, M.S. and Fischbach, M.A. (2015) Small molecules from the human microbiota. Science 349, 1254766–1254766

3. Mousa, W.K. et al. (2017) Antibiotics and specialized metabolites from the human microbiota. Nat. Prod. Rep. 34, 1302–1331

6. Krishnan, S. et al. (2018) Gut microbiota-derived tryptophan metabolites modulate inflammatory response in hepatocytes and macrophages. Cell Rep. 23, 1099–1111

Trends in Biotechnology, Month Year, Vol. xx, No. yy

11

TIBTEC 1714 No. of Pages 13

7. Hryckowian, A.J. et al. (2018) Microbiota-accessible carbohydrates suppress Clostridium difficile infection in a murine model. Nat. Microbiol. 3, 662–669

31. Ren, H. et al. (2017) Breaking the silence: new strategies for discovering novel natural products. Curr. Opin. Biotechnol. 48, 21–27

8. Ose, R. et al. (2018) The ability of human intestinal anaerobes to metabolize different oligosaccharides: Novel means for microbiota modulation? Anaerobe 51, 110–119

32. Yin, J. et al. (2015) A new recombineering system for Photorhabdus and Xenorhabdus. Nucleic Acids Res. 43, e36

9. Milshteyn, A. et al. (2018) Accessing bioactive natural products from the human microbiome. Cell Host Microbe 23, 725–736

33. Wang, X. et al. (2018) Discovery of recombinases enables genome mining of cryptic biosynthetic gene clusters in Burkholderiales species. Proc. Natl. Acad. Sci. U. S. A. 115, E4255– E4263

10. Sassone-Corsi, M. et al. (2016) Microcins mediate competition among Enterobacteriaceae in the inflamed gut. Nature 540, 280–283

34. Dong, C. et al. (2018) Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria. Nat. Commun. 9, 2489

11. Balskus, E.P. (2015) Colibactin: understanding an elusive gut bacterial genotoxin. Nat. Prod. Rep. 32, 1534–1540

35. Laviña, M. et al. (1990) Microcin H47, a chromosome-encoded microcin antibiotic of Escherichia coli. J. Bacteriol. 172, 6585– 6588

12. Dornisch, E. et al. (2017) Biosynthesis of the enterotoxic pyrrolobenzodiazepine natural product tilivalline. Angew. Chem. Int. Ed. Engl. 56, 14753–14757 13. Guo, C.J. et al. (2017) Discovery of reactive microbiota-derived metabolites that inhibit host proteases. Cell 168, 517–526 14. Letzel, A.-C. et al. (2014) Genome mining for ribosomally synthesized and post-translationally modified peptides (RiPPs) in anaerobic bacteria. BMC Genomics 15, 983 15. Dabard, J. et al. (2001) Ruminococcin A, a new lantibiotic produced by a Ruminococcus gnavus strain isolated from human feces. Appl. Environ. Microbiol. 67, 4111–4118 16. Nadine, A. et al. (2016) Initial molecular recognition steps of McjA precursor during microcin J25 lasso peptide maturation. ChemBioChem 17, 1851–1858 17. Gonzalez, D.J. et al. (2010) Clostridiolysin S, a post-translationally modified biotoxin from Clostridium botulinum. J. Biol. Chem. 285, 28220–28228 18. Dejong, C.A. et al. (2016) Polyketide and nonribosomal peptide retro-biosynthesis and global gene cluster matching. Nat. Chem. Biol. 12, 1007–1014 19. Hatziioanou, D. et al. (2017) Discovery of a novel lantibiotic nisin O from Blautia obeum A2-162, isolated from the human gastrointestinal tract. Microbiology 163, 1292–1305 20. Quereda, J.J. et al. (2017) Listeriolysin S: a bacteriocin from epidemic Listeria monocytogenes strains that targets the gut microbiota. Gut Microbes 8, 1–8 21. Nougayrede, J.-P. et al. (2006) Escherichia coli induces DNA double-strand breaks in eukaryotic cells. Science 313, 848–851 22. Mantovani, H.C. et al. (2002) Bovicin HC5, a bacteriocin from Streptococcus bovis HC5. Microbiology 148, 3347–3352 23. Vartoukian, S.R. (2016) Cultivation strategies for growth of uncultivated bacteria. J. Oral Biosci. 58, 142–149 24. D, I. and Walsh et al. (2018) Emulation of colonic oxygen gradients in a microdevice. SLAS Technol. 23, 164–171

36. Crost, E.H. et al. (2011) Ruminococcin C, a new anti-Clostridium perfringens bacteriocin produced in the gut by the commensal bacterium Ruminococcus gnavus E1. Biochimie 93, 1487–1494 37. Hugon, P. et al. (2015) A comprehensive repertoire of prokaryotic species identified in human beings. Lancet Infect. Dis. 15, 1211– 1219 38. Crits-Christoph, A. et al. (2018) Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis. Nature 558, 440–444 39. Brown, C.T. et al. (2015) Unusual biology across a group comprising more than 15% of domain Bacteria. Nature 523, 208–211 40. Wu, Y.-W. et al. (2016) MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics 32, 605–607 41. Alneberg, J. et al. (2014) Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146 42. Kang, D.D. et al. (2015) MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ 3, e1165 43. Peter, C. et al. (2014) Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 158, 412–421 44. Weber, T. et al. (2015) antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res. 43, W237–W243 45. Donia, M.S. et al. (2014) A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics. Cell 158, 1402–1414 46. O’Connor, P.M. et al. (2015) Nisin H is a new nisin variant produced by the gut-derived strain Streptococcus hyointestinalis DPC6484. Appl. Environ. Microbiol. 81, 3953–3960 47. Zschüttig, A. et al. (2012) Identification and characterization of microcin S, a new antibacterial peptide produced by probiotic Escherichia coli G3/10. PLoS One 7, e33351

25. Van de Wiele, T. et al. (2015) The simulator of the human intestinal microbial ecosystem (SHIME1). In The Impact of Food Bioactives on Health: In Vitro and Ex Vivo Models (Verhoeckx, K., ed.), pp. 305–317, Springer International Publishing

48. Melby, J.O. et al. (2011) Thiazole/oxazole-modified microcins: complex natural products from ribosomal templates. Curr. Opin. Chem. Biol. 15, 369–378

26. Friedman, J. and Alm, E.J. (2012) Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 8, e1002687

49. Tucker, A.T. et al. (2018) Discovery of next-generation antimicrobials through bacterial self-screening of surface-displayed peptide libraries. Cell 172, 618–628.e13

27. Boedicker, J.Q. et al. (2009) Microfluidic confinement of single cells of bacteria in small volumes initiates high-density behavior of quorum sensing and growth and reveals its variability. Angew. Chem. Int. Ed. Engl. 48, 5908–5911 28. Kawanishi, T. et al. (2011) New detection systems of bacteria using highly selective media designed by SMART: Selective Medium-Design Algorithm Restricted by Two Constraints. PLoS One 6, e16512 29. Tanaka, Y. and Benno, Y. (2015) Application of a single-colony coculture technique to the isolation of hitherto unculturable gut bacteria. Microbiol. Immunol. 59, 63–70 30. Lagier, J.-C. et al. (2015) The rebirth of culture in microbiology through the example of culturomics to study human gut microbiota. Clin. Microbiol. Rev. 28, 237–264

12

Trends in Biotechnology, Month Year, Vol. xx, No. yy

50. Quereda, J.J. et al. (2017) Listeriolysin S is a streptolysin S-like virulence factor that targets exclusively prokaryotic cells in vivo. mBio 8, e00259–17 51. Fu, J. et al. (2012) Full-length RecE enhances linear-linear homologous recombination and facilitates direct cloning for bioprospecting. Nat. Biotechnol. 30, 440–446 52. Li, J. and Neubauer, P. (2014) Escherichia coli as a cell factory for heterologous production of nonribosomal peptides and polyketides. New Biotechnol. 31, 579–585 53. Zhang, Y. et al. (2000) DNA cloning by homologous recombination in Escherichia coli. Nat. Biotechnol. 18, 1314–1317 54. Zhang, Y. et al. (1998) A new logic for DNA engineering using recombination in Escherichia coli. Nat. Genet. 20, 123–128

TIBTEC 1714 No. of Pages 13

55. Ongley, S.E. et al. (2013) Recent advances in the heterologous expression of microbial natural product biosynthetic pathways. Nat. Prod. Rep. 30, 1121–1138

65. Li, X.S. et al. (2018) Untargeted metabolomics identifies trimethyllysine, a TMAO-producing nutrient precursor, as a predictor of incident cardiovascular disease risk. JCI Insight 3, e99096

56. Guo, J. et al. (2018) Improved sgRNA design in bacteria via genomewide activity profiling. Nucleic Acids Res. 46, 7052–7069

66. Yan, S. et al. (2016) Metabolomics in gut microbiota: applications and challenges. Sci. Bull. 61, 1151–1153

57. Wang, H. et al. (2014) Improved seamless mutagenesis by recombineering using ccdB for counterselection. Nucleic Acids Res. 42, e37

67. Zhang, Y. et al. (2016) RiboFR-Seq: a novel approach to linking 16S rRNA amplicon profiles to metagenomes. Nucleic Acids Res. 44, e99–e99

58. Yin, J. et al. (2015) Direct cloning and heterologous expression of the salinomycin biosynthetic gene cluster from Streptomyces albus DSM41398 in Streptomyces coelicolor A3(2). Sci. Rep. 5, 15081

68. Spencer, S.J. et al. (2016) Massively parallel sequencing of single cells by epicPCR links functional genes with phylogenetic markers. ISME J. 10, 427–436

59. Jiang, W. et al. (2015) Cas9-assisted targeting of chromosome segments CATCH enables one-step targeted cloning of large gene clusters. Nat. Commun. 6, 8101

69. Oberhardt, M.A. et al. (2015) Harnessing the landscape of microbial culture media to predict new organism-media pairings. Nat. Commun. 6, 8493–8493

60. Wang, H. et al. (2018) ExoCET: exonuclease in vitro assembly combined with RecET recombination for highly efficient direct DNA cloning from complex genomes. Nucleic Acids Res. 46 (5), e28–e28

70. Pujol, A. et al. (2011) Characterization and distribution of the gene cluster encoding RumC, an anti-Clostridium perfringens bacteriocin produced in the gut. FEMS Microbiol. Ecol. 78, 405–415

61. Johnson, C.H. et al. (2016) Metabolomics: beyond biomarkers and towards mechanisms. Nat. Rev. Mol. Cell Biol. 17, 451–459 62. Deda, O. et al. (2018) Rat fecal metabolomics-based analysis. In Metabolic Profiling: Methods and Protocols (Theodoridis, G.A., ed.), pp. 149–157, Springer New York 63. Wishart, D.S. (2016) Emerging applications of metabolomics in drug discovery and precision medicine. Nat. Rev. Drug Discov. 15, 473–484 64. An, D. et al. (2014) Sphingolipids from a symbiotic microbe regulate homeostasis of host intestinal natural killer T cells. Cell 156, 123–133

71. Lai, P.K. and Kaznessis, Y.N. (2017) Free energy calculations of microcin J25 variants binding to the FhuA receptor. J. Chem. Theory Comput. 13, 3413–3423 72. Quereda, J.J. et al. (2016) Bacteriocin from epidemic Listeria strains alters the host intestinal microbiota to favor infection. Proc. Nat. Acad. Sci. U. S. A. 113, 5706–5711 73. Trautman, E.P. et al. (2017) Domain-targeted metabolomics delineates the heterocycle assembly steps of colibactin biosynthesis. J. Am. Chem. Soc. 139, 4195–4201

Trends in Biotechnology, Month Year, Vol. xx, No. yy

13