Comparative whole genome analysis of Listeria monocytogenes 4b strains reveals least genome diversification irrespective of their niche specificity

Comparative whole genome analysis of Listeria monocytogenes 4b strains reveals least genome diversification irrespective of their niche specificity

Accepted Manuscript Comparative whole genome analysis of Listeria monocytogenes 4b strains reveals genome diversification irrespective of their niche ...

680KB Sizes 24 Downloads 78 Views

Accepted Manuscript Comparative whole genome analysis of Listeria monocytogenes 4b strains reveals genome diversification irrespective of their niche specificity

Dharmendra K. Soni, Arpita Ghosh, Surendra K. Chikara, Krishna M. Singh, Chaitanya G. Joshi, Suresh K. Dubey PII: DOI: Reference:

S2452-0144(17)30040-7 doi: 10.1016/j.genrep.2017.05.007 GENREP 149

To appear in:

Gene Reports

Received date: Revised date: Accepted date:

15 October 2016 12 May 2017 19 May 2017

Please cite this article as: Dharmendra K. Soni, Arpita Ghosh, Surendra K. Chikara, Krishna M. Singh, Chaitanya G. Joshi, Suresh K. Dubey , Comparative whole genome analysis of Listeria monocytogenes 4b strains reveals genome diversification irrespective of their niche specificity, Gene Reports (2017), doi: 10.1016/j.genrep.2017.05.007

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Comparative whole genome analysis of Listeria monocytogenes 4b strains reveals genome diversification irrespective of their niche specificity

T

Dharmendra K Sonia, Arpita Ghoshb, Surendra K Chikarab, Krishna M Singhb, Chaitanya G

Running title: Comparison of L. monocytogenes genome

Centre of Advanced Study in Botany, Institute of Science, Banaras Hindu University, Varanasi

US

a

b

AN

221005, India

Eurofins Genomics India Pvt Ltd, 540/I, Doddenakundi Industrial Area 2, Hoodi, Whitefield,

M

Bangalore 560048, Karnataka, India

Department of Animal Biotechnology, Anand Agriculture University, Anand, Gujarat, India

ED

c

CR

IP

Joshic, Suresh K Dubeya,*

PT

* Corresponding author. Tel. : +91 542 2307147; Fax: +91 542 2368174

AC

CE

E-mail address: [email protected] (S. K. Dubey)

ACCEPTED MANUSCRIPT ABSTRACT Listeria monocytogenes has emerged as the deadly pathogen inflicting high mortality in humans and animals. To investigate the strain-specific characteristics of L. monocytogenes, and their role(s) in virulence and ecological sustenance, we sequenced genome of three L. monocytogenes strains BHU1, 2 and 3 isolated from the Ganges river, agricultural soil and human placenta bit,

T

respectively, and compared it with L. monocytogenes EGD-e serovar 1/2a and L. monocytogenes

IP

F2365 serovar 4b strain. The contigs of all the three strains had a similarity (> 90 %) in regions

CR

that aligned with EGD-e and F2365. A total of 2872 core genes on the set of strains were identified in BHU1, 2 and 3 strains. In the mice virulence assay, BHU2 and 3 strain showed pathogenicity while BHU1 was non-pathogenic. These strains were also characterized with

US

unique genes (8 genes in BHU1, 12 in BHU 2, and 17 in BHU 3 strain). Phylogenetic analysis based on multilocus sequence typing revealed BHU1, 2 and 3 strains to be more closely related

AN

to lineage I, serotype 4b, clonal complex 1, and sequence type 328. BHU1 strain seemed to harbor nucleotide mutation from A to G in the major virulence genes i.e. hemolysin D and

M

listeriolysin O. Strain-specific mutations 8, 5 and 2 were identified in BHU1, 2 and 3, respectively, compared to F2365. Though all the three strains were genetically very close, the

ED

observed differences may play the crucial role(s) in their virulence attributes, and also, in the

Keywords:

CE

PT

prevalence of L. monocytogenes.

MLST

AC

Listeria monocytogenes

SNP Phylogenetic analysis Genomics

ACCEPTED MANUSCRIPT 1. Introduction Listeria monocytogenes is the low GC-content, Gram-positive, facultative intracellular foodborne pathogen with wide ecological amplitude to resist diverse environmental stresses such as heavy metal ions, high salt concentrations, low pH, a wide range of temperature and low water activity [1]. This possibly accounts for the pathogen prevalence in different ecological niches

T

e.g., soil, vegetation, food processing plants, food, animals, and humans [2]. Human listeriosis

IP

has been a major health concern mostly in the immune-compromised people with the high

CR

mortality rate (20-30 %). These people develop invasive and systemic infections to manifest meningitis, encephalitis, and septicemia [3]. Moreover, infections during pregnancy lead to abortion, still-birth or septicemia in the neonates [4]. The pathogenic attributes of L.

US

monocytogenes depend mainly on the prfA virulence gene cluster (Listeria pathogenicity island, LIPI-1) encoding a number of proteins essential to cytosolic replications and intra- and inter-

AN

cellular movements. Among these, listeriolysin O (LLO) is the main virulence factor for the escape of bacteria from the primary and secondary intracellular vacuoles [5,6]. Also, the second

M

cluster of virulence genes comprises members from internalin protein family (InlA/B), which are cell wall anchored or secrete proteins essential for the attachment and invasion of the non-

ED

phagocytic host cells. Other genes associated with the infection event, regulate the metabolic

PT

activities and stress response of the pathogen [7,8]. The widespread incidence of L. monocytogenes necessitates virulence and genotypic

CE

characterization of clinical, environmental and food isolates to establish the epidemiologic linkages during the disease outbreak, and also, source tracking. In previous studies, we

AC

characterized L. monocytogenes isolates from food and clinical samples in terms of antibiotic susceptibility, serotypes, virulence genes, phylogenetic and DNA fingerprint analyses [9-12]. Such studies reflect considerable diversity among the isolates of the same species because of the distinct fingerprint profiles. However, there are certain limitations in such studies as the repertoire of genetic determinants implicated in diversification and microevolution, are still not well understood. Whole genome sequencing has emerged as the highly feasible tool in epidemiological analyses, and is anticipated to provide the cost-effective alternative to the current

subtyping

methods,

multi-locus

sequence

typing

(MLST),

pulsed-field

gel

electrophoresis (PFGE). Genomes of several L. monocytogenes strains have been sequenced in

ACCEPTED MANUSCRIPT the recent years [13-16]. Most studies emphasized on genome comparisons for different lineages and serovars as well as the disease outbreak, and the persistent strains of L. monocytogenes [1316]. The rapid expansion of genome sequencing and comparative analysis of additional strains from different countries in relation to diverse habitats, is warranted. L. monocytogenes can be divided into four possible lineages, among which, lineage I,

T

serovar 4b strains are responsible for the majority of human listeriosis [17]. Differences in

IP

virulence among the strains belonging to same serotype (4b), lineage, clonal complex (CC) and

CR

sequence type (ST), remain undefined. Therefore, the present study aimed at comparison of the genome sequences of three L. monocytogenes strains (BHU1, 2 and 3), belonging to serovar 4b, lineage I but with different virulence potential) isolated from the Ganges river, agricultural soil, and

US

human placenta bit, respectively, at Varanasi, India. This could provide a better understanding of the underlying molecular mechanisms of L. monocytogenes virulence and facilitate the

AN

identification of potential novel virulence determinants. Also, we looked for features uniquely shared by such strains to decipher gene(s) crucial to their survival under different environments.

M

Studies on the newly sequenced L. monocytogenes strains are likely to offer an opportunity to

2. Materials and Methods

ED

delineate specific information at the genome level.

PT

2.1. Listeria monocytogenes strains

CE

Strains L. monocytogenes BHU1, BHU2 and BHU3 were isolated from Ganges river, agricultural soil and human placenta bit, respectively, [18]. The Ganges water samples from

AC

Varanasi (Ravidas ghat) were collected ten meters away from the bank of the river and one meter deep at around 9.00 am as described [9]. Soil samples were collected from vegetable grown field of Indian Institute of Vegetable Research (IIVR), Varanasi, and were homogeneously mixed and sieved (2 mm) to remove the plant debris as described [10]. Human placenta bits, a piece of placenta from pregnant women with repeated abortion, were collected immediately after the delivery of baby as described [12]. All the samples were collected from Varanasi region of the province (Uttar Pradesh), India and aseptically, transported chilled to the laboratory, and processed within 24 h of collection.

ACCEPTED MANUSCRIPT Isolates of L. monocytogenes were obtained from the samples using the standard double enrichment method as described by ISO 11290:1. Following inoculation of 25 mL/g each of water, soil and placental bit into 225 mL of half-Fraser broth (Difco, USA), incubation was done for 24 h at 30 °C. Second enrichment was carried out by adding 0.1 ml of overnight grown culture into 10 mL of fully concentrated selective agents (Fraser broth, Difco, USA). After incubation at 37 °C for 48 h subsequent spreading of the culture on PALCAM agar (Difco, USA)

T

which was again followed by incubation at 37 °C for 48 h. Morphologically, grey-greenish

IP

appearing colonies with black sunken centre and black halo were picked up. The presence of L.

CR

monocytogenes was then confirmed by Gram staining, biochemical tests such as catalase test, methyl red-Voges-Proskauer (MR-VP) reaction, nitrate reduction and motility at 20-25 °C, acid

US

production from rhamnose, xylose, mannitol, α-methyl-D-mannopyranoside and CAMP test with Staphylococcus aureus and Rhodococcus equi. As controls for the reaction L. monocytogenes

AN

strain MTCC1143, S. aureus strain MTCC1144 and R. equi strain MTCC1135 were used. For preservation of both the L. monocytogenes isolates and control strains, tryptic soy agar slants

M

were used at room temperature.

The whole genome sequence of L. monocytogenes EGD-e (Lineage II, serotype 1/2a,

ED

CC9, and ST35), the clinical wild type strain, and L. monocytogenes F2365 (Lineage I, serotype 4b, CC1, and ST1), well-characterized disease outbreak strain from the Jalisco soft cheese, were

PT

used as reference strains. For MLST analysis, we used the following strains: F2365, CLIP 200900372 (Lineage I, serotype 4b, CC6, and ST6), CLIP 2005-00704 (Lineage I, serotype 1/2b,

CE

CC5, and ST5), FSL N1-017 (Lineage I, serotype 1/2b, CC3, and ST3), F6854 (Lineage II, serotype 1/2a, CC11, and ST11), F6900 (Lineage II, serotype 1/2a, CC11, and ST11), 10403S

AC

(Lineage II, serotype 1/2a, CC7, and ST85), 085578 (Lineage II, serotype 1/2a, CC8, and ST120), 085923 (Lineage II, serotype 1/2a, CC8, and ST120), and HCC23 (Lineage III, serotype 4a, CC69, and ST201). 2.2. DNA isolation Genomic DNA from overnight grown strains (37 °C) was extracted using QIAmp DNA mini kit (Qiagen, Milan, Italy). Harvested cells (maximum 2 x 109 cells) in a microcentrifuge tube (7500 rpm, 10 min) were re-suspended in 180 μL lysis buffer [20 mM Tris-Cl (pH 8.0), 2 mM sodium

ACCEPTED MANUSCRIPT EDTA, 1.2 % Triton® X-100, 20 mg lysozyme (Sigma) per mL] and incubated (30 min, 37 °C). Proteinase K (25 μL) and 200 μL buffer AL (without ethanol) were added, mixed by vortexing, and the mixture incubated (56 °C, 30 min). Thereafter, 4 μL RNase A (100 mg/mL) was added and further incubated (2 min) at room temperature. Pure ethanol (200 μL, Merck) was added to the sample, and vortexed. The DNA was eluted in buffer AE, and the concentration and purity ascertained using Nano Drop Spectrophotometer (ND 1000, Nano Drop Technologies, Inc,

CR

2.3. Virulence - specific genes and serogroup identification

IP

T

Wilmington, DE, USA).

Internalin genes (inlA, inlC and inlJ), virulence-associated genes (plcA, actA, hlyA, iap and prfA)

US

and serogroup (1/2a, 1/2b, 1/2c, and 4b) were determined by multiplex PCR with slight modifications as per published protocols [19-21]. The PCR products were analyzed on agarose

AN

gel (1.5 %), stained with ethidium bromide, and visualized under UV transilluminator (AlphaImager EC). The details of oligonucleotide sequences (Sigma) and PCR cyclic conditions

M

used in this study are given elsewhere [9,10].

ED

2.4. Mice -virulence assay

In the present investigation, 5- weeks-old female laboratory mice [Parkes (P) strain; weight 20-

PT

22 gm] maintained under hygienic conditions in well-ventilated rooms (23 ± 2 °C) with 12 h photoperiod (8 a.m. to 8 p.m. light) and relative humidity (50 ± 20 %) were used. Animals were

CE

provided with pellet food (Amrut Laboratory Animal Feeds, Pune, India) and drinking water ad libitum. Each group (n=5) of experimental animals were housed separately in polypropylene

AC

cages (450 × 270 × 150 mm) with dry rice husk as the bedding material. Mice virulence assay was performed with the modification of the method described by Menudier et al. [22]. BHU1, 2 and 3 strains were grown along with control clinical strains (ATCC19115 and MTCC1143) for 24 h (37 °C) on Brain Heart Infusion Agar (BHIA; Difco) slants, and harvested in 5 ml sterile normal saline solution (NSS). Each suspension was washed by agitation with a sterile pipette, and standardised turbidometrically, adjusted to McFarland nephelometric tube number 1 (approximately 3×108 cfu/ml) by adding either NSS or bacterial suspension. The inoculums (0.4 ml; ~ 107 cfu/ml) of each strain were intraperitoneally administered in a set of five mice. In each experiment, control mice were dosed with the known pathogenic bacteria (ATCC19115 and

ACCEPTED MANUSCRIPT MTCC1143) with saline alone. Mice were then observed every 6 h for 72 h, and the mortality recorded. General health conditions were investigated from time to time throughout. Mice were sacrificed under chloroform anesthesia by decapitation after death and the rest of the living mice were sacrificed after 72 h. Animals were maintained according to guidelines of Institutional Ethical Committee. The results are expressed as relative virulence (%) by dividing the number of

T

dead mice by the total number of mice tested for a particular strain.

IP

2.5. Genome sequencing

CR

Genome sequencing was carried out using high molecular weight genomics DNA from pure isolates. Barcoded shotgun libraries were prepared using Ionplus Fragment Library Preparation

US

kit. Concisely, 500 ng of genomic DNA (BHU1, 2 and 3) was enzymatically sheared and subjected to Ion Torrent adapters and barcode ligation as per the manufacturer's instructions

AN

(Life Technologies, U.S.A.). The un-ligated adapters and barcodes were removed using AMPure bead purification (Beckman Coulter, Brea, CA, USA). After qubit quantification of DNA

M

libraries (BHU1, 2 and 3), the barcoded libraries were pooled having the equimolar concentration to make all samples at the same concentration. The equimolar pooled library

ED

followed the emulsion PCR for clonal amplification using Ion OneTouch 400 template kit, and enrichment for template positive Ionospheres (ISPs) performed with an Ion OneTouch ES (Life

PT

Technologies, Carlsbad, CA, U.S.A). Sequencing of enriched templates bound to the ionspheres was carried out using the Ion PGM 400 sequencing kit (Life Technologies, Carlsbad, CA,

CE

U.S.A.). The fastq files of each barcoded sample sequence generated, were retrieved from the Ion Torrent suite for further analysis.

AC

2.6. Genomes assembly

The raw reads were quality filtered (prinseq-lite-0.20.3), and de novo assembly carried out using Newbler v2.6 [http://www.454.com/products/analysis-software/]. The gene annotation and screening for RNAs were performed by submitting sequences to Rapid Annotations using Subsystems Technology (RAST) server [23]. The comparative genomics was carried out using MUMmer 3.23 (nucmer) alignment tool with minimum length match of 100 bases employing BHU1, 2 and 3 against EGD-e genome [24].

ACCEPTED MANUSCRIPT 2.7. Phylogeny Phylogenetic analyses were carried out for 9 L. monocytogenes strains of different lineage, and the serotypes using full-length MLST genes (i.e., seven housekeeping genes acbZ, bglA, cat, dapE,

dat,

ldh

and

lhkA)

retrieved

from

GenBank,

http://www.pasteur.fr/recherche/genopole/PF8/mlst/Lmono.html and other databases [25]. For

T

each strain, the MLST genes were concatenated and multiple sequence alignment performed

IP

using ClustalW implemented in MEGA5. Maximum likelihood phylogenetic trees using the

CR

Tamura-Nei model were estimated in MEGA5 [26].

US

2.8. Single nucleotide polymorphisms

The high quality reads were mapped using BWA v 0.7.5a on the reference genome EGD-e and F2365 [27]. The SNPs were deciphered using SAMtool v 0.1.19 (mpileup) for BHU1, 2 and

AN

3. The PCR duplicates were removed to obtain high quality SNPs [28]. The SNPs were filtered using the criteria: minimum read depth of 20 was considered, SNPs located in the terminal

M

regions of contigs and with less than 100 bp flanking regions, and those within the distance of

ED

100 bp to the target SNPs, and SNPs having base quality less than 25, were all discarded. Also, SNPs in the low complexity regions were discarded.

PT

2.9. Nucleotide sequence accession numbers

the

accession

CE

The whole genome sequences described, have been deposited to DDBJ/EMBL/GenBank under numbers

JUKE00000000

(SRA:

SRS752068),

JUKF00000000

(SRA:

AC

SRS752762) and JUKG00000000 (SRA: SRS752764) for Listeria monocytogenes BHU1, Listeria monocytogenes BHU2, and Listeria monocytogenes BHU3 strain, respectively. 3. Results

3.1. Virulence - specific genes and serogroup identification BHU1, 2 and 3 strains of L. monocytogenes were positive for inlA, inlC, inlJ, plcA, prfA, actA, hlyA and iap genes. In serotype-specific multiplex PCR, all the strains were positive for ORF2110 and ORF2819 genes indicating that these belonged to 4b, 4d or 4e serogroup.

ACCEPTED MANUSCRIPT 3.2. Mice virulence assay Three strains from Ganges river, agricultural soil and human clinical samples were assessed by mice inoculation test. BHU1 Among these, BHU1 was non-pathogenic to mice (out of 5 mice tested none of them died) while BHU2 and 3 showed 60 % (out of 5 mice tested, 3 of them died) and 100 % (out of 5 mice tested, all 5 died) and the control clinical strains (ATCC 19115 and

IP

T

MTCC 1143) showed 100 % relative virulence within 72 h, respectively.

CR

3.3 General feature of genomes

De novo assembly of BHU1, 2 and 3 short reads formed 13 [N50 (a static measuring assembly quality) = 476568], 19 (N50 = 263325) and 15 (N50 = 540218) contigs, respectively. The

US

general features of BHU1, 2 and 3 genomes are described in Table 1. Total of 2872 core genes on the set of strains considered were identified in BHU1, 2 and 3 using OrthoMCL (Fig. 1).

AN

Thirty-three genes were common in BHU1 and 2, 34 in BHU2 and 3, and 23 in BHU3 and 1.

M

Unique genes 8, 12 and 17 were deciphered in BHU1, 2 and 3, respectively (Table 2).

ED

3.4. Phylogenetic analyses

We extracted the MLST determinants in silico from the genome sequences. Phylogenetic

PT

analysis based on MLST genes revealed same grouping of L. monocytogenes strains BHU1, 2 and 3 within the evolutionary lineage I, serotype 4b, CC1, and ST328 (Fig. 2).

CE

3.5. Single nucleotide polymorphism analysis in BHU1, 2 and 3 against EGD-e strain

AC

Single nucleotide polymorphism (SNP) analysis was carried out through mapping reads of BHU 1, 2 and 3 against the reference strain. First, the three strains (BHU1, 2 and 3) when mapped against EGD-e, revealed 91580, 85557 and 91399 SNPs and 544, 828 and 702 indels, respectively. Among these, 58291 SNPs were common to all the three strains. Out of the total, only 147, 215 and 243 SNPs were heterozygous in BHU1, 2 and 3, respectively. We found common nucleotide as well as amino acid substitutions in the internalin genes (inlA, inlB, inlC, inlE, inlH) of strain BHU1, 2 and 3 when compared with EGD-e (Table 3). Further, we identified common nucleotide mutation with no change in amino acids in the internalin genes (inlA, inlB, inlC, inlE, inlH) of strain BHU1, 2 and 3 (Supplementary Table S1). We also observed different

ACCEPTED MANUSCRIPT nucleotide mutations in the internalin genes (inlA, inlB, inlC, inlE, inlH) either in strain BHU1, 2 or 3 when compared with EGD-e (Supplementary Table S2).The study revealed common mutations in genes at the nucleotide level (actA, mpl, pycA, uhpT) that could have resulted in the amino acid substitutions in strain BHU1, 2 and 3 compared to EGD-e, wherein, small amino acid alanine to polar amino acid tyrosine and serine to proline substitution could have altered the protein structure and function (Table 4). We also identified common nucleotide mutations but no

T

change in amino acids in genes (actA, ami, dltA, fri, hly, iap, lgt, mpl, murA, plcA, plcB, pycA,

IP

recA, sod, uhpT, lsp and gap) of strain BHU1, 2 and 3 (Supplementary Table S1). We found

CR

different mutations in genes (actA, ami, dltA, fri, hly, iap, lgt, mpl, murA, plcA, plcB, pycA, recA, sod, uhpT, lsp, gap, gtc and prfA) in strain BHU1, 2 and 3 relative to EGD-e (Supplementary

US

Table S2).

AN

3.6. SNP analysis in BHU1, 2 and 3 against F2365 strain

Further, the three strains (BHU1, 2 and 3) when mapped against F2365, revealed 127, 142 and

M

144 SNPs and 4833, 1543 and 2151 indels, respectively. Among these, 107, 122 and 119 were genic SNPs and 4063, 1214 and 1750 the genic indels in BHU1, 2 and 3, respectively. Out of

ED

total genic SNPs, 94 were common to all the three strains (Supplementary Table S3). Instead of common mutations, we also found different SNPs in strain BHU1, 2 and 3 relative to F2365

PT

(Table 5). We observed BHU1 and 2 to harbor nucleotide mutation from A to G in the major virulence genes i.e. hemolysin D and listeriolysin O. Apart from this, we also identified strain-

CE

specific mutations 8, 5 and 2 in BHU1, 2 and 3, respectively, compared to F2365. In BHU1, the 8 strain-specific nucleotide mutations were from G to T at 424218 in the gene of uracil-DNA

AC

glycosylase 1, from T to C, C to T and A to G at 1246941, 1247091 and 2002410, respectively, in the gene of cell wall surface anchor protein, from C to A at 2117553 in the gene of hypothetical protein, from C to T at 516197 in the gene of macrolide ABC transporter permease, from A to C at 2192925 in the gene of maltose phosphorylase, and from G to T at 2507717 in the gene of UDP-glucose 4-epimerase. In BHU2, the 5 strain-specific nucleotide mutations were from C to G at 159466 in the gene of peptide ABC transporter substrate-binding protein, from C to A at 1026229 in the gene of acyltransferase, from T to A at 1132280 in the gene of CAAX amino terminal protease, from A to G at 2676448 in the gene of membrane protein, from G to A at 2704685 in the gene of two-component system sensor histidine kinase KdpD, and from G to A

ACCEPTED MANUSCRIPT at 2807625 in the gene of RpiR family transcriptional regulator. In BHU3, two strain-specific nucleotide mutations were from T to C at 486998 in the gene of internalin and from C to T at 1869219 in the gene of hypothetical protein. 4. Discussion

T

The emergence of next-generation sequencing technology has facilitated rapid, inexpensive and

IP

high-throughput microbial whole-genome analysis, and shows the promise in improving our understanding of the bacterial pathogenesis, detection and its control. The present study report

CR

genome sequences of L. monocytogenes BHU1, 2 and 3 strains from different habitats (river water, agricultural soil and human placenta bit). The genome size (2.9 Mb) of BHU1, 2 and 3, is

US

in accordance with those of others sequenced to be between 2.87 Mb [L. monocytogenes Finland 1988 (GeneBank accession number CP002004)] and 3.02 Mb [L. monocytogenes Scott A

AN

(GeneBank accession number AFGI00000000.1)]. Among the three strains sequenced, L. monocytogenes genomes are similar in size, G+C content, coding percentage, and the average

M

length of protein-coding genes (Table 1).

ED

The genome analysis indicates strains to harbor unique genes such as BHU1 to specifically harbor ABC transporter 2C ATP-binding protein that plays important role(s) in various physiological processes ranging from nutrient uptake, multi-drug resistance, secretion of

PT

signal molecules or toxins, cell volume regulation, and other events [29]. BHU2 harbors zinc

CE

ABC transporter 2C periplasmic-binding protein (ZnuA). It is reported that there are two ABCtype zinc importers (ZnuABC and ZurAM) which contribute to full virulence [30]. Further, BHU2 also harbors internalin-like protein (LPXTG motif) Lmo2821 homolog which encodes for

AC

inlJ. Mutation in inlJ is reported to cause virulence defect in both wild-type mice (after intravenous infection) and hEcad mice (after oral infection) [31]. Another gene implicated in pathogenesis

i.e., pdu gene has also been found in BHU2 with its specific role in propanediol catabolism [32]. The presence of such genes suggests their role in pathogenesis of BHU2 strain. Additionally BHU2 harbors other strain-specific gene such as adenosylcobalamin (AdoCbl), also known as coenzyme B12, which is the most complex coenzyme known with important metabolic role(s) in many microbes [33]. In case of BHU3, it harbors Lmo2470 that belongs to the secreted internalins and encode for inlP. InlP is identified for its role in infection of the placenta and is

ACCEPTED MANUSCRIPT conserved in virulent L. monocytogenes strains but absent in Listeria species that are nonpathogenic for humans [34]. In concordance with previous report [34], the results of the present study suggest that InlP strongly promotes placental infection and it can be used to understand host-pathogen interactions at the maternal-fetal interface and further studies may facilitate the development of new modalities for prevention and treatment of infection-related pregnancy issues such as preterm labor, still birth, abortion, etc. BHU 3 also harbors

T

phosphoenolpyruvate-dependent phosphotransferase (PTS) system2C galactitol-specific IIB

IP

component that allows galactitol to be used as the energy source in glycolysis [35] (Table 2).

CR

Further, BHU3 harbors gene for acyl carrier protein synthases and their products that play the vital role(s) in membrane adaptations to high and low-temperature [36]. Another gene Lmo2026

US

is present in BHU3 which possibly involved in listerial multiplication in the brain [37] (Table 2). However, to establish the presence of these unique genes in the three strains sequenced needs its

AN

more validation. The results are congruent with the concept that Listeria genome contains genes crucial to virulence, and even the very survival of the pathogen. Earlier reports are in accordance

M

with the finding of the present study that L. monocytogenes genome of EGDe, F6854, F2365 and H7858 strains contain 61, 97, 51 and 69 strain-specific genes, respectively [38,39]. Recently, the

ED

comparison of two 4b L. monocytogenes genomes (CLIP80459 and F2365) revealed 115 genes to be specific for CLIP80459 with respect to strain F2365 [40]. It was suggested that four

PT

transcriptional regulator and four surface anchored proteins were specific to 4b CLIP80459 indicating difference in regulation, sugar metabolism and surface characteristics of between the

CE

two strains. Further, the comparative analysis revealed BHU strains to have three ‘conserved hypothetical’ proteins and also, 547 to 549 hypothetical proteins. The conserved hypothetical

AC

proteins are possibly the potential targets for designing the diagnostic molecular markers. Prior to whole genome sequencing, we screened L. monocytogenes strains from our laboratory for their serotype, virulence genes and in vivo mice virulence assay to discriminate between pathogenic and non-pathogenic strains, as it is the “gold standard” for pathogenicity test in bacteria including L. monocytogenes, and facilitates in vivo measurement of all the virulent determinants [41,42]. The strains (BHU1, 2 and 3) showed presence of all major virulence genes and belonged to 4b serovar. These strains had different virulence potential, wherein BHU2 and 3 were pathogenic contrary to BHU1. We observed BHU1to harbor nucleotide mutation from A to G in the major virulence genes i.e. hemolysin D and listeriolysin O which is in accordance with

ACCEPTED MANUSCRIPT the previous reports suggesting that mere presence of virulence associated genes cannot be correlated with the virulence of L. monocytogenes [12,43]. Further, the literature survey also indicates that some naturally virulence attenuated L. monocytogenes strains often contain mutations in prfA, hlyA, actA, and inlA genes, and this attribute possibly results in the expression of truncated or non-functional PrfA, LLO, ActA and InlA proteins [44,45]. In addition in the present study, BHU1 strain showed genic mutations in cell wall surface anchor proteins, UDP-

T

glucose 4-epimerase, maltose phosphorylase, macrolide ABC transporter permease, and uracil-

IP

DNA glycosylase 1, which could also be the confounding factor in non-pathogenicity of the

CR

strain. Variations in genes involved in cell wall metabolism, and those encoding cell wall anchored proteins, possibly reflect the ability of the strains to interact with and infect various cell

US

types and tissues. However, as the number of strains investigated is relatively limited, with reference to the draft genomes of BHU1, 2 and 3, it cannot concluded as to what genes or

AN

mutations best explain the difference in virulence in strains assigned to the same serotype and lineage, and their adaptability to diverse environments. It is likely that comparisons between a

M

collection of 4b strains from different environments and the assessment of their virulence potential at the genomic and proteomic level, could possibly point to the key features for the

ED

differences observed in virulence, and adaptability to diverse environments. The precise characterization of L. monocytogenes is essential for assessing the long-term

PT

trends in sporadic cases as well as for the detection and identification of their common source. The MLST analysis revealed that all the three BHU strains sequenced belonged to lineage 1,

CE

CC1, and ST328 (Fig. 2). The average nucleotide identity of the sequenced serotype 4b strains in this study, was 99.99 %, consequently supporting their highly clonal nature. The prevalence of

AC

this clonal complex and sequence type of L. monocytogenes strains is persistently seen in India [46]. Moreover, ST328 is found to be the geographically distinct strain constrained primarily to India, with the unknown route of transmission, that warrants further investigation. 5. Conclusion Phylogenetic analyses revealed that BHU1, 2 and 3 belong to Lineage 1, 4b, and CC1. In spite of having all the virulence genes, the strains showed variations in mice virulence. With the strainspecific genes and mutations in the genes identified, the genomes of all the three strains were

ACCEPTED MANUSCRIPT remarkably similar in terms of gene content and organization. Genic mutations in energy metabolism and transport most probably impart varying abilities of the strains to withstand adverse environments and to colonize different ecological niches. In the overall, this study demonstrated that the currently used DNA sequencing provides valuable insight into the basic genetic underpinnings that define features of the organism, and may have the role(s) in their virulence and survival in different ecological niches. The information currently gained, is likely

T

to add to our understanding of the intra-specific variations in virulence, ecology, epidemiology

IP

and evolution of the pathogen.

CR

Authors’ contributions

US

DKS and SKD contributed to design the experiment, data analysis, and ms preparation. AG,

AN

SKC, KMS, CGJ contributed to whole genome sequence analysis. All the authors have read and approved the final draft before submission to Gene Reports.

CE

PT

We have no conflict of interest.

ED

M

Competing interests

AC

Acknowledgements

This study was supported by Indian Council of Medical Research (ICMR), Government of India, New Delhi through the research project No. 5/3/3/10/2007-ECD-I. We thank coordinator CAS and DST-FIST, Department of Botany, Banaras Hindu University for facilities. References

ACCEPTED MANUSCRIPT [1] B.H. Lado, A.E. Yousef, Characteristics of Listeria monocytogenes important to food processors, In E.T. Ryser and E.H. Marth (Eds.), Listeria, listeriosis and food safety, CRC Press, (2007) 157-214. [2] D. Liu, Epidemiology, In: Liu D. editor. Handbook of Listeria monocytogenes, Boca Raton (FL): CRC Press, (2008) 27-59. [3] B. Swaminathan, P. Gerner-Smidt, The epidemiology of human listeriosis, Microbes Infect. 9

T

(2007)1236-43.

IP

[4] K.P. Poulsen, C.J. Czuprynski, Pathogenesis of listeriosis during pregnancy, Anim. Health

CR

Res. Rev. 14 (2013) 30-39.

[5] D.K. Soni, A. Nath, S.K. Dubey, Evaluation and use of in-silico structure based epitope

US

prediction for listeriolysin O of Listeria monocytogenes, Indian J. Biotechnol. 14 (2015) 160-66.

AN

[6] Kashish, D.K. Soni, S.K. Mishra, R. Prakash, S.K. Dubey, Label-free impedimetric detection of Listeria mononcytogenes based on poly-5-carboxy indole modified ssDNA probe, J.

M

Biotechnol. 200 (2015) 70-76.

[7] M. Hamon, H. Bierne, P. Cossart, Listeria monocytogenes: a multifaceted model. Nat. Rev.

ED

Microbiol, 4 (2006) 423-34.

[8] N.E. Freitag, G.C. Port, M.D. Miner, Listeria monocytogenes-from saprophytes to

PT

intracellular pathogen, Nat. Rev. Microbiol. 7 (2009) 623-28. [9] D.K. Soni, R.K. Singh, D.V. Singh, S.K. Dubey, Characterization of Listeria monocytogenes

CE

isolated from Ganges water, human clinical and milk samples at Varanasi, India, Infect. Genet. Evol. 14 (2013) 83-91.

AC

[10] D.K. Soni, M. Singh, D.V. Singh, S.K. Dubey, Virulence and genotypic characterization of Listeria monocytogenes isolated from vegetable and soil samples, BMC Microbiol. 14 (2014) 241.

[11] D.K. Soni, S.K. Dubey, Phylogenetic analysis of the Listeria monocytogenes based on sequencing of 16S rRNA and hlyA genes, Mol. Biol. Rep. 41 (2014) 8219-29. [12] D.K. Soni, D.V. Singh, S.K. Dubey, Pregnancy-associated human listeriosis: virulence and genotypic analysis of Listeria monocytogenes, J. Microbiol. 53 (2015) 653-660. [13] M.W. Gilmour, M. Graham, G.V. Domselaar, S. Tyler, H. Kent, K.M. Trout-Yakel, O. Larios, V. Allen, B. Lee, C. Nadon, High-throughput genome sequencing of two Listeria

ACCEPTED MANUSCRIPT monocytogenes clinical isolates during a large foodborne outbreak, BMC Genomics 11 (2010) 120. [14] H.C. de Bakker, B.M. Bowen, L.D. Rodriguez-Rivera, M. Wiedmann, FSL J1-208, a virulent uncommon phylogenetic lineage IV Listeria monocytogenes strain with a small chromosome size and a putative virulence plasmid carrying internalin-like genes, Appl. Environ. Microbiol. 78 (2012)1876-89.

T

[15] A. Holch, K. Webb, O. Lukjancenko, D. Ussery, B.M. Rosenthal, L. Gram, Genome

IP

sequencing identifies two nearly unchanged strains of persistent Listeria monocytogenes

CR

isolated at two different fish processing plants sampled 6 years apart, Appl. Environ. Microbiol. 79 (2013) 2944-51.

US

[16] C. Becavin, C. Bouchier, P. Lechat, A. Cristel, S. Creno, E. Gouin, Z. Wu, A. Kuhbacher, S. Brisse, M.G. Pucciarelli, F.G. Portillo, T. Hain, D.A. Portnoy, T. Chakraborty, M. Lecuit, J.

AN

Pizarro-Cerda, I. Moszer, H. Bierne, P. Cossart, Comparison of widely used Listeria monocytogenes strains EGD, 10403S, and EGD-e highlights genomic differences underlying

M

variations in pathogenicity, mBio 5 (2014) e00969-14. [17] R.H. Orsi, H.C. den Bakker, M. Wiedmann, Listeria monocytogenes lineages: Genomics,

ED

evolution, ecology, and phenotypic characteristics, Int. J. Med. Microbiol. 301 (2011) 79-96. [18] D.K. Soni, K.M. Singh, A. Ghosh, S.K. Chikara, C.G. Joshi, S.K. Dubey, Whole-genome

PT

sequence of Listeria monocytogenes strains from clinical and environmental samples from Varanasi, India, Genome Announc. 3 (2015) e01496-14.

CE

[19] D. Liu, M.L. Lawrence, F.W. Austin, A.J. Ainsworth, A multiplex PCR for species- and virulence-specific determination of Listeria monocytogenes, J. Microbiol. Meth. 71

AC

(2007)133-40.

[20] S.H. Notermans, J. Dufrenne, M. Leimeister-Wachter, E. Domann, T., Chakraborty, Phosphatidylinositol-specific phospholipase C activity as a marker to distinguish between pathogenic and non-pathogenic Listeria species, Appl. Environ. Microbiol. 57 (1991) 266670. [21] M. Doumith, C. Buchrieser, P. Glaser, C. Jacquet, P. Martin, Differentiation of the major Listeria monocytogenes serovars by multiplex PCR, J. Clin. Microbiol. 42 (2004) 3819-22. [22] A. Menudier, C. Bosiraud, J.A. Nicolas, Virulence of Listeria monocytogenes serovars and Listeria spp. in experimental infection of mice, J. Food Prot. 54 (1991) 917-21.

ACCEPTED MANUSCRIPT [23] R.K. Aziz, D. Bartels, A.A. Best, M. DeJongh, T. Disz, R.A. Edwards, K. Formsma, S., Gerdes, E.M. Glass, M. Kubal, F. Meyer, G.J. Olsen, R. Olson, A.L. Osterman, R.A. Overbeek, L.K. McNeil, D. Paarmann, T. Paczian, B. Parrello, G.D. Pusch. C. Reich, R. Stevens, O. Vassieva, V. Vonstein, A. Wilke, O. Zagnitko, The RAST server: rapid annotations using subsystems technology, BMC Genomics 9 (2008) 75.

large sequence sets, Curr. Protoc. Bioinformatics 10 (2003) 10.3.

T

[24] A.L. Delcher, S.L. Salzberg, A.M. Phillippy, Using MUMmer to identify similar regions in

IP

[25] M. Ragon, T. Wirth, F. Hollandt, R. Lavenir, M. Lecuit, A. Le Monnier, S. Brisse, A new

CR

perspective on Listeria monocytogenes evolution, PloS Pathog. 4 (2008) e1000146. [26] K. Tamura, D. Peterson, N. Peterson, G. Stecher, M. Nei, S. Kumar, MEGA5: molecular

US

evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods, Mol. Biol. Evol. 28(2011) 2731-9.

AN

[27] H. Li, R. Durbin, Fast and accurate short read alignment with Burrows–Wheeler transform, Bioinformatics 25(2009) 1754-60.

M

[28] H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, The sequence alignment/map format and SAMtools, Bioinformatics 25(2009) 2078-

ED

9.

[29] J.S. Patzlaff, T. van der Heide, B. Poolman, The ATP/substrate stoichiometry of the ATP-

PT

binding cassette (ABC) transporter OpuA, J. Biol. Chem. 278 (2003) 29546-51. [30] D. Corbett, J. Wang, S. Schuler, G. Lopez-Casteion, S. Glenn, D. Brough, P.W. Andrew,

CE

J.S. Cavet, I.S. Roberts, Two zinc uptake systems contribute to the full virulence of Listeria monocytogenes during growth in vitro and in vivo, Infect. Immun. 80 (2012) 14-21.

AC

[31] C. Sabet, M. Lecuit, D. Cabanes, P. Cossart, H. Bierne, LPXTG Protein InlJ, a newly identified internalin involved in Listeria monocytogenes virulence, Infect. Immun. 73 (2005) 6912-22.

[32] J.R. Mellin, T. Tiensuu, C. Bécavin, E. Gouin, J. Johansson, P. Cossart, A riboswitchregulated antisense RNA in Listeria monocytogenes, Proc. Natl. Acad. Sci. 110 (2013) 13132-7. [33] C. Fan, H.J. Fromm, T.A. Bobik, Kinetic and functional analysis of L-threonine kinase, the PduX enzyme of Salmonella enteric, J. Biol. Chem. 284 (2009) 20240-48.

ACCEPTED MANUSCRIPT [34] C. Faralla, G.A. Rizzuto, D.E. Lowe, B. Kim, C. Cooke, L.R. Shiow, A.I. Bakardjiev, InlP, a new virulence factor with strong placental tropism, Infect. Immun. 84 (2016) 3584-96. [35] S. Mujahid, R.H. Orsi, K.J. Boor, M. Wiedmann, Protein level identification of the Listeria monocytogenes Sigma H, Sigma L, and Sigma C regulons, BMC Microbiol. 13 (2013) 156. [36] K. Zhu, X. Ding, M. Julotok, B.J. Wilkinson, Exogenous isoleucine and fatty acid shortening ensure the high content of anteiso-C15:0 fatty acid required for low-temperature

T

growth of Listeria monocytogenes, Appl. Environ. Microbiol. 71 (2005) 8002-7.

IP

[37] H. Bierne, C. Sabet, N. Personnic, P. Cossart, Internalins: a complex family of leucine-rich

CR

repeat-containing proteins in Listeria monocytogenes, Microbes Infect. 9 (2007) 1156-66. [38] K.E. Nelson, D.E. Fouts, E.F. Mongodin, J. Ravel, R.T. DeBoy, J.F. Kolonay, D.A. Rasko,

US

S.V. Angiuoli, S.R. Gill, I.T. Paulsen, J. Peterson, O. White, W.C. Nelson, W. Nierman, M.J. Beanan, L.M. Brinkac, S.C. Daugherty, R.J. Dodson, A.S. Durkin, R. Madupu, D.H. Haft, J.

AN

Selengut, S. Van Aken, H. Khouri, N. Fedorova, H. Forberger, B. Tran, S. Kathariou, L.D. Wonderling, G.A. Uhlich, D.O. Bayles, J.B. Luchansky, C.M. Fraser, Whole genome

M

comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species, Nucleic

ED

Acids Res. 32 (2004) 2386-95.

[39] C. Buchrieser, Biodiversity of the species Listeria monocytogenes and the genus Listeria,

PT

Microbes Infect. 9 (2007)1147-55.

[40] T. Hain, R. Ghai, A. Billon, C.T. Kuenne, C. Steinweg, B. Izar, W. Mohamed, M.A.

CE

Mraheil, E. Domann, S. Schaffrath, U. Kärst, A. Goesmann, S. Oehm, A. Pühler, R. Merkl, S. Vorwerk, P. Glaser, P. Garrido, C. Rusniok, C. Buchrieser, W. Goebel, T. Chakraborty,

AC

Comparative genomics and transcriptomics of lineages I, II and III strains of Listeria monocytogenes, BMC Genomics 13 (2012) 144. [41] K. Takeuchi, N. Mytle, S. Lambert, M. Coleman, M.P. Doyle, M.A. Smith, Comparison of Listeria monocytogenes virulence in a mouse model, J. Food Protect. 69 (2006) 842-46. [42] D. Liu, M.L. Lawrence, A.J. Ainsworth, F.W. Austin, Toward an improved laboratory definition of Listeria monocytogenes virulence, Int. J. Food Microbiol. 118 (2007) 101-11. [43] S. Kaur, S.V.S. Malik, K.N. Bhilegaonkar, V.M. Vaidya, S.B. Barbuddhe, Use of a phospholipase-C assay, in vivo pathogenicity assays and PCR in assessing the virulence of Listeria spp, Vet. J. 184 (2010) 366-70.

ACCEPTED MANUSCRIPT [44] R.H. Orsi, D.R. Ripoll, M. Yeung, K.K. Nightingale, M. Wiedmann, Recombination and positive selection contribute to evolution of Listeria monocytogenes inlA, Microbiol. 153 (2007) 2666-78. [45] J. Shen, L. Rumpb, Y. Zhang, Y. Chen, X. Wang, J. Meng, Molecular subtyping and virulence gene analysis of Listeria monocytogenes isolates from food, Food Microbiol. 35 (2013) 58-64.

T

[46] S.B. Barbuddhe, S.P. Doijad, A. Goesmann, R. Hilker, K.V. Poharkar, D.B. Rawool, N.V.

IP

Kurkure, D.R. Kalorey, S.S. Malik, I. Shakuntala, S. Chaudhari, V .Waskar, D. D'Costa, R.

CR

Kolhe, R. Arora, A. Roy, A. Raorane, S. Kale, A. Pathak, M. Negi, S. Kaur, R. Waghmare, S. Warke, S. Shoukat, B. Harish, A. Poojary, C. Madhavaprasad, K. Nagappa, S. Das, R.

US

Zende, S. Garg, S. Bhosle, S. Radriguez, A. Paturkar, M. Fritzenwanker, H. Ghosh, T. Hain, T. Chakraborty, Presence of a widely disseminated Listeria monocytogenes serotype 4b clone

PT

Legend to figures

ED

M

AN

in India, Emerg. Microbes Infect. 5 (2016) e55.

CE

Fig. 1. Venn diagram representing the core, common and unique genes in strains BHU1, BHU2, and BHU3 of L. monocytogenes.

AC

Fig. 2. Maximum likelihood phylogenetic tree of L. monocytogenes strains based on the concatenated full-length MLST gene sequences and was calculated with MEGA 5 using the Tamura-Nei model. L. monocytogenes lineage, serotypes, clonal complex and sequence type are indicated in brackets. The phylogeny consists of branch length and Node values which represent percent bootstrap confidence of the branch. From the phylogeny, it is observed that there are three main clads of lineage I, II and III.

Fig. 1.

AC

CE

PT

ED

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

Fig. 2.

ACCEPTED MANUSCRIPT Table 1 Origin and general features of compared isolates of species L. monocytogenes Strains

BHU 1

BHU 2

BHU 3

EGD-e

Origin

Ganges water

Agricultural soil

Placenta bit

Rabbit

Chromosome accession

GCA_000806395.1

GCA_000806455.1

GCA_000806405.1

AL591824

No. of contigs/chromosome

13

19

15

1

Total length of all de novo-assembled contigs (bp)

2944481

2945958

Total G+C content (%)

39

38

G+C content of protein coding genes (%)

37.9

37.8

Number of CDS

3005

3011

Protein coding DNA (%)

89

89

Conserved hypothetical

3

3

Unknown function

6

5

Hypothetical

547

Total RNAs

54

Number of rRNA genes

2

Number of tRNA genes

52

T P

D E

Number of sRNA genes

1

C A

E C

C S

I R

2946809

2944528

38

38

37.9

38

3030

2846

89

89

3

0

6

0

549

547

0

48

48

85

2

2

18

46

46

67

1

1

6

U N

A

M

T P

ACCEPTED MANUSCRIPT Table 2 Unique genes found in BHU1, 2 and 3 BHU1

BHU2

BHU3

Beta-glucosidase (EC 3.2.1.21)

Zinc ABC transporter2C periplasmicbinding protein ZnuA

Internalin-like protein (LPXTG motif) Lmo2026 homolog

Hypothetical protein

SSU ribosomal protein S21p

PTS system2C galactitol-specific IIB component (EC 2.7.1.69)

ABC transporter2C ATP-binding protein

Hypothetical protein

TsaE protein2C required for threonylcarbamoyladenosine t(6)A37 formation in tRNA

Late competence protein ComC2C processing protease

Internalin-like protein (LPXTG motif) Lmo2821 homolog

ABC-type Fe3+-siderophore permease component

Hypothetical protein

Amidase family protein

Hypothetical protein DUF1942C DegV family

Hypothetical protein2C Lmo2307 homolog [Bacteriophage A118]

Internalin-like protein (LPXTG motif) Lin1204 homolog

FIG00776142: protein

Hypothetical protein

T

IP

CR

US

PspC domain protein2C truncated

Adenosylcobinamide-phosphate synthase (EC 6.3.1.10)

Internalin-like protein Lmo2470 homolog

Propanediol utilization polyhedral body protein PduA

Hypothetical protein YrhD

ED

Ribonuclease M5 (EC 3.1.26.8)

FIG00774251: hypothetical protein

Protein of unknown function identified by role in sporulation (SpoVG)

PT

FIG00775260: hypothetical protein

CE AC

Acyl carrier protein

Putative peptidoglycan bound protein (LPXTG motif) Lin2281 homolog

M

Beta-glucosidase (EC 3.2.1.21)

system2C

FIG00774380: hypothetical protein

AN

hypothetical

transport

Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) ATP synthase F0 sector subunit c (EC 3.6.3.14) Acetyltransferase2C GNAT family ORF37-1 FIG00775553: hypothetical protein

ACCEPTED MANUSCRIPT

inlE

455791

G (A)

C (P)

455893

A (T)

G (A)

457225

G (A)

C (L)

457609

C (Q)

G (E)

458485

T (S)

G (A)

457633

G (A)

T (S)

1860866

T (R)

C (V)

287035

C (P)

A (I)

287545

C (Q)

A (K)

286237

G (I)

A (A)

286444

G (V)

A (I)

286540

G (V)

A (I)

286879

G (V)

A (I)

286894

G (S)

286924

G (E)

286981

G (V)

287248

G (A)

287329

G (A)

287506

G (D)

287590

G (G)

286702

T (S)

286654

G (D)

286234

A (T)

286372

A (T)

286561

A (T)

286588

A (R)

286858

A (M)

286891

A (T)

286993

RIP

A (R) A (I) A (T) A (T) A (I) A (S) A (T)

C (H) G (V) G (A) G (A) G (G) G (E) G (A)

A (N)

G (D)

A (T)

G (V)

A (N)

G (E)

A (K)

G (E)

A (I)

G (V)

A (N)

G (D)

A (N)

G (D)

A (K)

G (E)

C (L)

G (V)

286249

T (L)

G (V)

286549

T (L)

G (V)

286651

A (M)

T (L)

286864

A (S)

T (C)

287284

G (A)

T (L)

284425

C (L)

A (I)

284545

G (A)

A (T)

284491

G (A)

C (P)

284479

A (T)

G (A)

284569

A (T)

G (A)

284599

A (T)

G (A)

285115

A (T)

T (L)

287185 287311 287368 287497 287521 287632

AC

286369

CE

287398

inlH

A (G)

SC

inlC

BHU allelic type*

NU

inlB

EGD-e allelic type*

MA

inlA

SNP Position

PT ED

Gene

T

Table 3 The non-synonymous common mutation of internalin genes in BHU1, 2 and 3 strains against EGD-e

* In the allelic type columns, the amino acid residue are described in the parenthesis

ACCEPTED MANUSCRIPT Table 4 The non-synonymous common mutation of virulence genes in BHU1, 2 and 3 strains against EGD-e EGDe allelic type*

BHU allelic type*

210013

G (A)

A (T)

210019

G (A)

A (T)

211012

G (A)

A (I)

211141

G (D)

A (N)

210913

T (S)

C (P)

211195

T (S)

C (P)

211039

A (R)

G (G)

209017

C (L)

208504

A (T)

pycA

1100940

G (V)

uhpT

869125

C (H)

IP A (I)

G (A)

CR

Mpl

A (I)

US

actA

T

SNP Position

AN

Gene

T (V)

AC

CE

PT

ED

M

* In the allelic type columns, the amino acid residue are described in the parenthesis

ACCEPTED MANUSCRIPT 26

Table 5 Different genic mutations in BHU1, 2 and 3 strains against F2365 SNP position

F2365 allelic type

Allelic SNPs

BHU1 allelic type

BHU2 allelic type

BHU3 allelic type

Gene

56309

C

T

+

-

+

Single-stranded DNA-binding protein 1

159466

C

G

-

+

-

peptide ABC transporter substrate-binding protein

202365

A

G

+

+

-

hemolysin D

211295

A

G

+

+

-

listeriolysin O

424218

G

T

+

-

-

uracil-DNA glycosylase 1

486998

T

C

-

-

+

Internalin

506235

G

A

-

+

+

cytosine permease

573557

A

G

-

+

+

glycosyltransferase family 2

625098

T

C

-

+

+

50S rRNA methyltransferase

650047

T

C

-

+

+

hypothetical protein

658874

A

G

-

+

+

FMN-dependent NADH-azoreductase 1

937507

T

C

-

+

+

hypothetical protein

1026229

C

A

-

+

-

1117847

A

G

-

+

+

1132280

T

A

-

+

-

1246941

T

C

+

-

-

1247091

C

T

+

-

1247196

G

A

-

1256960

T

C

-

1516197

C

T

C A

1675514

A

1705173 1716381

D E

T P

I R

C S

U N

A

M

E C

T P

Acyltransferase CDP-glycerol--glycerophosphate glycerophosphotransferase CAAX amino terminal protease cell wall surface anchor protein

-

cell wall surface anchor protein

+

+

cell wall surface anchor protein

+

+

quinolone MFS transporter

+

-

-

macrolide ABC transporter permease

G

-

+

+

hypothetical protein

A

C

-

+

+

membrane protein insertion efficiency factor YidD

C

T

+

-

+

cystathionine beta-lyase

1719576

G

A

-

+

+

5-methyltetrahydropteroyltriglutamate--homocysteine methyltransferase

1730518

G

A

-

+

+

Dehydrogenase

ACCEPTED MANUSCRIPT 27

1776559

C

T

+

-

+

ABC transporter substrate-binding protein

1795666

T

A

-

+

+

ABC transporter permease

1823238

A

G

-

+

+

phosphoribosylformylglycinamidine synthase subunit PurQ

1869219

C

T

-

-

+

hypothetical protein

1907497

T

A

-

+

+

peptidase S41

1967125

T

G

-

+

+

diguanylate cyclase

2002410

A

G

+

-

-

cell wall surface anchor protein

2103381

G

C

-

+

+

cell division protein FtsA

2117553

C

A

+

-

-

hypothetical protein

2168566

C

T

-

+

+

GntR family transcriptional regulator

2192925

A

C

+

-

-

maltose phosphorylase

2492861

A

G

-

+

+

membrane protein

2507717

G

T

+

-

-

UDP-glucose 4-epimerase

2676448

A

G

-

+

-

membrane protein

2704685

G

A

-

+

-

two-component system sensor histidine kinase KdpD

2807625

G

A

-

+

-

RpiR family transcriptional regulator

T P

D E

C A

E C

I R

C S

U N

A

M

T P

ACCEPTED MANUSCRIPT 28

CE

PT

ED

M

AN

US

IP

CR

listeriolysin O (LLO) multi-locus sequence typing (MLST) Rapid Annotations using Subsystems Technology (RAST) Single nucleotide polymorphism (SNP) phosphotransferase (PTS)

AC

1. 2. 3. 4. 5.

T

List of Abbreviations:

ACCEPTED MANUSCRIPT 29

Highlight

CE

PT

ED

M

AN

US

CR

IP

T

This work focuses on Listeria monocytogenes strains 4b This pathogen is major cause of abortion, stillbirth and septicemia, Strains are genetically very close and belonging to Lineage I The major virulence-associated genes indicate mutations

AC

1. 2. 3. 4.