Accepted Manuscript Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses Kate Susan Baker, Josefina Campos, Mariana Pichel, Anabella Della Gaspera, Francisco Duarte-Martínez, Elena Campos-Chacón, Hilda María Bolaños-Acuña, Caterina Guzman-Verri, Alison E. Mather, Susana Diaz Velasco, Maria Luz Zamudio Rojas, Jessica Forbester, Thomas Richard Connor, Karen Helena Keddy, Anthony Marius Smith, Enedia A. Lopez de Delgado, Giovanny Angiolillo, Nirvia Cuaical, Jorge Fernandez, Carolina Aguayo, Melissa Morales Aguilar, Claudia Valenzuela, Ana Judith Morales Medrano, Alfredo Sirok Esteve, Natalie Weiler Gustafson, Paula Lucia Diaz Guevara, Lucy Angeline Montaño, Enrique Perez, Nicholas R. Thomson PII:
S1198-743X(17)30190-8
DOI:
10.1016/j.cmi.2017.03.021
Reference:
CMI 905
To appear in:
Clinical Microbiology and Infection
Received Date: 12 January 2017 Revised Date:
16 March 2017
Accepted Date: 27 March 2017
Please cite this article as: Baker KS, Campos J, Pichel M, Della Gaspera A, Duarte-Martínez F, Campos-Chacón E, Bolaños-Acuña HM, Guzman-Verri C, Mather AE, Velasco SD, Zamudio Rojas ML, Forbester J, Connor TR, Keddy KH, Smith AM, Lopez de Delgado EA, Angiolillo G, Cuaical N, Fernandez J, Aguayo C, Aguilar MM, Valenzuela C, Morales Medrano AJ, Esteve AS, Gustafson NW, Diaz Guevara PL, Montaño LA, Perez E, Thomson NR, Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses, Clinical Microbiology and Infection (2017), doi: 10.1016/j.cmi.2017.03.021. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT 1
Title: Whole genome sequencing of Shigella sonnei through PulseNet Latin America
2
and Caribbean: advancing global surveillance of foodborne illnesses
3 4
Running title: Shigella sonnei across Latin America
RI PT
5
Kate Susan Baker* 1,2 Josefina Campos3 Mariana Pichel3 Anabella Della Gaspera3
7
Francisco Duarte-Martínez4 Elena Campos-Chacón4 Hilda María Bolaños-Acuña4
8
Caterina Guzman-Verri5,6 Alison E Mather2,7 Susana Diaz Velasco8 Maria Luz
9
Zamudio Rojas8 Jessica Forbester2 Thomas Richard Connor9 Karen Helena Keddy10
SC
6
Anthony Marius Smith10 Enedia A. Lopez de Delgado11 Giovanny Angiolillo11 Nirvia
11
Cuaical11 Jorge Fernandez12 Carolina Aguayo12 Melissa Morales Aguilar13 Claudia
12
Valenzuela13 Ana Judith Morales Medrano13 Alfredo Sirok Esteve14 Natalie Weiler
13
Gustafson15 Paula Lucia Diaz Guevara16 Lucy Angeline Montaño16 Enrique Perez17
14
Nicholas R Thomson*2,18
TE D
M AN U
10
15 16
1
17
Liverpool, United Kingdom, L69 7ZB
18
2
19
Kingdom, CB10 1SA
20
3
Instituto Nacional de Enfermedades Infecciosas, ANLIS, Buenos Aires, Argentina
21
4
Instituto Costarricense de Investigación y Enseñanza en Nutrición y Salud
22
(Inciensa), Costa Rica
23
5
24
Veterinaria, Universidad Nacional, Heredia, Costa Rica
EP
University of Liverpool, Department of Functional and Comparative Genomics,
AC C
Wellcome Trust Sanger Institute, Pathogen Variation Programme, Hinxton, United
Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina
ACCEPTED MANUSCRIPT 25
6
26
Universidad de Costa Rica, San José, Costa Rica.
27
7
28
Kingdom, CB3 0ES
29
8
National Institute of Heath, Lima, Perú
30
9
Organisms and Environment Division, Cardiff University School of Biosciences, Sir
31
Martin Evans Building, Cardiff, CF10 3AX
32
10
33
Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South
34
Africa
35
11
36
University, Los Chaguaramos,Venezuela
37
12
Molecular Genetics Laboratory, Institute of Public Health of Chile, Santiago, Chile
38
13
Department of Foodborne Diseases, National Health Laboratory of Guatemala,
39
Laboratorio Nacional de Salud,Barcenas, Guatemala
40
14
41
Ministerio de Salud Pública (MSP); Alfredo Navarro 3051, 11600, Montevideo,
42
Uruguay
43
15
44
Paraguay
45
16
46
17
47
Health Emergencies., 525 23rd Street NW, Washington DC, USA.
48
18
49
WC1E 7HT
Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología,
RI PT
University of Cambridge, Department of Veterinary Medicine, Cambridge, United
M AN U
SC
Centre for Enteric Diseases, National Institute for Communicable Diseases and
TE D
Department of Bacteriology, National Institute of Hygiene “Rafael Rangel”, Ciudad
AC C
EP
Bacteriology Laboratory, Departamento de Laboratorios de Salud Pública (DLSP),
Laboratorio Central de Salud Pública, Departamento of Bacteriology, Asuncion,
Grupo de Microbiología, Instituto Nacional de Salud, Bogotá, Colombia
Pan American Health Organization/World Health Organization, Department of
London School of Hygiene and Tropical Medicine, London, United Kingdom,
ACCEPTED MANUSCRIPT *Corresponding authors: Dr Kate Baker,
[email protected], +44 1517
51
954575; Professor Nicholas R Thomson
[email protected] +44 1223 494740
AC C
EP
TE D
M AN U
SC
RI PT
50
ACCEPTED MANUSCRIPT 52 53
Abstract
54 Objective
56
Shigella sonnei is a globally-important diarrhoeal pathogen tracked through the
57
surveillance network PulseNet Latin America and Caribbean (PNLA&C), which
58
participates in PulseNet International. PNLA&C laboratories use common molecular
59
techniques to track pathogens causing foodborne illness. We aimed to demonstrate the
60
possibility and advantages of transitioning to whole genome sequencing (WGS) for
61
surveillance within existing networks across a continent where S. sonnei is endemic.
M AN U
SC
RI PT
55
62 Methods
64
We applied WGS to representative archive isolates of S. sonnei (n=323) from
65
laboratories in nine PNLA&C countries to generate a regional phylogenomic
66
reference for S. sonnei and put this in the global context. We used this reference to
67
contextualise 16 S. sonnei from three Argentinian outbreaks, using locally-generated
68
sequence data. Assembled genome sequences were used to predict antimicrobial
69
resistance (AMR) phenotypes and identify AMR determinants.
EP
AC C
70
TE D
63
71
Results
72
S. sonnei isolates clustered in five Latin American sublineages in the global
73
phylogeny, with many (46%, 149 of 323) belonging to previously undescribed
74
sublineages. Predicted multiple drug resistance was common (77%, 249 of 323) and
75
clinically-relevant differences in AMR were found among sublineages. The regional
ACCEPTED MANUSCRIPT 76
overview showed that Argentinian outbreak isolates belonged to distinct sublineages
77
and had different epidemiological origins.
78 Conclusions
80
Latin America contains novel genetic diversity of S. sonnei that is relevant on a global
81
scale and commonly exhibits multiple drug resistance. Retrospective passive
82
surveillance with WGS has utility for informing treatment , identifying regionally-
83
epidemic sublineages and providing a framework for interpretation of prospective,
84
locally-sequenced outbreaks.
AC C
EP
TE D
M AN U
85
SC
RI PT
79
ACCEPTED MANUSCRIPT 86
Introduction
87 Shigella are globally important bacteria, causing more than 190 million diarrhoeal
89
disease cases and 65,796 deaths annually, 18 million and 1,023 of which,
90
respectively, occur in the Americas [1, 2]. In Latin America, S. sonnei is a common
91
cause of diarrhoeal disease (mainly in children[3-5]) and is variably resistant to
92
commonly-used antimicrobials [6, 7]. Explosive outbreaks still occur (e.g. a 900-case
93
epidemic of S. sonnei in Argentina in 2016 [8]) and, mirroring trends in other
94
economically-developing areas [9], increases in endemic S. sonnei prevalence are also
95
reported [10]. In addition to local transmission, new phylogenetic lineages of S.
96
sonnei can disseminate nationally and spread internationally within two to three
97
decades [11, 12]. Given its worldwide distribution, increasing importance, and
98
international transmission, it is unsurprising that S. sonnei is under surveillance
99
through PulseNet International [13].
SC
M AN U
TE D
100
RI PT
88
PulseNet Latin America and Caribbean (PNLA&C) is a regional network that
102
contributes to PulseNet International; a public health network of >120 laboratories in
103
>80 countries that has performed surveillance of foodborne illnesses for twenty years
104
[13]. PNLA&C laboratories use common molecular subtyping techniques and share
105
their results and associated epidemiological information through a regional database
106
to facilitate early identification of disease outbreaks in an increasingly globalised
107
world [14]. Owing to the increased resolution compared with traditional techniques
108
(e.g. Pulsed-Field Gel Electrophoresis, PFGE), PulseNet International is currently
109
transitioning to the use of WGS [15].
110
AC C
EP
101
ACCEPTED MANUSCRIPT WGS has been applied to subtype S. sonnei effectively: a species-defining study
112
identified four main phylogenetic lineages that were further split into sublineages
113
containing isolates of similar geographical origins [16]. Subsequent WGS studies of
114
S. sonnei have demonstrated the emergence of sublineages of public health
115
importance at national and international levels, often driven by the acquisition of
116
AMR [11, 12, 16, 17]. In a strict public health setting, Public Health England
117
researchers have used WGS to identify epidemiological clusters of S. sonnei as
118
existing subtyping techniques (phage typing) provided poor lineage discrimination
119
[18]. WGS has also been used to predict AMR phenotypes, with high (e.g. >95%)
120
specificities and sensitivities reported for other Enterobacteriaceae, including E. coli,
121
Campylobacter and Salmonella [19-21]. Collectively, these studies suggest that the
122
application of WGS within international surveillance networks, such as PNLA&C,
123
can enhance outbreak detection and surveillance for S. sonnei and its AMR
124
determinants.
TE D
M AN U
SC
RI PT
111
125
The strength of surveillance frameworks lie in both the use of common techniques,
127
and large reference databases e.g. hundreds of thousands of PFGE-subtyped pathogen
128
profiles exist within PulseNet International. Populating these databases with WGS
129
data from passive and active surveillance programs will promote the continued
130
success of international surveillance. In this study members of PNLA&C worked
131
collaboratively toward this goal by generating a WGS overview of S. sonnei in the
132
region.
AC C
EP
126
133 134
Materials and Methods
135
Clinical isolates
ACCEPTED MANUSCRIPT 136 Latin American archive isolates
138
To construct a regional overview of S. sonnei across Latin America, WGS data were
139
generated from 323 archived clinical isolates of S. sonnei collected over nineteen
140
years from nine countries (Table 1, Figure 1). Each national PNLA&C partner was
141
responsible for selecting isolates from their own archives with the aim of achieving
142
diversity with respect to the following: PFGE profile, year of isolation, AMR profile,
143
disease manifestation, and PFGE profile linkage to outbreaks of disease or sporadic
144
cases. Metadata associated with the isolates frequently included the year of collection,
145
AMR susceptibility testing results and geographical information (e.g. patient
146
residential province or address of submitting laboratory). All metadata and results are
147
shown in Table S1.
148
M AN U
SC
RI PT
137
Global context isolates
150
To set this regional overview in a global context, publically available sequence data
151
from global reference isolates (n=116) of S. sonnei were also included (Table S1).
152
These comprise temporally and geographically diverse (samples from four continents
153
collected between 1943 and 2008) isolates used to define the population structure of
154
S. sonnei [16].
EP
AC C
155
TE D
149
156
Argentine outbreak isolates
157
To demonstrate the utility of the LA regional overview for investigating national
158
outbreaks, WGS data generated at the PNLA&C reference laboratory (ANLIS) from
159
three Argentinian outbreaks (n=16 isolates) of S. sonnei were also used. Previously
ACCEPTED MANUSCRIPT 160
reported at a national level [8], these isolates were from outbreaks in 2010 (n=5),
161
2011 (n=3) and 2016 (n=8).
162 163
Genome sequencing and bioinformatics analysis
RI PT
164
Archive isolates were sequenced, trimmed and quality checked at the Wellcome Trust
166
Sanger Institute according to in house protocols [22]. Sequencing data were de novo
167
assembled using a custom assembly pipeline [23]. All isolates assembled into > 4MB
168
and <650 contiguous sequences (Table S1). Sequencing data and assemblies are
169
publically available at the European Nucleotide Archive (accession numbers, Table
170
S1). Argentinian outbreak isolates were sequenced at ANLIS [8].
M AN U
SC
165
171
To construct the regional overview phylogeny, a multiple sequence alignment (MSA)
173
was created by mapping the sequence data from 439 taxa (archive and global context
174
isolates) to Shigella sonnei Ss046 and its five associated plasmids (5055316 bp) using
175
SMALT, followed by removal of repeat regions and mobile elements (7210310 bp)
176
[16] and regions of recombination (7074 sites) [24] resulting in a final alignment of
177
13,988 variant sites. A maximum likelihood phylogeny with 100 bootstraps was then
178
inferred [25]. Phylogenetic analysis incorporating outbreak isolates was conducted
179
similarly (final alignment 14,075 variant sites).
EP
AC C
180
TE D
172
181
For analysis of sequences related to AMR, AMR genes were detected on assembled
182
sequences [26] and cross referenced with phylogeny, contiguous sequence length and
183
traditional comparative genetic approaches including Artemis, BLAST (against NCBI
184
reference databases and locally) and ACT, as previously described [11] to determine
ACCEPTED MANUSCRIPT 185
the presence of known AMR determinants in shigellae. Single Nucleotide
186
Polymorphisms (SNPs) in known quinolone-resistance determining regions including:
187
gyrA positions 83, 87 and 211, and parC positions 80 and 84 [27], were retrieved.
188 Approximate longitude and latitude of locations were deduced and phylogeographic
190
analysis was visualised employing MicroReact [28]. Figtree and iTOL were used for
191
additional visualisations [29].
RI PT
189
193
SC
192 Results
M AN U
194
Phylogenetic analysis of LA S. sonnei was conducted to define its population
196
structure within the known four-lineage context of S. sonnei. This divided the LA
197
isolates into a new genetic lineage and four genetic sublineages of variable diversity.
198
The new lineage comprised 26 (8%) of the 323 archive isolates and was designated
199
Lineage V (Figure 1, Figure 2, Table 1). Since its detection in this study, Lineage V
200
isolates have been detected in South Africa (Figure S1) and the United Kingdom
201
(Baker et al, in preparation). The remaining archive isolates clustered within Lineages
202
II (n=123, 38%) and III (n=174, 54%). The archive isolates in Lineage II were further
203
subdivided into the sublineages LA IIa and IIb (Table 1, Figure 1, Figure 2). LAIIa
204
and LAIIb were phylogenetically distinct from the previously-described South
205
America II sublineage (Figure S2). The archive isolates in Lineage III were similarly
206
subdivided into sublineages LA IIIa and IIIb, which were expansions of previously
207
described sublineages (see Table 1, Figure S2), with IIIb being part of the multi-drug
208
resistant (MDR) Global III sublineage of S. sonnei that expanded globally following
209
the acquisition of AMR [16]. The four LA sublineages had variable genetic diversity
AC C
EP
TE D
195
ACCEPTED MANUSCRIPT 210
with, for example, the maximum MSA pairwise distance between any two isolates in
211
LAIIIb being 172 Single Nucleotide Polymorphisms (SNPs), compared with 309
212
SNPs for LAIIb (Table 2). S. sonnei isolates belonging to Lineages I and IV were not
213
found in this study.
RI PT
214
The regional phylogenetic overview provides a nomenclature for discussion and
216
interpretation of national and regional surveillance patterns of S. sonnei in LA. For
217
that reason, the geospatial information and phylogenetic results of the archived
218
isolates were loaded into a MicroReact project (found at
219
http://microreact.org/project/Shigella_sonnei_in_Latin_America). This public
220
resource can be used interactively by end-users to display and filter the results of this
221
study based on time, phylogeny and geography to highlight for example that
222
sublineages were not uniformly distributed around LA or individual countries (Figure
223
3).
M AN U
TE D
224
SC
215
The isolates within a LA lineage or sublineage were temporally and geographically
226
diverse. Each lineage or sublineage contained isolates collected in multiple years over
227
the course of the study, ranging from ten years for IIIb to the entire seventeen years
228
for IIb (Figure 1, Table 2). No obvious shifts in the presence of the LA lineage or
229
sublineages were seen over time, apart from IIIb possibly predominating in later years
230
(Figure 1). The LA lineage and sublineages were also diverse with respect to their
231
countries of origin, with each being comprised of isolates from between four and six
232
countries (Figure 1, Figure 2, Table 1). Within a given LA lineage or sublineage,
233
isolates from different countries were frequently intermingled rather than being
234
phylogenetically separated based on geography (often with good phylogenetic
AC C
EP
225
ACCEPTED MANUSCRIPT 235
support, Figure S5), indicating that international transmission across the region may
236
be occurring.
237 To demonstrate the utility of this data for contextualising new outbreaks, we
239
performed further phylogenetic analysis with additional isolates from S. sonnei
240
outbreaks in Argentina. This confirmed that the Argentinian outbreaks in 2010 and
241
2011 were caused by phylogenetically distinct S. sonnei with distinct AMR profiles
242
[8]. Here this is demonstrated by the majority of isolates from the 2011 outbreak
243
belong to sublineage IIIa and those from 2010 and 2016 belonging to sublineage IIIb
244
(Figure 4). Notably however, the 2011 and 2016 isolates fell into multiple
245
sublineages, indicating that the outbreaks may have multiple epidemiological origins.
M AN U
SC
RI PT
238
246
Predicted AMR phenotypes correlated well with antimicrobial resistance testing data
248
(94·3% sensitivity, 99·4% specificity of 330 available phenotypes, Table S1), and are
249
the results presented throughout this text. MDR (resistance to ≥3 antimicrobial
250
classes) was common among the isolates (77%, 249 of 323), with isolates being
251
resistant to between 0 and 7 antimicrobials (mean 3.7, mode and median 4) (Table 3,
252
Table S1, Figure 2). No relationship of increasing antimicrobial resistance over time
253
was detected (Figure S4). Macrolide resistance and an extended-spectrum beta-
254
lactamase (ESBL) gene were found in individual isolates, conferred by the
255
azithromycin resistance gene mphA in a IIIb isolate, and a Lineage V isolate
256
containing a blaSHV129 gene that conferred resistance to ceftazidime (Table S1).
257
Resistance to quinolones was similarly infrequent, being present in only 3% (10 of
258
323) of isolates (conferred by gyrA mutations (n=8) or qnr genes (n=2), Table S1,
259
Table 3). Resistance to other classes of antimicrobials was more common, with 65 -
AC C
EP
TE D
247
ACCEPTED MANUSCRIPT 82% of isolates being resistant to aminoglycosides (streptomycin), trimethoprim,
261
sulphonamide and tetracycline classes of antimicrobials, and resistance to phenicol
262
and beta-lactam classes also being frequently detected (25 and 48% of isolates
263
respectively) (Figure 2, Table 3). These resistances were encoded by a variety of
264
AMR genes (Table S1, Figure S3).
RI PT
260
265
Notably, the distribution of resistance against an antimicrobial class was not uniform
267
among the sublineages (Figure 2). For example, predicted beta-lactamase resistance in
268
sublineage IIIa was 90% compared with just 7% for sublineage IIIb (p<0.01), and
269
while IIIa and IIIb had ~90% resistance to tetracycline, sublineages IIb and IIa had
270
between 25 and 30% tetracycline resistance (Table 3). The presence of resistance
271
toward a variety of antimicrobials gave rise to an AMR profile (i.e. antibiogram) in
272
each isolate (Figure 2), with each sublineage containing isolates of between 6 and 13
273
different AMR profiles. However, in the case of sublineages IIIa and IIIb, single
274
AMR profiles dominated with one profile being present among 69 and 79%
275
(respectively) of the isolates in the sublineage, and other AMR profiles being present
276
in fewer than 9% of isolates in the sublineage (Table S1, Figure 2). The dominant
277
AMR profiles in each of IIIa and IIIb were determined by the presence of
278
chromosomal and plasmid-encoded AMR genes conferring resistance to multiple
279
antimicrobial classes (Table 3). Specifically, sublineage IIIb carried the chromosomal
280
Int2/Tn7 resistance determinant and the SpA plasmid and sublineage IIIa carried the
281
chromosomal Shigella Resistance Locus (SRL) and a variant plasmid of SpA, pABC-
282
3, on which the aminoglycoside-resistance gene strA has been interrupted by the
283
acquisition of a trimethoprim-resistance conferring gene dfrA14 [30]. Contrasting to
284
the presence of a dominant AMR profile in sublineages IIIa and IIIb, sublineages IIa,
AC C
EP
TE D
M AN U
SC
266
ACCEPTED MANUSCRIPT 285
IIb and lineage V contained at least three AMR profiles that were present in ≥15% of
286
the isolates (Table S1).
287 288
Discussion
RI PT
289
Here we have created a resource of novel WGS diversity of S. sonnei from Latin
291
America relevant to public health surveillance on regional and global scales. We
292
report the identification of a new global lineage (Lineage V)[16]; previously
293
undescribed sublineages of Lineage II (Latin America IIa and Latin America IIb); and
294
expansions in Lineage III, including Latin America IIIa (previously South America
295
(III)) and Latin America IIIb, within the MDR Global III lineage. The subsequent
296
detection of Lineage V in Europe and Africa testifies to the relevance of this diversity
297
for surveillance on a global scale, and regional importance of this data is
298
demonstrated here by contextualisation of Argentinian S. sonnei outbreaks.
M AN U
TE D
299
SC
290
In a previous study, outbreak isolates from Argentina were discriminated at a national
301
level into three WGS sublineages [8]. Building these isolates into this regional
302
overview, the 2010 and 2016 outbreak isolates were contained entirely within the
303
diversity of archive isolates from Argentina, indicating these epidemics were likely
304
from previously circulating strains (Figure 4). Contrastingly, the 2011 outbreak
305
isolates were more closely related to IIIa isolates from Perú and Chile than the single
306
IIIa isolate from Argentina, indicating the epidemic may have been subsequent to an
307
importation event. Thus, by providing interpretative context, our results enhance
308
national, regional and global surveillance of S. sonnei through publically available
309
sequencing data and MicroReact.
AC C
EP
300
ACCEPTED MANUSCRIPT 310 The WGS data was also used to examine AMR in S. sonnei across LA. Quinolone and
312
macrolide resistance and genes encoding extended-spectrum beta-lactamases were not
313
widespread; present in only a handful of isolates. Notably however, quinolone
314
resistant isolates in Lineage V and sublineage IIIa were outside of the Central Asia III
315
lineage, thought to act as the global reservoir for ciprofloxacin resistant S. sonnei
316
[17]. Resistance against many other classes of antimicrobials (including
317
aminoglycosides, beta-lactams, trimethoprim-sulphonamides, phenicol and
318
tetracyclines) however, was common, as was MDR. Common MDR across the
319
sublineages is part of the problem of increasing AMR in Shigella and has already lead
320
to the emergence of epidemiologically-dominant sublineages elsewhere [11, 12].
M AN U
SC
RI PT
311
321
This study provides both applications in future surveillance, and retrospective insight
323
on S. sonnei epidemiology across Latin America. The presence of closely related
324
isolates from different countries within an individual sublineage indicates that
325
international transmission of S. sonnei occurs across LA, as suggested for the
326
Argentina 2011 outbreak. Also, although not a representative epidemiological cross-
327
section, this study suggests that sublineages IIIa and IIIb are epidemically expanding
328
across Latin America driven by AMR. This is supported by the temporal incidence of
329
IIIa and IIIb isolates being weighted toward latter years of the study and the low
330
phylogenetic diversity in sublineages IIIa and IIIb compared with sublineages IIa, IIb
331
and lineage V, likely resulting from rapid clonal expansion. Furthermore, the presence
332
of a dominant AMR profile in each lineage is associated with combinations of
333
specific AMR determinants already associated with globally-epidemic S. sonnei.
334
Specifically, sublineage IIIb falls within the Global III sublineage and contains the
AC C
EP
TE D
322
ACCEPTED MANUSCRIPT Int2/Tn7 resistance determinant and pSpA resistance plasmid associated with the
336
global expansion of Global III [16]. Similarly, sublineage IIIa contains the
337
chromosomally-integrated SRL and pABC-3; two of the major resistance
338
determinants reported in a recent sublineage of S. flexneri 3a which transmitted
339
globally among men-who-have-sex-with-men, from a possible Latin American
340
origin.[31] Notably, pABC-3 has been reported in epidemic S. sonnei in Chile [30]
341
but the SRL, known as an important AMR determinant in S. flexneri and S.
342
dysenteriae [32, 33], has not been commonly reported in S. sonnei, so represents a
343
worrying addition to the armaments of this pathogen.
M AN U
344
SC
RI PT
335
Despite the insight gained, there were limitations to this study. It was retrospective in
346
nature and we selected intentionally-diverse isolates without a case definition or
347
representation relative to the disease burden of each country. Thus, the final overview
348
only approximates proportional representation of the phylogenetic variation and AMR
349
profiles of S. sonnei in Latin America and should be used primarily as a contextual
350
tool for future surveillance. Additionally, using WGS for Shigella surveillance relies
351
on isolate culture, which is diagnostically less-sensitive than alternative molecular
352
techniques (e.g. qPCR) [34]. However, as evidenced here, isolate cultures have a
353
value-added role for epidemiological, AMR and evolutionary surveillance given the
354
increased insight that can be gained through WGS.
EP
AC C
355
TE D
345
356
As we exploit novel subtyping techniques for understanding the spread of pathogens,
357
contextual databases need to be rebuilt through regional cooperation and investment
358
in appropriate technologies, facilities and training. This study demonstrates that this is
359
possible within existing infrastructure and surveillance networks, such as PNLA&C.
ACCEPTED MANUSCRIPT Through collaborative efforts we created a context for S. sonnei across LA, showing
361
international transmission and epidemiologically-expanding sublineages within this
362
region. We have also shown how this information can be used to place recent
363
outbreaks and increasing levels of AMR in a national, regional and global context.
364
This information is essential if we are to maintain current surveillance activities and
365
halt the increase and spread of AMR in important bacterial pathogens such as S.
366
sonnei.
RI PT
360
SC
367 Acknowledgements
369
This work was funded by the Wellcome Trust grant number 098051. AEM is
370
supported by a BBSRC grant BB/M014088/1. KSB is a Wellcome Trust Clinical
371
Research Fellow (106690/A/14/Z).
AC C
EP
TE D
M AN U
368
ACCEPTED MANUSCRIPT 372
Figure legends
373 Figure 1. Distribution of 323 Latin American S. sonnei isolates sequenced as part
375
of this study. By year and country (upper), sublineage designation and year (middle),
376
and sublineage and country (lower).
RI PT
374
377
Figure 2. Genomic portrait of Latin American S. sonnei in context. Maximum
379
likelihood phylogenetic tree of S. sonnei showing 323 Latin American isolates. Latin
380
American sublineages are shown in black and labelled with adjacent arcs, topology
381
unlinked with Latin American isolates is shown in grey, with lineages labelled to the
382
inner of the tree. The country of origin of Latin American isolates is shown on the
383
internal track at the tips of the tree, coloured according to the map. The presence of
384
predicted antimicrobial resistance is shown in outer tracks according to the inlaid key.
385
Global reference isolates in the tree do not have country or resistance labels.
M AN U
TE D
386
SC
378
Figure 3. Phylogeography of Latin American S. sonnei. Midpoint rooted
388
phylogenetic tree with taxa coloured by LA lineage or sublineage (reference isolates
389
shown in grey). The distribution of each lineage or sublineage across Latin America is
390
shown in the maps below, and for all sublineages and lineage in Costa Rica.
AC C
391
EP
387
392
Figure 4. Argentine S. sonnei outbreaks in the context of the Latin American
393
regional overview. The midpoint rooted phylogenetic tree shows the phylogenetic
394
positions of outbreak isolates from outbreaks in 2010, 2011 and 2016 using data
395
generated locally in Argentina. The tree is similarly labelled to Figure 2, with
ACCEPTED MANUSCRIPT 396
additional outbreak isolates being coloured by year (according to the inlaid key) in the
397
outermost track exterior to the country labels (coloured according to the map).
398 399
Table S1. Origin details, data accessions and results for S. sonnei in this study
RI PT
400
Figure S1. Phylogenetic distribution of S. sonnei isolates from South Africa. A
402
subset of reference isolates from Holt et al 2012 and this study (for Lineage V) were
403
used to generate the phylogenetic framework shown in black. South African isolates
404
are shown in red.
SC
401
M AN U
405
Figure S2. Relationship of LA S. sonnei lineage and sublineages to previously
407
designated sublineages. Previously named sublineages are highlighted in coloured,
408
labeled blocks (according to 16), with the Global III MDR sublineage being coloured
409
in blue and labelled at the defining internal node. The new Latin American lineage
410
and sublineage are similarly indicated (reb labels).
411
TE D
406
Figure S3. Antimicrobial resistance genes in Latin American S. sonnei. The
413
midpoint rooted phylogenetic tree shows the phylogenetic relationships of 439 S.
414
sonnei used in this study, with taxa names coloured by lineage. Adjacent tracks show
415
features of antimicrobial resistance determinants for the Latin American S. sonnei
416
sequenced as part of this study. The first track shows amino acid substitutions in gyrA
417
(WT= wildtype), where black is shown for global reference isolates, with subsequent
418
tracks showing the presence of major antimicrobial resistance genes (present in >5%
419
of isolates) coloured by the length of the contiguous sequence on which the gene is
420
found coloured according to the heat map in the inlaid key. Black is shown for Latin
AC C
EP
412
ACCEPTED MANUSCRIPT 421
American isolates where the gene is absent. Blocks of major antimicrobial resistance
422
determinants for each lineage are highlighted by dashed-white boxes and labelled
423
over the colour columns.
424 Figure S4. Predicted trends in antimicrobial resistance phenotypes over time.
426
The histogram shows the proportion of Latin American isolates predicted to have
427
resistant phenotypes by antimicrobial class. The isolates are further resolved by their
428
year of collection according to the inlaid key, with the total number of isolates in that
429
time bracket shown adjacent.
M AN U
430
SC
RI PT
425
Figure S5. Phylogenetic relationships of international S. sonnei across Latin
432
America, including phylogenetic support. The sublineages of Latin American S.
433
sonnei are shown with taxa labelled and coloured by country. Tree branches are
434
coloured by phylogenetic support according to the inlaid key.
EP AC C
435
TE D
431
ACCEPTED MANUSCRIPT Table 1. Shigella sonnei isolates included in this study
438 439 440 441 442 443 444
Number 50 27 31 50 30 18 48 28 41 116 439
Table 2. Genomic and epidemiological features of Latin American S. sonnei Lineage
Pairwise distances (SNPs) Previous sublineage name^
Years Countries Average Median Largest distance
V 26(8%) 1997 2009 4 134 160
AC C
Sublineages
LAIIa 66 (20%) 1999 2012 5 176 208
LAIIb 57 (18%) 1997 2014 6 161 210
1999 - 2012 6 95 95
LAIIIb 85(26%) 2002 2012 5 105 118
221
295
309
292
172
NA
Unnamed
Unnamed
South America (III)
Africa/South America, within Global III
EP
^ according to [16]
Number
TE D
Isolate features
445
Years 2002 - 2011 2010 - 2011 2008 - 2011 2002 - 2010 2011 - 2012 2008 - 2012 1999 - 2012 2000 - 2011 1997 - 2014 1943 - 2008 NA
SC
Global
Country Argentina Chile Colombia Costa Rica Guatemala Paraguay Perú Uruguay Venezuela Many Total
RI PT
Study Latin America
M AN U
436 437
LAIIIa 89 (28%)
ACCEPTED MANUSCRIPT Table 3. Predicted resistance phenotypes and major resistance determinants Lineage
62 30 0 0 0 62
3·3
2·7
30 25 0 0 0 51
9
11
83 90 0 9 0 91
2·3
-
TE D
-
11
6
30 -
Int2/Tn7
SRL
Int2/Tn7 and SpA
Int2/Tn7 -
SRL SRL SRL pABC-3
SpA Int2/Tn7
pABC-3 gyrA SNPs, qnrS genes
SpA -
13
-
-
-
-
AC C
65 0 3 0 77
3·7
Tetracycline Macrolide
Quinolone
82 48 25 80 67
3·9
EP
Previously described major AMR determinants
Beta-lactam Phenicol Trimethoprim Sulphonamide
Int2/Tn7 with bla Int2/Tn7 with bla Int2/Tn7 -
87 91 1 0 0 91
All (n=323)
5·3
-
Aminoglycoside
LAIIIb (n=85) 100 7 0 100
RI PT
46 73 0 0 4 81
SC
Multidrug resistance
Sulphonamide Tetracycline Macrolide Quinolone ESBL MDR (percentage) AMR phenotypes per isolate (average) Unique resistance profiles (number)
V (n=26) 81 42 0 85
LAIIa (n=66) 58 59 2 61
M AN U
Predicted resistant (percentage)
Antimicrobial class Aminoglycoside Beta-lactam Phenicol Trimethoprim
Sublineage LAIIb LAIIIa (n=57) (n=89) 70 90 35 90 0 89 68 81
-
-
-
ACCEPTED MANUSCRIPT Conflict of Interest statement AEM reports grants from Biotechnology and Biological Sciences Research Council during the conduct of study. KSB reports grants from the Wellcome Trust during the conduct of study. All authors declare no conflict of interest.
References
AC C
EP
TE D
M AN U
SC
RI PT
[1] M.D. Kirk, S.M. Pires, R.E. Black, M. Caipo, J.A. Crump, B. Devleesschauwer, D. Dopfer, A. Fazil, C.L. Fischer-Walker, T. Hald, A.J. Hall, K.H. Keddy, R.J. Lake, C.F. Lanata, P.R. Torgerson, A.H. Havelaar, F.J. Angulo, PLoS medicine 12(12) (2015) e1001921. [2] G. World Health Organization Foodborne Disease Burden Epidemiology Reference, WHO estimates of the global burden of foodborne diseases: foodborne disease burden epidemiology reference group 2007-2015., World Health Organisation, Switzerland, 2015. [3] J.A. Platts-Mills, S. Babji, L. Bodhidatta, J. Gratz, R. Haque, A. Havt, B.J. McCormick, M. McGrath, M.P. Olortegui, A. Samie, S. Shakoor, D. Mondal, I.F. Lima, D. Hariraju, B.B. Rayamajhi, S. Qureshi, F. Kabir, P.P. Yori, B. Mufamadi, C. Amour, J.D. Carreon, S.A. Richard, D. Lang, P. Bessong, E. Mduma, T. Ahmed, A.A. Lima, C.J. Mason, A.K. Zaidi, Z.A. Bhutta, M. Kosek, R.L. Guerrant, M. Gottlieb, M. Miller, G. Kang, E.R. Houpt, M.-E.N. Investigators, Lancet Glob Health 3(9) (2015) e564-75. [4] R. Achi, L. Mata, X. Siles, A.A. Lindberg, The Journal of infection 32(3) (1996) 211-8. [5] M.A. Sousa, E.N. Mendes, G.B. Collares, L.A. Peret-Filho, F.J. Penna, P.P. Magalhaes, Mem Inst Oswaldo Cruz 108(1) (2013) 30-5. [6] D.L. Woodward, F.G. Rodgers, Can J Infect Dis 11(4) (2000) 181-6. [7] N. Fulla, V. Prado, C. Duran, R. Lagos, M.M. Levine, The American journal of tropical medicine and hygiene 72(6) (2005) 851-4. [8] I. Chinen, M. Galas, E. Tuduri, M.R. Vinas, C. Carbonari, A. Della Gaspera, D. Napoli, D.M. Aanensen, S. Argimon, N.R. Thomson, D. Hughes, S. Baker, C. Guzman;Verri, M.T. Holden, A.M. Abdala, L.P. Alvarez, B. Alvez, R. Barros, S. Budall, C. Campano, L.S. Chamosa, P. Cheddie, D. Cisterna, D. De Belder, M. Dropa, D. Durand, A. Elena, G. Fontecha, C. Huber, A.P. Lemos, L. Melli, R.E. Paul, L. Suarez, J. Torres Flores, J. Campos, bioRxiv 10.1101/049940 (2016). [9] C.N. Thompson, P.T. Duy, S. Baker, PLoS neglected tropical diseases 9(6) (2015) e0003708. [10] H. Bolaños, A. Tijerino, G. Oropeza, J. Morales, E. Campos, N.L.N.o. Bacteriology., Incremento significativo de la shigelosis en Costa Rica durante el primer cuatrimestre del 2014 (Significant increase in shigellosis in Costa Rica during the first quarter of 2014) Laboratory based surveillance report, INCIENSA, Tres Ríos, Costa Rica, 2014. [11] K.S. Baker, T.J. Dallman, A. Behar, F.X. Weill, M. Gouali, J. Sobel, M. Fookes, L. Valinsky, O. Gal-Mor, T.R. Connor, I. Nissan, S. Bertrand, J. Parkhill, C. Jenkins, D. Cohen, N.R. Thomson, Emerging infectious diseases 22(9) (2016) 1545-53. [12] K.E. Holt, T.V. Thieu Nga, D.P. Thanh, H. Vinh, D.W. Kim, M.P. Vu Tra, J.I. Campbell, N.V. Hoang, N.T. Vinh, P.V. Minh, C.T. Thuy, T.T. Nga, C. Thompson, T.T. Dung, N.T. Nhu, P.V. Vinh, P.T. Tuyet, H.L. Phuc, N.T. Lien, B.D. Phu, N.T. Ai, N.M. Tien, N. Dong, C.M. Parry, T.T. Hien, J.J. Farrar, J. Parkhill, G. Dougan, N.R.
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
Thomson, S. Baker, Proceedings of the National Academy of Sciences of the United States of America 110(43) (2013) 17522-7. [13] B. Swaminathan, P. Gerner-Smidt, L.K. Ng, S. Lukinmaa, K.M. Kam, S. Rolando, E.P. Gutierrez, N. Binsztein, Foodborne pathogens and disease 3(1) (2006) 36-50. [14] J. Campos, M. Pichel, T.M.I. Vaz, A.T. Tavechio, S.A. Fernandes, N. Munoz, C. Rodriguez, M.E. Realpe, J. Moreno, P. Araya, J. Fernandez, A. Fernandez, E. Campos, F. Duarte, N.W. Gustafson, N. Binsztein, E.P. Gutierrez, Food Res Int 45(2) (2012) 1030-1036. [15] X. Deng, H.C. den Bakker, R.S. Hendriksen, Annual review of food science and technology 7 (2016) 353-74. [16] K.E. Holt, S. Baker, F.X. Weill, E.C. Holmes, A. Kitchen, J. Yu, V. Sangal, D.J. Brown, J.E. Coia, D.W. Kim, S.Y. Choi, S.H. Kim, W.D. da Silveira, D.J. Pickard, J.J. Farrar, J. Parkhill, G. Dougan, N.R. Thomson, Nature genetics 44(9) (2012) 10569. [17] H. Chung The, M.A. Rabaa, D. Pham Thanh, N. De Lappe, M. Cormican, M. Valcanis, B.P. Howden, S. Wangchuk, L. Bodhidatta, C.J. Mason, T. Nguyen Thi Nguyen, D. Vu Thuy, C.N. Thompson, N. Phu Huong Lan, P. Voong Vinh, T. Ha Thanh, P. Turner, P. Sar, G. Thwaites, N.R. Thomson, K.E. Holt, S. Baker, PLoS medicine 13(8) (2016) e1002055. [18] T.J. Dallman, M.A. Chattaway, P. Mook, G. Godbole, P.D. Crook, C. Jenkins, Journal of medical microbiology 65(8) (2016) 882-4. [19] E. Zankari, H. Hasman, R.S. Kaas, A.M. Seyfarth, Y. Agerso, O. Lund, M.V. Larsen, F.M. Aarestrup, The Journal of antimicrobial chemotherapy 68(4) (2013) 771-7. [20] S. Zhao, G.H. Tyson, Y. Chen, C. Li, S. Mukherjee, S. Young, C. Lam, J.P. Folster, J.M. Whichard, P.F. McDermott, Applied and environmental microbiology 82(2) (2015) 459-66. [21] G.H. Tyson, P.F. McDermott, C. Li, Y. Chen, D.A. Tadesse, S. Mukherjee, S. Bodeis-Jones, C. Kabera, S.A. Gaines, G.H. Loneragan, T.S. Edrington, M. Torrence, D.M. Harhay, S. Zhao, The Journal of antimicrobial chemotherapy 70(10) (2015) 2763-9. [22] M.A. Quail, T.D. Otto, Y. Gu, S.R. Harris, T.F. Skelly, J.A. McQuillan, H.P. Swerdlow, S.O. Oyola, Nature methods 9(1) (2012) 10-1. [23] A.J. Page, N. De Silva, M. Hunt, M.A. Quail, J. Parkhill, S.R. Harris, T.D. Otto, J.A. Keane, Microbial Genomics 2(8) (2016). [24] N.J. Croucher, A.J. Page, T.R. Connor, A.J. Delaney, J.A. Keane, S.D. Bentley, J. Parkhill, S.R. Harris, Nucleic acids research 43(3) (2015) e15. [25] A. Stamatakis, Bioinformatics 22(21) (2006) 2688-90. [26] E. Zankari, H. Hasman, S. Cosentino, M. Vestergaard, S. Rasmussen, O. Lund, F.M. Aarestrup, M.V. Larsen, The Journal of antimicrobial chemotherapy 67(11) (2012) 2640-4. [27] I.J. Azmi, B.K. Khajanchi, F. Akter, T.N. Hasan, M. Shahnaij, M. Akter, A. Banik, H. Sultana, M.A. Hossain, M.K. Ahmed, S.M. Faruque, K.A. Talukder, PloS one 9(7) (2014) e102533. [28] S. Argimón, K. Abudahab, R.J.E. Goater, A. Fedosejev, J. Bhai, C. Glasner, E.J. Feil, M.T.G. Holden, C.A. Yeats, H. Grundmann, B.G. Spratt, D.M. Aanensen, Microbial Genomics 2(11) (2016). [29] I. Letunic, P. Bork, Nucleic acids research 44(W1) (2016) W242-W245.
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
[30] A. Miranda, B. Avila, P. Diaz, L. Rivas, K. Bravo, J. Astudillo, C. Bueno, M.T. Ulloa, G. Hermosilla, F. Del Canto, J.C. Salazar, C.S. Toro, Front Cell Infect Microbiol 6 (2016) 77. [31] K.S. Baker, T.J. Dallman, P.M. Ashton, M. Day, G. Hughes, P.D. Crook, V.L. Gilbart, S. Zittermann, V.G. Allen, B.P. Howden, T. Tomita, M. Valcanis, S.R. Harris, T.R. Connor, V. Sintchenko, P. Howard, J.D. Brown, N.K. Petty, M. Gouali, D.P. Thanh, K.H. Keddy, A.M. Smith, K.A. Talukder, S.M. Faruque, J. Parkhill, S. Baker, F.X. Weill, C. Jenkins, N.R. Thomson, The Lancet infectious diseases (2015). [32] E. Njamkepo, N. Fawal, A. Tran-Dien, J. Hawkey, N. Strockbine, C. Jenkins, K.A. Talukder, R. Bercion, K. Kuleshov, R. Kolinska, J.E. Russell, L. Kaftyreva, M. Accou-Demartin, A. Karas, O. Vandenberg, A.E. Mather, C.J. Mason, A.J. Page, T. Ramamurthy, C. Bizet, A. Gamian, I. Carle, A.G. Sow, C. Bouchier, A.L. Wester, M. Lejay-Collin, M.C. Fonkoua, S.L. Hello, M.J. Blaser, C. Jernberg, C. Ruckly, A. Merens, A.L. Page, M. Aslett, P. Roggentin, A. Fruth, E. Denamur, M. Venkatesan, H. Bercovier, L. Bodhidatta, C.S. Chiou, D. Clermont, B. Colonna, S. Egorova, G.P. Pazhani, A.V. Ezernitchi, G. Guigon, S.R. Harris, H. Izumiya, A. KorzeniowskaKowal, A. Lutynska, M. Gouali, F. Grimont, C. Langendorf, M. Marejkova, L.A. Peterson, G. Perez-Perez, A. Ngandjio, A. Podkolzin, E. Souche, M. Makarova, G.A. Shipulin, C. Ye, H. Zemlickova, M. Herpay, P.A. Grimont, J. Parkhill, P. Sansonetti, K.E. Holt, S. Brisse, N.R. Thomson, F.X. Weill, Nat Microbiol 1 (2016) 16027. [33] T.R. Connor, C.R. Barker, K.S. Baker, F.X. Weill, K.A. Talukder, A.M. Smith, S. Baker, M. Gouali, D. Pham Thanh, I. Jahan Azmi, W. Dias da Silveira, T. Semmler, L.H. Wieler, C. Jenkins, A. Cravioto, S.M. Faruque, J. Parkhill, D. Wook Kim, K.H. Keddy, N.R. Thomson, Elife 4 (2015) e07335. [34] J. Liu, J.A. Platts-Mills, J. Juma, F. Kabir, J. Nkeze, C. Okoi, D.J. Operario, J. Uddin, S. Ahmed, P.L. Alonso, M. Antonio, S.M. Becker, W.C. Blackwelder, R.F. Breiman, A.S. Faruque, B. Fields, J. Gratz, R. Haque, A. Hossain, M.J. Hossain, S. Jarju, F. Qamar, N.T. Iqbal, B. Kwambana, I. Mandomando, T.L. McMurry, C. Ochieng, J.B. Ochieng, M. Ochieng, C. Onyango, S. Panchalingam, A. Kalam, F. Aziz, S. Qureshi, T. Ramamurthy, J.H. Roberts, D. Saha, S.O. Sow, S.E. Stroup, D. Sur, B. Tamboura, M. Taniuchi, S.M. Tennant, D. Toema, Y. Wu, A. Zaidi, J.P. Nataro, K.L. Kotloff, M.M. Levine, E.R. Houpt, Lancet 388(10051) (2016) 1291-301.
RI PT TE D
M AN U
SC
IIIa
EP
IIb V
AC C
IIa IIIb
ACCEPTED MANUSCRIPT
V
LA sub ACCEPTED MANUSCRIPT
Lineage II
RI PT
Lineages I and IV
AC C
EP
TE D
M AN U
SC
Lineage III
Aminoglycoside Beta-lactam
LA sublinea IIIa
ACCEPTED MANUSCRIPT
RI PT
IIIb
IIIa
AC C
EP
TE D
M AN U
SC
All (Costa Rica)
V
IIIb
LA sublineage IIa ACCEPTED MANUSCRIPT
Lineage V
Lineage III
AC C
EP
TE D
M AN U
Lineages I and IV
SC
Lineage II
RI PT
LA sublineag IIb
LA sublineag IIIa