Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses

Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses

Accepted Manuscript Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborn...

1MB Sizes 112 Downloads 155 Views

Accepted Manuscript Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses Kate Susan Baker, Josefina Campos, Mariana Pichel, Anabella Della Gaspera, Francisco Duarte-Martínez, Elena Campos-Chacón, Hilda María Bolaños-Acuña, Caterina Guzman-Verri, Alison E. Mather, Susana Diaz Velasco, Maria Luz Zamudio Rojas, Jessica Forbester, Thomas Richard Connor, Karen Helena Keddy, Anthony Marius Smith, Enedia A. Lopez de Delgado, Giovanny Angiolillo, Nirvia Cuaical, Jorge Fernandez, Carolina Aguayo, Melissa Morales Aguilar, Claudia Valenzuela, Ana Judith Morales Medrano, Alfredo Sirok Esteve, Natalie Weiler Gustafson, Paula Lucia Diaz Guevara, Lucy Angeline Montaño, Enrique Perez, Nicholas R. Thomson PII:

S1198-743X(17)30190-8

DOI:

10.1016/j.cmi.2017.03.021

Reference:

CMI 905

To appear in:

Clinical Microbiology and Infection

Received Date: 12 January 2017 Revised Date:

16 March 2017

Accepted Date: 27 March 2017

Please cite this article as: Baker KS, Campos J, Pichel M, Della Gaspera A, Duarte-Martínez F, Campos-Chacón E, Bolaños-Acuña HM, Guzman-Verri C, Mather AE, Velasco SD, Zamudio Rojas ML, Forbester J, Connor TR, Keddy KH, Smith AM, Lopez de Delgado EA, Angiolillo G, Cuaical N, Fernandez J, Aguayo C, Aguilar MM, Valenzuela C, Morales Medrano AJ, Esteve AS, Gustafson NW, Diaz Guevara PL, Montaño LA, Perez E, Thomson NR, Whole genome sequencing of Shigella sonnei through PulseNet Latin America and Caribbean: advancing global surveillance of foodborne illnesses, Clinical Microbiology and Infection (2017), doi: 10.1016/j.cmi.2017.03.021. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT 1

Title: Whole genome sequencing of Shigella sonnei through PulseNet Latin America

2

and Caribbean: advancing global surveillance of foodborne illnesses

3 4

Running title: Shigella sonnei across Latin America

RI PT

5

Kate Susan Baker* 1,2 Josefina Campos3 Mariana Pichel3 Anabella Della Gaspera3

7

Francisco Duarte-Martínez4 Elena Campos-Chacón4 Hilda María Bolaños-Acuña4

8

Caterina Guzman-Verri5,6 Alison E Mather2,7 Susana Diaz Velasco8 Maria Luz

9

Zamudio Rojas8 Jessica Forbester2 Thomas Richard Connor9 Karen Helena Keddy10

SC

6

Anthony Marius Smith10 Enedia A. Lopez de Delgado11 Giovanny Angiolillo11 Nirvia

11

Cuaical11 Jorge Fernandez12 Carolina Aguayo12 Melissa Morales Aguilar13 Claudia

12

Valenzuela13 Ana Judith Morales Medrano13 Alfredo Sirok Esteve14 Natalie Weiler

13

Gustafson15 Paula Lucia Diaz Guevara16 Lucy Angeline Montaño16 Enrique Perez17

14

Nicholas R Thomson*2,18

TE D

M AN U

10

15 16

1

17

Liverpool, United Kingdom, L69 7ZB

18

2

19

Kingdom, CB10 1SA

20

3

Instituto Nacional de Enfermedades Infecciosas, ANLIS, Buenos Aires, Argentina

21

4

Instituto Costarricense de Investigación y Enseñanza en Nutrición y Salud

22

(Inciensa), Costa Rica

23

5

24

Veterinaria, Universidad Nacional, Heredia, Costa Rica

EP

University of Liverpool, Department of Functional and Comparative Genomics,

AC C

Wellcome Trust Sanger Institute, Pathogen Variation Programme, Hinxton, United

Programa de Investigación en Enfermedades Tropicales, Escuela de Medicina

ACCEPTED MANUSCRIPT 25

6

26

Universidad de Costa Rica, San José, Costa Rica.

27

7

28

Kingdom, CB3 0ES

29

8

National Institute of Heath, Lima, Perú

30

9

Organisms and Environment Division, Cardiff University School of Biosciences, Sir

31

Martin Evans Building, Cardiff, CF10 3AX

32

10

33

Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South

34

Africa

35

11

36

University, Los Chaguaramos,Venezuela

37

12

Molecular Genetics Laboratory, Institute of Public Health of Chile, Santiago, Chile

38

13

Department of Foodborne Diseases, National Health Laboratory of Guatemala,

39

Laboratorio Nacional de Salud,Barcenas, Guatemala

40

14

41

Ministerio de Salud Pública (MSP); Alfredo Navarro 3051, 11600, Montevideo,

42

Uruguay

43

15

44

Paraguay

45

16

46

17

47

Health Emergencies., 525 23rd Street NW, Washington DC, USA.

48

18

49

WC1E 7HT

Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología,

RI PT

University of Cambridge, Department of Veterinary Medicine, Cambridge, United

M AN U

SC

Centre for Enteric Diseases, National Institute for Communicable Diseases and

TE D

Department of Bacteriology, National Institute of Hygiene “Rafael Rangel”, Ciudad

AC C

EP

Bacteriology Laboratory, Departamento de Laboratorios de Salud Pública (DLSP),

Laboratorio Central de Salud Pública, Departamento of Bacteriology, Asuncion,

Grupo de Microbiología, Instituto Nacional de Salud, Bogotá, Colombia

Pan American Health Organization/World Health Organization, Department of

London School of Hygiene and Tropical Medicine, London, United Kingdom,

ACCEPTED MANUSCRIPT *Corresponding authors: Dr Kate Baker, [email protected], +44 1517

51

954575; Professor Nicholas R Thomson [email protected] +44 1223 494740

AC C

EP

TE D

M AN U

SC

RI PT

50

ACCEPTED MANUSCRIPT 52 53

Abstract

54 Objective

56

Shigella sonnei is a globally-important diarrhoeal pathogen tracked through the

57

surveillance network PulseNet Latin America and Caribbean (PNLA&C), which

58

participates in PulseNet International. PNLA&C laboratories use common molecular

59

techniques to track pathogens causing foodborne illness. We aimed to demonstrate the

60

possibility and advantages of transitioning to whole genome sequencing (WGS) for

61

surveillance within existing networks across a continent where S. sonnei is endemic.

M AN U

SC

RI PT

55

62 Methods

64

We applied WGS to representative archive isolates of S. sonnei (n=323) from

65

laboratories in nine PNLA&C countries to generate a regional phylogenomic

66

reference for S. sonnei and put this in the global context. We used this reference to

67

contextualise 16 S. sonnei from three Argentinian outbreaks, using locally-generated

68

sequence data. Assembled genome sequences were used to predict antimicrobial

69

resistance (AMR) phenotypes and identify AMR determinants.

EP

AC C

70

TE D

63

71

Results

72

S. sonnei isolates clustered in five Latin American sublineages in the global

73

phylogeny, with many (46%, 149 of 323) belonging to previously undescribed

74

sublineages. Predicted multiple drug resistance was common (77%, 249 of 323) and

75

clinically-relevant differences in AMR were found among sublineages. The regional

ACCEPTED MANUSCRIPT 76

overview showed that Argentinian outbreak isolates belonged to distinct sublineages

77

and had different epidemiological origins.

78 Conclusions

80

Latin America contains novel genetic diversity of S. sonnei that is relevant on a global

81

scale and commonly exhibits multiple drug resistance. Retrospective passive

82

surveillance with WGS has utility for informing treatment , identifying regionally-

83

epidemic sublineages and providing a framework for interpretation of prospective,

84

locally-sequenced outbreaks.

AC C

EP

TE D

M AN U

85

SC

RI PT

79

ACCEPTED MANUSCRIPT 86

Introduction

87 Shigella are globally important bacteria, causing more than 190 million diarrhoeal

89

disease cases and 65,796 deaths annually, 18 million and 1,023 of which,

90

respectively, occur in the Americas [1, 2]. In Latin America, S. sonnei is a common

91

cause of diarrhoeal disease (mainly in children[3-5]) and is variably resistant to

92

commonly-used antimicrobials [6, 7]. Explosive outbreaks still occur (e.g. a 900-case

93

epidemic of S. sonnei in Argentina in 2016 [8]) and, mirroring trends in other

94

economically-developing areas [9], increases in endemic S. sonnei prevalence are also

95

reported [10]. In addition to local transmission, new phylogenetic lineages of S.

96

sonnei can disseminate nationally and spread internationally within two to three

97

decades [11, 12]. Given its worldwide distribution, increasing importance, and

98

international transmission, it is unsurprising that S. sonnei is under surveillance

99

through PulseNet International [13].

SC

M AN U

TE D

100

RI PT

88

PulseNet Latin America and Caribbean (PNLA&C) is a regional network that

102

contributes to PulseNet International; a public health network of >120 laboratories in

103

>80 countries that has performed surveillance of foodborne illnesses for twenty years

104

[13]. PNLA&C laboratories use common molecular subtyping techniques and share

105

their results and associated epidemiological information through a regional database

106

to facilitate early identification of disease outbreaks in an increasingly globalised

107

world [14]. Owing to the increased resolution compared with traditional techniques

108

(e.g. Pulsed-Field Gel Electrophoresis, PFGE), PulseNet International is currently

109

transitioning to the use of WGS [15].

110

AC C

EP

101

ACCEPTED MANUSCRIPT WGS has been applied to subtype S. sonnei effectively: a species-defining study

112

identified four main phylogenetic lineages that were further split into sublineages

113

containing isolates of similar geographical origins [16]. Subsequent WGS studies of

114

S. sonnei have demonstrated the emergence of sublineages of public health

115

importance at national and international levels, often driven by the acquisition of

116

AMR [11, 12, 16, 17]. In a strict public health setting, Public Health England

117

researchers have used WGS to identify epidemiological clusters of S. sonnei as

118

existing subtyping techniques (phage typing) provided poor lineage discrimination

119

[18]. WGS has also been used to predict AMR phenotypes, with high (e.g. >95%)

120

specificities and sensitivities reported for other Enterobacteriaceae, including E. coli,

121

Campylobacter and Salmonella [19-21]. Collectively, these studies suggest that the

122

application of WGS within international surveillance networks, such as PNLA&C,

123

can enhance outbreak detection and surveillance for S. sonnei and its AMR

124

determinants.

TE D

M AN U

SC

RI PT

111

125

The strength of surveillance frameworks lie in both the use of common techniques,

127

and large reference databases e.g. hundreds of thousands of PFGE-subtyped pathogen

128

profiles exist within PulseNet International. Populating these databases with WGS

129

data from passive and active surveillance programs will promote the continued

130

success of international surveillance. In this study members of PNLA&C worked

131

collaboratively toward this goal by generating a WGS overview of S. sonnei in the

132

region.

AC C

EP

126

133 134

Materials and Methods

135

Clinical isolates

ACCEPTED MANUSCRIPT 136 Latin American archive isolates

138

To construct a regional overview of S. sonnei across Latin America, WGS data were

139

generated from 323 archived clinical isolates of S. sonnei collected over nineteen

140

years from nine countries (Table 1, Figure 1). Each national PNLA&C partner was

141

responsible for selecting isolates from their own archives with the aim of achieving

142

diversity with respect to the following: PFGE profile, year of isolation, AMR profile,

143

disease manifestation, and PFGE profile linkage to outbreaks of disease or sporadic

144

cases. Metadata associated with the isolates frequently included the year of collection,

145

AMR susceptibility testing results and geographical information (e.g. patient

146

residential province or address of submitting laboratory). All metadata and results are

147

shown in Table S1.

148

M AN U

SC

RI PT

137

Global context isolates

150

To set this regional overview in a global context, publically available sequence data

151

from global reference isolates (n=116) of S. sonnei were also included (Table S1).

152

These comprise temporally and geographically diverse (samples from four continents

153

collected between 1943 and 2008) isolates used to define the population structure of

154

S. sonnei [16].

EP

AC C

155

TE D

149

156

Argentine outbreak isolates

157

To demonstrate the utility of the LA regional overview for investigating national

158

outbreaks, WGS data generated at the PNLA&C reference laboratory (ANLIS) from

159

three Argentinian outbreaks (n=16 isolates) of S. sonnei were also used. Previously

ACCEPTED MANUSCRIPT 160

reported at a national level [8], these isolates were from outbreaks in 2010 (n=5),

161

2011 (n=3) and 2016 (n=8).

162 163

Genome sequencing and bioinformatics analysis

RI PT

164

Archive isolates were sequenced, trimmed and quality checked at the Wellcome Trust

166

Sanger Institute according to in house protocols [22]. Sequencing data were de novo

167

assembled using a custom assembly pipeline [23]. All isolates assembled into > 4MB

168

and <650 contiguous sequences (Table S1). Sequencing data and assemblies are

169

publically available at the European Nucleotide Archive (accession numbers, Table

170

S1). Argentinian outbreak isolates were sequenced at ANLIS [8].

M AN U

SC

165

171

To construct the regional overview phylogeny, a multiple sequence alignment (MSA)

173

was created by mapping the sequence data from 439 taxa (archive and global context

174

isolates) to Shigella sonnei Ss046 and its five associated plasmids (5055316 bp) using

175

SMALT, followed by removal of repeat regions and mobile elements (7210310 bp)

176

[16] and regions of recombination (7074 sites) [24] resulting in a final alignment of

177

13,988 variant sites. A maximum likelihood phylogeny with 100 bootstraps was then

178

inferred [25]. Phylogenetic analysis incorporating outbreak isolates was conducted

179

similarly (final alignment 14,075 variant sites).

EP

AC C

180

TE D

172

181

For analysis of sequences related to AMR, AMR genes were detected on assembled

182

sequences [26] and cross referenced with phylogeny, contiguous sequence length and

183

traditional comparative genetic approaches including Artemis, BLAST (against NCBI

184

reference databases and locally) and ACT, as previously described [11] to determine

ACCEPTED MANUSCRIPT 185

the presence of known AMR determinants in shigellae. Single Nucleotide

186

Polymorphisms (SNPs) in known quinolone-resistance determining regions including:

187

gyrA positions 83, 87 and 211, and parC positions 80 and 84 [27], were retrieved.

188 Approximate longitude and latitude of locations were deduced and phylogeographic

190

analysis was visualised employing MicroReact [28]. Figtree and iTOL were used for

191

additional visualisations [29].

RI PT

189

193

SC

192 Results

M AN U

194

Phylogenetic analysis of LA S. sonnei was conducted to define its population

196

structure within the known four-lineage context of S. sonnei. This divided the LA

197

isolates into a new genetic lineage and four genetic sublineages of variable diversity.

198

The new lineage comprised 26 (8%) of the 323 archive isolates and was designated

199

Lineage V (Figure 1, Figure 2, Table 1). Since its detection in this study, Lineage V

200

isolates have been detected in South Africa (Figure S1) and the United Kingdom

201

(Baker et al, in preparation). The remaining archive isolates clustered within Lineages

202

II (n=123, 38%) and III (n=174, 54%). The archive isolates in Lineage II were further

203

subdivided into the sublineages LA IIa and IIb (Table 1, Figure 1, Figure 2). LAIIa

204

and LAIIb were phylogenetically distinct from the previously-described South

205

America II sublineage (Figure S2). The archive isolates in Lineage III were similarly

206

subdivided into sublineages LA IIIa and IIIb, which were expansions of previously

207

described sublineages (see Table 1, Figure S2), with IIIb being part of the multi-drug

208

resistant (MDR) Global III sublineage of S. sonnei that expanded globally following

209

the acquisition of AMR [16]. The four LA sublineages had variable genetic diversity

AC C

EP

TE D

195

ACCEPTED MANUSCRIPT 210

with, for example, the maximum MSA pairwise distance between any two isolates in

211

LAIIIb being 172 Single Nucleotide Polymorphisms (SNPs), compared with 309

212

SNPs for LAIIb (Table 2). S. sonnei isolates belonging to Lineages I and IV were not

213

found in this study.

RI PT

214

The regional phylogenetic overview provides a nomenclature for discussion and

216

interpretation of national and regional surveillance patterns of S. sonnei in LA. For

217

that reason, the geospatial information and phylogenetic results of the archived

218

isolates were loaded into a MicroReact project (found at

219

http://microreact.org/project/Shigella_sonnei_in_Latin_America). This public

220

resource can be used interactively by end-users to display and filter the results of this

221

study based on time, phylogeny and geography to highlight for example that

222

sublineages were not uniformly distributed around LA or individual countries (Figure

223

3).

M AN U

TE D

224

SC

215

The isolates within a LA lineage or sublineage were temporally and geographically

226

diverse. Each lineage or sublineage contained isolates collected in multiple years over

227

the course of the study, ranging from ten years for IIIb to the entire seventeen years

228

for IIb (Figure 1, Table 2). No obvious shifts in the presence of the LA lineage or

229

sublineages were seen over time, apart from IIIb possibly predominating in later years

230

(Figure 1). The LA lineage and sublineages were also diverse with respect to their

231

countries of origin, with each being comprised of isolates from between four and six

232

countries (Figure 1, Figure 2, Table 1). Within a given LA lineage or sublineage,

233

isolates from different countries were frequently intermingled rather than being

234

phylogenetically separated based on geography (often with good phylogenetic

AC C

EP

225

ACCEPTED MANUSCRIPT 235

support, Figure S5), indicating that international transmission across the region may

236

be occurring.

237 To demonstrate the utility of this data for contextualising new outbreaks, we

239

performed further phylogenetic analysis with additional isolates from S. sonnei

240

outbreaks in Argentina. This confirmed that the Argentinian outbreaks in 2010 and

241

2011 were caused by phylogenetically distinct S. sonnei with distinct AMR profiles

242

[8]. Here this is demonstrated by the majority of isolates from the 2011 outbreak

243

belong to sublineage IIIa and those from 2010 and 2016 belonging to sublineage IIIb

244

(Figure 4). Notably however, the 2011 and 2016 isolates fell into multiple

245

sublineages, indicating that the outbreaks may have multiple epidemiological origins.

M AN U

SC

RI PT

238

246

Predicted AMR phenotypes correlated well with antimicrobial resistance testing data

248

(94·3% sensitivity, 99·4% specificity of 330 available phenotypes, Table S1), and are

249

the results presented throughout this text. MDR (resistance to ≥3 antimicrobial

250

classes) was common among the isolates (77%, 249 of 323), with isolates being

251

resistant to between 0 and 7 antimicrobials (mean 3.7, mode and median 4) (Table 3,

252

Table S1, Figure 2). No relationship of increasing antimicrobial resistance over time

253

was detected (Figure S4). Macrolide resistance and an extended-spectrum beta-

254

lactamase (ESBL) gene were found in individual isolates, conferred by the

255

azithromycin resistance gene mphA in a IIIb isolate, and a Lineage V isolate

256

containing a blaSHV129 gene that conferred resistance to ceftazidime (Table S1).

257

Resistance to quinolones was similarly infrequent, being present in only 3% (10 of

258

323) of isolates (conferred by gyrA mutations (n=8) or qnr genes (n=2), Table S1,

259

Table 3). Resistance to other classes of antimicrobials was more common, with 65 -

AC C

EP

TE D

247

ACCEPTED MANUSCRIPT 82% of isolates being resistant to aminoglycosides (streptomycin), trimethoprim,

261

sulphonamide and tetracycline classes of antimicrobials, and resistance to phenicol

262

and beta-lactam classes also being frequently detected (25 and 48% of isolates

263

respectively) (Figure 2, Table 3). These resistances were encoded by a variety of

264

AMR genes (Table S1, Figure S3).

RI PT

260

265

Notably, the distribution of resistance against an antimicrobial class was not uniform

267

among the sublineages (Figure 2). For example, predicted beta-lactamase resistance in

268

sublineage IIIa was 90% compared with just 7% for sublineage IIIb (p<0.01), and

269

while IIIa and IIIb had ~90% resistance to tetracycline, sublineages IIb and IIa had

270

between 25 and 30% tetracycline resistance (Table 3). The presence of resistance

271

toward a variety of antimicrobials gave rise to an AMR profile (i.e. antibiogram) in

272

each isolate (Figure 2), with each sublineage containing isolates of between 6 and 13

273

different AMR profiles. However, in the case of sublineages IIIa and IIIb, single

274

AMR profiles dominated with one profile being present among 69 and 79%

275

(respectively) of the isolates in the sublineage, and other AMR profiles being present

276

in fewer than 9% of isolates in the sublineage (Table S1, Figure 2). The dominant

277

AMR profiles in each of IIIa and IIIb were determined by the presence of

278

chromosomal and plasmid-encoded AMR genes conferring resistance to multiple

279

antimicrobial classes (Table 3). Specifically, sublineage IIIb carried the chromosomal

280

Int2/Tn7 resistance determinant and the SpA plasmid and sublineage IIIa carried the

281

chromosomal Shigella Resistance Locus (SRL) and a variant plasmid of SpA, pABC-

282

3, on which the aminoglycoside-resistance gene strA has been interrupted by the

283

acquisition of a trimethoprim-resistance conferring gene dfrA14 [30]. Contrasting to

284

the presence of a dominant AMR profile in sublineages IIIa and IIIb, sublineages IIa,

AC C

EP

TE D

M AN U

SC

266

ACCEPTED MANUSCRIPT 285

IIb and lineage V contained at least three AMR profiles that were present in ≥15% of

286

the isolates (Table S1).

287 288

Discussion

RI PT

289

Here we have created a resource of novel WGS diversity of S. sonnei from Latin

291

America relevant to public health surveillance on regional and global scales. We

292

report the identification of a new global lineage (Lineage V)[16]; previously

293

undescribed sublineages of Lineage II (Latin America IIa and Latin America IIb); and

294

expansions in Lineage III, including Latin America IIIa (previously South America

295

(III)) and Latin America IIIb, within the MDR Global III lineage. The subsequent

296

detection of Lineage V in Europe and Africa testifies to the relevance of this diversity

297

for surveillance on a global scale, and regional importance of this data is

298

demonstrated here by contextualisation of Argentinian S. sonnei outbreaks.

M AN U

TE D

299

SC

290

In a previous study, outbreak isolates from Argentina were discriminated at a national

301

level into three WGS sublineages [8]. Building these isolates into this regional

302

overview, the 2010 and 2016 outbreak isolates were contained entirely within the

303

diversity of archive isolates from Argentina, indicating these epidemics were likely

304

from previously circulating strains (Figure 4). Contrastingly, the 2011 outbreak

305

isolates were more closely related to IIIa isolates from Perú and Chile than the single

306

IIIa isolate from Argentina, indicating the epidemic may have been subsequent to an

307

importation event. Thus, by providing interpretative context, our results enhance

308

national, regional and global surveillance of S. sonnei through publically available

309

sequencing data and MicroReact.

AC C

EP

300

ACCEPTED MANUSCRIPT 310 The WGS data was also used to examine AMR in S. sonnei across LA. Quinolone and

312

macrolide resistance and genes encoding extended-spectrum beta-lactamases were not

313

widespread; present in only a handful of isolates. Notably however, quinolone

314

resistant isolates in Lineage V and sublineage IIIa were outside of the Central Asia III

315

lineage, thought to act as the global reservoir for ciprofloxacin resistant S. sonnei

316

[17]. Resistance against many other classes of antimicrobials (including

317

aminoglycosides, beta-lactams, trimethoprim-sulphonamides, phenicol and

318

tetracyclines) however, was common, as was MDR. Common MDR across the

319

sublineages is part of the problem of increasing AMR in Shigella and has already lead

320

to the emergence of epidemiologically-dominant sublineages elsewhere [11, 12].

M AN U

SC

RI PT

311

321

This study provides both applications in future surveillance, and retrospective insight

323

on S. sonnei epidemiology across Latin America. The presence of closely related

324

isolates from different countries within an individual sublineage indicates that

325

international transmission of S. sonnei occurs across LA, as suggested for the

326

Argentina 2011 outbreak. Also, although not a representative epidemiological cross-

327

section, this study suggests that sublineages IIIa and IIIb are epidemically expanding

328

across Latin America driven by AMR. This is supported by the temporal incidence of

329

IIIa and IIIb isolates being weighted toward latter years of the study and the low

330

phylogenetic diversity in sublineages IIIa and IIIb compared with sublineages IIa, IIb

331

and lineage V, likely resulting from rapid clonal expansion. Furthermore, the presence

332

of a dominant AMR profile in each lineage is associated with combinations of

333

specific AMR determinants already associated with globally-epidemic S. sonnei.

334

Specifically, sublineage IIIb falls within the Global III sublineage and contains the

AC C

EP

TE D

322

ACCEPTED MANUSCRIPT Int2/Tn7 resistance determinant and pSpA resistance plasmid associated with the

336

global expansion of Global III [16]. Similarly, sublineage IIIa contains the

337

chromosomally-integrated SRL and pABC-3; two of the major resistance

338

determinants reported in a recent sublineage of S. flexneri 3a which transmitted

339

globally among men-who-have-sex-with-men, from a possible Latin American

340

origin.[31] Notably, pABC-3 has been reported in epidemic S. sonnei in Chile [30]

341

but the SRL, known as an important AMR determinant in S. flexneri and S.

342

dysenteriae [32, 33], has not been commonly reported in S. sonnei, so represents a

343

worrying addition to the armaments of this pathogen.

M AN U

344

SC

RI PT

335

Despite the insight gained, there were limitations to this study. It was retrospective in

346

nature and we selected intentionally-diverse isolates without a case definition or

347

representation relative to the disease burden of each country. Thus, the final overview

348

only approximates proportional representation of the phylogenetic variation and AMR

349

profiles of S. sonnei in Latin America and should be used primarily as a contextual

350

tool for future surveillance. Additionally, using WGS for Shigella surveillance relies

351

on isolate culture, which is diagnostically less-sensitive than alternative molecular

352

techniques (e.g. qPCR) [34]. However, as evidenced here, isolate cultures have a

353

value-added role for epidemiological, AMR and evolutionary surveillance given the

354

increased insight that can be gained through WGS.

EP

AC C

355

TE D

345

356

As we exploit novel subtyping techniques for understanding the spread of pathogens,

357

contextual databases need to be rebuilt through regional cooperation and investment

358

in appropriate technologies, facilities and training. This study demonstrates that this is

359

possible within existing infrastructure and surveillance networks, such as PNLA&C.

ACCEPTED MANUSCRIPT Through collaborative efforts we created a context for S. sonnei across LA, showing

361

international transmission and epidemiologically-expanding sublineages within this

362

region. We have also shown how this information can be used to place recent

363

outbreaks and increasing levels of AMR in a national, regional and global context.

364

This information is essential if we are to maintain current surveillance activities and

365

halt the increase and spread of AMR in important bacterial pathogens such as S.

366

sonnei.

RI PT

360

SC

367 Acknowledgements

369

This work was funded by the Wellcome Trust grant number 098051. AEM is

370

supported by a BBSRC grant BB/M014088/1. KSB is a Wellcome Trust Clinical

371

Research Fellow (106690/A/14/Z).

AC C

EP

TE D

M AN U

368

ACCEPTED MANUSCRIPT 372

Figure legends

373 Figure 1. Distribution of 323 Latin American S. sonnei isolates sequenced as part

375

of this study. By year and country (upper), sublineage designation and year (middle),

376

and sublineage and country (lower).

RI PT

374

377

Figure 2. Genomic portrait of Latin American S. sonnei in context. Maximum

379

likelihood phylogenetic tree of S. sonnei showing 323 Latin American isolates. Latin

380

American sublineages are shown in black and labelled with adjacent arcs, topology

381

unlinked with Latin American isolates is shown in grey, with lineages labelled to the

382

inner of the tree. The country of origin of Latin American isolates is shown on the

383

internal track at the tips of the tree, coloured according to the map. The presence of

384

predicted antimicrobial resistance is shown in outer tracks according to the inlaid key.

385

Global reference isolates in the tree do not have country or resistance labels.

M AN U

TE D

386

SC

378

Figure 3. Phylogeography of Latin American S. sonnei. Midpoint rooted

388

phylogenetic tree with taxa coloured by LA lineage or sublineage (reference isolates

389

shown in grey). The distribution of each lineage or sublineage across Latin America is

390

shown in the maps below, and for all sublineages and lineage in Costa Rica.

AC C

391

EP

387

392

Figure 4. Argentine S. sonnei outbreaks in the context of the Latin American

393

regional overview. The midpoint rooted phylogenetic tree shows the phylogenetic

394

positions of outbreak isolates from outbreaks in 2010, 2011 and 2016 using data

395

generated locally in Argentina. The tree is similarly labelled to Figure 2, with

ACCEPTED MANUSCRIPT 396

additional outbreak isolates being coloured by year (according to the inlaid key) in the

397

outermost track exterior to the country labels (coloured according to the map).

398 399

Table S1. Origin details, data accessions and results for S. sonnei in this study

RI PT

400

Figure S1. Phylogenetic distribution of S. sonnei isolates from South Africa. A

402

subset of reference isolates from Holt et al 2012 and this study (for Lineage V) were

403

used to generate the phylogenetic framework shown in black. South African isolates

404

are shown in red.

SC

401

M AN U

405

Figure S2. Relationship of LA S. sonnei lineage and sublineages to previously

407

designated sublineages. Previously named sublineages are highlighted in coloured,

408

labeled blocks (according to 16), with the Global III MDR sublineage being coloured

409

in blue and labelled at the defining internal node. The new Latin American lineage

410

and sublineage are similarly indicated (reb labels).

411

TE D

406

Figure S3. Antimicrobial resistance genes in Latin American S. sonnei. The

413

midpoint rooted phylogenetic tree shows the phylogenetic relationships of 439 S.

414

sonnei used in this study, with taxa names coloured by lineage. Adjacent tracks show

415

features of antimicrobial resistance determinants for the Latin American S. sonnei

416

sequenced as part of this study. The first track shows amino acid substitutions in gyrA

417

(WT= wildtype), where black is shown for global reference isolates, with subsequent

418

tracks showing the presence of major antimicrobial resistance genes (present in >5%

419

of isolates) coloured by the length of the contiguous sequence on which the gene is

420

found coloured according to the heat map in the inlaid key. Black is shown for Latin

AC C

EP

412

ACCEPTED MANUSCRIPT 421

American isolates where the gene is absent. Blocks of major antimicrobial resistance

422

determinants for each lineage are highlighted by dashed-white boxes and labelled

423

over the colour columns.

424 Figure S4. Predicted trends in antimicrobial resistance phenotypes over time.

426

The histogram shows the proportion of Latin American isolates predicted to have

427

resistant phenotypes by antimicrobial class. The isolates are further resolved by their

428

year of collection according to the inlaid key, with the total number of isolates in that

429

time bracket shown adjacent.

M AN U

430

SC

RI PT

425

Figure S5. Phylogenetic relationships of international S. sonnei across Latin

432

America, including phylogenetic support. The sublineages of Latin American S.

433

sonnei are shown with taxa labelled and coloured by country. Tree branches are

434

coloured by phylogenetic support according to the inlaid key.

EP AC C

435

TE D

431

ACCEPTED MANUSCRIPT Table 1. Shigella sonnei isolates included in this study

438 439 440 441 442 443 444

Number 50 27 31 50 30 18 48 28 41 116 439

Table 2. Genomic and epidemiological features of Latin American S. sonnei Lineage

Pairwise distances (SNPs) Previous sublineage name^

Years Countries Average Median Largest distance

V 26(8%) 1997 2009 4 134 160

AC C

Sublineages

LAIIa 66 (20%) 1999 2012 5 176 208

LAIIb 57 (18%) 1997 2014 6 161 210

1999 - 2012 6 95 95

LAIIIb 85(26%) 2002 2012 5 105 118

221

295

309

292

172

NA

Unnamed

Unnamed

South America (III)

Africa/South America, within Global III

EP

^ according to [16]

Number

TE D

Isolate features

445

Years 2002 - 2011 2010 - 2011 2008 - 2011 2002 - 2010 2011 - 2012 2008 - 2012 1999 - 2012 2000 - 2011 1997 - 2014 1943 - 2008 NA

SC

Global

Country Argentina Chile Colombia Costa Rica Guatemala Paraguay Perú Uruguay Venezuela Many Total

RI PT

Study Latin America

M AN U

436 437

LAIIIa 89 (28%)

ACCEPTED MANUSCRIPT Table 3. Predicted resistance phenotypes and major resistance determinants Lineage

62 30 0 0 0 62

3·3

2·7

30 25 0 0 0 51

9

11

83 90 0 9 0 91

2·3

-

TE D

-

11

6

30 -

Int2/Tn7

SRL

Int2/Tn7 and SpA

Int2/Tn7 -

SRL SRL SRL pABC-3

SpA Int2/Tn7

pABC-3 gyrA SNPs, qnrS genes

SpA -

13

-

-

-

-

AC C

65 0 3 0 77

3·7

Tetracycline Macrolide

Quinolone

82 48 25 80 67

3·9

EP

Previously described major AMR determinants

Beta-lactam Phenicol Trimethoprim Sulphonamide

Int2/Tn7 with bla Int2/Tn7 with bla Int2/Tn7 -

87 91 1 0 0 91

All (n=323)

5·3

-

Aminoglycoside

LAIIIb (n=85) 100 7 0 100

RI PT

46 73 0 0 4 81

SC

Multidrug resistance

Sulphonamide Tetracycline Macrolide Quinolone ESBL MDR (percentage) AMR phenotypes per isolate (average) Unique resistance profiles (number)

V (n=26) 81 42 0 85

LAIIa (n=66) 58 59 2 61

M AN U

Predicted resistant (percentage)

Antimicrobial class Aminoglycoside Beta-lactam Phenicol Trimethoprim

Sublineage LAIIb LAIIIa (n=57) (n=89) 70 90 35 90 0 89 68 81

-

-

-

ACCEPTED MANUSCRIPT Conflict of Interest statement AEM reports grants from Biotechnology and Biological Sciences Research Council during the conduct of study. KSB reports grants from the Wellcome Trust during the conduct of study. All authors declare no conflict of interest.

References

AC C

EP

TE D

M AN U

SC

RI PT

[1] M.D. Kirk, S.M. Pires, R.E. Black, M. Caipo, J.A. Crump, B. Devleesschauwer, D. Dopfer, A. Fazil, C.L. Fischer-Walker, T. Hald, A.J. Hall, K.H. Keddy, R.J. Lake, C.F. Lanata, P.R. Torgerson, A.H. Havelaar, F.J. Angulo, PLoS medicine 12(12) (2015) e1001921. [2] G. World Health Organization Foodborne Disease Burden Epidemiology Reference, WHO estimates of the global burden of foodborne diseases: foodborne disease burden epidemiology reference group 2007-2015., World Health Organisation, Switzerland, 2015. [3] J.A. Platts-Mills, S. Babji, L. Bodhidatta, J. Gratz, R. Haque, A. Havt, B.J. McCormick, M. McGrath, M.P. Olortegui, A. Samie, S. Shakoor, D. Mondal, I.F. Lima, D. Hariraju, B.B. Rayamajhi, S. Qureshi, F. Kabir, P.P. Yori, B. Mufamadi, C. Amour, J.D. Carreon, S.A. Richard, D. Lang, P. Bessong, E. Mduma, T. Ahmed, A.A. Lima, C.J. Mason, A.K. Zaidi, Z.A. Bhutta, M. Kosek, R.L. Guerrant, M. Gottlieb, M. Miller, G. Kang, E.R. Houpt, M.-E.N. Investigators, Lancet Glob Health 3(9) (2015) e564-75. [4] R. Achi, L. Mata, X. Siles, A.A. Lindberg, The Journal of infection 32(3) (1996) 211-8. [5] M.A. Sousa, E.N. Mendes, G.B. Collares, L.A. Peret-Filho, F.J. Penna, P.P. Magalhaes, Mem Inst Oswaldo Cruz 108(1) (2013) 30-5. [6] D.L. Woodward, F.G. Rodgers, Can J Infect Dis 11(4) (2000) 181-6. [7] N. Fulla, V. Prado, C. Duran, R. Lagos, M.M. Levine, The American journal of tropical medicine and hygiene 72(6) (2005) 851-4. [8] I. Chinen, M. Galas, E. Tuduri, M.R. Vinas, C. Carbonari, A. Della Gaspera, D. Napoli, D.M. Aanensen, S. Argimon, N.R. Thomson, D. Hughes, S. Baker, C. Guzman;Verri, M.T. Holden, A.M. Abdala, L.P. Alvarez, B. Alvez, R. Barros, S. Budall, C. Campano, L.S. Chamosa, P. Cheddie, D. Cisterna, D. De Belder, M. Dropa, D. Durand, A. Elena, G. Fontecha, C. Huber, A.P. Lemos, L. Melli, R.E. Paul, L. Suarez, J. Torres Flores, J. Campos, bioRxiv 10.1101/049940 (2016). [9] C.N. Thompson, P.T. Duy, S. Baker, PLoS neglected tropical diseases 9(6) (2015) e0003708. [10] H. Bolaños, A. Tijerino, G. Oropeza, J. Morales, E. Campos, N.L.N.o. Bacteriology., Incremento significativo de la shigelosis en Costa Rica durante el primer cuatrimestre del 2014 (Significant increase in shigellosis in Costa Rica during the first quarter of 2014) Laboratory based surveillance report, INCIENSA, Tres Ríos, Costa Rica, 2014. [11] K.S. Baker, T.J. Dallman, A. Behar, F.X. Weill, M. Gouali, J. Sobel, M. Fookes, L. Valinsky, O. Gal-Mor, T.R. Connor, I. Nissan, S. Bertrand, J. Parkhill, C. Jenkins, D. Cohen, N.R. Thomson, Emerging infectious diseases 22(9) (2016) 1545-53. [12] K.E. Holt, T.V. Thieu Nga, D.P. Thanh, H. Vinh, D.W. Kim, M.P. Vu Tra, J.I. Campbell, N.V. Hoang, N.T. Vinh, P.V. Minh, C.T. Thuy, T.T. Nga, C. Thompson, T.T. Dung, N.T. Nhu, P.V. Vinh, P.T. Tuyet, H.L. Phuc, N.T. Lien, B.D. Phu, N.T. Ai, N.M. Tien, N. Dong, C.M. Parry, T.T. Hien, J.J. Farrar, J. Parkhill, G. Dougan, N.R.

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

Thomson, S. Baker, Proceedings of the National Academy of Sciences of the United States of America 110(43) (2013) 17522-7. [13] B. Swaminathan, P. Gerner-Smidt, L.K. Ng, S. Lukinmaa, K.M. Kam, S. Rolando, E.P. Gutierrez, N. Binsztein, Foodborne pathogens and disease 3(1) (2006) 36-50. [14] J. Campos, M. Pichel, T.M.I. Vaz, A.T. Tavechio, S.A. Fernandes, N. Munoz, C. Rodriguez, M.E. Realpe, J. Moreno, P. Araya, J. Fernandez, A. Fernandez, E. Campos, F. Duarte, N.W. Gustafson, N. Binsztein, E.P. Gutierrez, Food Res Int 45(2) (2012) 1030-1036. [15] X. Deng, H.C. den Bakker, R.S. Hendriksen, Annual review of food science and technology 7 (2016) 353-74. [16] K.E. Holt, S. Baker, F.X. Weill, E.C. Holmes, A. Kitchen, J. Yu, V. Sangal, D.J. Brown, J.E. Coia, D.W. Kim, S.Y. Choi, S.H. Kim, W.D. da Silveira, D.J. Pickard, J.J. Farrar, J. Parkhill, G. Dougan, N.R. Thomson, Nature genetics 44(9) (2012) 10569. [17] H. Chung The, M.A. Rabaa, D. Pham Thanh, N. De Lappe, M. Cormican, M. Valcanis, B.P. Howden, S. Wangchuk, L. Bodhidatta, C.J. Mason, T. Nguyen Thi Nguyen, D. Vu Thuy, C.N. Thompson, N. Phu Huong Lan, P. Voong Vinh, T. Ha Thanh, P. Turner, P. Sar, G. Thwaites, N.R. Thomson, K.E. Holt, S. Baker, PLoS medicine 13(8) (2016) e1002055. [18] T.J. Dallman, M.A. Chattaway, P. Mook, G. Godbole, P.D. Crook, C. Jenkins, Journal of medical microbiology 65(8) (2016) 882-4. [19] E. Zankari, H. Hasman, R.S. Kaas, A.M. Seyfarth, Y. Agerso, O. Lund, M.V. Larsen, F.M. Aarestrup, The Journal of antimicrobial chemotherapy 68(4) (2013) 771-7. [20] S. Zhao, G.H. Tyson, Y. Chen, C. Li, S. Mukherjee, S. Young, C. Lam, J.P. Folster, J.M. Whichard, P.F. McDermott, Applied and environmental microbiology 82(2) (2015) 459-66. [21] G.H. Tyson, P.F. McDermott, C. Li, Y. Chen, D.A. Tadesse, S. Mukherjee, S. Bodeis-Jones, C. Kabera, S.A. Gaines, G.H. Loneragan, T.S. Edrington, M. Torrence, D.M. Harhay, S. Zhao, The Journal of antimicrobial chemotherapy 70(10) (2015) 2763-9. [22] M.A. Quail, T.D. Otto, Y. Gu, S.R. Harris, T.F. Skelly, J.A. McQuillan, H.P. Swerdlow, S.O. Oyola, Nature methods 9(1) (2012) 10-1. [23] A.J. Page, N. De Silva, M. Hunt, M.A. Quail, J. Parkhill, S.R. Harris, T.D. Otto, J.A. Keane, Microbial Genomics 2(8) (2016). [24] N.J. Croucher, A.J. Page, T.R. Connor, A.J. Delaney, J.A. Keane, S.D. Bentley, J. Parkhill, S.R. Harris, Nucleic acids research 43(3) (2015) e15. [25] A. Stamatakis, Bioinformatics 22(21) (2006) 2688-90. [26] E. Zankari, H. Hasman, S. Cosentino, M. Vestergaard, S. Rasmussen, O. Lund, F.M. Aarestrup, M.V. Larsen, The Journal of antimicrobial chemotherapy 67(11) (2012) 2640-4. [27] I.J. Azmi, B.K. Khajanchi, F. Akter, T.N. Hasan, M. Shahnaij, M. Akter, A. Banik, H. Sultana, M.A. Hossain, M.K. Ahmed, S.M. Faruque, K.A. Talukder, PloS one 9(7) (2014) e102533. [28] S. Argimón, K. Abudahab, R.J.E. Goater, A. Fedosejev, J. Bhai, C. Glasner, E.J. Feil, M.T.G. Holden, C.A. Yeats, H. Grundmann, B.G. Spratt, D.M. Aanensen, Microbial Genomics 2(11) (2016). [29] I. Letunic, P. Bork, Nucleic acids research 44(W1) (2016) W242-W245.

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

[30] A. Miranda, B. Avila, P. Diaz, L. Rivas, K. Bravo, J. Astudillo, C. Bueno, M.T. Ulloa, G. Hermosilla, F. Del Canto, J.C. Salazar, C.S. Toro, Front Cell Infect Microbiol 6 (2016) 77. [31] K.S. Baker, T.J. Dallman, P.M. Ashton, M. Day, G. Hughes, P.D. Crook, V.L. Gilbart, S. Zittermann, V.G. Allen, B.P. Howden, T. Tomita, M. Valcanis, S.R. Harris, T.R. Connor, V. Sintchenko, P. Howard, J.D. Brown, N.K. Petty, M. Gouali, D.P. Thanh, K.H. Keddy, A.M. Smith, K.A. Talukder, S.M. Faruque, J. Parkhill, S. Baker, F.X. Weill, C. Jenkins, N.R. Thomson, The Lancet infectious diseases (2015). [32] E. Njamkepo, N. Fawal, A. Tran-Dien, J. Hawkey, N. Strockbine, C. Jenkins, K.A. Talukder, R. Bercion, K. Kuleshov, R. Kolinska, J.E. Russell, L. Kaftyreva, M. Accou-Demartin, A. Karas, O. Vandenberg, A.E. Mather, C.J. Mason, A.J. Page, T. Ramamurthy, C. Bizet, A. Gamian, I. Carle, A.G. Sow, C. Bouchier, A.L. Wester, M. Lejay-Collin, M.C. Fonkoua, S.L. Hello, M.J. Blaser, C. Jernberg, C. Ruckly, A. Merens, A.L. Page, M. Aslett, P. Roggentin, A. Fruth, E. Denamur, M. Venkatesan, H. Bercovier, L. Bodhidatta, C.S. Chiou, D. Clermont, B. Colonna, S. Egorova, G.P. Pazhani, A.V. Ezernitchi, G. Guigon, S.R. Harris, H. Izumiya, A. KorzeniowskaKowal, A. Lutynska, M. Gouali, F. Grimont, C. Langendorf, M. Marejkova, L.A. Peterson, G. Perez-Perez, A. Ngandjio, A. Podkolzin, E. Souche, M. Makarova, G.A. Shipulin, C. Ye, H. Zemlickova, M. Herpay, P.A. Grimont, J. Parkhill, P. Sansonetti, K.E. Holt, S. Brisse, N.R. Thomson, F.X. Weill, Nat Microbiol 1 (2016) 16027. [33] T.R. Connor, C.R. Barker, K.S. Baker, F.X. Weill, K.A. Talukder, A.M. Smith, S. Baker, M. Gouali, D. Pham Thanh, I. Jahan Azmi, W. Dias da Silveira, T. Semmler, L.H. Wieler, C. Jenkins, A. Cravioto, S.M. Faruque, J. Parkhill, D. Wook Kim, K.H. Keddy, N.R. Thomson, Elife 4 (2015) e07335. [34] J. Liu, J.A. Platts-Mills, J. Juma, F. Kabir, J. Nkeze, C. Okoi, D.J. Operario, J. Uddin, S. Ahmed, P.L. Alonso, M. Antonio, S.M. Becker, W.C. Blackwelder, R.F. Breiman, A.S. Faruque, B. Fields, J. Gratz, R. Haque, A. Hossain, M.J. Hossain, S. Jarju, F. Qamar, N.T. Iqbal, B. Kwambana, I. Mandomando, T.L. McMurry, C. Ochieng, J.B. Ochieng, M. Ochieng, C. Onyango, S. Panchalingam, A. Kalam, F. Aziz, S. Qureshi, T. Ramamurthy, J.H. Roberts, D. Saha, S.O. Sow, S.E. Stroup, D. Sur, B. Tamboura, M. Taniuchi, S.M. Tennant, D. Toema, Y. Wu, A. Zaidi, J.P. Nataro, K.L. Kotloff, M.M. Levine, E.R. Houpt, Lancet 388(10051) (2016) 1291-301.

RI PT TE D

M AN U

SC

IIIa

EP

IIb V

AC C

IIa IIIb

ACCEPTED MANUSCRIPT

V

LA sub ACCEPTED MANUSCRIPT

Lineage II

RI PT

Lineages I and IV

AC C

EP

TE D

M AN U

SC

Lineage III

Aminoglycoside Beta-lactam

LA sublinea IIIa

ACCEPTED MANUSCRIPT

RI PT

IIIb

IIIa

AC C

EP

TE D

M AN U

SC

All (Costa Rica)

V

IIIb

LA sublineage IIa ACCEPTED MANUSCRIPT

Lineage V

Lineage III

AC C

EP

TE D

M AN U

Lineages I and IV

SC

Lineage II

RI PT

LA sublineag IIb

LA sublineag IIIa