Journal Pre-proof Performance of host-associated genetic markers for microbial source tracking in China Yang Zhang, Renren Wu, Kairong Lin, Yishu Wang, Junqing Lu PII:
S0043-1354(20)30206-2
DOI:
https://doi.org/10.1016/j.watres.2020.115670
Reference:
WR 115670
To appear in:
Water Research
Received Date: 29 July 2019 Revised Date:
25 February 2020
Accepted Date: 26 February 2020
Please cite this article as: Zhang, Y., Wu, R., Lin, K., Wang, Y., Lu, J., Performance of hostassociated genetic markers for microbial source tracking in China, Water Research (2020), doi: https:// doi.org/10.1016/j.watres.2020.115670. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier Ltd.
1
Performance of host-associated genetic markers for microbial
2
source tracking in China
3 4 5
Yang Zhanga, Renren Wub,c*, Kairong Lina*, Yishu Wangb,c, Junqing Lub,c
6 7 8
a
9
University, Guangzhou 510275, PR China;
Department of Water Resources and Environment, Sun Yat-sen
10
b
11
Province, South China Institute of Environmental Sciences, Ministry of
12
Ecology and Environment of the People’s Republic of China, Guangzhou
13
510000, PR China;
14
c
15
Simulation and Pollution Control, South China Institute of Environmental
16
Sciences, Ministry of Ecology and Environment of the People’s Republic
17
of China, Guangzhou 510530, P.R. China
The key Laboratory of Water and Air Pollution Control of Guangdong
State Environmental Protection Key Laboratory of Water Environmental
18 19 20
Running title: Performance of host-associated microbial source tracking
21
markers in China
22
23 24
Corresponding Author: Renren Wu; Kairong Lin
25
Address: Ruihe road 18, Huangpu District, Guangzhou 510000, P. R.
26
China; West Xingang Road 135, Guangzhou 510275, P. R.
27
China.
28
Email:
[email protected];
[email protected]
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
Renren Wu and Kairong Lin contributed equally to this study.
45 46
Abstract: Numerous genetic markers have been developed to establish
47
microbial source tracking (MST) assays in the last decade. However, the
48
selection of suitable markers is challenging due to a lack of understanding
49
of fundamental factors such as sensitivity, specificity, and concentration
50
in target/nontarget hosts, especially in East Asia. In this study, a total of
51
506 faecal samples comprised of human and 12 nonhuman hosts were
52
collected from 28 cities across China and tested for marker performance
53
characteristics. We firstly tested 40 host-associated markers based on a
54
binary
55
human-associated, 4 pig-associated, 3 ruminant-associated, and 1
56
poultry-associated) showed potential applicability in our study area. The
57
selected 15 markers were then tested using qualitative and quantitative
58
methods to characterise their performance. Overall, Bacteroidales
59
markers presented higher sensitivity and concentrations in target samples
60
compared to other bacterial or viral markers, but their specificity was low.
61
Among nontarget samples, pets accounted for 43.7% and 35.7% of
62
cross-reactivity with human-associated and poultry-associated markers,
63
respectively. Noncommon animals, including horse and donkey,
64
contributed 61.3% of cross-reactivity with ruminant-associated markers.
65
When considering the quantitative distribution of markers, their
66
concentration in nontarget samples were 1-3 orders of magnitude lower
(presence/absence)
criterion.
Here,
15
markers
(7
67
than in target samples. Moreover, a novel classification method was
68
proposed to classify the nontarget hosts into four groups spanning “no
69
cross-reactivity”, “weak cross-reactivity”, “moderate cross-reactivity”,
70
and “strong cross-reactivity” animal hosts. There were 77.9% nontarget
71
samples identified as no cross-reactivity and weak cross-reactivity hosts,
72
suggesting that these nontarget hosts produce little interference for
73
corresponding markers. Our findings elucidate the performance of
74
host-associated markers around China in a qualitative and quantitative
75
manner, and reveal the interference degree of cross-reactivity from
76
nontarget animals to genetic markers, which will facilitate tracking of
77
multiple faecal pollution sources and planning timely remedial strategies
78
in China.
79 80
Keywords: faecal pollution; microbial source tracking; genetic marker;
81
quantitative PCR; China
82 83
1. Introduction
84
Microbial source tracking (MST) is a tool used to discriminate faecal
85
pollution from different source hosts. This method presents an advantage
86
over the traditional faecal indicator bacteria (FIB) approach. FIB
87
approach generally cannot determine the source of faecal pollution
88
because FIB are widely present in most warm-blooded animal faeces
89
(Reischer et al., 2013; Mayer et al., 2018). Another limitation of
90
monitoring FIB is that these bacteria can reproduce in aquatic
91
environments alongside aquatic bacteria (Zhang et al., 2018), which may
92
confound pollution assessment. Therefore, library-independent MST
93
methods which rely on the measurement of genetic markers targeting
94
certain host-associated gut microorganisms have become increasingly
95
prominent (Reischer et al., 2013; Harwood et al., 2014; Feng et al., 2019).
96
The presence of these genetic markers in a watershed indicates faecal
97
pollution from specific hosts in environmental waters (Ahmed et al.,
98
2019). Furthermore, most genetic markers were developed from obligate
99
anaerobes and therefore degrade rapidly outside a host intestine (Bonjoch
100
et al., 2005).
101
Specific genetic markers, such as host-associated Bacteroidales 16S
102
rRNA gene markers, have been developed to discriminate the sources of
103
faecal pollution from humans (Ahmed et al., 2010a), pigs (Mieszkin et al.,
104
2009), ruminants (Bernhard et al., 2000a), poultry (Green et al., 2012),
105
common pets (Kildare et al., 2007), seagulls (Lu et al., 2008), and other
106
animals. The performance of these markers is a key determinant in
107
accurately identifying the source of faecal contamination. Moreover, the
108
suitability and accuracy of these genetic markers are susceptible to
109
regional variability. For example, HF183 is known to be highly sensitive
110
and specific to human-sourced faecal pollution in Belgium (Seurinck et
111
al., 2005) and USA (Boehm et al., 2013), but performed poorly in
112
Singapore and India (Nshimyimana et al., 2017; Odagiri et al. 2015). The
113
Bac305 marker also exhibits high specificity to ruminant faecal pollution
114
in one particular region, but not in others (Bernhard et al., 2000a; Malla et
115
al., 2018). Thus, the search for high-performance host-associated marker
116
genes in new and varied geographical regions has become a major focus
117
of MST research in recent years. The widespread applicability of genetic
118
markers has also been limited by the often narrow regional nature of
119
many seminal studies that evaluated the performance of faecal markers.
120
To our knowledge, only one study has reported the efficacy of gene
121
markers (two human-associated, two ruminant-associated, and one
122
bovine-associated) beyond a regional context, collecting faecal samples
123
from sixteen countries across six continents (Reischer et al., 2013). This
124
lack of data on genetic marker performance in a broad geographical
125
context increases the difficulty of selecting MST markers for validation in
126
new areas with certainty.
127
The performance of markers in different locations relies on repeated
128
testing of reference faecal samples (Ahmed et al., 2009; Bernhard et al.,
129
2000b; Shanks et al. 2010a). Sensitivity and specificity, the most typical
130
evaluation
131
presence/absence of specific markers in host samples. However, this
132
approach fails to quantify the abundance of said markers in individual
endpoints,
are
usually
determined
by
testing
the
133
sources. The variation of marker abundance in different host species has
134
significant implications for accurate and in-depth assessment of marker
135
performance, thus characterising these variations is of great importance
136
(Reischer et al., 2013).
137
To investigate the characteristics of a range of previously reported
138
MST markers in hosts beyond a regional context, China was selected as
139
our study area. China is the second-largest country in Asia, spanning five
140
temperature zones from cold to tropical climates. Until 2018, more than
141
1,390 billion people inhabited China. Moreover, Chinese animal
142
husbandry rapidly developed throughout the economic “great leap
143
forward” period. Nonetheless, few studies have validated MST assays
144
across China. In the present study, more than 500 faecal samples were
145
collected from 28 cities across different regions in China. When
146
considering the composition of samples, faecal pollution from human,
147
common livestock and poultry are the greatest concern. Moreover, the
148
markers selected for our study included but were not limited to the widely
149
well-acknowledged Bacteroidales genetic markers. The objectives of this
150
study were to (i) characterise the performance of host-associated markers
151
based on qualitative and quantitative analysis in a broad geographical
152
area, (ii) determine cross-reactivity between hosts and the resulting level
153
of false-positive signals for each marker, and (iii) provide a novel
154
classification
method
to
quantitatively
assess
the
degree
of
155
cross-reactivity between hosts.
156
2. Materials and methods
157
2.1 Faecal sample collection
158
From 2018 to 2019, a total of 506 faecal samples were collected from
159
human volunteers and nonhuman hosts in 28 cities across China. The
160
sampling cities are shown in Figure 1. Most of the sample sites were
161
distributed across seven major river systems in China, except for those in
162
Lhasa City and Urumqi City, which were distributed across the
163
continental river system of Tibet and north-western China, respectively.
164
Twenty-two sampling cities were selected in south-eastern China,
165
obeying the Heihe–Tengchong geo-demographic demarcation line. The
166
cities located east of said line account for the vast majority of the Chinese
167
population and, consequently, contribute far more faecal pollution to
168
nearby watersheds. In total, 506 faecal samples were collected, 117 of
169
which were of human source, 76 from pigs, 102 from ruminants
170
(including cattle, sheep, and camels), 104 from poultry (including
171
chickens, ducks, and geese), 70 from common pet animals (including
172
dogs, cats, and rabbits), and 37 from uncommon animals (including
173
horses and donkeys). The number of faecal samples collected from each
174
city along with their respective source species are summarized in Table
175
S1. Fresh faecal samples were collected from volunteers in hospitals and
176
from families in highly urbanized areas. Although some human faecal
177
samples were collected from hospitals, we applied to the doctors for
178
faecal samples from healthy people who went to the hospital for routine
179
physical examination, rather than from patients. Most dog, cat and rabbit
180
faecal samples were collected from the same urban households that
181
donated human faecal samples; however, a few dog, cat and rabbit faeces
182
were collected from rural households and as these animals were also
183
living with humans, they were considered as pets in this study. The rest of
184
the animal faecal material was collected from rural families, livestock
185
farms, and zoos. To ensure each sample came from a known source, and
186
to avoid contamination from other hosts, unified sampling guidelines
187
were defined and sent to all research partners prior to sampling in each
188
city. All faecal samples were collected using 60 mL sterile tubes.
189
Collected samples were immediately placed in a sealed icebox, then
190
transported to the laboratory, protected from the sunlight. Upon arrival to
191
the laboratory, the faecal samples were stored at -80
192
extracted.
193
until DNA was
194 195 196 197
Fig.1. Sampling cities in China. Red triangles represent the sampling cities. White line is Heihe-Tengchong geo-demographic demarcation line. The cities located east of the dividing line account for the vast majority of the Chinese population.
198 199
2.2 DNA Extraction
200
The TIANamp Stool DNA Kit (TIANGEN, Beijing, China) was used
201
to extract genomic DNA from all faecal samples following the
202
manufacturer recommendations. For simplicity, 0.25 g (wet weight) of
203
each faecal sample was added into bead tubes with a lysis buffer. Then,
204
the samples were vigorously homogenized using a TGrinder H24 Tissue
205
Homogenizer (TIANGEN, Beijing, China). Afterwards, the Universal
206
DNA Purification Kit (TIANGEN, Beijing, China) was employed to
207
remove polymerase chain reaction (PCR) inhibitors and ensure DNA
208
purity. The concentration and quality of genomic DNA were then
209
measured using a NanoDrop ND 1000 UV spectrophotometer
210
(MAESTROGEN, USA). DNA concentrations in the purified extracts
211
were between 15 and 120 ng/µL. In reference to a previous study
212
(Reischer et al., 2013), purified DNA extracts with concentrations > 30
213
ng/µL were diluted tenfold to ensure that all the purified DNA extract
214
concentrations ranged from 3 to 30 ng/µL for downstream analyses, and
215
the concentrations of most DNA templates were within 10 ng/µL (Fig.
216
S1).
217
2.3 qPCR assays and preliminary experiments
218
All qPCR reactions were performed in triplicate on a Roche
219
LightCycler® 480 II system (Roche Diognostics Ltd., Rotkreuz,
220
Switzeriand). Because we used different commercial reaction components
221
(e.g. polymerases) than those reported in original publications, all assays
222
were run according to the recommended reaction mixtures and procedure
223
of
224
host-associated gene marker assays were performed using 20 µL qPCR
225
mixtures, containing 10 µL of 2x SuperReal probe PreMix (TIANGEN,
226
China) and 2 µL of DNA template. The quantities of probe and primer
227
added were determined by their intended final concentration in the
228
mixtures. For SYBR-based marker assays, 20 µL qPCR mixtures were
229
prepared, incorporating 10 µL 2x Talent qPCR PreMix (TIANGEN,
the
used
commercial
kit
(TIANGEN,
China).
Probe-based
230
China), 2 µL of DNA template, and 10 µM of each primer. The protocol
231
for probe-based qPCR assays was executed according to the SuperReal
232
probe PreMix manufacturer's instructions (95
233
40 cycles of 95
234
protocol consisted of a step at 95
235
95
236
general Bacteroidetes marker, AllBac assay was performed to confirm the
237
amplification of DNA templates and the absence of PCR inhibition
238
(Reischer et al., 2013; Mayer et al., 2018). The detail of AllBac was
239
shown in Table S2.
for 3 s and 60
for 5 s and 60
for 15 min, followed by
for 30 s). The SYBR-based qPCR for 30 min, followed by 40 cycles of
for 10 s. As described in previous studies, the
240
To identify suitable host-associated markers for an in-depth analysis,
241
we adopted the validation method of a previous study to perform
242
pre-screening experiments targeting 40 markers (Table S2) (Fan et al.,
243
2017). These genetic markers (21 human-associated, 8 pig-associated, 7
244
ruminant-associated, and 4 poultry-associated) included but were not
245
limited to the often used Bacteroidales genetic markers. The
246
host-associated marker selection criteria in this study were (i) that they
247
targeted human, pig, ruminant, and poultry hosts, and (ii) that they had
248
precedence of good performance, either in the region where they were
249
developed or in other locations. The performance of the 40 markers were
250
shown in Table S3. Markers with sensitivity and specificity greater than
251
50% were selected for further analysis. Among these 40 markers, only
252
fifteen markers met this criterion, including 7 human-associated, 4
253
pig-associated, 3 ruminant-associated and 1 poultry-associated (Table 1).
254
After pre-screening experiments, we performed in-depth assessment for
255
these fifteen markers.
256
Quantitative analysis of these 15 markers was based on plasmid
257
standard dilutions. Plasmid DNA for different hosts was prepared with
258
the respective target PCR product and primers. The pGEM®-T Easy
259
Vector (BGI, China) was used for the crAssphage marker; all other assays
260
were performed with the pMD 19-T vector. Standard curves for all assays
261
were generated using seven 10-fold serial dilutions of plasmid DNA (i.e.
262
100–106 gene copies; GC). The resulting qPCR efficiencies were between
263
90 and 110%. The limits of detection (LODs) for individual markers were
264
calculated at 99% confidence intervals, as previously described
265
(Nshimyimana et al., 2014). Every qPCR incorporated DNA template
266
triplicates and non-template controls (Table S4). To ensure reproducibility
267
between different plates, two standards from 102 and 103 copies/µL
268
diluted positive controls (plasmid DNA) of each marker were tested in
269
different plates as described in a previous study (Nshimyimana et al.
270
2017). The average coefficient of variability (%CV) was 3.87±0.87% for
271
the 103 copies/µL standard and 3.75±0.80% for the 102 copies/µL
272
standard (Table S5).
273
274 275
Table 1. Primer and probe information for selected qPCR assays in the second test phase qPCR assay
primer or probe
sequence 5’-3’
target
reference
microorganism Human BacH
BacH-f
CTTGGCCAGCCTTCTGAAAG
Bacteroides-Prevot
(Reischer et al.,
BacH-r
CCCCATCGTCTACCGAAAATAC
ella
2010)
BacH-PC
FAM-TCATGATCCCATCCTG-NFQ–MGB
BacHum-160f
TGAGTTCACATGTCCGCATGA
Bacteroidales
(Kildare et al.,
BacHum-241r
CGTTACCCCGCCTACTATCTAATG
BacHum-193p
FAM-TCCGGTAGACGATGGGGATGCGTT-NFQ
SYBR-HF1
HF183-f
ATCATGAGTTCACATGTCCG
83
HF183-r
TACCCCGCCTACTATCTAATG
Hum2
Hum2-f
CGTCAGGTTTGTTTCGGTATTG
Hypothetical
(Shanks et al.,
Hum2-r
TCATCACGTAACTTATTTATATGCATTAGC
protein BF3236
2010a)
HumM2P
(FAM)-TATCGAAAATCTCACGGATTAACTCTTG
BacHum
2007)
Bacteroides dorei
(Ahmed et al., 2010b)
TGTACGC-(TAMRA) Hum163
CPQ_056
CPQ_064
Hum163-f
CGTCAGGTTTGTTTCGGTATTG
Hypothetical
(Shanks et al.,
Hum163-r
AAGGTGAAGGTCTGGCTGATGTAA
protein BF3236
2010a)
056F1
CAGAAGTACAAACTCCTAAAAAACGTAGAG
crAssphage
(Stachler et al.,
056R1
GATGACCAATAAACAAGCCATTAGC
056P1
(FAM)-AATAACGATTTACGTGATGTAAC-(MGB)
064F1
TGTATAGATGCTGCTGCAACTGTACTC
064R1
CGTTGTTTTCATCTTTATCTTGTCCAT
064P1
(FAM)-CTGAAATTGTTCATAAGCAA-(MGB)
Bac32-f
AACGCTAGCTACAGGCTTAAC
Pig-specific
(Mieszkin
Bac108r
CGGGCTATTCCTGACTATGGG
Bacteroidales
al., 2009)
Bac44P
(FAM)ATCGAAGCTTGCTTTGATAGAT
2014)
crAssphage
(Stachler et al., 2014)
Pig Pig-1-Bac
et
GGCG(BHQ-1) Pig-2-Bac
Bac41-f
GCATGAATTTAGCTTGCTAAATTTGAT
Pig-specific
(Mieszkin
Bac163-r
ACCTCATACGGTATTAATCCGC
Bacteroidales
al., 2009)
L.amylovor
L.amylovorus-f
TTCTGCCTTTTTGGGATCAA
Lactobacillus
(He
us
L.amylovorus-r
CCTTGTTTATTCAAGTGGGTGA
amylovorus
2016)
P.ND5
P.ND5-f
ACAGCTGCACTACAAGCAATGC
Mitochondrial
(He
P.ND5-r
GGATGTAGTCCGAATTGAGCTGATTAT
DNA NADH 5
2016)
et
et
al.,
et
al.,
gene Ruminant Rum-2-Bac
BacB2-590f
ACAGCCCGCGATTGATACTGGTAA
Ruminant-specific
(Mieszkin
Bac708Rm
CAATCGGAGTTCTTCGTGAT
Bacteroidales
al., 2010)
BacB2626P
(FAM)ATGAGGTGGATGGAATTCGTGGTGT(BH
Bacteroides-Prevot
(Bernhard
et
Q-1) Bac708
CF128-f
CCAACYTTCCCGWTACTC
et
BacCow
Bac708-r
CAATCGGAGTTCTTCGTG
ella
al., 2000a)
CF128-f
CCAACYTTCCCGWTACTC
Cow Bacteroidales
(Kildare et al.,
305r
GGACCGTGTCTCAGTTCCAGTG
GFD-f
TCGGCTGAGCACTCTAGGG
Unclassified
(Green et al.,
GFD-r
GCGTCTCTTTGTACATCCCA
Helicobacter spp.
2012)
2007)
Poultry GFD
276 277 278 279
2.4 Data analysis Sensitivity (r) and specificity (s) were determined according to the following equations (Kildare et al., 2007; Odagiri et al., 2015):
280
r=
(1)
281
s=
(2)
282
TP represents positive results for target reference samples, and FN
283
represents negative results for target reference samples. Conversely, TN
284
indicates negative results for nontarget reference samples and FP
285
represents positive results for nontarget reference samples. In preliminary
286
experiments, the mean reaction with < 31.0 Cq is considered as a positive
287
result. In the 15 selected markers, the qualitative performance was strictly
288
re-assessed based on the lower limit of detection (LOD) (Boehm et al.,
289
2013; Layton et al., 2013). The concentrations of the 15 markers in target
290
and nontarget samples were evaluated with standard curves.
291
We employed a “25th/75th” metric to classify nontarget animal
292
marker specificity and abundance into 4 groups. The 25th/75th metric
293
was determined by subtracting the 75th percentile concentration in the
294
nontarget hosts from the 25th percentile concentration in the target hosts
295
for each marker (i.e. 25th/75th metric = 25th percentiletarget − 75th
296
percentilenontarget) (Reischer et al., 2013). The four aforementioned groups
297
were: (1) “no cross-reactivity” (NCR), the marker did not produce any
298
positive signals in the nontarget animal; (2) “weak cross-reactivity”
299
(WCR), the 25th/75th metric rendered a positive value; (3) “moderate
300
cross-reactivity” (MCR), the 25th/75th metric rendered a negative value;
301
and (4) “strong cross-reactivity” (SCR), the disparity between the mean
302
concentrations of target and nontarget samples was below 1 order of
303
magnitude. qPCR data were converted into a log10 format, and statistical
304
significance was determined via the t-test or one-way ANOVA. All data
305
analysis was performed using Microsoft Excel 2010, SPSS 22 and the R
306
Statistical Computing Software.
307
3. Results
308
To understand the potential challenges of applying markers in a wide
309
range of geographical regions, the performance of 15 promising
310
pre-selected markers was mainly discussed in the subsequent analysis of
311
this study. Interestingly, the results of qualitative analysis for
312
cow-specific Bacteroidales Bac708 and BacCow markers were not only
313
detected in cattle samples but were also highly prevalent in sheep and
314
camel samples. Therefore, BacCow and BoBac should be more generally
315
considered ruminant-associated markers rather than cow-specific
316
markers.
317
3.1 Qualitative analysis
318
The sensitivities of all the markers tested ranged from 61% to 100%
319
(Table 2). Among these, human-associated markers had the most variable
320
sensitivity (61–98%), followed by pig-associated markers (68–100%). In
321
contrast, ruminant-associated marker sensitivity was in the 96–100%
322
range (Table 2). In human-associated markers, Bacteroidales markers
323
including BacH, BacHum, and SYBR-HF183 were the most prevalent
324
genetic markers, exhibiting host sensitivity values of 98%, 82%, and 74%,
325
respectively. Mitochondrial DNA markers (Hum2, Hum 163) and
326
crAssphage markers (CPQ_056, CPQ_064) exhibited relatively low
327
sensitivity (59–67%). Moreover, pig-associated Bacteroidales markers
328
(Pig-1-Bac, Pig-2-Bac) and mitochondrial marker (P.ND5) exhibited
329
significantly higher sensitivity (95–100%) compared to Lactobacillus
330
amylovorus markers (68%). All ruminant-associated markers targeted
331
Bacteroidales and showed the highest prevalence in target samples
332
compared to other host-associated markers (> 96%). Unfortunately, only
333
one poultry-associated marker (GFD) was selected for applicability in our
334
study from preliminary screens, and the sensitivity of this marker (68%)
335
was low compared to other Bacteroidales markers.
336
The specificity of the evaluated markers ranged from 50 to 91%. No
337
marker exhibited absolute host specificity. But most host-associated
338
markers presented limited cross-reactivity to nontarget samples except for
339
Pig-1-Bac,
340
presented the highest number of false-positives occurring with pets (i.e.
341
43.7% of the pet samples tested positive for seven human-associated
342
markers). Similarly, the pets also contributed many false positive signals
343
to the poultry-associated marker GFD (i.e. 35.7% of pet samples tested
344
positive for the GFD marker). Similarly, ruminant-associated markers
345
yielded the highest numbers of non-common animal (horse and donkey)
346
false-positives with 61.3% of non-common animal samples testing
347
positive for ruminant-associated markers.
Rum-2-Bac,
and
Bac708.
Human-associated
markers
348
In human-associated markers, Bacteroidales markers exhibited lower
349
specificity compared to mitochondrial DNA markers and crAssphage
350
markers. This trend was especially true for BacH and BacHum, which
351
exhibited host specificity values of 51% and 55%, respectively. Among
352
the pig-associated markers, the specificity value of both Pig-2-Bac and
353
P.ND5 were >0.90, but Pig-1-Bac showed relatively lower specificity
354
(68%) compared to other pig-associated markers. Ruminant-associated
355
markers all presented low specificity (<80%), especially Bac708, which
356
exhibited a specificity of barely 50%. In contrast, GFD showed the
357
highest specificity (91%) compared to other host-associated markers.
358
Overall, there was an apparent trade-off between sensitivity and
359
specificity in MST markers, whereby an improvement in one parameter
360
usually translated to a decrease in the other.
361
Table 2. Numbers of qPCR Positives with the indicated primers in Source Species or Source Groups qPCR positive poultry-as human-associated source
pig-associated
ruminant-associated
no. samples
sociated BacH
BacH
SYBR-
Hum
Hum
CPQ
CPQ
Pig-1-B
Pig-2-B
L.amyl
um
HF183
2
163
_056
_064
ac
ac
ovorus
P.ND5
Rum-2-
Bac708
Bac
BacCo
GFD
w
human
117
115
96
87
69
77
71
78
26
13
24
10
22
26
15
0
pig
76
35
25
16
24
12
22
21
76
72
52
72
13
47
23
12
cattle
51
22
30
0
0
7
0
0
17
5
18
0
51
51
51
0
sheep
32
16
14
11
0
0
0
0
11
4
9
5
32
32
32
0
camel
19
0
0
7
0
0
0
0
8
0
8
0
15
19
19
0
chicken
48
21
23
13
0
9
0
0
12
10
16
7
19
24
16
35
duck
35
16
12
8
0
5
14
13
6
0
9
5
15
20
8
29
goose
21
13
10
7
0
6
0
0
13
8
11
8
12
17
0
7
dog
20
17
11
10
0
8
11
0
11
0
7
0
11
14
9
11
cat
17
14
7
11
0
5
0
0
6
7
0
4
8
15
0
6
rabbit
33
26
28
20
18
11
8
9
7
0
8
4
7
12
0
8
horse
18
0
0
5
0
0
0
0
11
0
5
0
10
11
9
0
donkey
19
ruminant
poultry
pet
non-common animals
sensitivity(%) specificity(%)
362
a
11
15
9
9
0
0
0
10
5
6
0
9
16
13
0
a
98
82
74
59
66
61
67
100
95
68
95
96
100
100
68
a
51
55
70
87
84
86
89
68
88
72
90
69
50
77
91
506 506
Total number of samples.
363
3.2 Quantitative analysis
364
Marker abundance in faecal material was characterised per gram of
365
wet faeces, as discussed in previous studies (Ahmed et al., 2019;
366
Nshimyimana et al., 2017; Layton et al., 2013). The abundance of
367
markers in target and nontarget samples were assessed based on the
368
25th/75th percentiles and mean concentrations. Mean concentrations of
369
human-associated markers in target samples ranged from 3.57 ± 0.77
370
log10 GC/g to 5.27 ± 1.25 log10 GC/g, while the range in pig-associated
371
markers in target samples ranged from 4.99±1.79 log10 GC/g to 6.58 ±
372
1.59 log10 GC/g. Ruminant-associated markers had the highest
373
concentrations, ranging from 6.19 ± 1.26 log10 GC/g to 7.15 ± 0.89 log10
374
GC/g. The poultry-associated marker GFD exhibited relatively low
375
concentrations (4.09 ± 0.96 log10 GC/g) in target samples. Bacteroidales
376
markers generally presented significantly higher concentrations in target
377
samples compared to most of the other markers (paired t-test, p < 0.05).
378
Meanwhile,
379
concentrations and showed no statistically significant differences in
380
abundance with Pig-1-Bac (paired t-test, p > 0.05). Moreover, the
381
concentrations of tested markers in target samples presented much
382
broader distributions, 25th and 75th percentiles of marker concentrations
383
were separated by 1-4 orders of magnitude (Fig. 2). In human-associated
384
markers, 25th and 75th percentiles of Bacteroidales markers were
mitochondrial
marker
P.ND5
also
exhibited
high
385
separated by 3-4 orders of magnitude, which evidences a relatively broad
386
distribution compared to mitochondrial DNA and crAssphage markers. In
387
contrast, all ruminant-associated markers presented a relatively narrow
388
gap of 25th and 75th percentile distribution, which were separated by 1-3
389
orders of magnitude. The 25th and 75th percentiles of pig-associated
390
markers were separated by 1-4 orders of magnitude, but Bacteroidales
391
markers only showed 1-2 orders of magnitude differences in 25th and
392
75th percentile distribution. The 25th and 75th percentiles of
393
poultry-associated marker GFD were only separated by 2 orders of
394
magnitude.
395
The concentrations of 15 host-associated markers in nontarget
396
samples were also determined. The mean concentration in nontarget
397
samples ranged from 2.48±0.48 log10 GC/g for Hum163 to 4.09±0.90
398
log10 GC/g for BacCow, which revealed that marker concentrations in
399
nontarget samples were nearly 1-3 order of magnitude lower compared to
400
target samples. In nontarget samples, the 25th and 75th percentiles of
401
markers were separated only 1-2 order of magnitude. These results
402
indicate relatively limited distributions for the markers in nontarget
403
samples (Fig. 2). To investigate the contribution of different nontarget
404
animals to marker concentrations, the distribution of markers in each
405
nontarget host was calculated based on a corresponding standard curve
406
(Fig S2-S5).
407
408
409 410 411 412 413
Fig. 2. Concentrations of human- and nonhuman-associated markers in target/nontarget faecal samples. Boxes indicate 25th/75th percentile. Diamond indicate the median values. Panel (a): human-associated markers, and Panel (b): nonhuman-associated markers.
414 415
3.3 Classification of nontarget samples
416
Based on the distribution of false positives in nontarget samples, we
417
classified nontarget hosts into 4 groups (Fig. 3). The concentration ranges
418
of no cross-reactivity (NCR) class, weak cross-reactivity (WCR) class,
419
moderate cross-reactivity (MCR) class and strong cross-reactivity (SCR)
420
class for each marker were showed in Table S6-S9. The results of said
421
classification revealed that there were 47.1% nontarget animals assigned
422
to the weak cross-reactivity (WCR) class, and 30.8% fell into the no
423
cross-reactivity (NCR). The moderate cross-reactivity class (MCR,
424
14.5%), and strong cross-reactivity class (SCR, 7.6%) were much less
425
frequent (Fig 3). Overall, this classification method indicates that most
426
nontarget samples (77.9%) were found to have little or no impact on MST
427
assays (WCR and NCR). The distribution of WCR and SCR were
428
significantly different between human and nonhuman hosts (paired t-test,
429
p < 0.05). For instance, 40.5% of nontarget hosts were classified as NCR
430
in human-associated markers. However, of the nonhuman-associated
431
markers, 52.9% of nontarget samples were classified as WCR. Fewer
432
nontarget animal hosts fell into the strongly affected category. In
433
human-associated markers, Bacteroidales and mitochondrial markers
434
resulted in high mean concentrations in rabbit samples (ranging from
435
5.37±1.03 log10 GC/g for BacHum to 3.11±0.72 log10 GC/g for Hum163),
436
which
437
ruminant-associated markers Bac708 and BacCow, the level of
438
false-positive signals from donkey were similar to those of ruminant
439
faecal samples.
440
were
similar
in
target
samples.
Moreover,
of
the
441 442 443 444 445 446 447
Fig. 3. Classification of nontarget samples for each host-associated marker. The results were colored on the basis of the following criteria: NCR (no cross-reactivity), no false-positive signal was amplified; WCR (weak cross-reactivity), positive value for the 25th/75th metrics; MCR (moderate cross-reactivity), negative value for the 25th/75th metrics; SCR (strong cross-reactivity), the disparity of mean concentration between target and nontarget samples is less than 1 order of magnitude.
448 449
4. Discussion
450
4.1 Evaluation of genetic markers beyond a regional context
451
Suitable genetic markers spanning a wide variety of geographical
452
regions were selected from a preliminary literature review, rather than
453
conducting a blind validation study for a range of genetic markers. The
454
performance of markers typically declines dramatically when validation
455
studies are performed beyond the regional context in which the markers
456
were originally developed. Moreover, it is difficult to meet the >80%
457
requirement for both sensitivity and specificity (USEPA, 2005), at which
458
point the marker is considered useful. For example, previously published
459
studies reported BacH and BacHum to be the best performing markers
460
(Kildare et al., 2007; Reischer et al., 2010) but performed poorly in our
461
study. One study also consistently found the specificity of BacH and
462
BacHum to be 53% and 68%, respectively, even in the context of a study
463
area that spanned sixteen countries (Reischer et al., 2013). Several
464
possibly factors such as different diet, climate, animal health and lifestyle
465
may lead to the inconsistent performance of genetic markers (Stewart et
466
al., 2013; Shanks et al., 2010b; Shanks et al., 2011; Ahmed et al., 2019).
467
In addition, different DNA isolation procedures and qPCR parameters
468
(e.g. qPCR reagent, DNA load) are other influencing factor that cannot be
469
ignored, which may also contribute to the observed variable outcomes in
470
different locations (Reischer et al., 2013; Boehm et al., 2013). For
471
example, in the verification of BacH, BacHum and BacCow, all qPCR
472
reactions were run in a total volume of 20 µL in our study, whereas the
473
previous study performed qPCR reactions in a total volume of 25 µL
474
(Kildare et al., 2007; Reisher et al., 2010; Reischer et al., 2013), and the
475
temperature settings of qPCR assays were also different among these
476
studies. This discrepancy was due to the application of different
477
commercial reaction components for each study, leading to different
478
qPCR protocols and affecting the performance of these markers.
479
Therefore, we postulate that there is no single performance threshold that
480
determines a genetic marker’s applicability for MST in a wide array of
481
regions. Rather, the performance requirements are subject to each MST
482
challenge and the conditions present within each particular study area.
483
4.2 Variation of source-sensitivity
484
Quantitative data for host-associated concentrations of markers in
485
target samples is critical to detect their presence in aquatic environments.
486
A previous study proposes that if markers have a high qualitative
487
sensitivity, but their quantitative sensitivity (i.e. abundance in target
488
samples) is low, they are unlikely to detect faecal pollution in water
489
samples or will otherwise tend to underestimate the level of
490
contamination due to dilution or losses by sample processing steps
491
(Ahmed et al., 2019). Based on our results, we propose that sensitivity
492
may also be positively associated with the concentrations of specific
493
markers in the target samples. For example, the human-associated
494
markers Hum2, Hum163, CPQ_056, and CPQ_064 exhibited poor
495
sensitivity (59–67%), and their concentrations in the target samples were
496
also lower than the other human-associated markers by 1–3 orders of
497
magnitude. A similar trend was reported by previous studies that
498
evaluated the performance of human-associated markers in Singapore and
499
Australia (Nshimyimana et al., 2017; Ahmed et al., 2019). These
500
observations suggest that a higher mean concentration of markers allow
501
for a higher likelihood of obtaining true positive signals above the LOD
502
in target hosts. Thus, the sensitivity of genetic markers could be used to
503
predict the quantitative performance in target hosts when conducting
504
preliminary screens to select promising markers based on binary data (i.e.
505
presence/absence).
506
The population distribution of target microorganisms in the host
507
intestines, which is another factor considered in the development of
508
genetic markers, may also be linked to marker sensitivity and
509
concentrations in target samples. In the present study, Bacteroidales
510
markers typically exhibited higher sensitivity and concentrations
511
compared to other bacteria and virus markers. This may be because
512
Bacteroidales are a dominant microbial population, compared to most
513
other microorganisms, in mammal intestines (Ahmed et al., 2010a). The
514
pig-associated mitochondrial DNA (mtDNA) marker P.ND5 also
515
exhibited high sensitivity in our study. This is consistent with several
516
earlier studies on the abundance of mtDNA markers, which suggest that
517
multiple copies of mtDNA are contained in exfoliated epithelial cells (He
518
et al., 2016; Tambalo et al., 2012; Caldwell et al., 2009). Thus, these
519
mtDNA copies could provide strong positive signals comparable to those
520
of the Bacteroidales 16S rRNA genes.
521
We also found that human-associated Bacteroidales markers had
522
lower
523
nonhuman-associated
524
human-associated Bacteroidales markers in target samples exhibited a
sensitivity
and
concentrations Bacteroidales
in
target markers.
samples
than
Moreover,
525
relatively broad concentration distribution. As far as we know, although
526
an investigation of the widespread distribution and stability of most
527
Bacteroidales markers has not been systematically performed to date, our
528
partial results are consistent with previous study that reported that
529
BacCow has a broader target host distribution and greater stability than
530
BacH and BacHum (Reischer et al., 2013). This can be attributed to less
531
dominant and more variable target Bacteroidales in the human gut
532
compared to that in other mammals. This was illustrated in a previous
533
study, where target Bacteroidales in pig and cow faeces showed higher
534
and more stable relative abundances than human-associated Bacteroidales
535
including B. fragilis, B. caccae, B. uniformis, and B. vulgatus (Hong,
536
2010). Overall, the data suggests that the distribution of target
537
microorganism populations has a significant effect on the sensitivity
538
performance and concentrations of genetic markers.
539
A previous study reported that a highly abundant bacteriophage
540
“crAssphage” was discovered in human faeces (Dutilh et al., 2014);
541
however, in our study, sensitivity and concentrations of the CPQ_056 and
542
CPQ_064 crAssphage markers in target samples were relatively low
543
compared to other evaluated markers. One study also consistently found
544
lower sensitivity of CPQ_056 and CPQ_064 (both were at 46.1%) in
545
faecal samples (Ahmed et al. 2018). This may be explained by uneven
546
distribution of crAssphage in human faeces (Ahmed et al. 2018). Previous
547
studies have suggested that host age, health or other factors may influence
548
dispersion or aggregation of crAssphage in the faeces (Liang et al., 2016;
549
Cinek et al., 2018; Ahmed et al. 2018), thereby affecting their detection
550
using qPCR (Ahmed et al. 2018). However, as this experiment did not
551
consider the age information from donors and lacked of faecal samples
552
from patients, the reason for low sensitivity of CPQ_056 and CPQ_064
553
needs to be further confirmed in future studies. In addition, the difference
554
in crAssphage abundance among the Chinese, Europeans and Americans
555
may also be the reason for the low sensitivity of CPQ_056 and CPQ_064
556
in this study. A previous study has reported that crAssphage was less
557
abundant in sewage from Asia compared to that in United States and
558
Europe, in a survey of 86 publicly available metagenomes (Stachler et
559
al. 2014).
560
4.3 Cross-reactivity
561
The occurrence of cross-reactivity from nontarget samples leads to
562
poor marker specificity. To our knowledge, there are no genetic markers
563
with
564
cross-reactivity are unclear and require further investigation. Some
565
genome regions of host-associated genetic markers may have homology
566
to microorganisms among different animals (Stachler et al., 2018).
567
Cohabitation may also be an important factor explaining cross-reactivity.
568
In our study, frequent contact between humans and pets in urbanised
absolute
host
specificity
to
date.
The
mechanisms
for
569
areas lead to a higher proportion of false-positive signals from pets in
570
human-associated markers. Similar observations were found in other
571
Asian areas. For instance, a previous study reported that rabbits
572
contribute a large proportion of cross-reactivity to human-associated
573
markers among nontarget samples in Singapore due to their widespread
574
adoption as common pets in this area (Nshimyimana et al., 2017).
575
Moreover, in rural areas, free-roaming poultry and pets (especially dogs)
576
that are not isolated from each other might also explain the large
577
proportion of cross-reactivity from pets to the poultry-associated GFD
578
marker. However, as we did not collect faecal samples from pet-like
579
animals which present habitat isolation with humans and poultry, such as
580
wild rabbits, the influence of cohabitation needs to be further confirmed
581
in future studies by using animals from different habitats. Besides the
582
factor of cohabitation, diet and physiology may also explain
583
cross-reactivity. For instance, the high proportion of cross-reactivity from
584
non-common animals in ruminant-associated markers observed in our
585
study might be attributed to the relatively similar diets and physiologies
586
between these animals.
587
Despite the observations mentioned above, most genetic markers
588
exhibit a limited cross-reactivity with nontarget samples. This may help
589
to correctly identify sources of contamination if two or more markers are
590
used simultaneously (Ahmed et al., 2019). This highlights the importance
591
of characterising the species cross-reactivity of each marker to effectively
592
combine them for optimized target identification. For example, based on
593
our results, if the Hum2 and CPQ_056 human-associated markers are
594
used simultaneously, the potential false-positive signals due to the
595
presence of Hum2 in donkey may be identified when tracking the source
596
of faecal contamination in water bodies. Similarly, Hum2 could also be
597
used to resolve the false-positive results associated with duck and dog,
598
which derive from the CPQ_056 marker (Fig. S1).
599
4.4 Low level of false positive signals
600
The inconsistent species cross-reactivity for genetic markers in
601
different locations poses another significant challenge for the application
602
of MST assays. For instance, Hum2 was previously identified to
603
cross-react with chicken, sheep, cattle, and goose in some parts of the
604
United States (Layton et al., 2013; Shanks et al., 2009), but did not
605
amplify with these faecal samples in our study although cross-reactivity
606
was observed in pigs, donkeys, and rabbits (Fig. S2). This geographical
607
variability in cross-reactivity suggests that markers may produce solely
608
negative results for some animals if tested in a different place or region.
609
However, our study revealed that most false positive signals in nontarget
610
hosts were significantly lower than in target hosts (Fig. 2), this
611
observation could provide beneficial help to exclude the cross-reactivity.
612
This is consistent with several earlier studies, which reported that the
613
mean concentrations of markers in target sources were generally 1–5
614
orders of magnitude higher than in nontarget sources (Reischer et al.,
615
2013; Ahmed et al., 2019). Such low levels of these nontarget hosts may
616
not significantly interfere with the interpretation of results due to dilution
617
factors and loss of gene copies through sample concentration and DNA
618
extraction in water samples (Ahmed et al., 2019). Moreover, we observed
619
that most of the highly sensitive markers, such as Pig-1-Bac, Pig-2-Bac
620
and ruminant-associated markers, appeared to have a greater gap between
621
true positive and false positive signals. The mean concentrations in target
622
samples of these markers were 3–4 orders of magnitude higher than that
623
in nontarget samples. This may also provide more possibilities for
624
detecting the potential source of faecal pollution while eliminating false
625
positive signals.
626
It should be noted that though most false-positive signals were
627
significantly lower compared to true-positive signals, there will still be
628
individual cross-reactive species with a higher likelihood to exhibit
629
false-positive signals and influence the results of MST. For example, in
630
our study, rabbit and donkey exhibited a high degree of cross-reactivity
631
with human- and ruminant-associated markers, respectively. These
632
false-positive signals are likely to confound the detection of true-positive
633
signals. Therefore, if false-positive signals are suspected to be present in
634
water samples, the species known to be cross-reacting (i.e. those that
635
show similar concentrations to target samples), should be prioritised for
636
higher scrutiny or otherwise excluded from the study, adopting other
637
markers instead.
638
4.5 Implication of classification for host specificity
639
Although the mean concentrations of markers were lower in nontarget
640
samples than in target samples, it would be difficult to predict whether the
641
resulting lower false-positive signals from cross-reactive animals will
642
accordingly have little impact on the MST results. We proposed a novel
643
way to address this problem based on a 25th/75th metric. If the 25th
644
percentile concentration of target samples is not overlapped by the 75th
645
percentile concentration of nontarget samples, markers usually present
646
the clearest gap of distributions between true-positive and false-positive
647
signals. A small proportion of positive signals in nontarget samples may
648
further reduce the likelihood of a false-positive signal interfering with the
649
interpretation of results.
650
Our recommendations provide opportunities for policymakers and
651
managers to select suitable markers according to specific requirements,
652
such as land-use patterns. For instance, in highly urbanised areas, there
653
may not be significant faecal contamination input from livestock and
654
poultry farming to justify their monitoring. Rather, the main pollution
655
sources in here would be humans and pets. According to the classification
656
of our results, the application of the human-specific SYBR-HF183
657
marker could provide high sensitivity to human-originated pollution.
658
Furthermore, the cross-reactivity from most nontarget animals could be
659
masked due to their marker’s low-level abundance in nontarget hosts (i.e.
660
the animals which were assigned into WCR). Additionally, pairing the
661
Hum2 and CPQ_064 markers could accurately rule-out the interference
662
of cats (MCR) since there were no false-positive signals presented in
663
these animals. Moreover, Pig-2-Bac or BacCow could also be used to
664
rule-out cross-reactivity from rabbits (SCR).
665
It is important to note that the acquisition of reliable 25th/75th metrics
666
and mean concentrations in target and nontarget samples require a large
667
number of validation samples from various animals. There is no
668
established guideline on how many nontarget hosts must be validated in
669
specificity testing; however, the USEPA MST guidelines suggest that
670
more than 10 species of animals should be used for evaluating host
671
specificity, which would place our study well within an acceptable range.
672
However, the number of nontarget host species and sample size should be
673
increased when testing host specificity beyond a regional context, due to
674
the variability of the markers in different geographic locations. Also,
675
when tracking the source of faecal contamination in environmental waters,
676
the level of true-positive signals may critically mask false-positive signals
677
from WCR. If high concentrations of markers occur in water samples,
678
potential false-positive signals from WCR may also be increased and
679
interfere with MST results. Therefore, markers with high concentrations
680
in water samples should be diluted to ensure that the lower false-positive
681
signals from WCR cannot be detected.
682
This study validated the performance of a range of host-associated
683
genetic markers based on qualitative and quantitative tests for MST in
684
China and proposed a novel classification method for comprehensive
685
characterization of the specificity of markers. Our findings provide
686
opportunities for policymakers and managers to gain access to key
687
background information for selecting suitable markers to address the
688
challenges of accurately tracking faecal pollution sources in a quantitative
689
manner.
690
Unfortunately, in our pre-screening experiments, only one poultry
691
marker (GFD) was found to have applicability in China, and a single
692
marker is hard to validate for accuracy of MST results. The ability to
693
develop reliable genetic markers for the specific detection of poultry
694
faecal contaminations is still a worldwide challenge. This could be due to
695
different distributions of gut microbes in poultry compared to mammals.
696
Bacteroides and its associated organisms are commonly used to develop
697
host-associated genetic markers (Vadde et al., 2019), but a previous study
698
revealed that Bacteroides were not frequently detected in poultry gut or
699
faeces (Zhu et al., 2002; Scupham et al., 2008). Thus, further research
700
needs to focus on developing reliable methods for poultry-associated
701
source tracking. Moreover, the varying persistence of various genetic
702
markers could also affect MST results. A recent study established decay
703
models for HF183 and CPQ_056 in fresh and seawater as a function of
704
temperature; the results showed that the decay rate of HF183 was
705
significantly faster than for CPQ_056 (Balleste et al., 2019). This
706
difference in decay rate may affect the coupling of HF183 with CPQ_056
707
to
708
identification of decay rates of suitable markers is equally important for
709
MST validation.
710
5. Conclusions
711
•
712
MST
713
contamination in China. Overall, Bacteroidales markers exhibited higher
714
sensitivity and concentrations compared to other bacterial and viral
715
markers in target samples, but their specificity was low, suggesting that it
716
might be necessary to use multiple markers when tracking the sources of
717
faecal contamination.
718
•
719
and poultry-associated markers was likely due to cohabitation. Likewise,
720
similarities in diet and physiology may explain cross-reactivity from
721
noncommon animals (horse and donkey) to ruminant-associated markers.
722
•
determine
cross-reactivity
from
nontarget
hosts.
Therefore,
Fifteen host-associated markers presented potential suitability for of
human-,
pig-,
ruminant-
and
poultry-derived
faecal
The observed high proportion of cross-reactivity from pets to human-
Our novel animal cross-reactivity classification method has broad
723
implications for identifying the degree of impact of false-positive results.
724
According to this method, there were 77.9% nontarget samples
725
considered to be unlikely mismatched to the selected 15 markers due to
726
their absence or low concentrations in nontarget samples.
727 728
Acknowledgements
729
This research was financially supported by the National Natural Science
730
Foundation of China (Grant No. 41303054), the Basal Specific Research
731
of
732
PM-zx703-201803-089), the Outstanding Youth Science Foundation of
733
NSFC (Grant No. 51822908) and Natural Science Foundation of
734
Guangdong Province of China (Grant No. 2015A030313850).
the
Central
Public-Interest
Scientific
Institute
(Grant
No.
735 736
Reference
737
Ahmed, W., Goonetilleke, A., Powell, D., Gardner, T., 2009. Evaluation
738
of multiple sewage-associated Bacteroides PCR markers for sewage
739
pollution tracking. Water Res. 43(19), 4872-4877.
740
Ahmed, W., Gyawali, P., Feng, S., McLellan, S.L., 2019a. Host
741
specificity
742
sewage-associated marker genes in human and non-human fecal
743
samples. Appl. Environ. Microbiol. in press.
744
and
sensitivity
of
the
established
and
novel
Ahmed, W., Payyappat, S., Cassidy M., Besley C., Power, K., 2018.
745
Novel crAssphage marker genes ascertain sewage pollution in a
746
recreational lake receiving urban stormwater runoff. Water Res. 145,
747
769-778.
748
Ahmed, W., O'Dea, C., Masters, N., Kuballa, A., Marinoni, O., Katouli,
749
M., 2019b. Marker genes of fecal indicator bacteria and potential
750
pathogens in animal feces in subtropical catchments. Sci. Total
751
Environ. 656, 1427-1435.
752
Ahmed, W., Stewart, J., Powell, D., Gardner, T., 2010a. Evaluation of
753
Bacteroides markers for the detection of human faecal pollution.
754
Lett. Appl. Microbiol. 46(2), 237-242.
755
Ahmed, W., Yusuf, R., Hasan, I., Goonetilleke, A., Gardner, T., 2010b.
756
Quantitative PCR assay of sewage-associated Bacteroides markers
757
to assess sewage pollution in an urban lake in Dhaka, Bangladesh.
758
CAN. J. Microbiol. 56(10), 838.
759
Ayeni, F.A., Biagi, E., Rampelli, S., Fiori, J., Soverini, M., Audu, H.J.,
760
Cristino, S., Caporali, L., Schnorr, S.L. and Carelli, V., 2018. Infant
761
and Adult Gut Microbiome and Metabolome in Rural Bassa and
762
Urban Settlers from Nigeria. Cell Rep. 23(10), 3056-3067.
763
Balleste, E., Pascual-Benito, M., Martin-Diaz, J., Blanch, A.R., Lucena,
764
F., Muniesa, M., Jofre, J. and Garcia-Aljaro, C., 2019. Dynamics of
765
crAssphage as a human source tracking marker in potentially
766
faecally polluted environments. Water Res. 155, 233-244.
767
Bernhard, A.E., Field, K.G., 2000. A PCR assay to discriminate human
768
and ruminant feces on the basis of host differences in
769
Bacteroides-Prevotella genes encoding 16S rRNA. Appl. Environ.
770
Microbiol. 66(10), 4571-4574.
771
Boehm, A.B., Werfhorst, L.C., Van De, Griffith, J.F., Holden, P.A., Jay,
772
J.A., Shanks, O.C., Dan, W., Weisberg, S.B., 2013. Performance of
773
forty-one microbial source tracking methods: A twenty-seven lab
774
evaluation study. Water Res. 47(18), 6812-6828.
775
Bonjoch, X., Balleste, E., Blanch, A.R., 2005. Enumeration of
776
bifidobacterial populations with selective media to determine the
777
source of waterborne fecal pollution. Water Res. 39(8), 1621-1627.
778
Caldwell, J.M., Levine, J.F., 2009. Domestic wastewater influent
779
profiling using mitochondrial real-time PCR for source tracking
780
animal contamination. J. Microbiol. Meth. 77(1), 17-22.
781
Cinek, O., Mazankova, K., Kramna, L., Odeh, R., Alassaf, A., Ibekwe,
782
M.U., Ahmadov, G., Mekki, H., Abdullah, M.A., Elmahi, B.M.E.,
783
2018. Quantitative CrAssphage real-time PCR assay derived from
784
data of multiple geographically distant populations. J. Med.
785
Microbiol. 90(4), 767-771.
786
Dutilh, B.E., Cassman, N.A., Mcnair, K., Sanchez, S.E., Silva, G.G.Z.,
787
Boling, L., Barr, J.J., Speth, D.R., Seguritan, V., Aziz, R.K., 2014. A
788
highly abundant bacteriophage discovered in the unknown
789
sequences of human faecal metagenomes. Nat. Commun. 5(1),
790
4498-4498.
791
Fan, L., Shuai, J., Zeng, R., Mo, H., Wang, S., Zhang, X., He, Y., 2017.
792
Validation and application of quantitative PCR assays using
793
host-specific Bacteroidales genetic markers for swine fecal pollution
794
tracking. Environ. Pollut. 23, 1569-1577.
795
Feng, S., McLellan, S.L., 2019. Highly specific sewage-derived
796
Bacteroides qPCR assays target sewage polluted waters. Appl.
797
Environ. Microbiol. in press.
798
Gawler, A.H., Beecher, J.E., Brandão, J., Carroll, N.M., Falcão, L.,
799
Gourmelon, M., Masterson, B., Nunes, B., Porter, J., Rincé, A., 2007.
800
Validation of host-specific Bacteriodales 16S rRNA genes as
801
markers to determine the origin of faecal pollution in Atlantic Rim
802
countries of the European Union. Water Res. 41(16), 3780-3784.
803
Green, H.C., Dick, L.K., Brent, G., Mansour, S., Field, K.G., 2012.
804
Genetic markers for rapid PCR-based identification of gull, Canada
805
goose, duck, and chicken fecal contamination in water. Appl.
806
Environ. Microbiol. 78(2), 503-510.
807
Harwood, V.J., Christopher, S., Badgley, B.D., Kim, B., Asja, K., 2014.
808
Microbial source tracking
809
contamination in environmental waters: relationships between
810
pathogens and human health outcomes. FEMS Microbiol. Rev. 38(1),
markers for detection
of fecal
811
1-40.
812
He, X., Liu, P., Zheng, G., Chen, H., Shi, W., Cui, Y., Ren, H., Zhang,
813
X.X., 2016. Evaluation of five microbial and four mitochondrial
814
DNA markers for tracking human and pig fecal pollution in
815
freshwater. Sci. Rep. 6, 35311.
816
Hong, P.Y., 2010. A high-throughput and quantitative hierarchical
817
oligonucleotide primer extension (HOPE)-based approach to identify
818
sources of faecal contamination in water bodies. Environ. Microbiol.
819
11(7), 1672-1681.
820
Karkman, A., Parnanen, K. and Larsson, D.G.J. (2019) Fecal pollution
821
can
822
anthropogenically impacted environments. Nat. Commun. 10(1), 80.
explain
antibiotic
resistance
gene
abundances
in
823
Kildare, B.J., Leutenegger, C.M., McSwain, B.S., Bambic, D.G., Rajal,
824
V.B., Wuertz, S., 2007. 16S rRNA-based assays for quantitative
825
detection of universal, human-, cow-, and dog-specific fecal
826
Bacteroidales: A Bayesian approach. Water Res. 41(16), 3701-3715.
827
Kim, J.Y., Lee, H., Lee, J.E., Chung, M., Ko, G., 2013. Identification of
828
Human and Animal Fecal Contamination after Rainfall in the Han
829
River, Korea. Microbes Environ. 28(2), 187-194.
830
Kushugulova, A., Forslund, S.K., Costea, P.I., Kozhakhmetov, S.,
831
Khassenbekova, Z., Urazova, M., Nurgozhin, T., Zhumadilov, Z.,
832
Benberin, V. and Driessen, M., 2018. Metagenomic analysis of gut
833
microbial communities from a Central Asian population. BMJ Open
834
8(7).
835
Layton, B.A., Mckay, L., Dan, W., Garrett, V., Gentry, R., Sayler, G.,
836
2006. Development of Bacteroides 16S rRNA Gene TaqMan-Based
837
Real-Time PCR Assays for Estimation of Total, Human, and Bovine
838
Fecal Pollution in Water. Appl. Environ. Microbiol. 72(6),
839
4214-4224.
840
Layton, B.A., Yiping, C., Ebentier, D.L., Kaitlyn, H., Elisenda, B., João,
841
B.O., Muruleedhara, B., Reagan, C., Farnleitner, A.H., Jennifer, G.S.,
842
2013. Performance of human fecal anaerobe-associated PCR-based
843
assays in a multi-laboratory method evaluation study. Water Res.
844
47(18), 6897-6908.
845
Liang, Y.Y., Zhang, W., Tong, Y.G., Chen, S.P., 2016. CrAssphage is not
846
associated with diarrhoea and has high genetic diversity. Epidemiol.
847
Infect. 144(16), 3549-3553.
848
Lu, J., Santo Domingo, J.W., Lamendella, R., Edge, T., Hill, S., 2008.
849
Phylogenetic diversity and molecular detection of bacteria in gull
850
feces. Appl. Environ. Microbiol. 74(13), 3969-3976.
851
Lu, S., Smith, A.P., Dan, M. and Lee, N.M. (2010) Different real-time
852
PCR systems yield different gene expression values. Mol Cell
853
Probes 24(5), 315-320.
854
Malla, B., Ghaju, S.R., Tandukar, S., Bhandari, D., Inoue, D., Sei, K.,
855
Tanaka, Y., Sherchand, J.B., Haramoto, E., 2018. Validation of
856
host-specific Bacteroidales quantitative PCR assays and their
857
application to microbial source tracking of drinking water sources in
858
the Kathmandu Valley, Nepal. J. Appl. Microbiol. 125, 609-619.
859
Mayer, R.E., Reischer, G., Ixenmaier, S.K., Derx, J., Blaschke, A.P.,
860
Ebdon, J.E., Linke, R., Egle, L., Ahmed, W., Blanch, A., 2018.
861
Global Distribution of Human-associated Fecal Genetic Markers in
862
Reference Samples from Six Continents. Environ. Sci. Technol.
863
52(9), 5076-5084.
864
Mieszkin, S., Furet, J.-P., Corthier, G., Gourmelon, M., 2009. Estimation
865
of Pig Fecal Contamination in a River Catchment by Real-Time
866
PCR Using Two Pig-Specific Bacteroidales 16S rRNA Genetic
867
Markers. Appl Environ Microbiol. 75(10), 3045-3054.
868
Mieszkin, S., J-F, Y., Joubrel, R., Gourmelon, M., 2010. Phylogenetic
869
analysis of Bacteroidales 16S rRNA gene sequences from human
870
and animal effluents and assessment of ruminant faecal pollution by
871
real-time PCR. J. Appl. Microbiol. 108(3), 974-984.
872
Nshimyimana, J.P., Cruz, M.C., Thompson, R.J., Wuertz, S., 2017.
873
Bacteroidales markers for microbial source tracking in Southeast
874
Asia. Water Res. 118, 239-248.
875
Nshimyimana, J.P., Ekklesia, E., Shanahan, P., Chua, L.H.C., Thompson,
876
J.R., 2014. Distribution and abundance of human specific
877
Bacteroides and relation to traditional indicators in an urban tropical
878
catchment. J. Appl. Microbiol. 116(5), 1369-1383.
879
Odagiri, M., Schriewer, A., Hanley, K., Wuertz, S., Misra, P.R., Panigrahi,
880
P., Jenkins, M.W., 2015. Validation of Bacteroidales quantitative
881
PCR assays targeting human and animal fecal contamination in the
882
public and domestic domains in India. Sci. Total Environ. 502(5),
883
462-470.
884
Reischer, G.H., Ebdon, J.E., Bauer, J.M., Nathalie, S., Warish, A., Johan,
885
A.M., Blanch, A.R., Günter, B.S., Denis, B., Tricia, C., 2013.
886
Performance characteristics of qPCR assays targeting human- and
887
ruminant-associated Bacteroidetes for microbial source tracking
888
across sixteen countries on six continents. Environ. Sci. Technol.
889
47(15), 8548-8556.
890
Reischer, G., Kasper, D., Steinborn, R., Farnleitner, A., Mach, R., 2010. A
891
quantitative real-time PCR assay for the highly sensitive and specific
892
detection of human faecal influence in spring water from a large
893
alpine catchment area. Lett. Appl. Microbiol. 44(4), 351-356.
894
Seurinck, S., Defoirdt, T., Verstraete, W., Siciliano, S. D., 2005. Detection
895
and quantification of the human-specific HF183 Bacteroides 16S
896
rRNA genetic marker with real-time PCR for assessment of human
897
faecal pollution in freshwater. Environ. Microbiol. 7(2), 249-259
898
Scupham, A.J., Patton, T.G., Bent, E., Bayles, D.O., 2008. Comparison of
899
the Cecal Microbiota of Domestic and Wild Turkeys. Microb. Ecol.
900
56(2), 322-331.
901
Shanks, O.C., Karen, W., Kelty, C.A., Mano, S., Janet, B., Mark, M.,
902
Manju, V., Haugland, R.A., 2010a. Performance of PCR-based
903
assays targeting Bacteroidales genetic markers of human fecal
904
pollution in sewage and fecal samples. Environ. Sci. Technol. 44(16),
905
6281-6288.
906
Shanks, O.C., Karen, W., Kelty, C.A., Sam, H., Mano, S., Michael, J.,
907
Manju, V., Haugland, R.A., 2010b. Performance assessment
908
PCR-based assays targeting Bacteroidales genetic markers of bovine
909
fecal pollution. Appl. Environ. Microbiol. 76(5), 1359-1366.
910
Shanks, O.C., Kelty, C.A., Mano, S., Manju, V., Haugland, R.A., 2009.
911
Quantitative PCR for genetic markers of human fecal pollution. Appl.
912
Environ. Microbiol. 75(17), 5507-5513.
913
Shanks, O.C., Kelty, C.A., Shawn, A., Michael, J., Newton, R.J., Mclellan,
914
S.L., Huse, S.M., Sogin, M.L., 2011. Community structures of fecal
915
bacteria in cattle from different animal feeding operations. Appl.
916
Environ. Microbiol. 77(9), 2992-3001.
917
Stachler, E., Akyon, B., Carvalho, N.A.d., Ference, C., Bibby, K., 2018.
918
Correlation of crAssphage qPCR Markers with Culturable and
919
Molecular Indicators of Human Fecal Pollution in an Impacted
920
Urban Watershed. Environ. Sci. Technol. 52(13), 7505-7512.
921
Stachler, E., Bibby, K., 2014. Metagenomic Evaluation of the Highly
922
Abundant Human Gut Bacteriophage CrAssphage for Source
923
Tracking of Human Fecal Pollution. Environ. Sci. Technol. Lett.
924
1(10), 405-409.
925
Stewart, J.R., Boehm, A.B., Dubinsky, E.A., Fong, T.T., Goodwin, K.D.,
926
Griffith, J.F., Noble, R.T., Shanks, O.C., Vijayavel, K., Weisberg,
927
S.B.,
928
comparison of microbial source tracking methods. Water Res. 47(18),
929
6829-6838.
2013.
Recommendations
following
a
multi-laboratory
930
Tambalo, D.D., Boa, T., Liljebjelke, K., Yost, C.K., 2012. Evaluation of
931
two quantitative PCR assays using Bacteroidales and mitochondrial
932
DNA markers for tracking dog fecal contamination in waterbodies. J.
933
Microbiol. Meth. 91(3), 459-467.
934 935
USEPA,
2005.
Microbial
Source
Tracking
Guide
Document
EPA/600/R-05/064. Washington, DC.
936
Vadde, K.K., McCarthy, A.J., Rong, R., Sekar, R., 2019. Quantification of
937
Microbial Source Tracking and Pathogenic Bacterial Markers in
938
Water and Sediments of Tiaoxi River (Taihu Watershed). Front.
939
Microbiol. 10.
940
Zhang, Y., Wu, R., Zhang, Y., Wang, G., Li, K., 2018. Impact of nutrient
941
addition on diversity and fate of fecal bacteria. Sci. Total Environ.
942
636, 717-726.
943
Zhu, X.Y., Zhong, T., Pandya, Y., Joerger, R.D., 2002. 16S rRNA-Based
944
Analysis of Microbiota from the Cecum of Broiler Chickens. Appl.
945
Environ. Microbiol. 68(1), 124-137.
946
Zuo, T., Kamm, M.A., Colombel, J.F. and Ng, S.C. (2018) Urbanization
947
and the gut microbiota in health and inflammatory bowel disease.
948
Nat. Rev. Gastro. Hepat. 15(7), 440-452.
Performance of host-associated genetic markers for microbial source tracking in China
Highlights: 1. Performance of host-associated genetic markers were investigated in a large-scale area across China. 2. Distribution of target microorganisms affect the sensitivity and concentrations in target samples for corresponding markers. 3. Cohabitation, diet and physiology are important reason for occurrence of cross-reactivity. 4. Identifying the degree of impact of false-positive results from nontarget hosts by novel classification method.
Declaration of interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: