Accepted Manuscript Distinguishing neurocysticercosis epilepsy from epilepsy of unknown etiology using a minimal serum mass profiling platform Jay S. Hanas, James R. Hocker, Govindan Ramajayam, Vasudevan Prabhakaran, Vedantam Rajshekhar, Anna Oommen, Josephine J. Manoj, Michael P. Anderson, Douglas A. Drevets, Hélène Carabin PII:
S0014-4894(17)30648-3
DOI:
10.1016/j.exppara.2018.07.015
Reference:
YEXPR 7590
To appear in:
Experimental Parasitology
Received Date: 5 January 2018 Revised Date:
8 June 2018
Accepted Date: 20 July 2018
Please cite this article as: Hanas, J.S., Hocker, J.R., Ramajayam, G., Prabhakaran, V., Rajshekhar, V., Oommen, A., Manoj, J.J., Anderson, M.P., Drevets, D.A., Carabin, Héè., Distinguishing neurocysticercosis epilepsy from epilepsy of unknown etiology using a minimal serum mass profiling platform, Experimental Parasitology (2018), doi: 10.1016/j.exppara.2018.07.015. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
1
Distinguishing neurocysticercosis epilepsy from epilepsy of unknown etiology using a minimal
2
serum mass profiling platform.
3
Jay S. Hanas*, James R. Hocker*, Govindan Ramajayam†, Vasudevan Prabhakaran†, Vedantam
5
Rajshekhar†, Anna Oommen†, Josephine J. Manoj†, Michael P. Anderson‡, Douglas A. Drevets§,
6
Hélène Carabin‡
RI PT
4
7
SC
8 9
Corresponding author:
11
Hélène Carabin, D.V.M., Ph.D.
12
Dep of Biostatistics and Epidemiology, College of Public Health
13
University of Oklahoma Health Sciences Center
14
801 NE 13th St, Oklahoma City, OK, 73104
15
Tel: +1 405 271-2229 x48083
16
Fax: +1 405 271-2068
17
email:
[email protected]
20
TE D
EP
19
AC C
18
M AN U
10
Note: Supplementary data associated with this article′
ACCEPTED MANUSCRIPT
Abstract
22
Neurocysticercosis is associated with epilepsy in pig-raising communities with poor sanitation.
23
Current internationally recognized diagnostic guidelines for neurocysticercosis rely on brain
24
imaging, a technology that is frequently not available or not accessible in areas endemic for
25
neurocysticercosis. Minimally invasive and low-cost aids for diagnosing neurocysticercosis
26
epilepsy could improve treatment of neurocysticercosis. The goal of this study was to test the extent
27
to which patients with neurocysticercosis epilepsy, epilepsy of unknown etiology, idiopathic
28
headaches and among different types of neurocysticercosis lesions could be distinguished from each
29
other based on serum mass profiling. For this, we collected sera from patients with
30
neurocysticercosis-associated epilepsy, epilepsy of unknown etiology, recovered neurocysticercosis,
31
and idiopathic headaches then performed binary group comparisons among them using electrospray
32
ionization mass spectrometry. A leave one [serum sample] out cross validation procedure was
33
employed to analyze spectral data. Sera from neurocysticercosis patients was distinguished from
34
epilepsy of unknown etiology patients with a p-value of 10-28. This distinction was lost when
35
samples were randomized to either group (p-value=0.22). Similarly, binary comparisons of patients
36
with neurocysticercosis who has different types of lesions showed that different forms of this
37
disease were also distinguishable from one another. These results suggest neurocysticercosis
38
epilepsy can be distinguished from epilepsy of unknown etiology based on biomolecular differences
39
in sera detected by mass profiling.
41 42
SC
M AN U
TE D
EP
AC C
40
RI PT
21
Keywords: neurocysticercosis, epilepsy, diagnosis, serum, electrospray mass spectrometry, India
43 44
2
ACCEPTED MANUSCRIPT
1. Introduction
46
Epilepsy is a common neurological disorder affecting approximately 6.38 per 1000 persons (95%
47
Confidence Interval: 5.57-7.30 per 1000) (Fiest et al., 2017). Eighty-five percent of patients with
48
epilepsy reside in low and middle income countries (LMIC) where mortality rates from epilepsy are
49
also significantly higher (Newton and Garcia, 2012; Ngugi et al., 2010). Neuroimaging (CT and
50
MRI) of patients with adult onset epilepsy in endemic regions frequently reveals lesions diagnostic
51
or suggestive of Neurocysticercosis (NCC), a zoonotic infection of the central nervous system by
52
larvae of Taenia solium. The infection is transmitted between humans (definitive host) and pigs
53
(intermediate host). However, NCC may develop when humans become accidently infected with the
54
eggs of T. solium shed in human feces. NCC is most prevalent where sanitation is poor and pigs
55
roam and scavenge for food which includes several countries of Latin America, Africa and Asia
56
(Donadeu et al., 2016). A meta-analysis estimated that 29% of people with epilepsy show lesions
57
consistent with NCC in endemic areas (Ndimubanzi et al., 2010). The internationally recognized
58
NCC diagnostic guidelines rely on brain imaging (Del Brutto, 2012), facilities which are poorly
59
accessible to most people in endemic areas in LMIC. The mismatch between brain imaging
60
accessibility and prevailing economic realities in LMICs creates challenges for accurately
61
diagnosing NCC as a cause of epilepsy and for validly estimating the frequency of, and risk factors
62
for NCC (John et al., 2015; Ndimubanzi et al., 2010).
SC
M AN U
TE D
EP
Inflammatory responses are important for NCC epilepsy and epilepsy of unknown etiology
AC C
63
RI PT
45
64
(EUE), but the host response to NCC is not completely understood (Garcia et al., 2014; Vezzani,
65
2005). Host responses to degenerating larvae and calcified lesions are thought to be associated with
66
seizures in NCC epilepsy (Nash et al., 2015). Clinical distinction between seizures associated with
67
NCC, including among different types of NCC lesions, or EUE is impossible, although critical to
68
guide treatment and monitoring of patients after diagnosis (Coyle, 2014; Nash and Garcia, 2011).
3
69
ACCEPTED MANUSCRIPT
Analysis of biomolecules in readily accessible body fluids is one avenue of research for developing alternative NCC diagnostic aids in LMICs. While serological tests detecting antigens
71
and antibodies have had success in diagnosing the most active forms of NCC, their specificities
72
remain poor because metacestodes can be found in most human tissues and their sensitivities are
73
low for detection of single or calcified NCC lesions (Rodriguez et al., 2012; Sako et al., 2015). In
74
addition, these tests do not differentiate multiple from single lesions or among cysts in different
75
stages of development in the brain.
RI PT
70
Although soluble adhesion molecules in CSF and sera have been suggested as biomarkers for
77
EUE (Luo et al., 2014), one method of biomarker investigation not explored for NCC and EUE is
78
electrospray ionization (ESI) mass spectrometry (MS) serum mass profiling. The hypothesis of
79
serum biomolecule mass profiling is that the amounts and kinds of biomolecules in serum reflect
80
physiological changes, including those accompanying disease states (Hocker et al., 2011a; Hocker
81
et al., 2011b; Richter et al., 1999). The ESI-MS serum mass profiling platform requires minimal
82
sample preparation and examines a large number of different biomolecules in sera. In contrast, other
83
biomarker platforms focus on one or relatively small numbers of similar components. Examining
84
larger numbers of biomolecules increases the power of the platform to discriminate among disease
85
states (Hocker et al., 2011a; Hocker et al., 2011b; Vachani et al., 2015). For example, ESI-MS
86
serum profiling has been used to discriminate early-stage pancreatic cancer and lung cancer patients
87
from control individuals (Hanas et al., 2008; Hocker et al., 2011a; Hocker et al., 2011b).
M AN U
TE D
EP
AC C
88
SC
76
The goal of this study was to assess the degree to which serum mass profiling using ESI-MS
89
could discriminate between patients with NCC-associated epilepsy and those with EUE or
90
idiopathic headaches, and among different types of NCC lesions.
91 92
2. Materials and methods
93
2.1 Study participant descriptions 4
ACCEPTED MANUSCRIPT
Patients aged 18 to 51 years were recruited at the Department of Neurological Sciences, Christian
95
Medical College (CMC) and Hospital, Vellore, India, as described elsewhere (Prabhakaran et al.,
96
2017). The study was approved by the Institutional Review Boards of CMC and the University of
97
Oklahoma HSC, USA. All participants consented to participate in this study. Participants were
98
categorized into four groups: Group 1 included new patients diagnosed with NCC-associated
99
epilepsy who had experienced at least one seizure in the 7 months prior to enrollment. Patient sub-
RI PT
94
groups included: i) solitary cysticercus granuloma (SCG), ii) single calcified cysts (SCC), iii)
101
multiple neurocysticercosis cysts at various stages of development (MNCC). NCC patients were
102
further categorized for the absence or presence of peri-lesional edema on brain imaging. Group 2
103
included previously-diagnosed NCC patients with no seizures for at least two years and no residual
104
brain lesions (recovered NCC - RNCC). Group 3 included new patients with EUE reporting at least
105
one seizure in the 7 months prior to enrollment, no evidence of NCC or other lesions on brain
106
imaging and seronegative for cysticercosis antigens and antibodies. Group 4 included new patients
107
with headaches and normal brain imaging, no history of seizures, head trauma, human
108
immunodeficiency virus (HIV), hepatitis B virus (HBV) and hepatitis C virus (HCV) infections, or
109
serum cysticercosis antigens or antibodies (herein designated as idiopathic headaches). Patients in
110
all groups had not taken anti-inflammatory drugs (i.e., acetaminophen, ibuprofen) at least 7 days
111
prior to enrollment, and were not acutely ill at the time of phlebotomy. Peripheral blood was
112
obtained and serum was prepared as described previously according to blood biomarker standards
113
(Hocker et al., 2017; Hocker et al., 2015; Tuck et al., 2009).
M AN U
TE D
EP
AC C
114
SC
100
NCC and RNCC patients with extra-parenchymal lesions were excluded. This is because the
115
study focused on patients with epilepsy. Patients were tested for HIV, HBV and HCV only as
116
clinically indicated. All patients were tested for antigens and antibodies for cysticercosis
117
(Prabhakaran et al., 2017).
118 5
ACCEPTED MANUSCRIPT
2.2 Definition of NCC-associated epilepsy and epilepsy of unknown etiology
120
As described before (Prabhakaran et al., 2017), we used the proposed diagnostic criteria, including
121
computed tomography (CT) or magnetic resonance imaging (MRI) brain images interpreted by one
122
of the authors (VR) to define cases of NCC and categorize them into the subgroups described above
123
(Del Brutto et al., 2001; Garcı́a and Del Brutto, 2003). These diagnostic criteria were the only
124
available at the time the study was initiated and therefore retained throughout the study. These
125
criteria were recently shown to have similar sensitivity and specificity as those proposed in(Carpio
126
et al., 2016). Single calcified lesions were defined according to the recommendations by del Brutto
127
et al in their diagnostic criteria – “ solid, dense, supratentorial calcifications 1 to 10mm in diameter,
128
in the absence of other illnesses should be considered as highly suggestive of
129
neurocysticercosis.”(Del Brutto et al., 2001). In this study, the diagnosis of solitary cysticercus
130
granuloma (SCG) was made on the basis of previously validated criteria for SCG that has been
131
published by Rajshekhar et al. (Rajshekhar and Chandy, 1997). Patients with MNCC could show
132
only active lesions (i.e. viable or degenerating cysts), only calcified lesions, or a combination of
133
both. The operational definition for epilepsy of the International League Against Epilepsy was used
134
so that those with NCC and a single seizure met the definition (Fisher et al., 2014). All EUE
135
patients had experienced at least two seizures in their lifetime.
SC
M AN U
TE D
EP
136
RI PT
119
2.3 Mass spectrometry
138
Mass spectrometry (MS) analysis was conducted with an Advantage LCQ ion-trap bench top ESI-
139
MS instrument (ThermoFisher, Inc.) and an ESI-Single Quadrupole (Advion) instrument, both were
140
calibrated following manufacturer protocols. All solvents were HPLC grade and purchased from
141
ThermoFisher. Each patient’s serum aliquot (4 µl) was diluted 1:300 into 50% methanol and 2%
142
formic acid, and separated into 3 aliquots. The samples were loop injected (20 µl) into the nano
143
source of the mass spectrometer fitted with a 20 micron inner diameter fused silica (Polymicro
AC C
137
6
ACCEPTED MANUSCRIPT
Technologies) tip at a flow rate of 0.5 µl/min using an Eldex MicroPro series 1000 pumping system
145
with instrument settings determined in previous work (Hocker et al., 2017). High-resolution
146
triplicate mass spectra from two of the study groups were collected each day. The spectra were
147
sampled with m/Z (mass divided by charge) resolution of two hundredths over the m/Z range of the
148
instrument (i.e. 400 to 2000 m/Z). Positive ion mode spectra were collected over 30 min for each
149
injection. Raw spectral data were extracted using the manufacturer's software "Qual Browser"
150
version 1.4SR1 and exported in rounded unit m/Z and intensity values. Data were locally
151
normalized in segments of 10 m/Z from 400-2000 m/Z. MS spectral peak area assignments were
152
calculated as centroid m/Z peak area values (valley to valley) using Mariner Data Explorer 4.0.0.1
153
software (Applied BioSystems).
154
Centroid m/Z mass peak areas (referred to as peak areas), defined as the area of the peak calculated
155
from its geometric m/Z center, were exported into Excel 2013, and triplicate peak areas at each m/Z
156
value were averaged for each serum sample.
157
An ESI-Single Quadrupole instrument (Advion, Inc.) was also used to analyse the sera. This MS
158
instrument uses a different mass analyzer with reduced m/Z range. Daily calibration with Agilent
159
ESI Tuning Mix (G242A) diluted 1:4 with 100% Acetonitrile on peaks of 188.09, 322.05, 622.03,
160
and 922.01 m/z was performed. Fluid solvent flow of 0.23 µL/min was provided by a Harvard
161
Apparatus Pump 11 Elite equipped with a Hamilton 250 microliter gastight syringe. There was no
162
gas flow provided. General modifications made to the standard Advion system and set up included:
163
“Advion Data Express version 3.3.5.2. The tip was identical to that used in the LCQ
164
ADVANTAGE except the voltage was supplied through a M-572 IDEX-Health & Science
165
conductive MicroUnion Assembly. All solvents were HPLC graded purchased from ThermoFisher
166
Scientific. All acquisition and calibration were performed with the same voltages and flow rate as
167
the Advantage LCQ serum analysis. Data analysis of Advion samples was conducted using a 15
AC C
EP
TE D
M AN U
SC
RI PT
144
7
ACCEPTED MANUSCRIPT
168
minute averaged mass spectra (150-1200 m/Z data range) was extracted for each of 3 injections for
169
each patient sample.
170
2.4 Statistical and quantitative analysis
172
Peak areas were analyzed with a nested leave one out [serum sample] cross validation (LOOCV)
173
protocol to mitigate “over-fitting” (Guan et al., 2009; Hocker et al., 2015; Ransohoff, 2004). Fig 1A
174
illustrates the general approach for comparing two study groups. First, all peak areas of one subject
175
(in either group) are taken out of the database (“left out” serum sample). Second, the difference of
176
the means of peak area at each m/Z value for subjects “left-in” the two compared groups is analyzed
177
with a Student’s t-test (one-tailed, unequal variance, (Hocker et al., 2015)) at an alpha value of 0.05.
178
For each statistically significant peak area, the mid-point between these two means is used as a Peak
179
Classification Value (PCV) to classify the left-out sample. If the peak area value of the left-out
180
sample is above the PCV, it is allocated to the study group with the highest mean peak area at this
181
m/Z value. Otherwise, it is allocated to the other group. As an example, Fig. 1B illustrates this
182
classification procedure for 10 differentially expressed peak areas observed between 650 and 720
183
m/Z when one NCC sample is left out and 75 NCC (solid line) and 29 EUE (dotted line) samples
184
are left in. The peak area at 670 m/Z is categorized as a “NCC” peak area and the one at 689 m/Z as
185
an “EUE” peak area. If the left-out sample had a peak area of 12 at 670 m/Z (> PCV), it would be
186
classified as a NCC peak area, but if its value was 8 (< PCV), it would be classified as EUE peak
187
area and so on for all 10 peaks in Fig. 1B. This process is repeated sequentially for all significant
188
peak areas between 400 and 2000 m/V and until all samples have been left out and compared to the
189
remaining left in samples.
AC C
EP
TE D
M AN U
SC
RI PT
171
190
Each left-out sample is scored as the number of significant peak areas assigned to a specific
191
group divided by the number of all significant peak areas in that group. We refer to this score as the
192
% Total Group LOOCV classified peak areas, abbreviated as % Total Group LOOCV. 8
ACCEPTED MANUSCRIPT
193
The overall ability of the LOOCV approach to correctly classify subjects is determined by
194
comparing the mean % Total Group LOOCV (for example mean % Total NCC LOOCV) between
195
subjects from two study groups (for example, NCC and EUE). The p-value of the difference in the
196
means is determined using a Student’s t-test with unequal variance.
RI PT
197
2.5 Estimating the sensitivity and specificity of the ESI-MS approach to classify subjects
199
Means and standard deviations (SD) of the % Total Group LOOCV used to estimate the p-value of
200
the difference between two study groups were used to estimate the sensitivity and specificity of the
201
ESI-MS LOOCV to correctly classify subjects. Cohen’s d effect size values are calculated from the
202
% LOOCV means and standard deviations of two groups being compared to get a sense for the
203
importance of the difference observed (Cohen, 1988; Soper, 2018). A Cohen’s d value of 0.8 and
204
above is interpreted as a large effect size (Cohen, 1988). The observed Cohen’s d values were then
205
combined with the observed means and standard deviations of the two groups compared to get a
206
sense of the statistical power for each comparison conducted as described by Soper (Soper, 2018).
207
Classification of subjects into the NCC or EUE groups is used as an example, with the assumption
208
that the mean % Total NCC LOOCV is larger for the NCC group. First, a scale factor is calculated
209
as follows:
210
=
211
Second, the cut-off threshold value to classify samples into the NCC or EUE groups is obtained as
212
follows:
EP
TE D
M AN U
SC
198
% %
.
AC C
% %
− "" #ℎ %ℎ &
= '( % # ) *++, − - % # ) *++, ∗
213
Each subject’s % Total NCC LOOCV is then compared to the cut-off threshold and if above, the
214
subject is classified with the NCC group, otherwise, the subject is classified in the EUE group.
215
Taking NCC as, for example, the “infection” state and EUE as the “non-infection” state, this 9
ACCEPTED MANUSCRIPT
216
approach results in each sample being either correctly classified as NCC (True Positive or TP) or
217
EUE (True Negative or TN) or being wrongly classified as EUE (False Negative or FN) or NCC
218
(False Positive or FP). The sensitivity (Se = TP/TP+FN) and specificity (Sp = TN/TN+FP) are then
219
determined and the 95% confidence interval (95%CI) estimated using a binomial distribution.
RI PT
220
2.6 Randomizing subject allocation to assess the potential for over-fitting
222
Sample randomization was used to mitigate “over-fitting” (Baker et al., 2002). A randomized
223
database (RND) was created by randomizing each sample (Fig. 1A) to one of two study groups
224
while maintaining the original number of subjects, gender, and age group distribution in each group.
225
The nested LOOCV approach described above is applied to the RND. The number of significant
226
peak areas selected using the original dataset determines the number of peak areas selected in the
227
RND. The resulting classification p-values are expected to be either non-significant or considerably
228
larger than that obtained with the original dataset when good discrimination between groups is
229
present. The cut-off thresholds to classify the randomized subjects are determined using the scale
230
factor estimated with the original database. This results in different cut-off threshold values for the
231
two groups being compared (Fig. 2), which in turn mean that a subject randomized to the NCC
232
group can have a % Total NCC LOOCV value which is both above the cut-off threshold for NCC
233
(TP) and below the cut-off value for EUE (FN), resulting in this randomized subject being
234
simultaneously classified in two groups.
M AN U
TE D
EP
AC C
235
SC
221
236
2.7 Analysis of data from members of a smaller sized group
237
When small sized groups are analyzed, two larger groups are used to identify significant peak areas
238
and corresponding PCVs as well as the % Total Group LOOCV cut-off thresholds. Each subject of
239
the small group, referred to as the left-out “blinded sample group”, is classified at each significant
240
peak area to obtain their %Total Group LOOCV and further classified according to the %Total 10
ACCEPTED MANUSCRIPT
241
Group LOOCV cut-off threshold. For the RND, subjects are randomized into three groups,
242
including the small sized group, and then treated as described above.
243
2.8 Analysis of “left-out” data to determine the ability of the approach to classify new cases
245
The approach to analyze smaller groups can be applied to assess how “new” subjects classify into
246
two groups. A set number of subjects are excluded from the LOOCV analysis and put into a “blind
247
database”. The left-in subjects are put into a “training database” and analyzed with LOOCV to
248
determine the cut-off threshold for classification of subjects in the blind database. A p-value for the
249
classification of members of the blind database is estimated using the % Total Group LOOCV as
250
described in the general approach.
M AN U
SC
RI PT
244
251 252
3. Results
254
3.1 Demographics and characteristics of study groups
255
The socio-demographic and clinical characteristics of all recruited patients i.e. 76 patients with
256
NCC-associated epilepsy, including 29 SCG, 20 SCC and 27 MNCC, 29 with EUE, 17 with
257
idiopathic headaches and 10 RNCC were as given earlier (Prabhakaran et al., 2017). Overall, 44
258
(57%) of 76 patients with NCC had a Definite NCC diagnosis and 32 had a Probable NCC
259
diagnosis using the criteria of del Brutto et al (Del Brutto et al., 2001). Among the idiopathic
260
headache group, patients suffered from vascular and migraine headache (n=20), tension type
261
headache (n=3) and unspecified headache (n=3). There were no statistical differences among the
262
groups except for patients in the NCC and RNCC groups more frequently living near a pig-rearing
263
household. Patients with MNCC were more often sero-positive for cysticercosis antigens and
264
antibodies than those with single cysts. While all EUE patients reported two or more lifetime
AC C
EP
TE D
253
11
ACCEPTED MANUSCRIPT
265
seizures, 9 SCG and 5 MNCC patients reported having had only one lifetime seizures. No SCC case
266
reported only one lifetime seizure (See Supplementary Table 1).
267
3.2 Distinguishing NCC-associated epilepsy, EUE and idiopathic headache patients with
269
serum mass profiling
270
ESI-MS distinguished patients with NCC-associated epilepsy from those with EUE and idiopathic
271
headache (Fig. 2). The % Total NCC LOOCV clearly distinguished the two groups (Fig. 2A), with
272
75 of the 76 NCC subjects correctly classified as NCC (Se=99%) and all of the EUE subjects
273
correctly classified (Sp=100%) (Table 1). The p-value of the classification was estimated to 2.6 10-
274
28
275
the much higher p-values and the finding that most randomized NCC cases were classified as both
276
NCC and EUE (Fig. 2C) or as both NCC and idiopathic headache (Fig. 2D). .
SC
RI PT
268
M AN U
. In contrast, the groups were not distinguished from each other using the RND, as represented by
Similarly, NCC-associated epilepsy patients were distinguished from idiopathic headache
278
patients with 74 of the 76 NCC subjects classified as NCC (Se=97%) and all of the idiopathic
279
headache subjects classified as such (Sp=100%) (p-value=2.7 10-12; Table 1). When RND was used,
280
all but one subject were classified as both NCC and idiopathic headache (Fig. 2D). The larger p-
281
value associated with the classification in Fig. 2B and 2D as compared to Fig. 2A and 2C is due in
282
part to the smaller number of idiopathic headache subjects (n=17) compared with EUE (n=29).
EP
AC C
283
TE D
277
284
3.3 Different forms of NCC can be distinguished by serum mass profiling
285
Subjects with SCG and SCC appeared most different from each other with Se and Sp values of
286
100% each (p-value of 3.9 10-25)(Table 1). Good discrimination was obtained among all three sub-
287
groups (Fig. 3)., The RND resulted in p-values that were all significant, indicating some degree of
288
over-fitting however, RND p-values were several orders of magnitude larger than p-values obtained
289
from the actual data, and discrimination with this database was poor (Supplemental Data Fig. S1). 12
290
ACCEPTED MANUSCRIPT
Next, patients with active NCC (29 SCG and 11 MNCC) or calcified lesions (20 SCC and 10 MNCC) only were compared to those with EUE. MNCC subjects (n=6) with both active and
292
calcified lesions were not analyzed. Patients with EUE were distinct from patients with calcified
293
NCC (p-value=1.4 10-25) and from active NCC patients (p-value=8.2 10-18) (Fig. 4A and 4B).
294
Moreover, active NCC patients were also distinct from those with calcified NCC with 38 out of 40
295
patients with active NCC (Fig. 4C) (Se=95%) and 28 out of 30 (Sp=93%) with calcified NCC being
296
correctly classified (Table 1, p-value=1.6 10-19). The RND showed much larger p-values, although
297
the comparison between active NCC and EUE may have slight over-fitting (p-value=0.02).
298
However, the distinction among groups using the RND was poor with most subjects simultaneously
299
classified in two groups (Supplemental Data Fig. S2).
M AN U
SC
RI PT
291
300
3.4 Analysis of NCC patients with and without brain edema
302
Our study population included 48 NCC patients with edema, but edema was not evenly distributed
303
among the types of lesions (Suppl Table 1). To prevent the analysis from being overly influenced
304
by the types of lesions, subjects were selected to balance the number of subjects with and without
305
edema in each study sub-group and to frequency match for age and sex. Results shown in Fig. 4D
306
illustrate discrimination between NCC patients with (n=20) and without (n=20) edema (p-value=
307
1.8 10-19). The % Total edema LOOCV cut-off threshold correctly classified the 40 NCC cases
308
evaluated (Se=100%; Sp=100%) (Table 1) whereas the RND yielded a p-value of 0.02 and poor
309
discrimination (Fig. 4D).
EP
AC C
310
TE D
301
311
3.5 Assessing the classification of Recovered NCC (RNCC) with the other study group
312
The 10 RNCC patients were analyzed as a left-out blinded sample group. When compared with the
313
idiopathic headache and EUE patients, RNCC patients were best differentiated from the idiopathic
314
headache group (p-value=8.0 10-10), and appear more similar to the EUE group, although the p13
ACCEPTED MANUSCRIPT
value for the latter comparison was significant (Fig. 5A; p-value=0.002). In contrast, RNCC
316
patients were indistinguishable from the NCC group (p-value=0.1) while remaining distinctly
317
different from the EUE group (Fig. 5B, p-value=4.1 10-7). Collectively these data suggest that the
318
RNCC mass peak profiles were more similar to NCC patients with visible lesions than to NCC-free
319
subjects. Fig. 6C suggests more similarity between RNCC and SCG (p-value=2.0 10-6) than
320
between RNCC and SCC patients (p-value=3.3 10-11). Even when testing RNCC sera against the
321
more complicated relationship between active and calcified NCC patients (Fig. 5D), the RNCC
322
profiles indicated higher similarity to those with active lesions (p-value=0.045) than with calcified
323
lesions (p-value=1.1 10-5).
SC
RI PT
315
M AN U
324
All comparisons presented above showed Cohen’s d values quite a bit greater than 0.8, a value
326
considered to indicate a large difference. The observed data suggested that the power of our
327
analyses was above 90% for all comparisons. This suggests that our sample size was sufficient to
328
observe the large differences which we found between groups (Table 1).
329
TE D
325
3.6 Assessing the classification of “blinded” left-out samples
331
The left-in training dataset used to determine the group cut-off threshold when comparing the NCC
332
to the EUE group is illustrated in Fig 6A while Fig 6B shows how 28 NCC and five EUE left-out
333
subjects were classified. All five EUE were classified as such while 21 of 28 of the blinded NCC
334
samples were classified correctly, for an estimated sensitivity of 75%. Fig. 6D exhibits a similar
335
blind analysis of five active and five calcified NCC patient serum samples, tested against their
336
training set (Fig. 6C). Nine out of 10 samples were identified correctly with a sub-group
337
discriminatory p value of 10-4.
AC C
EP
330
338 339
3.7 Performance of the Advion instrument. 14
ACCEPTED MANUSCRIPT
The Advion instrument performed reasonably well at distinguishing subjects in the NCC group
341
from those in the EUE group (Suppl Fig. S3A, p-value=1.0 10-14) and of those with active NCC
342
from those with calcified NCC (Suppl. Fig. S3B, p-value=9.9 10-17). However, the classification of
343
subjects was not as good as that observed with the Advantage LCQ. Indeed, 64 out of 76 NCC
344
(84.2%) and 24 out of 29 EUE (82,8%) were classified as such. A similar performance was
345
observed to classify the active NCC patients (40/46 or 87,0%) as compared to the calcified NCC
346
patients (27/30 or 90%). All p-values for the RND were non-significant suggesting that over-fitting
347
was not an issue here. These results suggest that even a less accurate and lower resolution
348
instrument with reduced m/Z range can detect enough mass spectrum signal differences between
349
these groups, strengthening our conclusions that there are some biomolecules in the serum which
350
differ among the study groups that could, if identified, help in the diagnosis of NCC-associated
351
epilepsy and of NCC lesions.
M AN U
SC
RI PT
340
352
4. Discussion
354
This study reports an initial step towards developing minimally invasive and low-cost aids to
355
diagnose NCC-associated epilepsy based on biomolecules identified from mass spectra. We used a
356
LOOCV method combined with randomization of subjects to limit over-fitting of high dimensional
357
mass peak data. All comparisons showed very good discrimination among groups whereas poor
358
discrimination was observed with the RND, supporting the hypothesis that disease-specific
359
perturbations contributed to measurable differences in serum (Hocker et al., 2011a; Hocker et al.,
360
2011b). These observations were further supported by similar results using a different instrument,
361
the Advion CMS which employs a different type of mass analyzer albeit with a reduced m/Z range
362
(see Supplemental material and Suppl Data Figure S3).
363
Serum mass peak profiles are hypothesized to result from tissue shedding and secretion of
364
biomolecules into the bloodstream (Hocker et al., 2017; Hocker et al., 2015) . The small spectral
AC C
EP
TE D
353
15
ACCEPTED MANUSCRIPT
masses of 500-1200 m/Z that were analyzed comprise a lower mass peptide “serome” and likely
366
result from differential host tissue/organ exoprotease activities and other cell/tissue signaling
367
activities (Villanueva et al., 2006). Possible mechanisms yielding differences due to different
368
pathologies could involve “alarmin”-like molecules shed or secreted by differentially
369
damaged/altered cells which could trigger downstream responses in other cells (Bianchi, 2007).
370
Differentiating between seizures due to EUE or NCC is clinically relevant and important for
371
treatment, as is the knowledge of the presence of brain edema. Using serum mass profiling,
372
differences between different NCC lesions and those with or without edema were evident, despite
373
all NCC patients having seizures. An interesting finding was that the sera mass profile of RNCC
374
patients segregated with NCC patients, in particular with those with SCG, rather than with EUE
375
patients. However RNCC patients segregated with EUE patients rather than seizure-free idiopathic
376
headache subjects. These results suggest that novel tests based on biomolecules corresponding to
377
the mass peak areas showing differences could guide therapeutic decisions downstream of a
378
diagnosis of NCC.
379
Our blinded analyses showed promise that biological differences between groups could be helpful
380
in identifying new patients. These results are limited by the potential for data over-fitting and by the
381
lack of identification of the molecular composition of the key discriminating peaks. Although the
382
RND approach could demonstrate that age and gender were unlike to be confounder, it is possible
383
that other variables could have confounded the observed association. However, it would be very
384
difficult if not impossible to account for all potential confounders in a LOOCV model such as the
385
one used here. In addition, this study was meant to be the first step in exploring the possibility of
386
using mass spectrometry as a tool to differentiate among patients with different lesions and
387
symptoms. However, while the comparison of the NCC group with the idiopathic headache group
388
could have been the subject of confounding due to the imbalanced distribution of several variables,
389
other major comparisons such as between SCG and MNCC, SCC and SCG, MNCC and SCC were
AC C
EP
TE D
M AN U
SC
RI PT
365
16
ACCEPTED MANUSCRIPT
conducted among patients with similar distributions of these same potential confounders. Yet, the
391
difference in the p-values of the study group comparisons (which were all highly significant) and of
392
the RND comparisons (which were much higher) were similar when the NCC sub-groups were
393
compared as when the NCC group was compared to the idiopathic headache group. These
394
observations suggest that confounders may play minimal roles in these group separations.
395
Furthermore, the possibility of over-fitting was mitigated by RND analysis and by performing a
396
blinded analysis of patient samples against a training database. All comparisons showed good
397
power as suggested by the Cohen’s d. In addition, analyses are underway to determine the
398
composition of disease discriminating mass peaks more completely and gain a better understanding
399
of the molecules and mechanisms that differentiate the clinical groups.
M AN U
SC
RI PT
390
400
AC C
EP
TE D
401
17
ACCEPTED MANUSCRIPT
Acknowledgments
403
We wish to thank all participants for their time and willingness to take part in this study. This work
404
was supported by the National Institute of Neurological Diseases and Stroke in the U.S.
405
[R21NS077466] and by the Department of Biotechnology in India [BT/MB/BRCP/06/2011] under
406
the U.S.-India Bilateral Brain Research Collaborative Partnerships (U.S. – India BRCP). Further
407
support was received form the National Institute of Neurological Diseases and Stroke and the
408
Fogarty International Center [R01NS098891] under the Global Brain and Nervous System
409
Disorders Research Across the Lifespan program.
AC C
EP
TE D
M AN U
SC
RI PT
402
18
ACCEPTED MANUSCRIPT
410
Table 1: Estimated sensitivity and specificity values (95% CI) of the % Total Group LOOCV mass peak areas to classify subjects into their appropriate
411
groups using the actual and randomized databases. Randomized
Actual data % LOOCV mean (SD)
dataset
TN Sensitivityc Specificityd p-value of the TP
b
NCC 52.1 (4.2)
(95% CI ) (95% CI ) classifications
EUE
100 75
29 99 (93; 100)
37.0 (3.4)
(88;100)
Idiopathic NCC 74
16 97 (91; 100) 94 (71; 100)
TE D
headache 38.3 (5.0) 18.1 (5.7)
100 (88; headache
29
17 100)
35.1 (3.4) SCC
100 (88;
29 49.3 (4.3)
26.6 (3.4)
SCC
MNCC
20
100) 19
classifications
2.6 10-28
0.22
3.95
2.7 10-12
0.08
3.76
8.2 10-18
0.12
4.16
3.9 10-25
0.03
5.85
6.2 10-11
0.02
2.91
100)
100 (83; 100)
24 95 (75; 100) 89 (71; 98)
Note: Supplementary data associated with this article′
classifications
Cohen’s d of the
100 (88;
AC C
51.7 (4.5)
SCG
EP
Idiopathic EUE
p-value of the
SC
Group 2
a
M AN U
Group 1
RI PT
Compared groups
ACCEPTED MANUSCRIPT
Compared groups
Randomized Actual data
% LOOCV mean (SD)
MNCC
SCG 46.5 (5.2)
Active NCC
EUE
56.3 (5.8)
classifications
classifications
(95% CI ) (95% CI ) classifications
100 (88; 26
67.3 (4.3)
b
TP
100)
37
27 93 (77; 98) 93 (77; 99)
30
28
37.7 (6.5)
EUE NCC
100 (88;
46.1 (4.0)
97 (82; 100) 100)
Calcified NCC 34.5 (6.5)
NCC with
NCC without edema
38
28 95 (83; 99) 93 (78; 99)
20
20
AC C
54.2 (5.7)
edema
EP
65.5 (4.3) Active NCC
1.1 10-22
29 96 (81; 100)
Calcified
100 (83; 100)
RI PT
52.5 (2.9)
Cohen’s d of the
SC
63.7 (40.6)
p-value of the
a
0.04
4.35
8.2 10-18
0.02
3.01
1.4 10-25
0.28
4.67
1.6 10-19
0.14
3.27
1.8 10-19
0.02
5.41
M AN U
Group 2
TN Sensitivityc Specificityd p-value of the
TE D
Group 1
dataset
100 (83; 100) 20
ACCEPTED MANUSCRIPT
Compared groups
Randomized Actual data
% LOOCV mean (SD)
p-value of the
Cohen’s d of the
b
classifications
classifications
a
TP
(95% CI ) (95% CI ) classifications
34.2 (5.0)
TPa: Number of subjects truly from Group 1 classified as Group 1
413
TNb: Number of subject truly from Group 2 classified as Group 2
414
Sensitivityc: Proportion of subjects truly from Group 1 classified as Group 1
415
Specificityd: Proportion of subjects truly from Group 2 classified as Group 2
TE D EP AC C
417
M AN U
412
416
RI PT
59.2 (4.2)
Group 2
TN Sensitivityc Specificityd p-value of the
SC
Group 1
dataset
21
ACCEPTED MANUSCRIPT
References
419
Baker, S.G., Kramer, B.S., Srivastava, S., 2002. Markers for early detection of cancer: Statistical guidelines
420
for nested case-control studies. BMC Medical Research Methodology 2, 4-4.
421
Bianchi, M.E., 2007. DAMPs, PAMPs and alarmins: all we need to know about danger. Journal of Leukocyte
422
Biology 81, 1-5.
423
Carpio, A., Fleury, A., Romo, M.L., Abraham, R., Fandino, J., Duran, J.C., Cardenas, G., Moncayo, J., Leite
424
Rodrigues, C., San-Juan, D., Serrano-Duenas, M., Takayanagui, O., Sander, J.W., 2016. New diagnostic
425
criteria for neurocysticercosis: Reliability and validity. Ann Neurol 80, 434-442.
426
Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences, 2 ed. Lawrence Erlbaum Associates,
427
Hillsdale, NJ
428
Coyle, C.M., 2014. Neurocysticercosis: an update. Curr Infect Dis Rep 16, 437.
429
Del Brutto, O.H., 2012. Diagnostic criteria for neurocysticercosis, revisited. Pathogens and Global Health
430
106, 299-304.
431
Del Brutto, O.R., Rajshekhar, V., White, A.C., Tsang, V.C.W., Nash, T.E., Takayanagui, O.M., Schantz, P.M.,
432
Evans, C.A.W., Flisser, A., Correa, D., Botero, D., Allan, J.C., Sartì , E., Gonzalez, A.E., Gilman, R.H., García,
433
H.H., 2001. Proposed diagnostic criteria for neurocysticercosis. Neurology 57, 177-183.
434
Donadeu, M., Lightowlers, M.W., Fahrion, A.S., Kesselsd, J., Abela-Ridderc, B., 2016. Taenia solium: WHO
435
endemicity map update. The Weekly Epidemiological Record 91, 595-599.
436
Fiest, K.M., Sauro, K.M., Wiebe, S., Patten, S.B., Kwon, C.S., Dykeman, J., Pringsheim, T., Lorenzetti, D.L.,
437
Jette, N., 2017. Prevalence and incidence of epilepsy: A systematic review and meta-analysis of
438
international studies. Neurology 88, 296-303.
439
Fisher, R.S., Acevedo, C., Arzimanoglou, A., Bogacz, A., Cross, J.H., Elger, C.E., Engel, J., Forsgren, L., French,
440
J.A., Glynn, M., Hesdorffer, D.C., Lee, B.I., Mathern, G.W., Moshé, S.L., Perucca, E., Scheffer, I.E., Tomson, T.,
441
Watanabe, M., Wiebe, S., 2014. ILAE Official Report: A practical clinical definition of epilepsy. Epilepsia 55,
442
475-482.
443
Garcı ́a, H.H., Del Brutto, O.H., 2003. Imaging findings in neurocysticercosis. Acta Tropica 87, 71-78.
AC C
EP
TE D
M AN U
SC
RI PT
418
Note: Supplementary data associated with this article′
ACCEPTED MANUSCRIPT
Garcia, H.H., Gonzales, I., Lescano, A.G., Bustos, J.A., Pretell, E.J., Saavedra, H., Nash, T.E., The Cysticercosis
445
Working Group in, P., 2014. Enhanced steroid dosing reduces seizures during antiparasitic treatment for
446
cysticercosis and early after. Epilepsia 55, 1452-1459.
447
Guan, W., Zhou, M., Hampton, C.Y., Benigno, B.B., Walker, L.D., Gray, A., McDonald, J.F., Fernández, F.M.,
448
2009. Ovarian cancer detection from metabolomic liquid chromatography/mass spectrometry data by
449
support vector machines. BMC Bioinformatics 10, 259.
450
Hanas, J.S., Hocker, J.R., Cheung, J.Y., Larabee, J.L., Lerner, M.R., Lightfoot, S.A., Morgan, D.L., Denson, K.D.,
451
Prejeant, K.C., Gusev, Y., Smith, B.J., Hanas, R.J., Postier, R.G., Brackett, D.J., 2008. Biomarker Identification
452
in Human Pancreatic Cancer Sera. Pancreas 36, 61-69.
453
Hocker, J.R., Deb, S.J., Li, M., Lerner, M.R., Lightfoot, S.A., Quillet, A.A., Hanas, R.J., Reinersman, M.,
454
Thompson, J.L., Vu, N.T., Kupiec, T.C., Brackett, D.J., Peyton, M.D., Dubinett, S.M., Burkhart, H.M., Postier,
455
R.G., Hanas, J.S., 2017. Serum Monitoring and Phenotype Identification of Stage I Non-Small Cell Lung
456
Cancer Patients. Cancer Investigation 35, 573-585.
457
Hocker, J.R., Lerner, M.R., Mitchell, S.L., Lightfoot, S.A., Lander, T.J., Quillet, A.A., Hanas, R.J., Peyton, M.D.,
458
Postier, R.G., Brackett, D.J., Hanas, J.S., 2011a. Distinguishing early-stage pancreatic cancer patients from
459
disease-free individuals using serum profiling. Cancer Invest 29, 173-179.
460
Hocker, J.R., Peyton, M.D., Lerner, M.R., Lightfoot, S.A., Hanas, R.J., Brackett, D.J., Hanas, J.S., 2011b.
461
Distinguishing non-small cell lung adenocarcinoma patients from squamous cell carcinoma patients and
462
control individuals using serum profiling. Cancer Invest 30, 180-188.
463
Hocker, J.R., Postier, R.G., Li, M., Lerner, M.R., Lightfoot, S.A., Peyton, M.D., Deb, S.J., Baker, C.M., Williams,
464
T.L., Hanas, R.J., Stowell, D.E., Lander, T.J., Brackett, D.J., Hanas, J.S., 2015. Discriminating patients with
465
early-stage pancreatic cancer or chronic pancreatitis using serum electrospray mass profiling. Cancer
466
Letters 359, 314-324.
467
John, C.C., Carabin, H., Montano, S.M., Bangirana, P., Zunt, J.R., Peterson, P.K., 2015. Global research
468
priorities for infections that affect the nervous system. Nature 527, S178-186.
AC C
EP
TE D
M AN U
SC
RI PT
444
23
ACCEPTED MANUSCRIPT
Luo, J., Wang, W., Xi, Z., dan, C., Wang, L., Xiao, Z., Wang, X., 2014. Concentration of Soluble Adhesion
470
Molecules in Cerebrospinal Fluid and Serum of Epilepsy Patients. Journal of Molecular Neuroscience 54,
471
767-773.
472
Nash, T.E., Garcia, H.H., 2011. Diagnosis and Treatment of Neurocysticercosis. Nature reviews. Neurology 7,
473
584-594.
474
Nash, T.E., Mahanty, S., Loeb, J.A., Theodore, W.H., Friedman, A., Sander, J.W., Singh, G., Cavalheiro, E., Del
475
Brutto, O.H., Takayanagui, O.M., Fleury, A., Verastegui, M., Preux, P.M., Montano, S., Pretell, E.J., White,
476
A.C., Jr., Gonzales, A.E., Gilman, R.H., Garcia, H.H., 2015. Neurocysticercosis: A natural human model of
477
epileptogenesis. Epilepsia 56, 177-183.
478
Ndimubanzi, P.C., Carabin, H., Budke, C.M., Nguyen, H., Qian, Y.-J., Rainwater, E., Dickey, M., Reynolds, S.,
479
Stoner, J.A., 2010. A Systematic Review of the Frequency of Neurocyticercosis with a Focus on People with
480
Epilepsy. PLOS Neglected Tropical Diseases 4, e870.
481
Newton, C.R., Garcia, H.H., 2012. Epilepsy in poor regions of the world. Lancet 380, 1193-1201.
482
Ngugi, A.K., Bottomley, C., Kleinschmidt, I., Sander, J.W., Newton, C.R., 2010. Estimation of the burden of
483
active and life-time epilepsy: a meta-analytic approach. Epilepsia 51, 883-890.
484
Prabhakaran, V., Drevets, D.A., Ramajayam, G., Manoj, J.J., Anderson, M.P., Hanas, J.S., Rajshekhar, V.,
485
Oommen, A., Carabin, H., 2017. Comparison of monocyte gene expression among patients with
486
neurocysticercosis-associated epilepsy, Idiopathic Epilepsy and idiopathic headaches in India. PLOS
487
Neglected Tropical Diseases 11, e0005664.
488
Rajshekhar, V., Chandy, M.J., 1997. Validation of diagnostic criteria for solitary cerebral cysticercus
489
granuloma in patients presenting with seizures. Acta Neurologica Scandinavica 96, 76-81.
490
Ransohoff, D.F., 2004. Rules of evidence for cancer molecular-marker discovery and validation. Nat Rev
491
Cancer 4, 309-314.
492
Richter, R., Schulz-Knappe, P., Schrader, M., Standker, L., Jurgens, M., Tammen, H., Forssmann, W.G., 1999.
493
Composition of the peptide fraction in human blood plasma: database of circulating human peptides. J
494
Chromatogr B Biomed Sci Appl 726, 25-35.
AC C
EP
TE D
M AN U
SC
RI PT
469
24
ACCEPTED MANUSCRIPT
Rodriguez, S., Wilkins, P., Dorny, P., 2012. Immunological and molecular diagnosis of cysticercosis.
496
Pathogens and Global Health 106, 286-298.
497
Sako, Y., Takayanagui, O.M., Odashima, N.S., Ito, A., 2015. Comparative Study of Paired Serum and
498
Cerebrospinal Fluid Samples from Neurocysticercosis Patients for the Detection of Specific Antibody to
499
Taenia solium Immunodiagnostic Antigen. Tropical Medicine and Health 43, 171-176.
500
Soper, D., 2018. Free Statistical Calculators, 4.0 ed.
501
Tuck, M.K., Chan, D.W., Chia, D., Godwin, A.K., Grizzle, W.E., Krueger, K.E., Rom, W., Sanda, M., Sorbara, L.,
502
Stass, S., Wang, W., Brenner, D.E., 2009. Standard Operating Procedures for Serum and Plasma Collection:
503
Early Detection Research Network Consensus Statement Standard Operating Procedure Integration
504
Working Group. Journal of Proteome Research 8, 113-117.
505
Vachani, A., Pass, H.I., Rom, W.N., Midthun, D.E., Edell, E.S., Laviolette, M., Li, X.-J., Fong, P.-Y., Hunsucker,
506
S.W., Hayward, C., Mazzone, P.J., Madtes, D.K., Miller, Y.E., Walker, M.G., Shi, J., Kearney, P., Fang, K.C.,
507
Massion, P.P., 2015. Validation of a Multiprotein Plasma Classifier to Identify Benign Lung Nodules. Journal
508
of Thoracic Oncology 10, 629-637.
509
Vezzani, A., 2005. Inflammation and Epilepsy. Epilepsy Currents 5, 1-6.
510
Villanueva, J., Shaffer, D.R., Philip, J., Chaparro, C.A., Erdjument-Bromage, H., Olshen, A.B., Fleisher, M.,
511
Lilja, H., Brogi, E., Boyd, J., Sanchez-Carbayo, M., Holland, E.C., Cordon-Cardo, C., Scher, H.I., Tempst, P.,
512
2006. Differential exoprotease activities confer tumor-specific serum peptidome patterns. The Journal of
513
Clinical Investigation 116, 271-284.
515
SC
M AN U
TE D
EP
AC C
514
RI PT
495
25
ACCEPTED MANUSCRIPT
Figure captions and legends
517
Figure 1. A, Flowchart of the steps taken for an electrospray ionizing mass spectrometry analysis
518
using the comparison between neurocysticercosis (NCC) and epilepsy of unknown etiology (EUE)
519
as an example. B, Example of statistically significant different LOOCV means in normalized
520
spectral mass peaks seen between 650 and 750 m/Z when data from 75 NCC and 29 EUE are left-in
521
while 1 NCC case is left out.
522
Legend: * indicates that the difference in a normalized spectral mass peak mean between the NCC
523
and EUE left-in subjects is statistically significant. The horizontal bar indicated the median between
524
the two means and corresponds to the LOOCV peak classification value.
525
Abbreviations: LOOCV: Leave One Out Cross Validation; NCC: neurocysticercosis; EUE: epilepsy
526
of unknown etiology.
M AN U
SC
RI PT
516
527
Figure 2. Percent total (randomized) NCC LOOCV classified mass peaks of each study subject and
529
of randomized subjects in relation to the cut-off thresholds used to classify subjects into one of two
530
groups and with the p-values corresponding to the difference in the means of the two groups being
531
compared. A, Comparison of subjects with neurocysticercosis (NCC) and with epilepsy of unknown
532
etiology (EUE). B, Comparison of subjects with NCC and headaches. C, Comparison of subjects
533
randomized to either the NCC or EUE groups. D, Comparison of subjects randomized to the NCC
534
or headache groups.
535
Abbreviations: NCC, neurocysticercosis; EUE: epilepsy of unknown etiology; LOOCV: leave one
536
out cross validation; RND: randomized database; SD: standard deviation.
EP
AC C
537
TE D
528
538
Figure 3. Percent total group LOOCV classified mass peaks of each subject in the three NCC sub-
539
groups in relation to the cut-off thresholds used to classify subjects into one of two groups, the p-
540
value corresponding to the difference in the means of the two groups being compared, and the p26
ACCEPTED MANUSCRIPT
value obtained with the corresponding randomized database. A, Comparison of subjects with
542
multiple neurocysticercosis (MNCC) and with single cysticercus granuloma (SCG). B, Comparison
543
of subjects with SCG and single calcified cyst (SCC). C, Comparison of subjects with SCC and
544
MNCC.
545
Abbreviations: NCC, neurocysticercosis; MNCC: multiple neurocysticercosis; SCG: single
546
cysticercus granuloma; SCC: single calcified cyst; LOOCV: leave one out cross validation; SD:
547
standard deviation.
RI PT
541
SC
548
Figure 4. Percent total group LOOCV classified mass peaks of each subject with calcified
550
neurocysticercosis (NCC), active NCC, NCC with or without edema or epilepsy of unknown
551
etiology (EUE) in relation to the cut-off thresholds used to classify subjects into one of two groups,
552
the p-value corresponding to the difference in the means of the two groups being compared, and the
553
p-value obtained with the corresponding randomized database. A, Comparison of subjects with
554
calcified NCC and with EUE. B, Comparison of subjects with active NCC and with EUE. C,
555
Comparison of subjects with active NCC and calcified NCC. D, Comparison of NCC subjects with
556
edema* and without edema**
557
Abbreviations: NCC, neurocysticercosis; LOOCV: leave one out cross validation; SD: standard
558
deviation.
559
Legend: NCC with edema*: There were a total of 48 NCC subjects with edema. Among these, 6/6
560
multiple NCC patients with mixed lesions, 2/2 multiple NCC patients with calcified cysts only, 1/10
561
multiple NCC patient with active cysts only, 4/4 single calcified cyst patients, 7/26 single
562
cysticercus granuloma patients were included in this analysis.
563
NCC without edema**: There were a total of 28 NCC subjects without edema. Among these, 1/1
564
multiple NCC patient with active cysts only, 8/8 multiple NCC patients with calcified cysts only,
AC C
EP
TE D
M AN U
549
27
ACCEPTED MANUSCRIPT
565
8/16 single calcified cyst patients, 3/3 single cysticercus granuloma patients were included in this
566
analysis.
567
Figure 5. Classification of patients with recovered NCC (RNCC) according to their percent total
569
group LOOCV classified mass peaks using the data on the percent total group LOOCV classified
570
mass peaks and cut-off thresholds using from subjects in other groups. A, Classification of RNCC
571
patients using data from the idiopathic headache (headache) and epilepsy of unknown etiology
572
(EUE) patients. B, Classification of the RNCC patients using data from the epilepsy of unknown
573
etiology (EUE) and the NCC patients. C, Classification of the RNCC patients using data from the
574
single cysticercus granuloma (SCG) and single calcified cyst (SCC) patients. D, Classification of
575
the RNCC patients using data from patients with active and calcified NCC. Abbreviations: EUE:
576
epilepsy of unknown etiology; NCC, neurocysticercosis; RNCC: recovered neurocysticercosis;
577
SCG, single cysticercus granuloma; SCC: single calcified cyst (SCC); LOOCV: leave one out cross
578
validation; SD: standard deviation.
SC
M AN U
TE D
579
RI PT
568
Figure 6. Classification of 28 neurocysticercosis (NCC), five epilepsy of unknown etiology (EUE),
581
five active NCC and five calcified NCC samples taken out of the original database (blind samples)
582
according to their percent Total group LOOCV classified mass peaks using data from the remaining
583
samples (training set) to determine the group cut-off thresholds. A, Percent Total NCC LOOCV
584
classification mass peaks and group cut-off threshold for classifying subjects into the NCC or EUE
585
groups using the data from 48 NCC and 24 EUE subjects in the training dataset. B, Classification of
586
28 NCC and five EUE subjects not included in the training dataset according to their % Total NCC
587
LOOCV classification mass peaks compared to the group cut-off threshold obtained in (A). C,
588
Percent Total active NCC classification mass peaks and group cut-off threshold for classifying
589
subjects into the active NCC or calcified NCC groups using the data from 35 active NCC and 25
AC C
EP
580
28
ACCEPTED MANUSCRIPT
590
calcified NCC subjects in the training dataset. D, Classification of five active NCC and five
591
calcified NCC subjects not included in the training dataset according to their % Total NCC LOOCV
592
classification mass peaks compared to the group cut-off threshold obtained in (C).
593
AC C
EP
TE D
M AN U
SC
RI PT
594
29
ACCEPTED MANUSCRIPT
Supplementary Figure captions and legends
596
Supplementary Data Figure S1. Percent total randomized group LOOCV classified mass peaks of
597
each subject randomized in the three NCC sub-groups in relation to the randomized cut-off
598
thresholds used to classify subjects into one of two group and the p-value obtained with the
599
corresponding randomized database. A, Comparison of subjects randomized to the multiple
600
neurocysticercosis (MNCC) and with single cysticercus granuloma (SCG) sub-groups. B,
601
Comparison of subjects randomized to the SCG and single calcified cyst (SCC) sub-groups. C,
602
Comparison of subjects randomized to the with SCC and MNCC sub-groups. Abbreviations: NCC,
603
neurocysticercosis; MNCC: multiple neurocysticercosis; SCG: single cysticercus granuloma; SCC:
604
single calcified cyst; LOOCV: leave one out cross validation; RND: randomized database; SD:
605
standard deviation.
M AN U
SC
RI PT
595
606
Supplemental Data Figure S2. Percent total randomized group LOOCV classified mass peaks of
608
each subject randomized to calcified neurocysticercosis (NCC), active NCC, NCC with or without
609
edema or epilepsy of unknown etiology (EUE) in relation to the randomized cut-off thresholds used
610
to classify subjects into one of two group and the p-value obtained with the corresponding
611
randomized database. A, Comparison of subjects randomized to calcified NCC and to EUE. B,
612
Comparison of subjects randomized to active NCC and to EUE. C, Comparison of subjects
613
randomized to calcified NCC and to active NCC. D, Comparison of subjects randomized to NCC
614
with edema* or without edema** Abbreviations: NCC, neurocysticercosis; EUE: epilepsy of
615
unknown etiology; LOOCV: leave one out cross validation; RND: randomized database; SD:
616
standard deviation.
617
NCC with edema*: There were a total of 48 NCC subjects with edema. Among these, 6/6 multiple
618
NCC patients with mixed lesions, 2/2 multiple NCC patients with calcified cysts only, 1/10 multiple
AC C
EP
TE D
607
30
ACCEPTED MANUSCRIPT
NCC patient with active cysts only, 4/4 single calcified cyst patients, 7/26 single cysticercus
620
granuloma patients were included in this analysis.
621
NCC without edema**: There were a total of 28 NCC subjects without edema. Among these, 1/1
622
multiple NCC patient with active cysts only, 8/8 multiple NCC patients with calcified cysts only,
623
8/16 single calcified cyst patients, 3/3 single cysticercus granuloma patients were included in this
624
analysis.
RI PT
619
625
Supplemental Data Figure S3. Results using the Advion desktop instrument showing the percent
627
total (randomized) NCC LOOCV classified mass peaks of each study subject and of randomized
628
subjects in relation to the cut-off thresholds used to classify subjects into one of two groups and
629
with the p-values corresponding to the difference in the means of the two groups being compared.
630
A, Comparison of subjects with neurocysticercosis (NCC) and with epilepsy of unknown etiology
631
(EUE). B, Comparison of subjects with NCC and headaches. C, Comparison of subjects
632
randomized to either the NCC or EUE groups. D, Comparison of subjects randomized to the NCC
633
or headache groups. Abbreviations: NCC, neurocysticercosis; EUE: epilepsy of unknown etiology;
634
LOOCV: leave one out cross validation; RND: randomized database; SD: standard deviation.
M AN U
TE D
EP AC C
635
SC
626
31
ACCEPTED MANUSCRIPT
NCC
SC
Dilute
EUE
Mass Peak Processing Prior to Database Analysis
NCC vs. EUE
EP
Subject Groups LOOCV* Data Analysis
All-Liquid ESI-MS Sample Analysis
Dilute
AC C
Plot % of NCC Classified Serum Mass Peaks vs. Patient/Subject Number True Pathology: p-value Distribution
B
RI PT
EUE Patient Serum
M AN U
Subject Samples
NCC Patient Serum
TE D
A
Random Grouping: p-value Distribution
*Leave One Out Cross Validation (LOOCV)
A
ACCEPTED MANUSCRIPT
TE D EP AC C
C
M AN U
SC
RI PT
B
D
A
B
AC C
EP
C
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
A
ACCEPTED MANUSCRIPT
EP AC C
C
TE D
M AN U
SC
RI PT
B
D
ACCEPTED MANUSCRIPT
A
EP AC C
C
TE D
M AN U
SC
RI PT
B
D
B
ACCEPTED MANUSCRIPT
TE D EP AC C
C
M AN U
SC
RI PT
A
D
ACCEPTED MANUSCRIPT
Highlights •
Patients with NCC and epilepsy of unknown etiology were compared by serum mass profiling Patients with NCC, epilepsy of unknown etiology and headache had distinct spectral
RI PT
•
signals
NCC patients with different types of lesions had distinct mass profiles
•
Analysis of serum biomolecules could be used to diagnose NCC.
AC C
EP
TE D
M AN U
SC
•