Schizophrenia Research 149 (2013) 108–111
Contents lists available at SciVerse ScienceDirect
Schizophrenia Research journal homepage: www.elsevier.com/locate/schres
Family-wise automatic classification in schizophrenia René C.W. Mandl ⁎, Rachel M. Brouwer, Wiepke Cahn, René S. Kahn, Hilleke E. Hulshoff Pol Department of Psychiatry, University Medical Center Utrecht, Rudolf Magnus Institute of Neuroscience, The Netherlands
a r t i c l e
i n f o
Article history: Received 6 March 2013 Received in revised form 5 June 2013 Accepted 1 July 2013 Available online 20 July 2013 Keywords: Early detection Heritability Support vector machine MRI Automatic classification Contextual information
a b s t r a c t Automatic classification of individuals at increased risk for schizophrenia can become an important screening method that allows for early intervention based on disease markers, if proven to be sufficiently accurate. Conventional classification methods typically consider information from single subjects, thereby ignoring (heritable) features of the person's relatives. In this paper we show that the inclusion of these features can lead to an increase in classification accuracy from 0.54 to 0.72 using a support vector machine model. This inclusion of contextual information is especially useful in diseases where the classification features carry a heritable component. © 2013 Elsevier B.V. All rights reserved.
1. Introduction Automatic classification methods may become a helpful tool to increase the prediction rate for an increased risk for schizophrenia, allowing for early intervention (Kloppel et al., 2012). Especially classification methods operating on MRI data are of potential interest (Arribas et al., 2010; Castro et al., 2011; Castellani et al., 2012; Nieuwenhuis et al., 2012) because of the noninvasive nature of this modality. Several studies show impressive results in terms of both a high sensitivity and a high specificity when applying such methods to classify schizophrenia patients that have been ill or have been using antipsychotic medication for a considerable time period. Unfortunately, the overall performance of automatic classification methods remains modest when applied for early detection of the disease and a substantial increase in performance is required before these methods may become of practical use (Strobl et al., 2012). These automatic classification methods are designed to operate on single subject information ignoring both the fact that schizophrenia is in part a heritable disease (Aukes et al., 2008; Brans et al., 2008; Derks et al., 2012; Mulle, 2012; Turner et al., 2012; van Haren et al., 2012; Veltman and Brunner, 2012; Wray and Gottesman, 2012) and conversely, possible compensatory mechanisms may be detectable in brain morphology in healthy relatives (e.g. a higher than normal fiber integrity for the healthy siblings (Boos et al., 2013; Gogtay et al., 2012)). By changing the question ‘does this person have a high risk for schizophrenia?’ to ‘does this person, ⁎ Corresponding author at: Department of Psychiatry, University Medical Center Utrecht, Heidelberglaan 100, 3584CX, Utrecht, The Netherlands, hpnr A01.126. Tel.: + 31 0887559705; fax: + 31 0887555443. E-mail address:
[email protected] (R.C.W. Mandl). 0920-9964/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.schres.2013.07.002
given information from his or her relative(s), have a high risk for schizophrenia?’ one can make additional use of both types of information, thereby possibly increasing the performance of the automatic classification. Thus, by adding information from unaffected siblings we not only increase the accuracy of the classification algorithm by controlling for nonspecific environmental and genetic effects that impact on the classification measures but we also increase accuracy by including the effects of possible compensatory mechanisms that alter brain morphology of healthy siblings in the classification process. To test this hypothesized increase in performance we compare two different automatic classification approaches: the ‘single subject’ model, which is based solely on information from the subject to be classified and the ‘family-wise’ model, which also includes information on the subject's sibling. Here a support vector machine (SVM) – a supervised machine learning algorithm – is used to perform the automatic classifications. To allow for a fair comparison, in both approaches the SVM operates on the same number of subjects (n = 80) as well as on the same classification features. 2. Methods 2.1. Subjects A total of 77 patients with schizophrenia, 77 of their healthy siblings and 20 healthy control sibling pairs (n = 40) were included in this study (see Table 1). These subjects are all part of a sample that has been described earlier (Boos et al., 2012, 2013). After complete description of the study to the subjects, written informed consent was obtained. Subjects with a major medical or neurological illness were excluded. All subjects were assessed with the Comprehensive Assessment of Symptoms and History (CASH) (Andreasen et al., 1992)
R.C.W. Mandl et al. / Schizophrenia Research 149 (2013) 108–111
109
Table 1 Subject characteristics. Sample
Patients
Siblings
controls
Nr of subjects Age in years: mean (SD) Sex: male/female Handedness: right/left IQ-score PANSS-positive symptoms score mean/(SD) [range] PANSS-negative symptoms score mean/(SD) [range] PANSS-total symptoms score mean/(SD) [range] Illness duration at scan time in years mean/(SD) [range] Patients on atypical medication Patients on typical medication Patients medication unknown
77 26.8 (5.2) 65/12 72/5 97.7 (14.9) [68–136] 14.9/(5.7) [7–30] 15.3/(5.3) [7–31] 61.7/(16.8) [30–107] 4.3/(3.7) [0.0–17.5] 3 64 10
77 26.9 (5.9) 35/42 61/16 97.8 (16.3) [63–152] n.a. n.a. n.a. n.a. n.a. n.a. n.a.
40 28.7 (8.7) 21/19 37/3 108.9 (14.7) [78–135] n.a. n.a. n.a. n.a. n.a. n.a. n.a.
performed by at least one independent rater who was trained to assess this interview. Diagnosis was based on the DSM-IV criteria. No healthy control subject met the criteria for any DSM-IV axis I disorder at time of inclusion. Healthy control sibling pairs had no first- or second-degree family members with a lifetime psychotic disorder. The Intelligence Quotient (IQ) of each subject was estimated based on four subtests of the Dutch version of the Wechsler adult intelligence scale (WAIS) (Information, Arithmetics, Block design and Digit Symbol coding). All individuals received a 1.5 Tesla magnetic resonance imaging scan of the whole brain. Acquisition and post-processing of the volumetric MRI data and micro-structural MRI data have been described in detail elsewhere. (Mandl et al., 2010; Boos et al., 2012, 2013). 2.2. Classification In this study we propose to use a model for classification that includes familial information. We do not aim to present an optimal set of classification features. The combination of features used here was selected only because they have shown to carry a heritable component and/or are known to show disease-related differences that are not related to antipsychotic medication use. (DeLisi et al., 1985; Altamura et al., 2012; Boos et al., 2013; Castellani et al., 2012; Derks et al., 2012; Mandl et al., 2012; Terwisscha van Scheltinga et al., 2013) The features used to classify are identical in both models and are a combination of volumetric brain measures, micro-structural measures of two major white matter fiber bundles and total IQ. The volumetric measures include the volume of the lateral ventricles, the third ventricle, the cerebral white matter (all normalized with respect to cerebral brain volume) and the intracranial volume. The mean diffusivity (computed on the diffusion tensor images) and the magnetization transfer ratio (computed on magnetization transfer images) were computed for the left arcuate fasciculus and the right uncinate fasciculus to measure micro-structural properties of these white matter fiber bundles. The classification experiments were carried out with the statistical program R version 2.15.2 (www.R-project.org) using a nonlinear bound-constraint support vector machine (polynomial kernel) from the kernlab package. To compare both models we computed the sensitivity, the specificity and the accuracy for the classification results. Sensitivity is defined by TP/(TP + FP) and specificity by TN/(FN + TN), where TP denotes true positives, FP denotes false positives, TN denotes true negatives and FN denotes false negatives. The accuracy is defined by (sensitivity + specificity)/2. Fig. 1 details the experimental setup for both classification models. 3. Results For the single subject model the mean sensitivity was 0.55 (SD: 0.1), the mean specificity was 0.54 (0.1) and the mean accuracy was 0.54 (0.1) while for the family-wise model the mean sensitivity was
0.72 (0.1), the mean specificity was 0.73 (0.1) and the mean accuracy was 0.72 (0.1). The corresponding histograms are shown in Fig. 2. 4. Discussion Conventional classification methods typically consider single subjects. This ignores how the features of the person to be classified relate to those of the person's relatives. The inclusion of this contextual information is especially useful in cases where the classification features have a familial (heritable) component in common with the disease but also when the classification features are sensitive to possible compensatory mechanisms (both are the case for schizophrenia (Aukes et al., 2008; Brans et al., 2008; Boos et al., 2013; Derks et al., 2012; Gogtay et al., 2012; Mulle, 2012; Turner et al., 2012; van Haren et al., 2012; Veltman and Brunner, 2012; Wray and Gottesman, 2012)). The results of our experiment show that an increase of 0.18 in overall classification accuracy may be obtained by including such contextual information. The increased accuracy could make automatic classification useful in a clinical setting to estimate the increased risk schizophrenia prior to the manifestation of the disease. Of course, this would require an additional effort from the person's relatives. We note that these relatives not need to be siblings per se. In this particular study we used siblings to demonstrate the advantage of the family-wise model. But if one would acquire a dataset for classification from scratch then including parental data should be considered because most parents would be more than willing to participate when it concerns the health of their child. The additive value of including sibling information to the classification model reflects at least in part the increased genetic risk for the disease in structural brain measures and IQ. However, in addition, increased classification accuracy by adding sibling information may stem from shared environmental factors, such as an urban environment (van Os et al., 2010), which was recently found to link to social stress processing in the brain of healthy individuals (Lederbogen et al., 2011). The relatively small size of the data set used in our experiment led to substantial variability in estimates of the different measures (for instance, the 1000 accuracy estimates using the family-wise model range from 0.37 to 0.98). Nonetheless, even with this small dataset we demonstrated the clear advantage of context inclusion. However, for real world applications larger datasets are needed for training the classification algorithms to increase stability (Nieuwenhuis et al., 2012). This study is not about the selection of optimal classification features. In fact, the levels of sensitivity, specificity and accuracy reported in this study are relatively low. This study is about the comparison of two classification models (the single subject model and the family-wise model) utilizing the same features for which it was already known that they carry a heritable component. Because both classification models operate on the same set of features, the increased performance of the family-wise model is an asset of the
110
R.C.W. Mandl et al. / Schizophrenia Research 149 (2013) 108–111
single subject model 77 patients 77 siblings
n=144
repeat 1000 times:
family-wise model 77 patients 77 siblings
n=144
repeat 1000 times:
randomly select 40 patients
randomly select 20 patients/sibling pairs
(from 77 patients)
(from 77 patients/sibling pairs)
40 patients
20 controls 20 siblings
20 patients/ sibling pairs
20 control/ sibling pairs
(40 subjects)
(40 subjects)
(40 subjects)
80 subjects Leave one out (79+1)
80 subjects Leave one out (39+1)
Fig. 1. Experimental setup for single subject model and family-wise model. Single subject model: In the single subject model we included the 40 healthy comparison subjects and 40 randomly selected unrelated subjects from the 77 patients with schizophrenia. For these 80 subjects we then computed the mean specificity, mean sensitivity and mean accuracy using leave one out cross validation. In a leave one out validation one subject is removed from the set and the reduced set (that is 79 subjects) is then used to train the classifier. The classifier is then used to classify the previously removed subject. This procedure is repeated for each of the 80 subjects in the set. This leave one out cross validation was repeated 1000 times with a different random selection of patients after which the overall mean sensitivity, specificity and accuracy was computed. Family-wise model: In the family-wise model we included the 20 healthy control/sibling pairs and 20 randomly selected patient/sibling pairs from the 77 patient/sibling pairs available. Note that, similar to the single subject model, this set also contains 80 subjects but here the classification is applied at family level (40 pairs). For IQ and the volumetric and micro-structural brain measures the difference between schizophrenia patient and sibling (or healthy control subject and sibling), normalized to their sum was used for classification instead of the single volumetric and micro-structural measures as used in the single subject model. In this way the number of features in the model was kept equal to the number of features used in the single subject model. Identical to the single subject model the leave one (pair) out procedure was repeated 1000 times after which the first subject of the remaining pair was classified. The overall mean sensitivity, specificity and accuracy were computed.
model and is irrespective of the level of quality of the classification features or the particular properties of this dataset. As an example for the latter, the male/female distributions between patients and healthy participants differ but this affects both classification models equally and has therefore no effect on the reported results. To illustrate this we split the results with respect to gender and found for the single subject model for males 0.67, 0.42 and 0.54, and for females 0.31, 0.76 and 0.54, respectively for mean sensitivity, mean specificity and mean accuracy. For the family-wise model these values were for males 0.78, 0.63 and 0.70, and for females 0.52, 0.87 and 0.69. Here an imbalance appears to exist for mean sensitivity and mean specificity values between males and females (i.e. the results for males show a higher average sensitivity and a lower specificity while for females a
lower sensitivity and higher specificity is found). However, this imbalance affects both models in a equal way. We note that his imbalance in results between males and females suggests that in a real world application two separate models for males and females should be used. For this proof of principle we used a small number of classification features because for such a small example dataset the inclusion of non-heritable classification features could add a substantial amount of noise. Determining which set(s) of classification features is optimal for a family-wise classification model to predict for example an increased risk for schizophrenia will be part of future research. In this study we show that inclusion of contextual information (the family-wise model) leads to a substantial increase in performance of automatic classification methods. This way of classification may not
Fig. 2. Histogram results for single subject model (A) and family-wise model (B).
R.C.W. Mandl et al. / Schizophrenia Research 149 (2013) 108–111
only be beneficial in schizophrenia (or other psychiatric diseases) but is applicable in all diseases where familial/heritability and/or possible compensatory mechanisms detectable in brain morphology are involved. Role of funding source None. Contributors René Mandl designed the study, performed the analysis, wrote the article and approved the final version of the manuscript. Rachel Brouwer made a substantial contribution to the design of the study, interpretation of the data, critically reviewed the manuscript and approved the final version of the manuscript. Wiepke Cahn, René Kahn and Hilleke Hulshoff Pol made a substantial contribution to the acquisition and interpretation of the data, critically reviewed the manuscript and approved the final version. Conflicts of interest All authors report no conflicts of interest. Acknowledgment There are no further acknowledgments.
References Altamura, M., Fazio, L., De Salvia, M., Petito, A., Blasi, G., Taurisano, P., Romano, R., Gelao, B., Bellomo, A., Bertolino, A., 2012. Abnormal functional motor lateralization in healthy siblings of patients with schizophrenia. Psychiatry Res. 203 (1), 54–60. Andreasen, N.C., Flaum, M., Arndt, S., 1992. The Comprehensive Assessment of Symptoms and History (CASH). An instrument for assessing diagnosis and psychopathology. Arch. Gen. Psychiatry 49 (8), 615–623. Arribas, J.I., Calhoun, V.D., Adali, T., 2010. Automatic Bayesian classification of healthy controls, bipolar disorder, and schizophrenia using intrinsic connectivity maps from FMRI data. IEEE Trans. Biomed. Eng. 57 (12), 2850–2860. Aukes, M.F., Alizadeh, B.Z., Sitskoorn, M.M., Selten, J.P., Sinke, R.J., Kemner, C., Ophoff, R.A., Kahn, R.S., 2008. Finding suitable phenotypes for genetic studies of schizophrenia: heritability and segregation analysis. Biol. Psychiatry 64 (2), 128–136. Boos, H.B., Cahn, W., van Haren, N.E., Derks, E.M., Brouwer, R.M., Schnack, H.G., Hulshoff Pol, H.E., Kahn, R.S., 2012. Focal and global brain measurements in siblings of patients with schizophrenia. Schizophr. Bull. 38 (4), 814–825. Boos, H.B., Mandl, R.C., van Haren, N.E., Cahn, W., van Baal, G.C., Kahn, R.S., Hulshoff Pol, H.E., 2013. Tract-based diffusion tensor imaging in patients with schizophrenia and their non-psychotic siblings. Eur. Neuropsychopharmacol. 23 (4), 295–304. Brans, R.G., van Haren, N.E., van Baal, G.C., Schnack, H.G., Kahn, R.S., Hulshoff Pol, H.E., 2008. Heritability of changes in brain volume over time in twin pairs discordant for schizophrenia. Arch. Gen. Psychiatry 65 (11), 1259–1268. Castellani, U., Rossato, E., Murino, V., Bellani, M., Rambaldelli, G., Perlini, C., Tomelleri, L., Tansella, M., Brambilla, P., 2012. Classification of schizophrenia using featurebased morphometry. J. Neural Transm. 119 (3), 395–404.
111
Castro, E., Martinez-Ramon, M., Pearlson, G., Sui, J., Calhoun, V.D., 2011. Characterization of groups using composite kernels and multi-source fMRI analysis data: application to schizophrenia. Neuroimage 58 (2), 526–536. DeLisi, L.E., Goldin, L.R., Hamovit, J.R., Maxwell, M.E., Kurtz, D., Nurnberger, J.I., Gershon, E.S., 1985. Cerebral ventricular enlargement as a possible genetic marker for schizophrenia. Psychopharmacol. Bull. 21 (3), 365–367. Derks, E.M., Allardyce, J., Boks, M.P., Vermunt, J.K., Hijman, R., Ophoff, R.A., 2012. Kraepelin was right: a latent class analysis of symptom dimensions in patients and controls. Schizophr. Bull. 38 (3), 495–505. Gogtay, N., Hua, X., Stidd, R., Boyle, C.P., Lee, S., Weisinger, B., Chavez, A., Giedd, J.N., Clasen, L., Toga, A.W., Rapoport, J.L., Thompson, P.M., 2012. Delayed white matter growth trajectory in young nonpsychotic siblings of patients with childhoodonset schizophrenia. Arch. Gen. Psychiatry 69 (9), 875–884. Kloppel, S., Abdulkadir, A., Jack Jr., C.R., Koutsouleris, N., Mourao-Miranda, J., Vemuri, P., 2012. Diagnostic neuroimaging across diseases. Neuroimage 61 (2), 457–463. Lederbogen, F., Kirsch, P., Haddad, L., Streit, F., Tost, H., Schuch, P., Wust, S., Pruessner, J.C., Rietschel, M., Deuschle, M., Meyer-Lindenberg, A., 2011. City living and urban upbringing affect neural social stress processing in humans. Nature 474 (7352), 498–501. Mandl, R.C., Schnack, H.G., Luigjes, J., van den Heuvel, M.P., Cahn, W., Kahn, R.S., Hulshoff Pol, H.E., 2010. Tract-based analysis of magnetization transfer ratio and diffusion tensor imaging of the frontal and frontotemporal connections in schizophrenia. Schizophr. Bull. 36 (4), 778–787. Mandl, R.C., Rais, M., van Baal, G.C., van Haren, N.E., Cahn, W., Kahn, R.S., Hulshoff Pol, H.E., 2012. Altered white matter connectivity in never-medicated patients with schizophrenia. Hum. Brain Mapp. http://dx.doi.org/10.1002/hbm.22075 (Epub ahead of print). Mulle, J.G., 2012. Schizophrenia genetics: progress, at last. Curr. Opin. Genet. Dev. 22 (3), 238–244. Nieuwenhuis, M., van Haren, N.E., Hulshoff Pol, H.E., Cahn, W., Kahn, R.S., Schnack, H.G., 2012. Classification of schizophrenia patients and healthy controls from structural MRI scans in two large independent samples. Neuroimage 61 (3), 606–612. Strobl, E.V., Eack, S.M., Swaminathan, V., Visweswaran, S., 2012. Predicting the risk of psychosis onset: advances and prospects. Early Interv. Psychiatry 6 (4), 368–379. Terwisscha van Scheltinga, A.F., Bakker, S.C., van Haren, N.E., Derks, E.M., BuizerVoskamp, J.E., Boos, H.B., Cahn, W., Hulshoff Pol, H.E., Ripke, S., Ophoff, R.A., Kahn, R.S., 2013. Genetic schizophrenia risk variants jointly modulate total brain and white matter volume. Biol. Psychiatry 73 (6), 525–556. Turner, J.A., Calhoun, V.D., Michael, A., van Erp, T.G., Ehrlich, S., Segall, J.M., Gollub, R.L., Csernansky, J., Potkin, S.G., Ho, B.C., Bustillo, J., Schulz, S.C., FBIRN, Wang, L., 2012. Heritability of multivariate gray matter measures in schizophrenia. Twin Res. Hum. Genet. 15 (3), 324–335. van Haren, N.E., Rijsdijk, F., Schnack, H.G., Picchioni, M.M., Toulopoulou, T., Weisbrod, M., Sauer, H., van Erp, T.G., Cannon, T.D., Huttunen, M.O., Boomsma, D.I., Hulshoff Pol, H.E., Murray, R.M., Kahn, R.S., 2012. The genetic and environmental determinants of the association between brain abnormalities and schizophrenia: the schizophrenia twins and relatives consortium. Biol. Psychiatry 71 (10), 915–921. van Os, J., Kenis, G., Rutten, B.P., 2010. The environment and schizophrenia. Nature 468 (7321), 203–212. Veltman, J.A., Brunner, H.G., 2012. De novo mutations in human genetic disease. Nat. Rev. Genet. 13 (8), 565–575. Wray, N.R., Gottesman, I.I., 2012. Using summary data from the danish national registers to estimate heritabilities for schizophrenia, bipolar disorder, and major depressive disorder. Front. Genet. 3, 118.