Predicting suicidal ideation in primary care: An approach to identify easily assessable key variables

Predicting suicidal ideation in primary care: An approach to identify easily assessable key variables

General Hospital Psychiatry 51 (2018) 106–111 Contents lists available at ScienceDirect General Hospital Psychiatry journal homepage: www.elsevier.c...

442KB Sizes 0 Downloads 13 Views

General Hospital Psychiatry 51 (2018) 106–111

Contents lists available at ScienceDirect

General Hospital Psychiatry journal homepage: www.elsevier.com/locate/genhospsych

Predicting suicidal ideation in primary care: An approach to identify easily assessable key variables

T



Pascal Jordana,b, , Meike C. Shedden-Moraa, Bernd Löwea a b

Department of Psychosomatic Medicine and Psychotherapy, University Medical Center Hamburg-Eppendorf and Schön Klinik Hamburg Eilbek, Hamburg, Germany Psychological Methods, Faculty for Psychology and Human Movement Science, University of Hamburg, Hamburg, Germany

A R T I C L E I N F O

A B S T R A C T

Keywords: Suicide Suicidal ideation Primary care Depression Anxiety Somatic symptoms

Objective: To obtain predictors of suicidal ideation, which can also be used for an indirect assessment of suicidal ideation (SI). To create a classifier for SI based on variables of the Patient Health Questionnaire (PHQ) and sociodemographic variables, and to obtain an upper bound on the best possible performance of a predictor based on those variables. Methods: From a consecutive sample of 9025 primary care patients, 6805 eligible patients (60% female; mean age = 51.5 years) participated. Advanced methods of machine learning were used to derive the prediction equation. Various classifiers were applied and the area under the curve (AUC) was computed as a performance measure. Results: Classifiers based on methods of machine learning outperformed ordinary regression methods and achieved AUCs around 0.87. The key variables in the prediction equation comprised four items - namely feelings of depression/hopelessness, low self-esteem, worrying, and severe sleep disturbances. The generalized anxiety disorder scale (GAD-7) and the somatic symptom subscale (PHQ-15) did not enhance prediction substantially. Conclusions: In predicting suicidal ideation researchers should refrain from using ordinary regression tools. The relevant information is primarily captured by the depression subscale and should be incorporated in a nonlinear model. For clinical practice, a classification tree using only four items of the whole PHQ may be advocated.

1. Introduction According to a review [1] including 40 studies, 45% of individuals who died by suicide had contact with primary care providers within 1 month prior to suicide and 75% of the individuals who died by suicide had contact within the year of suicide. Given that the majority of the individuals who commit suicide make contact to primary care providers in previous months, an appropriate screening of individuals at the level of the primary care setting might help to initiate proper interventions in order to prevent suicide. Primary care physicians have to be aware of the prevalence (the figures vary between 1 and 10% [2]) of patients who experience suicidal thoughts [3]. Moreover, adequate assessment tools have to be known and available. However, the prediction of suicidal ideation (SI) at this level can be a challenging issue, as the underlying population is not a high-risk population which in turn implies relatively low base rates. Nevertheless, results from studies examining 1) SI within patients with anxiety disorders [4], 2) SI within somatoform disorders [5] and 3) SI within patients diagnosed with depression [6] suggest that the



usage of item batteries which reflect these constructs (accompanied with known predictors on the sociodemographic level such as age) could also be used within a non-preselected sample in order to screen for SI. In particular, the 9th item of the Patient Health Questionnaire depression scale (PHQ-9) [7] - a scale which is routinely used in the clinical context - has been widely used to assess suicidal ideation. Endorsement to the item has been identified as a consistent predictor of suicide attempts and suicide deaths in large primary care and population-based analyses [8,9,11]. Of those patients reporting SI, around 32% will make a suicide attempt at some point in their life [12]. Besides directly asking for suicidal thoughts and ideations, taking into account additional relevant variables, especially depressive symptoms and known risk factors such as gender or age, can improve the detection of suicide risk [8,9]. In psychiatric epidemiology or in general population studies, for reasons of liability, the 9th item of the PHQ-9 is frequently not included, and instead of the PHQ-9, the 8-item version of the Patient Health Questionnaire (PHQ-8) [14] that omits the suicidal ideation item, is used. In these studies, the results of our study might

Corresponding author at: Psychological Methods, Faculty for Psychology and Human Movement Science, University of Hamburg, Von Melle-Park 5, 20146 Hamburg, Germany. E-mail addresses: [email protected] (P. Jordan), [email protected] (M.C. Shedden-Mora), [email protected] (B. Löwe).

https://doi.org/10.1016/j.genhosppsych.2018.02.002 Received 4 August 2017; Received in revised form 1 February 2018; Accepted 1 February 2018 0163-8343/ © 2018 Elsevier Inc. All rights reserved.

General Hospital Psychiatry 51 (2018) 106–111

P. Jordan et al.

Hence, the item was dichotomized with 0 referring to no suicidal thoughts, whereas responding in any of the three remaining categories was coded with 1. This cut-off was chosen to provide a rather sensitive definition of SI.

provide a valuable method for imputing the ninth item and for assessing suicide risk in the investigated population. The aim of this study – among assessing the base rate of SI in primary care - was to confirm and identify relevant risk factors for suicidal ideation and to determine their relative importance in predicting SI, whereby SI is operationalized via the response on the 9th PHQ item. We aimed to create a prediction equation based on the PHQ on item level and sociodemographic variables using advanced methods of machine learning. By using methods of machine learning rather than ordinary regression models the resulting predictions can in general reflect much more complicated relationships between the variables and the outcome (suicidal ideation). Knowledge of these predictors can aid the primary care physician in detecting patients with SI and can also serve as a tool for a brief, initial assessment which can then, dependent on the outcome, be followed by a more precise and prolonged method of assessment, like e.g. the Columbia-suicide severity rating scale (C-SSRS) [15]. Moreover, in practical cases, wherein directly asking a patient about the topic of suicidal ideation is not endorsed or in cases wherein the response on the 9th item is missing, the results of accurate classifiers can serve as a valuable proxy for the unknown response.

2.3. Statistical analysis We used methods of pattern recognition in order to discern patients with no SI from patients with SI. Our aim was twofold: By reporting results of models of pattern recognition (see below), we sought to provide an estimate for the best possible prediction of SI one can achieve using the variables at hand. However, due to practical necessities (i.e. a classifier has to be easily applicable in order to be adapted in practice) we also incorporated results of classifiers which are more restrictive. Where by the term “classifier”, we mean a function which takes as input the measurements of a patient on several variables (e.g. PHQ-9 items) and outputs a probability that this patient belongs to the group of patients with suicidal thoughts. Secondly, we distinguished between three blocks of predictor variables which serve as input for the various classification methods: 1) Demographic variables (age, gender, education and marital status) 2) Variables of the Patient Health Questionnaire on the item level (all items of the PHQ-15, PHQ-8 and GAD-7 items) 3) Composite scores derived from the PHQ (PHQ-8 score; GAD-7 score and PHQ-15 score). Each classification method was applied to each block in order to gain insight into the predictive power at the demographic, the item and the scale score level. In a final analysis each classifier was also applied for the whole set of variables. The derivation and evaluation of each prediction equation was done in two steps: Firstly, a training sample (randomly chosen half of the data set) was used to learn the classifier. In a second step, the model derived in the first step was used to predict class membership probability for each entry of the test sample (the remaining half of the data set). The predicted probabilities were dichotomized for a continuum of cut-off scores and an approximate measure of the area under the resulting ROC-curve was computed for the performance of the classifier (see also Section 3 and the corresponding footnote). We distinguish the following methods: Method 1: classification trees For each variable, and each potential cut-off value for this variable, a split according to the cut-off is applied and the conditional class probabilities for SI of the resulting split categories are examined. The variable and the cut-off for which splitting results in partitions with strong informative conditional class probabilities is chosen first and splitting continues within the partitioned data set provided by the first split (we used the process implemented in the R function “tree” with default settings). For full description see [18]. Method 2: the Support Vector Machine (SVM) This method [19] provides highly flexible classifiers by seeking to separate classes in a transformed feature space via methods of convex analysis. The method incorporates two tuning parameters (we optimized the classifier over a grid specified by the “gamma”-vector (0.1, 0.2, …, 0.9, 1, 2, …,10) and the “cost”-vector (1, 2, …, 10, 100, 1000)) representing attributes of an underlying radial kernel function and penalties for non-perfect separation which regulate the generalization error. Basically, each group – those with and those without SI – can be depicted in a multidimensional space, wherein the coordinates are given by the values of the predictor variables (or by properly transformed values thereof). If the corresponding “clouds” of SI vs. non-SI patients can be separated by some line, then a SVM classifier will be able to find this separation. In general, with SVMs it becomes possible to capture much more complex relationships than with ordinary logistic regression models.

2. Methods 2.1. Design and sample For this study, we used cross-sectional data from a broad screening assessment performed in 33 primary care practices located in the metropolitan area of Hamburg, Germany, from 2011 [16] to 2014. Data collection was part of a study evaluating the Network for Somatoform and Functional Disorders (Sofu-Net) [16,17]. Patients were asked to complete a screening questionnaire after providing oral informed consent while waiting for the consultation (no reimbursement was given). Having severe somatic or psychiatric disease, severe cognitive disabilities, being younger than 18 years old, having impaired vision, and insufficient German language skills defined exclusion criteria. The practices were approached on two to four consecutive days and each patient who gave consent provided sociodemographic data and was assessed via the Patient Health Questionnaire (PHQ). Ethics approval was obtained from the Medical Chamber Hamburg, Germany. 2.2. Instruments The three subscales GAD-7, PHQ-9 and PHQ-15 of the Patient Health Questionnaire – which are routinely applied in clinical settings were used to measure generalized anxiety, depression and somatic symptoms. Each of the seven GAD-7 items has a score range from 0 to 3 (the same holds for each one of the PHQ-9 items), while the items of the PHQ-15 are scored from 0 to 2. Whenever we refer to an item of these scales, we use the score on this item in the statistical analysis. The PHQ8 – i.e. the PHQ-9 without the 9th item assessing suicidal ideation – is used in the analysis as a predictor instead of the PHQ-9 to avoid circularity. The PHQ scales have good psychometric properties which are summarized in Kroenke et al. [7]. The 9th PHQ-9 item concerning suicidal ideation serves as the dependent variable in this study. Participants are asked to indicate on an ordinal measurement scale how often they experienced suicidal thoughts (“thoughts that you would be better off dead or of hurting yourself in some way”) within the last two weeks. Although there are four possible values (0 = not at all; 1 = several days, 2 = more than half the days, 3 = nearly every day), we decided to dichotomize the item for primarily reasons of statistical precision (and for ease of interpretation). That is, categories 2 vs. 3 might for example be very difficult to distinguish properly, whereas the comparison of 0 vs other categories enables more sharp distinctions. Moreover, as shown in the Results section, some categories are rather sparsely chosen which limits the power of any method to detect the corresponding class membership. 107

General Hospital Psychiatry 51 (2018) 106–111

P. Jordan et al.

Method 3: neural network The terminology of this method is inspired from the functioning of neurons in the brain. Basically, each predictor serves as an input for a “neural network”, wherein each “neuron” receives as input some weighted signal of the output of other neurons. The output of each neuron may be a nonlinear function of the input signal. The neurons are clustered into layer, wherein each layer only receives inputs from previous layers. By using a number of “neurons” and layers, this cascading mechanism is able to produce a total output of the system which may be a highly complicated, nonlinear function of the predictor variables. Predictors are weighted and incorporated into a nonlinear activation function to yield intermediate quantities (the hidden-layer) which in turn are weighted and combined into an output unit [19]. The weights are determined via optimization of a likelihood function (using the output units as dependent variables). We used the default settings of the R function” nnet” adjusted by a maximum number of iterations of 1000 and optimized over the number of units in the hidden layer (optimized over a grid from 1 to 20). A final method – Fisher's linear discriminant function approach (see p.318 f. in [20]) - seeks to determine a summary score from all PHQitems such that the two groups (patient with and without SI) are separated as strict as possible in terms of an ANOVA group comparison. That is, if for example X, Y, Z are the predictor variables, then the method seeks to determine weights a, b, c, such that the two-sample ttest (comparing the SI with the non-SI group) based on the composite score a*X + b*Y + c*Z is as large as possible. The resulting score can then be used to a) rank the items with respect to their contribution to separate the classes and b) to predict SI via logistic regression.

Fig. 1. Classification tree using all PHQ items and depicting the conditional probabilities of suicide ideation at each final node. Note. Each relative frequency depicts the probability of suicidal ideation when the specific path is followed. For example, the entry of 17,2% refers to the estimated probability of suicidal ideation among all patients which fulfill PHQ8_2 = 1; i.e. among all patients which indicated that they felt down, depressed or hopeless for several days (=score of 1) within the last 2 weeks. The confidence intervals for the underlying probabilities are as follows (proceeding from top to bottom and from left to right): (1,4–2,9), (24,1–29,8), (1,3–2,8), (0,8–22,8), (14,3–20,0), (48,3–60,9), (5,7–11,5), (21,9–31,5), (36,7–54,3), (55,4–72,8) [in percentages].

3. Results Of n = 9025 consecutive primary care patients, 8347 patients (92.5%) were eligible and 6805 patients agreed to participate in this study (participation rate 81.5%). 60% of participants were female, mean age was 51.5 (SD = 19.07) years and the distribution of age did not differ significantly between males and females. The key outcome of our study – the presence of SI – was endorsed in 12.6% (CI: 11.8–13.4%) of all cases. Hence, 87,4% indicated no suicidal thought. The 12,6% endorsement probability of the dichotomized item split into 9,5% (“several days”); 1,9% (“more than half the days”) and 1,2% (“nearly every day”) for the respective original categories. In general, sociodemographic variables were markedly inferior to the other blocks – regardless of the applied classification method (classification tree, support vector machine, neural network). Not a single classifier's AUC1 exceeded 61. In contrast, most methods achieved AUCs exceeding 80 when using all available PHQ-items. And remarkably, the sociodemographic variables added no surplus value – that is, methods using all available variables do not show overall better performances than methods relying on PHQ-items only. The simplest classifier (see Fig. 1) among the best classifiers was given by a classification tree based on the PHQ-items (AUC 85.6). Within this tree, there were only four variables underlying the classification (listed in decreasing order of importance): The second PHQ-8 item (“feeling down, depressed or hopeless”), the sixth PHQ-8 item (“feeling bad about yourself”), the extreme category of the third item (“trouble falling or staying asleep”) and to a minor extend the second GAD-7 item (“not being able to stop worrying”). We also note that the classification tree (see Fig. 2) based on the overall PHQ-8, PHQ-15 and GAD-7 scores with an AUC of 84.8 almost exclusively uses the PHQ-8 score and to some minor extent the GAD-7 score. That is, once the depression score is known, the information added by the PHQ-15 scale score is dispensable. This result is in

Fig. 2. Classification tree based on the three scale scores PHQ-8, PHQ-15 and GAD-7. Note: Each relative frequency depicts the probability of suicidal ideation when the specific path is followed. For example, the entry of 1,9% refers to the estimated probability of suicidal ideation among all patients which show a PHQ8 score ≤ 6 and a GAD7 score ≤ 3. The corresponding confidence intervals for the underlying probabilities are as follows (proceeding from top to bottom and from left to right): (2,3–4,0), (30,0–37,1), (1,1–2,6), (4,0–8,0), (19,0–26,6), (52,2–65,8) [in percentages].

accordance with the spurious correlation between PHQ-15 and SI which can be detected when comparing a logistic regression model using all three scales with a logistic regression model using PHQ-15 only. The former results in a nonsignificant coefficient for PHQ-15 (and a barely

1 We applied the definition used in the AUC R package [21]. All AUC-values are reported as percentages unless otherwise stated.

108

General Hospital Psychiatry 51 (2018) 106–111

P. Jordan et al.

relying on just two items (“not being able to stop worrying”; “feeling afraid as if something awful might happen”). When turning to the more complex classifiers – i.e. SVMs and Neural Networks – it can be ascertained that each block (except the sociodemographic variables) achieves AUCs around 86 when using the items of the PHQ or all available variables. The classification rule underlying both methods is complex (the SVM uses around 1000 support vectors), so it is natural to seek for more transparent classifiers while maintaining predictive accuracy. The use of ordinary regression methods is not helpful in this regard – as the corresponding AUCs are poor (below 0.7). However, one particularly promising method was already described previously – the usage of a classification tree. Another method (see Table 1) was provided by the computation of Fisher's linear discriminant function (see p. 318 f. in [20]) which derives optimal scores for the items in order to separate the classes. Among the variables which contributed most to the summary score were (in decreasing order of importance) 1) the second PHQ-8 item (‘feeling down, depressed or hopeless’) 2) the sixth PHQ-8 item (‘feeling bad about yourself’) 3) the second GAD-7 item (“not being able to stop worrying”) 4) the thirteenth item of the PHQ-15 (“shortness of breath”) scored in the opposite direction and 5) the third PHQ-8 item (“trouble falling or staying asleep, or sleeping too much”). Using the derived composite score in a logistic regression results in an AUC of 87.9. The estimated intercept and slope are −3.943 and −1.353 respectively. In comparison, the statistical optimal estimator based on estimating the class conditional densities of the composite score via two different methods (based on kernels and on logconcave [22] density estimation) and applying the Bayes Theorem does not achieve a substantial improvement over the logistic model. Note that fundamental diagnostic performance measures (for definitions see [21]), such as sensitivity, specificity, positive predictive and negative predictive value are provided in Table 2 for a selection of relevant models.

Fig. 3. Application of a generalized additive model smoother based on splines. Depicted are the contributions of the three subscale scores to the linear predictor.

4. Discussion The present study indicates that, with 12.6% “two-week”-prevalence, SI is a relevant issue in primary care, in line with the literature [23–25]. Results of this study also indicate which symptoms are closely interrelated with SI and might therefore give reference to a potential SI: feeling down, depressed or hopeless, low self-esteem, worrying, and severe sleep disturbances were identified as core indicators. AUCs near 87 are achievable for the purpose of predicting SI in a primary care setting based on sociodemographic variables and variables pertaining to the Patient Health Questionnaire. The maximal AUC can either be achieved by a neural network or by assigning appropriate weights to the variables of the PHQ via Fishers linear discriminant function. Moreover, the distribution of weights in the latter approach and the results of other classifiers show that the most important variables in predicting SI are located in the PHQ-8 scale and that variables attributed to the GAD-7 or PHQ-15 subscale only contribute to a minor extend in predicting SI given knowledge of the PHQ-8. This result suggests that comorbidities like GAD or somatoform disorders add some surplus value when it comes to estimating SI, but that the dominating predictor remains the depression score. The vastly increased risk for mortality for patients with depression is of course well documented [26,27]. In addition, if one views the PHQ-15 items as reflecting physical health symptoms, then our findings are in line with the results of Bomyea et al. [4] who noted that physical health factors do not contribute to predicting suicide risk when controlling for mental health status. Based on the data, we advocate either a) the usage of four key variables, i.e. the classification tree using the second (‘feeling down, depressed or hopeless’), third (‘trouble falling or staying asleep’) and sixth (‘feeling bad about yourself’) PHQ-8 item as well as the second (“not being able to stop worrying”) GAD-7 item (Fig. 1) or b) the usage of the PHQ-8 score and the GAD-7 score as highlighted by the corresponding tree (Fig. 2) or

Fig. 4. Classification tree based on GAD-7 items only. Note: Each relative frequency depicts the probability of suicidal ideation when the specific path is followed. For example, the entry of 41,2% refers to the estimated probability of suicidal ideation among all patients which indicate that they were unable to control worrying for more than half of the days within the last two weeks. The confidence intervals are as follows (proceeding from top to bottom and from left to right): (2,3–4,2), (21,3–26,7), (5,9–8,6), (35,8–46,5), (3,1–5,1), (12,9–19,2) [in percentages].

significant coefficient for GAD-7), whereas the latter results in a highly significant p-value (see also Fig. 3). This by no means implies that knowledge of the PHQ-15 score is in general dispensable – if data on the PHQ-8 are not available, then the PHQ-15 can be used in predicting SI (AUC of 73,5 for the classification tree). Similarly, AUC = 79.9 for the classification tree based on GAD-7 items only (see Fig. 4). Hence, although GAD-7 items are almost superfluous when PHQ-8 item scores are known, they allow for some prediction accuracy when lacking other information (see Table 2). The corresponding tree is also very simple – 109

General Hospital Psychiatry 51 (2018) 106–111

P. Jordan et al.

Table 1 Scoring of the PHQ items with respect to the best discriminating composite score (derived via the method of Fisher's linear discriminant analysis). Items with negative weights are estimated to increase the likelihood of suicidal ideation, whereas items with positive values decrease the likelihood of suicidal ideation. Item

PHQ-8

Item

GAD-7

Item

PHQ-15

Little interest, pleasure Feeling down Sleep disturbance Feeling tired Poor appetite Feel like failure Trouble concentrating Moving slowly

−0.03 −0.63 −0.22 0.07 −0.17 −0.41 −0.06 −0.18

Nervous anxiety Not able stop worrying Worrying too much Trouble relaxing Hard to sit still Easily annoyed Feeling afraid

−0.05 −0.34 0.09 0.16 0.06 0.06 −0.09

Stomach Back Arms, legs, joints Feeling tired Trouble with sleep Menstrual cramps Sexual intercourse pain Headaches Chest pain Dizziness Fainting spells Heart pound/race Shortness of breath Constipation/diarrhea Nausea/indigestion

0.08 0.02 −0.07 0.11 0.03 −0.02 0.00 −0.03 0.14 −0.12 0.00 −0.14 0.22 0.12 −0.09

Note. Each number represents the weight of the corresponding variable in a composite score of the form a*X + b*Y + c*Z + … The predicted probability of suicidal ideation is then a function which solely depends on the value of this composite score. Bold highlights items with weights > 0.15.

been incorporated into the diagnostic process, the other variables, e.g., knowing that the patient has poor appetite, little pleasure in doing things or several somatic symptoms do not provide additional information (given knowledge of the four relevant predictors) for the prediction of SI. Finally, the scoring of all PHQ-items according to Fisher's linear discriminant function allows one to quantify which items contribute to increasing/decreasing likelihood of SI and offers an interesting observation. It can be seen that the incapability to stop worrying is indicative for SI – however having trouble relaxing (item 4 of the GAD-7) and exhibiting shortness of breath (item 13 of the PHQ-15) are both contraindicative of SI. A limitation of the study is the fact, that other well established risk factors for suicide risk such as past suicide attempts, functional impairment, stressful life events or physical illness [28] have not been assessed in this study. As further potential limitations of this study and lines of further research, we want to point out one major aspect: The dependent variable was SI and it was assessed via a single item which directly addresses this issue. It is not clear to what extend it can be ascertained that patients respond truthfully to this item, although its validity has been shown [29]. That is, SI is a sensible topic and such topics are prone to response bias and social desirability. Moreover, even if the responses were free of bias, it should be borne in mind that the scope of the underlying SI-construct which is reflected in the response to this single item is probably narrow. Hence, it cannot be ruled out that predictors which are deemed as not important on the single SI-item level could gain predictive accuracy when using more detailed assessment tools to reflect the SI-construct – like e.g. the C-SSRS. It should also be noted that the enforced cut-off of none vs any suicidal ideation does not necessarily match the best choice in terms of effect size. That is, it could very well be the case that the distinction between the two highest versus the two lowest scores provides a greater distinction in terms of e.g. the actual suicide risk, then the distinction employed in this study. However, as pointed out earlier, the sparsity of the responses in the upper categories is a severe caveat for the meaningful application of any (statistical) classifier. A second potentially very difficult topic to explore is the pathway between ideation and attempt. Whereas the endorsement of SI in questionnaires clearly predicts suicide attempts and deaths [8,9,11], there are – to the authors' knowledge – no well scrutinized results examining differences in the transition (from ideation to attempt) with respect to covariates. That is, if the conditional probabilities of attempting suicide given ideation vary to a great extend in dependence on some variables (e.g. the probability of committing suicide given SI could be higher for older people than for younger), then

Table 2 Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for various classifiers using different cut-off values. Classifier

Cut-Off

Sensitivity

Specificity

PPV

NPV

Tree all

0.2 0.5 0.8 0.2 0.5 0.8 0.2 0.5 0.8 0.2 0.5 0.8 0.2 0.5 0.8 0.2 0.5 0.8 0.2 0.5 0.8

0.78 0.27 0.00 0.78 0.27 0.00 0.73 0.00 0.00 0.81 0.42 0.00 0.72 0.15 0.06 0.68 0.33 0.01 0.59 0.35 0.23

0.83 0.98 1.00 0.83 0.98 1.00 0.78 1.00 1.00 0.78 0.96 1.00 0.86 0.99 1.00 0.88 0.97 1.00 0.92 0.97 0.99

0.39 0.64 NaN 0.39 0.64 NaN 0.31 NaN NaN 0.34 0.59 NaN 0.41 0.58 0.62 0.42 0.62 0.75 0.49 0.64 0.75

0.97 0.91 0.88 0.97 0.91 0.88 0.96 0.88 0.88 0.97 0.92 0.88 0.96 0.90 0.89 0.95 0.91 0.88 0.94 0.92 0.90

Tree items

Tree GAD-items

Tree PHQ-scales

SVM all

NNet PHQ-scales

Fisher

Note. The term “All” refers to a classification method which uses all available variables (i.e. PHQ-items and sociodemographic variables). The label “Fisher” is applied for Fisher's linear discriminant analysis using all PHQ-items as depicted in Table 1. Labels “NNet” and “SVM” refer to neuronal networks and support vector machines respectively. Hence, “Tree All” and “SVM All” refer to classification tree and support vector machines classifiers using all available items as input. Further, “Tree GAD-items” refers to the classification tree depicted in Fig. 4 (which uses GAD-7 items only) and “Tree PHQ-scales” refers to the tree underlying Fig. 3.

c) the usage of all 30 PHQ items with the scoring given by Fishers linear discriminant rule (Table 1). The latter gives the highest AUC –at the expense of the usage of every PHQ item. The most economic choice is to use the classification tree shown in Fig. 1. This choice is parsimonious in that it is not necessary to administer the whole PHQ. The assessment is restricted to only four items and can in principle even be incorporated in the framework of the routine consultation. To give an example, if the physician knows that the patient has been feeling down, depressed or hopeless for more than half the days, and has had sleep problems nearly daily, the probability of endorsing the SI item is 64.1%. Thus, the advantage of our approach is that the risk of SI can be quantified based on the known predictors. Practitioners should also be aware that once these four predictors have 110

General Hospital Psychiatry 51 (2018) 106–111

P. Jordan et al.

this has to be accounted for by any reasonable classifier. Finally, the suicidal ideation item was dichotomized for our analyses here, and we cannot exclude that an ordinal evaluation of suicidal risk might have revealed differing results. In a clinical setting, physicians should consider the four PHQ items (‘feeling down, depressed or hopeless’, ‘trouble falling or staying asleep’, ‘feeling bad about yourself’, “not being able to stop worrying”) when judging the risk of SI. However, while sociodemographic and the additional PHQ variables do not contribute to a better prediction of SI beyond the mentioned four items, we are well aware of the fact that other variables and their interplay might enter in the complex process of clinical judgement and decision making, which are not captured within the standardized framework of self-assessment tools provided by the PHQ and sociodemographic variables. Relevantly, several risk factors for suicidal ideation and suicide deaths are well known, among them age, gender, race, psychiatric disorder, chronic medical conditions, serious adverse childhood experience, sexual minority status or family history of self-harm [8,9,11,30] Clinicians should know and consider these factors when screening for SI. Aside from the direct clinical context, the proposed method – i.e. the classification tree depicted in Fig. 1 might be used in studies of psychiatric epidemiology to allow for a prediction of the SI-response. Overall, the results of our study need to be interpreted in the light of the debate about the utility of screening for suicide risk in primary care. The systematic review by O'Connor et al. [30] cautionary remarks that “Minimal evidence (two studies) suggests that there are screening tools that can identify adults and older adults in primary care who are at increased risk of suicide, at the cost of many false-positives”. Of course, the overall picture regarding the usage of screening is controversial. Not only is there limited data on the effects of screening, i.e. whether the process of screening changes the subsequent risk, but also the evidence regarding effectiveness of treatments is vague. The most promising fact reported in the systematic review by O'Connor et al. [30] is that psychotherapy can reduce the risk of suicide attempts by approximately 32%. However, when it comes to the examination of the effects of treatments on suicidal deaths, there is an inherent problem of statistical power (due to the rarity of the outcome) that hinders the evaluation of treatment effects with respect to this outcome. In addition, it should be borne in mind that the predictive accuracy of future suicide from suicidal thoughts is far from perfect [31]. More than a third of suicide attempts and deaths occur without endorsing the PHQ-9 suicidal ideation item within the past month [9], clearly underlining the at least partly unpredictability of suicidal behaviour. Overall, we highlight the substantial base rate of SI in primary care and the key role of depression/hopelessness, problems with self-esteem, constant worrying and major sleep disturbances – which should trigger appropriate responses by primary care physicians.

[3] [4]

[5]

[6]

[7]

[8]

[9]

[11]

[12]

[14]

[15]

[16]

[17]

[18] [19] [20] [21]

[22] [23]

[24]

[25]

Funding [26]

This study was funded by the German Ministry of Education and Research (BMBF) as one subproject of psychenet – Hamburg Network for Mental Health, a large health services research study in the Hamburg Metropolitan Area (subproject Somatoform Disorders; principal investigator: Bernd Löwe, grant number 01KQ1002B).

[27]

[28]

[29]

Conflict of interest The authors have no competing interests to report.

[30]

References [31] [1] Luoma JB, Martin CL, Pearson JL. Contact with mental health and primary care providers before suicide: a review of the evidence. Am J Psychiatry 2002;159(6):909–16. [2] Schulberg HC, Bruce M, Lee PW, Williams JW, Dietrich AJ. Preventing suicide in

111

primary care patients: the primary care physician's role. Gen Hosp Psychiatry 2004;26(5):337–45. Wiborg JF, Gieseler D, Löwe B. Suicidal ideation in German primary care. Gen Hosp Psychiatry 2013;35(4):366–9. Bomyea J, Lang AJ, Craske MG, Chavira D, Sherbourne CD, Rose RD, et al. Suicidal ideation and risk factors in primary care patients with anxiety disorders. Psychiatry Res 2013;209(1):60–5. Wiborg JF, Gieseler D, Fabisch AB, Voigt K, Lautenbach A, Löwe B. Suicidality in primary care patients with somatoform disorders. Psychosom Med 2013;75(9):800–6. Schulberg HC, Lee PW, Bruce ML, Raue PJ, Lefever JJ, Williams JW, et al. Suicidal ideation and risk levels among primary care patients with uncomplicated depression. Ann Fam Med 2005;3(6):523–8. Kroenke K, Spitzer RL, Williams JB, Löwe B. The patient health questionnaire somatic, anxiety, and depressive symptom scales: a systematic review. Gen Hosp Psychiatry 2010;32(4):345–59. Simon GE, Rutter CM, Peterson D, Oliver M, Whiteside U, Operskalski B, et al. Does response on the PHQ-9 depression questionnaire predict subsequent suicide attempt or suicide death? Psychiatr Serv 2013;64:1195–202. Simon GE, Coleman KJ, Rossom RC, Beck A, Oliver M, Johnson E, et al. Risk of suicide attempt and suicide death following completion of the patient health questionnaire depression module in community practice. J Clin Psychiatry 2016;77(2):221–7. Ribeiro JD, Franklin JC, Fox KR, Bentley KH, Kleiman EM, Chang BP, et al. Selfinjurious thoughts and behaviors as risk factors for future suicide ideation, attempts, and death: a meta-analysis of longitudinal studies. Psychol Med 2016;46(2):225–36. Nock MK, Borges G, Bromet EJ, Alonso J, Angermeyer M, Beautrais A, et al. Crossnational prevalence and risk factors for suicidal ideation, plans and attempts. Br J Psychiatry 2008;192(2):98–105. Kroenke K, Strine TW, Spitzer RL, Williams JB, Berry JT, Mokdad AH. The PHQ-8 as a measure of current depression in the general population. J Affect Disord 2009;114:163–73. Posner K, Brown GK, Stanley B, Brent DA, Yershova KV, Oquendo MA, et al. The Columbia–suicide severity rating scale: initial validity and internal consistency findings from three multisite studies with adolescents and adults. Am J Psychiatry 2011;168:1266–77. http://dx.doi.org/10.1176/appi.ajp.2011.10111704. Shedden-Mora MC, Gross B, Lau K, Gumz A, Wegscheider K, Löwe B. Collaborative stepped care for somatoform disorders: a pre-post-intervention study in primary care. J Psychosom Res 2016;80:23–30. Löwe B, Piontek K, Daubmann A, Härter M, Wegscheider K, König HH, et al. Effectiveness of a stepped, collaborative, and coordinated health care network for somatoform disorders (Sofu-Net): a controlled cluster cohort study. Psychosom Med 2017;79:1016–24. Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. Taylor & Francis; 1984. Bishop CM. Pattern recognition and machine learning. Springer; 2006. Mardia KV, Kent JT, Bibby JM. Multivariate analysis. Academic Press; 1979. Ballings M. Van den Poel D, editor. AUC: threshold independent performance measures for probabilistic classifiers. 2013. (R package version 0.3.0). https:// CRAN.R-project.org/package=AUC. Cule MR, Samworth Stewart M. Maximum likelihood estimation of a multi-dimensional log-concave density. J Roy Stat Soc B 2010;72:545–607. Jordans M, Rathod S, Fekadu A, Medhin G, Kigozi F, Kohrt B, et al. Suicidal ideation and behaviour among community and health care seeking populations in five lowand middle-income countries: a cross-sectional study. Epidemiol Psychiatr Sci 2017:1–10. http://dx.doi.org/10.1017/s2045796017000038. Moreno-Kustner B, Jones R, Svab I, Maaroos H, Xavier M, Geerlings M, et al. Suicidality in primary care patients who present with sadness and anhedonia: a prospective European study. BMC Psychiatry 2016;16:94. http://dx.doi.org/10. 1186/s12888-016-0775-z. Schulberg HC, Bruce ML, Lee PW, Williams JW, Dietrich AJ. Preventing suicide in primary care patients: the primary care physician's role. Gen Hosp Psychiatry 2004;26(5):337–45. http://dx.doi.org/10.1016/j.genhosppsych.2004.06.007. Stoudemire A, Frank R, Hedemark N, Kamlet M, Blazer D. The economic burden of depression. Gen Hosp Psychiatry 1986;8(6):387–94. Gaynes BN, West SL, Ford CA, Frame P, Klein J, Lohr KN. Screening for suicide risk in adults: a summary of the evidence for the US preventive services task force. Ann Intern Med 2004;140:822–35. Raue PJ, Ghesquiere AR, Bruce ML. Suicide risk in primary care: identification and management in older adults. Curr Psychiatry Rep 2014;16(9):466. http://dx.doi. org/10.1007/s11920-014-0466-8. Inagaki M, Ohtsuki T, Yonemoto N, Kawashima Y, Saitoh A, Oikawa Y, et al. Validity of the patient health questionnaire (PHQ)-9 and PHQ-2 in general internal medicine primary care at a Japanese rural hospital: a cross-sectional study. Gen Hosp Psychiatry 2013;35:592–7. O'Connor E, Gaynes B, Burda BU, Williams C, Whitlock EP. Screening for suicide risk in primary care: a systematic evidence review for the United States preventive services task force. Rockville, MD, USA: Agency for Healthcare Research and Quality; 2013. Large M, Galletly C, Myles N, Ryan CJ, Myles H. Known unknowns and unknown unknowns in suicide risk assessment: evidence from meta-analyses of aleatory and epistemic uncertainty. B J Psych Bull 2017;41(3):160–3. http://dx.doi.org/10. 1192/pb.bp.116.054940.