Asking patients about their general level of functioning: Is IT worth IT for common mental disorders?

Asking patients about their general level of functioning: Is IT worth IT for common mental disorders?

Psychiatry Research 229 (2015) 791–800 Contents lists available at ScienceDirect Psychiatry Research journal homepage: www.elsevier.com/locate/psych...

444KB Sizes 2 Downloads 71 Views

Psychiatry Research 229 (2015) 791–800

Contents lists available at ScienceDirect

Psychiatry Research journal homepage: www.elsevier.com/locate/psychres

Asking patients about their general level of functioning: Is IT worth IT for common mental disorders? Elena Olariu a,b,c, Carlos G Forero b,c,n, Pilar Álvarez d, José-Ignacio Castro-Rodriguez a,b,d, MJ Blasco a,b, Jordi Alonso a,b,c, on behalf of INSAyD Investigators1 a

Universitat Pompeu Fabra (UPF), Department of Experimental and Health Sciences, Spain Health Services Research Unit, IMIM-Institut Hospital del Mar d’Investigacions Mèdiques, Spain c CIBER Epidemiología y Salud Pública (CIBERESP), Spain d INAD-Institut de Neuropsiquiatria i Addiccions, Parc de Salut Mar, Barcelona, Spain b

art ic l e i nf o

a b s t r a c t

Article history: Received 6 April 2015 Received in revised form 21 July 2015 Accepted 31 July 2015 Available online 7 August 2015

Functional disability (FD) is a diagnostic criterion for the psychiatric diagnosis of many mental disorders (e.g. generalized anxiety disorder (GAD); major depressive episode (MDE)). We aimed to assess the contribution of measuring FD to diagnosing GAD and MDE using clinical (Global Assessment of Functioning, GAF) and self-reported methods (Analog scale of functioning, ASF and World Health Organization Disability Assessment Schedule WHODAS 2.0). Patients seeking professional help for mood/anxiety symptoms (N ¼244) were evaluated. The MINI interview was used to determine the presence of common mental disorders. Symptoms were assessed with two short checklists. Logistic and hierarchical logistic models were used to determine the diagnostic accuracy and the added diagnostic value of FD assessment in detecting GAD and MDE. For GAD, FD alone had a diagnostic accuracy of 0.79 (GAF), 0.79 (ASF) and 0.78 (WHODAS) and for MDE of 0.83, 0.84 and 0.81, respectively. Self-reported measures of FD improved the diagnostic performance of the number of symptoms (4% AUC increase) for GAD, but not for MDE. If assessed before symptom evaluation, FD can discriminate well between patients with and without GAD/ MDE. When assessed together with symptoms, self-reported methods improve GAD detection rates. & 2015 Elsevier Ireland Ltd. All rights reserved.

Keywords: Generalized anxiety disorder Major depressive episode Functional disability Assessment Diagnosis Common mental disorders

1. Introduction Mood and anxiety disorders are commonly seen in Primary Care (Terluin et al., 2009). Prevalence levels of depression and anxiety vary in this setting from 3% to 29% and from 1% to 23%, respectively according to a multicenter WHO study (Goldberg and Lecrubier, 1995). They are highly prevalent and have a negative effect on quality of life causing considerable functional disability (Olfson et al., 1997) and subsequently important personal, social and economic losses (McKnight and Kashdan, 2009). Thus, the evaluation of functional disability is essential for treatment planning (McQuaid et al., 2012) and monitoring clinical progress (Niv et al., 2007), being at the same time an important treatment outcome (Brown et al., 2014). n Correspondence to: Carrer del Doctor Aiguader, 88, Edifici PRBB, 08003 Barcelona, Spain. Fax: þ 34 933 160 410. E-mail address: [email protected] (C. Forero). 1 INSAyD Investigators: Jordi Alonso, Carlos García Forero, Gemma Vilagut, Pilar Álvarez, José-Ignacio Castro-Rodriguez, Luis Miguel Martín-López, Maite Campillo, Lina Abellanas, Carrie Garnier, Maria Rosa Más, Marta Reinoso, Gabriela Barbaglia, Miquel A. Fullana, Alberto Maydeu, Anna Brown

http://dx.doi.org/10.1016/j.psychres.2015.07.088 0165-1781/& 2015 Elsevier Ireland Ltd. All rights reserved.

Many instruments, both clinical and self-reported have been developed over the years for the assessment of mental disorders' symptoms in clinical practice (Olariu et al., 2014). This contrasts with the assessment of functional disability where fewer instruments are available and where clinician ratings are the preferred method of assessment. Arguably, the most important instrument for evaluating change in psychiatric functioning has been the Global Assessment of Functioning (GAF) (Vatnaland et al., 2007; Pedersen and Karterud, 2012). The GAF was included in the DSMIV and was proposed as the standard diagnostic assessment of functional disability in psychiatry (Hall, 1995; Sohail et al., 2013), reflecting the clinician's judgment of a patient’s overall level of psychological, social and occupational functioning (Söderberg et al., 2005). It provides a good clinical overview with good reliability in research settings (Russell et al., 1979; Mezzich et al., 1985; Rey et al., 1988; Hilsenroth et al., 2000; Beitchman et al., 2001; Startup et al., 2002; Haro et al., 2003; Söderberg et al., 2005; Niv et al., 2007; Pedersen et al., 2007). In spite of its simplicity (Startup et al., 2002; Urbanoski et al., 2014) and focus on the quality of life of psychiatric patients (Dimsdale et al., 2010), the GAF has been criticized for having low inter-rater reliability in routine clinical settings (Vatnaland et al., 2007) and for failing to

792

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

capture the complexities of functioning disability in psychiatric patients (Aas, 2010; Gold, 2014). Notwithstanding, a major criticism has been that is sensitive to symptom severity (Hilsenroth et al., 2000; Söderberg et al., 2005; Smith et al., 2011; Von Korff et al., 2011) to a higher degree than it is to impairment. As an alternative, patient-reported outcomes (PROs) have been proposed as measures of change in health status and treatment efficacy (Kulnik and Nikoletou, 2014) and have been included as the standard for routine clinical evaluations of disability in the recent DSM-5 (American Psychiatric Association, 2013). More specifically, the American Psychiatric Association, through the issue of DSM-5, adhered to the International Classification of Functioning, Disability and Health (ICF) (World Health Organization, 2001; Ustün et al., 2003) in order to converge with international standards. In the ICF, functioning is understood as a dynamic interaction between health, environment and personal factors. Functioning and disability depend on symptoms, but they are not symptoms, being conceptualized as impairments in body functions and structures, limitations in activities and restriction in participation. Importantly, a health condition must always be present when ICF is applied. In accordance with the ICF, DSM 5 proposes using the World Health Organization Disability Scale 2.0 (WHODAS 2.0) (Ustün et al., 2010a) as the standard for assessing disability. WHODAS 2.0 has good psychometric properties (Federici et al., 2009) tested in patients with diverse physical and mental conditions (Chwastiak and Von Korff, 2003; Chisolm et al., 2005; Baron et al., 2008; Luciano et al., 2010b) in culturally diverse countries. It is short, simple and easy to administer¸ and it can be used across physical and mental conditions to generate standardized disability levels and profiles in both clinical and general population settings. Besides WHODAS 2.0, other self-reported instruments of functioning disability have been developed. They are based on patients' responses of the GAF and are more sensitive to treatment effects than observer rating-scales (Kellner et al., 1979), just like any other self-rating procedure. Additionally, they have been found to have acceptable validity and reliability between patients' and experts' ratings (Bodlund et al., 1994; Ramirez et al., 2008). The criterion of clinically significant distress or impairment is necessary for the psychiatric diagnosis of many mental disorders (Konecky et al., 2014). Hence, standard systems, such as the DSM, include the evaluation of patient functioning as part of the compulsory criteria for establishing an active disorder. Even though the latest edition of the DSM has brought substantial change in the method of assessment of functional disability, little information exists about the contribution of clinical and self-reported data on functional disability in the diagnostic process. The objective of this study was to assess the relative contribution of clinical and self-reported functional disability measures to the diagnosis of GAD and MDE, in terms of concordance with a final diagnosis made with consensus methods. We also aimed to compare the contribution of measuring functional disability in addition to evaluating symptoms, when diagnosing generalized anxiety disorder (GAD) and major depressive episode (MDE). Our main hypothesis was that functional disability yields good discriminatory ability to detect positive cases regardless of the method of assessment. Our second hypothesis was that the contribution of functional disability would be subsidiary to symptomatology information, as indicated by a marginal increase in the predictive ability when included in the assessment in addition to symptoms. As a final hypothesis we expected that selfreported measures of functional disability would have similar diagnostic ability as clinician-reported measures.

2. Methods 2.1. Design Prospective study (baseline, 1-month, 3-months assessments) with a convenience sample of patients who sought professional help for mood and anxiety disorders symptoms. Patients came from three different health-care levels: primary care, outpatient mental health centers and acute psychiatric inpatient facilities/ hospitals. The study protocol was revised and approved by the Ethics committee from Hospital del Mar Medical Research Institute (IMIM) and was carried out in accordance with the principles of the Declaration of Helsinki.

2.2. Sample Health-care professionals from collaborating health-care centers invited patients to participate in our study according to the following inclusion criteria: adults older than 18 years old seeking care for affective symptomatology. Patients were excluded if they presented psychotic symptoms, syndromes attributable to organic or substance origin, or significant cognitive impairment. Further details on recruitment have been published elsewhere (Olariu et al., 2014). 2.3. Data collection Certified clinical psychologists, all of them with four years of postgraduate formal hospital internships as required by the Spanish health system, conducted the interviews. The clinical psychology residency program includes, among others, formal instruction in psychiatric diagnosis according to DSM and ICD systems, in differential diagnosis and in the use of assessment procedures and instruments, such as the SCID, the CIDI and the MINI. It also includes an extensive use of the GAF as part of the training given on the SCID assessment. After an initial appointment, patients underwent a personal interview where they were informed again about the scope and aim of the study. When the patient met all the requirements and agreed to participate, he/she signed the informed consent and the assessment was performed using a computer application software. Following the clinical interview, patients were asked to fill-in the self-reported questionnaires.

2.4. Instruments We collected data on sociodemographic characteristics of the included patients. Additionally, the following instruments were included: Mini-International Neuropsychiatric Interview 5.0 (MINI) We used the Spanish version of the MINI 5.0 (Ferrando et al., 1998), an algorithmic, fully structured diagnostic interview for Axis I psychiatric disorders in DSM-IV (Sheehan et al., 1998). The MINI decision algorithm stops the exploration of DSM diagnostic criteria when DSM requirements for a positive case are met. Given that the main focus of our study was on major depressive disorder and generalized anxiety and that the assessments had a limited time window, we explored only the following comorbidities: dysthymic disorder, (hypo)manic episode and panic disorder. Thus, with the MINI we only assessed 5 disorders and any other possible comorbidity was excluded from the assessment.

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

Self-reported symptom assessment: Inventory of depression (INS-D) and anxiety (INS-G) symptom scales

793

Organization, 2001; Ustün et al., 2003; Luciano et al., 2010b). 2.5. Statistical analysis

We assessed patients' perceptions of MDE and GAD symptoms with two clinician-administered checklists based on DSM symptom criteria. The checklists were part of the INSAyD project (Inventory of Depression and Anxiety Symptoms) (Olariu et al., 2014), and consisted of direct exploration of DSM-IV-TR symptomatologic criteria for MDE (9 symptoms) and GAD (8 symptoms). These checklists (INS-D for MDE and INS-G for GAD) have excellent reliability and are highly responsive to short-term clinical evolution (Olariu et al., 2014). Using INSAyD measures, clinicians examined the presence of DSM symptoms by simply asking the patient whether each symptom was present within the specified time period. As a nonalgorithmic exploration, the INSAyD checklists do not require symptom interpretation, and they reflect a direct insight of individual symptoms given by the patient. Unlike the MINI, the INSAyD measures are patient reported and check for all symptoms without stopping criteria. The INSAyD checklists do not assess impairment, substance or physiological effects, they only assess symptoms. Thus, the INSAyD checklists do not provide information about criterias B–D for MDE or criterias D–F for GAD. Global Assessment of Functioning (GAF) The Global Assessment of Functioning is a clinician-rated scale that assesses a patient's overall level of functioning. The GAF is a single item scale ranging from 1 (worst functioning) to 100 (best functioning). Clinicians are guided by brief explanations and examples of symptoms and levels of functioning in 10-point intervals. For example, the 1–10 interval is described as “persistent danger of severely hurting oneself or others or inability to maintain hygiene or suicidal attempts”, whereas the 91–100 interval is depicted as “superior functioning in a wide range of activities; life problems never seem to be out of hand; sought by others because of many positive qualities”. We used the Spanish version of the GAF to assess the overall level of functioning of recruited patients (American Psychiatric Association, 1994). Analog scale of functioning (ASF) The analog scale of functioning is a one item-scale proposed for this study: “On a scale from 1 to 100, how would you currently rate your overall level of functioning?” Higher scores show a higher level of functioning. It is based on the clinical GAF and on previous versions of self-reported, one-item instruments (Bodlund et al., 1994; Ramirez et al., 2008). It aims to capture the patient's own perceptions of his/her current level of functioning. World Health Organization Disability Scale 2.0 (WHODAS 2.0) We used the Spanish validated version of the 12-item WHODAS 2.0 (Vázquez-Barquero et al., 2000). The WHODAS 2.0 provides a global measure of functionalal disability, assessing the impact of pathology on body functions and structures, limitations in activities and restriction in social participation. Response options include a 5-point Likert scale and go from 1 (no difficulty) to 5 (extreme difficulty or cannot do). Scores were computed using the simple-scoring method (Ustün et al., 2010b), by adding up the scores from each of the items without recoding or collapsing the response categories. The total scores range from 12 to 60, higher scores reflect greater disability. WHODAS 2.0 has good psychometric properties (Federici et al., 2009). This instrument has a robust latent structure and construct validity in both normal, and pathological and psychopathological samples (World Health

Diagnostic ability of functional disability models The diagnostic accuracy of functional disability assessment was determined using logistic models. The dependent variable was diagnosis of MDE/GAD based on the MINI, the gold standard. Patients with either pure or comorbid MINI diagnoses of MDE/GAD were considered as positive cases and patients without an active MINI diagnosis of the before-mentioned disorders were considered negative cases. In case of comorbid cases, patients with comorbid MINI diagnoses of MDE&GAD were considered positive cases and all other patients were considered negative cases. The models estimated the probability of a positive diagnosis as a function of two different diagnostic criteria: functional disability measures and the number of DSM symptoms of the respective disorder. A first set of models predicted the diagnostic ability of functional disability (Functional Disability Models) against MINI diagnoses, in the absence of symptoms. A second set of models (Symptom Model) was used to compute the diagnostic ability of the number of clinical symptoms to establish a diagnosis. In order to compute the incremental contribution of functional disability to diagnostic accuracy in the presence of symptoms, a third set of models added functional disability measures to the second set of models (Symptom Model) in a hierarchical logistic model. Given that the Symptom model was nested within the Symptom þFunctional Disability models, it was possible to compare the added diagnostic value of the GAF, WHODAS and ASF to symptom assessment. All models were adjusted for age, sex and number of physical disorders. Models were compared in terms of significance, goodness of fit (GOF) (using Hosmer and Lemeshow test and Nagelkerke's pseudo R2 value ) (Hosmer and Lemeshow, 2000). Predictor parameters were obtained using exponential transformation to odds ratios. As we were interested in predictive discrimination, we also computed the c-index using the Area Under the Curve (AUC) in a ROC analysis. The c-index is more appropriate as a fit index than any other method for assessing diagnostic fit (Hosmer et al., 1997), and it is recommended for addressing agreement between predicted and observed values, as it does not require predictor categorization like the Hosmer and Lemeshow's test. This strategy is more adequate as GOF test for predictive discrimination models (Steyerberg et al., 2010). As an AUC, the minimum c-value is 0.5 and the maximum is 1.0. C-values of 0.7–0.8 show acceptable discrimination, values of 0.8–0.9 indicate excellent discrimination; values higher than Z0.9 show outstanding discrimination (Pepe, 2003). Given the small sample size, model parameters and standard errors were estimated using bias-corrected and accelerated bootstrap estimation (Efron, 1987). We used 5.000 bootstrap replications and we adjusted for bias and skewness of the bootstrap distribution. In order to avoid model overfitting when assessing case classification, we computed the probability of active diagnoses in a k-fold cross validation sample, by dividing the data set in 10 subsets of approximately n/10 cases (Molinaro et al., 2005). Each sample served as a test set after fitting the model in the remaining 9x(n/10) observation subsets. The logistic model was then used to classify the cases in the test sets, computing the c-index (i.e. the AUC) in each of them. The averaged AUC and its standard errors over the 10 test sets were computed as a cross-validated estimate of diagnostic accuracy. We also determined the level of correlation between the different methods of assessing functional disability. We used the following cutoff points to quantify the levels of correlation between questionnaires: 0.45, weak correlation; 0.45–0.70, moderate

794

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

correlation; 0.7, strong correlation (Cohen, 1988). Given that the GAF and ASF are single-item assessments and hence internal consistency measurement error cannot be estimated, we only determined their test–retest reliability (Lohr, 2002). We did this in a subgroup of patients with a stable severity level (i.e. patients with changes under 1 standard error of measurement in Hamilton Anxiety and Depression scales (Rejas et al., 2008) ) within a onemonth interval. Statistical analyses were run using STATA version 10 and SPSS version 20.0.

3. Results The results of this study are estimated using only the baseline data of the project. 233 patients underwent the baseline assessment. Table 1 presents the sociodemographic characteristics of the recruited sample. In our sample, we found a total of 388 diagnoses in the 233 patients: 34.79% were MDE diagnoses, 30.67% melancholic MDE diagnoses, 2.58% dysthymia diagnoses, 3.09% (hypo) mania diagnoses, 8.51% panic diagnoses, 20.36% GAD diagnoses. 79% of the patients with no GAD or MDE diagnosis had no other diagnosis of mental disorder registered. Patients with a MDE diagnosis had the highest mean number of MDE symptoms (6.5; SD ¼1.6), and a high mean of GAD symptoms (6.1; SD ¼1.5). On the other hand, patients with a GAD diagnosis had a high mean number of GAD symptoms (6.8; SD¼ 0.9) but a low mean number of MDE symptoms (3.2; SD ¼2.0). The highest

mean number of GAD symptoms was registered in case of patients with a comorbid MDE and GAD diagnosis: 7.4 (SD ¼ 0.7). These patients had a relatively high mean number of MDE symptoms as well (6.4; SD ¼ 1.5). Patients with no MDE or GAD symptoms had a higher mean number of GAD symptoms (4.2; SD ¼2.3) than MDE symptoms (1.7; SD ¼1.5). Table 2 shows the average scores of the three different functional disability measures by disorder status. As seen in the table, patients with no GAD or MDE diagnosis showed the highest functioning levels, as reflected by the mean scores of all functional disability measures. Patients with a GAD diagnosis had lower levels of functional disability than either patients with MDE only or patients with MDE and GAD, as shown by the mean scores of both clinician and self-reported measures. Cases with additional comorbidities had lower functioning levels than pure cases (cases with no other mental comorbidity). However, these differences were not significant. WHODAS and ASF correlated weakly with one another (r ¼  0.42). The test–retest reliability in the subgroup with a stable severity level (n ¼49) was 0.65 for the GAF, 0.64 for ASF and 0.86 for the WHODAS. The correlation between the WHODAS and the GAF (r ¼  0.54) was significantly higher than the correlation with the ASF (Fisher's Z¼ 1.65, p¼ 0.048); this difference is associated with a 12% more shared variance between the WHODAS and the GAF than with the ASF. Finally, a somewhat higher correlation was obtained between ASF and GAF (ASF–GAF: r ¼0.49), which was not significantly different from the other two measures. Table 3 shows the functional disability models (FDM). As seen, all

Table 1 Sociodemographic and clinical characteristics of the study sample at baseline. Sociodemographic variables

Total Age, M (SD) Sex Men Women Nationality Spanish Not Spanish Education level Less than Primary and Primary Secondary Higher Employment status Working Non-working Others Civil status Never married Married or living with a partner Divorced or separated Widowed Setting Primary Care Outpatient mental health centers Hospital Number of physical disorders, M (SD) Categories of mental disorders MDEb GADa Dysthymia (Hypo)mania Panic No diagnoses a b

N (%)

Active mental disorders No GADa or MDEb N (%)

Only GADa N (%)

Only MDEb N (%)

GADa and MDEb N (%)

233 (100) 49.1 (14.8)

75 (32.2) 50.5 (16.1)

23 (9.9) 45 (13.2)

79 (33.9) 51.1 (13.3)

56 (24) 46.1 (15.2)

72 (30.9) 161 (69.1)

26 (34.7) 49 (65.3)

8 (34.8) 15 (65.2)

23 (29.1) 56 (70.9)

15 (26.8) 41 (73.2)

212 (91.4) 20 (8.6)

69 (92.0) 6 (8.0)

22 (95.6) 1 (4.4)

72 (91.1) 7 (8.9)

49 (89.1) 6 (10.9)

92 (39.5) 97 (41.6) 44 (18.9)

32 (42.7) 25 (33.3) 18 (24.0)

5 (21.7) 13 (56.6) 5 (21.7)

30 (38.0) 38 (48.1) 11 (13.9)

25 (44.6) 21 (37.5) 10 (17.9)

106 (45.5) 118 (50.6) 9 (3.9)

34 (45.3 ) 40 (53.3) 1 (1.3)

15 (65.2) 8 (34.8) 0 (0)

30 (38.0) 43 (54.4) 6 (7.6)

27 (48.2) 27 (48.2) 2 (3.6)

61 (26.2) 132 (56.6) 30 (12.9) 10 (4.3)

18 (24) 42 (56) 12 (16) 3 (4)

9 (39.1) 10 (43.5) 3 (13) 1 (4.4)

22 (27.9) 47 (59.5) 8 (10.1) 2 (2.5)

12 (21.4) 33 (58.9) 7 (12.5) 4 (7.2)

91 (39.0) 122 (52.4) 20 (8.6) 1.13 (1.01)

31 (41.3) 42 (56) 2 (2.7) 0.96 (0.94)

15 (65.2) 8 (34.8) 0 1.17 (0.65)

15 (19.0) 47 (59.5) 17 (21.5) 1.24 (1.20)

30 (53.6 ) 25 (44.6) 1 (1.8) 1.18 (0.94)

135 (57.9) 79 (33.9) 10 (4.3) 12 (5.2) 33 (14.2)

0 (0) 0 (0) 10 (13.3) 1 (1.3) 5 (6.7) 59 (78.7)

0 (0) 23 (100) 0 (0) 0 (0) 5 (21.7)

79 (100) 0 (0) 0 (0) 5 (6.3) 10 (12.6)

56 (100) 56 (100) 0 (0) 6 (10.7) 13 (23.2)

Generalized Anxiety Disorder. Major Depressive Episode, M mean, SD standard deviation.

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

795

Table 2 Descriptive statistics of functional disability measures by mental disorder group. Average functioning (SD)

No GADa or MDEb Only GADa Only MDEb GADa and MDEb Other mental disorderc GADa þ other mental disorderc MDEb þ other mental disorderc GADa and MDEb þ other mental disorderc N (% missing)

GAF (1 worse – ASF (0 worse – 100 best) 100, best)

WHODAS 2.0 (12 best – 60, worse)

76.3 70.9 59.6 60.5 67.3 61.3

21.2 23.1 33.2 34.3 26.5 27.3

(10.2) (11.4) (14.3) (14.4) (9.2) (8.5)

65.2 58.3 41.1 41.6 58 52.0

(19.5) (16.5) (21.1) (17.6) (11.9) (22.0)

(8.4) (8.3) (10.4) (8.8) (9.9) (9.2)

57.2 (11.5)

32.6 (20.5)

33.9 (11.6)

57.9 (15.4)

40.3 (15.9)

33.7 (8.0)

231 (0.9%)

227 (2.6%)

213 (8.6%)

GAF Global Assessment of Functioning, ASF Analog scale of functioning, and WHODAS 2.0 World Health Organization Disability Scale 2.0 a b c

Generalized Anxiety Disorder. Major Depressive Episode. Panic Disorder, Dysthymia, (Hypo)mania.

models were significant and had good fit to the data. The goodness of fit improved when a self-reported method of assessing functional disability (WHODAS) was used for both GAD and comorbid cases of MDE and GAD, with respect to the model based on clinician measures (GAF) (64% and 60% increase respectively). The odds of having GAD were 7% less for each unit increase in the GAF score. In case of WHODAS, the odds of having GAD were 17% higher for each unit increase in the score. A similar pattern was observed for MDE and MDE and GAD comorbidity. The odds of the three scales are not directly comparable as the ranges of the scales are not the same. Under suspicion of GAD or MDE, the assessment of functional disability alone had acceptable discrimination in case of GAD irrespective of the method used (AUCWHODAS ¼0.78, 95% CI 0.66–0.90; AUCASF ¼0.79, 95% CI 0.72–0.87; AUCGAF ¼ 0.79, 95% CI 0.71–0.87) (see Fig. 1). The method of assessment of functional disability did not influence its diagnostic accuracy in case of MDE either: the ability of all three instruments to detect cases of MDE was excellent. The highest AUC was registered when functional disability was assessed with the ASF: AUCASF ¼0.84 (95% CI 0.77–0.90) (see Fig. 2). A cutoff score, based on Youden index, showed that self-reported methods had the highest sensitivity and specificity for MDE and clinician-rated methods for GAD(GAD: SensitivityGAF ¼ 0.77; SpecificityGAF ¼0.78;

----

Questionnaire GAF* ASF*

-------

WHODAS*

AUC

95% CI Sens (95%) Spec (95%) 0.71-0.87 .77 .78 0.68-0.87 (0.68-0.89) 0.79 0.72-0.87 .77 .77 0.66-0.88 (0.70-0.91) 0.78 0.66-0.90 .75 .79 0.66-0.86 (0.69-0.91) 0.79

Fig. 1. Receiver Operating Characteristic (ROC) curves of functional disability measures to identify GAD using MINI as a gold standard GAF Global Assessment of Functioning; WHODAS World Health Organization Disability Scale 2.0 – 12 items, ASF Analog scale of functioning; GAD Generalized Anxiety Disorder; MINI Mini International Neuropsychiatric Interview; AUC Area Under the Curve; CI Confidence Interval; Sens Sensitivity; Spec Specificity.

MDE: SensitivityASF ¼0.79; SpecificityASF ¼0.81). As seen in Table 4, in case of GAD, the diagnostic ability of the number of symptoms improved when functional disability was assessed with either clinician (GAF: p-value ¼0.002) or self-reported methods (ASF: p-value o0.0001; WHODAS: pvalue ¼0.002). The ASF model had the highest diagnostic accuracy in the K-fold cross-validation sample (see Table 4), when considering as positive cases patients with either pure or comorbid MINI diagnoses of GAD (AUCASF ¼ 0.92, 95% CI 0.87–0.97) and as negative cases patients without an active MINI diagnosis of the

Table 3 Detection of active GAD, MDE or comorbid cases (GAD þMDE) according to the MINI using functional disability models in the absence of symptom information. Mental disorder (MINI)

Functional disability model

Generalized anxiety (n¼ 134) GAFn ASFn WHODASn Major depressive episode GAFn (n¼178) ASFn WHODASn Comorbid MDE and GAD GAFn N ¼233 ASFn WHODASn

Model p-value Nagelkerke pseudo – R2

0.008 o 0.001 o 0.001 o 0.001 o 0.001 0.002 0.005 0.002 o 0.000

0.25 0.32 0.41 0.46 0.42 0.27 0.10 0.10 0.16

Hosmer–Lemeshow fit test (df ¼ 8) Chi-square

Sig.

4.32 7.04 8.62 3.50 5.64 3.78 9.87 11.76 18.31

0.25 0.49 0.38 0.89 0.70 0.87 0.28 0.16 0.02

Parameter odds ratio [Exp(B)]

95% CI for OR (B)

0.93 0.95 1.17 0.86 0.93 1.11 0.96 0.98 1.07

0.89–0.96 0.92–0.97 1.09–1.27 0.80–0.94 0.90–0.96 1.05–1.17 0.94–0.98 0.96–0.99 1.03–1.10

GAD: Generalized Anxiety Disorder, MDE: Major Depressive Episode; MINI: Mini International Neuropsychiatric Interview; AUC Area under the curve; GAF: Global Assessment of Functioning; ASF: Analog scale of functioning, WHODAS 2.0 World Health Organization Disability Scale 2.0. Functional Disability Models – logistic models used to predict the diagnostic ability of functional disability measures against MINI diagnoses. All models were adjusted for age, sex, number of physical disorders and number of additional psychiatric comorbidities.

796

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

----

Questionnaire GAF* ASF*

-------

WHODAS*

AUC 95% CI Sens (95%) 0.83 0.76-0.89 0.80 (0.70-0.91) 0.84 0.77-0.90 0.79 (0.69-0.95) 0.81 0.75-0.88 0.65 (0.40-0.90)

Spec (95%) 0.78 (0.58-0.98) 0.81 (0.60-0.96) 0.78 (0.60-0.96)

Fig. 2. Receiver Operating Characteristic (ROC) curves of functional disability measures to identify MDE using MINI as a gold standard GAF Global Assessment of Functioning; WHODAS World Health Organization Disability Scale 2.0 – 12 items, ASF Analog scale of functioning; MDE Major Depressive Episode; MINI Mini International Neuropsychiatric Interview; AUC Area Under the Curve; CIConfidence Interval; Sens Sensitivity; Spec Specificity.

before-mentioned disorder. In case of MDE and in comorbid cases of MDE and GAD, none of the different methods of disability assessment seemed to improve the diagnostic performance of the Symptoms Model. When symptoms were present, similar AUC values were registered for all models in the cross-validation sample (see Table 4).

4. Discussion In this work we studied, independently and together, the role of symptoms and of functional disability in diagnostic decisions. We used two methods of functional disability assessment that have three different sources of information on disability: clinical (the GAF) and self-reported (the WHODAS 2.0, the new DSM proposal and the ASF, a simplified assessment with a subjective analog scale). Our results show that either self-reported or clinician-administered methods of measuring functional disability have good and excellent discriminatory ability to detect positive cases of GAD and MDE. Nevertheless, when functional disability is measured in addition to the number of symptoms, a non-significant increase in the predictive ability of the number of symptoms was observed in case of GAD. No similar effect was observed in case of MDE or comorbid cases of MDE and GAD. According to our results, MDE had a higher impact on functioning than GAD. Apparently, patients with GAD seem to bear their condition in a better functioning status than depressive patients. This could be due to the fact that GAD symptoms might be associated with lower levels of impairment than MDE. Another

explanation might be that functional disability in MDE is redundant with symptoms: symptoms might be already collecting information on functional impairment. Also, it might be that the assessment process is more lenient to the presence of functioning in case of anxiety, and more stringent to it in case of MDE. Future research should contemplate why the level of functioning – either clinical or self-reported – is not as specific as symptoms in common mental disorders in spite of it being a compulsory criterion. These possibilities open an interesting line of research that would require assessing functional disability in disorders with wider ranges of disability levels (for instance, schizophrenia, obsessivecompulsive disorders or social phobia) using different methods of assessment. The ideal diagnostic instruments for busy clinicians are tools that have good content validity, are short, reliable and highly specific. Even though the PRO measures herein considered (WHODAS; ASF) did not fully meet the requirement of high specificity, self-reported levels of functional disability do discriminate positive cases of GAD or MDE well enough to be used as tools in health care services under pressure from rising demand. Additionally, although the GAF and the ASF were less reliable than the WHODAS, it is worth observing that they only have one item rather than 12 concurrent items. Given the reduced number of items, this result suggests that both clinicians and patients have remarkable abilities to assess an overall level of disability. The clinician-administered measures of functional disability did not fully meet the requirements of an ideal diagnostic instrument either. The GAF requires specific training and substantial experience in psychiatric assessment. Additionally, it is time consuming and sometimes it is more sensitive to symptom severity than to impairment, raising questions over its construct validity (Endicott et al., 2008; Von Korff et al., 2011) and its value over symptoms when predicting the allocation and outcomes of patient care (Moos et al., 2000, 2002). Our results regarding the diagnostic accuracy of functional disability measures in the absence of symptom evaluation are in line with those obtained by Luciano et al. in a primary care sample with depression (Luciano et al., 2010a). In that case, the accuracy of the WHODAS 2.0 with respect to discriminating depression “caseness” was found to be good, with levels similar to the ones reported in this study (AUC¼0.76, 95% CI 0.75–0.78). When the effect of functional disability measures was controlled by the number of self-reported symptoms, a proxy for severity (Andrews et al., 2007), the diagnostic performance of the models remained the same in case of MDE and comorbid cases of MDE and GAD and improved slightly but not significantly in case of GAD. Our results indicate that symptoms alone take the lion's share of the information regarding the disorder status. The slightly different results obtained in case of GAD might be due to the complex relationship between symptom severity and functional impairment in patients with anxiety disorders. This has been reported by other studies as well. For instance, even though functioning varies in a small amount with the number of anxiety symptoms (Leon et al., 1992; Michelson et al., 1998; Rapaport et al., 2005), it is as relevant and important as symptom severity for capturing treatment outcomes in case of anxiety disorders (Brown et al., 2014). It might be that in case of anxiety disorders, not only treatment monitoring, but also the diagnostic process should focus more on assessing functional disability along with disorder symptoms. Given that functional disability provides additional information that does not overlap with symptoms (ICF framework), our third set of models only measure the concurrence between different sources of information (symptom severity and functional disability) to establish a diagnosis. They are not predictive and our results must be seen as what in test theory is understood as homogeneity: the covariation of criteria with regard

Table 4 Incremental diagnostic accuracy of adding functional disability assessment to the number of symptoms, according to mental disorder. Mental disorder (MINI)

Model

Model p-value 1 Nagelkerke pseudo – R2

Chi-square

Paramenter odds ratio Exp(B)

95% CI for OR

AUC (cross validation sample)

95% CI for AUC

p-value

Generalized anxiety disorder (n ¼134)

Symptoms o 0.001* Symptoms þGAF 0.002# Symptoms þASF o 0.001# Symptoms þWHODAS 0.002#

0.62 0.72 0.79 0.69

6.97 4.11 2.74 1.41

0.45 0.85 0.95 0.99

4.19 0.95 0.90 1.14

2.12–8.31 0.90–1.00 0.86–0.96 1.03–1.25

0.88 0.90 0.92 0.91

0.82–0.94 0.85–0.95 0.87–0.97 0.85–0.96

Major depressive episode (n ¼178)

Symptoms o 0.000* Symptoms þGAF 0.40# Symptoms þASF 0.01# Symptoms þWHODAS 0.05#

0.87 0.87 0.91 0.89

0.26 2.07 0.61 0.55

0.99 0.98 0.99 0.99

11.58 0.95 0.89 1.15

2.55–52.58 0.84–1.07 0.79–0.99 1.97–1.36

0.98 0.98 0.98 0.98

0.96–0.99 0.96–1.00 0.96–1.00 0.96–1.00

Comorbid MDE and GAD (n ¼233)

Symptoms

o 0.001*

0.36

11.39

0.18

MDE 1.18 GAD 2.10

0.83

0.77–0.88

0.36 0.36 0.36

13.69 12.37 3.91

0.09 0.14 0.87

1.00 0.99 1.03

0.996–1.41 1.50– 2.97 0.98–1.00 0.97–1.01 0.99–1.07

0.82 0.82 0.82

0.76–0.88 0.76–0.88 0.76–0.88

Symptoms þGAF 0.187# Symptoms þASF 0.395# Symptoms þWHODAS 0.153#

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

Hosmer–Lemeshow fit test (df ¼ 8)

MINI Mini International Neuropsychiatric Interview; AUC Area under the curve; CI Confidence Interval; OROdds Ration; GAD Generalized Anxiety Disorder; MDE Major Depressive Episode; Symptoms þ GAF – Symptoms Model adjusted for functional disability as assessed with the Global Assessment of Functioning (GAF); Symptoms þ WHODAS – Symptoms Model adjusted for functional disability as assessed with the World Health Organization Disability Scale 2.0 (WHODAS); Symptoms þ ASF – Symptoms Model adjusted for functional disability as assessed with the analog scale of functioning (ASF). All models were adjusted for age, sex, number of physical disorders and number of additional psychiatric comorbidities.

797

798

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

to the final decision, adjusted by the remaining available information. All in all, functional disability assessment might serve as a first step evaluation when patients seek care for anxious and depressive symptomatology (GAD or MDE). Thus, when a case is suspected of having anxiety due to reduced functioning, the diagnosis can be confirmed next with symptom evaluation. A first self-reported assessment of functional disability could guide a busy clinician to decide whether a more in-depth assessment of symptoms is needed when a patient presents with signs that raise the suspicion of a mental disorder. Health-related functional disability is a complex construct, and factors other than symptomatology may influence its subjective assessment (World Health Organization, 2001), like for example personality traits (Mouzas et al., 2014). Such complex issues exceed though the diagnostic needs of patients with demands of care for mental disorders in non-specialized healthcare levels. The assessment of functional disability is an important part of the diagnostic procedure that can provide valuable clues about the severity of the case in patients with mental healthcare demands. However, further research about determinants of subjective disability is needed, especially when disability is related with psychopathology. The results of this study should be interpreted in view of its limitations. First of all, the sample was not large enough to validate our prediction models in a true test dataset. Ideally, independent datasets should have been used for the selection of the models and for estimating the generalization error (Molinaro et al., 2005). Nevertheless the 10-fold cross-validation techniques have been shown to have minimal effects of sample characteristics (Van der Gaag et al., 2006) and to achieve stable values of the prediction error (Molinaro et al., 2005). Additionally, we could not estimate the inter-rater reliability of the GAF. Hence measurement errors due to differences in scores between raters could not be determined. This might affect the internal validity of our study's results. On the other hand, even though the GAF has been criticized over the years for its low inter-rater reliability in routine clinical use (Vatnaland et al., 2007), several studies have shown that its validity and reliability can be improved through appropriate training (Bates et al., 2002; Söderberg et al., 2005; Abbo et al., 2013) and independently of the clinical experience of the rater (Støre-Valen et al., 2015). Hence the interviewers in our study were systematically trained by the research team to ensure an adequate and consistent data collection and questionnaire administration. They participated in a 4-h training session that included vignette-assessments of clinical cases and GAF assessments. The average inter-rater reliability of the GAF scores for the vignettes presented was 0.72, as measured using a two-way mixed-effects ICC model for absolute agreement. In addition, the interviewers conducted 3 real assessments under the supervision of a senior psychiatrist. Also, they were all psychologists with broad clinical experience. The strengths of our study lie in the in-depth clinical assessment and in the wide range of questionnaires assessing functional disability. Actually, we estimated functional disability with the two standard instruments that DSM recommended: the GAF, the scale proposed by DSM-IV and WHODAS 2.0, the standard recommended by ICD-10 and as of late by the recent edition of the DSM, DSM-5. In summary, functional disability assessment can be used with adequate results to identify patients at risk of MDE or GAD in busy health care settings where more detailed and in-depth mental health evaluations are not feasible. In this type of settings, a clinically relevant instrument should always be short, straightforward and user-friendly and should always have sound psychometric properties. A simple question asking the patient to assess his/her overall level of functioning seems to have as promising results as

other more established self-reported instruments. Additionally, it can be used to improve the diagnostic ability of symptoms in case of GAD, with good results that can lead to a more accurate diagnosis, and subsequently to better outcomes for the patient. Future studies using wider primary care samples are necessary in order to generalize the results obtained. Also, they will help us shed more light into the complex problem of whether, in busy Primary Care practices when diagnosing common mental disorders, clinicians should focus first on assessing patients' functional status and then patients' symptoms, thus inverting the order of assessments, an order so clearly established in more specialized health care settings. Hence, it would be appropriate to carry out clinical trials and longitudinal follow-ups informing about the effectiveness of a model of assessment where self-reported functional disability would serve as a first and quick examination and where symptom evaluation would come second, completing and defining the diagnosis.

Contributors JA, CGF, PA and JICR contributed to the conception and design of the study. EO, PA and JICR were involved in data collection. EO, CGF and PA drafted the manuscript. EO and CGF performed the analysis and interpretation of the data. JA, MJB critically reviewed the manuscript. All authors read and approved the final manuscript.

Conflict of interest The authors declare there is no conflict of interest.

Acknowledgment We thank the participating patients and health-care centers. This study was funded by a grant from the Spanish Ministry of Health, Instituto de Salud Carlos III FEDER (PI10/00530), FI11/ 00154; Ministerio de Ciencia e Innovacion FSE (JCI-2009-05486) and DIUE Generalitat de Catalunya (2014 SGR 748 and 2009 SGR 1095). The funders had no direct role in the design or conduct of the study, interpretation of the data or review of the manuscript.

References Aas, I.H.M., 2010. Global Assessment of Functioning (GAF): properties and frontier of current knowledge. Ann. Gen. Psychiatry 9, 20. Abbo, C., Okello, E.S., Nakku, J., 2013. Effect of brief training on reliability and applicability of Global Assessment of functioning scale by Psychiatric clinical officers in Uganda. Afr. Health Sci. 13, 78–81. American Psychiatric Association, 1994. Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC. American Psychiatric Association, 2013. Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC. Andrews, G., Brugha, T., Thase, M.E., Duffy, F.F., Rucci, P., Slade, T., 2007. Dimensionality and the category of major depressive episode. Int. J. Methods Psychiatr. Res., 16. Baron, M., Schieir, O., Hudson, M., Steele, R., Kolahi, S., Berkson, L., Couture, F., Fitzcharles, M.A., Gagné, M., Garfield, B., Gutkowski, A., Kang, H., Kapusta, M., Ligier, S., Mathieu, J.P., Ménard, H., Starr, M., Stein, M., Zummer, M., 2008. The clinimetric properties of the World Health Organization disability assessment schedule II in early inflammatory arthritis. Arthr. Care Res. 59, 382–390. Bates, L.W., Lyons, J.A., Shaw, J.B., 2002. Effects of brief training on application of the Global Assessment of Functioning Scale. Psychol. Rep. 91, 999–1006. Beitchman, J.H., Adlaf, E.M., Douglas, L., Atkinson, L., Young, A., Johnson, C.J., Escobar, M., Wilson, B., 2001. Comorbidity of psychiatric and substance use disorders in late adolescence: a cluster analytic approach. Am. J. Drug Alcohol Abuse 27, 421–440. Bodlund, O., Kullgren, G., Ekselius, L., Lindström, E., von Knorring, L., 1994. Axis V –

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

Global Assessment of Functioning Scale. Evaluation of a self-report version. Acta Psychiatr. Scand. 90, 342–347. Brown, L.A., Krull, J.L., Roy-Byrne, P., Sherbourne, C.D., Stein, M.B., Sullivan, G., Rose, R.D., Bystritsky, A., Craske, M.G., 2014. An examination of the bidirectional relationship between functioning and symptom levels in patients with anxiety disorders in the CALM study. Psychol. Med., 1–15. Chisolm, T.H., Abrams, H.B., McArdle, R., Wilson, R.H., Doyle, P.J., 2005. The WHODAS II: psychometric properties in the measurement of functional health status in adults with acquired hearing loss. Trends Amplif. 9, 111–126. Chwastiak, L.A., Von Korff, M., 2003. Disability in depression and back pain: Evaluation of the World Health Organization Disability Assessment Schedule (WHO DAS II) in a primary care setting. J. Clin. Epidemiol. 56, 507–514. Cohen, A., 1988. Statistical power for the behavioral sciences. Lawrence Eribaum, Hillsdale, NJ. Dimsdale, J.E., Jeste, D.V., Patterson, T.L., 2010. Beyond the global assessment of functioning: learning from Virginia Apgar. Psychosomatics 51, 515–519. Efron, B., 1987. Better bootstrap confidence intervals. J. Am. Stat. Assoc. 82, 171–185. Endicott, J., Spitzer, R.L., Fleiss, J., 2008. Global Assessment Scale (GAS); Global Assessment of Functioning Scale (GAF), Social and Occupational Functioning Assessment Scale (SOFAS). In: Rush, A.J., First, M.B., Blacker, D. (Eds.), American Psychiatric Publishing, Inc., Washington, DC, pp. 86–90. Federici, S., Meloni, F., Presti, A.L., 2009. International literature review on WHODAS II (World Health Organization Disability Assessment Schedule II). Life Span Disabil. 12, 83–110. Ferrando, L., Bobes, J., Gibert, M., Soto, M., Soto, O., 1998. M.I.N.I. Mini International Neuropsychiatric Interview. Versión en español 5.0.DSM-IV. Instituto IAP, Madrid. Gold, L.H., 2014. DSM-5 and the assessment of functioning: the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0). J. Am. Acad. Psychiatry Law 42, 173–181. Goldberg, D., Lecrubier, Y., 1995. Form and frequency of mental disorders across centres. In: Ustun, T., Sartorius, N. (Eds.), John Wiley & Sons on Behalf of WHO, Chichester, pp. 323–334. Hall, R.C., 1995. Global assessment of functioning. A modified scale. Psychosomatics 36, 267–275. Haro, J.M., Kamath, S.A., Ochoa, S., Novick, D., Rele, K., Fargas, A., Rodríguez, M.J., Rele, R., Orta, J., Kharbeng, A., Araya, S., Gervin, M., Alonso, J., Mavreas, V., Lavrentzou, E., Liontos, N., Gregor, K., Jones, P.B., 2003. The Clinical Global Impression - Schizophrenia scale: a simple instrument to measure the diversity of symptoms present in schizophrenia. Acta Psychiatr. Scand. Suppl., 16–23. Hilsenroth, M.J., Ackerman, S.J., Blagys, M.D., Baumann, B.D., Baity, M.R., Smith, S.R., Price, J.L., Smith, C.L., Heindselman, T.L., Mount, M.K., Holdwick, D.J., 2000. Reliability and validity of DSM-IV axis V. Am. J. Psychiatry 157, 1858–1863. Hosmer, D.W., Hosmer, T., Le Cessie, S., Lemeshow, S., 1997. A comparison of goodness-of-fit tests for the logistic regression model. Stat. Med. 16, 965–980. Hosmer, D.W., Lemeshow, S., 2000. Applied logistic regression. Wiley, New York. Kellner, R., Rada, R.T., Andersen, T., Pathak, D., 1979. The effects of chlordiazepoxide on self-rated depression, anxiety, and well-being. Psychopharmacology 64, 185–191. Konecky, B., Meyer, E.C., Marx, B.P., Kimbrel, N. a, Morissette, S.B., 2014. Using the WHODAS 2.0 to assess functional disability associated with DSM-5 mental disorders. Am. J. Psychiatry 171, 818–820. Kulnik, S.T., Nikoletou, D., 2014. WHODAS 2.0 in community rehabilitation a qualitative exploration of content and construct validity of a generic disability measure. Disabil. Rehabil. 36, 146–154. Leon, A.C., Shear, M.K., Portera, L., Klerman, G.L., 1992. Assessing impairment in patients with panic disorder: the Sheehan Disability Scale. Soc. Psychiatry Psychiatr. Epidemiol. 27, 78–82. Lohr, K.N., 2002. Assessing health status and quality-of-life instruments: attributes and review criteria. Qual. Life Res. 11, 193–205. Luciano, J.V., Ayuso-Mateos, J.L., Fernandez, A., Aguado, J., Serrano-Blanco, A., Roca, M., Haro, J.M., 2010a. Utility of the twelve-item World Health Organization disability assessment schedule II (WHO-DAS II) for discriminating depression “caseness” and severity in Spanish primary care patients. Qual. Life Res. 19, 97–101. Luciano, J.V., Ayuso-Mateos, J.L., Fernández, A., Serrano-Blanco, A., Roca, M., Haro, J. M., 2010b. Psychometric properties of the twelve item World Health Organization Disability Assessment Schedule II (WHO-DAS II) in Spanish primary care patients with a first major depressive episode. J. Affect. Disord. 121, 52–58. McKnight, P.E., Kashdan, T.B., 2009. The importance of functional impairment to mental health outcomes: a case for reassessing our goals in depression treatment research. Clin. Psychol. Rev. 29, 243–259. McQuaid, J.R., Marx, B.P., Rosen, M.I., Bufka, L.F., Tenhula, W., Cook, H., Keane, T.M., 2012. Mental health assessment in rehabilitation research. J. Rehabil. Res. Dev. 49, 121–138. Mezzich, A.C., Mezzich, J.E., Coffman, G.A., 1985. Reliability of DSM-III vs. DSM-II in child psychopathology. J. Am. Acad. Child Psychiatry 24, 273–280. Michelson, D., Lydiard, R.B., Pollack, M.H., Tamura, R.N., Hoog, S.L., Tepner, R., Demitrack, M.A., Tollefson, G.D., 1998. Outcome assessment and clinical improvement in panic disorder: evidence from a randomized controlled trial of fluoxetine and placebo. The Fluoxetine Panic Disorder Study Group. Am. J. Psychiatry 155, 1570–1577. Molinaro, A.M., Simon, R., Pfeiffer, R.M., 2005. Prediction error estimation: a comparison of resampling methods. Bioinformatics 21, 3301–3307. Moos, R.H., McCoy, L., Moos, B.S., 2000. Global Assessment of Functioning (GAF) ratings: Determinants and role as predictors of one-year treatment outcomes. J.

799

Clin. Psychol. 56, 449–461. Moos, R.H., Nichol, A.C., Moos, B.S., 2002. Global Assessment of Functioning ratings and the allocation and outcomes of mental health services. Psychiatr. Serv. 53, 730–737 (Washington, D.C.). Mouzas, O.D., Zibis, A.H., Bonotis, K.S., Katsimagklis, C.D., Hadjigeorgiou, G.M., Papaliaga, M.N., Dimitroulias, A.P., Malizos, K.N., 2014. Psychological distress, personality traits and functional disability in patients with osteonecrosis of the femoral head. J. Clin. Med. Res. 6, 336–344. Niv, N., Cohen, A.N., Sullivan, G., Young, A.S., 2007. The MIRECC version of the Global Assessment of Functioning scale: reliability and validity. Psychiatr. Serv. 58, 529–535 (Washington, DC). Olariu, E., Castro-Rodriguez, J.-I., Alvarez, P., Garnier, C., Reinoso, M., Martín-López, L.M., Alonso, J., Forero, C.G., 2014. Validation of clinical symptom IRT scores for diagnosis and severity assessment of common mental disorders. Qual. Life Res. 24, 979–992. Olfson, M., Fireman, B., Weissman, M.M., Leon, A.C., Sheehan, D.V., Kathol, R.G., Hoven, C., Farber, L., 1997. Mental disorders and disability among patients in a primary care group practice. Am. J. Psychiatry 154, 1734–1740. Pedersen, G., Hagtvet, K.A., Karterud, S., 2007. Generalizability studies of the Global Assessment of Functioning-Split version. Compr. Psychiatry 48, 88–94. Pedersen, G., Karterud, S., 2012. The symptom and function dimensions of the Global Assessment of Functioning (GAF) scale. Compr. Psychiatry 53, 292–298. Pepe, M.S., 2003. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, Oxford UK. Ramirez, A., Ekselius, L., Ramklint, M., 2008. Axis V-Global Assessment of Functioning Scale (GAF), further evaluation of the self-report version. Eur. Psychiatry 23, 575–579. Rapaport, M.H., Clary, C., Fayyad, R., Endicott, J., 2005. Quality-of-life impairment in depressive and anxiety disorders. Am. J. Psychiatry 162, 1171–1178. Rejas, J., Pardo, A., Ruiz, M.Á., 2008. Standard error of measurement as a valid alternative to minimally important difference for evaluating the magnitude of changes in patient-reported outcomes measures. J. Clin. Epidemiol. 61, 350–356. Rey, J.M., Stewart, G.W., Plapp, J.M., Bashir, M.R., Richards, I.N., 1988. Validity of Axis V of DSM-III and other measures of adaptive functioning. Acta Psychiatr. Scand. 77, 535–542. Russell, A.T., Cantwell, D.P., Mattison, R., Will, L., 1979. A comparison of DSM-II and DSM-III in the diagnosis of childhood psychiatric disorders. III. Multiaxial features. Arch. Gen.Psychiatry 36, 1223–1226. Sheehan, D.V., Lecrubier, Y., Sheehan, K.H., Amorim, P., Janavs, J., Weiller, E., Hergueta, T., Baker, R., Dunbar, G.C., 1998. The Mini-International Neuropsychiatric Interview (M.I.N.I.): the development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J. Clin. Psychiatry 59 (Suppl. 2), S22–S33 (quiz 34–57). Smith, G.N., Ehmann, T.S., Flynn, S.W., MacEwan, G.W., Tee, K., Kopala, L.C., Thornton, A.E., Schenk, C.H., Honer, W.G., 2011. The assessment of symptom severity and functional impairment with DSM-IV axis V. Psychiatr. Serv. 62, 411–417. Söderberg, P., Tungström, S., Armelius, B.A., 2005. Reliability of global assessment of functioning ratings made by clinical psychiatric staff. Psychiatr. Serv. 56, 434–438. Sohail, Z., Bailey, R.K., Richie, W.D., 2013. How to evaluate disability. Front. Psychiatry, 4. Startup, M., Jackson, M.C., Bendix, S., 2002. The concurrent validity of the Global Assessment of Functioning (GAF). Br. J. Clin. Psychol. 41, 417–422. Steyerberg, E.W., Vickers, A.J., Cook, N.R., Gerds, T., Gonen, M., Obuchowski, N., Pencina, M.J., Kattan, M.W., 2010. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 21, 128–138. Støre-Valen, J., Ryum, T., Pedersen, G.A.F., Pripp, A.H., Jose, P.E., Karterud, S., 2015. Does a web-based feedback training program result in improved reliability in clinicians' ratings of the Global Assessment of Functioning (GAF) Scale. Psychol. Assess. Terluin, B., Brouwers, E.P.M., van Marwijk, H.W.J., Verhaak, P.F.M., van der Horst, H. E., 2009. Detecting depressive and anxiety disorders in distressed patients in primary care; comparative diagnostic accuracy of the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS). BMC Fam. Pract. 10, 58. Urbanoski, K.A., Henderson, C., Castel, S., 2014. Multilevel analysis of the determinants of the global assessment of functioning in an inpatient population. BMC Psychiatry 14, 63. Ustün, T.B., Chatterji, S., Bickenbach, J., Kostanjsek, N., Schneider, M., 2003. The International Classification of Functioning, Disability and Health: a new tool for understanding disability and health. Disabil. Rehabil. 25, 565–571. Ustün, T.B., Chatterji, S., Kostanjsek, N., Rehm, J., Kennedy, C., Epping-Jordan, J., Saxena, S., von Korff, M., Pull, C., 2010a. Developing the World Health Organization Disability Assessment Schedule 2.0. Bull. World Health Organ. 88, 815–823. Ustün, T.B., Kostanjsek, N., Chatterji, S., Rehm, J., 2010b. Measuring Health and Disability: Manual for WHO Disability Assessment Schedule (WHODAS 2.0). World Health Organization, Geneva. Van der Gaag, M., Hoffman, T., Remijsen, M., Hijman, R., de Haan, L., van Meijel, B., van Harten, P.N., Valmaggia, L., de Hert, M., Cuijpers, A., Wiersma, D., 2006. The five-factor model of the Positive and Negative Syndrome Scale II: A ten-fold cross-validation of a revised model. Schizophr. Res. 85, 280–287. Vatnaland, T., Vatnaland, J., Friis, S., Opjordsmoen, S., 2007. Are GAF scores reliable

800

E. Olariu et al. / Psychiatry Research 229 (2015) 791–800

in routine clinical use. Acta Psychiatr. Scand. 115, 326–330. Vázquez-Barquero, J.L., Vázquez Bourgón, E., Herrera Castanedo, S., Saiz, J., Uriarte, M., Morales, F., Gaite, L., Herrán, A., Ustün, T.B., 2000. [Spanish version of the new World Health Organization Disability Assessment Schedule II (WHO-DASII): initial phase of development and pilot study. Cantabria disability work group]. Actas Esp. Psiquiatr. 28, 77–87.

Von Korff, M., Andrews, G., Delves, J., 2011. Assessing activity limitations and disability among adults. In: Regier, D., Narrow, W., Kuhl, E. (Eds.), American Psychiatric Publishing, Inc., Washington, DC, pp. 163–188. World Health Organization, 2001. International Classification of Functioning. Disabil. Health (ICF).