The symptom and function dimensions of the Global Assessment of Functioning (GAF) scale

The symptom and function dimensions of the Global Assessment of Functioning (GAF) scale

Available online at www.sciencedirect.com Comprehensive Psychiatry 53 (2012) 292 – 298 www.elsevier.com/locate/comppsych The symptom and function di...

199KB Sizes 2 Downloads 103 Views

Available online at www.sciencedirect.com

Comprehensive Psychiatry 53 (2012) 292 – 298 www.elsevier.com/locate/comppsych

The symptom and function dimensions of the Global Assessment of Functioning (GAF) scale Geir Pedersena,⁎, Sigmund Karteruda,b a

Division of Mental Health and Addiction, Oslo University Hospital, Ullevaal, PO Box 4956 Nydalen, 0424 Oslo, Norway b Institute for Clinical Medicine, University of Oslo, 0318 Oslo, Norway

Abstract Objective: The objective was to investigate the validity and clinical impact of the symptom and function dimensions of the Global Assessment of Functioning (GAF) in the Diagnostic and Statistical Manual of Mental Disorders (DSM), Fourth Edition. Is there any need for revision with respect to the DSM, Fifth Edition? Material: The sample comprised 2695 patients consecutively admitted to 14 different treatment units participating in the Norwegian Network of Personality-Focused Treatment Programs from 1998 to 2007. Methods: Convergent and discriminant validity of the symptom and function dimensions of GAF was analyzed by their associations with demographic variables, diagnostic status, and other self-reported variables assessing symptom distress, interpersonal problems, work and social impairment, and quality of life. Results: The validity of the separate GAF dimensions was confirmed by discriminant and concurrent associations to other relevant clinical measures. However, the traditional GAF measure based on the lower score of either symptom or function level was found to serve well as a global indicator of symptom distress and social dysfunction. A substantial difference between the symptom and function score of GAF was found in about 10% of the cases; and when differences were found, functional impairment was most often more severe. Conclusion: This study confirms the validity of the 2 GAF dimensions. However, substantial differences between these dimensions are rarely occurring. We therefore recommend that the GAF scale be prolonged in the DSM, Fifth Edition, roughly in the same shape as in the DSM, Fourth Edition. © 2012 Elsevier Inc. All rights reserved.

1. Background The first standardized instrument for assessing patients' overall mental health was introduced nearly 50 years ago when Luborsky [1] reported the development of the HealthSickness Rating Scale. Some decade later, Endicott et al [2] modified the original instrument, which resulted in the Global Assessment Scale (GAS). Both the Health-Sickness Rating Scale and GAS are single 100-point rating scales reflecting overall functioning from 1, representing the hypothetically sickest, to 100, representing the hypotheti-

None of the above authors has any financial disclosure/conflict of interest related to this manuscript. ⁎ Corresponding author. Department for Personality Psychiatry, Division of Mental Health and Addiction, Oslo University Hospital, Ullevaal, PO Box 4956 Nydalen, 0424 Oslo, Norway. E-mail address: [email protected] (G. Pedersen). 0010-440X/$ – see front matter © 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.comppsych.2011.04.007

cally healthiest individual. In the Diagnostic and Statistical Manual of Mental Disorders (DSM, Third Edition (1980) [3], axis V was introduced as a measure of “adaptive functioning,” scored on a 7-point scale ranging from superior to grossly impaired. In the DSM, Revised Third Edition (1987), the Global Assessment of Functioning (GAF) scale replaced this axis V [4] for the assessment of psychological, social, and occupational functioning. The GAF scale was based on and made very similar to the GAS, although the upper range from 91 to 100 was omitted. Within the DSM, Fourth Edition (DSM-IV) [5], the GAF scale was extended to a 100-point scale. The GAF scale is intended to be a single measure of overall impairment caused by mental factors. Its intended use is to communicate the level of impairment, indicate the need of professional help, and reflect improvement or change over time. There has been, and still is, some scepticism concerning the use of one single scale to assess both the level of

G. Pedersen, S. Karterud / Comprehensive Psychiatry 53 (2012) 292–298

psychological symptoms and social and occupational functioning [6-9]. Because these 3 dimensions do not always vary together, it will constitute a challenge to both reliability and validity of the GAF measures. This scepticism has led to several studies of the GAF scale. The review by Goldman and colleagues [7] suggested the need for different measures for different areas of functioning, and their answer to this was the development of the Social and Occupational Functioning Assessment Scale (SOFAS). Another reason for developing the SOFAS was to include dysfunction related to general medical impairment and not only to mental impairment, which is the case for GAF. Along with 2 additional scales, the Defensive Functioning Scale and the Global Assessment of Relational Functioning, SOFAS is a supplement for further research under axis V of the DSM-IV. However, in routine clinical settings, the traditional GAF scale is far more frequently used than any of these 3 supplements. The reliability of GAF scores has proven acceptable, especially under conditions when raters are experienced and trained [10-15]. The validity concerns the justifications of our inferences made by observed GAF scores. Several studies have focused on GAF and its associations with other clinical phenomena. Some of these studies have found the validity of GAF to be only modest [7,16,17]; but other studies have found significant associations between GAF scores and the presence of axis II pathology, self-reported symptom distress, interpersonal problems, as well as social functioning [12,13,18]. Nonetheless, one of the strongest criticisms of the GAF scale concerns the use of one single measure as an operationalization of more than one clinical phenomenon [19]. Within the GAF manual, 10 main GAF sections are described by examples of symptoms and functional impairment, separated by “and/or” in the text. This makes it easy to form 2 separate scales. The only study of such a simple split version of the traditional GAF scale was conducted by Jones and colleagues [20], concluding that these 2 separate scales had different patterns of validity. As for the forthcoming DSM, Fifth Edition, it is important to consider the current evidence on the relation between the 2 dimensions of the GAF scale. That is, should the current GAF scale be prolonged; or should it be revised, for example, in a split GAF? In 1998, the authors of this article constructed a Norwegian version of a split GAF scale [21]. The manual was the same as for the original GAF scale except for the fact that the symptom descriptions and function descriptions were kept on different sheets and rated separately. We kept the same procedure of choosing the lower of the 2 GAF scores as the “official” one. We supplemented the GAF manual with some more guidelines and established an interactive GAF training program on the Internet (www.personlighetsprosjekt.com/gaf/). In addition, we have conducted 1-day GAF rating workshops based upon focused video interviews. The clinical staff in the Norwegian Network of Personality-Focused Treatment Programs have been trained according to these guidelines.

293

In this study, we report the findings from analyses of split GAF data from 2695 patients from the Norwegian Network who have been rated at 2 different points of time. The research questions for the study were as follows: 1. Are the symptom and function dimensions valid; for example, do they share unique variance with other well-established indicators of symptoms and social functioning? 2. Are the symptom and function dimensions rated equally often as “lowest GAF”? 3. How frequent are there substantial differences between the symptom and function dimension of GAF? 4. Are the measured differences in GAF dimensions valid; for example, do they correspond to other clinically relevant indicators? 5. Does it matter with respect to treatment course that the GAF dimensions differ substantially? 2. Material and methods 2.1. The sample This multisite study comprised data from 2695 patients consecutively admitted to 14 different treatment units participating in the Norwegian Network of PersonalityFocused Treatment Programs [22] from 1998 to 2007. The units are specialized in the treatment of personality disorders (PDs) and provide intensive group-oriented, 18-week treatment programs. The majority of the patients were women (72%), and the mean age was 36 years (SD = 9). According to the DSM-IV, 76% had at least one PD; and 98% had at least one symptom disorder, wherein 76% were mood disorders, 66% anxiety disorders, and 8% substancerelated disorders. Further details regarding sociodemographic and diagnostic characteristics are reported by Karterud et al [18] and Pedersen and Karterud [23]. All data registered by the different units were collected regularly in a central, anonymous database, administrated by the Department of Personality Psychiatry, Oslo University Hospital. All patients gave written consent, and the procedures were approved by the State Data Inspectorate and the Regional Committee for Medical Research and Ethics. 2.2. Assessment All patients were diagnosed according to the DSM-IV by use of the Mini International Neuropsychiatric Interview for symptom disorders [24] and the Structured Clinical Interview for DSM-IV Axis II Personality Disorders (SCID-II) for PDs [25]. The reliability was not investigated. However, staff in these units were thoroughly trained in diagnostic procedures; and they were instructed to follow the Longitudinal, Expert, All-Data procedure [26]. This means that tentative diagnoses were given at admission on the basis of referral letters, self-reported history and complaints, several assessment interviews, and SCID-II and Mini

294

G. Pedersen, S. Karterud / Comprehensive Psychiatry 53 (2012) 292–298

International Neuropsychiatric Interview interviews and then reviewed at discharge. The therapists reassessed diagnoses, including the SCID-II protocol, based upon their extensive knowledge from a variety of group observations during approximately 18 weeks of treatment. The staff within the Network had been thoroughly trained in GAF ratings, and all GAF scores in this study were based on consensus between 2 or more clinicians. In a generalizability study comprised by 8 units in the Network, all of which are represented in the current study, GAF ratings revealed highly acceptable generalizability coefficients. Estimated reliability coefficients (absolute decision) based on 4 raters was .94 for the symptom scale and .95 for the function scale. Estimated coefficients based on 2 raters were .89 and .91, respectively [17]. Symptom distress was measured by the Revised Symptom Checklist 90 (SCL-90-R) [27], assessing 9 symptom dimensions. However, 3 of the subscales in SCL-90-R address interpersonal problems more than subjective symptoms. A study by Karterud et al [28] indicated that the added sum scores of 3 of these subscales are associated with the severity of PDs, labeling this sum score Personality Severity Index (PSI). Later on, Starcevic et al [29] introduced the Current Symptom Index (CSI), derived from the sum of the remaining 6 subscales of the SCL-90-R. Further validations of these 2 indexes support their discriminant validity with respect to symptom distress and interpersonal dysfunction [30,31]. With respect to this knowledge, the 2 derived indexes of SCL-90-R, PSI and CSI, are used in the current study instead of the traditional Global Severity Index. Interpersonal problems was measured by the sum score (Index of Interpersonal Problems [IIP]) of the Circumplex of Interpersonal Problems [32], a 48-item Norwegian version of the Inventory of Interpersonal Problems-Circumplex version [33]. In addition to this, the total number of PD criteria from the SCID-II protocol (without including antisocial criteria from youth) was used as a measure of interpersonal dysfunction at admission. From 1999, the patients rated their current quality of life on a 10-point scale (1 = worst possible and 10 = best possible quality of life). Two-day testretest reliability of the quality of life scale assessed in the Norwegian Network was 0.82. From 2005, self-estimated work and social dysfunctioning was assessed by the Work and Social Adjustment Scale (WSAS) [34]. The WSAS is a 5-item scale rated on a 9-point Likert format, with a sum score ranging from 0 to 40 and where higher scores represent more distress. In addition to these self-report measures, patients admitted to treatment after 1998 also reported the number of months in which they were working or studying during the 12 months before admission. Patients were rated on GAF at 2 points of time: at admission to day hospital treatment and at discharge approximately 20 weeks later. We defined a clinically significant difference between the GAF symptom score and the GAF function score when the difference was 10 or more points. The rationale for this

criterion is that there are no guidelines in the GAF manual as to differentiate at further narrow levels of the GAF scale and, furthermore, that a recent reliability study of GAF supports the assumptions that such a difference would be highly reliable among the raters within the current study [15]. We allocated patients into 3 groups: Group “ND” had, according to our definition, no substantial difference between the 2 GAF scores, group “LS” had their symptom score lower than the function score, and group “LF” had their function score lower than the symptom score. Reliable change was calculated by the formula 1.96 * SEDif, where SEDif is the standard error of the difference (change) score [35]; and effect size was calculated as Xa − Xb/SDa, where the measurement of Xa precedes Xb in time. Patients who dropped out are excluded from the discharge analysis because of missing data. Statistical analyses were conducted using Statistical Package for the Social Sciences, version 16 for Windows [36].

3. Results The mean GAF scores at admission were the following: GAF-S = 47.4 (SD = 5.3), GAF-F = 46.8 (SD = 6.2), and GAF-L (the lower of either the symptom or function score) = 45.3 (SD = 5.1). At discharge, the mean levels for patients who completed the treatment program (n = 2151) were GAFS = 53.4 (7.2), GAF-F = 52.5 (7.7), and GAF-L = 50.9 (7.0). Table 1 shows the results of the 4 linear regression models accounting for the variance of GAF-S and GAF-F at admission and discharge. Main findings were as follows: (1) The largest contribution to the GAF-S scale came from the GAF-F scale, and vice versa. This was found both at admission and discharge from treatment. (2) At admission, subjective experience of quality of life and symptom distress (CSI) accounted for more variance on the GAF-S than the GAF-F scale; and length of work before treatment, measures from the WSAS, the PSI, and the number of PD criteria accounted for more variance of the GAF-F scale than the GAF-S scale. (3) At discharge from treatment, the CSI accounted for more variance of the GAF-S than the GAF-F scale. Furthermore, the length of work before treatment had unique contribution to the variance of the GAF-F scale. (4) Interpersonal problems, as measured by the Circumplex of Interpersonal Problems, had no unique contribution to the variance of the GAF scales. (5) All in all, the regression models accounted for about 40% of the GAF variance at admission and 60% at discharge of treatment. At admission, 8% of the patients had, according to our definition, a substantial difference (N9) between the function and symptom score. At discharge, we found a substantial difference among 10%. For most patients, the pattern of difference between the 2 GAF scores was not the same at both admission and discharge (Table 2). When a difference of 10 or more points occurred between the symptom and function score of GAF, the function score was most often lower, both

G. Pedersen, S. Karterud / Comprehensive Psychiatry 53 (2012) 292–298 Table 1 Regression models accounting for variance of the GAF scales Standardized β coefficients

Dependent variables

Dischargea

Admission

Independent variables The other GAF scaleb Months at work/study last year WSAS Quality of life Interpersonal problems (IIP) PSI CSI No. of PD criteria Model/analysis of variance Adjusted R2 df F Significance

GAF-S

GAF-F

.558† −.015

.592‡ .168‡

.594‡ −.024

.696‡ .115‡

.007 .120† .007 .041 −.257‡ −.012⁎

−.132† −.056 −.054 .114⁎ .069 −.085⁎

.019 .072 .007 .072 −.363‡

−.090⁎ .059 .019 −.065 −.110⁎

.458 8; 558 60.703 b.001

.424 8; 558 53.108 b.001

GAF-S

GAF-F

.640 7; 620 160.119 b.001

.574 7; 620 121.774 b.001

a

Only completers included in the analysis at discharge. Where GAF-S is the independent variable, GAF-F is the independent; and vice versa. ⁎ P b .05, †P b .01, and ‡P b .001 for standardized β coefficients. b

at admission and discharge of treatment (P b .001). Of those patients with significant differences between the GAF scores at admission, 28.4% displayed the same difference at Table 2 Relations between GAF symptom and function scores at admission and discharge n Differences between function and symptom scores at admission Symptom severity greater than functional impairment at admission (LS: low symptom) Functional impairment greater than symptom severity at admission (LF: low function) Difference between function and symptom scores at dischargea Symptom severity greater than functional impairment at discharge (LS) Functional impairment greater than symptom severity at discharge (LF) No difference between function and symptom scores at admission nor discharge Symptom severity greater than functional impairment at both admission and discharge (LS) Functional impairment greater than symptom severity at both admission and discharge (LF) Symptom score greater than function score at admission (LF) but no difference at discharge Function score greater than symptom score at admission (LS) but no difference at discharge No difference at admission but symptom score greater than function score at discharge (LF) No difference at admission but function score greater than symptom score at discharge (LS) a

%

204

7.6

73

2.7

131

4.9

212

9.9

56

2.6

156

7.3

1828

85.0

14

0.7

44

2.0

63

2.9

48

2.2

111

5.2

40

1.9

Only completers included in the analysis at discharge (n = 2151).

295

admission, whereas 54.4% had no differences at discharge. The remaining 17.2% had missing data due to early dropout. Moreover, of the patients that had differences between the GAF scores at discharge, 71% had no such substantial differences at admission. No age differences were found between the 3 groups: lowest symptom (LS), lowest function (LF), and no difference (ND); but there was slightly less women within the LS group (63.4%) than within the ND group (72.3%, P b .05). Within the LS group, 91.4% of the patients were working or studying before admission, compared with 47.7% within the LF group (P b .001). Differences were found between the 3 groups with respect to the prevalence of earlier psychotic episodes, close relationships, suicidal thoughts, and sexual abuse (Table 3). No differences were found between the 3 groups with respect to prevalence of PD, but differences were found with respect to prevalence of some symptom disorders (Table 3). As to the fifth research question, no differences were found between the 3 groups with respect to dropout frequency; but the LS group was more likely (P b .01) to continue with outpatient group psychotherapy after the 18week program (63.0%) than both the FL and ND groups (46.6%). Moreover, Table 4 indicates a harmonization between the symptom and function scores of GAF. That is, the LS group has higher effect size on GAF-S than the LF group, whereas the LF group has higher effect size on GAFF than the LS group. Furthermore, the LS group has the highest effect size when it comes to symptom distress. When we analyzed the frequency of substantial differences between the 2 GAF dimensions, 2 groups of patients emerged that had no differences between the GAF scores at admission, but significant differences at discharge (Table 2). Of these, the subjects in the LS group only improved on their function score, whereas those in the LF group only improved Table 3 Clinical differences between GAF contrast groupsa

Earlier psychotic episodes Close regular relationship at admission Suicidal thoughts before treatment Previously sexually abused Posttraumatic stress disorder Agoraphobia without panic

ND

LS

LF

%

%

%

Significance of group differencesb

10.2 1.4 7.7 LS b ND*, LS b LF* 41.8 29.2 41.7 LS b ND* 29.2 44.8 17.8 LS N LF‡, ND b LS†, ND N LF† 25.9 30.0 17.8 NF b LS*, LF b ND* 9.3 16.4 6.9 LS N ND*, LS N LF*, 4.0 1.4 13.8 LF N ND†, LF N LS†, ND N LF† 22.0 12.3 29.0 LS b ND*, LS b LF†

Agoraphobia with or without panic Obsessive-compulsive disorder 5.2 9.6 3.1 LS N LF* Eating disorder 11.7 19.2 8.5 LS N LF* Major depressive disorder 54.8 47.9 40.8 ND N LF†

a Group ND: no difference between GAF symptom and function score; group LS: GAF symptom score lower than function score; and group LF: GAF function score lower than symptom score. b χ2 2-sided: *P b .05, †P b .01, and ‡P b .001.

296

G. Pedersen, S. Karterud / Comprehensive Psychiatry 53 (2012) 292–298

Table 4 Effect size and reliable change based on GAF contrast groupsa at admissionb ND

LS

LF

ES

RC (%)

ES

RC (%)

ES

RC (%)

GAF-F GAF-S

0.94 1.14

48.2 56.6

0.16 1.73

21.2 72.7

1.42 0.55

56.8 37.8

WSAS QoL

0.51 0.77

28.9 41.2

0.53 0.90

23.1 26.7

0.77 0.26

37.8 11.5

IIP PSI

0.48 0.38

32.8 43.6

0.38 0.58

29.5 55.0

0.21 0.15

20.6 30.8

CSI

0.48

23.2

0.74

35.0

0.24

14.2

Significance of group differences on prevalence of RCc ND N LS†, LF N LS† LS N ND†, ND N LF†, LS N LF† Ns LS N ND*, ND N LF†, LS N LF* ND N LF* LS N ND†, ND N LF*, LS N LF† ND N LF*, LS N ND*, LS N LF†

QoL indicates quality of life. a Group ND: no difference between GAF symptom and function score; group LS: GAF symptom score lower than function score; and group LF: GAF function score lower than symptom score. b Only completers included. c χ2 2-sided: *P b .05, †P b .01.

their symptom score of GAF. Those who only increased their function score (LS group) had also worked more (7 vs 3 months, respectively; P b .001) during the last 12 months before admission than those who only increased their symptom score (LF group). Moreover, 83.8% of subjects in the LS group were working or studying before admission, compared with 47.5% of those in the LF group (P b .001).

4. Discussion The main findings of this study were as follows: The validity of the 2 GAF dimensions was confirmed in 2 ways. Firstly, their variance revealed both discriminant and concurrent validity by their associations to other variables reflecting symptom distress and social functioning. Secondly, a considerable amount of their variance is not accounted for by these other variables or by the variance of each other. This means that the 2 GAF scales also reflect different aspects of clinical impairment not measured in this study. Lowest GAF is more often based on function level than symptom level. At any point of time, around 8% to 10% of patients had a significant difference between their symptom and function level as measured by GAF. When taking both admission and discharge into account, about 15% of the patients in the current sample had a difference of 10 or more points between the symptom and function score of GAF. When differences occurred, functional impairment was most often the most severe. When there are differences between the GAF scores at admission to treatment, it seems that one can expect a harmonization between the GAF scores at discharge. However, when there are differences between the GAF scores at discharge, the case is more likely that there was no

difference at admission and that just one of the GAF scores has improved during treatment. When differences between the GAF scales occur at admission to treatment, they are valid indicators of several clinical phenomenon. Patients whose symptom score of GAF was lower (the LS group) were more prone to have been sexually abused and to have suicidal thoughts before treatment. They also had a higher prevalence of PTSD and eating disorders. Among the patients whose function score of GAF was lower (the LF group), agoraphobia occurred more often. With respect to treatment course, patients with significantly lower GAF-S can expect a major symptom reduction and a considerably increased symptom score of GAF (ES = 1.73). Vice versa, patients with lower GAF-F can expect a major gain in social functioning by treatment. There were no differences between the 2 groups with respect to dropout rates. Furthermore, patients who are working or studying before treatment are more likely to improve their function score during treatment than patients who has been out of work the last year before treatment. Substantial differences between the 2 GAF dimensions may occur by natural fluctuations in a patient's clinical state. Our findings indicate that a period with suicidal thoughts will, by definition in the GAF manual, lower the symptom score, but not necessarily the patient's ability to uphold more instrumental role expectancies such as work, studies, outdoor activities, or other social and familiar obligations. Likewise, periods with strong agoraphobic avoidance will lead to impaired social activities, but not necessarily to equally increased symptom distress captured by definitions in the GAF manual. Although the 2 dimensions of GAF are valid and able to detect some clinically significant differences between patients, the results indicate that the traditional GAF scale do surprisingly well as one global indicator of symptom distress and social dysfunction. In around 90% of the cases, these dimensions, as they are defined in the GAF manual, are rated on a fairly equal level. Thus, the receiver of the information can assume in most cases that the single GAF score reflects a fairly unified level of symptoms and function. In around 10% of the cases, one single global GAF score fails in this respect. When there are substantial differences between the 2 separate GAF scores, these differences are associated with significant clinical characteristics. Although we find the use of one global score to be a sound conservative measure for clinical purposes, there are still benefits by a differentiated scoring procedure. For research purposes, separate scores may be preferred to a single score because of different covariance patterns between the 2 dimensions. Furthermore, among those patients we found that had no significant difference between the 2 GAF dimensions at admission, some had significant differences at discharge. For those patients, the use of only one GAF score would hide that one of the GAF dimensions had improved significantly during treatment.

G. Pedersen, S. Karterud / Comprehensive Psychiatry 53 (2012) 292–298

There are possible limitations to this study. First of all, the sample comprises severely impaired patients with personality, mood, and anxiety disorders. In such a patient sample, clinical measures tend to represent a restriction of range that might reduce variance and covariance of some of the measures. Secondly, the allocation of patients into the 3 GAF contrast groups by the observed differences between their GAF-S and GAS-F scores introduce the possibility of limitations to the generalization of these findings due to a regression toward mean. Thirdly, the findings from the current study might not generalize to other patient samples represented by higher or lower GAF levels. In conclusion, when exploring the uncertainties connected to the 2-dimensional nature of GAF, we find no alarming signs that one global GAF score conveys grossly misleading information. For clinical purposes, a global score derived by the lower of the 2 dimensions will serve well in most cases. In around 10% of the cases, the 2 GAF dimensions will differ significantly because of an imbalance between symptoms and function. All in all, we recommend that the GAF scale be prolonged in the DSM, Fifth Edition, roughly in the same shape as in the DSM-IV. Acknowledgment We wish to thank the patients and staff from the following 14 treatment units in the Norwegian Network of PersonalityFocused Treatment Programs for their contribution to this study: Department for Personality Psychiatry, Oslo University Hospital; the Group Therapy Unit, Lillestrøm District Psychiatric Center, Akershus University Hospital; the Unit for Group Therapy, District Psychiatric Center, Lovisenlund, Sørlandet Hospital HF, Kristiansand; the Unit of Personality Psychiatry, Department of Mental Health, Sanderud, Innlandet Hospital Health Authority; the Group Therapy Unit, Outpatient Clinic, Drammen Psychiatric Center; the Unit for Group Therapy, Vestfold Mental Health Care Trust, Tønsberg; the Group Therapy Unit, Alna District Psychiatric Center, Akershus University Hospital; the Årstad Day Unit, Fjell & Årstad District Psychiatric Center, Bergen; the Bergenhus Day Unit, District Psychiatric Center, Bergen; the Unit for Group Therapy, Skien District Psychiatric Center, Telemark Hospital Health Authority; Day Treatment Unit, Furuset District Psychiatric Center, Aker University Hospital, Oslo; the Group Therapy Unit, Ringerike Psychiatric Center, Hønefoss; the Outpatient Clinic in Farsund, District Psychiatric Center, Farsund; and the Unit for Group Therapy, Jessheim District Psychiatric Center, Akershus University Hospital HF. References [1] Luborsky L. Clinicians' judgements of mental health. Arch Gen Psychiatry 1962;7:407-17. [2] Endicott J, Spitzer RL, Fleiss JF, Cohen J. The Global Assessment Scale. A procedure for measuring overall severity of psychiatric disturbance. Arch Gen Psychiatry 1976;33:766-71.

297

[3] American Psychiatric Association. Diagnostic and statistical manual of mental disorders, 3rd ed. Washington, DC: American Psychiatric Association; 1981. [4] American Psychiatric Association. Diagnostic and statistical manual of mental disorders, 3rd ed. Washington, DC: American Psychiatric Association; 1987. [5] American Psychiatric Association. Diagnostic and statistical manual of mental disorders, 4th ed. Washington, DC: American Psychiatric Association; 1994. [6] Skodol AE, Link BG, Shrout PE, Horwath E. The revision of axis V in DSM-III-R: should symptoms have been included? Am J Psychiatry 1988;145(7):825-9. [7] Goldman HH, Skodol AE, Lave TR. Revising axis V for DSM-IV: a review of measures of social functioning. Am J Psychiatry 1992;149(9): 1148-56. [8] Bacon SF, Collins M, Plake EV. Does the Global Assessment of Functioning assess functioning? J Mental Healh Counseling 2002; 24(3):202-12. [9] Schwartz RC, Del Prete-Brown T. Construct validity of the Global Assessment of Functioning Scale for clients with anxiety disorder. Psychol Rep 2003;92(2):548-50. [10] Dworkin RJ, Friedman LC, Telschow RL, Grant KD, Moffic HS, Sloan VJ. The longitudinal use of the Global Assessment Scale in multiplerater situations. Community Ment Health J 1990;26(4):335-44. [11] Løvdahl H, Friis S. Routine evaluation of mental health: reliable information or worthless ‘guesstimates’? Acta Psychiatr Scand 1996;93(2):125-8. [12] Hilsenroth MJ, Ackerman SJ, Blagys MD, Baumann BD, Baity MR, Smith SR, et al. Reliability and validity of DSM-IV axis V. Am J Psychiatry 2000;157:1858-63. [13] Startup M, Jackson MC, Bendix S. The concurrent validity of the Global Assessment of Functioning (GAF). Br J Clin Psychol 2002; 41(4):417-22. [14] Vatnaland T, Vatnaland J, Friis S, Opjordsmoen S. Are GAF scores reliable in routine clinical use? Acta Psychiatr Scand 2007;115(4): 257-336. [15] Pedersen G, Hagtvet KA, Karterud S. Generalizability studies of the Global Assessment of Functioning (GAF)–Split Version. Comp Psychiatry 2007;48(1):88-94. [16] Roy-Byrne P, Dagadakis C, Unutzer J, Ries R. Evidence for limited validity of the revised global assessment of functioning scale. Psychiatric Serv 1996;47(8):864-6. [17] Moos RH, Nichol AC, Moos BS. Global Assessment of Functioning ratings and the allocation and outcomes of mental health services. Psychiatric Serv 2002;53:730-7. [18] Karterud S, Pedersen G, Bjordal E, Brabrand J, Friis S, Haaseth Ø, et al. Day hospital treatment of patients with personality disorders. Experiences from a Norwegian treatment research network. J Personal Disord 2003;17(2):173-93. [19] Monrad Aas IH. Global Assessment of Functioning (GAF): properties and frontier of current knowledge. Ann Gen Psychiatry 2010;9:20, doi:10.1186/1744-859X-9-20. [20] Jones SH, Thornicroft G, Coffey M, Dunn G. A brief mental health outcome scale-reliability and validity of the Global Assessment of Functioning (GAF). Br J Psychiatry 1995;166(5):654-9. [21] Karterud S, Pedersen G, Løvdahl H, Friis S. Global Assessment of Functioning–Split Version. Background and scoring guidelines. Oslo: Dep. of Psychiatry, Ullevaal University Hospital; 1998. [22] Karterud S, Pedersen G, Friis S, Urnes Ø, Irion T, Brabrand J, et al. The Norwegian network of psychotherapeutic day hospitals. Ther Communities 1998;1(19):15-28. [23] Pedersen G, Karterud S. Associations between patient characteristics and ratings of treatment milieu. Nord J Psychiatry 2007;61(4):271-8. [24] Sheehan DV, Lecrubier Y. Mini International Neuropsychiatric Interview (M.I.N.I.). Tampa, Florida / Paris, France: University of South Florida Institute for Research in Psychiatry / INSERM-Hôpital de la Salpétrière; 1994.

298

G. Pedersen, S. Karterud / Comprehensive Psychiatry 53 (2012) 292–298

[25] First MB, Gibbon M, Spitzer RL, Williams JBW, Benjamin LS. The Structured Clinical Interview for DSM-IV Axis II Personality Disorders (SCID-II). Washington, DC: American Psychiatric Press; 1997. [26] Spitzer RL. Psychiatric diagnoses: are clinicians still necessary? Comp Psychiatry 1983;24:399-411. [27] Derogatis LR. SCL-90-R: Symptom Checklist-90-R: administration, scoring and procedures manual. USA: Minneapolis (MN): National Computer systems; 1994. [28] Karterud S, Friis S, Irion T, Mehlum L, Vaglum P, Vaglum S. A SCL90-R derived index of the severity of personality disorders. J Personal Disord 1995;9:112-23. [29] Starcevic V, Bogojevic G, Marinkovic J. The SCL-90-R as a screening instrument for severe personality disturbance among outpatients with mood and anxiety disorders. J Personal Disord 2000;14:199-207. [30] Pedersen G, Karterud S. Is SCL-90R helpful for the clinician in assessing DSM-IV symptom disorders? Acta Psyciatr Scand 2004;110(3):215-24.

[31] Pedersen G, Karterud S. Using measures from the SCL-90-R to screen for personality disorders. Pers Ment Health 2010;4(2):121-32. [32] Pedersen G. Norwegian revised version of Inventory of Interpersonal Problems–Circumplex (IIP-C). Tidsskr Nor Psykol foren 2002;39(1): 25-34. [33] Alden LE, Wiggins JS, Pincus AL. Construction of circumplex scales for the Inventory of Interpersonal Problems. J Pers Assess 1990;55:521-36. [34] Mundt JC, Marks IM, Shear MK, Greist JM. The Work and Social Adjustment Scale: a simple measure of impairment in functioning. Br J Psychiatry 2002;180:461-4. [35] Jacobson NS, Truax P. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 1991;59(1):12-9. [36] SPSS. Statistical package for the social sciences, release. 16.0.1. for Windows. Chicago: SPSS Inc; 2007.