Journal of Affective Disorders 64 (2001) 89–98 www.elsevier.com / locate / jad
Brief report
The Seasonal Health Questionnaire: a preliminary validation of a new instrument to screen for Seasonal Affective Disorder Chris Thompson*, Andrew Cowan Research Division of Community Clinical Sciences, Department of Mental Health, Faculty of Medicine, Health and Biological Sciences, University of Southampton, Southampton, UK Received 17 October 1999; received in revised form 21 March 2000; accepted 3 April 2000
Abstract Background: The main screening tool for Seasonal Affective Disorder (SAD) is the Seasonal Pattern Assessment Questionnaire, but its reliability and validity have been thrown into doubt by several studies. Method: In this study we developed a new questionnaire, the Seasonal Health Questionnaire (SHQ), which is scored by computer to derive the four main operational criteria for diagnosis of SAD. A group of clinically diagnosed SAD patients was contrasted with a group of patients with recurrent non-seasonal depressive disorder using the SPAQ and the SHQ. Results: The SHQ could be completed without difficulty by patients with long histories of recurrent mood disorder. The SPAQ and the Rosenthal Criteria were the least specific of the criteria for identifying SAD — misclassifying many non-seasonal patients. Conclusions: After further development the SHQ may be a more appropriate screening instrument for SAD. The SPAQ should no longer be used for this purpose as it gives misleadingly high estimates of prevalence. 2001 Elsevier Science B.V. All rights reserved. Keywords: Seasonal Affective Disorder; Seasonal Pattern Assessment Questionnaire; Seasonal Health Questionnaire; Validity
1. Introduction Since its first description in 1984, Seasonal Affective Disorder (SAD) has become widely known and diagnosed as a recurrent depressive disorder in which episodes occur at particular times of the year,
*Corresponding author, Department of Psychiatry, Royal South Hants Hospital, Brinton’s Terrace, Southampton SO14 0YG, UK. Tel.: 1 44-1703-825-533; fax: 1 44-1703-234-243. E-mail address:
[email protected] (C. Thompson).
especially the winter. Atypical depressive symptoms including increased sleep and carbohydrate craving are prominent but not diagnostic as they frequently occur in non-seasonal depression. Apart from academic interest the importance of recognising the condition stems from the availability of a relatively specific treatment of bright light (Rodin et al., 1999). Four sets of criteria have been adduced for making clinical and research diagnoses of SAD. These are the original Rosenthal criteria based upon Research Diagnostic Criteria for depression (Spitzer et al., 1978; Rosenthal et al., 1984), DSM-IIIR (APA,
0165-0327 / 01 / $ – see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S0165-0327( 00 )00208-1
90
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
1987), DSM-IV (APA, 1994), and ICD-10 (WHO, 1992). These differ in several respects but they all rely explicitly or implicitly on several features of the disorder. 1. The diagnosis of depression in each of the episodes and an assumption that all counted episodes are similar. 2. A minimum number of episodes having occurred in the same season of the year. 3. A definition of a season. 4. A definition of the number of previous years during which the episodes took place (in some definitions this is assumed to be the whole of life). 5. A maximum number of episodes of non-seasonal depression during the same time period, producing in some definitions a minimum ratio of seasonal:non-seasonal episodes. Due to the inaccuracies of recall for morbid events it would be necessary to observe a subject clinically over several years to be sure that a diagnosis of SAD was correct. However, at the time of screening or early in the course of treatment this is not available and it is therefore necessary to develop structured screening instruments with high reliability and validity against standard diagnostic criteria. The most commonly used screening instrument for SAD is the Seasonal Pattern Assessment Questionnaire (SPAQ; Rosenthal et al., 1984). This consists of standard Likert scales (scored 0–4), for seasonal variations in sleep length, social activity, mood, weight, appetite and energy level. In each case the scale is for no change ( 5 0), to extremely marked change ( 5 4) giving a maximum score of 24. A ‘global seasonality score’ is also recorded as either not present, mild, moderate, marked, severe or disabling. The threshold for putative diagnosis of SAD (Kasper’s criteria) is 11 or greater plus a global rating of moderate or greater (Kasper et al., 1989; Jakobsen and Mellerup, 1998). Several studies have cast doubt on the reliability (Thompson and Isaacs, 1988; Thompson et al., 1988) and validity (Thompson et al., 1995) of the SPAQ in screening for SAD. For example, it does not predict
accurately the persistence of seasonal episodes in SAD patients (Raheja et al., 1996). The fundamental problem is that it has poor face validity for the clinical syndrome, not being based on the categorical definitions of recurrent depressive disorders but on the concept of seasonality as a continuously distributed variable. This approach has some advantages for population research but makes the definition of thresholds for putative diagnosis of the clinical syndrome difficult. No time frame is defined for the questionnaire so respondents are assumed to be answering in respect of their whole life. This contains the assumption that seasonality is a trait, whereas SAD has been shown to remit in a significant number of patients over a 7-year followup (Raheja et al., 1996). However, until now the SPAQ has been the only screening instrument available and thus has been widely used despite evidence for over-sensitivity to the clinical diagnosis. If the prevalence were as high as some of the upper estimates have suggested, a large unrecognised, untreated pool of SAD sufferers would exist. It may therefore be helpful to attempt to develop a more valid and reliable screening tool, able to identify the extent of unrecognised need in various clinical groups such as primary care or psychiatric out-patients and establish more accurately the community prevalence of clinical SAD.
2. Aims and predictions 1. To create a self-rating scale, the Seasonal Health Questionnaire, with high face validity for the clinical syndrome of SAD as defined in the four main diagnostic criteria. The questionnaire should be easy to use for the subject and the researcher. 2. To modify it according to qualitative comment on content and layout from a pilot group of clinically diagnosed SAD and major depressive disorder (MDD) patients. 3. To determine the construct and concurrent validity of the modified SHQ in out-patient groups with MDD and SAD. If there is acceptable preliminary validity it should attribute one of the four diagnostic categories of SAD to most of the SAD patients and few of the MDD patients. There should be a positive relationship between the
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
SHQ and the SPAQ but the SPAQ is expected to score more patients as having SAD than any diagnostic criterion derived from the SHQ. 4. The four diagnostic criteria for SAD will be appreciably different in the number of subjects they allocate to the SAD diagnosis. No prior data were available to allow a directional hypothesis. 3. Methods
3.1. Construction of the Seasonal Health Questionnaire A questionnaire was constructed for self-completion based upon the items contained in the four most frequently used diagnostic criteria. Any such questionnaire must be able to show the following. (a) Has the respondent ever been depressed in the time period of interest? (b) If so how many times? Is the number of episodes sufficient to demonstrate seasonality on at least one of the criteria? (c) Does the respondent believe there is a seasonal influence? (d) If so do the detailed seasonal characteristics of the recurrences meet one of the four criteria? The Seasonal Health Questionnaire (SHQ — available on request) meets these design specifications. It utilises probe questions and cut-off points to allow individuals with less psychopathology to complete the questionnaire more quickly. It is divided into six sections, labelled A–F. The time period of interest is explicitly set at 10 years for two reasons. Firstly, the reliability of self-report information, suspect at the best of times, is likely to be negligible for periods before this. Secondly, Seasonal Affective Disorder is not a lifetime trait (Raheja et al., 1996) and thus episodes which occurred longer than 10 years ago may be irrelevant to the current depressive risk of the subject. The first two sections of the SHQ are derived from the Hampshire Depression Scale (in development), a screening questionnaire for current major depressive disorder / depressive episode. Its specificity and sensitivity for current depression have been validated
91
against the SCAN (Schedules for Clinical Assessment in Neuropsychiatry) and were found to be 79% and 88%, respectively (Pender, personal communication). The sections were structured such that the obligatory symptoms were in section A. If none of the obligatory symptoms of any of the three diagnostic systems were reported the subject stops at that point. If at least one is reported they continue to section B and complete all the items. Only if the subject reports having had an episode of depression in the past 10 years do they proceed to the next section. This gathers information about the number of episodes. Section D is a probe question asking for the subject’s perception of seasonality in their episodes. An affirmative answer leads to section E that details the seasonal pattern. Finally, section F allows a ratio of seasonal to non-seasonal episodes to be calculated. DSM-IIIR explicitly requires a ratio of more than three seasonal to one non-seasonal depression. However, the ICD-10 and DSM-IV criteria only require that ‘seasonal episodes substantially outnumber non-seasonal ones’. For the purposes of the questionnaire a ratio of 2:1 was taken to represent this criterion numerically. The version of the questionnaire used in this study is shown in Appendix A. However, it should be noted that the questionnaire is provisional and may be subject to change as a result of further testing.
3.2. Pilot testing of content and layout Ten patients were selected from psychiatric outpatients, five with major depressive disorder (MDD) and five with SAD. Each was chosen so that they would be able to work through much of the questionnaire if answering correctly. They were given a brief explanation of the purpose of the SHQ and were asked to look out for wording that was difficult to understand, misleading or confusing. After completion they were interviewed to note their comments.
3.3. Construction of a machine readable version and an automated diagnostic algorithm The modified questionnaire was transformed using the Teleform Elite package, a software tool for enabling an Optical Mark Reader (OMR) to process
92
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
the forms automatically. Then, to enable automated analysis of the machine-readable forms, an SPSS programme was written to allocate each patient to one or more of the following categories. • No history of depression in the last 10 years. • Major Depressive Disorder (DSM) — one episode in the last 10 years. • Depressive Episode (ICD-10) — one episode in the last 10 years. • Major Depressive Disorder (DSM) — multiple episodes (non-seasonal) in the last 10 years. • Depressive Episodes (ICD-10) — multiple episodes in the last 10 years, non-seasonal. • SAD — Rosenthal criteria. • SAD — DSM-IIIR. • SAD — DSM-IV. • SAD — ICD-10.
3.4. Preliminary validation of the questionnaire The SHQ and the SPAQ were distributed by post, with a covering letter and stamped addressed envelope to a group of 59 clinically diagnosed SAD patients and 50 recurrent MDD patients over the age of 25 (as the SHQ investigates the last 10 years of adult life). The groups were identified from outpatient lists of specialist mood disorder clinics and all had been diagnosed by an experienced clinician. Self-diagnosed SAD patients were not included. Both bipolar and unipolar patients were included. The questionnaires were sent out in February 1999. A reminder was sent to non-responders after 3 weeks.
4. Results As a result of piloting, the instructions were amended to enable the help of a friend or relative in completion of the SHQ. The range of answers in questions E(3) and F(1) were amended to restrict the range of possible answers to those required for diagnosis. The response rate after the initial questionnaire mailing and the reminder was 68% in both groups. In the responses to Section A and B almost all subjects who qualified for one depression diagnosis also qualified for the others demonstrating the great congruity between the sets of diagnostic criteria for
depression. Two (5.0%) SAD and four (11.8%) MDD respondents failed to qualify for any episodes of major depression or depressive episode as their illnesses were too mild. If seasonality was reported the episodes almost always occurred in the winter. The SAD group had more total episodes in the previous 10 years with 92.5% experiencing greater than five, in comparison to 50% of the MDD patients. The number of putative SAD diagnoses made by the SPAQ, and the four criteria derived from the SHQ are shown in Table 1. The percentage of any SAD diagnosis was greater in the SAD group (90%) but 29% of the MDD group also receive a diagnosis of SAD on at least one criterion. The Kasper criteria of the SPAQ produced more putative diagnoses in the SAD group (90%) than in the MDD group (29%) and the mean score for the SAD group (18.1) was significantly higher than that in the MDD group (9.1) showing that it has some validity. However, of all the criteria, it produced the greatest number of SAD diagnoses in both groups, showing that it is the most sensitive but least specific. The Rosenthal diagnostic criteria were less sensitive (75% diagnosis in the SAD group) but only marginally more specific (26% false positive rate) than the SPAQ. The other three diagnostic criteria were all roughly equal in performance with the DSM-IV being slightly superior to the DSM-IIIR or ICD-10. Kappa coefficients (chance adjusted: Table 2) were calculated for each combination of diagnostic criteria and confirmed a poor agreement between the SPAQ and all the diagnostic criteria. DSM-IIIR and the ICD-10 were the most highly correlated pair. Table 1 A comparison screening and diagnostic criteria for Seasonal Affective Disorder a Diagnostic criteria No depression SPAQ Rosenthal DSM-IIIR DSM-IV ICD-10
Number diagnosed (%) SAD (N 5 40)
MDD (N 5 34)
2 36 30 21 25 23
4 10 9 3 2 3
(5) (90) (75) (53) (63) (58)
(12) (29) (26) (9) (6) (9)
a The number and % of patients meeting criteria for Seasonal Affective Disorder according to the Kasper criteria of the SPAQ, and the four available diagnostic criteria derived from the SHQ.
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98 Table 2 Agreement between the SPAQ and four diagnostic criteria a
DSM-IIIR DSM-IV ICD-10 SPAQ
Rosenthal
DSM-IIIR
DSM-IV
ICD-10
0.54 0.71 0.62 0.33
0.39 0.80 0.11
0.48 0.19
0.15
a
Kappa coefficients of agreement between the four definitions of Seasonal Affective Disorder: Kasper’s criteria derived from the SPAQ, and four diagnostic criteria derived from the SHQ.
5. Discussion There are several limitations to our results in this preliminary evaluation of a new screening instrument for SAD. The Hampshire Depression Scale, a new questionnaire for screening for depression in primary care, and which forms the basis of the episode screen in the SHQ, has good sensitivity and specificity in the primary care setting but is as yet untested for retrospective diagnosis. No research diagnostic interview was undertaken as a gold standard against which to compare the putative diagnostic categories allocated by the SHQ, and we therefore relied on the clinical diagnoses of expert clinicians. We cannot rule out response bias (Nayyar and Cochrane, 1996). But the response rate was not so low as to prejudice this preliminary analysis of validity. Thus, for a preliminary test of feasibility and validity this methodology may have been sufficient. The SHQ was inevitably longer and more difficult to complete than the SPAQ despite the use of cut-off points and probe questions. However, only four subjects returned the SPAQ without completing the SHQ suggesting that community prevalence surveys would not be unduly compromised by non-compliance. The complex algorithm used for scoring the questionnaire does mean that it would be unsuitable for widespread use in clinical settings without the SPSS program. Sub-syndromal SAD cannot be identified by this version of the questionnaire since respondents are not invited to complete the later sections on seasonality until it has been established that they have suffered at least one full depressive episode. The high rate of false positive diagnoses in the MDD group on the SPAQ further demonstrate its low specificity as has been previously described (Lam
93
and Levitt, 1999). This finding explains some of the discrepancies in the epidemiological literature on SAD. The original epidemiological study using the SPAQ (the Maryland study; Kasper et al., 1989) showed that most SPAQ positive subjects had recurrent MDD on subsequent telephone interview, rather than Seasonal Affective Disorder. In the presence of a low true positive rate a non-specific questionnaire would identify more false positives than true positives, which appears to have been the case. Such poor specificity probably also accounts for unrealistic estimates of prevalence in other studies using the SPAQ. With a few exceptions these have given results between 1 and 9% of the general population (Magnusson and Axelsson, 1993; Magnusson and Stefannson, 1993) with the largest (Kasper et al., 1989) suggesting a range of 4.3 to 10%. Reliable estimates of the prevalence of all major depressive disorder (of which SAD is but one subtype) are around 5%. Therefore the SPAQ must either have been over-sensitive (including seasonal patients without major depressive disorder) non-specific (diagnosing non-seasonal-depressed patients as SAD) or both to have given such high estimates. We have shown that it is certainly non-specific relative to all diagnostic criteria for SAD. A more robust approach to case identification using a structured interview based upon DSM-IIIR diagnostic criteria in 8000 subjects found a lifetime prevalence (0.4%) of one tenth the estimates based on the SPAQ (Blazer et al., 1998). We have shown that a questionnaire for screening for SAD using clinical definitions can be constructed and used without much difficulty by patients with recurrent mood disorders, both seasonal and nonseasonal. It has concurrent validity, correlating highly with the SPAQ, but better specificity. However, we have not yet demonstrated its positive predictive value for the clinical diagnosis in unselected community populations or in primary care, where it might be used eventually for case identification. Bearing in mind these limitations we can nevertheless compare the characteristics of the four clinical diagnostic criteria for SAD. This is a useful exercise since the effect of the changes in criteria between, say, DSM-IIIR and DSM-IV was not predictable from perusal of the items. Some individual items were made more restrictive and others less so. Using
94
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
the SHQ results the DSM-IV appears to be both the most sensitive and specific set of criteria compared to the clinical diagnoses. The Rosenthal criteria were less specific compared to all three of the laterdeveloped criteria. They were the first criteria used in research studies of SAD and should now be considered obsolete. Such results indicate that this criterion could be omitted from use in further SHQ modifications, which will simplify the questionnaire. A drawback of the questionnaire is the uncertainty about the accuracy of recall over the 10-year time span. However, this is an inevitable consequence of the identification of SAD itself, with the requirement to establish seasonal patterns, and would be a problem in both self-rated and observer-rated instruments. The SHQ is intended only as a screening instrument not a ‘gold standard’ research diagnostic tool and so the expectations are considerably less than, for example, the lifetime version of the Schedule for Affective Disorders and Schizophrenia. A formal test of positive predictive value in a representative population would demonstrate whether any extra bias has been intro-
duced by the self-report format compared with a clinical interview. The next steps in the validation process, currently underway, will be to re-distribute the SHQ in the summer to assess test–retest reliability under different seasonal conditions. Subsequently a large consecutive sample of primary care attenders will be used to assess the positive predictive value against a research diagnosis.
6. Conclusion In view of the inconsistencies in SAD epidemiology and our current findings we suggest that the use of the SPAQ as a screening tool and the Rosenthal criteria for diagnosis of SAD should cease. The SHQ holds promise as a new, more specific screening tool but requires further validation before it can be recommended for general use.
Appendix A
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
95
96
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
97
98
C. Thompson, A. Cowan / Journal of Affective Disorders 64 (2001) 89 – 98
References APA, 1987. In: 3rd Edition revised. Diagnostic and Statistical Manual of Mental Disorders, (DSM-IIIR). American Psychiatric Association, Washington DC. APA, 1994. In: 4th Edition. Diagnostic and Statistical Manual of Mental Disorders, (DSM-IV). American Psychiatric Association, Washington DC. Blazer, D.G., Kessler, R.C., Swartz, M.S., 1998. Epidemiology of recurrent major and minor depression with a seasonal pattern. The National Comorbidity Survey. Br. J. Psychiatry 172, 164– 167. Jakobsen, D.H., Mellerup, E., 1998. Prevalence of winter depression in Denmark. Acta Psychiatr. Scand. 97, 1–4. Kasper, S., Rogers, B., Yancey, A. et al., 1989. Epidemiological findings of seasonal changes in mood and behaviour: a telephone survey of Montgomery County, Maryland. Arch. Gen. Psychiatry 46, 823–833. Lam, R.W., Levitt, A.J. (Eds.), 1999. Canadian Consensus Guidelines for the Treatment of Seasonal Affective Disorder. Clinical and Academic Publishing, Vancouver, BC. Magnusson, A., Stefannson, J.G., 1993. Prevalence of seasonal affective disorder in Iceland. Arch. Gen. Psychiatry 50, 941– 946. Magnusson, A., Axelsson, J., 1993. The prevalence of seasonal affective disorder is low among descendants of Icelandic immigrants in Canada. Arch. Gen. Psychiatry 50, 947–951. Nayyar, K., Cochrane, R., 1996. Seasonal change in affective state
measured prospectively and respectively. Br. J. Psychiatry 168, 627–632. Raheja, S.K., King, E.A., Thompson, C., 1996. The seasonal pattern assessment questionnaire for identifying seasonal affective disorders. J. Affect. Disord. 41, 193–199. Rodin, I., Birtwistle, J., Thompson, C., 1999. A meta-analysis of phototherapy in treatment of depressive disorders. In: Society for Light Treatment and Biological Rhythms Annual Meeting, Washington DC. Rosenthal, N.E., Sack, D.A., Gillin, J.C. et al., 1984. Seasonal affective disorder: a description of the syndrome and preliminary findings with light therapy. Arch. Gen. Psychiatry 41, 72–80. Spitzer, R.L., Endicott, J., Robons, E., 1978. Research Diagnostic Criteria for a Selected Group of Functional Disorders. Biometric Research, New York State Department of Mental Hygiene, New York. Thompson, C. et al., 1988. A comparison of normal bipolar and seasonal affective disorder subjects using the seasonal pattern assessment questionnaire. J. Affect. Disord. 14, 257–264. Thompson, C., Isaacs, G., 1988. Seasonal Affective Disorder, a British sample, symptomatology in relation to mode of referral and diagnostic subtype. J. Affect. Disord. 14, 1–13. Thompson, C., Raheja, S.K., King, E.A., 1995. A follow-up study of seasonal affective disorder. Br. J. Psychiatry 167, 380–384. WHO, 1992. International Classification of Mental and Behavioural Disorders: Clinical Descriptions and Diagnostic Guidelines. World Health Organisation.