Health-related quality of life in remitted psychotic depression✰

Health-related quality of life in remitted psychotic depression✰

Journal of Affective Disorders 256 (2019) 373–379 Contents lists available at ScienceDirect Journal of Affective Disorders journal homepage: www.else...

266KB Sizes 0 Downloads 89 Views

Journal of Affective Disorders 256 (2019) 373–379

Contents lists available at ScienceDirect

Journal of Affective Disorders journal homepage: www.elsevier.com/locate/jad

Research paper

Health-related quality of life in remitted psychotic depression✰ a,⁎

b

a,c

T d

Kathleen S. Bingham , Ellen M. Whyte , Benoit H. Mulsant , Anthony J. Rothschild , Matthew V. Rudorfere, Patricia Marinof, Samprit Banerjeeg, Meryl A. Buttersb, George S. Alexopoulosh, Barnett S. Meyersh, Alastair J. Flinta,i, on behalf of the STOP-PD Study Group a

University of Toronto, Department of Psychiatry, Toronto, ON, Canada University of Pittsburgh School of Medicine, Department of Psychiatry, Pittsburgh, PA, United States c Centre for Addiction and Mental Health, Toronto, ON, Canada d University of Massachusetts Medical School, Department of Psychiatry, Worcester, MA, United States e National Institute of Mental Health, Rockville, MD, United States f Weill Cornell Medical College, Department of Psychiatry, New York, NY, United States g Weill Cornell Medical College, Department of Biostatistics and Epidemiology, New York, NY, United States h Weill Cornell Medicine, New York-Presbyterian/Westchester Division, White Plains, NY, United States i University Health Network, Toronto, ON, Canada b

A B S T R A C T

Background: Some patients with major depression continue to demonstrate deficits in health-related quality of life (HRQL) following remission. No data exist, however, regarding HRQL in remitted psychotic depression. In this study, we aimed to characterize HRQL in patients with psychotic depression receiving controlled pharmacotherapy. Methods: This is a secondary analysis of a randomized controlled trial studying continuation pharmacotherapy of psychotic depression. We compared participants’ HRQL (measured using the SF-36) between baseline and remission and to population norms. We also compared SF-36 scores stratified by age and gender and examined the correlation between SF-36 scores and medical burden, depression score and neuropsychological performance in remission. Results: SF-36 scores were significantly lower than population norms at baseline, but improved following remission to the level of population norms. Neither SF-36 scores nor magnitude of SF-36 improvement differed substantially between genders or between younger and older participants. In remission, depression scores were correlated with most SF-36 scales and medical burden was correlated with SF-36 scales measuring physical symptoms. Neuropsychological measures were generally not correlated with SF-36 scores. Limitations: This study was a secondary analysis not powered specifically to measure HRQL as an outcome variable and the SF-36 was the only HRQL measure used. Conclusions: Participants with remitted psychotic depression demonstrated levels of HRQL comparable to population norms, despite marked impairment in HRQL when acutely ill. This finding suggests that, when treated in a rigorous manner, many patients with this severe illness improve significantly from a clinical and HRQL perspective.

1. Introduction The construct of Health-Related Quality of Life (HRQL) has been defined as a person's perceived life functioning and wellbeing in the physical, mental and social domains of health (Hays and Morales, 2001). HRQL is recognized as an important outcome to consider in healthcare delivery, research and policy (Bakas et al., 2012). Given the well-established personal, social and economic costs of major depressive disorder (MDD), HRQL and its measurement are of critical importance in the MDD literature (Lam et al., 2014). Patients with MDD report more impaired subjective HRQL than population norms

(Ware et al., 1993). Although HRQL improves with depression treatment, several observational studies suggest that it may not return to population-level norms following remission (Bos et al., 2018; Pukrop et al., 2003; ten Doesschate et al., 2010). On the other hand, data from controlled trials suggest that the majority of patients with MDD in remission report normative levels of HRQL, similar to that found in the general population (IsHak et al., 2015; Kocsis et al., 2002; Miller et al., 1998). Independently to the study design, there is heterogeneity in the degree of HRQL impairment following remission of MDD (Bos et al., 2018). Variables that have been identified to contribute to more

✰ ⁎

Trial Registration and URL: Clinicaltrials.gov. Registry ID: NCT01427608 Corresponding author at: Toronto General Hospital, 200 Elizabeth St., 8 Eaton North Room 241, Toronto, Ontario M5G 2C4, Canada. E-mail address: [email protected] (K.S. Bingham).

https://doi.org/10.1016/j.jad.2019.05.068 Received 18 March 2019; Received in revised form 13 May 2019; Accepted 28 May 2019 Available online 28 May 2019 0165-0327/ © 2019 Elsevier B.V. All rights reserved.

Journal of Affective Disorders 256 (2019) 373–379

K.S. Bingham, et al.

as Parkinson's disease that might affect neuromuscular function; and unstable physical illness—although many of the study participants had stable chronic physical problems. Using procedures approved by local institutional review boards, written informed consent was obtained from all participants or their substitute decision maker prior to the initiation of any research assessments or treatment. In the open-label acute phase, participants were treated with a combination of sertraline (target dose of 150–200 mg/day) and olanzapine (target dose of 15–20 mg/day). Participants continued with open-label combination pharmacotherapy for a fixed 8-week stabilization period once they no longer had delusions and hallucinations (as determined by the delusion and hallucination severity items of the Schedule for Assessment Disorders and Schizophrenia (SADS) (Spitzer and Endicott, 1979) and either a) had a 17-item Ham-D score of ≤10 for 2 consecutive weeks (‘remission’), or b) had a HAM-D score of 11–15 with ≥50% reduction in their baseline HAM-D score by the end of the acute phase and were rated ‘very much improved’ or ‘much improved’ on the Clinical Global Impression (CGI) Scale (Guy, 1976) (‘near-remission’). While the acute phase of the study could last up to 12 weeks, participants entered the stabilization phase as soon as they met full-remission criteria. Participants who continued to meet remission or near-remission criteria at the end of the stabilization phase entered the 36-week RCT phase. The current report includes only those participants who completed the acute and stabilization phases of the study and met the study's criteria for remission at the end of the stabilization phase (N = 119). Participants who met near-remission criteria at the end of the stabilization phase were not included in the analyses because we wanted to examine SF-36 scores in the fully remitted state. Thus, the data reported here were collected from participants who had been in remission for a total of 10 weeks following treatment with open-label sertraline and olanzapine.

impaired HRQL in MDD, and may contribute to this heterogeneity, include worse premorbid functioning (Bos et al., 2018), medical burden (Ho et al., 2014), residual depressive symptoms (ten Doesschate et al., 2010; Pukrop et al., 2003), and, though it has not been well studied, cognitive impairment (Evans et al., 2014). In addition, certain subgroups of patients with MDD may experience poorer functional outcomes than others. The HRQL of patients with MDD with psychotic features (psychotic depression) has not been well studied. Psychotic depression is a severe disorder associated with significant disability during acute episodes and in short term follow-up (Coryell and Tsuang, 1982; Maj et al., 2007). In a study comparing HRQL measured using the 36-Item Short Form Survey (SF-36) (Ware et al., 1993) among different depression groups, patients with current psychotic depression exhibited poorer HRQL than those with severe, non-psychotic depression (Kruijshaar et al., 2003). However, no studies have investigated HRQL in remitted psychotic depression; it is therefore not known to what extent impairment of HRQL persists into remission in this patient group. The Sustaining Remission of Psychotic Depression study (STOP-PD II) examined the risks and benefits of continuing the antipsychotic medication olanzapine in patients who had experienced remission of psychotic depression when treated with both sertraline and olanzapine. HRQL was measured by the 36-item Short Form Survey (SF-36) at acute treatment baseline and following eight weeks of remission. Because older persons with MDD are more likely to experience psychotic features than their younger counterparts, STOP-PD II included participants aged 18–85 years, thus allowing us to analyze HRQL across the adult lifespan. The primary aim of this analysis was to examine HRQL in acute psychotic depression and in remission. Based on the previous findings of increased disability in psychotic depression (Coryell and Tsuang, 1982; Maj et al., 2007), and data suggesting poorer HRQL in depression generally compared to age-adjusted population norms or controls (Bos et al., 2018; Pukrop et al., 2003; ten Doesschate et al., 2010), we hypothesized that self-reported HRQL would improve with remission of psychotic depression, but nevertheless would remain more impaired than age- and gender-adjusted population norms. In addition, exploratory aims were to examine i) the relation of age and gender to HRQL and change in HRQL from baseline to remission, and ii) the association of depression score, medical burden, and neuropsychological performance with HRQL in remitted psychotic depression.

2.2. Measures The primary dependent variable of HRQL was measured using the 36-item Short Form Survey (SF-36) (Ware et al., 1993). The SF-36 is a well-established, widely-used measure of general health status that has evidence for reliability, construct validity and responsiveness to change in a broad age range of adults with depression (Beusterien et al., 1996; McHorney et al., 1993; Pukrop et al., 2003). Although the SF-36 was first developed as a measure of general health status, MDD researchers use it to measure a variety of related constructs, including health-related quality of life (Bingham et al., 2018). In this study, we are using the SF-36 to measure HRQL, based on the health services research model described in the Medical Outcomes Study, which conceptualizes HRQL as the extent to which a person's health impacts his or her ability to function, as well as his or her perceived well-being in physical, mental, and social domains of life (Hays and Morales, 2001; Stewart et al., 1992). The SF-36 contains eight scales that measure the following domains: physical functioning, role limitations related to physical symptoms (Role-Physical), mental health, role limitations related to emotional symptoms (Role-Emotional), General Health, Bodily Pain, Vitality, and Social Functioning. Not surprisingly, Mental Health and Role-Emotional scales are best able to discriminate between groups with and without psychiatric disorders and among groups with differing levels of psychiatric symptom severity (McHorney et al., 1993). However, we used all eight scales to gain a full picture of HRQL in patients with psychotic depression. Neuropsychological function was measured using the Repeated Battery for the Assessment of Neuropsychological Status (RBANS) (Randolph, 1998) and the Delis-Kaplan Executive Function System (DKEFS) (Delis and Kaplan, 2001). Neuropsychological domains measured included: (i) processing speed and sustained attention (RBANS coding task), (ii) executive functions and visual attention (the DKEFS Trail-Making Task, Condition 4), and (iii) delayed verbal recall (RBANS

2. Methods 2.1. Participants and study design STOP-PD II is an NIMH-funded, multi-site, randomized, placebocontrolled trial investigating the benefits and risks of continuing antipsychotic medication in persons with remitted psychotic depression. The design and methods of STOP-PD II have been described in detail previously (Flint et al., 2013). The study has three phases: acute, stabilization, and randomized. This paper is based on data from the acute and stabilization phases only. At the time of enrollment in the acute phase of the study, participants were aged between 18 and 85 years, met Structured Clinical Interview for DSM-IV_TR (First et al., 2001) criteria for a current episode of major depressive disorder with at least one associated delusion (with or without hallucinations), and had a 17item Hamilton Depression Rating Scale (Ham-D) (Hamilton, 1960; Williams et al., 2008) total score greater than or equal to 21. Exclusion criteria included: meeting current or lifetime DSM-IV criteria for any other psychotic disorder, bipolar affective disorder, or intellectual disability; current DSM-IV criteria for body dysmorphic disorder or obsessive-compulsive disorder; DSM-IV criteria for dementia preceding the index episode of depression or a 26-item Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE) (Jorm, 2004) mean score ≥ 4 at acute phase baseline; DSM-IV criteria for substance abuse or dependence within the preceding three months; neurologic disease such 374

Journal of Affective Disorders 256 (2019) 373–379

K.S. Bingham, et al.

aged 60 years and older. Independent two-sided t-tests were used to compare these age groups on i) SF-36 scale scores at baseline and at remission, and ii) the magnitude of change in SF-36 scale scores between baseline and remission. The same procedure was done to compare baseline and remission and change scores between men and women. Finally, to explore the relationship between SF-36 scale scores and medical burden, residual depressive symptoms, and neuropsychological performance, we performed correlation analyses (using Pearson's r) between each of the eight SF-36 transformed scaled scores in remitted STOP-PD II participants and the following variables: Ham-D, CIRS-G, RBANS Coding scaled score, RBANS Delayed List Recall scaled score and DKEFS Trail Making Test, Condition 4 scaled score. All analyses were conducted using SAS 9.4 (©SAS Institute, Cary, NC). Given that we performed eight statistical tests (one for each SF-36 scale) per dependent variable, we adjusted for multiple comparisons using the Bonferroni correction, resulting in an alpha of p < 0.006.

List Recall task). We targeted assessment of processing speed, executive function, and verbal memory because these are the domains found to be most impaired in psychotic and remitted depression (Bora et al., 2013; Schatzberg et al., 2000). Severity of depressive symptoms was measured with the total score of the 17-item Ham-D and severity of cumulative medical burden was measured with the total score of the Cumulative Illness Rating Scale for Geriatrics (CIRS-G) (Miller et al., 1992). The SF-36, Ham-D, and CIRS-G were each administered at acute phase baseline and the end of the stabilization phase, whereas the neuropsychological measures were administered at the end of the stabilization phase only. 2.3. Data analyses SF-36 raw data were re-coded and transformed according to directions in the instrument manual (Ware et al., 1993). In this procedure raw SF-36 item scores are first recoded such that all high scores indicate better performance. Next, raw item scores are summed to create the eight raw scale scores. Finally, raw scale scores are transformed to create a scale ranging from 0–100 based on the tables and formula provided in the manual. The formula to create transformed scale scores is as follows: transformed scale = [(actual raw score – lowest possible raw score)/possible raw score range] x 100. Unless otherwise mentioned, these transformed scale scores were used in analyses. Although individual domains of SF-36 do not conform to a normal distribution (Torrance et al., 2009), due to our relatively large sample size, parametric tests (e.g. paired t-tests) have asymptotically correct distributions and therefore we used conventional parametric tests. With respect to the primary aim of the study, paired t-tests were used to examine change in SF-36 scale scores between baseline and Stabilization Week 8 (full remission) in the entire sample. Secondly, in order to compare SF-36 scale scores in STOP-PD II participants with those of the general population, we used the norms for the general American population provided in the study manual (Ware et al., 1993). Population norms are available for men and women separately in the following age categories: 18–24, 25–34, 35–44, 45–54, 55–64, 65–74, and 75+ years. Based on these norms, SF-36 scale scores from the STOP-PD II participants at both baseline and remission were standardized to z-transformed values by subtracting each STOP-PD II participant scale score from the mean scale score of the general population and dividing by the standard deviation of the general population for each age/gender stratum separately. Mean z-transformed values for each SF-36 scale score, weighted by the proportion of participants in each age/gender stratum in our study sample, were then calculated. Mean z-scores approximate the number of standard deviations that a STOP-PD II participant's SF-36 scores differ from the standardized age and gender-adjusted US population mean. Under the null hypothesis, SF-36 scores in the STOP-PD II participant group follow population distributions, and thus z-transformed values will follow a standard normal distribution (with mean 0 and standard deviation of 1). Therefore, we examined the z-transformed values descriptively via means and standard deviations using the normative population mean of zero as a point of reference. We performed independent one-sample t-tests to examine whether STOP-PD II scale scores were significantly different from the standard normal mean of zero. We also calculated the proportion of participants in remission with persistent and clinically meaningful HRQL impairments in our study sample. Since there is no validated SF-36 threshold defining clinically meaningful HRQL impairment or disability in patients with depression or in the general population, we conservatively elected to use two standard deviations below the age-and gender-adjusted standardized normal population mean as our cut-off to define this construct. With respect to the exploratory aim of comparing SF-36 scores between younger and older adults and between genders, we divided the STOP-PD II sample into two age groups, those aged 18–59 and those

3. Results 3.1. Participant characteristics Sociodemographic, clinical and neuropsychological data for the 119 participants who completed the STOP-PD II stabilization phase in remission are shown in Table 1. 3.1.1. Comparison of SF-36 scores between baseline and remission Table 2 shows mean SF-36 scale scores for all participants at baseline and remission. The paired comparisons between baseline and remission are based on fewer than 119 participants, since six participants did not complete the SF-36 at baseline and two participants did not complete the SF-36 at the end of the stabilization phase. SF-36 norms for the general United States population are shown for comparison. SF36 scores improved significantly from baseline to remission for all scales. Magnitudes of improvement from baseline to remission were largest in the scales related to mental health (Mental Health, RoleEmotional and Social Functioning), and smallest (though still statistically significant) in those specific to physical symptoms (Physical Functioning and Bodily Pain). 3.1.2. Age group and gender analysis Of the 119 participants, N = 52 were between ages 18 and 59 and N = 67 were age 60 years and older. As shown in Table 1, our study group was 61% female. Only the Physical Functioning scale score was significantly different between age groups at baseline, with lower scores observed in older adults (mean [SD] = 61.13 [29.28]) compared to younger adults (mean [SD] = 75.68 [26.92]) (t116= −3.12, Table 1 Sociodemographic and clinical variables and neuropsychological measures in STOP-PD II participants in sustained remission (N = 119). Variables and Measures

Baseline

Remission

Age, mean (SD) Gender Female, N (%) Male, N (%) Education (standardized years), mean (SD) CIRS-G total score, mean (SD) Ham-D score, mean (SD) RBANS Coding scaled score, mean (SD) RBANS Delayed List Recall scaled score, mean (SD) DKEFS Trail Making Test scaled score, mean (SD)

56.35 (14.56)

– – – – – 3.85 4.59 4.93 7.85 5.80

73 (61.34) 46 (38.66) 13.64 (3.80) – – – – –

(3.60) (2.87) (4.01) (2.89) (3.85)

CIRS-G = Cumulative Illness Rating Scale for Geriatrics; Ham-D = 17-item Hamilton Depression Rating Scale; RBANS = Repeatable Battery for the Assessment of Neuropsychological Status; DKEFS = Delis-Kaplan Executive Function System. 375

Journal of Affective Disorders 256 (2019) 373–379

K.S. Bingham, et al.

Table 2 Comparison of SF-36 scale scores between baseline and remission. SF-36 Scale1

Paired N used2

Baseline

Remission

Paired t-test

df

p-value

General American Population Norm (N = 2474)3

Physical Functioning, mean (SD) Role-Physical, mean (SD) Role-Emotional, mean (SD) Bodily Pain, mean (SD) Vitality, mean (SD) General Health, mean (SD) Social Functioning, mean (SD) Mental Health, mean (SD)

112 111 111 111 111 110 111 111

69.63 47.79 13.57 62.93 25.84 49.35 31.42 28.57

78.14 71.37 77.78 77.32 60.85 72.02 78.63 73.94

−3.75 −5.66 −15.73 −5.66 −16.93 −10.72 −15.98 −21.91

111 110 110 110 110 109 110 110

0.0003 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001

84.15 80.96 81.26 75.15 60.86 71.95 83.28 74.74

(28.71) (45.14) (26.22) (27.8) (18.89) (22.35) (23.27) (15.51)

(24.21) (37.03) (35.39) (23.93) (19.26) (17.76) (22.51) (17.87)

(23.28) (34) (33.04) (23.69) (20.96) (23.69) (22.69) (18.05)

Bolded p-values are those that are statistically significant at p < 0.006. 1 Scale scores calculated and transformed according to instructions provided by Ware et al, 1993 2 Of the 119 total participants, six were missing only baseline SF-36 data and two were missing only stabilization SF-36 data, leaving fewer than 119 participants with data available for paired comparisons. 3 From Ware et al, 1993

p = 0.0023). None of the SF-36 scale scores were significantly different between age groups at remission. None of the change scores were significantly different between age groups. None of the SF-36 scale scores were different at baseline or remission between genders, nor were any of the change scores.

Table 4 Correlation coefficients between SF-36 scale scores in remission1.

3.1.3. Comparison to population norms The age- and gender- adjusted SF-36 z-transformed values are shown in Table 3. The number of participants in Table 3 differs from that in Table 2, because Table 2 reports participants with complete paired data at baseline and remission, whereas Table 3 reports unpaired data. At baseline, all mean SF-36 scale scores except for Physical Functioning and Bodily Pain are at least 0.5 standard deviations below the normal population mean of zero. Again, the mental health-related scales show the greatest deviation from the population mean, with Mental Health and Social Functioning scale scores greater than two standard deviations below the mean of zero. On statistical testing, all baseline SF-36 z-transformed values are significantly lower than the standardized normal population mean of zero except for Physical Functioning and Bodily Pain. At remission, on the other hand, SF-36 ztransformed values are very close (less than 0.3 standard deviations) to the population mean, with the exception of Bodily Pain and General Health, which were significantly higher than the population mean (see Table 3). When examining the proportions of STOP-PD II participants in remission, we found that, for the majority of scales, less than 5% of participants reported scores less than two standard deviations below age- and gender-adjusted population means. Exceptions were RolePhysical (6.1%), Social Functioning (5.1%) and Role-Emotional (7.7%).

SF-36 Scale

RBANS Coding r(df) p

RBANS Delayed List Recall r(df) p

DKEFS Trail Making Test, Condition 4 r(df) p

Ham-D r(df) p

CIRS-G r(df) p

Physical Functioning

−0.07 (109) 0.47 −0.09 (108) 0.35 0.02 (108) 0.87 −0.01 (108) 0.88 0.08 (108) 0.42 −0.07 (108) 0.46 0.19 (108) 0.05 0.10 (108) 0.32

0.19 (109) 0.04 0.08 (108) 0.41 −0.02 (108) 0.87 0.16 (108) 0.10 0.10 (108) 0.30 0.12 (108) 0.20 −0.06 (108) 0.54 0.02 (108) 0.83

0.43 (108) <0.0001 0.30 (108) 0.002 0.18 (108) 0.07 0.26 (108) 0.006 0.23 (108) 0.02 0.29 (108) 0.002 0.24 (108) 0.012 0.12 (108) 0.21

−0.42 (116) <0.0001 −0.47 (115) <0.0001 −0.43 (115) <0.0001 −0.25 (115) 0.006 −0.53 (115) <0.0001 −0.45 (115) <0.0001 −0.37 (115) <0.0001 −0.49 (115) <0.0001

−0.52 (116) <0.0001 −0.35 (115) 0.0001 −0.11 (115) 0.23 −0.23 (115) 0.01 −0.22 (115) 0.02 −0.35 (115) 0.0001 −0.07 (115) 0.47 −0.05 (115) 0.61

Role-Physical

Role-Emotional

Bodily Pain

Vitality

General Health

Social Functioning

Mental Health

Bolded values represent correlation coefficients that are ≥ |0.3| (suggesting at least a moderate correlation). 1 Of the 119 total participants, data were not completed for some of the SF36, RBANS, and DKEFS items, as reflected in the df for each correlation coefficient.

3.1.4. Relationships with clinical and neuropsychological measures Pearson's correlation coefficients between SF-36 scale scores and

Table 3 Age- and gender-adjusted SF-36 scale z -transformed values at baseline and remission and statistical comparison with standardized normal population mean of zero. SF-36 Scale

Baseline N used1

Mean z transformed value (SD)2

t-test

df

p

Remission N used1 Mean z transformed value (SD)2

t-test

df

p

Physical Functioning Role-Physical Role-Emotional Bodily Pain Vitality General Health Social Functioning Mental Health

113 113 113 113 113 112 113 113

−0.25 −0.60 −1.85 −0.20 −1.47 −0.77 −1.98 −2.43

−2.24 −4.9 −24.78 −1.86 −17.55 −7.48 −19.9 −30.48

112 112 112 112 112 111 112 112

0.03 <0.0001 <0.0001 0.07 <0.0001 <0.0001 <0.0001 <0.0001

118 117 118 117 117 117 117 117

0.97 −0.36 −0.71 3.56 0.66 3.39 −0.87 −0.48

117 116 117 116 116 116 116 116`

0.34 0.72 0.48 0.0005 0.51 0.001 0.39 0.63

(0.41) (0.46) (0.28) (0.39) (0.31) (0.38) (0.37) (0.30)

0.08 (0.33) −0.03 (0.36) −0.07 (0.33) 0.32 (0.35) 0.05 (0.31) 0.27 (0.31) −0.07 (0.33) −0.04 (0.34)

Bolded p-values are those that are statistically significant at p < 0.006. 1 N = 6 participants were missing SF-36 data at baseline (except for General Health, where N = 7 participants were missing data); N = 2 participants were missing data for most SF-36 scales at remission (except for Physical Functioning and Role-Emotional—only one participant was missing data for these scales). 2 Weighted by proportion of participants per age group/gender stratum 376

Journal of Affective Disorders 256 (2019) 373–379

K.S. Bingham, et al.

patients in clinical trials is more rigorously-defined and more sustained, thus reducing the confound of residual symptoms present in early clinical remission on HRQL. In fact, our findings are in keeping with those of Miller et al. (1998) and Kocsis et al. (2002), who reported on psychosocial treatment outcomes of participants in i) a randomized controlled trial of imipramine versus sertraline for the acute treatment of chronic depression (Miller et al., 1998), and ii) maintenance treatment with sertraline versus placebo in an extension of the original trial (Kocsis et al., 2002). Miller et al. (1998) found that mean scale scores of participants who attained remission with acute treatment obtained or surpassed published general population norms on SF-36 domains of Role-Emotional, Social Functioning, Role-Physical, General Health, Physical Functioning, RolePhysical and Bodily Pain. Kocsis et al. (2002) reported the proportion (rather than simply mean scores) of participants in the maintenance study who achieved study-defined “normative” SF-36 scores. They found that the majority of participants in remission for 12 additional weeks following eight weeks of acute treatment with sertraline had normative SF-36 scores: 92% of their participants scored within 10% of the normative range on Social Functioning, 75% on the Role-Emotional scale, and 70% on the Role-Physical scale. (They did not include the other scales). In comparing the magnitude of improvement between our study group and participants who attained remission in the Miller et al. (1998) study, STOP-PD II participants had greater improvement in the domains most reflective of mental health symptoms, reflecting lower baseline scores in STOP-PD II participants. For example, Miller et al. (1998) reported a change in mean (SD) Role-Emotional score from baseline to 8-week study endpoint from 22.9 (32.4) to 82.3 (30.9) (delta of 49.9). STOP-PD II participants improved from a baseline RoleEmotional mean score of 13.57 (26.22) to 77.78 (35.39) (delta 64.21). Changes in domains more reflective of physical health were more comparable, however. For Role-Physical, participants in the Miller et al. (1998) study reported an improvement from 65.5 (40.9) to 88.8 (25.1) (delta 23.3). STOP-PD II participants improved from a mean baseline Role-Physical score of 47.79 (45.14) to 71.37 (37.03) at remission (delta 23.58). This discrepancy in magnitude of change scores in the mental health-related domains likely represents study population differences: Miller et al. (1998) were investigating participants with chronic depression, a disorder that varies in severity, unlike psychotic depression, which is typically in the severe range (Coryell and Tsuang, 1982). Finally, although informative, it is difficult to make direct comparisons between our study and that of Miller et al. (1998) and Kocsis et al. (2002). These authors had a younger study sample (mean age in Miller et al. [1998] was 41.1 years). Further, in their comparison with population norms, neither Miller et al. (1998) nor Kocsis et al. (2002) adjusted for age and gender. Thus, our findings extend theirs both by assessing HRQL in psychotic depression across the lifespan and by adjusting for age and gender. In addition to investigating the effect of remission of depression on HRQL, we examined factors that have previously been reported to contribute to variability in HRQL in the remitted depressed state. HamD total score in remission exhibited at least moderate correlations to all SF-36 scales. This finding is in keeping with studies reporting an association between residual depressive symptoms and worse HRQL (ten Doesschate et al., 2010; Pukrop et al., 2003). It suggests that persistent symptoms even below the threshold for remission (or symptoms from another condition captured by the Ham-D) influence self-reported HRQL in treated psychotic depression. In our sample, medical burden had a moderate correlation with role-physical and general health and a strong correlation with physical functioning, but did not have clinically meaningful strengths of correlation with other SF-36 scores. This finding is not surprising, given the established construct validity of the SF-36 in separating emotional from physical symptoms and functioning in depressed patients (McHorney et al., 1993).

Ham-D, CIRS-G, and neuropsychological measures at remission are shown in Table 4. Although interpretation requires context, correlation coefficients are generally considered weak at the absolute value of r < 0.3, moderate when 0.3 < r < 0.5, and strong when r > 0.5 (Cohen, 1988). In STOP-PD participants, Ham-D total score is moderately to strongly correlated to all SF-36 scale scores except for its weak correlation with Bodily Pain. Overall medical burden is strongly correlated with Physical Functioning and moderately correlated with RolePhysical and General Health. The neuropsychological measures show very weak correlations with the SF-36 scale scores, with the exception of the DKEFS Trail Making Test, which was moderately correlated to Role-Physical and Physical Functioning and weakly correlated with Bodily Pain and General Health. 4. Discussion The principal finding of this study is that patients with psychotic depression who attained 10 weeks of remission with combined antidepressant and antipsychotic pharmacotherapy report HRQL similar to population norms. To our knowledge, this study is the first to investigate HRQL in patients with psychotic depression both when they were acutely ill and when they were in remission. As hypothesized, patients with psychotic depression reported overall low HRQL while acutely ill, with SF-36 scores at study entry significantly lower than ageand gender-adjusted norms for all eight scales. In keeping with existing literature, scales measuring constructs of mental health and emotional and social functioning were particularly low in acute psychotic depression (McHorney et al., 1993; Ware et al., 1993). However, scales measuring the effect of physical symptoms on everyday activities and general health were also significantly lower than population norms, reflecting both the multi-faceted nature of major depression and the complex relationship between depression and physical complaints (Bair et al., 2003). Encouragingly, patients treated with a combination of an antidepressant and an antipsychotic who experienced remission had HRQL similar to population norms across all domains— and even slightly better than population norms in two domains (General Health and Bodily Pain). Further, few remitted participants exhibited clinically significant HRQL impairment, although 5–8% exhibited substantial impairment (scores less than two standard deviations below the population mean) in Role-Physical, Role-Emotional and Social Functioning, slightly higher than the 2.3% one would expect to see in a normal population. We did not find that younger and older participants or men and women differed in either their baseline or remission SF-36 scale scores or changes in scale scores between baseline and remission. However, a caveat to this finding is that the sample size may have been too small to detect statistically significant differences. Naturalistic studies investigating the SF-36 in MDD have found that patients continue to score lower on the SF-36 than population norms when assessed outside of acute episode of depression (Bos et al., 2018; ten Doesschate et al., 2010; Pukrop et al., 2003). On the other hand, we found that mean SF-36 scale scores in remitted psychotic depression were no worse than population norms and that few participants exhibited clinically significant impairment in HRQL in remission. This discrepancy may be at least partly explained by the fact that our study was a controlled trial rather than an observational study, meaning that all participants were receiving systematic treatment with close monitoring. If participants in controlled trials attain better HRQL than remitted participants in observational studies, this may suggest that many MDD patients achieve good HRQL in the context of systematic treatment and close monitoring, over and above the benefits conferred by clinical remission alone. Alternatively, the difference in HRQL between observational studies and controlled trials may simply reflect a qualitative difference in patients who participate in clinical trials or observational studies. Another possibility may be that remission of 377

Journal of Affective Disorders 256 (2019) 373–379

K.S. Bingham, et al.

and HAPPYneuron (software used in a study funded by Brain Canada). He directly own stocks of General Electric (less than $5000). A.J. Rothschild has received grant or research support from Allergan, Janssen, the National Institute of Mental Health, Takeda, EliLilly(medications for a NIH-funded clinical trial), Pfizer (medications for a NIH-funded clinical trial), and the Irving S. and Betty Brudnick Endowed Chair in Psychiatry, is a consultant to Sage Therapeutics, GlaxoSmithKline and Sanofi-Aventis, and has received royalties for the Rothschild Scale for Antidepressant Tachyphylaxis (RSAT)®; Clinical Manual for the Diagnosis and Treatment of Psychotic Depression, American Psychiatric Press, 2009; The Evidence-Based Guide to Antipsychotic Medications, American Psychiatric Press, 2010; The Evidence-Based Guide to Antidepressant Medications, American Psychiatric Press, 2012, and Up-to-Date®. G.S. Alexopoulos has served in the speakers bureau of Takeda, Lundbeck, Otsuka, Allergan, Astra/Zeneca, Sunovion. B.S. Meyers received research support from the National Institute of Mental Health at the time this work was done. A.J. Flint currently receives grant support from the U.S. National Institutes of Health, the Patient-Centered Outcomes Research Institute, the Canadian Institutes of Health Research, Brain Canada, and the Alzheimer's Association. M.V. Rudorfer has no disclosures. P. Marino has no disclosures. S. Banerjee has no disclosures. M.A. Butters has no disclosures. K.S. Bingham has no disclosures.

Neuropsychological tests of processing speed, executive function and delayed verbal memory were, for the most part, not related to SF36 scores. The only exception were moderate correlations between the Trail Making Test and Role-Physical and Physical Functioning scales and weak correlations between the Trail Making Test and General Health and Bodily Pain. These correlations may represent a chance finding, but are in keeping with a previously identified relationship between higher Trail Making Test score and better physical performance in community-dwelling older adults (Vazzana et al., 2010). In addition to measuring executive function, the Trail Making Test Condition 4 reflects skills in visuospatial scanning, working memory and speed, additional domains that may be related to physical functioning, either directly or due to a shared relationship with another variable such as age (Clouston et al., 2013; Salthouse, 2011). Our finding, that neuropsychological performance did not substantially correlate with subjective impairment in HRQL, is notable given that our participants with remitted psychotic depression performed worse than age-adjusted population norms on neuropsychological tests of coding and Trail Making (i.e., more than one standard deviation below the age-adjusted norms, see Table 1). Prior studies suggest a relationship between cognition and the related construct of functioning in depression (McIntyre et al., 2013). However, most studies relied on participant self-report of cognitive performance rather than objective testing (Lam et al., 2014). Also, other studies have typically focused on the potentially more cognitively demanding area of occupational functioning. Furthermore, the SF-36, is a broad measure of HRQL and may, therefore, not be sensitive to the subtler functional deficits potentially experienced by patients with remitted depression. The main limitation of this study is that it was a secondary analysis not powered specifically to measure HRQL as an outcome variable. Furthermore, because HRQL was not a primary outcome of STOP-PD II, the SF-36 was the only measure of HRQL used. As it is a widely used measure with extensive normative data and well-characterized measurement properties, we believe the SF-36 was a good choice to assess HRQL in STOP PD II. However, studies with the primary aim of investigating HRQL in depression would benefit from including more specific, comprehensive and nuanced measures of HRQL, such as the Quality of Life Enjoyment and Satisfaction Questionnaire, a scale developed specifically for use in depressed patients (Endicott et al., 1993). Finally, we did not have SF-36 measures from a healthy comparison group but instead used population norms, which may not be as comparable to our study participants as a comparison group drawn from the same local population. In conclusion, our group of participants with psychotic depression in remission demonstrated similar levels of HRQL to population norms, despite marked impairment on most HRQL domains when acutely ill. This finding is encouraging, as it suggests that, when treated in a rigorous and systematic manner, many patients with this severe illness improve significantly from both a clinical and HRQL perspective. Nevertheless, even in a remitted state, very low levels of depressive symptoms can still influence HRQL and a small, but potentially meaningful, proportion of participants continue to report very low HRQL scores in certain domains.

Funding/Support This study was funded by USPHS grants MH 62446, MH 62518, MH 62565, and MH 62624 from the National Institute of Mental Health (NIMH). Eli Lilly provided olanzapine and matching placebo pills and Pfizer provided sertraline; neither company provided funding for the study. Role of the funder/sponsor The National Institute of Mental Health (NIMH) participated in the implementation of this study through the U01 mechanism. M.V. Rudorfer, M.D. represented NIMH on the study's steering committee. Dr. Rudorfer participated in the conduct of the study; interpretation of the data; preparation, review, and approval of the manuscript; and decision to submit the manuscript for publication. NIMH did not participate in the design of the study or the collection, management, or analysis of data. A data safety monitoring board at NIMH provided data and safety monitoring. Neither Eli Lilly nor Pfizer participated in the design and conduct of the study; collection, management, analysis, or interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication. Acknowledgements None.

Declaration of interest Supplementary materials E.M. Whyte receives grant support from the National Institute of Mental Health (NIMH) and Health Resources & Services Administration (HRSA). During the past 5 years, B.H. Mulsant has received: research funding from Brain Canada, the CAMH Foundation, the Canadian Institutes of Health Research, and the U.S. National Institutes of Health research support from Bristol-Myers Squibb (medications for a NIH-funded clinical trial), Eli-Lilly (medications for a NIH-funded clinical trial), Pfizer (medications for a NIH-funded clinical trial), Capital Solution Design LLC (software used in a study funded by CAMH Foundation),

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.jad.2019.05.068. References Bair, M.J., Robinson, R.L., Katon, W., Kroenke, K., 2003. Depression and pain comorbidity: a literature review. Arch. Intern. Med. 163, 2433–2445. https://doi.org/ 10.1001/archinte.163.20.2433. Bakas, T., McLennon, S.M., Carpenter, J.S., Buelow, J.M., Otte, J.L., Hanna, K.M., Ellett,

378

Journal of Affective Disorders 256 (2019) 373–379

K.S. Bingham, et al.

Kruijshaar, M.E., Hoeymans, N., Bijl, R.V., Spijker, J., Essink-Bot, M.L., 2003. Levels of disability in major depression: findings from the Netherlands Mental Health Survey and Incidence Study (NEMESIS). J. Affect. Disord. 77, 53–64. Lam, R.W., Kennedy, S.H., McIntyre, R.S., Khullar, A., 2014. Cognitive dysfunction in major depressive disorder: effects on psychosocial functioning and implications for treatment. Can. J. Psychiatry 59, 649–654. Maj, M., Pirozzi, R., Magliano, L., Fiorillo, A., Bartoli, L., 2007. Phenomenology and prognostic significance of delusions in major depressive disorder: a 10-year prospective follow-up study. J. Clin. Psychiatry 68, 1411–1417. McHorney, C.A., Ware, J.E., Raczek, A.E., 1993. The MOS 36-Item Short-Form Health Survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructs. Med. Care 31, 247–263. McIntyre, R.S., Cha, D.S., Soczynska, J.K., Woldeyohannes, H.O., Gallaugher, L.A., Kudlow, P., Alsuwaidan, M., Baskaran, A., 2013. Cognitive deficits and functional outcomes in major depressive disorder: determinants, substrates, and treatment interventions. Depress. Anxiety 30, 515–527. https://doi.org/10.1002/da.22063. Miller, M.D., Paradis, C.F., Houck, P.R., Mazumdar, S., Stack, J.A., Rifai, A.H., Mulsant, B., Reynolds, C.F., 1992. Rating chronic medical illness burden in geropsychiatric practice and research: application of the Cumulative Illness Rating Scale. Psychiatry Res. 41, 237–248. Miller, I.W., Keitner, G.I., Schatzberg, A.F., Klein, D.N., Thase, M.E., Rush, A.J., Markowitz, J.C., Schlager, D.S., Kornstein, S.G., Davis, S.M., Harrison, W.M., Keller, M.B., 1998. The treatment of chronic depression, part 3: psychosocial functioning before and after treatment with sertraline or imipramine. J Clin Psychiatry 59, 608–619. Pukrop, R., Schlaak, V., Möller-Leimkühler, A.M., Albus, M., Czernik, A., Klosterkötter, J., Möller, H.J., 2003. Reliability and validity of Quality of Life assessed by the ShortForm 36 and the Modular System for Quality of Life in patients with schizophrenia and patients with depression. Psychiatry Res. 119, 63–79. Randolph, C., 1998. Repeatable Battery for the Assessment of Neuropsychological Status. The Psychological Corporation, San Antonio, TX. Salthouse, T.A., 2011. What cognitive abilities are involved in trail-making performance? Intelligence 39, 222–232. https://doi.org/10.1016/j.intell.2011.03.001. Schatzberg, A.F., Posener, J.A., DeBattista, C., Kalehzan, B.M., Rothschild, A.J., Shear, P.K., 2000. Neuropsychological deficits in psychotic versus nonpsychotic major depression and no mental illness. Am. J. Psychiatry 157, 1095–1100. Spitzer, R., Endicott, J., 1979. Schedule for Affective Disorders and Schizophrenia, 3rd ed. Biometrics Research Dept, New York State Psychiatric Institute, New York, NY. Stewart, A., Sherbourne, C.D., Hays, R.D., Wells, K.B., Nelson, E.C., Kamberg, C., Rogers, W.H., Berry, S.H., Ware, J.E., 1992. Summary and discussion of MOS measures [www Document]. URLhttps://www.rand.org/pubs/external_publications/EP19920054. html(accessed 2.19.18). ten Doesschate, M.C., Koeter, M.W.J., Bockting, C.L.H., Schene, A.H., DELTA Study Group, 2010. Health related quality of life in recurrent depression: a comparison with a general population sample. J. Affect. Disord. 120, 126–132. https://doi.org/10. 1016/j.jad.2009.04.026. Torrance, N., Smith, B.H., Lee, A.J., Aucott, L., Cardy, A., Bennett, M.I., 2009. Analysing the SF-36 in population-based research. A comparison of methods of statistical approaches using chronic pain as an example. J. Eval. Clin. Pract. 15, 328–334. https:// doi.org/10.1111/j.1365-2753.2008.01006.x. Vazzana, R., Bandinelli, S., Lauretani, Fabrizio, Volpato, S., Lauretani, Fulvio, Di Iorio, A., Abate, G., Corsi, A.M., Milaneschi, Y., Guralnik, J.M., Ferrucci, L., 2010. Trail making test predicts physical impairment and mortality in older persons. J. Am. Geriatr. Soc. 58, 719–723. https://doi.org/10.1111/j.1532-5415.2010.02780.x. Ware, J.E., Snow, K., Koskinski, M., Gandek, B., 1993. SF-36 Health Status Survey Manual. The Health Institute, New England Medical Centre, Boston, MA. Williams, J.B.W., Kobak, K.A., Bech, P., Engelhardt, N., Evans, K., Lipsitz, J., Olin, J., Pearson, J., Kalali, A., 2008. The GRID-HAMD: standardization of the Hamilton Depression Rating Scale. Int. Clin. Psychopharmacol. 23, 120–129. https://doi.org/ 10.1097/YIC.0b013e3282f948f5.

M.L., Hadler, K.A., Welch, J.L., 2012. Systematic review of health-related quality of life models. Health Qual. Life Outcomes 10, 134. https://doi.org/10.1186/14777525-10-134. Beusterien, K.M., Steinwald, B., Ware, J.E., 1996. Usefulness of the SF-36 Health Survey in measuring health outcomes in the depressed elderly. J. Geriatr. Psychiatry Neurol. 9, 13–21. https://doi.org/10.1177/089198879600900103. Bingham, K.S., Kumar, S., Dawson, D.R., Mulsant, B.H., Flint, A.J., 2018. A systematic review of the measurement of function in late-life depression. Am. J. Geriatr. Psychiatry 26, 54–72. https://doi.org/10.1016/j.jagp.2017.08.011. Bora, E., Harrison, B.J., Yücel, M., Pantelis, C., 2013. Cognitive impairment in euthymic major depressive disorder: a meta-analysis. Psychol. Med. 43, 2017–2026. https:// doi.org/10.1017/S0033291712002085. Bos, E.H., Ten Have, M., van Dorsselaer, S., Jeronimus, B.F., de Graaf, R., de Jonge, P., 2018. Functioning before and after a major depressive episode: pre-existing vulnerability or scar? A prospective three-wave population-based study. Psychol. Med. 1–9. https://doi.org/10.1017/S0033291717003798. Clouston, S.A.P., Brewster, P., Kuh, D., Richards, M., Cooper, R., Hardy, R., Rubin, M.S., Hofer, S.M., 2013. The dynamic relationship between physical function and cognition in longitudinal aging cohorts. Epidemiol. Rev. 35, 33–50. https://doi.org/10.1093/ epirev/mxs004. Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences, Second. ed. Lawrence Erlbaum Associates, United States of America. Coryell, W., Tsuang, M.T., 1982. Primary unipolar depression and the prognostic importance of delusions. Arch. Gen. Psychiatry 39, 1181–1184. Delis, D., Kaplan, E., 2001. Delis-Kaplan Executive Function System (DKEFS). The Psychological Corporation, San Antonio, TX. Endicott, J., Nee, J., Harrison, W., Blumenthal, R., 1993. Quality of life enjoyment and satisfaction questionnaire: a new measure. Psychopharmacol. Bull. 29, 321–326. Evans, V.C., Iverson, G.L., Yatham, L.N., Lam, R.W., 2014. The relationship between neurocognitive and psychosocial functioning in major depressive disorder: a systematic review. J. Clin. Psychiatry 75, 1359–1370. https://doi.org/10.4088/JCP. 13r08939. First, M., Spitzer, R.L., Gibbon, M., Williams, J.B.W., 2001. Structured Clinical Interview for DSM-IV-TR Axis I Disorders - Patient Edition (SCID-I/P). Biometric Research Department, New York, NY. Flint, A.J., Meyers, B.S., Rothschild, A.J., Whyte, E.M., Mulsant, B.H., Rudorfer, M.V., Marino, P., STOP-PD II Study Group, 2013. Sustaining remission of psychotic depression: rationale, design and methodology of STOP-PD II. BMC Psychiatry 13, 38. https://doi.org/10.1186/1471-244X-13-38. Guy, W., 1976. Clinical Global Impressions: ECDEU Assessment Manual for Psychopharmacology. US Dept of Health, Education, and Welfare, Washington, DC. Hamilton, M., 1960. A rating scale for depression. J. Neurol. Neurosurg. Psychiatry 23, 56–62. Hays, R.D., Morales, L.S., 2001. The RAND-36 measure of health-related quality of life. Ann. Med. 33, 350–357. https://doi.org/10.3109/07853890109002089. Ho, C.S., Feng, L., Fam, J., Mahendran, R., Kua, E.H., Ng, T.P., 2014. Coexisting medical comorbidity and depression: multiplicative effects on health outcomes in older adults. Int. Psychogeriatr. 26, 1221–1229. https://doi.org/10.1017/ S1041610214000611. IsHak, W.W., Mirocha, J., James, D., Tobia, G., Vilhauer, J., Fakhry, H., Pi, S., Hanson, E., Nashawati, R., Peselow, E.D., Cohen, R.M., 2015. Quality of life in major depressive disorder before/after multiple steps of treatment and one-year follow-up. Acta Psychiatr. Scand. 131, 51–60. https://doi.org/10.1111/acps.12301. Jorm, A.F., 2004. The Informant Questionnaire on Cognitive Decline in the Elderly (IQCODE): a review. Int. Psychoger. 16, 275–293. https://doi.org/10.1017/ S1041610204000390. Kocsis, J.H., Schatzberg, A., Rush, A.J., Klein, D.N., Howland, R., Gniwesch, L., Davis, S.M., Harrison, W., 2002. Psychosocial outcomes following long-term, double-blind treatment of chronic depression with sertraline vs placebo. Arch. Gen. Psychiatry 59, 723–728.

379