Baseline characteristics of major depressive disorder patients in clinical trials in Europe and United States

Baseline characteristics of major depressive disorder patients in clinical trials in Europe and United States

Journal of Psychiatric Research 35 (2001) 71–81 www.elsevier.com/locate/jpsychires Baseline characteristics of major depressive disorder patients in ...

643KB Sizes 0 Downloads 51 Views

Journal of Psychiatric Research 35 (2001) 71–81 www.elsevier.com/locate/jpsychires

Baseline characteristics of major depressive disorder patients in clinical trials in Europe and United States: is there a transatlantic difference? I.A. Niklson *,1, P.-E. Reimitz 2 Research and Development, NV Organon Received 17 August 2000; received in revised form 31 January 2001; accepted 16 February 2001

Abstract There is a widely spread belief that different patients are being recruited into antidepressant clinical trials conducted in Europe and the USA which is probably generated by the fact that recruitment strategies vary between the two continents. In order to get an insight into the patients’ characteristics in clinical studies on both continents, we compared the baseline characteristics of depressed patients in a database of a cancelled development program of an antidepressant (2220 patients, Intention-to-Treat group). For the evaluation of continental differences, we compared the elements of demographics, previous psychiatric history, DSM-III-R criteria, HAM-D and MADRS total scores and separate items and/or factors and CGI severity scores at baseline. USA patients had statistically significantly higher baseline values on height, weight and BMI. European patients showed statistically significantly higher baseline severity scores on HAM-D, MADRS and CGI. Furthermore, European patients had statistically significantly higher baseline scores on HAM-D factors I (‘anxiety/somatization’), VI (‘sleep disturbance’), and HAM-D Angst anxiety/agitation factor, whereas USA patients had a statistically significantly higher baseline value on the Bech depression factor and the HAM-D Angst retarded depression factor. European patients appear to have a more severe depressive episode with more anxiety and melancholic features. Some of the statistically significant differences found may be the result of a large sample size and are probably without any clinical relevance when the absolute size of the difference is taken into account. Our opinion is that the differences found in our sample between European and USA populations are much smaller than is generally expected and not of a magnitude that would question the reliability of the results obtained in our global world-wide, antidepressant drug development program. If our findings were reproducible in other antidepressant databases it would indicate that data gathered in Europe and the USA within a global antidepressant drug development can be pooled. # 2001 Elsevier Science Ltd. All rights reserved. Keywords: Antidepressant trials; Transatlantic difference; Clinical trial methodology; Baseline patient characteristics

1. Introduction In the last 10 years, global world-wide drug development has been more a rule than an exception in the pharmaceutical industry. Companies are trying to shorten development time by initiating simultaneously

* Corresponding author. Head Therapeutic Group CNS, Strategic Clinical Development, UCB Pharma S.A. R&D, B-1420 Brainel’Alleud, Belgium. Tel.: +32 (0)2-386-2557; fax: +32 (0)2-386-2828. E-mail address: ida.niklson@ucb- group.com (I.A. Niklson). 1 Present address: Head TG CNS, Strategic Clinical Development, UCB SA. R&D, Chemin du Foriest, B-1420 Braine-l’Alleud, Belgium. 2 Present address: Director Data Management & Biometrics, AstraZeneca GmbH, Tinsdaler Weg 183, D-22880 Wedel, Germany.

several large efficacy clinical trials in a given indication. In the course of these efforts trials are being set up in different countries and continents. When the trial-specific efficacy results are measured by hard and objective outcome parameters, as in some non-psychiatric indications, there is a limited concern of how different the patient populations are that are recruited into clinical trials on particular continents. This issue is of other importance in indications where the outcome measures are less hard and largely influenced by patient selection, non-specific treatment effects, local clinical practices, health care policies, etc. Depression is one of those diseases where the outcomes of clinical trials are largely influenced by patients being recruited and investigators conducting the trial (Niklson et al., 1997).

0022-3956/01/$ - see front matter # 2001 Elsevier Science Ltd. All rights reserved. PII: S0022-3956(01)00011-5

72

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

There is a widely spread belief, not based on research data, that different patients are being recruited into antidepressant clinical trials that are conducted in Europe and the USA. This belief may have been partly generated by the fact that the recruitment strategies are different on both continents. In the USA, clinical research developed rapidly into a core business of specific research centers that need a lot of eligible patients for different trials that are performed at the center simultaneously. These patients most often are not recruited from the clinical practice of the investigators but from a large screening pool that is generated by advertisements in the local media. In the USA, this way of recruiting patients has a long history (Covi et al., 1979) and is widely accepted by patients, investigators and sponsors, and Ethics Committees (ECs) have no major concerns about granting an approval. Contrary to this practice, in Europe the use of media for recruiting patients for a clinical trial is far less common, and in those rare cases when requested, some investigators may get refusal from the ECs. European patients are mostly recruited from the psychiatrists’ own practices and through referrals from general practitioners. There are safe treatments available for treating depression in the general practice setting and appropriate patients with no co-morbidity, lack of suicidal risks and concurrent physical illness are often being treated by general practitioners. In order to recruit eligible patients in a clinical trial some psychiatrists cooperate with their network of general practitioners, who refer appropriate patients with Major Depressive Disorder (MDD) for recruitment into a clinical trial. The availability of medical care varies between the USA and most European countries (Davies and Marchall, 2000). In Europe, greater availability of free or nearly free medical care for the patient (i.e. treatment is paid by health insurance) may influence the patient’s motivation to participate in a clinical trial. The difference in patient population that is recruited in antidepressant clinical trials through referrals and advertisements has been the topic of several publications (Thase et al., 1984; Krupnik et al., 1986; Amori and Lenox, 1989; Rapaport et al., 1995, 1996; Miller et al., 1997). Contrary to the expectations, the results of these investigations found that there is almost no difference between patients recruited via above-mentioned methods and that the two cohorts of patients are remarkably similar on most variables. The results of other studies found some consistent differences between symptomatic volunteers and clinical subjects (Brauzer and Goldstein, 1973; Covi et al., 1979; Hersen et al., 1981). According to the common beliefs, patients in a USA antidepressant clinical trial are less severely ill, and therefore differ substantially from patients in Europe. According to our knowledge, there are no published

data to support this belief. We performed an intensive literature search, but the results were very limited. In order to get an insight into the patients’ characteristics in clinical studies on both continents, we compared the baseline characteristics of depressed patients in a database of a cancelled development program of an antidepressant. Our expectation was that the patients in the USA are less severely ill at baseline when entering the clinical trial than European patients. In this publication, we will limit ourselves to comparing of the baseline characteristics of European and USA patient populations in antidepressant clinical trials. Exploratory analysis on the treatment effects and the possible influence of baseline characteristics will be subject of further research.

2. Material and methods The overall Phase II/III database consisted of 2220 depressed patients belonging to the Intention-to-Treat (ITT) group. We compared, in an exploratory way, the data from two European (525 patients) and seven USA studies (1695 patients) performed between 1992 and 1995. The four Phase II studies (two European and two USA) had a treatment period of 6 weeks, whereas the five other Phase III USA studies had a treatment period of 8 weeks. The studies were designed as double-blind, placebo and/or active controlled trials. The inclusion criteria as specified in the protocols were almost identical; the only difference was that in the USA, investigators were allowed to include patients with current episode lasting for more than 12 months. All studies were performed according to the Declaration of Helsinki and the ECs of all investigators involved had approved the corresponding protocol. All patients provided written informed consent for study participation after the scope and the nature of the investigation had been explained, but before any study activities took place. All patients had a diagnosis of MDD according to DSM-III-R criteria. Included were patients older than 18 years with HAM-D total score (17 items) of at least 17 points at baseline. The duration of the present depressive episode had to be at least 2 weeks since the diagnosis was made. Patients were excluded from participation if they had any other primary psychiatric diagnosis (bipolar, dysthymic or not otherwise specified depressive disorder, anxiety or adjustment disorder, schizophrenia, organic mental syndromes or alcoholism), if previous antidepressant therapy had not been washed out adequately, or if patients had previous electroconvulsive therapy within 1 year. Patients who received an adequate antidepressant therapy during the current episode were also excluded from participation. Patients with clinically relevant renal, hepatic, cardiovascular or cerebrovascular disease, diabetic and epileptic

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

patients, and women of childbearing potential not adequately protected against pregnancy as judged by the investigator were excluded as well. In all studies investigator and rater training meetings were held in order to explain the protocol and to agree on the scoring criteria to be used during the study. All investigators participating in the studies on both continents were qualified psychiatrists and no general practitioners did participate in any of the studies. Investigators in the USA were allowed to advertise for patients, but the source of patients (advertisement or clinical practice) was not recorded. Also, one center in Europe was allowed to advertise. 2.1. Statistical methods For the evaluation of continental differences, we compared the elements of demographics, previous psychiatric history, DSM-III-R criteria, HAM-D and MADRS total scores and separate items and/or factors and CGI severity scores at baseline. Based on the ITT group, appropriate summary statistics were calculated by continent for the various baseline parameters. For quantitative variables, mean, standard deviation, minimum and maximum were calculated, whereas absolute and percentage numbers were calculated for categorical parameters. For the exploratory statistical comparison of the data between the continents, the following (twosided) tests, depending on the type of parameter, were used: 2- or Wilcoxon test. In addition, bar charts with frequencies or means and standard deviation (S.D.) are presented to visualize the results.

73

significantly higher in European patients and more patients from Europe were considered as severely depressed (as indicated by a HAM-D total score 525). In addition, European patients had statistically significantly higher scores at baseline on the HAM-D factors I (‘anxiety/somatization’) and VI (‘sleep disturbance’). USA patients had statistically significantly higher scores on the Bech depression factor (Bech et al., 1983, 2000) which consists of six core items from the HAM-D scale: ‘depressed mood’, ‘work&activities’, ‘anxiety psychic’, ‘somatic symptoms general’, ‘feelings of guilt’ and ‘retardation’. Statistically significantly higher scores were found in European patients on the anxiety/agitation factor according to Angst et al. (1993). As shown in Fig. 2, statistically significantly higher scores at baseline were found in the European population on HAM-D items [4] ‘somatic symptoms gastrointestinal’, [5] ‘loss of weight’, [8] ‘insomnia late’, [13] ‘anxiety psychic’, [14] ‘hypochondriasis’, [16] ‘retardation’. Statistically significantly higher scores at baseline were found in the USA population on HAM-D items [9] ‘somatic symptoms general’ and [10] ‘guilt’. 3.3. MADRS total score and individual items Consistent with findings on the HAM-D total score, European patients showed also a statistically significantly higher MADRS total score at baseline. Consequently, the percentage of severely depressed patients (as indicated by a MADRS total score 530) was also statistically significantly higher in Europe than in the US. The MADRS items [3], [4], [5] and [10] accounted for this difference (Table 2, Figs. 3 and 4).

3. Results 3.4. CGI severity score 3.1. Demographics Demographic characteristics are presented by continent in Table 1. At baseline, USA patients had a statistically significantly higher height, weight and body mass index, whereas suicide attempts in the past were documented for more European patients. In addition, the duration of present episode in the USA population was longer. However, this can probably be explained by the difference between the European and USA protocols regarding this specific inclusion criterion.

As shown in Table 2, a statistically significant difference between Europe and the USA was found regarding the CGI severity score at baseline. In Europe, 54.4% of the patients were considered moderately ill as compared with 80.9% in the USA. Investigators judged 36.8% of the European patients, but only 17% of the USA patients as markedly ill. Furthermore, 5.8% of the European patients were considered to be severely ill as compared with 1.5% of the USA patients. 3.5. DSM-III-R

3.2. HAM-D total score (17 items), factor scores and individual items The continent-specific results on HAM-D total score (17 items) and some HAM-D related factor scores are presented in Table 2. The frequency distribution of the HAM-D total score at baseline is presented in Fig. 1. The mean HAM-D total score at baseline was statistically

As shown in Table 1, there was no statistically significant difference between the continents regarding the number of patients having a recurrent depressive episode. Among the diagnostic symptoms (i.e. the A criteria, see Table 4 for a description), there was a statistically highly significant difference (see Fig. 5) with more patients in Europe suffering on item [A2]

74

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

‘markedly diminished interest or pleasure in all or almost all activities most of the day or nearly every day’, and with more patients in the USA suffering on items [A7] ‘feelings of worthlessness or excessive or inappropriate guilt nearly every day’ and [A9] ‘recurrent thoughts of death (not just fear of dying)’. As shown in Fig. 6, the number of melancholic patients, defined as those patients with at least five melancholic features present, was statistically significantly higher in Europe (71.6 vs. 54.9%). The mean HAM-D total score of melancholic patients in Europe was statistically significantly higher, whereas the mean score of non-melancholic patients did not differ between Europe and the USA (Table 3). With regard to the number of patients suffering from specific melancholic features (i.e. the M criteria, see Table 4 for a description), a statistically highly significant difference was found on item [M1] indicating that more patients in the USA had a loss of interest or pleasure, whereas more patients in Europe

than in the USA suffered on items [M2] ‘lack of reactivity to usually pleasurable stimuli’, [M3] ‘depression regularly worse in morning’, [M4] ‘early morning awakening, at least 2 hrs before usual awakening’, [M6] ‘significant anorexia or weight loss, e.g. >5% of weight loss in a month’, [M7] ‘no significant personality disturbance before first major depressive episode’, and [M9] ‘previous good response to specific and adequate somatic antidepressant therapy’.

4. Discussion There are methodological limitations to this post-hoc analysis because the studies were neither set up to investigate differences in baseline characteristics of patients in the USA and Europe, nor to compare the similarities and differences between patients recruited by advertisement and clinical practice. Nevertheless, we

Table 1 Summary statistics on demographic parameters at baseline by continent and on various parameters regarding the psychiatric history or the present major depressive episode at baseline by continent Intent-To-Treat group Parameter

Europe (n=525)

USA (n=1695)

Exploratory P-value

Gender Females Males

319 (60.8%) 206 (39.2%)

966 (57.0%) 729 (43.0%)

0.126a

Age (years) Mean (S.D.) Min–max 465 >65

42.4 (11.8) 17–70 509 (97.0%) 16 (3.0%)

41.2 (11.3) 18–81 1659 (97.9%) 36 (2.1%)

0.0217b

Height (cm) Mean (S.D.) Min–max

168.7 (9.0) 140–194

170.2 (10.1) 135–203

0.0010b

Body weight (kg) Mean (S.D.) Min–max

71.4 (15.9) 38.0–137.0

79.8 (19.3) 40.8–176.9

0.0001b

Body mass index (kg/m2) Mean (S.D.) Min–max 420 (20; 25] (25; 30] >30

25.0 (4.8) 16.4–48.0 59 (11.2%) 227 (43.2%) 167 (31.8%) 62 (11.8%)

27.5 (6.1) 15.0–62.9 81 (4.8%) 581 (34.3%) 575 (33.9%) 455 (26.8%)

0.0001b

DSM-III-R diagnosis 296.2 (single episode) 296.3 (recurrent episode)

180 (34.3%) 345 (65.7%)

605 (35.7%) 1090 (64.3%)

0.556a

Duration of present episode < 2 weeks < 1 month 1–6 months 7–12 months >1 year

– 24 (4.6%) 372 (70.9%) 124 (23.6%) 5 (1.0%)

2 (0.1%) 27 (1.6%) 480 (28.3%) 336 (19.8%) 850 (50.2%)

0.001a

Number of patients with suicide attempts in the past

71 (13.5%)

200 (11.8%)

0.001a

a b

Exploratory P-value based on a (two-sided) 2-Test. Exploratory P-value based on a (two-sided) Wilcoxon test.

75

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81 Table 2 Summary statistics on various efficacy parameters (HAM-Da, MADRS, CGI) at baseline by continent Intent-To-Treat group Parameter

Europe (n=525)

USA (n=1695)

Exploratory P-value

HAM-D total score (17 items) Mean (S.D.) Min–max < 25 5 25 (severely depressed)

23.4 (4.0) 9–40 334 (63.6%) 191 (36.4%)

22.3 (3.1) 15–36 1323 (78.1%) 372 (21.9%)

0.0001b

HAM-D factor I ‘anxiety/somatization’ Mean (S.D.) Min–max

7.9 (1.9) 2–14

7.0 (1.6) 0–13

0.0001b

HAM-D factor V ‘retardation’ Mean (S.D.) Min–max

8.0 (1.7) 3–13

7.9 (1.5) 3–12

0.5102b

HAM-D factor VI ‘sleep disturbance’ Mean (S.D.) Min–max

3.9 (1.5) 0–6

3.5 (1.7) 0–6

0.0001b

HAM-D Bech depression factor Mean (S.D.) Min–max

11.8 (2.0) 5–18

12.3 (1.8) 6–19

0.0001b

HAM-D angst retarded depression factor Mean (S.D.) Min–max

10.3 (2.2) 4–18

10.4 (2.0) 4–17

0.0475b

HAM-D angst anxiety/agitation factor Mean (S.D.) Min–max

8.5 (2.1) 2–16

7.6 (1.8) 2–14

0.0001b

MADRS total score Mean (S.D.) Min–max < 30 530 (severely depressed)

30.4 (5.8) 14–50 219 (41.7%) 303 (57.7%)

28.1 (5.5) 9–47 1026 (60.7%) 664 (39.3%)

0.0001b

CGI severity score Mean (S.D.) Min–max 2—borderline 3—mildly 4—moderate 5—markedly 6—severely 7—extremely severe

4.5 (0.7) 2–6 1 (0.2%) 15 ( 2.9%) 284 (54.4%) 192 (36.8%) 30 (5.8%) –

4.2 (0.4) 3–6 – 11 (0.6%) 1364 (80.9%) 286 (17.0%) 26 (1.5%) –

0.0001b

0.001c

0.001c

a The various factor scores on the HAM-D have been calculated as follows: HAM-D factor I: sum of items 4, 9, 12, 13, 14, 15; HAM-D factor V: sum of items 1, 2, 3, 16; HAM-D factor VI: sum of items 6, 7, 8; Bech Depression factor: sum of items 1, 2, 9, 10, 12, 16; Angst Agitation/anxiety factor: sum of items 4, 9, 12, 13, 14, 17; angst retarded depression factor: sum of items 1, 2, 8, 10, 11, 16. b Exploratory P-value based on a (two-sided) Wilcoxon test. c Exploratory P-value based on a (two-sided) 2-Test.

undertook the analysis in first instance to gain insight into this interesting and much talked about topic of transatlantic differences in patient populations that are being recruited in psychopharmacological clinical trials. When discussing the results of antidepressant clinical trials the issue of clinically meaningful difference becomes essential. There have been long discussions in the scientific community on this issue and there were proposals that a four point differences between placebo and an active drug on the HAM-D total score at the end of the 6–8 week trial would represent a clinically meaningful difference (Huitfeldt and Montgomery, 1983).

Results from the trials completed in the last 10 years further influenced this view. There are now proposals that a two-point difference on the same scale would represent a clinically meaningful difference in the shortterm clinical trial (Montgomery, personal communication). There are no guidelines and criteria regarding the clinically meaningful difference in all various efficacy scores, factors and items that we analyzed and compared in this clinical sample. Therefore, in the absence of any reference to clinically relevant differences when single items are concerned, we have used our clinical judgement on what we thought might represent a clinically

76

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

relevant difference for a particular item taking into account the absolute size of the difference. Based on our exploratory comparisons regarding the baseline characteristics between European and USA patients, there were numerous statistically significant differences, some of them indicating a clinically meaningful difference and some of them occurring simply due to the large sample size. When inspecting the absolute size of the differences on some of the statistically significantly different items, we noticed that they were very small and were probably without any clinical meaningfulness. For example, the mean (  S.D.) at baseline on the HAM-D item [5] ‘weight loss’ was 0.4  0.7 in the

European sample and 0.3  0.6 in the USA sample which was statistically significant, but we considered it as being a clinically meaningless difference. Another example was age: the mean age (  S.D.) at baseline was 42.4  11.8 years in Europe and 41.2  11.3 years in the USA; it was statistically significantly different, but we considered it clinically not meaningful in the context of an antidepressant clinical trial. Regarding demographic parameters, we consider the statistically significant differences on age and height in our sample as chance findings. In contradiction, we consider the statistically significant difference in the body mass index as clinically relevant (Europe:

Fig. 1. Frequency distribution of the HAM-D Total Score (17 items) at baseline by continent Intent-To-Treat group.

Fig. 2. Summary statistics (mean  S.D.) on the various HAM-D items at baseline by continent Intent-To-Treat group.

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

25.0  4.8, USA: 27.5  6.1) because it reflects the generally known fact about the differences of body weight in European and USA populations. There was also a statistically significant difference in the duration of the present depressive episode. This finding can probably be explained by the differences in the inclusion criteria concerning the duration of the present episode. In the USA, patients were allowed to participate in the study if their present episode had lasted longer than 12 months, while in the European studies these patients were to be excluded. This was the only major difference between the European and USA protocols, as already indicated in Section 2.

77

There were statistically significantly more patients in Europe with suicide attempts in the past (13.5 vs. 11.8%), but not with suicidal thoughts in the present episode. On HAM-D item [11] ‘suicide’, USA patients had statistically significantly higher scores, whereas on the MADRS item [10] ‘suicidal thoughts’ European patients had statistically significantly higher scores, which is difficult to explain. USA and European protocols excluded patients with known or suspected suicidal tendencies in the current episode. This fact could explain the relatively low average score for the suicide item. The mean score ( S.D.) for HAM-D item [11] was 0.9  0.8 in Europe and 1.0  0.8 in the USA (score

Fig. 3. Frequency distribution of the MADRS Total Score at baseline by continent Intent-To-Treat group.

Fig. 4. Summary statistics (mean  S.D.) on the various MADRS items at baseline by continent Intent-To-Treat group.

78

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

‘1’ on this four-point item corresponds to the description of ‘life is not worth living’. The average score ( S.D.) for MADRS item [10] was 1.7  1.0 in Europe and 1.6  1.1 in the USA (score ‘2’ on this six-point item corresponds to the description ‘weary of life, only fleeting suicidal thoughts’). Overall, we considered these findings as clinically meaningless due to the very small differences. The difference in means on the HAM-D total score (17 items) at baseline (Europe: 23.4  4.0; USA: 22.3  3.1) was statistically significant. In a ‘normalsized’ antidepressant trial with a sample size of 60–100 patients per treatment arm, the difference at baseline

between the treatment groups of even more than 1.1 point on the total HAM-D is considered as clinically not relevant for the treatment outcome (e.g. HAM-D total score at baseline: group 1: 22.7 points, group 2: 24.0 points [data on file]). These findings can be related to the findings of Thase et al. (1984) who found that the sample of ‘symptomatic volunteers’ (patients recruited by means of advertisements, as was probably frequently the case in our USA sample) had somewhat lower baseline HAM-D scores (symptomatic volunteers: 22.2, clinical referrals: 24.7). Even with a baseline difference of 2.5 points on the HAM-D total score the authors considered the samples as comparable.

Fig. 5. Presence of various diagnostic symptoms according to the DSM-III-R checklist for a major depressive episode at baseline by continent Intent-To-Treat group.

Fig. 6. Presence of various melancholic features according to the DSM-III-R checklist for a major depressive episode at baseline by continent Intent-To-Treat group.

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

When we compared the samples of melancholic and non-melancholic patients in Europe and the USA, the difference in HAM-D baseline scores was not of the magnitude found by Ansseau (1992). He reported that in nefazodone studies melancholic patients had 3.3 points higher scores in Europe than in the USA, and the scores in non-melancholic patients were 4.2 points higher in Europe than in the USA. In our sample, the mean HAM-D total score (17 items) at baseline within melancholic patients in Europe was 1.3 points higher, and within non-melancholic patients 0.1 point higher than in the USA. In our sample, there were statistically significantly more severely depressed patients (as indicated by an accepted cut-off score of 25 and above on the HAM-D total score) in Europe as compared with the USA (36.4 vs. 21.9%) which we also consider of clinical relevance due to the magnitude of the difference. There are published data showing that patients with less severe depression respond more favorably to placebo (Fairchild et al., 1986; Paykel et al., 1988; Reimherr et al., 1989; Brown et al., 1992; Wilcox et al., 1992; Brown, 1994; Stassen et al., 1994; Bialik et al., 1995). Although the baseline severity of depression as measured by the HAM-D total score was lower in the USA, USA

79

patients showed a higher level of depression on the Bech depression factor. The item analysis showed that the somatic items on the HAM-D are the ones that could explain this difference, namely ‘sleep’, ‘appetite’ and ‘weight loss’ which were more present in European patients. This is in accordance with the DSM-III melancholia subtype where sleep, appetite or weight loss are more pronounced in European patients than in USA patients. This observation was confirmed by the finding of statistically significantly more melancholic patients in Europe. There was a statistically highly significant difference on the HAM-D items [4] ‘somatic symptoms gastrointestinal’ and [5] ‘loss of weight’ being more affected by depression in European patients. Furthermore, European patients had more ‘somatic anxiety’, ‘hypochondriasis’ and ‘retardation’. USA patients showed more ‘general somatic symptoms’ and more ‘feelings of guilt’. These differences in symptom presentation may be the result of cultural variations in the frequencies with which certain symptoms appear (De Girolamo, 1993). However, the clinical relevance of these findings in the context of clinical trials with the aim to show efficacy of a new antidepressant treatment is unknown.

Table 3 Summary statistics on various efficacy parameters (HAM-D, MADRS) at baseline by continent and type of patients (melancholic versus non-melancholic according to DSM-III-Ra) Intent-To-Treat group Parameter

Europe (n=376)

USA (n=929)

Exploratory P-value

Melancholic patients according to DSM-III-R HAM-D total score (17 items) Mean (S.D.) Min–max <25 525 (severely depressed)

24.1 (4.1) 17–40 211 (56.1%) 165 (43.9%)

22.8 (3.3) 17–36 672 (72.3%) 257 (27.7%)

P=0.0001b

MADRS total score Mean (S.D.) Min–max <30 530 (severely depressed)

31.0 (6.0) 14–50 146 (38.8%) 228 (60.6%)

29.3 (5.5) 9–47 484 (52.2%) 443 (47.8%)

P=0.0001b

Non-melancholic patients according to DSM-III-R Parameter

Europe (n=149)

USA (n=762)

Exploratory P-value

HAM-D total score (17 items) Mean (S.D.) Min–max <25 525 (severely depressed)

21.7 (3.2) 21–32 123 (82.6%) 26 (17.4%)

21.6 (2.8) 15–33 647 (84.9%) 115 (15.1%)

P=0.6033b

MADRS total score Mean (S.D.) Min–max <30 530 (severely depressed)

29.1 (5.2) 14–42 73 (49.0%) 75 (50.3%)

26.6 (5.2) 13–41 541 (71.3%) 218 (28.7%)

P=0.0001b

a b c

For four subjects in the USA population, the DSM-III-R classification was missing. Exploratory P-value based on a (two-sided) Wilcoxon test. Exploratory P-value based on a (two-sided) 2-Test.

P=0.001c

P=0.001c

P=0.467c

P=0.001c

80

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

Table 4 Description of the diagnostic symptoms and melancholic features of the DSM-III-R checklist Symptom/feature

Description

A1 A2 A3 A4 A5 A6 A7 A8 A9

Depressed mood most of the day or nearly every day Markedly diminished interest or pleasure in all or almost all activities most of the day or nearly every day Significant weight loss when not dieting (e.g. 5 5% in a month) Insomnia or hypersomnia nearly every day Psychomotor agitation or retardation nearly every day Fatigue or loss of energy nearly every day Feelings of worthlessness or excessive or inappropriate guilt nearly every day Diminished ability to think or concentrate, or indecisiveness nearly every day Recurrent thoughts of death (not just fear of dying); recurrent suicidal ideation; or a suicide attempt or a specific plan for committing suicide

M1 M2 M3 M4 M5 M6 M7 M8 M9

Loss of interest or pleasure in all or almost all activities Lack of reactivity to usually pleasurable stimuli Depression regularly worse in morning Early morning awakening (at least 2 h before usual awakening) Psychomotor retardation or agitation Significant anorexia or weight loss (e.g. >5% of weight loss in a month) No significant personality disturbance before first major depressive episode One or more previous major depressive episodes followed by complete or nearly complete recovery Previous good response to specific and adequate somatic antidepressant therapy

The statistically significantly lower scores on the HAM-D Angst ‘anxiety/agitation’ factor in the USA patient population is in line with the findings of Thase et al. (1984) that patient recruited via advertisements had lower anxiety scores. The findings on the MADRS scale actually confirmed the results on the HAM-D scale that the patients recruited in studies in Europe were more severely depressed as indicated by a higher total score at baseline. There were several statistically highly significant findings for individual MADRS items: [3] ‘inner tension’, [4] ‘reduced sleep’ and [5] ‘reduced appetite’ indicating that these symptoms were more present in European patients. Despite this, one can argue that the difference in baseline means between the groups for ‘inner tension’ of 0.4, for ‘reduced sleep’ of 0.6 may not be clinically relevant. The difference between the baseline means on ‘reduced appetite’ of 1.2 points appears to us to be clinically relevant. The CGI severity score confirmed the findings of the statistically significant difference in baseline severity as measured by HAM-D and MADRS total scores. In Europe, 42.6% of the patients were markedly or severely ill as compared with only 18.5% in the USA. The importance and the consequences of the difference in severity have already been discussed. There was no statistically significant difference in the percentage of patients with recurrent episodes as classified by DSM-R-III diagnosis. The most consistent finding on the DSM-III-R was the statistically significant difference in the number of patients with melancholic features (Europe: 71.6%, USA: 54.9%). Significantly more European patients showed ‘lack of reactivity to usual pleasurable stimuli’, ‘depression regularly worse in

the morning’, ‘early morning awakening’, ‘significant anorexia or weight loss’, ‘no significant personality disturbances before first MDD episode’ and ‘previous good response to specific and adequate somatic antidepressant therapy’. According to our knowledge this is the first study to investigate in a systematic way similarities and differences in depressive patients from Europe and the USA who participated in antidepressant clinical trials. However, the findings should be cross-verified using databases of other antidepressant drugs. Our results indicate that there are some differences in patient populations that are being recruited in Europe and the USA for antidepressant clinical trials. Some of the statistically significant differences found may be the result of the large sample size and are probably without any clinical relevance when the absolute size of the difference is taken into account. The influence of clinically relevant baseline differences on efficacy and safety outcomes during the treatment period should be carefully investigated and handled appropriately in the statistical analysis using techniques like stratification, covariate analysis, etc., or sub-group analyses for more exploratory analyses. In our sample, European patients appear to have a more severe depressive episode as measured by HAM-D and MADRS total scores with more anxiety and melancholic features as opposed to USA patients, while USA patients had a higher mean score on the HAM-D Bech depression factor consisting of six core depression items. The differences between the European and the USA populations which we found in our sample were far less than we expected and, in our opinion, not of a magnitude that would call into question the reliability of the results obtained in a global, world-wide antidepressant

I.A. Niklson, P.-E. Reimitz / Journal of Psychiatric Research 35 (2001) 71–81

drug development program. If these general observations could be confirmed, it would indicate that the difference between European and American depression trial population is lower than assumed and that the data gathered in Europe and the USA on an antidepressant in development can be pooled.

References American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-III-R). Washington D.C., 1987. Amori G, Lenox RH. Do volunteer subjects bias clinical trials? Journal Clinical Psychopharmacology 1989;9:321–7. Angst J, Scheidegger P, Stabl M. Efficacy of moclobamide in different patient groups. Results of new sub-scale of the Hamilton Depression Rating Scale. Clinical Neuropharmacology 1993;16(2):S55–S62. Ansseau M. The Atlantic gap: clinical trials in Europe and the United States. Biological Psychiatry 1992;31:109–11. Bech P, Allerup P, Gram LF, Reisby N, Rosenberg R, Jacobsen O, et al. The Hamilton depression scale. Evaluation of objectivity using logistic models. Acta Psychiatrica Scandinavica 1981;63:290–9. Bech P, Cialdella P, Haugh MC, Birkett MA, Hours A, Boissel JP, et al. Meta-analysis of randomised controlled trials of fluoxetine v. placebo and tricyclic antidepressants in the short-term treatment of major depression. British Journal of Psychiatry 2000;176:421–8. Bialik RJ, Ravindran AV, Bakish D, Lapierre YD. A comparison of placebo responders and non-responders in subgroups of depressive disorder. Journal of Psychiatry Neuroscience 1995;20:265–70. Brauzer B. Goldstein BJ Symptomatic volunteers: another patient dimension for clinical trials. Journal of Clinical Pharmacology 1973;89–98. Brown WA. Placebo as a treatment for depression. Neuropsychopharmacology 1994;10:265–9. Brown WA, Johnson MF, Chen MG. Clinical features of depressed patients who do and do not improve with placebo. Psychiatry Research 1991;41:203–14. Covi L, Lipman RS, McNair DM, Czerlinsky T. Symptomatic volunteers in multicenter drug trials. Progress in Neuropsychopharmacology 1979;3:521–33. Davies HTO, Marshall MN. UK and US health-care systems: divided by more than a common language. The Lancet 2000;355:336. De Girolamo G. Cross-cultural differences in depression. Focus on Depression 1993;4:28–38.

81

Fairchild CJ, Rush AJ, Vasavada N, Giles DE, Khatami M. Which depressions respond to placebo. Psychiatry Research 1986;18:217– 26. Hersen M, Bellack AS, Himmelhoch JM. Comparison of solicited and nonsolicited female unipolar depressives for treatment outcome research. Journal of Consulting in Clinical Psychology 1981;49:611– 3. Huitfeldt M, Montgomery SA. Comparison between zimleldin and amitryptiline of efficacy and adverse symptoms — a combined analysis of four British clinical trials in depression. Acta Psychiatrica Scandinavic 1983;68(Suppl. 308):55–69. Krupnik J, Shea T, Elkin I. Generalizability of treatment studies utilising solicited patients. Journal of Consulting and Clinical Practice 1986;54:68–78. Miller CA, Hooper CL, Bakish D. A comparison of patients with major depressive disorder recruited through newspaper advertising versus consultation referrals for clinical drug trials. Psychopharmacology Bulletin 1987;33:69–73. Niklson IA, Reimitz P-E, Sennef C. Factors that influence the outcome of placebo controlled antidepressant clinical trial. Psychopharmacology Bulletin 1997;33:41–61. Paykel ES, Hollyman JA, Freeling P, Sedgwick P. Predictors of therapeutic benefit from amitriptyline in mild depression: a general practice placebo-controlled trial. Journal of Affective Disorders 1988;14:83–95. Rapaport MH, Frevert T, Babior S, Szisook S, Judd LL. A comparison of demographic variables, symptom profiles and measurements of functioning in symptomatic volunteers and an outpatient clinical population. Psychopharmacology Bulletin 1995;31:111–4. Rapaport MH, Zisook S, Frevert T, Seymour S, Kelsoe JR, Judd LL. A comparison of descriptive variables for clinical patients and symptomatic volunteers with depressive disorders. Journal of Clinical Psychopharmacology 1996;16:242–6. Reimher FW, Ward MF, Byerley WF. The introductory placebo washout: a retrospective evaluation. Psychiatry Research 1989;30:191–9. Stassen HH, Angst J, Delini-Stula A. Severity at baseline and onset of improvement in depression. Meta-analysis of imipramine and moclobamide versus placebo. European Psychiatry 1994;9:129–36. Thase ME, Last CG, Hersen M, Bellack AS, Himmelhoch JM. Symptomatic volunteers in depression research: a closer look. Psychiatry Research 1984;11:25–33. Wilcox CS, Cohn JB, Linden RD, Heiser JF, Lucas PB, Morgan DL, et al. Predictors of placebo response: a retrospective analysis. Psychopharmacology Bulletin 1992;28:157–62.