Psychometric validation of two Italian quality of life questionnaires in menopausal women

Psychometric validation of two Italian quality of life questionnaires in menopausal women

Maturitas 35 (2000) 129 – 142 www.elsevier.com/locate/maturitas Psychometric validation of two Italian quality of life questionnaires in menopausal w...

109KB Sizes 1 Downloads 38 Views

Maturitas 35 (2000) 129 – 142 www.elsevier.com/locate/maturitas

Psychometric validation of two Italian quality of life questionnaires in menopausal women C. Wool a, R. Cerutti a, P. Marquis b,*, P. Cialdella c, C. Hervie´ b, ISGQL 1 a

Ethical Drugs Medical Department, Bracco S.p.a., Via Egidio Folli 50, 20134 Milan, Italy b Mapi Values France, 27 Rue de la Villette, 69003 Lyon, France c 172 A6enue des Fre`res Lumie`re, 69008 Lyon, France Received 4 January 1999; accepted 19 January 2000

Abstract Objecti6e: To establish the psychometric properties of the Italian version of two quality of life (QOL) questionnaires in menopausal women: the psychological general well being index (PGWBI) and the women’s health questionnaire (WHQ). Method: These questionnaires were translated into Italian and then self-administered to out-patient women a first time, 1 week later in stable women to assess reproducibility, and 3 months later to evaluate responsiveness over time. Baseline analyses included: factorial structure, multitrait analysis, internal consistency reliability, and clinical validity. Results: Questionnaires were returned by 155 women (median age: 54 years, median duration of amenorrhoea: 56 months, median Kupperman index 26). Principal component analysis (PCA) of the PGWBI showed an important general factor and then, after rotation, three factors. The PCA of the WHQ showed ten factors. Only five reproduced the dimensions postulated a` priori quite well. The item convergent validity was confirmed for all items of the major dimension of the two questionnaires, and the item divergent validity, although acceptable, was less satisfying for the PGWBI than the WHQ. The internal reliability was good (Cronbach’s alpha ` 0.70) for the PGWBI and for nine scales out of ten for the WHQ. The six dimensions of the PGWBI and most of the dimensions of the WHQ were significantly correlated to the Kupperman index, indicating the clinical validity of the instruments. The responsiveness to change in clinical status at 3 months was better in the PGWBI than in the WHQ with moderate effect size (around 0.5). Conclusion: The Italian versions of the PGWBI and the WHQ are reliable and useful for HRT clinical trials but the dimensional scores must be calculated bearing in mind the limitations in the structure. Other studies are needed to improve the factorial stability of certain WHQ dimensions. For the Italian version of the PGWBI, the validation process is to be completed by studies of mixed populations suffering from other types of disease. © 2000 Published by Elsevier Science Ireland Ltd. All rights reserved. Keywords: Menopause; Quality of life; PGWBI; WHQ; Validity; Reliability; Factorial analysis; Psychometric validation

* Corresponding author. Tel.: +33-4-72136698; fax: + 334-72136668. E-mail address: [email protected] (P. Marquis) 1 Italian Study Group on Quality of Life, Italy.

1. Introduction Even though menopause is a normal physiological state, menopause transition means possible

0378-5122/00/$ - see front matter © 2000 Published by Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 3 7 8 - 5 1 2 2 ( 0 0 ) 0 0 0 9 3 - 1

130

C. Wool et al. / Maturitas 35 (2000) 129–142

changes in health and well-being. Women report physical discomfort, sleeplessness, and embarrassment, and complain about many symptoms such as vasomotor symptoms (hot flushes and night sweating), and vaginal dryness, which can lead to dyspareunia. These symptoms affect a very high proportion of women and, when occurring together, are typical of the menopausal period. In an epidemiological study involving 5213 women aged from 39 to 60 years old, Oldenhave et al. [1] demonstrated that vasomotor complaints peaked for 85% of the women at the end of the premenopause and at the beginning of the postmenopause. The post-menopausal stage is defined by the duration since the last menstrual cycle, which should be \6 months and B 2 years. This symptomatology is likely to have a major impact on patients’ subjective health-related quality of life (HRQoL) and the resolution of these symptoms may play a major role in improving patients’ HRQoL. As a result, it is important to have instruments available which measure the subjective consequences of this health state on quality of life, whereas clinical scales currently used by clinicians, as the Kupperman’s index [2], focus primarily on hot flushes. If possible, these HRQoL instruments should measure the impact of the menopause on several dimensions, be reliable, valid and sensitive to changes in the clinical status if they are to be used as assessment criteria in clinical research. At the present time, all experience in HRQoL assessment points to the simultaneous use of both a generic instrument (to assess general health status) and a condition-specific instrument. Several generic questionnaires have already been validated and are reliable and available in several languages (SF-36, SIP, NHP, PGWBI). Since the menopause is not a disease in itself, the choice of a questionnaire was made according to the psychological components which are more disturbed as compared to the physical components during menopause. The psychological general well-being index (PGWBI) [3] a well-known, validated and widely-used self-administered questionnaire (Naughton and Wiklund, 1993), was chosen. The women’s health questionnaire (WHQ) [4 – 6] was the specific questionnaire chosen to be adminis-

tered alongside the PGWBI questionnaire. The WHQ is a self-administered questionnaire developed to assess a wide range of physical and emotional symptoms, or sensations, experienced by mid-aged women and to specifically study possible changes in health and well-being during menopause transition. The WHQ, like the PGWBI, consists of both positive items (of wellbeing) and negative items (symptoms). The WHQ was originally developed and validated in English in 682 women. Although several translations were available, no Italian version was available. Several hormone replacement therapy (HRT) trials have already been performed with the WHQ as outcome criterion [7–9] For this study, the WHQ and the PGWBI were first translated into Italian, according to current standards [10,11]: the details of the translation process are given in Section 2. The psychometric properties of these versions also had to be evaluated before using both questionnaires in clinical studies and HRT trials. To complete the cultural adaptation process, a study was conducted to assess the main psychometric properties of these new linguistic versions. 2. Methods

2.1. Study design and objecti6e This was a multi-centre cross-sectional study with a 3-month follow-up among postmenopausal women. The aims of the study were to assess the psychometric properties of the WHQ and PGWBI Italian versions, and evaluate QoL in postmenopausal women.

2.2. Study population A total of 150 postmenopausal women recruited by gynaecologists in Italy were included in the study. The number of subjects was estimated so that the ratio of the number of subjects to the number of items was around five which is considered satisfactory for factorial analyses. Since the longer questionnaire (WHQ) consisted of 58 items, it was calculated that the sample for analysis should include about 150 subjects.

C. Wool et al. / Maturitas 35 (2000) 129–142

Consecutive subjects consulting their gynaecologists were recruited, irrespective of their treatment. A total of 11 gynaecologists each recruited 15 patients (see list of investigators).

2.2.1. Inclusion criteria Female outpatients aged 40 – 65 years, with a menopausal state defined as either the last natural menstrual cycle completed at least 12 consecutive months prior to screening, or a surgical postmenopausal state (total oophorectomy) completed at least 8 days prior to screening, or hysterectomized women who met the rest of inclusion criteria. Women had to complain of sufficiently serious post-menopausal symptoms, defined by the Kupperman index [2] as being \20, for which they had not been treated for over a year. Written informed consent was obtained before entering the study. 2.2.2. Exclusion criteria Inability to participate and/or to complete a quality of life questionnaire for cognitive and/or linguistic reasons. Treatment for postmenopausal symptoms for 1 year or more. 2.3. Conduct of the study and data collected Three visits were scheduled: at entry (D0), at 1 week post-entry (D7), and at 3 months post-entry (D90). Patients were asked to complete the quality of life questionnaires on these three occasions. In addition, socio-demographic, post-menopausal symptomatology, and clinical data were collected in the case record form which was completed by the clinicians during the three scheduled visits. The PGWBI [3] is a 22-item self-administered questionnaire designed to measure perception of general well-being through six dimensions or scores: anxiety (five items); depression (three items); feeling of well-being (four items); vitality (four items); general health (three items); and self-control (three items). All the items are rated on a 6-point scale (0 – 5). Six sub-scores and a global score (‘index’) can be calculated which ranges from 0 to 100 after a linear transformation. The higher the scores, the better the wellbeing.

131

The WHQ [6] is a 36-item self-administered specific instrument which provides a detailed examination of minor psychological and somatic symptoms experienced during peri- and postmenopausal periods. The items are combined into nine dimensions: depressed mood; somatic symptoms; cognitive difficulties; vasomotor symptoms; anxiety/fear; sexual functioning; sleep problems; menstrual symptoms and attractiveness. All the items are rated on a 4-point scale and nine subscores are calculated. The higher the scores, the more profound the distress or dysfunction. The translation of each of these two questionnaires was performed as follows: translation from English to Italian by two independent bilingual translators whose mother tongue is Italian. The two versions were compared and harmonised during a meeting. This first version was back-translated to English. The back-translated version was compared to the original English version of the questionnaire during a second meeting in order to make amendments as necessary. This yielded a new harmonised Italian version. This second harmonised version was field-tested with Italian patients in order to check the adequacy and comprehensibility of questions. Finally, a harmonisation meeting was held in order to produce the final version. For the WHQ, an additional meeting was held and the translation procedure was carried out in close collaboration with the author of the questionnaire, M. Hunter.

2.4. Statistical analysis 2.4.1. Patients description Socio-demographic and clinical variables were described at D0. Postmenopausal symptoms and QoL scores were described at D0, D7, and D90. Variables were described by classical summary statistics such as mean, standard deviation, median, minimum, maximum (quantitative variables), numbers of patients and percentages by categories (qualitative variables). 2.4.2. Psychometric 6alidation According to psychometric standards [10,11] the psychometric validation consisted of the following steps:

132

C. Wool et al. / Maturitas 35 (2000) 129–142

2.4.2.1. Construct 6alidity. The construct validity of the questionnaires at D0 was studied by means of principal component analysis (PCA) with varimax rotation and multitrait analysis. For the PCA, missing data were not replaced. The observed factors were compared to the a priori dimensions. The resulting scaling structure was subsequently examined using multitrait analysis. Item convergent and divergent validity was studied for each scale, using the following [12]: 1. item convergent validity defined as a correlation of 0.40 or greater between an item and its hypothesised scale (corrected for overlap); 2. item discriminant validity based on a comparison of the magnitude of the correlation of an item with its own scale, compared to other scales; 3. scaling success was defined as cases in which an item is correlated higher with its hypothesised scale (corrected for overlap) than with other scales. The results were expressed in percentage of scaling success; 100% meaning that in all cases, items of a scale were better correlated with their own scale than with any other scale; 4. the dimensional scores distributions were examined to check floor and ceiling effects which would have constituted a limit to the measuring capacities of the instrument. 2.4.2.2. Reliability. Two aspects of reliability were studied: “ internal consistency was estimated using the Cronbach’s alpha coefficient [13] at D0; “ test-retest reliability was analysed for the subgroup of patients who remained clinically stable between D0 and D7, according to the following criterion: Kupperman index which did not vary by more than two points between D0 and D7. We used the intraclass correlation coefficient (ICC), based on an analysis of variance as recommended by several authors [14] 2.4.2.3. Clinical 6alidity. It was postulated that the severity of patients’ postmenopausal symptoms was directly related to their QoL score. The correlations (non-parametric) between QoL scores and

the clinical severity criterion (Kupperman index) were then assessed. Other potential confounding criteria such as concomitant disease and age were also introduced in this analysis.

2.4.2.4. Responsi6eness to clinical change o6er time. A subgroup of patients was selected for this analysis, choosing as criterion of improved health status (whatever the treatment taken by the patients): a change in Kupperman index of at least four points. In order to compare the responsiveness to clinical change in PGWBI and WHQ scores over time, the score changes between D0 and D90, were transformed into effect-size [15]. Two types of calculation formulae were used for the effect-size. In both cases the effect-size is presented as a ratio with the difference in the average score between D90 and D0 as numerator, the denominator for effect size 1 being the standard deviation at D0, and for effect size 2, a composite standard deviation [16]. According to Cohen [17], an effect size of 0.20 or less can be considered small, 0.40 as medium, while 0.80 or more would be large.

3. Results

3.1. Description of the sample The database comprised 155 patients. Only one patient did not fulfil all the inclusion and exclusion criteria (37 years old); her data were retained for the analysis. No questionnaire was missing at any time during the study. The proportion of questionnaires with at least one missing data (MD) was higher at D90 (50%) than at D7 (22%) or at D0 (17.5%). For three of the items of the WHQ, the number of missing data was higher than for the other items. These items were related to ‘heavy periods’ (item 26) and sexuality, and thus did not concern all patients recruited. Socio-demographic and clinical data at baseline are described in Table 1. The age of the patients ranged from 37 to 65 years (mean= 549 5.1; median= 54). At the time of study, 82.6% of the women were living with partners. Most women

C. Wool et al. / Maturitas 35 (2000) 129–142

(43.2%) had a level of education below A-level and 50.3% were unemployed and therefore at home. Table 1 Patients baseline clinical and socio-demographic data Baseline socio-demographic and clinical data (n= 155) Quantitative variables Mean (S.D.)/median Age (years) 54.21 (5.08)/54 Age of first periods (years) 12.52 (1.34)/12 Amenorrhea duration 71.70 (58.65)/56 (months) Duration of treatment (weeks) 17.67 (13.94)/17 Number of pregnancies 2.12 (1.14)/2 Kupperman index at D0 27.69 (6.51)/26 Kupperman index at D7 26.56 (7.13)/25 Kupperman index at D90 23.65 (7.53)/23 Nominal variables (detailed by Number of modalities) patients Family Status —Living alone 12 —Living as a couple 128 —Other 15 Highest Le6el of Education —No diploma 42 — BA-level 67 —A-level or equivalent 28 —University level 18 Present job —Full time 38 —Part time 15 —Retired or early retired 23 —At home 78 —Unemployed 1

%

7.74 82.58 9.68 27.10 43.23 18.06 11.61 24.52 9.68 14.84 50.32 0.65

Baseline socio-demographic and clinical data (n= 155) Quantitative variables Number of % patients Current or last job —Sales, company head 15 9.68 —Executive 6 3.87 —Teacher, technical 15 9.68 —Employee 60 38.71 —Unemployed 39 25.16 —Other 19 12.26 —Missing data 1 0.65 Current treatment —No treatment 124 80 —Symptomatic treatment 2 1.29 —Hormonal replacement 27 17.42 therapy —Other treatments 1 0.65 Bleedings —yes 14 9.03 —no 141 90.97

133

The median age of first menstrual periods was 12 years old. The median number of pregnancies was two. The median duration of amenorrhoea was more than 4 years (56 months). A total of 91% of the women had not experienced any bleeding over the past month. A total of 80% did not take any treatment, and among those who were treated, 90% were taking some form of HRT. The median Kupperman index (non parametric distribution) was 26 on D0, 25 on D7 and 23 on D90. A total of 61.3% of the women did not suffer from breast tension, 40.4% did not complain of vaginal dryness, 70.3% of the women had sexual intercourse at least once during the past month and of these women, 49.5% did not complain of any dyspareunia. The distribution of PGWBI and WHQ scores per dimension are given in Table 2. Examining the most menopause-specific items of the WHQ, 45% of the women complained of having hot flushes (item 19), whereas 34% complained of having hot flushes only periodically and 14% just sporadically. The proportions were not so high for the item on night sweats (item 27) because 25% had no such complaint. Concerning sexual activities (item 24), 13% complained of constant dyspareunia linked to vaginal dryness (item 34), and 9% complained of total dissatisfaction during sexual intercourse (item 31), although these last two items were characterised by 30% missing data, probably linked to the 28% who had lost interest in sexual activities (item 24).

3.2. Psychometric results 3.2.1. Construct 6alidity 3.2.1.1. PGWBI (Italian 6ersion). The principal component analysis (PCA) resulted in three factors, according to Kaiser’s criterion [18], which explained up to 60.8% of the total variance. The first factor explained 48.9% of the total variance before rotation and 30.7% after rotation, and can be considered as a general factor. Table 3 details the content of each factor with specific variance explained for each factor. Items of anxiety and depression were distributed on factor 1. Some items of well-being were dis-

C. Wool et al. / Maturitas 35 (2000) 129–142

134

Table 2 Mean scores and standard deviation of PGWBI and WHQ questionnaires PGWBIa

Mean score

Standard deviation WHQa

Mean score

Standard deviation

Anxiety (5) Depressed mood (3) Positive well-being (4) Self-control (3) General health (3) Vitality (4) Global index (22)

14.1 10.7 9.0 12.5 9.7 7.9 63.7

5.4 3.0 3.6 4.4 2.9 3.0 18.6

0.34 0.52 0.57 0.71 0.51 0.48 0.70 0.28 0.48

0.21 0.22 0.36 0.37 0.33 0.35 0.32 0.27 0.42

a

Depressed mood (7) Somatic symptoms (5) Memory (3) Vasomotor symptoms (2) Anxiety (4) Sexual behaviour (3) Sleep problems (2) Menstrual symptoms (5) Attractiveness (2)

Number of items in the dimension.

Table 3 Principal component analysis of the PGWBI questionnaire, with varimax rotation Factors

PGWBI items: observed groupings

Original dimension

1 (% of V.E.= 30.7)a

19 5 17 8 3 22 20 7 1 9 21 11 16 15 12 6

Did you feel relaxed Have you been bothered by nervousness Have you been anxious Were you tense or did you feel tension Did you feel depressed Have you been under or felt under any strain I felt cheerful I felt downhearted How have you been feeling in general How happy I felt tired Have you felt so low Did you feel active My daily life was full of things that I woke up feeling fresh and rested How much energy

Anxiety Anxiety Anxiety Anxiety Depression Anxiety Well-being Depression Well-being Well-being Vitality Depression Vitality Well-being Vitality Vitality

2 (% V.E.=15.4)a

4 18 15 14 20 7 13 10

Have you been in firm control of I was emotionally stable My daily life was full of things that Have you had any reason to wonder I felt cheerful I felt downhearted Have you been concerned Did you feel healthy enough to carry

Self-control Self-control Well-being Self-control Well-being Depression General health General health

3 (% V.E.=14.7)a

2 12 13 10 6 21 11

Were you bothered by any illness I woke up feeling fresh and rested Have you been concerned Did you feel healthy enough to carry How much energy I felt tired Did you feel active

General health Vitality General health General health Vitality Vitality Vitality

a

% V.E., percentage of total variance extracted by the rotated factor

C. Wool et al. / Maturitas 35 (2000) 129–142

tributed on factors 1 and 2. Some items of vitality were distributed on factors 1 and 3. Items of general health were distributed on factor 3, and items of self control were distributed on factor 2. This PCA showed that the PGWBI measured only three main aspects in this study: anxiety/depression; self-control; and general health/vitality. According to the multitrait analysis, all items respected the convergent validity criterion. Eight items did not respect the divergent validity criteria, being more correlated to another dimension than to their own dimension (items 1 – 3, 7, 10, 13, 14 and 18). There was a wide range of scaling success: from 100% for anxiety and vitality dimensions to 66.7% for the self control and general health dimensions. No floor and/or ceiling effects were observed.

3.2.1.2. WHQ. As item 26 (‘heavy periods’) was not applicable to this sample (inclusion criteria), this item had virtually no covariance with the other items, and was not included in the PCA. A first PCA resulted in 11 factors (Kaiser’s criterion) and a first multitrait analysis showed that item 10 (I have a good appetite) was neither correlated to its own dimension, depression (r = 0.01) nor to the other dimensions. In addition, some items appeared to be better correlated to other scales than to their own scale. Item 10 was deleted, items 30 (I often notice pins…in my hands…) and 35 (I need to pass urine more frequently) were transferred from the dimension somatic symptoms to menstrual symptoms. Item 11 (I am restless and can’t keep still) was transferred from the dimension sleep problems to depressed mood. A second PCA was then performed. The first factor before rotation explained 22.9% of the total variance and there was no real tendency towards a general factor to appear. Although Kaiser’s criterion had indicated 11 factors, the eleventh value was B1.5, as a consequence only ten factors were retained, which explained 66.8% of the total variance. Table 4 displays the content of each factor with variance explained for each factor. Three factors were quite composite (factors 1, 3 and 4) because they consisted of items taken from different dimensions of the WHQ, but the other factors were more

135

homogeneous and reproduced certain dimensions fairly well such as factors 2 (menstrual symptoms); 5 (sexual behaviour); 6 (vasomotor symptoms); 7 (memory) and 8 (sleep problems). Factor 9 was reduced to two items of the somatic symptoms dimension (the last two items of the somatic symptoms dimension were spread over factors 1 and 3. Factor 10 was composed of item 13 (worry of growing old) which was not attributed to a specific dimension [6] and of item 36. A second multitrait analysis of the WHQ (34 items) was performed: for six dimensions (memory, vasomotor symptoms, anxiety/fears, sexual behaviour, sleep problems and attractiveness), the items fulfilled the convergent validity criterion, and for the five items of the other three dimensions, the item-scale correlations were lower than the threshold of 0.40, but were \ 0.32. Only four items did not fulfil the divergent validity criteria, being more correlated to another dimension than to their own dimension (items 9, 12, 15 and 21). There was a medium range of scaling success: from 100% for vasomotor symptoms, sexual behaviour, sleep problems, menstrual symptoms dimensions, to 91.1% for somatic symptoms dimension. A floor effect was observed for two dimensions: vasomotor symptoms, where the answers were concentrated around the response choices ‘yes, definitely’ and ‘yes, sometimes’, and sleep problems where extreme response choices were over-represented. Taking into account those psychometric results, the construct validity of the 34-item questionnaire was considered to be moderately satisfactory.

3.2.2. Reliability Cronbach’s coefficients for the PGWBI ranged from 0.63 (general health) to 0.89 (anxiety). Four dimensions had a coefficient greater than the generally recommended value (0.70), but for the dimensions general health and self control, the coefficients were equal to 0.63 and 0.69 (Table 5). However, Cronbach’s alpha value is dependent on the number of items: if these subscales had comprised ten items, the coefficients of these two dimensions would have been 0.85 and 0.88 (see third column of Table 4). Cronbach’s coefficients for the WHQ ranged from 0.56 (vasomotor symptoms) to 0.78 (depressed mood). Four dimensions

C. Wool et al. / Maturitas 35 (2000) 129–142

136

had a coefficient greater than 0.70, but this was not the case for the other dimensions. However, if these latter dimensions had comprised ten items, their coefficients would have been superior to 0.79.

A total of 95 patients were evaluable for testretest reliability, according to the stability of their clinical status (based on Kupperman index). The ICC for each subscale are shown in Table 5, where values ] 0.80, were considered to be very

Table 4 Principal component analysis of the WHQ, with VARIMAX rotation Factors

WHQ: observed groupings

Original dimension

1 (% V.E.= 4.36)

12 11 3 9 15 2 20 21

I I I I I I I I

Depressed mood Depressed mood Depressed mood Anxiety/fears Somatic symptoms Anxiety/fears Memory Attractiveness

2 (% V.E.= 2.64)

28 17 22 30 35

My stomach feels bloated My breast feels tender or uncomfortable I have abdominal cramps or discomfort I often have a feeling of pins in hands and feet I need to pass urine more frequently

Menstrual Menstrual Menstrual Menstrual Menstrual

3 (% V.E.= 2.53)

23 16 4 6 5 25

I I I I I I

feel sick or nauseated have dizzy spells feel anxious when I go out of house get palpitations in my stomach or my chest have lost interest in things have feelings of well-being

Somatic symptoms Somatic symptoms Anxiety/fears Anxiety/fears Depressed mood Depressed mood

4 (% V.E.= 2.47)

32 8 21 25 24 5

I I I I I I

feel physically attractive feel life is not worth living feel rather lively and excitable have feelings of well-being have lost interest in sexual activities have lost interest in things

Attractiveness Depressed mood Attractiveness Depressed mood Sexual behaviour Depressed mood

5 (% V.E.= 2.16)

34 31 24

Due to vaginal dryness sexuality is uncomfortable I am satisfied with my current sexual relations I have lost interest in sexual activities

Sexual behaviour Sexual behaviour Sexual behaviour

6 (% V.E.= 1.90)

27 19 7

I suffer from night sweats I have hot flushes I still enjoy the things

Vasomotor symptoms Vasomotor symptoms Depressed mood

7 (% V.E.= 1.90)

33 36

I have difficulty concentrating My memory is poor

Memory Memory

8 (% V.E.= 1.67)

29 1

I have difficulty in getting off to sleep I wake early then sleep badly

Sleep problems Sleep problems

9 (% V.E.= 1.60)

18 14

I suffer from backache/or pain in my limbs I have headaches

Somatic symptoms Somatic symptoms

10 (% V.E.= 1.43)

13 35

I worry about growing old I need to pass urine more frequently

No dimension Menstrual symptoms

am more irritable than usual am restless and can’t keep still feel miserable and sad feel tense or ‘wound up’ feel more tired than usually get very frightened am more clumsy than usual feel rather lively and excitable

symptoms symptoms symptoms symptoms symptoms

C. Wool et al. / Maturitas 35 (2000) 129–142

137

Table 5 Reliability of the PGWBI and WHQ subscales (internal consistency and test-retest reliability)a Dimensions

PGWBI Anxiety Depression Positive well-being Self-control General health Vitality WHQ Depressed mood Somatic symptoms Memory Vasomotor symptoms Anxiety Sexual behaviour Sleep problems Menstrual symptom Attractiveness a b

Number of items

Cronbach’s alpha

Theoretical alpha if 10 items

Test-retest reliability (ICC)b

5 3 4 3 3 4

0.89 0.82 0.83 0.69 0.63 0.83

0.94 0.94 0.93 0.88 0.85 0.93

0.79 0.77 0.90 0.86 0.83 0.85

7 5 3 2

0.78 0.65 0.71 0.56

0.83 0.79 0.89 0.86

0.82 0.85 0.85 0.70

4 3 2 5 2

0.74 0.78 0.56 0.68 0.63

0.88 0.92 0.87 0.81 0.90

0.88 0.85 0.67 0.83 0.77

In bold characters we have indicated the values considered as satisfactory. For this analysis, n = 95. ICC, intraclass correlation coefficient.

satisfactory according to many authors. Results showed high test-retest reliability for four of the PGWBI dimensions and for the global index, but not for the anxiety and depressed mood dimensions, where the ICCs were lower than the recommended value (0.79 and 0.77). However, these values are still satisfactory. The test-retest reliability of the WHQ was high for six dimensions, acceptable for the attractiveness dimension, but poor for the dimensions vasomotor symptoms and sleep problems.

3.2.3. Clinical 6alidity The average scores of each sub-scale (PGWBI, WHQ) on D0 were correlated with external clinical variables. These external variables were Kupperman index (which acts as an external validator), the number of associated diseases (comorbidity), and the age of the patients. Table 6 shows the correlations and associated P-values of PGWBI and WHQ subscale scores with Kupperman index, number of concomitant diseases and age. For the PGWBI, all the correlation coefficients between sub-scores and Kupper-

man index were higher or equal to 0.34, and for the global index, the correlation coefficient was 0.49. All results were statistically significant (P= 0.0001). Some PGWBI subscales (self-control, general health and vitality) were significantly correlated to the number of concomitant diseases, while the depression subscale was also significantly associated with age (the older the patient, the less depressed she appeared). These results confirmed the sensitivity of the PGWBI to differences in health states and level of symptoms. For the WHQ, correlations between dimensional scores and the Kupperman index were significant for seven of the nine dimensions, but not for memory or sexual behaviour. This is not surprising because the Kupperman index does not assess memory and sexual activities. Depression and anxiety subscores were correlated with the number of concomitant diseases, and depression, vasomotor symptoms, menstrual symptoms subscores were inversely correlated with age (which means an improvement in quality of life with age). These results confirmed the clinical sensitivity of the WHQ and age as a confounding factor.

C. Wool et al. / Maturitas 35 (2000) 129–142

138

3.2.4. Sensiti6ity to clinical changes between D0 and D90 A total of 67 patients showing an improvement in the Kupperman index were included in the analysis. The sensitivity to change was checked after calculation of effect-size. Effect-size values are presented for the PGWBI and WHQ subscales in Table 7. Effect-size values indicated a medium sensitivity of the PGWBI questionnaire, and a medium or even a low responsiveness of the WHQ.

4. Discussion This study conducted among 155 menopausal Italian women, provides information on the validation of the Italian versions of the PGWBI and the WHQ. However, it will be essential to per-

form further validation studies of the Italian version of the PGWBI on men and other medical conditions. No QoL questionnaire was missing and there was a low percentage of missing data. The sample we recruited with the assistance of 11 gynaecologists was seen on D0, D7 and D90. Although the sample was not representative of the general population because of the recruitment method, it did not seem to have any special characteristics. The mean age was 54.2 years, 82.6% of women were living with partners, 43% had not been educated to A-level and 50.3% were currently employed. The average Kupperman index was 27.69 at D0, 26.56 at D7 and 23.65 at D90, 70.3% had sexual intercourse during the last month. Amongst those who had sexual intercourse, 49.5% reported no problems of dyspareunia. A total of 9% of women reported bleedings, while 80% re-

Table 6 Clinical validity of the PGWBI and WHQ questionnaires, correlations of subscale scores with Kupperman indexa

PGWBI Anxiety Depressed mood Positive well-being Self-control General health Vitality Global index

Kupperman index (Spearman’s r)

P-value

Comorbidity (Spearman’s r)b

P-value

−0.4818 −0.4244 −0.3472 −0.3956 −0.3560 −0.4351 −0.4940

0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

−0.0739 −0.1217 −0.0371 −0.1787 −0.1936 −0.1859 −0.1509

0.3606 0.1314 0.6466 0.0261 0.0158 0.0205 0.0609

0.1224 0.1606 0.0662 0.0323 0.0169 0.1004 0.1051

0.1293 0.0460 0.4132 0.6898 0.8347 0.2140 0.1933

0.0001 0.0001 0.2211 0.0095

0.1780 0.1245 0.0653 −0.0390

0.0267 0.1228 0.4198 0.6321

−0.1992 −0.1223 −0.1171 −0.1718

0.0130 0.1293 0.1468 0.0338

0.0001 0.9714 0.0001 0.0012 0.0484

0.1797 0.0040 0.0878 0.0234 −0.0467

0.0252 0.9661 0.2772 0.7723 0.5636

−0.1363 0.0179 −0.0637 −0.2630 −0.0229

0.0907 0.9506 0.4313 0.0009 0.7776

WHQ Depressed mood 0.3811 Somatic symptoms 0.4456 Memory 0.0988 Vasomotor 0.2091 symptoms Anxiety 0.4438 Sexual behavioura −0.0034 Sleep problems 0.3730 Menstrual symptoms 0.2574 Attractiveness 0.1588 a b

Age (Spearman’s r)

In bold characters we have indicated the values considered as satisfactory. For this analysis n = 113. Defined as the number of concomitant diseases.

P-value

C. Wool et al. / Maturitas 35 (2000) 129–142 Table 7 Sensitivity to clinical change over time of the PGWBI and WHQ subscalesa PGWBI and WHQ dimensions

Patients having changed between D0 and D90, according to Kupperman index (n= 67) Effect-size 1

PGWBI Anxiety Depression Positive well-being Self control General health Vitality Global index WHQ Depressed mood Somatic symptoms Memory Vasomotor symptoms Anxiety Sexual behaviour Sleep problems Menstrual symptoms Attractiveness a

Effect-size 2

0.52 0.39 0.32

0.53 0.46 0.33

0.41 0.36 0.58 0.36

0.42 0.42 0.58 0.42

−0.38 −0.24*

−0.36 −0.27*

−0.22* −0.72

−0.25* −0.57

−0.29* −0.11*

−0.36 −0.11*

−0.45 −0.31

−0.47 −0.29*

−0.17*

−0.15*

*Effect-size B0.30 meaning small sensitivity to changes.

ceived no treatment for post-menopausal symptoms. For the construct validity of the Italian version of the PGWBI in this specific study, we found that only three main aspects were assessed: anxiety/depression, self-control and general health/vitality, instead of the six dimensions expected. However, the global index could be used without any problem, as a result of the importance of the general factor (48.9% of the total variance extracted), before rotation in PCA. Similar results concerning the PGWBI questionnaire were also found in an international study including ten countries [19]. In spite of the predominant three dimension structure of the PGWBI in our study, the reliability of the six original dimensions does not pose any major problem: Cronbach’s coeffi-

139

cients were good for four of the six PGWBI dimensions (at least 0.80 for most of the scales), and the test-retest reliability at 7 days was good for all six subscales. The PGWBI does not seem to pose any floor or ceiling effect problems in this type of population. Furthermore, the clinical validity study proved highly satisfactory in terms of correlations with the Kupperman index. Although the PGWBI is sensitive to problems of comorbidity, its sensitivity to a change in clinical status over a period of time was not very high but was, nevertheless, satisfactory. For the construct validity of the Italian version of the WHQ, after deletion of two items and the transfer of three items from their original dimensions to other dimensions seeming more appropriate, the construct validity was considered satisfactory. However, the more psychological dimensions of the WHQ (depression, anxiety/fears, somatic symptoms, attractiveness) were not found in the PCA, because the items in these dimensions formed composite factors (cf. factors 1 and 4). To some extent, this supports the difficulty in differentiating the dimensions of depression, anxiety and somatic symptoms. We are particularly aware that the anxiety and depression scores are very frequently correlated to a considerable degree. These correlations are greater in self-administered questionnaires than in observer scales [20]. According to Clark and Watson [20], the relations between anxiety and depression measures would depend on three factors: one major non-specific common factor of distress called negative affectivity, (the lack of positive activity defines specific depression-related anhedonia), and the physiological hyper-awakening (which takes the somatic dimension of anxiety into consideration). As for the reliability of the WHQ, the Cronbach’s alpha coefficients were acceptable for four of nine subscales but several subscales would have reached a satisfactory level with simply a higher number of items. The test-retest reliability at 7 days proved more satisfactory for seven of the nine dimensions (ICCs\ =0.77). For only two subscales (vasomotor symptoms and sleep problems), a floor effect was found, indicating that the patients were very concerned by these health problems. The clinical validity study proved satisfactory in terms

140

C. Wool et al. / Maturitas 35 (2000) 129–142

of correlations with Kupperman index (except for the subscales which clearly measured phenomena other than those assessed by this index, such as memory and sexual behaviour). Sensitivity to clinical changes over a period of time was moderate or low for attractiveness, sexual behaviour and memory. The WHQ’s relatively low sensitivity to change was also observed (but to a lesser degree) in the PGWBI and was probably related to the fact that most women were not on hormone replacement therapy so that their scores did not really improve; indeed, the score improvement was small. In addition, the Kupperman index used to determine clinical change was unbalanced by the relatively high weighting of hot flushes which have a multiplicative coefficient of 4. This is detrimental to other symptoms such as vaginal dryness, for example. This index is certainly not ideal for measuring clinical validity. A recent article criticising the Blatt-Kupperman index was published while our study was being conducted [21]. Apart from its questionable weighting system, the index totally omits vaginal dryness and decreased libido. However, a large amount of published work is based on this index and it captures both the physician and patient viewpoints. Hormonal assays might have been a better clinical criterion but we did not want to subject our patients to laboratory investigations. Thus, the use of this index puts our study somewhere between clinical validity and construct validity. If we look at things more generally, the reasons for the less than perfect reproducibility of the PGWBI and the WHQ factors, compared with the earlier analyses, must be taken into consideration at three levels: the subjects, the items and the methods. For a factorial structure to be reproducible, the sample should present the same sociodemographic and clinical characteristics as the original. This was not really feasible in our study. Our sample was recruited from clinic subjects, whereas the two questionnaires were validated in non-clinical samples. Hunter et al.’s sample [6] was composed of women who were noticeably younger than in our study because their average age was closer to 52 (i.e. 2 years younger than in our study) and in Hunter’s study there were more

active working women (66% as opposed to 50% in our sample). In Hunter’s sample [6], there were a number of pre-menopausal women but surgical menopauses were excluded, whereas they were retained in our sample. The exclusively feminine composition and the particular age group may have had effects on the PGWBI’s factorial structure but we are not really able to explain why this might be. However, we do know that for a supposed dimension to appear in a factorial analysis, the sample must present sufficient variability in this dimension. For example, a general depression factor may not appear easily during the PCA of a depression scale if this is carried out on severely depressed patients. For the items, inter-item correlations may be distorted by distribution asymmetry, as was the case for several items in our study and especially for the WHQ, which was characterised by a low number and by the parity of its response. As to the methods, the saturation threshold retained by Hunter was 0.300, whereas ours was more restrictive (0.400). Other factorial analyses of the WHQ were performed by the Swedish team of Wiklund et al. [22]: these were alpha factorial type analyses and not PCAs. Their results were closer to Hunter’s than to ours, even if the sleep items did not form a separate factor, because they were associated with vasomotor items in one of their analyses but not in their second one. It was no doubt difficult to expect perfect reproducibility of the factorial structures for both the PGWBI and the WHQ. Furthermore, the problems of replicating a factorial structure are frequently encountered and only rarely constitute a practical obstacle to using a scale. For example, Hamilton’s depression scale is very widely used in spite of its factorial instability [23]. Nevertheless, the consequences of the factorial instability are not the same for the PGWBI and for the WHQ in that a global score can be used for the PGWBI (the first factor before rotation is clearly a general factor) but not for the WHQ. Some dimensions of the WHQ pose greater construct validity problems than others, especially the more psychological dimensions, (depression, anxiety/fear, attractiveness) and the somatic symptoms dimensions, which assesses the somatic signs which are non-menopause specific.

C. Wool et al. / Maturitas 35 (2000) 129–142

To conclude, the Italian versions of the PGWBI and the WHQ are reliable and useful for HRT clinical trials but the dimensional scores must be calculated bearing in mind the limitations in the construct validity. Other studies are needed to improve the factorial stability of certain WHQ dimensions and on a more general scaling level it may well prove of use to replace the 4 level coding system by a 5 level coding system. This would reduce the distributional asymmetry. For the Italian version of the PGWBI, the validation process is to be completed by studies of mixed populations suffering from other types of disease. Consideration has to be given, however, to the fact that these two questionnaires measure only certain aspects of quality of life, notably mood, well-being and physical symptoms, without assessing functional status and interference of symptoms with daily life. Nevertheless, there are other questionnaires which assess these aspects of life quality that are perhaps less affected by menopause than by established disease conditions. In this regard, the term health-related quality of life (HRQoL) is appealing although it is conceptualized in very different ways in the literature. ISGQL — Italian study group on quality of life The principal investigators for the ISGQL were as follows: — Andrea R. Genazzani, Marco Gambacciani, Massimo Ciaponi Clinical Ostetrica e Ginecologica, Ospedale Santa Chiara, Universita` degli Studi di Pisa, Pisa, Tel: (+39) 050 992615 — Pier Giorgio Crosignani, Fiorenza Bruschi, Raffaela Di Pace I Clinica Ostetrica e Ginecologica, Istituti Clinici di Perfezionamento, Universita` degli Studi di Milano, Milano, Tel: (+39) 02 57992267 — Francesco Bottiglioni, Domenico De Aloysio, Alessandra Roncuzzi Clinica Ostetrica e Ginecologica, Ospedale Sant’Orsola, Universita` degli Studi di Bologna, Bologna, Tel: (+39) 051 6363339

141

— Carlo Zara, Franco Polatti, Rossella Colleoni Clinica Ostetrica e Ginecologica, Policlinico San Matteo, Universita` degli Studi di Pavia, Pavia, Tel: (+39) 0382 503721 — Piero Sismondi, Nicoletta Biglia, Riccardo Roagna Cattedra di Ginecologia Oncologica, Ospedale Mauriziano, Universita` degli Studi di Torino, Torino, Tel: (+39) 011 5080427 — Gianfranco Scarselli, Maria Sandra Bucciantini, Luisa Bigozzi Clinica Ostetrica e Ginecologica II, Ospedale Careggi, Universita` degli Studi di Firenze, Firenze, Tel: (+39) 055 4277362 — Sal6atore Di Leo, Sal6atore Sciacchitano, Grazia De Luca I Clinica Ostetrica e Ginecologica, Ospedale Vittorio Emanuele, Universita` degli Studi di Catania, Catania, Tel: (+39) 095 7435492 — Piero Capetta, Federico Sallusto 5 Clinica Ostetrica e Ginecologica, Ospedale Macedonio Melloni, Universita` degli Studi di Milano, Milano, Tel: (+39) 02 7523256 — Ettore Cittadini, Marcello Mezzatesta, Leopoldo Di Cara Clinica Ostetrica e Ginecologica, Policlinico Universitario, Universita` degli Studi di Palermo, Palermo, Tel: (+39) 091 6552038 — Emilio Imparato, Aurelio Storace, Maria Giusepppina Piga Divisione di Ostetricia e Ginecologia, Ente Ospedaliera Galliera, Genova, Tel: (+39) 010 5632439 — Gio6an Battista Serra, Stefania Ricci Divisione di Ostetricia e Ginecologia, Ospedale Generale di Zona ‘Cristo Re’, Roma, Tel: (+39) 06 6275741

Acknowledgements The research was supported by the Ethical Medical Department of BRACCO S.p.A. The authors would like to acknowledge the Italian Study Group on Quality of Life for actively par-

142

C. Wool et al. / Maturitas 35 (2000) 129–142

ticipating in the screening and recruitment of the patients, the supervision in self-administration of questionnaires by the patients and the reporting of all clinically relevant data in the case record forms (CRFs).

References [1] Oldenhave A, Jaszmann LJB, Ary A, et al. Impact of climacteric on well-being, Am J Obstet Gynecol 1993;168(3)-1:772– 80. [2] Kupperman H, Blatt MH, Wiesbader H, et al. Comparative clinical evaluation of estrogenic preparation by the menopausal and amenorrheal indices. J Clin Endocrinol 1957;13:688 – 703. [3] Dupuy HJ. The psychological general well-being (PGWB) index. In: Wenger NK, Matson MI, Furberg CD, Elinson J, editors. Assessment of Quality of Life in Clinical Trials of Cardiovascular Therapies. NY, USA: Le Jacq Publ Inc, 1984:170 – 83. [4] Hunter MS, Battersby R, Whitehead MI. Relationships between psychological symptoms, somatic complaints and menopausal status. Maturitas 1986;8:217–28. [5] Hunter MS. Psychological and somatic experience of the menopause: a prospective study. Psychosom Med 1990;52:357 – 67. [6] Hunter MS. The women’s health questionnaire: a measure of mid-aged women’s perceptions of their emotional and physical health. Psychol Health 1992;0:1–10. [7] Limouzin-Lamothe MA, Mairon N, Joyce CRB, et al. Quality of Life after the menopause: influence of hormonal replacement therapy. Am J Obstet Gynecol 1994;170(2):618– 24. [8] Wiklund I, Berg G, Hammar M, Karlberg J, Lindgren R, Sandin K. Long-term effect of transdermal hormonal therapy on aspects of quality of life in postmenopausal women. Maturitas 1992;14:225–36. [9] Wiklund I, Karlberg J, Lindgren R, et al. A Swedish version of the women’s health questionnaire, a measure of postmenopausal complaints. Acta Obstet Gynecol Scand 1993;72:648 – 55.

.

[10] Guyatt GH. The philosophy of health-related quality of life translation. Qual Life Res 1993;2:461 – 5. [11] Guillermin F, Bombardier C, Beaton D. Cross-cultural adaptation of health-related quality of life measures: literature review and proposed guidelines. J Clin Epidemiol 1993;46:1417 – 32. [12] Hays RD, Hayashi T. Beyond internal consistency reliability: rationale and user’s guide for Multitrait Analysis Program on the microcomputer. Behav Res Methods, Instruments Comput 1990;22(2):167– 75. [13] Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297 – 334. [14] Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures: statistics and strategies for evaluation. Control Clin Trials 1991;12:142S – 58S. [15] Deyo RA, Centor RM. Assessing the responsiveness of functional scales to clinical change: an analogy to diagnostic test performance. J Chronic Dis 1986;39(11):897– 906. [16] Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care 1989;27(Suppl. 3):S178 – 89. [17] Cohen J. Statistical Power Analysis for the Behavioral Sciences. New York: Academic Press, 1977. [18] Kaiser HF. A second generation little Jiffy. Psychometrika 1970;35:401 – 15. [19] Marquis P, Dubois D. The psychological general well-being (PGWB) index: scores of a representative sample of the general population in ten countries. Qual Life Res 1997;6:689. [20] Clark LA, Watson D. Tripartite model of anxiety and depression: psychometric evidence and taxonomic implications. J Abnorm Psychol 1991;100(3):316– 36. [21] Alder E. The Blatt-Kupperman menopausal index: a critique. Maturitas 1998;29(1):19 – 24. [22] Wiklund I, Karlberg J, Mattson L-A. Quality of life of post-menopausal women on a regimen of transdermal estradiol therapy. A double-blind, placebo-controlled study. Am J Obstet Gynecol 1993;168:824 – 30. [23] Guelfi JD. L’E´chelle de de´pression d’Hamilton. In: Guelfi ´ valuation Clinique Standardise´e en PsychiJD, editor. L’E atrie. Castres: Editions Me´dicales Pierre Fabre, 1993:187 – 96.