Development and validation of the PCQ: A questionnaire to measure the psychological consequences of screening mammography

Development and validation of the PCQ: A questionnaire to measure the psychological consequences of screening mammography

Sot. Sci. Med. Vol. 34, No. 10, pp. 1129-1134,1992 Printed in Great Britain. All rights reserved 0277-9536192 $5.00+ 0.00 Copyright c 1992Pcrgamon Pr...

685KB Sizes 0 Downloads 67 Views

Sot. Sci. Med. Vol. 34, No. 10, pp. 1129-1134,1992 Printed in Great Britain. All rights reserved

0277-9536192 $5.00+ 0.00 Copyright c 1992Pcrgamon Press Ltd

DEVELOPMENT AND VALIDATION OF THE PCQ: A QUESTIONNAIRE TO MEASURE THE PSYCHOLOGICAL CONSEQUENCES OF SCREENING MAMMOGRAPHY JILL COCKBURN,’ TRUDY DE LUBE,’ SUSANHURLEY**and KERRIECLOVER) ‘Centre for Behavioural Research in Cancer, *Cancer Epidemiology Centre, Anti-Cancer Council of Victoria, 1 Rathdowne St, Carlton South, 3053 Victoria, Australia and ‘Discipline of Behavioural Science in Relation to Medicine, University of Newcastle, New South Wales, Australia Abstract-We have developed a reliable and valid questionnaire to measure the psychological consequences of screening mammography. The questionnaire measures the effect of screening on an individual’s functioning on emotional, social, and physical life domains. Content validity was ensured by extensive review of the relevant literature, discussion with professionals and interviews with attenders at a pilot Breast X-ray Screening Program in Melbourne, Australia. Discriminant validity was assessed by having expert judges sort items into dimensions which they appeared to be measuring. Acceptable levels of concordance (above 80%) with LIpriori classifications were found. Concurrent validity was demonstrated by comparison of subscale scores of 53 attenders at the Breast X-ray Program with an independent interview assessment of dysfunction on each of the emotional, social and physical dimensions. There was over 79% agreement between interview scores and questionnaire scores for each dimension. Construct validity was confirmed by showing that subscale scores varied in predicted ways. For women who were recalled for further investigation, scores on each subscale measuring negative consequences, were higher at the recall clinic than at s&eening clinic (emotional: I = - 7.28; df 2 70; P < 0.001; physical: t = - 2.53; df = 70; P = 0.014; social: r = -2.49; df = 70; P = 0.01s). The internal consistency of all subscales was found to be acceptable. This questionnaire is potentially useful for assessing the psychological consequences of the screening process and should have wide application. Key words-psychological

consequences, questionnaire,

INTRODUCTION Screening for breast cancer may lead to a worthwhile reduction in deaths from breast cancer [l, 21. However, the benefits of screening mammography in terms of mortality reduction need to be carefully weighed against any other consequences [3]. Such consequences may be economic costs, for example, the use of health service resources and the personal expenses involved with attendance-or the consequences may be less tangible, for example, any positive or negative psychological consequences of participants. Despite the importance of the issue [4] there have been only a few attempts to measure the psychological consequences caused by attendance for screening [S, 61. Additionally, there are some problems in the interpretation of the results of previous research, especially in studies which used the General Health Questionnaire (GHQ) as a measure of psychological consequences. The GHQ was designed to measure non-psychotic psychiatric morbidity in a general practice population [7], and therefore detects psychological morbidity of unknown aetiology. In addition, the questionnaire only asks about changes in psychological well-being over a period of a few weeks before completion of the questionnaire. Any long term *Present address: Department of Social and Preventive Medicine, Monash University, Melbourne, Victoria, Australia.

mammography

sustained change in wellbeing caused by an event in the past, such as attendance at screening, would not be detected. Also, the GHQ focuses on negative consequences, and does not take into account any perceived positive consequences from screening. This limitation also applies to another study which reports on the quality of life after a false positive mammogram [8]. It seems essential, therefore, that an instrument be developed which measures psychological consequences specific to the screening process, is capable of detecting any long term changes which might be the result of attendance for screening and examines not only the negative, but also the positive consequences of screening. There are a number of issues involved in the measurement of psychological consequences caused by screening. The first is an adequate definition. It seems reasonable to regard a worry or concern which has reached a level where it has a significant impact on the normal functioning of an individual, as a ‘negative consequence’. The second issue involves an adequate conceptualisation of normal functioning. Commonly, quality of life and health status scales have regarded a deficit in functioning on any of the major life domains-emotional, social and physicalas detrimental [9-l 11. The framework is also relevant for examining any positive consequences which women may experience attending a mammographic screening program. For example, consequences such as a ‘greater sense of well

1129

1130

JILL COCKBURN er

being’ or ‘feeling more relaxed’ may occur through getting a clear result after screening. It therefore appears appropriate to measure any positive consequences within the same framework of major life domains. Two other measurement issues which must be considered in the development of any questionnaire are reliability and validity. Reliability refers to the extent to whi$ measurements are repeatable and measurement error small [12]. Validity refers to the extent to which the questionnaire measures that which it is purported to measure. The main type of validity criteria are predictive, content, discriminant, concurrent, and construct [ 131.Predictive validity will be addressed in future studies relating scores on psychological consequences scales with subsequent behaviour, in particular attendance for subsequent mammography, and is not further considered in this paper. Content validity refers to the extent which items represent the domain of the phenomenon under study, and is assessed by comparing items logically to the target domain (or domain of interest). Discriminant validity involves determining whether items measuring one dimension can be distinguished from items measuring another dimension. Concurrent validity refers to the agreement between scores obtained on the questionnaire under study and scores on another standard measure assessing the same characteristic. Construct validity is the extent to which measurement corresponds to theoretical concepts so that scores vary in predictable ways. The aim of the series of studies described in this paper was to develop a reliable and valid measure of the psychological consequences for women attending at a mammography screening program. Standard psychometric procedures were used to develop and field test the questionnaire (the PCQ). The project consisted of 6 distinct studies as follows: (1) content validation; (2) discriminant validation; (3) pilot testing of the instrument and item analysis; (4) concurrent validation; (5) testing internal consistency of subscales and (6) construct validation. STUDY

1: CONTENT

VALIDATION

The aims of Study 1 were to ensure content validity first, by identifying all relevant life domains where worry and concern engendered by the screening process might be expected to have an impact on normal functioning and secondly, by developing a pool of items which would adequately measure identified domains. Method

A review of the literature of previously validated quality of life, health status and psychological morbidity scales [7-l 1, 14-171 and discussions with researchers and clinical workers suggested that three social and physical-were domains+motional, necessary to adequately describe the normal functioning of an individual.

al.

A pool of 16 items to measure these domains was scales derived from a review of published [7-l 1, 14-171 and the ideas of the research team and clinical associates. In addition, 48 women attending the Breast X-ray Program at the Essendon Hospital in Victoria, Australia were interviewed in 1990 prior to having a mammogram to gain ideas for item content and for colloquial language to express concepts. Two separate sets of interviews were conducted. The first set was concerned with the negative consequences of screening, the second with the positive consequences. Interviews relating to negative consequences

Fifteen consecutive attenders at screening clinic and 22 consecutive attenders at recall clinics were asked open ended questions about what they had been thinking since making the decision to come to the clinic (screening clinic participants) or learning they had to come back (recall clinic participants). Women were specifically asked whether they had been worried or concerned about the visit. If any worry or concern was expressed, three open ended questions were asked about the ways that this may have affected them in terms of the things they normally do (physical dimension); the way they normally feel (emotional dimension); and the way they react to others (social dimension). In addition, 16 items from the initial item pool were presented to determine their relevance to the women. These items asked women to rate, again on a four point scale, the frequency with which thoughts and feelings about breast cancer had caused a number of symptoms, e.g. “found yourself noticeably withdrawing from friends and relatives”, “felt .under strain” “had difficulty doing things around the house which you would normally do”. Interviews relating to positive consequences

Women were approached while waiting at the recall clinic and asked to participate, and 11 of those who agreed were given eight items from the initial pool reworded to emphasise the positive consequences of screening. These questions asked whether their experience at the Breast X-ray Program had led to a number of positive consequences. The consequences listed corresponded wherever possible to the ‘negative consequences’, e.g. “improved relations with friends and family”, “felt more hopeful about the future”, “felt more able to do things which you normally do”. Women were also asked an open ended question about any other positive consequences of their clinic visit. Women were telephoned one week after their visit and asked the same questions over the telephone. Results Interviews relating to psychological costs Response rate. All women who were approached consented to be interviewed. Thirty-seven interviews

Development and validation of the PCQ were conducted, 15 with women at screening clinic and 22 with women at recall clinic. Thinking or feeling about clinic visit. Responses mentioned by at least one respondent after open ended questioning which might be indicative of psychological distress (and the number of women mentioning them) were: feeling uncomfortable (l), being anxious (3), feeling tense (I), feeling concerned (5), feeling off-colour (l), feeling worried (2), feeling nervous (l), feeling shocked (l), and feeling upset (1). Two women mentioned they had tried to keep busy so that they wouldn’t think about the clinic visit. Efict on major life dimensions. Responses mentioned after open ended questioning about each dimension were: Physical:

Emotional: Social:

affecting sleep (1); unable to do things around the house (1); more busy than usual (1). unhappy/depressed (3); anxious (4); scared (1); short tempered (2). taking things out on others/ aggressive with others (2); distracted/withdrawn (2); resentful of others (1).

Many of the symptoms elicited by open-ended questioning were in the initial pool of items. Four items were infrequently mentioned by respondents from both recall and screening clinics and were removed from the questionnaire at this stage. The wording of two items was ammended and five items suggested by open-ended questioning were added. Positive consequences interviews Response rate. Twelve women were approached at recall clinics and 11 agreed to participate. Of these, 9 were recontacted by telephone 1 week later. Interview results. Two women at recall clinics felt they could not answer some or all of the questions because they did not seem relevant before results were received. For other women, there was an increase in the reporting of all benefits at time 2 (one week after recall clinic). The particular items used therefore seem appropriate. The open ended questioning also suggested three further items which were added to the questionnaire at this stage. The result of these studies was a 28 item questionnaire containing negative items and positive items which appeared to adequately cover the relevant dimensions.

STUDY 2: DISCRIMINANT VALIDATION

The aim of Study 2 was to determine whether the postulated dimensions, physical, social and emotional, could be distinguished on the basis of item content.

1131

Method

A modified Q sort procedure was used. Each item was written on a card and the cards given to eight expert research psychologists. They were asked to classify the items according to the particular dimensions which they seemed to represent. The classification of items measuring both positive and negative consequences was examined. Results

For each item, the classification of items according to dimension by the expert judges agreed with apriori classification for at least six out of the eight judgements. For 19/28 items presented, all judges agreed with the a priori classification. For a further seven items only one judge did not agree with the a priori classification, and for the other two items only two judges did not agree. STUDY 3: FIELD-TESTING THE PILOT INSTRUMENT

The aim of Study 3 was to field test the pilot instrument to determine the relevance of individual items. We planned to exclude items at this stage if: l

l

l

few women reported experiencing the symptom; there was no variation between screening and recall responses they were consistently not completed

Method

An interviewer asked women at both the screening and recall clinics to complete the pilot questionnaire. After the women had completed it, the interviewer asked a series of standard questions to determine whether there were any difficulties associated with its completion. Results Response rate. Twenty women from the screening clinic and 18 women from recall clinic were approached and all agreed to complete the questionnaire. Negative consequences. Five items (four emotional and one social item) were excluded at this’ stage because they met one or more of the exclusion criteria. Table 1 lists the items in the final questionnaire which we call the PCQ (Psychological Consequences Questionnaire). In the negative consequences section there were five items in the emotional scale; four items in the physical scale; and three items in the social scale. Positive consequences. There were problems with the completion of the positive consequences section for both screening and recall women. At screening clinic, 7/20 women did not complete the section, stating that they did not feel it was appropriate to do so before results were received. A further two women, although completing the section, when questioned

1132

JILL COCKBL~N er al. Table 1. Items and layout of the PCQ’

We would like to find out about women’s experiences of the Breast X-ray Program. We would therefore like you to answer the questions on this questionnaire as best you can. Over the last week how often have you experienced the following things because of thoughts and feelings obour breasr cancer: Some Quite a of the Not at lot of the time all Rarely time (P) (P) (E) (E) (E) (E) (S) (S) (S) (P) (P) (E)

had trouble sleeping experienced a change in appetite been unhappy or depressed been scared and panicky felt nervous or strung up felt under strain found you have been keeping things from those who are close to you found yourself taking things out on other people found you&f noticeable withdrawing from those who are close to you had difficulty doing things around the house which you normally do had difficulty meeting work or other commitments felt worried about your future

0 0 0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 1

I

1

I I I 1

2 2 2 2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3 3 3 3

All things considered, would you say your experiences at the Breast X-ray Program have caused any of the following: A little Not at Quite A great all bit a bit deal (E) (E) (S) (P) (P) (E) (E) (S) (P) (E)

A sense of reassurance that you do not have breast cancer Feeling more relaxed Improved relationship with friends or relations Feeling more able to do things which you normally do Feeling more able to meet your home and/or work responsibilities Feeling more hopeful about the future Feeling less anxious about breast cancer Getting on better with those around you Been sleeping better A greater sense of well being

0 0 0 0 0 0 0 0 0 0

1 1 1 1 1 I 1 1 1

I

2 2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3 3

N.B.: the positive consequences section is only given to participants after results have been received. ‘Letters in ‘parentheses’ indicate subscale E = Emotional; P = Physical; S = Social.

about it claimed that it did not seem relevant for them. At recall clinic, 6/18 did not complete the section, while one other woman only completed half of it. Seven women in total claimed, on questioning, that the section did not appear relevant. Four women also gave responses for the positive consequences section which were inconsistent with responses to the negative consequences section. Given the large numbers of women who had difficulties understanding the purpose of this section, it was apparent that the positive consequences section was only appropriate after results had been received. One item was excluded from the positive consequences section at this stage, as its counterpart ‘negative’ item had been excluded. This left five questions in the emotional scale; three questions in the physical scale and two questions in the social scale for the positive consequences section. These items are listed in Table 1. STUDY 4: CONCURRENT

VALIDITY

The aim of Study 4 was to compare different measurements of the psychological consequences of screening. This involved comparing the scores on the subscales of the questionnaire with an independent interview assessment of respondents’ level of functioning on each dimension. Method

A trained clinical interviewer approached consecutive attenders before their mammogram at both the

initial screening clinic and recall clinic. After women completed the questionnaire, the interviewer asked some further questions, within a loosely structured framework, to elicit further information about the physical, emotional and social effects of attending the program. The interviewer then rated the degree of apparent dysfunction on each dimension separately. The questionnaire scores were not known to the interviewer at the time of assessment. Results Response rate. Thirty consecutive attenders at screening clinic were approached and consented to participate. A further 23 consecutive attenders at the recall clinics were approached and agreed to take part. Comparison. The possible range of scores for each subscale of the negative consequences section of the PCQ was divided into four equal class intervals. The lowest class interval, representing the bottom 25% of the range of possible scores, was taken to indicate no or minimal dysfunction on the particular dimension. Scores within the 2nd, 3rd and 4th class intervals were taken to indicate mild, moderate and marked disturbance respectively. The subscale scores for each individual were computed by adding together responses to items within subscales. For subscales with three or more items, if there was only one item missing from a subscale then the mean of other items in that subscale for that individual was substituted. If more than one item was missing, then a subscale score was not calculated. Subscale scores were classified as

Development and validation of the PCQ belonging to one of the four categories of dysfunction. The rating schema used by the interviewer had five categories. these were: 0 = no dysfunction; 1 = minimal dysfunction; 2 = mild dysfunction; 3 = moderate dysfunction and 4 = marked dysfunction. In order to assess concurrence between -interview and questionnaire scores; for each dimension an interview score of 0 or 1 (no or minimal dysfunction) was taken to be equivalent to a score in the first class interval for the corresponding subscale of the PCQ. A score of 2 or above on the relevant interview dimension (mild dysfunction or above) was taken to be equivalent to scores in the 2nd, 3rd or 4th class interval of the corresponding subscale score of the PCQ. A comparison was made of each individual’s interview and subscale scores. The criterion used for evidence of concurrent validity for each subscale was > 75% of respondents showing agreement between the two measures [ 181. For the physical subscale 89% of respondents showed agreement between the two measures, for the emotional subscale 79% of respondents showed agreement, and for the social subscale there was agreement in 94% of cases.

Table

1133

2. Mean and SD of the emotional, physical subscales’ at screening and recall clinic Emotional

Point of data collection Screening clinic Recall clinic

Physical

and social Social

Mean

SD

Mean

SD

Mean

SD

1.62 5.01

(2.97) (4.71)

0.60 1.38

(1.80) (2.44)

0.61 1.45

(1.57) (2.38)

‘n = 77 for each subscalc because of some missing data.

Method

Women in this study completed a questionnaire at the screening clinic while waiting for their mammogram. Women whose mammogram showed a suspicious lesion which on further investigation turned out to be benign (‘false positive’ group), completed another questionnaire while waiting at recall clinic. Results

There were 78 women who completed questionnaires at both screening clinic and recall clinic. Table 2 shows the mean and SD for each subscale at these points of data collection. For each subscale, a paired-t-test showed that there were significantly higher scores at recall clinic than at screening clinic: Emotional: t = -7.41; df = 76; P < 0.001; Physical: t = -2.90; df = 76; P = 0.005; Social: t = -2.84; df = 76; P = 0.006.

STUDY

5: INTERNAL

CONSISTENCY

The aim of Study 5 was to examine reliability by estimating Cronbach’s alpha which measures the internal consistency of scales [12]. Following Helmstadter’s recommendations, estimates of 0.5 or above are considered sufficient measures of the internal consistency of scales to be used for group comparisons [ 191. Method

Women in this study completed a questionnaire at the screening clinic while waiting for their mammogram. Rest&s

Data was collected for 1722 women. Cronbach’s alpha for the emotional subscale was 0.89; for the physical subscale 0.77; and the social subscale 0.78.

STUDY

6: CONSTRUCX

VALIDITY

The aim of Study 6 was to assess the construct validity of the questionnaire by comparing scores obtained from women while waiting at the screening clinic with their scores obtained while waiting at the recall clinic. It was predicted that all negative consequences subscale scores obtained at recall clinic would be significantly higher (indicating a higher level of dysfunction) than those obtained at screening clinic [5].

DISCUSSION

The series of studies outlined in this paper show how systematic procedures were used to develop a reliable and valid questionnaire to measure the psychological consequences of screening mammography. The techniques used to assess content validity meant that all relevant domains were included in the questionnaire, and that the content of items adequately represented these domains. It also served to ensure that items and response formats were acceptable and understood by respondents. Evidence was also obtained for the discriminant, construct and concurrent validity of the PCQ, as classification of items by expert judges agreed with . . . . a pnorl classlficatlon of items into dimensions, the measurements from different points of data collection varied in a predictable manner and there was acceptable concurrence between different forms of measurement. There was also evidence for the reliability of the PCQ as all subscales showed adequate internal consistency. The results give confidence that the PCQ meets methodological standards for achieving an adequate representation of respondents’ perceptions. The PCQ is a potentially useful tool for assessing the psychological consequences of the screening process as the effect of the screening process on three dimensions of normal functioning are measured and scored sepaiately. It allows an assessment not only of the negative consequences of screening but also of the any positive

JILLCOCKBURN et al.

1134

consequences after results are received. The questionnaire is short, taking less than 5 min to complete. We found that the response formats and self-completion techniques were acceptable. Given the importance of psychological consequences in assessment of the costs and benefits of screening programs, the PCQ should have a wide application, particularly in studies where aspects of the service are varied and psychological consequences are treated as a dependent variable. We are currently using this questionnaire in a longitudinal study whereby a cohort of women are followed through all stages of the screening process, which will enable us to report on the psychological consequences of attendance for screening mammography. It should be noted that although the PCQ was designed specifically to examine the psychological consequence of breast cancer screening, it might also be applied to a variety of patient groups by the substitution of medical-condition specific terms. However, we would recommend further psychometric testing if it were to be used in populations other than the one for which it was developed. Acknowledgements-We

would like to thank Ian Russell, Director, and Delia Flint-Richter, Manager, of the Breast X-ray Program at the Essendon and District Memorial Hospital (Melbourne) for their co-operation in having the study conducted. We would also like to thank: Sally Redman and Barbara Murphy for helpful discussion on definitions of consequences; staff at the Program for their help during data collection; Bronwyn Nixon and Denise Hoban for efficient data collection; and Marco Cappiello for careful data analysis. Finally, we would like to acknowledge the women who gave their time to participate in this study. This research was funded by the Anti-Cancer Council of Victoria.

REFERENCES 1. Tabar L., Gad A., Holmberg L. H. et al. Reduction in

mortality from breast cancer after mass screening with mammography. Lancer 829-832, 1985. 2. Shapiro S., Venet W., Strax P., Venet L. and Roeser R. Selection follow-up and analysis in the health insurance plan study: a randomised trial with breast

cancer screening. Nam. Cancer Inst. Monogr. 65.65-74, 1985. 3. Roberts M. M. Breast screening: time for a rethink. Br. Med. J. 299, 1153-1155, 1989. 4. Martaeu T. M. Psychological costs of screening. Er. Med. J. 299, 527, 1989. 5. Ellman R., Angeli N., Christians A., Moss S., Chamberlain J. and Maguire P. Psychiatric morbidity associated with screening for breast cancer. Br. J. Cancer 60. 781-784, 1989: 6. Dean C., Roberts M. M., French K., and Robinson S. Psychiatric morbidity after screening for breast cancer. J. Epidem. Commun;ry Hlth 40, 71175, 1986. 7. Goldberg D. Manual for the General Health Quesrionnaire. NFER Publishing, Horsham, 1978. 8. Gram I. T., Lund E. and Slenker S. E. Quality of life after a false positive mammogram. Br. J. Cancer 62, 1018-1022, 1990. 9. Selby P. J., Chapman J. A. W., Etazadi-Amoli J., Dalley D. and Bovd N. F. The develooment of a method of assessing the quality of life of cancer patients. Br. J. Cancer 50, 13-22, 1984. IO. Schipper H., Clinch J., McMurray A. and Levitt M. Measuring the quality of life of cancer patients: the functional living index-cancer: development and validation. J. clin. Oncol. 2, 472-483, 1984. 11. Bergner J., Bobbitt R. A., Carter W. B., and Gibson B. S. The sickness impact profile: development and final revision of a health status measure. Med. Care 19, 787-805, 1981. 12. Cronbach L. J. Essenlials of Psychological Testing. Harper & Row, New York, 1970. 13. Anastasi A. Psychological Testing. MacMillan, New York, 1988. 14. Irwig L., Cockburn J., Turnbull D., Simpson J., Mock P. and Tattersall M. Womens’ perceptions of screening mammoaraohv. Aust. J. Publ. Hlth 15. 24-32. 1991. 15. Vinokur-A: D., Threatt B. A., Capian R. ‘D. and Zimmerman B. L. Physical and psychosocial functioning and adjustment to breast cancer. Cancer 63, 394-405, 1989. 16. McNair D. M., Lorr M. and Droppleman L. F. Manual: profile of mood states. Educational and Industrial Testing Service, San Diego, 1971. 17. Marteau T. M. Reducing the psychological costs. Br. Med. J. 301, 26-28, 1990. IS. Nelson L. D., Satz P., Mitrushina M., Gorp W. V., Cichetti D., Lewis R. and VanLancker D. Development and validation of the neuropsychology behavior and affect profile. Psychological Assessment. J. Consult. Clin. Psychol. 1, 266-272,

1989.

19. Helmstadter G. C. Principles of Psychological Measurement. Appleton-Century-Crofts, New York, 1964.