Validating the Medical Data Interpretation Test in a Dutch population

Validating the Medical Data Interpretation Test in a Dutch population

Patient Education and Counseling 68 (2007) 287–290 www.elsevier.com/locate/pateducou Short communication Validating the Medical Data Interpretation ...

144KB Sizes 0 Downloads 104 Views

Patient Education and Counseling 68 (2007) 287–290 www.elsevier.com/locate/pateducou

Short communication

Validating the Medical Data Interpretation Test in a Dutch population Chris M.R. Smerecnik *, Ilse Mesters Care and Public Health Research Institute (Caphri), Department of Health Education and Health Promotion, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands Received 8 January 2007; received in revised form 26 June 2007; accepted 26 June 2007

Abstract Objective: To validate the Dutch translation of the Medical Data Interpretation Test. Methods: A test–retest design with a 2-week interval was used. Results: The intraclass correlation coefficient (ICC = .82), the limits-of-agreement interval (LOA = 8.96 to 2.48) and the test–retest reliability (Pearson’s r = 86) suggest that the Dutch translation has good reproducibility. Construct validity was tested by two hypotheses, both of which were confirmed. University participants had higher test scores than non-university participants ( p = .02), and males did not score differently than females ( p = .61). Conclusion: The results suggest that the Dutch version of the Medical Data Interpretation Test is an adequate scale to assess ability to interpret medical data. Practice implications: Assessing patients’ numeracy skills before a counseling session will enable the counselor to adjust subsequent communication accordingly and, as such, improve the session’s effectiveness. # 2007 Elsevier Ireland Ltd. All rights reserved. Keywords: Medical decision making; Medical Data Interpretation Test—Dutch; Validity; Reliability

1. Introduction Public health care has steadily moved towards active patient participation in the last decades [1,2]. An integral aspect of patient participation concerns informing the patient about treatment options and its consequences [1], often by presenting the patient with statistical medical information. However, patients’ ability to understand and interpret such information is often inadequate [3]. In view of the importance of accurate understanding of such information in decision-making [4], it would be useful to assess patients’ understanding skills and consequently improve subsequent communication [5]. Although, an English scale that is capable of adequately measuring patients’ understanding of medical data already exists [6], a Dutch scale does not. Instead of creating a new Dutch scale, the purpose of the present study was to validate the Dutch version of the Medical Data Interpretation Test [6].

* Corresponding author. Tel.: +31 43 388 4278; fax: +31 43 367 1032. E-mail address: [email protected] (C.M.R. Smerecnik). 0738-3991/$ – see front matter # 2007 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.pec.2007.06.013

1.1. Creation of the Dutch Medical Data Interpretation Test The original Medical Data Interpretation Test (MDIT) is a 20-item self-administered questionnaire and was developed to quickly asses patients’ understanding of medical data and, as such, facilitate health communications [6]. It has been shown to have good internal consistency (Cronbach’s a = .71), reliability (Pearson’s r = .67) and good content and construct validity. Prior to the present study, the Dutch version of the MDIT (the MDIT-D) questionnaire was created by using forward and backward translation [7]. Furthermore, two experts in the field of health communication assessed and confirmed face and content validity of the MDIT-D. Subsequently, it was pre-tested and pilot-tested among 50 students of the Department of Health Education and Health Promotion of the Maastricht University to determine whether the MDIT-D was comparable to original MDIT. This pilot test revealed that one item (item 3) differed substantially in the proportion of correct answers and was consequently rephrased. The remainder of the paper discusses the reliability and validity of the MDIT-D.

288

C.M.R. Smerecnik, I. Mesters / Patient Education and Counseling 68 (2007) 287–290

2. Methods

Table 1 Demographical characteristics of the University and higher education groups

2.1. Participants

Demographical characteristics

University

Higher education

Test-value

P-value

Five university students of the Maastricht University and five students of vocational education colleges were recruited for the present study. These students were requested to ask their fellow students to participate in the study as well. After agreeing to participate in the present study, individuals were presented with the MDIT-D. Additionally, participants were asked to report their age, gender, level of education and their interest in, experience with, and their ability to interpret medical data. Two weeks later participants were again presented with the MDIT-D in order to asses the test– retest reliability [7]. All participants were native Dutch speakers.

Age Gender Men Women

21.35

20.73

1.61 2.57

.11 .11

8 23

12 16 1.55 1.17 1.18

.13 .25 .24

2.2. Test evaluation 2.2.1. Test scores Test scores were analyzed to establish the usability of the scale. That is, item non-response should be minimal and items should substantially differ in difficulty. Furthermore, the MDIT-D should have no floor and ceiling effects (defined as 15% of the participants having the lowest or highest possible score). 2.2.2. Reproducibility Reproducibility was determined by assessing the agreement and reliability of the MDIT-D. Agreement was determined using the intraclass correlation coefficient and the limits-of-agreement procedure [8,9]. Reliability was assessed using the test–retest method, which measures the agreement between individual’s scores separated by a certain time interval. The scale was also tested for internal consistency. Although the MDIT-D was intended as measur-

Interest (range: 1–5) Experience (range: 1–5) Self-reported ability (range: 1–5)

3.71 3.42 4.19

3.42 3.62 4.00

ing one skill (i.e. the ability to interpret medical data), an exploratory factor analysis was conducted to determine whether the scale was unidimensional or multidimensional to properly assess internal consistency. In case of multidimensionality, internal consistency was determined using Cronbach’s alpha [10] for each subscale and compared to the overall Cronbach’s alpha. 2.2.3. Validity Criterion validity is considered the most powerful indicator of validity [7]. However, criterion validity requires the existence of an external ‘gold’ standard with which the scale can be compared. As no such ‘gold’ standard exists, construct validity was assessed by linking the attribute measured by the scale (i.e. understanding medical data) to another attribute (i.e. level of education) based on theoretical or empirical notions [7]. For instance, research has shown that while there are gender differences, highly educated individuals have better numeracy skills than lower educated individuals [11]. We therefore expected an effect of education, but not of gender, on numeracy skills.

Table 2 Percentages of correct and incorrect answers with percentages of the original MDIT in parenthesis Items

Knows that comparison group is needed to establish benefits Knows that age and sex of participants are needed Knows that lowering all causes of death is better than just one cause of death Knows that age of participants is needed Knows that risk of other diseases is needed Select ‘1 in 296’ as larger than ‘1 in 407’ Knows that denominator is needed to compare a risk between groups Calculate absolute risk reduction by using relative risk reduction Calculate risk by reduce relative risk form baseline Knows that denominator is needed to calculate a risk Knows that, for male smokers, risk of lung cancer is larger than prostate cancer Calculate relative risk reduction from two absolute risks Calculate absolute risk reduction from two absolute risks Calculate number of event by applying absolute risk to total group Rate riskiness in gain frame the same as in loss frame Rate death from all causes larger than from one cause Rate death in 10 years smaller than in 20 years

Answered correctly

78.9 (81) 56.1 (47) 7.0 (20) 64.9 (60) 63.2 (62) 98.2 (85) 54.4 (45) 86 (87) 94.7 (80) 96.5 (75) 66.7 (60) 45.6 (52) 61.4 (77) 77.2 (72) 78.9 (61) 52.6 (30) 71.9 (39)

Answered incorrectly Completed

Left blank

21.1 (18) 43.9 (51) 93.0 (79) 35.1 (39) 36.8 (35) 1.8 (14) 45.6 (54) 14 (11) 5.3 (19) 1.8 (24) 33.3 (37) 47.4 (46) 38.6 (19) 22.8 (22) 21.1 (37) 47.2 (69) 28.1 (60)

0 (1) 0 (2) 0 (1) 0 (1) 0 (3) 0 (1) 0 (1) 0 (2) 0 (1) 1.8 (1) 0 (3) 7 (2) 0 (4) 0 (6) 0 (1) 0 (1) 0 (1)

C.M.R. Smerecnik, I. Mesters / Patient Education and Counseling 68 (2007) 287–290

289

Fig. 1. Frequencies of test scores.

3. Results 3.1. Sample characteristics In total, 80 individuals agreed to participate in the present study. Fifty-seven participants (20 males, 36 females) completed both questionnaires, resulting in a response rate of 71.25%. Mean age of the sample was 21.0 (SD = 1.61; range: 19–28). Thirty-one participants reported being non-university students, whereas 26 participants were university students. University and non-university students did not differ on demographical characteristics (see Table 1). 3.2. Test evaluation 3.2.1. Test scores The mean test score for the numeracy scale was 69 (SD = 12.2; range: 30–94), with a median test score of 72. Item

non-response ranged from 0% to 7%, with most items having a 0% non-response. Item difficulty ranged from .07 to .98 (see Table 2). No floor or ceiling effects were observed (see Fig. 1). 3.2.2. Reproducibility Analyses revealed an ICCagreement of .82, p < .001 for a twoway random model with participants and moments of measurement as random factors. The limit-of-agreement interval ranged from 8.96 to 2.48 (on a 0–100 scale). The standard error of measurement (as calculated by SDdifference divided by H2) was 2.06. The test–retest analysis revealed a high test–retest reliability; Pearson’s r = .86, p < .001. Regarding internal consistency, we observed an overall Cronbach’s alpha of .73. Exploratory factor analysis revealed the existence of four factors in the scale (see Table 3). Analyses showed that Cronbach’s alpha of each factor was similar to the overall Cronbach’s alpha. The items loading on factor 1 had a Cronbach’s alpha of .74; factor 2 had a Cronbach’s

Table 3 Factor loading for the MDIT-D items Items

Factors 1

Knows that comparison group is needed to establish benefits Knows that age and sex of participants are needed Knows that lowering all causes of death is better than just one cause of death Knows that age of participants is needed Knows that risk of other diseases is needed Select ‘1 in 296’ as larger than ‘1 in 407’ Knows that denominator is needed to compare a risk between groups Calculate absolute risk reduction by using relative risk reduction Calculate risk by reduce relative risk form baseline Knows that denominator is needed to calculate a risk Knows that comparison group is needed Knows that, for male smokers, risk of lung cancer is larger than prostate cancer Calculate relative risk reduction from two absolute risks Calculate absolute risk reduction from two absolute risks Calculate number of event by applying absolute risk to total group Rate riskiness in gain frame the same as in loss frame Rate death from all causes larger than from one cause Rate death in 10 years smaller than in 20 years

2

3

4

.60 .53 .66 .82 .61 .52 .79 .55 .48 .54 .59 .54 .65 .55

290

C.M.R. Smerecnik, I. Mesters / Patient Education and Counseling 68 (2007) 287–290

alpha of .76; factor 3 had a Cronbach’s alpha of .73; finally, factor 4 had a Cronbach’s alpha of .81. 3.2.3. Validity Independent samples t-tests revealed a significant effect of level of education, t(55) = 2.44, p = .02, but not of gender, t(55) = .516, p = .61, on MDIT-D score. University students scored significantly better (M = 71.92, SD = 11.65) than nonuniversity participants (M = 64.33, SD = 11.74), whereas men (M = 66.56, SD = 11.29) did not score higher on the numeracy scale than women (M = 68.41, SD = 12.78). 4. Discussion and conclusion 4.1. Discussion The purpose of the present study was to create a Dutch version of the Medical Data Interpretation Test (the MDIT-D). The results indicate that the MDIT-D is a reliable and valid questionnaire. The analyses of the test scores showed good usability. Item non-response was low and the item difficulty differed substantially. Furthermore, no floor or ceiling effects were observed. The intraclass correlation, the limits-ofagreement and the test–retest reliability suggest that the scale is reliable. We also tested and confirmed two hypotheses, suggesting adequate construct validity. A number of limitations of the present study deserve some attention. First, the research sample was relatively small and consequently may have influenced the analyses. A second and related limitation concerns the fact that the sample consisted of students only, limiting the generalizability of the results to other populations. Future research should test the MDIT-D using larger research samples and other subpopulations. 4.2. Conclusion Based on the analyses of the usability, the reproducibility and the validity of the scale, we conclude that the MDIT-D seems a reliable and valid scale to assess patients’ numeracy skills. The intraclass correlation, the limits-of-agreement and the test–retest reliability suggest that the scale is reliable. We also tested and confirmed two hypotheses, suggesting adequate construct validity. 4.3. Practice implications In view of the increased tendency to involve patients in medical decision making [1,2], the need to assess patients’

understanding of medical data at any given moment is becoming increasingly important in health care settings. The present study suggests that the Dutch version of the Medical Data Interpretation Test is capable of accomplishing this purpose. In other words, by assessing patients’ understanding of medical data before a counseling session will enable the counselor to adjust subsequent communication accordingly and, as such, improve the session’s effectiveness. Acknowledgements This study was financially supported by Maastricht University and was performed at Care and Public Health Research Institute (Caphri), which participates in the Netherlands School of Primary Care research (CaRe), acknowledged by the Royal Dutch Academy of Science (KNAW) in 1995. References [1] Frosch DL, Kaplan RM. Shared decision making in clinical medicine: past research and future directions. Am J Prev Med 1999;17: 285–94. [2] Sheridan SL, Harris R, Woolf S. Shared decision making about screening and chemoprevention: a suggested approach from the US Preventive Services Task Force. Am J Prev Med 2004;26:56–66. [3] Schwartz LM, Woloshin S, Black WC, Welch HG. The role of numeracy in understanding the benefit of screening mammography. Ann Internal Med 1997;127:966–72. [4] Peters E, Va¨stfja¨ll D, Slovic P, Mertz CK, Mazzocco K, Dickert S. Numeracy and decision making. Psychol Sci 2006;17:407–13. [5] DeWalt DA, Pignone M, Malone R, Rawls C, Kosnar M, George G, Bryant B, Rothman RL, Angel B. Development and pilot testing of a disease management program for low literacy patients with heart failure. Patient Educ Counsel 2004;55:78–86. [6] Schwartz LM, Woloshin S, Welch HG. Can patients interpret health information? An assessment of the medical data interpretation test. Med Decision Making 2005;25:290–300. [7] Streiner DL, Norman GR. Health measurement scales. A practical guide to their development and use, 2nd ed., Oxford: Oxford University Press; 2003. [8] Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979;86:420–8. [9] Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1: 307–10. [10] Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297–333. [11] Woloshin S, Schwartz LM, Black WC, Welch HG. Women’s perceptions of breast cancer risk: how you ask matters. Med Decision Making 1999;19:221–9.