CME examinations: One year of experience Lawrence E. Rosenthal, Ph .D.* Evanston , IL Since its inception, the JOURNAL OF THE AMERI CAN ACADEMY OF DERMATOLOGY (JAAD ) has provided physicians with an opportunity to earn Category I CME credit by completing an examination on CME articles that appear in the JOURNAL. Since July, 1979, an average of 748 physicians per mon th have participated in this program . The Academy office has developed a computer analysis program for each of these ex aminations . This analysis includes : ( I ) item analysis , (2) difficulty and discriminat ion indices, (3) descripti ve statist ics, (4) reliability coefficient, (5 ) raw score distribution, and (6) percentile distribut ion. This article provides a detailed description of the above statistical treatments along with an analysis of CME examination performance during the first year of publication (July, 1979-June , 1980). (1 AM ACAD DERMATOL 4:115-118 , 1981. )
With the advent of the
JOURNAL OF
THE
(JAAD) in July. 1979. the Academy implemented a new del ivery system of Continuing Medical Education . Each month a CME article with two hours for Category I credit is published in the J OURN AL. Dermatologists who are interested in earning
AMERICAN ACADEMY OF DERMATOLOGY
"Associate Executive Director, American Academyof Derm atology.
credit from JOURNAL reading can earn up to 24 hours per year by responding to the examination that appears at the end of each CME article and returning the completed answer sheet to the Academy office. The answ er sheets are fed through an optical scanner that is linked to the Academy's computer. In addition to analyzing the examinations, the computer automatically assigns CME credit to the
Table I. CME examination summary statistics Quiz
No. of items
Mean*
Median'!'
SD
Relative coefficient
July , 1979 Augu st September October November December January, 1980 February March April May June
13 23 28 24 20 27 33 27 33 33 33 21
10.09 17.66 18.55 17.56 16.34 23.33 27.57 22. 04 27. 81 28.61 28 .14 17.22
10 18 19 18 17 24 29 23 29 30 29 18
2. 17 3.22 3.52 3.02 2.79 3.27 5.28 3.66 5.15 4.3 4.3 3.16
0.56 0.63 0.51 0.50 0 .64 0.73 0.86 0.72 0.86 0.82 0.80 0.79
SD: standard deviation, *Mcan and median scores are for correctanswers,
0190-9622 / 8 1/010 115+04$00 .40/0 © 198 1 Am Acad Derrnatol
115
Journal of the
116
Rosenthal
American Aca demy of
Dermatology
Table II. American Academy of Dermatology analysis of test 880-107 (July, 1980) Percentage of item choice Item No.
I 2
Correct choice
B B
3
C
4
D
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
C
A E C E A C
27 28
A
B D
E D A C E A D
B A C A
B B
B
A
1.3
1.5 0.1 7.8 0.1 1.0 8.1 0.6 0.7 94.0 0.4 6.5 87.6 0.8 7.9 80.6 0.4 69.2 6.8 7.5 79.7 7.5 5.4 1.7
80.5 17.0 3.1 80.9
I
B
85.7 85.5 2 .1
0 .1 0.4 75.3 1.0 0.0 8.1 0.3 1.4 1.3 7.2 6.7 85.4 0 .6 9.5 3 .3 80.8 69.6 7.0 1.8 4 .5 1.7 3 .8 2 .1 65.5 1.0
I
C
I
0.6 0.3 83.3 0.1 93.7 2.1 0. 8 1.0 0.6 0.1 93 .2
0.6 0.4 0.3 1.4 0.1 78.3 0.8 0.4 6.3 4. 7 1.8
75.8 1.7 0.3 66.7 3.5 3.1
physician whose Medical Education number is marked on the answer sheet. After the CME credit is recorded the quiz sheets are destroyed . The computer does not record individual test scores. It scans , analyzes, and reports composite or group results from the recorded examinations. Each month the JOURNAL editor s , the Chairman of the Council on Educational Affairs, and the Chairman of the Committee on Evaluation rec eive an analysis of the previous month 's examination . This information is used to plan and improve future CME articles and programs. From July, 1979 through June, 1980, an average of 748 physicians per month returned completed answer sheets to the Academy office. An overall total of 8,985 answer sheets representing 17,970 CME credit hours have been received and processed since July, 1979. Physicians from every
D
7.4 0.6 8.2 83.3 0.1 7.4 83.1 8.5 84. I 1.0 0.0 0.1 0.6 86.9 0.7 1.3 0.3 2.1 3.8 3.3 2.1 2.5 0.4 5.7 1.9
0.1 12.4
0.6
I
E
0.7 8.4 0.0 4.5 0.6 8.4 1.1
82.0 0.1 1.1
0.4 85.7 0.0 0.6 1.0 11 .8 0.8 9.3 0.4 4.7 0.4 80.6 0.8 77.7 0.4 0.1 0.1 0.0
I .. 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Other
4.5 3.8 6.3 4.2 5.0 5.8 5.8 7.9 6.4
3.5 4.6 5.8 4.2 4 .7 3.6 5.6 10.7 15.2 7.8 8.5 6. 1 5.7 13 .1 11.6 l3 . 1 13 .9 15.5 14.5
Diff. Index
Disc. index
0.77 0.77 0.75 0.76 0.91 0.68 0.76 0.74 0.74 0.91 0.88 0.76 0.78 0.78 0 .77 0 .73 0.80 0.63 0.73 0.64 0.73 0.74 0.68 0.71 0.71 0.62 0.61 0.73
0.44 0.45 0.47 0.46 0.15 0.59 0.44 0.51 0.51 0.18 0.24 0.47 0.44 0.44 0.45 0.47 0.24 0.59 0.51 0.61 0.49 0.49 0.53 0.53 0.54 0.60 0.58 0.53
state except Alaska have participated in the program . This represents 16 .6% participation of all non-Life U.S . members of the Academy . The computer analysis of each month 's examination includes: I . Question Item Analysis 2. Difficulty and Discrim ination Index for each test item 3 . Des criptive Statistics (number participating , mean, median and standar d deviation) 4. Reliability Coeffi cient 5. Raw Score Distribution 6. Percentile Distribution
The Item Analysis proces s calculates the percent of responses for each possible answer choice for each item, providing a distribution of how each test item was answered . Too many incorrect responses suggest that the test item or the possible
Volume 4 Number 1 January, 1981
117
CME examinations
Table II. Cont'd Percentage of item choice Correct choice
Item No.
29
A
* * * *
Items 29. 30. 31. 32. 33 . The *
I
B
0 .8 6 .7 1.4 0 .8
I
C
I
0
0 .3 0 .3 0.1 0.4 0.3
0 .1 0 .1 1.7 0.3 0.0
1.9
*
30 31 32 33
I
0.1 0.1 0.1 0.1 0.4
E
Diff.
I
" 53.2 19.8 71.2 68.5 53.5
1.1 0.3 0.0 0.7 0.4
Other
index
Disc. index
43.2 78.6 20.2 28.6 44.6
0.52 0.20 0.64 0.62 0.51
0.70 0.30 0.60 0.74 0.75
29-33 required multiple responses as a correct answer. The correct responses were: a, b, c, d, e a, b, c, d, e a, b a, b,c , ct , e a, b, c, d, e column denotes the percent of those part icipants who answered these items correctly. Test statistics" Raw score distribution
Score
:55
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Nos .
65
0
3
I
I
I
a
4
2
I
2
0
I
4
7
Score
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Nos .
8
10
12
24
39
30
54
54
70
82
87
71
74
11
Percentile distribution
5
10
15
20
25
30
35
40
45
50
S5
60
65
70
75
80
85
90
95
4
10
21
23
24
25
26
27
27
28
29
29
29
30
30
31
31
32
32
'Number
= 718; mean = 25.41:
med ian
= 28; SO = 7.9;
reI. = 0.93468.
choices did not focus adequately on the concept being e valuated . Too many co rrect responses may indicate that the item was too eas y or that the choices presented " led " the responder too much. The Difficulty and Discrimination Indices provide an additional dimension for analyzing the effectiveness of a test item. In both instances , the upper and lower 27% o f the scores are anal yzed as a s ubgrou p of the total exam inat ion participa nts . Th e Difficulty Index is the percentage of the uppe r and lower groups that answered the item correctly . Th e index can then be compared to the performance of the ove rall g ro up participating in the qui z . Such comparison ans wers the question of how those groups (upper and lower) performed in relation to the overall group. Generally, items that have more than a 0.90
Difficulty Index should be questioned as too easy and items that have le ss than a 0 .30 index rating sh ould be regarded as po ssibly too difficult to be incl uded in a test. Th ey shou ld be questioned , not reje cted, for the y may be justified on other grounds. The Discrimination Index , w hich ranges from - 1.00 to + 1 .00, pro vide s a measure by which tes t item deve lopers ca n dete rmine the pe rformance on the item between the upper and lower groups. For example , if a certain test item wa s de veloped so that correct responses were expected onl y from the upper group , the Discrimination Index would pro vide the answe r to the item developer as to whether or not the item actually discriminated between the upper and lower groups. 1 The Descriptive Statistics produced by the
118
Journal of the American Academy of Dermatology
Rosenthal
computer provide overall examination performance data. These data include: (1) number who participated, (2) mean or average score, (3) median, and (4) standard deviation. The median is the score at which 50% of the respondents scored below and 50% scored above. The standard deviation measures the differences of individual scores from the mean. The formula is based upon a theoretical normal distribution of scores around the mean. The standard deviation is calculated to describe, partially, the shape of the score distribution and its relation to the theoretical normal distribution curve. The Reliability Coefficient measures the consistency of examination measurement of skill or knowledge. It provides an indicator as to whether or not the same or similar individuals would perform similarly on the same test at some other time. Reliability is a statistical concept. Consequently, no logical or visual examination or review by experts will provide a clue to Reliability. As a rule of thumb, most experts consider a Reliability Coefficient greater than 0.6 as a good reusable test. 2 The Raw Score Distribution is merely a frequency distribution of scores of participants. it answers questions such as: "How many participants received a perfect score?" "How many got 5 or less correct or missed only I question on the quiz?" Each correct test item is worth 1 point. Hence, a perfect score on a 27 item test is 27. The Percentile Distribution is designed to an-
swer the question of the relation of a test score to other scores. For example, if a score of 26 is calculated to be at the 95th percentile it means that someone receiving a score of 26 scored better than 95% of all participants who took the quiz. Table I provides overall descriptive statistics for the first twelve months of the Academy's CME Journal reading program. The Academy is considering printing, on a routine monthly basis, descriptive statistics (number of test participants, mean, median, and standard deviation), item analysis, and raw score distribution of CME examinations. A sample of such reporting is contained in Table II, which contains the above statistics for the July, 1980 examination. Would providing information, as presented in Table II, for each examination be helpful to you? Please let the Academy know your views on this matter. You may address your comments to: American Academy of Dermatology Continuing Education Assessment Service 820 Davis St. Evanston, IL 60201 REFERENCES I. Isaac S, Michael W: Handbook in Research and Evaluation. San Diego, 1971, Robert R. Knapp, pp. 80, 87, 89, 136. 2. Cronbach L: Essentials of psychological testing. New York, 1960, Harper and Brothers, p. l4J.