The In-Training Examination: An Analysis of Its Predictive Value on Performance on the General Pediatrics Certification Examination

The In-Training Examination: An Analysis of Its Predictive Value on Performance on the General Pediatrics Certification Examination

The In-Training Examination: An Analysis of Its Predictive Value on Performance on the General Pediatrics Certification Examination LINDA A. ALTHOUSE,...

158KB Sizes 11 Downloads 39 Views

The In-Training Examination: An Analysis of Its Predictive Value on Performance on the General Pediatrics Certification Examination LINDA A. ALTHOUSE, PHD,

AND

GAIL A. MCGUINNESS, MD

Objective This study investigates the predictive validity of the In-Training Examination (ITE). Although studies have confirmed the predictive validity of ITEs in other medical specialties, no study has been done for general pediatrics. Study design Each year, residents in accredited pediatric training programs take the ITE as a self-assessment instrument. The ITE is similar to the American Board of Pediatrics General Pediatrics Certifying Examination. First-time takers of the certifying examination over a 5-year period who took at least 1 ITE examination were included in the sample. Regression models analyzed the predictive value of the ITE. Results The predictive power of the ITE in the first training year is minimal. However, the predictive power of the ITE increases each year, providing the greatest power in the third year of training. Conclusions Even though ITE scores provide information regarding the likelihood of passing the certification examination, the data should be used with caution, particularly in the first training year. Other factors also must be considered when predicting performance on the certification examination. This study continues to support the ITE as an assessment tool for program directors, as well as a means of providing residents with feedback regarding their acquisition of pediatric knowledge. (J Pediatr 2008;153:425-8) or approximately 35 years, residents in general pediatric training programs have been given an In-Training Examination (ITE) as a self-assessment instrument. The ITE is similar in difficulty, format, and content to the American Board of Pediatrics (ABP) General Pediatrics (GP) Certifying Examination and is created with 200 items from a prior certifying examination. The ITE is administered each July, approximately 2 weeks after the academic year begins. As a result, first-year residents take the examination as they begin formal training. Therefore their scores are indicative of the pediatric knowledge they have before training. Third-year residents take the examination at the beginning of their final year of training, after completion of 2 years of training. The ABP offers the ITE annually to all accredited residency training programs in the United States and Canada. Program participation is 100%, with more than 200 programs participating each academic year. Residents are encouraged to take the ITE but are not required to do so. Nonetheless, more than 9800 residents (more than 95% of all residents) participated in the ITE in 2005. The purpose of this study is to investigate the predictive validity of the ITE relative to performance on the GP certifying examination. Predictive validity is the extent to which performance on an examination correlates with future performance on another examination or scorable measure for the same set of examinees. Although studies have attempted to confirm the predictive validity of ITEs in other medical specialties,1-9 no formal study has been done to explore the predictive validity of the ITE for general pediatrics.

F

METHODS Scoring of the GP Certifying Examination When scoring the certifying examination, the number of items a candidate answered correctly is converted to a scaled score. This scaled score is determined by the reference group, which is composed of individuals who have graduated recently from an accredited medical school in the United States or Canada and who are taking the examination for the ABP GP ITE

American Board of Pediatrics General pediatrics In-Training Examination

PL-1 PL-2 PL-3

First-year resident Second-year resident Third-year resident

From the American Board of Pediatrics (L.A., G.M.), Chapel Hill, NC. Submitted for publication Oct 2, 2007; last revision received Jan 29, 2008; accepted Mar 13, 2008. Reprint requests: Linda A. Althouse, PhD, American Board of Pediatrics, 111 Silver Cedar Ct, Chapel Hill, NC 27514. E-mail: [email protected]. 0022-3476/$ - see front matter Copyright © 2008 Mosby Inc. All rights reserved. 10.1016/j.jpeds.2008.03.012

425

Table I. ITE and GP scores for all examinees of the 2001-2005 GP examination

PL-1 ITE Score PL-2 ITE Score PL-3 ITE Score GP Score

N

Mean

Standard deviation

Standard error of mean

Range of scores

13611 13627 13614 14525

173.48 299.93 359.09 484.97

116.94 123.03 118.55 110.24

1.00 1.05 1.02 .92

0-760 0-780 0-760 0-800

first time. The mean of the reference group is set to 500, with the standard deviation set to 100. The raw score for each candidate is then converted to a standardized scaled score of 0 to 800. Although evaluated each year, the passing point for the certifying examination has consistently remained at 410.

Scoring of the ITE The ITE, like the GP examination, is a norm-referenced examination; relative achievement, rather than absolute knowledge, is measured. The ITE is scored with the reference group from the certifying examination from which the ITE was derived. The same 0 to 800 scale is used. A score of 300 on the ITE can be interpreted as scoring 2 standard deviations below the mean performance of the reference group on the same set of items when taken as part of the GP examination. Interpreting the ITE Scores With the ITE difficulty level being stable from year to year, a resident’s scores can be compared across all 3 years of training. By scoring all residents, regardless of their level, on the same scale as the GP examination, residents are able to compare their scores with those of individuals who took the same set of items on an actual certifying examination and to assess whether they are improving during their training, how they are improving relative to their peers, and whether their scores are nearing the GP certifying examination passing mark of 410. First-year (PL-1) residents are not expected to score as high as second-year (PL-2) or third-year (PL-3) residents. Although the ITE is a subset of the certifying examination, it must be noted that the 2 examinations are administered under different conditions. The certifying examination is a high-stakes, secure examination administered simultaneously across the United States and Canada during a single 2-day period; it is administered by ABP-appointed proctors following strict, standardized procedures. The ITE is administered by the program directors at each training program. Even though program directors are encouraged to administer the examination to all residents on a single administration day, they are given a 1-week window to allow for flexibility in the rotation schedules of residents. Not all residents (even within the same training program) take the ITE at the same time. Trainees are not encouraged to prepare for the ITE, and intensive preparation may take place for the certifying examination. In addition, although individuals are likely to allow the appropriate amount of rest before taking the certifying examination, a resident may take the ITE with 426

Althouse and McGuinness

minimal rest time (eg, after being on call). Even with these differences, it is hypothesized that ITE scores provide some measure of predicting performance on the certification examination. All first-time test takers of the GP Certifying Examination during the 5-year period of 2001 to 2005 who took at least one ITE examination while in training were included in the sample. The scores of each of their ITE administrations and of their initial GP Certifying Examination were recorded, which resulted in 14 525 examinee records for the 5-year period. The number of first-time takers each year for the GP examination during this 5-year period ranged from 2809 to 2960, indicating a relatively stable sample size across the 5 years. Linear regression was used to analyze the predictive nature of the ITE scores. The predictive validity was assessed at each training level. Descriptive statistics, relative to performance on the GP certifying examination, for specific score points for the ITE also are provided.

RESULTS Descriptive Statistics Table I provides descriptive statistics for all first-time test takers of the GP examination during the 5-year period of 2001 to 2005 who took at least 1 ITE examination while in training. The sample size for each set of scores varies, because those who took the GP examination during 2001 to 2005 may not have taken the ITE in all 3 years of their training. As expected, the average score increases as time in training increases. The increase between PL-1 and PL-2 average scores (126.5) was more than twice the increase between PL-2 and PL-3 scores (59.2). The increase between the last year of training and GP scores was similar to the gain during the first year of training. Results of paired-sample t tests indicate that the differences in the mean performance at each of the training levels are statistically significant (P ⬍ .0001). This systematic increase of scores at each training level supports the intent of the ITE, because one would expect medical knowledge and scores to increase as one progresses through training. Linear Regression Table II provides the results of the linear regression models conducted for each of the 3 training years. In each model, the respective ITE scores were used as the predictor variables, with the GP examination scores as the dependent variable. If ITE scores were a perfect predictor of GP scores, The Journal of Pediatrics • September 2008

Table II. Regression models in predicting GP scores from past ITEs Models

Intercept

Unstandardized coefficient

Standardized coefficient

R2

PL-1 ITE Score PL-2 ITE Score PL-3 ITE Score

381.82 298.98 248.21

.591 .620 .660

.628 .687 .712

.395 .471 .506

Table III. PL-3 scores compared with actual performance on the 2001-2005 GP examination PL-3 ITE score range

Mean GP score

Standard deviation

95% confidence interval

Number of test takers

Actual GP score range

ⱕ50 60-100 110-150 160-200 210-250 260-300 310-350 360-400 410-450 460-500 ⬎500

240 290 320 370 400 440 470 500 530 560 600

120 100 100 100 90 90 80 70 70 70 70

210-260 270-300 320-330 360-370 390-400 430-440 470-570 500-500 530-540 560-560 600-610

101 156 441 699 1235 1689 2053 2285 2043 1243 1669

10-540 0-500 0-570 0-640 80-630 0-670 180-690 120-690 160-730 220-730 210-800

they would account for 100% of the variation of scores among those taking the GP examination. That is, there would be a perfect correlation between ITE and GP scores such that the person scoring the lowest on the ITE would also score the lowest on the GP examination, and the spread of the scores for the ITE would be the same as the GP examination. In practice, perfect prediction does not occur. PL-1 ITE scores accounted for 39.5% of the variation (R2) in the GP examination scores. The PL-2 ITE scores accounted for about 47% of the variance, and the PL-3 scores accounted for about 51%. These results indicate that, as residents move through their training years, their scores on the ITE become better predictors of their GP examination scores. All coefficients were significant at the P ⬍ .0001 level.

Summary Statistics on the Basis of Actual Data Although regression analysis provides models for predicting performance, the ABP provides actual data from past results to program directors and residents. Table III shows score ranges for the ITE examination for third-year residents compared with actual performance on the certifying examination. This table is an abbreviated version of the actual table provided to program directors, which provides data points for ITE scores in increments of 10. Similar tables for first- and second-year residents also are provided to program directors. This table can be used to ascertain how previous residents, at the same training level, who achieved a particular ITE score performed on the GP examination. The Figure, which illustrates the passing rate at various increments of ITE score points, is also provided to program directors. The passing rate is a 5-year moving average of pass rates for each training level at each score

First attempt pass rate 8.9% 11.5% 19.7% 36.6% 48.9% 67.0% 81.3% 90.7% 96.5% 98.1% 99.2%

group. The Figure shows a decrease in GP scores at the higher ITE performance ranges for PL-1s. This decrease likely is a function of the small sample sizes, because very few first-year residents score above 500.

DISCUSSION Results from this study suggest that the ITE is an important predictor of a resident’s chances of passing the GP examination, with its predictive value increasing each training year. Although the models showed a significant correlation, the single variable of ITE performance is only one contributing factor. Other factors must be considered when predicting performance on the GP Certifying Examination. Future research should focus on other variables (eg, characteristics of the training program, performance ratings) that correlate with performance of residents on the certifying examination. Table I shows a gain of about 125 points in examination scores between training years 1 and 2 and training year 3 to GP examination performance. However, there appears to be a plateau between training years 2 to 3. This pattern is difficult to explain. One postulation is that residents take the first ITE almost immediately upon entering pediatrics training, so their first score would reflect minimal pediatrics knowledge. The higher scores in the second year of training can be attributed to the accelerated learning of pediatrics that occurs in the first training year, compared with prior medical school years. From training year 3 to the GP examination, residents are motivated to pass the certification examination, which may explain the large gain in scores during this period. Further research would need to be conducted to ascertain other factors that may contribute to this result, because one might expect equal gains in knowledge across all 3 years.

The In-Training Examination: An Analysis of Its Predictive Value on Performance on the General Pediatrics Certification Examination 427

In addition to Table III and the Figure, program directors are provided with a summary of residents’ performance on each item in the examination, content feedback statements that provide the topic for each item on the examination, and individual item performance reports that note the exact items each resident missed. The predictive validity of the ITE is one more tool that program directors can use to help interpret ITE scores.

REFERENCES

Figure. Five-year average passing rate for the 2001-2005 GP Certifying Examination by ITE Score Groups.

Program directors and residents are instructed that the ITE is not to be used for anything other than a tool to measure the acquisition of pediatric knowledge during training. The limitations arising from the variability in examinee preparedness and administration protocols likely result in differences that have an impact on the predictability for the certifying examination. However, even with these factors, the results indicate that the ITE can help in predicting GP outcome, particularly at the PL-3 level.

428

Althouse and McGuinness

1. Babbott SF, Beasley BW, Hinchey KT, Blotzer JW, Holmboe ES. The predictive validity of the internal medicine in-training examination. Am J Med 2007;120:735-40. 2. Borlase BC, Bartle EJ, Moore EE. Does the in-service training examination correlate with clinical performance in surgery? Curr Surg 1985;42:290-2. 3. Grossman RS, Fincher RM, Layne RD, Seelig CB, Berkowitz LR, Levine MA. Validity of the in-training examination for predicting American Board of Internal Medicine certifying examination scores. J Gen Intern Med 1992;7:63-7. 4. Leigh TM, Johnson TP, Pisacano NJ. Predictive Validity of the American Board of Family Practice in-training examination. Acad Med 1990;65:454-7. 5. Garibaldi RA, Subhiyah R, Moore ME, Waxman H. The in-training examination in internal medicine: an analysis of resident performance over time. Ann Intern Med 2002;137:505-10. 6. Replogle WH. Interpretation of the American Board of Family Practice in-training examination. Fam Med 2001;33:98-103. 7. Replogle WH, Johnson WD. Assessing the predictive value of the American Board of Family Practice in-training examination. Fam Med 2004;36:185-8. 8. Rollins LK, Martindale JR, Edmond M, Manser T, Scheld WM. Predicting pass rates on the American Board of Internal Medicine certifying examination. J Gen Intern Med 1998;13:414-6. 9. Waxman H, Braunstein G, Dantzker D, Goldberg S, Lefrak S, Lichstein E, et al. Performance on the internal medicine second-year residency in-training examination predicts the outcome of the ABIM certifying examination. J Gen Intern Med 1994;9:692-4.

The Journal of Pediatrics • September 2008