The Promise and Peril of High-Stakes Tests in Nursing Education

The Promise and Peril of High-Stakes Tests in Nursing Education

The Promise and Peril of High-Stakes Tests in Nursing Education Darrell Spurlock, Jr., PhD, RN, NEA-BC Regulators, educators, and the public view firs...

159KB Sizes 0 Downloads 51 Views

The Promise and Peril of High-Stakes Tests in Nursing Education Darrell Spurlock, Jr., PhD, RN, NEA-BC Regulators, educators, and the public view first-time NCLEX-RN® pass rates as an indication of prelicensure program quality. However, some nursing education programs have implemented progression policies that prevent students at risk for failure from taking the NCLEX-RN, thus undermining the intended value of the NCLEX-RN first-time pass rate as an indicator of educational program quality. While artificially driving up licensure pass rates may protect a program from regulatory or accreditation actions, progression policies based on high-stakes testing do not improve educational program quality and divert attention from other issues that could be affecting NCLEX-RN pass rates.

T

oday, a continuous focus on quality, efficiency, and accountability has prompted a sea of changes in fields as diverse as health care, education, and business. In nursing, educational programs are subject to requirements that they provide quality educational experiences for prelicensure students. Approval by an accrediting agency (generally the National League for Nursing Accrediting Commission or the Commission on Collegiate Nursing Education) provides external validation to the public, employers, students, and state boards of nursing (BONs). State BONs approach program quality from a different perspective, striving to protect the public by ensuring the safe and effective practice of nursing (Exstrom, 2001). Exercising their regulatory authority, state BONs often provide a framework for the content, delivery, and evaluation of prelicensure nursing education programs (Russell, 2012). State BONs also may focus on nursing education program quality as another mechanism to protect the public. One indicator of nursing education program quality that has become increasingly important in the last 2 decades is the first-time National Council Licensure Examination for Registered Nurses (NCLEX-RN®) pass rate, defined as the proportion of students from a program who pass the NCLEX-RN on their first attempt (Shultz, 2010). In establishing these indicators of program quality, the use of high-stakes testing to achieve first-time pass rates above requirements of NCLEX has emerged. The purpose of this article is to explore the value of highstakes testing, focusing on implications for nursing students, nursing faculty and leaders in nursing education policy.

NCLEX-RN Pass Rates and Program Quality Regulators, educators, and the public view first-time NCLEXRN pass rates as an indication of prelicensure program quality (Pennington & Spurlock, 2010). To achieve and maintain approval 4

Journal of Nursing Regulation

from the state BON, nursing education programs must achieve defined NCLEX-RN pass rates, which vary from state to state but are generally based on national first-time NCLEX-RN pass rates each year. To meet the licensure exam pass-rate requirements, some nursing education programs have implemented progression policies and high-stakes tests to maintain desirable pass rates by preventing students at risk for failure from taking the NCLEX-RN. For the purpose of this article, progression policies are defined as policies that prevent students from progressing to graduation or for sitting for the NCLEX-RN once they have graduated (Morrison, Free, & Newman, 2002; Spurlock, 2006). Progression decisions are often made based on students' scores on high-stakes tests, normally administered near the end of the nursing program. Highstakes tests, as defined by the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education (American Educational Research Association, 1999) is “a test used to provide results that have important, direct consequences for examinees, programs, or institutions involved in the testing” (p. 176). A nursing education program’s use of an exit exam to determine whether a student should sit for the NCLEX-RN exam is clearly a high-stakes situation. A survey of registered nurse programs by the National League for Nursing in 2011 found that one in three schools have implemented progression policies (National League for Nursing [NLN], 2012a). Twenty percent of the schools required students must achieve a minimum score on a standardized exam in order to progress. One in four practical nursing programs reported implementing the same types of progression policies. The problem with using progression policies to address licensure pass rates is that it undermines the intended value of the NCLEX-RN first-time pass rate. In her 2009 essay, Giddens addressed this point, suggesting that graduation and persistence rates

should be reported alongside NCLEX-RN pass rates. Persistence rates reflect program completion rates, that is the proportion of students starting a program who graduate. Such reporting would provide more information on program quality because persistence and graduation rates are perhaps better indicators of educational program soundness than licensure pass rates alone. In the last few years, the idea of considering the graduation rate along with other metrics has been endorsed by others in higher education. Often, the endorsement comes from political leaders who see taxpayer money as wasted when students enroll but do not complete college programs, but it also comes from accreditation and professional groups who see low retention rates as strong indicators of program quality (Ramaley et al., 2012). In an ideal world, highly qualified students are admitted to nursing education programs, receive high-quality instruction and clinical experiences from qualified and experienced faculty, study intently over the course of their program, graduate from the program, pass the NCLEX-RN, and enter practice as safe, competent, novice nurses. In the real world, students admitted to nursing education programs come from a variety of educational and demographic backgrounds and bring with them an equally diverse set of skills and abilities (Evans, 2008). Also, new, inexperienced faculty members may be hired because of vacancies or increasing student enrollment. Students juggle one or more jobs, tremendous family responsibilities, and a variety of other distractors in addition to their student responsibilities. Lastly, once students graduate, education programs have little control over when they take the licensure exam (Carrick, 2011). Research has shown the longer students wait to take the NCLEX-RN after separation from the education program, the worse their performance (National Council of State Boards of Nursing [NCSBN], 2002). Given this complex picture, it is easy to see why a single measure can only partially indicate the quality of the nursing education program.

High-Stakes Tests The use of high-stakes tests in nursing education has been much debated over the last decade. The core question that has driven much of the debate is one of test-use validity, a term Heubert and Hauser (1999) describe as the validity of a test for its intended purpose. The claimed purpose of high-stakes tests in nursing education is to identify students who are likely to pass the NCLEX-RN. Schools can theoretically increase their pass rates if only those students who are likely to pass are permitted to sit for the NCLEX-RN. But faculty members, regulators, and researchers should note a caveat in the language. Students either pass or fail the NCLEX-RN, so there are two possible outcomes. Companies do not clearly report the ability of their tests to predict NCLEX-RN failure, which is what schools truly need to know. But setting passing cut scores is complex work. Downing (2006) notes, “Passing scores reflect policy, values, expert judgment, and politics. The defensibility and the strength of the validity Volume 4/Issue 1 April 2013

evidence for passing scores rely on the reasonableness of the unbiased process, its rationale and research basis, and the psychometric characteristics of expert judgments.” (p. 20) Thus, it is left to the faculty to decide what happens to students who are not in the predicted-to-pass category on high-stakes exams. These students, who may have good grade point averages, may be told that they cannot take the NCLEX-RN. They are left to sort out the meaning of their high-stakes test scores, the good grades they achieved in their courses, and their inability to simply take the licensing exam—even though students who score in the lowest categories of the high-stakes exams in use today pass the NCLEX-RN more often than they fail it (Lewis, 2006). Cut Scores

To illustrate, consider one widely studied exit exam that only predicts students scoring 900 (the highest scoring category) will pass the NCLEX-RN; thus, students with scores below 900 are considered at risk for failing the NCLEX-RN (Nibert, Young, & Adamson, 2002). Figure 1, based on data from several large published studies, shows the proportion of students from each exit exam scoring category who passed the NCLEX-RN on their first attempt. Nibert, Young, and Adamson (2002) found that only 19.3% of N = 3,844 students scoring below 900 failed the NCLEX-RN. Lewis (2006) reported that 13% of N = 5,295 students scoring below 900 failed the NCLEX-RN on their first attempt. Lewis reported that even students scoring less than 700, the lowest scoring category, passed the NCLEX-RN 68.7% of the time. Data from Young and Langford (2010) show that only 14.2% of N = 1,553 students scoring below 900 actually failed the NCLEX-RN on their first attempt. Students scoring less than 700 passed the NCLEX-RN 61.8% of the time. Spurlock and Hunt (2008) showed a score of 625 to 650 to be a much more accurate cut score, though overall classification accuracy was still low. Young and Langford (2010) surveyed 66 schools in their study. Thirty-two schools reported that 850 is the cut score they use for progression, though the predicted to pass (the NCLEX-RN) label is only applied for students scoring 900 and above (Young & Langford, 2010, p. 2). When students did not reach the cut score determined by their school, the consequences were capstone course failure (13 schools), delay or denial of graduation (21 schools), and delay or denial of NCLEX candidacy (18 schools). Question of Value

Some conclusions can be drawn from these data. First, when classifying only those students scoring in the highest scoring category as predicted to pass, high-stakes tests perform well. Students with the best content knowledge and test-taking skills are most prepared for the NCLEX-RN and will likely pass it. Second, students scoring in the lower scoring categories are still likely to pass the NCLEXRN, but faculty may erroneously assume they are at great risk for NCLEX failure because of their low scores are denied graduation or NCLEX candidacy. Third, they are not permitted to graduate or www.journalofnursingregulation.com

5

Figure 1

83.8 86.7 86.7 68.7 61.8 49.01

80

76.28 85.3 84.1

89.18 93.3 92.3

94.08 96 95.6

100

98.3 99.1 98.3

First-Time NCLEX-RN® Pass Rates by Scoring Category From Three National Studies

60

40

20

0

> 900

850-899

800-849 700-799

Nibert, Young, and Adamson (2002) Lewis (2006)

< 699

Avg. first time NCLEX-RN pass rate (%)

Young and Langford (2010) Note. National average first-time NCLEX-RN pass rates were collected from National Council of State Boards of Nursing data online (www. ncsbn.org/1237.htm) to correspond with the high-stakes testing pass rates reported in the respective studies: Nibert et al. (2002), 2000 national averages; Lewis (2006), 2002 national averages; Young and Langford (2010), 2008 national averages.

are denied NCLEX candidacy, despite evidence of high NCLEXRN pass rates among these students. Third, schools represented in the studies summarized above reported NCLEX-RN results in line with national average pass rates, despite the fact that most students did not score 900 or more on their high-stakes exams. There is no rigorous empirical evidence to support claims that implementing progression policies and high-stakes testing improves educational program quality or NCLEX-RN pass rates. Carr (2011) reported implementing high-stakes testing and progression policies with a resulting increase in NCLEX-RN pass rates, but even after use of the high-stakes tests was disallowed by the New York State Education Department, pass rates remained high. Haleem et al. (2010) implemented a bundle of interventions, including standardized testing (but not high-stakes testing) to improve NCLEX-RN pass rates, and were successful. Nursing programs and their parent institutions have been the targets of lawsuits by students who have been subjected to progression policies they were unaware of before the end of their program, and some of these students have prevailed (for example, Barbieri, 2008; Harris, 2011). The National League for Nursing (NLN) recently addressed many progression policy issues in a special Reflection & Dialogue series (NLN, 2010), Fair Testing 6

Journal of Nursing Regulation

Guidelines (NLN, 2012b), and a position statement approved by the Board of Governors of NLN (NLN, 2012a), imploring nursing faculty to consider more than just test scores when making important educational decisions and to take a more holistic approach to program evaluation.

Reconsidering the Approach An interesting parallel exists between high-stakes testing in nursing education and in public K-12 education. In both cases, the testing process drifted away from the original intent (Spurlock, 2006). In a seminal and expansive report, the National Research Council (2011) recently evaluated the No Child Left Behind Act, which had as its main tool the use of high-stakes tests in public K-12 education. The Council found that high-stakes exit tests, especially at the high school level, had no impact on student achievement and only served to decrease graduation rates. The same effect seems to be demonstrated in nursing education. Nearly 30 years ago, Cornell (1985, p. 357) warned nursing education leaders: “Just as grades are not the only result of the college experience, the environment is not the only cause of learning. Measuring only the components of students and structure overlooks the complete equation: input (student) plus process/structure equals outcomes.” Whereas students are denied progression in their nursing program based on their test scores, reports on the consequences of low NCLEX-RN pass rates for faculty cannot be easily found. Are students the sole cause of a program’s NCLEX-RN pass-rate problem? Clearly not, but the use of student progression/graduation policies based on high-stakes testing addresses only the student component. The lack of attention to other components of the teaching/learning equation is likely due to the complexity of addressing these issues, but they deserve attention nonetheless. Over 3 decades ago, Astin (1980) described quality in higher education as an organization’s continuous process of critical selfreflection and improvement. Today, an appropriate paradigm for approaching the issue of poor NCLEX-RN pass rates is continuous quality improvement (CQI). A thin distinction exists between formal program evaluation and CQI, though they may be more or less intertwined in individual nursing programs. Program evaluation normally refers to a timed, systematic, predetermined evaluation process driven by regulatory and accreditation requirements and providing general data about educational programs (as suggested by Spangehl, 2012). CQI is an improvement-oriented approach that most nurses in clinical practice settings understand. There are a variety of frameworks and models faculty and state BONs could use to address program quality in a more thoroughly systems-oriented manner, as suggested by Davis (2011) and Carrick (2011). A few illustrative examples follow. Carrick addresses the problem of NCLEX-RN pass rates specifically by linking student and educational (faculty and environment) systems and describing how both must be addressed in any intervention

toward improvement. This seems clear enough, though one is left wondering why it is not more widely employed. Suhayda and Miller (2006) provide an overview of how Rush University College of Nursing used a modified input-process-product model to frame their evaluation efforts, while also describing the culture of CQI existing in the college. Similarly, Brown and Marshall (2008) reported using a CQI model to address NCLEXRN pass rates among their associate-degree students. In just 1 year, pass rates increased from 56% to 87%, without the need to use high-stakes testing or punitive progression policies. Spangehl (2012) describes the Academic Quality Improvement Program (AQIP), an alternative method of maintaining institutional accreditation through the Higher Learning Commission, one of the large regional accreditors in the United States. In this program, educators focus on CQI rather than static program evaluation plans. Carroll, Thomas, and DeWolff (2006) describe the usefulness of AQIP and its Plan-Do-Check-Act process to nursing education. The focus in programs like AQIP is on processes in addition to outcomes.

Conclusions and Recommendations Nursing education accreditors and state BONs require prelicensure nursing programs to demonstrate educational effectiveness and quality instruction by achieving NCLEX-RN pass rates above established thresholds. Some nursing programs have responded by disallowing students they deem at high a risk for NCLEX-RN failure to take the exam, thereby altering the value of NCLEX-RN pass-rate information. While artificially driving up licensure pass rates may protect a program from regulatory or accreditation actions, progression policies based on high-stakes testing do nothing to improve educational program quality and divert attention from the issues that could be impacting NCLEX-RN pass rates, including poor instructional quality, disruptive or inadequate learning environments, and lack of effective learning resources. In addition, making important educational decisions based on high-stakes test scores directly conflicts with the ethics of acceptable test use, as described by the Joint Committee on Testing Practices (2004). From an evidence-based practice perspective, high-stakes testing has shown little value in improving school NCLEX-RN pass rates. High-stakes exams have not been found to accurately identify students who fail the NCLEX-RN, which are the only students who should be subject to increased scrutiny. The current national pass rate for first time U.S.-educated test takers is more than 90% (NCSBN, 2012). Correctly identifying the 10% who fail the NCLEX-RN is exceedingly difficult because of the statistical problems in identifying cases in a low-prevalence condition (Grimes & Schulz, 2002), in this case, NCLEX-RN failure. Even tests with high sensitivity and specificity perform poorly when the prevalence of an event (NCLEX-RN failure) is low. A way forward could include these three considerations for state BONs, nursing education leaders, and nursing faculty. First, Volume 4/Issue 1 April 2013

in addition to NCLEX-RN pass rates, attrition rates or program completion rates, defined as the proportion of students starting a program who graduate, should be reported. This alone could encourage prelicensure programs to invest time in critical self-reflection to determine the root cause of pass-rate problems. State BONs could also collect more information from nursing programs by inquiring about the programs’ use of standardized exams, whether or not test scores alone determine progression or graduation, and how standards for progression (for example, test cut scores) were determined. These questions could further encourage nursing education programs to examine the processes and outcomes of using high-stakes exams within a more comprehensive framework of program evaluation. Second, nursing faculty should reconsider how they use classroom-designed and standardized exams. The current paradigm is that exams are used primarily in summative evaluation, that is, to evaluate what a student knows about a subject. Exams are universally structured in a “minus framework”: Students loose points unless they answer every question correctly, so from the start, exams function as anxiety-producing, grade-lowering mechanisms. Research from psychology overwhelmingly shows that frequent testing, used as much for evaluation as for learning, has robust positive effects on learning and memory (Roediger & Karpicke, 2006a, 2006b). This phenomenon, called the “testing effect,” has been well described in the literature (for a review, see Carpenter, 2012) and should be implemented and evaluated in nursing education more widely. Lastly, there is substantial opportunity to apply a positive research approach regarding pass rates, one that seeks to describe the strengths and characteristics of thriving nursing education programs. For decades, some nursing education programs have provided excellent education with resulting high NCLEX-RN pass rates, high rates of students returning for advanced education, and other positive outcomes. And they do not use high-stakes tests. We need to know more about these schools and their successful practices. The nursing education literature is replete with problemoriented research, but in our commitment to starting any study with a research problem, have we overlooked the strengths and virtues of strong nursing education programs that could be studied, so their practices can be replicated? Clearly, there are opportunities in this domain of research.

References American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Astin, A. (1980). When does a college deserve to be called “high quality?” Improving teaching and institutional quality. Current Issues in Higher Education, 1(10). Barbieri, L. (2008, April 22). Nursing students unsure if they will walk in May graduation. The Bolivar Commercial. Cleveland, MS.

www.journalofnursingregulation.com

7

Brown, J., & Marshall, B. (2008). Continuous quality improvement: An effective strategy for improvement of program outcomes in a higher education setting. Nursing Education Perspectives, 29(4), 205–211. Carpenter, S. K. (2012). Testing enhances the transfer of learning. Current Directions in Psychological Science, 21(5), 279–283. Carr, S. M. (2011). NCLEX-RN pass rate peril: One school’s journey through curriculum revision, standardized testing, and attitudinal change. Nursing Education Perspectives, 32(6), 384–388. Carrick, J. (2011). Student achievement and NCLEX-RN success: Problems that persist. Nursing Education Perspectives, 32(2), 78–83. doi:10.5480/1536-5026-32.2.78 Carroll, V., Thomas, G., & DeWolff, D. (2006). Academic quality improvement program: Using quality improvement as tool for the accreditation of nursing education. Quality Management in Health Care, 15(4), 291–295. Cornell, G. (1985). The value-added approach to the measurement of educational quality. . . measuring student gains. Journal of Professional Nursing, 1(6), 356–363. Davis, B. W. (2011). A conceptual model to support curriculum review, revision, and design in an associate degree nursing program. Nursing Education Perspectives, 32(6), 389–394. Downing, S. M. (2006). Twelve steps for effective test development. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development. Mahwah, NJ: Lawrence Erlbaum Associates. Evans, B. (2008). The importance of educational and social backgrounds of diverse students to nursing program success. Journal of Nursing Education, 47(7), 305–313. doi:10.3928/01484834-20080701-04 Exstrom, S. (2001). The state board of nursing and its role in continued competency. Journal of Continuing Education in Nursing, 32(3), 118– 125. Giddens, J. (2009). Changing paradigms and challenging assumptions: Redefining quality and NCLEX-RN pass rates. Journal of Nursing Education, 48(3), 123–124. Grimes, D. A., & Schulz, K. F. (2002). Uses and abuses of screening tests. The Lancet, 359(9309), 881–884. doi:10.1016/S01406736(02)07948-5 Haleem, D., Evanina, K., Gallagher, R., Golden, M., Healy-Karabell, K., & Manetti, W. (2010). Program evaluation: How faculty addressed concerns about the nursing program. Nurse Educator, 35(3), 118–121. doi:10.1097/NNE.0b013e3181d95000 Harris, A. (2011, December 9). MSU nursing dean faced similar complaints at last school. The Charleston Gazette. Charleston, WV. Retrieved from http://wvgazette.com/News/201112090120?page=1 Heubert, J., & Hauser, R. (Eds.). (1999). Testing for tracking, promotion, and graduation. Washington, DC: National Academies Press. Joint Committee on Testing Practices. (2004). Code of fair testing practices in education. Retrieved from www.apa.org/science/programs/testing/fair-testing.pdf Lewis, C. (2006). Predictive accuracy of the HESI exit exam on NCLEX-RN pass rates and effects of progression policies on nursing student exit exam scores. Texas Woman’s University Dissertation. Morrison, S., Free, K., & Newman, M. (2002). Do progression and remediation policies improve NCLEX-RN pass rates? Nurse Educator, 27(2), 94–96. National Council of State Boards of Nursing. (2002). NCLEX research report: The NCLEX® delay pass rate study. Retrieved from http:// www.ncsbn.org/pdfs/RecentNCLEXResearch_Web_Testing017B02.pdf National Council of State Boards of Nursing. (2012). 2012 NCLEX pass rates. Retrieved from www.ncsbn.org/Table_of_Pass_ Rates_2012.pdf

8

Journal of Nursing Regulation

National League for Nursing. (2010, December). Reflection and Dialogue: High-stakes Testing, December 2010. Retrieved from www.nln.org/aboutnln/reflection_dialogue/refl_dial_7.htm National League for Nursing. (2012a, February). The fair testing imperative in nursing education. Retrieved from www.nln.org/aboutnln/livingdocuments/pdf/nlnvision_4.pdf National League for Nursing. (2012b). NLN fair testing guidelines. Retrieved from www.nln.org/facultyprograms/facultyresources/fairtestingguidelines.pdf National Research Council. (2011). Incentives and test-based accountability in education. Washington, DC: National Academies Press. Nibert, A., Young, A., & Adamson, C. (2002). Predicting NCLEX success with the HESI Exit Exam: Fourth annual validity study. CIN: Computers, Informatics, Nursing, 20(6), 261–267. Pennington, T., & Spurlock, D. (2010). A systematic review of the effectiveness of remediation interventions to improve NCLEX-RN pass rates. Journal of Nursing Education, 49(9), 485–492. doi:10.3928/01484834-20100630-05 Ramaley, J., Hauptman, A. M., Callan, P. M., Hurtado, S., Bailey, T., Reno, E., & Merisotis, J. P. (2012). Do college-completion rates really measure quality? Chronicle of Higher Education, 58(27), A16– A19. Roediger, H. L., & Karpicke, J. D. (2006a). Test-enhanced learning. Psychological Science, 17(3), 249–255. doi:10.1111/j.1467-9280.2006.01693.x Roediger, H. L., & Karpicke, J. D. (2006b). The power of testing memory: Basic research and implications for educational practice. Perspectives on Psychological Science, 1(3), 181–210. doi:10.1111/j.1745-6916.2006.00012.x Russell, K. (2012). Nurse practice acts guide and govern nursing practice. Journal of Nursing Regulation, 3(3), 36–42. Shultz, C. M. (2010). President’s message. High-stakes testing!? Help is on the way. Nursing Education Perspectives, 31(4), 205. Spangehl, S. D. (2012). AQIP and accreditation: Improving quality and performance. Planning for Higher Education, 40(3), 29–35. Spurlock Jr., D. R. (2006). Do no harm: Progression policies and highstakes testing in nursing education. Journal of Nursing Education, 45(8), 297–302. Spurlock Jr., D. R., & Hunt, L. (2008). A study of the usefulness of the HESI Exit Exam in predicting NCLEX-RN failure. Journal of Nursing Education, 47(4), 157–166. Suhayda, R., & Miller, J. M. (2006). Optimizing evaluation of nursing education programs. Nurse Educator, 31(5), 200. Young, A., & Langford, R. (2010). The eighth E2 validity study for RNs: Accuracy, benchmarking, remediation, and testing practices. Retrieved from www.elsevieradvantage.com/pdf/HESI_Eight_E2_ Validity_Study_for_RNs_E-Flyer.pdf

Darrell Spurlock, Jr., PhD, RN, NEA-BC, is the Senior Nurse Researcher at Riverside Methodist Hospital and Assistant Professor at Mount Carmel College of Nursing, both in Columbus, Ohio.