Selecting Standardized Tests in Nursing Education

Selecting Standardized Tests in Nursing Education

SELECTING STANDARDIZED TESTS NURSING EDUCATION CHERYL L. MEE, MSN, MBA, RN⁎ AND IN VIRGINIA J. HALLENBECK, MS, RN, ACNS-BC† Nursing faculty freque...

67KB Sizes 0 Downloads 24 Views

SELECTING STANDARDIZED TESTS NURSING EDUCATION CHERYL L. MEE, MSN, MBA, RN⁎

AND

IN

VIRGINIA J. HALLENBECK, MS, RN, ACNS-BC†

Nursing faculty frequently utilize or consider the use of nationally standardized tests to evaluate nursing student performance and their potential to pass the National Council Licensure Examination (NCLEX) © after graduation. There is little literature available to advise and guide nursing faculty in criteria to consider when selecting a standardized testing company to assess student readiness for NCLEX. The intent of this article is to share criteria to consider when evaluating a standardized test or testing program that has been gathered through an informal survey of faculty who are currently using standardized tests. (Index words: Standardized tests; Attributes of standardized tests; Selection criteria) J Prof Nurs 0:1–5, 2014. © 2012 Published by Elsevier Inc.

A

S NURSING EDUCATORS, we are accustomed to using standards and guidelines to assist in making the best decisions. These include standards for choosing a textbook, guidelines to determine if a clinical site is appropriate for our students, standards to evaluate research, and the list goes on. But, in a recent search of the literature, no standards or guidelines could be found to assist the nursing faculty member in determining the best standardized testing program they should adopt. A literature search was initiated using Cumulative Index to Nursing and Allied Health Literature Plus, PubMed, and ScienceDirect to find articles about the criteria to use in selecting standardized tests for nursing education. The search criteria included articles published in nursing, peer-reviewed publications within the last 10 years, in English. The search terms included the terms selection, standardized tests, nursing education, and guidelines for selecting. In addition, no dissertations or unpublished documents were included in this review. Although there have certainly been articles published on the quality of various standardized testing products, no articles were found that specifically addressed how to select a standardized testing product. More and more nursing programs are using standardized testing. The pressure for high National Council Licensure Examination (NCLEX) © pass rates is imposed

⁎ Adjunct Faculty, Immaculata University, Manager Faculty Development, Elsevier. †Clinical Nurse Specialist, The Ohio State University Medical Center, Adjunct Faculty, Indiana Wesleyan University. Address correspondence to Cheryl Mee: Adjunct Faculty, Immaculata University, Manager Faculty Development, Elsevier. E-mail: [email protected] 8755-7223/12 Journal of Professional Nursing, Vol 0, No. 0 (Month), 2014: pp 1–5 © 2012 Published by Elsevier Inc.

by numerous stake holders and includes, but is not limited, to boards of nursing, program faculty and administration, students and their families, and funding bodies. NCLEX pass rates are measures of student success and program quality. Nursing faculty want to ensure that the quality of the nursing education at their facility is meeting or exceeding national standards. Nursing faculty also feel an obligation not to continue to retain students throughout the full program when there is little or no chance that these individuals will be able to pass boards. Standardized testing is one way to gauge a student's potential for program and NCLEX success. Because of the critical role standardized testing can play for an education program, it is imperative that the right decisions be made when selecting a testing product. Because of the lack of literature available addressing this concern, the authors queried 10 expert faculty from 10 schools across the United States who were currently using standardized tests and involved in analyzing testing data. These nursing faculty members were currently using Health Education Systems, Inc. Assessment examinations; all were aware of product features offered by various standardized testing product companies, and some had experience using other testing products as well. One open-ended question was asked, “What do you think are the important criteria for selecting a standardized testing company/product that nursing faculty should consider?” Six major factors for faculties to consider emerged.

Data to Support the Examinations Predictive Capabilities The first criterion to consider is: Does the testing program being considered have research that supports the claim that the test is valid? A valid test means that it measures http://dx.doi.org/10.1016/j.profnurs.2012.06.006

2

ARTICLE IN PRESS

MEE AND HALLENBECK

what it claims to measure (Morrison, Nibert, & Flick, 2006; Polit, & Beck, 2012). Validity of a high-stakes exit examination in nursing can be determined by how accurately the test identifies students who will pass the licensure or certification examination. If the testing company states that its examinations are predictive of success on the NCLEX, the faculty should conduct both a review of the research and the strength of the research. Inquiries about the research process, the examination statistics including those that demonstrate reliability and validity, and the predictive accuracy of the examinations related to student success are some details that faculty should review. Some factors to consider when reviewing the research data include the number of students in the pool of students selected in the research and the student selection methods—for example, does the pool of students in the exit examination validity research match the makeup of the pool of students that are taking the NCLEX? Has the examination been validated on the same type of students as in the school that is considering the testing package? For example, if the faculty is considering purchase of a testing program for an associate degree nursing program, has any research been done with associate degree graduates? Reliability of examinations is how well the examinations do with repeated use among different groups of students and how consistent are the test scores (Nunnally & Bernstein, 1994). Reliability coefficients, such as a Kuder–Richardson, are a score for an examination's overall reliability. For more information on some examination statistics, see Table 1. Faculty should consider the research because it pertains to the intended use of the testing product. If the school is planning on using the standardized exit examination, they should examine the research on the predictive accuracy of that examination. But, if the faculty are considering purchasing additional products such as admission assessment and specialty examinations, the research related to these examinations need to be examined.

and editors. The team should be well versed in developing strong test items that assess the students at the analysis level or more based on the revised Bloom's Taxonomy (Anderson & Krathwohl, 2001). Questions should be well written with a tightly focused stem, require multilogical thinking to answer, and contain strong plausible distracters (Morrison, Nibert, & Flick, 2006). Each question should be supported with rationale that clarify and instruct the student when the questions are reviewed. Editing processes help assure that formats and item development are of the highest quality and prevent cultural or regional biases. Internal quality review assures that item editing meets high quality standards. Standardized tests should contain questions that mimic the types of questions and formats seen on NCLEX, which includes chart exhibit, hot spot, select all that apply, ranking, fill in the blank, in addition to the customary multiple choice. And because the NCLEX evolves and newer innovative formats such as audio and video questions are added, the standardized test should reflect those changes. The test blueprint should adhere to the NCLEX blueprint. As the blueprint changes, so should the preparation examinations. It is also essential that the standardized test remains current, and all changes that will be in the NCLEX are incorporated in a timely manner. For example, the NLCLEX changes occur in May/June time frame every 3 years. Therefore, the changes to the standardized testing program should occur prior to this event so that students preparing for the new version of the NCLEX are familiar with the appropriate test content. Lastly, it is important to be sure that key concepts and content areas that the faculty desire to assess are actually covered in the examination. Does the standardized testing company provide a blueprint that the faculty can review? Finally, as mentioned above, item piloting and statistical analysis of items and examinations help assure that criteria are being met on an ongoing basis.

Test and Test Item Development

Faculty Friendly

A review of the examination development processed should be conducted as well. Who are the test item writers, what are their qualifications as content experts, and how are they trained in test item writing? Are test items reviewed and revised after being submitted by the item writer, and are items piloted with statistical analysis completed for both difficulty and reliability before they are used as actual test items? Following the test blueprint, content experts with active clinical experience should be developing the questions to assure that items are based on true current clinical expectations for new nurse graduates. Experts should be well versed in the expectations for a new clinical nurse in their field of expertise so that they are writing appropriate level questions for entry into practice and a diverse group geographically located across the nation. In addition to being content experts, proficiency in item writing is also essential for the team of item writers

Faculties need scoring reports that are well developed yet easy to interpret. The data analysis capability of the standardized testing provider should include the scores for the school's group of students and the national averages. In addition, to improve ease of analysis, scores should be broken into categories of content that are appropriate for that school. For example, a faculty member, depending on which report they are writing or what concern they are addressing, might need to see how their group of students scored on various National League of Nursing (NLN) or American Association of Colleges of Nursing (AACN) accreditation categories, nursing process categories, or Quality and Safety in Nursing Education (QSEN) categories. Before choosing a product, nursing faculty members should inquire about the scope of the categories available in the reports generated. Some companies offer a wide variety of content categories, and some are more limited. When

SELECTING STANDARDIZED TESTS

ARTICLE IN PRESS

3

Table 1. Test Statistics, Uses, and Interpretation of the Data Statistical measure Kuder–Richardson 20 (KR20)

Item difficulty Point–Biserial correlation coefficient

Uses

Interpreting data

Determines test reliability. Measures whether the high-scoring individuals who took the examination are consistently answering the questions correctly or the low-scoring individuals who took the examination are consistently answering the test items incorrectly. Measures how many students got the item correct Measures the ability of the item to help discriminate between students who know the content and students who do not know that content.

analyzing data across multiple examinations, faculty can evaluate scores in focused areas of interest for both curriculum evaluation and accreditation purposes. Faculty should consider how data are displayed in the score reports. Are there visual graphs and numeric information that can be easily interpreted? Sample reports should be available for review. In addition, the extent that data can be manipulated beyond the customary score reports and the ease in performing manipulations should be considered. The testing company should provide the ability to analyze multiple tests across time and multiple content categories, such as the categories mentioned above. Ideally, a group of faculty should request a demonstration of the system for pulling reports to evaluate the reporting capabilities. Although having a system filled with data may be impressive, it is most helpful if it is easy to extract and analyze needed information. Security is a major concern for faculty and for testing companies. If items are not secure and students know the correct response ahead of time, the item becomes invalid, and the test is no longer a good predictor of student success. Therefore, it is imperative that security measures for the test are in place. Are questions scrambled for each student individually? How is the item secured, and do students get to see the whole item, correct and incorrect responses, and item rationale? How is that review of the test protected? Can students access the Internet or other sources of information while taking the examination? Test items should be “retired” and replaced with new items on a regular basis so inquire about the frequency of new item additions and deletion of items. Faculty should compare examination security systems between testing products while also evaluating the student's ability to learn from the test, test review, and remediation tools. The faculty surveyed also stated that is important that the test is appropriate for their unique student mix. Language used in the test items should be free of culturally or regionally biased terms. In addition, how are students accommodated for special learning needs? Finally, it is important to assess the level of faculty commitment needed to correctly implement the test session. How easy is it for the faculty to learn the system— both administering the test and retrieving the results? Training options and additional support that faculty and

The KR20 ranges from − 1 to + 1, and the closer the KR20 is to + 1, the more internal consistency the examination possesses. The best standardized tests have a reliability coefficient greater than .90.

The item difficulty index ranges from 0 to 100 The strength of the item to discriminate yields a number 0 to 1 with greater than 0.30, indicating that the question is a highly discriminating test item.

students receive (including technical support) before, during, and after test sessions are essential considerations that should be discussed. An additional criterion would be the availability of higher level support such as help with implementation of the standardized testing from experienced faculty advisors.

Student View The student's view of testing should be considered and the faculty perspective. How well does the testing program help students identify individual learning needs? This goes back to the predictive accuracy of the examination but also includes help for students after the examination. Does the test report and remediation program help students identify their strengths and weakness in various content categories? With any testing package, faculty should help students understand the potential value of the examination—that it is one of many tools that can help students focus their study efforts. Students and faculty value an examination that is reliable and valid in predicting their success (even though they may not articulate this need). They appreciate when the testing environment and the test itself closely models the actual board examination. Practice that mimics real board examinations may help diminish some test anxiety. Remediation can be a large undertaking for students; so, is the remediation targeted to student weaknesses? Does the remediation content come from content resources that are current, reliable, and accurate? Is the content easy to use, and can it be collated by the student for further review in the future? Accessibility of the remediation and ease of retrieval by students who may live in remote areas with limited access to high speed Internet should be determined. With this being an issue for some students, it should be determined if the feedback, reports, and remediation are only available on-line or if students can print materials when access to the Internet may be limited. Faculty should have access to student reports and be able to collate testing results for students across multiple examinations in various content areas. In addition, faculty should be able to easily determine if students have accessed their on-line remediation and what types of remediation content they have explored. One plus would be if a testing product had practice tests in the

4

ARTICLE IN PRESS

MEE AND HALLENBECK

Table 2. Criteria Checklist for Selecting a Standardized Examination Criteria

Criteria met/not met

Fit with program

Examinations' ability to predict student success • Validity • Reliability Test and item development • Based on NCLEX blueprint • Qualifications of item writers • Variety of item types • Covers desired content (specialty examinations) Reports and data • Ease of use/ability to collate data for groups and individual students • Item content categories—NCLEX, AACN, NLN, QSEN, and so forth Faculty support/training Security issues addressed Student perspective • Testing model closely resemble actual examination • Remediation content: quality and format System requirements • IT support • Upgrades needed at school computer laboratories to accommodate testing Purchasing considerations • Cost • Flexibility • Testing options (exit, admission, specialty, custom, computer adaptive, etc.)

remediation content, which not only helped students but also faculty who could view in order to see the student's success on these tests.

System Requirements System requirements and faculty and student access to these systems are essential information when considering a testing company's product. An evaluation of the system requirements necessary for implementing the testing program should be completed. Close collaboration between the nursing faculty and the information technology staff at the college is important when considering the system requirements of the proposed testing product and what the processes will be for students taking examinations. In addition, the ease of use of administrative functions related to the test, such as ordering and establishing test dates and times, should be evaluated. Consider where students will take the examinations and if the computer laboratory area is available to accommodate testing needs. Although some of the technology requirements for standardized testing might seem cumbersome, stringent criteria for examination set up, administration, and proctoring is essential for the protection of the examination content and assurance of maintaining examination validity. Faculty need to consider the benefits of assurance of the examination integrity against requirements for proctoring and maintaining security.

Test Selection and Costs A broad array of tests may be available. For example, specialty examinations, admission assessment, customs examinations, and computer adaptive examinations may be available in addition to exit examinations. Many

schools start adoption of a standardized testing package focusing on exit examinations and then consider the addition of other options. Of course, it probably goes without saying that the actual cost analysis is a consideration. In this deliberation, the decision needs to be made whether each student will bear the cost of the standardized testing program or if it will be an institutional purchase. As in all cost–benefit analysis, the potential benefits need to be weighed against the cost. Faculty should determine what is a fair or reasonable price to help them assess how students are performing compared with other student nationally. Furthermore, consideration should be given to flexibility in selection of which tests to purchase and package options, as well as the ability to use specialty and/or custom examinations and if the time from order date to actual testing time reasonable to accommodate changes in the numbers of students because of attrition and changing schedules.

Conclusion Standardized tests are tools that help faculty and students gauge or predict their future success. It is used in conjunction with a teachers expertise—their knowledge of the subject and their ability to help students not just understand the content, but be able to apply the knowledge in clinical practice. It is one piece of a puzzle to help faculty evaluate a student's performance at various points in their nursing education program. Standardized test results are just one measure of many of student capabilities, and the remediation provided is only one form of remediation that faculty may consider beneficial. Standardized tests from companies that are used to predict a nursing student's success on the NCLEX vary

SELECTING STANDARDIZED TESTS

ARTICLE IN PRESS

greatly in a number of factors, so faculty need to carefully assess the value of the products before committing to a purchase. See Table 2 for a checklist summarizing items nursing faculty should consider when evaluating a standardized testing program. Using standardized tests over time can yield data over time and trending information that can help with curriculum evaluation—a tool to help identify curriculum strengths, weaknesses, and potential gap. But, standardizing testing is only one tool. Because faculty work on curriculum development and ongoing revision, multiple sources of data must be considered. Besides analysis of the reports generated by the standardized testing company, the faculty need to consider student feedback, evaluations from current employers of their graduates, faculty-generated test results, simulation and

5

laboratory practicum evaluations, and the observations and expertise of their own faculty. Excellence can be achieved when all the available feedback is considered and used.

References Anderson, L. W., & Krathwohl, D. R. (Eds.). A taxonomy for learning, teaching and assessing: A revision of Bloom's Taxonomy of educational objectives: Complete edition. New York: Longman. Morrison, S., Nibert, A., & Flick, J. (2006). Critical thinking item writing. (2nd ed.). Houston, TX.: Health Education Systems, Inc. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw-Hill. Polit, D. F., & Beck, C. T. (2012). Nursing research: Generating and assessing evidence for nursing practice. Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins.