Teaching and Learning in Nursing xxx (2017) xxx–xxx
Contents lists available at ScienceDirect
Teaching and Learning in Nursing journal homepage: www.jtln.org
Considerations for Developing a Student Evaluation of Teaching Form Margaret A. Bush, PhD, MBA, RPh a, Sharron Rushton, DNP, MS, RN, CCM a, Jamie L. Conklin, MSLIS b, Marilyn H. Oermann, PhD, RN, ANEF, FAAN a,⁎ a b
Duke University School of Nursing, DUMC 3322, 307 Trent Drive, Durham, NC, 27110, USA Liaison to the School of Nursing, Duke University Medical Center Library & Archives, Durham, NC, USA
a r t i c l e
i n f o
Article history: Accepted 14 October 2017 Available online xxxx Keywords: Student evaluation of teaching Course evaluation Nursing faculty
a b s t r a c t Student evaluations of teaching (SET) provide feedback to nursing faculty for improving courses and teaching and to administrators for making personnel decisions affecting faculty. We surveyed the literature to better understand the content of and types of rating scales on SET forms. In this article, we summarize our findings and use the information gathered to provide guidance for development of SET forms for use in nursing programs. © 2017 Organization for Associate Degree Nursing. Published by Elsevier Inc. All rights reserved.
Introduction
Review Methods
Student appraisal of courses and faculty is an important component of the overall evaluation of a nursing program. This appraisal generally takes the form of student evaluations of teaching (SET). Data from these evaluations are used for making decisions that affect the nurse educator's career: contract renewal, promotion, tenure, and other personnel decisions. In addition, student ratings and comments provide feedback to the teacher about areas of the course and teaching that might need to be improved. Given the potential impact of student evaluations, the selection of the right evaluation tool is a key component of the process. The most appropriate tool depends on the areas (teaching effectiveness, course organization, delivery of content, and others) to be assessed and the type of information (quantitative and/or qualitative) that is required. For this article, we reviewed the research on forms used for SET in multiple fields of study. The aim was to develop a better understanding of the content and types of rating scales included on these tools. The purpose of this article is to present a summary of the literature and guidelines for nursing faculty in developing a SET form for use in their nursing program.
The studies reported in this article were retrieved through a comprehensive search of the literature in the Cumulative Index to Nursing and Allied Health Literature, PubMed, and Education Resources Information Center. A research librarian developed a broad search strategy to capture all articles related to student evaluations of courses or teachers within health professions education. We also captured some articles from other academic disciplines in our search and retained those that were relevant for developing a SET form for a nursing program. The search strategy included a combination of keywords and controlled vocabulary terms within each database. The search was limited to those articles published in English language from January 2010 to September 2017. These dates were included to narrow the broad search to the most recent findings. Editorials, letters, comments, and conference abstracts were excluded. The authors screened the articles by title and abstract to identify those that addressed the content of SET forms, number and types of questions, types of rating scales, and analysis of student comments. Other inclusion and exclusion criteria were applied during the screening process (Table 1). After the title and abstract screening process, we assessed full-text content, which resulted in 24 studies for review. Content of SET Forms
⁎ Corresponding author. Tel.: +1 919 684 1623 (Office). E-mail addresses:
[email protected] (M.A. Bush),
[email protected] (S. Rushton),
[email protected] (J.L. Conklin),
[email protected] (M.H. Oermann).
Student evaluations provide an opportunity for students to give feedback about the course and teaching. Items on the SET form should be based on research on effective teaching and other
https://doi.org/10.1016/j.teln.2017.10.002 1557-3087/© 2017 Organization for Associate Degree Nursing. Published by Elsevier Inc. All rights reserved.
Please cite this article as: Bush, M.A., et al., Considerations for Developing a Student Evaluation of Teaching Form, Teaching and Learning in Nursing (2017), https://doi.org/10.1016/j.teln.2017.10.002
2
M.A. Bush et al. / Teaching and Learning in Nursing xxx (2017) xxx–xxx
Table 1 Inclusion and Exclusion Criteria Inclusion criteria 1. Focused on the content of SET forms, number and types of questions, types of rating scales, and analysis of student comments 2. Quantitative or qualitative research, including anecdotal reports and reviews 3. Study set within colleges (all levels) and universities in all countries to reflect international perspectives 4. Relevant to all academic disciplines, including health professions, business, liberal arts, etc. 5. Published in English from January 2010 to September 2017 Exclusion criteria 1. Focused only on describing the aspects of SET involving costs or policy development 2. Published editorials, letters, case reports, comments, and conference abstracts 3. Published in a non-English language or prior to January 2010 Note. SET = student evaluation of teaching.
information that the faculty member seeks about the course or teaching approach, for example, feedback on a new assignment in the course or change in clinical site. Student evaluation of teaching forms typically assess student perceptions about the organization of the course, teaching methods, course assignments, student workload, examinations and grading, communication with students, enthusiasm of the teacher, interactions between students and faculty (in a group and individually), and the course's value in terms of their learning (Annan, Tratnack, Rubenstein, Metzler-Sawin, & Hulton, 2013; Balam & Shannon, 2010; Oermann, 2017; Powell, Rubenstein, Sawin, & Annan, 2014; Pritchard, Saccucci, & Potter, 2010; Spooren, Brockx, & Mortelmans, 2013). While students can rate these areas and their satisfaction with the course and teaching, they cannot judge the accuracy, depth, and currency of the course content (Oermann, 2017, p55). Students do not have the knowledge and understanding of the content to make these judgments. As a result, SET forms should not include questions that ask students to evaluate the content of the course. In an effort to incorporate predictors of learning in the evaluation, Frick, Chadha, Watson, and Zlatkovska (2010) developed the Teaching And Learning Quality tool on the basis of the first principles of instruction identified by Merrill (2002) as components required for complex learning. These principles are that the instruction (a) is problem based, (b) connects what students know with new learning, (c) exposes students to demonstrations of what they should learn, (d) provides opportunities for students to apply what they have learned with teacher coaching and feedback, and (e) integrates learning into students' personal lives (Merrill, 2002). Using 40 items, the Teaching And Learning Quality tool asks students to evaluate the teacher's use of these first principles, the time they spent learning, their own learning trajectory, their satisfaction with the course and faculty member, and the overall quality of the course and teaching. Frick et al. (2010) found a high positive correlation between students' perceptions of the teacher's use of the first principles and their evaluations of the course.
Types of Rating Scales on SET Forms The majority of SET forms use objective items that are rated on 5-point Likert scales (Agbetsiafa, 2010; Barone & Lo Franco, 2010; Chang, 2012; Chulkov & Van Alstine, 2012; Frick et al., 2010; Nation, Carmichael, Fidler, & Violato, 2011; Ross, Wallis, Huggins, & Williams, 2013; Rucker & Haise, 2012; Stalmeijer, Dolmans, Wolfhagen, Muijtjens, & Scherpbier, 2010). In general, one Likert scale should be used consistently for all items on the form (Chulkov & Van Alstine, 2012). When a different scale is used for one or more questions within the evaluation, it introduces potential issues related to interpretation of results.
Chulkov and Van Alstine (2012) reported their experience with transitioning an existing SET form from a 4-point to a 5-point Likert scale. Following revision to a 5-point Likert scale, mean evaluation scores decreased, which has implications for subsequent ability to compare ratings with prior evaluation results (Chulkov & Van Alstine, 2012). The lowest numerical value on the Likert scale is most commonly associated with the least favorable or worst descriptor, and the most favorable or best descriptor is commonly linked with the highest value, e.g., strongly disagree = 1 to strongly agree = 5 (Barone & Lo Franco, 2010; Chang, 2012; Chulkov & Van Alstine, 2012; Frick et al., 2010; Nation et al., 2011; Ross et al., 2013; Rucker & Haise, 2012; Stalmeijer et al., 2010). In rare cases, the lowest value is associated with the most favorable descriptor (Agbetsiafa, 2010). The lowest values appear to be most commonly oriented on the left side of the rating scale, increasing to the highest value on the right of the scale; however, a few SET forms orient the scale such that the highest value is positioned on the left (Chang, 2012). These conventions should be standardized within an institution to avoid confusion for students and stakeholders who evaluate the data. Rucker and Haise (2012) studied the effects of different response scales and descriptors on evaluation results in a clever design that involved administration of six different versions of evaluation forms to the same students. Interestingly, variations in routine elements of the form can have significant effects on evaluation results. For example, when descriptors on the Likert scale were more extremely worded, those responses tended to be selected less frequently compared with a descriptor corresponding to the same numeric value that was less extreme. When the highest value (best response) was oriented on the left of the scale, the mean response values tended to be higher. Number of Items to Include on SET Form The number of items included on SET forms varies. Most forms using Likert scales include about 15 to 20 items, and items are often grouped within categories. One SET form described in the literature had only nine items, but this tool had a focused purpose to evaluate select teaching behaviors of clinical instructors (Keely, Oppenheimer, Woods, & Marks, 2010). The use of specialized survey techniques to ensure that students carefully read all questions and, thus, improve the accuracy of the information collected does not appear to be common. Examples of these include random ordering of questions rather than grouping by category and using negative wording with reverse scoring to ensure that students read all questions (Frick et al., 2010). Typically, SET forms include a general question at the end that asks for an overall rating of the instructor (Zhao & Gallant, 2012). Several studies reported that asking students to compare their current course and teacher to other courses and faculty they have
Please cite this article as: Bush, M.A., et al., Considerations for Developing a Student Evaluation of Teaching Form, Teaching and Learning in Nursing (2017), https://doi.org/10.1016/j.teln.2017.10.002
M.A. Bush et al. / Teaching and Learning in Nursing xxx (2017) xxx–xxx
experienced in the program (global comparison type questions) yielded lower average scores when compared with more typical items that were limited to rating the current course and teacher (Chulkov & Van Alstine, 2012; Rucker & Haise, 2012). Open-Ended Questions on SET Form In addition to quantitative information, many SET forms also include open-ended questions in combination with Likert-scale items. These questions allow for both positive and negative feedback and comments from students and give them an opportunity to provide suggestions about ways to improve the course (Dresel & Rindermann, 2011). If the open-ended questions are located in a separate section of the evaluation form, for example, at the end, rather than interspersed with Likert-scale items, the extent and number of comments increase (Chulkov & Van Alstine, 2012). Chulkov and Van Alstine (2012) suggested that SET forms should include specific openended questions about the course and teaching and not merely a blank space for comments. For example, questions might ask students: What was the most valuable aspect of this course in terms of your learning? What one aspect of this course would you change? What were the strengths of this teacher? What changes can this teacher make to be more effective? Analyzing Student Comments Student comments on open-ended questions in course and teacher evaluations can represent challenges for analysis. Comments can be evaluated manually or using a software program to arrive at themes. For example, Tucker (2014) sought to understand the number of and degree of inappropriate comments in teacher evaluations and manually coded comments. In another approach, faculty used an automated tool, Leximancer (Lexi-Portal Version 4, Australia), that analyzed student comments for common themes without the faculty member having to create a coding scheme. This tool examines the frequency of words used, leading to the creation of concept maps (Stupans, McGuren, & Babey, 2016). Using the nominal group technique, Coker, Tucker, and Estrada (2013) elicited feedback on a physical assessment course. Students first recorded the top three areas they learned in the course and then listed the top three areas they had hoped to learn. Answers were subsequently discussed in a small group with a facilitator for purposes of prioritizing the feedback (Coker et al., 2014). This was an effective strategy to gather detailed student feedback about the course. Implications for Nursing Programs Student ratings are a key strategy for gathering information on teaching effectiveness (Zhao & Gallant, 2012). These tools collect information from students about their experiences, opinions, and satisfaction related to the course and teacher. Considering the importance of this evaluation, SET forms need to be carefully selected or developed. In this article, we identified typical areas of content evaluated on these forms. These include organization of content, teaching methods, course assignments, examinations, grading methods and criteria, promptness and quality of feedback, interactions between students and faculty (in a group and individually), enthusiasm of the teacher, accessibility of the teacher outside of class or clinical practice, and the course's value to students in terms of their learning. The items on the form need to address the type of course, for example, a clinical nursing course, and behaviors expected of the instructor when teaching that course. Likert-type scales are used most frequently on these forms. Based on our review of the literature across different disciplines, the
3
majority of forms include approximately 15 to 20 objective items about the course and teacher, rated on a 5-point Likert scale, with the most favorable outcome paired to the highest numerical value on the scale. A number of factors related to the Likert scale can affect evaluation results. Examples of these factors include the number of points on the scale, orientation of the numeric scale, Likert scale descriptors, and wording used in stems of items (Rucker & Haise, 2012). In cases where the lowest value is assigned the best descriptor, this needs to be clear to students when they are rating items on the tool and also to those faculty members and administrators reviewing the results. Because this is not the norm, scores may be misinterpreted by someone who is not familiar with the scale. When there are multiple nursing programs within the department or school, the SET form should be standardized across programs. Making changes to existing evaluations (e.g., changing the Likert scale) should be done with caution, as this may affect interpretation of results and comparability with past results. Consistent with the literature, we recommend no more than 15 to 20 items be included on SET forms that evaluate both the course and teacher. An excessive number of items may affect a student's attention span or conscientiousness in completing the evaluation. This is a problem in many nursing programs where students are asked to complete SET forms for multiple courses and faculty, including clinical and adjunct instructors, at the end of each semester. One item at the end of the SET form should ask for a summary rating, such as “overall, this instructor was an effective teacher.” The SET form should include specific open-ended questions about the course and teaching (e.g., What was the most valuable aspect of this course in terms of your learning?) instead of a blank space for general comments. Those questions should be placed at the end of the tool rather than with each of the Likert scale items. Student comments provide valuable feedback about different aspects of the course and teaching that might be improved. While qualitative software and other programs can be used to analyze data from openended questions, it is unlikely that most nursing faculty will use these strategies. However, to provide value, student comments require organization in some form, such as putting the comments into categories, for example, grouping all the comments on assignments together (Oermann, 2017). Another strategy is to group the student comments on the basis of the overall course ratings (one group with the highest ratings of the course and the other with the lowest ratings). The most important question for faculty is whether students learned what they needed to in the course and clinical experience (Oermann, 2017). Using the same form for all nursing courses allows for comparisons of ratings for individual courses and teachers to mean ratings for the nursing program. It also facilitates completion of tools by students and enables faculty and administrators to identify trends over time. Some institutions may use a single SET form across many academic majors. In the case of a single institution-wide SET along with the need to evaluate aspects specific to nursing (e.g., clinical instruction), addition of a separate validated SET form or validated subset of items could be considered. Addition of nursing-specific, openended questions is another alternative. The SET form needs to be evaluated for validity and reliability before use. Powell et al. (2014) found that student perceptions of items on SET forms in their nursing program were different from the interpretations by faculty and administrators and emphasized the need to explore this in a nursing program before a form is used. Focus groups with students or asking students to think aloud as they answer sample items provide easy methods for examining student interpretation of SET items. The set of guidelines provided in this article should be considered within the context of the specific nursing program and institution given the wide variability across nursing programs. A tool and
Please cite this article as: Bush, M.A., et al., Considerations for Developing a Student Evaluation of Teaching Form, Teaching and Learning in Nursing (2017), https://doi.org/10.1016/j.teln.2017.10.002
4
M.A. Bush et al. / Teaching and Learning in Nursing xxx (2017) xxx–xxx
process that work effectively in one program may not be as useful in another, and some of the guidelines we identified from our review may not be appropriate for a particular nursing program. Students' evaluations of courses and teachers are influenced by many factors, some of which were presented in this article. No form or system in a nursing program is “perfect,” and faculty and administrators need to keep that in mind when reviewing SET scores and comments of students. The extent of student learning in a course is sometimes only realized by students well after the course has ended, and ratings on a SET form do not measure the learning that occurred in the course. The number and variety of factors that can affect evaluation results have important implications for use of evaluations and crosscomparability of results over time. Cautious interpretation of evaluation results is advisable, and data from SET should not be the only information about teaching used for decisions about nursing faculty.
References Agbetsiafa, D. (2010). Evaluating effective teaching in college level economics using student ratings of instruction: A factor analytic approach. Journal of College Teaching and Learning, 7(5), 57–66. Annan, S. L., Tratnack, S., Rubenstein, C., Metzler-Sawin, E., & Hulton, L. (2013). An integrative review of student evaluations of teaching: Implications for evaluation of nursing faculty. Journal of Professional Nursing, 29, e10–e24. https://doi.org/10. 1016/j.profnurs.2013.06.004. Balam, E. M., & Shannon, D. M. (2010). Student ratings of college teaching: A comparison of faculty and their students. Assessment & Evaluation in Higher Education, 35, 209–221. https://doi.org/10.1080/02602930902795901. Barone, S., & Lo Franco, E. (2010). TESF methodology for statistics education improvement. Journal of Statistics Education, 18(3), 1–25. Chang, S. F. (2012). The development of an evaluation tool to measure nursing core curriculum teaching effectiveness: An exploratory factor analysis. Journal of Nursing Research, 20, 228–236. https://doi.org/10.1097/jnr.0b013e3182656166. Chulkov, D. V., & Van Alstine, J. (2012). Challenges in designing student teaching evaluations in a business program. International Journal of Educational Management, 26, 162–174. https://doi.org/10.1108/09513541211201979. Coker, J., Castiglioni, A., Kraemer, R. R., Massie, F. S., Morris, J. L., Rodriguez, M., ... Estrada, C. A. (2014). Evaluation of an advanced physical diagnosis course using consumer preferences methods: The nominal group technique. The American Journal of the Medical Sciences, 347, 199–205. https://doi.org/10.1097/MAJ. 0b013e3182831798. Coker, J., Tucker, J., & Estrada, C. (2013). Nominal group technique: A tool for course evaluation. Medical Education, 47(11), 1145. https://doi.org/10.1111/medu.12324.
Dresel, M., & Rindermann, H. (2011). Counseling university instructors based on student evaluations of their teaching effectiveness: A multilevel test of its effectiveness under consideration of bias and unfairness variables. Research in Higher Education, 52, 717–737. https://doi.org/10.1007/s11162-011-9214-7. Frick, T. W., Chadha, R., Watson, C., & Zlatkovska, E. (2010). Improving course evaluations to improve instruction and complex learning in higher education. Educational Technology Research and Development, 58(2), 115–136. https://doi. org/10.1007/s11423-009-9131-z. Keely, E., Oppenheimer, L., Woods, T., & Marks, M. (2010). A teaching encounter card to evaluate clinical supervisors across clerkship rotations. Medical Teacher, 32(2), e96–e100. https://doi.org/10.3109/01421590903202496. Merrill, M. D. (2002). First principles of instruction. Educational Technology Research and Development, 50(3), 43–59. Nation, J. G., Carmichael, E., Fidler, H., & Violato, C. (2011). The development of an instrument to assess clinical teaching with linkage to CanMEDS roles: A psychometric analysis. Medical Teacher, 33, e290–e296. https://doi.org/10.3109/0142159X. 2011.565825. Oermann, M. H. (2017). Student evaluations of teaching: There is more to course evaluations than student ratings. Nurse Educator, 42, 55–56. https://doi.org/10.1097/ nne.0000000000000366. Powell, N. J., Rubenstein, C., Sawin, E. M., & Annan, S. (2014). Student evaluations of teaching tools: A qualitative examination of student perceptions. Nurse Educator, 39, 274–279. https://doi.org/10.1097/NNE.0000000000000066. Pritchard, R. E., Saccucci, M. S., & Potter, G. C. (2010). Evaluating a program designed to demonstrate continuous improvement in teaching at an AACSB-accredited college of business at a regional university: A case study. Journal of Education for Business, 85, 280–283. https://doi.org/10.1080/08832320903449568. Ross, L., Wallis, J., Huggins, C., & Williams, B. (2013). Students' views of teachers using the clinical teaching effectiveness inventory. Journal of Paramedic Practice, 5(6), 336–340. Rucker, M. H., & Haise, C. L. (2012). Effects of variations in stem and response options on teaching evaluations. Social Psychology of Education, 15, 387–394. https://doi. org/10.1007/s11218-012-9186-2. Spooren, P., Brockx, B., & Mortelmans, D. (2013). On the validity of student evaluation of teaching the state of the art. Review of Educational Research, 83, 598–642 https://doi.org/10.3102/0034654313496870. Stalmeijer, R. E., Dolmans, D. H., Wolfhagen, I. H., Muijtjens, A. M., & Scherpbier, A. J. (2010). The Maastricht clinical teaching questionnaire (MCTQ) as a valid and reliable instrument for the evaluation of clinical teachers. Academic Medicine, 85, 1732–1738. https://doi.org/10.1097/ACM.0b013e3181f554d6. Stupans, I., McGuren, T., & Babey, A. M. (2016). Student evaluation of teaching: A study exploring student rating instrument free-form text comments. Innovative Higher Education, 41(1), 33–42. https://doi.org/10.1007/s10755-0159328-5. Tucker, B. (2014). Student evaluation surveys: Anonymous comments that offend or are unprofessional. Higher Education, 68, 347–358. https://doi.org/10.1007/ s10734–014–9716-2. Zhao, J., & Gallant, D. J. (2012). Student evaluation of instruction in higher education: Exploring issues of validity and reliability. Assessment & Evaluation in Higher Education, 37, 227–235. https://doi.org/10.1080/02602938.2010.523819.
Please cite this article as: Bush, M.A., et al., Considerations for Developing a Student Evaluation of Teaching Form, Teaching and Learning in Nursing (2017), https://doi.org/10.1016/j.teln.2017.10.002