Student evaluation based indicators of teaching excellence from a highly selective liberal arts college

Student evaluation based indicators of teaching excellence from a highly selective liberal arts college

G Models IREE-65; No. of Pages 14 International Review of Economics Education xxx (2014) e1–e14 Contents lists available at ScienceDirect Internat...

241KB Sizes 0 Downloads 47 Views

G Models

IREE-65; No. of Pages 14

International Review of Economics Education xxx (2014) e1–e14

Contents lists available at ScienceDirect

International Review of Economics Education journal homepage: www.elsevier.com/locate/iree

Student evaluation based indicators of teaching excellence from a highly selective liberal arts college§ Aju Jacob Fenn * Economics and Business, Colorado College, 14 E. Cache La Poudre Street, Colorado Springs, CO 80903, USA

A R T I C L E I N F O

A B S T R A C T

Article history: Received 17 December 2013 Received in revised form 1 October 2014 Accepted 22 November 2014 Available online xxx

The paper uses individual level data from each student that has enrolled in each class in the department in the last five years. Unlike several past studies that lack data on teaching quality measures Ordered Probit models reveal that the significant determinants of effective teaching, based on student evaluations of teaching (SET) are: organization, clarity of exposition, availability and enthusiasm of the instructor. The gender and race of the instructor is significant with female and non-white instructors being downgraded by students. The usual findings of expected grades inflating teaching evaluations are contested by the empirical evidence from this paper. Highly selective liberal arts colleges claim to provide a better education based on smaller class sizes and excellent teachers that are also active scholars. The purpose of this paper is to examine the determinants of effective teaching in Economics and Business, according to student evaluations of teaching (SET) at one such college. The paper will begin with a literature review of the scholarship on teaching evaluations. The next section will describe the dataset and the models used to examine teaching effectiveness. The paper will conclude with a discussion of the econometric results. ß 2014 Elsevier Ltd. All rights reserved.

JEL classification: A10 A22 Keywords: Teaching Evaluations Determinants

§

The author acknowledges Sidharth Moktan, Jeff Moore and Lachlan Watkins for their help as superb research assistants. * Tel.: +1 719 389 6409; fax: +1 719 389 6927. E-mail address: [email protected].

http://dx.doi.org/10.1016/j.iree.2014.11.001 1477-3880/ß 2014 Elsevier Ltd. All rights reserved.

Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

e2

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

1. Literature review There is a substantial literature on student evaluations of teaching (SET) but a search for the determinants of teaching excellence at liberal arts colleges reveals a sparsely populated area of this pedagogical literature. A perusal of the literature reveals several prominent themes. One dominant strand of the literature deals with the impact of grade inflation on SET. Another strand discusses the impact of non-teaching related factors such as faculty scholarship, sociodemographic factors of students/faculty such as gender and race on SET. The attributes of effective teachers have also been well documented. Yet another branch of the literature focuses on the drawbacks of student teaching evaluations and questions their validity as a tool to measure teaching quality. This paper examines what attributes student’s value in an Economics and Business Professor at a highly selective liberal arts college. At colleges such as these while scholarship is essential, it is non-substitutable for teaching ability when it comes to tenure and promotion decisions. There is a growing trend for college administrators to use SET to evaluate a teacher’s prowess. While several schools also use other tools to evaluate teaching effectiveness such as peer observations of teaching, SET are almost always the only tool used for annual reviews and annual salary raises. This paper adds to the literature by sorting out signal from noise and quantifying some of the demographic and grade inflation biases that exist in such data. The policy relevance of this piece is that not all SET quantifications of numbers are equal. Some instructors do get downgraded because of demographic differences while others do get downgraded due to actual deficiencies in teaching. This piece is the first of its kind to quantify these factors using average marginal effects from an Ordered Probit Model. The results are similar to the existing literature in that race, gender and grade inflation biases do exist. However, in contrast to existing studies student’s expected grades do not boost their evaluation of an instructor across the board. It is only if a student expects to get an A that they boost their evaluation of the instructor and the course. If administrators or review committees fail to account for these biases they would be making decisions about teaching quality that reflect the biases of the students. The survey of the literature presented below will attempt to identify salient features that impact student teaching evaluations and factors that must be controlled for in order to use student teaching evaluations as reliable instruments to gauge the quality of teaching. 2. Grade inflation and SET Expected grades are a significant determinant of the SET that instructors receive. Jewell and McPherson find that faculty may buy better evaluations by inflating students grade expectations (McPherson and Jewell, 2007). They use feasible generalized least squares (FGLS) on the data collected from students in their masters program in Economics. Their SET scores used in the regression consist of the average of student responses on a one to five scale. They find that a 1 unit increase in expected grade corresponds to a 0.25 unit increase in the instructors SET score. They also find that White instructors get 0.08 points higher on their SET scores than non-white instructors, while gender is insignificant. The current study also finds these effects but unlike Jewell and McPherson expected grade only boost course and instructor evaluations if the students expects to get an A. Jewell, McPherson and Tieslau using data from 1683 courses taught at 28 different academic departments argue that incentives to inflate grades vary according to the characteristics of departments and instructors (Jewell et al., 2011). In another recent study Jewell and McPherson suggest that female professors are most likely to inflate grades at large public universities while ethnicity plays a smaller role in grade inflation. Departments that see more talented students as measured by average SAT scores see grade inflation at a slower rate than those with less talented students (Jewell and McPherson, 2012). Not all scholars agree. Boertz claims that student evaluations do not cause grade inflation because of instructor’s attempting to bribe students with higher grades in return for better teaching evaluations (Boretz, 2004). McPherson, Jewell and Kim find that instructors can buy better evaluation scores by inflating students grade expectations Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e3

(McPherson et al., 2009). Matsos-Diaz and Ragan conclude that a tighter expected grade distribution corresponds to higher student ratings (Matos-Diaz and Ragan, 2009). Langbein finds that actual grades have a positive impact on SET at American University while controlling for expected grades and other fixed effects for faculty and courses (Langbein, 2008). Isley and Singh also find evidence of grade inflation. They contend that it is mainly the gap between expected grades and cumulative grade point average (coming into the class) that is the relevant explanatory variable for favorable SET (Isely and Singh, 2005). Other scholars such as Krautman and Sander have examined the potential simultaneity of expected grades and SET (Krautmann and Sander, 1999). They contend that expected grades may be correlated with an unobservable variable such as teaching productivity and thus higher grades may be correlated with higher SET. If teaching productivity or its measures are omitted from the model then ordinary least squares (OLS) estimates will be biased and inconsistent and two stage least squares (2SLS) is the appropriate estimator. The SET are measured on a five-point scale from 1 to 5. Krautman and Sander find that tenured, visiting and emeritus professors get lower SETs than professors not in those categories. They also find that for a one letter grade increase in the expected grade the SET score increases by about 0.6 on a five-point scale. Gender and class size are insignificant in their study. A Hausman test confirms that expected grades are determined simultaneously with SET scores, in the absence of regressors that measure teaching productivity. While most of the literature in this area is empirical, Love and Kotchen provide a neoclassical theoretical framework for grade inflation (Love and Kotchen, 2010). They nest the student’s grade maximization problem within the professor’s tenure and promotion utility maximization problem. Their comparative static results suggest that grade inflation may be the unintended consequences of universities trying to improve teaching quality and/or research output. They hold the view that student’s grades and SETs are correlated not because of teaching excellence but rather due to students rewarding professors on their SETs for grade leniency. Their modified framework shows how grade targets may be used to control grade inflation and to increase the professors’ allocation of time toward teaching. While most grade impact studies focus on the student’s own expected grade Nowell examines whether students own expected grades relative to the performance of their peers influence the SET scores (Nowell, 2007). He is able to use individual SET data to test his hypotheses. Nowell does not have variables that measure teaching quality and thus test whether expected grades and SET scores are endogenous. He finds that there is no endogeneity between SET scores and expected grades. Nowell maintains that sample selection bias for completed SET forms is not an issue. He finds that increases in the student’s own grade relative to the student’s historical GPA results in a positive and significant impact on the SET scores. There is an unambiguously positive effect on SET scores when the reference grade based on either the expected grades of peers in the same section or the grades of students who have taken the same instructor increase. However, when the students own grade increases relative to either of the above reference points the impact on SET scores may be positive or negative. The main finding is that the grades earned by peers in the same class or having had the same instructor do influence the SET scores handed out by the students. Finally he does find that students do reward instructors for higher grades with higher SET scores. The next section will deal with the impact of race and gender on SET. 3. Gender, race and SET In a unique study Reid uses data on 3079 White, 142 Black, 238 Asian, 130 Latino, and 128 Other race faculty at the 25 highest ranked liberal arts colleges from the website www.RateMyProfessors.com (Reid, 2010). Using a two-step cluster analysis Reid finds that the cluster corresponding to faculty viewed most favorably by students was populated exclusively by White faculty. The cluster corresponding to faculty of color was viewed favorably by students but less so than the cluster corresponding to White faculty. The cluster corresponding to faculty viewed as the worst instructors is dominated by Black and Asian faculty. Gender was insignificant along the teaching dimensions of overall quality, helpfulness and clarity. However men were seen as easier graders than women. Latino and Other race faculty are not perceived significantly Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

e4

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

differently than White faculty. Race and gender do interact to impact SETs with Black, Asian and Latino males being rated lower than their corresponding racial female counterparts. McPherson and Jewell find that White instructors get about a 0.08 point boost in their SET scores on a four point scale due to their race (McPherson and Jewell, 2007). They use the average of SET scores on questions that ask about teaching quality as the dependent variable for each class. Kogan, Schoenfeld-Tacher and Hellyer find that female faculty appear to be more negatively impacted by student evaluations than their male counterparts (Kogan et al., 2010). Weinberg, Hashimoto and Fleisher find that women and foreign born instructors receive lower SETs, ceteris paribus (Weinberg et al., 2009). There are also a number of studies that claim that race and gender have little to do with the course evaluations that an instructor receives from their students. In particular Centra and Gaubatz, Feldman, and finally Theall find that students do not favor instructors soleley because of their gender (Centra and Gaubatz, 1998; Feldman, 1992; Theall and Franklin, 2001). The impact of race and gender has been well documented in the literature. The interested reader is directed to Reid who has an excellent literature review on the subject (Reid, 2010). The present study finds that race and gender do matter when it comes to SET scores awarded by students.

4. Attributes of teaching effectiveness Boex outlines SET questions from Georgia State that examine various dimensions of teaching ability (Boex, 2000). Students are asked to grade instructors on a five-point scale on their abilities in the following areas: making presentations, organization and clarity, grading and assignments, intellectual/scholarly, student interaction and student motivation. Each category contains statements that describe the dimension of teaching that students are asked to agree or disagree with on a five-point scale. All of these dimensions are found to be statistically significant in one or more of the Ordered Probit regressions. The most valued trait is organization and clarity. Seiler, Seiler and Chiang examine professor, student and course attributes that contribute to successful SET (Seiler et al., 1999). Using factor analysis on 291 SET responses from a single semester they conclude that research oriented professors, part time professors, MBA students, elective courses and GPA all contribute to higher SET scores. Their results are mixed on whether tenure, student major and time of day that the class is offered benefit SET scores. DeCanio using individual SET forms, a sample size of 6872; while employing OLS and a Multinomial Logit model finds that class size negatively impacts SET scores while higher academic standards upheld by the instructor positively affect SET scores (DeCanio, 1986). The instructor’s accessibility out of class, the value of the lectures, enjoyment of the material, lecture preparation, the ability of the instructor to explain material and the ability to answer questions are all positively and significantly related to SET scores. Hammermesh and Parker’s findings suggest that more attractive instructors are likely to receive better evaluations (Hamermesh and Parker, 2005). However, Campbell, Gerdes and Steiner find that when omitted variables pertaining to the successful teaching attributes are included in the model pulchritude loses its significance (Campbell et al., 2005). Next the dataset and empirical model is described.

5. Data and empirical model A total of 3754 individual SET forms are used for each course taught at a highly selective liberal arts college in the Department of Economics and Business from 2006 to 2010 for all courses taught by the department. After incomplete forms are discarded N = 3151 and N = 3165 for each of the regression models used in this paper. Students are asked to evaluate statements on teaching and about the course on a five-point scale with 1 = Strongly Disagree and 5 = Strongly Agree. In general 5’s correspond to favorable impressions of the course and the instructor. The only exceptions to this rule are the variables CourseGrade and ExpectedGrade for which 4’s correspond to teaching excellence and 0 indicates the worst grade. The empirical model is presented next followed by a Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e5

description of the variables used in the study. Quality of course=instructor ¼ b0 þ b1 professor attributes þ b2 student attributes þ b3 teaching effectiveness þ b4 annual dummies þ b5 cross-sectional dummies

(1)

Teaching quality is measured using two different dependent variables. The first outcome variable measures teaching quality by the overall course grade assigned to the course by students. Students are asked: If you could give this course a grade it would be A, B, C, D, No Credit i.e. F. The second measure of teaching quality is the score assigned to the professor in response to the question: Recommend others take this course from this instructor. The students’ choices are: Strongly Agree, Agree, Neutral, Disagree and Strongly Disagree. The independent variables may be loosely classified into three main categories: professors’ characteristics, students’ characteristics and measures of teaching effectiveness. Table 1 contains a summary of the variable names and their definitions (Table 2). 5.1. Professor’s characteristics Professor’s characteristics are captured by variables such as CAUCASIAN which is a dummy variable that takes on a value of 1 if the professor is white and a U.S. native, 0 otherwise. PROF_MALE is a dummy that is 1 for a male professor and 0 otherwise. TENURE is a dummy variable that takes on a value of 1 if the professor is tenured and 0 otherwise. 5.2. Students’ characteristics Students’ characteristics include both variables about the student and the type of course (Introductory, Intermediate, Elective or Quantitative). GRADEEXPECTED reflects the grade that the student expects to earn in the class (4 = A, 3 = B, etc.). These evaluations are administered shortly before the final so most students have a reasonable idea of their progress in the class. MALE is a dummy variable that takes on a value of 1 if the student is male and 0 otherwise. CLASSSTATUS is a count variable that accounts for the student’s class year (1 = First year, 2 = Sophomore, 3 = Junior and 4 = Senior). SEMESTER is a dummy variable that takes on a value of 1 if the course if offered in the Spring semester and a value of 0 otherwise. PRINC, INTERMED and ELEC are dummy variables that take on values of 1 if the class is a principles, intermediate theory or elective course respectively; and a value of 0 otherwise. The omitted category is quantitative methods classes such as Econometrics. TIMEINCLASS is the average time spent in class per day by the respondent, measured in hours. Instructors at this school have considerable flexibility in determining class meeting times and the duration of class. TIMEOUTCLASS is the average time spent on class work out of class by the respondent measured in hours. RELATIVETIME is a count variable that measures how much time students spent on this class compared to other courses that they have taken. It takes on values as follows: 1 = student spent more time than average relative to other courses for this class, 2 = about the same amount of time as other courses, 3 = less time than average for this class, etc. 5.3. Measures of teaching effectiveness Measures of teaching effectiveness are captured by the variables described below. Students are asked to respond on a 1–5 scale on whether they agree or disagree with statements about the instructor’s teaching where 5’s correspond to strongly agree and 1’s to strongly disagree. In general a score of 5 corresponds to teaching excellence in the given category. ACCESSIBLE captures whether the instructor was available outside class. This is one of the major selling points of a small liberal arts college and a low student–faculty ratio. ASSIGNMENTS_APPROPRIATE asks students whether they agree with the claim that the assignments in class were appropriate. AWARE_DIFFICULTIES asks students whether the instructor was aware of student learning difficulties during the course. BALANCE Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e6

Table 1 Variable names and brief definitions. Variable names Dependent variables COURSEGRADE RECOMMEND_INSTRUCTOR

Independent variables Professors’ characteristics CAUCASIAN PROF_MALE TENURE Students’ characteristics GRADEEXPECTED MALE CLASSSTATUS SEMESTER PRINC INTERMED ELEC TIMEINCLASS TIMEOUTCLASS RELATIVETIME

Measures of teaching effectiveness ACCESSIBLE ASSIGNMENTS_APPROPRIATE AWARE_DIFFICULTIES BALANCE BOOKS CLEAR_PRESENTATION DEMANDING EXAMS_APPROPRIATE EXPLANATIONS_ERRORS EXPLAINED_ASSIGNMENTS INTERESTING KNOWLEDGE OBJECTIVE ORGANIZED POV QUESTIONS STIMULATED_THINK

Definitions If you could give this course a grade (4 = A, 3 = B, 2 = C, etc.) Would recommend instructor to others (5 = Strongly recommend, 4 = Recommend, etc.)

Dummy variable for professor being White Dummy variable for professor being male Dummy variable for professor being tenured

What grade the respondent is expecting to receive (4 = A, 3 = B, 2 = C, etc.) Dummy variable for respondent being male Count Variable for class year (1 = First year, 2 = Sophomore, etc.) Dummy Variable for semester (0 if Fall Semester, 1 Otherwise) Dummy Variable for Principles Classes (1 if Principles, 0 otherwise) Dummy Variable for Intermediate Theory Classes (1 if Intermediate, 0 otherwise) Dummy Variable for Electives Classes (1 if Elective, 0 otherwise) Average time spent in class per day in hours Average time spent on class work out of class per day in hours Relative to other classes taken at the college the amount of time spent on this course is (1 = more than average, average, 2 = about the same, 3 = less than average, etc.)

Was accessible outside of class (5 = Strongly Agree, 4 = Agree, etc.) Assigned paper and projects that were appropriate (5 = Strongly Agree, 4 = Agree, etc.) Aware of students’ difficulties (5 = Strongly Agree, 4 = Agree, etc.) Instructor provided proper balance between in and out of class work (5 = Strongly Agree, 4 = Agree, etc.) Instructor chose books and other materials that were helpful (5 = Strongly Agree, 4 = Agree, etc.) Instructor gave clear presentations (5 = Strongly Agree, 4 = Agree, etc.) Instructor was demanding in expectations (5 = Strongly Agree, 4 = Agree, etc.) Exams, etc. were appropriate (5 = Strongly Agree, 4 = Agree, etc.) Provided explanation of errors (5 = Strongly Agree, 4 = Agree, etc.) Explained assignments (5 = Strongly Agree, 4 = Agree, etc.) Student found subject interesting (5 = Strongly Agree, 4 = Agree, etc.) Instructor displayed sound knowledge (5 = Strongly Agree, 4 = Agree, etc.) Instructor was objective (5 = Strongly Agree, 4 = Agree, etc.) Instructor was organized (5 = Strongly Agree, 4 = Agree, etc.) Instructor was open to other points of view (5 = Strongly Agree, 4 = Agree, etc.) Encouraged questions (5 = Strongly Agree, 4 = Agree, etc.) Instructor stimulated students to think (5 = Strongly Agree, 4 = Agree, etc.)

captures whether the instructor provided proper balance between in and out of class work during the course. The variable BOOKS captures whether students found the books used for the class helpful. CLEAR_PRESENTATION evaluates the presentation skills of the instructor. DEMANDING is included to get the student’s impression of how demanding an instructor is because higher standards by an instructor may influence the students’ opinion of the effectiveness of an instructor. EXAMS_APPROPRIATE controls for the fit of tests to the instruction and coursework. KNOWLEDGE controls for the instructor’s knowledge of the material. OBJECTIVE controls for the objectivity of the instructor. POV measures whether the instructor is open to alternative points of view. QUESTIONS measures whether an instructor encouraged questions during class. STIMULATED_THINK is related to whether an instructor stimulated students to think about the material. Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e7

Table 2 Sample statistics. Variable

Mean

Standard deviation

Dependent variables COURSEGRADE RECOMMEND_INSTRUCTOR

3.40 4.09

0.79 1.11

Independent variables Professors’ characteristics CAUCASIAN TENURE

0.64 0.63

0.48 0.48

Students’ characteristics GRADEEXPECTED MALE CLASSSTATUS SEMESTER PRINC INTERMED ELEC TIMEINCLASS TIMEOUTCLASS

3.39 0.62 2.52 0.48 0.39 0.17 0.34 7.58 6.84

0.82 0.49 1.04 0.50 0.49 0.38 0.47 6.32 6.88

Measures of teaching effectiveness ACCESSIBLE ASSIGNMENTS_APPROPRIATE AWARE_DIFFICULTIES BALANCE BOOKS CLEAR_PRESENTATION DEMANDING EXAMS_APPROPRIATE EXPLANATIONS_ERRORS EXPLAINED_ASSIGNMENTS INTERESTING KNOWLEDGE OBJECTIVE ORGANIZED POV QUESTIONS RELATIVETIME STIMULATED_THINK

4.31 4.08 4.02 4.04 4.04 4.09 4.19 4.13 4.09 4.02 4.12 4.54 4.20 4.22 4.22 4.25 2.03 4.24

0.81 0.86 1.00 0.91 0.88 0.99 0.80 0.89 0.90 0.93 0.93 0.74 0.86 0.88 0.81 0.91 0.76 0.91

N = 3151.

6. Econometric methodology The literature has several studies that have used OLS applied to average SET data. OLS is not the appropriate estimator for ranked categorical variables. The appropriate estimator for ordered data in integer form is an ordered estimator such as Ordered Probit (Mason et al., 1995; DeCanio, 1986). The results remain largely the same for the Ordered Probit and Ordered Logit estimators. OLS results are also presented next to the Probit results for comparison purposes. The OLS and Ordered Probit standard errors are clustered at the instructor level. Both sets of regressions employ annual dummies and instructor specific dummies for non-white instructor. Employing instructor specific dummies for all instructors is not possible because of near multi-colinearity with the CAUCASIAN variable. Average marginal effects are computed for the Ordered Probit models taking into account that several of the independent variables are integer count variables and dummy variables according to the procedure outlined in Bartus (2005). The Ordered Probit Model was subjected to robustness checks by running subsample regressions of SET scores of individual instructors/courses including only semester and course controls. The results regarding the aspects of teaching that influence overall course scores and instructor scores vary by instructor but most of the significant variables show up for at least a few if not most of the subsamples. The author is grateful to an anonymous referee for suggesting this check. Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e8

7. Overall course quality results COURSEGRADE is used to measure the overall course experience or course quality. OLS and Ordered Probit estimates of the model with COURSEGRADE as the dependent variable are contained in Table 3. Students assign a grade to the entire course in response to the question: If you could give this grade a course it would be A, B, C, D, F or do not know. A numerical value of 4 corresponds to A, 3 to be B and so on. In this equation, positive and significant coefficients increase the number assigned to COURSEGRADE and increase the value of the appraisal of the course. Table 3 presents the OLS and Ordered Probit estimates of the model with COURSEGRADE as the dependent variable. For both of these estimates, the significant coefficients on CAUCASIAN, PROF_MALE and TENURED variables suggest that tenured, Caucasian, male professors tend to perform

Table 3 OLS and Ordered Probit estimates of overall experience. Variable

OLS

Constant

0.18 (0.87)

Professors’ characteristics CAUCASIAN PROF_MALE TENURE

0.11** (2.49) 0.06** (2.13) 0.06* (1.93)

Probit – 0.28*** (3.26) 0.19** (2.33) 0.16* (1.84)

Students characteristics GRADEEXPECTED MALE CLASSSTATUS SEMESTER PRINC INTERMED ELEC TIMEINCLASS TIMEOUTCLASS RELATIVETIME

0.09*** (5.35) 0.01 (0.43) 0.01 (0.70) 0.01 (0.48) 0.00 (0.05) 0.03 (0.62) 0.00 (0.02) 0.00 (1.14) 0.00 (0.58) 0.01 (0.53)

0.24*** (6.89) 0.03 (0.56) 0.03 (0.87) 0.01 (0.18) 0.08 (0.63) 0.14 (1.11) 0.03 (0.26) 0.01 (1.24) 0.00 (0.74) 0.03 (0.88)

Measures of teaching effectiveness ACCESSIBLE ASSIGNMENTS_APPROPRIATE AWARE_DIFFICULTIES BALANCE BOOKS CLEAR_PRESENTATION DEMANDING EXAMS_APPROPRIATE EXPLANATIONS_ERRORS EXPLAINED_ASSIGNMENTS INTERESTING KNOWLEDGE OBJECTIVE ORGANIZED POV QUESTIONS STIMULATED_THINK

0.02 (1.39) 0.06*** (2.91) 0.07*** (5.61) 0.04** (2.71) 0.01 (0.33) 0.15*** (6.56) 0.05*** (2.83) 0.09*** (4.23) 0.02 (1.03) 0.03* (1.97) 0.14*** (10.37) 0.01 (0.25) 0.02 (1.13) 0.13*** (6.14) 0.05*** (3.10) 0.04* (1.95) 0.13*** (7.20)

0.02 (0.83) 0.09** (2.32) 0.15*** (6.63) 0.10*** (3.92) 0.01 (0.17) 0.30*** (6.41) 0.08** (2.02) 0.20*** (4.39) 0.05 (1.41) 0.09** (2.45) 0.32*** (11.17) 0.05 (0.83) 0.01 (0.41) 0.25*** (5.80) 0.11** (2.49) 0.08* (1.79) 0.22*** (6.31)

R-squared Pseudo/adjusted R-squared Log likelihood

0.5785 – –

– 0.3627 2074.8135

N = 3165. * Significance at P< 0.1. ** Significance at P< 0.05. *** Significance at P< 0.01. Both the OLS and Probit estimators standard errors have been clustered at the instructor level. The numbers in parentheses are the T-statistics for OLS and Z-statistics for Probit.

Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e9

Table 4 OLS and Ordered Probit estimates of instructor quality. Variable

OLS

Constant

1.34 (6.57)

Professors’ characteristics CAUCASIAN PROF_MALE TENURE

0.14** (2.23) 0.08** (1.93) 0.08* (1.93)

Probit – 0.33*** (2.43) 0.21** (1.99) 0.18* (1.77)

Students characteristics GRADEEXPECTED MALE CLASSSTATUS SEMESTER PRINC INTERMED ELEC TIMEINCLASS TIMEOUTCLASS RELATIVETIME

0.03*** (1.27) 0.05 (1.33) 0.00 (0.35) 0.02 (0.61) 0.00 (0.05) 0.05 (1.05) 0.02 (0.56) 0.01 (2.01) 0.00 (0.86) 0.00 (0.14)

0.06*** (1.59) 0.09 (1.47) 0.01 (0.24) 0.01 (0.21) 0.05 (0.43) 0.09 (0.86) 0.07 (0.67) 0.03 (2.15) 0.01 (0.75) 0.02 (0.38)

Measures of teaching effectiveness ACCESSIBLE ASSIGNMENTS_APPROPRIATE AWARE_DIFFICULTIES BALANCE BOOKS CLEAR_PRESENTATION DEMANDING EXAMS_APPROPRIATE EXPLANATIONS_ERRORS EXPLAINED_ASSIGNMENTS INTERESTING KNOWLEDGE OBJECTIVE ORGANIZED POV QUESTIONS STIMULATED_THINK

0.00 (0.08) 0.07*** (3.30) 0.11*** (6.63) 0.03** (1.68) 0.01 (0.46) 0.22*** (6.29) 0.07*** (3.27) 0.12*** (5.57) 0.04 (1.51) 0.02* (1.10) 0.24*** (7.98) 0.09 (3.17) 0.01 (0.35) 0.15*** (6.50) 0.05*** (1.72) 0.11* (4.10) 0.15*** (6.79)

0.05 (0.98) 0.12** (2.89) 0.17*** (6.46) 0.10*** (2.88) 0.04 (1.09) 0.34*** (6.04) 0.07** (1.95) 0.19*** (5.27) 0.10 (2.55) 0.07** (2.05) 0.48*** (8.33) 0.13 (2.52) 0.06 (1.38) 0.22*** (6.71) 0.04** (0.69) 0.18* (4.34) 0.18*** (5.43)

R-squared Adjusted R-squared Log likelihood F-statistic

0.6874 – –

– 0.405 2405.1617 –

N = 3151. * Significance at P< 0.1. ** Significance at P< 0.05. *** Significance at P< 0.01. Both the OLS and Probit estimators have been clustered at the instructor level. T-statistics and Z-statistics are reported in parentheses.

differently on this measure of COURSEGRADE than their counterparts. GRADEEXPECTED, the grade expected by a student in the course is also significant. The gender of the student is insignificant as is the type of class principles, intermediate or elective. The time spent in class, out of class and relative to other courses are also insignificant determinants of the overall course grade that a student assigns a course. The measures of teaching effectiveness that significantly impact the overall grade assigned to a course by students are: ASSIGNEMENTS_APPROPRIATE, AWARE_DIFFICULTIES, BALANCE, CLEAR_PRESENTATION, DEMANDING, EXAMS_APPROPRIATE, EXPLAINED_ASSIGNMENTS, INTERESTING, ORGANIZED, POV, QUESTIONS and STIMULATED_THINK. None of these significant variables come as surprises to most teachers. Students value clear presentations, interesting and organized professors most highly. Interestingly enough access to the professor as measured by the ACCESIBLE variable was insignificant. Highly selective liberal arts colleges pride themselves on small class sizes and access to Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

e10

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

the professor. It may be the case that given the clear presentations, explained assignments and appropriate exams provided by many of the instructors combined with the high quality of the students that few students actually need access to the professor after class. It may also be the case that the opportunity cost of going to a professor’s office hours may be too high for this student body. Next, a discussion of the average marginal effects of the significant factors on overall course quality is presented. Table 5 presents the average marginal effect (AME) of each of the independent variables on the dependent variable, COURSEGRADE. AMEs are computed as the averages of the marginal effects evaluated at each data point and then averaged across the sample. The more familiar Marginal Effects at Means (MEM) are simply the marginal effects computed at the sample means. Bartus argues that AMEs are more realistic when some of the independent variables are dummy variables (Bartus, 2005). AMEs have been computed while accounting for dummy variables values changing from 0 to 1 and count variables changing by an integer value of 1. This is a distinct advantage over MEM which just uses sample means are a nonsensical value like 0.64 for CAUCASIAN. AMEs show the marginal impact of a change of achieving the given probability level for the dependent variable computed at each sample point and then averaged across the entire sample. For example, AME of COURSEGRADE = A or 4 shows the average marginal impacts of each independent variable for a marginal change (or a unit change in the case of dummy or count regressors) on the probability of the student assigning a COURSEGRADE = A or 4. Table 5 shows that CAUCASIAN professors have a 7% higher probability of being assigned an A as their COURSEGRADE ceteris paribus while MALE professors have a 4.6% higher probability of getting a COURSEGRADE of A than their female counterparts. Of the student characteristic variables GRADEXPECTED and DEMANDING have significant and interesting signs. An increase of the expected grade by one letter grade (GRADEEXPECTED) by the student increases the probability that they will give the course experience an A by 5.7% while a unit increase in the instructor being DEMANDING lowers the probability of the course being rated as an A by 2%. However, a one letter grade change in expected grade does decrease the probability of the course being rated as a B by 3%. The signs of these two variables may be explained by remembering that this student body is that of a highly selective college that want to be challenged in the classroom. While on the one hand they wish to be challenged on the other hand they do not want the instructor to be too demanding. While grade inflation may not be used to buy excellent course grades for the instructor there may be enough room to purchase adequate course scores. Some of the teaching effectiveness variables that positively impact the student’s overall impression of the course are: ASSIGNMENTS_APPROPRIATE, AWARE_DIFFICULTIES and BALANCE. Each of these variables contribute to an increase of the probability of the course being given a grade of A by 2.3%, 3.6% and 2.5% respectively. Other teaching effectiveness variables that positively impact the probability of the student giving the course an A are CLEAR-PRESENTATION (7.4%), EXAMS_APPROPRIATE (4.8%), EXPLAINED_ASSIGNMENTS (2.2%), INTERESTING (7.8%) and STIMULATED_THINK (5.3%). The results suggest that students like their professors to be experts that are opinionated and not necessarily open to other points of view. The POINT OF VIEW variable reduces the probability of the course being graded as an A by 2.6% for each unit increase in the scale on being open to the viewpoints of others. Each regressor that has a positive AME on COURSEGRADE being an A will necessarily have a negative effect on the AMEs for the other categories of this ordered dependent variable (COURSEGRADE = B, C, D and F). The signs of the AMEs in the rest of Table 5 bear this out. 8. Overall instructor results RECOMMEND_INSTRUCTOR is used to measure the student’s impression of the quality of the instructor. OLS and Ordered Probit estimates of instructor quality are contained in Table 4. RECOMMEND_INSTRUCTOR is the score assigned to the professor in response to the statement: recommend others take this course from this instructor. The students’ choices are: Strongly Agree or 5, Agree or 4, Neutral or 3, Disagree or 2 and Strongly Disagree or 1. Table 4 presents these estimates, while Table 6 presents the AMEs that correspond to the Ordered Probit results from Table 4. Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

0.06987 0.04644 0.05735 0.00278 0.01874 0.03409 0.00560 0.02260 0.03586 0.02499 0.00164 0.00648 0.07351 0.01898 0.00688 0.04814 0.01255 0.02220 0.07754 0.01174 0.00746 0.00306 0.06103 0.02595 0.01915 0.00712 0.05319 0.04030 0.00327 0.00105

(3.16)* (2.26)* (6.53)* (0.18) (0.63) (1.09) (0.82) (2.34)* (6.74)* (4.00)* (0.17) (0.86) (6.26)* (1.99)* (0.26) (4.56)* (1.42) (2.49)* (11.51)* (0.83) (0.56) (0.41) (6.00)* (2.45)* (1.81) (0.87) (5.98)* (1.79) (1.23) (0.74)

Course grade = B 0.03954 0.02662 0.03271 0.00159 0.01069 0.01917 0.00320 0.01289 0.02045 0.01426 0.00094 0.00369 0.04194 0.01083 0.00391 0.02746 0.00716 0.01267 0.04423 0.00670 0.00425 0.00175 0.03481 0.01480 0.01092 0.00406 0.03034 0.02281 0.00186 0.00060

(3.23)* (2.26)* (6.73)* (0.18) (0.63) (1.10) (0.82) (2.29)* (6.76)* (3.89)* (0.17) (0.87) (5.83)* (1.97)* (0.26) (4.34)* (1.41) (2.55)* (9.96)* (0.82) (0.57) (0.41) (5.95)* (2.36)* (1.80) (0.88) (5.48)* (1.78) (1.21) (0.74)

Course grade = C 0.02227 0.01451 0.01805 0.00088 0.00590 0.01095 0.00176 0.00711 0.01129 0.00787 0.00052 0.00204 0.02314 0.00598 0.00217 0.01515 0.00395 0.00699 0.02441 0.00370 0.00235 0.00096 0.01921 0.00817 0.00603 0.00224 0.01675 0.01283 0.00103 0.00033

(3.04)* (2.22)* (6.05)* (0.18) (0.63) (1.06) (0.82) (2.34)* (6.31)* (4.05)* (0.17) (0.85) (6.22)* (2.03)* (0.26) (4.64)* (1.43) (2.39)* (11.53)* (0.84) (0.57) (0.41) (5.94)* (2.54)* (1.80) (0.86) (6.34)* (1.82) (1.26) (0.75)

Course grade = D 0.00541 0.00352 0.00438 0.00021 0.00143 0.00264 0.00043 0.00173 0.00274 0.00191 0.00013 0.00049 0.00561 0.00145 0.00053 0.00367 0.00096 0.00169 0.00592 0.00090 0.00057 0.00023 0.00466 0.00198 0.00146 0.00054 0.00406 0.00310 0.00025 0.00008

(2.61)* (2.19)* (3.94)* (0.18) (0.62) (1.02) (0.84) (2.45)* (4.17)* (3.66)* (0.17) (0.83) (5.02)* (1.89) (0.26) (4.52)* (1.35) (2.27)* (5.86)* (0.80) (0.56) (0.41) (3.73)* (2.61)* (1.73) (0.82) (4.66)* (1.62) (1.24) (0.72)

Course grade = F 0.00265 0.00179 0.00220 0.00011 0.00072 0.00134 0.00022 0.00087 0.00138 0.00096 0.00006 0.00025 0.00282 0.00073 0.00027 0.00185 0.00048 0.00085 0.00298 0.00045 0.00029 0.00012 0.00234 0.00100 0.00074 0.00027 0.00204 0.00156 0.00013 0.00004

(3.01)* (2.14)* (5.44)* (0.18) (0.62) (1.09) (0.80) (2.20)* (5.73)* (3.18)* (0.17) (0.87) (4.96)* (1.89) (0.26) (3.33)* (1.35) (2.31)* (5.74)* (0.83) (0.57) (0.41) (5.54)* (2.27)* (1.82) (0.89) (4.99)* (1.75) (1.21) (0.75)

*

The Z score is statistically significant at P< 0.05. Z scores are reported in parentheses below the coefficient estimates.

G Models

Course grade = A

IREE-65; No. of Pages 14

Variable CAUCASIAN PROF_MALE GRADEEXPECTED SEMESTER PRINCIPLES COURSE INTERMEDIATE COURSE ACCESSIBLE ASSIGNMENTS_APPROPRIATE AWARE_DIFFICULTIES BALANCE BOOKS CLASS STATUS CLEAR_PRESENTATION DEMANDING ELECTIVE COURSE EXAMS_APPROPRIATE EXPLANATIONS_ERRORS EXPLAINED_ASSIGNMENTS INTERESTING KNOWLEDGE MALE OBJECTIVE ORGANIZED POINT OF VIEW QUESTIONS RELATIVE TIME STIMULATED_THINK TENURE TIME IN CLASS TIME OUT OF CLASS

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14 e11

Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

Table 5 Average marginal effects for overall course experience.

Instructor = 4 Somewhat Agree

CAUCASIAN PROF_MALE GRADEEXPECTED SEMESTER PRINCIPLES COURSE INTERMEDIATE COURSE ACCESSIBLE ASSIGNMENTS_APPROPRIATE AWARE_DIFFICULTIES BALANCE BOOKS CLASS STATUS CLEAR_PRESENTATION DEMANDING ELECTIVE COURSE EXAMS_APPROPRIATE EXPLANATIONS_ERRORS EXPLAINED_ASSIGNMENTS INTERESTING KNOWLEDGE MALE OBJECTIVE ORGANIZED POINT OF VIEW QUESTIONS RELATIVE TIME STIMULATED_THINK TENURE TIME IN CLASS TIME OUT OF CLASS

0.0713 (2.41)* 0.0458 (1.98)* 0.0130 (1.61) 0.0022 (0.20) 0.0107 (0.43) 0.0191 (0.87) 0.0109 (0.98) 0.0264 (2.88)* 0.0360 (6.62)* 0.0225 (2.85)* 0.0078 (1.10) 0.0014 (0.24) 0.0741 (5.77)* 0.0159 (1.96) 0.0140 (0.68) 0.0407 (5.40)* 0.0214 (2.56)* 0.0146 (2.08)* 0.1029 (8.48)* 0.0270 (2.59)* 0.0204 (1.47) 0.0131 (1.38) 0.0484 (6.86)* 0.0082 (0.69) 0.0387 (4.23)* 0.0033 (0.38) 0.0380 (5.38)* 0.0383 (1.79) 0.0059 (2.13)* 0.0012 (0.74)

0.0245 0.0163 0.0046 0.0008 0.0038 0.0066 0.0039 0.0093 0.0127 0.0079 0.0028 0.0005 0.0262 0.0056 0.0049 0.0144 0.0076 0.0052 0.0364 0.0096 0.0071 0.0046 0.0171 0.0029 0.0137 0.0012 0.0134 0.0133 0.0021 0.0004

*

The Z score is statistically significant at P< 0.05. Z scores are reported in parentheses below the coefficient estimates.

(2.44)* (1.94) (1.61) (0.20) (0.44) (0.91) (1.00) (2.91)* (7.49)* (2.81)* (1.09) (0.24) (5.80)* (1.96)* (0.69) (5.34)* (2.57)* (2.08)* (6.86)* (2.58)* (1.46) (1.39) (6.27)* (0.68) (3.86)* (0.38) (5.06)* (1.79) (2.06)* (0.74)

Instructor = 3 Neutral 0.0217 0.0136 0.0039 0.0007 0.0032 0.0057 0.0033 0.0079 0.0107 0.0067 0.0023 0.0004 0.0221 0.0048 0.0042 0.0121 0.0064 0.0044 0.0307 0.0081 0.0061 0.0039 0.0144 0.0025 0.0115 0.0010 0.0113 0.0115 0.0018 0.0004

(2.42)* (1.95) (1.62) (0.21) (0.43) (0.84) (0.98) (2.69)* (5.43)* (2.86)* (1.14) (0.24) (4.80)* (1.91) (0.67) (5.20)* (2.58)* (2.10)* (9.13)* (2.62)* (1.50) (1.34) (6.81)* (0.68) (4.20)* (0.38) (4.96)* (1.82) (2.13)* (0.75)

Instructor = 2 Somewhat Disagree 0.0137 0.0087 0.0025 0.0004 0.0020 0.0037 0.0021 0.0050 0.0069 0.0043 0.0015 0.0003 0.0142 0.0030 0.0027 0.0078 0.0041 0.0028 0.0197 0.0052 0.0039 0.0025 0.0093 0.0016 0.0074 0.0006 0.0073 0.0074 0.0011 0.0002

(2.32)* (2.06)* (1.59) (0.20) (0.43) (0.84) (0.98) (3.11)* (6.12)* (2.77)* (1.09) (0.24) (5.77)* (2.03)* (0.67) (4.87)* (2.51)* (2.00)* (7.61)* (2.42)* (1.43) (1.37) (6.43)* (0.70) (4.23)* (0.38) (5.65)* (1.73) (2.22)* (0.74)

Instructor = 1 Strongly Disagree 0.0113 0.0072 0.0020 0.0004 0.0017 0.0030 0.0017 0.0041 0.0057 0.0035 0.0012 0.0002 0.0117 0.0025 0.0022 0.0064 0.0034 0.0023 0.0162 0.0043 0.0032 0.0021 0.0076 0.0013 0.0061 0.0005 0.0060 0.0061 0.0009 0.0002

(2.32)* (1.95) (1.56) (0.21) (0.43) (0.85) (0.96) (2.75)* (5.98)* (2.80)* (1.07) (0.24) (6.30)* (1.89) (0.67) (5.18)* (2.40)* (2.06)* (7.78)* (2.61)* (1.43) (1.40) (6.07)* (0.69) (4.52)* (0.38) (5.23)* (1.75) (2.11)* (0.74)

G Models

Instructor = 5 Strongly Agree

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

Variable

IREE-65; No. of Pages 14

e12

Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

Table 6 Average marginal effects for recommended instructor.

G Models

IREE-65; No. of Pages 14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

e13

The common findings results from both the OLS and Ordered Probit model estimates presented in Table 4 are as follows: CAUCASIAN, PROF_MALE and TENURE are significant determinants of RECOMMEND_INSTRUCTOR. Among the determinants of teaching excellence, ASSIGNMENTS_ APPROPRIATE, AWARE_DIFFICULTIES, BALANCE CLEAR_PRESENTATION, DEMANDING, EXAMS_ APPROPRIATE, INTERESTING, ORGANIZED, POV and STIMULATED_THINK are all significant determinants of RECOMMEND_INSTRUCTOR. In addition, GRADEXPECTED is a significant determinant of RECOMMEND_INSTRUCTOR. Table 6 presents the average marginal effect (AME) of each of the independent variables on the dependent variable, RECOMMEND_INSTRUCTOR. The following discussion is in the context of the probability of an instructor being strongly recommended by the students to their peers or RECOMMEND_INSTRUCTOR = 5. The results in Table 6 show that CAUCASIAN instructors have a 7% better chance than their counterparts of being strongly recommended as instructors by the students while MALE professors have a 4.5% better chance of being strongly recommended than their female counterparts, ceteris paribus Among the measures of teaching effectiveness that impact RECOMMEND_INSTRUCTOR, CLEAR_PRESENTATION is one of the strongest. A unit increase in the clarity of presentation leads to a 7.4% increase in the probability of students’ strongly agreeing with the statement that this instructor should be recommended to others, for this course. A unit increase in being INTERESTING leads to a 10% increase in the chance that students will strongly agree to recommend the instructor while being ORGANIZED leads to a 4.8% increase in the chance that students will strongly agree to recommend this instructor. The other determinants of teaching excellence are the usual suspects; appropriate examinations, explaining students’ errors, explaining assignments, knowledge and stimulated thinking. These AMEs may be found in Table 6. 9. Conclusions In keeping with the bulk of the literature race and gender play a significant role in SETs even at a highly selective liberal arts college. In fact the impact of race and gender on excellent SET may be as strong as being a clear presenter or an organized teacher. Instructors may not buy excellent ratings by giving out better grades but they can buy adequate ratings by giving out higher grades. The overall impression of a course being excellent deteriorates with higher expected grades. This finding is unique to the literature in that any class that is an easy A is not a very attractive course to these highly selective students. The measures of teaching effectiveness that correspond to better SET are factors such as clear presentations, organized instructors, appropriate examinations, well explained assignments and explanations of student errors. The professors knowledge, ability to be interesting, being aware of student’s difficulties and to be able to stimulate students to think are all significant determinants of SET. This paper highlights the qualities that students at a highly selective liberal arts institution value in an Economics or Business instructor. References Bartus, T., 2005. Estimation of marginal effects using Margeff. Stata J. 5 (3), 309–329. Boex, L.F.J., 2000. Attributes of effective economics instructors: an analysis of student evaluations. J. Econ. Educ. 31 (3), 211–227. Boretz, E., 2004. Grade inflation and the myth of student consumerism. Coll. Teach. 24 (4), 25. Campbell, H.E., Gerdes, K., Steiner, S., 2005. What’s looks got to do with it? Instructor appearance and student evaluations of teaching. J. Policy Anal. Manage. 24 (3), 611–620. Centra, J.A., Gaubatz, N.B., 1998. Is there gender bias in student ratings of instruction? In: Paper Presented at the Seventy-Ninth Annual Meeting of the American Educational Research Association, San Diego, April. DeCanio, S.J., 1986. Student evaluations of teaching—a multinominal logit approach. J. Econ. Educ. 17 (3), 165–176. Feldman, K.A., 1992. College students’ views of male and female college teachers: part I—evidence from the social laboratory and experiments. Res. High. Educ. 33, 317–375. Hamermesh, D.S., Parker, A., 2005. Beauty in the classroom: instructors’ pulchritude and putative pedagogical productivity. Econ. Educ. Rev. 24 (4), 369–376. Isely, P., Singh, H., 2005. Do higher grades lead to favorable student evaluations? J. Econ. Educ. 36 (1), 29–42. Jewell, R.T., McPherson, M.A., 2012. Instructor-specific grade inflation: incentives, gender, and ethnicity. Soc. Sci. Q. 93 (12 January (1)), 95–109. Jewell, R.T., McPherson, M.A., Tieslau, M.A., 2011. Whose fault is it? Assigning blame for grade inflation in higher education. Appl. Econ. 45 (9), 1185–2000.

Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001

G Models

IREE-65; No. of Pages 14

e14

A.J. Fenn / International Review of Economics Education xxx (2014) e1–e14

Kogan, L., Schoenfeld-Tacher, R., Hellyer, P.W., 2010. Student evaluations of teaching: perceptions of faculty based on gender, position, and rank. Teach. High. Educ. 15 (6), 623–636. Krautmann, A.C., Sander, W., 1999. Grades and student evaluations of teachers. Econ. Educ. Rev. 18 (1), 59–63. Langbein, L., 2008. Management by results: student evaluation of faculty teaching and the mis-measurement of performance. Econ. Educ. Rev. 27 (4), 417–428. Love, D.A., Kotchen, M.J., 2010. Grades course evaluations, and academic incentives. East. Econ. J. 36, 151–163. Mason, P.M., Steagall, J.W., Fabritius, M.M., 1995. Student evaluations of faculty: a new procedure for using aggregate measures of performance. Econ. Educ. Rev. 14 (4), 403–416. Matos-Diaz, H., Ragan Jr., J.F., 2009. Do student evaluations of teaching depend on the distribution of expected grade? Educ. Econ. 18 (3), 317–330. McPherson, M.A., Jewell, R.T., 2007. Leveling the playing field: should student evaluation scores be adjusted? Soc. Sci. Q. 88 (September (3)), 868–881. McPherson, M.A., Jewell, R.T., Kim, M., 2009. What determines student evaluation scores? A random effects analysis of undergraduate economics classes. East. Econ. J. 35 (1), 37–51. Nowell, C., 2007. The impact of relative grade expectations on student evaluation of teaching. Int. Rev. Econ. Educ. 6 (2), 42–56. Reid, L.D., 2010. The role of perceived race and gender in the evaluation of college teaching on RateMyProfessors.Com. J. Divers. High. Educ. 3 (September (3)), 137–152. Seiler, M.J., Seiler, V.L., Chiang, D., 1999. Professor, student, and course attributes that contribute to successful teaching evaluations. Financ. Pract. Educ. 9 (2), 91–99. Theall, M., Franklin, J., 2001. Looking for bias in all the wrong places: a search for truth or a witch hunt in student ratings of instruction? In: Theall, M., Abrami, P.C., Mets, L.A. (Eds.), The Student Ratings Debate: Are They Valid? How can We Best Use Them?. New Directions for Institutional Research, vol. 109. Jossey-Bass, San Francisco, pp. 45–56. Weinberg, B.A., Hashimoto, M., Fleisher, B.M., 2009. Evaluating teaching in higher education. J. Econ. Educ. 40 (3), 227–261.

Please cite this article in press as: Fenn, A.J., Student evaluation based indicators of teaching excellence from a highly selective liberal arts college. Int. Rev. Econ. Educ. (2014), http://dx.doi.org/ 10.1016/j.iree.2014.11.001