G Model JEBO-3474; No. of Pages 22
ARTICLE IN PRESS Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Contents lists available at ScienceDirect
Journal of Economic Behavior & Organization journal homepage: www.elsevier.com/locate/jebo
Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program夽 Claudio Lucifora a,b,∗ , Marco Tonello c,d a
Università Cattolica di Milano, Italy IZA, Germany c Bank of Italy, Directorate General for Economics, Statistics and Research, Structural Economic Analysis Directorate, Economics and Law Division, Rome, Italy d CRELI (Università Cattolica di Milano), Italy b
a r t i c l e
i n f o
Article history: Received 2 January 2014 Received in revised form 16 September 2014 Accepted 1 December 2014 Available online xxx JEL classification: C31 D62 I21
a b s t r a c t We investigate cheating behavior in school using a unique data set drawn from a national evaluation test. We exploit a randomized experiment to identify social interactions in the classroom and estimate a cheating social multiplier of about two, which is consistent with a change in students’ achievements twice as large as the initial response. Cheating behavior is found to be more relevant in primary schools as compared to junior-high schools. We also show that cheating occurs mainly when teachers shirk or lower monitoring effort letting students exchange information and cooperate. Differences in the estimated effects are found in terms of social ties among classmates and social capital endowment in the territory. © 2014 Elsevier B.V. All rights reserved.
Keywords: Social multiplier Classroom cheating Randomized experiment
“It’s seen as helping your friend out. If you ask people, they’d say it’s not cheating. I have your back, you have mine.” Senior student at Stuyvesant High School in Manhattan. “We want to be famous and successful, we think our colleagues are cutting corners, we’ll be damned if we’ll lose out to them, and some day, when we’ve made it, we’ll be role models. But until then, give us a pass.” Student at Harvard Graduate School of Education. The New York Times, September 25, 2012
夽 We thank Piero Cipollone, Paolo Sestito, Patrizia Falzetti (Invalsi) for outstanding support with the data. We are grateful to the editor of the journal, the guest editors of the special issue on “Behavioral Economics of Education” and two anonymous referees for their comments. We thank Marco Manacorda, Giulio Zanella, Maria De Paola, Andrea Ichino, Rudolf Winter-Ebmer, Sandra Black, Erich Battistin, Paolo Pinotti, Alfonso Rosolia, Ylenia Brilli, participants to XI Brucchi Luchino Labor Economics Workshop, workshop on Cheating in test scores: identification and forms of contrast (Invalsi, Rome), IV Workshop on Applied Economics of Education (IWAEE, University of Lancaster and Magna Grecia), 2013 EEA Annual Conference (University of Gothenburg), 2014 RES Annual Conference (University of Manchester), seminars at Catholic University Milan and Bank of Italy for useful comments. The data have been kindly made available by INVALSI. The views expressed in the paper are those of the authors and do not necessarily reflect those of the institutions they belong to. The usual disclaimers apply. ∗ Corresponding author at: Università Cattolica di Milano, Largo A. Gemelli, 1, 20123 Milano (Italy). E-mail addresses:
[email protected] (C. Lucifora),
[email protected] (M. Tonello). http://dx.doi.org/10.1016/j.jebo.2014.12.006 0167-2681/© 2014 Elsevier B.V. All rights reserved.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
2
1. Introduction In many social and economic contexts, individuals often face the choice of adopting different types of opportunistic or even illicit behavior to increase their welfare, taking advantage of others for personal interests. Leaving aside violent crimes, there is abundant evidence indicating that cheating on taxes, free riding on public goods, claiming benefits without entitlement, bribing and corrupting public officials, abusing of drugs and drinking, smoking when not permitted, as well as other types of dishonest behaviors are widely diffuse phenomena in most countries (Kleven et al., 2011; Card and Giuliano, 2013). In this paper, we focus our attention on a specific type of such fraudulent behavior, that is, cheating in the classroom during a national evaluation test. A number of papers report evidence that cheating has grown over the last decades, hand in hand with the more extensive use of testing programs (McCabe, 2005; Davis et al., 2009),1 yet empirical evidence on the effects of cheating behavior on educational outcomes is scarce. Cheating can alter evaluators’ ability to assess students’ performance and the signaling role of grades (Anderman and Murdock, 2007). Cheating also raises a number of concerns, not just regarding its unfairness relative to those who do not cheat, but more generally regarding the externalities that are created on others (McCabe and Trevino, 1993; Carrell et al., 2008; Dee and Jacob, 2012). When cheating occurs, either because teachers discretionally help some students or let students exchange information and cooperate with each other, other students – who otherwise would have behaved honestly – feel that they cannot afford to be disadvantaged by those who cheat and may end up cheating too (Anderman and Murdock, 2007; Davis et al., 2009).2 In this context, even an isolated cheating event may propagate through social interactions via a direct effect (i.e. private incentives to cheat) and an indirect effect on behaviors (i.e. a reaction to others cheating). The cheating outcome is amplified by the so-called social multiplier,3 generating large differences in variance across different groups (i.e. school and classroom), with otherwise similar characteristics (Glaeser et al., 2003). While unobserved heterogeneity and sorting of individuals across groups may account for part of the differences in cheating behavior, social interactions within a group of students linked by different types of contextual ties are often necessary to explain the excess variation that is observed in the data (Manski, 1993, 2000). To investigate cheating behavior and social externalities, we use a unique data set drawn from a national evaluation test – the ‘National Survey of Students’ Attainments’, a test in mathematical and language literacy – which is compulsory for all schools and students attending different grades of primary and junior-high school in Italy. In particular, we exploit a specific feature of the administration of the test, which assigns an external inspector to a randomly chosen sample of classrooms. Social externalities in cheating behavior are identified by contrasting the variances of students’ test scores in classrooms where the test was supervised by school teachers only (i.e. our control group) with the variances of students’ test scores in classrooms where an external inspector was supervising the test (i.e. our treated group). We report evidence showing that the test scores are on average higher, more homogeneous and exhibit a lower variance in classrooms without external monitoring as compared to monitored classrooms. We interpret such evidence as an indication of cheating occurring during the test.4 The identification strategy is based on the excess-variance approach (henceforth, E-V), which exploits the exclusion restrictions provided by the randomized experiment to separate the part of the variability due to individual- and group-level heterogeneity from the excess variability genuinely originating from social interactions (Graham, 2008). This paper contributes to the existing literature on cheating behavior (among others, Dee and Jacob, 2012: on plagiarism; Jacob and Levitt, 2003; Dee et al., 2011: on teachers’ manipulation; Carrell et al., 2008; Bertoni et al., 2013; Angrist et al., 2014 for students’ and teachers’ cheating).5 In this context, our approach departs from existing studies in a number of ways. With respect to Carrell et al. (2008), we do not identify the effects of cheating on individual test scores using self-reported measures of cheating and the ‘share of cheaters’ in the reference group, but identify cheating behavior via (endogenous) social interactions. We also depart from Bertoni et al. (2013) and Angrist et al. (2014) who use similar data to our own and focus on primary school only (second and fifth grade). In particular, the paper by Bertoni et al. (2013) estimates the effects of external monitoring on classroom average test scores; while Angrist et al. (2014) investigate the effect of class-size
1 Large-scale cheating has been uncovered over the last year at some of the U.S.’s most competitive schools, like Stuyvesant High School in Manhattan, the Air Force Academy and, most recently, Harvard University (The New York Times, 7 September 2012). An increasing trend in the number of cases of test maladministration and forms of students’ and teachers’ cheating has also been discovered in the U.K. national curriculum assessments (Maladministration Report, Standard & Testing Agency, 2013). A survey conducted as part of the Academic Integrity Assessment Project by the Center for Academic Integrity (Duke University) reported that 21% of undergraduates admitted to having cheated in exams at least once a year (McCabe, 2005). Another survey run in 2010 by the Josephson Institute of Ethics (Report on Honesty and Integrity, 2011) found that 59.3% of the U.S. students interviewed cheated at least once during a test, while more than 80% of them copied from others’ homework at least once. 2 Note that reporting the offenders, as contemplated in many schools’ ethical codes, is required to halt the diffusion of cheating behaviors; nevertheless, it should be noted that small transgressions and dishonest behavior are very often overlooked or tolerated within many schools, either because students do not like to be directly involved in the accusation or because the schools themselves do not want to be associated with the judiciary procedures required to support the allegations of a student’s dishonesty. 3 If there are social interactions in which one person’s actions influence his peers’ incentives or information (and vice versa), then the presence of positive social interactions, or strategic complementarities, imply the existence of a social multiplier where the aggregate relationships will overstate individual elasticities (Glaeser et al., 2003). 4 Students’ cheating during the test is an interesting case study of social interactions in the classroom, since it is likely to capture the same pattern of friendships and cooperative behaviors that takes place during the school year. Students are more likely to interact with close friends, classmates they see outside school (i.e. participating to social activities together), as well as those sitting closer. 5 Evidence on cheating behavior in academia is also presented in McCabe and Trevino (1993), Jordan (2001) and McCabe (2005).
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
3
on students’ achievements and interpret their results in terms of teachers’ shirking and score manipulation across Italian regions. In this paper, we use data on both primary and junior-high schools (fifth and sixth grade), matched to a follow-up survey with additional information on parental characteristics and motivational questions, and exploit the differences in the institutional setting between different grades (in terms of reference teachers per classroom and length of time students shared together in the school) to uncover cheating behavior. We provide a measure of teachers’ and students’ cheating and investigate different behavioral mechanisms that may explain our results. We place a particular focus on students’ interactions, and address the usual ‘reflection problem’ by directly estimating the effects of the endogenous social multiplier (Bramoullé et al., 2009; De Giorgi et al., 2010). We identify a social multiplier in cheating behavior of the order of two, suggesting a strong amplifying role of cheating interactions in the classroom. In other words, our findings suggest that an exogenous shock altering cheating behavior independently across individuals – either initiated by discretional help of teachers or by letting students exchange information or cooperate – produces an equilibrium variation, through social interactions, which is up to twice the initial response. While teachers’ attitude, such as shirking or ignoring students’ interactions, is often an important part of cheating detected in school, when we evaluate the contribution of students’ cheating net of teachers’ effects, we find it to be of more relevant in junior-high schools (sixth grade) as compared to primary schools (fifth grade). Finally, we use some predictions emerging from the educational psychology literature to test alternative cheating mechanisms. We show that the effects of cheating interactions are larger the more students are homogeneous in terms of parental background and other ‘social ties’, the latter measured by the extent of outside school activities. This comparison also provides evidence of complementarity between the extent of cheating and social ties in the classroom. Our findings show that neglecting or tolerating cheating, as is often the case, has several negative implications: for example, by altering students’ performance, it impairs the signaling role of education; its diffusion also raises collective indulgence (i.e. the social norm) with respect to various forms of dishonest practices (Mechtenberg, 2009; Schwager, 2012). The paper is organized as follows. In Section 2, we discuss a number of social mechanisms through which a social multiplier in cheating behavior may originate. Section 3 describes the institutional setting and the main features of the data and provides some descriptive statistics. Section 4 discusses the identification strategy, while Section 5 presents the baseline results and a number of robustness and specification tests. Section 6 illustrates heterogeneous effects in the estimated social multiplier. Section 7 concludes. 2. Cheating in the classroom In many circumstances, the driving force for dishonest or illicit behavior may be found in some personal traits, such as greed, envy and competitive pressure; however, social norms, a low level of trust, widespread acceptance of illicit behavior, high power incentives schemes and other background characteristics may also increase the likelihood of dishonesty in the classroom. Cheating can be seen from two different perspectives, which are often linked and difficult to disentangle: one is related to students’ interactions in exchanging information or cooperating; the other is associated to teachers, who may shirk lowering monitoring intensity, or may help students at their discretion with suggestions and even deliberately changing answers on students’ answer sheets.6 From the point of view of the students, cheating is a genuine free-riding problem, whereby students – for any given level of effort – try to maximize their performance (i.e. pass-rate probability, exam grades and test scores) and exploit the possibility of opportunistic behavior – i.e. exchanging information or cooperating – at any time when the monitoring system tolerates it or is not efficient in reporting the offenders.7 The literature on social interactions in education has largely focused on peer effects in students’ achievements in classrooms and schools or on social outcomes within fraternity (or sorority) membership (Sacerdote, 2001; Imberman et al., 2012; Lavy et al., 2012), while less attention has been devoted to cheating behavioral interactions. The early literature found that students’ cheating is generally higher among members of a fraternity or sorority and peer-related contextual factors have been shown to be the strongest predictors of cheating (Stanard and Bowers, 1970; McCabe and Trevino, 1993). In their multi-campus investigation of individual and contextual influences related to academic dishonesty, McCabe and Trevino (1993) found that students who perceived that their peers disapproved of academic dishonesty were less likely to cheat, while those who perceived higher levels of cheating among their peers were more likely to report having cheated. Carrell et al. (2008) were the first to analyze cheating behavior as a social interaction using separate estimations to identify exogenous (contextual or pre-treatment) and endogenous (during treatment) peer effects. They reported evidence of a statistically significant social multiplier in students’ cheating, suggesting that one additional college cheater is likely to induce other students to cheat.
6 Related literature has focused on other forms of cheating, such as cheating on taxes. Kleven et al. (2011) analyzed a tax enforcement field experiment in Denmark confronting different types of tax reporting methods and showed that tax cheating is substantial when income is self-reported and that threatof-audit letters may reduce cheating. Galbiati and Zanella (2012) estimated a social multiplier effect in tax cheating generated by a congestion effect of auditing resources, and found a sizable social multiplier. 7 Monitoring activities are introduced to validate testing procedures in national evaluation programs. However, contrary to international programs of students’ assessments (e.g. PISA, TIMSS, PIRLS) – which are usually conducted on a survey basis and the sampled students sit the test under the supervision of inspectors – national assessment programs are conducted on a census basis and, usually, the same school teachers supervise students while taking the exam (Eurydice, 2009; U.S. Department of Education, 2009).
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
4
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Seen from the teachers’ point of view, there might be cheating originating from a genuine desire to help students, such as lowering monitoring intensity and letting them interact, or providing discretionary help to some of them, in particular lowachieving students (i.e. particularly when the pass-rate or admission to college is set at some given threshold or cutoff scores) (Dee et al., 2011). Alternatively, cheating may arise from moral hazard in teacher behavior and real dishonest behavior in grading (such as altering or even changing student responses on answer sheets), which typically occurs in high-stakes settings when pay incentives or promotions are linked to measures of classroom performance (Jacob and Levitt, 2003; Lazear, 2006; Dee et al., 2011; Neal, 2014). It is clear that some form of teachers’ cheating, such as shirking or a benevolent attitude – a kind of ‘benign neglect’ – that involves ignoring or not reporting and sanctioning students’ cheating, is required for student cheating interactions to occur.8 A number of studies have focused on teachers’ cheating, particularly in high-stake settings such as test-based accountability systems for schools or performance related pay systems for teachers. Jacob and Levitt (2003), using data from the Chicago public schools, document extensive manipulation of test-scores by teachers and administrators in order to improve students’ performance. They show that repeating the test under controlled circumstances to avoid manipulations reduced the performance of students, particularly in low-achieving classroom. Dee et al. (2011) investigate the existence of manipulation in the scoring for the Regents Examinations reporting pervasive cheating by teachers mainly motivated by their desire to help students to meet admission standards. Bertoni et al. (2013) show that the presence of external inspectors providing strict monitoring, which disciplines teachers’ and students’ cheating behavior, reduces the average performance (i.e. measured as the proportion of correct answers) of students in monitored classrooms, as compared to non-monitored classrooms. Finally, Angrist et al. (2014) analyze teachers’ cheating in a relatively low-stake setting, and test alternative mechanisms which may motivate the manipulation of test-scores. They contrast school accountability considerations, effort-related shirking and wholesale curbstoning (i.e. when an entire answer sheet is manipulated) and conclude in favor of the shirking hypothesis to explain the excess returns from class-size reduction observed in Italian Southern regions. The available evidence in the educational psychology literature also highlights a number of mechanisms for students’ and teachers’ cheating behavior. A difference is introduced between ‘imitative behavior’ in students’ cheating practices, induced by peer pressure (such as plagiarism or using prohibited materials), and ‘collaborative behavior’, involving an exchange of information or answers between classmates. The former is likely to be more diffused in relatively homogeneous group; while the latter, requiring some trust between individuals, is expected to be more likely the longer students have spent time at school together (Anderman and Murdock, 2007). Similarly, a benevolent teacher behavior and a softer monitoring intensity is likely to increase with the length of time teachers have been with the students, and the younger the students are (Pickhardt, 2013; Davis et al., 2009). Conversely various forms of illicit teachers’ behavior, such as altering students’ responses on answer sheets (i.e. wholesale curbstoning), are likely to increase with the expected benefits, as in high-stakes settings, and decrease with the perceived cost in being discovered, as in settings where there is peer monitoring by teachers or by an external inspector. In this respect, the various mechanisms described above are likely to produce different effects on students’ test scores, either increasing students’ interactions and thus magnifying the effects on achievement via a social multiplier, or altering the pattern of responses in some systematic way when explicit teachers’ cheating occurs. While the different cheating mechanisms described above are not easily distinguishable in the data – in the sense that it would not be possible to clearly discriminate between them – they share a common feature, which is the bias that they introduce into both the mean and the variance of the classroom’s test scores. Nonetheless, both sources of cheating – students’ and teachers’ – are likely to be considerably reduced when forms of supervision or monitoring are enforced by the presence of an external inspector or by a technology capable of detecting cheaters. For the purposes of our analysis, the most important implication of the presence of cheating in non-monitored classrooms is that students’ achievements tend to be more similar and exhibit a smaller within-classroom variance than in classrooms in which the students are supervised by an external inspector.9 In other words, cheating behavior is expected to alter both the between- and the within-classroom variance in achievements: in non-monitored classrooms, cheating generates excess variance in individual behavior with respect to individual and group characteristics. Moreover, it introduces a difference among the between-group and the within-group variance of individual behavior. These two features are the core of our identification strategy that we discuss in more detail in the empirical analysis below.
8 Evidence of ‘benevolent’ teacher supervision during official test scores administration, for example, has been reported by the U.K. Standards & Testing Agency in almost half of the maladministration allegations in the 2012 Key Stages 1 and 2 assessments (Standard & Testing Agency, 2013). 9 While we focus our attention mainly on cheating, other social mechanisms may also be at work. For example, social interactions may also originate from ethical norms of behavior, the strength of which decreases with the extent of cheating itself, generating a kind of pro-social behavior, whereby dishonest behavior is less stigmatized when others are also likely to cheat. Alternatively, there might be congestion effects at work, such that for the given enforcing resources one student’s cheating may reduce the chances of detecting illicit behavior among others, hence creating incentives for other students to cheat. A further interpretation is through learning: students, by observing other students cheating and not being sanctioned, could come to the conclusion that cheating is an easy way to improve their own achievements. While the above mechanisms are not the main focus of the paper, in a later section we investigate the robustness of our results with respect to a number of alternative interpretations.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
5
3. Institutional context, data and descriptive statistics 3.1. Primary and junior-high schools in the Italian school system The school system in Italy starts with five years of primary school (grades 1–5, corresponding to ISCED level 1) and three years of junior-high school (grades 6–8, corresponding to ISCED level 2). These two stages are part of the ‘first cycle’ of the educational system, which is compulsory and identical for all students. Children enroll in the first grade of primary school in the year they turn six, and start junior-high school when they turn eleven. Primary and junior-high schools are quite different in terms of their organization and types of teaching activities. In primary schools, pupils have two reference teachers, teaching different subjects, who usually follow them from the first to the fifth grade, establishing a strong personal link. On the contrary, in junior-high school, students have several teachers, one for each subject, and are expected to gain knowledge on a wide range of skills. Due to the larger number of teachers and because sixth grade students are just in their first year, the length of time each teacher passes in the classroom in junior-high school is considerably less than in primary school, as it is the formation of interpersonal relationships between students and teachers. The switch from primary to junior-high school also coincides with the transition from the childhood to the pre-adolescent and adolescent periods, which has implications on the teacher–student relationship, since the teacher is no longer perceived as a sort of parental care-giver, but more as an assertive and detached instructor (see Pickhardt, 2013). Notice, that these institutional characteristics might also affect both the extent and the type of cheating practices in the classroom. In particular, for the reasons discussed above, we expect cheating in the classroom to be on average higher in primary schools as compared to junior-high schools. Teachers may be more inclined to help students or to lower monitoring intensity in the fifth grade of primary schools, while students, on their side, after five years together may share stronger social ties with their classmates. In our empirical analysis, we combine information on the institutional features of the schooling system and the administration of the test with the randomized experiment to try to assess the contribution of students’ and teachers’ cheating behavior. 3.2. The National Survey of Students’ Attainments Starting from the 2009–2010 school year, the ‘National Institute for the Evaluation of the Education System’ (Invalsi) has carried out a yearly evaluation of students’ attainment and schools’ quality (SNV). The evaluation is based on questionnaires and test score evaluations. The SNV takes the form of an annual census, since it is compulsory for all schools and students attending the second and fifth grade of primary schools and the sixth grade of junior-high schools (about 500,000 students per grade).10 Invalsi SNV data for sixth and fifth grade students contain test scores results and individual level information. Each student takes a test in Mathematics and Language on two different days in late May and the tests scores are obtained by a number of items varying for each subject and grade. Individual level information are gathered in the dataset from three different sources: (i) students’ general information from school administrative records compiled directly from school administrative staff on each student’s answer sheet; (ii) family background information collected through a ‘Family Questionnaire’ sent to each family some days before the test; (iii) additional individual information on family, school and environmental characteristics collected through a ‘Student Questionnaire’ taken by each student the same day of one of the test (after finishing the exam). The SNV are low-stakes tests: there is no teachers’ or school performance evaluations based on the tests results, nor monetary or non-monetary benefits. The tests results are intended to provide external and standardized information to each single school and teacher with the only aim to improve the educational processes. The test administration is carried out by school teachers, who also have to transcript the complete record of answers from the students’ test sheets to a single ‘answer sheet’. The answer sheets are then sent to an external institution that is responsible for the computation of the test scores using an automatic procedure. Invalsi enforces a detailed protocol for the administration of the tests to reduce discretion and the possibility of teachers’ manipulations (i.e. Invalsi, 2010, SNV Report). Several features have to be mentioned. First, as is often the case in national evaluation programs (Eurydice, 2009), the test is not administered by the class teachers but by other teachers of the school, who in general teach a different subject from the one that is being tested. Second, all the school teachers are simultaneously involved in the transcription process, so that they cross-check each other, while the school head – who is responsible for the correct implementation of the protocol – supervises the whole process. Although the protocol and the low-stake nature of the test should minimize the scope for illegal practices, it cannot be excluded students’ cheating and teachers’ manipulations occurring during the administration and the transcription process. As previously discussed, there are many forms of cheating that can take place during and after the test: teachers may shirk and let students collaborate or may provide discretional help to some students. Also, during the transcription process, teachers could deliberately change students’ answers or make mistakes in transferring the answers from the students’ test sheets to the answer sheets. This could happen when the correction leaves some discretion to the teacher in interpreting answers, and multiple answers are possible. However, in
10 Pupils with disabilities are identified by a team of specialists since the beginning of their schooling path, sit special formats of the tests compatible with their physical or mental disability, and their results are not included in the official reports.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
6
order to get an idea of the degree of discretion teachers may have, it is worth reporting that only one item (over a total of 100) admitted multiple correct answers in the 2009–10 SNV data for the sixth grade, and 4 items (out of 113) in the fifth grade.11 3.3. The randomized experiment in the SNV data The protocol for the SNV survey entails the use of external inspectors to administrate the tests in a representative and random sample of classrooms both to validate the general results of the survey and to give each school a ‘certified’ benchmark. Inspectors are selected from a pool of potential candidates mainly composed by retired teachers or school heads. In particular, they are required to perform the following tasks in the monitored classrooms: (i) invigilate students during the tests, (ii) provide specific information on the test administration, and (iii) compute the test scores and send the results and documentation to Invalsi within a couple of days (Invalsi, 2010). The sampling procedure is a two-stage design, stratified at the regional level. The number of schools and classrooms to be selected is calculated in a such a way to ensure the country and regional representativeness of the sample. Schools are first sampled separately in each region, then, within each school, one or two classrooms per grade are randomly selected, depending uniquely on school size.12 Any negotiation process with school heads to establish monitored classrooms within a selected school has to be categorically excluded (Invalsi, 2010; Bertoni et al., 2013). The only exclusion criterion from the sample is constituted by classrooms with fewer than 10 students, which are then dropped from the analysis. We will take into account the features of the sampling design in our empirical utilization of the SNV data.13 The allocation of inspectors to a random sample of classrooms in the SNV data provides the ideal framework for our empirical strategy, as it introduces a random treatment with respect to the possibility that students’ or teachers’ cheating practices are in place. The possibility of students’ interactions and cheating in monitored classrooms is to be excluded, as well as any test score manipulation on the part of the inspectors. On the contrary, it cannot be ruled out that students in non-monitored classrooms are subject to lower monitoring intensity or receive help by teachers and that interactions during the test occur (Invalsi, 2010; Ferrer-Esteban, 2013; Bertoni et al., 2013). Given that the choice of the monitored classrooms is random and made after classrooms’ formation, no sorting or matching between the treatment and the school or classroom characteristics can occur. Monitored classrooms correspond to 7–8% of the total in each grade. 3.4. Descriptive statistics To allow comparability across subjects and grades, we focus on all students who sit both Language and Math tests (SNV 2009–10), and use information on both sixth and fifth grades.14 To take into account the sampling design previously discussed, we regress the test scores on a full set of 20 regional fixed effects, a dummy indicating large schools (i.e. a dummy equal to 1 for schools with more than 100 students in the sampled grade), and the linear combination of the two. We then retain the residuals obtained from the above regressions and perform our empirical analysis on the test scores filtering out the effects of the sampling scheme.15 For each student, the SNV data provides the test score for Math and Language and microdata containing individual-level information.16 The individual characteristics cover information on gender, year and place of birth, Italian citizenship, grade retention, kindergarten attendance and both a school and a class (anonymous) identifier. We define ‘sampled school’ a school in which there are one or more monitored classrooms and ‘monitored classroom’ (in a sampled school) a classroom where an external inspector is present supervising the test. A ‘non-monitored classroom in a sampled school’ identifies a classroom, in a sampled school, in which the external inspector was not present during the test. All other classrooms are non-monitored in non-sampled school. Table 1 provides an overview of the main features of the data set: the number of schools and classrooms, as well as the average number of students per school and class. Table 2 compares the observable characteristics of students and cheating behaviors across the different treatment states: (i) monitored, (ii) non-monitored and (iii) non-monitored in sampled schools. We also report statistics on cheating behavior (i.e. ‘cheating scores’) using class-level and subject-specific statistical indicators
11 The multiple answers to be considered correct are listed in the ‘correction greed’ provided by the Invalsi to guide the transcription process (Invalsi, 2010, pp. 192–324). 12 In practice, a school size greater or equal to 100 students implies the random selection of two classrooms. 13 The sample procedure to determine the pool of sampled schools is such that larger schools have higher probability of being sampled than smaller schools (Bertoni et al., 2013; Angrist et al., 2014). However, Bertoni et al. (2013) point out that this difference in the selection probabilities is largely compensated at the second stage of sampling by selecting a fixed number of classrooms with equal probability from the sampled school, so that classrooms in large schools have a lower probability of selection than classrooms in smaller schools (Bertoni et al., 2013, p. 76). 14 Tests for second and eighth grade students have different institutional characteristics and no Student Questionnaires, which do not allow for comparisons with other grades (see Invalsi, 2010). Given that the tests were in two different, although subsequent, days there are students who sit just one of the two tests and students who do not sit none of them because they are absent in both days. The percentage of absent students in sixth grade is about 0.6%, and 2.7% in fifth grade. 15 This procedure is similar to the approach taken by Angrist et al. (2014). In a later Section we perform additional robustness checks. 16 Micro data can be downloaded from www.invalsi.it. Test scores are obtained as a percentage of right answers for each subject and standardized with zero mean and unit standard deviation. Invalsi national evaluation tests are constructed with psychometric characteristics that allow an interpretation of the scores in terms of ‘students’ ability’ (Invalsi, 2010).
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
7
Table 1 Descriptive statistics.
Monitored classrooms (% over total) Non-monitored classrooms in sampled school (% over total) Average school size Average class size Total no. of schools Total no. of classrooms
5th grade
6th grade
1962 (7.28) 2614 (9.70) 93.12 18.45 7325 26,942
2058 (7.93) 3417 (13.16) 128.08 19.93 5748 25,959
Source: SNV Invalsi 2009–10. Table 2 Observable characteristics. Monitored students
Non-monitored students
Non-monitored students in sampled schools
5th grade Individual characteristics Female Grade repeater Immigrant Kindergarten attendance Speak dialect at home
49.21 2.78 9.18 97.79 15.77
49.42 2.64 8.99 97.70 15.71
49.16 2.61 9.54* 97.45*** 15.02***
Cheating scores Statistical indicator: Math Statistical indicator: Language N (% over total)
2.22 1.97 34,332 (7.46)
5.96*** 5.97*** 425,713 (92.54)
3.99*** 4.30*** 44,823 (9.74)
6th grade Individual characteristics Female Grade repeater Immigrant Kindergarten attendance Speak dialect at home
48.32 7.28 10.25 96.84 16.92
48.36 7.09 9.96* 96.81 16.96
48.39 7.23 10.45 96.47 15.42***
Cheating scores Statistical indicator: Math Statistical indicator: Language N (% over total)
1.35 6.20 41,394 (8.05)
2.18*** 8.06*** 472,724 (91.95)
1.34*** 8.17*** 69,489 (13.52)
Source: SNV Invalsi 2009–10. Note: Cheating scores are statistical indicators of the propensity to cheat and are directly calculated by Invalsi (see Castellano, Longobardi, and Quintano 2009 for details on the methodology). The statistical indicator is class-level and subject-specific index of the propensity to cheat ranging from a theoretical maximum of 100 (high cheating) to a minimum of 0 (no cheating). In practice, extreme values are never observed – i.e. in almost 95% of classrooms the index of cheating is below 55 (both for Language and Math). Absent students are excluded: a student is considered ‘absent’ if he/she does not sit either Math or Language test, or both. * The difference with monitored students (reference category) is statistically significant at 0.1 significance level. *** The difference with monitored students (reference category) is statistically significant at 0.01 significance level.
based on sophisticated fuzzy-logic algorithms computed by Invalsi to detect cheating practices (see Castellano et al., 2009). We find no evidence of statistically significant differences in the characteristics and composition of monitored and nonmonitored classrooms, with the only exception of the share of immigrant students (slightly oversampled in the sixth grade). Some minor differences are detected between monitored and non-monitored classrooms in sampled schools. Conversely, there is evidence of statistically significant differences in cheating behavior between the three groups: cheating scores are found to be considerably higher in non-monitored classrooms, both in Language and Math, than in monitored classrooms. In non-monitored classrooms in sampled schools, cheating scores are also statistically different from zero, but substantially lower than those of non-monitored classrooms. Finally, Invalsi (2010) shows that the propensity to cheat is not statistically different from zero in monitored classrooms, thus ruling out the possibility of any cheating practice. Table 3 reports simple OLS regressions of the class-level mean test score (columns 1 and 2), the within-class variance of test scores (columns 3 and 4) and the ‘cheating scores’ (columns 5 and 6), on a dummy variable indicating monitored classrooms. In columns (2), (4) and (6), we also add a dummy for non-monitored classrooms in sampled schools and additional controls at the class level. Given the nature of the randomized experiment, we attach a causal interpretation to the estimated coefficients. The supervision of the external inspector decreases the mean of the test scores and increases the within variance. We also detect the existence of spillover effects of the external inspector on non-monitored classrooms in sampled schools, which show a negative (statistically significant) effect on the mean and a positive effect on the within variance, as compared with classrooms in non-sampled schools. The spillover effects of ‘monitoring’ – which we call ‘indirect monitoring’ – are, however, of a smaller magnitude than the effects of direct monitoring when the external inspector is present. These estimates can be interpreted as an indirect discipline effect on cheating, driven by the presence of an external inspector in the school. Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
8
Table 3 Classroom test scores and cheating (OLS regressions).
5th grade: Math Monitored classroom
Class mean test score
Within class variance
Cheating scores
(1)
(2)
(3)
(4)
(5)
(6)
−0.176*** (0.014)
−0.182*** (0.015) −0.036** (0.016) 0.016
0.072*** (0.008)
0.075*** (0.008) 0.025*** (0.008) 0.036
−3.777*** (0.273)
−3.988*** (0.289) −2.193*** (0.401) 0.017
−0.169*** (0.012) −0.029** (0.014) 0.032 26,942
0.100*** (0.010)
0.103*** (0.010) 0.017* (0.009) 0.040 26,942
−4.008*** (0.217)
−0.049*** (0.011) −0.024** (0.011) 0.031
0.026*** (0.007)
0.024*** (0.007) −0.006 (0.007) 0.032
−0.827*** (0.169)
−0.027*** (0.010) −0.000 (0.011) 0.068 25,959
0.045*** (0.008)
0.043*** (0.008) −0.011 (0.007) 0.054 25,959
−1.855*** (0.371)
Non-monitored classroom in sampled school R sq.
0.005
5th grade: Language Monitored classroom
−0.163*** (0.012)
Non-monitored classroom in sampled school R sq. N
0.006 26,942
6th grade: Math Monitored classroom
−0.043*** (0.011)
Non-monitored classroom in sampled school R sq.
0.001
6th grade: Language Monitored classroom
−0.023** (0.010)
Non-monitored classroom in sampled school R sq. N
0.000 25,959
Additional class-level controls
0.003
0.004 26,942
0.001
0.001 25,959
Yes
Yes
0.003
0.003 26,942
0.001
0.001 25,958
−4.186*** (0.234) −1.838*** (0.392) 0.015 26,942
−0.859*** (0.177) −0.342* (0.175) 0.007 −1.893*** (0.388) 0.112 (0.428) 0.016 25,958 Yes
Source: SNV Invalsi 2009–10. Note: Weighted OLS regressions with weights equal to the number of students in each class. Robust standard error clustered at the school level. Test scores are normalized with 0 mean and 1 standard deviation. All regressions include regional fixed effects, a dummy for schools with more than 100 students enrolled in the tested grade, and their interactions. Additional class-level controls include dummies equal to 1 for classrooms with: the share of immigrants higher than the ninetieth percentile, the share of students speaking dialect at home higher than the ninetieth percentile, the share of students who attended kindergarten higher than the ninetieth percentile. See Table 2 for the definition of the cheating scores. * Significance level: P < 0.1. ** Significance level: P < 0.05. *** Significance level: P < 0.01.
We expect this effect to be more likely to reduce teachers’ illicit practices rather than students’ interactions, since the latter may be unaware of the presence of the inspector in the school (Bertoni et al., 2013). In a later section, we exploit this feature of the monitoring process to try to disentangle the contribution of teachers’ and students’ cheating practices. Similar effects of the monitoring process are also detected on cheating scores: cheating is lower in monitored classrooms (as well as in non-monitored classrooms in sampled schools), both in Language and in Math, as compared to non-monitored classrooms. Finally, all the above effects are found to be stronger and estimated with higher precision in the fifth grade, suggesting that the deterrence effect of monitoring on cheating is higher in primary than in junior-high schools. Thus, these figures seem to be consistent with the idea that cheating interactions, as detected by cheating scores, are present in non-monitored classrooms and that these are more pronounced in primary schools, inducing an increase in the class mean and a reduction in the within-class variance of the test scores.
4. Empirical strategy We start from the following achievement production function (APF): yic = ı0 + xi ıx + x¯ c ı˜ x + J y¯ c + ı˜
(1)
where the test score of student i in classroom c (yic ) depends on individual characteristics (xi ), average peers’ characteristics (i.e. exogenous or contextual effects, x¯ c ) and common shared factors (), such as teachers’ characteristics and school Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
9
environment.17 In the above specification, J represents endogenous part of the social interactions – i.e. the extent to which the individual test score depends on the average test score of classroom peers (y¯ c ). If cheating interactions are at work (on the part of the students or of the teachers), they determine a positive covariance in individual behavior – i.e. a positive value of the parameter J – that triggers the social multiplier effect (Manski, 1993; Glaeser et al., 2003; Bramoullé et al., 2009; De Giorgi et al., 2010).18 From Eq. (1), averaging within the classroom and solving for y¯ c yields: y¯ c = [ı0 + x¯ c (ıx + ı˜ x ) + ı˜ ]
(2)
where = (1 − J)−1 , with ≥ 1, represents the social multiplier in cheating interactions. While the social multiplier derived from Eq. (1) can be interpreted in different ways, within the framework of our randomized experiment, the parameter is expected to capture the effect of cheating interactions on students’ individual achievement (i.e. test score). In other terms, the aggregate outcome (class average test score) is the result of a direct effect (a reaction via private incentives to cheat) and an indirect effect (a reaction to the cheating behavior of others), the multiplier being the ratio of the (equilibrium) aggregate response and the sum of the reactions of individuals to cheating peer pressure (Glaeser et al., 2003). Substituting back Eq. (2) into (1), we obtain the classical reduced-form linear-in-means model of social interactions à la Manski (1993), which is the starting point for our empirical strategy.19 yic = ı0 + xi ıx + ı˜ x x¯ c + ( − 1)ıx x¯ c + ı
(3)
4.1. The excess-variance approach In a reduced-form setting, J (and consequently ) cannot be estimated directly because of the well-known reflection problem. To identify the structural parameter of the endogenous social multiplier originating from cheating in the classroom (), we exploit the randomized experiment in SNV data and implement the E-V approach developed by Graham (2008). This empirical strategy has several advantages. First, by relying only on the cross-group variation that originates from endogenous social effects, it bypasses most of the identification problems that characterize the classical reduced-form linear-in-means model. Second, it is robust to individual- and group-level heterogeneity. Third, the data requirements to overcome the bias originating from omitted variables due to correlated effects are very limited.20 Identification through the E-V approach relies on the existence of an appropriate classroom-level indicator (Zc ), which randomly allocates classrooms to two subpopulations that are identical in all observable and unobservable characteristics but differ in terms of a specific type of social interaction, which is present only in one of the two groups.21 Under this assumption, the model in Eq. (3) – as shown in Graham (2008) and in Galbiati and Zanella (2012) – can be reformulated in variance component terms as follows: Vcb = c + 2 Vcw
(4)
where Vcw (Zc , c ) = Vcw and Vcb (Zc , c ) = Vcb represent, respectively, the within-group and the between-groups conditional variance, and vectors Zc and c contain some classroom-level information. Note that the within-group variance of students’ achievements in the classroom – Vcw in Eq. (4) above – does not depend on social interactions and classroom-level heterogeneity, but only on differences in individual characteristics and students’ sorting across classrooms (which in a randomized setting are the same for all students). Conversely, the between-group variance – Vcb in Eq. (4) above – depends on classroom heterogeneity and, when cheating occurs, is likely to be magnified by students’ interactions. In other words, cheating externalities introduce a wedge into the variance of students’ achievements – at different levels of aggregation – between monitored and non-monitored classrooms, which is what we exploit to identify the social multiplier. Expressing the conditional variances as conditional expectations of the relative statistics (Gcw and Gcb ), we can rewrite Eq. (4) as22 : E(Gcb |Zc , c ) = c + 2 [E(Gcw |Zc , c )]
(5)
17 See Lucifora and Tonello (2012) for a formal derivation. Notice that, consistently with Graham (2008), the vector xi includes both observable and unobservable characteristics. 18 Manski (1993) identified three main factors that are likely to influence social interactions: exogenous (or contextual) effects, correlated effects and endogenous social interactions. Only the latter effect (i.e. when the propensity of an individual to behave in a certain way varies with the behavior of the group) can determine the social multiplier. The empirical literature has focused on the estimation of reduced-form parameters, collapsing the endogenous and the exogenous effects (Ammermüller and Pischke, 2009; Lavy et al., 2012). Recent works have addressed the reflection problem in the estimation of the classical linear-in-means model à la Manski (1993) using data in which social groups are endogenously defined (Bramoullé et al., 2009) or introducing appropriate exclusion restrictions (De Giorgi et al., 2010). 19 While the linear-in-means specification comes at the cost of introducing some ad hoc (linear) functional forms, it has the notable advantage of providing a specification that allows direct estimation of the (endogenous) social multiplier parameter (). Notice also that this specification makes it possible to estimate potentially relevant forms of heterogeneity in the estimated social multiplier (Graham, 2008). 20 Durlauf and Tanaka (2008) suggested that E-V can be better justified whenever appropriate prior information exists on the variance matrix structure fulfilling the sort of exclusion restriction needed on the variance–covariance matrix of the outcomes. Our implementation follows this direction as we exploit the exclusion restriction from the natural experiment in SNV data. 21 Identification relies on: ‘[. . .] two subpopulations of social groups where assignment to groups is as if random’ (Graham, 2008, p. 658). 22 See Graham (2008) and Galbiati and Zanella’s (2012) web supplement for a formal derivation of conditional expectations.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
10
which implies the following unconditional moment restriction:
E
Zc c
(Gcb
− c −
2
Gcw )
=0
(6)
Eq. (6) delivers the appropriate specification to estimate, by GMM, the social multiplier ( 2 ), using Zc as the instrumental variable.
4.2. Empirical implementation In the above set-up, Zc must generate an exogenous variation that affects the between-classroom variance in students’ achievement only via the effect that cheating interactions have on the within-classroom variance. In practice, we use the random allocation of external inspectors to define a dummy variable for classrooms with external monitoring (Zc = 1) and classrooms without external monitoring (Zc = 0). Since the model is just-identified, we can simply estimate it by two-stage least squares. Given that the instrument, Zc , is a dummy, the estimator of the social multiplier takes the form of a Wald estimator: 2 =
E(Gcb |Zc = 1) − E(Gcb |Zc = 0) E(Gcw |Zc = 1) − E(Gcw |Zc = 0)
(7)
The numerator shows the difference in the observed (or actual) between-classroom variance in students’ achievement across treatment states. Randomization ensures that the difference in the variance of achievements across treatment states is purged of the influence of teacher heterogeneity and matching and sorting of students. The denominator also equals the difference in the variance of achievements across treatment states, but in this case is unaffected by social interactions (Graham, 2008; Sacerdote, 2010). Notice that the feasible estimator requires an estimate of the conditional expectation of students’ achievement E(yic |Zc , c ), which we obtain from a regression of yic on Zc and c . We then use the residuals to 2 ˆ cb = (ˆyc − Zc ˆ 1 and ˆ 2 are least-squares estimates (Graham, 2008; Galbiati and Zanella, ˆ 1 − c ˆ 2 ) , where replace Gcb with G 2012). Randomization also implies that (in principle) it is not necessary to include any control variable in the vector c . The descriptive evidence in Section 3 shows that the treated and control subgroups constitute a representative and random sample of the student population. There are, however, a couple of matters for concern. First, since immigrant students appeared to be slightly oversampled in the treated classrooms (see Table 2), it seems preferable to control for the share of immigrant students in the classroom. Second, since we detected spillover effects of external monitoring on non-monitored classrooms in sampled schools, we may also want to control for that. In practice, for the baseline analysis, the following control variables are included in vector c : a dummy variable indicating whether a classroom is a ‘non-monitored class in a sampled-school’ and a dummy variable indicating whether there is a ‘high share’ of immigrant students in the classroom (i.e. it takes the value 1 if the immigrant share is greater than the ninetieth percentile of the immigrant class share distribution). We discuss further extensions and check the sensitivity of our estimates in the following sections. In particular, in Section 5 we present our baseline estimates and we discuss some potential behavioral factors that, independently from cheating, may affect test scores (Section 5.1). A number of robustness and sensitivity checks are presented in Section 5.2. Further, in Section 6, we test the existence of heterogeneous effects and explore alternative mechanisms of cheating interactions. We compare the North and the South of the country, we contrast areas with high or low social capital (Section 6.1), and sub-populations characterized by a higher (or lower) degree of heterogeneity according to a set of selected (exogenous) characteristics (Section 6.2). Finally, we exploit additional features of the randomized experiment – i.e. variations across classes within a sampled school – and some institutional differences between primary and junior-high schools, to evaluate the relative contribution of students’ cheating, net of teachers’ illicit behaviors (Section 6.3).
5. Baseline results Estimates of the cheating social multiplier are obtained first without including any control variables (i.e. our baseline specification), then adding variables to the vector c to control for selected features of randomization and for the existence of spill over effects. The existence of cheating interactions requires the estimated social multiplier to be statistically different from one (see Eq. (2)), thus we test the null that 2 = 1 and report the correspondent P-value in each table. Since the E-V approach only allows the identification of 2 , the sign of the social multiplier is in principle undetermined. However, given the nature of the randomized experiment, the cheating mechanisms discussed in the educational psychology literature and the descriptive evidence in Table 3, we posit a positive sign: that is, our structural parameter identifies the equilibrium effects Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
11
Table 4 Baseline estimates of the social multiplier. Math
5th grade Social multiplier sq. ( 2 ) P-value (H0 : 2 = 1) Model parameters Social multiplier () Endogenous effect (J) First stage Instrument coeff. F-stat No. classrooms 6th grade Social multiplier sq. ( 2 ) P-value (H0 : 2 = 1) Model parameters Social multiplier () Endogenous effect (J) First stage Instrument coeff. F-stat No. classrooms Additional controls ( c ) Non-monitored classroom in sampled school High share of immigrants
Language
(1)
(2)
(3)
(4)
(5)
(6)
7.726 (0.366) 0.00
7.739 (0.366) 0.00
7.902 (0.407) 0.00
5.151 (0.317) 0.00
5.127 (0.320) 0.00
5.170 (0.358) 0.00
2.780 (0.066) 0.640 (0.009)
2.782 (0.066) 0.641 (0.009)
2.811 (0.072) 0.644 (0.009)
2.270 (0.070) 0.559 (0.014)
2.264 (0.071) 0.558 (0.014)
2.274 (0.079) 0.560 (0.015)
0.040 (0.000) 7113.20 26,942
0.040 (0.000) 7112.94 26,942
0.036 (0.001) 5109.92 26,942
0.047 (0.001) 6504.80 26,942
0.047 (0.001) 6504.56 26,942
0.042 (0.001) 4669.66 26,942
4.322 (0.200) 0.00
4.261 (0.197) 0.00
4.429 (0.225) 0.00
3.306 (0.132) 0.00
3.187 (0.129) 0.00
3.279 (0.150) 0.00
2.079 (0.048) 0.519 (0.011)
2.064 (0.048) 0.516 (0.011)
2.105 (0.054) 0.525 (0.012)
1.818 (0.036) 0.450 (0.011)
1.785 (0.036) 0.440 (0.011)
1.811 (0.041) 0.448 (0.013)
0.043 (0.000) 10,772.44 25,959
0.043 (0.000) 10,772.02 25,959
0.038 (0.000) 6838.82 25,959
0.046 (0.001) 8290.73 25,959
0.046 (0.001) 8290.42 25,959
0.041 (0.001) 5641.34 25,959
Yes
Yes Yes
Yes
Yes Yes
Source: SNV Invalsi 2009–10. Note: Social multiplier sq. ( 2 ) is the square of the social multiplier (). Additional controls ( c ) include the dummy for non-monitored classrooms in sampled school and the dummy for high share of immigrants (i.e. the class share of immigrants greater than the ninetieth percentile of the immigrant class share distribution for each grade).
of cheating interactions when the latter are expected to increase students’ achievement and class average performance ( > 0 and J > 0).23 Table 4 reports estimates of the social multiplier from our baseline equation. The first-stage F-statistics reject the hypothesis that the instrument is weak and the standard rank condition is always satisfied. The first-stage regressions also show that the conditional within variance of the test scores in monitored classrooms is always higher than that of non-monitored classrooms. This reflects the greater dispersion of individual heterogeneity in test scores when cheating practices are not in place. The estimate of that we obtain from our baseline specification – columns (1) and (4) – is 2.78 for Math and 2.27 for Language in the fifth grade, and 2.08 for Math and 1.82 for Language in the sixth grade. Adding control variables has a negligible effect on the estimated social multiplier, confirming that the two subgroups are (almost) identical in terms of observable characteristics. All the estimates are statistically different from 1 at the 1% significance level, which means that we can reject the null of ‘no social interactions’.24 Our findings point to a significant amplifying role played by cheating interactions within the classroom. The baseline estimates for imply that an exogenous shock altering cheating behavior independently across individuals, either initiated by discretional help of teachers or by students exchanging information or cooperating, produces an equilibrium variation, through social interactions, which is up to twice the initial response. Estimates of J range between 0.64 and 0.52 for Math and between 0.56 and 0.45 for Language.
23 Note that this is not necessarily true as cheating interactions may also involve exchanging incorrect information or wrong answers, which could lower the class average performance. The descriptive statistics, however, seem to suggest that the effect of cheating on the average performance is expected to be positive. See Lucifora and Tonello (2012) for the underlying theoretical assumptions needed. 24 Standard errors for the model parameters (, J) are obtained using the delta method.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
12
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Comparing the results across grades, we find that the cheating social multiplier is larger in primary schools than in junior-high (both for Language and Math). This may reflect institutional differences between primary schools and juniorhigh schools, as previously discussed, and be consistent with the fact that students and teachers, over the five years spent together in primary school, are more likely to have developed strong interpersonal links, and that teachers may be more willing to help out younger students. Comparing the results across subjects, we find that the social multiplier is larger in Math than in Language, suggesting that cheating has higher returns in Math tests than in Language tests.25 This also confirms findings in the educational psychology literature, which suggests that cheating occurs more frequently in hard science subjects than in arts and social sciences (Anderman and Murdock, 2007). Our results, although not directly comparable, are in line with the evidence available from other studies in the social interactions literature (Glaeser et al., 1996; Maurin and Moschion, 2009; Drago and Galbiati, 2012), which have found social multipliers between 2 and 3 in order of magnitude. Estimates are consistent with those of Carrell et al. (2008), who estimated a social multiplier in students’ cheating between 3.93 and 2.57. A more direct comparison can be made with those studies that use the E-V approach to recover an estimate of the social multiplier. In his analysis of students’ peer effects in class learning activities, Graham (2008) reported an estimate for the social multiplier of approximately 1.9 for Math and 2.29 for Reading. However, as is standard in the literature (Sacerdote, 2001; Imberman et al., 2012), Graham’s social multiplier due to peer interactions in achievement embeds both exogenous and endogenous effects. Galbiati and Zanella (2012) estimated a social multiplier arising from congestion externalities in tax cheating between 3.1 and 3.2, and offered a structural interpretation of the multiplier generated by externalities in concealed income, which represents an upper bound for the endogenous effects of tax cheating.26
5.1. Psychological pressure, distraction and effort A potential threat to validity in our empirical strategy is represented by alternative behavioral factors unrelated to cheating that may affect test scores. For example, the presence of an external inspector may exert forms of psychological pressure on students, inducing stress and altering their performance. Alternatively, the presence of a stranger may simply determine higher students’ distraction (Bertoni et al., 2013). Finally, the presence of the inspector may determine a change in the students’ motivations and effort exerted in doing the test (Levitt et al., 2012). In order to assess the relevance of these hypotheses, we exploit a set of motivational questions that students have to answer immediately after taking the test (‘Student Questionnaire’), reporting their emotional feelings and psychological pressures while taking the test, or general information about their studying habits in the classroom or at home. To test whether the monitored students felt higher psychological pressure or distraction while sitting the test, we regress the share of students in each classroom who declared that they felt nervous or calm during the test on the dummies for monitored classrooms and non-monitored classrooms in sampled schools and the usual set of controls (as in Table 3). The results reported in Table 5, Panel A, show that there are some statistically significant differences in the answers to the motivational questions between monitored and non-monitored classrooms. However, the effects (where present) are small in their magnitude and work in the opposite direction from what we expected as a potential threat to our strategy. In monitored classrooms, students admit to being less nervous and calmer while taking the test than students in nonmonitored classrooms. Hence, we are confident that the estimates of the social multiplier are not biased downward due to higher psychological pressure or distraction in the treated units. If the presence of the external inspector has some direct effect on students’ feelings and perceptions, this may alter their effort or concentration in making the test. In Table 5 Panel B we check whether monitored and non-monitored students have different feelings concerning the difficulty of the questions of the SNV test compared to exercises usually done in the classroom or as homework. If questions are perceived easier (more difficult) than usual it could be possible that students in monitored classrooms put a lower (higher) effort compared to non-monitored classrooms. We find no evidence of statistically significant differences between perceptions in monitored and non-monitored classrooms. Only for the Math test in the sixth grade there is some weak evidence that students in monitored classrooms considered the SNV test questions easier. As a further check for the different monitoring technology between treated and non-treated students, we investigate whether there is evidence of higher non-response rate for monitored classrooms in our sample. If there is any difference in effort, we should observe a higher share of items left blank where less effort is exerted. Results in Table 6 show no statistically significant effects of the presence of an external inspector on the share of items left blank (over the total number of items to be completed) in each classroom, suggesting no upward bias due to student
25 Given the structure of the test, whereby closed answers and problems are proposed in Math, while text comprehension and interpretation are required in Language, it could be the case that cheating technology is easier to implement in Math. 26 Note, however, that when comparing the results reported by Graham (2008) and Galbiati and Zanella (2012) with our own, it should be borne in mind that we exploit the identifying restriction given by a natural experiment, while both Graham (2008) and Galbiati and Zanella (2012) identified the social multiplier through exogenous variations in the size of the reference group.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
13
Table 5 Robustness checks: students’ psychological pressure, distraction and effort. Panel A
5th grade Monitored classroom
(1)
(2)
(3)
(4)
(5)
(6)
−0.004*** (0.001)
−0.004*** (0.001) 0.000 (0.001) 0.023 26,764
−0.004*** (0.001) 0.000 (0.001) 0.029 26,764
0.010*** (0.002)
0.010*** (0.002) 0.002 (0.002) 0.027 26,764
0.010*** (0.002) 0.002 (0.002) 0.029 26,764
−0.000 (0.001) −0.001 (0.001) 0.049 25,765
−0.000 (0.001) −0.001 (0.001) 0.064 25,765
0.011*** (0.002)
0.011*** (0.002) 0.001 (0.002) 0.061 25,765
0.011*** (0.002) 0.001 (0.002) 0.066 25,765
Non-monitored classroom in sampled school R sq. N 6th grade Monitored classroom
Share of students who felt calm while sitting the test
Share of students who felt nervous while sitting the test
0.023 26,764 −0.000 (0.001)
Non-monitored classroom in sampled school
0.027 26,764
R sq. N
0.049 25,765
Panel B
Share of students who felt that the test contained easier questions Math
5th grade Monitored classroom
Language
(1)
(2)
(3)
(4)
(5)
(6)
0.003* (0.002)
0.002 (0.002) −0.007*** (0.002) 0.008 26,764
0.002 (0.002) −0.007*** (0.002) 0.008 26,764
0.003 (0.002)
0.003 (0.002) 0.000 (0.002) 0.043 26,764
0.003 (0.002) 0.000 (0.002) 0.046 26,764
0.006*** (0.002) −0.000 (0.002) 0.018 25,765
0.006*** (0.002) −0.000 (0.002) 0.019 25,765
0.002 (0.002)
0.002 (0.002) 0.000 (0.001) 0.054 25,765
0.002 (0.002) 0.000 (0.001) 0.056 25,765
Non-monitored classroom in sampled school R sq. N 6th grade Monitored classroom
0.008 26,764 0.006*** (0.002)
Non-monitored classroom in sampled school R sq. N
0.061 25,765
0.018 25,765
Additional controls
0.043 26,764
0.054 25,765
Yes
Yes
Source: SNV Invalsi 2009–10 (Student Questionnaire). Note: Weighted OLS regressions with weights equal to the number of students in each class. Robust standard error clustered at the school level. Classes with missing values in the relevant variables are dropped from the sample. All regressions include regional fixed effects, a dummy for schools with more than 100 students enrolled in the tested grade, and their interactions. Additional controls include a dummy equal to 1 for classrooms with a high share of immigrants (i.e. the class share of immigrants greater than the ninetieth percentile of the immigrant class share distribution for each grade). * Significance level: P < 0.1. *** Significance level: P < 0.01.
Table 6 Robustness checks: share of test items left blank. Share of test items left blank 5th grade
Monitored classroom Non-monitored classroom in sampled school R sq. N
6th grade
Math
Language
Math
Language
0.000 (0.000) 0.000 (0.000) 0.016 26,942
0.000*** (0.000) −0.000 (0.000) 0.012 26,942
−0.000 (0.000) 0.000 (0.000) 0.043 25,959
0.000 (0.000) 0.000 (0.000) 0.019 25,959
Source: SNV Invalsi 2009–10 (Student Questionnaire). Note: Weighted OLS regressions with weights equal to the number of students in each class. Robust standard error clustered at the school level. Classes with missing values in the relevant variables are dropped from the sample. All the regressions include regional fixed effects, a dummy for schools with more than 100 students enrolled in the tested grade, and their interactions; a dummy equal to 1 for classrooms with a high share of immigrants (i.e. the class share of immigrants greater than the ninetieth percentile of the immigrant class share distribution for each grade). *** Significance level: P < 0.01.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
14
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Table 7 Robustness checks: sensitivity analysis. Math
5th grade Social multiplier () P-value (H0 : 2 = 1) No. classrooms 6th grade Social multiplier () P-value (H0 : 2 = 1) No. classrooms Additional controls ( c ) Including small classrooms High share of immigrants (P90) High share of immigrants (P75) Non-monitored classroom in sampled school Test scores alternative cleaning
Language
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
2.898 (0.087) 0.00 26,942
2.838 (0.074) 0.00 29,596
2.948 (0.091) 0.00 29,596
2.816 (0.072) 0.00 26,942
2.321 (0.095) 0.00 26,942
2.262 (0.080) 0.00 29,596
2.323 (0.099) 0.00 29,596
2.278 (0.080) 0.00 26,942
2.187 (0.062) 0.00 25,959
2.109 (0.053) 0.00 26,682
2.194 (0.062) 0.00 26,682
2.096 (0.054) 0.00 25,959
1.861 (0.048) 0.00 25,959
1.824 (0.043) 0.00 26,682
1.878 (0.051) 0.00 26,682
1.798 (0.041) 0.00 25,959
Yes Yes
Yes
Yes Yes
Yes
Yes Yes
Yes
Yes Yes Yes
Yes Yes
Yes Yes
Yes
Yes Yes Yes
Yes Yes
Source: SNV Invalsi 2009–10. Note: Social multiplier sq. ( 2 ) is the square of the social multiplier (). Additional controls ( c ) include the dummy for non-monitored classrooms in sampled school and dummies equal to 1 for classrooms with a high share of immigrants (i.e. the class share of immigrants greater than the ninetieth, P90, or the seventy-fifth, P75, percentiles of the immigrant class share distribution for each grade). Specifications (2–3) and (5–6) include classrooms with less than 10 students. Test scores alternative cleaning (columns 4 and 8) include preliminary cleaning regressions where the dummy for large school is substituted with the school size (continuous variable).
overperforming in monitored classrooms. Taken together, the results of Tables 5 and 6 seem to exclude that alternative psychological mechanisms, other than cheating, are directly affecting tests scores and our estimate of the social multiplier.27 5.2. Specification tests In this section we check further the robustness of our baseline results running a number of specifications tests with respect to our baseline estimates. In Table 7, columns (1) and (5), we test alternative definition of our control variables. We define a dummy variable for the immigrant class share, which takes the value 1 for classrooms above the seventy-fifth percentile. The results do not change with respect to the estimates in the baseline model (Table 4). In columns (2)–(3) and (6)–(7), we check the selection criteria imposed on the original sample with respect to classrooms with fewer than 10 students, which were dropped to meet the ex ante selectivity used by Invalsi in the random assignment of monitored classrooms. To assess whether the ‘less than 10 students’ selection rule has any impact on our estimates, we replicate the analysis also including the classrooms with fewer than 10 students (723 classes for grade 6 and 2654 for grade 5, corresponding respectively to 2.7% and 8.8%). The results show no significant differences with respect to our baseline estimations. Finally, we apply a slightly different cleaning of raw test scores to filter-out the effects of the sampling scheme. Instead of using a dummy identifying schools with a size larger than 100 students in the grade tested, we control for school size (a continuous variable), together with the battery of regional fixed effects and interactions.28 Results, reported in columns (4) and (8), do not show differences with respect to the baseline. 6. Heterogeneous effects and cheating mechanisms Results from our baseline specification have shown that, when monitoring is not accurate, cheating interactions may generate a social multiplier effect. Also the differences detected in the size of the multiplier between primary and juniorhigh grades, seem to suggest that institutional features and other background characteristics may increase the likelihood of cheating interactions in the classroom. In this section, we explore different dimensions of heterogeneous effects in cheating behavior which may correlate with social norms concerning trust and acceptance of illicit behavior, such as differences between the North and the South of Italy and the endowment of social capital in a territory. Further, we investigate whether heterogeneity in personal traits of students and their families also has an impact on the intensity of cheating interactions.
27 Notice that also other studies exploiting the same natural experiment, albeit using different waves or grades, come to a similar conclusion (FerrerEsteban, 2013; Bertoni et al., 2013). 28 This methodology is closer to the one applied by Angrist et al. (2014), who control for the size of the institution. Lacking this variable in our dataset because of privacy restriction, we control for the school size.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
15
Table 8 Heterogeneous effects in the cheating social multiplier. South (1)
North (2)
Low social capital (3)
High social capital (4)
12.9236 (0.8777) 1968.31 0.00 9801
3.9778 (0.2766) 2551.46
10.8083 (0.6552) 2699.50 0.00 13,606
3.7752 (0.2919) 1821.33
7.3732 (0.5909) 1966.01 0.00 9801
2.5565 (0.1778) 2063.02
6.2801 (0.4665) 2539.48 0.00 13,606
2.6892 (0.1987) 1479.62
6.9820 (0.5477) 3004.64 0.00 9175
2.6214 (0.1643) 3578.74
5.9185 (0.4116) 3857.82 0.00 12,930
2.4744 (0.1544) 2661.84
F statistics 1st stage P-value No. classrooms
4.4214 (0.2807) 3049.74 0.00 9175
2.1894 (0.1259) 2717.02
2.0768 (0.1376) 2129.25
14,963
4.0220 (0.2258) 3455.38 0.00 12,930
Additional controls ( c )
Yes
Yes
Yes
Yes
5th grade Math Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Language Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms 6th grade Math Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Language Social multiplier sq. ( 2 )
15,500
15,500
14,963
11,695
11,695
11,208
11,208
Source: SNV Invalsi 2009–10. Note: Social multiplier sq. ( 2 ) is the square of the social multiplier (). Classrooms with missing values in the relevant variables are dropped from the sample. The P-value indicates whether the social multipliers calculated for the subpopulations are statistically different. The measure of social capital is per capita donations of blood bags in 1995 (Guiso et al., 2004). Additional controls ( c ) include the dummy for non-monitored classrooms in sampled school and the dummy for a high share of immigrants (see Table 4).
6.1. Regional differences, social capital and cheating An extensive literature has documented the presence of significant and persistent differences between the North and the South of the country in terms of economic conditions, educational attainment, crime rates, as well as in social norms, social capital and working ethic (Guiso et al., 2004, 2013; Ichino and Maggi, 2000, among others). Moreover, in a recent paper, Paccagnella and Sestito (2014) show that cheating practices in SNV test scores are negatively correlated to the extent of social capital in a territory.29 These stylized facts motivate our analysis of heterogeneous effects in students’ and teachers’ cheating behavior between classrooms in the North and in the South of the country and according to the degree of social capital measured at the province level by the number of blood bags donated per resident population in 1995 (as in Guiso et al., 2004).30 The main results are reported in Table 8. The cheating social multiplier is always higher in the Southern regions than in the North, and in provinces with low social capital.31 For example, the social multiplier of Math test in the fifth grade is 1.8 times larger in the South as compared to the North, and 1.6 times larger in low social capital provinces compared to high social capital ones. The regional divide is also larger in primary schools as compared to junior-high, and for Math compared to Language tests. In other words, cultural differences, a wider acceptance of illegal practices and higher shirking can be used to explain the higher incidence of cheating in Southern regions and in low social capital provinces. Notice, however, that the above factors are likely to influence the cheating behavior of both students and teachers, and at this stage we cannot establish their relative contribution to the overall effect and whether (or where) one of the two prevails. Nevertheless, these
29 We define social capital as ‘civic capital’, that is those persistent and shared beliefs and values that help a group overcome the free rider problem in the pursuit of socially valuable activities (Guiso et al., 2010). 30 In practice, we compare classrooms in Provinces (NUTS 5 level) with high (low) social capital defined with respect to above (below) median value of our social capital variable. Results proved robust to alternative measures of social capital, such as electoral participation and trust (Guiso et al., 2004). Results are available upon request. 31 We test the null, for example H0 : 2 North = 2 South , using the Sargan–Hansen test of over-identification associated with the estimates of the combined sample in which the binary instrument (Zc ) and its interaction with the high heterogeneity dummy serve as excluded instruments (Graham, 2008).
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
16
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
results are consistent with the North–South gradient in the effects of monitoring on mean test scores reported in Bertoni et al. (2013), and with teachers’ shirking behavior in the Mezzogiorno found by Angrist et al. (2014). 6.2. Heterogeneous attributes and cheating mechanisms Since cheating requires interactions among students to take place in the classroom, one may expect that in classrooms where students are more similar, in terms of personal traits and parental background characteristics (defined as ‘homogeneous classrooms’), social interactions are larger as compared to classrooms in which students are more heterogeneous (defined as ‘heterogeneous classrooms’). In this section, we use additional individual-level information, drawn from the Student Questionnaire, to test whether the cheating multiplier differs between homogeneous and heterogeneous classrooms (Graham, 2008). This comparison provides evidence regarding whether there is complementarity or substitutability between the extent of cheating and the strength of social ties in the classroom.32 If external monitoring and classroom heterogeneity are complementary, the social multiplier estimated on more homogeneous subpopulations should be greater than the one estimated on heterogeneous subpopulations. If they are substitutes, the opposite should occur. We select the following attributes to measure the degree of homogeneity (or heterogeneity) in students’ social ties within classrooms: the number of books at home, teachers’ evaluations at the end of the first semester, participation in outside school activities (music, arts and foreign language courses) and time spent watching TV. In all the above cases, we split the sample of classrooms into two groups characterized by high (H) and low (L) heterogeneity (see Appendix for details on the definition of the variables and of the subsamples). We consider the number of books that students have at home as a proxy for parental background heterogeneity in terms of education and socio-economic status (Ammermüller and Pischke, 2009), while teachers’ evaluation at the end of first semester is considered as a measure of the degree of homogeneity of the classroom in terms of perceived students’ learning attitude. The attributes related to ‘outside school activities’ and ‘time spent watching TV’ are expected to proxy for the strength of social ties within the classroom and are measured as the amount of time classmates spend in each activity: outside school activities are done with schoolmates and close friends, thus they are assumed to increase social ties; on the contrary, watching TV is an activity usually done alone and which does not entail interactions with peers, thus it is considered not to influence (or, even to deteriorate) social ties (Bruni and Stanca, 2008).33 We show in Table 9 the results for Math tests of fifth- and sixth-grade students. Results for the Language test do not significantly differ and are reported in Appendix to save space. For each selected attribute, we report the square of the social multiplier ( 2 ) for the heterogeneous and homogeneous group of classrooms and test the null of no differences (P-values reported). The first-stage F-statistics show that the effect is always strongly identified. We find that the social multiplier is larger, both for Language and Math, in the more homogeneous classrooms with respect to teachers’ evaluation and outside school activities, while no statistically significant difference is detected with respect to parental background characteristics and the watching TV activity. Hence, classrooms with stronger social ties, enhanced during outside school activities and interactions with peers, and classrooms with more homogeneous students in terms of learning attitude seem to create an environment that is more favorable to cheating interactions; on the contrary weaker social ties (where the activity of watching TV is prevalent) and parental background characteristics do not seem to influence the extent of cheating behaviors. In Appendix we also perform a number of additional falsification tests. We split the sample in heterogeneous and homogeneous classrooms according to some students’ information which are not related to social ties within the classroom. Results do not show statistically different effects of the estimated social multipliers in the two subpopulations, thus confirming the overall goodness of the main analysis. 6.3. Teachers’ and students’ cheating The cheating behavior detected in our estimates shows a pattern that is consistent with students’ interactions in the classroom, which occurs when monitoring intensity is loose because of teachers’ shirking, when teachers adopt a benevolent attitude and let student help each other (either exchanging information or cooperating), or when they help students suggesting correct answers. Thus some form of teachers’ unscrupulous or even dishonest behavior is a essential element of students’ cheating interactions in the classroom. The Invalsi protocol, as discussed in Section 3, carefully disciplines teachers’ tasks in administering the test and school heads are held responsible for any illicit behavior of the school staff, still we cannot exclude that teachers’ cheating (either ‘soft’ or ‘more explicit’) contributes to inflate test scores in non-monitored classrooms, also reducing the within variance (Invalsi, 2010). While the contributions of teachers’ and students’ cheating are likely to be linked and difficult to disentangle, in this section we exploit further the random allocation of external inspectors to gain an idea of the relevance of the relative contribution of teachers’ and students’ cheating on our estimates of the social multiplier. In particular, we consider non-monitored classrooms in sampled schools as an alternative assignment to treatment, and
32 With complementarity, moving a group of students with more homogeneous characteristics and stronger social ties (i.e. homogeneous subpopulations) to a non-monitored classroom should, in addition to increasing the average test scores, reduce the variance more than for a comparable group of students with less homogeneous characteristics (i.e. heterogeneous subpopulations). Note that all the attributes are considered to be predetermined with respect to students’ achievement during the exam. 33 We test the null, H0 : 2 H = 2 L , as explained above.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
17
Table 9 Social ties, homogeneity in students’ characteristics and cheating: Math, 5th and 6th grade.
Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms
Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Additional controls ( c )
Heterogeneous classrooms
Homogeneous classrooms
5th grade No. Books at home 7.5078 (0.4918) 1925.19 0.88 12,611
7.6116 (0.4856) 2568.55
Outside-school activities 6.9499 (0.4589) 2618.75 0.08 12,694 6th grade No. Books at home 4.2241 (0.3162) 2855.98 0.40 12,032 Outside-school activities 3.7193 (0.2594) 3382.94 0.00 12,280 Yes
12,689 8.1755 (0.5195) 1876.99 12,606
4.6110 (0.3401) 3670.27 12,103 5.2178 (0.3996) 3074.08 11,855 Yes
Heterogeneous classrooms Teachers’ evaluations 6.2946 (0.4218) 2342.08 0.00 12,660 Watching TV 7.7964 (0.4866) 2427.03 0.60 12,747
Teachers’ evaluations 3.0693 (0.2422) 2999.31 0.00 12,057 Watching TV 4.7049 (0.3096) 4030.72 0.20 12,048 Yes
Homogeneous classrooms
8.8682 (0.5517) 2136.16 12,640 7.4332 (0.4979) 2022.41 12,553
5.9590 (0.4093) 3797.26 12,078 4.1130 (0.3473) 2716.44 12,087 Yes
Source: SNV Invalsi 2009–10. Note: Social multiplier sq. ( 2 ) is the square of the social multiplier (). Classes with missing values in the relevant variables are dropped from the sample. The P-value indicates whether the social multipliers calculated for the subpopulations are statistically different. Additional controls ( c ) include the dummy for non-monitored classrooms in sampled school and the dummy for a high share of immigrants (see Table 4).
identify the discipline effect of external supervision on teachers’ behavior in non-monitored classrooms within the sampled schools (i.e. our ‘indirect monitoring’ effect) comparing different treatment effects. We expect this discipline effect to work mainly on teachers’ cheating behavior, as students may be unaware of the presence of the inspector in the school. As already shown in Section 3 (Table 3), indirect monitoring has a statistically significant (negative) effect on test scores as well as on the cheating indicators (see also Invalsi, 2010; Bertoni et al., 2013; Ferrer-Esteban, 2013). In practice, to retrieve the relative contribution of students’ cheating – i.e. ‘net’ of the effect of teachers’ behavior, we estimate the social multiplier comparing monitored and non-monitored classrooms in sampled schools. Next we compare the above estimates to the baseline results reported in Table 4 which are likely to capture the overall effect of students’ and teachers’ cheating behavior. Notice that several caveats must be borne in mind in interpreting the results of this exercise. First, since we are attributing the discipline effect of indirect monitoring entirely to teachers’ cheating, these estimates should be interpreted as a lower bound of students’ cheating. Second, the indirect monitoring effect is likely to minimize the more ‘explicit’ forms of teachers’ cheating (such as direct suggestions to the students, or wholesale curbstoning in the transcription process, see Sections 2 and 3), while other ‘softer’ forms could be still at work. Finally, as randomness seems to be compromised due to the imperfect balancing of some observable characteristics (as reported in Table 3), we also include additional controls, namely: the share of immigrants, the share of students who attended kindergarten and the share of students speaking dialect at home.34 The estimates for the fifth and sixth grades are presented in Table 10 (for convenience, we only present the estimates of the social multiplier (), while the extended results are relegated to Table A.3 in Appendix). The social multiplier estimates in the sampled schools are smaller when compared to the baseline estimates. With the above caveats in mind, to get an idea of the relative effects of students’ and teachers’ cheating behavior, we report some back-of-the-envelope calculations. In practice, our results imply that students’ contribution can explain between 69 and 62% of the overall effect in primary schools (columns 1 and 2), and between 78 and 68% in junior-high schools (columns 3 and 4). The relative contribution of students’ cheating appears to be larger in junior-high schools relative to primary schools, and in Math as compared to Language.35 The lower effect detected in primary schools is consistent with differences existing in the institutional setting
34 Both the share of students in kindergarten and those speaking dialect at home are defined as dummies for shares above the ninetieth percentile. Experimentation with dummies with a different threshold (i.e. above the seventy-fifth percentile) did not change the results. 35 Effects for Language are less precisely estimated, given that the students’ contribution to the overall social multiplier is not statistically different from one (columns 2 and 4).
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
18
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Table 10 Social multiplier and relative contribution of students and teachers to cheating. 5th grade
6th grade
Math (1)
Language (2)
Math (3)
Language (4)
P-value (H0 : = 1) No. classrooms
2.811 (0.072) 0.00 26,942 1.949 (0.305) 0.02 4576
2.274 (0.079) 0.00 26,942 1.413 (0.335) 0.29 4576
2.105 (0.054) 0.00 25,959 1.633 (0.186) 0.01 5475
1.811 (0.041) 0.00 25,959 1.229 (0.152) 0.17 5475
Students’ contribution (%) Teachers’ contribution (%)
69.33 30.67
62.14† 37.86†
77.58 22.42
67.87† 32.13†
Social multiplier (): baseline (A) P-value (H0 : = 1) No. classrooms Social multiplier (): net of teachers’ effects (B)
Source: SNV Invalsi 2009–10. Note: The baseline social multiplier (A) is taken from Table 4, columns (3) and (6) (full specification including all the control variables). The social multiplier net of teachers’ effects (B) is taken from Table A.3, columns (3) and (6) (full specification including all the control variables). † The estimate is not statistically significant.
(i.e. primary school teachers are more likely to help their students due to the longer time spent with them and the stronger interpersonal relations). Also, the finding that students’ cheating interactions prevail in Math support the evidence found in the educational psychology that answers to short mathematical and logic problems are easier to be exchanged with other students as compared to long narrative texts and reading-comprehension exercises. 7. Discussion and conclusions There is abundant evidence showing that cheating has grown over the last decades, becoming a widespread practice in schools, colleges and high-ranked universities (Dee and Jacob, 2012). Experts report that cheating has become more common and widely tolerated, as both schools and parents fail to give students clear messages about what is allowed and what is prohibited. The diffusion of test-based accountability programs in many countries has increased the likelihood of cheating practices and dishonest behavior also among teachers. In this paper, we provide evidence on the effects of cheating behavior in a national evaluation test using data from a randomized experiment where an external inspector was supervising the test in primary and junior-high schools. Our findings document a strong amplifying role arising from cheating interactions in the classroom, which occur when monitoring intensity is loose because of teachers’ shirking or when teachers adopt a benevolent attitude and let student exchange information. The magnitude of the estimated social multiplier, implied by cheating behavior, is around 2, suggesting a change in students’ achievements that is twice as large as the initial response. We find that the cheating social multiplier is larger in primary schools, compared to junior-high schools, consistent with the evidence that students share stronger interpersonal links and teachers are more likely to lower monitoring intensity with younger students. When we focus attention on the relative contribution of students’ cheating, we find it more relevant in juniorhigh schools and in Math. Heterogeneous effects in cheating interactions are found between the North and the South, as well as according to the social capital endowment at the local level suggesting that social norms about trust and tolerance of illicit behavior may explain those differences. We show that social ties in the classroom are a complementary input to students’ cheating behavior, such that the estimated effect is larger the more homogeneous students are with respect to their participation in after-school activities. A number of sensitivity checks confirm the overall robustness of our results. Our findings have a number of relevant policy implications. We argue that tolerating cheating, as it is often done in schools, is a very unsafe practice, since the social multiplier magnifies the negative effects on students’ performance also altering the signaling role of education. This is confirmed by studies showing that scores inflation and biases in educational grades determine skill misallocation on the labor market, favor students from high social backgrounds and contribute to gender inequalities (Mechtenberg, 2009; Schwager, 2012). Tolerating cheating can also be detrimental to societies since it is likely to feed back into social norms, thus raising collective indulgence with respect to various forms of dishonest practices. Evidence that social ties matter in students’ behavior in the classroom also indicate that tracking policies, whereby students are sorted by cognitive ability and other traits into homogeneous classroom, may have the undesired effect of favoring cheating. Given that increasing competition in the job market and high-stakes testing systems are likely to exert considerable pressure on students and teachers to perform well, it should be recognized that where (and when) the pressure is higher or teachers’ shirking more tolerated, larger resources should be devoted to monitoring activities in order to deter cheating practices. In other words, ethical or honorable codes of behavior in schools should be strictly enforced and students’ cheating behavior reported and sanctioned. In this sense, the social multiplier mechanism would also magnify the beneficial effects of policies directed toward stricter monitoring Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
19
and sanctioning of cheaters. Examples of such policies could be the implementation of a system of ‘fame and shame’ sanctioning using ‘cheating scores’ to warn those schools that lie above a threshold of ‘statistical’ acceptance. Alternatively, even simple changes in the administration of the test, such as reallocating students randomly across classrooms within schools in the day of the test, would break the social ties mechanism and reduce cheating in a rather inexpensive way. The primary aim of a test is to determine what students have learned after instruction, thus cheating undermines the intent and process of assessment because it interferes with an evaluator’s ability to make such judgments and reduces the external validity of grades (Anderman and Murdock, 2007). We argue that cheating contaminates the information available to policy makers and families when making important choices leading to immediate and long-term consequences on individuals and society. Appendix. Heterogeneous effects and falsification tests In the heterogeneous effects analysis (Section 6.2) we exploit some variables taken from the ‘Student Questionnaire’. Here we provide a detailed definition of each variable, show additional results for Language and provide a falsification test. The ‘number of books at home’ is a categorical variable with 5 levels (0–10 books; 11–25; 26–100; 101–200; more than 200). The ‘outside school activities’ variable asks students how many times per week he/she takes part to leisure activities outside the school time (e.g. music, arts, theatre or foreign language courses) (never, 1 or 2, 3 or 4, more than 4). The ‘watching TV variable’ asks students how many hours per day he/she watches the TV (including HomeVideo and DVD) (none, less than 1, 1 or 2, more than 2). Teachers’ evaluations (categorical variables ranging from 1 to 10) are given to each student in January, at the end of the first semester and represent a (relative) ranking of individual learning attitude. This ranking is typically relative to the pool of students within each classroom and generally reflect teachers’ evaluation about the learning environment in each class. Classrooms more homogeneous in the teachers’ evaluation plausibly reflect a more homogeneous pool of students in terms of learning attitude.36 For the teachers’ marks and books at home variables, the homogeneous (heterogeneous) group is defined as the subpopulation of classrooms with a standard deviation lower (higher) than or equal to the median standard deviation observed in the entire classroom population (e.g. in the homogeneous classrooms students are homogeneous because they are given either high or low teachers’ mark). For the social ties variables (i.e. the outside school activities and watching TV variables) classrooms with a mean above the median level are homogeneous classrooms, in which a large amount of students interact more outside school (or watch more the TV), thus showing stronger (or weaker) social ties. The reverse is true for classrooms below the median. Finally, we exclude from the empirical analysis students with missing values in any of the four variables used. Table A.1 contains the results relative to Language. The overall pattern does not depart from the one presented for Math (Table 9). In this case, we notice that the ‘watching TV’ variable holds statistically different social multipliers in the two subgroups. However, differently from the ‘outside school activities’ variable, which is intended to proxy for higher social ties in homogeneous classrooms, the watching TV variables proxy for lower social ties in the homogeneous classrooms (i.e. more students that spend time alone watching TV). Thus, consistently with this hypothesis, we find that the cheating social multiplier is higher in heterogeneous classrooms. In Table A.2 we present a falsification test. We identify a battery of questions, which are common to the fifth and sixth grade Student Questionnaire, but are not related to students’ activities which may affect social ties within the classroom. To this purpose we exploit a set of questions about students’ home environment, i.e. whether they have a place to study at home, an own PC to be used for studying purposes, an own desk, and an own room where to study. As the main heterogeneous effects analysis, we split the sample according to classrooms heterogeneous and homogeneous in the students’ answers to these questions (below or above the overall median value). Results in Table A.2 do not show statistically different effects in the estimated social multiplier across the two subpopulations, thus confirming the overall goodness of the main analysis. We also perform additional falsification tests using additional question available for sixth grade students only, and regarding students’ opinions about the school building, the classroom enlightenment, cleaning and heating, the school dinner. Results, available upon request to the authors, never show statistically significant differences in the social multipliers calculated for the subpopulations of heterogeneous and homogeneous classrooms.
36 Teachers’ evaluations are given both for Language and Math. In the analysis we use Math teachers’ evaluations, but results do not change for the Language ones. We also make additional tests using alternative measures of (i) activities linked to social ties creation, such as the variable indicating the time spent each day playing with friends outside school, the variable asking students how many times per week he/she practices sport activities outside school; (ii) activities linked to social ties deterioration (or not influencing social ties), such as the amount of daily time dedicated to video games or helping parents in doing some small housework. Although less precisely estimated, the general pattern of the results is confirmed also using these variables.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
20
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Table A.1 Heterogeneous effects in cheating: Language, 5th and 6th grade.
Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms
Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms
Heterogeneous classrooms
Homogeneous classrooms
5th grade No. books at home 4.2241 (0.3162) 2855.98 0.40 12,032
4.6110 (0.3401) 3670.27
Outside-school activities 3.7193 (0.2594) 3382.94 0.00 12,280 6th grade No. books at home 3.1421 (0.2126) 2653.94 0.51 12,032
F statistics 1st stage P-value No. classrooms
Outside-school activities 2.7927 (0.1756) 2946.13 0.00 12,280
Additional controls ( c )
Yes
Social multiplier sq. ( 2 )
12,103 5.2178 (0.3996) 3074.08 11,855
3.3346 (0.1995) 2856.95 12,103
Heterogeneous classrooms Teachers’ evaluations 3.0693 (0.2422) 2999.31 0.00 12,057 Watching TV 4.7049 (0.3096) 4030.72 0.20 12,048
Teachers’ evaluations 2.4295 (0.1679) 2771.75 0.00 12,057
11,855
Watching TV 3.7256 (0.2309) 3171.02 0.00 12,048
Yes
Yes
3.7371 (0.2375) 2573.98
Homogeneous classrooms
5.9590 (0.4093) 3797.26 12,078 4.1130 (0.3473) 2716.44 12,087
3.9992 (0.2386) 2759.23 12,078 2.7147 (0.1780) 2465.45 12,087 Yes
Source: SNV Invalsi 2009–10. Note: Social multiplier sq. ( 2 ) is the square of the social multiplier (). Classes with missing values in the relevant variables are dropped from the sample. The P-value indicates whether the social multipliers calculated for the subpopulations are statistically different. Additional controls ( c ) include the dummy for non-monitored classrooms in sampled school and the dummy for a high share of immigrants (see Table 4).
Table A.2 Heterogeneous effects: falsification tests.
Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms
Social multiplier sq. ( 2 ) F statistics 1st stage P-value No. classrooms
Social multiplier sq. ( 2 ) F statistics 1st stage
Heterogeneous classrooms
Homogeneous classrooms
5th grade Place to study at home 3.2381 (0.1938) 2746.21 0.97 12,270
3.2491 (0.2102) 2770.97
Own desk at home 3.1092 (0.1806) 2965.16 0.58 11,924 6th grade Place to study at home 4.3821 (0.2966) 3098.42 0.70 12,270 Own desk at home 4.1760 (0.3052)
11,857 3.2639 (0.2104) 2775.94 12,203
Heterogeneous classrooms PC 3.1302 (0.2026) 2824.64 0.62 12,659 Own room 3.3426 (0.1968) 3101.05 0.50 12,450
11,857
PC 4.5092 (0.3385) 3021.58 0.73 12,659
4.7020 (0.3534)
Own room 4.5495 (0.3290)
4.5623 (0.3666) 3324.66
Homogeneous classrooms
3.2690 (0.1894) 2708.13 11,468 3.1472 (0.2100) 2409.57 11,677
4.3470 (0.3198) 3430.90 11,468 4.3684 (0.3266)
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model
ARTICLE IN PRESS
JEBO-3474; No. of Pages 22
C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
21
Table A.2 (Continued) Heterogeneous classrooms
Homogeneous classrooms
Heterogeneous classrooms
Homogeneous classrooms
P-value No. classrooms
3474.46 0.26
2958.46
3759.43 0.70
2655.03
Additional controls ( c ) Social multiplier sq. ( 2 )
11,924 Yes
12,203 Yes
12,450 Yes
11,677 Yes
Source: SNV Invalsi 2009–10. Note: Social multiplier sq. ( 2 ) is the square of the social multiplier (). Classes with missing values in the relevant variables are dropped from the sample. The P-value indicates whether the social multipliers calculated for the subpopulations are statistically different. Additional controls ( c ) include the dummy for non-monitored classrooms in sampled school and the dummy for a high share of immigrants (see Table 4). The analysis is conducted with using Math tests as outcome variable. Table A.3 Estimates of the social multiplier: students’ contribution ‘net’ of teachers’ cheating. Math
5th grade Social multiplier sq. ( 2 ) P-value (H0 : 2 = 1) Model parameters Social multiplier () Endogenous effect (J) First stage Instrument coeff. F-stat No. classrooms 6th grade Social multiplier sq. ( 2 ) P-value (H0 : 2 = 1) Model parameters Social multiplier () Endogenous effect (J) First stage Instrument coeff. F-stat No. classrooms Additional controls ( c ) High share of immigrants High share of studentswho attended kindergarten High share of students speaking dialect at home
Language
(1)
(2)
(3)
(4)
(5)
(6)
8.050 (0.413) 0.00
4.347 (1.197) 0.01
3.797 (1.189) 0.02
5.257 (0.366) 0.00
4.267 (0.386) 0.00
1.998 (0.946) 0.29
2.837 (0.073) 0.648 (0.009)
2.085 (0.287) 0.520 (0.066)
1.949 (0.305) 0.487 (0.080)
2.293 (0.080) 0.564 (0.015)
2.066 (0.093) 0.516 (0.022)
1.413 (0.335) 0.293 (0.167)
0.038 (0.001) 5511.61 4576
0.015 (0.001) 426.72 4576
0.014 (0.001) 417.26 4576
0.043 (0.001) 4942.34 4576
0.017 (0.001) 440.04 4576
0.017 (0.001) 428.44 4576
4.426 (0.228) 0.00
2.844 (0.602) 0.00
2.667 (0.606) 0.01
3.269 (0.149) 0.00
2.784 (0.196) 0.00
1.509 (0.374) 0.17
2.104 (0.054) 0.525 (0.012)
1.686 (0.178) 0.407 (0.063)
1.633 (0.186) 0.388 (0.070)
1.808 (0.041) 0.447 (0.013)
1.669 (0.059) 0.401 (0.021)
1.229 (0.152) 0.186 (0.101)
0.040 (0.000) 7955.36 5475
0.017 (0.001) 631.34 5475
0.016 (0.001) 610.91 5475
0.042 (0.001) 6535.26 5475
0.019 (0.001) 652.33 5475
0.018 (0.001) 622.26 5475
Yes
Yes Yes
Yes Yes Yes
Yes
Yes Yes
Yes Yes Yes
Source: SNV Invalsi 2009–10. Note: Only sampled schools are considered. Additional controls ( c ) include dummies equal to 1 for classrooms with: the share of immigrants higher than the ninetieth percentile, the share of students speaking dialect at home higher than the ninetieth percentile, the share of students who attended kindergarten higher than the ninetieth percentile.
References Ammermüller, A., Pischke, J.-S., 2009. Peer effects in European primary schools: evidence from PIRLS. J. Labour Econ. 27 (3), 315–348. Anderman, E.M., Murdock, T.B., 2007. The Psychology of Academic Cheating. Academic Press Inc. Angrist, J.D., Battistin, E., Vuri, D., 2014. In a small moment: class size and moral hazard in the Mezzogiorno. NBER Working Paper No. 20173, May 2014. Bertoni, M., Brunello, G., Rocco, L., 2013. When the cat is near the mice won’t play: the effect of external examiners in Italian schools. J. Public Econ. 104, 65–77. Bramoullé, Y., Djebbari, H., Fortin, B., 2009. Identification of peer effects through social networks. J. Econom. 150, 41–55. Bruni, L., Stanca, L., 2008. Watching alone: relational goods, television and happiness. J. Econ. Behav. Organ. 65 (3–4), 506–528. Card, D., Giuliano, L., 2013. Peer effects and multiple equilibria in the risky behavior of friends. Rev. Econ. Stat. 95 (4), 1130–1149. Carrell, E.S., Frederick, M.V., James, E.W., 2008. Peer effects in academic cheating. J. Hum. Resour. 63 (1), 173–206.
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006
G Model JEBO-3474; No. of Pages 22
22
ARTICLE IN PRESS C. Lucifora, M. Tonello / Journal of Economic Behavior & Organization xxx (2014) xxx–xxx
Castellano, R., Longobardi, S., Quintano, C., 2009. A fuzzy clustering approach to improve the accuracy of Italian student data. Stat. Appl. 7 (2), 149–171. Davis, F.S., Drinan, P.F., Gallant, T.B., 2009. Cheating in School. Wiley-Blackwell, U.K. De Giorgi, G., Pellizzari, M., Redaelli, S., 2010. Identification of social interactions through partially overlapping peer groups. Am. Econ. J. Appl. Econ. 2 (2), 241–275. Dee, S.T., Jacob, B.A., 2012. Rational ignorance in education: a field experiment in student plagiarism. J. Hum. Resour. 47 (2), 397–434. Dee, T.S., Jacob, B.A., Justin, M., Jonah, R., 2011. Rules and Discretion in the Evaluation of Students and Schools: The Case of the New York Regents Examinations, Columbia Business School Research Paper Series. Drago, F., Galbiati, R., 2012. Indirect effects of a policy altering criminal behavior: evidence from the Italian prison experiment. Am. Econ. J. Appl. Econ. 4 (2), 199–218. Durlauf, S.N., Tanaka, H., 2008. Regression versus variance tests for social interactions. Econ. Inq. 46 (1), 25–28. Eurydice, 2009. National Testing of Pupils in Europe: Objectives, Organization and Use of the Results. http://eacea.ec.europa.eu/education/ eurydice/documents/ Ferrer-Esteban, G., 2013. Rationale and incentives for cheating in the standardized tests of the Italian assessment system. FGA Working Paper No. 50/2013. Galbiati, R., Zanella, G., 2012. The tax evasion social multiplier. Evidence from Italy. J. Public Econ. 96 (5), 485–494. Glaeser, E.L., Sacerdote, B., Scheinkman, J.A., 1996. Crime and social interactions. Q. J. Econ. 111, 507–548. Glaeser, E.L., Sacerdote, B., Scheinkman, J.A., 2003. The social multiplier. J. Eur. Econ. Assoc. 1 (2–3), 345–353. Graham, B., 2008. Identifying social interactions through conditional variance restrictions. Econometrica 76 (3), 643–660. Guiso, L., Sapienza, P., Zingales, L., 2004. The role of social capital in financial development. Am. Econ. Rev. 94 (3), 526–556. Guiso, L., Sapienza, P., Zingales, L., 2010. Civic capital as the missing link. In: Benhabib, J., Bisin, A., Jackson, M.O. (Eds.), Handbook of Social Economics. Elsevier Science B.V. Guiso, L., Sapienza, P., Zingales, L., 2013. Long-term persistence. EIEF Working Paper No. 23/2013. Einaudi Institute of Economics and Finance, Rome. Ichino, A., Maggi, G., 2000. Work environment and individual background: explaining regional shirking differentials in a large Italian firm. Q. J. Econ. 115 (3), 1057–1090. Imberman, S.A., Kugler, A.D., Sacerdote, B.I., 2012. Katrina’s children: evidence on the structure of peer effects from hurricane evacuees. Am. Econ. Rev. 102 (5), 2048–2082. Invalsi, 2010. The National Survey of Students’ Attainments: Technical Report 2009-10. http://www.invalsi.it/download/rapporti/snv2010/ Jacob, B.A., Levitt, S.D., 2003. Rotten apples: an investigation of the prevalence and predictors of teacher cheating. Q. J. Econ. 118 (3), 843–877. Jordan, A.E., 2001. College student cheating: the role of motivation, perceived norms, attitudes, and knowledge of institutional policy. Ethics Behav. 11 (3), 233–247. Kleven, H.J., Knudsen, M.B., Kreiner, T.C., Pedersen, S., Saez, E., 2011. Unwilling or unable to cheat? Evidence from a tax audit experiment in Denmark. Econometrica 79 (3), 651–692. Lavy, V., Olmo, S., Felix, W., 2012. The good, the bad and the average: evidence on the scale and the nature of ability peer effects at school. J. Labour Econ. 30 (2), 367–414. Lazear, P.E., 2006. Speeding, terrorism and teaching to the test. Q. J. Econ. 121 (3), 1029–1061. Levitt, D.S., List, J.A., Neckermann, S., Sadoff, S., 2012. The behavioralist goes to school: leveraging behavioral economics to improve educational performance. NBER Working Paper, No. 18165. Lucifora, C., Tonello, M., 2012. Students’ cheating as a social interaction: evidence from a randomized experiment in a national evaluation program, IZA D.P. No. 6967. Manski, F.C., 1993. Identification of endogenous social effects: the reflection problem. Rev. Econ. Stud. 60, 531–542. Manski, F.C., 2000. Economic analysis of social interactions. J. Econ. Perspect. 14 (3), 115–136. Maurin, E., Moschion, J., 2009. The social multiplier and labor market participation of mothers. Am. Econ. J. Appl. Econ. 1 (1), 251–272. Mechtenberg, L., 2009. Cheap talk in the classroom: how biased grading at school explains gender differences in achievements, career choices and wages. Rev. Econ. Stud. 76, 1431–1459. McCabe, L.D., 2005. Cheating among college and university students: a North American perspective. Int. J. Educ. Integr. 1 (1). McCabe, D.L., Trevino, L.K., 1993. Academic dishonesty: honour codes and other contextual influences. J. High. Educ. 64 (5), 522–538. Neal, D., 2014. The consequences of using one assessment system to pursue two objectives. NBER Working Paper No. 19214. Paccagnella, M., Sestito, P., 2014. School cheating and social capital. Educ. Econ. 22 (4), 367–388. Pickhardt, C., 2013, March. Surviving (Your Child’s) Adolescence. Wiley Press. Sacerdote, B., 2001. Peer effects with random assignment: results for Dartmouth roommates. Q. J. Econ. 116 (2), 681–704. Sacerdote, B., 2010. Peer effects in education: how might they work, how big are they, and how much do we know so far? In: Hanushek, E.A., Machin, S., Woessmann, L. (Eds.), Handbook of the Economics of Education, vol. 3. North-Holland, Elsevier, Amsterdam (Chapter 4). Stanard, C.I., Bowers, W.J., 1970. The college fraternity as an opportunity structure for meeting academic demands. Soc. Probl. 17 (3), 371–390. Standard & Testing Agency, 2013. 2012 Maladministration report. National Curriculum assessments. http://www.gov.uk/government/publications/2012maladministration-report Schwager, R., 2012. Grade inflation, social background, and labour market matching. J. Econ. Behav. Organ. 82, 56–66. U.S. Department of Education, 2009. Standards and Assessments Peer Review Guidance: Information and Examples for Meeting Requirements of the No Child Left Behind Act of 2001. http://www2.ed.gov/policy/elsec/guide
Please cite this article in press as: Lucifora, C., Tonello, M., Cheating and social interactions. Evidence from a randomized experiment in a national evaluation program. J. Econ. Behav. Organ. (2014), http://dx.doi.org/10.1016/j.jebo.2014.12.006