School quality and learning gains in rural Guatemala

School quality and learning gains in rural Guatemala

Economics of Education Review 28 (2009) 207–216 Contents lists available at ScienceDirect Economics of Education Review journal homepage: www.elsevi...

181KB Sizes 5 Downloads 85 Views

Economics of Education Review 28 (2009) 207–216

Contents lists available at ScienceDirect

Economics of Education Review journal homepage: www.elsevier.com/locate/econedurev

School quality and learning gains in rural Guatemala Jeffery H. Marshall ∗ Sapere Development Solutions, Avenida Sete de Setembro, no. 2774, Salvador da Bahia, Bahia, Brazil

a r t i c l e

i n f o

Article history: Received 5 June 2007 Accepted 14 October 2007 JEL classification: 20 I28 J15 Keywords: Human capital Economic development Efficiency Resource allocation

a b s t r a c t I use unusually detailed data on schools, teachers and classrooms to explain student achievement growth in rural Guatemala. Several variables that have received little attention in previous studies – including the number of school days, teacher content knowledge and pedagogical methods – are robust predictors of achievement. A series of decompositions by student ethnicity and type of school shed some additional light on important questions in the Guatemalan context, and beyond. The large indigenous test score gap is not explained by differences in an extensive list of observable features of schools. The large effect for community characteristics suggests peer group effects or more general institutional differences related to services or labor markets. PRONADE community schools are associated with moderate gains vis-à-vis public schools in areas related to utilization of capacity, such as days worked. But these gains are largely offset by lower teacher capacity, which highlights the challenge of improving school quality in poor, rural areas. © 2008 Elsevier Ltd. All rights reserved.

1. Introduction Standardized tests applied in developing countries show that many students fail to reach even minimum proficiency levels in reading, writing and mathematics. The size of this basic skills gap is disturbing, and has potentially serious implications for equity and overall development (Hanushek & Woessmann, 2007). And while there is general agreement that many schools are failing to provide an adequate learning environment, articulating an effective policy response to this problem remains a challenge. On the one hand, on-going efforts to expand access into secondary education in developing countries deflect some attention away from school quality issues. But even when policymakers are focused on improving skills in basic education the empirical evidence they have available to guide their actions is restricted. This unfortunate reality, despite more than 30 years of production function analyses of student

∗ RAND Corporation, 1776 Main Street, P.O. Box 2138, Santa Monica, CA 90407-2138, USA. Tel.: +1 310 393 0411. E-mail address: [email protected]. 0272-7757/$ – see front matter © 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.econedurev.2007.10.009

achievement from all over the developing world, is in part attributable to the limited utility of many of the independent variables used in these studies (Fuller & Clarke, 1994). Concerns about production function work go beyond data limitations, as best demonstrated by the increasing use of randomized trials. Nevertheless, the framework remains a popular one for testing ideas in education, and a group of “exceptional” studies combining original data and methods provides a useful precedent to build on (Glewwe, 2002). This paper continues in this vein using unusually detailed, longitudinal data on student achievement from rural Guatemala. In the first part I estimate models of student achievement using variables that are rarely available to researchers. These results are then contextualized through two sets of achievement decompositions. The first extends previous studies in Guatemala (McEwan & Trowbridge, 2007) and beyond (Psacharopoulos & Patrinos, 1994) by testing for specific mechanisms that explain a persistent indigenous student test score gap. The second analyzes different production dynamics between traditional public schools and their community school (PRONADE) counterparts.

208

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

The results push forward the production function literature on student achievement and school policy in several areas. First, the significant effects of total school days and teacher knowledge confirm the importance of two frequently cited – but rarely analyzed – influences on achievement. However, these variables explain little of the achievement gap between indigenous and non-indigenous students in rural Guatemala, which in turn points to other (unmeasured) aspects of schools or communities. Finally, there is evidence that PRONADE community schools are realizing efficiency gains vis-à-vis the public sector in things like teacher attendance. But these gains are largely offset by lower levels of capacity among the teachers recruited to work in these communities. The paper proceeds as follows. Section 2 includes a brief review of production function work in developing countries, focusing on some of the harder to measure elements of school quality. Section 3 introduces the Guatemalan context, the data and methodology. Section 4 details the results from the various statistical analyses, and Section 5 concludes. 2. A brief review of some relevant evidence This review focuses on elements of schooling that have received little attention in previous analyses of student achievement in developing countries. For example, surprisingly little is known about how variation in days offered or teacher attendance affects student achievement in developing countries. Measuring the length of the school year is complicated by unreliable official records, and in rural areas especially there is little supervision. The evidence that does exist–coming largely from surprise visits–is troubling. Kremer, Chaudhury, Rogers, Muralidharan, and Hammer (2005) and Alcazar et al. (2005) find teacher absence rates of between 15 and 40% (daily) based on visits to Indian and Peruvian classrooms, respectively. Bedi and Marshall (2002) show that “non-official” school closings in Honduras add up to 19 days on average per year—twice as many days the average student is absent. For the teacher’s work the performance of the “usual suspects” (teacher experience, education, etc.) has been underwhelming. One glaring omission in the literature is teacher knowledge, which is often reduced to proxies based on education levels or classes taken. Only a handful of studies use actual measures of teacher content knowledge based on an exit exam (Mullens, Murnane, & Willett, 1996), their overall score on a battery of performance tests ˜ (Santibanez, 2006), their primary level content knowledge (Harbison & Hanushek, 1992) or their “pedagogical content knowledge” (PCK) using real-life teaching situations (Hill, Rowan, & Ball, 2005; Marshall & Sorto, 2007). In each case the teacher’s knowledge is a significant predictor of student achievement. For teaching methodology a handful of studies have used limited classroom observations, detailed teacher questionnaires or student interview responses to gather clues about effective strategies (Fuller et al., 1999; Glewwe, Grosh, Jacoby, & Lockheed, 1995). There are clear limits in conceptualizing school quality solely in terms of days worked or teacher qualifications. One policy initiative in the developing world that provides

a natural arena for analyzing more dynamic influences on quality is the community school. Community schools empower principal agents through the use of parent councils that are responsible for making decisions locally, such as the hiring and firing of teachers. In Central America the EDUCO and PRONADE systems in El Salvador and Guatemala, respectively, have been associated with marginally higher student achievement levels (Jimenez & Sawada, 1999). One mechanism appears to be a better use of existing capacity, as community school teachers are less frequently absent (DiGropello & Marshall, 2005). But there is also evidence that efficiency gains in some areas are being offset by capacity limitations in others. A recent review of community schooling in Central America finds no evidence that teachers in decentralized programs are more effective than their public school counterparts. There is also a tendency for community schools to have fewer students in the classroom, which could reflect parental concerns about the teacher’s ability to work with many students at once (DiGropello, Marshall, & Rápalo et al., 2004). In other words, there appear to be limits to the community school impact when parents are unable – or unwilling – to attract high quality teachers, or unable to instigate in-service training activities to upgrade capacity. 3. Analytical framework: explaining academic achievement in rural Guatemala 3.1. Guatemala context, data collection and variable measurement Guatemala is one of the poorest countries in Latin America. 34% of the population lives on less than two dollars per day, and 70% of rural Guatemalans live below the poverty line (World Bank, 2004). The inhabitants are a mixture of indigenous peoples, many of whom speak Mayan dialects, and Spanish-speaking non-indigenous (“ladinos”). There have been efforts in recent years to improve educational opportunities in rural areas. Access has been expanded through the PRONADE community school program. Teacher capacity is being improved through an in-service training program. And the country’s linguistic diversity is receiving more direct recognition through the DIGEBI (Dirección General de Educación Bilingüe Intercultural) program. The Human Capital in Rural Guatemala (HCRG) data were collected by the author throughout the 2002 school year in 58 rural schools. The selected students were part of a cohort that was originally tested in grade three in 2001 by the PRONERE assessment project (De Baessa, 2002). Because of resource limitations it was not possible to replicate the nationally representative PRONERE sampling framework in all 21 states (“departamentos”), so instead all of the schools from only three states were revisited.1 Comparisons of 2001 data between the HCRG and PRONERE samples show similar averages for achievement,

1 The states were chosen based primarily on their ethnic makeup, and include one state almost entirely inhabited by ladinos (Escuintla), another made up mainly of indigenous residents (Alta Verapaz), and a third that includes many mixed communities (Chimaltenango).

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

poverty and ethnic makeup. Nevertheless, the HCRG sample is only loosely representative beyond these three parts of the country. Of the original 2001 cohort a little over 80% of the students were in grade four in 2002, 9% were repeating grade three, and 9% had deserted by the end of 2002. Enrolled students were given the same test (using a different form) at the end of the 2002 school year. The tests included 40 multiple choice questions designed by curriculum experts in Guatemala specifically for a rural population. Student interviews and questionnaires for teachers and directors provide the rest of the variables. In the analysis that follows the only information from 2001 is the incoming student test score; all other variables are based on the 2002 questionnaires and interviews. Table 1 summarizes the variables. The information on the number of days of instruction is especially detailed. Based on information collected from school directors, most schools began classes by the middle of January 2002, as dictated by the official calendar. But a significant percentage began a week or more later. There is also information on how many days classes were held according to the teacher grade book for grade four.2 Classes were (apparently) not given about 20% of the time during the attendance-taking period, with the biggest category being for unknown reasons where the teacher did not take attendance and did not write in a reason for why. The total days of class time from the first day of class until the application of the standardized exams is roughly 100 days, which is considerably lower than the official number (roughly 140) for this same period. Students were in attendance about 93% of the available days. A quantitative classroom observation instrument was applied in one Spanish and mathematics class in fourth grade. This uses time segments to divide the total time of each class into various categories, such as student seatwork, student–teacher interaction, etc. There are also observations about whether or not the teacher checked everyone’s work during class, and the physical condition of the classroom. The teacher’s content knowledge in Spanish is based on their responses to 12 items drawn from the student test. Mathematics content knowledge is based on lower order items from the student test and 16 questions drawn from the middle school curriculum. Finally, the teacher’s pedagogical content knowledge (PCK) in mathematics is measured using three open-ended questions.3

2 When teachers did not take attendance it was assumed no class was given. Follow-up investigation during the fieldwork and discussions with fieldwork personnel supports this strategy, although it is impossible to verify in every case that no class was given on these days. 3 For example, teachers were asked to diagnose the source of the error when students give the incorrect answer of 2806 to the multiplication problem of “352 × 8”. Teacher responses such as “the student does not know how to count” receive low marks, while specific diagnoses of the problem (i.e. “the student is not regrouping or carrying a ten”) score higher. The PCK items were created by Alejandra Sorto, and together we analyze their impact on achievment in this same sample in specific content areas (Marshall & Sorto, 2007).

209

3.2. Methodology 3.2.1. Student achievement production functions In theory student achievement in rural Guatemala is determined by a production function: A = f (X, S, C; ˛)

(1)

where achievement A is a product of home background (X), school resources (S), community characteristics (C) and an efficiency parameter measuring capacity utilization in the school (˛). This general function does not specify levels of knowledge or knowledge growth. In policy circles a higher premium is usually placed on understanding the dynamics of growth (or gains), although in circumstances where the independent variables do not change much over time the analysis of levels will give very similar results (Glewwe, 2002). Function 1 can be expressed in reduced form as a linear estimating equation: Aijt = ˇX Xi + ˇS Sj + ˇC Ci + εi

(2)

where achievement A for student i in school j in time t is a function of vectors of variables corresponding to home background, schools and communities. By adding previous achievement (Ai(t−1) ) Eq. (2) is extended to measure learning gains. In this paper I use both the level and gain scores for the production function work (see Table 2). Interpretation of the point estimates in Eq. (2) can be affected by the presence of selection bias. One of the unique features of the data used in this study is information on three potential forms of selection bias. I begin by including a Heckman-style parameter () based on a probit estimation of dropping out between test applications in 2001 and 2002. This requires excluding the student’s attendance during the 2002 school year and the grade control from the first stage equation. Three measures of the mother’s work participation are used as instruments. The identification strategy is supported empirically by the significance of the instrument set in the selection equation and the lack of power of these same variables in the achievement production functions, with and without the predicted hazard of non-selection (see bottom of Table 2). There are also different rates of attrition prior to where this cohort began in grade three in 2001, which is controlled using the average grades completed for a randomly drawn group of grade one students taken from enrollment rolls in each school in 1999. Finally, the data include the percentage of students in attendance on the day the exams were applied in 2002, which makes it possible to address “day-of-test selection” (Glewwe et al., 1995). 3.2.2. Achievement decompositions Another potential problem is omitted variable bias, especially for the school and teacher characteristics. For example, the capacity utilization parameter (˛) from the theoretical production function (1) does not appear in the actual estimation (2). However, since the sample includes schools from the PRONADE community school project it is possible to test for different production dynamics in one specific case. The community

210

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

Table 1 Descriptive statistics: dependent and independent variables Characteristic

Definition

Whole sample

PRONADE

Ethnicity Ladino

Achievement dependent variables Spanish score 2002

Math score 2002

Spanish gain score

Math gain score

Student-family background Spanish test 2001

Math test 2001

Student age Female Indigenous

Parental education SES

Grade 4 Student attendance

Selection correction Cohort control

Day of test control

Non-selection hazard ()

School characteristics Class size Student-reported fighting (school average)

Indigenous

Percent correct on same 40 item multiple choice test applied in 2001, with different form Percent correct on same 40 item multiple choice test applied in 2001, with different form Percentage difference between 2001 and 2002 result Percentage difference between 2001 and 2002 result

59.3 (19.2)

46.6** (20.6)

73.7 (14.0)

51.1** (15.7)

57.5 (15.8)

48.8** (15.7)

63.9 (14.9)

53.8** (15.2)

8.8 (11.3)

6.8* (10.6)

10.8 (11.4)

7.6** (11.1)

13.1 (11.7)

11.5 (11.6)

14.8 (11.5)

12.1** (11.7)

Percent correct on same 40 item multiple choice test Percent correct on same 40 item multiple choice test Student’s age in years Student is female (pct) Student reports speaking a Mayan language (pct) Average years of parental education Principal component factor of household possessions and services (non-rotated) Student is in grade 4 (pct) Percentage of total days offered in school during 2002 that student was in attendance

50.5 (18.2)

39.8** (20.7)

62.9 (14.9)

43.5** (16.0)

44.4 (15.3)

37.4** (15.2)

49.2 (15.5)

41.7** (14.6)

11.7 (1.6) 49.0 64.0

11.5 (1.6) 53.0 84.5**

11.1 (1.5) 50.0 –

12.0** (1.5) 48.0 –

Average grades completed for cohort of students chosen from grade 1 in 1999 Percentage of grade 4 class in attendance on day of exam application Hazard that student is enrolled in school in 2002 (mills ratio transformation)

Number of students in classroom Students report fighting with other students in school (1 = None, 2 = Some, 3 = Many)

2.3 (2.1)

1.5** (1.8)

3.0 (2.3)

1.9** (1.9)

0.10 (1.1)

−0.61** (1.1)

1.2 (1.1)

−0.4** (1.2)

86.0

92.0**

89.0

85.0**

93.5 (6.9)

93.6 (5.0)

94.4 (6.1)

93.0** (7.3)

2.6 (0.5)

90.1

0.77

32.9 (10.0) 1.33 (0.15)

2.9** (2.5)

85.1**

0.75**

35.0** (6.1) 1.42** (0.17)

2.6 (0.4)

89.3

0.79

33.1 (11.5) 1.35 (0.18)

2.6 (0.6)

91.1**

0.76**

32.4 (9.3) 1.33 (0.14)

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

211

Table 1 (Continued) Characteristic

Definition

Whole sample

PRONADE

Ethnicity Ladino

Teacher in university

Teacher experience Teacher is Indigenous

Teacher content knowledge (Spanish)

Teacher content knowledge (math)

Teacher pedagogical content knowledge (PCK)

Teaching segments Seat work

Teacher–student interaction

Group work

Teacher-centered

Transition/discipline

Teacher checks all work

Physical condition of classroom

School days

Total enrollment PRONADE school Distance to middle school

Teacher currently taking university classes (pct) Years of experience in this school Teacher describes his/herself as Indigenous (pct) Percentage correct of 12 items from student Spanish test Percentage correct of 10 items from student Math test and 16 items from middle school level Sum of points (out of 8) for three activities designed to measure specialized teaching knowledge in grade 3 mathematics Percent of class devoted to students working individually Percent of class teacher asks individual questions, group responses or students work at board Percent of class devoted to students working in groups Percent of class devoted to teacher giving instructions, resolving problems, etc. Percent of class devoted to transitions in between activities or stopped time Teacher checked all students work during lesson (pct) Average condition of classroom based on observations of space, lighting, noise and desks Total days of class from first day of class until day of test Total grade 1–6 enrollment School is part of PRONADE system (pct) Kilometers to nearest middle school

18.0

4.9 (5.6)

17.1

2.3** (1.5)

15.3

6.1 (7.7)

Indigenous 19.5

4.2** (4.1)

41.4

56.8**

22.0

51.5**

94.2 (3.8)

94.2 (7.0)

94.1 (3.8)

94.4 (3.9)

69.7 (11.7)

66.0** (10.4)

70.5 (12.2)

69.4 (11.6)

5.2 (2.0)

4.3** (1.6)

5.7 (1.7)

4.2** (1.9)

40.6

31.5**

45.2

37.8**

26.6

27.4

25.3

28.1**

4.4

0.0**

10.3

1.6**

24.0

34.2**

15.9

27.9**

4.4

6.9**

3.3

4.6**

25.1

18.5**

33.3

22.4**

2.6 (0.6)

2.9** (0.6)

2.6 (0.6)

2.6 (0.6)

111.2 (9.6)

112.8* (6.7)

112.5 (8.3)

110.5** (10.2)

249.4 (131.8)

204.4** (145.2)

254.4 (146.4)

242.3 (126.4)

10.6 5.0 (6.0)



4.8

7.2** (8.9)

3.5 (3.8)

14.6** 6.1** (6.9)

Source: HCRG Databases, 2003. Notes: Means are presented with standard deviations (when appropriate) in parentheses. Asterisks for PRONADE schools and Indigenous, students refer to significant differences compared with whole sample average. *Difference p < 0.10 level; **Difference p < 0.05.

school/traditional school comparison is especially enticing given the stated purpose of community schooling to improve capacity utilization. I use a simplified decomposition framework that pools the data by school type and divides the achievement differences between PRON-

ADE and traditional public schools into two sources: explainable differences based on endowments of independent variables, and an unexplainable fixed or “direct” effect (Blinder, 1973; Neumark, 1988; Oaxaca, 1973). The gain score production function estimates are used for

212

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

this activity, and the variation is divided into four categories: ˆ PRONADE + (X¯ 1 − X¯ 2 )ˇ ˆ A¯ 1 − A¯ 2 = ˇ X ˆ  + (C¯ 1 − C¯ 2 )ˇ ˆ +(S¯ 1 − S¯ 2 )ˇ S C

(3)

where the achievement difference between group 1 (traditional schools) and group 2 (PRONADE) is a function of the direct effect of PRONADE in (1) and differences in endowments for family background, schools and communities multiplied by the coefficients for each variable, all taken from Eq. (2) (with incoming achievement) from above. Table 1 shows that indigenous students score much lower than their ladino counterparts. This is consistent with previous standardized testing in Guatemala and elsewhere, although the language gap in this sample of rural schools is larger than in any other previous study in the region. The same decomposition framework (3) is applied to explain the indigenous test score gap. The basic interpretation is the same except for the treatment of the “direct” effect for indigenous students, which represents the difference in achievement between indigenous and ladino students with equal family background endowments who are studying in the same school. All of the decomposition results are presented in Table 3 in the next section. 4. Results 4.1. The covariates of student achievement The production function results for Spanish and mathematics are presented in Table 2. In each subject achievement is analyzed in both level and gain score form. Dependent variables are measured in standard deviations. T-statistics are based on robust standard errors that account for student clustering, and the models also include state (departamento) dummy variables (Alta Verapaz excluded). Upwards of 52% of the variation is explained in level form, and between 23 and 31% of the gain score variation. As expected, indigenous students score significantly lower than ladinos. In Spanish language the unexplained difference – measured by the indigenous student dummy variable – is 0.25 standard deviations in the gain score model. The corresponding effect in mathematics is much smaller and insignificant. The very different results by subject cast some doubt on differential treatment inside the classroom. The stronger impact in language points instead to linguistic factors. Nevertheless, the size of this effect in Spanish is still troubling given the extensive controls. With information on student attendance and days of class the HCRG data make it possible to analyze the attendance–achievement link with unusual detail. A standard deviation increase in days of class (about 10 days) predicts 0.20 standard deviations higher gain scores in Spanish, and a 0.11 higher level score in mathematics. The student attendance measure is only significant in the mathematics level model, although when combined with days of class the total attendance effect in standardized terms is roughly 0.22 standard deviations. However, what if students are attending less frequently as a response to the

school being closed more often? The question is important enough to warrant further exploration, and in separate estimations (not presented) using attendance as the dependent variable I find that student attendance is higher in schools that work more days. The only category of school closings that significantly predicts less frequent student attendance is the unexplained category where the teacher did not write in the reason for no class. These kinds of linkages are only suggestive, but the most serious implication is that students attend less frequently when schools have more unofficial – or unannounced – closings. In rural Guatemala the teacher’s ethnicity and language use are potentially important elements of quality. Table 2 shows that students studying with (self-described) indigenous teachers have higher Spanish and mathematics scores. However, for this prima facie evidence of bilingual education effectiveness to be convincing it needs to be shown that indigenous students fare especially well when studying with indigenous teachers. Additional estimations using interaction terms show evidence of positive interaction only in the mathematics gain score estimation. A better test of the impact of bilingual education is based on actual language use in the classroom. Teachers were asked if they ever use indigenous languages in instruction, and during the classroom observations the enumerators noted the language being used during class. The teacher’s reported use of Mayan languages (yes–no) is significant in mathematics only, and only when it is included without the teacher ethnicity control. So overall the results are inconclusive with regards to the effectiveness of bilingual education programs in rural Guatemala. Other variables touch on actual abilities and still more specific elements of the teacher’s work. As expected, teacher content knowledge is associated with higher levels of student achievement. For mathematics a standard deviation increase in content knowledge predicts 0.11 standard deviations of gains. For pedagogical content knowledge (PCK) the point estimate is significant, but the effect size is modest—only 0.03 standard deviations in student gains for each point increase in teacher PCK. The results for the teaching segments show that the most effective classes appear to be those that limit group work and stress teacher–student interaction and teacher-centered lecture and explanation activities (in mathematics). Student achievement is also lower in classes that have more “down time” due to transitions between activities or disciplinary actions. More direct instruction methods have been associated with learning in the United States (Goldhaber & Brewer, 1997), while the negative coefficient for group work is corroborated by recent work with the TIMSS international mathematics test score data (Carnoy, Marshall, & Socias et al., 2007). Finally, when teachers are observed checking all students work the average achievement is significantly higher. The effect size is quite large – upwards of 0.35 standard deviations – which provides powerful evidence of the effectiveness of this particular strategy. The remaining variables in Table 2 cover the community characteristics and controls for selection and school type. The results for selection bias are uneven, although they do point to a negative effect on average achievement when the school does a better job of retaining students. This

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

213

Table 2 OLS estimates of student achievement, by subject Variables

Spanish Level

Student-family variables Spanish test 2001 Math test 2001 Student age Female Indigenous Parental education SES Attendance Student attendance School days Teacher and classroom characteristics Class size Student-reported fighting (school average) Teacher in university Teacher experience Teacher is Indigenous Teacher content knowledge Teacher pedagogical content knowledge (PCK) Teaching segmentsa Teacher–student interaction Group work Teacher-centered Transition/discipline Teacher checks all work Physical condition of classroom School and community characteristics Average parental education Distance to middle school Total enrollment PRONADE school State controlsb Chimaltenango Escuintla Selection correction Cohort control Day of test control Non-selection hazard () LR test for instrument power (chi-square) Wald test for instrument exclusion (F-test) Wald test excluding lambda (F-test) Sample size R2

Mathematics Gains

Level

Gains

– – −0.02 (−0.74) −0.14*** (−2.73) −0.39*** (−3.04) 0.04** (2.56) 0.05*** (2.62)

−0.03*** (−9.67) – −0.03 (−0.94) −0.02 (−0.28) −0.25** (−2.05) 0.03 (1.26) 0.03 (1.61)

– – −0.001 (−0.01) −0.32*** (−5.30) −0.14 (−1.13) 0.03** (2.04) −0.003 (−0.12)

0.003 (0.68) 0.02*** (5.83)

0.001 (0.39) 0.02*** (4.67)

0.01* (1.83) 0.01** (2.06)

0.01 (1.19) 0.006 (0.80)

– −0.03*** (−13.82) −0.04 (−1.53) −0.15*** (−2.66) −0.02 (−0.19) 0.02 (1.07) 0.002 (0.08)

0.004 (0.98) −0.01 (−0.37) 0.09 (0.93) 0.01 (0.42) 0.15* (1.76) 0.001 (0.13) –

0.01* (1.69) −0.01 (−0.32) 0.14 (1.46) 0.02** (2.25) 0.11 (1.30) 0.01 (1.06) –

−0.004 (−0.06) 0.03 (0.68) 0.10 (0.70) −0.01 (−0.67) 0.21 (1.62) 0.01* (1.74) −0.01 (−0.29)

−0.001 (−0.01) 0.04 (1.28) −0.05 (−0.58) −0.004 (−0.43) 0.23*** (2.68) 0.01** (2.15) 0.03* (1.79)

−0.001 (−0.22) −0.008*** (−3.71) −0.005 (−1.47) −0.02*** (−2.56) 0.35*** (3.80) 0.07 (1.41)

0.003 (1.30) −0.01*** (−5.21) −0.002 (−0.28) −0.03** (−1.99) 0.20** (2.32) 0.09* (1.79)

0.005 (1.43) −0.001 (−0.38) 0.01** (2.02) −0.03** (−2.26) 0.28* (1.90) 0.06 (0.65)

0.01*** (4.22) −0.006 (−1.48) 0.01*** (3.21) −0.05*** (−4.44) 0.34** (2.27) 0.11 (1.38)

0.03 (0.61) −0.05*** (−4.82) −0.001** (−2.31) −0.30** (−2.26)

0.07 (1.11) −0.03*** (−2.91) −0.001** (−1.86) −0.23 (−1.92)

−0.01 (−0.08) −0.008 (−0.47) −0.001 (−0.47) −0.44** (−2.46)

−0.08 (−0.97) 0.004 (0.37) −0.001 (−0.41) −0.17 (−0.97)

0.23 (1.51) 0.50*** (2.97)

0.26* (1.78) 0.57*** (3.41)

0.08 (0.39) 0.83*** (3.15)

−0.22** (−2.10) 0.004 (1.07) −1.89*** (−2.75) 9.86** 0.51 1.07 839 0.517

−0.26*** (−3.09) 0.008 (1.58) −1.10 (−1.20) 9.15** 1.37 1.22 839 0.233

0.06 (0.38) −0.01 (−1.37) −0.30 (−0.29) 5.67 0.45 0.50 803 0.276

0.14 (0.83) 0.98*** (3.90)

−0.22** (−2.21) −0.01*** (−3.00) 0.08 (0.09) 5.61 0.91 0.94 803 0.310

Source: HCRG, 2003. Notes: Dependent variables measured in standard deviations. Additional predictors include the number of older and younger siblings, textbooks, teacher gender and the distance to Guatemala City (the national capital). “Level” models use 2002 score as dependent variable; “Gains” refers to difference between 2002 and 2001 scores. The LR Test compares first stage selection estimations with and without identifying instruments to assess instrument power (as a group). Wald tests are used to test excludability of instruments from second stage achievement estimations, first with the predicted non-selection hazard () and then without. T-statistics (in parentheses) are based on robust standard errors that correct for clustering within schools. See Table 1 for variable measurement specifics. a Excluded category for time segment analysis is individual seatwork. b Excluded category for state controls is Alta Verapaz.

and the very large point estimates for the all-ladino state of Escuintla will be returned to in the decomposition activity. As for the PRONADE schools there is some evidence that achievement is lower when all else is equal. 4.2. Academic achievement decompositions Table 3 presents the decompositions for student achievement. In each comparison the raw difference (in

standard deviations) is presented for the gain scores only. Negative coefficients favor the lower scoring schools, meaning indigenous schools and PRONADE. The overall flavor of each set of comparisons – especially for the (potential) policy levers – is little changed by model specification or decomposition technique. Based on the fixed effects analysis of McEwan and Trowbridge (2007), and the more detailed decomposition in Hernandez-Zavala, Patrinos, Sakellariou, and Shapiro

214

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

Table 3 Academic achievement decompositions Variables

Indigenous test score gap Spanish gains

Raw difference in standard deviations By variable category Student/family background Community characteristics Selection controls School characteristics Unexplained “direct” parameter Select school quality differences School days Teacher is Indigenous Teacher content knowledge Teacher PCK Teaching segments Group work Teacher-centered Transition/discipline Teacher checks all work

PRONADE test score gap Math gains

Spanish gains

Math gains

0.21

0.25

0.14

0.07

−0.47 0.40 −0.01 0.04 0.25

−0.11 0.46 0.04 −0.16 0.02

−0.31 0.09 0.11 0.02 0.23

−0.24 0.13 −0.02 0.04 0.17

0.04 −0.02 0.00 –

0.01 −0.06 0.01 0.02

−0.04 −0.02 0.00 –

−0.01 −0.04 0.05 0.02

−0.10 0.03 0.05 0.03

−0.05 −0.15 0.07 0.04

−0.06 0.02 0.07 0.03

−0.03 −0.10 0.21 0.05

Source: HCRG, 2003. Notes: Decompositions based on achievement levels are available on request from author. Gains refer to difference between 2002 and 2001 scores. Students are classified as indigenous when they report speaking a Mayan language in the home. The top row refers to the raw difference in average achievement between each group (in standard deviations). This total difference is decomposed into the five variable categories listed in the top half (Student-family background, community, etc.). The incoming (2001) test score is included in the student-family background category. The decomposition uses a single estimation equation and focuses only on endowment differences between the two groups. All coefficients refer to standard deviation changes in achievement. Negative coefficients refer to areas where the low scoring group (indigenous and PRONADE) have more favorable endowments. See text for more information, Table 2 for the coefficients and Table 1 for the means for each group.

(2006), we know that school and community characteristics figure somewhat prominently in explaining test score differences by ethnicity. With the HCRG data this activity can be taken one step further using multiple features of schools and classrooms that are significant predictors of achievement gains. As argued above, the larger unexplained (or “within school”) result for Spanish suggests that this subject is more affected by the linguistic limitations that indigenous students bring to school. But we cannot rule out a pedagogical component to this, especially if differential treatment in the classroom revolves around language use. The remaining family background variables play a relatively minor role in determining the indigenous test score gap. The school variables as a group also do little to explain why indigenous students have lower scores. This does not mean that all endowments are equal, however, and the detailed results at the bottom of Table 3 provide some clues into areas of schooling that can be improved in indigenous communities. For example, the non-indigenous schools gain as much as 0.04 standard deviations by working more days. The classes are also more orderly and the ladino school teachers are more likely to check all student work. But there are offsetting differences where the predominantly indigenous schools have more favorable endowments. One advantage is they have more indigenous teachers. Indigenous classrooms also rely less on group work and use more teacher-centered instruction, each of which provide sizeable gains. The largest gains for ladinos come instead from community characteristics. The biggest advantage is their concentration in the state of Escuintla, which accounts for upwards of 0.50 standard deviations. Indigenous students also fare worse due to their schools being located farther

away from middle schools. The critical question therefore is the extent to which these community and state controls are capturing unmeasured elements of school and teacher quality. This does not seem likely given the scope of information that is available for days worked, teacher quality and classroom processes. The centralized nature of policymaking in Guatemala also argues against inter-state quality effects of the magnitude suggested by the Escuintla dummy variable parameters. The more likely explanation takes us back to the kinds of unmeasured local influences identified in previous studies, such as peer effects, cultural attitudes towards schooling or variation in labor market dynamics. The right hand side of Table 3 includes the decompositions by school type. As hypothesized, there is some evidence that community schools realize achievement gains as a result of more efficient capacity utilization. They report more days of class, which accounts for a small advantage in achievement. There is also less reliance on student group work and more use of teacher-centered instruction, two methodological choices that could – in theory – result from having more motivated teachers. However, the utilization argument is weakened by the greater tendency of non-PRONADE school teachers to check all of the student’s work, which in turn predicts upwards of 0.05 standard deviations of achievement gain advantages. For Spanish achievement there is one aspect where greater PRONADE efficiency has a detrimental effect on average test scores: grade completion. Average grade completion is significantly higher in PRONADE schools among the randomly drawn students who entered grade one in 1999 (Table 1). This greater efficiency in getting children through school does appear to come with a tradeoff in achievement gains. In the decomposition the effect of this

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

variable is 0.11 standard deviations. This is a tentative linkage given the different results for mathematics. But the important point is that community schools may be especially focused on getting children from grade to grade and reducing dropout, which in turn is likely to handicap their ability to maximize achievement. Finally, PRONADE teacher endowments are also generally inferior, at least in several aspects that are related to student achievement. In mathematics their lower levels of content knowledge result in a 0.05 standard deviation disadvantage. The classes are also more disorderly, which in turn predicts upwards of 0.21 standard deviations in achievement differences. This disorder must be qualified somewhat, since the PRONADE teachers are more likely to work in a multigrade setting, which in turn requires more frequent transitions between activities. Overall the results are consistent with the general research contours that are forming related to community schooling in Central America. There are some things that these schools do well, and these advantages in general predict more student achievement. But capacity utilization has limits, especially when the schools are either intentionally (to save money) or unintentionally (because of access) making greater use of lower quality teachers. 5. Conclusion This paper uses unusually rich data from rural Guatemalan primary schools to analyze student achievement in a poor, developing country context. The detailed data make it possible to help fill in missing elements in several existing literatures. For example, a handful of previous studies have found that teachers are frequently absent in developing country schools; the results here corroborate this finding in rural Guatemala while demonstrating its consequences for student achievement. The results linking two different forms of teacher mathematics knowledge with student achievement gains also help break some new ground for a frequently mentioned – but rarely analyzed – feature of teacher quality. For policymakers in Guatemala and beyond these kinds of linkages serve as useful reminders of the potential impact of welldesigned interventions in school management and teacher preparation. For researchers the results provide one more example of how exceptional data collections can help inform research questions that are frequently left unaddressed in the production function framework. The analysis also addresses one of the most pressing social questions in contemporary Guatemalan society: the poor performance of indigenous students on standardized tests. The results suggest no simple policy prescription for equalizing test scores among indigenous and ladino students in rural areas. Most importantly, the schools attended by each group are very similar, at least based on an extensive list of school and teacher characteristics. This is itself an important finding, since previous decomposition activities in Guatemala have had few school quality mechanisms to consider. There is some evidence that indigenous teachers are more effective, and math gain scores are significantly higher for indigenous students when they are paired with indigenous teachers. But assessing the overall effectiveness

215

of this intervention – let alone identifying the most important mechanisms – will have to wait for better data on implementation. If indigenous and non-indigenous schools in rural Guatemala have similar measured endowments, then what can policymakers do to redress the historical legacy of massive inequalities in educational opportunities? The significant marginal differences in achievement between indigenous and ladino, when all else is equal, raise the possibility of different treatment inside the classroom. But additional cultural and institutional factors also appear to be at work, especially when considering the large community effects in these comparisons. Indigenous students and their families may be less sanguine about the future payoffs to schooling, perhaps because cultural and/or physical isolation reduces access to urban labor markets. Or these communities may be neglected in other institutional areas, such as health care access. Each points to aspects of the larger opportunity structure that go beyond simply targeting these communities for more educational resources. The results also shed some light on decentralization dynamics in a poor, rural context. Expectations that PRONADE community schools realize efficiency gains that are offset by endowment deficiencies in critical areas of teacher competence are largely confirmed. Once again the evidence does not point to dramatic differences in schools by program, and more qualitative evidence about how PRONADE communities conceptualize school quality is clearly needed. Given the tendency of PRONADE to enroll indigenous students this question once again touches on the need to understand the impact of underlying motivations and access to information on schooling behaviors. Capacity deficiencies in PRONADE schools may be a product of parents having low expectations about future returns to schooling, or community councils made may be hard pressed to instigate changes to improve teaching quality. Acknowledgements Useful comments were provided on earlier versions by Martin Carnoy, Susanna Loeb, Eric Hanushek, Miguel Socias, Arjun Bedi, Jo Boaler, Nancy Tuma, and two anonymous referees. All remaining errors are of course my own. In Guatemala, Yetilu de Baessa and her research team at the PRONERE evaluation project provided crucial logistical assistance with the data collection. Alejandra Sorto provided the items for math teachers, which are part of a larger study we are working on together. Partial funding was provided by the Spencer Foundation. The views expressed here do not reflect those of the RAND Corporation. The HCRG data are available at http://www.sapere.org. References Alcazar, L., Rogers, F. H., Chaudhury, N., Hammer, J., Kremer, M., & Muralidharan, K. (2005). Why are teachers absent? Probing service delivery in Peruvian primary schools. Washington, DC: The World Bank. Bedi, A. S., & Marshall, J. H. (2002). Primary school attendance in Honduras. Journal of Development Economics, 69, 129–153. Blinder, A. S. (1973). Wage discrimination: Reduced form and structural estimates. The Journal of Human Resources, 8, 436–455. Carnoy, M., Marshall, J. H., & Socias, M. (2007). Explaining differences in academic achievement using international test data. Working paper.

216

J.H. Marshall / Economics of Education Review 28 (2009) 207–216

De Baessa, Y. (2002). Summary of 2001 PRONERE test application. Universidad del Valle, Guatemala. Working paper. DiGropello, E., & Marshall, J. H. (2005). Teacher effort and schooling outcomes in rural Honduras. In E. Vega (Ed.), Incentives to improve teaching: Lessons from Latin America. Washington, DC: The World Bank. DiGropello, E., Marshall, J. H., & Rápalo, R. (2004). The community based school management study: Evidence on progress and the impact of the models. Washington, DC: The World Bank. Fuller, B., & Clarke, P. (1994). Raising school effects while ignoring culture? Local conditions and the influence of classroom tools, rules and pedagogy. Review of Educational Research, 64, 119–157. Fuller, B., Dellagnelo, L., Strath, A., Bastos, E. S. B., Maia, M. H., Lopes de Matos, K. S., Portelaand, A. L., & Vieira, S. L. (1999). How to raise children’s early literacy? The influence of family, teacher, and classroom in northeast Brazil. Comparative Education Review, 43, 1–35. Glewwe, P. (2002). Schools and skills in developing countries: Education policies and socioeconomic outcomes. Journal of Economic Literature, 90, 436–482. Glewwe, P., Grosh, M., Jacoby, H., & Lockheed, M. (1995). An eclectic approach to estimating the determinants of achievement in Jamaican primary education. World Bank Economic Review, 9, 231–258. Goldhaber, D. D., & Brewer, D. J. (1997). Why don’t schools and teachers seem to matter? Assessing the impact of unobservables on educational productivity. The Journal of Human Resources, 32(3), 505–523. Hanushek, E. A., & Woessmann, L. (2007). The role of education quality in economic growth. World Bank Policy Research Working Paper No. 4122. Washington, DC: The World Bank. Harbison, R., & Hanushek, E. A. (1992). Educational performance of the poor: Lessons from rural northeast Brazil. New York: Oxford University Press. Hernandez-Zavala, M., Patrinos, H. A., Sakellariou, C., & Shapiro, J. (2006). Quality of schooling and quality of schools for indigenous students in Guatemala, Mexico and Peru. Washington, DC: The World Bank.

Hill, H., Rowan, B., & Ball, D. L. (2005). Effects of teachers’ mathematical knowledge for teaching on student achievement. American Educational Research Journal, 42, 371–406. Jimenez, E., & Sawada, Y. (1999). Do community-managed schools work? An evaluation of El Salvador’s EDUCO program. World Bank Economic Review, 13, 415–441. Kremer, M., Chaudhury, N., Rogers, F. H., Muralidharan, K., & Hammer, J. (2005). Teacher absence in India: A snapshot. Journal of the European Economic Association, 3, 658–667. Marshall, J. H., & Sorto, A. M. (2007). Teaching what you know or knowing how to teach? The effects of different forms of teacher mathematics knowledge on student achievement in rural Guatemala. Working paper. http://www.sapere.org. McEwan, P. J., & Trowbridge, M. (2007). The achievement of indigenous students in Guatemalan primary schools. International Journal of Educational Development, 27, 61–76. Mullens, J. E., Murnane, R. J., & Willett, J. B. (1996). The contribution of training and subject matter knowledge to teaching effectiveness: A multilevel analysis of longitudinal evidence from Belize. Comparative Education Review, 40, 139–157. Neumark, D. (1988). Employers’ discriminatory behavior and the estimation of wage discrimination. The Journal of Human Resources, 23, 279–295. Oaxaca, R. (1973). Male–female wage differentials in urban labor markets. International Economic Review, 14, 693–709. Psacharopoulos, G., & Patrinos, H. A. (Eds.). (1994). Indigenous people and poverty in Latin America: An empirical analysis. Washington, DC: The World Bank. ˜ Santibanez, L. (2006). Why we should care if teachers get A’s: Teacher test scores and student achievement in Mexico. Economics of Education Review, 25, 510–520. World Bank. (2004). Poverty in Guatemala. Washington, DC: The World Bank.