Journal of Public Economics 89 (2005) 821 – 839 www.elsevier.com/locate/econbase
The effects of spending on test pass rates: evidence from Michigan Leslie E. Papke Department of Economics, Michigan State University, East Lansing, MI 48824-1038, United States Received 25 October 2001; received in revised form 3 May 2004; accepted 12 May 2004 Available online 14 August 2004
Abstract This study uses data on standardized test scores from 1992 through 1998 at Michigan schools to determine the effects of spending on student performance. The years in the data set straddle 1994, when Michigan dramatically changed the way that K-12 schools are funded, and moved toward equalization of spending across schools. Focusing on pass rates for a fourth-grade math tests (the most complete and consistent data available for Michigan), I find that increases in spending have nontrivial, statistically significant effects on math test pass rates, and the effects are largest for schools with initially poor performance. D 2004 Elsevier B.V. All rights reserved. JEL classification: 324; 912 Keywords: Spending; Test pass rates; Michigan
1. Introduction The past 25 years have witnessed a move in the United States to centralize K-12 school funding at the state level with the goal of providing equal, or more equal, resources per student. For example, in the early 1970s, New Mexico limited the amount districts could spend on their own schools and made the state, rather than local property taxes, the main source of money for school operation. In New Mexico, and also in California, these reforms resulted in a leveling down of per-student resources. More recently, Michigan E-mail address:
[email protected]. 0047-2727/$ - see front matter D 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jpubeco.2004.05.008
822
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
centralized K-12 school finance in a manner also designed to equalize spending, but with the goal of leveling per-pupil spending upward. In all the states, the purpose of these equalizing reforms is both to offer equal educational opportunities and to improve student performance, particularly in initially low-spending (and low-performing) school districts. Clearly, having precise estimates of the effects of spending or other resources on student performance is important from a policy perspective. Yet, as Yinger (2004) concludes, the link between state aid reform and student performance has proven difficult to establish. In the absence of substantial reform, there is typically not enough variation in inputs to estimate causal effects of spending. In states where there have been dramatic efforts to equalize spending, researchers are often limited to cross-sectional data or panel data hampered by long stretches of time between measured inputs and outputs (as much as 10 years apart in the case of Census data). It is too ambitious, perhaps, to expect to estimate an education production function when key confounding factors cannot be controlled for either explicitly with data or by econometric methodology. This paper examines the dramatic changes in district and school-level funding that resulted from the 1994 school finance reform in Michigan called Proposal A. The reform imposed strict spending floors (and phased in ceilings) on operating expenditures for public school districts, replacing a matching grant system that allowed for local discretion in determining the level of education spending with a district foundation guarantee financed primarily from the state School Aid Fund. I use panel data on Michigan elementary schools for the years 1992 to 1998 to examine the effects of increased spending on test pass rates. Using annual data at the building level from years bracketing the reform, I am able to avoid many of the limitations in earlier work by using panel data techniques. In addition, the nature of Proposal A generates a natural instrument to use in addressing the possible endogeneity of spending, so I do not have to resort to bsimulated instrument methodsQ as in other work.1 For each school, pass rates on various Michigan Educational Assessment Program (MEAP) exams are available, along with per student spending, school enrollment, average teacher salaries, and pupil-to-teacher ratios. The data come from annual Michigan School Reports (MSRs). The MSRs have some information on student demographics or economic status—I use the percent of students eligible for the school lunch program as a proxy variable for economic well being of the students at a school. Ideally, we would follow individual students over several years, and determine how educational inputs, such as total spending or, perhaps more directly, teachers’ salaries or class size, affect performance. But panel data on individual students are very difficult to come by. For example, in Michigan, students are not tested in the same subjects every year, and so it would be difficult to obtain outcomes that could be compared across years. In the future, such data may become available, where elementary school performance, middle school performance, and high school performance can be linked together (along with information on school inputs). While cross-sectional data at the individual level are easier to come by, problems in inferring causality make such data less useful, unless they
1 In a 50-state study of school finance equalization schemes, Hoxby (2001) uses simulated instruments to address the endogeneity of tax prices in per pupil spending equations.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
823
have been collected experimentally.2 While experimental data exist for studying the effects of class size on student performance, as far as I am aware no such data exist for studying the effects of total spending. From a policy perspective, it may be at least as interesting to know the effects of per student spending on average performance at a school. Some educational initiatives involve giving additional funds to schools and allowing the schools to spend the money as they see fit. The ceteris paribus question of interest is: If two schools are the same in all relevant aspects except per-student spending, what is the predicted difference in pass rates on a standardized exam? With panel data at the school level I can control for school heterogeneity, which is likely to be an important confounding factor, and for aggregate changes over time that affect performance of students at all Michigan schools. In particular, in studying the link between spending and performance, it is important to recognize that educational costs vary across districts (and schools) for reasons outside the control of school officials. In the absence of a good index of education costs by district or school, use of panel data techniques can address this problem. But, in the school finance literature, the use of fixed effects has been limited since the technique is effective only if the observed educational inputs contain enough variation over time. Fortunately, in the state of Michigan, Proposal A resulted in notable changes in the distribution of K-12 funding across schools. This exogenous change in funding acts as a natural experiment, and can allow more precise estimation of the effects of school inputs. In the next section, I provide a brief overview of the change in K-12 school funding in Michigan and of related work. In Section 3, I discuss my econometric approaches for the panel data at the building level, and, in particular, discuss how the possible endogeneity of spending can be handled. In Section 4, I describe the state-wide MEAP exams, and present summary statistics on student pass rates and school funding. I present the econometric results in Section 5. There is a brief concluding section.
2. Background Since 1974, Michigan had used a power-equalizing/guaranteed tax base (GTB) plan that was intended to provide an equal, basic per-pupil property tax base to each district, rather than a basic per-pupil minimum level of expenditure. In effect, the marginal cost of education spending was reduced because the GTB plan involved matching grants. No limits were placed on school spending. It was anticipated that the matching grants, by lowering the price of education, would increase education spending in low-spending districts. In fact, spending differences increased because residents of low-spending districts did not respond to the reduction in price of the GTB plan, while higher-spending districts continued to approve local tax increases to increase spending. Further, state categorical aid at this time was not equalizing.
2 Krueger (1999) analyzes one of the rare cross-sectional data sets that was generated experimentally, the Tennessee STAR class-size experiment.
824
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
As a consequence of growing spending inequalities across districts (and dissatisfaction with relatively high local property taxes), in 1994 Michigan changed its system of school finance entirely with Proposal A.3 Under the reform, which went into effect for the 1994/ 1995 school year, new revenues are deposited to the state School Aid Fund to finance a district foundation guarantee, or spending floor.4 This spending floor increases spending by low spending districts, and, since revenue limits are based on 1994 spending levels, also caps district spending at the high end. Thus, spending differences between districts will gradually be reduced as low-spending districts are gradually raised to the basic foundation and as the growth is limited in high-spending districts. 2.1. Distribution of revenues to districts under the reform The initial foundation each school district received in 1994/1995 was based on state and local source revenue for 1993/1994.5 High-revenue districts were bheld harmlessQ in that they continued to receive at least as much revenue as they had received under the previous finance system. Consequently, these districts received much more money from the state than other districts, but their revenues increased at a much smaller rate. Any increased equalization was to be accomplished by bleveling upQ (increasing revenues provided to low-spending districts) rather than bleveling down.Q A district’s foundation grant is a non-smooth function of per-student spending in the 1993/1994 school year. In the first year of the new financing system, the minimum foundation grant was US$4200.6 Districts initially spending below US$4200 per student in 1993/1994 were given a grant of US$4200 or an extra US$250, whichever grant would be larger. If initial spending was between US$4200 and US$6500, the allowed increase varied from US$250 to US$160, with smaller increases in districts with higher initial spending. The bulk of districts were in this group. If 1993/1994 spending was US$6500 or greater, districts were allowed increases of US$160 per student. In subsequent years, each district’s increase is determined by whether it is above or below what is termed the basic foundation. The basic foundation grant, US$5000 in 1995, increases each year according to an index that equals total statewide revenues per pupil for all taxes earmarked for the School Aid Fund, divided by the 1995 level of revenues. Districts whose grant is above the state’s basic grant receive an annual lump-sum perstudent increase equal to the percentage growth in the index times the basic foundation. Districts whose grant is below the basic foundation receive up to double those annual perstudent amounts.
3 The following discussion of the finance reform is drawn in part from Courant, Cullen and Loeb (2003); see Fisher and Wassmer (1995) for an early analysis. 4 The new funds come from a two percentage point increase in the state sales and use tax, a 50% per pack increase in the cigarette tax, a new six mill education tax, a 0.75% real estate transfer tax, and 23% of individual income tax revenues. 5 Arsen and Plank (2003) provide an extensive summary of the K-12 reform and implications for the State budget. 6 See Cullen and Loeb (2004) for the time path of minimum and basic foundation grants. By 2000, the minimum and basic grants reach US$5700 and subsequently grow at the rate of the School Aid Fund.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
825
In the years following adoption of Proposal A, per-pupil foundation allowances continued to increase in all districts, but the lowest revenue districts received significantly larger increases than higher revenue districts.7 Districts choose the amount to spend from the foundation and any funds remaining from previous years, but districts do not have the option of reducing local taxes in response to the grant.8 The change in financing continued an upward trend in real per pupil spending in Michigan, but, as I illustrate in Section 4, also resulted in dramatic increases in average real spending per pupil for initially low spending schools. This increased variation, along with an established annual data series, provides an unusual opportunity to estimate school production functions. Papers in this literature examine across-district inequality in spending, or in performance, but typically the data are inadequate to examine the link between the two. Murray et al. (1998) discuss the flurry of litigation that challenged the constitutionality of school finance systems in 16 states between 1971 and 1996. They demonstrate that court-induced reforms in state aid did reduce inequality between 1971 and 1992. Casserly (2002) provides some evidence on continuing performance disparities by comparing test scores in large cities to their state averages. Downes (1992) analyzes California’s Proposition 13 and finds reductions in differences between districts in total expenditures per student, but no corresponding equalization of student achievement as measured by test scores. Downes et al. (1998) provide indirect evidence in finding that the imposition of property tax limits in the Chicago metropolitan area do not appear to affect student performance.9 In surveying the literature, Hanushek (1996) concludes b. . . simply adding more resources to schools as currently structured is unlikely to yield significant improvement in student performance.Q Yinger (2004) discusses education finance litigation and resulting reforms to state finance systems. He concludes that while some of the evidence indicates that state aid can boost student performance, none of the findings is definitive and some are quite ambiguous. However, Hoxby (2001) offers an important criticism of much of this literature by noting that school finance equalizations create dramatically different tax prices across states. She suggests that in multi-state analyses of student performance, binary indicators of breformQ do a poor job of representing the true economic content of a school finance reform. Using the bsimulated instrumentsQ methodology she developed to study the impact of state aid reform on spending equality, Hoxby (2001) estimates an education production function using the high school drop out rate as the performance measure. She uses Census data for the years 1970, 1980, and 1990 and includes district fixed effects to control for
7 Michigan does not use cost adjustments in its foundation amount, but does provide categorical programs with ad hoc adjustments for concentrations of poor students or other students with special needs. Unfortunately, these programs have never been fully funded and may therefore have little impact (Cullen and Loeb, 2004). 8 Fisher and Papke (2000) discuss how school districts respond—financially and educationally—to intergovernmental grants. 9 This is only indirect evidence on these issues since the districts could reshuffle their budgets to accommodate the tax limits. Indeed, the authors speculate that districts subject to the tax limits appear to have protected instructional spending at the expense of other, potentially less productive, spending.
826
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
local taste for education, school organization, factors that affect the local cost of schooling, some demographic characteristics, and some forms of categorical aid. Unfortunately, only the high school drop out rate is available as a performance measure, and results for this variable were weak. It may be that 10-year intervals are simply too great for evaluating progress and that schools do not choose to spend increased funds on students likely to drop out. In addition, intra-district differences in spending across schools may be important. In fact, even schools within a school district can vary greatly with regard to unobserved factors. Generally, the problem is that variations in education inputs might be correlated with unobserved factors that affect student outcomes, such as family income. Demographic and economic variables are known to affect student outcomes. Failure to account for such variables can lead to spurious relationships between spending and performance. For example, parent support, while perhaps partly captured by family income, cannot easily be measured. If schools that have large parent support also have higher per pupil spending, the effect of parent support will be wrongly attributed to spending. Studies that use aggregate time series data require other factors that affect student performance over time to be uncorrelated with spending, something that is difficult to argue.10 Michigan’s Proposal A created an excellent opportunity to examine the performancespending relationship since the dramatic change in school funding produced a natural experiment—in particular, exogenous changes in spending—that allows for better estimates of the effect of spending. By using data from before and after the re-financing initiative, I hope to exploit the extra variation in spending within schools that would not have been present without Proposal A. And, I can examine the changes in spending directly on student performance by using data from state-wide tests that have a long history in the state. The next section presents alternative models and discusses estimation issues.
3. Econometric methodology There are two common econometric approaches that can be used to estimate the effects of spending in light of Proposal A. One possibility is a difference-in-differences (DD) analysis, where initially low-spending schools form the btreatmentQ group while initially higher spending schools constitute a control group. An alternative approach is to estimate an educational production function, allowing for school heterogeneity as well as possibly endogenous variation in idiosyncratic spending. I prefer the (structural) production function approach for several reasons, some having to do with the nature of the data and the Proposal A experiment. First, any attempt to define control and treatment groups would be arbitrary in this application, since the foundation grant affected spending at all schools. Defining the treatment group to include districts where spending was below US$4200 in 1993/1994 (the lowest-spending group pre-reform in Michigan) is imperfect because the change in spending, especially as a percent of initial spending, can vary greatly even within this group. Even if we settle on a
10 Hanushek (1986) contains detailed discussions of the problems inherent in inferring causality when relating student outcomes to spending and specific education inputs.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
827
set of control and treatment groups, we would not be able to easily summarize the effects of spending on pass rates because of differences in the changes in spending within the treatment group. Also, I use school-level data here to exploit the extra variation in spending and performance within school districts, whereas the foundation grant is determined at the district level. As I demonstrate in Section 4, while there is a strong relationship between school-level spending and the foundation grant—which makes the foundation grant a good instrument for spending—the amount of school spending can vary within a district as well as across time. The production function approach allows me to summarize the effects of spending very concisely, and loses none of the attractiveness of DD estimation. While I rely on a particular functional form, it is no more restrictive than the implied functional form in a DD analysis when the control and treatment groups are determined by an arbitrary spending cutoff. Mainly for purposes of comparison, I initially pool the data across school and time and estimate standard regression models by ordinary least squares (OLS). I model the student performance variable—a school pass rate—as a function of per-student spending and other observable controls. Aggregate time intercepts allow for secular changes in student performance and spending over time. (For example, aggregate time effects can capture changes in the difficulty level of the test over time.) The equation can be written as Yit ¼ Xit b þ cSit þ Tt h þ vit ;
ð1Þ
where Yit is the percent of students passing the MEAP math test at a satisfactory level, X it includes my student and school characteristics—school enrollment and percent of students eligible for the school lunch program—and S it is per-student spending at the school level. The vector T t contains dummy variables for each year, and v it is the unobserved disturbance for school i at time t. Without many controls in X it , however, I am unlikely to isolate the causal effect of spending on performance with this model. The Michigan School Report data do not contain many such controls, although the free-lunch variable essentially measures the poverty rate, and school enrollment allows for school size to affect achievement and be correlated with spending.11 I include the spending variable in logarithmic form to impose a diminishing effect of spending on performance. As it turns out, this functional form is not rejected by the data. Naturally, for high-spending districts where real per-student spending increased slowly over the period, the model predicts only small increases in test performance due to higher spending. Eq. (1) is written as if spending has only a contemporaneous effect on student performance, but that may be incorrect for this application. One reason is timing of the tests and spending variables: in Michigan, the tests for elementary students are administered early in the second semester, whereas the spending variable is the allocation
11 Note that this model assumes v it is uncorrelated with S it and X it . Using pooled OLS, the disturbances v it are likely to contain substantial serial correlation. In Section 4, I report standard errors and test statistics that are robust to arbitrary serial correlation within school, as well as arbitrary heteroskedasticity in v it .
828
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
for the entire school year. Plus, it makes sense that increases in recent spending—say, more resources spent on teachers when a child is in third grade—could help performance on a test taken in fourth grade. Of course, allowing lagged effects is costly in terms of data requirements, as every lag of spending included in the equation means one fewer year is available for estimation. In this study, I allow a 1-year lag by using an average of spending in the current and previous year.12 Since estimating Eq. (1) by pooled OLS assumes all school-level variables not contained in X it are uncorrelated with spending, S it , a common second option is to exploit the repeatability in panel data by decomposing the disturbance into a school fixed effect and an idiosyncratic error that changes over time: Yit ¼ Xit b þ cSit þ Tt h þ ai þ uit ;
ð2Þ
where a i is the unobserved school effect and u it is the idiosyncratic error that changes across time for each school. Provided spending changes over time, we can estimate c while allowing for arbitrary correlation between a i and S it . Practically, this means that schools with historically high levels of student achievement, as captured in a i , are allowed to have systematically higher (or even lower) levels of spending. The presence of a i also allows for differences in educational costs, geographical differences, historical differences among schools in parental involvement, and other factors that may move slowly over time. (Incidentally, if I had only 2 years of data, no controls X it , and I replace the spending variable with a binary treatment indicator for low-spending schools, the estimate of c would be the difference-in-differences estimate.) Estimating Eq. (2) by fixed effects allows for correlation between spending and unobserved school heterogeneity. Nevertheless, spending could be correlated with the idiosyncratic error, u it . I have only a crude control for family incomes, and it may be that incomes, or parental motivation, or other factors correlated with spending are still in the time-varying error term. Fortunately, the nature of the Michigan education finance reform provides a natural instrumental variable for spending beginning in the 1994/1995 school year: the district foundation grant. As described in Section 2, the amount of the foundation grant awarded to each school district is a deterministic function of district-level spending in 1993/1994 and growth in dedicated state revenues. As the growth in the School Aid Fund is determined at the state level, it is unrelated to idiosyncratic shocks that affect student performance in specific districts. (In particular, two districts with the same level of spending in 1993/1994 receive the same foundation grant in all subsequent years.) My identification strategy is to control for the 1993/1994 level of spending in the pass rate equation, along with aggregate time effects, and treat the grant awarded to each district as exogenous in the math pass rate equation. It seems reasonable that initial spending affects student performance in a smooth manner, whereas the grant, as described in Section 2, is a non-smooth function of initial spending based on essentially arbitrary cutoff points. In the
12 In an earlier version of this paper, I allowed current and lagged spending to have separate effects but the results are quantitatively similar and the models fit no better.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
829
first year, for example, 20% of the districts were below the floor of US$4200 in 1994, and these were all brought up to the minimum of US$4200 in the first year of reform. The nonlinearity at this lower range is sharp, since very low spending districts were raised to the same level as those just under US$4200. Districts at the high end were effectively capped because most would have spent more had they not been subject to the cap. During the remainder of the sample period, there is another important source of nonlinearity relating the foundation grant to initial-year spending, and this arises from the difference between the minimum foundation and the basic foundation.13 In 1995, the basic foundation was US$5000 (compared with the guaranteed minimum of US$4200). More than 55% of schools were receiving less than US$5000. Those districts received twice as much of an increase in the foundation grant as districts above US$5000. As the years passed, more districts hit the basic foundation (which grew over time), but those still below it received higher increases than those above it. The nonlinearity in later years is a result of the rule about which districts receive the standard increase in the foundation and which districts receive twice the increase. Since the foundation grant is awarded at the district level, the foundation amount is the same for each school within a district. Using a district-level instrumental variable in a school-level equation makes the exogeneity assumption for the foundation grant plausible. Therefore, my preferred approaches are to estimate (1) by instrumental variables (with initial-level spending included as a control) and (2) by fixed-effects instrumental variables, using the foundation grant as the instrument for spending in both cases. I also include some robustness checks in Section 5. In the next section I present some summary statistics on the path of district spending and test scores before and after Proposal A.
4. Data description and summary statistics The original Michigan legislation authorizing the establishment of educational standards and assessment of academic achievement dates from the 1970s. The legislation initially required tests administered at the elementary and junior high level. Subsequently, an optional high school series was added that has now become mandatory as well. The mathematics test for 4th and 7th graders was first administered in 1991. The percentage of students achieving a satisfactory level of performance in each school and district on these tests—the pass rate—is widely reported. Students performing well on the high school tests are eligible for US$2500 to spend on post-high school education administered through the Michigan Merit Award Program. The Michigan School Reports (MSRs) contain data at the school level and the district level. The MSRs contain test past rates for several subjects and grade levels. In the past decade, students have been tested in math, reading, and science. The elementary school tests are given in 4th grade (math and reading) and 5th grade (science). Math and reading tests are also given in the 7th and 10th grades, and science is administered in the 8th and 11th grades.
13
In 2000, both the minimum and basic foundation amounts were US$5700.
830
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
Table 1 Percentage of students with satisfactory performance on the 4th grade math test 1992 1993 1994 1995 1996 1997 1998
10th
25th
50th
75th
90th
21.4 26.5 31.5 44.5 43.2 40.0 58.0
28.4 34.8 39.9 53.2 52.1 50.0 67.0
36.3 42.9 48.6 62.4 62.7 61.0 76.0
44.8 51 58.9 70.3 72.5 70.0 84.0
53.5 59.3 67.6 78.5 81.1 78.7 91.0
Percentiles for school years 1991/1992–1997/1998.
Because there are many more elementary schools than either middle schools or high schools, the elementary school data seem best suited for studying the causal effects of spending on student performance. In addition, a significant spending change should be relatively more important for younger students. We might expect the effects for elementary students to be larger than those for older students. In this paper, I focus on the 4th grade math test pass rate, which has been reported consistently over the period.14 The change in school funding begins in the 1994/1995 school year.15 Table 1 contains pass rate percentile breakdowns of the percentage of 4th graders performing satisfactorily on the MEAP math test; subsequently, I refer to the pass rate as math4. For every percentile, the pass rate increased every year with the exception of the 1996/1997 school year. This improvement is evident in the 10th, 25th, 50th, 75th, and 90th percentiles. For example, in the 1991/1992 school year 21.4% of the lowest 10th percentile of 4th graders performed satisfactorily. By 1997/1998, this percent had risen to 58.0% of students. For students in the 50th percentile, the percentage of students passing rose from 36.3 to 76.0, and for the top 90th percentile, the percentage passing rose from 53.5% to 91.0%.16 Table 2 provides spending percentile breakdowns for real annual per pupil expenditures at the district level.17 Average real expenditures per pupil have risen every year and in each percentile for all districts combined. The lower percentiles increased the most in percentage terms. For example, in the 10th percentile, expenditures rose from US$3839.19 (1997 dollars) in 1992/1993 to US$5025.72 in 1997/1998, a 31% increase. In the 50th
14
Prior to 1994/1995, the reading test consisted of two components, and two pass rates are reported. After that, a single pass rate is reported. Unfortunately, this consolidation coincides with the change in how the schools were funded. For the science test, the method of reporting the pass rates changed in 1995/1996: only the percent of students performing at a high level is reported starting in 1995/1996. This makes comparing statistics across all years difficult, and could also pose problems for the econometric analysis. See Papke (2000) for further analysis. 15 All of the data used in this and the following sections can be obtained from the web site for the Michigan Department of Education: www.mde.state.mi.us. 16 An earlier version of this paper contains the 7th grade math results. The pattern is similar, although the overall pass rates are somewhat lower. 17 Because the foundation grants are awarded at the district level, I summarize spending patterns using districtlevel spending data. Plus, district-level spending data are available starting in 1991/1992, while the school-level spending data are available starting in 1992/1993. For the econometric analysis in Section 5, I use school-level data in order to exploit a much larger sample size as well as variation in spending within each school district. In addition, schools within a district can exhibit heterogeneity that can differentially affect student performance.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
831
Table 2 District-level real expenditures per pupil 1992 1993 1994 1995 1996 1997 1998
10th
25th
50th
75th
90th
3839.19 3577.87 3957.27 4646.41 4758.74 4890.50 5025.72
4109.17 4143.01 4461.94 5068.55 5186.07 5308.00 5425.49
4435.20 4537.32 4846.41 5434.78 5576.07 5728.50 5776.03
5080.40 5172.65 5523.28 6104.06 6194.95 6333.00 6332.37
6182.05 6192.08 6523.97 7052.21 7222.30 7217.90 7246.33
Percentiles for school years 1991/1992–1997/1998 (figures in 1997 dollars).
percentile, per pupil expenditures rose from US$4435.20 to US$5776.03 in 1997/1998, also a 31% increase. In the 90th percentile, per pupil expenditures rose from US$6182.05 in 1992/1993 to US$7246.33 in 1997/1998, a 17% increase. The largest increases occur between the 1993/1994 and 1994/1995 school years: 17% in the 10th percentile, 12% in the 50th percentile, and 8% in the 90th percentile. As desired, the reform increased average real spending generally and also reduced the coefficient of variation. Table 3 displays the means and standard deviations of real annual per pupil district expenditures along with the annual coefficient of variation. On average, spending increased by over 26% between 1991/1992 and 1997/1998, while the coefficient of variation fell from 0.261 to 0.192. The average increase in spending between the 1993/ 1994 and 1994/1995 school years is about 13%. We want to examine the link between spending and performance, so an interesting question is to ask what happened to spending over time based on pre-reform spending levels. Table 4 presents real expenditures per pupil by 1991/1992 spending percentile. In the first year of reform (1994/1995) real expenditure per pupil increased by an average of US$973 in the lowest spending decile, by US$701 between the 10th and 25th percentiles, by US$642 between the 25th and 50th percentiles, and by US$379 in the highest spending decile. Real spending continued to increase in subsequent years, but at a much lower rate. For example, the increases in spending in 1997/1998 fell to US$550 for the lowest decile, although this increase was well above what the schools were experiencing pre-reform. The drop off in spending increases at the high end is even more dramatic; the average real spending increase in the 1997/1998 for the highest decile is US$75.18 In Table 5, I present a simple difference-in-differences estimate comparing the growth rates for a broader grouping of districts: those with spending below the median in 1991/ 1992 and those with spending above the median. Before the finance reform, real expenditures by low-spending schools grew at an average rate of 5.3%, while those above the median grew at 3.27%. In the year of the reform, both low and high spending schools experienced a huge jump in expenditures, with low spending districts spending 21.11% more, and the high spending schools spending 10.27% more. This is evidence of equalization in the first year of the reform, since the change in the growth rate for low spending districts exceeded that for high spending districts by 8.81%.19 18 19
There was no increase in the districts’ foundation grant in 1998/1999. This difference-in-difference estimate is statistically significant with a t-statistic equal to 5.0.
832
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
Table 3 Means and standard deviations of expenditure per pupil, 1991/1992–1997/1998 1992 1993 1994 1995 1996 1997 1998
Average expenditure per pupil (standard deviation)
Coefficient of variation
4739.86 (1237.81) 4730.11 (1301.78) 5036.40 (1321.82) 5674.03 (1339.76) 5792.15 (1176.45) 5899.93 (1115.77) 5979.16 (1150.89)
0.261 0.275 0.262 0.236 0.203 0.189 0.192
Figures in 1997 dollars.
After 1994/1995, average spending growth per year is actually lower than it was pre1994/1995 (Table 5). For districts initially spending below the median, the annual growth rate averages 3.08%, while the average is 1.41% for higher spending districts. The difference, 1.67 percentage points, is statistically significant (with a t-statistic of 3.66). Compared with pre-1994/1995 differences in growth rates, the differential between lowspending and high-spending schools is actually less after 1994/1995. Therefore, the amount of additional catching up that occurs after the first year of reform for districts that were low spenders in 1992 is more modest. Did this finance change improve student performance? A crude indication is provided in Table 6, where district-level math pass rates are presented for initial spending percentiles as in Table 4. While pass rates increased generally for all spending percentiles over time, the lowest spending percentile with the lowest pass rate initially also had the largest increase in the pass rate—15.7 percentage points—in the first year of reform. The increase in the pass rate in the highest decile for the 1994/1995 school year was 11.47 percentage points.
5. Econometric findings In this section I estimate equations relating the 4th grade math pass rate (math4) on the MEAP exams to per pupil spending and a few other controls using data at the individual school level. Column (1) of Table 7 presents pooled OLS estimates that do not remove an Table 4 Average real expenditure per pupil by 1991/1992 spending percentile 1992 1993 1994 1995 1996 1997 1998
b10th
10th–25th
25th–50th
50th–75th
75th–90th
N90th
3152.98 3297.88 3592.62 4571.96 4872.54 4976.03 5122.27
3994.40 4128.55 4391.79 5092.51 5232.56 5407.52 5501.08
4266.12 4352.19 4666.06 5308.39 5418.54 5566.28 5654.43
4721.75 4762.16 5117.77 5699.88 5798.02 5895.10 5908.25
5561.47 5629.65 5857.74 6449.82 6526.66 6614.12 6659.79
7441.85 7350.56 7565.08 7944.38 7985.25 7956.31 8019.52
Figures in 1997 dollars.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
833
Table 5 Growth rates in real expenditure by spending in 1991/1992 Before 1994 1995 After 1995
Spending below median in 1992
Spending above median in 1992
5.30 (10.81) 21.11 (32.92) 3.08 (10.39)
3.27 (11.57) 10.27 (14.13) 1.41 (8.50)
Average annual growth rates, percent. Standard deviations in parentheses.
unobserved school effect. As I mentioned in Section 4, I use the log of per pupil spending averaged over the current and previous year. Thus, to obtain the percentage point change in math4 given a 10% increase in spending, we divide the coefficient on the spending variable by 10. The percent of students eligible for the school lunch program (lunch) and the natural log of school enrollment appear in quadratic form to allow for diminishing or increasing effects. I also include, but do not report, aggregate year effects. The estimated effect of a 10% increase in average spending increases the pass rate by 0.84 percentage points. The fully robust t-statistic that allows for serial correlation and heteroskedasticity is quite large (t=4.90). It seems plausible that more resources devoted to children in 3rd and 4th grade could improve performance on a 4th grade test.20 Not surprisingly, the coefficient on lunch is practically large and very statistically significant: ignoring the quadratic term, a one percentage point increase in the percent of students eligible for the school lunch program leads to over a half a point drop in the pass rate. The quadratic indicates a very small diminishing effect. Column (2) of Table 7 contains the fixed effects estimates. The estimated spending effect is slightly smaller: a 10% increase in average spending is estimated to increase the pass rate by about 0.72 percentage points. Not surprisingly, the fully robust t-statistic is substantially smaller, t=2.45, but still statistically significant ( p-valuec0.015 against a two-sided alternative). Once the unobserved school effects are controlled for, lunch, enroll, and their squares become much less significant, both practically and statistically; the fully robust p-value for joint significance is 0.404. It is not too surprising that controlling for school effects diminishes the importance of poverty rates and enrollments, as these vary much more across school than within a school over a relatively short time period. With a relatively short panel, allowing for school fixed effects might account for most of the endogeneity of spending. Nevertheless, there are two avenues for spending to be endogenous after netting out a school effect. Consistency of fixed effects requires that spending is strictly exogenous after accounting for the school fixed effect.21 But if future movements in spending depend on current, unexplained changes in test performance, S i,t+1 and u it are correlated in Eq. (2). This kind of feedback effect seems unlikely, but possible, because districts have some latitude in allocating school-level spending. As a test of strict exogeneity, I add next year’s spending (in logarithmic form) to the fixed effects estimation 20 Recall, the math test is administered early in the spring term, and so the amount of spending for the current school year is not entirely relevant. 21 That is, in Eq. (2), spending at time t, S it , must be uncorrelated with the idiosyncratic errors, u ir, in all time periods r.
834
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
Table 6 District pass rate by 1991/1992 spending percentile 1992 1993 1994 1995 1996 1997 1998
b10th
10th–25th
25th–50th
50th–75th
75th–90th
N90th
40.13 41.55 46.65 62.35 59.56 50.40 70.38
33.70 39.92 45.48 57.87 58.12 57.83 71.71
36.21 42.85 47.90 60.68 60.22 57.37 75.32
38.05 43.68 51.86 64.04 64.65 62.82 76.43
38.16 45.43 50.61 62.76 64.28 62.72 73.85
43.76 47.01 54.89 66.36 69.67 64.55 76.71
and use a robust t-test. (I lose the last year of the data in doing this.) The coefficient on the lead of spending is very small, 0.116, and its t-statistic is only 0.04. Therefore, using this simple test, there is no evidence against strict exogeneity of spending after controlling for school heterogeneity. A second possibility is that spending is contemporaneously correlated with timevarying, school-level idiosyncratic variables that affect the pass rate—that is, S it is correlated with u it in Eq. (2). As explained earlier, the Michigan education finance reform provides a potential instrumental variable for spending beginning in the 1994/1995 school year—the foundation grant awarded to each school district. (The use of instrumental variables also solves any problems due to feedback effects of the kind described above.) Using the subset of data starting in 1994/1995, I add to the controls in column (1) of Table 7 spending in the 1993/1994 school year, and, for flexibility, interactions of the initial level of spending with a full set of year dummies. I use four instruments for average spending: the log of the foundation grant and the log of the grant interacted with dummy variables for the 1995/1996, 1996/1997, and 1997/1998 school years. A fully robust F-test for joint significance of the instruments in the first-stage regression has a p-value =0.0000. The pooled 2SLS estimates of the effects of spending are reported in column (3) of Table 7. The estimated effect of spending is much larger than the pooled OLS or fixed effects estimates. Now, a 10% increase in spending is predicted to increase the pass rate by 2.2 percentage points. Not surprisingly, the fully robust 95% confidence interval for the 2SLS estimated is fairly wide, from 11.40 to 32.92, but it excludes the OLS and fixed effects estimates. To formally test the null hypothesis that school-level spending is exogenous in column (3) of Table 7, I use a fully robust Hausman test that compares pooled OLS and pooled 2SLS estimates. (That is, the pooled OLS estimation includes the extra controls in column (3).) The test gives a pretty strong rejection ( p-value=0.001), suggesting that spending is endogenous and that the IV procedure is needed for consistent estimation. Plus, the pooled OLS coefficient on spending is 6.22, which is substantially smaller than the 2SLS estimate. Because the pooled 2SLS estimation uses three overidentifying restrictions, I can test these to see whether the foundation grant fails my exogeneity assumption. The Hausman test gives p-value=0.098, which is only weak statistical evidence against the overidentifying restrictions. More importantly, the single-IV spending estimate that drops the
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
835
Table 7 Pooled OLS, fixed effects, and instrumental variables estimation: percentage of students with a satisfactory score on the 4th grade math test (math4) log(average per pupil expend.)
lunch
lunch2
log(enroll)
[log(enroll)]2
(1) Pooled OLS
(2) Fixed effects
(3) IV
(4) FE–IV
8.442 (1.189) [1.723] 0.575 (0.027) [0.042] 0.0017 (0.00029) [0.00051] 4.992 (4.863) [8.052] 0.294 (0.428) [0.708]
7.179 (2.156) [2.934] 0.153 (0.080) [0.111] 0.0012 (0.00069) [0.0010] 14.052 (15.890) [20.325] 0.998 (1.382) [1.754]
22.159 (4.042) [5.483] 0.595 (0.035) [0.053] 0.0019 (0.00038) [0.00066] 16.736 (7.032) [10.968] 1.365 (0.615) [0.965] 13.506 (3.922) [4.872] 10.758 (3.671) [2.914] 11.693 (3.750) [3.019] 6.325 (3.724) [3.508] No 4853 0.355
37.310 (16.318) [15.639] 0.091 (0.101) [0.130] 0.00061 (0.00084) [0.0011] 15.502 (38.964) [47.185] 0.833 (3.107) [3.810]
log(per pupil exp. 1994)
y96*log(per pupil exp. 1994)
y97*log(per pupil exp. 1994)
y98*log(per pupil exp. 1994)
School-level Fixed effects Obs. R2
No 7242 0.402
Yes 7242 0.362
13.816 (4.410) [4.015] 14.570 (5.039) [4.658] 9.464 (5.432) [5.243] Yes 4853 0.217
(i) Data used in columns (1) and (2) are for the 1992/1993–1997/1998 school years and 1771 elementary schools. Data used in columns (3) and (4) are from the 1994/1995–1997/1998 school years. All regressions include separate year intercepts. (ii) The variable blunchQ is the percentage of students eligible to participate in the school lunch program at no cost or reduced cost. (iii) Quantities in parentheses are the usual OLS standard errors. Quantities in brackets are the standard errors robust to arbitrary within-school serial correlation and arbitrary heteroskedasticity. (iv) For fixed effects methods, the R-squareds are net of school fixed effects. (v) In columns (2) and (3), the instrumental variables for the spending variable are the log of the foundation grant and the log foundation grant interacted with the 3 year dummies.
interactions of the foundation variable with the year dummies is very similar to that in column (3) of Table 7: 21.44 with robust t-statistic=3.72. I conclude that my IVs pass the exogeneity test. That the 2SLS estimate is statistically greater than the previous estimates suggest a negative correlation between spending at time t and u it . One scenario where this could occur is if the district, knowing something about the group of students coming up within
836
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
each school in its district, spends more on schools whose children are predicted to perform relatively poorly on the MEAP exams. Classical measurement error in (the log of) spending would also lead to a negative correlation between S it and u it , and IV estimation can alleviate this problem, too. A different interpretation of the difference between 2SLS and OLS estimates comes from the recent literature on estimating average treatment effects. For a binary treatment and binary instrumental variable, Imbens and Angrist (1994) define the notion of a blocal average treatment effect.Q Imbens and Angrist show that, under weak assumptions, the IV estimator consistently estimates the treatment effect for those population units induced into treatment by a change in the instrumental variable. In the current application, neither the btreatmentQ (spending) nor the instrument (the foundation grant) is binary, but we might think that the 2SLS estimator puts more weight on the effect of spending for schools induced to spend relatively more because of increases in the foundation grant. These are exactly the schools that were low spending prior to 1993/1994. In addition to controlling for spending in 1993/1994 as a way of making the foundation grant exogenous in the pass rate equation, I can combine a fixed effects model with instrumental variables estimation. The presence of a time-constant school effect allows for initial spending to affect current performance, and it allows for other school-level heterogeneity that might be correlated with the foundation grant. The equation is estimated by time-demeaning all variables and then using 2SLS on the time-demeaned data. The fixed effects–instrumental variables results are given in column (4) of Table 7. The estimated effect of spending is now even larger: a 10% increase in spending is estimated to increase the pass rate by 3.73 percentage points (with a fully robust p-value of 0.017). Not surprisingly, combining fixed effects with IV results in estimates that, while statistically significant at standard significance levels, are somewhat imprecise: the 95% confidence interval for a 10% increase in the average spending coefficient is 0.66 to 6.80, and contains the estimated effects from all previous approaches. The econometric results using the full set of Michigan elementary schools are broadly consistent with the notion that increased spending can improve student performance, and the estimates based on IV are substantive (if imprecise). One potential limitation of the analysis is that it pools schools that begin with fairly low performance with those that are always high performers.22 To see why this pooling might be undesirable, consider an elementary school that has an 80% pass rate on the math4 exam at the beginning of the sample period, 1992/1993. It may be difficult for such a school to increase its pass rate compared to a school with a 30% pass rate in 1992/1993. This criticism is partly handled by the fixed effects estimation, because each school has its school-specific unobserved effect that includes historical factors that cause some schools to be better than others. However, because the pass rates are necessarily capped at 100, the linear models may not adequately capture the effect of spending on pass rates throughout a wide range of pass rates. 22 Up to this point, I have assumed that a 1% increase in spending always has the same percentage point effect on the pass rate. To see if I missed any obvious nonlinearities in the effects of spending, I estimate column (2) of Table 5 adding the square of average spending. The squared term is highly collinear with the levels, and neither is statistically significant. In fact, the robust test of joint significance gives p-value=0.77. I conclude there are no obvious nonlinearities that are missed by the model.
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
837
I study differences between low- and high-performing schools by splitting the sample based on 1992/1993 pass rates, the earliest year for which the rates are available at the school level, and then applying the same models and estimation methods as in Table 7. Sample splitting on the basis of initial performance is also interesting for policy purposes since it may be desirable to target limited funds at low-performing schools. The median school pass rate on the 4th grade math test in 1992/1993 was about 42%. Therefore, Table 8 reports the results of fixed effects, pooled 2SLS, and fixed effects–IV on two different samples: schools with math4 below 42.0 in 1992/1993, and schools with math4 above 42.0 in 1992/1993. For the low-performing group, the estimated spending effects are uniformly larger than for the higher-performing group. For example, for lowperforming schools, the pooled 2SLS estimate of a 10% increase in spending on the
Table 8 Fixed effects and instrumental variables estimation: split sample results
log(average per pupil expend.) lunch
lunch2
log(enroll)
[log(enroll)]2
Initial performance below median
Initial performance above median
(1) FE
(2) IV
(3) FE–IV
(4) FE
(5) IV
(6) FE–IV
9.573 (3.536) [4.566] 0.056 (0.131) [0.140] 0.00047 (0.00103) [0.00125] 12.766 (34.789) [41.522] 1.334 (2.942) [3.507]
102.56 (38.01) [43.00] 0.102 (0.179) [0.171] 0.0012 (0.0014) [0.0015] 125.42 (63.86) [65.13] 8.800 (4.920) [5.006]
2.121 (2.636) [3.764] 0.160 (0.103) [0.179] 0.00063 (0.00112) [0.00214] 23.447 (16.739) [26.651] 1.862 (1.486) [2.260]
36.640 (11.238) [12.732] 40.675 (13.048) [14.692] 37.335 (13.351) [15.337] Yes
Yes
13.180 (4.081) [6.024] 0.552 (0.044) [0.090] 0.0036 (0.00056) [0.00143] 15.514 (8.128) [10.487] 1.391 (0.722) [0.956] 1.501 (4.177) [5.608] 5.590 (3.966) [3.241] 8.097 (4.061) [3.329] 2.586 (4.026) [4.074] No
26.307 (16.635) [16.014] 0.189 (0.129) [0.216] 0.0012 (0.0013) [0.0023] 58.859 (53.794) [85.416] 5.251 (4.365) [6.973]
Yes
31.723 (9.086) [12.186] 0.519 (0.066) [0.088] 0.0013 (0.00063) [0.00091] 22.971 (12.088) [22.435] 1.904 (1.053) [1.924] 29.730 (7.720) [9.714] 19.015 (6.736) [6.007] 20.715 (6.989) [6.760] 15.405 (6.884) [7.193] No
2280 0.375
2280 0.262
2280 0.144
2573 0.363
2573 0.262
2573 0.207
log(per pupil exp. 1994) y96*log(per pupil exp. 1994) y97*log(per pupil exp. 1994) y98*log(per pupil exp. 1994) School-level Fixed effects Obs. R2
8.704 (4.239) [3.818] 9.919 (4.740) [4.453] 4.575 (5.288) [5.350] Yes
(i) The bbelow medianQ results are for schools with a 1992/1993 pass rate on the 4th grade math test below 42%. (ii) See Table 7 for other notes.
838
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
predicted pass rate is about 3.17 percentage points (with robust t-statistic=2.60); for the high-performing schools, the estimated effect is about 1.32 percentage points (robust tstatistic=2.19). The difference in FE–IV estimates between the low- and high-performing schools is even more striking. For low-performing schools, the effect of a 10% increase in spending is now a 10.26 percentage point increase in math4 (robust t-statistic=2.38). This is a substantial point estimate, although the robust confidence interval is pretty wide. For the high-performing schools the effect is much smaller (but nontrivial) at 2.63 percentage points (robust t-statistic=1.64). The evident difference in estimated effects between the initially low-performing and initially high-performing schools lends support to policies that increase spending at lowperforming schools relative to high-performing schools.
6. Conclusions and caveats A variety of econometric methods suggest a positive effect of spending on pass rates for a math test administered to Michigan 4th graders, with the most convincing estimates implying the largest effects. A rough rule-of-thumb would be that 10% more real spending increases the pass rate by between one and two percentage points, and more for initially underperforming schools. It is worth noting that I am able to obtain statistically significant estimates of spending effects with fixed effects estimation—something that has eluded much of the prior research on the spending/performance relationship. This highlights the importance of Michigan’s dramatic education finance reform in increased estimation precision. (In an earlier version of this paper, I quantify the likely reduction in standard errors due to the passage of Proposal A as roughly 22%.) This paper addresses several limitations of earlier work. The data I use here are at the school level, available in consecutive years, and include a performance measure for elementary school children. Schools are much more homogeneous than states, and I am able to control for school-level characteristics, as well as address the endogeneity of idiosyncratic changes in school spending (including possible measurement error) by using the district foundation grant as an instrumental variable. Still, student composition changes every year, and so I am not able to control fully for unobserved differences in the students across years. The fraction of students eligible for the free-lunch program reflects one characteristic of the students each year, but other factors, such as parent involvement, are not captured. Availability of individual-level data sets, where students are tracked over time, may help refine the econometric analysis, although some kind of natural experiment may be needed to identify the spending effects.
Acknowledgements I thank participants in the Federal Reserve Bank of Chicago Conference on Education, seminar participants from the Education Policy Center at Michigan State University, the Department of Economics at MSU, the Department of Economics at the
L.E. Papke / Journal of Public Economics 89 (2005) 821–839
839
University of Michigan, Jon Gruber, and an anonymous referee for helpful comments on earlier drafts.
References Arsen, D., Plank, D.N., 2003. Michigan School Finance Under Proposal A: State Control, Local Consequences, working paper, The Education Policy Center at Michigan State University. Casserly, M., 2002. Beating the Odds: II. A City-by-City Analysis of Student Performance and Achievement Gaps on State Assessments. Council of the Great City Schools, Washington, DC. June. Cullen, J.B., Loeb, S., 2003. K-12 Education in Michigan. Michigan at the Millennium. Michigan State University Press, East Lansing, Michigan, pp. 299 – 321. Cullen, J.B., Loeb, S., 2004. School finance reform in Michigan: evaluating proposal A. In: Yinger, J. (Ed.), Helping Children Left Behind: State Aid and the Pursuit of Educational Equity. MIT Press, Cambridge, MA, pp. 215 – 249. Downes, T.A., 1992. Evaluating the impact of school finance reform on the provision of public education: the California case. National Tax Journal 45, 405 – 419. Downes, T.A., Dye, R.F., McGuire, T.J., 1998. Do limits matter? Evidence on the effects of tax limitations on student performance. Journal of Urban Economics 43, 401 – 417. Fisher, R.C., Papke, L.E., 2000. Local government responses to education grants. National Tax Journal 53, 153 – 168. Fisher, R.C., Wassmer, R.W., 1995. Centralizing educational responsibility in Michigan and other states: new constraints on states and localities. National Tax Journal 48, 417 – 428. Hanushek, E.A., 1986. The economics of schooling. Journal of Economic Literature 24, 1141 – 1177. Hanushek, E.A., 1996. Measuring investment in education. Journal of Economic Perspectives 10, 9 – 30. Hoxby, C.M., 2001. All school finance equalizations are not created equal. Quarterly Journal of Economics 116, 1189 – 1231. Imbens, G., Angrist, J.D., 1994. Identification and estimation of local average treatment effects. Econometrica 62, 467 – 476. Krueger, A.B., 1999. Experimental estimates of education production functions. Quarterly Journal of Economics 114, 497 – 532. Murray, S.E., Evans, W.N., Schwab, R.M., 1998. Education-finance reform and the distribution of education resources. American Economic Review 88, 789 – 812. Papke, L.E., 2000. Final Report: Michigan Applied Public Policy Research Project on K-12 School Finance, mimeo, Michigan State University, East Lansing, MI. Yinger, J., 2004. State aid and the pursuit of educational equity: an overview. In: Yinger, J. (Ed.), Helping Children Left Behind: State Aid and the Pursuit of Educational Equity. MIT Press, Cambridge, MA, pp. 3 – 57.