EXPLORATIONS
IN ECONOMIC
HISTORY
22, 220-226 (1985)
The Parker-Gallman Sample and Wealth Distributions for the Antebellum South: A Comment* DONALR SCHAEFERAND MARK SCHMITZ Department
of Economics, Washington State University, Pullman, Washington 99164-4860
Few data sets have proved as rich as the Parker-Gallman (hereafter P-G) sample for the cotton South in 18.59-1860. The sample has been indispensablefor inquiries into agriculturalself-sufficiency,specialization, and productivity, the slave trade, and the distribution of agricultural wealth. However, in a recent article in this Journal Yang (1984,pp. 88102),raises three major points regardingthe use of the P-G samplefor comparing southernand northern rural wealth inequality beforethe Civil War. First, he contendsthat “. . . startingthe samplingfrom theagricultural schedules[of the manuscriptcensus]madethe [P-G] samplesubstantially truncated,” alsobiasing wealth inequality downwardrelative to measures derived from nontruncatedsamples(Yang, 1984,p%90). Second,the PG sample does not contain the variable “real wealth” although that variable was enumerated by the census. Yang arguesthe use of the variable “cash value of farm” as a proxy for real wealth biases the measured wealth inequality downward. A third point raised by Yang relates to the statistical propertiesof the southerndistribution of wealth. Yang suggestsit is more fruitful to considerthe wealth distributions for southern free and slave farms separatelynoting when this is done the two distributions are approximately lognormal but differ with respectto log mean and log variance.This suggeststo Yang that thesedistributions weregeneratedby differentmechanisms.Further,he arguesthe differences are understated since wealth is measuredby the proxy variable cash value of farm. In addition, he makes the subpoint that the distributions of wealth for southern free and northern farms were not significantly different-at least with respect to their lag variances. In this note we briefly arguethat Yang’s first point dealsfor the most * The authors appreciate the comments of the editor and two anonymous referees. The data collection was funded under NSF Grant SES-8006419. 220 0014~4983/85 $3..00 Copyright 0 1985 by Academic Press, Inc. AU ri&s of reproduction in any form reserved.
COMMENT
ON D. YANG’S
PAPER
221
part with the nature of the population sampled rather than with the characteristicsof the sample itself. That is, the sample is not truncated for the purposes for which it was initially designed. Then WE show evidence concerningthe effect of using “cash value of farm” as proxy for real wealth and show the effect overall to be less than suggestedby Yang. Finally, we examine Yang’s hypothesis of separategenerating mechanisms for the distributions of wealth for free and slave farms in the South and his tests for lognormality and equality of variances. 1 Yang’s objective is to comparerural wealth distributionsfor the South andNorth. His first concernis the P-G sampleonly coversthe (substantial) cotton growing region,but his greatestmisgiving is the samplewas drawn from the censusagricultural manusciptsrather than from all rural households listed in the population schedules(as was done by Bateman and Foust for their northern sample).’ This last characteristicof the sampled population is already well known (Foust and Swan, 1978,pp. 58-60; Faust, 1968,pp. 23-30, 206-218; Wright 197Ob,p. 98; GaIlman, 1969, pp. 24-25) as are the samplingtechniques.Furthermore,Yang’s question of “‘whether the sample representswell the population in question” (p. 90) has been answeredin the affirmative for variables tabulatedby the census.2 2 Although the real wealth of the farm owner (or operator)was reported on the free population schedule, the P-G study did not cohect this information. Therefore previous studies that have usedthe P-G sample to estimate wealth inequality have defined total wealth as the sum of personal wealth (PW) and farm value (FV), the latter being a proxy for the uncollected real wealth (RW). Yang finds when this proxy is used for the farmer subsetof the Bateman-Foustnorthernsamplethe estimated Gini coefficient for total wealth falls from 0.563 to 0.463.j From this ’ Yang’s guarded acceptance of the cotton South as representative of the rural South with respect to wealth inequality may well be premature. Neimi’s (1977, p. 753) homogeneity conclusion based upon slaveholding distributions published in the 1860 Census is contradicted by evidence from other samples (and perhaps by his own tabular evidence). It seems clear agricultural wealth in the sugar and rice growing regions was less equally distributed than in the cotton South while the food and tobacco regions of Kentucky and Tennessee had a more equal distibution of agricultural wealth than the cotton South (Schmitz, 1974, pp~ 104-105; Swan, 1972, pp. 25-27; Schaefer, 1978, pp. 430-431). ’ Foust and Swan note, “[o]n the basis of these [statistical] tests it was not possible to conclude that the [P-G] sample differed significantly from the universe from which it was selected.” Also see Foust (1968, pp. 32-50). 3 Only a part of the decline is attributable to the use of the proxy variable. As noted in his footnote to Table 3, Yang also truncates the sample of households, removing nearly 3000 with no reported real wealth. He does not report the Ginis for the two total wealth measures when the same subset of farms is used.
222
SCHAEFER
AND SCHMITZ
result he concludes “the 10 percentage points would be the approximate magnitude by which the Southern inequality coefficient could have been understated” (p. 95). As a part of an ongoing project that traces the P-G farm population over time, RW has been added to a large subset (n = 2685) of the original P-G sample. Our first experiment uses the P-G farms located in the new South (Ala., Ariz., La., Miss., Tenn., Tex.) eliminating some farms according to the criteria proposed by Fogel and Engerman. The Gini coefficient for total wealth using the FV proxy is 0.728 (which is identical to that for the full P-G sample) while the use of the correct variable RW raises the Gini to only 0.740. The explanation for the small difference is the high correlation between RW and FV (0.915) in this subset of the P-G sample.’ Thus Yang’s concern that multiple farm ownership, which was not taken into account in the P-G sample, would cause FV and RW to diverge appears to be unfounded. A second experiment was performed using Schaefer’s matched sample for the New South in 1850. This sample, drawn from the P-G subset of 2685 farms, was formed by first matching the 1860 head of household to the 1850 population schedule and then to the 1850 agricultural and slave census manuscripts. This 1850 sample thus contains Yang’s A, B, C, and D type households and has no truncation of the type B (tenants) observations. We compared the distributions of FV and RW for the 1062 type A and B households within the matched sample. The resulting Gini coefficients for FV and RW are 0.739 and 0.778, respectively.6 Unfortunately, personal wealth was not collected for the 1850 Census, but it is clear if personal wealth was added to FV and RW the difference in the Ginis would be less than that reported above.
4 These criteria eliminate farms with: (a) more than three slaves but no male slaves, (b) no labor, (c) no improved acreage, (d) no FV, (e) no farm machinery, (f) no corn output, (g) insufficient corn equivalents to feed the working animal stock, or (h) no output. The Gini coefficients for total wealth computed with FV over the full P-G sample (n = 5228) and the subset of farms not excluded by the above criteria (n = 4299) are 0.728 and 0.715, respectively. ’ One of the referees has suggested the lower correlation between RW and FV in the Northern sample results from the greater importance of the nonfarm RW in the North. 6 The 1850 sample also gives insight into the effect of adding farm laborers and “farmers with real wealth that were not found in the agricultural census”. The Ginis using the broader samples were as follows: Groups
FV Gini
RW Gini
N
A+B+C 0.765 0.798 1176 0.777 0.792 1247 A+B+C+D For the sample of types A and B households the correlation between RW and FV is 0.845.
COMMENT
ON D. YANG’S
PAPER
22’3
3
Yang’s third point is that the southerndistribution appearsto be the combinationof two distinct lognormaldistributions(slaveand free farms). Further, he speculatesthat the use of FV instead of RW causes the difference between these two distributions to be understated.Finally, he notes the log variancesfor the southernfree and northern farms are not significantly different. On the question of lognormality, the &i-square statistics reported in Table 1for the entire P-G sampleas well asthe matchedsubsetreplicate and substantiate Yang’s result that total wealth for slave farms was lognormallydistributed. However, in all casesthe lognormalityhypothesis is rejected for free farms. We could of coursetake the view that statistical significance is less relevant than substantiveimportance and note that the chi-square statistics are quite a bit smaller for free farms than for the entire sample. However, the same point holds for the Wright soiltype region results that Yang (1984)cites as failing the lognormality test (p. 99, fn, 8) where the median significant chi-squarestatistic is 36.57 (Wright 1970a,pp. 88-89).7 The lognormality conclusionsand the differencesin the slaveand free distributions are largely unaffectedby the use of the proxy variable FV. Within the P-G subsetTable 1 shows the use of the proxy variable has little effect upon the (geometric)mean wealth for free farms and slave farms. The use of FV does significantly raise the log variance for free farms but the changein the log variancefor slavefarms is not significant. However, the result for free farms should be viewed with some caution given the significance of the cl&square statistic for free farms.* More importantly and not unexpectedlythe use of FV does not alter the conclusionthat a significantdifferenceexistsbetweenthe variancesof southern free and slave farm’s total wealth distributions. It is our view that the findingof distinctfreeandslavewealthdistributions and Yang’s inferencethat “. . . the wealth distributionsof slaveand free farms were influencedby somewhatdifferent mechanisms” (p. 99)needs to be approachedwith caution. We have two reasonsfor this stand. First, the cotton South is a large and heterogeneousarea. By restricting ’ Yang (p. 99, fn. 8) is correct in noting that Wright experimented with the incompietebeta distribution which can take on a bimodal shape. However, in this instance the form of the incomplete-beta distribution that best fits the data iri 85% of the cases was the Jshaped curve where “. . . the &ted curve declines smoothly from a high point at or near the Y-axis” (Wright, 197Oa, p. 91). ’ The &i-square statistic is used to test whether the sample data deviate sufficiently from a’ normal distribution for us to conclude that they were not drawn from a normal populatian. The test for the equality of variances assumes the data are drawn from normaI populations The signiticant chi-square statistic implies the assumption is not correct and thus the test results may be misleading.
224
SCHAEFER
Descriptive Group
TABLE 1 Statistics for Wealth Distributions Wealth variable
Free farms Slave farms All farms
FV + PW
Free farms Slave farms Free farms Slave farms
RW + PW
’ * * **
AND SCHMITZ
FV + PW
Test for lognormality using Farms with no wealth were Significant at the 0.05 level Significant at the 0.01 level
Chi square”
Based on the P-G Sample Geometric mean
Entire P-G sample 24..55** 1,185 11.33 12,805 246.36** 3,859 P-G subset 17.72* 12.89 29.77** 7.25
1,238 15,224 1,309 14,830
Log variance
nb
0.805 1.297 2.645
2627 2586 5213
0.663 1.277 0.799 1.327
1396 1260 1396 1260
10 deciles. excluded from the calculations. with 7 degrees of freedom. with 7 degrees of freedom.
the scope of the investigation to smaller more homogeneous subsamples it is possible to find distinct distributions for each. The restrictions per se rather than the slave-free division may be the controlling factor. Second, by subdividing the sample and reducing the number of observations it becomes more difficult to reject the null hypothesis. Put the other way, large sample sizes make it easier to find significant results. This is a well known property of tests involving the chi-square statistic. As Learner (1983, p. 39) noted: Diagnostic tests such as goodness-of-fit tests, without explicit alternative hypotheses, are useless since, if the sample size is large enough, any maintained hypothesis will be rejected (for example no observed distribution is exactly normal). Such tests therefore degenerate into elaborate rituals for measuring the effective sample size.
Table 2 provides an example of both these forces, homogeneity and smaller sample size, for a portion of the Schaefer-matched sample. The data relate to the distribution of wealth for farm operators in six states who did not migrate across state lines in the 1850s. In 5 out of 12 cases for the individual states we fail to reject the hypothesis of lognormalitya result similar to Yang’s for slave and free farms. Additionally, the smaller &i-square statistics and smaller sample sizes have a high positive correlation.’ ’ One could also try to justify these results on substantive grounds. A rationale for thinking the wealth distributions of migrants and nonmigrants distinct has been put forward by Kearl et al. (1980). Similarly, it is not unreasonable to divide the sample by states as different areas in the country might well have different priced land depending upon not only population pressures but also soil quality.
COMMENT
Descriptive
ON D. YANG’S
TABLE 2 Statistics for Wealth Distributions Chi square
225
PAPER
of Nonmigrant
Geometric mean’
Farm Operators Log variance
State
Year
Alabama
1850 1860 1850 1860 1850 1860 1850 1860
21.07** 48.82**
1635 4546
1.211 1.154
304 308
6.70 10.24
862 3428
1.041 0.762
79 79
3920
1.633
42
11097 1976
1.775
42
6539
1.052 1.193
241 244
Tennessee
1850 1860
20.94** 28&S**
1744 5119
1.015 0.981
375 I79
Texas
1850 1860
4.54 3.38
3036 8925
1.068 1.110
52
Six states
1850 1860
46.50** 86.84**
1779 5428
1.163 1.149
893 904
Arkansas Louisiana Mississippi
15.61* 5.62 19.62** 18.79**
52
Source. Schaefer matched sample. a Wealth is defined as FV + slave wealth + animal wealth + value of machinery. This is similar to the approach in Wright (1978, p. 31, fn. 24). b Farms with no wealth were omitted from the calculations. * Significant at the 0.05 level with 7 degrees of freedom. ** Significant at the 0.01 level with 7 degrees of freedom.
Yang’s last statistical point, namely the southern free and northern distributions have log variancesthat are not significantly different (pp. 99, 101)appearsto be incorrect. Yang erroneouslyreportedthe variance ratio of (0.892/0.8047)with approximately 9000degreesof freedom in the numerator and 2627degreesof freedom in the denominatoras “not significant.” In fact this ratio is significantat the 0.001level. One could, of course, question the substantiveimportanceof the value of this ratio since, as noted above, the chi-squarestatistic is influenced by sample size.lo 4 It is our contention Yang erroneously arguedthat the P-G sample might be unrepresentativeof the cotton South and that the substitution of the farm value for the missing real wealth variable madethe southern wealth distribution appearmore equal than it was. Further, we conclude his statistical results for the wealth holdings of free southernfarmers are ” In addition, our evidence does not support the conjecture that the difference in the variances of the southern and northern distributions is explainable by the higher proportion of older heads of household in the North than the South. Using the age distributions for
SCHAEFER
AND SCHMITZ
incorrect; the distribution of total wealth is not lognormally distributed and its log variance is not equal to that of northern farmers. REFERENCES Atack, J., and Bateman, F. (1981), “Egalitarianism, Inequality, and Age: The Rural North in 1860.” Journal of Economic History 41, 85-93. Fogel, R. W., and Engerman, S. &. (1974), Time on the Cross. Boston: Little, Brown. Foust, J. D. (1968), “The Yeoman Farmer and Westward Expansion of U.S. Cotton Production.” Ph.D. dissertation, University of North Carolina. Foust, J. D., and Swan, D. E. (1970), “Productivity and Profitability of Antebellum Slave Labor: A Micro-Approach.” Agricultural History 44, 39-62. Gallman, R. E. (1969), “Trends in the Size Distribution of Wealth in the Nineteenth Century: Some Speculations.” In L. Soltow (Ed.), Six Papers on the Size Distribution of Wealth and Income. National Bureau of Economic Research, Studies in Income and Wealth, Vol. 33, pp. l-30. Kearl, J. R., Pope, C. L., and Wimmer, L. T. (1980), “Household Wealth in a Settlement Economy: Utah, 1850-1870.” Journal of Economic History 40, No. 3, pp. 477-496. Learner, E. E. (1983), “Let’s Take the Con Out of Econometrics.” American Economic Review 73, No. 1, pp. 31-43. Neimi, A., (1977), “Inequality in the Distribution of Slave Wealth: The Cotton South and Other Southern Agricultural Regions,” Journal of Economic History 37, 747-753. Schaefer, D. (1978), “Yeomen Farmers and Economic Democracy: A Study of Wealth and Economic Mobility in the Western Tobacco Region, 1850 to 1860.” Explorations in Economic
History
15, 421-437.
Schmitz, M. (1974), “Economic Analysis of Antebellum Sugar Plantations in Louisiana.” Ph.D. dissertation, University of North Carolina. Soltow, L. (1975), Men and Wealth in the United States 1850-1870. New Haven, Conn.: Yale Univ. Press. Swan, D. E. (1972), “The Structure of the Rice Economy: 1859.” Ph.D. dissertation, University of North Carolina. Wright, G. (197Oa), “ ‘Economic Democracy’ and the Concentration of Economic Wealth in the Cotton South, 1850-1860.” Agricultural History 44, 63-93. Wright, G. (1970b), “A Note on the Manuscript Census Samples Used in These Studies.” Agricultural
History
44, 95-99.
Wright, G. (1978), The Political Economy of the Cotton South. New York: Norton. Yang, D. (1984), “Notes on the Wealth Distribution of Farm Households in the United States, 1860: A New Look at Two Manuscript Census Samples.” Explorations in Economic
History
21, 88-102.
farmers in the North (Atack and Bateman, 1981, p. 91) and farmers in the South (using the Schaefer matched sample) we obtain the following results: Percentage distribution Age of head of household
North
South
Less than 20 20-29 30-39 40-49 50-59 60 or older Sample size
0.3 20.1 28.2 22.3 16.1 12.9 13,268
0.2 14.3 27.0 26.4 18.7 13.4 2685