Economics Letters 1 (1978) 389-393 0 North-Holland Publishing Company
EXTREME VALUES FOR GINI COEFFICIENTS CALCULATED GROUPED DATA
FROM
David MURRAY * University of New Enghd,
Armidale, NSW 2351, Australia
Received December 1978
Data on income distributions is usually specified as numbers of persons in mutually exclusive, non-overlapping income ranges with an open ended terminal group. In this paper we present a method for determining the maximum and minimum Gini coefficients when only the population mean income is known.
1. Introduction
The purpose of this paper is to provide a method for calculating minimum and maximum values for Gini coefficients from grouped data, when the mean income in each group is unknown but the population mean income is known. The problems of estimating Gini coefficients from grouped data are well known. When mean incomes of groups are known the problem of determining the range for the estimates of the Gini has been solved by Gastwirth (1972). However data on income distributions does not always include this information, e.g. that provided by the Australian Bureau of Statistics (1976). The method presented below is applicable to income distribution data where the numbers in a series of income ranges and an open ended terminal group are given together with the population mean income. We note initially that in this situation the Gini coefficient may be expressed as a sum of two terms, one depending on the mean incomes in the groups and the other being a weighted sum of within group Gini coefficients. It is then shown that the Gini coefficient has a minimum value which is a linear function of certain variables A, while the maximum value is a quadratic function of the same variables. The functions can then be minimized or maximized using standard linear or quadratic programming techniques.
* I wish to thank Dan Dao for reassurance on some of the mathematical content of this paper. 389
390
D. Murray /Extreme
values for Gini coefficients
2. The decomposed Gini coefficient, and its upper and lower bounds Where data on incomes is consolidated into k exhaustive, mutually exclusive and non-overlapping groups the Gini coefficient may be written [following Pyatt (1976)] as G= 5 5 7TiEiiPi + 5 i=l ,j=1 i=l
niG?pi
=G”tGW.
(1)
In the above formulation ni is the share of the population income received by group i, pi the proportion of the total population in group i, Eii = max((yj -ri)/vi, 0) and Gr is the Gini coefficient of inequality within group i. The term yi is the mean income in group i. We shall also define ni as the number of individuals in group i, N as the total number and Y as the mean income of the population. Considering the first k - 1 closed income ranges we define yy and yf as the upper and lower limits of the income range. For the last, open-ended group only yi is defined. Ifyi is the (unknown) mean income in group i then the minimum value for the within group Gini coefficient is zero (all individuals have income yi). The maximum value of the within group Gini for the first k - 1 groups for any yi will be obtained when all individuals have an income equal to either the lower or upper bound. If we define Xi for these groups as being the proportion of individuals in the group having income yy then it follows that yi =
hiyY+ (1
- hi)yt,
and the maximum
i=
O>Ai>O,
l,...,
within group Gini coefficient
GY* = Xi( 1 - Ai)(_yY- yf)/yi,
i=
k-l,
(2)
is given by
1, . . .. k - 1.
For the terminal group we define hk as the ratio of the mean income to the lower bound of the group, so that Yk
= hkh
T
hk>
1.
(4)
The maximum within group Gini for this terminal group for any yk will exist when nk - 1 individuals have income equal to y$ , and one individual has income equal to flkyk - (nk - 1)~;. This Gini is given as Gkw* = nk(nk - l)o\Y~-.d)/~~~k.
(5)
Thus for any one yi within group inequality in group i can take either a minimum value of zero or a maximum value as defined in (3) or (5). And consequently for any k values of the yi, GW [eq. (l)] has a minimum value of zero or a maximum value given by k-l
GW* = C i=l
niGW*pi +nkGk w* * pk.
(6)
D. Murray /Extreme
values for Gini coefficients
391
It follows that for any set of Xi values the minimum Gini is given by the GM term in (l), and the maximum is given by the sum of G”plus GW* as defined in (6). Conversely if we can find the set of Xi values which minimize GM we will be able to find the minimum possible value of G, and if we find the set of Xi values which maximize GM + G W*then we can find the maximum possible value of G. To do this we show that GM is a linear function of the Xi, and GM + GW* is a quadratic function of the hi.
3. The functions
and maximized
to (1) above, and ordering the ranges so that Yi >Yi, j > i, we can
Returning write k
to be minimized
k
Given the values for Yi defined in terms of hi and writing the term in square brackets as ai, we can further write k-l G”=(1/N2y)[F2
[ni(&yY+
+ nk@kYk
-
(1
-
&)YF - hrYy_
(l-
hr)Yf)%I
hy’ - t1 - xl>Yi?akl k-l
= t1/N2Y)bk”k_&k
k
X (izni(Yi)X(
+ iz
niai(Y?-Yt)b
- (YFY4)
k-l
•I z2 ni&(Yf
- Yf) - nkakYfI
’
(7)
Thus GM may be expressed as a linear function of the hi. All other variables entering into (7) are known. It follows that we can find a minimum value for the population Gini by minimizing (7) with respect to the Xi subject to the constraints on the
D. Murray /Extreme valuesfor Gini coefficients
392
Xi and total mean income 1 >hi
i = 1, .. . . k-
20,
hk>
1,
1,
(8)
In order to find the maximum possible value of the Gini coefficient we need to find the values of hi which maximize GM + GW* . Reverting back to eqs. (3), (5) and (6), we may write k-l GW*
= C
h_YilNY)@i(l
-
WYY-Yf’)/.Yi)(dN)
i=l
k-l = (1/N2y)[z
&_.+~;>~i(l
-
hi)
+ nk(nk
-
l)&‘k-
I)]
*
(9)
Adding this to GM we get the maximum value of the Gini G’ , G*=G”+Gw* k-l = (1/N2q[nk(nk+ak
-
1)&k k-l
k + (yy-yf)(nf
-
C i=2
?ZiCfi)hl
-
k-l -
nk(nk
-
l)d
ni&i(yf + c i=Z
-J’f)’
nk(Ykyi]
e
(10)
From (10) it is seen that G* is a quadratic function of the hi, and all other variables entering into the function are known. It follows that we can find a maximum value for the population Gini by maximizing (10) with respect to the X,-subject to the constraints given in (8) on the hi. Since (7) is a linear function in the hi, and (10) is a quadratic function and the minimization or maximization is subject to a set of linear inequalities on the hi it follows that solutions may be found using standard linear or quadratic programming techniques.
D. Murray /Extreme
values for Gini coefficients
393
4. Conclusion In this paper we have provided a method for finding maximum and minimum values of the Gini coefficient when data is grouped into closed income ranges and an open ended terminal class and only the population mean is available. The reasons for wishing to obtain these statistics are the same as those given by Gastwirth (1972). The contribution made here is that it is demonstrated how outer bounds on the Gini coefficient can be obtained even though means in sub-groups are not available.
References Australian Bureau of Statistics, 1976, Income distribution 1973-74, Part I and Part II (Canberra, Australia). Gastwirth, Joseph L., 1972, The estimation of the Lorenz curve and the Gini index, Review of Economics and Statistics 54, 306-316. Pyatt, Graham, 1976, On the interpretation and disaggregation of Gini coefficients, Economic Journal 86,243-2X