Statistics and Probability Letters 82 (2012) 1504–1506
Contents lists available at SciVerse ScienceDirect
Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro
On the Amato inequality index Barry C. Arnold Statistics Department, University of California, Riverside, CA 92521, USA
article
abstract
info
Article history: Received 1 March 2012 Received in revised form 25 April 2012 Accepted 25 April 2012 Available online 7 May 2012
Amato (1968) proposed using the length of the Lorenz curve as an index of inequality. The index has been little used, perhaps because of the perceived difficulty in analytically evaluating the value of the index in specific situations. A simple representation of the index as an expectation of a particular convex function is presented here. © 2012 Elsevier B.V. All rights reserved.
Keywords: Lorenz curve Convex function Curve length Discrete approximation
1. Introduction We will denote the class of all non-negative random variables with positive finite expectations by L+ . For a random variable X ∈ L+ , following Gastwirth (1971) we define its Lorenz curve by
u L X ( u) =
0 1 0
FX−1 (y)dy FX−1 (y)dy
u =
0
FX−1 (y)dy E (X )
,
0 ≤ u ≤ 1,
(1)
where FX−1 (y) = sup{x : FX (x) ≤ y},
= sup{x : FX (x) < 1},
0 ≤ y < 1, y = 1,
is the right continuous inverse distribution function of the random variable X . This definition is the natural extension of Lorenz’s (1905) original definition of the curve that bears his name. The Lorenz order allows comparison of random variables in L+ with regard to the inequality that they display. Definition 1. For X , Y ∈ L+ , with corresponding Lorenz curves LX and LY , X is less than or equal to Y in the Lorenz order, written as X ≤L Y if LX (u) ≥ LY (u) for all u ∈ [0, 1]. It is generally accepted that summary measures or indices of inequality should be monotone with respect to the Lorenz order. A rich source of candidate inequality indices is provided by the following theorem, which is found implicitly in Hardy et al. (1929). Theorem 1. For X , Y ∈ L+ , X ≤L Y if and only if E (g (X /E (X ))) ≤ E (g (Y /E (Y ))) for every continuous convex function g such that the expectations exist.
E-mail address:
[email protected]. 0167-7152/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2012.04.020
B.C. Arnold / Statistics and Probability Letters 82 (2012) 1504–1506
1505
Thus any continuous convex function g has an associated inequality index Ig that is monotone with respect to the Lorenz order, defined as follows: Ig (X ) = E (g (X /E (X ))).
(2)
For example the choice g (x) = (x − 1) in (2) produces the squared coefficient of variation as an index of inequality. But there are other inequality indices that are based on geometric properties of the Lorenz curve and which do not appear to be of the form (2). The Lorenz curve is a convex curve joining the points (0, 0) and (1, 1) in the unit square lying below the egalitarian curve which is the straight line joining (0, 0) to (1, 1). Three such geometrically based indices are: 2
(1) The Gini index, G(X ), which corresponds to twice the area between the Lorenz curve and the egalitarian line. (2) The Pietra index, P (X ), which corresponds to the maximum vertical distance between the Lorenz curve and the egalitarian line. Alternatively, it can be viewed as the area of the largest triangle that can be inscribed in the area between the Lorenz curve and the egalitarian line. (3) The Amato index, A(X ), which is the length of the Lorenz curve. All three of these indices are obviously monotone with respect to the Lorenz order. There are well known available definitions of the Gini and the Pietra indices that do not make reference to the Lorenz curve. Thus G(X ) =
E |X (1) − X (2) | 2E (X )
,
(3)
where X (1) and X (2) are independent copies of X , and P (X ) =
E |X − E (X )| 2E (X )
.
(4)
The Gini index is immensely popular; the Pietra index is moderately so. The Amato index is much less frequently employed. Undoubtedly this is partially as a consequence of the absence of a Lorenz curve free representation of the form (3) or (4). This perceived fault of the Amato index will be rectified in the next section. From (3) the Pietra index admits the representation P (X ) = E (gP (X /E (X )))
(5)
where gP (x) = |x − 1|, a continuous convex function. It is thus an index of the form (2), and could be denoted by IgP . The Gini index does not admit such a representation of the form (2). It will be verified that the Amato index can indeed be represented in the form (2). Specifically
A(X ) = E ( 1 + (X /E (X ))2 ) = E (gA (X /E (X ))), where gA (x) =
√
1+
x2 ,
(6)
a continuous convex function.
2. The Amato inequality index Suppose that X ∈ L+ . For each integer n define a random variable Xn with n2n + 1 possible values by Xn =
1 2n
Int (2n min{X , n}),
(7) d
where Int (u) denotes the integer part of u. Evidently, Xn → X (in fact, the convergence is pointwise), LXn (u) → LX (u) for every u ∈ [0, 1], A(Xn ) → A(X ) and, provided that E (g (X )) exists, E (g (Xn )) → E (g (X )). Because of this, it will suffice to prove the claim that A(X ) = E
1 + (X /E (X ))2
= E (gA (X /E (X )))
for a discrete random variable, since a limiting argument can be used to extend the result to cover all random variables in L+ . Moreover, without loss of generality we can assume that the mean value of the discrete random variable is 1. Thus we consider a discrete random variable Y with m possible values {yi : i = 1, 2, . . . , m} with associated probabilities {pi : i = 1, 2, . . . , m}. Again without loss of generality, we may assume that the xi ’s are arranged in non-decreasing order. The Lorenz curve of Y is then a piecewise linear curve joining (0, 0) to (1, 1). The segment of the curve associated with the value yi has a rise of pi yi and a run of pi . The length of this line segment (the hypotenuse of a right triangle) is then given by (pi yi )2 + (pi )2 . The length of the Lorenz curve (the Amato index) is the sum of the lengths of the line segments. Thus A(Y ) =
m (pi yi )2 + (pi )2 i=1
m = [ (yi )2 + 1]pi = E ( 1 + Y 2 ), i=1
(8)
1506
B.C. Arnold / Statistics and Probability Letters 82 (2012) 1504–1506
and, by use of a suitable limiting argument, consequently we have
A(X ) = E ( 1 + (X /E (X ))2 ) = E (gA (X /E (X ))) for every X ∈ L+ , as claimed. 3. An example Gastwirth (1988, pp. 23–24) provides data on the population of each of the 33 electoral districts for the Tennessee Legislature in the years 1900 and 1960. At issue was whether the population sizes of the districts had become unacceptably disparate after 60 years. To this end, a measure of variability should be applied to the two data vectors. Denote by x the vector of district populations in 1900 and by y the corresponding vector for 1960. The corresponding Amato indices will be denoted by A(1900) and A(1960) respectively. The computing formula for the 1900 Amato index is A(1900) = A(x) =
33 1
33 i=1
x 2 i
x
+1
33
1 in which x = 33 i=1 xi . This, as remarked earlier, permits computation of the length of the Lorenz curve without the necessity of drawing the curve. With the given data, the Amato indices are found to be A(1900) = 1.4209 and A(1960) = 1.4615, confirming that in 1960 the populations of the districts exhibited more inequality than they had in 1900. √ Since the length of a Lorenz curve will always fall in the interval ( 2, 2), a standardized Amato index might be recommended. It will be defined by
√ A(X ) − 2 A(X ) = √ . 2− 2 The standardized Amato indices for the Tennessee data, namely A(1900) = 0.011437 and A(1960) = 0.0807, provide more compelling evidence of the increased inequality exhibited in 1960. 4. Comments The Amato index thus has a dual interpretation, just as does the Pietra index. The Amato index is the geometrical length of the Lorenz curve, but also can be viewed as the expectation of a simple convex function. Perhaps the popularity of the Amato index (up to now, very low) will be increased by the knowledge of the existence of this perhaps unexpected representation in terms of the expectation of a particular convex function. Acknowledgments I am grateful to the anonymous referees for their suggestions which have led to an improved version of this paper. References Amato, V., 1968. Metodologia Statistica Strutturale, Vol. 1. Cacucci, Bari. Gastwirth, J.L., 1971. A general definition of the Lorenz curve. Econometrica 39, 1037–1039. Gastwirth, J.L., 1988. Statistical Reasoning in Law and Public Policy, Vol. 1. Academic Press, Boston. Hardy, G.H., Littlewood, J.E., Pólya, G., 1929. Some simple inequalities satisfied by convex functions. Messenger of Mathematics 58, 145–152. Lorenz, M.O., 1905. Methods of measuring concentration of wealth. Journal of the American Statistical Association 9, 209–219.