Nonlinear Analysis 71 (2009) e1401–e1407
Contents lists available at ScienceDirect
Nonlinear Analysis journal homepage: www.elsevier.com/locate/na
On the transmuted extreme value distribution with application Gokarna R. Aryal a,∗ , Chris P. Tsokos b a
Department of Mathematics, CS and Statistics, Purdue University Calumet, Hammond, IN 46323, USA
b
Department of Mathematics and Statistics, University of South Florida, Tampa, FL 33620, USA
article
info
Keywords: Generalized extreme value distribution Transmutation map Gumbel distribution Return level
abstract A functional composition of the cumulative distribution function of one probability distribution with the inverse cumulative distribution function of another is called the transmutation map. In this article, we will use the quadratic rank transmutation map (QRTM) in order to generate a flexible family of probability distributions taking extreme value distribution as the base value distribution by introducing a new parameter that would offer more distributional flexibility. It will be shown that the analytical results are applicable to model real world data. © 2009 Elsevier Ltd. All rights reserved.
1. Introduction The quality of the procedures used in statistical analysis depends heavily on the assumed probability model or distributions. Because of this, considerable effort over the years has been expended in the development of large classes of standard probability distributions along with revelent statistical methodologies, designed to serve as statistical models for a wide range of real world phenomena. However, there still remain many important problems where the real data does not follow any of the classical or standard probability models. Very few real world phenomenon that we need to statistically study are symmetrical. Thus, the popular normal model would not be a useful probability density function, pdf, for studying every phenomenon. The normal model at times is a poor description of observed asymmetrical phenomena. Skewed models, which exhibit varying degrees of asymmetry, are a necessary component of the modeler’s tool kit. According to Genton, [1] an introduction of non-normal probability distributions can be traced back to the nineteenth century. Edgeworth [2] examined the problem of fitting asymmetrical distributions to asymmetrical frequency data. Several skewed and kurtotic statistical models appeared in the literature very recently. The skew normal probability model proposed by O’Hgana and extensively studied by Azzalini has been extended to the other symmetric models and has been applied in several areas. An extensive bibliography has been made available at [3]. According to the Azzalini’s approach: A random variable X is said to have the skew-symmetric distribution if its pdf is f (x) = 2g (x)G(β x), where g (·) and G(·), respectively, denote the pdf and the cdf of the symmetric distribution and β ∈ (−∞, ∞) is skewness parameter. Aryal et al. [4] have applied the skew Laplace probability distribution to study the currency exchange rate data. If the underlying (base) probability distribution is not symmetric we can not apply the Azzalini’s approach to generate a flexible family of probability distributions. The aim of the present study is to investigate a probability distribution that can be derived from non-symmetric distribution, in particular extreme value distribution, and can be used on modeling and analyzing real world data. In Section 2 we will define the transmuted probability distribution and will show the relationship with the Azzalini’s formulation for a uniform random variable. In Section 3 we will define the transmuted generalized extreme value distribution. In Section 4 we will study some basic mathematical characteristics of transmuted Gumbel probability distribution and apply the subject distribution in modeling snow fall data.
∗
Corresponding author. E-mail address:
[email protected] (G.R. Aryal).
0362-546X/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.na.2009.01.168
e1402
G.R. Aryal, C.P. Tsokos / Nonlinear Analysis 71 (2009) e1401–e1407
2. Transmutation map Let F1 and F2 be the cumulative distribution functions, cdf, of two distributions with a common sample space. The general rank transmutation as given in [5] is defined as: GR12 (u) = F2 (F1−1 (u)) GR21 (u) = F1 (F2−1 (u)). Note that the inverse cumulative distribution function also known as quantile function is defined as F −1 (y) = infx∈R {F (x) ≥ y} for y ∈ [0, 1]. The functions GR12 (u) and GR21 (u) both map the unit interval I = [0, 1] into itself, and under suitable assumptions are mutual inverses and they satisfy GRij (0) = 0 and GRij (1) = 1. A quadratic Rank Transmutation Map, (QRTM), is defined as, GR12 (u) = u + λu(1 − u),
|λ| ≤ 1
from which it follows that the cdf’s satisfy the relationship F2 (x) = (1 + λ)F1 (x) − λF1 (x)2
(1)
which on differentiation yields, f2 (x) = f1 (x) [1 + λ − 2λF1 (x)]
(2)
where f1 (x) and f2 (x) are the corresponding pdfs associated with cdf F1 (x) and F2 (x) respectively. An extensive information about the quadratic rank transmutation map is given in Shaw et al. [5]. Lemma. f2 (x) given in (2) is a well defined probability density function. Proof. Rewriting f2 (x) as f2 (x) = f1 (x) [1 − λ(2F1 (x) − 1)] we observe that f2 (x) is nonnegative. We need to show that the integration over the support of the random variable is 1. Consider the case when the support of f1 (x) is (−∞, ∞). In this case we have
Z
∞
f2 (x)dx = −∞
Z
∞
f1 (x) [1 + λ − 2λF1 (x)] dx
−∞ ∞
Z
= (1 + λ)
f1 (x)dx − 2λ −∞
= (1 + λ) − λ
∞
Z
f1 (x)F1 (x)dx −∞
Z
∞
since
2f1 (x)F1 (x)dx = 1
−∞
= 1. Similarly, other cases where the support of the random variable is a part of real line follows. Hence f2 (x) is a well defined probability density function. We call f2 (x) the transmuted probability density of a random variable with base density f1 (x). Also note that when λ = 0 then f2 (x) = f1 (x). In Theorem 1 below we will establish the relationship between the Azzzalini’s formulation and the QRTM formulation for a uniform random variable. Theorem 1. Let X be a uniform (−θ , θ ) random variable with density function f (x) with absolutely continuous distribution function F (x) such the F 0 (x) is symmetric about 0. Then g (x|β) = 2f (x)F (β x) =
max {min(β x, θ ), −θ } + θ 2θ 2
,
−θ < x < θ
is a density function for θ > 0 and any real β . If λ be the shape parameter used to define transmuted uniform distribution then λ = −β for |β| ≤ 1. Proof. Let F (x) and f (x) denote the cdf and pdf of an uniform random variable. Then for |β| ≤ 1 and |λ| ≤ 1, 2f (x)F (β x) and f (x)[1 + λ − 2λF (x)] are identical distribution if and only if the cumulative distribution functions are equal at each point in the support of the random variable. So the two definitions are identical if and only if ∀x,
Z
x
2f (y)F (β y)dy =
Z
−θ
x
f (y)[1 + λ − 2λF (y)]dy −θ
Z
x
Z
βy
⇔ −θ
−θ
f (s)ds + λ
Z
y
f (s)ds f (y)dy = −θ
1+λ 2
F (x)
G.R. Aryal, C.P. Tsokos / Nonlinear Analysis 71 (2009) e1401–e1407
Z
x
f (y)
⇔ −θ
⇔
1+λ 2
Z
1 2
βy
Z + Z
x
f (y) −θ
f (y) −θ
Z
λ
f (s)ds +
2
0
F (x) +
x
⇔
βy
βy
Z
+λ
Z
f (s)ds dy =
0
f (s)ds + λ
y
Z
Z
y
1+λ 2
f (s)ds dy = 0
0
f (s)ds + λ
y
e1403
F (x)
1+λ 2
F ( x)
f (s)ds dy = 0.
0
0
Note that
1 f (x) = 2θ 0
−θ < x < θ , otherwise.
Now considering all cases β > 0, β = 0 and β < 0 it can easily be seen that the above expression is true iff λ = −β . Hence proved. Note that the Azzalin’s formulation can be applied only for symmetric distribution whereas QRTM can be applied in any (symmetric or asymmetric) distribution. On the other hand the shape parameter in Azzalini’s formulation can be any real number whereas in QRTM it is bounded between −1 and 1. In either case shape parameter taking value 0 produces the base distribution. 3. Extreme value distributions In 1928, Extreme Value Theory (EVT) originated in work of Fisher and Tippett describing the behavior of maximum of independent and identically distributed random variables. Various applications have been implemented successfully in many fields such as: actuarial science, hydrology, climatology, engineering, and economics and finance. Kotz and Nadarajah [6] provides several areas of applications of the extreme value distributions. A random variable X is said to have generalized extreme value (GEV) distribution if its cdf is given by
( −1/γ ) x−µ F (x) = exp − 1 + γ , σ for 1 + γ (x − µ)/σ > 0, where −∞ < µ < ∞ is the location parameter, σ > 0 is the scale parameter and −∞ < γ < ∞ is the shape parameter. The corresponding density function is given by f ( x) =
1
σ
1+γ
x−µ
−1/γ −1
σ
( −1/γ ) x−µ exp − 1 + γ . σ
Note that the shape parameter γ determines the tail behavior of the distribution. The particular case of a GEV for γ = 0 and −∞ < x < ∞ is the Gumbel distribution. Also note that the cases for γ > 0 and γ < 0 GEV tends to be Fréchet and the negative Weibull distribution respectively. The cdf of a transmuted generalized extreme value distribution is
( ( −1/γ ) −1/γ ) x−µ x−µ F2 (x) = (1 + λ) exp − 1 + γ − λ exp −2 1 + γ . σ σ
(3)
Hence the corresponding pdf is
( −1/γ ) x−µ f 2 ( x) = 1+γ exp − 1 + γ σ σ σ " ( −1/γ )# x−µ × 1 + λ − 2λ exp − 1 + γ . σ 1
x−µ
−1/γ −1
(4)
In Section 4, we will study the transmuted Gumbel distribution extensively and exhibit an application in climate data modeling. 4. Transmuted gumbel probability distribution The Gumbel distribution, named after Gumbel [7] is also referred to as the Smallest Extreme Value (SEV) distribution or Type I Extreme Value distribution and is perhaps the most widely used probability distribution in modeling the climate data including the global warming, flood frequency analysis, rainfall modeling etc. Koutsoyiannis [8] studied the Gumbel
e1404
G.R. Aryal, C.P. Tsokos / Nonlinear Analysis 71 (2009) e1401–e1407
Fig. 1. Probability density function of transmuted Gumbel distribution for µ = 0, σ = 1 and λ = −1, −0.5, 0, 0.5 & 1.
distribution to observe the appropriateness of the Gumbel distribution in modeling extreme rainfall. Aryal et al. [9] have given the necessary formulation of the Gumbel distribution to study the airline spill data. In fact the Gumbel distribution has been widely used to analyze and model the behavior of random phenomenon that occur in engineering, biology, environment among others. A random variable X is said to have a Gumbel distribution if its pdf is given by
x−µ x−µ f (x) = exp − − exp − σ σ σ 1
− ∞ < x < ∞,
(5)
where µ and σ are the location and scale parameters, respectively. The corresponding cdf is given by
F (x) = exp −exp −
x−µ
σ
.
(6)
Now, using Eq. (1) we can write
F2 (x) = (1 + λ) exp −exp −
x−µ
σ
x−µ − λ exp −2 exp − . σ
(7)
This yields the pdf of a transmuted Gumbel random variable as f2 (x) =
1
exp −
σ
x−µ
σ
x−µ x−µ − exp − 1 + λ − 2λ exp −exp − . σ σ
(8)
The probability density function of a transmuted Gumbel probability distribution for different value of the shape parameter is displayed in Fig. 1. 4.1. Moments For a transmuted Gumbel random variable X the nth order moments are given by E (X n ) =
∞
Z
xn
−∞
σ
exp −
x−µ
σ
x−µ x−µ − exp − 1 + λ − 2λ exp −exp − dx σ σ
Let y = exp(− σ ) then we have x−µ
E (xn ) =
∞
Z
(µ − σ ln y)n (1 + λ − 2λ exp(−y)) exp(−y)dy 0 n
=
X n k=0
k
∞
Z µn−k (−σ )k (1 + λ)
X Z n n (ln y)k exp(−y)dy − µn−k (−σ )k 2λ
0
k=0
k
n X ∂k ∂ k −ν n n k n −k σ µ (1 + λ) k Γ (ν) − 2λ k 2 Γ (ν) . = (−1) k ∂ν ∂ν ν=1 k=0 The last expression is obtained by using (4.358) in Gradshteyn et al. [10] which states that, ∞
Z
xν−1 exp(−µx)(ln x)n dx =
0
for n = 0, 1, 2, 3, . . .
∂ n −ν µ Γ (ν) n ∂ν
∞
exp(−2y)(ln y)k dy 0
G.R. Aryal, C.P. Tsokos / Nonlinear Analysis 71 (2009) e1401–e1407
e1405
Now we can write the expressions for the expected value and the variance of a transmuted Gumbel random variable as E (X ) = (µ + σ C) − λσ ln 2 Var(X ) = σ
2
π2 6
− λ(1 + λ)(ln 2)
2
where C = lim
s→∞
s X 1 m=1
m
! − ln s = 0.577215
is the Euler’s constant. Also note that Γ 0 (1) = −C. 4.2. Random number generator and parameter estimation Using the method of inversion we can generate random numbers from a transmuted Gumbel probability distribution as
x−µ x−µ (1 + λ) exp −exp − − λ exp −2 exp − =u σ σ where u ∼ U (0, 1). This yields " ( )# p (1 + λ) − (1 + λ)2 − 4λu x = µ − σ ln − ln . 2λ
(9)
One can use Eq. (9) to generate random numbers when the parameters µ, σ and λ are known. The maximum likelihood estimates, MLE’s, of the parameters that are inherent within the transmuted Gumbel probability distribution function is given by the following: xi − µ
xi − µ xi − µ − exp − 1 + λ − 2λ exp −exp − σ σ σ σ i =1 X n n X xi − µ xi − µ xi − µ ln L = − ln σ + + exp − + ln 1 + λ − 2λ exp −exp − . σ σ σ i=1 i=1 L=
n Y n 1
exp −
Now setting
∂ ln L = 0, ∂µ
∂ ln L ∂ ln L = 0 and =0 ∂σ ∂λ
we have
x −µ x −µ n exp −exp − i σ 2λ X (xi − µ) exp − i σ xi − µ + ( x − µ) 1 − exp − , i x −µ σ σ 2 i =1 σ σ 2 i =1 1 + λ − 2λ exp −exp − i σ x −µ x −µ n n X X exp − i σ exp −exp − i σ xi − µ xi −µ , + 2λ 0=n− exp − σ 1 + λ − 2λ exp −exp − σ i=1 i=1 0=−
n
+
n 1 X
and n X
x −µ
1 − 2 exp −exp − i σ x −µ . 0= 1 + λ − 2λ exp −exp − i σ i=1 Therefore, the likelihood equations are given as below,
x −µ n n X X (xi − µ) exp − xi −µ exp −exp − i σ xi − µ σ xi −µ , (xi − µ) 1 − exp − + 2λ σ 1 + λ − 2λ exp −exp − σ i =1 i=1 x −µ x −µ n n X X exp − i σ exp −exp − i σ xi − µ xi −µ , n= − 2λ exp − σ 1 + λ − 2λ exp −exp − σ i=1 i=1 nσ =
and n X
x −µ
1 − 2 exp −exp − i σ x −µ . 0= 1 + λ − 2λ exp −exp − i σ i=1 Now we need to solve this system of equations simultaneously. There is no closed form solution of this system but one can use numerical method, for example Newton Raphson’s method, to solve the system and get the maximum likelihood estimates of the parameters µ, σ and λ.
e1406
G.R. Aryal, C.P. Tsokos / Nonlinear Analysis 71 (2009) e1401–e1407
Fig. 2. Scatter diagram of the snow fall data.
4.3. Application of transmuted Gumbel probability distribution In this section we will show that the transmuted Gumbel probability distribution can be used to model snow fall data. We will illustrate this by analyzing the snow fall data in Midway Airport in the state of Illinois. The data consists of the annual maximum daily snowfall during the period from 1970 to 2007. The data were obtained from national climate data center (NCDC). The data is given in a scatter diagram in Fig. 2. We fitted the transmuted Gumbel probability distribution, f2 (x) given in (5) by estimating the parameters µ, σ and λ using quasi-Newton algorithm. Statistical software R [11] have been used to perform all necessary calculations and graphics. The estimates of the parameters are as follows:
µ ˆ = 5.83331,
σˆ = 2.205878,
λˆ = 0.1685559.
The goodness-of-fit test reveals that the data is well described by the transmuted Gumbel distribution having the above parameters. One characteristic that is often derived in snowfall/ rainfall statistical data modeling is the return level. The return level with return period T is denoted by xT and is defined to be the level exceeded on average only once in T years. For example, the 2 year return level is the median of the distribution. If we consider the proposed distribution then inverting F (xT ) = 1 − T1 , we have
xT = µ − σ log − log
q 1 + λ − (1 − λ)2 +
2λ
4λ T
.
On substituting the parameter estimated from the subject data we have the following table: Return period (T)
Return level (xT )
2 5 10 20 30 40 50 60 70 80 90 100 200
6.38 8.79 10.42 11.99 12.91 13.54 14.04 14.45 17.79 15.08 15.34 15.58 17.11
G.R. Aryal, C.P. Tsokos / Nonlinear Analysis 71 (2009) e1401–e1407
e1407
5. Concluding remarks In the present study we have developed the analytical framework of the transmuted extreme value probability distribution. We have established the relationship between the Azzalini’s framework and QRTM for uniform distribution. We have derived the expressions for basic statistical measures, and also provide the maximum likelihood equations of the parameters inherent in the subject transmuted Gumbel probability distribution. To illustrate the usefulness and effectiveness of the transmuted Gumbel probability distribution we applied the actual snowfall data from the Midway airport in the state of Illinois. We have provided the return level for short term as well as long term return period. It has been observed that 15.58 inches of snow would be the level of annual maximum daily snow fall expected to occur on average once in every 100 years whereas 6.38 inches/day is expected on average once in every 2 years. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
M.G. Genton, Skew-Elliptical Distributions and their Applications -A Journey Beyond Normality, Champman Hall/CRC, 2004. F.Y. Edgeworth, The law of error and the elimination of chance, Philosophical Magazine 21 (1886) 308–324. A. Azzalini, References on the skew-normal distribution and related ones, On-line at http://tango.stat.unipd.it/SN/list-publ.pdf. G. Aryal, C.P. Tsokos, An application of skew Laplace distribution in finance, Journal of Statistical Theory and Applications 6 (2007) 45–60. W. Shaw, I. Buckley, The alchemy of probability distributions: Beyond Gram–Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map, 2007. S. Kotz, S. Nadarajah, Extreme Value Distributions: Theory and Applications, Imperial College Press, 2000. E.J. Gumbel, Statistics of Extremes, Columbia University Press, New York, 1958. D. Koutsoyiannis, On the appropriateness of the Gumbel distribution for modelling extreme rainfall, Hydrological Risk: Recent advances in peak river flow modelling, prediction and real-time forecasting, assessment of the impacts of land-use and climate changes, 2004, pp. 303–319. G. Aryal, C.P. Tsokos, Airline spill analysis using Gumbel and Moyal distributions, Neural, Parallel and Scientific Computations 16 (2008) 35–43. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series and Products, Academic Press, 2000. R: A language for data analysis and graphics.