Computational Statistics & Data Analysis 51 (2006) 1822 – 1839 www.elsevier.com/locate/csda
Construction of bivariate S-distributions with copulas Lining Yu, Eberhard O. Voit∗ Department of Biostatistics, Bioinformatics and Epidemiology, Medical University of South Carolina, 135 Cannon Street, Suite 303K, Charleston, SC 29425 USA Received 3 August 2004; received in revised form 29 November 2005; accepted 30 November 2005 Available online 21 December 2005
Abstract S-distributions are univariate statistical distributions with four parameters. They have a simple mathematical structure yet provide excellent approximations for many traditional distributions and also contain a multitude of distributional shapes without a traditional analog. S-distributions furthermore have a number of beneficial features, for instance, in terms of data classification and scaling properties. They provide an appealing compromise between generality in data representation and logistic simplicity and have been applied in a variety of fields from applied biostatistics to survival analysis and risk assessment. Given their advantages in the singlevariable case, it is desirable to extend S-distributions to several variates. This article proposes such an extension. It focuses on bivariate distributions whose marginals are S-distributions, but it is clear how more than two variates are to be addressed. The construction of bivariate S- distributions utilizes copulas, which have been developed quite rapidly in recent years. It is demonstrated here how one may generate such copulas and employ them to construct and analyze bivariate—and, by extension, multivariate—S-distributions. Particular emphasis is placed on Archimedean copulas, because they are easy to implement, yet quite flexible in fitting a variety of distributional shapes. It is illustrated that the bivariate S-distributions thus constructed have considerable flexibility. They cover a variety of marginals and a wide range of dependences between the variates and facilitate the formulation of relationships between measures of dependence and model parameters. Several examples of marginals and copulas illustrate the flexibility of bivariate S-distributions. © 2005 Elsevier B.V. All rights reserved. Keywords: S-distribution; Copula; Archimedean copula; Bivariate S-distribution; Marginal
1. Introduction Statistical data modeling often faces the challenge of insufficient information about the most appropriate probability distribution for the data of interest. To address this challenge, many methods have been developed, including the use of generalized systems of distribution functions, the most famous of which goes back more than a century (Pearson, 1895). These generalized systems are designed to encompass as special cases a set of known distributions that are considered particularly useful. They are interesting from a theoretical point of view, but often have a serious disadvantage for practical applications in that they tend to become rather complex and that they require a rapidly increasing number ∗ Corresponding author. The Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, 313 Ferst Drive,
Suite 2118, Atlanta, GA 30332, USA. Tel.: +1 404 385 0131; fax: +1 404 385 5028. E-mail address:
[email protected] (E.O. Voit). 0167-9473/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2005.11.021
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1823
of parameters as more special cases are included. Indeed, their mathematical complexity sometimes becomes so overwhelming that these generalized forms cannot be used for purposes of estimation or data analysis. In order to alleviate the conflicting demands of generality and simplicity, a new distribution function was proposed that did not attempt to include a maximal number of traditional distributions as exact special cases but instead approximated them with relatively high accuracy (Voit, 1992; see also Savageau, 1982). This distribution is based on four parameters and called S-distribution. Initially formulated as a differential equation in the cumulative distribution function (cdf) F, it was shown that the S-distribution may be formulated as an algebraic equation that expresses its density (pdf) f as a function of F. In this or in the original differential form, S-distributions have been used for a variety of purposes, including the representation and classification of long-tailed or otherwise odd-shaped data and distributions (Voit, 1992), survival analysis (Yu and Voit, 1995; He and Voit, 2005), representations of different types of trends (Voit, 1996; Sorribas et al., 2000; Voit and Sorribas, 2000), risk assessment (Balthis et al., 1996; Voit and Schwacke, 2000), and receiver-operator characteristics (Sorribas et al., 2002). Recent generalizations (Sorribas et al., 2006) have further increased the flexibility and range of application of S-distributions. So far, the theory and applications of S-distributions have been limited to univariate cases. This is unfortunate, because multivariate data analysis and modeling are of increasing interest in various areas of theoretical and applied statistics. Since dealing with several variables is an inherently difficult subject, it appears to be desirable to make the advantages of S-distributions applicable to cases with several variates. This article thus proposes extensions of the univariate S-distribution to two variates, and by extension to multivariate distributions, whose marginals are S-distributions. A very useful tool for this purpose exists in the form of so-called copulas. Although almost 50 years old (Sklar, 1959; see also Schweizer and Sklar, 1983; Schweizer, 1991), copulas have only recently begun to be recognized for their potential in directly relating multivariate distribution functions to their univariate marginals, thereby allowing the investigation of scale-free measures of dependence (e.g., Nelsen, 1998). Copulas can also be used to construct families of multivariate distributions from marginals that have a preselected correlation structure (Joe, 1997). In recent years, copulas have been applied to data in a variety of area, including clinical trials (e.g., Meester and MacKay, 1994), survival analysis (e.g., Shih and Louis, 1995; Zheng and Klein, 1995), and resource management (cf. Das et al., 2003; Capéraà et al., 1997). We show here how streamlined methods based on copulas lead naturally to a multitude of bivariate S-distributions.
2. The univariate S-distribution The S-distribution has its roots in the theory of S-systems, which were originally developed for the analysis of complex systems in biology (Savageau, 1969, 1982; Voit, 1991, 2000a; Voit and Rust, 1992). As in the case of such systems (Savageau, 1979), it was demonstrated that S-distributions may be understood as coarse, global approximations of complex population processes, which result in quite simply structured distributions of global features in such populations (Voit, 1998). An example of this type of approximation is the relatively simple structure of a survival function, which nevertheless is the result of hundreds of often-complex processes governing individuals and populations. Restricting the general S-system form to just one equation for the cumulative F as a function of the random variable X, one directly obtains the S-distribution dF = · Fg − Fh , dX
F (X0 ) = F0
(1)
(Voit, 1992; see also Savageau, 1982). F0 = F (X0 ) ∈ (0,1) is the initial condition, which determines the location of the distribution. Specifically, if F0 = 0.5, X0 is the median of the distribution. The parameter 0 reflects the spread of the distribution, and g and h, with g < h, are parameters that determine the shape of the distribution. Depending on the selection of values for g and h, S-distributions can model symmetry as well as skewness to the right or left. Typical examples are shown in Fig. 1. Because the differential on the left-hand side of Eq. (1) is equivalent to the density f, it is immediately clear that the S-distribution may be formulated as an algebraic relationship between density and cumulative (Voit, 1992): f = · Fg − Fh . (2)
1824
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
f 0.8
f1
f2
f3
0.4
0
0
5
10
X
Fig. 1. Three S-distribution densities with = 1 and F (0) = 0.01. f1 : g = 0.1, h = 1; f2 : g = 0.75, h = 2; and f3 : g = 1, h = 8.
S-distributions have several interesting features, including the following: (1) Their mathematical structure is simpler than that of distribution families based on densities, such as the ones proposed by Pearson (1895) and Johnston et al. (1980), and the suprafamily proposed by Savageau (1982; see also Voit and Rust, 1992). (2) Because of the rigid general form of the distribution there is no need to distinguish different “types” within a family, and no precautions are necessary to avoid poles or to limit the domain of definition (e.g., Pearson, 1895). (3) The S-distribution permits symmetry as well as different types of skewness and closely approximates many traditional distributions. (4) Because only two parameters determine the shape of an S-distribution, while the other two determine location and spread, this distribution has been used successfully for classifying traditional distributions based on their (approximated) shapes. Such classifications have been performed for continuous as well as discrete distributions (Voit, 1992; Voit and Yu, 1994). (5) The S-distribution exhibits scaling properties similar to the normal distribution, which suggests the definition of standard S-distributions (Voit and Schwacke , 1998). (6) The four parameters determining the S-distribution allow for considerable shape flexibility but at the same time permit efficient parameter estimation (Voit, 2000b; Hernández-Bermejo and Sorribas, 2001). (7) The fixed structure of the S-distribution offers advantages for the numerical analysis of the distribution itself and of key features, such as quantiles and moments (Voit and Schwacke, 1998). (8) The combination of a small number of parameters with rich flexibility in shape has allowed novel statistical analyses, for instance, of evolutions of data distributions over time (Voit et al., 1995; Voit, 1996; Sorribas et al., 2000, 2002).
3. Copulas Copulas offer a tool for understanding relationships in multivariate data. The basic concept is that a multivariate distribution can be “organized” into its marginals plus a dependence function called copula. A rigorous mathematical description of copulas and their features is available in Nelsen (1998). Linking univariate marginals to a multivariate distribution, the copula fully determines the dependence structure of the bivariate data, while the scaling and shape of the joint distribution are entirely determined by the marginals. The copula approach ensures a high level of flexibility and convenience to the modeler, because the characterization of the margins can be separated from the specification of the dependence structure, which is determined by the parameters of the copula. 3.1. Definition and properties A copula is itself a multivariate CDF with uniform marginals. Thus, for bivariate cases, the copula maps points of the unit square I2 = [0, 1]2 onto values on I = [0, 1] by correlating the quantiles of the two marginal distributions
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1825
rather than the two original random variables. Thus, the copula for two random variables is unaffected by any strictly increasing scaling of the variables. For example, the copula for X and Y is the same as the copula for ln(X) and exp(Y ) (Nelsen, 1998; Rosenberg and Schuermann, 2004). Copulas overcome a number of drawbacks associated with multivariate distributions. They can deal with both normal and non-normal multivariate distributions, including those with heavy tails and high degrees of skewness. Furthermore, the construction of a joint distribution through copulas does not constrain the choice of marginal distributions. Marginals may be the same or different with respect to parameter values or even structures (Frees and Valdez, 1997; Genius and Strazzera, 2003). For instance, one marginal may be normal and the other exponential. Another important feature of copulas is that they permit varying degrees of tail-dependence, which refers to the extent to which the dependence between random variables arises from extreme observations. This type of asymmetric dependence is often encountered in practical applications, such as finance and environmental science (Das et al., 2003). This particular property of copulas is to be seen in contrast to the representation of dependence in multivariate normal distributions, which comes from central observations. Moreover, the tail dependence may be specified for the upper tail, the lower tail, or both (Das et al., 2003). Several methods may be used to construct copulas (Nelsen, 1998). The most important inversion method, which relates a copula to the corresponding joint distribution and will be discussed in the following, is based on Sklar’s (1959) seminal work. A limitation of this method is that it is sometimes not easy to calculate the inversion of F (x) and G(y). Similarly important is a method based on generator functions, which will also be discussed. This method can be very effective. However, it is limited to the special case of Archimedean Copulas (ACs). An alternative method employs the algebraic relationship between the joint distribution function and its univariate margins to determine copulas. A genuinely different set of methods works without reference to distribution functions or random variables and instead uses purely geometric features, such as a description of the support or the shape of the graphs of horizontal, vertical, or diagonal sections. The major limitation of these methods is that they are based on copulas with specific forms, such as C(u, v) = a(v) · u + b(v) for linear sections; C(u, v) = a(v) · u2 + b(v) · u + c(v) for quadratic sections; and C(u, v) = uv + u(1 − u)[a(v) · (1 − u) + b(v) · u] for cubic sections. There are also mixing methods for copulas, including a method using a “convex sum” construction, which is a continuous analog of convex linear combinations. The convex sum method requires the specification of an entire family of copulas, from which the desired copula is constructed. Finally, there are ad hoc methods for specific purposes. Details of these methods, including advantages and disadvantages, are discussed in Nelsen (1998). Of crucial importance for working with copulas is the theorem of Sklar, which proves the existence of copulas. For the two-dimensional case, it states the following (Sklar, 1959; Schweizer, 1991; Nelsen, 1998): Let H be a joint distribution function with margins F and G. Then there exists a copula C such that for all x, y ∈ R = [−∞, +∞]: H (x, y) = C(F (x), G(y)).
(3)
If F and G are continuous, then C is unique; otherwise, C is uniquely determined on the product of ranges, Ran(F ) × Ran(G). Conversely, if C is a copula and F and G are distribution functions, then the function H defined by Eq. (3) is a joint distribution function with margins F and G. One important consequence of Sklar’s theorem is that it provides an “inversion” method for constructing copulas from joint distributions, because if F and G are continuous, we obtain from Eq. (3): C(u, v) = H F −1 (u), G−1 (v) . (4) Thus, the copula can be retrieved from the joint distribution. As an example, consider Gumbel’s bivariate logistic distribution (Nelsen, 1998), which is defined on R2 and has the form −1 H (x, y) = 1 + e−x + e−y . (5) −1 −1 Its marginal distributions are u = F (x) = 1 + e−x and v = G(y) = 1 + e−y , and the copula corresponds to C(u, v) = H F −1 (u), G−1 (v) = uv/(u + v − uv).
(6)
1826
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
Sklar’s theorem may be used in the opposite direction as well: If C is a copula and F and G are distribution functions, then according to Eq. (3), the function H (x, y) = C(F (x), G(y)) is a joint distribution function with margins F and G. As an example, revisit Gumbel’s bivariate logistic distribution (Nelsen, 1998) with C(u, v) = uv/(u + v − uv), u = −1 −1 and v = G(y) = 1 + e−y . Then, one obtains, according to Sklar’s theorem, H (x, y) as given F (x) = 1 + e−x by Eq. (5). Densities may be obtained from the copula C through partial differentiation in Eq. (3). The first derivatives of the cumulative H are given as j j d H (x, y) = C(F, G) · F jx jF dx and j j d H (x, y) = C(F, G) · G jy jG dy and the bivariate density is given as the second derivative, namely j j d d j j h(x, y) = H (x, y) = C(F, G) · F· G jy jx jG jF dx dy j j = C(F, G) · f (x) · g(y). jG jF
(7)
(8)
This joint density exists if C(F, G) is absolutely continuous, that is, if j2 C(F, G)/jF jG exists. According to Nelsen (1998), the partial derivatives jC/ju and jC/jv exist for almost all u and v, for which 0 jC(u, v)/ju, jC(u, v)/jv 1. These partial derivatives are defined and non-decreasing almost everywhere on I2 . The fact that the partial derivatives jC/ju and jC/jv exist is an important feature for the construction of bivariate S-distributions, as will be demonstrated later. 3.2. Archimedean copulas (ACs) An important class among many families of copulas (see Appendix) consists of the so-called ACs. Because these have particularly useful properties, they are described here in more detail. ACs always have the same general form, namely C(u, v) = −1 ((u) + (v)).
(9)
In this formulation, the generator of the copula, , is a strictly decreasing convex function from I to [0, ∞] such that (1) = 0. Detailed discussions of general properties of ACs can be found in the literature (e.g., Genest and Rivest, 1993; Nelsen, 1998). There are a numerous types of ACs, which collectively cover a wide range of positive or negative dependences, joint asymmetry, and heavy or thin tails. ACs have beneficial properties for practical analyses. They render it possible to reduce the study of a multivariate structure to a single univariate function, the generator. Moreover, there are easy methods for constructing ACs and to sample from them. Finally, there are obvious extensions to copulas for more than two dimensions. The key to constructing an AC is the selection of a suitable generator (t). Once selected, one computes −1 (t), from which the AC follows as C(u, v) = −1 ((u) + (v)). For example, if (t) = − ln(t) for t ∈ (0, 1], then (t) is continuous and strictly decreasing with (1) = 0 and −1 (t) = exp(−t). Elementary algebra yields C(u, v) = exp(−(− ln(u) − ln(v))). Clearly, this simplifies to C(u, v) = uv, which corresponds to the important special case of the product copula (u, v) for independent variables. Of course, there are also limitations. For instance, some one-parameter AC families permit only a limited degree of dependence. Other ACs are asymmetric in the sense that they lead to different tails, or they are monotonic and cannot accommodate negative dependence. To overcome these limitations, it is useful to note that, under certain conditions, families of new generators may be constructed from a “base form” generator (Nelsen, 1998). Specifically, if (t) is
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1827
a generator, then t is a generator for > 0, and so is (v) for 1. The combination, t , describes an entire class of generators for > 0 and 1. As an example, consider the base form (t) = t − − 1 /, which is the generator of the so-called Clayton family (Clayton, 1978). Then
t − − 1 /
, (t) =
(10)
(−1 < ∞, = 0, 1) is also a valid and rather flexible generator; it will be used extensively in the construction of bivariate S-distributions below. For the particular case = 1/( 1), one obtains yet another generator, namely 1/ , (t) = (1/)1/ × t − − 1 , which is characterized by only one parameter. A list of generators is given in the Appendix. ACs facilitate the computation of densities. According to Smith and Mu (2003), one directly obtains j (C) · C = (F ) jF and (C) ·
j C = (G), jG
(11)
which may be rearranged as (F ) j C= jF (C) and (G) j C= . jG (C)
(12)
If () is twice differentiable and (C) = 0, the mixed second derivative is thus − (C) · (F ) · (G) j j C = jG jF (C)3
(13)
and the density can be expressed in terms of the copula and the two marginal densities: j j h(x, y) = C(F, G) · f (x) · g(y) jG jF =
− (C) · (F ) · (G) (C)3
For the particular case (t) =
· f (x) · g(y).
(14)
t − − 1 / , −1 < ∞, = 0, 1, the derivatives are
−1 , (t) = − · 1− · t −−1 · t − − 1 −2
(t) = · 1− · t −−2 · t − − 1 · ( · + 1) · t − − ( + 1) ,
(15)
from which we obtain − (C) (C)
3
1−2
= −2 · 2·−2 · C 2·+1 · C − − 1 · ( · + 1) · C − − ( + 1)
(16)
1828
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
and thus the bivariate density 1−2·
· ( · + 1) · H − − ( + 1) h(x, y) = · · H 2·+1 · H − − 1 −1 −1 × F −−1 · F − − 1 · G−−1 · G− − 1 · F g − F h · Gk − G m .
(17)
The estimation of an AC takes advantage of the fact that C(u, v) = −1 ((u) + (v)) is uniquely determined by the function K(t) = t − (t)/ (t) = t − (t), which is defined on the unit interval, that is, T = C(U, V ) ∼ K(t) = t − (t)/ (t) = t − (t)
(18)
(Genest and Rivest, 1993). Thus, it is possible to recover by solving the differential equation (t)/ (t) = t − K(t) = (t), which yields (t) = exp (1/(t)) dt . (19) As an example, the generator of the Gumbel–Hougaard family of copulas (Nelsen, 1998) is (t) = (− ln t) , which corresponds to the differential equation (t)/ (t) = t · ln t/ = (t).
(20)
3.3. Measures of dependence It is of great interest for many applications to investigate the dependence structure of a multivariate joint distribution. The use of copulas facilitates such analyses without the need to study marginal distributions. To appreciate this aspect, it is necessary to consider different of dependence, as described in Nelsen (1998). measures , y , y denote two observations from a vector (X, Y ) of continuous random variables. and x Concordance: Let ) (x j j i i (xi , yi ) and xj , yj are called concordant if xi < xj and yi < yj , or if xi > xj and yi > yj ; otherwise they are called discordant. Concordance is also given if (xi − yi ) xj − yj > 0. The concordance function is defined as follows: suppose (X1 , Y1 ) and (X2 , Y2 ) are independent vectors of continuous random variables with joint distributions H1 and H2 , respectively, and H1 (x, y) = C1 (F (x), G(y)) and H2 (x, y) = C2 (F (x), G(y)). The concordance function Q quantifies the difference between the probabilities of concordance and discordance of (X1 , Y1 ) and (X2 , Y2 ), i.e., Q = P [(X1 , Y1 ) (X2 , Y2 ) > 0] − P [(X1 , Y1 ) (X2 , Y2 ) < 0]. It can be shown that Q = Q (C1 , C2 ) = 4 C2 (u, v) dC1 (u, v) − 1. (21) I∗I
Kendall’s tau: Kendall’s c is a measure of dependence that is based on the concept of concordance. It may be defined for the population or a sample. The former is given as c = Q,
(22)
whereas it is defined for a sample as tc = (c − d)/(c + d) = (c − d)/(n · (n − 1)/2),
(23)
where c is the number of concordant pairs, d is the number of discordant pairs, and n is the number of observations in the sample. If X and Y are continuous random variables whose copula is C, then c = Q(C, C) = 4 C(u, v) dC(u, v) − 1 = 4E[C(U, V )] − 1. (24) I∗I
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1829
If X and Y are continuous random variables whose copula is an AC generated by , then
1
c = 1 + 4 · 0
(t) dt. (t)
(25)
As an example, the correlation structure is particularly simple for the generator of the Gumbel–Hougaard family, (t) = (− ln t) . In this case, Eq. (25) reduces to c = ( − 1)/,
(26)
which demonstrates explicitly how the parameter determines the correlation between the two marginals F and G. Spearman’s rho: A second measure for dependence is Spearman’s . Suppose (X1 , Y1 ) , (X2 , Y2 ), and (X3 , Y3 )are three independent random vectors with a common joint distribution function H (whose margins are again F and G) and copula C. The population coefficient is defined to be proportional to the probability of concordance minus the probability of discordance for the two vectors (X1 , Y1 ) and (X2 , Y3 ), i.e. = 3 (P [(X1 − X2 ) (Y1 − Y3 ) > 0] − P [(X1 − X2 ) (Y1 − Y3 ) < 0]) . If X and Y are continuous random variables whose copula is C, then c = 3Q C, = 12 uv dC(u, v) − 3 I∗I = 12 C(u, v) du dv − 3,
(27)
(28)
I∗I
where the product copula
(u, v) = uv corresponds to independent u and v.
4. Construction of bivariate S-distributions through copulas The construction of bivariate distributions in general requires the identification of the marginal distributions F (x) and G(y) and of an appropriate copula that connects these as H (x, y) = C(F (x), G(y)). In other words, the specification of a probability model for independent observations (x1 , y1 ) , . . . , (xn , yn ) from a bivariate population with non-normal distribution function H (x, y) can be simplified by expressing H in terms of its marginals, F (x) and G(y), and its associated dependence function, C. This strategy is used here to construct bivariate S-distributions. It is a natural way of analyzing bivariate data, because many tractable models are readily available for the marginal distributions. Furthermore, many authors restrict the estimation to dependence functions that are known to belong to a specific class of bivariate distributions indexed by a one- or two-dimensional parameter. A more general solution to the problem consists of choosing an appropriate parametric family of dependence functions. Genest and Rivest (1993) assume that the data can be suitably modeled by an AC, which the authors estimate from data (x1 , y1 ) , . . . , (xn , yn ), and describe in detail the method of estimating the dependence function and various analytical results. We follow a similar strategy by creating a copula in Archimedean form, selecting S-distributions as marginals, and finally constructing the bivariate distributions. Specifically, let us consider the case where one is interested in modeling the stochastic behavior of two random variables X and Y based on a set of n independent observations of the pair (X, Y ), say {(x1 , y1 ) , (x2 , y2 ) , . . . , (xn , yn )}. Suppose one has already estimated the respective marginal distributions, F (x), and G(y), using standard statistical techniques. It is then of interest to test whether X and Y are independent and to describe their dependence structure if they are not independent. One way to describe this dependence structure is to estimate and use their joint distribution function using copulas as tools (Genest and Rivest, 1993).H and C are obtained as H (x, y) = P [X x, Y y} and C(u, v) = P [F (X) u, G(Y ) v] = H F −1 (u), G−1 (v) . Thus, if one can estimate the copula C, one has H as H (x, y) = C(F (x), G(y)). In fact, one can transform the observations of X and Y, (x1 , y1 ) , (x2 , y2 ) , . . . , (xn , yn ) into observations of U = F (X) and V = G(Y ), (u1 , v1 ) , (u2 , v2 ) , . . . , (un , vn ), where ui = F (xi ) and vi = G (yi ) , i = 1, 2, . . . , n. Thus, C may be estimated as the joint distribution of U and V, and this leads to an estimate of H.
1830
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
4.1. Examples of bivariate S-distributions (1) For a first simple example, we select the AC C(u, v) = −1 ((u) + (v)) with generator (t) = t −1 − 1 and joint distribution H (x, y) = C(F (x), G(y)). Next, we select the marginals as generic univariate S-distributions, namely dF dG and (29) = · Fg − Fh = · G k − Gm . dx dy Computing the appropriate partial derivatives of H, as shown above (see Eqs. (7), (8), and (14)), we obtain a set of partial differential equations for the bivariate S-distribution H in terms of the marginals F and G: −F −2 j (F ) g h · · F − F H (x, y) = · · Fg − Fh = jx (C) −C −2 = · H 2 · F g−2 − F h−2 (30) and
j H (x, y) = · H 2 · Gk−2 − Gm−2 . jy
(31)
As in the univariate case (see Eq. (1)), these equations require the specification of initial conditions H (x0 , y0 ), which determine the location of the distribution. The corresponding density is obtained from Eqs. (30) and (31) by further differentiation, evoking the appropriate initial conditions, and working out the density h(x, y) in terms of H, F, and G, namely j jx j = jx
j H (x, y) jy · H 2 · Gk−2 − Gm−2 = 2 · · · H 3 · F g−2 − F h−2 · Gk−2 − Gm−2 .
h(x, y) =
(32)
The equations for the derivatives and the density show an immediate analogy with the univariate S-distributed marginals. It is noted that S-distributions allow for finite left tails, in which case the cumulatives and densities are only defined down to the point where the density becomes 0. Hernández-Bermejo and Sorribas (2001) discussed this situation in detail. (2) The copula of the previous example, generated by (t)=t −1 −1, is from the Clayton family (t)= t − − 1 /, whose application is somewhat limited. Specifically, the copula generated by (t) = t −1 − 1 permits only a certain degree of dependence and a single structure of dependence. As a more flexible form, we consider again the two parameter generator (t) = t − − 1 / and apply it to S-distributed marginals as in the previous example. The AC corresponding to this generator is 1/ −1/ C(u, v) = 1 + u− − 1 + v − − 1 . (33) Partial differentiation of the joint distribution H (x, y) = C(F (x), G(y)) yields H as a set of two differential equations: 1− −1 j · F −−1 · F − − 1 · Fg − Fh H (x, y) = · H +1 · H − − 1 jx and
1− −1 j H (x, y) = · H +1 · H − − 1 · G−−1 · G− − 1 · Gk − Gm . jy
(34)
As in the previous example, these equations are reminiscent of the univariate S-distributed marginals. The set of equations involves eight parameters: , , g, h, k, m for the marginal distributions and and for the generator of
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1831
the AC. In addition, the initial conditions, H (x0 , y0 ), determine the location of the distribution, as discussed before. Further differentiation yields the bivariate density, as shown before. It has the form 1−2
h(x, y) = · · H 2+1 · H − − 1 · ( · + 1) · H − − ( + 1) −1 −1 × F −−1 · F − − 1 (35) · G−−1 · G− − 1 · F g − F h · Gk − G m . The copula allows us to study scale-free measures of dependence, such as Spearman’s c and Kendall’s c , which are measures of concordance that are invariant under strictly increasing transformations of the underlying random variables. As an illustration, we compute Kendall’s c for the bivariate S-distributions just obtained. Directly applying the computations in Eq. (25) yields 1 1 1 (t) t +1 − t dt dt = 1 + 4 · · c = 1 + 4 · 0 0 (t) 1 1 2 1 =1+4· − =1− . (36) · +2 2 ( + 2) · The result indicates that c can accommodate positive and negative dependence. Specifically, c = −1 if = −1 and = 1, and c = 1 if → ∞ or → ∞. For other parameter combinations, the numerical values of and determine the correlation between random variables, and the mathematical relationship between c and the parameters of the generator of the copula permits the determination of the degree of correlation, given the copula. It also facilitates the construction of copulas, given a desired correlation, by means of suitable and . The example again includes the special case of = 1, −1 < ∞, = 0, which represents the Clayton family of copulas and leads to the particularly simple distribution j H (x, y) = · H +1 · F −−1 F g − F h jx and
j H (x, y) = · H +1 · G−−1 Gk − Gm jy
(37)
and the density
h(x, y) = · · ( + 1) · H 2+1 · F g−−1 − F h−−1 · Gk−−1 − Gm−−1 .
(38)
As in the previous scenario, it is interesting how the structure of the univariate S-distribution is reflected in this bivariate form. 5. Numerical implementations The previous section addressed the construction of bivariate S-distributions in a generic fashion, and thus permitted any choices of parameters for the marginals and the copula. It is now of interest to demonstrate the great flexibility of shapes afforded by the copula method as applied to S-distributed marginals. The examples were computed in MathCad. The first two examples consider two marginals that have S-distribution parameters = 0.84307, g = 0.690923, h = 2.885386 and F (0) = 0.5 and therefore closely approximate normal distributions (Voit, 1992). Fig. 2 shows an AC with the simple generator (t) = t −1 − 1. Kendall’s c in this example is 13 . In Fig. 3 the generator and dependence are 4 (t) = t −2 − 1 /2 , c = 7/8. The second set of examples uses one quasi-normal marginal as before, but different second marginals and AC generators. Fig. 4 is based on an AC with generator (t) = t −1 − 1. The first marginal is exponential marginal = 1, k = 0, m = 1; c = 13 and the second is skewed to the right without a particular name ( = 1, k = 3, m =3.2; c = 2/3). The densities in Fig. 5 are based on the same skewed marginal without a particular name ( = 1, 2 6 k = 1, m = 8), but differ in generators: (t) = t −1 − 1 and c = 2/3 (Panels A, B) versus (t) = t −4 − 1 /4 , c = 17/18 (Panels C, D).
1832
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
Fig. 2. Bivariate S-distribution H (x, y) with two quasi-normal marginals and AC with generator (t) = t −1 − 1. Panel A: plot of H (x, y); Panel B: contour plot of H (x, y); Panel C: plot of density h(x, y); panel D: contour plot of h(x, y).
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
Fig. 3. Bivariate S-distribution H (x, y) with two quasi-normal marginals and AC with generator (t) = Panel B: contour plot of H (x, y); Panel C: plot of density h(x, y); Panel D: contour plot of h(x, y).
1833
4 t −2 − 1 /2 . Panel A: plot of H (x, y);
1834
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
Fig. 4. Three-dimensional and contour plot representations of two bivariate S-distribution densities h(x, y) with one quasi-normal and one other marginal and with AC generated by (t) = t −1 − 1. Panels A and B: exponential marginal. Panels C and D: marginal without name, skewed to the right.
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1835
Fig. 5. Three-dimensional and contour plot representations of two bivariate S-distribution densities h(x, y) with one quasi-normal and one skewed 2
6 marginal and different AC generators. Panels A, B: (t) = t −1 − 1 ; c = 2/3. Panels C, D: (t) = t −4 − 1 /4 ; c = 17/18.
1836
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
6. Conclusions and discussion In this article we have extended S-distributions to their bivariate analogs. This step is of interest, because univariate S-distributions have interesting features that combine great flexibility in shape with relative computational simplicity, given that most practical statistical analyses require the use of computers anyway. Bivariate distributions are generally more difficult to tackle, but it was shown here that bivariate S-distributions retain much of their univariate counterparts, which suggests that some methodologies may be transferable. For the construction of the bivariate S-distributions we used copulas, which have been discussed in the literature for a while, but which only recently have received the attention they deserve. Copulas have the advantage that the correlation structure between random variables can be separated from the marginals. In our case, this immediately allowed us to use univariate S-distributions as marginals and to construct from them bivariate distributions with various dependence structures. The dependence was easily assessed through Kendall’s or Spearman’s and, for the particularly convenient case of ACs, was shown to map directly to the parameters of the copula’s generator. The resulting distributions permit the full range of dependence, [−1, 1], as well as varying degrees of tail-dependence, such as stronger dependence in the upper tail, lower tail, or both. The construction of bivariate S-distributions by means of copulas also led directly to the corresponding densities. These resemble in their mathematical structure the corresponding unvariate marginals and, like these, may be expressed as algebraic relationships between density and cumulatives. Of course, the strategy proposed here does not account for all possible cases, and it is always necessary to weigh generality against simplicity of representation and analysis. For instance, working out Eqs. (34) and (35) requires selection of the initial conditions H (x0 , y0 ), which may be difficult to specify. Generally, calculating Eqs. (34) and (35) is not always trivial, and while one may employ modern computers for the purpose, it might be advantageous to reduce these equations, for example, to the simpler Eqs. (37) and (38), if one expects that the data may be suitably modeled by a convenient family of copulas, such as the Clayton family. Because of their particular importance, and for reasons of clarity, we have restricted the discussion here to bivariate distributions. Nevertheless, the construction methods based on copulas immediately suggest the transition to higher-dimensional distributions. For instance, the corresponding AC has the analogous general form C (u1 , . . . , un ) = −1 ( (u1 ) + · · · + (un )) (Nelsen, 1998), which indicates that it is straightforward in principle to construct multivariate S-distributions of dimension n. Acknowledgements The authors are grateful to Drs. S. Lipsitz and P. Rust for their generous help throughout this project and for their constructive criticism. Appendix Following is a list of one-parameter bivariate families of copulas and generators that have been described in the literature (e.g., Joe, 1997). (1) Normal: for 0 1, C (u, v) = −1 (u), −1 (v) where is the N (0, 1) cdf. With x = −1 (u), y = −1 (v), the density of C (u, v) is
−1/2 −1
2 2 2 2 exp −1/2 1 − x + y − 2xy exp 1/2 x 2 + y 2 . c (u, v) = 1 − Noteworthy properties: reflection symmetry; easy multivariate extension; for = 0. (2) Plackett (1965): for 0 ∞,
1/2 C (u, v) = (1/2)−1 1 + (u + v) − (1 + (u + v))2 − 4uv where = − 1. Noteworthy properties: reflection symmetry;
for → 1.
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1837
−1/ (3) Kimeldorf and Sampson (1975): for 0 ∞, C (u, v) = u− + v − − 1 . Noteworthy properties: lower tail dependence; mixture of powers; for → 0. −1/ (4) Galambos (1975): for 0 < ∞, C (u, v) = uv exp (− ln u)− + (− ln v)− . Noteworthy properties: upper tail dependence; extreme value copula; for → 0. (5) For 0 1, C (u, v) = min{u, v} + (1 − )uv. Noteworthy properties: reflection symmetry; for = 0. (6) For 0 1, C (u, v) = [min{u, v}] [uv]1− . Noteworthy properties: reflection symmetry; for = 0. Of particular interest are Archimedean copulas. These families are reviewed in more detail in Nelsen (1998): (7) Ali-Mikhail-Haq family: for = 0. C (u, v) = uv/(1 − (1 − u)(1 − v)); (t) = ln(1 − (1 − t))/t, −1 < 1; (8) Gumbel–Hougaard family:
1/ ; C (u, v) = exp − (− ln u) + (− ln v)
(t) = (− ln t) ,
1 < ∞;
(9) Frank family: C (u, v) = −−1 ln([ − (1 − exp(−u))(1 − exp(−v))]/), where (t) = − ln((exp(−t) − 1)/(exp(−) − 1))
= 1 − exp(−);
for 0 < ∞.
(10) Gumbel–Barnett family: C (u, v) = uv exp(− ln u ln v),
(t) = ln(1 − ln t),
0 < 1;
(11) Joe (1993) Family 1/ C (u, v) = 1 − (1 − u) + (1 − v) − (1 − u) (1 − v) , for 1 < ∞. (t) = − ln 1 − (1 − t) (12) Clayton (1978) −1/ ,0 , C (u, v) = max u− + v − − 1 (t) = t − − 1 for 0. (13) C (u, v) = / ln(exp(/u) + exp(/v) − exp()), (t) = exp(/t) − exp() for > 0. (14) C (u, v) = 1 +
u
−1/
(t) = t −1/ − 1
1/ − 1 + v −1/ − 1
for > 0.
− ,
C0 =
.
for = 1.
1838
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
References Balthis, W.L., Voit, E.O., Meaburn, G.M., 1996. Setting prediction limits for mercury concentrations in fish having high bioaccumulation potential. Environmetrics 7, 429–439. Capéraà, P., Fougères, A.-L., Genest, C., 1997. A nonparametric estimation procedure for bivariate extreme values copulas. Biometrika 84 (3), 567–577. Clayton, D.G., 1978. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65, 141–151. Das, Sanjiv R., Geng. G., 2003. Simulating correlated default processes using copulas: a criterion-based approach. Paper posted on DefaultRisk.com http://www.defaultrisk.com/ Frees, E.W., Valdez, E.A., 1997. Understanding relationships using copulas. Paper presented at the 32nd Actuarial Research Conference, August 1997. Galambos, J., 1975. Order statistics of samples from multivariate distributions. J. Amer. Statist. Assoc. 70, 674–680. Genest, C., Rivest, L.-P., 1993. Statistical inference procedures for bivariate Archimedean copulas. J. Amer. Statist. Assoc. 88, 1034–1043. Genius, M., Strazzera, E., 2003. The copula approach to sample selection modeling: an application to the recreational value of forests. Paper posted on Social Science Research Network Electronic Paper Collection: http://papers.ssrn.com/ He, Q., Voit, E.O., 2005. Estimation and completion of survival data with piecewise linear models and S-distributions. J. Statist. Comput. Simulation 75 (4), 287–305. Hernández-Bermejo, B., Sorribas, A., 2001. Analytical quantile solution for the S-distribution, random number generation and statistical data modeling. Biometrical J. 43, 1007–1025. Joe, H., 1993. Parametric families of multivariate distributions with given margins. J. Multivariate Anal. 46, 262–282. Joe, H., 1997. Multivariate Models and Dependence Concepts. Chapman & Hall, London. Johnston, M.E., Tietjen, G.L., Beckman, R.J., 1980. A new family of probability distributions with applications to Monte Carlo studies. J. Amer. Statist. Assoc. 75, 276–279. Kimeldorf, G., Sampson, A.R., 1975. Uniform representations of bivariate distributions. Comm. Statist. 4, 617–627. Meester, S.G., MacKay, J., 1994. A parametric model for cluster correlated categorical data. Biometrics 50, 954–963. Nelsen, R.B., 1998. An Introduction to Copulas. Springer, New York. Pearson, K., 1895. Contributions to the mathematical theory of evolution. II. Skew variations in homogeneous material. Philos. Trans. Roy. Soc. A 186, 343–414. Plackett, R.L., 1965. A class of bivariate distributions. J. Amer. Statist. Assoc. 60, 516 522. Rosenberg, J.V., Schuermann, T.I., 2004. A general approach to integrated risk management with skewed, fat-tailed risks. Paper provided by Federal Reserve Bank of New York in its series Staff Reports with number 185. http://www.newyorkfed.org/ Savageau, M.A., 1969. Biochemical systems analysis, I. Some mathematical properties of the rate law for the component enzymatic reactions. J. Theoret. Biol. 25, 365–369. Savageau, M.A., 1979. Growth of complex systems can be related to the properties of their underlying determinants. Proc. Nat. Acad. Sci. 76, 5413–5417. Savageau, M.A., 1982. A suprasystem of probability distributions. Biometrical J. 24, 323–330. Schweizer, B., 1991. Thirty years of copulas. In: Dall’Aglio, G., Kotz, S., Salinetti, G. (Eds.), Advances in Probability Distributions with Given Marginals. Kluwer, Dordrecht, Netherlands, pp. 13–50. Schweizer, B., Sklar, A., 1983. Probabilistic Metric Spaces. Elsevier, North-Holland, New York. Shih, J.H., Louis, T.A., 1995. Inferences on the association parameter in copula models for bivariate survival data. Biometrical 51, 1384–1399. Sklar, A., 1959. Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 8, 229–231. Smith, Mu, D., 2003. Modelling sample selection using Archimedean copulas. Econometrics J. 6, 99–123. Sorribas, A., March, J., Voit, E.O., 2000. Estimating age-related trends in cross-sectional studies using S-distributions. Statist. Med. 10 (5), 697–713. Sorribas, A., March, J., Trujillano, J., 2002. A new parametric method based on S-distributions for computing receiver operating characteristic curves for continuous diagnostic tests. Statist. Med. 21, 1213–1235. Sorribas, A., Muiño, J.M., Voit, E.O., 2006. GS-distributions: a new family of distributions for continuous unimodal variables. Comput. Statist. Data Anal., in press, doi:10.1016/j.csda.2005.04.016 Voit, E.O. (Ed.), 1991. Canonical Nonlinear Modeling. S-System Approach to Understanding Complexity. Van Nostrand Reinhold, New York. Voit, E.O., 1992. The S-distribution: a tool for approximation and classification of univariate, unimodal probability distributions. Biometrical J. 34, 855–878. Voit, E.O., 1996. Dynamic trends in distributions. Biometrical J. 38 (5), 587–603. Voit, E.O., 1998. Canonical modeling: a link between environmental models and statistics. Austrian J. Statist. 27, 109–121. Voit, E.O., 2000a. Computational Analysis of Biochemical Systems. A Practical Guide for Biochemists and Molecular Biologists. Cambridge University Press, Cambridge, UK xii +530pp. Voit, E.O., 2000b. A maximum likelihood estimator for the shape parameters of S-distributions. Biometrical J. 42(4), 471–479. Voit, E.O., Rust, P.F., 1992. Tutorial: S-system analysis of continuous univariate probability distributions. J. Statist. Comput. Simulation 42, 187–249. Voit, E.O., Schwacke, L.H., 1998. Scalability properties of the S-distribution. Biometrical J. 40, 665–684. Voit, E.O., Schwacke, L.H., 2000. Random number generation from right-skewed, symmetric, and left-skewed distribution. Risk Anal. 20 (1), 59–71.
L. Yu, E.O. Voit / Computational Statistics & Data Analysis 51 (2006) 1822 – 1839
1839
Voit, E.O., Sorribas, A., 2000. Computer modeling of dynamically changing distributions of random variables. Math. Comput. Modelling 31, 217–225. Voit, E.O., Yu, S.Y., 1994. The S-distribution: approximation of discrete distributions. Biometrical J. 36, 205–219. Voit, E.O., Balthis, W.L., Holser, R.A., 1995. Hierarchical Monte Carlo modeling with S-distributions: concepts and illustrative analysis of mercury contamination in king mackerel. Environ. Int’l 21, 627–635. Yu, S., Voit, E.O., 1995. A simple, flexible failure model. Biometrical J. 37, 595–609. Zheng, M., Klein, J.P., 1995. Estimates of the marginal survival for dependent competing risks based on an assumed copula. Biometrika 82 (1), 127–138.