A simple approach to standardized-residuals-based higher-moment tests

A simple approach to standardized-residuals-based higher-moment tests

Journal of Empirical Finance 19 (2012) 427–453 Contents lists available at SciVerse ScienceDirect Journal of Empirical Finance journal homepage: www...

452KB Sizes 0 Downloads 34 Views

Journal of Empirical Finance 19 (2012) 427–453

Contents lists available at SciVerse ScienceDirect

Journal of Empirical Finance journal homepage: www.elsevier.com/locate/jempfin

A simple approach to standardized-residuals-based higher-moment tests Yi-Ting Chen ⁎ Institute of Economics, Academia Sinica, Taipei 115, Taiwan

a r t i c l e

i n f o

Article history: Received 17 June 2011 Received in revised form 8 March 2012 Accepted 5 April 2012 Available online 17 April 2012 JEL classification: C12 C22 C52

Keywords: Conditional distribution Estimation effect GARCH-type models Higher-moment tests Standardized errors

a b s t r a c t We propose a new approach to the higher-moment tests for evaluating the standardized error distribution hypothesis of a conditional mean-and-variance model (such as a GARCH-type model). Our key idea is to purge the effect of estimating the conditional mean-and-variance parameters on the estimated higher moments by suitably using the first and second moments of the standardized residuals. The resulting higher-moment tests have a simple invariant form for various conditional mean-and-variance models, and are also applicable to the symmetry or independence hypothesis that does not involve a complete standardized error distribution. Thus, our tests are simple and flexible. Using our approach, we establish a class of skewness– kurtosis tests, characteristic-function-based moment tests, and Value-at-Risk tests for exploring the standardized error distribution and higher-order dependence structures. We also conduct a simulation to show the validity of our approach in purging the estimation effect, and provide an empirical example to show the usefulness of our tests in exploring conditional non-normality. © 2012 Elsevier B.V. All rights reserved.

1. Introduction A conditional distribution model is often established by adding a distribution hypothesis to the standardized errors of a conditional mean-and-variance model that are assumed to be independently and identically distributed (IID). This context encompasses a variety of fully specified GARCH (generalized autoregressive conditional heteroskedasticity)-type models for financial returns, and these models have wide applications in asset pricing, Value-at-Risk (VaR) evaluation, density forecast, and other conditional-distribution-oriented problems. In this context, researchers have proposed various specification tests for the standardized error distribution. For parametric (finite-dimensional-moments-based) tests, examples include the normality tests of Fiorentini et al. (2004) and Bontemps and Meddahi (2005) and the distribution tests of Duan (2003) and Bontemps and Meddahi (forthcoming). For nonparametric (infinite-dimensional-moments-based) tests, examples include the symmetry test of Bai and Ng (2001), the distribution tests of Bai (2003), Koul and Ling (2006), among others. These two classes of tests have various strengths and weaknesses in various aspects. Nonparametric tests could have large-sample powers against arbitrary deviations from the null hypothesis. In comparison, parametric tests could have more specific power directions for identifying possible causes of misspecification. In this paper, we focus entirely on the parametric tests, and aim at proposing a simple and flexible approach to generating model-invariant parametric tests in this context. By construction, the standardized error distribution has zero mean and unit variance, and any sensible parametric distribution test must be based on certain moments that cannot be recovered from the first and second moments. We refer to such moments as “higher moments” by the sense that they are in contrast with the “lower moments:” the first and second moments. Thus, the

⁎ Tel.: + 886 2 27822791x622; fax: + 886 2 27853946. E-mail address: [email protected]. 0927-5398/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jempfin.2012.04.006

428

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

higher moments of our interest include not only the skewness and kurtosis coefficients, underlying the normality tests of Jarque and Bera (1980, 1987), and the expectations of Hermite polynomials, underling the normality tests of Kiefer and Salmon (1983) and Bontemps and Meddahi (2005), but also any other moments beyond the mean-and-variance scope. Examples include the expected coverage rate and the expected shortfall of a lower-tail event for the VaR evaluation; see, e.g., Christoffersen (1998) and Berkowitz (2001). The bounded moments used by Chen et al. (2000) and Premaratne and Bera (2005) for symmetry tests and the probability-integral-transformation (PIT)-based moments considered by Diebold et al. (1998), Wallis (2003), Ghosh and Bera (2005), and Lejeune (2009) for distribution evaluation are also examples of higher moments. Other examples may also be extracted from the information matrix equality, the Bartlett (1953) identities, and the side conditions for non-Gaussian maximumentropy distributions; see, e.g., White (1982), Chesher et al. (1999), and Park and Bera (2009, Table 1), respectively. Theoretically, we may base the tests of our interest on such higher moments of the standardized errors. Nonetheless, we have to first deal with the issue of parameter estimation uncertainty before establishing the asymptotically valid tests. This issue arises because the standardized errors are unobserved, and any feasible higher-moment test must be based on the standardized residuals of an estimated conditional mean-and-variance model. In the literature, there are two main approaches to this issue. The first one is to derive the asymptotic null distribution of an estimated moment by directly accounting for the estimation effect; see, e.g., Newey (1985), Tauchen (1985), and White (1987) for this approach and Lejeune (2009) and Chen (2011) for recent applications. The second one is to purge the estimation effect using a score-based orthogonal transformation of the moment function; see, e.g., Neyman (1959) and Wooldridge (1990) for the ideas. Unlike the first approach that needs an explicit choice of the estimation method, the second approach is invariant to various T 1/2-consistent estimation methods, where T denotes the sample size. However, these two approaches are both model-specific, and their test statistics are dependent on the conditional mean-and-variance derivatives that could have some complicated forms. Recently, Bontemps and Meddahi (forthcoming) contributed a useful variant of the second approach for testing distribution hypotheses. Instead of using the whole score function, their transformation uses a particular score function that is free of the model derivatives. Thus, their distribution tests are simple and model-invariant. In this paper, we propose another simple and flexible approach to this issue. Our key idea is to eliminate the conditional meanand-variance estimation effect on the estimated higher moments using the first and second moments of the standardized residuals. This is motivated by the fact that the latter two statistics are also subject to this estimation effect. Accordingly, we obtain another class of standardized-residuals-based higher-moment tests that are also robust to various T 1/2-consistent estimators, free of the conditional mean-and-variance derivatives, invariant to various conditional mean-and-variance models, and hence attractive for applications. Importantly, because our approach does not rely on the score function to eliminate the conditional mean-and-variance estimation effect, it can also be flexibly applied to the hypothesis that does not involve a complete standardized error distribution assumption, such as the hypothesis of symmetry or independence. For practical applications, we also apply our approach to establishing a set of skewness–kurtosis tests for various distribution hypotheses, a set of characteristicfunction-based moment tests for testing heavily-tailed distribution hypotheses, and a set of VaR tests, and extend these tests to evaluating the independence hypothesis against higher-order dependence structures. A simulation shows the validity of our approach, and an empirical example shows that our tests are useful for exploring conditional non-normality. The remainder of this paper is organized as follows. In Section 2, we introduce the proposed approach and a general form of our tests. In Section 3, we compare the proposed approach with existing approaches. In Section 4, we apply our approach to generating different sets of higher-moment tests. Section 5 includes a Monte Carlo simulation. Section 6 contains an empirical example. Section 7 provides the conclusion. The mathematical proofs are presented in Appendix A. 2. The proposed approach Let yt be the dependent variable with the time index t, and X t−1 be the information set generated by the lagged dependent variables and some exogenous variables such that X t−1 ⊂X t for all t. The conditional mean-and-variance model for yt jX t−1 is the form: 1=2

yt ¼ μ t ðα Þ þ vt ht ðα Þ

;

ð1Þ

where μt : = μt(α) stands for a conditional mean specification, and ht : = ht(α) represents a conditional variance specification; μt and ht are both X t−1-measurable and differentiable with respect to the parameter vector α; vt : = vt(α) is the standardized error  with E½vt  ¼ 0 and E v2t ¼ 1. This context encompasses various regressions and GARCH-type models. Let f ð⋅jX t−1 Þ be the true conditional probability density function (PDF) of yt jX t−1 . Also, let g(·, β) be a postulated PDF of vt that may, or may not, include a parameter vector β, and Θ be the parameter space of the parameter vector θ : = (α ⊺, β ⊺) ⊺. Note that θ = α when β is irrelevant. As mentioned before, it is common to establish a conditional distribution model for yt jX t−1 by adding the distribution and independence hypotheses to the standardized errors of model (1). This conditional distribution model is correctly specified when there exists a true parameter vector θo : = (αo⊺, βo⊺) ⊺ ∈ Θ at which −1=2

f ðyjX t−1 Þ ¼ ht ðα o Þ

  −1=2 g ðy−μ t ðα o ÞÞht ðα o Þ ; βo ; ∀y∈R;

ð2Þ

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

429

for all t. Condition (2) comprises the distribution hypothesis: fv(·)=g(·, βo), where fv(·) denotes the true PDF of vot : = vt(αo), and the independence hypothesis: Assumption 1. {vot} is an IID sequence, and vot is independent of X t−1 for all t. Assumption 1 further encompasses the presumption that model (1) is correctly specified for the conditional mean and   variance of yt jX t−1 ; that is, E½vot jX t−1  ¼ 0 and E v2ot X t−1  ¼ 1 for all t. Let ψt(θ) : = ψ(vt, β) be a q × 1 moment function of vt which is “higher-order” in the sense that it cannot be represented as a linear combination of the “lower-order” moment functions: vtιq and (vt2 − 1)ιq, where ιq : = (1, …, 1) ⊺ is a q × 1 vector of ones. Also, let xt : = xt(α) be a p × q matrix of X t−1 -measurable variables which may be dependent on α, and denote xot : = xt(αo). Suppose that condition (2) implies the conditional moment restriction: E½ψot jX t−1  ¼ 0; ∀t;

ð3Þ

with ψot : = ψt(θo). Our higher-moment test evaluates the distribution hypothesis by examining the unconditional moment restriction: E½xot ψot  ¼ 0; ∀t;

ð4Þ

with xt = Iq and p = q, where Iq denotes the q × q identity matrix. It can also evaluate the independence hypothesis against higherorder dependence structures by examining Eq. (4) with a time-varying xt. This is due to the fact that this restriction must be satisfied under the independence hypothesis; otherwise, this hypothesis is misspecified. We provide useful examples of Eq. (4) in Section 4.  ⊺ ^ ⊺ be an estimator for θo. Also, let vec(xt) be a pq × 1 vector which is obtained by stacking the column vectors ^ ⊺T ; β Let θ^T :¼ α T of the p × q matrix xt, and denote the score function st(θ) : = ∇ βlng(vt, β). Our approach needs Assumption 1 and the following ones: Assumption 2. The estimator θ^T is T 1/2-consistent for θo. Assumption 3. The asymptotic expansion: T T  pffiffiffi 1 X 1 X pffiffiffi ζ^ t ¼ pffiffiffi ζ ot þ E½∇θ⊺ ζ t θ¼θo T θ^T −θo þ op ð1Þ T t¼1 T t¼1

ð5Þ

  and T −1=2 ∑Tt¼1 ζ^ t ¼ Op ð1Þ, where ζ^ t :¼ ζ t θ^T and ζot : = ζt(θo), hold for the moment functions: (i) ζt = xtψt, (ii) ζt = vec(st⊺ ⊗ xt), and (iii) ζt = (vec(xt)vt, vec(xt)(vt2 − 1)) ⊺. Assumption 2 is satisfied for the maximum likelihood (ML) method and various two-step estimation methods under ^ T , it also applies to condition (2) (and suitable regularity conditions); see, e.g., Newey and McFadden (1994). For the estimator α the Gaussian quasi-ML (QML) method when the conditional mean-and-variance model is correctly specified, as implied by Assumption 1; see, e.g., Bollerslev and Wooldridge (1992). This QML method is particularly important in testing the hypothesis of symmetry, or independence, in which the standardized error distribution is unspecified and the ML method is inapplicable. Assumption 3 is also common in the moment-test literature; see, e.g., White (1994, Chapter 9). It is standard to derive Eq. (5) by first taking a Taylor expansion of the statistic T −1=2 ∑Tt¼1 ζ^ t and then assuming a uniform law of large number for the sequence {∇ θζt} and using Assumption 2. In addition, the {ζot}s in Assumption 3 are martingale-difference sequences under condition (2). Given Assumption 2 and Eq. (5), we may further show T −1=2 ∑Tt¼1 ζ^ t ¼ Op ð1Þ using the asymptotic normality T −1=2 ∑Tt¼1 ζ ot , implied by a suitable martingale-difference central limit theorem for {ζot}. Since these asymptotic arguments and their underlying conditions are well-documented, we present the Assumptions 2 and 3 as high-level assumptions without derivation for simplicity. As will be explained in Section 3, existing approaches also need similar assumptions. ^ Importantly, the asymptotic expansion in Eq. (5) with ζ t=x tψt indicates that, because of the estimation effect, the θ T -based −1=2 T ^ ^ ^ ^ ^ ^ ∑t¼1 x t ψ t , where x t :¼ xt ðα T Þ and ψ t :¼ ψt θ T , is not asymptotically equivalent to its θo-based counterpart statistic T   T −1=2 ∑Tt¼1 xot ψot , unless E ∇θ⊺ ðxt ψt Þ θ¼θo ¼ 0. Ignoring this effect could make the resulting test asymptotically invalid. It is common ^ following Eq. (5) and a particular choice to deal with this problem by directly deriving the asymptotic distribution of T −1=2 ∑Tt¼1 x^t ψ t of θ^T . Instead of using this direct approach, we propose a new approach to purging the estimation effect by exploiting the Jacobian   ^ T Þ. matrix E ∇θ⊺ ðxt ψt Þ θ¼θo and by suitably using the first and second moments of the standardized residual v^ t :¼ vt ðα 2.1. Purging the estimation effect To introduce our approach, we denote wt(α):=(∇ αμt)ht− 1/2, zt(α):=(∇ αht)ht− 1, svt(θ):=∂st/∂vt, ψvt(θ):=∂ψt/∂vt, and ψβt(θ):=         ^ :¼ ψ θ^T , and ψ ^ :¼ ψ θ^T , and their ^ T Þ, z^t :¼ zt ðα ^ T Þ, ^s t :¼ st θ^T , ^s vt :¼ svt θ^T , ψ ^ t :¼ wt ðα ∇ β⊺ψt, the θ^T -based variables: w vt

vt

βt

βt

θo-based counterparts: wot : = wt(αo), zot : = zt(αo), sot : = st(θo), sv, ot : = svt(θo), ψv, ot : = ψvt(θo), and ψβ, ot : = ψβt(θo). In the

430

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

  Appendix A, by exploring the matrix E ∇θ⊺ ðxt ψt Þ θ¼θo , we show that the asymptotic expansion in Eq. (5) with ζt = xtψt can be reformulated as: Lemma 1. Under Assumptions 1–3(i), T T h ipffiffiffi  X 1 X ^ ¼ p1ffiffiffi ^ −β pffiffiffi xot ψot þ E½xot E ψβ;ot T β x^t ψ t T o T t¼1 T t¼1 i3 2 h  h E vecðx Þw⊺ i⊺ i⊺ pffiffiffi ot ot 1 h i 5 T ðα ^ T −α o Þ þ op ð1Þ: − E ψv;ot ⊗Ip ; E ψv;ot vot ⊗I p 4 h 2 E vecðx Þz⊺ ot

ot

Note that such an expansion is standard in the moment-test literature. For example, the asymptotic expansion of Chen (2008, ^ T is the Gaussian QMLE for Lemma 1) is also of a very similar structure, while it is derived in the case where β is irrelevant and α ^ on the statistic ^ T and β αo. According to Lemma 1, we conduct a two-step transformation to purge the estimation effects of α T ^ . T −1=2 ∑Tt¼1 x^t ψ t ^ by replacing the estimated moment T −1 ∑T x^t ψ ^ with The first-stage transformation eliminates the estimation effect of β T t t¼1 the statistic T −1 ∑Tt¼1 x^t ξ^t , where " #" #−1 T T X X ⊺ ⊺ ^ ^ ^ ^ ^ ^ ^s t st st ψt st ξ t :¼ ψ t − t¼1

ð6Þ

t¼1

^ on ^s t . This part is a straightforward application of the is the ordinary least squares (OLS) residual of the artificial regression: ψ t score-based orthogonal transformation underlying Neyman's (1959) C(α) test; see Section 3.2. Note that ξ^t has the population counterpart ξot : = ξt(θo) with h i h i ⊺ ⊺ −1 st : ξt ðθÞ :¼ ψt −E ψt st E st st

ð7Þ

We denote ξvt(θ) : = ∂ ξt(θ)/∂ vt and ξv, ot : = ξvt(θo), and make the assumption: h i      ^ ^s ⊺ ∑T ^s t ^s ⊺ −1 is consistent for E ψ s⊺ E sot s⊺ −1 . (ii) The generalized information matrix equality: Assumption 4. (i) ∑Tt¼1 ψ t t ot ot t ot t¼1 h i   E ζ β;ot þ E ζ ot s⊺ot ¼ 0, with ζot : = ζt(θo) and ζβ, ot : = ∇ β⊺ζt(θo), holds for ζt = ψt and ζt = st. (iii) E½ξvt θ¼θ^T and E½ξvt vt θ¼θ^T (or     T −1 ∑Tt¼1 ξ^vt and T −1 ∑Tt¼1 ξ^vt v^ t ) are consistent for E ξv;ot and E ξv;ot vot , respectively. This assumption comprises the consistency of certain plug-in statistics for their popular moments and the generalized information matrix equality which generally holds under condition (2); see, e.g., Tauchen (1985, Theorem 5). In Appendix A, we show the result: Lemma 2. Under Assumptions 1, 2, 3(i)–(ii), and 4(i)–(ii), T T 1 X 1 X pffiffiffi x^t ξ^t ¼ pffiffiffi xot ξot T t¼1 T t¼1 i3 2 h  h E vecðx Þw⊺ i⊺ i⊺ pffiffiffi ot ot 1 h 4 5 T ðα h i ^ T −α o Þ þ op ð1Þ: − E ξv;ot ⊗Ip ; E ξv;ot vot ⊗Ip 2 E vecðxot Þz⊺ot

ð8Þ

  ^ −β . This shows the validity of this transformation in purging the Note that the right-hand side of Eq. (8) is free of T 1=2 β T o ^ . estimation effect of β T ^ T by replacing The second-stage transformation is the key ingredient of our approach. It eliminates  the estimation effect of α ^ ¼ ϕ θ^T and ^ , where ϕ the estimated moment T −1 ∑Tt¼1 x^t ξ^t with the statistic T −1 ∑Tt¼1 ϕ ct ct ct

  1 2 ϕct ðθÞ :¼ xt ξt −E½ξvt vt − E½ξvt vt  vt −1 : 2

ð9Þ

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

431

^ is redefined as: In the case where E½ξvt  and E½ξvt vt  have no closed-form formulae, and ϕ ct " # " # ! T T   X X 2 ^ v^ − 1 ^ v^ ^ ¼ x^ ξ^ − 1 ^ ϕ −1 ; ξ ξ v t ct t t T t¼1 vt t 2T t¼1 vt t

ð10Þ

h i  T ^ ^⊺ −1 ^ ^ − ∑T ψ ^ ^⊺ s vt . To see its idea and usefulness, we derive the following result in Appendix A: where ξ^vt :¼ ψ vt t¼1 t s t ∑t¼1 s t s t Lemma 3. Under Assumptions 1, 2 and 3(iii), 3 2 3 T T 1 X 1 X 2 h i3 7 6 7 6 pffiffiffi pffiffiffi vecðx^t Þv^t vecðxot Þvot ⊺ 7 6 7 6 E vecðxot Þwot pffiffiffi T t¼1 T t¼1 7 6 7 6 6 7 ^ T −α o Þ þ op ð1Þ: i 5 T ðα 7¼6 7− 4 h 6 T T ⊺  7 6 1 X  7 6 1 X E vecðxot Þzot 2 2 5 5 4 pffiffiffi 4 pffiffiffi vecðx^t Þ v^t −1 vecðxot Þ vot −1 T t¼1 T t¼1 2

ð11Þ



This indicates that the first and second moments of the v^ t s: T −1 ∑Tt¼1 vecðx^t Þv^ t and T −1 ∑Tt¼1 vecðx^t Þ v^ 2t −1 are also subject to ^ T . Thus, we can utilize Lemma 3 to purge this undesirable effect contained in Lemma 2. In Appendix A, we the estimation effect of α show the main result of our approach: Proposition 1. Under Assumptions 1–4, T T X 1 X ^ ¼ p1ffiffiffi pffiffiffi ϕc;ot þ op ð1Þ; ϕ ct T t¼1 T t¼1

ð12Þ

where ϕc, ot : = ϕct(θo). ^ is asymptotically equivalent to its θo-based counterpart: T −1=2 ∑T ϕ , This shows that the θ^T -based statistic T −1=2 ∑Tt¼1 ϕ ct t¼1 c;ot and hence is not contaminated by the estimation effect of θ^T . Recall that θ degenerates to θ = α when β is irrelevant. In this scenario, the higher-moment ψ t reduces to ψt = ψt(α). h ipfunction ffiffiffi ^ −β , and the first-stage Correspondingly, Lemma 1 no longer contains the β-specific component: E½xot E ψβ;ot T β T o orthogonal transformation and Lemma 2 become redundant. Thus, we can further simplify the transformed higher-moment function ϕct as

  1 2 ϕct ðθÞ ¼ xt ψt −E½ψvt vt − E½ψvt vt  vt −1 ; 2

ð13Þ

^ accordingly. This demonstrates again that our key idea is to purge the estimation effect of α ^ T by suitably and compute the ϕ ct using the first and second moments of the v^ t s. Importantly, the transformation in Eq. (13) does not explicitly rely on the score function st. Thus, it is particularly useful when the hypothesis being tested does not involve a complete standardized error distribution. 2.2. The proposed test   Note that {ϕc,ot} is a martingale-difference sequence with E ϕc;ot X t−1  ¼ 0 for Eq. (9) under condition (2) and for condition (13) under Assumption 1. Thus, we may apply a martingale-difference central limit theorem to this sequence and show that T 1 X d pffiffiffi ϕc;ot → Nð0; Ωc Þ; T t¼1

ð14Þ

h i with the asymptotic covariance matrix Ωc :¼ E ϕc;ot ϕ⊺c;ot . By further assuming that Ωc can be consistently estimated by a ^ c , we propose the test statistic: uniformly positive definite matrix Ω "

T 1X ^ ϕ C T :¼ T T t¼1 ct

#⊤

" # T X −1 1 ^ ^ Ωc ϕ : T t¼1 ct

ð15Þ

^ c can be computed by replacing the role of θo using θ^T in Ωc when Ωc has a closed-form formula. In more general The matrix Ω 2 2 ^ ^⊺ ^ c as Ω ^ c ¼ T −1 ∑T ϕ cases, we can also write Ω t¼1 ct ϕ ct , and present CT as CT = TRc accordingly, where Rc denotes the uncentered

432

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

^ . Given Proposition 1, the asymptotic normality in Eq. (14), and the determinant coefficient of the artificial regression: 1 on ϕ ct ^ c for Ωc, the test statistic CT has the asymptotic null distribution of CT: consistency of Ω d

2

C T → χ ðpÞ

ð16Þ

under Assumption 1 (and the other aforementioned assumptions). Henceforth, we refer to this test as the C test. The C test is applicable to evaluating the distribution hypothesis by maintaining the independence hypothesis and setting xt = Iq and p = q. It is also applicable to evaluating the symmetry hypothesis or the independence hypothesis without relying on a particular distribution assumption. Meanwhile, it can flexibly generate various higher-moment tests by choosing various ψts, as will be explained in Section 4. More attractively, its test statistic is invariant to various conditional mean-and-variance models and various T 1/2-consistent estimation methods. In particular, the test statistic CT does not include the model derivatives: wt and zt, and hence is very simple to implement in empirical studies. The C test constitutes a class of parametric tests for condition (2). As mentioned in the Introduction, a parametric test does not have universal powers against arbitrary deviations from the null hypothesis. Nonetheless, the C test could be powerful in the directions where the higher-moment restriction: E½ϕct  ¼ 0 is not satisfied. This point follows from a standard result of the chisquare test theory: C T →d χ 2 ðp; νÞ under the local alternative: E½ϕct jX t−1  ¼ dt T −1=2 (and suitable regularity conditions), where dt is a p × 1 vector of X t−1 -measurable variables with E½dt ≠0; unlike (16), χ 2(p, ν) is a non-central chi-square distribution with the ⊺ degrees of freedom p and the noncentrality parameter ν :¼ E½dt  Ω−1 c E½dt . By maintaining the presumption that the  conditional mean-and-variance model is correctly specified, we can further utilize the restrictions: E½vt jX t−1  ¼ 0 and E v2t X t−1  ¼ 1 to observe that the non-zero E½dt  (or said the source of local powers) of the C test is resulted from E½xt ξt ≠0, which reduces to E½xt ψt ≠0 in the case of Eq. (13). This indicates that the C test is powerful against the misspecifications of the distribution hypothesis (xt = Iq) that make E½ψt ≠0 or E½st ≠0. It is also powerful against the deviations from the independence hypothesis that generate the correlation between ψt and xt. These features may be potentially useful for identifying possible misspecifications in the process of extending a correctly specified conditional mean-and-variance model to a conditional distribution model. 3. Comparison with existing approaches In this section, we compare our approach with existing approaches that are also applicable to generating higher-moment tests for evaluating condition (2). For ease of exposition, we first discuss these approaches in Sections 3.1 and 3.2, and make the comparisons in Section 3.3. 3.1. A direct approach ^ A direct approach to the estimation-effect problem is to derive and estimate the asymptotic covariance matrix of T −1=2 ∑Tt¼1 x^t ψ t ^ according to Eq. (5) with ζt = xtψt and a particular choice of θ T . This approach needs an extension of Assumption 2: T  pffiffiffi 1 X ηot þ op ð1Þ; T θ^T −θo ¼ pffiffiffi T t¼1

ð17Þ

  where ηot : = ηt(θo) and ηt : = ηt(θ) denotes the influence function of the selected θ^T , and it satisfies the restriction E ηot X t−1  ¼ 0 under condition (2); see Section 3.3 for examples of ηt. The resulting test statistic is of the form: "

T 1X ^ x^ ψ MT :¼ T T t¼1 t t

#⊺

" # T X −1 1 ^ ^ ^ Ωm xψ ; T t¼1 t t

ð18Þ

^ is a consistent (and uniformly positive-definite) estimator for and has hthe asymptotic null distribution: MT →d χ 2 ðpÞ, where Ω i m Ωm :¼ E ϕm;ot ϕ⊺m;ot with ϕm, ot : = ϕmt(θo) and ϕmt ðθÞ :¼ xt ψt þ E½∇θ⊺ ðxt ψt Þηt :

ð19Þ

 h i⊺ Note that E½∇θ ðxt ψt Þ :¼ E½∇α ⊺ ðxt ψt Þ; E ∇β⊺ ðxt ψt Þ , and we can write that E½∇α⊺ ðxt ψt Þθ¼θo

i3 2 h  h E vecðx Þw⊺ i⊺ i⊺ ot ot 1 h i5 ¼ − E ψv;ot ⊗Ip ; E ψv;ot vot ⊗Ip 4 h 2 E vecðxot Þz⊺ot

ð20Þ

h i   and E ∇β ðxt ψt Þ θ¼θo :¼ E½xot E ψβ;ot following the derivation of Lemma 1. It is easy to see that Ωm is the asymptotic covariance ^ using the expression: matrix of T −1=2 ∑Tt¼1 x^t ψ t T T X 1 X ^ ¼ p1ffiffiffi pffiffiffi ϕm;ot þ op ð1Þ; x^t ψ t T t¼1 T t¼1

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

433

which is obtained by introducing Eq. (17) into Eq. (5) with ζt = xtψt. We refer to this test as the M test. More discussions of this approach can be found in Newey (1985), Tauchen (1985), White (1987, 1994), and Davidson and MacKinnon (1993, Section 16.8), among many others. 3.2. Indirect approaches Alternatively, an indirect approach eliminates the estimation effect of θ^T using a score-based orthogonal transformation of the moment function xtψt: h i h i ⊺ ⊺ −1 St ; ϕwt ðθÞ :¼ ðxt ψt Þ−E ðxt ψt ÞS t E S t S t

ð21Þ

where S t ðθÞ :¼ ∇θ lngyt and gyt : = ht− 1/2g(vt, β). Recall that st : = ∇ βlng(vt, β), and note that the score function S t is of the form: 

∇α lngyt St ¼ ∇β lngyt

"

¼

# 1 −wt ‘vt − zt ðvt ‘vt þ 1Þ ; 2 st

ð22Þ

where ‘vt ðθÞ :¼ ∂lngðvt ; βÞ=∂vt ; see also Bontemps and Meddahi (forthcoming, Eq. (39)). Denote Sot :¼ S t ðθo Þ and ϕw, ot : = ϕwt(θo). This orthogonal transformation is valid because condition (2) implies not only E½S ot jX t−1  ¼ 0 but also the generalized information matrix equality:  h ⊺  E½∇θ⊺ ϕwt jX t−1 θ¼θo þ E ϕw;ot S ot X t−1  ¼ 0;

ð23Þ

as in Assumption 4(ii). By introducing Eq. (21) into Eq. (23), we have E½∇θ⊺ ϕwt θ¼θo ¼ 0. This means that the estimation effect of θ^T becomes asymptotically negligible when the asymptotic expansion in Eq. (5) is based on ζt =ϕw, t. This result can be represented as: T T X 1 X ^ ¼ p1ffiffiffi pffiffiffi ϕw;ot þ op ð1Þ; ϕ wt T t¼1 T t¼1

ð24Þ

where " #" #−1 T  T    X X ⊺ ⊺ ^ ^ ^ ^ ^ ^ x^t ψ t S t StSt S^ t ; ϕ wt :¼ x^t ψ t − t¼1

ð25Þ

t¼1

  and S^ t :¼ S t θ^T . Note that like the result in Eq. (12), the result in Eq. (25) is also robust to the estimation effect of θ^T . Accordingly, this approach generates the test statistic: " W T :¼ T

T 1X ^ ϕ T t¼1 wt

#⊺

" # T X ^ −1 1 ^ Ω ϕ w T t¼1 wt

ð26Þ

^ is a consistent (and uniformly positive-definite) estimator for that hashthe asymptotic null distribution: W T →d χ 2 ðpÞ, where Ω i w ⊺ Ωw :¼ E ϕw;ot ϕw;ot . We refer to this test as the W test. As discussed by Bera and Bilias (2001, pp. 24–25), the idea of such a score-based orthogonal transformation can be traced back to Neyman's (1959) C(α) test for checking a partial set of parameter restrictions in the presence of nuisance parameters. With a different motivation, Wooldridge (1990, Theorem 2.1) also proposed an orthogonal transformation for generating robust conditional moment tests in the context of partially specified models. As mentioned by Bontemps and Meddahi (forthcoming, Section 4.1), the Khmaladze (1981) transformation underlying Bai's (2003) nonparametric test is also related to this approach; see also Duan (2003) for another type of orthogonal transformation. Focusing on testing the distribution hypothesis (where p = q and xt = Iq), Bontemps and Meddahi (forthcoming) proposed a clever variant of this approach using the fact that  h h h     1  E ψot S ⊺ot X t−1  ¼ −wot E ψot ‘v;ot X t−1 − zot E ψot vot ‘v;ot þ 1 X t−1 E½ψot sot jX t−1  2 " h i 1 h  i # z − −w E ψ ‘ E ψot vot ‘v;ot þ 1 ot ot v;ot ot ¼ ; 2 E½ψot sot 

ð27Þ

  where the second equality is implied by Assumption 1. Given Eq. (27), their approach generates the condition: E ∇θ⊺ ϕwt θ¼θo ¼ 0

434

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

using the moment function ψt which is, or is transformed to be, orthogonal to ‘vt , vt ‘vt þ 1, and st. Note that ð‘vt ; vt ‘vt þ 1; st Þ⊺ is the (wt, zt)-free part of the whole score function S t in Eq. (22). By basing the original transformation on this part of S t , their distribution test statistics are also model-invariant and free of the model derivatives: wt and zt. 3.3. Comparisons The tests generated by the aforementioned approaches are all asymptotically valid in the presence of the estimation effect, but they involve different strategies in dealing with this effect and hence have different properties. To directly “calculate” this effect, the M test requires a particular choice of θ^T . In the case where θ^T is the MLE for θo, the influence function of θ^T is of the form: ηt ¼ −E½∇θ⊺ S t 

−1

St :

ð28Þ

^ , the resulting ηt = (ηαt(α) ⊺, ηβt(θ) ⊺) ⊺ ^ T and the α ^ T -based two-step MLE β In the case where θ^T comprises the Gaussian QMLE α T is composed of the subvectors: ηαt ðα Þ :¼ and

h i 1 h i −1

 1  2 ⊺ ⊺ E wt wt þ E zt zt wt vt þ zt vt −1 2 2

ð29Þ

h i

h i 1 h i ⊺ −1 ⊺ ⊺ ηβt ðθÞ :¼ −E st st st − E½svt E wt þ E½svt vt E zt ηαt : 2

ð30Þ

Obviously, the asymptotic covariance matrix Ωm and hence the test statistic MT would change with the choice of θ^T . It is known ^ t ¼ 0. In that MT could be asymptotically equivalent to WT when θ^T is the MLE of θo because of the estimating equation: T −1 ∑Tt¼1 S comparison, like the C test, the W test and its variant by Bontemps and Meddahi (forthcoming) are both invariant to the choice of θ^T because they are all established by “purging” the estimation effect. 2 2 ^ ^⊺ ^ c ¼ T −1 ∑T ϕ Recall that CT = TRc2 when Ω t¼1 ct ϕ ct . It is known that WT can also be computed as WT = TRw, where Rw denotes the ^ , when Ω ^ w ¼ T −1 ∑T ϕ ^ ϕ ^ ⊺ ; see Wooldridge (1990, uncentered determinant coefficient of the artificial regression: 1 on ϕ wt t¼1 wt wt Procedure 2.1). The difference between TRc 2 and TRw 2 reflects the difference between the underlying approaches of the C test and W test. Importantly, the ϕwt is dependent on S t whichis inseparable from the model derivatives: wt and zt as shown in (22); the  ϕmt in Eq. (19) is dependent on the Jacobian matrix E ∇θ⊺ ðxt ψt Þ which is also inseparable from wt and zt as shown in Lemma 1. Thus, the M and W test statistics are both model-specific and based on the model derivatives: wt and zt that could involve a computational issue in practical applications. For the simple location-scale model: yt ¼ μ þ σvt ;

ð31Þ

it is easy to write that wt = (σ − 1, 0) ⊺ and zt = (0, σ − 2) ⊺ using the fact that Eq. (31) is a special case of Eq. (1) where μt = μ and ht = σ 2. For the AR(1) model: yt ¼ α o þ α 1 yt−1 þ σ vt ;

ð32Þ

we have wt = (σ − 1, yt − 1σ − 1, 0) ⊺ and zt = (0, 0, σ − 2) ⊺. For the GARCH(1,1) model: 1=2

2

yt ¼ α o þ ut ; ut ¼ vt ht ; ht ¼ α 1 þ α 2 ht−1 þ α 3 ut−1 ; we need to write wt = (ht− 1/2, 0, 0,0)⊺ and zt ¼ h−1 t



∂ ∂α o

ð33Þ

ht ; ∂α∂ ht ; ∂α∂ ht ; ∂α∂ ht 1

2

3

⊺

, and compute zt using the recursive formula:

2

3 2 3 ∂ ∂ 6 ∂α ht 7 6 ∂α ht−1 7 6 o 7 2 6 o 7 3 6 ∂ 7 6 ∂ 7 −2α 3 ut−1 6 6 7 ht 7 h 6 7 6 6 7 t−1 7 1 ∂α 6 ∂α 1 7 6 6 7 7 þ α2 6 1 6 7¼4 7; 5 h 6 ∂ 7 6 ∂ 7 t−1 6 6 ht 7 ht−1 7 2 6 ∂α 2 7 6 ∂α 2 7 ut−1 6 7 6 7 4 ∂ 5 4 ∂ 5 ht ht−1 ∂α 3 ∂α 3 see Bollerslev (1986). Obviously, the derivation of wt and zt would be more tedious for more complicated conditional mean-andvariance models, such as the smooth-transition AR, the exponential GARCH (EGARCH) model of Nelson (1991), the GJR-GARCH model of Glosten et al. (1993), or a GARCH-type model that includes more past return shocks ut − ks. Theoretically, practitioners need to take these computational complexities into account in using the M test or the W test. In comparison, the proposed test and the test of Bontemps and Meddahi (forthcoming) are free of these complexities in their applications.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

435

To compare our approach with the approach of Bontemps and Meddahi (forthcoming), we focus on the case where θ = α and xt = Iq for ease of discussion. In this case, our ϕct degenerates to a particular version of Eq. (13):   1 2 ϕct ðα Þ ¼ ψt −E½ψvt vt − E½ψvt vt  vt −1 ; 2

ð34Þ

and their orthogonal transformation of ψt is of the form: h i h i−1 ϕbt ðα Þ :¼ ψt −E ψt ζ ⊺t E ζ t ζ ⊺t ζt; ¼ ψt −ς 1 ‘vt −ς 2 ðvt ‘vt þ 1Þ;

ð35Þ

where ζ t ¼ ð‘vt ; vt ‘vt þ 1Þ⊺ , i i  h   h  E½ψt ‘vt  E v2t ‘2vt þ 2E½vt ‘vt  þ 1 −E½ψt ðvt ‘vt Þ E vt ‘2vt þ E½‘vt  ; ς 1 :¼



2      E ‘2vt E v2t ‘2vt þ 2E½vt ‘vt  þ 1 − E vt ‘2vt þ E½‘vt  and h i i  h  E½ψt ðvt ‘vt ÞE ‘2vt −E½ψt ‘vt  E vt ‘2vt þ E½‘vt  ς 2 :¼   



2 :   E ‘2vt E v2t ‘2vt þ 2E½vt ‘vt  þ 1 − E vt ‘2vt þ E½‘vt  This shows that Eq. (34) is of a simpler expression than Eq. (35). Nonetheless, as addressed by a referee, the transformation of Bontemps and Meddahi (forthcoming) is also easy to use in practical applications. In particular, the coefficients ζ1 and ζ2 can be estimated by fitting the artificial regression: ψt on ‘vt and vt ‘vt þ 1 using the OLS method. Let φt(α) : = φ(vt) be a moment function of vt with the derivative φvt :¼ ∂v∂ t φt . Bontemps and Meddahi (forthcoming, Proposition 1) showed that, under the assumption: φtg(vt,⋅) → 0 as vt → ± ∞, the distribution hypothesis implies the moment restriction: h i E φv;ot þ φot ‘v;ot ¼ 0;

ð36Þ

where φot : = φt(αo), φv, ot : = φvt(αo), and ‘v;ot :¼ ‘vtðθo Þ. Using Eq. (36) with a suitable choice of φt and the normality-implied restrictions: ‘vt ¼ −vt , ς1 ¼ −E½ψt vt , and ς2 ¼ − 12 E ψt v2t , it is not difficult to verify that Eq. (34) is the same as Eq. (35) in testing normality. Thus, our and their tests are the same or asymptotically equivalent when the hypothesis being tested is N(0,1). Importantly, these two approaches become different in testing other hypotheses because the restriction: ‘vt ¼ −vt is no longer ^ T without relying on ‘vt and vt ‘vt þ 1. This is satisfied. In particular, Eq. (34) maintains its robustness to the estimation effect of α essential because the score-based orthogonal transformations: Eqs. (21) and (35) are both inapplicable when the hypothesis being tested does not include a complete g(·, β); see Section 4.1 for an example of testing symmetry and Section 4.4 for examples of testing independence. By contrast, our approach remains flexibly applicable to this scenario because it purges the estimation ^ T without relying on S t , ‘vt , or vt ‘vt þ 1. This shows an important feature of our approach. Indeed, it is even easier to effect of α implement our approach in this scenario because the associated ϕct can be simplified as Eq. (13), or as Eq. (34) when xt = Iq. 4. Applications In this section, we apply this approach to generating a set of skewness–kurtosis tests, a set of characteristic-function-based moment tests, and a set of VaR tests for testing distribution hypotheses from various aspects in Sections 4.1–4.3, respectively. For ease of reference, we first show the associated ψts and their E½ψvt s and E½ψvt vt s in Table 1. In Section 4.4, we further extend the applicability of these particular C tests to testing the independence hypothesis against higher-order dependence structures. 4.1. Skewness–kurtosis tests In empirical finance, researchers routinely explore the standardized error distributions of GARCH-type models using the sample skewness and kurtosis of standardized residuals; see, e.g., Bollerslev (1987) and Engle and Gonzalez-Rivera (1991), among many other. Nonetheless, with the exception of the skewness–kurtosis test for normality, it seems that a generalized skewness–kurtosis test is still lacking in this context. This motivates us to consider this test as a demonstrative example of our approach. Bai and Ng (2005) proposed a skewness–kurtosis test in the presence of unknown serial correlation; see also Bontemps and Meddahi (2005, forthcoming) for distribution tests in this case. Since this case is precluded by Assumption 1, our tests have a different focus from these tests. ^ k :¼ T −1 ∑Tt¼1 v^ kt be the k-th sample moment of the Let λk be the k-th arithmetic moment of the postulated PDF g(·, β), and m standardized residuals. Note that λ3 and λ4 are, respectively, the skewness and kurtosis of g(·, β) because g(·, β) is standardized

436

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

Table 1 Higher-moment functions for particular C tests. 1. Skewness–kurtosis tests The S test ψSt(θ) : = vt3 − λ3 E½ψvt  ¼ 3 E½ψvt vt  ¼ 3λ3 The K test ψKt(θ) : = vt4 − λ4 E½ψvt  ¼ 4λ3 E½ψvt vt  ¼ 4λ4 2. Characteristic-function-based moment tests The I test

vt 1 ψIt ðθÞ :¼ 1þv ln 1 þ v2t ‘vt 2 þ t 2  

 ð1−v2t Þ vt E½ψvt  ¼ E þ E 1þv þ 12 E ln 1 þ v2t ‘vv;t 2 ‘vt 2 2 ð Þ 1þv ð t Þ  t 

 vt ð1−v2t Þ v2 E½ψvt vt  ¼ E þ E 1þvt 2 ‘vt þ 12 E vt ln 1 þ v2t ‘vv;t 2 ð tÞ ð1þv2t Þ The R test v2 ψRt ðθÞ :¼ 1þvt 2 þ ðvt −arctanðvt ÞÞ‘vt t    v2 vt E½ψvt  ¼ 2E þ E 1þvt 2 ‘vt þ E ðvt −arctanðvt ÞÞ‘vv;t 2 2 ð Þ 1þv ð t Þ  t  

v2t v3 E½ψvt vt  ¼ 2E þ E 1þvt 2 ‘vt þ E v2t −vt arctanðvt Þ ‘vv;t 2 ð tÞ ð1þv2t Þ 3. VaR tests The V test ψVt(θ) : = I(πt b Vτ) − τ E½ψvt  ¼ −gτ E½ψvt vt  ¼ −g τ vτ The E test  π Iðπ bV Þ þ g nτ ψEt ðθÞ :¼ 2 t t τ π t −1 Iðπ t bV τ Þ þ g nτ V τ " # gτ E½I ðπ t bV τ Þ−V τ   E½ψvt  ¼ 2 2E½ðg t =g nt Þπ t Iðπ t bV τ Þ− V τ −1 g τ " # E½ðg t =g nt Þvt I ðπ t bV τ Þ−vτ V τ gτ  E½ψvt vt  ¼ 2E½ðg t =g nt Þvt π t Iðπ t bV τ Þ−vτ V 2τ −1 gτ Note: For the S and K tests, λk denotes the kth moment implied by g(·, β). For the I and R tests, we denote‘vv;t :¼ ∂‘vt =∂vt with‘vt :¼ ∂lngðvt ; β Þ=∂v t . For example,‘vv;t ¼ −1    pffiffiffi    pffiffiffi      pffiffiffi v2t βþ1 1 for N(0,1), ‘vv;t ¼ ‘vt ‘vt þ π= 3 þ 2 − π= 3 exp − π= 3 vt g ðvt Þ for Lg, and ‘vv;t ¼ − β−2 for t(β). For the V and E þ 2 βþ1 2 2 1þv2t =ðβ−2Þ 2 ðβ−2Þ ð1þvt =ðβ−2ÞÞ tests, we define the indicator function I(A) which is equal to one (zero) if A occurs (does not occur), and denote πt in (55), Vτ :=Gn− 1(τ) with Gn− 1(·) standing for the quantile function of N(0,1) for some τ∈(0,1), gnτ :=gn(Vτ) with gn(·) standing for the PDF of N(0,1), gτ :=g(vτ,β) with vτ :=G− 1(τ,β) and G− 1(·, β) standing for the quantile function of g(·, β), gt :=g(vt,β), and gnt :=gn(πt). The notations: λk, ‘vt , Vτ, Gn− 1(·), and gn(·) are also defined in the main text for ease of discussion.

to have λ1 = 0 and λ2 = 1. The skewness–kurtosis test checks the distribution hypothesis: fv(⋅) = g(⋅, βo) by examining the higher    moment restrictions: E v3ot ¼ λ3 and E v4ot ¼ λ4 , where λ3 and λ4 are evaluated at β = βo. Given the moment functions: ψSt and ψKt in Table 1, we can easily obtain a skewness test, a kurtosis test, and a skewness–kurtosis test for this hypothesis by applying ⊺ ⊺ ⊺ the C test (with xt = Iq) to ψt = ψSt, ψt = ψKt, and ψt = (ψSt , ψKt ) , respectively. We refer to these particular  Ctests as the  S test, the K test, and the SK test, respectively. Note that the S test (the K test or the SK test) requires the existence of E v6ot and λ6 (E v8ot and λ8) for defining the associated Ωc. In the case where the PDF g(·) and hence the λks are free of the unknown β, these particular C tests would have very simple ⊺ ⊺ ⊺ test statistics. Specifically, by introducing ψt : = (ψSt , ψKt ) and xt = Iq into (13), we can obtain that 2

3  1  2 3 λ v − 3v −1 −3v t t 7 6 t 2 3   ϕct ¼ 4 5: v4t −λ4 2v2t −1 −4λ3 vt

ð37Þ

It is easy to further show that, in this case, " Ωc ¼

2

σS σ SK

# σ SK ; 2 σK

ð38Þ

where 2

σ S :¼ λ6 −3λ3 λ5 þ

9 2 27 2 λ3 −6 λ4 þ λ þ 9; 4 4 3

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

437

  2 2 2 2 σ K :¼ λ8 −4λ4 λ6 −8λ3 λ5 þ 4λ4 −λ4 þ 16λ3 λ4 þ 16λ3 ; and

3 2 σ SK :¼ λ7 þ λ5 ð3 þ 2λ4 Þ þ λ3 3λ4 þ ðλ4 −λ6 Þ : 2

Thus, we can write the S test statistic, the K test statistic, and the SK test statistic as

2 1 ^ 3 − λ3 ð3m ^ 2 −1Þ−3m ^ 1 =σ 2S ; ST :¼ T m 2 2

ð39Þ

2

^ 4 −λ4 ð2m ^ 2 −1Þ−4λ3 m ^ 1 Þ =σ K ; K T :¼ T ðm and

"

1 ^ ^ ^ SK T :¼ T m 3 − 2 λ3 ð3m 2 −1Þ−3m 1 ^ 2 −1Þ−4λ3 m ^ 4 −λ4 ð2m ^1 m

#⊺ "

σ 2S σ SK

ð40Þ

σ SK σ 2K

#−1 "

# 1 ^ 2 −1Þ−3m ^1 ^ 3 − λ3 ð3m m ; 2 ^ 2 −1Þ−4λ3 m ^ 4 −λ4 ð2m ^1 m

ð41Þ

^ k s. Moreover, they encompass a number of respectively. These test statistics are simple transformations of the λks and the m existing test statistics, as discussed below. 4.1.1. Testing for symmetry The S test can be applied to testing the symmetry of the standardized errors by modifying ST as the following statistic: ′

S T :¼ T

^ 3 −3m ^ 1 Þ2 ðm : ^ 6 −6m ^4 þ9 m

ð42Þ

We need this modification because the PDF g(·) is not fully specified in this case, and this modification is obtained by applying the symmetry-implied odd-moment restrictions: λ3 = λ5 = 0 (and hence σS2 = λ6 − 6λ4 + 9) to ST and by estimating σS2 using its ^ 6 −6m ^ 4 þ 9. Note that ST′ includes the following statistic: sample counterpart σ^ 2S ¼ m ″

S

T

:¼ T

^ 23 m ^ 6 −6m ^4 þ9 m

ð43Þ

^ 1 ¼ 0. The statistic ST″ is known to have the asymptotic null distribution χ 2(1) under the null of as a special case where m symmetry for an IID sequence with unknown mean and variance and a finite sixth moment; see, e.g., Gupta (1967). Put differently, the ST″-based skewness test is applicable to the extreme case of model (1) where μt and ht are both constants; see ^ T using our approach, the ST′-based skewness test considerably Eq. (31). In comparison, by purging the estimation effect of α extends the applicability of the ST″-based test to the general context of Eq. (1) in a very simple way. 4.1.2. Testing for normality (and symmetric distributions) In the case where g(·) is symmetric and fully specified, we can simplify SKT as: SK s :¼ Ss þ K s ; Ss :¼ T

^ 3 −3m ^ 4 −λ4 ð2m ^ 1 Þ2 ^ 2 −1ÞÞ2 ðm ðm ; K s :¼ T ; λ6 −6λ4 þ 9 λ8 −4λ4 λ6 þ 4λ34 −λ24

ð44Þ

by the symmetry-implied restrictions: λ3 = λ5 = λ7 = 0 (and hence σSK = 0). This scenario includes testing normality as an important example. In this example, g(·) is the PDF of N(0,1), and implies the even-moment restrictions: k

λ2k ¼ ∏ ð2i−1Þ; k ¼ 1; 2; 3; 4; …;

ð45Þ

i¼1

see, e.g., Davidson and MacKinnon (1993, p. 804). Accordingly, the test statistic SKs can be further simplified as: SK n :¼ Sn þ K n ; Sn :¼ T

^ 3 −3m ^ 4 −6m ^ 1 Þ2 ^ 2 þ 3Þ2 ðm ðm ; K n :¼ T : 6 24

This test statistic degenerates to the Jarque and Bera (1980) test statistic: JB ¼ T

! ^ 23 ðm ^ 4 −3Þ2 m þ ; 6 24

ð46Þ

438

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

^ 1 ¼ 0 and m ^ 2 ¼ 1. This case holds when μt is linear with an intercept, ht see also White and MacDonald (1980), in the case where m is a constant, and the linear regression is estimated using the OLS method. In comparison, the test statistic SKn is applicable to the context of Eq. (1) which is more general than this case. Note that SKn is the same as the normality test statistic of Kiefer and Salmon (1983) in the conditional homoskedasticity context. Specifically, this normality test is based on the third and fourth normalized Hermite polynomials: H3 ðvt Þ and H4 ðvt Þ with Hk ðvt Þ :¼ k

−1=2

  1=2 vt Hk−1 ðvt Þ−ðk−1Þ Hk−2 ðvt Þ

when k ≥ 2; H0 ðvt Þ :¼ 1 and H1 ðvt Þ :¼ vt . As shown by Bontemps and Meddahi (2005), the Kiefer–Salmon test is also applicable to

our context because, under the null of normality, ψt ¼ ðH3 ðvt Þ; H4 ðvt ÞÞ⊺ is orthogonal to S t ¼ wt vt þ 12 zt v2t −1 and hence the ^ T is eliminated; see also their paper for other Hermite-polynomials-based normality tests and their finiteestimation effect of α sample performance. In comparison, our approach provides a different interpretation to this model-invariant feature of the ^ T on the sample skewness and kurtosis of the Kiefer–Salmon test; that is, this test successfully eliminates the estimation effect of α v^ t s using the first and second moments of the v^ t s. This interpretation is useful because its underlying idea is generally applicable to many other cases. For example, our test statistic SKs is directly applicable to other symmetric g(·)s. If g(·) is the PDF of the standardized logistic distribution (denoted by Lg):



−2 π π π ; g ðvt Þ ¼ pffiffiffi exp − pffiffiffi vt 1 þ exp − pffiffiffi vt 3 3 3

ð47Þ

we can obtain the associated SKs using the Lg-implied even moments:

k   3 −ð2k−1Þ ζ ð2kÞ; k ¼ 1; 2; 3; 4; …; λ2k ¼ 2 2 ð2k!Þ 1−2 π

ð48Þ

where ζ ðkÞ :¼ ∑∞i¼1 i−k is the Riemann zeta function with ζ(2) = π2/6, ζ(4) = π4/90, ζ(6) = π6/945, and ζ(8) = π8/9450; see Balakrishnan and Nevzorov (2003, pp. 199–200). If g(·, β) is the PDF of the standardized t(β) with a fixed degrees of freedom β: g ðvt ; βÞ ¼

!−βþ1 2 2 Γ ððβ þ 1Þ=2Þ v pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ t ; β−2 Γ ðβ=2Þ ðβ−2Þπ

ð49Þ

where Γ(·) denotes the gamma function, we can also obtain the associated SKs when β > 8 using the t(β)-implied even moments: k

λ2k ¼ ðβ−2Þ

Γ ðk þ 1=2ÞΓ ðβ=2−kÞ ; β > 2k; k ¼ 1; 2; 3; 4; …; Γ ð1=2ÞΓ ðβ=2Þ

ð50Þ

see Stuart and Ord (1994, p. 548). In the case where β is unknown, we may also apply the general form of the S (K or SK) test to examining the distribution hypothesis of t(β) when β > 6 (β > 8). 4.2. Characteristic-function-based moment tests The S (K or SK) test requires finite sixth (eighth) moments. This precludes a number of heavily-tailed distributions, such as the t(β)s with β ≤ 6 (β ≤ 8). It is therefore important to supplement these tests with another set of moment tests that are free of this restriction. We generate this set of tests by applying our approach to a pair of characteristic-function-based moments. Specifically, vot has the characteristic function: E½expðiωvot Þ ¼ E½cosðωvot Þ þ iE½sinðωvot Þ; i :¼

pffiffiffiffiffiffiffiffi −1; ω∈R;

regardless whether it has the skewness and kurtosis. As discussed by Lukacs (1970), E½expðiωvot Þ decays to zero as ω → ∞ for any continuous distribution. Motivated by this property and the fact that E½sinðωvot Þ ¼ 0 holds for all ω∈Rþ under the symmetry of vot, Chen et al. (2000) considered a ω-weighted moment restriction for testing symmetry: h i E ∫Rþ sinðωvot Þκ ðωÞdω ¼ 0;

ð51Þ

where κ(·) is the PDF of a nonnegative random variable that serves as a weighting function of ω. In this study, we extend this restriction to a set of higher-moment restrictions for testing the distribution hypothesis.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

439

For simplicity, we focus on the exponential weighting function: κ(ω) = exp(− ω), which is useful for generating the closedform moment functions: ∫Rþ sinðωvt Þκ ðωÞdω ¼

vt 1 þ v2t

ð52Þ

and ∫Rþ ½1−cosðωvt Þκ ðωÞdω ¼

v2t : 1 þ v2t

ð53Þ

Given Eq. (36) and the moment functions: ψIt and ψRt defined in Table 1, we can obtain the higher-moment restrictions: E½ψIt ðθo Þ ¼ 0 and E½ψRt ðθo Þ ¼ 0 by setting the φvt in Eq. (36) as Eqs. (52) and (53), respectively. We refer to the C tests with ψt = ψIt, ψt = ψRt, and ψt = (ψIt, ψRt) ⊺ as the I test, the R test, and the IR test, respectively. The notations “I” and “R” stand for the imaginary part: E½sinðωvot Þ and the real part: E½cosðωvot Þ of the characteristic function, respectively. This class of tests is also applicable to evaluating the distribution hypothesis. In particular, they are applicable to testing various g(·, β)s using the associated ‘vt s. For example, ‘vt ¼ −vt if g(·) is the PDF of N(0,1), as mentioned before; on the other hand,    pffiffiffi   pffiffiffi ‘vt ¼ − π= 3 þ 2g ðvt Þ 1 þ exp − π= 3 vt if g(·) follows Eq. (47), and

βþ1 vt

‘vt ¼ − β−2 1 þ v2t =ðβ−2Þ if g(·, β) follows Eq. (49). In testing symmetry or testing a symmetric g(·, β), we can simplify ψIt as vt(1 + vt2) − 1 because  −1 ¼ 0 holds by the symmetry, as implied by (51). In this scenario, we can also express the IR test statistic as the E vot 1 þ v2ot ^ following Eq. (34) and the sum of the I and R test statistics. We can easily compute the I, R, and IR test statistics as TRc2 using the ϕ ct associated ψts. It is known that the imaginary (real) part of the characteristic functionh determines ithe symmetry (dispersion) of the

−1  h 2

−1 i E vt 1 þ v2t as a symmetry distribution. Thus, like the skewness (kurtosis), we may interpret the moment E vt 1 þ v2t (dispersion) indicator. Correspondingly, like the S test (the K test), the I test (the R test) may be expected to be powerful for testing a distribution hypothesis against the misspecifications in the direction of symmetry (dispersion). However, unlike the former, the latter is valid regardless of whether the skewness (kurtosis) exists. Thus, the I, R, and IR tests are expected to be more useful than the S, K, and SK tests for evaluating heavily-tailed distribution hypotheses. 4.3. VaR tests In empirical finance, researchers have also proposed a variety of methods for the parametric VaR evaluation. Representative examples include Christoffersen's (1998) likelihood ratio (LR) test that checks whether the observed coverage rate of a lower-tail event is the same as its theoretical counterpart implied by a VaR model. The graphical inspection method of Diebold et al. (1998) examines whether the histogram of the PIT of returns, generated from condition (2), is sufficiently close to the PDF of U(0,1). These two methods are both closely related to Kupiec's (1995) backtesting method. Berkowitz's (2001) LR test, based on a normality-transformation of returns, is another well-known example in this literature. Although these tests have been routinely applied to the standardized residuals of an estimated GARCH-type model for parametric VaR evaluations, such applications are likely to be size-distorted because these existing tests are designed in the case of no estimation effect. To correct for this problem, Chen (2011) applied the M test to the associated higher-moment restrictions; see also Lejeune (2009) for a related study. Using our approach, we can easily establish another set of VaR tests that are more attractive than their M-test counterparts because of the appealing properties of the C test. Let G(·, β) be the distribution function of the PDF g(·, β). Also, let Gn(·), Gn− 1(·), and gn(·) be, respectively, the distribution function, quantile function, and PDF of N(0,1), and write Vτ : = Gn− 1(τ) as a pre-selected VaR threshold with a fixed probability τ > 0. The PIT of yt, generated from condition (2), is of the form: P t ðθÞ :¼ Gðvt ; βÞ:

ð54Þ

It is known that P ot :¼ P t ðθo Þ is U(0,1)-distributed and independent of X t−1 under condition (2); see, e.g., Diebold et al. (1998). The same condition also implies that the normality-transformed variable πot : = πt(θo), with −1

πt ðθÞ :¼ Gn ðP t ðθÞÞ;

ð55Þ

440

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

is N(0,1)-distributed and independent of X t−1 , and hence satisfies the restrictions: E½Iðπot bV τ Þ ¼ τ

ð56Þ

and "

hE½πot Iðπot bV τ Þ i E π2ot −1 Iðπot bV τ Þ

#

 ¼

−g n ðV τ Þ : −g n ðV τ ÞV τ

ð57Þ

The first component of the left-hand side of Eq. (57) is based on the expected shortfall. The VaR-evaluation methods of Kupiec (1995), Christoffersen (1998), and Diebold et al. (1998) amount to checking the moment restriction in Eq. (56) when the parametric VaR model is established by assuming condition (2). Moreover, Berkowitz's (2001) LR test amounts to examining the restriction in Eq. (57) when it is applied to the same parametric VaR model; see the Appendix of Chen (2011). Using our approach, we can easily correct the estimation-effect problem that does not considered by these existing tests by applying the C test to the ⊺ ⊺ ψVt and ψEt defined in Table 1. We refer to the C tests with ψt = ψVt, ψt = ψEt, and ψt = (ψVt, ψEt ) as the V test, the E test, and the VE test, respectively. The notations “V” and “E” stand for the (estimation-effect-corrected) VaR and expected shortfall tests, respectively. Unlike the previously mentioned ψts, the ψVt and ψEt are based on the indicator function I(πt b Vτ) which is notdifferentiable in  the ordinary sense. A standard method to deal with this problem is to replace the Jacobian matrix E ∇θ⊺ ðxt ψt Þ in Eq. (5) with ζt = xtψt by the matrix ∇θ⊺ E½xt ψt , following the stochastic-equicontinuity argument; see, e.g., Andrews (1994). Alternatively, it is more convenient to maintain the asymptotic expansion in Eq. (5) with ζt = xtψt in this scenario by the generalized-Taylorexpansion approach of Phillips (1991, 1995). Heuristically, this approach replaces the role of the “derivative” of the indicator ∂ function with δðkÞ :¼ ∂k Iðk≥0Þ, in which the Dirac delta function δ(·) is a generalized function with the sifting property: ∫R δðk−ko Þr ðkÞdk ¼ rðko Þ; for any continuous function r(·); see, e.g., Kanwal (1983) and Phillips (1995, p. 917). In Appendix A, we use this property to derive the formulae of E½ψvt  and E½ψvt vt  for ψt = ψVt and ψt = ψEt shown in Table 1. Given these formulae, we can obtain the ϕcts for the V, E, and VE tests. Our simulation supports the validity of this approach. Similar to the SK test (the IR test), the VE test is also useful for checking the distribution hypothesis. However, unlike the former that checks this hypothesis from the aspect of skewness and kurtosis (asymmetry and dispersion), the latter is focused on exploring the lower-tail misspecifications that are particularly important for risk management. 4.4. Independence tests In the GARCH literature, the standardized error distributions are often specified to be time-invariant; see, e.g., Engle (1982) and Bollerslev (1986) for the standard normal distribution, Bollerslev (1987) for the standardized t distribution, and Nelson (1991) for the generalized error distribution. The aforementioned tests and these specifications are all based on the independence hypothesis. Nonetheless, there also exists a number of time-varying specifications that allow their distributional parameters to vary with certain vt − ks. These specifications are designed to generate higher-order dependence structures, such as time-varying skewness–kurtosis (asymmetry-dispersion) and tail properties. Examples include the conditional skewed t distribution of Hansen (1994) and the conditional skewness–kurtosis-based maximum entropy distribution of Rockinger and Jondeau (2002), among others. Our approach may also be applied to discriminating between these two classes of specifications by testing the independence hypothesis against higher-order dependence structures. Specifically, we can reinterpret φt(α) : = φ(vt) as a higher-moment function of vt, and set ψt ¼ φt −E½φot  for checking the ^ t :¼ φt ðα ^ T Þ, the C implication of the independence hypothesis: E½φot X t−1  ¼ E½φot ; recall that φot : = φt(αo). Given this ψt and φ ^ ¼φ ^ test with Eq. (13) is directly applicable to ψ −E ½ φ . However, because E ½ φ  is generally unknown under the independence t t ot ot ^ with ψ ~ :¼ φ ^ t and by replacing ψ  T :¼ T −1 ∑Tt¼1 φ ^ t −φ T. hypothesis, we need to facilitate this test by estimating E½φot  using φ t t From the expansion: ! T T T X X pffiffiffi 1 X ~ ¼ p1ffiffiffi ^ − 1  T −E½φot Þ; pffiffiffi x^t ψ x^t ψ x^t T ðφ t t T T t¼1 T t¼1 t¼1

ð58Þ

 T which is not considered by Proposition 1. Nonetheless, we can realize that this replacement generates the estimation effect of φ this expansion also shows that we can easily eliminate this effect by setting T −1 ∑Tt¼1 x^t →p 0. This justifies the restriction: T −1 ∑Tt¼1 x^t →p 0 for testing the independence hypothesis. It is easy to satisfy this restriction by choosing a zero-mean (or a centered) xt, such as  ⊺ xt ¼ ðvt−1 ; …; vt−m Þ⊗ιq

ð59Þ

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

441

and xt ¼



⊺  2 2 vt−1 −1; …; vt−m −1 ⊗ιq :

ð60Þ

~ for testing the independence hypothesis, and its test statistic Accordingly, the C test with Eq. (13) remains applicable to ψ t comprises the sample counterpart of Eq. (13): " # " # ! T T   1X 1 X 2 ^ ^ ^ ^  ^ ^ ^ ^ ϕ ct ¼ x t ðφ t −φ T Þ− φ v− φ v v t −1 ; T t¼1 vt t 2T t¼1 vt t

ð61Þ

~ as ψ ~ ¼φ ^ t −E½φt θ¼θ^ and base the C test statistic on ^ vt :¼ φvt ðα ^ T Þ. Similarly, we can also redefine ψ where φ t t T ^ ¼ x^ ϕ ct t

h

i   1 ^ t −E½φt θ¼θ^ −E½φvt θ¼θ^ v^ t − E½φvt vt θ¼θ^ v^ 2t −1 φ T T T 2

ð62Þ

for testing the distribution and independence hypotheses simultaneously. Consequently, we can easily extend the SK test and the IR test to evaluating the independence hypothesis against time-varying skewness–kurtosis and asymmetry-dispersion by using Eq. (61) with φt = (φSt, φKt) ⊺ and φt = (φIt, φRt) ⊺, respectively, in which φSt : = vt3, φKt : = vt4, φIt : = vt/(1 + vt2), and φRt : = vt2/(1 + vt2). Similarly, we can extend the S, K, I, and R tests to testing the independence hypothesis by using Eq. (61) with φt = φSt, φt = φKt, φt = φIt, and φt = φRt, respectively. These independence tests can be implemented without using a particular distribution hypothesis. We can also extend the V, E, and VE tests to checking time-varying lower-tails by using Eq. (62) with φt = ψVt, φt = ψEt, and φt = (ψVt, ψEt) ⊺, respectively. However, these tests are inseparable from a distribution hypothesis for calculating the expectations in Eq. (62); see also Table 1. The aforementioned C tests are all asymptotically valid when T −1 ∑Tt¼1 x^t →p 0, and their test statistics with the xt in Eq. (59), or Eq. (60), are all of the asymptotic null distribution χ 2(m). This choice of xt mimics a convention in the GARCH literature that evaluates the conditional mean-and-variance specification by examining whether vt and vt2 − 1 are correlated to the xt in Eq. (59), or Eq. (60), with q = 1. Indeed, we can also base the C test on other φts and xts for other purposes. For example, Christoffersen (1998) proposed a LR test for testing this hypothesis against the first-order Markov chain of the VaR violations. Berkowitz (2001) also proposed a related LR test for testing the independence of the VaR violations. These two LR tests amount to checking the h i h i moment restrictions: E ψVo;t ψVo;t−1 ¼ 0 and E π o;t πjo;t−i ¼ 0, respectively, with ψVo, t : = ψVt(θo), i = 1, …, k, and j = 1,2, in the context of no estimation effect; see Chen (2011). We can easily correct the estimation-effect problem by applying the C test to Eq. (62) with φt = ψVt and xt = ψV, t − 1 and with ψt = πt and xt = (πt − 1, …, πt − k, πt2− 1 − 1, …, πt2− k − 1) ⊺, respectively. This shows again the flexibility of our approach in practical applications. It should be noted that, by construction, our higher-moment tests are not suitable for checking the independence hypothesis against a conditional mean-or-variance misspecification; we refer to Lundbergh and Teräsvirta (2002), Chen (2008), and the references therein for such tests. Nonetheless, our tests are potentially useful for evaluating whether a conditional mean-andvariance model is capable of fully explaining the dynamic dependence structures of yt jX t−1 , as implied by Assumption 1. 5. Simulation In this section, we conduct a Monte Carlo simulation to assess the finite-sample performance of our approach in testing distribution, symmetry, and independence hypotheses. 5.1. Simulation designs In this simulation, the empirical sizes and powers are all evaluated at the 5% nominal level, the number of replications is 1000, and the sample size is T = 1000 or 2000. The models being considered include the AR(1) model in Eq. (32) with (αo, α1, σ 2) = (0, 0.5, 1) and the GARCH(1,1) model in Eq. (33) with (αo, α1, α2, α3) = (0, 0.1, 0.9, 0.05). The parameters of these two models are, respectively, estimated using the OLS method and the Gaussian QML method. The Gaussian QML method is implemented using the Maximum Likelihood Estimation of GAUSS™ with the BFGS method. The programs used in this simulation are available upon request. We set xt = Ip (xt = vt − 1) in testing distribution and symmetry hypotheses (independence hypothesis). In the experiment of distribution tests, we consider the C tests, the M tests, and the W tests with the skewness–kurtosis-based ψts: ψSt, ψKt, and (ψSt, ψKt) ⊺, the characteristic-function-based ψts: ψIt, ψRt, and (ψIt, ψRt) ⊺, and the VaR-based ψts (τ = 0.1): ψVt, ψEt, and (ψVt, ψEt) ⊺. The data generating process (DGP) for {vot} is an IID sequence with one of the five standardized distributions: N(0,1), Lg, t(9), t(12), and Ln(0.1), where Ln(a) represents the standardized log-normal distribution with the asymmetry parameter a = 0.1. The distribution hypotheses    being ⊺ tested include   N(0,1), Lg, and t(12). The test statistic MT is computed as Eq. (18) with ^ m ¼ T −1 ∑T ϕ ^ ^ ^ ^ , where ϕ ^ ^ Ω mt θ T is the sample analogue of Eq. (19). The test statistics CT and WT are, respectively, t¼1 mt θ T ϕ mt θ T 2 computed as TRc2 and TRw mentioned in Sections 2 and 3. The empirical sizes and powers of these tests are shown in Table 2 for the AR(1) model and in Table 3 for the GARCH(1,1) model. In addition, we also consider the theoretical-covariance-matrix-based C tests

442

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

with ψt = ψSt, ψKt, and (ψSt, ψKt) ⊺ using the closed-form formula of Ωc in Eq. (38). The resulting empirical sizes and powers are shown in Table 4. As mentioned in Section 3.3, Eq. (34) is the same as (35) in testing normality. This means that the C test in Table 3 (Table 4) can be understood as the TRc2-based (theoretical-covariance-matrix-based) test of Bontemps and Meddahi (forthcoming) when the distribution hypothesis being tested is N(0,1). In the experiments of symmetry and independence tests, we do not assume a specific distribution assumption. Thus, the score function S t is not defined, and the W test is inapplicable to these two experiments. The distribution-oriented ψts: ψVt and ψEt are also not considered for the same reason. The M and C test statistics are computed as in Tables 2 and 3 but using different ψts. In testing symmetry, we base these two tests on the odd-symmetric ψts: φSt = vt3, φIt = vt/(1 + vt2), and (φSt, φIt) ⊺ but not the symmetric ψts: φKt = vt4 and φRt = vt2/(1 + vt2) because the symmetry-implied restriction: E½ψot  ¼ 0 only holds for the former. The DGP for {vot} is an IID sequence with one of the symmetric distributions: N(0,1), Lg, and t(9) for computing the empirical sizes and one of the asymmetric distributions: Ln(0.05) and Ln(0.1) for computing the empirical powers. Note that the asymmetry of Ln(0.1) is stronger than that of Ln(0.05). The empirical sizes and powers of these symmetry tests are shown in Table 5. In testing independence, the C test is based on Eq. (61) with the φts: φSt, φKt, (φSt, φKt) ⊺, φIt, φRt, and (φIt, φRt) ⊺; correspondingly, ^ in Eq. (18) with these φts. The DGP for {vot} is an IID sequence ^ t −φ  T in place of the role of ψ the M test is implemented by using φ t with one of the static distributions: N(0,1) and Lg for computing the empirical sizes and a serially dependent sequence with one of the time-varying distributions: t(b1t), Ln(b2t), and Ln(b3t), where the parameters b1t, b2t, and b3t have the law of motions:       ; bit ¼ bi;min þ bi;max −bi;min = 1 þ exp − cio þ ci1 vo;t−1

i ¼ 1; 2; 3;

with cio : = − ln((bi, max − di)/(di − bi, min)), (b1, min, b1, max) = (5, 30), (b2, min, b2, max) = (b3, min, b3, max) = (0, 2), (d1, d2, d3) = (12, 0.05, 0.05), and (c11, c21, c31) = (3, 1.5, 3); see, e.g., Hansen (1994, p. 712) for a related specification. Note that d1 and d2 (or d3) are, respectively, the shape parameters of t(12) and Ln(0.05). Moreover, t(b1t) and Ln(b2t) (or Ln(b3t)) are, respectively, of timevarying tails and asymmetry, and the time-varying asymmetry of Ln(b3t) is stronger than that of Ln(b2t). The empirical sizes and powers of these independence tests are presented in Table 6. 5.2. Simulation results The following discussions will be mainly focused on the experiment of distribution tests because the W test can only be compared with the M and C tests in this experiment. Moreover, this comparison will be concentrated in Table 3 because Table 2 shows very similar results to Table 3 for all the cases considered. Table 3 shows that the empirical sizes of the M, W, and C tests are close to the 5% level for all the distributions considered when these tests are based on one of the ψts: ψSt, ψIt, ΨRt, (ψIt, ΨRt) ⊺, ψVt, and ψEt. This suggests that, like the existing M and W tests, the C test is also valid in this experiment. The same table also indicates that these tests become obviously over-sized in testing the heavily-tailed distributions: Lg and t(12) when ψt = ψKt or (ψSt, ψKt) ⊺. Since this over-sized distortion tends to be remedied when T increases from 1000 to 2000, it is likely to be a finite-sample problem. Specifically, the ψKt-based M, W, and C tests are, respectively, of the empirical sizes: 19.6%, 15.7%, and 18.4% (16.2%, 11.4%, and 15.1%) in testing Lg and 20.6%, 17.8%, and 18.5% (16.8%, 14.1%, and 15.6%) in testing t(12) when T = 1000 (T = 2000). A possible interpretation of this problem is that these kurtosis-based tests all implicitly involve estimating the eighth moment of vot. The adequacy of this estimation needs a sufficiently large T for heavily-tailed distributions; see, e.g., Lobato and Velasco (2004) and Bai and Ng (2005). A similar, but much milder, over-sized distortion can also be found for the M and W tests in testing normality when ψt = ψKt or (ψSt, ψKt) ⊺. Given T = 1000 (T = 2000), these two tests are, respectively, of the empirical sizes: 8.7% and 9.4% (7.8% and 8.3%) when ψt = ψKt and 8.2% and 10.7% (8.6% and 9.6%) when ψt = (ψSt, ψKt) ⊺. In comparison, the C tests with these two ψts are, respectively, of the empirical sizes: 5.8% and 6.3% (6.4% and 7.1%) when T = 1000 (T = 2000). Thus, the C test has better size performance for testing normality. Table 3 also shows that the empirical powers of the M, W, and C tests all tend to increase with T provided that the empirical sizes of these tests are close to the 5% level and the associated ψts are suitably selected. For example, the M, W, and C tests with ψt = (ψIt, ψRt) ⊺ are, respectively, of the empirical sizes: 5.8%, 6.2%, and 5.3% (6.6%, 6.8%, and 5.8%) for testing normality and the empirical powers: 95.5%, 97.5%, and 96.5% (100%, 100%, and 100%) for testing N(0,1) against Lg when T = 1000 (T = 2000). This suggests that, like the M and W tests, the C test is also sensible for checking distribution hypotheses. From Table 3, we can also see that the relative power performance of the M, W, and C tests is dependent on the ψt, the distribution hypothesis, and the DGP being considered. Nonetheless, the empirical powers of the C test are quite close to those the M test in most cases of this experiment. Thus, the following comparison is mainly focused on the W and C tests for ease of exposition. In testing normality, the key difference between the empirical powers of the W and C tests arises when ψt = (ψSt, ψKt) ⊺ and the DGP is symmetric. Given T = 1000 (T = 2000), the W test has the empirical powers: 93.3%, 56.4%, and 82.1% (100%, 90.5%, and 98.6%) and the DGP is Lg, t(12), and t(9), respectively. Correspondingly, the C test has the powers: 78.0%, 37.0%, and 58.8% (97.6%, 79.7%, and 89.0%). In testing Lg, the main difference appears when ψt = ψVt and the DGP is N(0,1). The W and C tests are, respectively, of the empirical powers: 10.3% and 54.9% (12.1% and 85.9%) when T = 1000 (T = 2000). In testing t(12), a key difference arises when ψt = ψRt and the DGP is N(0,1). In this example, the W and C tests are, respectively, of the empirical powers: 70.8% and 45.1% (94.1% and 78.0%) when T = 1000 (T = 2000). The power differences between these two tests are milder in most of the other cases in Table 3. At the first sight, one might conclude that the W test (the C test) obviously outperforms the C test (the W test) in the first and third cases (the second case). However, this ignores the size differences between these two tests. As

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

443

mentioned before, the over-sized distortion of the W test is more serious than that of the C test in testing N(0,1). Thus, the “power advantage” of the W test over the C test is likely to be over-estimated in this case. To verify this point, we conduct a size-corrected experiment for power comparison. In this experiment, the significance of a test statistic is evaluated using the 95% quantile of the simulated null distribution, rather than that of the asymptotic null distribution, of this test statistic. The simulated null distribution is the empirical distribution of a test statistic generated from the 1000 replications under the null hypothesis. This permits us to preclude the effect of size distortions for a fair power comparison. In a Supplementary Appendix B, which is not reported here but is available upon request, we provide a table of the 95% quantiles of the simulated null distributions and a table of the size-corrected powers of various tests. For brevity, we only report the sizecorrected powers of the W and C tests in the above three cases. In the first case, the size-corrected powers of the W test are, respectively, 76.8%, 29.1%, and 56.8% (99.7%, 79.5%, and 94.2%) when T = 1000 (T = 2000) and the DGP is Lg, t(12), and t(9). Correspondingly, the C test has the size-corrected powers: 71.3%, 29.8%, and 50.9% (95.8%, 72.4%, and 84.8%). Importantly, the power advantage of the W test over the C test in checking normality considerably shrinks after the size correction. This verifies our viewpoint about the original power performance of the W and C tests in checking normality. Given T = 1000 (T = 2000), the W and C tests are, respectively, of the size-corrected powers: 7.9% and 53.9% (12.1% and 86.4%) in the second case and 68.1% and 44.6% (93.5% and 78.0%) in the third case. In these two cases, the size correction has little impact on the relative power performance of the W and C tests because the empirical sizes of the two tests are already close to the 5% level before the correction. Recall that the K and SK tests in Table 4 are, respectively, the theoretical-covariance-matrix-based C tests with ψt = ψKt and (ψSt, ψKt) ⊺, and can be interpreted as the tests of Bontemps and Meddahi (2005, forthcoming) with the same ψts in testing N(0,1). By comparing Table 3 with Table 4, we can observe that these theoretical-covariance-matrix-based tests have empirical sizes that are closer to the 5% level than their TRc2-based counterparts (or the M and W tests with the same ψts). This is because, unlike the latter, the former are based on the distribution-implied λks and hence free of estimating the eighth moment of vot. Moreover, the former are also more powerful than the latter in testing normality against other symmetric distributions. This may be explained by the fact that unlike the latter, the former considers not only the skewness–kurtosis restriction but also the theoretical covariance matrix of the sample skewness and kurtosis implied by the normality. Focusing on the C test, we can further compare the power performance of different higher-moments in testing distribution hypotheses. The C tests being compared here include the SK test in Table 4 for the GARCH(1,1) model and the IR and VE tests in Table 3. Given T = 1000, the SK test has the empirical powers: 99.6%, 83.4%, 96.1%, and 91.2% in testing normality when the DGPs are, respectively, Lg, t(12), t(9), and Ln(0.1). Correspondingly, the IR test (the VE test) has the empirical powers: 96.5%, 61.3%, 85.2%, and 87.3% (41.2%, 18.7%, 29.5%, and 91.1%). The former is more powerful than the latter in testing normality. However, their relative performance is reversed in testing heavily-tailed distributions. Given T = 1000, the IR test (the VE test) has the empirical powers: 87.0%, 25.4%, 12.7%, and 99.0% (87.0%, 26.0%, 12.5%, and 100%) for testing Lg against N(0,1), t(12), t(9), and Ln(0.1), respectively. Correspondingly, the SK test has only the powers: 23.7%, 1.5%, 7.0%, and 86.3%. Clearly, the IR test (the VE test) is more powerful than the SK test in examining Lg, particularly when the true distributions are symmetric. A similar result can also be found in testing t(12). Table 3 also shows that the IR test is more powerful than the VE test is testing normality, while the relative performance of these two tests becomes mixed in testing Lg or t(12). We also observe that it is more difficult to discriminate between two heavily-tailed distributions than to test normality against heavily-tailed distributions. In addition, the S test shown in Table 4 is essentially of no power in testing Lg, or t(12), against normality. This is because the S test is based on the σS2 implied by a heavily-tailed distribution in this case. This σS2 obviously “over-estimates” the true σS2 implied by the normality. By contrast, the ψSt-based C test in Table 3 is free of this problem because it consistently estimates the ^ using its sample counterpart, but this test has no power for testing N(0,1) asymptotic covariance matrix of the T −1=2 ∑Tt¼1 ψ St against other symmetric distributions because the zero-skewness restriction holds under both the null and alternative hypotheses. (The S test in Table 4 has certain powers even in this scenario because it is based on the normality-implied σS2 which “underestimates” the true σS2 under the alternatives.) In comparison, this skewness test is highly powerful for testing normality against the asymmetric distributions Ln(0.05) and Ln(0.1); see also the I test for a similar result. Thus, the S and I tests are powerful against the non-normality in the direction of asymmetry. It can be seen that, by contrast, the K and R tests are powerful against the non-normality in the direction of heavy tails. Moreover, the E test is systematically more powerful than the V test; see also Chen (2011) for a similar simulation finding. Regarding the experiments of symmetry tests and independence tests, Tables 5 and 6 show that the M and C tests are both of reasonable size performance. Table 5 also indicates that these two tests have the same empirical sizes and powers for testing −1 T ^ symmetry under the AR(1) model. This is because their test statistics are the same for testing symmetry when T ∑t¼1 v t ¼ 0, ^ t ¼ 0, and T −1 ∑Tt¼1 v^ 2t −1 z^t , and this condition is satisfied because we estimate the parameters of T −1 ∑Tt¼1 v^ 2t ¼ 1, T −1 ∑Tt¼1 v^ t w the AR(1) model using the OLS method. This table also shows that the empirical powers of the C test are very similar to, or marginally higher than, those of the M test for testing symmetry against Ln(0.05) or Ln(0.1) under the GARCH(1,1) model. The empirical powers of these two tests all reasonably increase with T and the degrees of asymmetry. Table 6 demonstrates that the C test is more powerful than the M test for testing independence against the time-varying t(b1t), Ln(b2t), and Ln(b3t) under both the AR(1) model and the GARCH(1,1) model, provided that the φt is suitably selected. In particular, as shown by this table, the empirical powers of the C test for testing independence against t(b1t) are mainly contributed by the use of the symmetric φt: φKt or φRt. In comparison, the empirical powers of the C test for testing independence against Ln(b2t) or Ln(b3t) are mainly contributed by the use of the odd-symmetric φt: φSt or φIt. This is because t(b1t) is symmetric and its tails are time-varying, and Ln(b2t) and Ln(b3t) are asymmetric and their degrees of asymmetry are time-varying. In this experiment, the empirical powers of the C test also

444

Table 2 The empirical sizes and powers of the M, W, and C tests for the AR(1) model. T = 1000 DGP

Test

N(0,1)

M

W

C

Lg

M

C

t(12)

M

W

C

t(9)

M

W

C

Ln(0.1)

M

W

C

T = 1000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

4.9 7.2 7.1 5.6 8.2 9.2 4.9 6.2 6.7 4.6 89.2 75.9 8.1 97.7 91.0 4.6 89.5 76.8 4.5 56.1 36.0 7.5 73.1 54.3 4.5 57.9 37.0 4.4 73.8 56.7 8.3 91.0 79.6 4.4 74.3 57.4 96.1 4.4 95.2 97.1 8.0 96.0 96.1 4.8 94.9

4.9 99.8 99.9 2.3 99.8 99.9 4.9 99.8 99.9 4.6 16.8 22.3 5.0 12.8 18.9 4.6 16.4 22.0 4.5 54.2 60.8 3.6 51.5 56.4 4.5 53.8 59.9 4.4 25.0 32.2 4.5 21.7 28.5 4.4 24.5 31.4 96.1 93.8 100.0 91.8 94.5 100.0 96.1 93.8 100.0

4.9 98.1 97.1 3.7 97.1 96.5 4.9 97.7 96.8 4.6 5.0 5.4 7.4 13.4 11.3 4.6 5.2 5.4 4.5 17.4 21.7 5.3 13.9 19.9 4.5 16.9 21.2 4.4 4.6 8.5 6.5 8.2 9.7 4.4 4.6 8.1 96.1 78.1 100.0 95.1 73.9 100.0 96.1 77.5 100.0

5.1 6.0 6.8 5.5 6.5 7.9 5.1 5.7 6.5 4.5 98.7 96.8 7.4 99.8 99.5 4.5 98.7 96.9 4.1 90.7 79.7 6.7 96.5 88.7 4.1 91.1 80.3 3.8 92.3 88.5 7.6 99.3 96.4 3.8 92.3 88.6 100.0 8.5 99.9 100.0 12.2 100.0 100.0 8.7 99.9

5.1 100.0 100.0 1.9 100.0 100.0 5.1 100.0 100.0 4.5 14.7 18.8 5.1 10.6 14.1 4.5 14.4 18.4 4.1 59.1 67.1 3.1 59.2 65.4 4.1 58.6 66.9 3.8 22.8 28.3 4.3 20.2 25.5 3.8 22.2 28.0 100.0 99.0 100.0 99.9 99.5 100.0 100.0 98.9 100.0

4.8 78.9 70.4 4.3 94.6 89.5 4.8 78.6 69.3 4.7 54.5 43.7 6.0 66.9 55.8 4.7 55.1 44.4 5.1 5.8 7.0 5.4 5.6 6.2 5.1 5.9 7.0 4.9 15.0 12.4 5.8 24.2 19.4 4.9 15.5 12.9 99.2 79.5 99.9 99.0 86.6 100.0 99.2 78.7 99.9

I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR

T = 2000

T = 1000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

5.1 5.1 4.7 5.4 5.7 4.9 5.2 5.3 4.7 6.5 98.4 96.6 7.4 98.8 97.5 6.5 98.5 97.0 6.2 73.1 57.9 7.2 77.6 63.9 6.2 74.4 59.2 6.4 91.3 85.2 7.6 93.7 89.8 6.5 91.8 86.0 92.3 6.6 87.7 92.9 7.2 88.7 92.3 6.8 87.6

5.2 94.2 88.8 3.4 99.1 96.7 5.2 93.7 87.9 6.0 5.2 5.3 5.9 6.5 6.1 6.0 5.4 5.2 5.8 34.8 26.8 4.8 38.8 30.1 5.8 33.3 25.8 5.9 14.7 13.1 5.6 12.9 11.9 5.9 14.0 12.5 81.4 94.3 99.2 75.9 96.6 99.7 81.4 93.5 99.0

5.0 47.8 38.3 4.2 71.0 60.6 5.0 46.2 37.0 6.1 30.0 22.4 6.8 39.7 31.0 6.1 31.4 23.1 5.9 5.3 5.9 5.8 6.4 6.1 5.9 5.3 5.8 6.1 10.1 8.4 6.5 15.4 12.6 6.1 10.5 9.0 84.8 47.9 91.9 84.3 57.7 96.3 84.8 45.6 91.6

4.9 4.0 5.0 5.0 4.1 4.9 4.9 4.0 4.9 5.4 100.0 100.0 6.5 100.0 100.0 5.4 100.0 100.0 5.2 95.7 91.2 5.8 96.9 92.8 5.2 96.1 91.4 5.6 99.8 99.7 6.7 99.8 99.7 5.6 99.8 99.7 99.9 6.8 99.5 99.9 7.5 99.5 99.9 7.1 99.5

4.7 100.0 99.7 3.7 100.0 100.0 4.7 100.0 99.6 5.2 6.0 6.6 5.2 5.6 6.0 5.2 6.2 6.8 5.2 58.0 46.4 4.2 64.2 53.8 5.2 57.5 45.3 5.4 25.3 20.2 5.0 22.0 16.6 5.4 24.6 19.4 98.7 100.0 100.0 97.5 100.0 100.0 98.7 100.0 100.0

5.9 75.3 64.7 2.9 95.4 91.0 3.9 80.0 71.6 5.1 52.4 42.9 6.1 67.3 56.9 5.5 56.2 45.2 5.5 5.2 4.4 4.5 6.3 5.1 4.6 5.2 4.6 4.6 13.8 11.2 4.5 23.7 17.6 3.7 13.8 9.3 98.7 72.1 99.9 99.0 86.0 100.0 99.0 77.8 99.8

V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE

T = 2000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

5.4 5.2 8.3 5.6 5.5 9.8 5.5 5.3 8.4 41.9 51.6 41.1 45.8 58.8 51.2 40.4 51.5 40.7 19.5 22.5 18.7 21.3 27.2 23.8 18.9 22.5 18.5 33.9 38.3 30.1 38.9 47.5 40.0 33.3 38.0 30.1 23.5 92.0 92.6 23.4 92.3 92.8 23.0 91.6 92.4

56.7 72.8 87.1 10.8 74.4 87.0 57.9 72.8 87.1 5.8 6.4 9.9 5.1 6.3 11.0 5.9 6.4 9.8 12.3 15.4 23.9 5.6 15.1 24.1 12.6 15.7 24.0 6.2 8.8 12.5 4.9 8.4 13.4 6.2 9.2 12.3 17.1 99.8 100.0 9.7 100.0 100.0 17.3 99.8 100.0

26.4 35.5 57.3 10.3 35.0 56.9 27.1 35.7 57.1 9.7 10.6 9.1 7.4 12.4 13.1 9.6 10.6 9.2 6.1 6.3 9.6 5.2 6.7 11.0 6.0 6.1 9.6 8.1 6.9 8.0 6.6 9.1 10.5 8.0 7.0 8.0 6.8 99.0 100.0 12.0 99.3 100.0 6.7 99.0 100.0

4.6 5.7 7.9 4.7 6.3 8.7 4.7 5.8 7.9 72.2 88.0 81.9 75.0 90.1 86.2 71.6 87.9 81.9 37.7 47.6 38.1 39.1 52.0 44.1 37.3 47.3 38.1 61.7 74.9 65.6 64.7 79.6 72.2 61.2 74.6 65.4 42.5 99.8 99.7 42.4 99.8 99.7 41.5 99.8 99.7

85.1 96.2 99.5 14.5 97.1 99.4 85.5 96.3 99.5 5.1 4.7 7.0 3.6 5.0 7.7 5.2 4.8 7.0 16.2 23.1 28.5 5.2 24.8 28.9 16.7 23.2 28.6 6.9 8.5 12.0 4.2 8.6 12.4 7.0 8.4 11.8 27.8 100.0 100.0 16.9 100.0 100.0 27.9 100.0 100.0

44.6 59.4 79.0 11.6 60.7 78.5 44.7 59.5 79.2 17.0 16.4 11.6 8.3 20.3 17.6 16.8 16.3 11.6 5.3 4.7 7.6 3.9 5.1 8.2 5.3 4.7 7.5 11.6 8.7 7.0 7.0 10.3 9.0 11.4 8.5 7.1 5.3 100.0 100.0 21.5 100.0 100.0 5.3 100.0 100.0

Note: the entries are rejection frequencies in percentages, and the rejection frequencies in boldface are the empirical sizes. The notations S, K, SK, I, R, IR, V, E, and VE are, respectively, standing for ψt = ψSt, ψKt, (ψSt, ψKt)⊺, ψIt, ψRt, (ψIt, ψRt)⊺, ψVt, ψEt, and (ψVt, ψEt)⊺.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

W

S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK

T = 2000

Table 3 The empirical sizes and powers of the M, W, and C tests for the GARCH(1,1) model. T = 1000 DGP

Test

N(0,1)

M

W

C

Lg

M

C

t(12)

M

W

C

t(9)

M

W

C

Ln(0.1)

M

W

C

T = 1000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

4.6 8.7 8.2 5.5 9.4 10.7 4.5 5.8 6.3 5.1 87.4 75.3 10.2 97.4 93.3 4.8 89.5 78.0 5.1 54.0 32.2 9.5 73.3 56.4 4.8 59.9 37.0 5.0 73.1 53.4 11.0 92.6 82.1 4.4 76.1 58.8 95.7 4.0 93.7 97.2 7.2 95.9 95.5 3.8 93.7

4.6 99.7 100.0 2.4 99.9 100.0 4.5 99.7 100.0 5.1 19.6 25.6 7.7 15.7 20.9 4.8 18.4 23.9 5.1 58.0 65.6 5.0 55.4 61.5 4.8 56.2 63.2 5.0 28.0 36.6 6.5 25.2 32.5 4.4 26.0 34.3 95.7 95.5 99.9 92.1 96.0 100.0 95.5 94.8 100.0

4.6 97.8 97.1 3.7 97.0 97.1 4.5 97.3 96.8 5.1 5.7 7.0 9.0 15.5 13.3 4.8 6.2 6.4 5.1 20.6 24.4 7.1 17.8 22.7 4.8 18.5 23.0 5.0 6.7 9.5 9.1 11.3 12.5 4.4 5.6 8.6 95.7 79.8 99.8 95.4 76.3 100.0 95.5 77.6 99.8

5.1 7.8 8.6 5.7 8.3 9.6 5.0 6.4 7.1 4.3 99.1 97.1 7.8 100.0 100.0 3.6 99.1 97.6 4.6 89.2 77.0 7.1 97.3 90.5 3.9 90.5 79.7 3.8 92.1 87.9 8.4 99.8 98.6 3.6 92.6 89.0 100.0 7.6 100.0 100.0 13.4 100.0 100.0 9.2 100.0

5.1 100.0 100.0 2.5 100.0 100.0 5.0 100.0 100.0 4.3 16.2 19.2 5.4 11.4 15.1 3.6 15.1 17.9 4.6 64.2 69.4 3.9 64.1 67.1 3.9 62.3 68.5 3.8 23.6 27.6 4.5 21.3 25.0 3.6 22.1 26.5 100.0 99.0 100.0 99.9 99.4 100.0 100.0 98.8 100.0

6.2 99.0 99.6 3.5 99.8 99.8 4.6 99.7 99.8 7.1 20.6 13.2 8.6 28.5 20.4 5.1 14.9 8.3 6.2 16.8 21.1 6.5 14.1 17.5 5.2 15.6 20.2 6.7 12.7 10.5 7.8 14.9 11.1 5.0 5.3 6.0 99.7 90.8 99.8 99.9 89.6 100.0 100.0 90.1 100.0

I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR I R IR

T = 2000

T = 1000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

5.1 5.6 5.8 5.8 5.4 6.2 5.4 5.2 5.3 6.0 98.4 95.5 8.2 99.0 97.5 5.9 99.0 96.5 6.0 69.7 55.7 7.0 75.9 65.7 6.0 74.1 61.3 6.2 89.5 81.8 8.1 93.3 88.0 6.0 91.3 85.2 92.7 6.4 87.8 92.9 6.7 89.3 92.8 6.2 87.3

4.1 93.9 90.3 3.5 98.9 96.8 4.4 92.5 87.0 5.6 6.7 5.9 5.8 6.4 6.5 5.8 5.4 5.6 5.8 36.7 29.6 5.4 39.4 32.4 6.1 31.1 25.4 5.7 17.8 15.5 5.9 15.8 13.7 6.0 15.2 12.7 80.0 93.3 99.3 74.7 96.7 99.8 79.5 92.0 99.0

4.8 51.1 42.7 4.5 70.8 60.3 5.0 45.1 37.2 5.4 27.9 22.2 6.7 38.5 31.7 5.6 33.1 26.2 5.4 6.2 6.2 5.9 5.8 6.3 5.9 5.4 5.7 5.6 8.1 6.5 6.4 14.6 13.1 5.7 9.7 7.6 84.3 51.1 92.7 83.7 58.5 96.1 84.1 44.5 91.5

5.4 5.5 6.6 5.4 5.1 6.8 5.1 4.2 5.8 5.1 100.0 100.0 5.9 100.0 100.0 5.0 100.0 100.0 4.4 94.8 89.0 4.8 96.6 92.3 4.1 96.3 91.4 4.7 99.8 99.6 6.2 99.9 99.7 4.8 99.9 99.7 99.7 6.9 99.3 99.8 8.7 99.2 99.8 8.5 99.2

5.6 100.0 99.8 3.7 100.0 100.0 5.8 99.8 99.4 4.8 5.6 5.4 4.9 5.6 5.8 4.5 5.2 5.1 5.1 59.5 49.7 4.3 64.8 56.3 4.8 56.7 46.2 4.7 26.2 21.9 4.5 21.7 18.6 4.6 23.6 18.9 98.1 100.0 100.0 96.6 100.0 100.0 98.0 99.9 100.0

5.7 80.8 73.4 4.4 94.1 89.4 5.6 78.0 69.0 4.5 50.6 40.9 6.0 64.7 55.9 4.7 56.1 45.6 5.0 5.3 5.0 5.0 5.7 5.6 4.6 5.1 5.3 4.8 13.1 10.7 5.5 25.8 19.3 4.8 15.5 11.8 98.7 80.4 99.7 98.8 86.5 99.9 98.4 77.3 99.7

V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE V E VE

T = 2000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

4.8 5.9 9.4 4.7 6.4 11.6 4.5 5.6 9.4 41.6 52.0 41.5 44.4 60.2 52.9 39.5 51.4 41.2 23.2 23.2 18.8 23.9 29.8 24.5 21.9 22.9 18.7 35.5 38.0 30.2 38.1 48.1 41.3 33.0 36.6 29.5 24.3 91.4 91.7 22.7 92.0 92.7 22.5 90.8 91.1

51.6 72.2 86.6 10.3 75.1 88.2 54.9 72.0 87.0 5.6 5.5 8.9 6.3 6.1 10.5 5.3 5.4 8.4 11.0 15.6 26.3 6.4 17.2 27.6 12.5 16.2 26.0 6.4 7.7 13.1 5.7 8.2 14.5 6.0 7.4 12.5 15.9 99.9 100.0 9.0 99.9 100.0 17.5 99.8 100.0

23.9 36.3 58.2 9.8 38.0 59.0 26.4 35.8 57.3 12.4 11.1 8.1 7.8 14.1 13.4 10.7 9.9 8.2 6.2 5.0 9.1 6.2 6.0 11.0 5.3 4.8 8.5 9.2 7.5 7.1 7.1 9.1 11.5 8.1 6.5 6.9 5.3 99.1 100.0 12.6 99.6 100.0 5.8 99.1 100.0

4.9 5.9 7.4 4.9 5.7 9.0 4.7 5.9 7.7 69.7 86.9 81.2 71.9 90.4 86.1 68.4 87.2 80.7 41.0 48.2 40.1 41.0 54.1 45.5 40.3 47.5 39.4 61.6 73.7 64.6 64.2 80.2 73.6 61.2 73.3 64.1 43.1 99.8 99.7 40.8 99.8 99.7 40.9 99.8 99.7

84.4 95.8 99.2 12.1 96.7 99.4 85.9 95.6 99.2 5.3 5.5 7.8 5.3 5.2 8.7 5.0 5.0 7.8 17.3 24.5 27.1 5.2 26.6 29.1 18.8 24.2 27.1 6.9 9.1 11.9 5.0 8.8 12.2 7.3 8.4 11.8 27.4 100.0 100.0 17.6 100.0 100.0 29.6 100.0 100.0

43.7 62.1 80.0 12.6 63.7 80.3 46.1 62.3 80.2 16.0 15.7 11.8 8.4 21.4 18.5 15.5 15.1 11.7 5.6 5.7 8.3 5.4 6.1 8.9 5.2 5.8 7.9 11.0 8.2 7.0 7.9 10.8 10.6 9.8 7.4 7.1 5.3 100.0 100.0 22.7 100.0 100.0 5.5 100.0 100.0

445

Note: the entries are rejection frequencies in percentages, and the rejection frequencies in boldface are the empirical sizes. The notations S, K, SK, I, R, IR, V, E, and VE are, respectively, standing for ψt = ψSt, ψKt, (ψSt, ψKt)⊺, ψIt, ψRt, (ψIt, ψRt)⊺, ψVt, ψEt, and (ψVt, ψEt)⊺.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

W

S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK S K SK

T = 2000

446

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

Table 4 The empirical sizes and powers of the S, K, and SK tests based on Eq. (44). AR(1) model

GARCH(1,1) model

T = 1000 vot N(0,1)

Lg

t(12)

t(9)

Ln(0.1)

S K SK S K SK S K SK S K SK S K SK

T = 2000

T = 1000

T = 2000

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

N(0,1)

Lg

t(12)

4.8 3.6 4.5 29.4 99.6 99.2 20.4 90.2 87.3 31.6 98.0 96.6 95.3 23.3 91.8

0.0 83.9 29.4 5.2 3.3 4.9 2.6 3.3 1.4 6.3 5.4 6.3 46.4 53.7 89.3

0.0 2.0 0.1 8.4 9.8 10.8 5.3 2.4 4.4 8.7 10.9 13.5 66.0 0.4 58.2

5.6 4.6 5.7 29.4 100.0 100.0 25.2 99.5 98.9 33.8 99.9 99.9 100.0 34.2 100.0

0.0 100.0 99.3 5.5 4.9 4.6 2.4 17.8 5.1 7.5 7.3 8.1 93.1 95.2 100.0

0.3 61.1 12.6 8.4 16.5 16.9 4.1 3.0 4.0 9.8 19.9 18.6 97.4 19.9 100.0

4.7 3.8 4.6 28.9 99.8 99.6 20.5 88.0 83.4 29.1 98.0 96.1 94.4 21.5 91.2

0.0 80.3 23.7 4.6 3.1 3.4 1.8 3.6 1.5 6.5 6.0 7.0 43.0 50.7 86.3

0.0 1.7 0.0 7.7 10.2 11.3 3.4 3.0 3.8 9.8 10.4 12.2 64.8 0.7 55.0

4.5 4.5 4.6 29.6 100.0 100.0 24.8 98.4 97.7 35.1 100.0 99.9 99.9 33.5 99.8

0.0 99.9 99.0 5.2 2.7 3.9 2.6 16.2 6.8 7.3 7.9 7.8 93.6 96.0 100.0

0.0 61.1 11.2 9.2 13.9 15.2 4.5 3.6 4.2 11.3 15.9 16.6 98.3 19.5 99.7

Note: the entries are rejection frequencies in percentages, and the rejection frequencies in boldface are the empirical sizes. The notations S, K, and SK are, respectively, standing for ψt = ψSt, ψKt, and (ψSt, ψKt)⊺.

reasonably increase with T and the degrees of higher-order dynamics (as shown by the cases of Ln(b2t) and Ln(b3t)). Meanwhile, they have very similar values under both models. This is consistent with the model-invariant feature of the C test. Generally speaking, this simulation shows that, compared to the M test, the C test has comparable (or even better) finitesample performance in testing distribution, symmetry, and independence hypotheses. Compared to the W test, the C test also has comparable (or at least reasonable) performance in testing distribution hypotheses. Unlike the W test, and the C test is also applicable to testing symmetry and independence. The C test is even easier to use than the M and W tests in these experiments. This demonstrates that the C test can serve as a useful alternative to these two existing tests for the testing problems of our interest. 6. Empirical example In this section, we further show the applicability of our approach to real data by extending an existing empirical example. Specifically, Harvey et al. (1994, HRS) and Kim et al. (1998, KSC) considered a set of daily exchange rate returns: British Pound (BP), Deutschmark (DM), Japanese Yen (JY), and Swiss-Franc (SF) per U.S. Dollar, sampled from 1 October 1981 to 28 June 1985. On the basis of this data set, Bontemps and Meddahi (2005, forthcoming) evaluated the normality hypothesis and the

Table 5 The empirical sizes and powers of the symmetry tests. The M test

The C test

AR(1) vot N(0,1)

Lg

t(12)

Ln(0.05)

Ln(0.1)

S I SI S I SI S I SI S I SI S I SI

GARCH(1,1)

AR(1)

GARCH(1,1)

T = 1000

T = 2000

T = 1000

T = 2000

T = 1000

T = 2000

T = 1000

T = 2000

4.9 5.3 4.4 4.6 6.3 4.2 4.5 6.1 4.7 44.3 31.6 33.7 96.1 86.3 92.3

5.1 4.5 5.1 4.5 5.0 5.0 4.1 5.1 4.8 75.2 59.5 65.9 100.0 99.4 99.9

4.6 5.2 4.7 5.1 5.4 5.0 5.1 5.4 4.8 41.9 31.0 31.2 95.7 82.4 91.2

5.1 5.2 5.1 4.3 4.3 3.9 4.6 4.5 3.8 75.5 57.1 65.2 100.0 98.2 99.8

4.9 5.3 4.4 4.6 6.3 4.2 4.5 6.1 4.7 44.3 31.6 33.7 96.1 86.3 92.3

5.1 4.5 5.1 4.5 5.0 5.0 4.1 5.1 4.8 75.2 59.5 65.9 100.0 99.4 99.9

4.5 4.8 5.1 4.8 5.5 4.3 4.8 6.0 4.2 43.5 31.5 31.2 95.5 85.9 91.5

5.0 5.6 4.5 3.6 4.7 4.8 3.9 4.7 4.7 76.3 59.7 65.7 100.0 98.9 100.0

Note: the entries are rejection frequencies in percentages, and the rejection frequencies in boldface are the empirical sizes. The notations S, I, and SI are, respectively, standing for φt = φSt, φIt, and (φSt, φIt)⊺.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

447

Table 6 The empirical sizes and powers of the independence tests. The M test

The C test

AR(1) vot N(0,1)

Lg

t(b1t)

Ln(b2t)

Ln(b3t)

S K SK I R IR S K SK I R IR S K SK I R IR S K SK I R IR S K SK I R IR

GARCH(1,1)

AR(1)

GARCH(1,1)

T = 1000

T = 2000

T = 1000

T = 2000

T = 1000

T = 2000

T = 1000

T = 2000

4.2 4.1 2.6 5.1 4.5 4.9 5.1 2.6 3.6 4.7 4.1 5.3 3.2 5.3 4.1 4.9 24.0 18.7 24.9 3.0 19.1 16.4 4.8 12.5 43.7 3.3 36.8 28.1 5.1 22.3

4.4 4.9 3.3 4.0 4.0 3.7 4.6 3.7 3.9 4.6 5.6 4.9 3.8 9.0 4.9 4.5 45.8 37.1 48.4 3.9 42.0 30.7 4.3 23.8 75.6 4.7 73.2 54.2 4.3 44.6

6.0 3.2 3.7 5.2 5.4 4.4 4.5 3.9 3.8 3.7 5.4 5.1 3.6 4.2 2.4 4.0 25.7 19.9 15.3 3.4 8.2 11.4 5.1 9.5 25.3 3.7 16.7 18.0 5.2 13.8

4.8 5.2 3.8 4.3 4.2 4.0 3.6 2.6 3.3 4.3 4.7 4.6 3.7 9.8 4.9 3.4 46.4 36.8 28.3 4.5 19.7 17.7 4.9 13.2 48.7 5.1 39.3 29.8 5.5 22.4

3.6 4.9 3.2 4.6 4.2 3.8 4.4 3.3 3.5 5.4 3.9 5.0 4.3 29.9 21.8 5.3 38.0 25.5 34.6 5.0 25.7 27.8 5.2 21.0 59.9 6.8 47.3 47.7 6.1 37.0

3.9 4.7 2.5 3.9 3.5 3.7 4.5 4.1 3.5 4.5 5.4 4.8 4.0 48.7 39.0 5.1 67.4 54.5 65.0 7.5 53.7 51.7 6.0 40.1 89.5 12.6 84.5 80.3 6.9 70.0

4.4 4.5 4.4 5.0 5.3 5.4 4.3 4.2 3.7 5.0 3.8 4.7 4.6 31.4 20.7 5.7 38.8 28.5 36.3 4.4 24.1 27.4 5.2 20.7 61.4 5.9 48.3 49.2 5.4 37.4

4.3 4.3 3.4 4.7 4.5 4.6 4.5 5.2 3.3 4.9 5.4 5.5 3.2 49.7 39.7 5.1 68.2 56.0 64.7 7.2 52.0 51.8 6.0 41.7 88.1 13.0 83.3 79.0 7.3 69.2

Note: the entries are rejection frequencies in percentages, and the rejection frequencies in boldface are the empirical sizes. The notations S, K, SK, I, R, and IR are, respectively, standing for φt = φSt, φKt, (φSt, φKt)⊺, φIt, φRt, and (φIt, φRt)⊺.

standardized t distribution hypothesis for the standardized errors of the GARCH(1,1) model in their 2005 and forthcoming papers, respectively. (In their studies, DM is replaced or denoted by French Franc, while they noted that the data are used in HRS and KSC.) The normality hypothesis is significantly rejected (at the 5% level) for all the exchange rates. By contrast, the standardized t distribution hypothesis is unable to be significantly rejected for all but SF. In this example, we apply our approach to exploring such a conditional non-normality in a more complete way. For comparison, we also base this example on the daily returns of BP, DM, JY, and SF. However, we use the exchange rate data released by the Federal Reserve Bank of New York because the data used in the aforementioned studies is not publicly available. Let Pt be an exchange rate on date t, and denote the daily return yt = 100 × (ln(Pt) − ln(Pt − 1)). The sample size is T = 937. Using the (first-five-sample-autocorrelations-based) Ljung–Box test, we observe that {yt} tends to be serially uncorrelated, and {yt2} tends to be highly serially correlated for all the return sequences. (These test statistics are not reported for brevity.) The GARCH(1,1) model is a well-known baseline model for financial time series with these dynamic features. Choosing this model is also important for comparing our empirical results with those of Bontemps and Meddahi (2005, forthcoming). In this example, we allow the GARCH(1,1) model to include a location parameter because our exchange rate returns show non-zero means. In addition, we also consider the GJR-GARCH(1,1) model and the EGARCH(1,1) model. The Gaussian QMLEs of these GARCH-type models are shown in Table 7. As explained in the footnote of this table, these two alternative specifications do not outperform the GARCH(1,1) model in this example. Thus, our discussions are focused on the GARCH(1,1) model. Given the estimated GARCH(1,1) models, we apply the S, K, and SK tests, the I, R, and IR tests, and the V, E, and VE tests to evaluating the four standardized error distribution hypotheses: N(0,1), Lg, t(12), and t(β), where β stands for the unknown degrees of freedom. In addition, we also apply the (extended) SK and IR tests and their individual tests to examining the independence hypothesis, and use the (extended) V, E, and VE tests to evaluating the independence hypothesis and the normality hypothesis simultaneously. These tests are all based on the xts in Eqs. (59) and (60) with m = 5. In this example, we estimate βo ^ ¼ 4 þ 6=ðm ^ is 15.835 for BP, 22.199 for DM, ^ 4 −3Þ. The resulting β under the hypothesis of t(β) using the moment estimator: β T T 28.769 for SF, and 7.062 for JY. This suggests that, under the hypothesis of t(β), the eighth moment λ8 may not exist for JY. Thus, the K and SK test statistics may not be asymptotically valid. For this reason, they are not reported in this example. Other C test statistics are shown in Table 8. The main results of this table are summarized as follow. We also find evidence against the normality hypothesis for the standardized errors of the GARCH(1,1) model. Given the 5% significance level, this hypothesis is rejected by the K, SK, R, and IR tests for BP, by the S, K, SK, R, and E tests for DM, the R, IR, E, and

448

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

Table 7 The Gaussian QMLEs for GARCH-type models. BP

α0 α1 α2 α3

DM

SF

JY

(I)

(II)

(III)

(I)

(II)

(III)

(I)

(II)

(III)

(I)

(II)

(III)

0.064 (0.020) 0.010 (0.005) 0.909 (0.026) 0.070 (0.023)

0.066 (0.020) 0.010 (0.005) 0.912 (0.025) 0.074 (0.035) − 0.010 (0.041)

0.063 (0.020) − 0.128 (0.032) 0.983 (0.009) 0.146 (0.041) 0.007 (0.031)

0.048 (0.019) 0.018 (0.006) 0.866 (0.028) 0.093 (0.024)

0.050 (0.019) 0.016 (0.060) 0.871 (0.025) 0.102 (0.041) − 0.021 (0.045)

0.046 (0.019) − 0.176 (0.038) 0.964 (0.013) 0.181 (0.044) 0.013 (0.027)

0.052 (0.022) 0.016 (0.007) 0.900 (0.022) 0.068 (0.018)

0.047 (0.022) 0.019 (0.007) 0.893 (0.024) 0.055 (0.035) 0.028 (0.041)

0.048 (0.022) − 0.140 (0.033) 0.966 (0.013) 0.146 (0.040) − 0.023 (0.028)

0.011 (0.017) 0.006 (0.004) 0.939 (0.023) 0.041 (0.020)

0.014 (0.017) 0.004 (0.003) 0.950 (0.019) 0.047 (0.023) − 0.024 (0.022)

0.010 (0.016) − 0.051 (0.023) 0.991 (0.004) 0.050 (0.030) 0.015 (0.017)

α4

Note: the GARCH-type models shown in this table are of the specification: yt = α0 + ut, where ut = vtht1/2 with (I) the GARCH(1,1) model in Eq. (33), (II) the GJRGARCH(1,1) model: ht = α1 + α2ht − 1 + α3ut2− 1 + α4I(ut − 1 b 0)ut2− 1, and (III) the EGARCH(1,1) model: ht = exp(α1 + α2lnht − 1 + α3|vt − 1| + α4vt − 1). The entries in the parentheses are the Bollerslev and Wooldridge (1992) asymptotic standard deviation estimates of the Gaussian QMLEs. We obtain these QMLEs and estimates using Eviews 6. Note that the GARCH(1,1) model is a special case of the GJR-GARCH(1,1) model where α4 = 0. The t statistic for this parameter restriction is insignificant at the 5% level for all the exchange rate returns. This shows that the GJR-GARCH(1,1) model can be reduced to the GARCH(1,1) model in this empirical example. The GARCH(1,1) model and the EGARCH(1,1) model are non-nested. Nonetheless, the parameter restriction: α4 = 0 of the EGARCH(1,1) model is also insignificant at the 5% level for all the returns. This means that the EGARCH(1,1) model can also be reduced to a symmetric volatility model, like the GARCH(1,1) model, in this empirical example.

Table 8 The C test statistics for the standardized errors of the GARCH(1,1) model. BP

S K SK I R IR V E VE

S K SK I R IR V E VE

DM

N(0, 1)

Lg

t(12)

t(β)

I1

I2

N(0, 1)

Lg

t(12)

t(β)

I1

I2

0.095 (0.757) 10.392 (0.001) 10.487 (0.005) 0.038 (0.846) 6.125 (0.013) 6.275 (0.043) 0.251 (0.616) 1.344 (0.511) 1.369 (0.713)

0.024 (0.876) 1.468 (0.226) 1.492 (0.474) 0.150 (0.699) 0.502 (0.479) 0.651 (0.722) 0.016 (0.900) 0.277 (0.871) 3.167 (0.367)

0.032 (0.857) 0.182 (0.669) 0.215 (0.898) 0.127 (0.721) 0.486 (0.486) 0.600 (0.741) 0.358 (0.550) 0.236 (0.889) 2.381 (0.497)

0.053 (0.818) 0.002 (0.967) 0.056 (0.972) 0.112 (0.738) 1.079 (0.299) 1.174 (0.556) 0.334 (0.564) 0.323 (0.851) 1.377 (0.711)

11.489 (0.043) 1.819 (0.874) 15.760 (0.107) 18.398 (0.002) 5.129 (0.400) 25.885 (0.004) 12.949 (0.024) 20.589 (0.024) 35.504 (0.002)

5.252 (0.386) 5.666 (0.340) 9.939 (0.446) 7.405 (0.192) 3.719 (0.591) 9.731 (0.464) 4.555 (0.473) 7.710 (0.657) 22.399 (0.098)

4.079 (0.043) 4.665 (0.031) 8.744 (0.013) 1.754 (0.185) 4.690 (0.030) 5.526 (0.063) 1.182 (0.277) 6.310 (0.043) 6.335 (0.096)

1.034 (0.309) 2.281 (0.131) 3.315 (0.191) 0.853 (0.356) 2.785 (0.095) 3.899 (0.142) 1.496 (0.221) 2.244 (0.326) 2.398 (0.494)

1.379 (0.240) 0.545 (0.460) 1.924 (0.382) 0.972 (0.324) 0.001 (0.975) 0.986 (0.611) 0.034 (0.853) 1.928 (0.381) 1.985 (0.576)

3.140 (0.076) 0.011 (0.916) 3.624 (0.163) 1.248 (0.264) 0.683 (0.409) 1.726 (0.422) 0.516 (0.472) 3.550 (0.169) 3.594 (0.309)

2.525 (0.773) 2.349 (0.799) 4.058 (0.945) 3.469 (0.628) 1.409 (0.923) 5.029 (0.889) 11.164 (0.048) 11.852 (0.295) 19.808 (0.179)

3.083 (0.687) 9.036 (0.108) 14.827 (0.139) 6.420 (0.267) 6.483 (0.262) 13.734 (0.185) 5.225 (0.389) 8.668 (0.564) 18.515 (0.237)

SF 2.825 (0.093) 2.663 (0.103) 5.488 (0.064) 3.321 (0.068) 4.999 (0.025) 8.458 (0.015) 0.013 (0.909) 7.868 (0.020) 11.581 (0.009)

0.717 (0.397) 2.751 (0.097) 3.468 (0.177) 2.585 (0.108) 1.321 (0.250) 3.990 (0.136) 6.587 (0.010) 9.454 (0.009) 15.601 (0.001)

0.955 (0.328) 0.797 (0.372) 1.752 (0.416) 3.048 (0.081) 0.211 (0.646) 3.214 (0.200) 1.427 (0.232) 5.756 (0.056) 11.919 (0.008)

2.503 (0.114) 0.017 (0.897) 2.644 (0.267) 3.320 (0.068) 1.878 (0.171) 5.134 (0.077) 0.209 (0.647) 6.162 (0.046) 10.155 (0.017)

2.986 (0.702) 4.985 (0.418) 9.924 (0.447) 8.619 (0.125) 5.235 (0.388) 14.155 (0.166) 3.974 (0.553) 7.065 (0.719) 14.099 (0.518)

13.029 (0.023) 8.399 (0.136) 24.490 (0.006) 8.949 (0.111) 6.607 (0.252) 15.991 (0.100) 5.822 (0.324) 22.084 (0.015) 24.209 (0.062)

JY 32.982 (0.000) 152.734 (0.000) 185.716 (0.000) 4.705 (0.030) 13.093 (0.000) 13.343 (0.001) 2.457 (0.117) 10.506 (0.005) 13.306 (0.004)

8.365 (0.004) 1.964 (0.161) 10.329 (0.006) 2.272 (0.132) 0.630 (0.427) 2.693 (0.260) 0.073 (0.787) 6.364 (0.041) 6.388 (0.094)

11.149 (0.001) 5.181 (0.023) 16.330 (0.000) 2.540 (0.111) 3.602 (0.058) 5.756 (0.056) 0.504 (0.478) 8.955 (0.011) 8.994 (0.029)

3.493 (0.062) · · · · 1.735 (0.188) 0.788 (0.375) 2.480 (0.289) 0.001 (0.971) 6.912 (0.032) 7.059 (0.070)

0.389 (0.996) 4.392 (0.494) 6.513 (0.770) 3.351 (0.646) 3.757 (0.585) 7.116 (0.714) 2.653 (0.753) 5.725 (0.838) 14.093 (0.518)

3.255 (0.661) 3.530 (0.619) 6.675 (0.756) 2.018 (0.847) 2.747 (0.739) 5.267 (0.873) 4.743 (0.448) 8.558 (0.575) 9.736 (0.836)

Note: the entries in boldface represent the test statistics that are significant at the 5% level, and the entries in the parentheses are the p-values of the C test statistics. These C test statistics are computed as TRc2s with the associated ψts. For the V, E, and VE tests, we set τ = 0.1 as in the simulation. The C test statistics in Columns “I1” and “I2” examine the independence hypothesis using the xts in Eqs. (59) and (60), respectively, with m = 5.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

449

VE tests for SF, and by all but the V test for JY. As shown by Table 8, the evidence of non-normality for JY is much stronger than that for the other exchange rates. Bontemps and Meddahi (2005) also observed a similar result using their normality tests. Using our tests, we can obtain more information regarding this conditional non-normality by examining other distribution hypotheses and the independence hypothesis. Specifically, our tests show that the heavily-tailed distributions: Lg, t(12), and t(β) outperform N(0,1) in modeling the standardized errors of the GARCH(1,1) model for these exchange rate returns. In particular, they are unable to reject these nonnormal distribution hypotheses (at the 5% level) for both BP and DM. This implies that the conditional non-normality for BP and DM may be related to the heavy tails of the standardized errors ignored by N(0,1). Nonetheless, these non-normal distribution hypotheses are still rejected by the VE test for SF and by the E test for JY. This means that these non-normal distributions remain unable to explain the left-tail misspecifications of N(0,1) for SF and JY. The S and SK tests also reject the hypotheses of Lg and t(12) for JY, and provide some evidence of asymmetry. In addition, the independence hypothesis is rejected by the S, I, IR, V, E, and VE tests with the xt in (59) for BP and by the S, SK, and E tests with the xt in Eq. (60) for SF. Put differently, the GARCH(1,1) model may not fully explain the dynamic structures of the return sequences for BP and SF. In particular, the significant S test statistic (the significant E test statistic) indicates that this model is unable to explain the conditional skewness (the conditional expected shortfall) for these two exchange rate returns. In comparison, the independence hypothesis is unable to be rejected by the C tests for DM and JY with the exception that the V test statistic with the xt in Eq. (59) is marginally significant at the 5% level for DM. 7. Conclusion A conditional distribution model is often established by adding a distribution hypothesis and the independence hypothesis to the standardized errors of a conditional mean-and-variance model. This context encompasses a variety of fully specified GARCHtype models. By utilizing the first and second moments of the standardized residuals, we propose a simple approach to generating standardized-residuals-based higher-moment tests for this class of models. This approach is not only asymptotically valid but also robust to T1/2-consistent estimators. More attractively, it is applicable to the hypotheses that do not include a complete distribution specification, and the resulting tests have a simple invariant form for various conditional mean-and-variance models. By this approach, we further propose a set of skewness–kurtosis tests, a set of characteristic-function-based moment tests, and a set of VaR tests for checking the standardized error distribution hypothesis from various aspects. We also extend these higher-moment tests to checking the independence hypothesis against higher-order dynamic structures of the standardized errors. A Monte Carlo simulation shows the validity of our approach, and demonstrates the usages of various higher-moment tests. We also provide an empirical example to show the usefulness of our tests in exploring the conditional non-normality for a set of exchange rate returns. Acknowledgements The author gratefully acknowledges the Editor and two anonymous referees for their valuable comments and suggestions. The comments from participants of the 2006 International Symposium on Econometric Theory and Applications (SETA) in Xiamen, China and the 2007 European Meeting of the Econometric Society in Budapest, Hungary on earlier versions of this paper are gratefully acknowledged. The author also acknowledges the National Science Council of Taiwan for research support (NSC 942415-H-001-00). Appendix A A.1. Proof of Lemma 1 The asymptotic expansion in Eq. (5) with ζt = xtψt is of the form: T T  X pffiffiffi 1 X ^ ¼ p1ffiffiffi pffiffiffi xot ψot þ E½∇θ⊺ ðxt ψt Þθ¼θo T θ^T −θo þ op ð1Þ: x^t ψ t T t¼1 T t¼1

ðA1Þ

By construction, we can write that  pffiffiffi pffiffiffi ^ T −α o Þ E½∇θ⊺ ðxt ψt Þθ¼θo T θ^T −θo ¼ E½∇α⊺ ðxt ψt Þθ¼θo T ðα  h i pffiffiffi ^ T β T −βo : þE ∇β⊺ ðxt ψt Þ

ðA2Þ

θ¼θo

Let A be a p × q matrix, and b be a q × 1 vector. Note that the p × 1 vector Ab can be expressed as Ab = (b ⊺ ⊗ Ip)vec(A); see, e.g., Magnus and Neudecker (1988, p. 31). By using this expression (with A = xt and b = ψt) and the notations: ψαt(θ) : = ∇ α⊺ψt and ψα, ot : = ψαt(θo), we can write that h i  h i ⊺ E½∇α⊺ ðxt ψt Þθ¼θo ¼ E xot ψα;ot þ E ψot ⊗Ip ∇α ⊺ vecðxot Þ

ðA3Þ

450

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

and h i E ∇β⊺ ðxt ψt Þ

θ¼θo

h i  h i ⊺ ¼ E xot ψβ;ot þ E ψot ⊗I p ∇β⊺ vecðxot Þ :

ðA4Þ



Using the relationship: ψαt ¼ ψvt ∇α ⊺ vt ¼ − ψvt w⊺t þ 12 ðψvt vt Þz⊺t , we also have

h h i i 1 h   i ⊺ ⊺ E xot ψα;ot ¼ − E xot ψv;ot wot þ E xot ψv;ot vot zot ; 2

h  i 1 h ⊺  i ⊺ ψv;ot vot ⊗Ip vecðxot Þzot ; ¼ − E ψ⊺v;ot ⊗Ip vecðxot Þw⊺ot þ E 2 in which the second equality is again due to the expression: Ab = (b ⊺ ⊗ Ip)vec(A). By the law of iterated expectations, we can further write that  i h h  i  ⊺ ψ⊺v;ot ⊗I p vecðxot Þw⊺ot ¼ E E ψv;ot X t−1  ⊗Ip vecðxot Þw⊺ot ;  h i⊺  h i ¼ E ψv;ot ⊗Ip E vecðxot Þw⊺ot ;

E

h

where the second equality h h i is due to the independence hypothesis encompassed by Assumption 1, which implies  E ψv;ot X t−1  ¼ E ψv;ot . Similarly, we can apply the law of iterated expectations to show that, under Assumption 1, h ⊺  i  h i⊺  h i ⊺ ⊺ E ψv;ot vot ⊗I p vecðxot Þzot ¼ E ψv;ot vot ⊗Ip E vecðxot Þzot ;   h i h i ⊺ ⊺ E ψot ⊗Ip ∇α ⊺ vecðxot Þ ¼ E E½ψot jX t−1  ⊗Ip ∇α ⊺ vecðxot Þ ¼ 0;   h i h i ⊺ ⊺ E ψot ⊗Ip ∇β⊺ vecðxot Þ ¼ E E½ψot jX t−1  ⊗Ip ∇β⊺ vecðxot Þ ¼ 0; h i h i and E xot ψβ;ot ¼ E½xot E ψβ;ot . Lemma 1 is proved by first introducing these components into Eqs. (A3) and (A4) and then plugging the resulting expression of Eq. (A2) into (A1). □ A.2. Proof of Lemma 2 By the definition of ξ^t in Eq. (6), we can write that " #" #−1 ! T T T T X X 1 X 1 X ⊺ ^ − 1 ^ ^s ⊺ 1 ^ ^ ^s t ; pffiffiffi ψ x^t ξ^t ¼ pffiffiffi x^t ψ s s t T t¼1 t t T t¼1 t t T t¼1 T t¼1

T T h i h i X 1 X ^ −E ψ s⊺ E s s⊺ −1 ^s − p1ffiffiffi ^ ^s ; ¼ pffiffiffi x^t ψ x^t D t ot ot ot ot t T t T t¼1 T t¼1

ðA5Þ

where " #" #−1 T T h i h i 1X 1X ⊺ ⊺ ⊺ ⊺ −1 ^ ^ ^ ^ ^ −E ψot sot E sot sot : st st ψt st D T :¼ T t¼1 T t¼1   ^ T ^s t ¼ vec x^t D ^ T ^s t because x^t D ^ T ^s t is Let A, B, and C be three matrices such that the matrix product ABC is defined. Note that x^t D a p × 1 column vector. By using the expression: vec(ABC) = (C ⊺ ⊗ A)vec(B), see, e.g., Magnus and Neudecker (1988, p. 30), and ^ T , and C ¼ ^s t , we have setting A = xt, B ¼ D T 1 X ^ ^s ¼ pffiffiffi x^t D T t T t¼1

! T   1 X ^ ¼ o ð1Þ; ^s ⊺t ⊗x^t vec D pffiffiffi T p T t¼1

in which the last equality is due to Assumption 3(ii) and Assumption 4(i). Accordingly, we can rewrite Eq. (A5) as

T T h i h i 1 X 1 X ^ −E ψ s⊺ E s s⊺ −1 ^s þ o ð1Þ: pffiffiffi x^t ξ^t ¼ pffiffiffi x^t ψ ot ot t p t ot ot T t¼1 T t¼1

ðA6Þ

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

451

   −1 By using the moment function ψt −E ψot s⊺ot E sot s⊺ot st in place of the role of ψt in Lemma 1, it is easy to see that

T T h i h i X 1 X ^ −E ψ s⊺ E s s⊺ −1 ^s ¼ p1ffiffiffi pffiffiffi x^t ψ xot ξot t ot ot ot ot t T t¼1 T t¼1

h i h i h i−1 h i pffiffiffi  ⊺ ⊺ ^ −β T β þE½xot  E ψβ;ot −E ψot sot E sot sot E sβ;ot T o i3 2 h  h E vecðx Þw⊺ i⊺ i⊺ pffiffiffi ot ot 1 h i 5 T ðα ^ T −α o Þ þ op ð1Þ: − E ξv;ot ⊗Ip ; E ξv;ot vot ⊗Ip 4 h 2 E vecðxot Þz⊺ot

ðA7Þ

Using the generalized information matrix equality in Assumption 4(ii), we can write that     E sβ;ot þ E sot s⊺ot ¼ 0, and hence

h i   E ψβ;ot þ E ψot s⊺ot ¼ 0,

h i h i h i h i ⊺ ⊺ −1 E ψβ;ot ¼ E ψot sot E sot sot E sβ;ot : Lemma 2 is proved by first introducing this result into Eq. (A7) and then using Eq. (A6).



A.3. Proof of Lemma 3 Since the moment function ζt = (vec(xt)vt, vec(xt)(vt2 − 1)) ⊺ is free of β, we can simplify θ as θ = α and express Eq. (5) with this ζt as: 2

3 2 3 1 1 T T ^ ^ p ffiffiffi p ffiffiffi vec ð x Þ v vec ð x Þv ∑ ∑ t t ot ot t¼1 t¼1 6 7 6 7 T T 6 6  7  7 4 1 5¼4 1 5 T 2 T 2 pffiffiffi ∑t¼1 vecðx^t Þ v^ t −1 pffiffiffi ∑t¼1 vecðxot Þ vot −1 T "T # pffiffiffi h E½∇  α⊺ ðvecðxt Þvt Þ i ^ T −α o Þ þ op ð1Þ; T ðα þ 2 E ∇α⊺ vecðxt Þ vt −1

ðA8Þ

α¼α o

where

  1 ⊺ ⊺ ∇α⊺ ðvecðxt Þvt Þ ¼ −vecðxt Þ wt þ vt zt þ vt ⊗Ipq ∇α⊺ vecðxt Þ 2 and

      1 2 ⊺ 2 ⊺ 2 ∇α⊺ vecðxt Þ vt −1 ¼ −2vecðxt Þ vt wt þ vt zt þ vt −1 ⊗Ipq ∇α ⊺ vecðxt Þ: 2   Recall that Assumption 1 implies E½vot jX t−1  ¼ 0 and E v2ot X t−1  ¼ 1. Accordingly, we have E½∇α⊺ ðvecðxt Þvt Þα¼αo ¼ 

     −E vecðxot Þw⊺ot and E ∇α ⊺ vecðxt Þ v2t −1 α¼α o ¼ −E vecðxot Þz⊺ot by the law of iterated expectations. Lemma 3 is proved by introducing these results into Eq. (A8). □ A.4. Proof of Proposition 1 By introducing Eq. (11) into Eq. (8), we can obtain the result: T

 h i⊺  i⊺    1 X 1 h 2 pffiffiffi x^t ξ^t − E ξv;ot ⊗Ip vecðx^t Þv^ t − E ξv;ot vot ⊗Ip vecðx^t Þ v^ t −1 2 T t¼1 T

 h i  i⊺    X ⊺ 1 1 h 2 xot ξot − E ξv;ot ⊗I p vecðxot Þvot − E ξv;ot vot ⊗Ip vecðxot Þ vot −1 ¼ pffiffiffi 2 T t¼1 þop ð1Þ:    ⊺   Using Ab = (b ⊺ ⊗Ip)vec(A),   the expression:  ⊺  mentioned in the Proof of Lemma 1, we have E ξv;ot ⊗I p vecðxot Þ ¼ xot E ξv;ot and E ξv;ot vot ⊗Ip vecðxot Þ ¼ xot E ξv;ot vot . Accordingly, we can rewrite the above result as:

T h i i  1 X 1 h 2 pffiffiffi x^t ξ^t −IE ξv;ot v^ t − IE ξv;ot vot v^ t −1 2 T t¼1 T 1 X ϕc;ot þ op ð1Þ:A9 ¼ pffiffiffi T t¼1

ðA9Þ

452

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

    Proposition 1 is derived by replacing the moments: E ξv;ot and E ξv;ot vot in the left-hand side of Eq. (A9) using their consistent estimators shown in Assumption 4(iii), respectively. This replacement is asymptotically valid because of Assumption 3(iii) and Assumption 4(iii) for a similar reason underlying the derivation of (A6). □. A.5. Derivation of E½ψvt  and E½ψvt vt  for the V test Given the choice of ψt = ψVt, we have ψvt ¼

∂ ∂π ð1−Iðπt ≥VτÞÞ ¼ −δðπt −V τ Þ t : ∂vt ∂vt

By using the restriction: Gn(πt) = G(vt, β), implied by Eqs. (54) and (55), we can show that   g G−1 ðGn ðπt Þ; βÞ; β ∂πt gt : ¼ ¼ g n ðπt Þ ∂vt g nt Under Assumption 1, πt has the PDF gn(·), and hence       −1 E½ψvt  ¼ ∫R ψvt πt ¼π g n ðπÞdπ ¼ −∫R δðπ−V τ Þg G ðGn ðπÞ; βÞ; β dπ:   By the sifting property of δ(·), we obtain that E½ψvt  ¼ −g G−1 ðGn ðV τ Þ; βÞ; β ¼ −g τ , in which the last equality is due to the restriction: τ = Gn(Vτ) and the definitions: vτ : = G − 1(τ, β) and gτ : = g(vτ, β). Similarly, it is easy to see that   −1 −1 E½ψvt vt  ¼ −∫R δðπ−V τ Þg G ðGn ðπ Þ; βÞ; β G ðGn ðπÞ; βÞdπ   ¼ −g G−1 ðGn ðV τ Þ; βÞ; β G−1 ðGn ðV τ Þ; βÞ ¼ −g τ vτ : □ A.6. Derivation of E½ψvt  and E½ψvt vt  for the E test Given the choice of ψt = ψEt, we have

ψvt ¼

  gt 2 ⊺ ðIðπt bV τ Þ−πt δðπt −V τ ÞÞ; 2π t I ðπt bV τ Þ− πt −1 δðπt −V τ ÞÞ : g nt

By the sifting property of δ(·), we can write that 

  gt πt δðπt −V τ Þ ¼ ∫R δðπ−V τ Þπg G−1 ðGn ðπÞ; βÞ; β dπ g nt  ¼ V τ g G−1 ðGn ðV τ Þ; βÞ; β ¼ V τ g τ ;        gt E π2t −1 δðπ t −V τ Þ ¼ ∫R δðπ−V τ Þ π2 −1 g G−1 ðGn ðπ Þ; βÞ; β dπ g  nt      ¼ V 2τ −1 g G−1 ðGn ðV τ Þ; βÞ; β ¼ V 2τ −1 g τ ;

E

and hence obtain the E½ψvt  for ψt = ψEt in Table 1. The E½ψvt vt  for ψt = ψEt can be derived in a similar way. □ Appendix B. Supplementary data Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.jempfin.2012.04.006. References Andrews, D.W.K., 1994. Empirical process methods in econometrics. In: Engle, R.F., McFadden, D.L. (Eds.), Handbook of Econometrics, vol. 4. Elsevier, Amsterdam, pp. 2247–2294. Bai, J., 2003. Testing parametric conditional distributions of dynamic models. The Review of Economics and Statistics 85, 531–549. Bai, J., Ng, S., 2001. A test for conditional symmetry in time series models. Journal of Econometrics 103, 225–258. Bai, J., Ng, S., 2005. Tests for skewness, kurtosis, and normality for time series data. Journal of Business & Economic Statistics 23, 49–60. Balakrishnan, N., Nevzorov, V.B., 2003. A Primer on Statistical Distributions. John Wiley, New York. Bartlett, M.S., 1953. Approximate confidence intervals. Biometrika 40, 12–19. Bera, A.K., Bilias, Y., 2001. Rao's score, Neyman's C(α), and Silvey's LM tests: an essay on historical developments and some new results. Journal of Statistical Planning and Inference 97, 9–44. Berkowitz, J., 2001. Testing density forecasts, applications to risk management. Journal of Business & Economic Statistics 19, 465–474.

Y.-T. Chen / Journal of Empirical Finance 19 (2012) 427–453

453

Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327. Bollerslev, T., 1987. A conditional heteroskedastic time series model for speculative prices and rates of return. The Review of Economics and Statistics 69, 542–547. Bollerslev, T., Wooldridge, J.M., 1992. Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances. Econometric Reviews 11, 143–172. Bontemps, C., Meddahi, N., 2005. Testing normality: a GMM approach. Journal of Econometrics 124, 149–186. Bontemps, C., Meddahi, N., forthcoming. Testing distributional assumptions: a GMM approach. Journal of Applied Econometrics. http://dx.doi.org/10.1002/jae. 1250. Chen, Y.-T., 2008. A unified approach to standardized-residuals-based correlation tests for GARCH-type models. Journal of Applied Econometrics 23, 111–133. Chen, Y.-T., 2011. Moment tests for density forecast evaluation in the presence of parameter estimation uncertainty. Journal of Forecasting 30, 409–450. Chen, Y.-T., Chou, R.Y., Kuan, C.-M., 2000. Testing time reversibility without moment restrictions. Journal of Econometrics 95, 199–218. Chesher, A., Dhaene, G., Gouriéroux, C., Scaillet, O., 1999. Bartlett identities tests. Working Paper. University of Bristol. Christoffersen, P.F., 1998. Evaluating interval forecasts. International Economic Review 39, 841–862. Davidson, R., MacKinnon, J.G., 1993. Estimation and Inference in Econometrics. Oxford University Press, Oxford. Diebold, F.X., Gunther, T.A., Tay, A.S., 1998. Evaluating density forecasts with applications to financial risk management. International Economic Review 39, 863–883. Duan, J.C., 2003. A specification test for time series models by a normality transformation. Working Paper. University of Toronto. Engle, R.F., 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50, 987–1006. Engle, R.F., Gonzalez-Rivera, G., 1991. Semiparametric ARCH models. Journal of Business & Economic Statistics 9, 345–359. Fiorentini, G., Sentana, E., Calzolari, G., 2004. On the validity of the Jarque–Bera normality test in conditionally heteroskedastic dynamic regression models. Economics Letters 83, 307–312. Ghosh, A., Bera, A.K., 2005. Smooth test for density forecast evaluation. Working Paper. University of Illinois at Urbana-Champaign. Glosten, L.R., Jagannathan, R., Runkle, D.E., 1993. On the relation between the expected value and the volatility of the nominal excess return on stocks. Journal of Finance 5, 1779–1801. Gupta, M.K., 1967. An asymptotically nonparametric test of symmetry. Annals of Mathematical Statistics 38, 849–866. Hansen, B.E., 1994. Autoregressive conditional density estimation. International Economic Review 35, 705–730. Harvey, A.C., Ruiz, E., Shephard, N., 1994. Multivariate stochastic variance models. Review of Economic Studies 61, 247–264. Jarque, C.M., Bera, A.K., 1980. Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters 6, 255–259. Jarque, C.M., Bera, A.K., 1987. A test for normality of observations and regression residuals. International Statistical Reviews 55, 163–172. Kanwal, R.P., 1983. Generalized Functions: Theory and Technique. Academic Press, New York. Khmaladze, E.V., 1981. Martingale approach in the theory of goodness-of-fit tests. Theory of Probability and Its Applications 26, 240–257. Kiefer, N.M., Salmon, M., 1983. Testing normality in econometric models. Economics Letters 11, 123–127. Kim, S., Shephard, N., Chib, S., 1998. Stochastic volatility: likelihood inference and comparison with ARCH models. Review of Economic Studies 65, 361–393. Koul, H.L., Ling, S., 2006. Fitting an error distribution in some heteroscedastic time series models. The Annals of Statistics 34, 994–1012. Kupiec, P.H., 1995. Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives 3, 73–84. Lejeune, B., 2009. A diagnostic m-test for distributional specification of parametric conditional heteroscedasticity models for financial data. Journal of Empirical Finance 16, 507–523. Lobato, I.N., Velasco, C., 2004. A simple test of normality for time series. Econometric Theory 20, 671–689. Lukacs, E., 1970. Characteristic Functions. Griffin, London. Lundbergh, S., Teräsvirta, T., 2002. Evaluating GARCH models. Journal of Econometrics 110, 417–435. Magnus, J.R., Neudecker, H., 1988. Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York. Nelson, D., 1991. Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59, 347–370. Newey, W.K., 1985. Maximum likelihood specification testing and conditional moment tests. Econometrica 53, 1047–1070. Newey, W.K., McFadden, D.L., 1994. Large sample estimation and hypothesis testing. In: Engle, R.F., McFadden, D.L. (Eds.), Handbook of Econometrics, vol. 4. Elsevier, Amsterdam, pp. 2111–2245. Neyman, J., 1959. Optimal asymptotic tests of composite statistical hypotheses. In: Granander, U. (Ed.), In Probability and Statistics: The Harald Cramer Volume. Wiley, New York, pp. 213–234. Park, S.Y., Bera, A., 2009. Maximum entropy autoregressive conditional heteroskedasticity model. Journal of Econometrics 150, 219–230. Phillips, P.C.B., 1991. A shortcut to LAD estimator asymptotics. Econometric Theory 7, 450–463. Phillips, P.C.B., 1995. Robust nonstationary regression. Econometric Theory 11, 912–951. Premaratne, G., Bera, A., 2005. A test for symmetry with leptokurtic financial data. Journal of Financial Econometrics 3, 169–187. Rockinger, M., Jondeau, E., 2002. Entropy densities with an application to autoregressive conditional skewness and kurtosis. Journal of Econometrics 106, 119–142. Stuart, A., Ord, J.K., 1994. Kendall's Advanced Theory of Statistics. Vol I: Distribution Theory. Edward Arnold, London. Tauchen, G., 1985. Diagnostic testing and evaluation of maximum likelihood models. Journal of Econometrics 30, 415–443. Wallis, K.F., 2003. Chi-squared tests for interval and density forecasts, and the Bank of England's fan charts. International Journal of Forecasting 19, 165–175. White, H., 1982. Maximum likelihood estimation of misspecified models. Econometrica 50, 1–25. White, H., 1987. Specification testing in dynamic models. In: Bewley, T. (Ed.), Advances in Econometrics-Fifth World Congress, vol. 1. Cambridge University Press, New York, pp. 1–58. White, H., 1994. Estimation, Inference and Specification Analysis. Cambridge University Press, New York. White, H., MacDonald, G.M., 1980. Some large-sample tests for nonnormality in the linear regression model. Journal of the American Statistical Association 75, 16–28. Wooldridge, J.M., 1990. A unified approach to robust, regression-based specification tests. Econometric Theory 6, 17–43.