Estimating a generalized correlation coefficient for a generalized bivariate probit model

Estimating a generalized correlation coefficient for a generalized bivariate probit model

ARTICLE IN PRESS Journal of Econometrics 141 (2007) 1100–1114 www.elsevier.com/locate/jeconom Estimating a generalized correlation coefficient for a ...

213KB Sizes 4 Downloads 77 Views

ARTICLE IN PRESS

Journal of Econometrics 141 (2007) 1100–1114 www.elsevier.com/locate/jeconom

Estimating a generalized correlation coefficient for a generalized bivariate probit model Songnian Chena,, Yahong Zhoub a

Department of Economics, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, PR China b School of Economics, Shanghai University of Finance and Economics, Shanghai, PR China Available online 2 March 2007

Abstract In this paper we consider semiparametric estimation of a generalized correlation coefficient in a generalized bivariate probit model. The generalized correlation coefficient provides a simple summary statistic measuring the relationship between the two binary decision processes in a general framework. Our semiparametric estimation procedure consists of two steps, combining semiparametric estimators for univariate binary choice models with the method of maximum likelihood for the bivariate probit model with nonparametrically generated regressors. The estimator is shown to be consistent and asymptotically normal. The estimator performs well in our simulation study. r 2007 Elsevier B.V. All rights reserved. JEL classification: C31; C35 Keywords: Bivariate probit; Generalized correlation coefficient; Dependence measures

1. Introduction Binary choice models have been widely used in empirical applications. The most popular binary choice models are the probit and logit models when the underlying error distribution is assumed to have the normal or logistic distribution. It is common that economic agents make multiple decisions, which can be modelled by extending the univariate binary choice model. A natural extension commonly used in applied research is Corresponding author. Tel.: +852 2358 7602; fax: +852 2358 2084.

E-mail addresses: [email protected] (S. Chen), [email protected] (Y. Zhou). 0304-4076/$ - see front matter r 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jeconom.2007.01.012

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1101

the bivariate probit model with possibly correlated disturbances. Beyond the estimation of each of the binary choice equation (jointly or separately), applied researchers are generally interested in the dependence structure of the two binary choice decision equations. In the context of the bivariate probit model, the correlation coefficient between the error disturbances in the latent regression equations provides a simple summary measure of how the unobservables interact with each other in the underlying equations. For example, applied researchers (e.g., Chiappori and Salanie´, 2000) are very often interested in whether the two decisions are independent, which is implied by a zero correlation coefficient in the bivariate probit model. One major drawback of the bivariate probit model is that its complete parametric specification can be too restrictive in applied contexts. In general, the joint normality specification of the error disturbances and the linear structure of the underlying regression functions cannot be justified by economic theory. As is well known (and well documented) by now, misspecification of the error distribution and/or the regression structure may lead to inconsistent estimates and misleading inference. Thus an estimate of the correlation coefficient of a bivariate probit might provide an inappropriate dependence measure in the event of model misspecification. Therefore, it would be most desirable to investigate the dependence structure of the two binary decisions in a more general setting. In this paper we consider the dependence structure through a generalized correlation coefficient in a generalized bivariate probit model. The most general estimation and testing procedures is based on complete nonparametric procedures. However, one major drawback of the nonparametric procedures is the problem associated with the curse-of-dimensionality; namely, the finite and asymptotic properties of the estimation and testing procedures deteriorate quickly when the dimension of the regressors increases, rendering them impractical in most typical empirical applications. Therefore, semiparametric models, which greatly relax the restrictive nature of the parametric models, but at the same time allowing estimation/testing procedures whose effectiveness is comparable to parametric procedures, provide a good alternative to the tightly specified parametric models and general nonparametric models. In particular, we will consider a semiparametric generalized bivariate probit model that is much more general than the parametric bivariate probit model; yet, a generalized correlation coefficient1 in this model provides a simple summary statistic as a dependence measure of two binary decision problems in a very general framework. The general error structure assumed in our framework for the bivariate binary choice model has its precedent in the literature on sample selection models with nonnormal distributions (Lee, 1982, 1983); in particular, Lee assumed a bivariate normal structure for the unobservable disturbance terms subject to some known monotone transformations in a binary choice sample selection framework. We adopt a similar structure in the context of a bivariate binary choice model, but allowing the monotone transformations applied to the latent regression models completely unknown; thus our framework is much more general than that of Lee. In addition, adopting such monotone transformations is more natural in the framework of (the system of) binary choice models to generalize the normal error 1

Our generalized correlation coefficient is essentially the tetrachoric correlation coefficient for a 2  2 contingency table introduced by Pearson (Kotz and Johnson, 1989), which is a popular dependence measure widely used by psychologists, educational psychologists and psychometricians. We are grateful to an anonymous referee for pointing out this connection.

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1102

structure as these transformations applied to the latent regression models preserve the observationally equivalent binary dependent variables. Our semiparametric estimation procedure consists of two steps; in the first step, we adopt some existing semiparametric estimators for the regression coefficients for each binary choice equation; such estimators include Cosslett (1983), Han (1987), Horowitz and Hardle (1996), Ichimura (1993), Klein and Spady (1993), Manski (1985), Powell et al. (1989) and Sherman (1993, 1994), etc., among others. In the second step, we construct some generated regressors using one-dimensional kernel nonparametric procedure based on the first-step estimates, and then propose a semiparametric maximum likelihood estimator for the generalized correlation coefficient using the nonparametric generated regressors. The paper is organized as follows. The next section describes the model and motivates the proposed estimator. Section 3 gives regularity conditions and investigates the large sample properties of the estimator. The estimator is shown to be consistent and asymptotically normal. Section 4 contains the results of a simulation study. The final section concludes. 2. The estimator To fix ideas, consider the bivariate binary choice model ( d i1 ¼ 1fX 0i1 b10  ei1 40g; i ¼ 1; 2; . . . ; n, d i2 ¼ 1fX 0i2 b20  ei2 40g;

(2.1)

where X 1 and X 2 are the independent variables and d 1 and d 2 are observable binary dependent variables, and the error terms e1 and e2 are assumed to be independent of ðX 1 ; X 2 Þ. When ðe1 ; e2 Þ are jointly normally distributed, the joint log-likelihood function can be written as Ln ðb1 ; b2 ; rÞ ¼

n X

ln F2 ðqi1 Z i1 ; qi2 Z i2 ; qi1 qi2 rÞ,

i¼1

where Z i1 ¼ X 0i1 b1 and Zi2 ¼ X 0i2 b2 , qij ¼ 2d ij  1 for j ¼ 1; 2; then the true parameters b10 , b20 and r0 can be jointly estimated by method of maximum likelihood (see, for example, Greene, 2004). To motivate our semiparametric estimator proposed below, note that the parameters b10 , b20 and r0 can, alternatively, be estimated in a sequential manner: we first estimate b10 and b20 separately using single equation probit maximum likelihood estimators b^ 1pb and b^ 2pb ; in the second step, r0 can be estimated by maximizing Ln ðb^ 1pb ; b^ 2pb ; rÞ. While the two-step sequential method is computationally simpler, in general, it is asymptotically less efficient than the joint estimation approach. However, this sequential approach is instructive as our two-step semiparametric estimator for the correlation coefficient r0 is based on a similar sequential estimation method for a generalized bivariate probit model. To generalize the bivariate probit model, Lee (1982, 1983) assumed that the joint distribution of ðe1 ; e2 Þ is in the form of F2 ðJ 1 ðe1 Þ; J 2 ðe2 Þ; r0 Þ, where the marginal distribution of e1 and e2 are F 1 ðe1 Þ and F 2 ðe2 Þ, and J 1 ðe1 Þ ¼ F1 ðF 1 ðe1 ÞÞ and J 2 ðe2 Þ ¼ F1 ðF 2 ðe2 ÞÞ. Namely, the bivariate distribution of ðe1 ; e2 Þ is derived by assuming that the transformed variables e1 ¼ J 1 ðe1 Þ and e2 ¼ J 2 ðe2 Þ are jointly normally distributed

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1103

with zero means, unit variances, and correlation coefficient r. Unlike Lee’s setup in which F 1 and F 2 are known, we allow them to be any unknown monotone transformations only subject to some mild regularity conditions specified below; thus our structure of the error distribution is much more general than that of Lee’s. When r0 ¼ 0, it corresponds to statistical independence of e1 and e2 . This generalized bivariate probit model reduces to the conventional bivariate probit when both J 1 ðÞ and J 2 ðÞ are the identity transformation. As no restriction is imposed on the marginal distributions of e1 and e2 , clearly the model setup under consideration here is much more general than the bivariate probit model; at the same time, we allow the dependence structure between e1 and e2 to be succinctly summarized by a single parameter r0 . The focus of our investigation is on the estimation of r0 . To motivate our estimator, we write Eq. (2.1) as ( d i1 ¼ 1fJ 1 ðX 0i1 b10 Þ  ei1 40g ¼ 1fmi1 4i1 g; (2.2) d i2 ¼ 1fJ 2 ðX 0i2 b20 Þ  ei2 40g ¼ 1fmi2 4i2 g; where m1 ¼ J 1 ðx01 b10 Þ and m2 ¼ J 2 ðx02 b20 Þ. If m1 and m2 were known, then r0 could be estimated by maximizing n X

ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 rÞ

(2.3)

i¼1

with respect to r. Our strategy is to estimate b10 and b20 by some existing semiparametric ^ i1 estimators b^ 1 and b^ 2 , and then use b^ 1 and b^ 2 to construct the nonparametric estimates m ^ i2 to replace mi1 and mi2 in the likelihood function. Let P1 ðx01 b1 ; b1 Þ ¼ Eðd 1 jX 01 b1 ¼ and m x01 b1 Þ and P2 ðx02 b2 ; b2 Þ ¼ Eðd 2 jX 02 b2 ¼ x02 b2 Þ, and P1 ðx1 Þ ¼ P1 ðx01 b10 ; b10 Þ and P2 ðx2 Þ ¼ P1 ðx02 b20 ; b20 Þ. Notice that m1 ¼ F1 ðP1 ðx1 ÞÞ and m2 ¼ F1 ðP2 ðx2 ÞÞ. Given a pair of estimators b^ 1 and b^ 2 (for example, Han, 1987; Ichimura, 1993; Klein and Spady, 1993; Powell et al., 1989, etc.), we construct the nonparametric estimates for mi1 and mi2 by ^ i1 ¼ F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ m

and

^ i2 ¼ F1 ðPn2 ðX 0i2 b^ 2 ; b^ 2 ÞÞ, m

(2.4)

where Pn1 ðx01 b1 ; b1 Þ

P ð1=nhÞ nj¼1 Kðx01 b1  X 0j1 b1 =hÞd 1j P ¼ ð1=nhÞ nj¼1 Kðx01 b1  X 0j1 b1 =hÞ

(2.5)

Pn2 ðx02 b2 ; b2 Þ

P ð1=nhÞ nj¼1 Kðx02 b2  X 0j2 b2 =hÞd 2j P , ¼ ð1=nhÞ nj¼1 Kðx02 b2  X 0j2 b2 =hÞ

(2.6)

and

where K is a one-dimensional kernel function, and fhg denotes a sequence of bandwidth converging to zero as the sample size increases. Obviously different kernel functions and bandwidths could have been used; but for simplicity we have adopted a single kernel function and one bandwidth parameter. Consequently, we propose to estimate r by r^ which maximizes the objective function Ln ðrÞ ¼

n X i¼1

^ i1 ; qi2 m ^ i2 ; qi1 qi2 rÞ, I i ln F2 ðqi1 m

ARTICLE IN PRESS 1104

S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

where I ¼ 1fX 2 Xg is a fixed trimming function to deal the boundary problems caused nonparametric kernel estimation. Remark 1. Our model is essentially a copula model with unknown marginal distributions. By nesting the widely used bivariate probit model, the Gaussian copula model is a natural candidate for generalizing the bivariate probit model. Indeed, a simple test for the validity of the bivariate probit model can be easily constructed by comparing the estimates of the generalized correlation coefficient and the correlation coefficient of the probit model based on the principle of the Hausman test. Remark 2. In the bivariate probit model, the correlation coefficient of the underlying error terms provides a concise measure of the dependence structure of the two binary decisions beyond the linear relation of the error terms due to the distinctive nature of the normality. In the normality context, the correlation coefficient measures both the linear and nonlinear relationships between different random variables; in particular, a zero correlation is equivalent to the independence. Similarly, the generalized correlation coefficient provides a simple measure of the dependence structure of the binary equations in the generalized bivariate probit model. Indeed, the knowledge of the generalized correlation coefficient as a Gaussian copula parameter in our current framework contains all the information on the dependence of the two binary decisions. Remark 3. As pointed out earlier, applied researchers are frequently interested in whether the two binary decisions are correlated. It is straightforward to verify that r ¼ 0 if and only if e1 and e2 are independent. Therefore, a simple test for independence can be based on ^ which would be asymptotically valid even if the Gaussian copula is our estimate r, misspecified. Remark 4. Based on the equivalence result (2.2), the model under consideration here can also be viewed as an extension of the traditional bivariate linear probit model by allowing the index structure for the latent regression functions. Indeed, in this semiparametric framework, Eq. (2.2) is also observationally equivalent2 to ( d i1 ¼ 1fg1 ðJ 1 ðX 0i1 b10 ÞÞ  g1 ðei1 Þ40g; d i2 ¼ 1fg2 ðJ 2 ðX 0i2 b20 ÞÞ  g2 ðei2 Þ40g for any strictly increasing unknown functions g1 and g2 . Therefore, it would be sensible to also consider association measures invariant to monotone transformations. Two such measures are Kendall’s tau and Spearman’s rho. Indeed, for the Gaussian copula model with unknown marginal distributions, there are simple relationships between the generalized correlation coefficient and Kendall’s tau and Spearman’s rho (Joe, 1997, p. 54). Thus, these three terms would provide an informative picture of the dependence structure of the two binary decisions. 3. Large sample properties of the estimator In this section, we study the large sample properties of the proposed estimator. We make the following assumptions: 2

We are grateful to an anonymous referee for a detailed discussion on this issue.

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1105

Assumption 1. The vectors ðX i1 ; X i2 ; d i1 ; d i2 Þ are independent and identically distributed across i, with finite fourth order moments for each component of X i1 and X i2 . Assumption 2. The observations are generated based on Eq. (2.1). Assumption 3. The parameters b1 2 B1 and b2 2 B2 , and the first components of b1 and b2 are 1, where B1 and B2 are compact subsets, and b10 and b20 are interior points of B1 and B2 , respectively. The generalized correlation coefficient r is an interior point of compact set R. We write b1 ¼ ð1; b~ 01 Þ0 and b2 ¼ ð1; b~ 02 Þ0 . Assumption 4. The preliminary estimators b b1 and b^ 2 for b~ 10 and b~ 20 are root-n consistent, and has the following asymptotic linear representation 1 b b1 ¼ b~ 10 þ n

n X

1 b b2 ¼ b~ 20 þ n

n X

ci1 þ op ðn1=2 Þ

i¼1

and ci2 þ op ðn1=2 Þ

i¼1

for some ci1 ¼ cðd i1; X i1 Þ and ci2 ¼ cðd i2; X i2 Þ such that Eci1 ¼ Eci2 ¼ 0 and Ekci1 k2 o1 and Ekci2 k2 o1. Assumption 5. The kernel function KðÞ is a symmetric density with a bounded support. It is twice continuously differentiable. Assumption 6. The bandwidth sequences satisfy: (1) h ! 0, nh2 =ln2 n ! 1. (2) nh8 ! 0 and nh3 = ln n ! 1 as n increases. Assumption 7. (i) The supports of the distribution of x1 and x2 are not contained in any proper linear subspace of Rq . (ii) For almost every x~ 1 ¼ ðx12 ; . . . ; x1q1 Þ0 ðx~ 2 ¼ ðx22 ; . . . ; x2q2 Þ0 Þ the distribution of x11 ðx21 Þ conditional on x~ 1 ðx~ 2 Þ has everywhere positive density with respect to Lebesgue measure. Let p1 ðt; bÞ ðp2 ðt; bÞÞ denote the density of x01 b ðx02 bÞ evaluated at t, denote p1 ðtÞ ¼ p1 ðt; b10 Þ, and p2 ðtÞ ¼ p2 ðt; b20 Þ. Assumption 8. Both p1 ðtÞ and p2 ðtÞ are twice continuously differentiable and its derivatives are uniformly bounded away from zero for x 2 X. Assumptions 1 and 2 describe the model and data. The compactness required in Assumption 3 is common for extremum estimators (e.g., Amemiya, 1985); and the scale normalization is common used for semiparametric estimation of binary choice models (see, e.g., Horowitz, 1992). Assumption 4 requires the estimators b^ 1 and b^ 2 are root-n consistent, and possess asymptotic linear representations of influence function. Several existing estimators for the slope parameters for b0 (Ahn et al., 1996; Han, 1987; Horowitz and Hardle, 1996; Ichimura, 1993; Klein and Spady, 1993; Powell et al., 1989; Sherman, 1993, among others) satisfy Assumption 4. Assumptions 5 and 6 describe the kernel function and the rate of convergence condition for the bandwidth

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1106

parameters. Assumptions 7 and 8 are some boundedness and smoothness conditions, which can be justified by more primitive conditions on the distributions of the variables in the model (see Lee (1994) and Sherman (1994) for some discussions on similar conditions). Theorem 1. If Assumptions 1–8 hold, then r^ is consistent and asymptotically normal, pffiffiffi d nðr^  r0 Þ ! Nð0; SÞ, where S ¼ V 1 OV 1  2  q ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 r0 Þ V ¼E I qr2 and O ¼ Ec2i , with ci ¼ ci1 þ ci2 þ ci3 þ Sa ci4 þ S b ci5 , and the individual terms are defined in the proof of the theorem. Some consistent estimator for the asymptotic covariance matrix is needed to conduct large sample inferences. A consistent estimator of S can be easily obtained if we can find consistent estimators of V and O. Estimation of V is straightforward; following the proof of the consistency result, we can show that V^ n1 is consistent for V, where 1 V^ n1 ¼ n

n X

Ii

i¼1

^ q2 ln F2 ðqi1 F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðPn2 ðX i2 b^ 2 ; b^ 2 ÞÞ; qi1 qi2 rÞ . qr2

~ ,c ~ , c~ , c~ and To estimate O, we need to construct estimators for S a , S b , sequences c i1 i2 i3 i4 ~ to estimate c ¼ c þ c þ c þ S a c þ S b c . Define c i5 i i1 i2 i3 i4 i5 1 S^ na ¼ n

n X

1 S^ nb ¼ n

n X

Ii

^ qðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ q2 ln F2 ðqi1 F1 ðPni1 Þ; qi2 F1 ðPn2i ; qi1 qi2 rÞÞ , qb1 qrqP1

Ii

^ qðPn2 ðX 0i2 b^ 2 ; b^ 2 ÞÞ q2 ln F2 ðqi1 F1 ðPni1 Þ; qi2 F1 ðPn2i Þ; qi1 qi2 rÞ , qb2 qrqP1

i¼1

i¼1

^ q ln F2 ðqi1 F1 ðPni1 Þ; qi2 F1 ðPni2 ; qi1 qi2 rÞÞ c~ i1 ¼ I i , qr c~ i2 ¼ I i C ni1 ðd i1  Pni1 Þ and c~ i3 ¼ I i C ni2 ðd i2  Pni2 Þ, where Pni1 ¼ Pn1 ðX 0i1 b^ 1 ; b^ 1 Þ, Pni2 ¼ Pn2 ðX 0i2 b^ 2 ; b^ 2 Þ, C ni1 ¼

^ q2 ln F2 ðqi1 F1 ðPni1 Þ; qi2 F1 ðPni2 Þ; qi1 qi2 rÞ qrqP1

C ni2 ¼

^ q2 ln F2 ðqi1 F1 ðPni1 Þ; qi2 F1 ðPni2 Þ; qi1 qi2 rÞ . qrqP2

and

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1107

Then following the arguments in the consistency proof and Powell (1989), we can show n 1X ~  c k2 ¼ op ð1Þ kc i1 n i¼1 i1

and n 1X ~  c k2 ¼ op ð1Þ. kc i2 n i¼1 i2

S^ na ¼ Sa þ op ð1Þ, S^ nb ¼ S b þ op ð1Þ. In addition, based on existing semiparametric ~ , and c~ are available (e.g., estimators for the binary choice model, examples of c i3 i4 Powell et al. (1989)) such that n 1X ~  c k2 ¼ op ð1Þ kc i3 n i¼1 i3

and n 1X ~  c k2 ¼ op ð1Þ, kc i4 n i¼1 i4 n 1X ~  c k2 ¼ op ð1Þ. kc i5 n i¼1 i5

Finally, we define ~ ¼c ~ þ S^ nb c~ . ~ þ c~ þ c ~ þ S^ na c c i i1 i2 i3 i4 i5 Then it is straightforward to show that n 1X ~  c k2 ¼ op ð1Þ kc i n i¼1 i

Pn ~ 2 ^ ^ ^ 1 and S ¼ S^ 1 n1 OS n1 is consistent for S, where O ¼ ð1=nÞ i¼1 ci . 3.1. A simulation study In this subsection we present the results of a small simulation study to assess the finite sample performance of the our estimator. We report the results for our estimator r^ and the bivariate probit estimator r^ p using the method of maximum likelihood that ignores possible misspecification. In implementing our two-stage estimator, we use the maximum rank correlation estimator (Han, 1987) in the first step. Throughout, we report the mean, bias, SD (standard deviation), and RMSE (root mean square error) of these two estimators based on 500 replications for each design with sample size equal to 200. All the results are reported in Table 1. The bandwidth for our estimator is chosen based on Silverman’s (1986) rule-of-thumb method. The first design for generating the underlying data is as follows: d 1i ¼ fx1i þ x2i þ 1i 40g, d 2i ¼ fx1i þ x2i þ 2i 40g,

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1108 Table 1

True value

Mean

Bias

SD

RMSE

Design I r^ p r^

0.707 0.707

0.69041 0.73588

0.00659 0.02888

0.13280 0.13388

0.13296 0.13401

Design II r^ p r^

0.707 0.707

0.89251 0.74840

0.18551 0.04140

0.10438 0.11481

0.19231 0.12190

0.46874 0.48404

0.96874 0.01596

0.09865 0.17590

0.97374 0.17644

Design III r^ p r^

0.5 0.5

where x1 HNð0; 1Þ, x2 HNð0; 1Þ independent of each other, ð1 ; 2 Þ are jointly pffiffiffi normally distributed, independent of ðx1 ; x2 Þ, with the correlation coefficient r ¼ 1= 2  0:707. In this case, the bivariate probit is correctly specified. Thus both estimators are consistent in the current setting. As expected, the maximum likelihood estimator performs very well, but our estimator is also very competitive. The data for the second design is generated by the model: d 1i ¼ fhðx1i þ x2i ; l1 Þ þ 1i 40g, d 2i ¼ fhðx1i þ x2i ; l2 Þ þ 2i 40g, where x1 ; x2 ; e1 are e2 are generated as in Design I, hðy; lÞ ¼ jyjl sgnðyÞ  1=l is modified Box–Cox transformation function, with l1 ¼ 0:8, l2 ¼ 0:6. In this case the bivariate probit model is misspecified; thus the maximum likelihood estimator is inconsistent, which results large bias. Our estimator, on the other hand, performs very well. For the third design, the data is generated according to d 1i ¼ fhðx1i þ x2i ; l1 Þ þ 1i 40g, d 2i ¼ fhðx1i þ x2i ; l2 Þ þ 2i 40g, where x1 ; x2 ; e1 are e2 are generated as in the previous two designs except that r ¼ 0:5, and hðy; lÞ ¼ jyjl sgnðyÞ  1=l with l1 ¼ 1, l2 ¼ 0:8. Again, the bivariate probit model is misspecified; similar to Design II, our estimator performs well, whereas the maximum likelihood estimator incurs very large bias.

4. Conclusion In this paper we have considered semiparametric estimation of a generalized correlation coefficient in a generalized bivariate probit model, which provides a simple summary statistic measuring the relationship between the two binary decision processes in a very general setup. In particular, we adopt a bivariate normal structure for the unobservable disturbance terms subject to unknown monotone transformations. The estimator is shown to be consistent and asymptotically normal. A simulation study indicates that the estimator may be useful in practical applications.

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1109

For the estimation of the generalized probit model, we have considered a two-step approach. In principle, a joint estimation approach, say, by combining the approach in Klein and Spady (1993) and the Gaussian copula, could lead to some efficiency gain. One major drawback with this joint estimation lies in its computational difficulty. It might be possible adopt a one-step efficient estimation approach based on our two-step estimates. We will consider the theoretical property and practical implication of this approach in future research.

Acknowledgments We would like to thank Xiaohong Chen, Chin-Fan Chung, Yanqin Fan, Chung-ming Kuan, Lung-Fei Lee and seminar participants at Academia Sinica for their helpful comments. Insightful comments from two anonymous referees have greatly improved the presentation of the paper. Chen’s research was supported by the UGC Grants DAG 03/04 BM57 and DAG 04/05 BM52 from University Grants Committe of Hong Kong, and Zhou’s research was supported by the SHUFE Grant No.211-3-70.

Appendix The following two lemmas present some results for rates of convergence and asymptotic linear representations for nonparametric kernel estimates. These results are useful to prove the main theorem. The proof of Lemma 3.1 is similar to the approach in Newey (1994), (Lemma B.1) thus omitted here. Based on the first lemma, Lemma 3.2 follows from a simple linearization. Lemma A1. Under Assumptions 1–4, 6–8, for x 2 X, and b1 ðb2 Þ belonging to a neighborhood of b10 ðb20 Þ we have sup jPn1 ðx01 b1 ; b1 Þ  P1 ðx01 b1 ; b1 Þj ¼ Op ðh2 þ ðnhÞ1=2 ðln nÞ1=2 Þ, x;b1

sup jPn2 ðx02 b2 ; b2 Þ  P2 ðx02 b2 ; b2 Þj ¼ Op ðh2 þ ðnhÞ1=2 ðln nÞ1=2 Þ, x;b2

  qPn1 ðx01 b1 ; b1 Þ qP1 ðx01 b1 ; b1 Þ  ¼ Op ðh þ ðnh3 Þ1=2 ðln nÞ1=2 Þ  sup  qb1 qb1 x;b1 and   qPn2 ðx02 b2 ; b2 Þ qP2 ðx02 b2 ; b2 Þ   ¼ Op ðh þ ðnh3 Þ1=2 ðln nÞ1=2 Þ. sup   qb qb x;b2

2

2

Lemma A2. Under assumptions 1–4, 6–8, Pn1 ðx01 b10 ; b10 Þ  P1 ðx1 Þ   n 1 X 1 Z i1  x01 b10 ðd i1  P1 ðx1 ÞÞK ¼ þ Op ðh4 þ ðnhÞ1 ln nÞ nh i¼1 p1 ðx01 b10 Þ h

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1110

and Pn2 ðx02 b20 ; b20 Þ  P2 ðx2 Þ   n 1 X 1 Z i2  x02 b20 ðd  P ðx ÞÞK ¼ þ Op ðh4 þ ðnhÞ1 ln nÞ i2 2 2 nh i¼1 p2 ðx02 b20 Þ h uniformly in x 2 X, where Z i1 ¼ X 0i1 b10 , Z i2 ¼ X 0i2 b20 and p1 ðÞ and p2 ðÞ are the density functions of Zi1 and Z i2 , respectively. Proof. First we establish the consistency. Let 1 L¯ n ðrÞ ¼ n

n X

I i ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 rÞ

i¼1

¯ and LðrÞ ¼ E½I i ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 rÞ. From a Taylor expansion, we have Ln ðrÞ ¼ ¼

n 1X I i ln F2 ðqi1 F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b^ 2 ; b^ 2 ÞÞ; qi1 qi2 rÞ n i¼1 n 1X I i ln F2 ðqi1 F1 ðP1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b^ 2 ; b^ 2 ÞÞ; qi1 qi2 rÞ n i¼1

þ

n ¯ 2i Þ; qi1 qi2 rÞ F2p1 ðqi1 F1 ðP¯ 1i Þ; qi2 F1 ðP 1X Ii 1 1 ¯ 1i Þ; qi2 F ðP ¯ 2i Þ; qi1 qi2 rÞ n i¼1 F2 ðqi1 F ðP

½Pn1 ðX 0i1 b^ 1 ; b^ 1 Þ  P1 ðX 0i1 b^ 1 ; b^ 1 Þ n ¯ 2i Þ; qi1 qi2 rÞ F2p2 ðqi1 F1 ðP¯ 1i Þ; qi2 F1 ðP 1X Ii þ 1 1 ¯ 1i Þ; qi2 F ðP ¯ 2i Þ; qi1 qi2 rÞ n i¼1 F2 ðqi1 F ðP ½Pn2 ðX 0i2 b^ 2 ; b^ 2 Þ  P2 ðX 0i2 b^ 2 ; b^ 2 Þ ¼ Ln1 þ Ln2 þ Ln3 , ¯ 2i is between P2 ðX 0i2 b^ 2 ; b^ 2 Þ and ¯ 1i is between P1 ðX 0i1 b^ 1 ; b^ 1 Þ and Pn1 ðX 0i1 b^ 1 ; b^ 1 Þ, and P where P 0 ^ ^ Pn2 ðX i2 b2 ; b2 Þ, F2p1 ððqi1 F1 ðP1 Þ; qi2 F1 ðP2 ÞÞ; qi1 qi2 rÞ ¼

qF2 ððqi1 F1 ðP1 Þ; qi2 F1 ðP2 ÞÞ; qi1 qi2 rÞ qP1

F2p2 ððqi1 F1 ðP1 Þ; qi2 F1 ðP2 ÞÞ; qi1 qi2 rÞ ¼

qF2 ððqi1 F1 ðP1 Þ; qi2 F1 ðP2 ÞÞ; qi1 qi2 rÞ . qP2

and

By Lemma 1, under the Assumption 6, it is easy to see that Ln2 ¼ op ð1Þ and

Ln3 ¼ op ð1Þ.

uniformly in r 2 R. Analogously, A Taylor expansion and an application of a uniform law of large numbers yields Ln ¼ Ln1 þ op ð1Þ

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1111

Ln1 ðrÞ ¯ n ðrÞ þ ¼L

þ

n F2b1 ðqi1 F1 ðP1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ; qi1 qi2 rÞ ^ 1X Ii ðb1  b1 Þ n i¼1 F2 ðqi1 F1 ðP1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ; qi1 qi2 rÞ

n F2b2 ðqi1 F1 ðP1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ; qi1 qi2 rÞ ^ 1X Ii ðb2  b2 Þ, n i¼1 F2 ðqi1 F1 ðP1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ; qi1 qi2 rÞ

where b¯ is between b1 and b^ 1 , and b¯ 2 is between b2 and b^ 2 . F2b1 ðqi1 F1 ðP1 ðX i1 b1 ; b1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b2 ; b2 ÞÞ; qi1 qi2 rÞ ¼

qF2 ðqi1 F1 ðP1 ðX i1 b1 ; b1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b2 ; b2 ÞÞ; qi1 qi2 rÞ , qb1

F2b2 ðqi1 F1 ðP1 ðX 0i1 b1 ; b1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b2 ; b2 ÞÞ; qi1 qi2 rÞ ¼

qF2 ðqi1 F1 ðP1 ðX 0i1 b1 ; b1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b2 ; b2 ÞÞ; qi1 qi2 rÞ . qb2

Under the Assumption 4, we have ¯ Ln1 ¼ L¯ n ðrÞ þ op ð1Þ ¼ LðrÞ þ op ð1Þ uniformly in r 2 R. Other conditions for the consistency proof for extremum estimator in Amemiya (1985) are easy to check. Thus consistency follows. We now establish the asymptotic normality. As r0 is an interior point in R, by the ^ with probability approaching one as the sample size increases, we have consistency of r, the following first order condition: n ^ 1X q ln F2 ðqi1 F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b^ 2 ; b^ 2 ÞÞ; qi1 qi2 rÞ ¼ 0. Ii n i¼1 qr

Then a Taylor expansion gives ¯ Ln1 ¼ L¯ n ðrÞ þ op ð1Þ ¼ LðrÞ þ op ð1Þ, uniformly in r 2 R. Other conditions for the consistency proof for extremum estimator in Amemiya (1985) are easy to check. Thus consistency follows. We now establish the asymptotic normality. As r0 is an interior point in R, by the ^ with probability approaching one as the sample size increases, we have consistency of r, the following first order condition, n ^ 1X q ln F2 ðqi1 F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b^ 1 ; b^ 2 ÞÞ; qi1 qi2 rÞ ¼ 0. Ii n i¼1 qr

Then a Taylor expansion gives pffiffiffi  nðr^  r0 Þ ¼ V 1 n Sn , where V n ¼

n 1X q2 ln F2 ðqi1 F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b^ 1 ; b^ 2 ÞÞ; qi1 qi2 rÞ ¯ Ii qr2 n i¼1

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1112

and n 1 X q ln F2 ðqi1 F1 ðPn1 ðX 0i1 b^ 1 ; b^ 1 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b^ 2 ; b^ 2 ÞÞ; qi1 qi2 r0 Þ , Ii S n ¼ pffiffiffi n i¼1 qr

where r¯ is between r^ and r0 . Similar to the proof of the consistency result, we can show  2  q ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 r0 Þ  V n ¼ V þ op ð1Þ ¼ E I þ op ð1Þ. qr2 Now turn to S n . Note that pffiffiffi pffiffiffi S n ¼ S n þ S na nðb^ 1  b10 Þ þ Snb nðb^ 2  b20 Þ, where n 1 X q ln F2 ðqi1 F1 ðPn1 ðX 0i1 b10 ; b10 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b20 ; b20 ÞÞ; qi1 qi2 r0 Þ , S n ¼ pffiffiffi Ii n i¼1 qr

Sna ¼

n 1X q2 ln F2 ðqi1 F1 ðPn1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ; qi2 F1 ðPn2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ; qi1 qi2 r0 Þ Ii n i¼1 qrqP1



qðPn1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ qb1

and Snb ¼

n 1X q2 ln F2 ðqi1 F1 ðP1 ðX 0i1 b¯ 1 ; b¯ 1 ÞÞ; qi2 F1 ðP2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ; qi1 qi2 r0 Þ Ii n i¼1 qrqP2



qðPn2 ðX 0i2 b¯ 2 ; b¯ 2 ÞÞ , qb2

^ Again, following the consistency proof, we can where b¯ lies the line segment of b0 and b. show that S na ¼ S a þ op ð1Þ and S nb ¼ S b þ op ð1Þ, where S a ¼ EI

q2 ln F2 ðqi1 F1 ðP1 Þ; qi2 F1 ðP2 Þ; qi1 qi2 r0 Þ qðP1 ðX 01 b10 ; b10 ÞÞ qrqP1 qb1

S b ¼ EI

q2 ln F2 ðqi1 F1 ðP1 Þ; qi2 F1 ðP2 Þ; qi1 qi2 r0 Þ qðP2 ðX 02 b20 ; b20 ÞÞ . qrqP2 qb2

and

We now consider Sn . Another Taylor expansion and using Lemma 2 gives S n ¼ Sn1 þ S n2 þ S n3 þ Rn ,

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1113

where n 1 X q ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 r0 Þ , Ii Sn1 ¼ pffiffiffi qr n i¼1

n 1 X Sn2 ¼ pffiffiffi I i C i1 ½Pn1 ðX 0i1 b10 ; b10 Þ  Pi1  n i¼1

and n 1 X Sn3 ¼ pffiffiffi I i C i2 ½Pn2 ðX 0i2 b20 ; b20 Þ  Pi2 . n i¼1

Pi1 ¼ P1 ðX 0i1 b10 ; b10 Þ, Pi2 ¼ P2 ðX 0i2 b20 ; b0 Þ, and Rn is the residual term involving the second order terms, C i1 ¼

q2 ln F2 ðqi1 F1 ðPi1 Þ; qi2 F1 ðPi2 Þ; qi1 qi2 r0 Þ qrqP1

C i2 ¼

q2 ln F2 ðqi1 F1 ðPi1 Þ; qi2 F1 ðPi2 Þ; qi1 qi2 r0 Þ . qrqP2

and

Following the proof of the consistency result and using Lemma 1, we can show that Rn ¼ Op ð1ÞOp ðn1=2 ðh4 þ ðnhÞ1 ln nÞÞg ¼ op ð1Þ. By Lemma 2, we have n 1 X I i C i1 ½Pn1 ðX 0i1 b10 ; b10 Þ  Pi1  S n2 ¼ pffiffiffi n i¼1   n d j1  Pi1 Zi1  Z j1 1 X ¼ pffiffiffi I i C i1 k þ n1=2 Op ðh4 þ ðnhÞ1 ln nÞ. nnh i;j¼1 pi1 h

Then following the arguments in Powell et al. (1989) using a U-Statistic projection, we obtain n 1 X Sn2 ¼ pffiffiffi I i C i1 ðd i1  Pi1 Þ þ op ð1Þ. n i¼1

Similarly, we can show n 1 X Sn3 ¼ pffiffiffi I i C i2 ðd i2  Pi2 Þ þ op ð1Þ. n i¼1

ARTICLE IN PRESS S. Chen, Y. Zhou / Journal of Econometrics 141 (2007) 1100–1114

1114

By combining the previous results, we obtain n 1 X q ln F2 ðqi1 mi1 ; qi2 mi2 ; qi1 qi2 r0 Þ Sn ¼ pffiffiffi Ii qr n i¼1 n n 1 X 1 X þ pffiffiffi I i C i1 ðd i1  Pi1 Þ þ pffiffiffi I i C i2 ðd i2  Pi2 Þ n i¼1 n i¼1 pffiffiffi pffiffiffi þ S a nðb^ 1  b10 Þ þ S b nðb^ 2  b20 Þ n 1 X fc þ ci2 þ ci3 þ S a ci4 þ Sb ci5 g þ op ð1Þ. ¼ pffiffiffi n j¼1 i1

Applying the central limit theorem and the Sluskty theorem yields the desired result.

&

References Ahn, H., Ichimura, H., Powell, J.L., 1996. Simple estimators for monotone index models. Department of Economics, University of California, Berkeley. Chiappori, P.A., Salanie´, B., 2000. Testing for asymmetric information in insurance markets. Journal of Political Economy 108 (1). Cosslett, S.R., 1983. Distribution-free maximum likelihood estimator of the binary choice model. Econometrica 51, 765–782. Greene, W., 2004. Econometric Analysis. Macmillan, New York. Han, A.K., 1987. Nonparametric analysis of a generalized regression model: the maximum rank correlation estimator. Journal of Econometrics 35, 303–316. Horowitz, J.L., 1992. A smooth maximum score estimator for the binary response model. Econometrica 60, 505–531. Horowitz, J.L., Hardle, W., 1996. Direct semiparametric estimation of single-index models with discrete covariates. Journal of the American Statistical Association 91, 1632–1640. Ichimura, H., 1993. Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. Journal of Econometrics 58, 71–120. Joe, H., 1997. Multivariate Models and Dependence Concepts. Chapman and Hall, London. Klein, R.W., Spady, R.S., 1993. An efficient semiparametric estimator of the binary response model. Econometrica 61, 387–421. Kotz, S., Johnson, N.L., 1989. Encyclopedia of Statistical Sciences, vol. 9. Wiley, New York. Lee, L.-F., 1982. Some approaches to the correction of selectivity bias. The Review of Economic Studies XLIX, 355–372. Lee, L.-F., 1983. Generalized econometric models with selectivity. Econometrica 51, 507–512. Lee, L.-F., 1994. Semiparametric two-stage estimation of sample selection models subject to Tobit-type selection rules. Journal of Econometrics 61, 305–344. Manski, C.F., 1985. Semiparametric analysis of discrete response: asymptotic properties of the maximum score estimator. Journal of Econometrics 27, 313–333. Newey, K.N., 1994. Kernel estimation of partial means and a general variance estimator. Econometric Theory 10, 233–253. Powell, J.L., Stock, J.H., Stoker, T.M., 1989. Semiparametric estimation of weighted average derivatives. Econometrica 57, 1403–1430. Sherman, R.P., 1993. The limiting distribution of the maximum rank correlation estimator. Econometrica 61, 123–138. Sherman, R.P., 1994. Maximal inequalities for degenerate U-process with applications to optimization estimators. The Annals of Statistics 22(1), 439–459. Silverman, B.W., 1986. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.