Nonparametric methodology for the time-dependent partial area under the ROC curve

Nonparametric methodology for the time-dependent partial area under the ROC curve

Journal of Statistical Planning and Inference 141 (2011) 3829–3838 Contents lists available at ScienceDirect Journal of Statistical Planning and Inf...

263KB Sizes 0 Downloads 42 Views

Journal of Statistical Planning and Inference 141 (2011) 3829–3838

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Nonparametric methodology for the time-dependent partial area under the ROC curve Hung Hung, Chin-Tsang Chiang  Department of Mathematics, National Taiwan University, Taipei 10617, Taiwan, ROC

a r t i c l e i n f o

abstract

Article history: Received 25 August 2009 Received in revised form 9 March 2011 Accepted 24 June 2011 Available online 7 July 2011

To assess the classification accuracy of a continuous diagnostic result, the receiver operating characteristic (ROC) curve is commonly used in applications. The partial area under the ROC curve (pAUC) is one of the widely accepted summary measures due to its generality and ease of probability interpretation. In the field of life science, a direct extension of the pAUC into the time-to-event setting can be used to measure the usefulness of a biomarker for disease detection over time. Without using a trapezoidal rule, we propose nonparametric estimators, which are easily computed and have closed-form expressions, for the time-dependent pAUC. The asymptotic Gaussian processes of the estimators are established and the estimated variance–covariance functions are provided, which are essential in the construction of confidence intervals. The finite sample performance of the proposed inference procedures are investigated through a series of simulations. Our method is further applied to evaluate the classification ability of CD4 cell counts on patient’s survival time in the AIDS Clinical Trials Group (ACTG) 175 study. In addition, the inferences can be generalized to compare the time-dependent pAUCs between patients received the prior antiretroviral therapy and those without it. & 2011 Elsevier B.V. All rights reserved.

Keywords: AUC Bandwidth Censoring time FPR Gaussian process Kaplan–Meier estimator Marker-dependent censoring Nonparametric estimator pAUC ROC Survival time TPR

1. Introduction Decision-making is an important issue in many fields such as signal detection, psychology, radiology, and medicine. For example, preoperative diagnostic tests are medically necessary and implemented in clinical preventive medicine to determine those patients for whom surgery is beneficial. For the sake of cost-saving or performance improvement, new diagnostic tests are often introduced and the classification accuracies of them are evaluated and compared with the existing ones. The ROC curve, a plot of the true positive rate (TPR) versus the false positive rate (FPR) for each possible cut point, has been widely used for this purpose when the considered diagnostic tests are continuous. One advantage of the ROC curve is that it describes the inherent classification capability of a biomarker without specifying a specific threshold. Moreover, the invariance characteristic of ROC curve in measurement scale provides a suitable base to compare different biomarkers. Generally, the more the curve moves toward the point (0,1), the better a biomarker performs. In many applications, the area under the ROC curve (AUC), one of the most popular summary measures of the ROC curve, is used to evaluate the classification ability of a biomarker. It has the probability meaning that the considered biomarker of a randomly selected diseased case is greater than that of a non-diseased one. Generally, a perfect biomarker will have the AUC  Corresponding author.

E-mail address: [email protected] (C.-T. Chiang). 0378-3758/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2011.06.025

3830

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

of one while a poor one takes a value close to 0.5. Since the AUC is the whole area under the ROC curve, relevant information might not be entirely captured in some cases. For example, two crossed ROC curves can have the same AUC but totally different performance. Furthermore, there might be limited or no data in the region of high FPR. In view of these drawbacks, it is more useful to see the pAUC within a certain range of TPR or FPR. To evaluate the performance of several biomarkers, McClish (1989) adopted the summary measure pAUC for the FPR over a practically relevant interval. On the other hand, Jiang et al. (1996) showed that women with false-negative findings at mammography cannot be benefited from timely treatment of the cancer. Thus, these authors suggested using the pAUC with a portion of the true positive range in their applied data. Although their perspectives are different, the main frame is the same: only the practically acceptable area under the ROC curve is assessed. As mentioned by Dwyer (1996), the pAUC is a regional analysis of the ROC curve intermediate between the AUC and individual points on the ROC curve. The pAUC becomes a good measure of classification accuracy because it is easier for a practitioner to determine a range of TPR or/and FPR that are relevant. Several estimation and inference procedures have been proposed by Emir et al. (2000), Zhang et al. (2002), Dodd and Pepe (2003), among others. A more thorough understanding of the ROC, AUC, and pAUC can be also found in Zhou et al. (2002). Recent research in ROC methodology has extended the binary disease status to the time-dependent setting. Let T denote the time to a specific disease or death and Y represent the continuous diagnostic marker measured before or onset of the study with joint survivor function Sðt,yÞ ¼ PðT 4 t,Y 4yÞ. For each fixed time point t, the disease status can be defined as a case if T rt and a control otherwise. To evaluate the ability of Y in classifying subjects who is diseased before time t or not, Heagerty et al. (2000) generalized the traditional TPR and FPR to the time-dependent TPR and FPR as T t ðyÞ ¼ PðY 4yjT rtÞ and F t ðyÞ ¼ PðY 4yjT 4tÞ, which can be further derived to be ðSð0,yÞSðt,yÞÞ=ð1ST ðtÞÞ and Sðt,yÞ=ST ðtÞ with ST ðtÞ ¼ Sðt,1Þ. For the time-dependent AUC, Chambless and Diao (2006), Chiang et al. (2009), and Chiang and Hung (2010) proposed different nonparametric estimators and developed the corresponding inference procedures. As for the time-dependent pAUC, there is still too little research on this topic so far. We propose nonparametric estimators, which are shown to converge weakly to Gaussian processes, and the estimators for the corresponding variance–covariance functions. The established properties facilitate us to make inference on the time-dependent pAUC and can be reasonably applied to the time-dependent AUC because it is a special case of this summary measure. The rest of this paper is organized as follows. In Section 2, the nonparametric estimation and inference procedures are proposed for the time-dependent pAUC. The finite sample properties of the estimators and the performance of the constructed confidence bands are studied through Monte Carlo simulations in Section 3. Section 4 presents an application of our method to the ACTG 175 study. In this section, an extended inference procedure is further provided for the comparison of the time-dependent pAUCs. Some conclusions and future works are addressed in Section 5. Finally, the proof of main results is followed in the Appendix.

2. Estimation and inferences In this section, we estimate the time-dependent pAUC and develop the corresponding inference procedures. Without loss of generality, the time-dependent pAUC is discussed for restricted F t ðyÞ because that the time-dependent pAUC for restricted T t ðyÞ can be derived in the same way by reversing the roles of case and control subjects.

2.1. Estimation Let X be the minimum of T and censoring time C, d ¼ IðX ¼ TÞ represent the censoring status, and qat ¼ F 1 t ðaÞ ¼ inffy : F t ðyÞ r ag, a 2 ð0,1, denote the ð1aÞth quantile of Y conditioning on fT 4tg at the fixed time point t. Following the R expression f T t ðyÞ dy F t ðyÞg for the time-dependent AUC, the time-dependent pAUC yt ðqat Þ with the FPRt(y) less than a is derived to be functional of Sðt,yÞ: R  ðSð0,uÞSðt,uÞÞIðu Z qat Þdu Sðt,uÞ for t 2 ð0, t with PðX 4 tÞ 4 0: Ya ðSÞ ¼ ð2:1Þ ST ðtÞð1ST ðtÞÞ Note that the value of yt ðqat Þ for a perfect biomarker should be a while a useless one is 0:5a2 . Same with the interpretation of Cai and Dodd (2008), the rescaled time-dependent pAUC yt ðqat Þ=a can be explained as the probability that the test result of a case fTi r tg is higher than that of a control fTj 4 tg with its value exceeding qat for iaj, i.e., PðYi 4Yj jTi rt,Tj 4 t,Yj 4qat Þ. From the formulation in (2.1), an estimator of yt ðqat Þ can be obtained if Sðt,yÞ is estimable. Under marker-dependent censoring (T and C are independent conditioning on Y), Akritas (1994) suggested estimating Sðt,yÞ by b Sðt,yÞ ¼ n1 Pn b i ¼ 1 S T ðtjYi ÞIðYi 4yÞ, where ( ) Y K ðb S Y ðYi Þb S Y ðyÞÞ b 1 l S T ðtjyÞ ¼ ð2:2Þ nb S X ðXi jyÞ fi:Xi r t, di ¼ 1g

P P is an estimator of ST ðtjyÞ ¼ PðT 4 tjY ¼ yÞ with b S Y ðyÞ ¼ n1 nj¼ 1 IðYj 4 yÞ and b S X ðtjyÞ ¼ n1 nj¼ 1 IðXj Z tÞKl ðb S Y ðYj Þb S Y ðyÞÞ being 1 estimators of SY ðyÞ ¼ PðY 4 yÞ and SX ðtjyÞ ¼ PðX 4 tjY ¼ yÞ. Here, Kl ðuÞ ¼ ð2lÞ Iðjuj o lÞ and l is a nonnegative smoothing

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

3831

parameter. Substituting b Sðt,yÞ for Sðt,yÞ in (2.1), yt ðqat Þ is proposed to be estimated by

ybt ðqbat Þ9Ya ðb SÞ ¼

n2

P

b

b

b

iaj ð1S T ðtjYi ÞÞS T ðtjYj Þfij ðq at Þ

b S T ðtÞÞ S T ðtÞð1b

,

ð2:3Þ

b 1 ðaÞ, F b t ðyÞ ¼ b b at ¼ F Sðt,yÞ=b S ðtÞ, and b S T ðtÞ ¼ b Sðt,1Þ. In the Appendix, we show that where fij ðyÞ ¼ IðYi 4Yj 4 yÞ, q t pffiffiffi b P T bat Þyt ðqat ÞÞ is uniformly approximated by n1=2 ni¼ 1 Cai ðtÞ and converges weakly to a mean zero Gaussian process nðy t ðq with variance–covariance function Sa ðs,tÞ ¼ EðCai ðsÞCai ðtÞÞ. The application of kernel function Kl ðuÞ provides the nearest neighbor estimator of Sðt,yÞ. An alternative choice of kernel function is possible and will lead to a different estimator of yt ðqat Þ. As mentioned in Akritas (1994), the asymptotic properties of b Sðt,yÞ is irrelevant to the choice of kernel function under some b t ðq b at Þ. The author further showed that any other estimator for Sðt,yÞ is at least as dispersed as regularity conditions and so is y b Sðt,yÞ and the choice of l is irrelevant to the measurement scale of Y. It is not difficult to see that the estimation problem of yt ðqat Þ becomes that of Sðt,yÞ. From this perspective, the proposed estimation procedure can be extended to any censoring or truncation mechanism provided that Sðt,yÞ is estimable. Alternatively, one might be interested in making inference on the time-dependent pAUC ðyt ðqa0 t Þyt ðqat ÞÞ over the b t ðq b t ðq ba0 t Þy bat ÞÞ is suggested and the limiting Gaussian process of range ½a, a0 , 0 r a o a0 r 1, of F t ðyÞ. The estimator ðy pffiffiffi b b t ðq b t ðq ba0 t Þy b at ÞÞðyt ðqa0 t Þyt ðqat ÞÞg is a direct consequence of the large sample property of y bat Þ. When the complete nfðy t ðq P ~ failure time data fTi ,Yi gni¼ 1 are available, Sðt,yÞ can be estimated by an empirical estimator Sðt,yÞ ¼ n1 ni¼ 1 IðTi 4t,Yi 4 yÞ. A natural estimator for yt ðqat Þ is obtained as P n2 iaj IðTi r t,Tj 4 tÞfij ðq~ at Þ , ð2:4Þ y~ t ðq~ at Þ ¼ S~ T ðtÞð1S~ T ðtÞÞ 1 ~ ~ S~ T ðtÞ, and S~ T ðtÞ ¼ Sðt,1Þ. By substituting the disease and disease-free groups for the where q~ at ¼ F~ t ðaÞ, F~ t ðyÞ ¼ Sðt,yÞ= time-varying case and control ones, yt ðqat Þ and the estimator y~ t ðq~ at Þ will reduce to the time-invariant pAUC and the nonparametric estimator of Dodd and Pepe (2003). By the similar argument as in the proof of the asymptotic Gaussian pffiffiffi b t ðq b at Þ, it is straightforward to derive that nðy~ t ðq~ at Þyt ðqat ÞÞ converges weekly to a Gaussian process with process of y n mean zero and variance–covariance function Sa ðs,tÞ ¼ EðCnai ðsÞCnai ðtÞÞ, where

Cnai ðtÞ ¼

Uin ðt,qat Þ þ Zðt,qat ÞVin ðt,1Þþ ðaST ðtÞSY ðqat ÞÞðVin ðt,qat ÞaVin ðt,1ÞÞ , ST ðtÞð1ST ðtÞÞ

Uin ðt,yÞ ¼ Eðhnij ðt,yÞ þhnji ðt,yÞjTi ,Yi Þ2Hðt,yÞ, hnij ðt,yÞ ¼ IðTi r t,Tj 4tÞfij ðyÞ, and Vin ðt,yÞ ¼ IðTi 4t,Yi 4 yÞ. 2.2. Inference procedures on the time-dependent pAUC The confidence intervals for yt ðqat Þ and ðyt ðqa0 t Þyt ðqat ÞÞ can be constructed by the asymptotic Gaussian processes and the estimated variance–covariance functions. Replacing the parameters with their sample analogues, Cai ðtÞ is proposed to be estimated by b ai ðtÞ ¼ C

b i ðt,1Þþ ðab b i ðt, q b i ðt, q b i ðt,1ÞÞ bat Þ þ Z bat ÞV bat ÞÞðV bat ÞaV b ðt, q S Y ðq S T ðtÞb U , bT ðtÞð1b S T ðtÞÞ S

ð2:5Þ

b b b b b b b b b b b i ðt,yÞ ¼ n1 P where U fj:jaig ðh ij ðt,yÞ þ h ji ðt,yÞÞ2Hðt,yÞ þðS Y ðYi ÞSðt,yÞÞx i ðtÞIðYi 4yÞ, V i ðt,yÞ ¼ ðS T ðtjYi Þ þ x i ðtÞÞIðYi 4yÞSðt,yÞ, and R t b1 2 P b b ðt,yÞ ¼ ð1b b b b b b Zb ðt,yÞ ¼ Hðt,yÞð2 S T ðtÞ1Þ=ðb S T ðtÞb S T ðtÞÞ with h S T ðtjYi ÞÞb S T ðtjYj Þfij ðyÞ, Hðt,yÞ ¼ n2 iaj h ij ij ðt,yÞ, x i ðtÞ ¼ S T ðtjYi Þ 0 S X b ðujY Þ, and M b ðtjY Þ ¼ IðX r tÞd þ ln b ðujY Þdu M S T ðt4X jY Þ. Thus, it is straightforward to have an estimated variance–covariance i

i

i

i

i

i

i

i

i

function b a ðs,tÞ ¼ S

n 1X b ðtÞ b ðsÞC C ai n i ¼ 1 ai

ð2:6Þ

and a ð1BÞ, 0 o B o 1, pointwise confidence interval for yt ðqat Þ: Z

1=2

B=2 b ybt ðqbat Þ 7 pffiffiffi S a ðt,tÞ,

n

ð2:7Þ

where ZB=2 is the ð1B=2Þ quantile value of the standard normal distribution. With the independent and identically distributed P representation n1=2 ni¼ 1 Cai ðtÞ, the re-sampling technique of Lin et al. (2000) is applied to determine a critical point LB so that   0 1 pffiffiffi b   nðy t ðq bat Þyt ðqat ÞÞ  o LB A61B ð2:8Þ P@ sup   t2½t1 , t2  b 1=2 ðt,tÞ  S a

3832

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

for a subinterval ½t1 , t2  of interest within the time period ½0, t. The validity of (2.8) enables us to construct a ð1BÞ simultaneous confidence band for fyt ðqat Þ : t 2 ½t1 , t2 g via   L b 1=2 ybt ðqbat Þ 7 pBffiffiffi S ð2:9Þ a ðt,tÞ : t 2 ½t1 , t2  : n Note that both pointwise confidence intervals and simultaneous confidence bands for ðyt ðqa0 t Þyt ðqat ÞÞ can be constructed as the above ones. When y~ t ðq~ at Þ is applicable, the confidence bands are easily obtained by substituting P n ~ n ðtÞ for S ~ n ðsÞC b a ðs,tÞ in (2.7) and (2.9), where S~ a ðs,tÞ ¼ n1 ni¼ 1 C ai ai n U~ i ðt, q~ at Þ þ Z~ ðt, q~ at ÞVin ðt,1Þþ ðaS~ T ðtÞb S Y ðq~ at ÞÞðVin ðt, q~ at ÞaVin ðt,1ÞÞ , S~ T ðtÞð1S~ T ðtÞÞ P P 2 n ~ ~ ~ U~ i ðt,yÞ ¼ n1 fj:jaig ðhnij ðt,yÞ þ hnji ðt,yÞÞ2Hðt,yÞ, Hðt,yÞ ¼ n2 iaj hnij ðt,yÞ, and Z~ ðt,yÞ ¼ Hðt,yÞð2 S~ T ðtÞ1Þ=ðS~ T ðtÞS~ T ðtÞÞ. n

~ ðtÞ ¼ C ai

3. Numerical studies In this section, Monte Carlo simulations are conducted to investigate the finite sample properties of the proposed estimators and the performance of the inference procedures. The continuous biomarker Y is designed to follow a standard normal distribution. Conditioning on Y ¼y, the failure time T and the censoring time C are independently generated from a lognormal distribution with parameters m ¼ 0:15y þln 10 and s ¼ 0:3, and an exponential distribution with scale parameter 10bf2Iðy o0Þ þ Iðy Z0Þg, where the constant b is set to produce the censoring rates (c.r.) of 30% and 50%. In our numerical studies, 500 data sets with sample sizes (n) 250 and 500 are simulated. The estimators and the pointwise confidence intervals of yt ðqat Þ are evaluated at the selected time points t0:4 , t0:5 , and t0:6 with a ¼ 0.1, 0.3, and 1, where tp is the pth quantile of the distribution of T. Moreover, the simultaneous confidence bands for yt ðqat Þ over the subintervals ½t0:4 ,t0:5  and ½t0:4 ,t0:6  are considered. Since a small portion of cases or controls occur outside ½t0:4 ,t0:6  under the above design, the simulation results are presented within this time period. When survival times are subject to censoring, an appropriate smoothing parameter urgently becomes necessary in the estimation of yt ðqat Þ. It usually attempts to select a bandwidth that minimizes the asymptotic mean squared error of an estimator, which is obtained by using the plug-in method for unknown parameters. This approach, however, would lead to further bandwidth selection problems and is infeasible in our current setting. For the bandwidth selection, we propose a simple and easily implemented data-driven method. This procedure is to find a bandwidth, say, lopt which minimizes the following integrated squared error Z 1 ISEðlÞ ¼ ðb S e ðuÞð1uÞÞ2 dN ei ðuÞ, ð3:1Þ 0 ðiÞ bðiÞ ðtjyÞ is computed S T ðXi jYi Þ, S where b S e ðuÞ is the Kaplan–Meier estimator computed based on the data fei , di gni¼ 1 , ei ¼ 1b T b as S T ðtjyÞ with the ith observation ðXi , di ,Yi Þ deleted, and Nei ðuÞ ¼ di Iðei r uÞ. The rationale behind (3.1) is that f1ST ðXi jYi Þ, di gni¼ 1 can be shown to be an independent censored sample from a uniform distribution U(0,1) under the b t ðq b ðtÞ’s are computed by using lopt bat Þ and C validity of conditionally independent censoring. For each generated sample, y ai and the subjective bandwidths of 0.01 and 0.2. Among the 500 simulated samples, the bandwidths obtained from minimizing ISEðlÞ in (3.1) have a range between 0.01 and 0.2. Tables 1 and 2 summarize the averages and standard deviations of estimates, the standard errors, and the empirical coverage probabilities of 0.95 pointwise confidence intervals for yt ðqat Þ. From the unshown simulation results for complete failure time bt ðq bat Þ, computed using lopt , give separately a slight overestimate and underestimate of yt ðqat Þ. Moreover, the data, y~ t ðq~ at Þ and y n variance Sa ðt,tÞ tends to be underestimated, which leads to a lower coverage probability. As expected, the bias and standard b t ðq bat Þ will separately increase and decrease as the bandwidth becomes larger. It is also detected from these tables deviation of y that the poor estimates of Sa ðt,tÞ’s appear at extremely small or large bandwidths. In the numerical studies, the larger bias of ybt ðqb0:1t Þ is found and the main reason for this is because it is computed via comparing only (at most) the top 10% of subjects in the control group with those in the case group. For a ¼ 0:3 and 1, the availability of data used for statistical analysis is expanded and, hence, the performance becomes better. The biases of the proposed estimators is indistinguishable in the presence of heavy censoring, whereas the standard deviations and the standard errors will become larger. At each simulated sample, we can see that the estimators using lopt provide more satisfactory results than those obtained by subjective bandwidths. One can find from Tables 1–3 that most of the coverage probabilities are systematically lower than 0.95 for the bandwidths of 0.01 and 0.2. Since the fixed-bandwidth of 0.2 is much larger than the optimal ones as the sample size increases, the corresponding coverage probabilities become worst and very low compared with the nominal value. However, our automatic bandwidth selection procedure is shown to have a good performance in interval estimation, except for ðn,c:r:; aÞ of (250,30%,0.1), (250,50%,0.1), and (500,50%,0.1) at the time point t0:6 . As expected, a very small control group is considered at t0:6 and a direct consequence (unstable survival estimates) of heavy censoring at the right tail lead to intervals with poor coverage probabilities. Although the Edgeworth expansion and random bootstrap approximation of Chiang et al. (2009) are alternative methods to b t ðq b at Þ, the improvement is still limited for small a and, hence, more subjects in the approximate the sampling distribution of y control group are needed to draw meaningful analyses. For samples without censoring, it has been observed from the unshown b t ðq bat Þ with lopt are results that the empirical coverage probabilities of the pointwise confidence intervals computed based on y

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

3833

Table 1 The averages (Mean) and the standard deviations (SD) of 500 estimates, the standard errors (SE), and the empirical coverage probabilities (CP). c.r.¼ 30%

n ¼250

n¼ 500

l

Time

yt ðq0:1t Þ

Mean

SD

SE

CP

Mean

SD

SE

CP

0.01

t0:4 t0:5 t0:6

0.0264 0.0269 0.0279

0.0274 0.0276 0.0282

0.0073 0.0073 0.0080

0.0048 0.0048 0.0049

0.796 0.794 0.738

0.0268 0.0272 0.0279

0.0052 0.0055 0.0055

0.0040 0.0040 0.0041

0.834 0.846 0.848

lopt

t0:4 t0:5 t0:6

0.0264 0.0269 0.0279

0.0219 0.0233 0.0252

0.0068 0.0070 0.0078

0.0081 0.0077 0.0075

0.954 0.926 0.890

0.0234 0.0248 0.0264

0.0050 0.0054 0.0055

0.0058 0.0056 0.0056

0.948 0.932 0.924

0.20

t0:4 t0:5 t0:6

0.0264 0.0269 0.0279

0.0192 0.0208 0.0231

0.0043 0.0053 0.0067

0.0086 0.0082 0.0082

0.974 0.942 0.918

0.0190 0.0203 0.0218

0.0032 0.0037 0.0045

0.0062 0.0061 0.0061

0.894 0.868 0.864

l

Time

yt ðq0:3t Þ

Mean

SD

SE

CP

Mean

SD

SE

CP

0.01

t0:4 t0:5 t0:6

0.1420 0.1423 0.1446

0.1431 0.1424 0.1429

0.0197 0.0197 0.0204

0.0141 0.0138 0.0141

0.836 0.844 0.830

0.1424 0.1427 0.1438

0.0140 0.0142 0.0137

0.0116 0.0113 0.0115

0.880 0.874 0.892

lopt

t0:4 t0:5 t0:6

0.1420 0.1423 0.1446

0.1318 0.1337 0.1374

0.0210 0.0208 0.0213

0.0223 0.0210 0.0207

0.952 0.938 0.930

0.1362 0.1383 0.1414

0.0144 0.0150 0.0143

0.0154 0.0147 0.0146

0.948 0.932 0.940

0.20

t0:4 t0:5 t0:6

0.1420 0.1423 0.1446

0.1272 0.1297 0.1342

0.0169 0.0179 0.0191

0.0242 0.0227 0.0223

0.982 0.960 0.956

0.1265 0.1281 0.1310

0.0122 0.0129 0.0137

0.0174 0.0164 0.0164

0.938 0.914 0.916

l

Time

yt ðq1t Þ

Mean

SD

SE

CP

Mean

SD

SE

CP

0.01

t0:4 t0:5 t0:6

0.7769 0.7746 0.7765

0.7775 0.7729 0.7708

0.0355 0.0352 0.0353

0.0256 0.0252 0.0256

0.874 0.844 0.858

0.7760 0.7746 0.7743

0.0244 0.0238 0.0247

0.0212 0.0205 0.0207

0.926 0.906 0.886

lopt

t0:4 t0:5 t0:6

0.7769 0.7746 0.7765

0.7555 0.7531 0.7557

0.0398 0.0391 0.0385

0.0396 0.0387 0.0392

0.940 0.934 0.946

0.7665 0.7631 0.7633

0.0263 0.0263 0.0269

0.0268 0.0264 0.0270

0.948 0.946 0.944

0.20

t0:4 t0:5 t0:6

0.7769 0.7746 0.7765

0.7479 0.7461 0.7486

0.0341 0.0341 0.0339

0.0426 0.0415 0.0422

0.968 0.956 0.968

0.7436 0.7431 0.7442

0.0237 0.0236 0.0242

0.0305 0.0298 0.0306

0.888 0.894 0.908

more close to the nominal level of 0.95 than those based on y~ t ðq~ at Þ. Similar conclusions can be also drawn from Table 3 for the simultaneous coverage probabilities. The empirical coverage probabilities of the simultaneous confidence band (2.9) with lopt roughly stay around 0.95 for larger sample size, lower censoring rate, smaller quantile of Y, and shorter time period. 4. A data example—ACTG 175 study In the ACTG 175 study (Hammer et al., 1996), the classification accuracy of CD4 cell counts on the time in weeks from entry to AIDS diagnosis or death might depend on whether they received the prior antiretroviral therapy. A total of 2467 HIV-1-infected patients, which were recruited between December 1991 and October 1992, are considered. Of these patients, 1395 received the prior therapy while the rest 1072 did not receive the therapy. During the study period, 308 patients died of all causes or were diagnosed with AIDS. For a negative association between CD4 counts and the time to AIDS and death, we let Y be a strictly decreasing function of the CD4 marker and T be the minimum of time-to-AIDS and time-to-death. Currently, there is still no standard of clinically meaningful values of FPR for the pAUC in AIDS research. In this data analysis, we restrict our attention to the pAUC of Y with the FPR less than 0.1 or 0.2 or 0.3. To simplify the presentation, the marker and the time-dependent pAUC for non-therapy and therapy patients ð2Þ ð1Þ ð1Þ n1 ð2Þ are denoted separately by ðY ð1Þ , yt ðqð1Þ at ÞÞ and ðY , yt ðqat ÞÞ. Based on two independent data sets fXi , di ,Yi gi ¼ 1 and ð1Þ

ð1Þ

ð2Þ

ð1Þ

ð2Þ

ð2Þ 2 b ðq b b ð2Þ Þ are computed as y b t ðq b ð1Þ bat Þ in (2.3) using the bandwidths of 0.042 and 0.069, which are ,y fXið2Þ , di ,Yið2Þ gni ¼ at Þ and y t ðq at t 1 ðkÞ

the minimizers of ISEðlÞ in (3.1). The confidence intervals for yt ðqðkÞ at Þ’sare further constructed from (2.7) and (2.9). Due to the large variation of estimators before week 98, we only provide the estimated time-dependent pAUCs and 0.95 pointwise and simultaneous confidence bands from week 98 to the end of study in Figs. 1(a)–(f). Based on the summary measures ðkÞ yðkÞ t ðq0:1t Þ’s and compared with 0.005, one can conclude from the simultaneous confidence bands that the CD4 count is a

useless biomarker in classifying patient’s survival time within the considered time period for both therapy and nontherapy patients. However, a different conclusion will be drawn at each time point based on the pointwise confidence

3834

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

Table 2 The averages (Mean) and the standard deviations (SD) of 500 estimates, the standard errors (SE), and the empirical coverage probabilities (CP). c.r.¼ 50%

n¼250

n¼ 500

l

Time

yt ðq0:1t Þ

Mean

SD

SE

CP

Mean

SD

SE

CP

0.01

t0:4 t0:5 t0:6

0.0264 0.0269 0.0279

0.0258 0.0248 0.0238

0.0080 0.0079 0.0083

0.0050 0.0049 0.0050

0.760 0.722 0.668

0.0266 0.0268 0.0265

0.0062 0.0064 0.0066

0.0042 0.0041 0.0043

0.786 0.772 0.754

lopt

t0:4 t0:5 t0:6

0.0264 0.0269 0.0279

0.0233 0.0245 0.0268

0.0078 0.0086 0.0095

0.0093 0.0086 0.0081

0.954 0.912 0.864

0.0237 0.0250 0.0266

0.0057 0.0061 0.0071

0.0067 0.0064 0.0064

0.950 0.918 0.892

0.20

t0:4 t0:5 t0:6

0.0264 0.0269 0.0279

0.0195 0.0214 0.0235

0.0053 0.0066 0.0078

0.0098 0.0094 0.0090

0.970 0.928 0.906

0.0192 0.0209 0.0225

0.0037 0.0047 0.0056

0.0073 0.0071 0.0072

0.952 0.932 0.908

l

Time

yt ðq0:3t Þ

Mean

SD

SE

CP

Mean

SD

SE

CP

0.01

t0:4 t0:5 t0:6

0.1420 0.1423 0.1446

0.1377 0.1330 0.1274

0.0226 0.0222 0.0231

0.0148 0.0145 0.0147

0.788 0.768 0.668

0.1417 0.1407 0.1389

0.0167 0.0168 0.0176

0.0122 0.0118 0.0120

0.826 0.824 0.796

lopt

t0:4 t0:5 t0:6

0.1420 0.1423 0.1446

0.1355 0.1366 0.1410

0.0238 0.0242 0.0247

0.0249 0.0235 0.0228

0.942 0.928 0.916

0.1369 0.1390 0.1418

0.0168 0.0173 0.0184

0.0176 0.0168 0.0169

0.948 0.934 0.928

0.20

t0:4 t0:5 t0:6

0.1420 0.1423 0.1446

0.1278 0.1306 0.1340

0.0196 0.0212 0.0227

0.0275 0.0257 0.0249

0.974 0.960 0.948

0.1273 0.1295 0.1321

0.0137 0.0154 0.0163

0.0201 0.0190 0.0190

0.968 0.954 0.938

l

Time

yt ðq1t Þ

Mean

SD

SE

CP

Mean

SD

SE

CP

0.01

t0:4 t0:5 t0:6

0.7769 0.7746 0.7765

0.7658 0.7554 0.7418

0.0419 0.0408 0.0435

0.0272 0.0271 0.0281

0.788 0.760 0.660

0.7755 0.7682 0.7621

0.0280 0.0288 0.0308

0.0222 0.0218 0.0223

0.866 0.854 0.824

lopt

t0:4 t0:5 t0:6

0.7769 0.7746 0.7765

0.7605 0.7576 0.7607

0.0438 0.0441 0.0449

0.0434 0.0425 0.0429

0.944 0.932 0.944

0.7650 0.7621 0.7626

0.0298 0.0303 0.0323

0.0304 0.0301 0.0310

0.956 0.942 0.934

0.20

t0:4 t0:5 t0:6

0.7769 0.7746 0.7765

0.7465 0.7455 0.7454

0.0393 0.0399 0.0413

0.0481 0.0467 0.0475

0.972 0.968 0.964

0.7457 0.7424 0.7420

0.0277 0.0280 0.0296

0.0346 0.0342 0.0351

0.930 0.928 0.924

Table 3 The empirical coverage probabilities of 0.95 simultaneous confidence bands.

l

a

n¼250

n ¼500

½t0:4 ,t0:5 

½t0:4 ,t0:6 

½t0:4 ,t0:5 

½t0:4 ,t0:6 

0.1 0.3 1.0

0.710 0.780 0.834

0.606 0.734 0.802

0.800 0.860 0.890

0.754 0.838 0.868

lopt

0.1 0.3 1.0

0.918 0.932 0.946

0.878 0.918 0.942

0.936 0.938 0.948

0.920 0.938 0.946

0.20

0.1 0.3 1.0

0.946 0.970 0.970

0.924 0.966 0.976

0.888 0.934 0.914

0.878 0.936 0.922

0.1 0.3 1.0

0.616 0.670 0.712

0.520 0.584 0.610

0.702 0.774 0.850

0.614 0.720 0.792

lopt

0.1 0.3 1.0

0.916 0.926 0.940

0.844 0.900 0.924

0.924 0.940 0.958

0.886 0.928 0.954

0.20

0.1 0.3 1.0

0.938 0.964 0.972

0.896 0.948 0.968

0.940 0.966 0.946

0.896 0.952 0.942

c.r.¼ 30% 0.01

c:r: ¼ 50% 0.01

0.06

0.06

0.04

0.04 pAUC

pAUC

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

0.02 0.0 120

140 Week

160

180

0.15

0.15

0.10

0.10

pAUC

pAUC

0.02 0.0

100

0.05 0.0

100

120

140 Week

160

180

100

120

140 Week

160

180

100

120

140 Week

160

180

0.05 0.0

100

120

140 Week

180

160

0.25

0.25

0.15

0.15

pAUC

pAUC

3835

0.05

0.05 0.0

0.0 100

120

140 Week

160

180

Fig. 1. The estimated time-dependent pAUCs (solid curve) and the 0.95 pointwise confidence intervals (dotted curve) and simultaneous confidence bands (dashed curve). ð1Þ

ð2Þ

ð2Þ 2 intervals. The time-dependent pAUCs yt ðqð1Þ at Þ and yt ðqat Þ are detected to be significantly higher than 0:5a for a ¼0.2 and 0.3 after week 110 and week 98, respectively. Fig. 1(a), (c), and (e) give a clear indication that the pAUCs decrease slightly over time for patients without prior therapy. However, the pAUCs stay very close to a constant throughout the study period for those with prior therapy (Fig. 1(b), (d), and (f)). ð1Þ

The difference in the classification accuracies of Y ð1Þ and Y ð2Þ can be measured by the summary index ga ðtÞ ¼ yt ð2Þ

ð2Þ ðqð1Þ at Þyt ðqat Þ. When a ¼ 1, ga ðtÞ is the usual comparison of AUCs in the time-dependent setting. It is natural to estimate ð1Þ ð1Þ ð2Þ ð2Þ pffiffiffi ga ðtÞ by gba ðtÞ ¼ ybt ðqbat Þybt ðqbat Þ. Along the same lines as the proof in the Appendix, we can derive that nðgba ðtÞga ðtÞÞ ð1Þ converges weakly to a mean zero Gaussian process with variance–covariance function Ga ðs,tÞ ¼ k1 EðCð1Þ ai ðsÞCai ðtÞÞ þ ð2Þ ðkÞ ð1kÞ1 EðCð2Þ ai ðsÞCai ðtÞÞ provided that n1 =n-k (0 o k o1) as n ¼ ðn1 þn2 Þ-1, where Cai ðtÞ is a counterpart of Cai ðtÞ, k¼1,2. To make inference on ga ðtÞ, Ga ðs,tÞ is first estimated by

b a ðs,tÞ ¼ G

n1 n2 X n X b ð1Þ ðtÞ þ n b ð2Þ ðtÞ: b ð1Þ ðsÞC b ð2Þ ðsÞC C C ai ai ai ai 2 2 n1 i ¼ 1 n2 i ¼ 1

ð4:1Þ

A ð1BÞ pointwise confidence interval for ga ðtÞ and a ð1BÞ simultaneous confidence band for fga ðtÞ : t 2 ½t1 , t2 g are separately given via ( ) ðgÞ ZB=2 b 1=2 LB b 1=2 b b g a ðtÞ 7 pffiffiffi G a ðt,tÞ and g a ðtÞ 7 pffiffiffi G a ðt,tÞ : t 2 ½t1 , t2  ð4:2Þ n n

3836

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

Difference of pAUC

0.06

0.02 0.0

Difference of pAUC

Difference of pAUC

−0.04 100

120

140 Week

160

180

100

120

140 Week

160

180

100

120

140 Week

160

180

0.10 0.05 0.0 −0.05

0.10 0.0 −0.10

Fig. 2. The estimated curves for the difference of the time-dependent pAUCs between non-therapy and therapy patients (solid curve) and the 0.95 pointwise confidence intervals (dotted curve) and simultaneous confidence bands (dashed curve). ðgÞ

with LB being obtained as (2.8). It is revealed in Fig. 2(a)–(c) that ga ðtÞ, a ¼0.1, 0.2, and 0.3, tend to be positive within the study period and the difference becomes negligible as a increases. In other words, with small values of FPRt(y), a prior antiretroviral therapy might lower the discrimination ability of CD4 counts in classifying subject’s t-week survival. One possible explanation for this conclusion is that the prior therapy makes patients more homogeneous in survival time and CD4 counts. The estimates are further found to be around zero after about week 160. It means that for long term survival classification the performance of CD4 counts is irrelevant to whether patients receive prior therapy or not. Due to the large variability in the data, we could not detect ð1Þ

ð2Þ

ð2Þ any significant difference between yt ðqð1Þ at Þ and yt ðqat Þ. It would necessitate extremely large sample sizes to enable demonstration of significant differences between the pAUCs.

5. Discussion For the time-dependent pAUC, it was traditionally estimated by the trapezoidal numerical integration method. The derivation for its sampling distribution becomes complicated and the computation load is prohibitively expensive. Although the inferences can be developed through a bootstrap technique, there is still no rigorous theoretical justification for this procedure. We can see in this paper that the proposed estimators are simple and have explicit mathematical expressions. The confidence bands are built based on the asymptotic Gaussian process of the estimators as well as the corresponding estimates of the asymptotic variances. As long as the sample size of control group is large relative to the change in ða,tp ,c:r:Þ, the estimation and inference procedures are shown to be useful through simulation studies and an application to the ACTG 175 data. b t ðq b a ðs,tÞ is very sensitive to small value of a, large b at Þ and S It is detected from our numerical studies that the performance of y b t ðq b at Þ tends to be larger in a small a because it is computed on small quantile value of time, and high censoring rate. The bias of y

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

3837

portion of subjects in control group. The resulting biases in these estimators further cause low coverage probabilities in the constructed confidence intervals. Although bootstrapping approaches might solve such a problem, a large sample is usually suggested to obtain more stable estimates especially in the presence of censoring. Moreover, the price for the assumption of marker-dependent censoring is to find an appropriate bandwidth in estimation. To this problem, we propose a simple and easily implemented selection procedure and show its good performance through simulations. As for the estimation of yt ðqat Þ, it can be also derived via using the bivariate estimation methods of Campbell (1981) or Burke (1988) for Sðt,yÞ. However, these estimators are only valid under independent censorship which appears to be very limited and may not always be met in applications. One advantage of totally independent censoring assumption is that no smoothing technique is required. In some empirical examples, censored survival data of the form fXi , di ,Yið1Þ ,Yið2Þ gni¼ 1 are often occurred in a paired design with ð1Þ ð2Þ ðYi ,Yi Þ being the different biomarkers of the ith subject. The scientific interest usually focuses on comparing the discrimination abilities of Y ð1Þ and Y ð2Þ on subject’s survival status at each time point within the study period. Obviously, the assumptions of marker-dependent censoring made separately on ðT,C,Y ð1Þ Þ and ðT,C,Y ð2Þ Þ are often unreasonable in practice. Under a more flexible assumption of conditionally independent censorship (T and C are independent conditioning on ðY ð1Þ ,Y ð2Þ Þ), the estimated joint survivor function of T and Y ð1Þ and that of T and Y ð2Þ in this article are quite inadequate in the estimation of ga ðtÞ without modification. It is worthwhile to investigate the associated comparison procedure in our future study. Acknowledgments The corresponding author’s research was partially supported by the National Science Council Grants 97-2118-M-002020-MY2 and 99-2118-M-002-003 (Taiwan). We would like to thank the referee for valuable comments. Appendix A For the proof of main results, the assumptions in Akritas (1994) and the conditions (A1: ft ðyÞ ¼ @F t ðyÞ=@y exists with inf t ft ðqat Þ 4 0) and (A2: supt jB1 fF t ðqat þ BÞF t ðqat Þg þft ðqat Þj-0 as B-0) are made throughout the rest of this paper. b t ðq bat Þ: From Theorem 3.1 of Akritas (1994), one has Asymptotic Gaussian process of y   p ffiffiffi n pffiffiffi  nX   sup nðb Vi ðt,yÞ ¼ op ð1Þ, ðA:1Þ Sðt,yÞSðt,yÞÞ   n t,y i¼1 Rt where Vi ðt,yÞ ¼ ðST ðtjYi Þ þ xi ðtÞÞIðYi 4yÞSðt,yÞ with xi ðtÞ ¼ ST ðtjYi Þ 0 S1 X ðujYi Þ du Mi ðujYi Þ, Mi ðtjYi Þ ¼ IðXi r tÞdi þ ln ST ðt4Xi jYi Þ, and t4Xi ¼ min ft,Xi g. Let hij ðt,yÞ ¼ ð1ST ðtjYi ÞÞST ðtjYj Þfij ðyÞ and Hðt,yÞ ¼ Eðhij ðt,yÞÞ. The uniform consistency of b S T ðtjyÞ (cf. Dabrowska, 1987) ensures that 1 X 1 X 1 X b Hðt,yÞ ¼ 2 hij ðt,yÞ þ 2 ðST ðtjYi Þb S T ðtjYi ÞÞST ðtjYj Þfij ðyÞ þ 2 ð1ST ðtjYi ÞÞðb S T ðtjYj ÞST ðtjYj ÞÞfij ðyÞ þr1n ðt,yÞ n iaj n iaj n iaj

ðA:2Þ

with supt,y jr1n ðt,yÞj ¼ op ðn1=2 Þ. By a direct calculation and (A.1), a simplified form of the second term in the righthand side of (A.2) is obtained as follows: ( ) n n 1X 1X 1 X b ST ðtjYj ÞIðYj 4yÞ ST ðtjYi ÞIðYi 4 Yj ÞSðt,Y Þ ¼ 2 ST ðtjYj Þxi ðtÞfij ðyÞ þ r2n ðt,yÞ, ðA:3Þ j nj¼1 ni¼1 n i,j where supt,y jr2n ðt,yÞj ¼ op ðn1=2 Þ. Similarly, the third term can be expressed as 1 X ð1ST ðtjYi ÞÞxj ðtÞfij ðyÞ þ r3n ðt,yÞ n2 i,j

ðA:4Þ

with supt,y jr3n ðt,yÞj ¼ op ðn1=2 Þ. It follows from (A.2)–(A.4), the decomposition of a U-statistic into a sum of degenerate U-statistics (Serfling, 1980), and Theorem 7 of Sherman (1994) that   pffiffiffi X pffiffiffi  n n   b sup nðHðt,yÞHðt,yÞÞ Ui ðt,yÞ ¼ op ð1Þ, ðA:5Þ  n t,y  i¼1

b t ðyÞ ¼ Hðt,yÞ b where Ui ðt,yÞ ¼ Eðhij ðt,yÞ þ hji ðt,yÞjXi ,Yi , di Þ2Hðt,yÞ þ ðSY ðYi ÞSðt,yÞÞxi ðtÞIðYi 4 yÞ. By the Taylor expansion of y b b fb S T ðtÞð1b S T ðtÞÞg1 at ðHðt,yÞ, S T ðtÞÞ ¼ ðHðt,yÞ,ST ðtÞÞ, (A.1) and (A.5), one has   pffiffiffi n pffiffiffi n X Ui ðt,yÞ þ Zðt,yÞVi ðt,1Þ  b sup nðy t ðyÞyt ðyÞÞ ðA:6Þ  ¼ op ð1Þ,  n ST ðtÞð1ST ðtÞÞ t,y  i¼1

pffiffiffi b where Zðt,yÞ ¼ Hðt,yÞð2ST ðtÞ1ÞðST ðtÞS2T ðtÞÞ1 . Thus, nðy t ðqat Þyt ðqat ÞÞ can be shown to converge to a mean zero Gaussian process by an application of the functional cental limit theorem. pffiffiffi b b ðq b Þ, it is established through the equality bat Þyt ðqat ÞÞ ¼ For the asymptotic Gaussian process of y nðy t ðq pffiffiffi b pffiffiffi pffiffiffi t at pffiffiffi b bat Þyt ðq bat ÞÞ þ nðyt ðq bat Þyt ðqat ÞÞ. Let nðq bat qat Þ ¼ nðQ ðSÞQ ðSÞÞ with Q : S-qat . By assumptions (A1) and (A2), nðy t ðq

3838

H. Hung, C.-T. Chiang / Journal of Statistical Planning and Inference 141 (2011) 3829–3838

the Hadamard differentiability of Q is a direct result of Lemma A.1 in Daouia et al. (2008). Together with the functional delta method (cf. van der Vaart, 2000), we have   pffiffiffi X pffiffiffi n n Vi ðt,qat ÞaVi ðt,1Þ  bat qat Þ sup nðq ðA:7Þ  ¼ op ð1Þ:  ft ðqat ÞST ðtÞ n i¼1 t  pffiffiffi b at qat Þ. It is further ensured by (a version of) Lemma 19.24 of van der Vaart (2000) and and the weak convergence of nðq (A.6) that pffiffiffi b pffiffiffi b b at Þyt ðq bat ÞÞ nðy supj nðy ðA:8Þ t ðq t ðqat Þyt ðqat ÞÞj ¼ op ð1Þ: t

bat Þ at q bat ¼ qat , the continuity of @yt ðyÞ=@y, supt jq b at qat j ¼ op ð1Þ, and the Moreover, the first order Taylor expansion of yt ðq continuous mapping theorem imply that   pffiffiffi  pffiffiffi aS ðtÞSY ðqat Þ bat Þyt ðqat ÞÞ T bat qat Þ ¼ op ð1Þ: ft ðqat Þ nðq sup nðyt ðq ðA:9Þ 1ST ðtÞ t It follows from (A.6)–(A.9) that   pffiffiffi n pffiffiffi  nX   b bat Þyt ðqat ÞÞ sup nðy t ðq Cai ðtÞ ¼ op ð1Þ,   n t

ðA:10Þ

i¼1

where

Cai ðtÞ ¼

Ui ðt,qat Þ þ Zðt,qat ÞVi ðt,1Þ þ ðaST ðtÞSY ðqat ÞÞðVi ðt,qat ÞaVi ðt,1ÞÞ : ST ðtÞð1ST ðtÞÞ

Finally, the proof is completed by applying the functional central limit theorem to the approximated term n1=2 Pn i ¼ 1 Cai ðtÞ in (A.10). References Akritas, M.G., 1994. Nearest neighbor estimation of a bivariate distribution under random censoring. Ann. Statist. 22, 1299–1327. Burke, M.D., 1988. Estimation of a bivariate distribution function under random censorship. Biometrika 75, 379–382. Cai, T., Dodd, L.E., 2008. Regression analysis for the partial area under the ROC curve. Statist. Sinica 18, 817–836. Campbell, G., 1981. Nonparametric bivariate estimation with randomly censored data. Biometrika 68, 417–423. Chambless, L.E., Diao, G., 2006. Estimation of time-dependent area under the ROC curve for long-term risk prediction. Statist. Med. 25, 3474–3486. Chiang, C.T., Wang, S.H., Hung, H., 2009. Random weighting and Edgeworth expansion for the nonparametric time-dependent AUC estimator. Statist. Sinica 19, 969–979. Chiang, C.T., Hung, H., 2010. Nonparametric estimation for time-dependent AUC. J. Statist. Plann. Inference 140, 1162–1174. Dabrowska, D.M., 1987. Uniform Consistency of Nearest Neighbor and Kernel Conditional Kaplan–Meier Estimates. Technical Report No. 86. University of California, Berkeley. Daouia, A., Florens, J.P., Simar, L., 2008. Functional convergence of quantile-type frontiers with application to parametric approximations. J. Statist. Plann. Inference 138, 708–725. Dodd, L., Pepe, M.S., 2003. Partial AUC estimation and regression. Biometrics 59, 614–623. Dwyer, A.J., 1996. In pursuit of a piece of the ROC. Radiology 201, 621–625. Emir, B., Wieand, S., Jung, S.H., Ying, Z., 2000. Comparison of diagnostic markers with repeated measurements: a non-parametric ROC curve approach. Statist. Med. 19, 511–523. Hammer, S.M., Katzenstein, D.A., Hughes, M.D., Gundacker, H., Schooley, R.T., Haubrich, R.H., Henry, W.K., Lederman, M.M., Phair, J.P., Niu, M., Hirsch, M.S., Merigan, T.C., 1996. A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England J. Med. 335, 1081–1090. Heagerty, P.J., Lumley, T., Pepe, M.S., 2000. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 56, 337–344. Jiang, Y., Metz, C.E., Nishikawa, R.M., 1996. A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology 201, 745–750. Lin, D.Y., Wei, L.J., Yang, I., Ying, Z., 2000. Semiparametric regression for the mean and rate functions of recurrent events. J. Roy. Statist. Soc. B62, 711–730. McClish, D.K., 1989. Analyzing a portion of the ROC curve. Med. Decis. Making 9, 190–195. Serfling, R.J., 1980. Approximation Theorems of Mathematical Statistics. Wiley, New York. Sherman, R.P., 1994. Maximal inequalities for degenerate U-processes with applications to optimization estimators. Ann. Statist. 22, 439–459. van der Vaart, A.W., 2000. Asymptotic Statistics. Cambridge University Press, Cambridge. Zhang, D.D., Zhou, X.H., Freeman, Daniel, H.J., Freeman, J.L., 2002. A non-parametric method for the comparison of partial areas under ROC curves and its application to large health care data set. Statist. Med. 21, 701–715. Zhou, X.H., McClish, D.K., Obuchowski, N.A., 2002. Statistical Methods in Diagnostic Medicine. Wiley, New York.