Some variants of adaptive sampling procedures and their applications

Computational Statistics and Data Analysis 55 (2011) 3183–3196 Contents lists available at ScienceDirect Computational Statistics and Data Analysis ...

Download PDF

339KB Sizes 2 Downloads 85 Views

Report

PDF Reader
Full Text

Computational Statistics and Data Analysis 55 (2011) 3183–3196

Contents lists available at ScienceDirect

Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda

Some variants of adaptive sampling procedures and their applications Raghu Nandan Sengupta a,∗ , Angana Sengupta b a

Department of Industrial & Management Engineering, Indian Institute of Technology Kanpur, Kanpur - 208 016, India

b

Delhi Public School Kalyanpur Kanpur, Kanpur - 208 017, India

article

info

Article history: Received 19 November 2009 Received in revised form 24 May 2011 Accepted 29 May 2011 Available online 12 June 2011 Keywords: Decision analysis Normal Exponential Gamma Extreme value Sequential sampling Loss functions Squared error loss Linear exponential loss Bounded risk Simulation Business applications

abstract Sequential analysis as a sampling technique facilitates efficient statistical inference by considering less number of observations in comparison to the fixed sampling method. The optimal stopping rule dictates the sample size and also the statistical inference deduced thereafter. In this research we propose three variants of the already existing multistage sampling procedures and name them as (i) Jump and Crawl (JC), (ii) Batch Crawl and Jump (BCJ) and (iii) Batch Jump and Crawl (BJC) sequential sampling methods. We use the (i) normal, (ii) exponential, (iii) gamma and (iv) extreme value distributions for the point estimation problems under bounded risk conditions. We highlight the efficacy of using the right adaptive sampling plan for the bounded risk problems for these four distributions, considering two different loss functions, namely (i) squared error loss (SEL) and (ii) linear exponential (LINEX) loss functions. Comparison and analysis of our proposed methods with existing sequential sampling techniques is undertaken and the importance of this study is highlighted using extensive theoretical simulation runs. Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved.

1. Introduction Multistage sampling procedure or sequential analysis as a tool was first developed by Wald along with Wolfowitz (Wald and Wolfowitz, 1945) for a more efficient industrial quality control. The same approach was independently developed at the same time by Alan Turing, to test the hypothesis whether different messages coded by German enigma machines should be connected and analyzed together. Wald (1947) developed this branch of statistics using the concept of sequential probability ratio test (SPRT). In course of time different variations of multistage sampling techniques have been developed such as those proposed by Hall (1981, 1983), Liu (1997), Mukhopadhyay (1990), Mukhopadhyay and Solanky (1991), Ray (1957) and Stein (1945). As a sampling technique this method is efficient as it considers less number of observations in comparison to the fixed sampling method. Few good references are, Ghosh et al. (1997), Ghosh and Sen (1991), Govindarajulu (2004), Mukhopadhyay et al. (2004), Mukhopadhyay and de Silva (2009), Mukhopadhyay and Solanky (1994), Schmitz (1993), Siegmund (1985), Wald (1947) and Zacks (2009). In recent times use of sequential analysis as a technique has found wide applications in areas like regime switching models (Carvalhoa and Lopes, 2007; Bollen et al., 2000), study of optimal clinical trials and novel therapies (Cui et al., 2009; Salvan, 1990; Orawo and Christen, 2009;

∗

Corresponding author. Tel.: +91 512 259 6607; fax: +91 512 259 7553. E-mail address: [email protected] (R.N. Sengupta).

0167-9473/$ – see front matter Crown Copyright © 2011 Published by Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2011.05.020

3184

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

Prado, in press; Todd et al., in press), analysis of financial time series (Bai and Perron, 2002; Alp and Demetrescu, 2010), etc. In this paper we propose three different variants of the existing multistage sampling procedures available in the literature, and name them as (i) Jump and Crawl (JC); (ii) Batch Crawl and Jump (BCJ) and (iii) Batch Jump and Crawl (BJC) sequential sampling methodologies. We consider the normal, exponential, gamma and extreme value distributions separately to show the advantage of using the right adaptive sampling plan. We solve the bounded risk point estimation problems for (i) µ, i.e., the location parameter, (ii) λ, which is again the location parameter, (iii) α , the shape parameter and finally (iv) E (X ) = µ + γE σ , (where µ and σ are the location and scale parameters respectively, while γE is Euler’s constant) for the (i) normal, (ii) exponential, (iii) gamma and (iv) extreme value distributions respectively. The two different loss functions used for the bounded risk problem formulations are the squared error loss (SEL) and the linear exponential (LINEX) loss functions. To solve these bounded risk point estimation problems we use our proposed sequential sampling plans, under some specified constraints for these distributions. Finally we test the validity of our models using extensive simulation runs. The paper is organized as follows. In Section 2 we describe the concepts of a loss function and bounded risk along with the brief descriptions of few existing sequential sampling schemes. Our proposed models are then discussed in Section 3, while Section 4 deals with the background for simulation runs conducted for the four distributions. Section 5 gives the detailed analysis related to all the four sets of simulations and finally we conclude this paper with our comments in Section 6. 2. Loss functions, bounded risk, sequential sampling methodologies 2.1. Loss functions In point estimation problem our aim is to find an estimator, Tn , based on a collection of sample of observations, X1 , X2 , . . . , Xn , to estimate the parameter, θ . Tn should be unbiased as well consistent, but this may be unlikely because of the randomness of the sample. Imposition of consistency and unbiasedness may not always lead to a unique estimate. To overcome this problem we utilize the idea of a non-negative metric, called the loss function, L(Tn , θ ), where, L(Tn , θ ) = f (∆) is usually a function of ∆ = (Tn − θ ), which is the error of estimation. The accuracy of the loss function is measured by its corresponding risk function, R(Tn , θ ) = E [L(Tn , θ )] and our main concern is to minimize  this  risk by a proper choice of the estimator, Tn . Ideally we try to find the optimal estimator, Tn∗ , for which the risk, R Tn∗ , θ ≤ R (Tn , θ) ∀θ ∈ Θ . The optimal estimator thus obtained is called the minimum risk estimator (MRE) with respect to that particular loss function only. Practically for a particular loss function, we may not be able to find, Tn∗ , as the value of R(Tn , θ ) usually depends on the sample size, n, and the parameter, θ . Thus the general risk function may be expressed as [L(Tn , θ ) + c (n)], which is a sum of two terms. The first term, L(Tn , θ ), is non-increasing in nature with respect to n. While the second term, which is the cost of sampling is given by c (n) and depends on the sample size, n. Furthermore this second term is a non-decreasing function with respect to the sample size, n. Hence it may become difficult to find the optimal value of n, such that the risk, E [L(Tn , θ ) + c (n)], is minimized or made as low as possible. Before we discuss the method to find the optimal estimate corresponding to the minimum risk, we give examples of different types of loss functions available in the literature. Loss functions are of various types, few examples of it being the (i) squared error loss (SEL) function, (ii) weighted squared loss function, (iii) linear loss function, (iv) non-uniform linear loss function, (v) 0-1 loss function, (vi) balanced loss function (BLF), Zellner (1994), (vii) squared exponential loss function and (ix) linear exponential (LINEX) loss function, Zellner (1986). Without going into the detail of each we discuss about the SEL and LINEX loss functions only, as our analysis for all the four distributions is based on the risks associated with these two loss functions. Squared error loss (SEL) loss function: The SEL is of the form L(Tn , θ ) = (Tn − θ )2 and is the most widely used loss function. It is used in estimation problems when unbiased estimators of θ are considered, since the risk, R(Tn , θ ) = E [L(Tn , θ )] = E [(Tn − θ )2 ] is the mean square error (MSE) of Tn , which reduces to the variance of Tn subject to unbiasedness. The corresponding optimal estimator, if it exists, is called the minimum variance unbiased (MVU) estimator. It must be remembered that the weighted squared error loss [w(θ)(θ − Tn )2 ] is a variant of the SEL, where the choice of w(θ ), the weight, depends on the specific value of θ , where θ ∈ Θ , (Θ being the parameter space). Linear exponential loss function: An important point which SEL ignores is the fact that overestimation and underestimation of θ may be of unequal importance in many situations. A loss function, which takes care of this is the linear exponential loss function (LINEX), (Zellner, 1986), which is an asymmetric convex loss function given by L (Tn , θ) =   b ea(Tn −θ ) − a (Tn − θ ) − 1 , where a and b are the shape and scale parameters respectively. One can easily see that for a > 0, the convex loss increases almost exponential (linearly) for positive (negative) values of error, ∆. Therefore, overestimation is of more serious concern than  underestimation, while for a < 0, the  trend  is just the opposite. It is quite interesting to 2

2

2

note that as a → 0, L (∆) = b 1 + a1∆! + a 2∆! + · · · − a∆ − 1 ≈ b a2 ∆2 + o a2 , i.e., the LINEX loss reduces to the SEL function for small values of a. The value of b does not affect the property of the loss function, but only scales the nature of the LINEX loss function without affecting its shape.

 

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

3185

2.2. Bounded risk Let us illustrate the concept of bounded risk and its implication with a simple example. Consider a normal distribution 2

with the probability distribution function (p.d.f.) given by f (x; µ, σ ) =

− (x−µ) 2

, − ∞ < x < ∞, −∞ < µ < σ 2π ∞, 0 < σ < ∞, with both mean (µ) and variance (σ 2 ) unknown. Suppose one is interested to find the point estimate of µ (location parameter) subject to an SEL given by L(Tn , θ ) = (Tn − θ )2 . After recording X1 , X2 , . . . , Xn observations, ∑function, n we find the sample mean X¯ n = 1n X , which is the estimator of µ. The corresponding associated risk is given by i=1 i      2 σ R X¯ n , µ = E L X¯ n , µ = n . Next suppose we require this risk to be such that it does not exceed a pre-assigned known value, w(> 0 ) . We immediately see that if σ 2 (square of scale parameter) is known then the optimal sample size,  2 n, is D = σw . But with σ 2 unknown the problem cannot be solved with any fixed sampling techniques. √1

e

2σ

2.3. Sequential sampling methodologies From the previous section, one is aware that as σ 2 is unknown, D =



σ2 w



is also unknown. Hence using fixed sampling

rule will not help us to solve our problem of finding the minimum sample size. This implies, one has to take the recourse of some multistage or adaptive sampling techniques to solve this problem. In order to retain the continuity in the paper and make it crisp, we restrict our discussion to few of the multistage or adaptive sampling methodologies used in the literature to circumvent such a bounded risk estimation problem. Two-stage sampling procedure: Stein (1945) considered a two-stage sampling procedure, where at the first a sample  stage  

of size m (≥2) is drawn to estimate the unknown quantity D which is calculated using N. Here N = max m,

2 Sm

w

, is the

estimate of the number of observations, needed to satisfy the bound placed by w . The methodology works as follows. Start with X1 , X2 , . . . , Xm observations in a single batch and determine N. If N = m, then we stop and do not take any more observations in the second stage. However if N > m, then one samples an additional (N − m) observations in the second ∑N stage. Based on the total observations X1 , X2 , . . . , XN , the estimator, X¯ N = N1 i=1 Xi is calculated. We must remember since S 2 is the result of a random procedure, N, the sample size is also a random number. Purely sequential sampling procedure: Ray (1957) considered a purely sequential methodology, which starts   with a

sample of size m (≥2) and one continues to take one observation at a time until, N = inf n ≥ m : n ≥ σw , where 2

2

1 ¯ is the best estimate of σ 2 which is recalculated each time the sample size, n, changes. In other S 2 = n− i =1 X i − X 1 words, the estimator is updated at each stage with the arrival of each new observation, until the stopping rule is met for the very first time. Once sampling stops the value of µ is evaluated using its estimator X¯ N . Thus it establishes the superiority of the purely sequential sampling procedure over Stein’s two-stage procedure from a statistical asymptotic viewpoint and not from the practical perspective. Three-stage sampling procedure: Even though from the theoretical standpoint, purely sequential sampling procedure satisfies the asymptotic second-order efficiency property, yet one immediately realizes that taking one observation at a time, as is done in the purely sequential sampling scheme, is practically inconvenient. Hence in order to save sampling operations and at the same time maintain the second-order property, Hall (1981)  and  Mukhopadhyay (1990) considered

∑n 

1

the three-stage sampling procedure. This methodology is as follows. Let m = O D r , for some r > 1. That is, the starting

sample size m is allowed to grow, but in a manner that m → 0 as w → 0, which implies that m is allowed to increase D at a slower rate than D as w becomes smaller. After having fixed 0 < ρ < 1 and with the starting sample size of m (≥2), let

 



 





T = max m, ρ N = max T ,

ST2

w

2 Sm

w 



.

Here T estimates ρ × D, which is a fraction of D. If T = m, then we do not sample any more in the second stage, but if T > m, one samples the difference (T − m) in one single batch. Based on the observations {X1 , X2 , . . . , XT } one now proceeds to find N which is the estimator of D. If now N = T , then we do not take any more sample in the third stage, but if N > T , the difference (N − T ) of observations are taken in the third stage to find the value of N. After the sampling procedure ∑N terminates, the estimator X¯ N = N1 i=1 Xi determines the estimated value of µ (the location parameter for the normal distribution). One must remember that even if there is a huge amount of variability in the last (N − T ) observations, still we are certain to terminate our sampling procedure following the same stopping criteria. If the variability in the last (N − T ) observations is appreciable, the number of observations one needs to take in the third, i.e., the last stage would be quite

3186

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

high. It is observed that such a three-stage procedure apart from obeying the asymptotic consistency and asymptotic first order efficiency properties, also obeys the asymptotic second-order efficiency property (Hall, 1981). Accelerated sequential sampling procedure: Another variation of the purely sequential sampling methodology has been considered by Hall (1983) and Mukhopadhyay and Solanky (1991) which cuts down on the cost by accelerating the original sequential Here also one starts with a sample size of m (≥2) and after having fixed 0 < ρ < 1 and with  2  n procedure. S define Sn∗2 = n− n 1



R = inf n ≥ m : n ≥ ρ

 T =

Sn∗2



w

∗2 

SR

w

N = max {R, T } . Thus one first samples purely sequentially obtaining X1 , X2 , . . . , XR such that R estimates ρ×D and then proceeds to estimate D by N. If T = R, then we do not take any more samples, but if T > R, then one samples (T −R) observations in one single batch thus curtailing sampling operations and comes up with the estimate of the location parameter µ. The asymptotic secondorder properties of such accelerated sequential procedures have been developed by Hall (1983), and Mukhopadhyay and Solanky (1991). Batch sequential sampling procedure: Liu (1997) proposed the batch sequential sampling procedure, where we first consider 0 < ρ1 < ρ2 < ρ3 < · · · < ρk−1 < 1. We also specify r1 ≥ r2 ≥ r3 ≥ · · · ≥ rk ≥ 1 and ti ’s, where ri (i = 1, 2, . . . , k) denotes the minimum number of observations one takes at each and every step in the ith batch, while ti , is the number of such steps one is required to take in that ith batch. The connotation of minimum number of observations means the number of observations or individuals one takes at one go. Thus if we have k number of batches, then for the ith (i = 1, 2, . . . , k)  batch, the number  of observations one would take is ti × ri , and for the whole batch sequential sampling procedure it is m +

∑k

i=1 ti

× ri , where m (≥2) is the number of observations required to initiate the batch sequential

sampling procedure. Remember, this m (≥2) number of observations is taken at one go in the first step which is literally the zero batch. One should also remember that r1 , r2 , r3 , . . . , rk and ti ∈ Z + . The procedure works as follows. Start with a sample size of m (≥2) and for each batch follow the sampling methodology according to the rule given below Sn2



R1 = inf n ≥ (m + r1 × t1 ) : n ≥ ρ1



w

R2 = inf n ≥ (R1 + r2 × t2 ) : n ≥ ρ2

.. .



Sn2

w





Sn2

estimates ρ1 D

batch #2

estimates ρ2 D

batch # (k − 1)

estimates ρk−1 D

batch #k

estimates D.



Rk−1 = inf n ≥ (Rk−2 + rk−1 × tk−1 ) : n ≥ ρk−1 N = inf n ≥ (Rk−1 + rk × tk ) : n ≥

batch #1

Sn2

w





w

3. Proposed models of multistage sampling methodologies In line with the different multistage sampling procedures discussed in Section 2.3 we propose three variants of sequential sampling plan, viz., the (i) Jump Crawl (JC), (ii) Batch Crawl and Jump (BCJ) and (iii) Batch Jump and Crawl (BJC) sequential sampling procedures. These are the hybrid or modified multistage sampling techniques similar to Hall (1981), Mukhopadhyay (1990) and Liu (1997). 3.1. Jump Crawl (JC) sequential sampling technique To illustrate the jump crawl sequential sampling methodology we first give the schematic diagram of the procedure in Fig. 1. The methodology works as follows. We start with an initial sample of size, m (≥2) and also choose a value of ρ (0 < ρ < 1). After that we jump (denotedby —) a large sample of observations at one go to estimate ρ × D (using  by  collecting  R), keeping in mind that, R = max m, ρ

2 Sm

w

. Once ρ × D is estimated we check whether we have completed our

sampling procedure. If not,we  proceed purely sequentially, i.e., take one observation at a time (denoted by —) following the  rule T = inf n ≥ R : n ≥

Sn2

w

. Finally the random sample, N = max {R, T } is found which estimates, D. One should be

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

3187

Fig. 1. Jump Crawl (JC) sequential sampling technique.

Fig. 2. Batch Crawl and Jump (BCJ) sequential sampling technique.

aware that the choice of ρ depends on the compromise one makes between efficiency of the result vs. ease of sampling. A high value of ρ means we reduce our sampling effort but at a cost which results in a larger value of the estimated sample size, N. On the other hand for a low ρ , the reverse holds true where N as well as the estimate results are close to the optimal value but the effort one spends in conducting the experiment is high. 3.2. Batch Crawl and Jump (BCJ) sequential sampling technique This proposed Batch Crawl and Jump (BCJ) sequential sampling methodology follows from Liu (1997) and for convenience of understanding is illustrated below; see Fig. 2. The basic notion of this procedure differs from that of Liu (1997) due to the fact, that for each individual batch we first proceed purely sequentially (denoted by —) and then literary jump (denoted by —) after a certain number of stages to estimate the values of ρi × D′ s. In order to explain the scheme of sampling, one first needs to specify γi , ρi−1 , and k, where i = 1, 2, . . . , k. Here k is the number of batches, which is fixed at the beginning of the experiment depending on the experimenter’s choice of the sampling scheme. Now 0 = ρ0 < ρ1 < ρ2 < ρ3 < · · · < ρk−1 < 1 are the percentages of the total actual sample size, D, which is unknown and which one needs to estimate. As for the values of ρi−1 ’s, they are also specified before the experiment. So if we consider BCJ1 as illustrated in the diagram above, i.e., Fig. 2, then we need to find ρ1 × D, using its estimate R1 . γi ’s, (0 < γi < 1), are the corresponding values of percentage of the ith batch itself, i.e., till which stage in that ith batch we continue sampling taking one at a time observation, i.e., continue using the purely sequential sampling methodology (denoted by —). This effectively means we continue the purely sequential sampling methodology till the (γi × ρi × D)th stage in each of the ith batch. After that, one jumps at one go (denoted by —) to estimate ρi × D. Thus the procedure works as follows. Start with a sample size of m (≥2) and for each batch follow the crawl and jump sampling rule according to the scheme given below



T1 = inf n ≥ m : n ≥ γ1 × ρ1

 U1 =



ρ1

ST21



Sn2



w



batch #1

w

R1 = max {T1 , U1 }



T2 = inf n ≥ R1 : n ≥ γ2 × ρ2

 U2 =

ρ2



ST22



w

R2 = max {T2 , U2 }

.. .



Sn2



w batch #2

3188

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

Fig. 3. Batch Jump and Crawl (BJC) sequential sampling technique.



Tk−1 = inf n ≥ Rk−2 : n ≥ γk−1 × ρk−1





ρk−1

Uk−1 =

ST2k−1



Sn2



w



batch #k − 1

w

Rk−1 = max {Tk−1 , Uk−1 } and



Tk = inf n ≥ Rk−1 : n ≥ γk

 Uk =

ST2k



Sn2



w



batch #k.

w

N = max {Tk , Uk } Once sampling stops, we calculate the sample estimator, X¯ N = N1 i=1 Xi . We must remember that R1 , R2 , . . . , Rk−1 and N estimate the values of ρ1 × D, ρ2 × D, . . . , ρk−1 × D and D respectively. In choosing values of ρ1 , . . . , ρk−1 , the basic idea about the efficiency of the sampling results vs. cost of sampling is similar in line to that mentioned in Section 3.1, i.e., Jump Crawl (JC) sequential sampling technique.

∑N

3.3. Batch Jump and Crawl (BJC) sequential sampling technique The Batch Jump and Crawl (BJC) sequential sampling methodology (Fig. 3) is just a different variant of the methodology explained in Section 3.2, whereby for each batch we follow the jump crawl sampling scheme rather than the crawl jump procedure. The nomenclature of BJC scheme is exactly the same as for BCJ procedure (Section 3.2). Hence we illustrate the diagram and the procedure without going into detailed explanation of the same.

 T1 =

γ1 × ρ1



2 Sm



w

U1 = max {m, T1 }

batch #1



R1 = inf n ≥ U1 : n ≥ ρ1

 T2 =

 γ2 × ρ2

SR21



Sn2



w



w

U2 = max {R1 , T2 }

batch #2



R2 = inf n ≥ U2 : n ≥ ρ2



 2

Sn

w

.. .  Tk−1 =



γk−1 × ρk−1

SR2k−2



w

Uk−1 = max {Rk−2 , Tk−1 }



Rk−1 = inf n ≥ Uk−1 : n ≥ ρk−1

batch #k − 1



Sn2

w



R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

3189

and

 Tk =

 γk

SR2k−1



w batch #k.

Uk = max {Rk−1 , Tk }





N = inf n ≥ Uk : n ≥

Sn2



w

After the sampling stops we find the estimator of µ using X¯ N = N1 i=1 Xi . One should note that the value of ρ (for JC method) or ρi and γi (for BJC or BCJ) would depend on two important factors which are (i) expected regret, i.e., the difference between the estimated risk and its actual value (which is denoted by w ) and (ii) number of sampling operations. Hence finding the optimal values of ρ, ρi and γi is a different research idea by itself, which is not the focus of this paper.

∑N

4. Background for simulation study The emphasis of this research work is to highlight the efficacy of the proposed models. Thus for a better appreciation of this work, we illustrate the general framework of simulation employed for the proposed models for all the four distributions. The results of these simulation runs are then discussed in Section 5. Normal distribution: Suppose X ∼ N (µ, σ 2 ), and one is interested to estimate the location parameter, µ. Now in order to compare the efficiencies of the multistage sampling procedures we consider the bounded risk estimation ∑ problem of µ, for both (i) SEL and (ii) LINEX loss functions. The estimate of µ is µ ˆ = X¯ n = 1n ni=1 Xi , and the bounded SEL and LINEX loss risk problems may be written as

 

R X¯ n , µ = E





X¯ n − µ

2 

=

σ2

≤w



and

n     a2 σ 2     ¯ − a X −µ R X¯ n , µ = E e ( n ) − a X¯ n − µ − 1 = e 2n − 1 ≤ w , respectively. The corresponding optimal sample  2  2 2  sizes, with a known value of the scale parameter, σ , are then D ≥ σw and D ≥ 2 loga (σ1+w) respectively. In case σ is





e

unknown, then both the bounded risk problems may be solved using different adaptive sampling procedures, some of which have been discussed in Section 3. In this work we determine the empirical efficiency of the proposed sequential sampling estimation procedures for the bounded risk problems for both SEL and LINEX loss functions separately. For some interesting theoretical results as well as practical examples for the bounded risk estimation problem of µ under the LINEX loss function, one may check Chattopadhyay et al. (2000), Chattopadhyay and Sengupta (2006), Sengupta (2008), Takada (2006), etc. Exponential distribution: Consider the exponential distribution, X ∼ E (σ , λ), with location parameter, λ and scale parameter as σ . We highlight the advantages of using sequential sampling procedures when one is interested to estimate, λ. If (X1 , X2 , . . . , Xn ) is the set of sample observations, then the best estimate of λ is X(1) = min {X σ values. Now the bounded SEL risk problem may be 1 , X2 , . . . , Xn }, for  and unknown  both known 

= σn2 ≤ w , while the counterpart for LINEX loss bounded risk is           1 aσ − − 1 ≤ w . The corresponding optimal sample R X(1) , λ = E e−a(X(1) −λ) − a X(1) − λ − 1 = aσ n 1−( n )     1   σ2 sizes, are D ≥ and D ≥ 12 a + |a| 1 + w4 2 σ . So with σ being unknown, one may solve both these bounded w R X(1) , λ = E



formulated as

 



2

X(1) − λ



2



risk problems using different multistage sampling methodologies. An interested reader may refer to Basu (1971) and Chattopadhyay (2000), to get a good idea about the use of sequential sampling methods for estimating the location parameter, λ, for the SEL and the LINEX loss function respectively. Gamma distribution: Next assume X ∼ G(α, β, γ ), i.e., the gamma distribution, where α, β and γ as the shape, scale and location parameters respectively. simplicity assume β = 1, γ = 0, then one can easily see that α = E (X ), for ∑For n which the best estimate is αˆ = 1n i=1 Xi . So the bounded SEL and LINEX risk problem for the gamma distribution are

          a(α−α ˆ ) − a αˆ − α − 1 = e−aα 1 − a −nα − 1 ≤ w (provided ≤ w and R α, ˆ α = E e n n     2 −1 < a < n) respectively. Furthermore the corresponding optimal sample sizes, are given by D ≥ wα and D ≥ 2 loga (1α+w) e (with −1 < a < n). Thus with unknown σ one may solve both these bounded risk problems using any one of the sequential R α, ˆ α =E





 2  αˆ − α =

α

analysis methods discussed earlier. Extreme Value Distribution: Finally consider X ∼ EVD(µ, σ ) as the Extreme Value Distribution (EVD) with µ as its  location parameter and σ as the scale parameter. We know the estimate of σ is σˆ =



6

π2

×



1 n −1



×

∑n  i =1

2

Xi − X¯ n ,

∑n that of µ is µ ˆ = X¯ n − γE σˆ . Hence the SEL bounded risk problem formulation for µ, is i=1 Xi , while   2 2    2  2 2 R µ, ˆ µ = E µ ˆ −µ = π 6nσ ≤ w , for which the optimal sample size is D ≥ π6wσ . In case one considers where X¯ n =

1 n

3190

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

the LINEX loss function, then the bounded risk problem takes the form of R µ, ˆ µ



aσ n



    ˆ )−a µ = E ea(µ−µ ˆ −µ −1 =

n

 − aγE σ − 1 ≤ w, under the conditions that |a| ≤ σn and a ̸= 0. The optimal sample size for this bounded   −a2 σ 2 , when |a| ≤ σn and a ̸= 0. Hence to find the optimal estimate of µ, with unknown LINEX loss risk is D ≥ 2 log (1+w+ aγ σ )+aσ } { e σ , both these bounded risk problems may be solved employing any one of the sequential sampling methods discussed. Furthermore this estimate of µ will be used to calculate E(X). Hence one first finds the bounded risk estimate of µ and then calculates E(X) using Eˆ (X ) = µ ˆ N + γE σ , where µ ˆ N , is the estimate of µ found using any one of the sequential sampling   0 1−

plans. 5. Results and discussions Consider X as the general random variable which specifies the distribution used for our simulation study. Thus X r.v. denotes (i) normal, (ii) exponential, (iii) gamma and (iv) extreme value cases. Our study highlights the efficiencies of the three proposed sequential sampling schemes, and the exhaustive simulation results from all the four distributions adequately emphasize this. As a first step we simulate 1,000,000 X values, i.e., realized values from (i) X ∼ N (µ = 10, σ 2 = 1) (for both SEL and LINEX case), (ii) X ∼ E (σ = 10, λ = 4) (both for SEL and LINEX instances), (iii) X ∼ G(α = 5, β = 1, γ = 0) (for both SEL and LINEX runs) and (iv) X ∼ EVD(µ = 3, σ = 1)/X ∼ EVD(µ = 5, σ = 95) (for SEL/LINEX example). These 1,000,000 generated realized values may be considered as those which constitute the respective populations for the four distributions under study. Apart from considering the relevant parameter values present in the risk function for all the four cases as unknown, we also face the added constraint that the level of risk, i.e., w , is bounded from above. This bound on w signifies resource restrictions as we may have limitations on our budgetary allocations. So to estimate the required value of the parameter of interest one employs multistage sampling methodologies under bounded risk consideration. To understand the efficacies of different sequential sampling plans, we compare the average performances (efficiencies) of the adaptive sampling procedures, by simulating each metric value a total of 1,00,000 times (for any distribution). We then find the average values of these metrics which are of interest to us. Using different choices of w , we find the estimates, i.e., we simulate K (=1000) number of runs, i = 1, 2, . . . , K , and for each of these K ’s we repeat them B (=100) number of times, j = 1, 2, . . . , B. Thus with B × K number of simulations, for different choices of w , we use bootstrap methodology with B = 100 and K = 1000, where the idea of bootstrap is to get the best values for the different estimates we like to use for comparison. Hence for each of the four distribution examples, our motivation is theefficiency of the proposed  to highlight   multistage sampling procedure by comparing its simulated values of risk, R¯¯ θˆN , θ

∑ ∑ B K



∑

=

1 B×K

∑B ∑K j =1

i=1

Ri,j θˆNi,j , θ

,



1 ˆ ˆ average sample size, N¯¯ = B×1 K j =1 i=1 Ni,j and estimate, θ N = B×K j =1 i=1 θNi,j with the corresponding values of w, D and θ , respectively. Apart from this endeavor, one would also like to comment how the given procedures compare within themselves. One must remember that the theoretical values are a benchmark against which any particular sequential sampling methodology should be judged. Two of comparison, one would like to use are (i) average number of sampling  other useful metric 

operations, SO = B∗1K

∑B ∑K j=1

i =1

B

∑K

SOi,j and (ii) CPU time required in seconds to complete a particular adaptive sampling

procedure. While, SO gives us the average number of sampling operations one would need to perform for any sampling method, the CPU time on the other hand provides us the average time one would need to invest to conduct such an experiment. The highlight of this study is the generic nature of the analysis and deduction one concludes from all the four different sets of simulation runs pertaining to the four different distributions. Keeping that in mind our attempt is to first state the different parameter values used for all the four different examples (Section 5.1) and then analyze the results and state our observations for the four instances (in Section 5.2). 5.1. Parameter values for different distributions Normal distribution: Table 1(a) shows the first set of simulation runs for the SEL loss function for AS vs. JC sequential sampling procedures with the following combinations of w = (0.008.0.010), ρ = (0.6, 0.7, 0.8) and m = 10. On the other hand Table 1(b) contains the values of the simulation runs considering SEL case, for both BCJ vs. BJC sequential sampling schemes with w = (0.008, 0.010), γ = (γ1 = 0.5, γ2 = 0.6, γ3 = 0.7), ρ = (ρ1 = 0.7, ρ2 = 0.8) and m = 4 combination set. We know that LINEX loss highlights the significance of over estimation vis-à-vis underestimation and that depends on the value of a. So with a = +0.8 (over estimation being more penalized) the corresponding results for (i) AS vs. JC and (ii) BCJ vs. BJC are shown in Table 2(a) and (b) respectively. The combination of parameters used to illustrate these run results (i.e., Table 2(a) and (b)) are (i) w = (0.004, 0.006), ρ = (0.6, 0.7, 0.8), m = 10 and (ii) w = (0.004, 0.006), γ = (γ1 = 0.3, γ2 = 0.5, γ3 = 0.7), ρ = (ρ1 = 0.7, ρ2 = 0.9), m = 4, respectively. For the under estimation case (a = −1.0) the simulation results for (i) AS vs. JC and (ii) BCJ vs. BJC are depicted in Table 3(a) and (b)

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

3191

Table 1 (a) and (b) Simulation results for X ∼ N (µ = 10, σ 2 = 1), i.e., normal distribution, with SEL function, B = 100, K = 1000 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.8, γ1 = 0.5, γ2 = 0.6, γ3 = 0.7, m = 4 sequential sampling procedures.

w

ρ 0.6 0.7 0.8 0.6 0.7 0.8

0.008

0.010

w 0.008 0.010

D

D

N

126

101

SO

CPU time (s)

JC

AS

JC

AS

JC

AS

JC

AS

JC

74 86 99 59 69 79

124 124 124 99 99 99

64 76 89 49 59 69

114 114 114 89 89 89

0.013172 0.011313 0.009913 0.016408 0.014102 0.012361

0.007946 0.007946 0.007946 0.009915 0.009915 0.009914

9.996341 10.001254 9.999738 9.998820 10.002488 9.998432

9.998960 10.000606 10.000087 10.001149 9.998018 9.999236

522.83 599.20 680.16 493.45 482.06 544.48

851.86 848.00 853.53 678.27 675.34 689.47

µ ˆ N = XN

SO = SO1 + SO2 + SO3

R

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

85 67

124 99

75 = 35 + 15 + 25 57 = 26 + 11 + 20

120 = 77 + 14 + 29 95 = 61 + 12 + 22

0.011780 0.014879

0.007946 0.009914

9.995904 9.997748

9.998626 9.999248

184.89 175.08

960.05 753.06

N

126 101

µ ˆ N = XN

R

AS

CPU time (s)

Table 2 (a) and (b) Simulation results for X ∼ N (µ = 10, σ 2 = 1), i.e., normal distribution, with LINEX loss function, B = 100, K = 1000, a = +0.8 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.9, γ1 = 0.3, γ2 = 0.5, γ3 = 0.7, m = 4 sampling procedures.

w

ρ 0.6 0.7 0.8 0.6 0.7 0.8

0.004

0.006

w 0.004 0.006

D

D

N

81

54

µ ˆ N = XN

R

CPU time (s)

JC

AS

JC

AS

JC

AS

JC

AS

JC

47 55 63 31 36 41

79 79 79 52 52 52

37 45 53 21 26 31

69 69 69 42 42 42

0.000025 0.000023 0.000021 0.000038 0.000034 0.000031

0.000017 0.000017 0.000017 0.000025 0.000025 0.000026

10.001017 9.998713 10.000788 10.000929 10.000178 10.002220

10.000107 9.999260 10.000520 9.997193 9.999643 10.001867

347.36 401.17 461.73 242.39 272.16 313.63

568.03 568.13 567.50 376.36 381.64 380.84

µ ˆ N = XN

SO = SO1 + SO2 + SO3

R

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

53 35

79 52

43 = 8 + 16 + 19 25 = 3 + 10 + 12

75 = 46 + 18 + 11 48 = 27 + 12 + 9

0.000022 0.000033

0.000017 0.000026

9.999058 9.999793

10.000127 10.001627

160.11 152.83

633.58 438.73

N

81 54

SO

AS

CPU time (s)

Table 3 (a) and (b) Simulation results for X ∼ N (µ = 10, σ 2 = 1), i.e., normal distribution, with LINEX loss function, B = 100, K = 1000, a = −1.0 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.9, γ1 = 0.3, γ2 = 0.5, γ3 = 0.7, m = 4 sequential sampling procedures.

w

ρ 0.6 0.7 0.8 0.6 0.7 0.8

0.004

0.006

w 0.004 0.006

D

D

126 84

N

126

84

SO

µ ˆ N = XN

R

CPU time (s)

AS

JC

AS

JC

AS

JC

AS

JC

AS

JC

74 87 99 49 57 66

124 125 124 83 83 82

64 77 89 39 47 56

114 114 114 73 73 72

0.000025 0.000023 0.000021 0.000038 0.000034 0.000031

0.000017 0.000017 0.000017 0.000025 0.000025 0.000025

10.001221 10.001031 10.000290 10.000313 9.998714 10.002332

9.996907 10.000459 9.997936 10.000111 9.995758 10.003020

536.88 629.58 727.95 357.80 418.66 483.52

906.17 907.88 910.55 599.24 600.52 599.86

µ ˆ N = XN

SO = SO1 + SO2 + SO3

R

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

84 55

124 82

74 = 17+25+32 45 = 9 + 16 + 20

120 = 77 + 27 + 16 78 = 48 + 18 + 12

0.000022 0.000033

0.000018 0.000026

10.002424 10.003427

9.999272 10.002528

173.74 158.03

965.34 655.22

N

CPU time (s)

respectively. The combinations of parameter set w, ρ, γ and m for a = −1.0 (Table 3(a) and (b)) are the same as that used in the run results shown in Table 2(a) and (b) (a = +0.8). Exponential distribution: With the SEL function, Table 4(a) and (b) highlight the simulated estimates of different values of interest for AS vs. JC (Table 4(a)) and BCJ vs. BJC (Table 4(b)) methods respectively. The combinations of parameter choices, used in Table 4(a) are w = (0.009, 0.010), ρ = (0.6, 0.7, 08), m = 10. While Table 4(b) shows the runs for the combination w = (0.009, 0.010), γ = (γ1 = 0.7, γ2 = 0.8, γ3 = 0.9), ρ = (ρ1 = 0.7, ρ2 = 08), m = 4. The corresponding different values of the estimates, when one considers over estimation, (a = +0.8), for the LINEX loss case, are depicted for the sequential sampling plans AS vs. JC in Table 5(a), for which the parameter combinations are w = (0.002, 0.003),

3192

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

Table 4 (a) and (b) Simulation results for X ∼ E (σ = 10, λ = 4), i.e., exponential distribution, with SEL function, B = 100, K = 1000 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.8, γ1 = 0.7, γ2 = 0.8, γ3 = 0.9, m = 4 sequential sampling procedures.

w

ρ 0.6 0.7 0.8 0.6 0.7 0.8

0.009

0.010

w 0.009 0.010

D

D

N

106

101

N

106 101

SO

R

   λˆ N = min∀N min λˆ

CPU time (s)

AS

JC

AS

JC

AS

JC

AS

JC

AS

JC

89 104 119 85 99 113

148 148 148 141 141 141

79 94 109 75 89 103

138 138 138 131 131 131

0.024618 0.018127 0.013902 0.027332 0.020128 0.015438

0.008920 0.008920 0.008920 0.009907 0.009906 0.009907

4.000008 4.000028 4.000003 4.000029 4.000005 4.000021

4.000019 4.000011 4.000016 4.000006 4.000012 4.000010

660.753 758.332 853.055 627.432 735.079 813.555

1036.199 1053.812 1042.204 990.366 989.601 993.127

SO = SO1 + SO2 + SO3

R

   λˆ N = min∀N min λˆ

CPU time (s)

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

133 126

148 141

123 = 63 + 23 + 37 116 = 59 + 21 + 36

144 = 95 + 16 + 33 137 = 92 + 16 + 29

0.011112 0.012351

0.008920 0.009907

4.000014 4.000011

4.000003 4.000003

270.614 269.163

1112.202 1062.828

Table 5 (a) and (b) Simulation results for X ∼ E (σ = 10, λ = 4), i.e., exponential distribution, with LINEX loss function, B = 100, K = 1000, a = +0.8 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.8, γ1 = 0.7, γ2 = 0.8, γ3 = 0.9, m = 4 sequential sampling procedures.

w

ρ 0.6 0.7 0.8 0.6 0.7 0.8

0.002

0.003

w 0.001 0.002

D

D

N

183

151

R

   λˆ N = min∀N min λˆ

CPU time (s)

AS

JC

AS

JC

AS

JC

AS

JC

AS

JC

109 127 146 89 104 119

182 182 182 151 149 149

99 117 136 79 94 109

172 172 172 141 139 139

0.005645 0.004111 0.003127 0.008501 0.006177 0.004691

0.001983 0.001983 0.001983 0.002969 0.002969 0.002969

4.000003 4.000009 4.000001 4.000008 4.000004 4.000008

4.000007 4.000000 4.000024 4.000002 4.000006 4.000004

961.584 1123.606 1295.689 790.686 922.662 1054.965

1617.564 1617.564 1621.090 1313.239 1310.743 1314.612

SO = SO1 + SO2 + SO3

N

258 183

SO

   λˆ N = min∀N min λˆ

R

CPU time (s)

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

230 162

256 182

220 = 104 + 51 + 65 152 = 72 + 35 + 45

252 = 172 + 27 + 53 178 = 118 + 20 + 40

0.001249 0.002514

0.000994 0.001983

4.000005 4.000005

4.000009 4.000003

359.362 315.666

2466.001 1704.051

Table 6 (a) and (b) Simulation results for X ∼ E (σ = 10, λ = 4), i.e., exponential distribution, with LINEX loss function, B = 100, K = 1000, a = −1.0 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.8, γ1 = 0.7, γ2 = 0.8, γ3 = 0.9, m = 4 sequential sampling procedures.

w

ρ

D

0.6 0.7 0.8 0.6 0.7 0.8

0.002

0.003

N

219

178

SO

   λˆ N = min∀N min λˆ

R

CPU time (s)

AS

JC

AS

JC

AS

JC

AS

JC

AS

JC

130 152 174 106 124 141

218 218 218 177 177 177

120 142 164 96 114 131

208 208 208 167 167 167

0.005335 0.003966 0.003064 0.007934 0.005913 0.004577

0.001986 0.001987 0.001986 0.002975 0.002975 0.002975

4.000007 4.000005 4.000009 4.000008 4.000005 4.000003

4.000002 4.000009 4.000002 4.000007 4.000009 4.000031

1168.581 1347.637 1563.136 944.144 1086.961 1247.766

1959.376 1958.830 1956.240 1560.436 1569.314 1556.194

   λˆ N = min∀N min λˆ

w

D

N BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

0.001 0.002

312 219

279 195

311 218

269 = 128 + 62 + 79 185 = 87 + 43 + 55

307 = 208 + 33 + 66 214 = 144 + 24 + 46

0.001238 0.002480

0.000995 0.001987

4.000002 4.000031

4.000009 4.000009

402.075 331.547

3065.088 2065.394

SO = SO1 + SO2 + SO3

R

CPU time (s)

ρ = (0.6, 0.7, 0.8), m = 10. On the other hand for BCJ vs. BJC sequential sampling plans, which are depicted in Table 5(b), the corresponding parameter set is w = (0.001, 0.002), γ = (γ1 = 0.7, γ2 = 0.8, γ3 = 0.9), ρ = (ρ1 = 0.7, ρ2 = 0.8), m = 4. Furthermore Table 6(a) and (b) show the synopsis of the simulation results when a = −1.0, for the multistage sampling methodologies, AS vs. JC and BCJ vs. BJC, respectively. The w, γ , ρ and m values for these are the same as those for the LINEX loss overestimation instance (a = +0.8), i.e., Table 5(a) and (b).

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

3193

Table 7 (a) and (b) Simulation results for X ∼ G(α = 5, β = 1, γ = 0), i.e., gamma distribution, with SEL function, B = 100, K = 1000, m = 10, considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.8, γ1 = 0.6, γ2 = 0.7, γ3 = 0.8, m = 4 sequential sampling procedures.

w

ρ 0.7 0.8 0.9 0.7 0.8 0.9

0.01

0.02

w 0.01 0.02

D

D

N

501

251

αˆ N = XN

R

CPU time (s)

JC

AS

JC

AS

JC

AS

JC

AS

JC

350 400 449 175 200 225

501 501 501 251 251 251

340 390 439 165 190 215

491 491 491 241 241 241

0.014298 0.012510 0.011151 0.028630 0.025047 0.022260

0.009988 0.009988 0.009988 0.019952 0.019952 0.019952

4.991384 4.999725 4.997953 4.992816 4.993773 4.994360

4.996036 4.996925 4.998963 4.998004 4.997757 4.992071

809.984 935.907 1076.484 404.203 459.969 517.390

1181.718 1188.703 1183.906 578.156 574.734 573.282

SO = SO1 + SO2 + SO3

N

501 251

SO

AS

αˆ N = XN

R

CPU time (s)

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

404 201

501 251

394 = 203 + 71 + 120 191 = 96 + 35 + 60

497 = 342 + 52 + 103 247 = 167 + 27 + 53

0.012406 0.024956

0.009988 0.019953

5.003337 4.994732

4.999364 4.998398

178.828 114.406

1240.812 613.516

Table 8 (a) and (b) Simulation results for X ∼ G(α = 5, β = 1, γ = 0), i.e., gamma distributions, with LINEX loss function, B = 100, K = 1000, a = +0.8 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.7, ρ2 = 0.8, γ1 = 0.7, γ2 = 0.8, γ3 = 0.9, m = 10 sequential sampling procedures.

w

ρ 0.7 0.8 0.9 0.7 0.8 0.9

0.01

0.02

w 0.01 0.02

D

D

322 162

N

322

162

SO

CPU time (s)

JC

AS

JC

AS

JC

AS

JC

AS

JC

227 257 288 113 129 145

322 322 322 162 162 162

217 247 278 103 119 135

312 312 312 152 152 152

−0.999273 −0.999657 −0.999657 −0.999646 −0.999647 −0.999651

−0.999658 −0.999659 −0.999661 −0.999650 −0.999651 −0.999649

4.899971 4.998107 4.997129 4.993756 4.992525 4.995651

4.996633 4.999112 5.001849 4.991776 4.994285 4.990083

570.893 663.940 710.5567 309.0546 340.6197 383.6076

821.9422 797.3391 815.2728 419.6498 414.6433 412.4438

SO = SO1 + SO2 + SO3

N

αˆ N = XN

R

AS

αˆ N = XN

R

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

290 146

322 162

280 = 149 + 49 + 82 136 = 70 + 24 + 42

312 = 216 + 32 + 64 152 = 103 + 17 + 32

−0.999657 −0.999660 4.996630 −0.999649 −0.999650 4.992363

CPU time (s) BJC

BCJ

BJC

5.001139 4.992775

156.4205 112.2666

821.8930 418.4169

Gamma distribution: Tables 7(a), (b) and 8(a), (b) are the synopsis of the results when one takes into account the gamma distribution for AS vs. JC (SEL), BCJ vs. BJC (SEL), AS vs. JC (LINEX, a = +0.8) and BCJ vs. BJC (LINEX, a = +0.8) instances respectively. For the SEL (AS vs. JC, i.e., Table 7(a)) example the values of the parameters one assumes are w = (0.01, 0.02), ρ = (0.7, 0.8, 0.9), m = 10. While for BCJ vs. BJC under SEL (Table 7(b)) the corresponding values are w = (0.01, 0.02), γ = (γ1 = 0.6, γ2 = 0.7, γ3 = 0.8), ρ = (ρ1 = 0.7, ρ2 = 0.8), m = 4. When we switch over to the LINEX loss case, with a = +0.8, Table 8(a) summarizes the comparison of AS vs. JC, for which we consider the parameter values, as w = (0.01, 0.02), ρ = (0.7, 0.8, 0.9), m = 10. Finally the BCJ vs. BJC comparison results are highlighted in Table 8(b) with w = (0.01, 0.02), γ = (γ1 = 0.7, γ2 = 0.8, γ3 = 0.9), ρ = (ρ1 = 0.7, ρ2 = 0.8), m = 10 as the parameter set for a = +0.8. Extreme Value Distribution: The final two sets of tables, i.e., Tables 9(a), (b) and 10(a), (b) highlight the findings for the simulation runs when the distribution is extreme valued. For the SEL bounded risk example, one considers a combination of (i) w = (0.007, 0.008), ρ = (0.6, 0.7, 0.8), m = 10 and (ii) w = (0.007, 0.008), γ = (γ1 = 0.5, γ2 = 0.6, γ3 = 0.7), ρ = (ρ1 = 0.6, ρ2 = 0.7), m = 10 to evaluate the performance of (i) AS vs. JC (Table 9(a)) and (ii) BCJ vs. BJC (Table 9(b)) multistage sampling methods respectively. We assume the asymmetric, i.e., LINEX, the loss function with a fixed level of bounded risk, i.e., w = +0.05. Our aim is to study the effect of change of a, i.e., shape parameter of the LINEX loss, on the sample size as well as on the estimate of E (X ) (which is of interest to us). The results are highlighted in Table 10(a) (AS vs. JC) and Table 10(b) (BCJ vs. BJC). For the AS vs. JC simulation runs, the parameters vectors are a = (+0.5, +1.0), ρ = (0.5, 0.7, 0.9), m = 10, while a = (+0.5, +1.0), γ = (γ1 = 0.5, γ2 = 0.6, γ3 = 0.7), ρ (ρ1 = 0.6, ρ2 = 0.7), m = 10 are the corresponding values of parameters for BCJ vs. BJC sampling procedures. 5.2. Results, general findings and discussions A closer look at all the tables conveys few general as well as some specific findings. We first highlight the general results and then discuss the specific analysis pertaining to few individual instances.

3194

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

Table 9 (a) and (b) Simulation results for X ∼ EVD(µ = 3, σ = 1), i.e., distributions, with SEL function, B = 100, K = 1000, m = 10, considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.6, ρ2 = 0.7, γ1 = 0.5, γ2 = 0.6, γ3 = 0.7, m = 10 sequential sampling procedures.

w

ρ 0.6 0.7 0.8 0.6 0.7 0.8

0.007

0.008

w

D

D

0.007 0.008

235

206

CPU time (s)

JC

AS

JC

AS

JC

AS

JC

AS

JC

228 266 306 199 233 267

384 384 383 334 335 337

218 256 296 189 223 257

374 374 373 324 325 327

0.011672 0.010004 0.008753 0.013340 0.011434 0.010005

0.006985 0.006984 0.006984 0.007980 0.007980 0.007980

3.731245 3.732031 3.734643 3.729973 3.731583 3.733409

3.735142 3.735685 3.736172 3.733215 3.733679 3.735939

838.719 987.579 1142.735 734.719 860.344 991.250

1467.141 1471.593 1465.781 1265.578 1264.047 1272.703

SO

R

SO = SO1 + SO2 + SO3

N

235 206

Eˆ (X ) = µ + γ ∗ σˆ N

AS

N

R

Eˆ (X ) = µ + γ ∗ σˆ N

CPU time (s)

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

268 234

383 335

258 = 107 + 56 + 95 224 = 93 + 48 + 83

373 = 213 + 42 + 118 325 = 190 + 35 + 100

0.010149 0.011676

0.006985 0.007980

3.731585 3.731271

3.734956 3.733650

152.047 140.000

1501.125 1286.406

Table 10 (a) and (b) Simulation results for X ∼ EVD(µ = 5, σ = 95), i.e., distributions, with LINEX loss function, B = 100, K = 1000, w = +0.05 considering (a) Accelerated Sequential (AS), Jump and Crawl (JC) with m = 10 and (b) Batch Crawl and Jump (BCJ), Batch Jump and Crawl (BJC) with ρ1 = 0.6, ρ2 = 0.7, γ1 = 0.5, γ2 = 0.6, γ3 = 0.7, m = 10 sequential sampling procedures.

ρ

a

+0.5

+1.0

a

0.5 0.7 0.9 0.5 0.7 0.9 D

+0.5 +1.0

D

36 62

N

36

62

Eˆ (X ) = µ + γ ∗ σˆ N

SO

CPU time (s)

AS

JC

AS

JC

AS

JC

AS

JC

22 31 40 37 52 67

44 44 44 74 74 74

12 21 30 27 42 57

34 34 34 64 64 64

70.944758 72.267342 72.934863 72.975407 73.672788 74.089739

72.993731 73.143197 72.998988 74.213237 74.155536 74.150556

921.214 1550.071 2162.724 1948.064 3112.121 4024.412

2474.927 2431.346 2436.554 4677.936 4588.755 4792.011

SO = SO1 + SO2 + SO3

Eˆ (X ) = µ + γ ∗ σˆ N

BCJ

BJC

BCJ

BJC

BCJ

BJC

BCJ

BJC

30 51

44 74

20 = 4 + 5 + 11 41 = 12 + 9 + 20

34 = 16 + 5 + 13 64 = 34 + 8 + 22

72.166052 73.643506

73.122859 74.146519

528.010 592.107

2745.722 4895.464

N

CPU time (s)

General findings: As w decreases (i.e., as we deploy more resources for our sampling estimation problem) the value of N increases for both AS and BCJ sequential sampling procedures. This is true irrespective of whether one uses the SEL or the LINEX loss function or any one of the four underlying distributions. The only exception being the EVD bounded LINEX loss risk case for any fixed value of w . For all these runs, with decreasing w , the corresponding values of estimated risk (denoted

by R¯¯ (θˆN , θ )) show a decreasing trend while that of sampling time (denoted by CPU time) portrays an increasing tendency. On the other hand, on an average for the other two methods, whereby we first jump and then crawl (i.e., JC and BJC) the N value almost always estimates the optimal sample size, D, accurately, and the values of estimated risk and sampling time show similar trends as that witnessed for AS and BCJ sampling procedures. But one interesting fact worth mentioning is the pronounced effect of part estimation (considering the effect of ρ and γ ) in AS and BCJ cases, which is almost negligible for both JC and BJC methods. This may be true for the reason that in both AS and BCJ, for increasing values of ρ we complete more of the sampling scheme using one big chunk of observations after which we switch over to one observation at a time till the end. While for JC and BJC one is already on the over drive by undertaking a big step after sampling one observation at a time initially. Hence using AS and BCJ we slowly approach the limiting case, while by utilizing JC and BJC we are already there instantly. Thus we do get an accurate estimate of D (using  JCand BJC methods) but there are costs associated with it, which are (i) increase in CPU time and (ii) lower values of R¯¯ θˆN , θ , which is not desirable in all practical purpose.





Lower simulated values of risk, R¯¯ θˆN , θ , or higher CPU time implies that more effort/time is required in terms of resources/days one needs to invest/spend to estimate the parameters of interest. This may not be warranted considering resource constraints and bottlenecks. Another interesting point worth mentioning are the values of the sampling operations (SO). If one wants to ensure a more efficient sample survey plan in terms of higher budget outlay (smaller value of w) and better estimation, then it would invariably mean that the experimenter needs to consider more number of sets of observations at any stage for any of the four adaptive sampling procedures discussed. JC and BJC sampling plans will invariably have more number of sample operations/sub-sample operations, as these two procedures come closer to estimating the optimal sample size, D, than either AS or BCJ sampling schemes. But it would cost us more time (CPU time)

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

3195

and entail allocation of higher amount of resource/time for the JC and BJC multistage sampling rules. This is clearly visible if one concentrates on decreasing the values of w , i.e., w → 0. Another interesting fact which comes out from these exhaustive simulation runs, is, a decrease  in the value of w (increase of budgetary outlay) shows that using JC, irrespective of the distributions, the value of R¯¯ θˆN , θ , predicts the corresponding theoretical values (i.e., w ) more efficiently than the



corresponding values found using the AS sampling method. This is true even though the corresponding value of R¯¯ θˆN , θ



estimated using the AS method is able to predict w on an average with less percentage error. The same findings are applicable when we try to compare the results of estimated risk using BCJ with BJC sampling procedures. This is not apparent as the combinations of ρ, γ = (γ1 , γ2 , γ3 ) and ρ = (ρ1 , ρ2 , ρ3 ) shown in the tables are inadequate in number, which is due to limitation of page number and also the fact that the highlight of the research is to study four different sequential sampling schemes for four different distributions under two different loss functions. On the other hand the cost component, the best proxy for which is the CPU time in seconds, conveys the fact that a gain in efficiency for the estimation process invariably leads to an increase in costs, i.e., CPU time. The simulated value of risk also portrays very clearly that more time and effort is required to obtain a higher level of precision in estimation. Whether that is warranted depends on the different levels of resources  one is willing to spend for such a work. Last but not the least, the average values of different sampling operations, i.e., SO, SO1 , SO2 , SO3 for BCJ and BJC clearly highlight

that a proper sampling plans in terms of efficiency of the final results, i.e., better estimate for D, w , etc., is of importance. A higher budget outlay, i.e., more CPU time usage, for better estimation would definitely mean that the experimenter will consider more number of sets of observations at any stage. This may be regulated using different practical combinations of ρ, γ = (γ1 , γ2 , γ3 ) and ρ = (ρ1 , ρ2 , ρ3 ). This fact is also corroborated if one considers that decreasing w , i.e., w → 0, will increase the estimation accuracy, but at a cost. Specific findings: Under the general findings we mentioned that for JC and BJC methods, N values are almost always equal to D. But from Tables 4(a), (b) and 9(a), (b) we get a different picture, which may be due to the specific combinations of ρ and ρ, γ = (γ1 , γ2 , γ3 ), ρ = (ρ1 , ρ2 , ρ3 ), using which we are unable to highlight the estimation power of JC and BJC methods. The second of the specific findings is related to Table 8(a) and (b). Due to the choice of w , the estimated risk is negative, which is an anomaly. The authors did exhaustive runs with many different combinations  of w and a, but with fixed values of (α, β, γ ) as shown in the tables. One did obtain positive estimates of risk, i.e., R¯¯ θˆN , θ , but the optimal sample size in

that case is always 1, which is practically unappealing to study, given the main motivation of the paper. One may refer to the gamma distribution under Section 4 to have a better understanding of this aberration. Runs with different negative values of a (i.e., the underestimation case) and values of w , do result in positive estimates of risk, but here again the optimal sample size, D, is 1. The absence of run results for G(α, β, γ ) with negative values of a (LINEX loss function) is also left out as that also gave N = 1, which is not relevant for us. Finally, the presence of the gamma function in the formulation of LINEX risk for EVD case, prevents us with any simulation result to highlight the effect of underestimation, i.e., negative a’s, as we know that negative value of the gamma function, which is present as an integral part in the LINEX loss risk for the EVD distribution, do not exist. 6. Conclusions and future scope of research The basic idea, based on which this extensive simulation based research work were undertaken was to highlight the efficacy of different sequential sampling methodologies. Moreover we verified using extensive simulations runs, the efficiency of different multistage sampling methodologies. This becomes apparent on how the samples are taken, keeping in mind one wants to ensure the adherence of few theoretical properties for these different sampling schemes. In this work it is noticed that decreasing the level of bound on the risk invariably results in more precision in estimation. But the cost and time factors to complete the sampling plan are two of the most important issues one should keep in mind when designing any new sequential sampling procedure. It is true that first undertaking a jump and then crawl leads us to better estimation, but whether the asymptotic properties hold or whether it is at all practical (from time and resource angle) are two very important points which need to be considered in all theoretical and practical situations. The future extension of this study may be to consider other different types of distributions and check how good these different sampling techniques are for these distributions. Moreover emphasis can be laid to design new sampling schemes such that one may achieve greater accuracy of the population estimates without sacrificing too much on the cost front. In these cases it should be ensured that the estimates of population parameters, sample size, risk, etc., are close to the actual parameter values. Apart from this they should also adhere to the asymptotic properties of sequential sampling rules. Acknowledgments The authors would like to thank the Editor, Associate Editor and two anonymous referees for their critical and valuable comments/suggestions for the final version of this paper.

3196

R.N. Sengupta, A. Sengupta / Computational Statistics and Data Analysis 55 (2011) 3183–3196

References Alp, T., Demetrescu, M., 2010. Joint forecasts of Dow Jones stocks under general multivariate loss function. Computational Statistics and Data Analysis 54, 2360–2371. Bai, J., Perron, P., 2002. Computation and analysis of multiple structural change models. Journal of Applied Econometrics 18, 1–22. Basu, A.P., 1971. On a sequential rule for estimating the location parameter of an exponential distribution. Naval Research Logistics Quarterly 18, 329–337. Bollen, N.P.B., Gray, S.F., Whaley, R.E., 2000. Regime switching in foreign exchange rates: evidence from currency option prices. Journal of Econometrics 94, 239–276. Carvalhoa, C.M., Lopes, H.F., 2007. Simulation-based sequential analysis of Markov switching stochastic volatility models. Computational Statistics and Data Analysis 51, 4526–4542. Chattopadhyay, S., 2000. Sequential estimation of exponential location parameter using asymmetric loss function. Communications in Statistics—Theory and Methods 29, 783–795. Chattopadhyay, S., Chaturvedi, A., Sengupta, R.N., 2000. Sequential estimation of a linear function of normal means under asymmetric loss function. Metrika 52, 225–235. Chattopadhyay, S., Sengupta, R.N., 2006. Three-stage and accelerated sequential point estimation of the normal mean using LINEX loss function. Statistics 40, 39–49. Cui, Y., Fub, Y., Hussein, A., 2009. Group sequential testing of homogeneity in genetic linkage analysis. Computational Statistics and Data Analysis 53, 3630–3639. Ghosh, M., Mukhopadhyay, N., Sen, P., 1997. Sequential Estimation. John Wiley & Sons Inc., New York, NY, USA. Ghosh, B.K., Sen, P.K., 1991. Handbook of Sequential Analysis. Mercer Dekker. Govindarajulu, Z., 2004. Sequential Statistics. World Scientific Press. Hall, P., 1981. Asymptotic theory of triple sampling for sequential estimation of a mean. Annals of Statistics 9, 1229–1238. Hall, P., 1983. Sequential estimation saving sampling operations. Journal of the Royal Statistical Society: Series B 45, 219–223. Liu, W., 1997. Improving the fully sequential sampling scheme of Anscombe-Chow-Robbins. Annals of Statistics 25, 2164–2171. Mukhopadhyay, N., 1990. Some properties of a three-stage procedure with applications in sequential analysis. Sankhya A 52, 218–231. Mukhopadhyay, N., Datta, S., Chattopadhyay, S., 2004. Applied Sequential Methodologies: Real-world Examples with Data Analysis. Marcel Dekker Inc., New York, NY, USA, edited. Mukhopadhyay, N., de Silva, B.M., 2009. Sequential Methods and their Applications. CRC Press. Mukhopadhyay, N., Solanky, T.K.S., 1991. Second order properties of accelerated stopping times with application in sequential estimations. Sequential Analysis 10, 99–123. Mukhopadhyay, N., Solanky, T.K.S., 1994. Multistage Selection and Ranking Procedure: Second Order Asymptotics. Marcel Dekker Inc., New York, NY, USA. Orawo, L. A’ O., Christen, J.A., 2009. Bayesian sequential analysis for multiple-arm clinical trials. Journal Statistics and Computing 19, 99–109. Prado, R., Sequential estimation of mixtures of structured autoregressive models. Computational Statistics and Data Analysis, in press (doi:10.1016/j.csda.2011.03.017). Ray, W.D., 1957. Sequential confidence intervals for the mean of a normal population with unknown variance. Journal of the Royal Statistical Society: Series B 19, 133–143. Salvan, A., 1990. Planning sequential clinical trials: a review. Computational Statistics and Data Analysis 9, 47–56. Schmitz, N., 1993. Optimal Sequential Planned Decision Procedures. Springer-Verlag, New York, NY, USA. Sengupta, R.N., 2008. Use of asymmetric loss functions in sequential estimation problem for the multiple linear regression. Journal of Applied Statistics 35 (3), 245–261. Siegmund, D., 1985. Sequential Analysis: Tests and Confidence Intervals. Springer-Verlag, New York, NY, USA. Stein, C., 1945. A two sample test for a linear hypothesis whose power is independent of the variance. Annals of Mathematical Statistics 16, 243–258. Takada, Y., 2006. Multistage estimation procedures with bounded risk for the normal mean under LINEX loss function. Sequential Analysis 25, 227–239. Todd, S., Baksh, M.F., Whitehead, J., Sequential methods for pharmacogenetic studies. Computational Statistics and Data Analysis, in press (doi:10.1016/j.csda.2011.02.019). Wald, A., 1947. Sequential Analysis. John Wiley & Sons Inc., New York, NY, USA. Wald, A., Wolfowitz, J., 1945. Sampling inspection plans for continuous production which insure a prescribed limit on the outgoing quality. Annals of Mathematical Statistics 16, 30–49. Zacks, S., 2009. Stage-Wise Adaptive Designs. John Wiley & Sons. Zellner, A., 1986. Bayesian estimation and prediction using asymmetric loss function. Journal of American Statistical Association 81, 446–451. Zellner, A., 1994. Bayesian and Non-Bayesian estimation using balanced loss function. In: S. S., Gupta, J. O., Berger (Eds.), Statistical Decision Theory and Related Topics V. Springer-Verlag, New York, pp. 377–390. (Chapter 28).

Some variants of adaptive sampling procedures and their applications

Some variants of adaptive sampling procedures and their applications

Recommend Documents