Computers & Industrial Engineering 143 (2020) 106409
Contents lists available at ScienceDirect
Computers & Industrial Engineering journal homepage: www.elsevier.com/locate/caie
Efficient monitoring of overdispersed counts with time-varying population sizes Dong Dinga, Jian Lib, a b
T
⁎
School of Management, Xi’an Polytechnic University, Xi’an, Shaanxi 710048, China School of Management and State Key Laboratory for Manufacturing Systems Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China
ARTICLE INFO
ABSTRACT
Keywords: Generalized Poisson distribution Incidence rate Overdispersion factor Statistical process control Weighted likelihood
The monitoring of count data is a major issue in many industrial applications and research activities. Count data often exhibit overdispersion with variance greater than mean, which makes the commonly used Poisson distribution problematic because of its underlying assumption of equal mean and variance. In this article, we consider the surveillance strategy for overdispersed count data, and we also take into account the effect of timevarying population sizes. In particular, to model the data we propose to adapt the generalized Poisson distribution to incorporate the incidence rate, the overdispersion factor, and the non-constant population sizes. The weighted likelihood ratio test is employed for online monitoring. Simulations show that the proposed method is efficient at detecting changes simultaneously in both the incidence rate and the overdispersion factor, and it is robust under different time-varying patterns of the population sizes.
1. Introduction Early detection of changes in a process is of great importance in a wide range of industries and is studied extensively in the literature. Statistical process control (SPC) has proven its effectiveness in monitoring the stability of a process. Among all, the detection problem of count data has attracted researchers’ attention for a long time. It traces back to the very well-known c-chart which models count data by the Poisson distribution. Under the assumption of a fixed population size or inspecting area of opportunity, monitoring the expected number of incidents is in fact equivalent to monitoring the incidence rate based on the Poisson distribution. The importance of detecting changes in the incidence rate can be verified in lots of examples, such as detection of an increased birth rate of infants with congenital malformations and an increased rate of incidence of some disease. Numerous methods have been developed given a constant population size. The classic c-chart is one of the Shewhart-type control charts. But, because the Shewhart chart is known to be less sensitive in detecting small and moderate shifts, many researchers suggest using cumulative sum (CUSUM) and exponentially weighted moving average (EWMA) charts to overcome this shortcoming. One can refer to Lucas (1985), Lai (1995), White and Keats (1996) for the CUSUM version. As for the EWMA charts, see Borror, Champ, and Rigdon (1998), Gan (1990), Frisén and de Maré (1991) for example. In particular, Frisén
⁎
and de Maré (1991) proposed the likelihood ratio method which shows its superiority over the Shewhart and CUSUM competitors. However, the assumption of a constant population size is sometimes invalid in practice, where the inspecting area of opportunity is changing over time. The expected number of incidents is also time-varying even in the in-control (IC) state, and therefore attention should be paid to the incidence rate, which is the ratio of the expected number of incidents to the population size. Although some articles use the term “sample size” to represent “population size” or “area of opportunity” here, please be reminded that it does not refer to the number of observations in a sample. In fact, the population size is not necessarily an integer, and it can be any real-valued number greater than zero. The product of the population size and the incidence rate is the expected number of incidents. In the literature, Mei, Han, and Tsui (2011) considers male thyroid cancer cases (with malignant behavior) in New Mexico during 1973—2005 with increasing population sizes. They developed some variations of the CUSUM method and proposed three CUSUM-based charts. Dong, Hedayat, and Sinha (2008) considered the EWMA methods to address the issue of time-varying population sizes. Ryan and Woodall (2010) evaluated the performance of EWMA and CUSUM methods in detecting increases in the Poisson rate and suggested a modified EWMA chart by adding a lower reflecting barrier. Both Dong et al. (2008) and Ryan and Woodall (2010) used the idea of dividing the
Corresponding author. E-mail address:
[email protected] (J. Li).
https://doi.org/10.1016/j.cie.2020.106409 Received 20 January 2019; Received in revised form 26 November 2019; Accepted 9 March 2020 Available online 14 March 2020 0360-8352/ © 2020 Elsevier Ltd. All rights reserved.
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
observed counts by the corresponding population sizes that can be seen as extensions of the classic u-chart. Zhou, Zou, Wang, and Jiang (2012) proposed the weighted likelihood ratio method which is effective and has satisfactory run length distribution. Other contributions also include Jiang, Shu, and Tsui (2011), Zhou, Shu, and Jiang (2016), and Shen, Zou, Jiang, and Tsung (2013), Shen, Tsui, Woodall, and Zou (2016). Richards, Woodall, and Purdy (2015) made a thorough discussion on the surveillance for non-homogenous Poisson processes. It should be noted that all the above methods are based on the Poisson distribution, which assumes that the mean and the variance are equal. However, it is usually observed that the variance is in fact greater than the mean. This phenomenon of excess variability is called overdispersion and is very common in various fields, such as insurance practice and ecology. For example, the Malaysian motor insurance claim count data analyzed by Zamani and Ismail (2012), and counts of harbor seals in Alaska studied by Ver Hoef and Boveng (2007) are overdispersed counts. To accommodate overdispersion, different SPC approaches have been proposed. Many researchers suggested the use of the geometric and negative binomial distributions, such as Kaminsky, Benneyan, Davis, and Burke (1992), Schwertman (2005), Sparks, Keighley, and Muscatello (2010); the Conway-Maxwell-Poisson distribution is also considered in some works like Sellers (2012) and Wang, Li, and Xue (2018). In addition, Yu, Wu, Wang, and Tsung (2018) considered a weighted Poisson CUSUM monitoring scheme that is robust to overdispersion. Das, Zhou, Chen, and Horst (2016) proposed to use approximate likelihood ratio tests for the monitoring of overdispersed multivariate count data based on the Poisson-multivariateGaussian mixed model. However, either the population size is assumed constant, or the focus is on monitoring the mean only, regardless of the overdispersion factor. In this article, we consider the monitoring of overdispersed count data with time-varying population sizes. Our contributions mainly lie in three aspects: (1) borrowing the additive property, we adapt the generalized Poisson (GP) distribution so as to model overdispersed counts with non-constant population sizes, which takes into account simultaneously the incidence rate, the overdispersion factor, and time-varying population sizes; (2) in Phase II SPC the weighted likelihood ratio test is employed to construct a control chart to detect changes either in the incidence rate or in the degree of overdispersion; (3) simulations show that the proposed method has powerful detection performance and is robust under different time-varying patterns of the population sizes. The remainder of this article is organized as follows: Section 2 describes the modeling of the overdispersed counts with time-varying population sizes; Section 3 is on the development of the monitoring strategy; simulation study is conducted in Section 4; the implementation of the proposed method is illustrated through a real example in Section 5; finally Section 6 concludes this article.
respectively. Here both mean and variance are equal to nt . However in practice, the assumption of equal mean and variance is often violated. It is more common that count data exhibit overdispersion with variance larger than mean. In such situations the Poisson distribution does no longer provide a good fit, and more general distributions for describing count data are needed. Among others, the generalized Poisson distribution is an appropriate choice when overdispersion occurs. The GP distribution has two parameters µ and and < 1) with the PMF is denoted by GP(µ, ) (0
Pr X = x = µ 1
1
exp[ µ (1
)
x]
x!
.
The mean and the variance are
E(X ) = µ ,
Var(X ) =
µ )2
(1
,
respectively. This is in fact a reparameterized form from the original one in Consul and Jain (1973). Compared with the Poisson distribution, the additional parameter controls the degree of overdispersion. With = 0 , the GP distribution GP(µ, ) will reduce to the Poisson distribution P( µ ), whereas with 0 < < 1, the variance is larger than the mean. Thus the GP distribution allows for overdispersion, and (1 ) 2 1 is called the overdispersion factor. For any two independent variables X and Y, if X ~GP(µ1 , ) and Y ~GP(µ 2 , ) , then X + Y ~GP(µ1 + µ 2 , ) . This additive property makes it possible to incorporate time-varying population sizes into consideration. To be specific, similar to the Poisson counts, an overdispersed sequence X1 , X2 , … with time-varying population size nt at < 1) with time t can be modeled by Xt ~GP(nt , ) (0
E(Xt ) = nt ,
Var(Xt ) =
nt (1
)2
.
Here can still be regarded as the incidence rate, and the over) 2 . The PMF of Xt is dispersion factor is still (1
Pr(Xt = x ) = nt (1
)[nt (1
) + x ]x 1 exp[ nt (1
)
x ]/ x!. (1)
Besides additivity, the GP distribution has many other properties that have been extensively studied. See Consul (1989) for reference. It is known that the negative binomial (NB) distribution also allows for overdispersion. That’s why some methods are based on the NB distribution as mentioned in the Introduction. When fitting to count data the performance of the GP and NB distributions are actually similar. The investigation by Joe and Zhu (2005) showed that GP is more appropriate for describing strictly unimodal (mode >0 ) count data or counts with a heavy tail, while NB is preferred when data have a very large zero fraction. Otherwise their differences are insignificant. The main reason why GP is chosen instead of NB is that GP has a simpler form of the likelihood function, so that it facilitates online monitoring. On the contrary, the PMF of NB involves the Gamma function that will bring too much trouble into computing maximum weighted likelihood estimates used for real-time monitoring. This will be seen more clearly in Section 3. In a word, the generalized Poisson distribution expressed in Eq. (1) is an appropriate choice for characterizing overdispersed count data, and can be adapted to account for time-varying population sizes.
2. Data modeling Let Xt (t = 1, 2, …) be a sequence of event counts during fixed time intervals. Usually count data can be modeled by the Poisson distribution denoted by P(µ) with mean and variance both equal to µ . The probability mass function (PMF) of P(µ) is
Pr X = x =
) + x ]x
[µ (1
e µµ x . x!
3. Process monitoring
With this expression, the expected number of occurrences is constant over time. However, in many cases the size of the associated population or inspecting area of opportunity is not constant, but varies over time. In such scenarios, it is reasonable to assume at time t , Xt ~P(nt ) , where nt and represent the population size at time t and the incidence rate,
SPC usually consists of two phases. In Phase I, retrospective analysis of historical data is conducted to see if any abnormalities exist. After identification of root causes and corresponding adjustment, a clean data set is obtained in order to represent the IC performance of the process.
2
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
Table 1 OC ARL comparison for scenario (I) with OC-TPS 5.00 21.1 64.0 133 126 62.3 22.7 6.51 2.66 1.24
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 1 2
0
= 0.1.
P-TPS
(0.02) (0.15) (0.59) (1.35) (1.30) (1.20) (0.18) (0.04) (0.01) (0.00)
5.29 18.5 53.9 123 117 54.7 20.1 6.01 2.91 1.63
EWMAu (0.02) (0.13) (0.50) (1.26) (1.20) (0.52) (0.15) (0.03) (0.01) (0.01)
5.53 20.1 63.0 144 106 50.6 19.3 5.91 2.86 1.60
OC-TPS
(0.03) (0.14) (0.60) (1.47) (1.10) (0.48) (0.15) (0.03) (0.01) (0.06)
184 221 214 178 139 87.0 34.8 16.9 6.05 2.16
0.1 0.05 0.02 0.02 0.05 0.1 0.2 0.3 0.5 0.8
P-TPS
(1.87) (2.35) (2.31) (1.90) (1.47) (0.91) (0.34) (0.15) (0.04) (0.01)
404 290 233 179 150 109 61.5 35.6 13.0 3.65
EWMAu (4.08) (3.02) (2.42) (1.88) (1.00) (1.14) (0.63) (0.36) (0.12) (0.02)
404 287 231 179 149 109 62.2 36.0 13.2 3.78
(4.13) (2.98) (2.44) (1.92) (1.57) (1.13) (0.64) (0.36) (0.13) (0.02)
NOTE: Standard errors are in parentheses, and the best ARLs are in bold.
The IC data set can be used to estimate the IC model. Then in Phase II, on the basis of the IC model, a control chart can be established to detect shifts as quickly as possible. In this article, we focus on only Phase II monitoring. Based on the generalized Poisson distribution, we can now turn to the monitoring of overdispersed count data with time-varying population sizes. Not only are we interested in detecting changes in the rate , ) 2 as well. To this end, the debut in the overdispersion factor (1 tection problem is formulated as the following change-point model:
Xt ~
GP(nt
0,
0 ),
for
t
GP(nt 1,
1),
for
t> ,
Xt at time t and fails to make full use of all the information. To tackle this problem, the EWMA scheme is incorporated to enhance the detection power. To be specific, the exponentially weighted likelihood function is employed, which is the linear combination of the log-likelihood functions at different time points with exponentially decaying weights (Li, Tsung, & Zou, 2014; Zhou et al., 2012;Zou & Qiu, 2009). The weighted log-likelihood function over all the observations up to the current time t is t
Wt ( ) = a1, t1,
,
=
0
and 0 or
=
t
0
1)ln(nt (1
) + Xt ) + ln(nt ) + ln(1
)
nt (1
)
Rt = 2[Wt ( t )
OC-TPS
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 1 2
6.60 30.1 85.5 156 149 86.2 34.0 9.58 4.09 1.79
(0.03) (0.25) (0.84) (1.59) (1.54) (0.85) (0.29) (0.05) (0.02) (0.01)
0
Xt .
= 0.3 .
P-TPS 6.99 27.2 77.7 152 135 72.7 29.2 8.32 3.79 2.01
Wt ( 0 )],
with 0 = [ 0 , 0 ]T . The proposed control chart triggers an OC signal if Rt > L where L > 0 is the control limit chosen to achieve a specified false alarm rate. Hereafter this chart is named overdispersed count control chart for time-varying population sizes, or OC-TPS chart for short. Since there is no recursive form of the exponentially weighted
The LRT statistic can then be constructed accordingly. However, this monitoring statistic is of Shewhart-type and inefficient in detecting small and moderate shifts, since it exploits only the current observation Table 2 OC ARL comparison for scenario (I) with
= argmax Wt ( ).
While this MWLE has no closed-form expression, the Newton-Raphson algorithm is used to solve the above maximization problem. More details can be found in the appendix. The finalized charting statistic takes the form
0
at every time point. To derive a monitoring statistic, the likelihood ratio test (LRT) is adopted due to its high efficiency and asymptotic optimality. By ignoring constants with respect to = [ , ]T , the log-likelihood function of based on only the observation Xt is lt ( ) = (Xt
(2)
(0, 1] is the EWMA smoothing parameter, and Here t at0, t , = i = t0 (1 )t i is to ensure that all the weights sum up to 1. Obviously Wt ( ) makes use of all the information of the observations up to the current time t, with exponentially decaying weights over time. The maximum weighted likelihood estimate (MWLE) of at time t, denoted by t , is the solution to the maximization problem:
where is the unknown change-point. Parameters with subscript 0 are assumed to be known or have been estimated based on a clean dataset, and they indicate that the process is in-control. Parameters with subscript 1 are unknown and imply that the process goes out-of-control (OC). The goal is to detect if any shifts occur in the parameters = [ , ]T . Therefore the monitoring task is to test the hypothesis
H0: H1:
)t ili ( ).
(1 i=1
EWMAu (0.03) (0.21) (0.76) (1.58) (1.40) (0.73) (0.25) (0.05) (0.02) (0.01)
7.52 31.3 97.4 184 120 65.1 27.0 8.05 3.73 1.96
OC-TPS
(0.03) (0.25) (0.98) (1.92) (1.24) (0.64) (0.23) (0.05) (0.02) (0.01)
0.3 0.2 0.1 0.05 0.05 0.1 0.2 0.3 0.4 0.5
NOTE: Standard errors are in parentheses, and the best ARLs are in bold.
3
41.5 73.4 157 213 128 69.8 24.2 10.9 6.22 3.95
(0.31) (0.65) (1.59) (2.21) (1.36) (0.72) (0.22) (0.09) (0.04) (0.02)
P-TPS
>500 >500 492 312 134 91.7 44.9 23.8 13.2 7.64
EWMAu (∗) (∗) (5.18) (3.25) (1.41) (0.96) (0.47) (0.23) (0.12) (0.06)
>500 >500 462 301 137 94.7 46.9 24.7 13.7 8.04
(∗) (∗) (4.63) (3.17) (1.45) (0.99) (0.48) (0.24) (0.13) (0.07)
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
Table 3 OC ARL comparison for scenario (I) with
0
OC-TPS 9.01 44.2 114 171 171 117 53.7 15.3 3.12 1.52
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 2 5
= 0.5 .
P-TPS
(0.05) (0.40) (1.17) (1.79) (1.75) (1.18) (0.49) (0.10) (0.01) (0.01)
10.6 47.4 126 191 150 99.2 47.0 13.4 2.78 1.57
EWMAu (0.05) (0.42) (1.30) (2.03) (1.58) (1.02) (0.44) (0.10) (0.01) (0.01)
12.2 68.0 192 247 136 88.6 42.4 12.9 2.71 1.54
OC-TPS
(0.06) (0.63) (2.02) (2.61) (1.43) (0.90) (0.40) (0.09) (0.01) (0.01)
17.1 20.5 27.6 47.6 118 108 48.9 14.5 6.61 3.79
0.5 0.4 0.3 0.2 0.1 0.05 0.1 0.2 0.3 0.4
P-TPS
(0.07) (0.10) (0.17) (0.38) (1.17) (1.13) (0.48) (0.12) (0.04) (0.02)
EWMAu (∗) (∗) (∗) (∗) (∗) (1.22) (0.75) (0.30) (0.13) (0.05)
>500 >500 >500 >500 >500 119 72.4 29.8 14.3 7.65
>500 >500 >500 >500 >500 126 80.4 33.8 16.1 8.71
(∗) (∗) (∗) (∗) (∗) (1.32) (0.83) (0.33) (0.15) (0.06)
NOTE: Standard errors are in parentheses, and the best ARLs are in bold.
likelihood function here, in the simulation study, by adding a sliding window of width M (Shang, Tsung, & Zou, 2011), we usually adopt the following slightly modified weighted likelihood function:
Wt ( ) = at
1 M + 1, t ,
t
)t ili ( ).
(3)
i=t M+1
Here M can be selected to ensure that (1
)M
1
is sufficiently small.
4. Performance assessment
Ut = a1, t1,
)t iXi ln
i=1
t 0
)t i n i
(1
t
0
t
(1
)t
i
Xi ni
0
.
(I) Uniformly distributed: nt ~U(7, 12) . c (II) Increasing: nt = 1 + exp((c1 t ) / c ) ;
.
(III) Decreasing: nt =
i=1
2
3
c1 / 2.4 1 + exp((t c2) / c3)
+ 5.
Here c1 = 13.8065, c2 = 11.8532 , and c3 = 26.4037 . Scenario (I)
Here Table 4 OC ARL comparison for scenario (II) with OC-TPS
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 1 2
)t i n i
To assess the performance of control charts, we use the average run length (ARL), which is the average number of samples taken before the chart signals. Here in our setting each sample contains only one observation. Given the same IC ARL, the chart with shorter OC ARL has better performance, because it detects a shift more quickly. In our simulation study, the IC ARL is specified as 200, and all the results are obtained based on 10,000 repetitions. We consider three scenarios of the patterns of time-varying population sizes, which are as follows:
t
a1, t1,
(1
4.2. Simulation results
The first competitor is the EWMA-type control chart suggested by Zhou et al. (2012), which is constructed by integrating the exponentially weighted LRT with the Poisson distribution and aims at monitoring Poisson counts with time-varying population sizes. Here for short we name this chart P-TPS chart. The P-TPS chart regards the data as Poisson distributed with time-varying population sizes, i.e., Xt ~P(nt ) , and its charting statistic is
(1
)t iXi
This chart is actually the EWMA version of the well-known u-chart, and therefore we call it the EWMAu chart.
4.1. Existing alternatives
Vt = a1, t1,
(1
i=1
The performance of the proposed OC-TPS chart is investigated in this section. We first briefly introduce two existing alternatives, and then present the comparison results. Notice that in the literature we cannot find any other work for overdispersed count data with timevarying population sizes. The existing methods introduced below are not designed for overdispersed count data, but for Poisson counts only.
t
t i=1 t i=1
is the maximum exponentially weighted likelihood estimate of the incidence rate at time t. The second competitive approach is proposed by Dong et al. (2008), which assigns the exponential weights directly to the rates at each time point and therefore adopts the following charting statistic (with some minor modification)
t
(1
=
4.20 17.1 50.5 120 111 49.3 18.4 5.50 2.24 1.11
(0.02) (0.11) (0.45) (1.20) (1.15) (0.45) (0.13) (0.03) (0.01) (0.00)
0
= 0.1.
P-TPS 4.50 14.7 42.2 104 98.6 43.1 16.0 5.09 2.54 1.44
EWMAu (0.02) (0.10) (0.37) (1.03) (1.01) (0.39) (0.12) (0.03) (0.01) (0.01)
4.55 15.2 46.5 122 94.4 39.7 14.9 4.87 2.46 1.41
OC-TPS
(0.02) (0.10) (0.45) (1.35) (1.04) (0.39) (0.11) (0.03) (0.01) (0.01)
0.1 0.05 0.02 0.02 0.05 0.1 0.2 0.3 0.5 0.8
NOTE: Standard errors are in parentheses, and the best ARLs are in bold.
4
187 224 223 181 140 86.4 34.3 16.4 5.76 1.92
(1.97) (2.35) (2.39) (1.94) (1.50) (0.88) (0.33) (0.14) (0.04) (0.01)
P-TPS 412 289 234 177 144 105 59.1 34.1 12.1 3.36
EWMAu (4.42) (3.13) (2.52) (1.90) (1.54) (1.10) (0.62) (0.35) (0.12) (0.02)
477 325 262 194 156 112 60.9 34.5 11.9 3.36
(5.41) (3.73) (3.02) (2.22) (1.80) (1.27) (0.70) (0.39) (0.12) (0.02)
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
Table 5 OC ARL comparison for scenario (II) with OC-TPS
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 1 2
5.50 23.5 69.0 139 135 69.7 27.3 7.95 3.43 1.52
(0.03) (0.18) (0.65) (1.44) (1.39) (0.67) (0.22) (0.04) (0.02) (0.01)
0
= 0.3 .
P-TPS 5.97 21.5 62.3 136 119 59.4 23.3 7.00 3.29 1.78
EWMAu (0.03) (0.16) (0.59) (1.40) (1.25) (0.58) (0.19) (0.04) (0.01) (0.01)
5.96 22.7 73.4 166 110 53.6 21.2 6.50 3.13 1.71
OC-TPS
(0.03) (0.18) (0.77) (1.89) (1.26) (0.56) (0.18) (0.04) (0.01) (0.01)
0.3 0.2 0.1 0.05 0.05 0.1 0.2 0.3 0.4 0.5
41.8 71.6 160 216 126 67.7 23.4 10.7 5.88 3.63
P-TPS
(0.31) (0.63) (1.60) (2.26) (1.35) (0.70) (0.22) (0.09) (0.04) (0.02)
EWMAu (∗) (∗) (∗) (3.44) (1.46) (0.97) (0.45) (0.22) (0.12) (0.06)
>500 >500 >500 319 134 91.0 43.5 22.4 12.3 7.11
>500 >500 >500 344 144 95.3 45.0 22.3 12.1 6.93
(∗) (∗) (∗) (4.00) (1.69) (1.13) (0.51) (0.24) (0.13) (0.06)
NOTE: Standard errors are in parentheses, and the best ARLs are in bold. Table 6 OC ARL comparison for scenario (II) with OC-TPS
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 2 5
7.59 35.7 94.7 160 159 101 44.1 12.6 2.61 1.15
(0.04) (0.30) (0.95) (1.68) (1.67) (1.01) (0.39) (0.07) (0.01) (0.00)
0
= 0.5 .
P-TPS
EWMAu
8.80 36.4 101 173 140 85.6 38.0 10.9 2.44 1.26
(0.04) (0.31) (1.02) (1.82) (1.51) (0.87) (0.34) (0.07) (0.01) (0.00)
9.22 44.4 144 240 131 77.1 33.9 10.1 2.32 1.22
OC-TPS
(0.05) (0.41) (1.59) (2.76) (1.49) (0.84) (0.32) (0.07) (0.01) (0.00)
0.5 0.4 0.3 0.2 0.1 0.05 0.1 0.2 0.3 0.4
17.0 20.5 27.3 46.8 118 103 45.9 13.5 6.10 3.37
P-TPS
(0.07) (0.10) (0.17) (0.38) (1.12) (1.08) (0.44) (0.11) (0.04) (0.02)
EWMAu (∗) (∗) (∗) (∗) (∗) (1.25) (0.73) (0.29) (0.12) (0.05)
>500 >500 >500 >500 >500 116 69.7 28.0 12.9 6.78
>500 >500 >500 >500 >500 127 76.0 29.6 13.4 7.01
(∗) (∗) (∗) (∗) (∗) (1.46) (0.86) (0.33) (0.13) (0.05)
NOTE: Standard errors are in parentheses, and the best ARLs are in bold. Table 7 OC ARL comparison for scenario (III) with OC-TPS
0.8 0.2 0.1 0.05 0.05 0.1 0.2 0.5 1 2
3.26 34.9 99.8 167 149 88.9 35.0 9.41 3.82 1.62
(0.01) (0.29) (0.99) (1.75) (1.53) (0.88) (0.31) (0.06) (0.02) (0.01)
0
= 0.1.
P-TPS 4.09 28.9 84.0 156 141 77.2 30.3 8.46 3.81 2.01
EWMAu (0.00) (0.24) (0.83) (1.56) (1.44) (0.76) (0.27) (0.05) (0.02) (0.01)
4.46 34.6 103 178 122 69.2 28.6 8.42 3.80 2.02
OC-TPS
(0.01) (0.26) (0.95) (1.69) (1.17) (0.64) (0.23) (0.05) (0.02) (0.01)
0.1 0.05 0.02 0.02 0.05 0.1 0.2 0.3 0.5 0.8
184 219 216 179 141 88.3 35.7 17.9 6.78 2.77
(1.87) (2.24) (2.25) (1.87) (1.45) (0.90) (0.34) (0.15) (0.05) (0.01)
P-TPS 398 282 231 179 147 110 61.4 37.1 14.3 4.47
EWMAu (4.25) (2.99) (2.41) (1.84) (1.52) (1.12) (0.62) (0.37) (0.13) (0.03)
365 261 217 172 145 110 64.1 39.4 15.6 4.79
(3.59) (2.60) (2.16) (1.67) (1.39) (1.06) (0.60) (0.37) (0.14) (0.03)
NOTE: Standard errors are in parentheses, and the best ARLs are in bold. Table 8 OC ARL comparison for scenario (III) with OC-TPS
0.5 0.2 0.1 0.05 0.05 0.1 0.2 0.5 1 5
9.52 48.0 122 179 165 114 51.3 13.9 5.88 1.08
(0.05) (0.43) (1.23) (1.89) (1.74) (1.18) (0.48) (0.09) (0.03) (0.00)
0
= 0.3 .
P-TPS 9.74 44.3 115 179 152 98.0 44.6 11.9 5.18 1.24
EWMAu (0.05) (0.39) (1.14) (1.83) (1.55) (1.00) (0.42) (0.08) (0.03) (0.00)
11.4 58.8 153 211 132 84.0 40.1 11.7 5.12 1.23
OC-TPS
(0.05) (0.48) (1.44) (2.06) (1.29) (0.80) (0.35) (0.08) (0.03) (0.01)
0.3 0.2 0.1 0.05 0.05 0.1 0.2 0.3 0.4 0.5
NOTE: Standard errors are in parentheses, and the best ARLs are in bold.
5
41.8 75.3 161 211 132 74.3 26.3 12.7 7.31 4.79
(0.32) (0.67) (1.63) (2.21) (1.38) (0.75) (0.24) (0.10) (0.05) (0.03)
P-TPS
>500 >500 479 307 136 95.5 48.4 25.9 14.9 9.03
EWMAu (∗) (∗) (4.93) (3.18) (1.38) (0.97) (0.49) (0.25) (0.14) (0.08)
>500 >500 422 282 136 97.4 52.1 28.9 16.7 10.1
(∗) (∗) (4.06) (2.74) (1.32) (0.92) (0.50) (0.27) (0.15) (0.08)
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
Table 9 OC ARL comparison for scenario (III) with OC-TPS
0.5 0.2 0.1 0.05 0.1 0.2 0.5 1 2 8
12.9 65.8 138 184 135 74.0 21.5 9.22 4.42 1.14
0
= 0.5 .
P-TPS
(0.08) (0.63) (1.41) (1.92) (1.41) (0.72) (0.16) (0.05) (0.02) (0.00)
15.4 80.2 172 211 119 66.4 20.1 7.96 3.68 1.17
EWMAu (0.09) (0.78) (1.79) (2.20) (1.22) (0.67) (0.16) (0.05) (0.02) (0.00)
21.4 149 270 253 102 58.0 19.0 7.85 3.64 1.17
OC-TPS
(0.12) (1.40) (2.66) (2.51) (0.99) (0.54) (0.15) (0.05) (0.02) (0.00)
0.5 0.4 0.3 0.2 0.1 0.05 0.1 0.2 0.3 0.4
17.0 20.7 28.1 48.2 117 119 56.3 17.6 8.23 4.98
P-TPS
(0.08) (0.11) (0.18) (0.39) (1.14) (1.23) (0.56) (0.15) (0.05) (0.02)
EWMAu (∗) (∗) (∗) (∗) (∗) (1.29) (0.80) (0.33) (0.15) (0.07)
>500 >500 >500 >500 >500 125 77.5 33.5 16.3 9.24
>500 >500 >500 >500 492 130 87.1 40.9 20.4 11.5
(∗) (∗) (∗) (∗) (4.92) (1.28) (0.86) (0.39) (0.18) (0.08)
NOTE: Standard errors are in parentheses, and the best ARLs are in bold. Table 10 OC ARL comparison for double shifts in scenario (I) with
0.5
0.2
0.1
0.05
0.05 0.1 0.2 0.5 2 5
0
= 0.5 .
0.5
0.4
0.3
0.2
0.1
0.05
0.1
0.2
0.3
0.4
11.8 11.1 13.0 19.3 284 >500 18.7 >500 >500 18.0 >500 >500 16.0 >500 >500 15.1 >500 >500 13.3 285 205 9.84 14.3 13.5 3.00 2.63 2.58 1.16 1.28 1.24
11.7 11.0 13.0 23.2 191 490 23.1 >500 >500 21.9 >500 >500 19.2 >500 >500 17.6 >500 >500 15.0 184 143 10.3 14.1 13.4 3.03 2.67 2.59 1.16 1.28 1.25
11.7 11.0 13.0 30.3 131 285 31.7 >500 >500 30.5 >500 >500 25.1 >500 >500 22.5 >500 >500 18.2 122 97.0 11.0 14.0 13.3 3.01 2.67 2.62 1.17 1.30 1.25
11.2 10.9 12.8 44.1 90.0 170 56.9 >500 >500 53.3 >500 >500 40.0 >500 >500 33.0 377 294 23.4 85.4 73.4 11.9 13.7 13.0 3.04 2.70 2.63 1.20 1.31 1.28
10.3 10.8 12.5 59.4 66.2 107 123 292 >500 137 >500 >500 89.8 363 297 63.2 183 150 35.8 62.4 55.2 13.4 13.5 12.9 3.08 2.74 2.66 1.23 1.33 1.30
8.14 10.4 11.9 31.0 40.1 54.3 62.0 83.6 117 85.4 115 138 109 100 95.7 93.9 74.0 67.5 53.8 40.3 37.4 16.0 13.1 12.6 3.16 2.82 2.77 1.32 1.40 1.38
7.26 10.2 11.7 21.0 33.0 42.7 33.2 57.3 72.7 41.1 69.3 84.1 54.6 67.8 66.6 53.6 54.9 54.2 42.3 34.9 32.9 16.0 13.0 12.6 3.21 2.91 2.83 1.37 1.44 1.41
5.52 9.31 10.5 10.3 20.9 24.4 12.3 27.1 32.2 13.7 30.2 35.3 16.3 32.8 35.5 17.3 31.3 33.6 18.1 26.1 26.4 14.0 13.1 12.7 3.48 3.15 3.08 1.51 1.55 1.53
4.13 7.77 8.68 5.72 12.6 14.1 6.29 14.2 16.1 6.65 15.1 17.1 7.35 17.1 19.3 7.59 17.8 19.7 8.33 18.4 19.7 9.39 15.1 15.4 4.29 4.01 3.95 1.88 1.86 1.84
3.08 5.84 6.47 3.59 7.27 7.99 3.71 7.71 8.61 3.84 8.02 8.94 4.03 8.47 9.56 4.15 8.82 9.99 4.30 9.42 10.6 4.91 11.5 12.7 6.30 8.46 8.42 3.32 3.21 3.16
represents stationary but inhomogeneous population sizes. Scenarios (II) and (III) are in fact the same as case (I) and (III) in Zhou et al. (2012) and also in Mei et al. (2011). These two settings are from the suggestion of Mei et al. (2011) that the population growth can be modeled by the logistic function. We let 0 = 1 to maintain consistency with the literature. For the dispersion factor, we consider three cases of 0 :
OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu OC-TPS P-TPS EWMAu
charts (OC-TPS, P-TPS, and EWMAu) in detecting such shifts are compared, summarized in Tables 1–9 as follows. Throughout the simulations, the EWMA smoothing parameter is 0.1. Actually, it can usually be chosen between 0.05 and 0.2. Different values of have different effects on detection. As indicated by Lucas and Saccucci (1990), a larger facilitates faster detection in large shifts, whereas a smaller is helpful for quickly detecting small shifts. Tables 1–3 are for scenario (I), where nt ~U(7, 12) , and 0 = 0.1, 0.3, 0.5 , respectively. With shifts in , for small and moderate , the proposed OC-TPS chart has slightly larger OC ARLs than the other two charts. Since P-TPS and EWMAu are devised exclusively for detecting shifts in , so they are supposed to perform well in detecting shifts in . On the other hand, for very large shifts such as = 0.5 or 2 in Table 1, OC-TPS outperforms both P-TPS and EWMAu. Tables 2 and 3 also demonstrate this phenomenon that OC-TPS performs better in detecting very large shifts in . When it comes to shifts occurring in , things are totally different.
• = 0.1 with the IC dispersion factor 1.23 for a small degree of overdispersion; • = 0.3 with the IC dispersion factor 2.04 for a moderate degree of overdispersion; • = 0.5 with the IC dispersion factor 4.00 for a large degree of 0
0
0
overdispersion.
In total there are nine cases, and in each case shifts occur either in or in , denoted by and respectively. The performance of the three
6
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
and EWMAu. The reason is that the proposed OC-TPS chart is designed to monitor changes in both the incidence rate and the overdispersion factor, while the other two charts are devised for only. As one reviewer suggested, for illustration we have run more simulations for pairs of shifts in ( , ) in scenario (I) with 0 = 1 and 0 = 0.5. The results are listed in Table 10, according to which some conclusions are drawn. (1) With negative , our proposed OC-TPS chart performs the best in 36/40 cases, regardless of negative or positive 0.2 , where the OC-TPS chart . Exceptions are in = 0.5 and performs better than EWMAu, but worse than P-TPS. This may happen because of the inaccuracy in parameter estimation due to small OC means and too many zeros in observations. (2) With positive = 0.1, 0.2, 0.5, 2 , regardless of negative or positive , there are 24/40 cases where OC-TPS performs the best. Similar to Table 3 with shifts in only, as increases, there are also more and more cases where EWMAu performs the best. The reason is perhaps that positive shifts in dominate shifts in . For the same reason, with = 5, OC-TPS outperforms the others again in 8/10 cases. In total, with simultaneous shifts in both and , there are 77 out of 100 cases where our proposed OC-TPS chart has the best performance among all. Throughout Tables 1–10, the P-TPS and EWMAu charts are two effective charts that are devised to detect shifts in the incidence rate only, whereas our proposed OC-TPS chart can detect shifts in and/or . For shifts in only, it’s not beyond expectation that P-TPS and EWMAu are powerful. However, in some cases, the proposed OC-TPS still outperforms the other two. For shifts in only, OC-TPS performs the best in almost all cases, and the other two charts even lose detection power in some cases. In this sense, our OC-TPS chart is a more general chart that takes both the incidence rate and overdispersion into account, and it also demonstrates fair robustness to various patterns of time-varying population sizes.
Table 11 The adverse event counts, product P exposures, and rates for each quarter reported during July 1, 1999 — December 31, 2004. (Dong et al., 2008). time t
adverse event counts X
product exposure n (in millions)
rate (per million units)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
1 0 0 0 1 0 3 3 3 2 5 5 2 4 4 3 4 3 8 3 2 2
0.206 0.313 0.368 0.678 0.974 0.927 0.814 0.696 0.659 0.775 0.731 0.71 0.705 0.754 0.682 0.686 0.763 0.833 0.738 0.741 0.843 0.792
4.854 0 0 0 1.027 0 3.686 4.310 4.552 2.581 6.840 7.042 2.837 5.305 5.865 4.373 5.242 3.601 10.840 4.049 2.372 2.525
The proposed OC-TPS chart uniformly outperforms the other two charts, as we expect. The reason is that our proposed chart consider overdispersion whereas the other two fail to do so. When 0 gets larger with the degree of overdispersion increasing, from Table 1 up to Table 3, the P-TPS chart and the EWMAu chart even lose their power in detecting negative shifts in in terms of OC ARLs larger than 200. This is more obvious from Tables 2 and 3 where many OC ARLs of P-TPS and EWMAu are even larger than 500. In sharp contrast, the proposed OCTPS chart is very efficient at detecting shifts in , except several cases with very small shifts, such as = 0.02 in Table 1 and = 0.05 in Table 2. Tables 4–6 are for scenario (II) where nt follows the increasing function with different values of 0 , and Tables 7–9 are for scenario (III) where nt follows the decreasing function with different values of 0 . The results herein are similar to scenario (I). When shifts occur in , the proposed OC-TPS chart is slightly worse than the other two charts in detecting small and moderate shifts, whereas OC-TPS outperforms the other two in detecting very large shifts. On the other hand, when shifts, the OC-TPS chart has much better performance than both P-TPS
5. Implementation into practice In this section, we illustrate the implementation of the proposed OCTPS chart through a real example. This example comes from Dong et al. (2008) for illustration. The dataset describes adverse events for a product P in a pharmaceutical company. Totally there are 22 observations. Although the dataset is not large, it suffices to demonstrate the use of the proposed chart in a real-world setting. Table 11 provides the adverse event counts and the product exposures per quarter (per million product) between July 1, 1999 and December 31, 2004, and the rates are also listed in the last column. Dong et al. (2008) directly assume the IC value of the incidence rate 0 = 4.0 , and the value of 0 is unavailable therein. Here for illustration,
Fig. 1. An OC-TPS chart for the implementation. 7
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
Fig. 2. A P-TPS chart for the implementation.
Fig. 3. An EWMAu chart for the implementation.
we regard the first 14 observations as the in-control dataset, and based on them we employ the maximum likelihood estimation method to obtain the IC incidence rate 0 = 3.115 and the IC parameter 0 = 0.1549 , leading to the IC overdispersion factor 1.400. Furthermore, similar to Dong et al. (2008), we select the IC ARL as 100, and the control limit is finalized as 0.6450 by Monte Carlo simulations via the bisection method (Ding, Tsung, & Li, 2016). In choosing the control limit, the population sizes are obtained by resampling the above 14 = 0.1 population sizes. The EWMA smoothing parameter is throughout. To be consistent with Dong et al. (2008), the whole dataset is used in Phase II. The OC-TPS chart is shown in Fig. 1 from t = 1 to t = 22 , and it displays an OC signal at t = 18 with the charting statistic above the control limit. For comparison purpose, we also apply the T-TPS chart and the EWMAu chart to this example with the IC incidence rate = 0.1. These two 0 = 3.115 and the EWMA smoothing parameter charts are shown in Figs. 2 and 3, respectively, and they both trigger an OC alarm at t = 19, slower than our proposed OC-TPS chart.
problem of count data with overdispersion and time-varying population sizes. Facing this challenge, this article has made three contributions. First, we propose a model to describe overdispersed counts with timevarying population sizes, which is built on the GP distribution. This model is fairly reasonable and flexible enough to incorporate the incidence rate, the overdispersion factor, and time-varying population sizes. Second, we suggest an efficient surveillance approach by employing the exponentially weighted likelihood ratio test. Compared with the extant techniques, the proposed OC-TPS chart possesses satisfactory performance in detecting shifts in the incidence rate and almost uniformly better behavior in discovering abnormalities in the overdispersion factor. Third, the detection power of the OC-TPS chart has been verified to be robust under various patterns of time-varying population sizes. As indicated in the second last paragraph in Section 2, the proposed model is constructed on the GP distribution, which is suitable for characterizing strictly unimodal overdispersed counts or counts with a long tail. But if there is large mass at zero (mode = 0 ), the NB distribution can be considered for modeling overdispersed counts with time-varying population sizes. In this case, the zero-inflated versions of the GP and NB distributions can also be selected for data modeling and further process monitoring. These will be left in our future work.
6. Conclusion Count data are very popular yet complicated in practice, and their modeling and monitoring are drawing increasingly more attention recently. In this article, we comprehensively consider the detection 8
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li
CRediT authorship contribution statement
anonymous referees for their many helpful comments that have resulted in significant improvements in this article. This research was supported by the National Key R&D Program of China Grant 2016YFF0202004; the National Natural Science Foundation of China Grants 71602155 and 71772147; the Fundamental Research Funds for the Central Universities; the Youth Innovation Team of Shaanxi Universities “Big data and Business Intelligent Innovation Team”.
Dong Ding: Conceptualization, Methodology, Writing - original draft, Writing - review & editing, Funding acquisition. Jian Li: Methodology, Software, Visualization, Formal analysis, Investigation, Funding acquisition. Acknowledgements The authors would like to thank the associate editor and three Appendix A. Derivation of the MWLE in Eq. (2)
The derivation of the MWLE of = [ , ]T is shown here. At time t, Wt ( ) is the weighted log-likelihood function involving all observations up to t. For simplicity the subscript t is suppressed temporarily without causing any confusion. Taking the first derivatives of the weighted log-likelihood function W ( ) , we have the score vector denoted by s( ) , i.e.,
s( )
W ( ).
The solution to the score equations
s( ) =
W( ) = 0
is denoted by and is referred to as the MWLE of . The Newton-Raphson algorithm is adopted to obtain the MWLE. Denote the Hessian matrix of by
H( )
W ( ).
Given the kth iteration, the Newton-Raphson algorithm gives the (k + 1) th iteration by (k + 1)
=
(k )
H 1(
(k ) ) s ( (k ) ).
All the derivatives are given as follows. Wt ( )
= a1, t1,
t
(1
)t
ni (1 ni (1
i
i=1 Wt ( )
= a1, t1,
2W ( ) t 2 2W ( t
)
2W ( ) t 2
= a1, t1, = a1, t1, = a1, t1,
t
(1
)t
+
1
(Xi ni )(Xi 1) ni (1 ) + Xi
i
i=1 t
(1
)t
i
(1
)t
i
(1
)t
i
)2 (Xi
ni2 (1 (ni (1
i=1 t
(Xi
1
1 2
1)
) + Xi)2
ni
,
,
+ ni ,
1)
) + Xi)2
, Xi
1 1)
ni )2 (Xi
(ni (1
ni 1
) + Xi)2
ni Xi (Xi (ni (1
i=1 t i=1
)(Xi 1) ) + Xi
1 (1
)2
.
Kaminsky, F. C., Benneyan, J. C., Davis, R. D., & Burke, R. J. (1992). Statistical control charts based on a geometric distribution. Journal of Quality Technology, 24, 63–69. Lai, T. L. (1995). Sequential changepoint detection in quality control and dynamical system. Journal of the Royal Statistical Society, Series B, 57, 613–658. Li, J., Tsung, F., & Zou, C. (2014). Multivariate binomial/multinomial control chart. IIE Transactions, 46, 526–542. Lucas, J. M. (1985). Counted data CUSUMs. Technometrics, 27, 129–144. Lucas, J. M., & Saccucci, M. S. (1990). Exponentially weighted moving average control schemes: properties and enhancements. Technometrics, 32, 1–29. Mei, Y., Han, S. W., & Tsui, K. L. (2011). Early detection of a change in Poisson rate after accounting for population size effects. Statistica Sinica, 21, 597–624. Ryan, A. G., & Woodall, W. H. (2010). Control charts for Poisson count data with varing sample sizes. Journal of Quality Technology, 42, 260–274. Richards, S. C., Woodall, W. H., & Purdy, G. (2015). Surveillance of nonhomogeneous Poisson processes. Technometrics, 57, 388–394. Schwertman, N. C. (2005). Designing accurate control charts based on the geometric and negative binomial distributions. Quality and Reliability Engineering International, 21, 743–756. Sellers, K. F. (2012). A generalized statistical control chart for over- or under-dispersed data. Quality and Reliability Engineering International, 28, 59–65. Shang, Y., Tsung, F., & Zou, C. (2011). Phase-II profile monitoring with binary data and random predictors. Journal of Quality Technology, 43, 196–208. Shen, X., Zou, C., Jiang, W., & Tsung, F. (2013). Monitoring Poisson count data with dynamic control limits when the sample sizes are time-varying. Naval Research Logistics, 60, 625–636. Shen, X., Tsui, K. L., Woodall, W. H., & Zou, C. (2016). Self-starting monitoring scheme for Poisson count data. Technometrics, 58, 460–471.
References Borror, C. M., Champ, C. W., & Rigdon, S. E. (1998). Poisson EWMA control charts. Journal of Quality Technology, 30, 352–361. Consul, P. C. (1989). Generalized Poisson distribution: Properties and applications. New York: Marcel Dekker Inc. Consul, P. C., & Jain, G. C. (1973). A generalization of the Poisson distribution. Technometrics, 15, 791–799. Das, D., Zhou, S., Chen, Y., & Horst, J. (2016). Statistical monitoring of over-dispersed multivariate count data using approximate likelihood ratio tests. International Journal of Production Research, 54, 6579–6593. Ding, D., Tsung, F., & Li, J. (2016). Rank-based process control for mixed-type data. IIE Transactions, 48, 673–683. Dong, Y., Hedayat, A. S., & Sinha, B. K. (2008). Surveillance strategies for detecting changepoint in incidence rate based on exponentially weighted moving average methods. Journal of the American Statistical Association, 103, 843–853. Frisén, M., & de Maré, J. (1991). Optimal surveillance. Biometrika, 78, 271–280. Gan, F. F. (1990). Monitoring Poisson observations using modified exponentially weighted moving average control charts. Communications in Statistics—Simulation and Computation, 19, 103–124. Jiang, W., Shu, L., & Tsui, K. L. (2011). Weighted CUSUM control charts for monitoring Poisson processes with varying sample sizes. Journal of Quality Technology, 43, 346–362. Joe, H., & Zhu, R. (2005). Generalized Poisson distribution: The property of mixture of Poisson and comparison with negative binomial distribution. Biometrical Journal, 47, 219–229.
9
Computers & Industrial Engineering 143 (2020) 106409
D. Ding and J. Li Sparks, R., Keighley, T., & Muscatello, D. (2010). Exponentially weighted moving average plans for detecting unusual negative binomial counts. IIE Transactions, 42, 721–733. Ver Hoef, J. M., & Boveng, P. L. (2007). Quasi-Poisson VS. negative binomial regression: How should we model overdispersed count data? Ecology, 88, 2766–2772. Wang, R., Li, J., & Xue, L. (2018). Joint monitoring of mean and dispersion of count data. Journal of Industrial and Production Engineering, 35, 269–276. White, C. H., & Keats, J. B. (1996). ARLs and higher-order run-length moments for the Poisson CUSUM. Journal of Quality Technology, 28, 363–369. Yu, M., Wu, C., Wang, Z., & Tsung, F. (2018). A Robust CUSUM scheme with a weighted likelihood ratio to monitor an overdispersed counting process. Computers & Industrial Engineering, 126, 165–174.
Zamani, H., & Ismail, N. (2012). Functional form for the generalized Poisson regression model. Communications in Statistics—Theory and Methods, 41, 3666–3675. Zhou, Q., Shu, L., & Jiang, W. (2016). One-sided EWMA control charts for monitoring Poisson processes with varying sample sizes. Communications in Statistics—Theory and Methods, 45, 6112–6132. Zhou, Q., Zou, C., Wang, Z., & Jiang, W. (2012). Likelihood-based EWMA charts for monitoring Poisson count data with time-varying sample sizes. Journal of the American Statistical Association, 107, 1049–1062. Zou, C., & Qiu, P. (2009). Multivariate statistical process control using LASSO. Journal of the American Statistical Association, 104, 1586–1596.
10