Dynamic VAR Model-Based Control Charts for Batch Process Monitoring
Journal Pre-proof
Dynamic VAR Model-Based Control Charts for Batch Process Monitoring Danilo Marcondes Filho, Marcio Valk PII: DOI: Reference:
S0377-2217(19)31078-1 https://doi.org/10.1016/j.ejor.2019.12.038 EOR 16246
To appear in:
European Journal of Operational Research
Received date: Accepted date:
18 March 2019 23 December 2019
Please cite this article as: Danilo Marcondes Filho, Marcio Valk, Dynamic VAR Model-Based Control Charts for Batch Process Monitoring, European Journal of Operational Research (2019), doi: https://doi.org/10.1016/j.ejor.2019.12.038
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Elsevier B.V. All rights reserved.
Highlights • A new batch control approach based on VAR coefficients is proposed. • The proposed VAR model-based charts holds more information about the variables dynamics than the classical approaches. • Through a simulated toy model we have shown that the proposed approach outperforms the residual-based control charts in both offline and online context. • A new class of couple VAR processes is established and the asymptotic theory for least squares estimators is obtained.
1
Dynamic VAR Model-Based Control Charts for Batch Process Monitoring Danilo Marcondes Filho1 , Marcio Valk2 Universidade Federal do Rio Grande do Sul, Department of Statistics, Av. Bento Gon¸calves, 9500, post code 91509-900, Porto Alegre, RS, Brazil
Abstract In the field of Statistical Process Control (SPC) there are several different approaches to deal with monitoring of batch processes. Such processes present a three-way data structure (batches × variables × time-instants), so that for each batch a multivariate time series is available. Traditional approaches do not take into account the time series nature of the data. They deal with this kind of data by applying multivariate techniques in a reduced two-way data structure, in order to capture variables dynamics in some way. Recent developments in SPC have proposed the use of the Vector Autoregressive (VAR) time series model considering the original three-way structure. However, they are restricted to control approaches focused on VAR residuals. This paper proposes a new approach to deal with batch processes focusing on VAR coefficients instead of residuals. In short, we estimate VAR coefficients from historical in-control reference batch samples and build two multivariate control charts to monitoring new batches. We showcase the advantages of the proposed methodology for offline and online monitoring in a simulate example comparing it with the residual-based approach. Keywords: Quality Control, Vector Autoregressive Model, VAR coefficients, Batch Process, Control Charts Email addresses:
[email protected] (Danilo Marcondes Filho),
[email protected] (Marcio Valk) 1 Phone: +555133086214 2 Phone: +555133086191
Preprint submitted to European Journal of Operational Research
January 9, 2020
Declarations of interest: none
1. Introduction Batch processes are widely used in chemical, biochemical, pharmaceutical, and food industries. This kind of processing involves charging a vessel with row materials, processing under conditions controlled by many process variables according to standard time trajectories, and discharging the final product after the batch is finished. During the batch, multiple measurements of these process variables are obtained, yielding a large amount of information to be used in performance evaluation. Data from batch processes can be arranged as a tri-dimensional array with dimension (I × K × T ), where I represents the number of batches, K represents the number of variables and T represents the number of time-instants. More specifically, for each batch, a K-dimensional time series of size T is available, containing data with a strong dynamic feature. In a wide range of traditional monitoring approaches, the data is rearranged to compose a two-dimensional data array and analyzed by using multivariate techniques like Principal Components, Partial Least Squares, Support Vector machines, Neural Networks, etc. This two dimensional structure has two classical forms. The classical approach of Nomikos & MacGregor (1995) uses the data array (I × KT ), so that the dynamic behavior of the process variables is prioritized (time-wise analysis), i.e., they focus on the serial correlations [Figure 1 (a)]. Wold et al. (1998) proposes an alternative approach using the arrangement (IT × K), in which the variable-wise analysis is prioritized. This approach focuses on the cross correlation among variables and allows diagnostics of process disturbances [Figure 1 (b)]. Camacho et al. (2009) presents a wide discussion about advantages and disadvantages of these two approaches. However, there is not a consensus about what is the best strategy and there is a trade-off between modeling cross and serial correlation. Some improvements and applications of these approaches can be found in Van den Kerkhof et al. (2012), Peng et al. (2014) and Wang et al.
3
(2018). (a)
(b)
Figure 1: (a) Time-wise analysis: I × KT two-way design by Nomikos & MacGregor (1995) . (b) Variable-wise analysis: IT × K two-way design by Wold et al. (1998) .
Time series models have been considered in order to deal with multivariate statistical process control, more specifically, the VAR model. However, most of these approaches are restricted to monitoring continuous process, i.e., processes that have the intrinsic two-way structure (I × K). So, since there are no replications of measures in each sample, there is no time dimension. These approaches use a VAR model to eliminate the correlation within the I samples, which is a necessary condition to apply classical multivariate techniques. Basically, in a preliminary stage they propose a VAR model in order to estimate data serial correlation. In a post stage, classical control charts are used in the VAR residuals to monitor the behavior of future samples. An overview of these approaches can be found in Snoussi (2011), Pan & Jarrett (2012), Vanhatalo & Kulahci (2015), Leoni et al. (2015) and Sim˜ oes et al. (2016). An approach to deal with monitoring of batch processes based on VAR models was first proposed by Choi et al. (2008). They consider the data batch structure (I × K × T ) as I K-dimensional time series of size T . In short, the authors propose a set of charts based on the traditional Hotelling statistic to monitor the VAR residuals, obtained thought the adjusted VAR from the I
4
historical samples, in a reduced variable space by using Partial Least Squares. Through a benchmark batch process, they show that the residual-based control approach outperforms competitive classical approaches such as Nomikos & MacGregor (1995) and Wold et al. (1998). Their VAR residuals scheme seems to hold substantial information about the variables dynamics, considering serial and cross correlations simultaneously. This paper proposes a new approach to deal with batch processes using VAR models focused on the VAR coefficients instead of the residuals. A monitoring scheme based on the estimated coefficients from the adjusted VAR model is presented. In short, we estimate VAR coefficients for each historical batch sample and compute information about the standard serial and cross correlations of the in-control process. After that, we use the Hotelling and the Generalized Variance control charts for offline and online monitoring of new batches. Through a simulated batch process, we show that the proposed Dynamic VAR-based control charts outperforms the residual-based approach. The paper is organized as follows: a brief description of the VAR models and the traditional multivariate control charts are presented in the background Sections 2 and 3. The proposed monitoring strategy is presented in Section 4. In Section 5, the proposed approach is illustrated through simulated data from an appropriate toy model. Conclusions are presented in Section 6.
2. Vector Autoregressive Model The VAR model is widely used in modeling multivariate time series. This model, initially proposed by Sims (1980), is specially useful for forecasting and jointly describing the dynamical behavior of financial time series and allows expressing the evolution of a set of K variables as a linear function of their past values. The basic form of the VAR(p) model is
z t = Φ0 +
p X
Φj zt−j + ut ,
j=1
5
t ∈ Z,
(1)
where Φ0 is a K × 1 vector of constants, Φj , j = 1, . . . , p is a K × K matrix of
model coefficients and ut is a K × 1 vector of error terms satisfying E[ut ] = 0,
Var[ut ] = Γ and Cov[ut , ut+h ] = 0, for every h = 1, 2, 3, . . . . If det(I K ζ p − Pp p−j ) 6= 0 for {ζ ∈ C; |ζ| ≤ 1}, then the VAR(p) process is called j=1 Φ1 ζ stable.
2.1. Model Estimation Properties of multivariate least squares (LS) estimation for VAR models are discussed in L¨ utkepohl (2005). Let z = (z1 , . . . , zT ) be a time series of dimension (K × T ). Denote by Φ = (Φ0 , Φ1 , . . . , Φp )
(2)
a (K×(Kp+1))-matrix of coefficients from equation (1). Let wt = (1, zt , . . . , zt−p+1 )0 be a vector of length (Kp + 1) and define W = (w0 , . . . , wT −1 ) a ((Kp + 1) × T ) dimensional matrix. Additionally, define a (K × T )-matrix U = (u1 , . . . , uT ). Using vectorial notation we can rewrite (1) as v = (W0 ⊗ IK )φ + u,
(3)
where φ = Vec(Φ), and u = Vec(U), ⊗ is the Kronecker product and Vec is the operator that transform a matrix into a vector. We note that the covariance matrix of u is Σu = IT ⊗ Γ. The multivariate LS estimator of φ is b = ((WW0 )−1 ⊗ IK )v. φ
(4)
From Proposition 3.1 in L¨ utkepohl (2005) we have that for a stable Kdimensional VAR(p) process with standard white noise innovations, the result 1 T
P
WW0 − → H holds with H non singular (it means that
1 T
WW0 converges
in probability to a limit H). Furthermore the LS estimator is consistent and asymptotically normal, that is, 1 L b − φ) = T 21 Vec(Φ b − Φ) − T 2 (φ → N (0, Σφ )
as T → ∞,
(5)
L
where Σφ = H−1 ⊗ Γ. The notation − → means convergence in distribution P
and − → means convergence in probability. In order to assess the asymptotic 6
dispersion of the LS estimator, we need to estimate the matrix Σφ . A consistent estimator for Σφ is given by L¨ utkepohl (2005) bφ = Σ
T (WW0 )−1 ⊗ z(IT − W0 (WW0 )−1 W)z. T − Kp − 1
(6)
3. Multivariate Statistical Control This section describes briefly the two classical multivariate control charts: the Hotelling and Generalized Variance charts for monitoring means and the covariance structure, respectively. Let X ∼ N (µ0 , Σ0 ) be a K-dimensional normal random variable, where µ0 is the vector of means and Σ0 is the covariance structure of the in-control process. In phase I, µ0 and Σ0 are estimated through independent reference samples from the in-control process, i.e., without any contamination in the covariance structure caused by some atypical event in the process. The control chart widely used to monitor multivariate means is based on the Hotelling statistic T 2 (Montgomery, 2007). In phase II, considering a new sample of size one, the T 2 statistic has the following form: 0
T 2 = (x − x) S−1 (x − x) ∼
K(n + 1)(n − 1) FK,n−K , n(n − K)
(7)
where T 2 is F -distributed, x and S represent the sample mean vector and the sample covariance, respectively, obtained from n reference samples. In each new sample, the score T 2 is obtained by (7). Scores above the α percentile imply that the new observation was obtained from a process with µ 6= µ0 . In a practical sense, it means that this observations was sampled during the process in which the variables (or a subset of them) have values far from their expected values. Consider once more the K-dimensional normal random variable X ∼ N (µ0 , Σ0 ). The process covariance structure can be monitored by the control chart based on the statistic called Generalized Variance. In phase II, for a new group of
7
samples of size m, the W statistic has the following form:
W = −Kn + Knln(n) − nln
det(A) det(S)
+ tr(S−1 A) ∼ χ2K(K+1)/2 ,
(8)
where W is χ2 -distributed and S is obtained from n reference samples. Here A = (m − 1)Si , in which Si is the covariance matrix of the new group of observation of size m. For the new group of m samples, the score W is obtained by (8) and scores above the α percentile imply that the new group of observation was obtained in a process with Σ 6= Σ0 . In this case, notice that the covariance structure is very different to the expected according to S. A wide description of the T 2 and W charts including real data applications can be found in Johnson & Wichern (2007) and Montgomery (2007).
4. Dynamic VAR-Model Based Control Approach 4.1. Offline design Consider a historical data set containing realizations of I batches yielding products compliant with specifications. The data represent I well succeeded batches each one containing the time trajectory of K process variables, measured at T equal time-instants (i.e., K-dimensional time series of size T ). So, for each batch, consider the array Zi = (zi,1 , . . . , zi,T ) of dimension (K × T ), for i ∈ {1, . . . , I}. Assume that these I samples come from the in-control process, i.e., these samples represent the trajectories of the K variables in the process under normal regime. Under a statistical point of view, this means that the behaviour of these samples are related to the dynamics of the same stochastic process. Thus, it is reasonable to assume that this dynamic is well characterized by a VAR(p) model. In phase I, for each historical batch, we fit a VAR(p) model (1) and store
b (4) and Σ b φ (6), for the vector (column vector) of estimated parameters φ i i i = 1, . . . , I. In the next step, we build the unique estimates of the mean and
8
variance of the parameters in Φ (2) by combining the individual estimates: I
φ=
1Xb φ I i=1 i
I
and Σφ =
1Xb Σφi . I i=1
(9)
These estimates hold relevant information about the correlation structure (including cross and serial correlation from lag 1 to p) of the K variables in the process operating in a normal regime. Since φ is asymptotically normally distributed (as a consequence of (5)), we propose control charts based on the Hotelling T 2 and the Generalized Variance W statistics, described in Section 3. In phase II, the new batch is monitored using (7) and (8) rewritten in the following way:
and
0 C(I + 1)(I − 1) −1 b b Tφ2 = (φ FC,I−C new − φ) Σφ (φnew − φ) ∼ I(I − C)
(10)
Wφ = −C(T − 1) + C(T − 1)ln(T − 1) − (T − 1) D + E ∼ χ2C(C+1)/2 , (11) (−1) in which D = ln det(A)/ det(Σφ ) , E = tr(Σφ A). C is the length of φ and
b φ , where Σ bφ A = (T − 2)Σ is the estimated covariance matrix obtained new new 2 b from the φ new . Scores above the α percentile in Tφ or Wφ imply that the
correlation structure of the variables (cross or serial correlation from lag 1 to p) are very different to their expected values for the in-control process. The overall view is shown in Figure [2 (a)].
9
(a)
(b)
Phase I
OFFLINE HandwritingOFFL INE
K - dimensional VAR
ONLINE HandwritingOFFL INE Phase I
i
Random Array
/ 2
Phase II new K - dimensional VAR
Fitting Array
Coupled Array
/ 2
new
i,t
,
/ 2 Phase II ,
K - dimensional VAR
new
new,t
Figure 2: The overall view of (a) offline and (b) online approaches.
4.2. Online design Consider once more the arrays Zi = (zi,1 , . . . , zi,T ), for i ∈ {1, . . . , I}, containing the I sampled K-dimensional time series of size T , obtained from the incontrol batch process. In order to propose an online control approach, as in preliminary stage we must design the online reference distribution for the in-control observations using the historical batches Zi . First we randomly split these I 10
samples into two groups of size I/2: the random group Zrj = (zrj,1 , . . . , zrj,T ) and the fitting group Zfi = (zfi,1 , . . . , zfi,T ), with i, j ∈ {1, . . . , I/2}.
For each fitting element Zfi , in each time-instant t ∈ {1, . . . , T }, we define
the coupled array Zci,t as follow: Zci,1
=
(zrj(s1 ),1 , . . . , zrj(s1 ),Lc −1 , zfi,1 )
Zci,2
=
(zrj(s2 ),1 , . . . , zrj(s2 ),Lc −2 , zfi,1 , zfi,2 )
Zci,3
=
(zrj(s3 ),1 , . . . , zrj(s3 ),Lc −3 , zfi,1 , zfi,2 , zfi,3 )
.. . Zci,t
=
(zrj(st ),1 , . . . , zrj(st ),Lc −t , zfi,1 , zfi,2 , zfi,3 , . . . , zfi,t ),
(12)
for t ∈ {1, . . . , Lf } and Zci,Lf +1
=
(zrj(st ),1 , . . . , zrj(st ),Lr , zfi,2 , zfi,3 , . . . , zfi,Lf +1 ),
Zci,Lf +2
=
(zrj(st ),1 , . . . , zrj(st ),Lr , zfi,3 , zfi,4 , . . . , zfi,Lf +2 ),
.. . Zci,T
=
(zrj(st ),1 , . . . , zrj(st ),Lr , zfi,t−Lf +1 , zfi,t−Lf +2 , . . . , zfi,T ),
for t ∈ {Lf + 1, . . . , T }, where Lr and Lf are the maximum number of observa-
tions in Zci,t from random and fitting vectors, respectively, in which Lc = Lr +Lf
is the length of the coupled array Zci,t . The subscripts j(st ) represent that for each time-instant, a new random array Zrj is randomly chosen (according to the discrete uniform distribution) to compose the coupled array Zci,t . The array Zci,t is composed of Lr successive observations from an element Zrj and Lf successive observations from the fitting element Zfi . So, for each t we have as many coupled arrays as the fitting elements in the online reference distribution. In the next step, for each Zci,t , i ∈ {1, · · · , I/2} and t ∈ {1, . . . , T } we obtain the LS estimates of the VAR(p) coefficients and denote this (K×(Kp+1)) matrix b i,t . Thus, for a fixed time-instant, we have I/2 matrices Φ b i,∗ containing by Φ information on cross and serial correlation of lags from 1 to p among variables
11
of the in-control process. In Subsection 4.4 we show that the coupled time series Zci,t is non-stationary and prove that the asymptotic properties of LS estimators are the same as those for the classical VAR(p) model. b i,t for This approach aims to remove the correlation among the estimates Φ
b i,t , for each i. Moreover, this fixed t, and reduce the serial correlation among Φ approach allows monitoring of the new ongoing batch since the first time-instant.
Our online procedure consists in applying the proposed offline control approach for each time-instant. In phase I, we now have I/2 batches in the reference distribution. For each time-instant, we estimate the mean and the b i,∗ like in (9). In phase II, the ongoing new variance of the parameters in Φ batch at time-instant t is monitored similar to (10) and (11), replacing T by Lc
in (11). Thus we can monitor the cross and serial correlation of the variables during the batch through the scores Tφ2t and Wφ2 t . The overall scheme is shown in Figure [2 (b)]. 4.3. The concept behind the online methodology The first issue to discuss is why we use an approach based on VAR models in a context of batch processes. Such processes generate, for each batch, Kdimensional time series of size T . As we mentioned in the Introduction, the classical approaches do not consider any time series framework to model this kind of data. These tri-dimentional array (I ×K ×T ) are rearranged to compose a two-dimensional data array of the forms (I × KT ) or (IT × K), and some multivariate technique are used to build control charts. The first array try to prioritize the time dynamic of the variables (serial correlation), however it does not consider directly the cross correlation structure. On the other hand, the second one seeks to do the opposite. Through the VAR-based methodology, we can model and monitor the cross and serial correlation simultaneously. The second issue is why we monitor the VAR coefficients. The VAR-based approaches to monitoring continuous and batch processes found in the literature are mostly focused in VAR residuals obtained in phase I. The residuals capture, in some way, changes in variable dynamics, including cross and serial 12
correlations. However, through direct monitoring of VAR parameters estimates, we are able to build a control approach with better sensitivity that allows to detect changes in variable dynamics faster. The third point is about our methodology itself. More specifically, why we split the data into random and fitting groups in phase I and how does the size of each portion impacts in its performance. For fixed Lc , if the random portion Lr of the reference coupled array Zci,t is large, it decreases the charts performance, because most of the information comes from the in-control process. On the other hand, if Lr is small, the fitting portion Lf will be large causing an increase in b t and, once again, the charts the serial correlation among the VAR parameters Φ
performance decreases. Therefore, it is clear that we need a balance between the sizes Lr and Lf in each coupled array. Let us assume that the sizes Lr and Lf are similar. For large Lc , we have a lot of information about the in-control process, decreasing the performance, and increasing computational time. Alternatively, small Lc causes a poor estimation of the VAR coefficients, restrains the use of the asymptotic theory and, by consequence, causes instability on the charts control limits. Additionally, the use of the random portion in the coupled array guarantees independence among batches for each fixed time-instant t, which is a required assumption for the T 2 and W statistics. Besides that, it minimizes the b t estimates, even thought this is not a central problem. correlations among Φ 4.4. Asymptotic properties of LS estimator for coupled processes
The goal of this Section is to derive the asymptotic properties of the LS estimators of VAR(p) parameters for modelling the coupled arrays Zci,t . A detailed account on the theory of coupling can be found in Lindvall (2002). However, unlike our proposition, the classical theory of coupling assume random coupling time, countable state space and identically initial distribution. The proof procedure is similar to the one used by Fuller (1976) and concern in the univariate case, since for multivariate structure the proof differs in no substantive way from the univariate framework. We start defining coupled process. To do that, let {xt }t∈Z be a univariate stochastic process AR(p) defined 13
by xt = φ1 xt−1 + · · · + φp xt−p + at , where the roots of ζp −
p X
φi ζ p−i = 0
(13)
(14)
i=1
are less than one in absolute value and {at }t∈Z are independent random variables
with zero mean and variance σ 2 . Assume that E[a4t ] < ∞, for all t ∈ Z. Suppose the stochastic process {yt }t∈Z is an independent copy of {xt }t∈Z , defined by yt = φ1 yt−1 + · · · + yt−p + εt ,
(15)
where the sequence {εt }t∈Z has the same properties of {at }t∈Z . Additionally, {at }t∈Z and {εt }t∈Z are independent for all t, s ∈ Z. We define the coupled
process {zt }∞ t=1 by
zt =
xt , if t ≤ τ ;
(16)
yt−τ , if t > τ,
where τ is a non random coupling time, possibly depending on t. Note that both {xt }t∈Z and {yt }t∈Z processes are stationary. The first natural and critical question is whether {zt }t∈Z is a stationary process. To answer this question we have to verify whether the first moments, mean, variance and covariance function are not depending on t. Observe that the processes {xt }t∈Z and {yt }t∈Z may be represented as stationary moving average processes. The M A(∞) representation are given by xt =
∞ X
ωj at−j ,
and
j=0
yt =
∞ X
ωj εt−j ,
(17)
j=0
where the weights ωj are defined in Fuller (1976) pg. 57. Therefore, the coupled process {zt }t∈Z can be expressed as P∞ ωj at−j , if t ≤ τ ; j=0 zt = P ∞ ωj εt−τ −j , if t > τ. j=0 14
(18)
For the especial case where {xt }t∈Z and {yt }t∈Z are AR(1) processes with autorregressive parameter φ, the coupled process becomes P∞ φj at−j , if t ≤ τ ; j=0 zt = P ∞ φj εt−τ −j , if t > τ. j=0
(19)
From this representation of {zt }t∈Z is straightforward to verify that E[zt ] = 0
and E[zt2 ] = σz2 < ∞, for all t ∈ Z, since at and εt are independent sequences of P∞ random variables and j=0 |ωj | < ∞. Thus, mean and variance of {zt }t∈Z do not depend on t. The autocovariance function of {xt }t∈Z and {yt }t∈Z are iden-
tically, that is γx (h) = γy (h) = γ(h). The autocovariance function of {zt }t∈Z , γzt (h) = Cov(zt , zt+h ) is γ(h) if t and t + h are smaller than τ or, if t and t + h are grater than τ . If t is smaller than τ and t + h are grater than τ , then γzt (h) = 0. Therefore, we can write γzt (h) as γ(h), if |h| ≤ |τ − t|; γzt (h) = 0, otherwise ,
(20)
which depends on t. Consequently {zt }t∈Z is a non stationary process. For the stationary process {xt }t∈Z in (13), we define the parameter φ =
(φ1 , . . . , φp )0 and let {xt }Tt=1 be a time series of size T from {xt }t∈Z . Moreover, let b = A−1 vT , φ T
(21)
xt = (xt−1 , . . . , xt−p )0 and the vector vT =
P
T 1 0 t=p+1 xt xt , the vector T −p P T 1 t=p+1 xt xt . The existence T −p
be the LS estimator of φ where the matrix AT =
of AT and A−1 utkepohl (2005). From T is discussed in the Section 3.2.2 in L¨ Theorem 8.2.1 in Fuller (1976) we have that 1
P
L
b − φ) → N (0, A−1 σ 2 ), T 2 (φ
(22)
where AT → A. Note that the expression (22) is equal to (5) when K = 1. The same result holds for {yt }t∈Z in (15). Although {zt }t∈Z is a non stationary coupled process, we will prove that the LS estimator of {zt }t∈Z parameters of process in (16) is asymptotically normal. 15
We first show a central limit theorem for m-dependent coupled processes. Assuming that {xt }t∈Z and {yt }t∈Z are m-dependent process (Definiton 6.3.1 in Fuller (1976)), then the coupled process {zt }t∈Z is m-dependent, since the two sets (. . . , zr−2 , zr−1 , zr ) and (zs , zs+1 , zs+2 , . . . ) are independent if s − r > m, where m is a positive integer. Theorem 4.1. Let {xt }Tt=1 be m-dependent time series with E[xt ] = 0, E[x2t ] = σt2 < ∞, |E[x3t ]| < ∞. Let
νt = E[x2t+m ] + 2
m X
E[xt+m−j xt+m ]
(23)
j=1
and suppose that the limit s
1X νt+j = ν s→∞ s j=1 lim
(24)
exists and it is uniform for t ∈ {1, 2, 3, . . . }, where ν 6= 0. Let {yt } be an independent copy of {xt }. Then, {yt } has the same asymptotic variance of {xt } given by (24). Let {zt } be a coupled time series defined in (16) where the coupling time is a constant (τ ) or a function of T (τT ), satisfying τT /T = δ ∈ (0, 1). Then, 1
T−2
T X t=1
L
zt → N (0, ν),
T → ∞.
(25)
Proof: See Appendix. The next result presents the asymptotic properties of LS estimators of the AR(p) coefficients for non stationary univariate coupled time series {zt }. Theorem 4.2. Let {zt }Tt=1 be a coupled AR(p) time series satisfying zt = φ1 zt−1 + · · · + φp zt−p + ut ,
(26)
where ut is an i.i.d. (0, σ 2 ) sequence with E[u4t ] < ∞. Then, 1 L b − φ) → T 2 (φ N (0, A−1 σ 2 ),
b φ and A are defined following (21). where φ, 16
(27)
Proof: See Appendix. The purpose of this Section is to obtain a result similar to (5) for multivariate coupled processes. Since we have proved the univariate case, by Theorem 8.2.3 in Fuller (1976) we have a natural extension of these results for the multivariate framework. Let {xt }Tt=1 be a stable K-dimensional VAR(p) time series represented by xt =
p X
Φj xt−j + at
(28)
j=1
where the standard white noise residuals at and the VAR coefficients Φj are defined similar to (1). The ith coordinate of xt is xit =
p X
ϕji xt−j + ait
(29)
j=1
where xit is the ith element of xt , the vector ϕji is the ith row of Φj , and ait is the ith element of at . Now, define θ i = (ϕ1i , . . . , ϕpi )0 , i ∈ {1, . . . , K} and wt = (x0t−1 , . . . , x0t−p )0 . Then, we can rewrite (29) as xit = wt0 θ i + ait . From univariate results, a natural estimator for θ i is " T #−1 T X X 0 b wt xit , i ∈ {1, 2, . . . , K}. θi = wt wt
(30)
(31)
t=p+1
t=p+1
Theorem 4.3. Let the vector time series {yt }Tt=1 be an independent copy of {xt }Tt=1 defined in (28). Define the K-dimensional coupled time series {zt }Tt=1
as zt =
xt , if t ≤ τ ;
yt−τ , if t > τ,
(32)
where the coupling time is a constant (τ ) or a function of T (τT ), satisfying τT /T = δ ∈ (0, 1). The vector time series {zt }Tt=1 satisfy zt =
p X
Φj zt−j + ut
j=1
17
(33)
where ut and Φj are defined similar to (28). Then, as in (5) we have L
b − φ) → N (0, H−1 ⊗ Γ), T 1/2 (φ
(34)
b = Vec(Φ) b0 , · · · , θ b0 ), Φ bi is defined in (31), b = (θ b = (Φ b 1, . . . , Φ b p ), θ where φ 1 K P P T 1 0 → H. Γ = Var(ut ) and T −p t=p+1 wt wt −
Proof: See Appendix. 5. Simulation Study 5.1. Offline case
For the sake of simplicity, we consider a batch process with 2 variables in which the dynamic behaviour follows a VAR(1) model, as described in (1), for p = 1, K = 2 and vector of intercepts Φ0 = 0. More explicitly, we have zt = Φzt−1 + ut ,
(35)
where ut represent the residual vector at time-instant t with Γ = E(ut u0t ). The bivariate vector of observations zt can be rewritten as z1,t φ11 φ12 z1,t−1 u1,t = + . z2,t φ21 φ22 z2,t−1 u2,t
(36)
In phase I, we assume that, under normal operating conditions, the batches
are generated by:
φ11 Φ= φ21
−0.3 0.4 = . φ22 0.4 0.5 φ12
(37)
The matrix Φ contains information about serial correlation of lag 1 on the main diagonal and contains information about cross correlation between z1,t and z2,t lagged by 1; and between z2,t and z1,t lagged by 1, on antidiagonal. In order to evaluate the efficiency of the proposed strategy, in phase II study, we consider four scenarios, as described below: φ11 Φ= 0.4
0.4 0.5
and
18
−0.3 0.4
φ12 0.5
,
(38)
with φ11 and φ12 taking values in {−0.90, −0.85, −0.8, . . . , 0.90, 0.95}. This values represent a wide range of disturbance levels. In the first case we disturb one serial correlation of lag 1 (z1,t is chosen) and in the second case we disturb one cross correlation of lag 1 (we choose z1,t and z2,t lagged by one). These values of Φ represent a typical stationary VAR. The VAR(1) time series is generated by assuming that the residual covariance matrix is the identity (Γ = I). In each scenario we generate 1,000 batches in phases I and II, with T = 300
time-instants. We replicate each scenario 10,000 times. In phase I, Tφ2 (10) and Wφ (11) charts were setting to the false alarm probability of α = 0.01. We adopt the ARL index (Average Run Length) to evaluate charts performance. ARL = 1/r, where r is the rate of batches beyond the control limits. We consider ARL0
as the average number of batches until a false alarm (for α = 0.01, ARL0 = 100), i.e., points above control limits in the process without disturbances (in-control process). In opposite, ARL1 is the average number of samples until an out-ofcontrol batch falls outside the control limits. This last one is a measure of the charts sensibility. As a benchmark approach, we consider the residual control approach arranged according to Choi et al. (2008), as mentioned in the Introduction. They propose a monitoring scheme based on VAR adjusted residuals, obtained from historical in-control batches. In phase I, the residuals from the I historical batches are arranged in a (I(p − 1) × K) array, where p is the VAR order and K is the number of variables. The set of control charts based on the Hottelling statistics is used to evaluate new batches. Considering that in our simulations batch data is generated from a VAR(1) model, we actually do not need to estimate the VAR parameters in phase I. So, the residual in each time-instant is b t = zt − Φzt−1 , where Φ is given in (37). The new batches are so obtained by u
monitored by the proposed charts (10) and (11) using the VAR residuals of the b t,new , arranged according to Choi et al. (2008) new batch under monitoring, u
(we call Tu2 and Wu charts). Simulations and calculations were conducted using R (Team et al., 2018).
Tables 1 and 2 summarize the results of both approaches in each scenario. 19
The tables show the mean and standard deviation of ARL values for each disturbance. The results of the two scenarios are similar. Therefore, they will not be commented separately. Table 1: Mean and standard deviation of ARL0 (in gray) and ARL1 values for disturbances in φ11 for comparing (Tφ2 , Wφ ) and (Tu2 , Wu ) charts.
Tφ2
Disturbance
Tu2
Wφ
Wu
φ11
mean
sd
mean
sd
mean
sd
mean
sd
0.10
1.00
0.00
2.31
0.28
60.63
1.00
3.20
0.19
0.05
1.00
0.00
3.26
0.51
68.01
1.19
6.12
0.52
0.00
1.00
0.00
5.01
0.91
74.89
1.31
12.79
1.57
-0.05
1.05
0.01
8.04
1.74
81.09
1.51
26.63
4.91
-0.10
1.32
0.03
14.01
3.73
86.54
1.58
50.07
12.15
-0.15
2.42
0.11
25.07
8.10
90.93
1.80
77.22
23.45
-0.20
7.26
0.68
48.27
21.21
94.33
1.84
98.48
39.30
-0.25
32.80
6.21
84.90
56.58
96.36
1.89
105.26
43.39
-0.30
101.93
38.36
99.54
30.13
97.01
1.92
108.41
49.26
-0.35
43.46
10.21
83.73
44.59
96.27
1.90
103.91
38.65
-0.40
8.70
0.88
36.11
13.52
93.77
1.83
93.18
43.42
-0.45
2.48
0.12
11.87
3.11
89.42
1.68
62.38
18.76
-0.55
1.04
0.01
1.78
0.17
74.76
1.33
10.44
1.16
-0.60
1.00
0.00
1.18
0.04
64.28
1.11
3.79
0.24
-0.65
1.00
0.00
1.02
0.01
51.90
0.87
1.75
0.06
-0.70
1.00
0.00
1.00
0.00
38.35
0.64
1.16
0.02
In the gray line we highlighted that the observed ARL0 is close to the chosen nominal value of 100, which is consistent with the fact that no disturbance were introduced in the process. For the remaining cases, either in the Hotelling T 2 as the W charts, it is evident from the ARL1 values that the proposed
20
Dynamic VAR-based charts outperform the residual-based charts in detecting disturbances of different intensities. Additionally, we notice that the degree of detection in our approach increases faster as the perturbations get more intense. Even for the higher values disturbances, the performance of our approach remains better than the residual-based charts. We emphasize here the power of the proposed approach (based on estimates of correlations in Φ) to capture information about process dynamics. Table 2: Mean and standard deviation of ARL0 (in gray) and ARL1 values for disturbances in φ12 for comparing (Tφ2 , Wφ ) and (Tu2 , Wu ) charts.
Tφ2
Disturbances
Tu2
Wφ
Wu
φ12
mean
sd
mean
sd
mean
sd
mean
sd
0.05
1.00
0.00
1.97
0.29
64.93
1.09
4.70
0.36
0.10
1.00
0.00
3.16
0.67
71.63
1.28
9.62
1.05
0.20
1.14
0.02
11.66
3.46
84.24
1.57
42.34
9.78
0.25
1.78
0.07
24.06
8.74
89.50
1.71
72.70
22.99
0.30
4.97
0.38
51.27
21.85
93.53
1.83
96.22
45.10
0.35
24.56
4.41
91.35
58.36
96.20
1.89
105.40
40.88
0.40
104.62
43.50
115.69
81.73
97.01
1.95
107.82
41.54
0.45
38.09
8.42
80.10
44.76
95.98
1.94
105.86
45.57
0.50
6.06
0.55
34.85
18.13
92.92
1.71
92.58
30.93
0.55
1.84
0.07
12.35
3.48
88.03
1.62
63.21
19.24
0.65
1.01
0.00
2.30
0.33
73.62
1.32
12.05
1.47
0.70
1.00
0.00
1.41
0.11
64.80
1.13
4.80
0.36
21
5.2. Online case Lets assume again the simulated process presented in (35) and (36) with the in-control parameters described in (37). In phase II we consider the scenario for disturbances as described in (38) with φ11 and φ12 taking values in the set {−0.4, −0.3, −0.2, 0.0, 0.2, 0.4}. For each scenario in phase I we generate I = 2, 000 batches of T = 300 time-instants, which are equally split into random and fitting sets. In phase II, we generate 100 batches with 100 replications of each disturbance. For out of control processes, disturbances were imposed in three different time-instants (t∗ ), taking values in {5, 50, 200}. For the coupled arrays we set Lc = 50, Lr = 30 and Lf = 20. We use ART L index (Average Relative Time Length) to evaluate the charts performance. The ART L records the number of time-instants, from t∗ , until the chart signalize the out of control condition, given the maximum number (T − t∗ ) of time-instants available. We set the false alarm probability to α = 0.01, which corresponds to ARL0 = 100. The benchmark approach is again based on the work of Choi et al. (2008). The residuals from the adjusted VAR model are arranged in a (I(p − 1) × K) array. In the online context we cannot fully apply our aproach, since the estimates of residual covariance at every new time-instant are not available. So we can not use the W based-chart to monitor an ongoing batch. For that reason the online comparative study is performed using only the Tu2 chart (the
Hotelling T 2 based on VAR residuals). Its important to notice that this is an additional advantage of our approach over the residual based one. The Tφ2t and Wφ2 t charts allow monitoring both mean and covariance of Φt .
Tables 3 and 4 summarizes the results of both approaches in each scenario. The tables show the mean and standard deviations of the ART L values for each disturbance. In the gray line we highlighted that ARL0 values is closed to the nominal target of 100, since the charts were set to false alarm probability of α = 0.01. As in the offline case, the results of the two scenarios are similar and will not be commented separately. Notice that the ART L measure, for each scenario t∗ , is associated to a distinct probability space, since the remaining times (T − t∗ ) do not have the 22
same length. So, we cannot compare the ART L for different t∗ . However, for each t∗ , the set of charts Tφ2t and Wφ2 t present better ART L values than Tu for all disturbance sizes. Additionally, the degree of detection of these charts increases faster as the perturbations get more intense. We emphasize that, for small t∗ , the sensibility of our Dynamic VAR-base aproach is even more pronounced over the residual based approach, showing a good monitoring power from the very first time-instant. Table 3: Mean and standard deviation of ARL0 (in gray) and ART L values for disturbances 2 ) and (T 2 ) charts. in φ11 for comparing (Tφ2 , Wφ u t
t∗
Disturbance
t
Tφ2t
Wφ2 t
Tu2
φ11
mean
sd
mean
sd
mean
sd
5
0.4
0.063
0.004
0.159
0.016
0.089
0.009
5
0.2
0.102
0.009
0.242
0.024
0.178
0.018
5
0.0
0.179
0.016
0.292
0.031
0.268
0.025
5
-0.2
0.270
0.030
0.317
0.026
0.326
0.029
5
-0.3
118.434
8.808
122.962
8.606
132.331
8.700
5
-0.4
0.310
0.024
0.324
0.026
0.326
0.025
50
0.4
0.073
0.005
0.187
0.018
0.105
0.013
50
0.2
0.125
0.012
0.272
0.024
0.209
0.019
50
0.0
0.208
0.019
0.314
0.027
0.295
0.026
50
-0.2
0.299
0.027
0.341
0.031
0.339
0.027
50
-0.3
116.546
8.720
124.735
8.349
133.298
8.402
50
-0.4
0.332
0.024
0.346
0.029
0.337
0.025
200
0.4
0.189
0.014
0.334
0.032
0.235
0.020
200
0.2
0.280
0.023
0.339
0.044
0.319
0.031
200
0.0
0.336
0.038
0.320
0.057
0.324
0.051
200
-0.2
0.319
0.052
0.305
0.049
0.314
0.057
200
-0.3
118.020
8.109
123.394
9.260
133.861
8.638
200
-0.4
0.296
0.057
0.294
0.057
0.320
0.065
23
Table 4: Mean and standard deviation of ARL0 (in gray) and ART L values for disturbances 2 ) and (T 2 ) charts. in φ12 for comparing (Tφ2 , Wφ u t
t∗
Disturbance
t
Tφ2t
Wφ2 t
Tu2
φ12
mean
sd
mean
sd
mean
sd
5
0.4
117.119
8.658
123.914
8.569
131.826
7.543
5
0.3
0.263
0.024
0.321
0.030
0.324
0.028
5
0.2
0.197
0.019
0.314
0.031
0.294
0.026
5
0.0
0.115
0.010
0.279
0.027
0.215
0.020
5
-0.2
0.076
0.005
0.239
0.026
0.136
0.014
5
-0.4
0.058
0.003
0.184
0.022
0.084
0.008
50
0.4
116.644
7.319
125.233
8.951
131.199
9.022
50
0.3
0.283
0.026
0.342
0.030
0.338
0.025
50
0.2
0.231
0.022
0.334
0.030
0.324
0.024
50
0.0
0.136
0.012
0.307
0.024
0.243
0.022
50
-0.2
0.091
0.006
0.269
0.028
0.158
0.015
50
-0.4
0.070
0.004
0.216
0.024
0.098
0.009
200
0.4
117.075
9.036
123.523
8.132
133.939
8.779
200
0.3
0.334
0.045
0.302
0.057
0.317
0.057
200
0.2
0.332
0.040
0.310
0.056
0.317
0.047
200
0.0
0.298
0.026
0.313
0.056
0.329
0.045
200
-0.2
0.222
0.017
0.346
0.048
0.301
0.032
200
-0.4
0.174
0.010
0.351
0.035
0.225
0.020
In order to show that the performance of the proposed method are not affected by the number of batches used to built the control charts, table 5 summarizes ART L results of additional simulations varying the number of batches in the reference data set (200 and 500) with two levels of disturbances in φ11 and φ12 . For the previous one, the in-control value is φ11 = −0.3 and the disturbances are 0.0 (moderate) and 0.4 (severe). For the later, the in-control value
24
is φ12 = 0.4 and the disturbances are 0.0 (moderate) and -0.2 (severe). The results are very similar to the previous ones presented in Tables 3 and 4, for which 1000 batches were considered in the reference data set. We can see that, regardless of the time-instant t∗ that the disturbance was introduced, the size of the data set used to determine the control limits doesn’t change the good performance of Tφ2t . 2 ) and (T 2 ) Table 5: Mean and standard deviation of ART L values for comparing (Tφ2 , Wφ u t
t
charts for two intensity levels (moderate and severe) of disturbances in φ11 and φ12 , and two different sample sizes (200 and 500). In-control
I
Parameter
#bat
200
φ11 =-0.3
500
200
φ12 =0.4
500
t∗
Disturb.
mean
sd
mean
sd
mean
sd
5
0.4
0.053
0.005
0.128
0.016
0.090
0.009
Tφ2t
2 Wφ t
Tu2
5
0.0
0.124
0.015
0.220
0.029
0.273
0.028
50
0.4
0.063
0.005
0.155
0.019
0.106
0.010
50
0.0
0.152
0.017
0.246
0.027
0.290
0.023
200
0.4
0.161
0.014
0.314
0.029
0.241
0.024
200
0.0
0.304
0.030
0.350
0.047
0.326
0.049
5
0.4
0.060
0.004
0.150
0.016
0.090
0.010
5
0.0
0.160
0.018
0.270
0.030
0.267
0.025
50
0.4
0.072
0.005
0.177
0.020
0.107
0.011
50
0.0
0.195
0.020
0.294
0.029
0.297
0.026
200
0.4
0.179
0.014
0.332
0.031
0.236
0.024
200
0.0
0.339
0.037
0.323
0.047
0.322
0.048
5
-0.2
0.061
0.005
0.193
0.033
0.137
0.015
5
0.0
0.087
0.009
0.218
0.031
0.217
0.022
50
-0.2
0.075
0.007
0.214
0.027
0.158
0.014
50
0.0
0.103
0.010
0.247
0.033
0.245
0.024
200
-0.2
0.184
0.017
0.351
0.038
0.302
0.027
200
0.0
0.244
0.024
0.352
0.035
0.323
0.039
5
-0.2
0.072
0.005
0.220
0.022
0.137
0.013
5
0.0
0.105
0.009
0.265
0.034
0.216
0.020
50
-0.2
0.085
0.007
0.252
0.027
0.160
0.015
50
0.0
0.125
0.012
0.286
0.031
0.247
0.024
200
-0.2
0.212
0.016
0.343
0.039
0.301
0.029
200
0.0
0.288
0.025
0.346
0.050
0.327
0.042
25
5.3. Choice of tuning parameters As we mentioned in the Section 4.3 the proposed approach is sensitive to the choice of Lr , Lf and Lc values. The choice of this values in the simulated case study were based on a previous analysis considering the ideas discussed in Section 4.3. Although we are not concerned on finding the optimal set, we have made preliminary simulations varying the size of Lc (from 15 to 70) and the proportion of elements (Lr ,Lf ) in Lc from (40%, 60%) to (60%, 40%). We noticed that the charts performance decreases only for the smallest size of Lc . For values of Lc higher than 40 the charts became very sensitive to detect disturbances of any intensities. However, for the highest values of Lc (close to 70), the computational burden increases considerably. Additionally, we observed that varying the proportion of elements of the pair (Lr ,Lf ) doesn’t impact significantly the charts performance. We conclude that, considering batches that generate time-series of sizes of at least 300 time-instants, the choice of Lc around 40 to 50 with nearly equal proportion of elements in (Lr ,Lf ) will offer charts settings close to optimal performance. 5.4. Computational aspects It is important to highlight that the computational complexity of our approach may only be an issue in the online charts settings, since we need to run coupled vectors from the reference batches to obtain control limits for every time-instant. However, this procedure has a very short computational time comparing to the number of batches and time-instants in the reference data set. In our simulation study for a reference set of 1000 batches with 300 time-instants each, with Lc = 50 (Lr =20, Lf = 30), the average time to calculate the online limits is 1.3 minutes by using a PC with a standard configuration [Intel(R) Core(TM) i7-4770 3.4GHz]. Additionally, we must keep in mind that this procedure of “setting limits” is done only once, so this time length is very reasonable, making our approach computationally feasible for online monitoring.
26
In the monitoring phase the computational time is even faster. Considering a new ongoing batch, the time spent to calculate the Tφ2t and Wφ2 t scores for each time-instant is about 0.00033 seconds on the average. These facts confirm how computationally applicable the method is. 5.5. Pratical issues We have showcase the capabilities of the proposed approach in monitoring a toy model representing a batch process in which the variable dynamics is described by a VAR(p), with p = 1, without intercept (Φ0 = 0). However, the proposed method can work easily on more complex processes, including the case p > 1 needed for VAR fitting and in the presence of intercepts. Indeed, the proposed approach is independent of p and Φ0 . In a practical sense, for a real reference data set we can estimate the VAR order (p) in a previous stage and, once p is defined, perform our VAR-based control charts by using equations (2) to (4) and (6), including as many Φj , for j = 1, · · · , p, as the number p. Additionally, considering that we are concerned only in the monitoring of the variables dynamic (regardless the variables means), even if the real data demands a non zero intercept, we can vanish this term by subtracting each variable from their means for each batch in a previous step. This operation fully preserves the data dynamics and turns Φ0 = 0, making the presented approach applicable without any modification. 6. Conclusion This paper presented a new approach to deal with batch processes using VAR models focused on the VAR coefficients instead of the residuals. The online and offline designs are described. We have shown that the online version based on coupled arrays allows powerful monitoring since the very first time-instant. The idea of coupled array established a new class of coupled time series with some specific characteristics. We have shown that the asymptotic theory for least squares estimation in this class of VAR process is very similar to the classical VAR case. 27
Through a simulated toy model representing a batch processes with two variables, we have shown the proposed model outperforms the residual-based control charts in both offline and online context. The proposed approach based on the estimated coefficients from Φ seems to work very well and hold more information about the variable dynamics than the classical one based on residuals. Although performing efficiently in the simulated examples presented above, some aspects of Dynamic VAR-based charts use may be challenging to practitioners, requiring further refinements in the method. Three of them are easy to solve and, although not explicitated in this article, are mentioned here: (i) extension of the proposed method to monitoring batch process with more (or many more) than two variables and when the VAR order is p > 1; (ii) development of diagnostics tools to support the interpretation of out-of-control signals; (iii) extension of the proposed methodology to modeling data in phase I when the underling process is unknown (as in a real dataset), including non stationarity, identification of the VAR order p and considering cross contemporary correlations (correlations of lag 0). All of these issues are easy to handle in the proposed approach and already currently being addressed by the authors.
Acknowledgements We thank Gabriela Cybis and Guilherme Pumi for fruitful discussions leading to this paper.
Appendix Title: Appendix for “Dynamic Var Model-Based Control Charts for Batch Process Monitoring” (pdf) R-code: R-codes used on simulation are available upon request.
28
References Camacho, J., Pic´ o, J., & Ferrer, A. (2009). The best approaches in the on-line monitoring of batch processes based on pca: Does the modelling structure matter? Analytica chimica acta, 642 , 59–68. Choi, S. W., Morris, J., & Lee, I.-B. (2008). Dynamic model-based batch process monitoring. Chemical Engineering Science, 63 , 622–636. Fuller, W. A. (1976). Introduction to statistical time series. Wiley. Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis. Van den Kerkhof, P., Gins, G., Vanlaer, J., & Van Impe, J. F. (2012). Dynamic model-based fault diagnosis for (bio) chemical batch processes. Computers & chemical engineering, 40 , 12–21. Leoni, R. C., Costa, A. F. B., Franco, B. C., & Machado, M. A. G. (2015). The skipping strategy to reduce the effect of the autocorrelation on the t2 chart’s performance. The International Journal of Advanced Manufacturing Technology, 80 , 1547–1559. Lindvall, T. (2002). Lectures on the coupling method . Courier Corporation. L¨ utkepohl, H. (2005).
New introduction to multiple time series analysis.
Springer Science & Business Media. Montgomery, D. C. (2007). Introduction to statistical quality control . John Wiley & Sons. Nomikos, P., & MacGregor, J. F. (1995). Multivariate spc charts for monitoring batch processes. Technometrics, 37 , 41–59. Pan, X., & Jarrett, J. E. (2012). Why and how to use vector autoregressive models for quality control: the guideline and procedures. Quality & Quantity, 46 , 935–948. 29
Peng, J., Liu, H., Hu, Y., Xi, J., & Chen, H. (2014). Ascs online fault detection and isolation based on an improved mpca. Chinese Journal of Mechanical Engineering, 27 , 1047–1056. Sim˜ oes, F. D., Leoni, R. C., Machado, M. A. G., & Costa, A. F. B. (2016). Synthetic charts to control bivariate processes with autocorrelated data. Computers & Industrial Engineering, 97 , 15–25. Sims, C. A. (1980). Macroeconomics and reality. Econometrica: Journal of the Econometric Society, (pp. 1–48). Snoussi, A. (2011). Spc for short-run multivariate autocorrelated processes. Journal of Applied Statistics, 38 , 2303–2312. Team, R. C. et al. (2018). R: A language and environment for statistical computing. Vanhatalo, E., & Kulahci, M. (2015). The effect of autocorrelation on the hotelling t2 control chart. Quality and Reliability Engineering International , 31 , 1779–1796. Wang, J., Liu, W., Qiu, K., Yu, T., & Zhao, L. (2018). Dynamic hypersphere based support vector data description for batch process monitoring. Chemometrics and Intelligent Laboratory Systems, 172 , 17–32. Wold, S., Kettaneh, N., Frid´en, H., & Holmberg, A. (1998). Modelling and diagnostics of batch processes and analogous kinetic experiments. Chemometrics and intelligent laboratory systems, 44 , 331–340.
30