Control Engineering Practice 55 (2016) 185–196
Contents lists available at ScienceDirect
Control Engineering Practice journal homepage: www.elsevier.com/locate/conengprac
Alarm management practices in natural gas processing plants Vinícius Barroso Soares a, José Carlos Pinto a, Maurício Bezerra de Souza Jr.b,n a b
Programa de Engenharia Química/COPPE, Universidade Federal do Rio de Janeiro, Cidade Universitária, CP: 68502, Rio de Janeiro, 21941-972 RJ, Brazil Departamento de Engenharia Química/Escola de Química, Universidade Federal do Rio de Janeiro, Cidade Universitária, Rio de Janeiro, 21941-909 RJ, Brazil
art ic l e i nf o
a b s t r a c t
Article history: Received 7 March 2016 Received in revised form 12 May 2016 Accepted 4 July 2016
In industrial data sets, groups of variables often move together. Monitoring all these variables may result in many nuisance alarms. However, it is possible to take advantage of redundant information to design and reduce the size of alarm sets. The present work reports the application of an alarm management protocol based on alarm priorization to three large Natural Gas Processing Plants, during a three year period, and also investigates the use of different correlation analyses techniques as tools to assist in the further reduction of the number of alarms. The results show that the adopted practices enable the reduction of alarms. & 2016 Elsevier Ltd. All rights reserved.
Keywords: Natural gas processing plants Alarm management Correlation analysis Principal component analysis Cluster analysis
1. Introduction An alarm system is the collection of hardware and software that can provide alarm states, communicating them to operators and recording state changes. Alarm systems are critically important for safe and efficient operations of modern industrial plants, including oil refineries, petrochemical facilities and power plants (Bransby & Jenkinson, 1997; Rothenberg, 2009). These systems are used primarily as tools for detection of near misses, which can be defined as departures from normal operating ranges that are followed by subsequent returns to the desired process operation conditions (Pariyani, Seider, Oktem, & Soroush, 2010). Therefore, these systems are indeed safeguards to prevent the deterioration of near misses to accidents. Retrospective investigations over a large number of accidents support the important roles played by alarm systems during process operation (Venkatasubramanian, Rengaswamy, Yin, & Kavuri, 2003). Alarm systems also play prominent roles in maintaining the high efficiency of plant operation. It is a well-known fact that deviations of process variables from normal/optimal operating zones usually imply negative effects on the process performance, leading for example to off-spec products and excessive consumption of raw materials and energy. In spite of that, these systems may suffer from poor performance when too many alarms have to be handled by the operators (EEMUA, 2013). For all these n
Corresponding author. E-mail address:
[email protected] (M.B. de Souza Jr.).
http://dx.doi.org/10.1016/j.conengprac.2016.07.004 0967-0661/& 2016 Elsevier Ltd. All rights reserved.
reasons, industrial alarm systems are receiving increasing attention from both industrial and academic communities. The importance of this issue in the industrial field can be measured by the enormous amount of standards and guidelines regarding the design and use of alarm systems published by industrial societies and professional organizations, including the Nuclear Regulatory Commission, the Engineering Equipment and Materials Users' Association, the Standardization Association for Measurement and Control in Chemical Industries, the Electric Power Research Institute, the Abnormal Situations Management Consortium, the International Society of Automation, the International Electrotechnical Commission, the American Petroleum Institute, among many others (Wang, Fan, Chen, & Shah, 2015a). All of these standards and guidelines impose specific requirements on the performance of alarm systems and suggest the use of indicators based on frequency analyses, alarm rates, pattern distributions, operator response times, reaction times, among others (ISA, 2009; EEMUA, 2013). Alarm occurrences can be classified into two major groups: true alarms and nuisance alarms. A true alarm indicates an abnormal condition associated with the process or equipment requiring an action in a limited time. A nuisance alarm does not require a specific action or response from operators as it does not affect the process operation (Rothenberg, 2009). Thus, the key point to distinguish correct alarms from nuisance alarms regards the requirement of operator response (EEMUA, 2013). In contrast to nuisance alarms, a true alarm requires operators to pay attention or to take action in a prompt manner; otherwise, abnormal situations associated with true alarms will exert negative effects on operation safety and/or efficiency.
186
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
Chattering alarms are the mostly encountered nuisance alarms and may account for 70% of alarm occurrences (Rothenberg, 2009; Hollifield & Habibi, 2010). A chattering alarm can be defined as the alarm that transitions between the alarmed state and the normal state with undesired high frequency or with a constant time period. These types of alarms are typically generated by random noise and/or frequent disturbances on process variables, especially when the process operates in the vicinities of the alarm setpoint. Chattering alarms can also be induced by repeated on-off actions of badly-tuned control loops (Bransby & Jenkinson, 1998; ISA, 2009; Wang & Chen, 2013; Wang & Chen, 2014). Nuisance alarms often lead to occurrence of alarm overloading, while the simultaneous occurrence of a large number of true alarms can lead to alarm floods. Occurrence of alarm overloading can be extremely detrimental to the confidence and usefulness of alarm systems. First, if significant number of alarms can be regarded as nuisance, the alarm system may provide no useful information and only serve as distractions to plant operators. As a result, a true alarm may be overlooked by process operators among the many active nuisance alarms. Second, even if all active alarms are correct, the alarm rate may be too high to be manageable by operators. When the alarm rate is too high, operators may have no other choice but to ignore some of the active alarms. In this case, the designed functionality of the alarm systems can be completely discredited (Hollifield & Habibi, 2010; ISA, 2009; Yang, Duan, Shah, & Chen, 2014). Many different techniques can be used for improvement of alarm system performance, including: (automatic) adjustment of setpoint and dead bands; use of filtering, transient suppression and de-bounce timers to repeating alarms; combination and simplification of redundant sets of alarms; eclipsing of multi-level alarms (such as high and high-high); application of counters and auto-shelving to repeating alarms; dynamic alarm re-prioritization; grouping of alarms that demand similar operator responses; automatic suppression of alarms according to the operating mode of the plant; development of intelligent logics for identification of the most important alarms; among others (EEMUA, 2013). However, while one may possibly say that many strategies have been devised to improve the performance of an alarm system, strategies proposed for reduction of the number of alarms configured in an industrial plant are scarcer. Perhaps this can be linked to a conservative operation approach, as it is frequently assumed that reduction of the number of alarms can somehow compromise the safety of the process operation. The technical literature indicates that different techniques can be used to investigate correlations among variables (EEMUA, 2013). However, when one concentrates specifically on the behavior of alarm states, analysis and comparison of distinct variable correlation methods are seemingly scarce. This is particularly true when one is interested in reducing the number of alarm activations in real industrial plants, as a tool of alarm management. Xie et al. (2006a, 2006b) presented a multivariate statistical approach to detect and diagnose faults in industrial plants with complex dynamics. Their work pointed the difficulties related to monitoring correlated variables. Lieftucht, Kruger, and Irwin (2006) proposed multivariate statistical methods to remove auto-correlation and cross correlation between variables, reducing the number of false alarms. Natural Gas Processing Plants are treatment plants that operate under severe pressure (often above 9000 kPa) and temperature (range from 70 °C to 300 °C) conditions. Their products are mostly flammable and explosive. The process is dynamic and can change dramatically depending on the composition of the gas processed. Furthermore, they are usually the “bottleneck” of the production/operation fields, so that their stop can mean total loss of the activities of these fields, either at sea (offshore) or ground
(onshore), with great economical losses. Because of all these reasons, the monitoring of the process in these units is extremely important. This paper aims to contribute to alarm management practices by combining a theoretical statistical framework with long-term industrial implementation. With that purpose, the results of a long-term (3-year) alarm management program to 3 identical Natural Gas Processing Plant based on alarm priorization are presented. Additionally, correlation methods - namely Correlation Analysis (Yang, Shah, & Xiao, 2010), Cluster Analysis (Higuchi, Yamamoto, Takai, Noda, & Nishitani, 2009; Yang, Shah, Xiao, & Chen, 2012; Kondaveeti, Izadi, Shaha, Black, & Chen, 2012) and Principal Component Analysis (Izadi, Shah, Shook, & Chen, 2009; Chen, 2010) are applied, compared and investigated with the intent of further reducing the number of alarms. Insights and recommendations for industrial application arising from this analysis are presented. This text is organized as follows. Section 2 introduces the fundamentals on correlation analyses. Section 3 presents and discusses the results of the industrial application of the proposed alarm management protocol and of the correlation analyses. Finally, the conclusions are described in Section 4.
2. Fundamentals 2.1. Correlation analysis Alarms can be represented in terms of binary sequence of zeros and ones, where zero (0) represents no alarm or no information and one (1) represents an alarm annunciation. Thus, each alarm tag name is represented by a sequence of 0's and 1's sampled over a given period of time. Most part of the binary sequence is filled with 0's, except for time instants when an alarm is presented to the operator (Chen, 2010; Kondaveeti et al., 2012; Wang, Li, Huang, & Chong, 2015b). This can be represented mathematically by,
⎧ ⎪ 1, ifx a( t) ≥ ( ≤ ) alarm setpoint xa( t) = ⎨ ⎪ ⎩ 0, ifxa( t) < ( > ) alarm setpoint
(1)
Alarm sequences can be analyzed statistically. Two numbers are often used to summarize a probability distribution for a random variable X. The mean (μ X ) is a measure of the center or middle of the probability distribution, and the variance (σ2 XX ) is a measure of the dispersion, or variability of the distribution. These two measures do not uniquely identify a probability distribution; that is, two different distributions can have the same mean and variance. Still, these measures are simple, useful summaries of the probability distribution of X (Montgomery, 2005). As the variance unit is the square of the variable unit, the standard deviation is usually employed to represent data scattering ( σX ).
μ X=
1 N
σ2XX =
σX =
N
∑ Xi
(2)
i=1 N
1 N−1
∑ ( X i − μ X)
1 N−1
∑ ( X i − μ X)
2
(3)
i=1 N i=1
2
(4)
When two or more random variables are defined on a probability space, it is useful to describe how they vary together; that is, it is useful to measure the relationship between the variables. A common measure of the relationship between two random
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
variables is the covariance (Martin & Crowley, 1995; Arts, Irwan, Janssen, & Augustus, 2002; Montgomery, 2005). The term correlation has a statistical meaning that is similar to the commonly understood concept of association. Specifically it refers to a measure of the similarity between two (or more) paired data sets. Consider the process variable x and y with averages μx and μy and standard deviations σx and σy , respectively (Smith, 2014). The Pearson correlation coefficient ( ρxy ) is,
Cov(X, Y)=
1 N−1
N
ρXY =
N
∑ ( X i − μ X)( Yi − μ Y) i=1
( )( ) 2 2 ( X i − μ X) ( Yi − μ Y)
(5)
∑i = 1 X i − μ X Yi − μ Y N ∑i = 1
(6)
The covariance is always measured between 2 dimensions. If the data set has more than 2 dimensions, more than one covariance measurement can be calculated. For example, for a 3-dimensional data set (dimensions x, y, z), cov(x,y), cov(x,z), and cov (y,z) can be calculated. A useful way to get all the possible covariance values between all the different dimensions is to calculate them all and put them in a covariance matrix C.
⎛ cov(x, x) cov(x, y) cov(x, z)⎞ ⎜ ⎟ C=⎜ cov(y, x) cov(y, y) cov(y, z)⎟ ⎜ ⎟ ⎝ cov(z, x) cov(z, y) cov(z, z) ⎠
(7)
Basically, if the absolute values of the correlation coefficients are sufficiently close to 1, then the analyzed variables can be regarded as redundant and describe a single experimental effect, allowing for use of a single variable for purposes of process monitoring. In other words, if two alarm states are correlated strongly, then one of the alarms can possibly be discarded at plant site. As a consequence, the existence of correlated variables can help to reduce the number of alarm points in an industrial plant. 2.2. Principal component analysis When process data sets contain large number of variables, groups of variables often fluctuate together. One reason for this typical behavior is the fact that more than one variable may respond to similar driving forces that govern the process trajectory. Besides, in many real industrial systems only few driving forces affect the system, while many distinct instruments can be used simultaneously to monitor the process operation. When this happens, at least in principle a group of many variables can be replaced by a single new process variable, used to monitor the real driving force. Principal component analysis is a quantitatively rigorous method for achieving this goal. Particularly, the method generates a new set of variables, called principal components or projections onto the eigenvectors of the covariance matrix of available process data, which are linear combinations of the original variables. It is important to emphasize that the principal components are orthogonal to each other, so that all redundant information of the process is captured by each of the principal components (Mathworks, 2003). The concepts that are used during PCA computations involve the standard deviation, the covariance matrix, eigenvectors and eigenvalues. Basically, one: 1) chooses the data set; 2) computes the mean and standard deviations of each dimension; 3) subtracts all the values of the mean and divide the result by the standard deviation in order to standardize the data fluctuations and remove measurement units; 4) computes the covariance matrix; 5) calculates the eigenvectors and eigenvalues for this matrix. The
187
variances of the principal components are the eigenvalues of this matrix. Assuming that the eigenvalues are ordered as λ1 4 λ2 4 … 4 λn, then the eigenvector associated with the i-th eigenvalue is the i-th main component (Smith, 2002). The use of PCA for analysis of variable redundancy can be particularly useful because standard correlation analysis is biparametric, while principal component analysis is multiparametric. However, standard principal component analysis requires linear and normal behavior of the analyzed data set, which can constitute a limiting factor in many applications. Despite that, non-linear and non-normal versions of the standard principal component technique have been proposed in the literature for more involving applications (Feital, Kruger, Dutra, Pinto, & Lima, 2013). As the respective eigenvalues represent the process variability along that particular direction (or variable), ordering of the calculated eigenvalues allows for selection of the number of driving forces (or eigenvectors or new process variables) that must be used to capture a desired amount of process variability (Jolliffe, 1986; Jackson, 1991; Macgregor & Kourtl, 1995). By examining plots of these few new variables, researchers often develop a deeper understanding about the driving forces that generated the original data. Basically, if the desired amount of process variability can be captured with n eigenvectors and if N alarms constitute the data set, then n new process alarms can be created and used to monitor the process and replace the original set of N alarms (Albazzaz & Wang, 2006; Xie et al., 2006a; Camacho, Pérez-Villegas, Rodríguez-Gómez, & Jiménez-Mañas, 2015).
2.3. Cluster analysis Cluster analysis, also called segmentation analysis or taxonomy analysis, is a way to create groups of objects, or clusters, in such a way that the profiles of objects in the same cluster can be regarded as similar and the profiles of objects in different clusters can be regarded as distinct. Many different fields of study, such as engineering, zoology, medicine, linguistics, anthropology, psychology, and marketing, have contributed to the development and application of clustering techniques (Mathworks, 2003). The use of cluster analysis for analysis of variable redundancy can be particularly useful because standard correlation analysis is biparametric, while cluster analysis is multiparametric. Besides, cluster analysis does not require linear or normal behavior of the analyzed data set, which can be particularly important in many applications. In order to perform hierarchical cluster analysis on a data set, a standard procedure can be implemented. First, the degree of similarity or dissimilarity between pairs of objects in the data set must be characterized, normally with help of a distance function. As a consequence, objects can be ordered in terms of the distance from other objects. Then, the objects must be grouped into a binary, hierarchical cluster tree, using distance values as the appropriate parameter. In this step, the degree of similarity or dissimilarity between pairs of objects can also be characterized. The procedure can be repeated iteratively. A cluster is obtained when the minimum distances between objects in the cluster are smaller than a critical threshold, used to characterize distinct clusters (Morrison, 1976). Different functions can be used to represent distances and thresholds, giving birth to distinct clustering techniques (Macgregor & Kourtl, 1995). In the present work, the inverse correlation shown in Eq. (8) is used to measure the distance between data sets:
188
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
d( x, y)=1− N
∑ ( x i − μx)( yk − μ y) i=1
⎡ ⎢ ⎢ ⎣
N
∑(
2
)
x i − μx
i=1
N
∑( i=1
⎤−1 2 yk − μ y ⎥ ⎥ ⎦
)
(8)
The hierarchical, binary cluster tree created by the distance function can be understood more easily with help of a dendrogram. Essentially, a dendrogram is a set of U-shaped lines that connect objects in a hierarchical tree, with height that represents the distance between the two connected objects. Objects inside a single cluster are similar in some sense and can be regarded as redundant for purposes of process monitoring (Xie et al., 2006b; Camacho et al., 2015). Therefore, the number of alarms belonging to the same cluster can possibly be reduced at plant site. In this work, it was decided to use the Ward's grouping method (Ward, 1963) in order to assign the statistical meaning to analytical results.
3. Industrial results 3.1. Process description In general, natural gas processing plants are dedicated to the production of liquid hydrocarbons, gaseous streams and liquefied natural gas (LNG) streams, using natural gas as raw material. Many books show in detail the steps involved in processing the natural gas, as GPSA (1998). Three identical natural gas processing units (NGPU) were analyzed in the present study, based on the thermodynamic principle of turbo-expansion. The objective of each plant is to treat raw natural gas in order to obtain the following products: fuel gas (C1 and C2 cuts), liquefied petroleum gas (C3 and C4 cuts) and petrochemical naphtha (C5 and heavier). The
units are fed with natural gas from the producing fields (onshore and offshore) and have nominal capacity to process 3.5 million m³/ d of natural gas (@ 20 °C and 1 atm) or 137,243 kg/h. In this case, the plant produces 2,927,599 m³/d (@ 20 °C and 1 atm) or 90,942 kg/h fuel gas for sale (specified gas), 37,530 kg/h of liquefied petroleum gas (LPG), 3819 kg/h petrochemical naphtha (C5 þ ) and the ethane-rich current produce 93,966 m³/d (@ 20 °C and 1 atm) or 4863 kg/h. Each unit comprises 18 vessels, 9 air coolers, 14 heat exchangers, 10 pumps, 7 compressors, 1 turbo-expander, 3 distillation towers, 2 furnaces, 4 blowers and 5 filters. The unit is able to deal with large disturbances in the feed quality (ranging from high to low C3 þ content). For this to be possible, the unit was designed for the high content feed and, in order to be able to process the poor C3þ feed, it contains recirculation lines and lines to bypass the heat exchangers. Basically, the natural gas treatment process in the NGPU involves three distinct stages: 1) gas drying and cooling; 2) heating, 3) compression. The process also has the following thermal systems: a propane cooling system and a heating oil system. Fig. 1 shows the simplified NGPU process flow diagram, highlighting some of the areas described above. A more detailed description of equipment and processing steps is not presented here due to proprietary constraints. The digital alarms are configured in the Supervisory Control and Data Acquisition (SCADA) system of the plant, which receives real-time measurements of process variables. A total of 13,903 digital alarms were located in strategic points of the process (equipment and piping) and were visible on the screens of the control room operation. Among them, 4932 (35.47%) alarms were used to monitor temperature; 3491 (25.11%), pressure; 3373 (24.26%), level; 1340 (9.64%), differential pressure; 420 (3.02%), flow; 304 (2.19%), vibration; and 43 (0.31%), concentration. Due to the SCADA data storage capacity, it was possible to access the files and see the daily alarm activations for a selected period (from January 2013 to December 2015). During these three years, 1,847,346 digital alarm occurrences were registered on the plant. Clearly, the rates of alarm activations are much larger than values
Fig. 1. Simplified NGPU process flowsheet.
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
suggested by standard guidelines. Alarm management practices were then adopted to reduce the number of alarm occurrences in the plant, as described below. 3.2. Assumptions for the alarm management protocol based on alarm priorization According to ISA (2009) and EEMUA (2013), the priority distribution of alarms can be performed in classes, allowing for optimization of the results. In industrial plants, some examples of alarm classes are operation/maintenance, quality/safety, unit, utilities/transfer/storage, etc. The highest priority alarms shall be detached audibly and visually from the lower ones. Critical priority alarms may also be announced in the same way as the high priority ones, according to the criteria adopted in the plant. Alarms must also be listed in terms of the current state, starting from not acknowledged alarms, and sorted chronologically. There must also be possible to filter or sort alarms by priority, state, group, class and type of alarm. Alarms were prioritized based on the allowable time for the operator's response, and the impacts caused on the plant when no response was taken. These impacts were related to loss of production and assets, environmental damage and personnel safety, considering, within these categories, the alarms defined to comply with local legislation and company's internal policies. The recommended values for Priority distribution of announced alarms are shown in Table 1. As recommended by ISA (2009) and EEMUA (2013), statistical alarm analysis must be part of the operational routine, and its monitoring frequency must take place at least once a month. The recommended values for performance indicators shown in Table 2 were adopted. Besides, according to the recommendations of ISA (2009) and EEMUA (2013), any required changes concerning setpoint values, alarm suppression, cancellation or inclusion of an alarm, strategies of digital system configuration, etc., were monitored and controlled. The responsibility for the alarm system performance was assigned to the operation team, and shared with the process, automation and maintenance teams. An operational group was formed with members from each discipline for performance monitoring, corrective actions implementation and improvements in the alarms system. A group to manage the alarm system was created to ensure that the philosophy and practices adopted were uniformly applied in all existing plants and for those that may be designed in the future for the plant. Periodic audits were also conducted in order to assure compliance with the proposed procedures. 3.3. Implementation of the alarm management protocol in the whole plant The recommendations of ISA (2009) and EEMUA (2013) were adopted. According to these recommendations, strategies for alarm processing must be implemented in the Basic Process Control System (BPCS) configuration level for suppression of alarms in real time, enabling the rational availability of
189
Table 2 Metrics for alarm performance adopted in the plant (EEMUA, 2013). Metric
Acceptable
Alarms announced by operating position Alarms announced per hour Alarms announced by period of 10 min Percentage of hours with incidence higher than 30 alarms Percentage of periods of 10 min with incidence higher than 10 alarms Maximum number of alarms in a period of 10 min
144 6 1 o 1% o 1%
information to the operator, minimizing the amount of alarms and increasing plant reliability. Some practical examples are given below. For instance, actions associated with automatic on-off control of process equipment and open-close on-off valve operations must obey the following strategy: when the equipment is controlled and properly responds to the control command, no alarm must be activated. These actions should be characterized as special events. Besides, when there are two levels of alarms for the same process variable, the second alarm must suppress the first one. Thus, highhigh (HH) alarms must suppress high (H) alarms, while low-low (LL) alarms must suppress low (L) alarms. This suppression must be applied for two alarms associated with one single instrument. Additionally, when a measuring point is associated with more than one sensor, it is recommendable to configure the detection of divergent measurements between these readings. These deviations must be characterized as alerts when the response time is undetermined. The setting value for the detection of the deviation between the sensors must consider the maximum uncertainty between instruments and the need of timer in order to avoid nuisance alarms. Only the alarms or a summary of alarms that require action of the control room operator must be issued from the equipment supplied by third parties (e.g. compressors) to the alarm list in the BPCS. Alarms must be set according to the state of the equipment. The following states may be characterized for particular pieces of equipment: under start-up, steady operation, shutdown and outof-service. Thus, alarms that apply just to the equipment under steady operation shall be suppressed when they are on startup, shutdown or out-of-service states. The detection of the steady operating condition of a piece of equipment may be automatic, according to one or more operational variables, or may be informed by the operator. Alarms generated by redundant instruments must be displayed as a single alarm when the abnormal condition occurs. All alarms, however, must be registered. Alarms preceding a trip event, and related to the trip, must be suppressed when the trip occurs. When two or more alarms are closely related, only one of them must be announced. The use of delay times, deadbands and filtering are recommended to reduce the number of nuisance alarms. Tables 3–5 show the values used in plant. The values were adjusted based on operational experience with the analyzed process variables. Table 3 Recommended delay times based on signal type (ISA, 2009).
Table 1 Priority distribution adopted in the plant (EEMUA, 2013).
o 10
Priority distribution of announced alarms
Acceptable
Signal type
Delay times (s)
Low Average High Critical
80% 15% 5 o 1%
Flow rate Level Pressure Temperature
15 60 15 60
190
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
Table 4 Recommended starting point deadband based on signal type (ISA, 2009). Signal type
Deadband (percent of operating range)
Flow Rate Level Pressure Temperature
5% 5% 2% 1%
Table 5 Recommended filtering based on signal type (ISA, 2009). Signal type
Filtering (s)
Flow rate 2 Level 2 Pressure 1 Temperature –
Fig. 2. Alarm occurrences in the natural gas processing plant.
Fig. 2 presents the numbers of alarm occurrences in three consecutive years in the plant and constitutes a significant benchmark for studies regarding the operation of alarm systems, as similar reports are not common in the literature. There was significant reduction in the number of occurrences over the months due to the alarm management practices described above. It can be seen that the reduction of alarm occurrences was higher in 2013 than in 2014 and higher in 2014 than in 2015. The monthly alarm activation ratio was reduced in 74%, as 102,917 occurrences were observed in January 2013 (beginning of the procedures) and only approximately 1/ 4 of that (26,499).were activated in December 2015 (end of it).This shows that, at first, alarm occurrences can be reduced simply by adjusting some configuration parameters, such as setpoints, delay times, deadbands, etc. Then it becomes necessary to invest in operational culture by introducing new procedures and attitudes, although this obviously takes a little longer. Finally, physical changes may also be needed (such as acquisition and replacement of process instruments), which causes additional implementation time and cost. In this case, a cheaper and faster alternative to be considered is the reduction of the number of alarms used in the plant. The following case studies show how the number of alarms in the plant can be further reduced by investigating the overall behavior of process variables through Correlation Analysis, Principal Component Analysis and Cluster Analysis.
3.4. Case study 01: the hot oil fired heater A small part of the plant is used below for illustrative purposes, as it is simpler to represent the behavior of few variables than the behavior of the whole plant. The analyzed system consists of a hot oil fired heater (see F-101 in Fig. 1). Hot oil fired heaters are very important pieces of equipment in petrochemical and hydrocarbon industries. This equipment transfers heat produced by combustion of the fuel to the process stream. In gas processing equipment, the fuel is usually natural gas; however, ethane, propane or light oils are sometimes used. The process stream varies widely, comprising natural gas, heavier hydrocarbons, water, glycol, amine solutions, heat transfer oils and molten salts. Approximately 65–90% of the total energy demand of the processing plant goes into process heating in the form of fired heaters. Typically, the tubes are arranged along the walls of the radiant section and in a separate tube bank located in the convection section. The radiant and convection sections are so termed because the predominant modes of heat transfer are radiation and convection, respectively. The purpose of the convection section, which is a type of cross flow recuperator, is to recuperate heat and improve thermal efficiency. Process heaters and burners come in a variety of designs and configurations (GPSA, 1998; Bussman & Baukal, 2009; Axon, 2008). Fig. 3 shows the location of the monitored process variables to facilitate understanding. The signals, which have been used for testing, were selected from the plant during a period of stable operation with small variations. Information about 11 important process variables of the hot oil fired heater system was collected. Tags 1 to 3 are used to monitor the pressure in different locations of the radiant section; Tags 4 to 7 are used to monitor the temperature in each pass of the fired heater where the thermal oil to be heated flows; Tags 8 to 11 are used to monitor the thermal oil flow in each pass of the fired heater. Table 6 shows the mean, standard deviation, minimum and maximum values of analyzed signals, respectively. Figs. 4–6 show the time profiles of pressure, temperature and flow, respectively. It can be seen that the behavior seems very similar into each group of variables, suggesting the existence of strong correlations. In this case, correlation is intimately connected to the relatively flat spatial variable state profiles inside the heater. When the number of variables is high and the process is more complex, it may be difficult to identify the most important process driving forces and the observed variable correlations. It seems obvious that process disturbances will lead to multiple alarm activations, meaning that the number of alarms can possibly be reduced.
Fig. 3. Location of process variables in the hot oil fired heaters.
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
Table 6 Descriptive statistics of process variables in the hot oil fired heater. Mean
Std.
Min.
Max.
Tag1 Tag2 Tag3 Tag4 Tag5 Tag6 Tag7 Tag8 Tag9 Tag10 Tag11
8,85 8,18 6,79 209,73 211,06 208,46 210,85 86,21 85,28 82,31 80,84
1,30 1,31 1,21 0,62 0,61 0,61 0,62 0,64 0,57 0,56 0,61
14,97 14,32 12,49 208,24 209,55 207,01 209,25 83,63 83,16 80,31 78,62
5,01 4,53 3,57 211,22 212,42 209,98 212,32 89,23 88,10 84,92 83,53
tag1 (mmH2O)
Tags
-5 -10 -15
0
100
200
300
400
500
600
700
800
900
1000
tag3 (mmH2O)
tag2 (mmH2O)
Time -5 -10 -15
0
100
200
300
400
500 Time
600
700
800
900
1000
0
100
200
300
400
500 Time
600
700
800
900
1000
-5 -10 -15
Fig. 4. Pressure behavior in the hot oil fired heater.
Table 7 shows the linear correlation coefficients of the investigated variables. Strong correlation between similar variables
can be observed, as it might already be expected. It is particularly interesting to notice that correlations between variables of different types are weak. Fig. 7 shows the color map that can be associated with the correlation analysis presented in Table 7. The color map is a matrix of real numbers between 0.0 and 1.0 where the highly correlated variables are indicated by intense colors and weakly correlated variables are indicated by light colors (Mathworks, 2003). Identification of significant correlations in the color map is more suggestive than in the table form, especially when large amounts of data must be analyzed. Clearly, the presence of three well-defined groups can be perceived in the color map. These groups can be identified more clearly and systematically with help of principal component and clustering techniques. Fig. 8 shows the Normal Distribution Plot of the process variables of the thermal oil system. This plot can be useful for assessment of the normal behavior of the measured data. Many statistical procedures, including the principal component analysis, assume that the underlying distribution of the data is normal (Mathworks, 2003). Therefore, analysis of normal probability plots can provide some assurance that the assumption of normality is not being violated (or providing early warning about possible misuse of the proposed analytical technique). In the present case, the data cannot be fitted by the normal curve very precisely. Therefore, one can always wonder whether statistical procedures based on the normal assumption should be used to provide information about the analyzed data set. This particular point is frequently neglected at plant site, although it should be carefully considered by process engineers, especially when one realizes that non-normal analytical procedures can be implemented by process analysts. Fig. 9 shows the Pareto plot of the percent variability explained by each principal component (mean pressure, temperature and flow), when principal component is applied. It can be seen that three principal components are able to explain the total variability of the 11 original process variables. The percentage of the total variability explained by summing these three principal components is close to 100%. As a consequence, it seems fair to say that
212 tag5 (°C)
tag4 (°C)
212 210 208 206
0
500
210 208 206
1000
0
Time
1000
212 tag7 (°C)
tag6 (°C)
500 Time
212 210 208 206
191
0
500
1000
210 208 206
0
Time Fig. 5. Temperature behavior in the hot oil fired heater.
500 Time
1000
192
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
90 tag9 (m³/h)
tag8 (m³/h)
90
85
80 0
500
85
80
1000
0
500
Time
Time 90 tag11 (m³/h)
90 tag10 (m³/h)
1000
85
80 0
500
85
80
1000
0
500
Time
1000
Time
Fig. 6. Flow behavior in the hot oil fired heater. Table 7 Correlation coefficients for process variables in the hot oil fired heater.
Tag1 Tag2 Tag3 Tag4 Tag5 Tag6 Tag7 Tag8 Tag9 Tag10 Tag11
Tag1
Tag2
Tag3
Tag4
Tag5
Tag6
Tag7
Tag8
Tag9
Tag10
Tag11
1.00 0.99 0.95 0.35 0.36 0.34 0.34 0.10 0.10 0.09 0.11
1 0.98 0.38 0.39 0.37 0.37 0.09 0.10 0.08 0.10
1 0.39 0.41 0.39 0.38 0.08 0.08 0.07 0.09
1 0.98 0.99 0.97 0.03 0.02 0.00 0.01
1 0.99 0.97 0.04 0.02 0.01 0.02
1 0.96 0.03 0.02 0.01 0.01
1 0.03 0.02 0.00 0.02
1 0.89 0.88 0.81
1 0.95 0.88
1 0.87
1
0.8 tag2 0.6
tag3
Frequence
1 tag1
0.4
tag5
0.2
tag6
0
tag7
-0.2
tag8
-0.4
tag9
-0.8 tag11 tag1
tag2
tag3
tag4
tag5
tag6
tag7
tag8
tag9
tag10
tag11
-1
Fig. 7. Color map obtained for the correlation analysis in the hot oil fired heater. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Frequence
-0.6 tag10
200 0 -5
-4
-3
-2
-1 0 Pressure
1
2
3
4
400 Frequence
tag4
400
200 0 -5
-4
-3
-2
-1 Temperature
0
1
2
3
-4
-3
-2
-1 Flow rate
0
1
2
3
400 200 0 -5
Fig. 8. Distribution of process measurements in the hot oil fired heater. Reference is the normal distribution.
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
100
100%
90
90%
80
80%
70
70%
60
60%
50
50%
40
40%
30
30%
20
20%
10
10%
193
2
Inverse correlation distance
Variance Explained (%)
1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
1
2 Principal Component
3
0%
0
Fig. 9. Pareto plot of the percent variability explained by each principal component in the hot oil fired heater.
the implementation of three alarm settings is sufficient for this particular section of the plant. The first one is related to the feed flow rates (which can be manipulated through the feed valve), the second one is related to the stream temperature (which can be manipulated through the heat input or the feed flow rate of the fuel stream) and the third one is related to the heater pressure (which can be manipulated through the output fuel valve). Then, the new alarms might be defined in terms of the new transformed variables: NV ⎛ x j −μ ⎞ z= ∑ vj⎜ ⎟ ⎝ σX ⎠ j=1
(9)
where z stands for the transformed variable, vj represents the j-th component of the analyzed principal direction and the remaining variables are defined as described previously. In the particular case analyzed, the PCA analysis recommends the computation of the usual average (flow, temperature and pressure averages) as the transformed variables to be used for monitoring purposes and process alarming. Although this can be simple to understand in the proposed example, the analysis of larger sections of the plant can provide less obvious representations of the operation states. Perhaps this fact can be used to recommend the use of bottom-up analytical procedures, which propose the analysis of smaller sections of the plant (local analysis) before the implementation of the more involving analyses of the full plant behavior (global analysis). It is important to observe that reduction of the number of alarms can face significant resistance among operators and process engineers, especially if the replaced alarms are associated with a system with a history of accidents or are regarded as very important for process monitoring. Besides, PCA alarms may lack physical meaning, when compared to the original alarms. In view of this possible difficulty and in order to facilitate the acceptance of the new combined alarms at plant site, one alternative may be using the so-called PC alarms as the most visible alarms and keeping the original alarms in a second screen, so that the operators can refer to physical alarms when PCA alarms are activated or check them whenever they regard as necessary. Fig. 10 shows the dendrogram of the investigated process variables. The clusters were computed with the help of Ward distances. The calculated cophenetic correlation coefficient (Sokal & Rohlf, 1962) was equal to 0.9979. It can be seen once more that three groups were formed: Group-1, containing tag1, tag2 and tag3; Group-2, containing tag4, tag5, tag6, and tag7; and Group-3, containing tag8, tag9, tag10 and tag11. It can also be seen that
1
2
3
4
6
5
7
8
9
10
11
Tag Fig. 10. Dendrogram of the analyzed process variables in the hot oil fired heater.
there is higher affinity between Group-1 and Group-2 and between Group-2 and Group-3. In other words, temperature measurements can affect (or be affected) more intensively the remaining process variables than available flow and pressure measurements. As discussed by Schwaab and Pinto (2007), weak correlation between variables might indicate true independence between the variables, existence of pronounced measurement errors, influence of other variables or a higher degree of nonlinear functional dependence between the variables. As mass and energy balances and thermodynamic constraints introduce strong correlations among flow, temperature and pressure measurements, the most probable hypothesis for the weak correlation between these variables is the fact that the variables presented little variation during the normal operation. So, in this case, a single alarm might be used for variables of the same type, but it is advisable to keep independent alarms for the three different type variables. In order to deepen this discussion, a more complex system is addressed in the next example. The situation is also far more complex, as data for the process under severe faults are included in the correlation analysis. 3.5. Case study 02: the propane refrigeration system Correlation, Principal Component and Cluster analyses were used to investigate the alarms configured for the propane refrigeration system, as illustrated in Fig. 11. This system allows the process to reach very low negative temperatures and two compressors are available, although one is a stand-by compressor. As one can see in Fig. 11, liquid propane from the propane accumulator vessel (V-1238009) flows to a dehydrator (S-1238001) and then to the propane economizer (V-1238010), flowing through the coil of the depurator vessel (V-1238010) in the suction zone of the propane compressor. The gas that leaves vessels V-1238010 and V-1238013 flows to the propane compressor. The liquid that remains in V-1238010 is drained in the vessel, while the liquid that remains from V-1238013 flows to the chiller (P-123802). The twophase stream enters in the shell of the chiller, where it is vaporized. The propane vapors that leave the chiller flow to the depurator vessel in the suction zone of the propane compressor (V1238010). Then, the propane compressor (C-1238001 A/B) compresses the propane gas. The hot gas that leaves the compressors is directed to the propane condensers (P-1238005 e P-1238004). After condensation, the liquid propane flows back to V-1238009.
194
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
Fig. 11. Simplified Flowchart of the Propane Refrigeration System.
Table 8 presents the most important variables to be monitored in the system. The faults that happen in the plant more frequently were prospected. The following scenarios describe these problems: Scenario 1, normal operation of the system (no occurrence of faults); Fault Scenario 2, propane compressor stoppage; Fault Scenario 3, sale gas compressor stoppage; and Fault Scenario 4, instability of the level control loop of the propane economizer vessel (V-1238013). Analyses were performed for the four proposed scenarios; however, due to lack of space, only the figures related with Scenarios 1 and 4 are shown below. Figs. 12 and 13 present the results obtained with the Correlation Analysis of the analyzed process variables for periods of normal operation and oscillations in the level control loop of the propane economizer vessel, respectively. The name “Var” was omitted from the graph for the sake of readability; therefore, number i refers to var_i. During normal operation, some significant correlations may be identified in the color map of Fig. 12. However, Fig. 13 shows that the correlations between the process variables are dynamic and most of them change in response to the events experienced by the process. This reflects the multivariate and nonlinear features of the process, which limit the existence of Table 8 Important variables to be monitored in the propane refrigeration system. Variables Description
Variables Description
Var1 Var2
Var11 Var12
Var3 Var4 Var5 Var6 Var7
Flow of gas for sale Output temperature of P-123802 Dif. pressure in P-123802
Var13 Var14 Var15 Var16 Var17
Var8
Temperature of P-123802 Pressure in P-123802 Pressure in P-123802 Level in the shell of P-123802 Level in V-1238013
Var9
Level in V-1238013
Var19
Var10
Pressure in V-1238013
Var20
Var18
Level in V-1238009 Level in the base of V-1238009 Output temperature in P-1238005 Pressure in V-123809 Temperature in V-1238009 Fuel gas temperature Fuel gas pressure Suction pressure of C-1238001B Discharge pressure of C-1238001B Status of C-1238001B
long-range persistently strong correlations through different fault scenarios (or driving forces of process trajectories). In spite of that, the Figs. 12 and 13 indicate the existence of persistently strong correlations (between Var8–Var9 and Var14–Var19), as observed in all scenarios investigated here. As a consequence, in spite of the modification of variable correlations along some typical process trajectories, these persistent strong correlations can be used to simplify the settings of the alarm system, as described in the previous example. Principal component analyses were also performed for the four distinct operation scenarios. For Scenario 1, 10 principal components were identified; for Scenarios 2, 3 and 4, the number of principal components was equal to 7. This indicates the existence of distinct linear relationships among the variables, making more difficult the reduction and combination of the alarms and making the alarm reduction process more involving. However, this also suggests that alarm clustering can be used to classify the distinct 1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
-1
Fig. 12. Color map of variable correlation during normal operation in the propane refrigeration system. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Fig. 13. Color map of variable correlation during Fault Scenario 4 in the propane refrigeration system. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Inverse correlation distance
2
1.5
1
0.5
14 19 10 13
1
3
6 17 11 15 16
2
4
5 18
7 12
8
9
Var Fig. 14. Dendrogram during normal operation in the propane refrigeration system.
4.5 4
Inverse correlation distance
operation states (normal or faulty) of the process. In other words, the particular combination of alarmed states can provide information about the particular trajectory that the process is experiencing, allowing for possible fault identification and diagnosing. Based on the PCA results, Cluster Analysis were also performed for the 4 investigated scenarios. As one can see in Figs. 14 and 15, the appearance of the dendrograms depends on the process condition, as the number of principal directions in PCA analyses. The cophenetic correlation coefficients were respectively equal to 0.6911, 0.7610, 0.7926 and 0.9195, so that the grouping correlation increased in the following order of events: 1) normal operation; 2) stoppage in the propane compressor; 3) stoppage in the compressor of gas for sale; 4) oscillations in the level control loop of the propane economizer vessel. The suggested variable clustering depended on the process operation condition and trajectory, possibly due to the nonlinear behavior of the process. This characteristic certainly reduces the efficiency of the alarm reduction process, although opens the opportunity to use alarm clustering as a tool for fault diagnosis when the alarm correlations change with the process states.
4. Conclusions
2.5
0
195
3.5 3 2.5
One of the inherent difficulties in monitoring industrial processes is the problem of high multidimensionality. Fortunately, in data sets with many variables, groups of variables often move together. Variables are correlated due to known or unknown relationships and due to sensor redundancies in the plant. Because of these correlations, when an abnormality occurs in the plant, it affects many process variables simultaneously. Monitoring all these variables may result in many alarm occurrences. However, it is possible to take advantage of redundant information to design and reduce the size of alarm sets. In the present paper, alarm management was applied to a Natural Gas Processing Plant during a three-year period. Particularly, it was shown that the implementation of a sound alarm management protocol could reduce the number of alarm activations by almost 80% in a real industrial plant. Additionally, it was shown here that correlation analyses, cluster analyses and principal component analyses can be used as tools for further reduction of the number of alarms, as these techniques capture and allow for direct visualization of variable correlations present in historical process data sets. The main idea is that a large number of correlated alarms can be replaced by a more meaningful alarm that takes variable correlations into consideration. However, when the process is nonlinear and can experience different trajectories, variable correlations can change significantly during operation, making more difficult the effort to reduce the number of alarms through combination of correlated variables. In this case, however, alarm clustering can be used as a tool for diagnosing process faults, which has been overlooked in the literature.
2
References
1.5 1 0.5 0
8
9 13 6
7 15 16 14 19 11 17
1
2 12
3
4 10
5 18
Var Fig. 15. Dendrogram during Fault Scenario 4 in the propane refrigeration system.
Albazzaz, H., & Wang, X. Z. (2006). Multivariate statistical batch process monitoring using dynamic independent component analysis. In: Proceedings of the 16th European symposium on computer aided process engineering and 9th international symposium on process systems engineering. Arts, R. M., Irwan, R., Janssen, & Augustus, J. E. M. (2002). Efficient tracking of the cross-correlation coefficient. IEEE Transactions on Speech and Audio Processing, 10(6). Axon, B. (2008). Improve combustion system efficiency. Chemical Engineering Progress, 105(8), 41. Bransby, M. L., & Jenkinson, J. (1998). The management of alarm systems. Norwich: Health and Safety Executive.
196
V.B. Soares et al. / Control Engineering Practice 55 (2016) 185–196
Bussman, W. R., & Baukal, C. E. (2009). Ambient condition effects on process heater efficiency. Energy, 34, 1624–1635. Camacho, J., Pérez-Villegas, A., Rodríguez-Gómez, R. A., & Jiménez-Mañas, E. (2015). Multivariate Exploratory Data Analysis (MEDA) toolbox for matlab. Chemometrics and Intelligent Laboratory Systems, 143, 49–57. Chen, T. (2010). On reducing false alarms in multivariate statistical process control. Chemical engineering research and design, 88, 430–436. EEMUA-191 (2013). Alarm systems – A guide to design, management and procurement. London, UK: Engineering Equipment and Materials Users Association. Feital, T., Kruger, U., Dutra, J., Pinto, J. C., & Lima, E. L. (2013). Modeling and performance monitoring of multivariate multimodal processes. AIChE Journal, 59 (5), 1557–1569. GPSA (1998). Gas processors suppliers association (7 ed.). Oklahoma: Gas Processors Suppliers Association. Higuchi, F., Yamamoto, I., Takai, T., Noda, M., & Nishitani, H. (2009). Use of event correlation analysis to reduce number of alarms. In: Proceedings of the 10th International Symposium on Process Systems Engineering – PSE. Hollifield, B., & Habibi, E. (2010). The alarm management handbook ((2nd ed.). Houston, TX, USA: PAS. ISA (2009). ANSI/ISA-18.2: Management of alarm systems for the process industries. Durham, NC, USA: International Society of Automation. Izadi, I., Shah, S. L., Shook, D. S., & Chen, T. (2009). An introduction to alarm analysis and design. In: Proceedings of the 7th IFAC symposium on fault detection, supervision and safety of technical processes barcelona. Spain, June 30–July 3, 2009. Jackson, J. E. (1991). A User's guide to principal components. New York, USA: John Wiley & Sons. Jolliffe, I. T. (1986). Principal component analysis. New York, USA: Springer-Verlag. Lieftucht, D., Kruger, U., & Irwin, G. W. (2006). Improved reliability in diagnosing faults using multivariate statistics. Computers Chemical Engineering, 30, 901–912. Kondaveeti, S. R., Izadi, I., Shaha, S. L., Black, T., & Chen, T. (2012). Graphical tools for routine assessment of industrial alarm systems. Computers and Chemical Engineering, 46, 39–47. Macgregor, J. F., & Kourtl, T. (1995). Statistical process control of multivariate processes. Control Fag Practice, 3(3), 403–414. Martin, J., & Crowley, J. L. Comparison of correlation techniques, conference on intelligent autonomous systems, IAS'95, Karlsruhe, March, 1995. Mathworks (2003). Statistics toolbox for use with MATLAB. Natick, MA: The MathWorks, Inc. Montgomery, D. C. (2005). Introduction to statistical quality control (5 ed.). New York: John Wiley & Sons. Morrison, D. F. (1976). Multivariate statistical methods, 415. New York: McGraw-Hill Book Company.
Pariyani, A., Seider, W. D., Oktem, U. G., & Soroush, M. (2010). Incidents investigation and dynamic analysis of large alarm databases in chemical plants: a fluidized-catalytic-cracking unit case study. Ind Eng Chem Res, 49, 8062–8079. Schwaab, M., & Pinto, J. C. (2007). Optimum reference temperature for reparameterization of the Arrhenius equation. Part 1: problems involving one kinetic constant. Chemical Engineering Science, 62(10), 2750–2764. Rothenberg, D. (2009). Alarm management for process control. New York, NY, USA: Momentum Press. Smith, M. J. (2014). Statistical Analysis Handbook. A comprehensive handbook of statistical concepts, techniques and software tools. 〈www.statsref.com〉. SMITH. L. I. (2002). A tutorial on principal components analysis. Sokal, R. R., & Rohlf, F. J. (1962). The comparison of dendogram by objective methods. Taxon, 11, 33–40. Venkatasubramanian, V., Rengaswamy, R., Yin, K., & Kavuri, S. N. (2003). A review of process fault detection and diagnosis Part I: quantitative model-based methods. Computers and Chemical Engineering, 27, 293–311. Wang, J., & Chen, T. (2013). An online method for detection and reduction of chattering alarms due to oscillation. Computers & Chemical Engineering, 54, 140–150. Wang, J., & Chen, T. (2014). An online method to remove chattering and repeating alarms based on alarm durations and intervals. Computers & Chemical Engineering, 67, 43–52. Wang, J., Fan, J., Chen, T., & Shah, S. L. (2015a). An overview of industrial alarm systems: main causes for alarm overloading. Research Status, and Open Problems, IEEE Transactions on Automation Science and Engineering. Wang, J., Li, H., Huang, J., & Chong, S. U., (2015b). A data similarity based analysis to consequential alarms of industrial processes. Journal of Loss Prevention in the Process Industries. Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244. Xie, L., Kruger, U., Lieftucht, D., Littler, T., Chen, Q., & Wang, S.-Q. (2006a). Statistical monitoring of dynamic multivariate processes. Part 1. Modeling autocorrelation and cross-correlation. Industrial & Engineering Chemistry Research, 45, 1659–1676. Xie, L., Kruger, U., Lieftucht, D., Littler, T., Chen, Q., & Wang, S.-Q. (2006b). Statistical monitoring of dynamic multivariate processes – Part 2. Identifying fault magnitude and signature. Industrial & Engineering Chemistry Research, 45, 1677–1688. Yang, F., Duan, P., Shah, S. L., & Chen, T. (2014). Capturing connectivity and causality in complex industrial processes. New York, NY, USA: Springer. Yang, F., Shah, S. L., Xiao, D. Correlation analysis of alarm data and alarm limit design for industrial processes. 2010 American Control Conference, 2010. Yang, F., Shah, S. L., Xiao, D., & Chen, T. (2012). Improved correlation analysis and visualization of industrial alarm data. ISA Transactions, 51, 499–506.