Bayesian Network for quality control in the drilling process

Bayesian Network for quality control in the drilling process

Bayesian Network for quality control in the drilling process Susana Ferreiro*. Basilio Sierra**. Eneko Gorritxategi ***. Itziar Irigoien****. *Fundaci...

817KB Sizes 2 Downloads 73 Views

Bayesian Network for quality control in the drilling process Susana Ferreiro*. Basilio Sierra**. Eneko Gorritxategi ***. Itziar Irigoien****. *Fundación TEKNIKER, Eibar, Guipúzcoa, Spain (Tel: +34.943.206.744; e-mail: [email protected]). **UPV-EHU, San Sebastián, Guipúzcoa, Spain (e-mail: [email protected]) ***Fundación TEKNIKER, Eibar, Guipúzcoa, Spain (e-mail: [email protected]) ****UPV-EHU, San Sebastián, Guipúzcoa, Spain (e-mail: [email protected])} Abstract: Nowadays, the aeronautic industry requires the automation of certain processes to minimize economic costs and to optimize resources, ensuring at the same time the quality of these processes. One of the most important tasks in this sector is the drilling process, the main problem of which lies in the occurrence of burr. Today there is a manual burr elimination task subsequent to drilling and previous to riveting which guarantees the quality of the process, where the permissible burr size is set at under 127 microns, imposed by aeronautic industry. This task increases manufacturing costs and it must be replaced by a monitoring system in order to detect automatically and on-line when the burr is outside this limit and to reduce the number of holes to be removed. This article shows the efficacy of Bayesian networks for predicting burr generation in the drilling process, which is an easy model to interpret and to integrate into the final system. Moreover, the article provides the most influential parameters in the generation of burr in the process. Keywords: Bayesian network, machine learning, classification, burr, drilling process, quality control, monitoring system.

1.

INTRODUCTION

Nowadays, storage, organization and information retrieval have been automated thanks to data base systems and the availability of a huge quantity of information. There are several analytic techniques based on statistics that have been used to analyze this information, but they are cryptic for people who are not very experienced with them. Machine learning (Hernández, 2004) is a subfield of artificial intelligence and its aim is to develop algorithms such as Bayesian networks, the model to study in this work, that allow the machines to learn from data, that is, to develop programs able to induce models that improve their performance over time from data. This is the reason why it is a knowledge induction process. Bayesian networks have been used for some years in specific fields and applications such as medicine, mainly diagnosis, but at present they have evolved toward industrial fields, and many papers have been published about modelling industrial processes, as explained in (Correa, et al., 2009). In medicine, they are used for medical diagnoses such as prostate cancer, benign prostate hyperplasia, and for screening cervical cancer or liver disorders (Onisko, et al., 1998). They are used for medical prognosis as well, attempting to predict the future state of the patient from the evidence (symptoms, signs, laboratory test results, etc) and the treatment (Sierra, & Larrañaga, 1998). But they are also applied in the area of mobile robotics (Lazkano, et al., 2007) and in other fields. Furthermore, industrial maintenance has evolved thanks to new technologies such as Bayesian networks, their being able to support the decision process on fault diagnosis and identify

problems based on the faults prediction for non-critical machinery as shown in (Gilabert, & Arnaiz, 2006), (Correa, et al., 2008), (Nieves, et al., 2009), (Arnaiz, & Arzamendi, 2003) or (Santos, et al., 2009). The main contribution of this article is to provide a Bayesian network model for detecting burr generation in the drilling process that improves the results of the conventional method used today and that can be inserted into a monitoring system in order to detect the burr generation automatically, on-line and in real time. Moreover, the article includes the most influential parameters that affect the drilling process and compare the Bayesian network model with respect to other machine learning algorithms. The rest of the article is organized as follows. Section 2 presents the drilling process and explains the monitoring system to be developed. Section 3 summarizes the principles of Bayesian networks and Section 4 introduces the Bayesian network model and compares the model with other machine learning algorithms. Finally Section 5 concludes with the most important findings. 2.

DRILLING PROCESS DIFICULTIES

Today, certain processes of the aeronautic industry have a high economic cost due to unprofitable operations that to date have not been eliminated and reduce product quality. Drilling is one of the processes to be improved. Taking into account that a small-medium size aeroplane has more than 250.000 holes, a pre-task such as visual inspection and burr elimination implies an excessive increase in costs.

Visual inspection and burr elimination are non-productive operations. They are carried out subsequent to drilling and they should be eliminated or minimized to the maximum extent possible. In summary, it is necessary to eliminate this manual process and replace it with a monitoring system able to detect automatically and on-line the generated burr as shown in the next figure:

Fig. 1. Elimination of unproductive operations. There is no monitoring system development up until now but the technology centre “C.I.C MARGUNE” patented (number WO2007/065959A1) a mathematical model, experimentally adjusted, capable of detecting in real time whether the size of the generated burr is between aeronautical limits or not. This first approach to burr detection was based on 5 parameters extracted from the whole internal signal of the machine, and the ratio of correct classification for this conventional model was 92%. The present article aims to improve the detection of burr by using Bayesian networks model that improve on the correct classification ratio obtained by the mathematical model and that could be implanted later in a monitoring system to predict automatically and on-line the burr generation during the drilling process. The model is learnt from a dataset consisting of a design of experiments, which is then inserted into the machine to detect when the burr occurs during the drilling process. 3.

BAYESIAN NETWORKS APPROACH

Bayesian network (BN) is a model representation for reasoning under uncertainty. Formally, its representation is a directed acyclic graph (DAG) where each node represents a random variable and the edges represent (often causal) dependence relations between them. Thus, each variable represents a unique event or hypothesis, it has a finite set of mutually exclusive states: X={x1, … , xn} and there must be a state for each possible value and its conditional probabilities. In order to specify Bayesian networks (BNs) and fully represent the joint probability distribution to take advantage of this paradigm for the representation of uncertainties, it is necessary to build a model (structure and parameters of the network) and specify for each node X the a priori probability

distribution for X conditional upon X's parents node. Further information on BNs (learning) can be found in (Jordan, 1999) and (Neapolitan, 2004). Joint probability distribution (global model) is specified through marginal and conditional distributions (local models) taking into account conditional independence relations between the nodes and their parents. This modularity means easy maintenance and reduces the number of parameters necessary to specify the global model: the estimation of the parameters is easier, there is a reduction in the storage needs and inference is more efficient. Bayesian networks are used to answer probabilistic queries. For example, the network can be used to obtain updated knowledge of the state of a subset of variables when other variables (the evidence variables) are observed. This process of computing the posterior distribution of variables given evidence is called probabilistic inference and it is useful in different situations like diagnosis (abductive reasoning, also called “explaining away”) and prediction (or deductive reasoning). Introduction to inference and advanced inference for BNs is available in (Jordan, 1999). Bayesian networks are very useful because it is adaptable. It is possible to build an initial network with a limited knowledge in a domain and increase it as new knowledge becomes available. But the most significant point is that it is possible to learn from experience, that is, Bayesian networks can refine (conditional) probabilities obtained from the states of the nodes by taking into account real observations. Furthermore, the fact that BNs represent a mixture of statistical techniques and probabilistic graphic models, offers some advantages over other techniques as explained in (Byington, et al, 2002) and (Goode, & Roylance, 1999): they can deal with uncertainties and they are an effective technique for solving diagnostic and prediction problems in situations where knowledge comes from different sources because they are able to combine a priori knowledge and experimental knowledge. 4.

BURR PREDICTION USING BAYESIAN NETWOKS

As mentioned above, what is being looked at in this article is an efficient burr detection model learnt via Bayesian network. Moreover, this model conveys the physical relationships of the machining process among the variables. 4.1 Predictive variables A fundamental task prior to the development of this work is to study the sensitivity of different signals to the burr detection, to treat and to use them to learn the Bayesian network in order to use the model to develop the on-line monitoring system. It implies analyzing the signals and evaluating which of them has more information about the burr. Initially, the internal signals of the machine were analyzed. These included the ‘torque of the spindle’, the ‘power force’

and ‘advanced force’, and the studies concluded that these signals present certain advantages: -

They are a simple acquisition method (which does not require additional elements).

-

They form a non intrusive method since no elements are added to the work piece

-

They provide an easy methodology of integration into the machine control.

Figure 2 shows an example of an internal signal caught during a drilling test. This signal belongs to the torque of the electro-spindle during the drilling of a hole, from the electrospindle acceleration to the deceleration. It presents four areas represented in the following figure: “Spindle acceleration area” of the gear-head, “Approach to work piece area” to the material, “Cutting area” and “Spindle deceleration”.

Fig. 3. Cutting area. 4.2 Experimental Setup In order to learn the Bayesian network or other types of models of supervised classification, a set of experiments based on a design of experiment was performed using the predictive variables (Table 1) plus the class ‘BURR’ being predicted. This class was categorized based on the permissible burr size imposed by the aeronautical industry, which imposes a maximum size of 127 microns: BURR=’yes’: non-admissible burr ( 127 microns) BURR=’no’: admissible burr (<127 microns) Table 1. Predictive variables

Fig. 2. Electro-spindle signal. Next, further study of these types of signals caught in the drilling process concludes that the shape of the signal of the torque of the electro-spindle with respect to the time domain is related to the size of the burr, and it was observed that the most representative area corresponds to the “Cutting area”. Finally, the predictive variables were defined as the parameters that define the process. These parameters can be divided into two groups: configuration and sensor parameters. Configuration parameters are related to the operational conditions such as speed of cut, speed of advance, length in entrance and exit, thickness and type of drill, while sensor parameters are calculated from the “Cutting area” (Fig. 3) of the spindle signal (maximum, minimum, angle, height and weight). The monitoring system should start from a dataset from which the model is learnt and which will be inserted later into the machine to control the generation of burrs during the drilling process. Given this, a set of tests was made after extracting the most influential variables of the process defined above.

Variable

Origin

Type

Values

BRO

Config.

Discrete

SR; HARD

VC

Config

Continuous

TRM

Config

Discrete

AV1

Config

Continuous

AV2

Config

Continuous

REC

Config

Discrete

20;35

ESP

Config

Discrete

12;25

MIN

Sensor

Continuous

MAX

Sensor

Continuous

ANG

Sensor

Continuous

ALT

Sensor

Continuous

ANC

Sensor

Continuous

8;15;20; 35

Table 1 presents the predictive variables that were used for the design of experiments and the analysis. As mentioned

above, this data set consists of configuration and sensor variables: BRO is the type of drill bit (HARD- hard rock; SR- soft rock). VC is the speed of cut (150-250 m/min). TRM represents the length in entrance to the work piece. AV1 and AV2 are relative velocities between the work piece and the tool before and after cutting. REC corresponds to the output length of the tool. ESP is the thickness of the material (aluminium). MIN is the relative minimum measured on the output of the tool. MAX is the relative maximum measured before the output of the tool. ANG represents the slope of the curve measured during the output of the tool. ALT is the height of the disruptions in the post output area. And finally, ANC corresponds to the weight of the disruptions in the post output area.

Figure 4 shows the structure learnt for the first Bayesian network, the accuracy of which was 92.28% and the standard deviation was 1.3%. It illustrates the causal effects existing amongst its nodes. It provides information about the cause and effect relationship amongst the variables that define the drilling process. It is interesting information that must be studied in more detail because it can provide information about the process that was previously unknown.

After defining the predictive variables which represent the drilling process and an initial data set, a group of experts in the drilling processes made a data selection and carried out pre-processing. Therefore this data set is more representative and reliable. They carried out a detection of irrelevant or unnecessary data, anomalous data (outliers), missing values, inconsistencies (Hand, Mannila, & Smyth, 2001), etc. Each task includes a broad set of Data Mining (Hernández, 2004) techniques. Briefly, for detecting this type of data (irrelevant, anomalous, missing or inconsistent) it is necessary to thoroughly examine the data by means of a summary of variables, histograms, dispersion plots, box plots or a simple visual inspection. Then, a decision is made about what to do with this data: ignore it, delete the variable for all the tests, delete the test which contains it, or replace the value (i.e. with the means or the variance). Data selection and processing is an important task that usually takes more than 50% of the analysis time so the results obtained later depend on the quality of the initial data set to a certain extent. 4.3 Experimental Results There were a set (Fayyad, Piatetsky, & Smyth, 1996) of 106 tests of drilling measures after the first pre-processing of the data, made up of the configuration variables of the process and variables from the spindle signal. At this point, the objective was to obtain a more suitable classification model based on the Bayesian network learning its structure and probabilities from the experimental dataset shown above. This learning process was perform using Weka software which is a collection of machine learning algorithms written in Java and developed by the University of Waikato (Australia). The Bayesian network was learnt using the K2 (Bouckaert, 2005) search algorithm to define the structure and Simple Estimator (Cooper, & Herskovits, 1993) to estimate the probabilities.

Fig. 4. First Bayesian network model. Once the standard approach was applied, we expected to improve the model increasing its accuracy, validity, reliability and stability (Kazakov, & Kudenko, 2001) with a selection of variables (Mitchell, 1997). The aim of the selection of a subset of variables does not lie in analyzing the variables because this task was made in a previous section, but this is a second selection and it determines which the most influential variables are and which of them improve the model. It has some advantages: Noise elimination, increasing data precision and predictive and explanatory ability of the model. Irrelevant data elimination, decreasing acquisition cost and computational cost of the data base. Redundancies elimination, avoiding problems of inconsistencies and duplications. To make a selection there are different types of criteria and measures such as information gain, explicative variance, correlation tests, filter methods, wrapper, etc. They are combined together, with search methods such as exhaustive search, genetic search, tabu search, greedy search, etc, to obtain the most representative set of variables. The present work considers the combination of Wrapper criterion with the Best First method. This type of criterion evaluates the sets of variables with a learning scheme. In this work it is used to estimate the accuracy of learning Bayesian network for each set of variables. In addition to this, the Best First method searches the space of subsets of variables by

means of the greedy hill climbing augmented algorithm combined with a backtracking facility.

trees

REPTree (J48)

91.71

2.11

0.075

After testing this combination of criteria and method a second network was developed (Fig. 5) using the same algorithms for learning the network: K2 and SimpleEstimator. Its accuracy was 95.2381% and the standard deviation 0%. The variables identified as the most relevant in the drilling process were BRO, VC, ANC and TRM.

Induction rules

JRIP

92

1.57

0.053

RIDOR

94.86

0.92

0.029

KNN (k=3)

87.9

0.78

0.13

IB1

94.57

0.9

0.029

Naive Bayes Simple

89.52

0

0.043

Distance based techniques Techniques based on probabilities.

Fig. 5. Second Bayesian network model. The second Bayesian network provides good accuracy, but obviously it has to be compared with other machine learning algorithms such as classification trees, induction rules, distance based techniques or techniques based on probabilities. There are other techniques such as neural networks or regression models that have not been taken into account in the current work but that would be interesting to study in future works. Table 2 presents the results for the different techniques applied to the initial data set. It was performed the selection of variables by means of Wrapper criterion and Best First method. It is seen that the accuracy of the models are worse than this obtained by the Bayesian network and in addition, the network provides a number of advantages seen in the section above. Having mentioned the accuracy of the Bayesian network, it is also worth noting that it provides 0.029% of false negatives and this percentage is only achieved by IB1 and RIDOR. False negatives are cases in which the model does not detect the burr which has in fact been generated. In some industrial processes such as aeronautics drilling process, in which there can not be false negatives due to very restrictive impositions, it is necessary to obtain a model to avoid them. Table 2. Results of machine learning algorithms Type of classification

Algorithm

Mean value (%)

Standard Deviation (%)

False Negatives (%)

Classification

J48

94.09

1.25

0.041

The benefits of the Bayesian networks in comparison with other algorithms of machine learning are numerous. They have the ability to adapt: they can self-modify not only parameters –or specific instances- but also the modelling structure, which is hard to integrate into actual modelling systems tools. Nevertheless, it is possible to integrate the structure and its abilities thanks to some software such as Hugin (www.hugin.com), allowing the use of the network through libraries that can be imported from other frameworks. Furthermore, Bayesian networks give the possibility of handling explicitly uncertain knowledge as almost all human knowledge presents some type of uncertainty. It is founded in probability theory and provides a clear semantic and sound theoretical foundation, and finally its representation of knowledge is graphical, intuitive and the reasoning close to that made by human beings. 4.4 Evaluation The evaluation is a very important issue to bear in mind after learning the network because the network validity depends on the quality of the evaluation. It includes some objectives: A. To estimate the real error rate of the prediction (with new validation samples): this rate should be calculated using the data sets that have not been used for learning the model because the error rate calculated from training samples underestimates the error rate predicted for new samples. B. To select the model amongst two or more models. The evaluation assesses whether one model is better than another. Because of the importance of these two objectives, the following procedure was carried out for calculating good estimates of the error rate of the models: 1. To apply 10-fold cross-validation. It is a technique to estimate the performance of the predictive model. The technique randomly assigns tests to 10 sets {d0, d1, .., d9} of equal size. Then the model is trained on each set and tested on the rest. The final accuracy is calculated as the average of the accuracy obtained from the 10 sets.

2. To repeat 10 times (using different seeds) 10-fold crossvalidation obtaining 10 rates of the percentage of correct classification. 3. To calculate the average of the 10 rates. Finally, the best model / a ranking of models was selected / established based on the correct classification rate calculated in 3 above. 5.

CONCLUSIONS AND FUTURE WORK

The use of a Bayesian network classifier to optimize the drilling process is an important advance in the aeronautic industry. Being able to detect the burr generation during the drilling process can decrease the manufacturing cost by eliminating the visual inspection and deburring tasks previous to riveting. Moreover, as shown in the article, through the use of these networks it is possible to detect certain relationships between different variables that define the process and provide information about the behaviour of the process itself. Bayesian network can be used not only to optimize the configuration of variables on the drilling process but it can also determine the effect of various combinations of those variables on the process. Bayesian network obtains results that improve the performance of conventional models and other machine learning techniques, also providing a number of additional advantages mentioned in the sections above such as adaptation capability, uncertainty management, or ease of understanding, among others. What is more, and not forgetting the goal of developing a monitoring system to detect automatically and on-line burr generation, Bayesian network allows an easy integration into the machine for the control of the drilling process thanks to some dynamic libraries. Even though the Bayesian network provides a high percentage of correct classification rate there is still a number of false negatives. Even with the study of the configuration and sensor variables there are a small percentage of false negatives due to stochastic components which can not be controlled. The two misclassified tests are very close to the limit (127 microns), so introducing confidence limits may help to avoid false negatives. It is proposed as future work to establish a margin between 100 and 127microns. All the tests classified within this range shall be inspected in order to ensure zero percent false negatives. 6.

REFERENCES

Cooper, G. F., and Herskovits, E. (1992). A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning, vol. 9, pp. 309-347. Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996). From data monitoring to knowledge discovery in databases. AI magazine. Mitchell, T.M. (1997). Machine Learning. McGrawHill International Editions.

Onisko, A., et al. (1998). A probabilistic causal model for diagnosis of liver disorders. 7th Intelligent Information Systems Proceedings, pp. 379-387. Sierra B., and Larrañaga, P. (1998). Predicting survival in malignant Skull melanoma using Bayesian Networks automatically induced by genetic algorithms. An empirical comparison between different approaches. Artificial Intelligence in Medicine, vol. 14, pp. 215-230. Goode K.B., and Roylance B. (1999). Predicting the Time to Failure of Critical Components- A Software Package Strategy. Condition Monitoring and Diagnostic Engineering Management Proceedings, vol. 99, pp. 547555. Jordan, M.I. (1999). Learning in Graphical Models. First MIT Press edition. Hand, D. J., Mannila, H., and Smyth, P. (2001). Principles of Data Mining. The MIT Press. Kazkov, D., and Kudenko, D. (2001). Machine Learning and ILP for MAS. ACAI 2001. LNAI, vol. 2086, pp. 246270. Byington C.S, et al. (2002). Prognostic enhancements to diagnostic systems for improved condition-based maintenance. 2002 Ieee Aerospace Conference Proceedings, vols 1-7, pp. 2815-2824. Arnaiz, A., and Arzamendi, J. (2003). Adaptative diagnostic systems by means of Bayesian networks. 16th International Congress on Condition Monitoring and Diagnostic Engineering Management, pp. 155-164. Hernández, J., Ramírez, M.J., and Ferri, C. (2004). Introducción a la Minería de Datos. Pearson, España. Neapolitan, R.E. (2004). Learning Bayesian Networks. Pearson Prentice Hall. Bouckaert, R. Bayesian Network Classifiers in Weka, Technical Report. (2005). Department of Computer Science, Waikato University, Hamilton. Lazkano, E., et al. (2007). On the use of Bayesian Networks to develop behaviours for mobile robots. Robotics and Autonomous Systems, vol. 55, pp. 253-265. Peña, B., Aramendi, G., and Rivero, M (2007). METHOD FOR MONITORING BURR FORMATION IN PROCESSES INVOLVING THE DRILLING OF PARTS. WO2007/065959A1. Correa, M., Bielza, C., Ramirez, M.D. and Alique, J.R. (2008). A Bayesian network model for surface roughness prediction in the machining process. International Journal of Systems Science, vol 39(12), pp. 1181-1192. Correa, M., Bielza, C., and Pamies-Teixeira, J. (2009). Comparison of Bayesian networks and artificial neural networks for quality detection in a machining process. Expert Systems with Applications, vol 36, pp. 7270-7279. Santos, I., Nieves. J., Penya, Y.K. and Bringas, P.G. (2009). Optimising Machine-Learning-Based Fault Prediction in Foundry Production. 10th International Work-Conference on Artificial Neural Networks (IWANN 2009), pp. 554561.