8th IFAC Symposium on Advanced Control of Chemical Processes The International Federation of Automatic Control Singapore, July 10-13, 2012
Automatic Inspection of TFT-LCD Glass Substrates Using Optimized Support Vector Machines Ali Yousefian Jazi*. J. Jay Liu*, Hokyung Lee** * Department of Chemical Engineering, Pukyung National University, Busan, Korea (e-mail:
[email protected];
[email protected]) **CRD , LG Chem, Daejeon, Korea (e-mail:
[email protected]) Abstract: The visual appearance of manufactured products is often one of the important quality attributes for certain types of products, which are mainly used for display purposes or used as the exterior part of other products. TFT-LCD (thin film transistor – liquid crystal display) glass substrates can be one of the representative cases. In such cases, visual quality (i.e., visual appearance) as well as the physical or mechanical quality attributes has to be controlled or maintained. This paper presents an industrial case study of a new machine vision methodology to manufacturing of TFT-LCD glass substrates. In this case study, we developed a classification model using support vector machine (SVM), optimized via the simulated annealing (SA) algorithm. We also used parallel genetic algorithm to reduce the number of features for classification. The results show that utilization of optimized SVM approach with SA in classification of TFT-LCD glass defects could be a viable alternative to manual classification in the TFT-LCD glass substrate industry. 1. INTRODUCTION The visual appearance of manufactured products is an important quality attribute. It is an essential determining factor for products quality, especially in case those products are used as the exterior part of other manufactured products. One of representative examples of those products includes sheet glass or thin film transistor-liquid crystal display (TFTLCD) glass substrates for liquid crystal display (LCD) panels. TFT-LCD has become the most popular flat panel display (FPD) during the past decade and the manufacturing of high quality TFT-LCD glass substrates with dimensions more than 2×2 m2 is pushing the envelope for producing huge size of flat panels bigger than of 100–inch FPDs. However, one of the main weaknesses that remain with manufacturing larger glass substrates is their surface defects such as surface warp and surface waviness. For example, the sheets with large warps, of the order of a few hundred micrometers, pose severe quality problems to the LCD panel industry.
Another main difficulty in this type of the problems arises from the fact that prior information or knowledge is very limited: no accurate class labels are often available nor prior knowledge about important aspects of visual appearance is often available. This limitedness of the prior information as well as stochastic nature of the visual appearance of the products makes them much more difficult to apply machine vision for the problems. Recently, Liu (2004) proposed a general machine vision framework for overcoming these difficulties (i.e., stochastic nature of products and processes and limitedness of prior information) and this pioneering work has initiated more than one hundred subsequent papers reporting successful industrial applications (Duchesne et al., 2011). Liu, Huang, and Lee (2008) have proposed an inline defect detection system for a TFT-LCD array process, which was designed based on the locally linear embedding (LLE), and Roweis et al. (2000) have proposed the support vector data description (SVDD) for this reason (Tax & Duin, 2004). Liu, Lin, Hsueh, and Lee (2009) further proposed a fuzzy SVDD ensemblebased system which can judge whether the inline defect in an input image is a target defect or not. Liu et al. (2008) used
Increased competition in such competitive markets derives LCD panel manufacturers as well as their raw material suppliers such as sheet glass manufacturers to make every effort to improve production yields. For this reason, one of the most efficient ways for defect inspection at present is using inline automatic optical inspection (AOI) based on machine vision, and the use of AOI is becoming more popular. However, some manufacturing processes such as sheet glass manufacturing, in which more than simple physical transformation of raw materials or parts occurs, have certain types of surface defects that can hardly be detected by typical AOI. In such manufacturing processes, defects are 978-3-902823-05-2/12/$20.00 © 2012 IFAC
hard to define or measure, in other words, defects have stochastic appearance. For example, froth bubbles have very complex patterns with continuously varying shape, size, direction, etc. There can be no discontinuous class of patterns because different patterns merge together to form more complex patterns (Liu et al., 2005). For this reason, inspection of such stochastic defects is left for trained human inspector. But inconsistency in human judgement is yet to be resolved in order to secure reliable quality control and the same situations can be found in glass substrate manufacturing.
325
10.3182/20120710-4-SG-2026.00054
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
SVDD as the defect detector, and have shown a high accuracy of inline defect detection. However, there still remains a critical concern: Liu et al. (2008) only used SVDD for the issue of sampling inspection. If a full inspection is needed, then SVDD fails to meet this requirement since the classification speed of SVDD is slow.
TFT-LCD generation. For example, the size of generation 7 is 1870×2200 or 1950×2250 (in mm).
Support vector machines (SVMs) (Vapnik, 1998, Burges, 1998, Scholkopf et al., 1999) and other supervised learning techniques approve the opposite approach. A support vector machine finds an optimal separating hyper-plane between members and non-members of a given class in an abstract space. Choosing appropriate values for parameters of SVM is an important step in SVM analysis which has a great influence on its performance and thus on its prediction accuracy. In this sense, utilization of metaheuristics may to be useful in discovering the optimal value of SVM parameters for the best forecast and estimation performance (Zhang and Guo, 2009; Mo et al., 2010). Simulated annealing (SA) algorithm is one of the well-known metaheuristics that can discover a good quality solution to an optimization problem by trying random differences of the current solution (Kirkpatrick et al., 1983). Besalatpour et al. (2012) used SA to optimize the parameters of SVM for prediction of soil physical and mechanical properties.
Fig. 1. The flow chart of the cold process 2.2 Surface defects and their inspection A LCD panels consists of two glass substrates (called Color filter glass and TFT glass) and a backlight unit. Liquid crystal fills the gap between the two glass substrates. Since the gap is the order of a few micrometers, surface defects over a length scale of a few micrometers can cause local brightness variation on a TV screen, often called “mura” in the industry. According to the LCD panel industry (Mauch, 2000; Freischlad, 1996; Lapp, 1994; Zhu et al., 2000), the surface quality of glass substrates is determined by following defects: (1) air bubbles and particles, (2) surface flaws or scratches, (3) surface roughness, (4) surface waviness, and (5) surface warp. Our focus is on the second and fourth defects and this paper will use only the second defect, surface flaws or scratches, to illustrate our proposed method for automatic inspection. Other defects can be detected well using AOI or other equipment. For example, the first defect, air bubble (inside a glass substrate) and particles (inside or on a glass substrate) can be detected very well using inline AOI system, known as “particle counter.”
The contribution of this paper is development of automatic inspection system for the surface defects on TFT-LCD glass substrates and its application to real plant data. This paper is organized as follows. In Section 2, we present problem description including a general glass substrate manufacturing process, surface defects and their inspection at a real plant. The description of the modelling approaches will be given in the next section and the proposed method will also be introduced in this section as well. Finally, Experimental results and conclusion are presented in Sections 4 and 5, respectively.
Since a glass substrate is reflective as well as transparent, imaging glass substrates for detecting surface defects is somewhat tricky. Therefore, projected (called transmission image) and reflected (reflection image) images through and from the substrate are often used for inspection among glass substrate manufacturers. Most of surface defects can be seen easily with bare eyes of expert human inspectors on those images. Examples of surface flaws or scratches are shown in Fig. 2. These images show fractions of the entire glass substrate and a number of these “sub-glass” images depend on the size or generation of a substrate. As mentioned earlier, some defects such as surface flaws/scratches and waviness are still judged by human inspector due to their stochastic appearance. In general, quality inspection by skilled human inspector, however, is limited and open to discrepancy: in most cases, product inspection by human inspector is performed only in the final check for the purpose of picking out defective products. It is time-consuming and costly to train and maintain skilful human inspectors. Furthermore, inconsistency in human judgement is an open issue.
2. PROBLEM DESCRIPTION 2.1 Manufacturing of glass substrates There are two types of commercial processes (also called “hot” processes) for manufacturing TFT-LCD glass substrates: (1) the floating process (Pilkington, 1969) and (2) the fusion process (Dockerty, 1967). In the floating process, molten glass is continuously poured onto a molten tin bath, in which the glass spreads forming a continuous sheet of flat glass and is then pulled to lehr for annealing and further cooling. In the fusion process, the molten glass is delivered into a trough called an “isopipe,” overfilling until the glass flows evenly over both sides. It then fuses at the bottom, at which it is drawn down to form a continuous sheet of flat glass. The continuous sheet of flat glass is then rolled out of a hot process to be cut and inspected at a cold process. The cold process consists of cutting, champering, and surface polishing steps with cleaning and inspection steps after each step. A schematic diagram of this cold process is shown in Fig. 1. The final product, a TFT-LCD glass substrate, has thickness of 0.7 mm and varying sizes according to so-called
326
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
increased complexity of interactions among the features and increased degree of noise. In this study we used parallel genetic algorithm (PGA) for reducing the dimension of the feature space. Let C {x1 , x2 ,, x p } be the set containing all of p possible
Defect type A
features, and be the collection of all subsets of C. The goal of feature selection is to find, in some sense, the “best” . In genetic algorithm (GA) each individual ω is represented by a binary string, say of length p, which is treated as the genetic code (DNA) of ω. Starting with a randomly generated population of size m, {1 , 2 ,, m } , a
Defect type B
new generation is produced with three genetic operations: selection, reproduction, and mutation. Zhu & Chipman (2006) first demonstrated that the GA, although perfectly natural for the variable selection problem, is actually not easy to use or terribly effective and then proposed a very simple modification. Their idea is to run a number of GAs in parallel without allowing each GA to fully converge, and to unify the information from all the individual GAs in the end. They also specified the appropriate stopping criterion as follow. Given a collection of binary sequences of length p (with each sequence containing p bits), let r j be the
Defect type C Fig. 2. Examples of the three types of surface flaws or scratches. 3. METHODS AND ALGORITHMS
frequency that the jth bit is equal to 1. Then the average “per bit” entropy of this collection is given by
3.1 Feature Extraction
entropy
Seen from the figures of different surface defects of glass substrates, detecting these defects are closely related to texture analysis since the image texture can be defined as a function of the spatial variation in pixel intensities (Tuceryan & Jain, 1998). Wavelet texture analysis is known as a very powerful state-of-the-art method for extracting textural features from images (Liu & Han, 2011). In this paper, we used wavelet co-occurrence signature (Van de Wouwer, 1998), which is direct extension of the grey level cooccurrence matrix (GLCM) by Haralick et al., (1973) (Haralick et al., 1973). In other words, wavelet co-occurrence signatures are higher order statistics based on co-occurrence matrix of two-dimensional (2-D) wavelet detail coefficients (also called sub-images). Among 14 GLCM features proposed in (Haralick et al., 1973), four features (angular second moment, contrast, energy, entropy) are extracted from GLCMs of wavelet sub-images after applying 5 level wavelet decomposition to substrate sub-images using bior1.3 wavelet function (biorthogonal wavelets with order one reconstruction filter and order three decomposition filter). We found the design parameters of wavelets transform appropriate by trial and errors. For more detail, please refer the references.
1 p rj log 2 (rj ) (1 rj ) log 2 (1 rj ) p j 1
(1)
Therefore, the GA can be regarded as having converged when the entropy of the population is sufficiently close to 0, that is, below a prespecified threshold δ (e.g., δ = .05). They showed with a methodical simulation study that parallel evolution or PGA is competitive in its ability to recover the correct model. They also illustrated the strength and usefulness of parallel evolution with both simulated and real datasets and indicate its general ability to be implemented as a feature selection tool for more complex statistical models. 3.3 Support vector machines Support Vector Machines (SVMs) are learning machines which mean that a linear function of f ( x) wT x b is used to solve the prediction problem. The best line is defined to be that line which minimizes the following cost function (Q):
min
w,b ,
i
,
i
*
N
(2)
w 1 C (i i ) *
i 1
subject to: yi (wT xi b) i , (wT xi b) yi i , *
3.2 Feature Selection
i 0, i* 0, i 1,, N
The existence of irrelevant and redundant features may make vague the distribution of really relevant features for a target concept and hence cause damage the models (John et al., 1994; Koller & Sahami, 1996). In addition, increasing the dimensionality of the feature space will generally result in
where i and i* are the corresponding positive and negative errors at the ith point, respectively, and N is the total number of samples. The first part of this cost function is a weight decomposer which is employed to make regular weight sizes
327
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
by punishing large weights. The second part is a penalty function which penalizes errors larger than using a so called insensitiv e loss function L for each of the N training points. The positive constant C denotes the amounts up to which aberrations from are endured. The third part of the equation is the constraints setting to the errors between regression forecasts ( wT xi b ) and exact values ( y i ).
draw a solution x at random in the neighbourhood V ( x (n ) ) of x (n ) .
The minimization of Eq. 2 is a standard problem in optimization theory: minimization with constraints. This can be solved by applying the Lagrangian theory. With the help of Lagrange theory, the function can be derived as follow:
The function p(n, x, x (n ) ) is often taken to be a Boltzmann function inspired from thermodynamics models:
max* W a,a
1 l l (ai ai* )(a j a *j ) xi , x j 2 i 1 j 1
If F ( x) F ( x (n) ) then x ( n1) x and if F ( x) F * then ( x, F ( x)). If F ( x) F ( x (n) ) then draw a number p at random in [0,1] and if p p(n, x, x (n) ) then x ( n1) x else x ( n1) x ( n ) .
1 (8) p(n, x, x ( n ) ) exp Fn , T n where F F ( x) F ( x (n) ) and Tn is the temperature at step n, that is a non-increasing function of the iteration counter n. In so-called geometric cooling schedules, the temperature is kept unchanged during each successive stage, where a stage consists of a constant number L of consecutive iterations. After each stage, the temperature is multiplied by a constant factor of (0,1) .
(3)
[a i ( y i ) a i* ( y i )]
subject to:
l
(a i 1
i
ai* ) 0
0 ai , ai* C
i 1,2, , l
In SA algorithm, if the candidate does not improve the current solution, there is still a possibility of transition according to the next probability function (Azizi and Zolfaghari, 2004):
*
From the above function, ai and ai can be found and thus W can be then expressed as: l
W (ai ai* ) xi
(4)
Ci P(transition ) min 1, exp i
i 1
The estimation function can be explained as: l
f ( x) (ai ai* )K ( xi , x) b
,
(9)
In the each iteration, the above transition probability is compared with a uniform random number. If the probability value is greater than or equal to the random number, then the transition to the worse solution is accepted. If the transition from the current solution to the candidate solution is rejected, another solution in the neighbourhood will be generated and evaluated.
(5)
i 1
where the K ( xi , x) is named the kernel function. In this study, the radial basis function (RBF) was used:
xi x (6) K ( xi , x) exp 2 2 where is kernel parameter (Burges, 1998; Cristianini and Taylor, 2000; Li et al., 2009).
4. EXPERIMENTAL RESULTS In this paper, a total of 1182 sub-glass images were collected and labelled by expert inspectors. There are total four classes including on-specification sub-glass (labelled as OK) and offspecification sub-glass with three types of defects (labelled as A, B, C). A training set of 826 sub-glass images was obtained from total of 1182 by random selection and the remaining 356 images were used as a testing set. 96 wavelet cooccurrence features were extracted from each image and used as a feature vector for a sub-glass image. Therefore, the training set is a 826×96 matrix and the test set is a 356×96 matrix.
3.4 Optimization of the SVM parameters using simulated annealing algorithm Discovering the optimal values of SVM parameters is important to achieve a good forecast and estimation performance (Zhang and Guo, 2009; Mo et al., 2010). In this study, the simulated annealing (SA) algorithm was used for optimizing the parameters of SVM. More specifically, the SA executes the following steps (Geng et al., 2007): 1. Choosing an initial solution x ( 0) and compute the value of the objective function of F ( x ( 0) ) .
In this investigation, the radial basis function is employed as the kernel function of SVMs, which is inspired by the empirical findings that radial basis kernels tend to give good performance under general smoothness assumptions, and therefore should be considered especially if no additional knowledge of the data is available. As there is no structured way to choose the optimal parameters of SVMs, the values of parameter optimized by simulated annealing algorithm that produce the near optimal result. Table 1 shows the optimal
2. Initializing the incumbent solution (i.e. the best available solution), denoted by:
( x * , F * ), as : ( x * , F * ) ( x (0) , F ( x (0) ))
(7) 3. Until a stopping criterion is fulfilled and for n starting from 0, do:
328
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
values of the SVM parameters resulting from SA analysis for classification of four classes of glass substrates.
Table 2. Performance of purposed method SVM parameter
In this section, we provide the testing results for the automatic inspection system. Table 2 shows the performance of the purposed system with feature selection and without using feature selection for classification of TFT-LCD glass substrates. The training accuracy for using feature selection or without feature selection is the same but it is different for testing data. The test accuracy values for the classification using SVM with and without feature selection are 77.53% and 83.13%, respectively. So, the optimized SVM with feature selection could classify four classes with more satisfactory performance. The accuracy which was used to evaluate the performance of proposed method is defined as below: Accuracy
TOK TA TB TC n
Without Feature Selection With Feature Selection
kernel parameter (ζ)
insensitive parameter (ε)
punishment coefficient (C)
0.5
0.05
17
0.9
0.009
24
(10)
where TOK, TA, TB and TC are the number of sample cases correctly classified in the classes OK, A, B and C, respectively; n is also the total number of sample cases. a) OK
Figures 3 exploits that by applying optimized SVM, we get the performance of our proposed method in prediction of each four classes. The lift and gain is a useful tool for measuring the value of a predictive model. The basic idea of lift and gain is to sort the predicted target values in decreasing order of purity on some target category and then compare the proportion of cases with the category in each bin with the overall proportion. The lift and gain values show how much improvement the model provides in picking out the best of the cases. A gain chart displays cumulative percent of the target value on the vertical axis and cumulative percent of population on the horizontal axis. Cumulative gain is the ratio of the expected outcome using the model to prioritize the prospects divided by the expected outcome of randomization. The straight, diagonal line shows the expected return if no model is used for the population. The shaded area between the lines shows the improvement (gain) from the model. The gain of 1.00 means we are not doing any selective targeting.
b) Defect type A
c) Defect type B
Table 1. The optimal values of the SVM parameters resulting from the simulated annealing algorithm Optimized SVM Without Feature Selection With Feature Selection
Accuracy of Training (%)
Accuracy of Testing (%)
100
77.53
100
83.13
d) Defect type C Fig. 3. The gain charts for classification of OK, defects A, B and C classes. 329
8th IFAC Symposium on Advanced Control of Chemical Processes Singapore, July 10-13, 2012
5. CONCLUSION
Koller, D. and M. Sahami (1996). Toward Optimal Feature Selection. In ICML-96: Proceedings of the Thirteenth International Conference on Machine Learning, 284292. Lapp, J. C. (1994). Advanced glass substrates for flat panel displays. Proc. of SPIE, Advanced Flat Panel Display Technologies, 2174, 129–138. Li, H., Y. Liang and Q. Xu (2009). Support vector machines and its applications in chemistry. Chemometrics and Intelligent Laboratory Systems, 95, 188-198. Liu, J.J, J.F. MacGregor, C. Duchesne and G. Bartolacci (2005). Monitoring of flotation processes using multiresolutional multivariate image analysis. Minerals Engineering, 18(1), 65-76. Liu, J. (2004). Machine vision for process industries: monitoring, control, and optimization of visual quality of processes and products. Ph.D thesis, McMaster University, Canada. Liu, J.J., C. Han (2011). Wavelet texture analysis in process industries. Korean Journal of Chemical Engineering, 28(9), 1814-1823. Liu, Y.H., Y.K. Huang and M.J. Lee (2008). Automatic inline-defect detection for a thin film transistor-liquid crystal display array process using locally linear embedding and support vector data description. Measurement Science and Technology, 19, 495-501. Liu, Y.H., S.H. Lin, Y.L. Hsueh and M.J. Lee (2009). Automatic target defect identification for TFT-LCD array process inspection using kernel FCM based fuzzy SVDD ensemble. Expert Systems with Applications, 36, 1978–1998. Mauch, R.H. (2000). Thin glass substrates for mobile applications. Proc. of SPIE, Inorganic Optical Materials II, 4102, 162–168. Mo, Z., H. Xie, H. Liu and F. Li (2010). Parameter optimization of SVM based on HQGA. Proc. Sixth Int. Conf. Natural Computation, 2429-2433. Pilkington, L.A.B. (1969). The float glass process. Proc. of the Royal Society of London A., 314, 1–25. Roweis, S.T. and L.K. Saul (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323–2326. Scholkopf, C., J. Burges and A. Smola (1999). Advances in Kernel Methods: Support Vector Learning. MIT Press. Tax, D. and R. Duin (2004). Support vector data description. Machine Learning, 54, 45–66. Tuceryan, M., A.K. Jain (1998). Texture analysis, in: Chen, C.H., Pau, L.F., Wang, P.S.P. (Eds), The Handbook of Pattern Recognition and Computer Vision (2nd edition), Wold Scientific publishing Co. Van De Wouwer, G. (1998). Wavelets for multiscale texture analysis. Ph.D Thesis, University of Antwerp, Belgium. Vapnik, V. (1998). Statistical Learning Theory. Wiley, New York. Zhang, X. and Y. Guo (2009). Optimization of SVM parameters based on PSO algorithm. Proc. Fifth Int. Conf. Natural Computation. Zhu, H., Q. Lin and B. Zhang (2000). Analysis of system error in the measurement of liquid crystal empty cell gap by means of interferometry. Displays, 21, 121–126.
This paper presents a method based on optimized SVM to inspect surface defects on TFT-LCD glass substrates which is very useful and helpful in inline inspection system for glass substrates manufacturing industry. In this study, the use of SA in finding the best SVM parameters provides optimal classification performance. The experimental results using images from real production lines show that our method can provide competitive performances in detecting different types of surface defects. ACKNOWLEDGEMENT This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-00003056). REFERENCES Azizi, N. and S. Zolfaghari (2004). Adaptive temperature control for simulated annealing: a comparative study. Computers and Operations Research, 31, 2439-2451. Besalatpour, A.A., M.A. Hajabbasi, S. Ayoubi, A. Gharipour and A. Yousefian Jazi (2012). Prediction of soil physical and mechanical properties using optimized support vector machines. J. International Agrophysics, 26. Burges, C.J.C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167. Cristianini, N. and J.S. Taylor (2000). An Introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, New York. Dockerty, S.M. (1967). Sheet Forming Apparatus. U.S. Patent, 3, 338-696. Duchesne, C., J.J. Liu and J.F. MacGregor (2011). Multivariate image analysis in the process industries: a review, submitted. Freischlad, K. (1996). Large flat panel profiler. Proc. of SPIE, Flatness, Roughness, and Discrete Defect Characterization for Computer Disks, Wafers, and Flat Panel Displays, 2862, 163–171. Geng, X., J. Xu, J. Xiao and L. Pan (2007). A simple simulated annealing algorithm for the maximum clique problem. Information Sciences, 177, 5064-5071. Haralick, R.M., K. Shanmugam, I. Dinstein (1973). Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics. 3, 610-621. Huang, Y., Y. Lan, S.J. Thomson, A. Fang, W.C. Hoffmann and R.E. Lacey (2010). Development of soft computing and applications in agricultural and biological engineering. Computer and Electronic in Agriculture, 71, 107-127. John, G.H., R. Kohavi and K. Pfleger (1994). Irrelevant feature and the subset selection problem. In ICML. Kirkpatrick, S., C.D. Gelatt and M.P. Vecchi (1983). Optimization by simulated annealing. J. Science, 220, 671-680.
330