An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing

An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing

Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]] Contents lists available at SciVerse ScienceDirect Engineering Applications of A...

1MB Sizes 96 Downloads 148 Views

Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

Contents lists available at SciVerse ScienceDirect

Engineering Applications of Artificial Intelligence journal homepage: www.elsevier.com/locate/engappai

Brief paper

An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing Chiao-Wen Liu, Chen-Fu Chien n Department of Industrial Engineering & Engineering Management, National Tsing Hua University, Hsinchu 30013, Taiwan

a r t i c l e i n f o

abstract

Article history: Received 2 August 2012 Received in revised form 21 November 2012 Accepted 25 November 2012

Wafer bin maps (WBMs) that show specific spatial patterns can provide clue to identify process failures in the semiconductor manufacturing. In practice, most companies rely on experienced engineers to visually find the specific WBM patterns. However, as wafer size is enlarged and integrated circuit (IC) feature size is continuously shrinking, WBM patterns become complicated due to the differences of die size, wafer rotation, the density of failed dies and thus human judgments become inconsistent and unreliable. To fill the gaps, this study aims to develop a knowledge-based intelligent system for WBMs defect diagnosis for yield enhancement in wafer fabrication. The proposed system consisted of three parts: graphical user interface, the WBM clustering solution, and the knowledge database. In particular, the developed WBM clustering approach integrates spatial statistics test, cellular neural network (CNN), adaptive resonance theory (ART) neural network, and moment invariant (MI) to cluster different patterns effectively. In addition, an interactive converse interface is developed to present the possible root causes in the order of similarity matching and record the diagnosis know-how from the domain experts into the knowledge database. To validate the proposed WBM clustering solution, twelve different WBM patterns collected in real settings are used to demonstrate the performance of the proposed method in terms of purity, diversity, specificity, and efficiency. The results have shown the validity and practical viability of the proposed system. Indeed, the developed solution has been implemented in a leading semiconductor manufacturing company in Taiwan. The proposed WBM intelligent system can recognize specific failure patterns efficiently and also record the assignable root causes verified by the domain experts to enhance troubleshooting effectively. & 2012 Published by Elsevier Ltd.

Keywords: Wafer bin map (WBM) Semiconductor manufacturing Yield enhancement Spatial statistics test Cellular neural network (CNN) ART 1 neural network

1. Introduction Semiconductor manufacturing process is lengthy and technology intensive that contains several hundred process steps with advanced tools to fabricate integrated circuits (IC) on a silicon wafer in the wafer fabrication facility (fab). Semiconductor manufacturing is very capital intensive, in which building a modern 12 in. wafer fab with 40 nm process technologies requires more than 4 Billion USD. Therefore, fast ramp up for new process technology and quick response to yield excursion for strict quality control is crucial to maintain competitive advantages. For each fabricated wafer, circuit probe (CP) test is performed on each of the gross dies on the wafer and thus can detect the specific failures with the corresponding bin values. The CP yield is one critical yield measure in semiconductor manufacturing (Cunningham et al., 1995), since only the known good dies of the CP test can be packaged into chips for further usage. WBMs

n

Corresponding author. Tel.: þ886 3 5742648; fax: þ886 3 5722685. E-mail address: [email protected] (C.-F. Chien).

are spatial results of CP test presenting specific patterns that experienced engineers can recognize them to track the corresponding process failures in semiconductor manufacturing. Hence, the troubleshooting of low CP yield depends on two steps: recognizing the WBM patterns and identifying the assignable root causes. In practice, most companies rely on experienced engineers to analyze WBM, while it is time consuming and unreliable for an engineer to cluster the patterns with their eye-ball analysis. Furthermore, the identification of the possible root causes from the specific WBM failure patterns is affected by the domain knowledge and the experience for trouble shooting of the engineers. A number of studies have applied data mining techniques for trouble shooting and yield enhancement for semiconductor manufacturing (Ooi et al., 2012; Chien et al., 2010, 2007; Wang, 2008; Chien and Chen, 2007; Wang et al., 2006; Palma et al., 2005; Chen et al., 2000). To enhance CP yield, Wu and Zhang (2010) proposed a novel fuzzy neural networks with considering the number of defects per wafer, mean number of defects per chip, mean number of defects per unit area, clustering parameter, chip size

0952-1976/$ - see front matter & 2012 Published by Elsevier Ltd. http://dx.doi.org/10.1016/j.engappai.2012.11.009

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

2

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

and five critical electrical test parameters as the input variables. In addition, Self-Organization Map (SOM) clustering has been applied to cluster E-test, CP fail bins and metrology data to detect the failure patterns (Chien et al., 2003). Langford et al. (2001) presented a robust windowing method for the Poisson yield model to extract the systematic and random components of yield from wafer probe bin map data. For WBM clustering problems, Friedman et al. (1997) defined four defect clustering patterns including random, edge, top and bull’s eye pattern, but give no information about pattern types. For example, mask misalignment in the lithographic process generates a checkerboard pattern; the abnormal temperature control in the rapid thermal annealing process (RTP) can generate a ring of failing chips around the edge of the wafer; the zone pattern often arises from the thin film deposition (Chien et al., 2002). Hsu and Chien (2007) proposed a hybrid data mining approach that integrates spatial statistics and ART neural networks to extract patterns from WBM and associate with manufacturing defects. Wang (2009) proposed an approach that combined spatial statistics, kernel based eigendecomposition and support vector clustering to estimate the number of defect clusters, and separate the convex and nonconvex defect cluster on WBMs. However, as wafer size is enlarged and IC feature size is continuously shrinking, WBM patterns become complicated due to the different die sizes, wafer rotation, the variant density of failed dies. Motivated by the needs in real setting and improved via empirical studies, this research developed a WBMs clustering solution that integrates the spatial statistics test, cellular neural network (CNN), ART1 neural network, and moment invariant (MI) for the clustering of WBM failure patterns automatically and efficiently. Furthermore, we also constructed an intelligent system for WBM defect diagnosis system equipped with user-friendly graphical interface to help the engineers detect the root cause and the knowledge database to record the know-how and domain knowledge. In particular, a number of WBMs of the twelve patterns defined by domain engineers were used for validation, in which four performance indexes were employed to compare the proposed approach with the other clustering methods. The results have shown that the clustering performance of the proposed approach is superior to other methods. Indeed, the developed solution has been implemented in a leading semiconductor manufacturing company in Taiwan. The rest of this study is organized as follows: Section 2 introduces the CP yield and the WBMs. Section 3 outlines the proposed system. Section 4 describes an experiment for the proposed pattern clustering system. Section 5 concludes this study with discussions of the results and future research directions.

the wafer at the end of fabrication. They are multi-dimensional and have complex structures, can provide essential information for engineers to identify problems in the manufacturing process. WBM patterns can provide information to monitor the process and product. Fig. 1 shows a typical WBM where the different symbols denote chips failing in different functional tests. The failure patterns of WBM can be classified into three major categories (Stapper, 2000; Taam and Hamada, 1993): (1) Random defect: No spatial clustering or pattern exists, and the defective chips are randomly distributed in the twodimensional map. Random defects are usually caused by the manufacturing environmental factors. Even in a in a nearsterile environment, particles cannot be removed completely. However, reducing the level of random defects can improve the overall productivity of wafer fabrication. (2) Systematic defect: The positions of defective chips in the wafer show the spatial correlation, for example, ring, edgefail, checkerboard. Fig. 2 shows ten systematic patterns that are frequently seen in fab, as defined by domain experts. (3) Mixed defect, consisting of a random defect and a systematic defect in one map. Most wafer maps are of this type, as shown in Fig. 3. Engineer needs to separate random and systematic defects in the WBM, since the systematic defect’s signature can reveal the process problem (Friedman et al., 1997).

Fig. 1. Defect map and bin map.

2. CP yield and wafer bin maps Yield is a widely used performance measure in semiconductor manufacturing. Sampling test is the way to monitor the production quality during wafer fabrication. In the circuit probe (CP) testing that involves testing of individual die for functionality using different electrical probes and provides key information about the performance of the wafer fabrication process (Chien et al., 2011). Then, the wafers are died up, and the good dies are package into chips and shipped to the customer (Kumar et al., 2006; Hsu and Chien, 2007). There are three kinds of yields in a semiconductor manufacturing: line yield, CP yield, and final test yield (Cunningham et al., 1995). Among them, CP yield is the critical factor of the manufacturing yield and divided into two major categories. One is base line yield improvement; the other is low yield trouble shooting. Wafer bin maps (WBM) are the result of CP inspection of dies on

Fig. 2. Six different root cause of wafer map pattern.

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

3

the statistical inference methods. In this research, we aim on the pattern clustering system for WBMs clustering. The detailed methodology procedure is described in the following subsection. 3.2. WBM patterns clustering procedure

Fig. 3. Mix defect pattern of systematic and random defects.

Control flow Data flow

User

User Interface

Knowledge base

Database

Test Spatial pattern

Random

Mask Error

Specific Pattern

Noise Filter

3.2.1. Spatial independent test In this section, we will divide data into three categories by the position of bin value distributed on the WBMs. Hence, following Taam and Hamada (1993), we applied a join-count statistics based on spatial clustering method to examine spatial dependence of the WBM. For a given WBM, if die i is defect with respect to the target bin value, it will denoted as Yi ¼1 (Bad); Yi ¼0 (Otherwise) in other bin values. Four statistics defined as follows: NBB denotes each die i and the neighbor die j are all bad. That is, 8   PP N > OO ¼ i o j dij ð1Y i Þ 1Y j > > P P > > < N OB ¼ i o j dij ð1Y i ÞY j   PP ð1Þ > N BO ¼ i o j dij Y i 1Y j > > P P > > : N BB ¼ i o j dij Y i Y j where dij is a neighboring index. dij equals 1 if die i and die j are neighbors; otherwise, dij equals 0. Table 1 is a two-way contingency table summarizing the relations. We examine the spatial dependence of WBM by testing the following hypothesis. In particular,

Clustering Result Pattern Clustering

Fig. 4. Conceptual design of the proposed WBM diagnosis system.

H0: the dies are randomly distributed on the wafer (i.e., spatial independence). H1: the dies are not randomly distributed on the wafer. Based on the defined statistics of WBMs, a natural logarithm of the odds ratio is proposed as follows: Logy^ ¼

3. System design and implementation As illustrated in Fig. 4, we developed a WBM diagnosis system that consists of three modules: the user-interface, the knowledge management database, and the WBM patterns clustering system. There are two kinds of flow, control and data, in the system. The control flow starts from user to control the parameters in WBM patterns clustering system and acquire the WBMs pattern diagnosis knowledge via the user interface. The data flow shows the data path. Three modules can provide the data and manufacturing intelligence to the engineers via graphical user interface. For WBM clustering, data in database will go to WBM patterns clustering system, and the clustering results will summarize the report to user and are recorded in the knowledge base. 3.1. The user interface and knowledge management database According to Fig. 4, the user interface module allows users to execute the following system operations including: loading the analyzed dataset, adding or retracting the decision knowledge, controlling the parameters in WBM patterns clustering system, and monitoring the clustering results. The design of the user interface focus on four points: (1) parameters setting, (2) WBMs graphical illustration, (3) graphical reports representation, and (4) buttons click for keyword search in knowledge database and user comments. The graphical user interface is easily to use and quickly to help user to identify the root cause. The function for buttons click in user interface is helpful to storage the user’s domain know-how and useful to search keywords in the knowledge base. When the knowledge base is well building, the intelligent system will be constructed by

ðNOO þ 0:5ÞðN BB þ 0:5Þ ðNOB þ 0:5ÞðNBO þ 0:5Þ

ð2Þ

When the number of dies on a wafer is large, Logy^ is approximately Normal mlog y ¼ 0, slog y ¼

1=2 ! 1 1 1 1 þ þ þ N GG þ 0:5 NGB þ 0:5 N BG þ 0:5 NBB þ 0:5



ð3Þ therefore, Logy^ almost 0 (Logy^  0) indicates spatial indepdence; that is, the defect dies are randomly distributed on the wafer. A positive value of Logy^ (Logy^ b0) indicates an attraction among the dies; that is, the defect dies are clustering on the wafer. A negative value of Logy^ (Logy^ 50) indicates a repulsion; that is, the map presents a systematic pattern with no bad–bad neighboring dies (Agresti, 1990). For example, if there is a flaw on the mask, the defect is scattered (i.e., repulsive pattern) yet systematically repeated. In other words, the signs of Logy^ indicates different types of spatial dependence while its magnitude denotes the degree of dependence. Then, we use ART1 to cluster the WBMs that present attraction among the dies (i.e., have high positive values of Logy^ ). For illustration, let us examine the WBMs that have 268 dies in Table 2. According to the above equations, we can derive the Table 1 Contingency table of the adjacent dies (io j). Die i

Other Bad

Die j Other

Bad

NOO NBO

NOB NBB

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

4

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

Table 2 The odds ratio statistic table of four different kinds of WBMs. No.

No. 1

No. 2

No. 3

No. 4

0.102

0.006

2.876

 2.764

875 2 88 0.592

19 712 234 0.587

424 357 184 6.43  10  13

806 0 159 0.998

Different types of WBMs

Logy^ NGG NBB NGB þ NBG p value

Fig. 5. 5  5 grid (the central defect die (grid) was marked by black color, gray color meant defect die and the white meant good die).

corresponding values of Logy^ . In particular, WBM No.1 and WBM No.2 fail to reject the null hypothesis at the significance level with a ¼0.05. WBM No.3 rejects the null hypothesis (p value¼ 6.43  10  13) and shows a specific spatial pattern. WBM No.4 also rejects the null hypothesis (p value ¼0.998) and shows a repulsive pattern.

3.2.2. Noise elimination and pattern improvement Since most of the WBMs are composite of random errors and systematic defects, the degrees of spatial dependence are varied. However, the involved systematic defects can be reduced through identifying the assignable causes and the corresponding troubleshooting. To reduce the noises of random errors, we propose the cellular neural network (CNN) to amplify the specific clustering patterns and to filter the random error before clustering. However, since the input data of CNN is continuous, the weight is assigned to the neighbor of each die on the WBMs before CNN transformation. In this study, we consider the neighbor with r ¼2. Thus, by several times of experiment design, the weight was defined as follows: 2 3 0:006 0:006 0:006 0:006 0:006 6 0:006 0:05 0:05 0:05 0:006 7 6 7 6 7 6 0:1 0:05 0:006 7 Q ¼ 6 0:006 0:05 ð4Þ 7 6 7 0:05 0:05 0:006 5 4 0:006 0:05 0:006 0:006 0:006 0:006 0:006 For example, given the 5  5 grid in Fig. 5, the value of central die (which was marked by black color and the other defect die was marked by gray color) will be 0.224 after transforming by matrix Q. After transforming the value on WBMs, we apply cellular neural networks that are proposed by Chua and Yang (1988a, 1988b) to eliminate noise effect and improve the pattern effect.

Fig. 6. The input and the output of an averaging operation. (a) Input pattern and (b) Output pattern.

Since the CNN theory was proposed in analog process, we approximate the differential equation by a difference equation. In particular, given an initial state vx(0) and an input signal, the behavior of the system is determined by the interaction coefficients A(i, j), B(i, j) and a bias term Iij, 1 rirM; 1rj rN. Here, we called A(i, j) is feed-back template, B(i, j) is feed-forward template and the bias term Iij ¼ c (where c is a constant) constitute a cloning template. In the case of linear interaction, if the values of the interaction terms depend only on the relative positions of the neighborhood Nr(i, j), the interaction pattern around a given cell is translation-invariant. Let t¼nh (i.e., h is a constant time step), r is neighborhood distance of Nr(i, j), C, Rx and Ry are real values and C(i, j)ANr(i, j), then approximate the derivative of vx(t) by the corresponding difference from 0 C ðvx ððn þ 1ÞhÞvx ðnhÞÞ ¼ h@

X

C ði,jÞ

Aði,jÞvy ðnhÞ þ

X

1 Bði,jÞvy ðnhÞ þ IA

C ði,jÞ

h vx ðnhÞ Rx

ð5Þ where  1  Ry 9vx ðnhÞ þ 199vx ðnhÞ19 2 P Let Iij ¼ C ði,jÞ Bði,jÞvy þ I,and we have suppressed the time step h from ‘‘nh’’ for simplicity to make vx(n)  vx(nh) and vy(n) vy(nh), we can recast (4) into the form 0 1 X h@ 1  vx ðnÞ þ ð6Þ vx ðn þ 1Þ ¼ vx ðnÞ þ Aði,jÞvy ðnÞ þIij A, C Rx C ði,jÞ vy ðnhÞ ¼

if we substitute vy(nh) f(vx(nh)), we would obtain 0 1 X h@ 1  vx ðnÞ þ vx ðn þ 1Þ ¼ vx ðnÞ þ Aði,jÞf ðvx ðnÞÞ þ Iij A, C Rx C ði,jÞ

ð7Þ

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

For example, in the case of the smallest neighborhood (i.e., r ¼1), for a given cell C(i, j), the interaction coefficients are as follows: 2

ai þ 1,j1

3

ai,j

ai þ 1,j

ai,j þ 1

ai þ 1,j þ 1

7 5,

ai1,j

ai1,j1 6 A ¼ 4 ai1,j ai þ 1,j þ 1

2

bi1,j1 6 B ¼ 4 bi1,j bi þ 1,j þ 1

bi1,j bi,j bi,j þ 1

bi þ 1,j1

3

bi þ 1,j 7 5

ð8Þ

bi þ 1,j þ 1

we define this image loaded into the initial states vx(0) of the cells, then the mapping F(T) is an image processor with an output image vy(N). The template T(A, B, I) defines the image-processing operation as follows: 2 3 2 3 0 0:065 0 0 0 0 6 7 6 7 0:5 0:065 5, B ¼ 4 0 0 0 5, I ¼ 0, A ¼ 4 0:065 ð9Þ 0 0:065 0 0 0 0 then vy(t) tend to constant values [0, 1], and our CNN image processor performs averaging over a 3  3 window. Fig. 6 illustrates an example. Fig. 6(a) shows the input WBM with noise and Fig. 6(b) shows the WBM after CNN transformation.

5

between two layers of neurons. To manage the variety input, ART has the following characteristics: (1) balance on stability and plasticity, (2) match and reset, and (3) balance on search and direct access. ART solves the stability-plasticity dilemma, which is caused by learning new data leading to unstable conditions and loss of data. Several algorithms are derived from the original ART, including ART1 (Garpenter and Grossberg, 1987a; 1987b), ART2 (Garpenter and Grossberg, 1987b), ART3 (Garpenter and Grossberg, 1990), ARTMAP (Grossberg et al., 1992), and Fuzzy ARTMAP (Garpenter and Grossberg, 1990). This study employs the ART1 algorithm for WBM clustering, since the input data form a binary map. We transform the twodimensional WBMs inputs into one-dimensional binary values. The value of each die on the wafer is either 1 (denoted the target bin value) or 0 (otherwise).

4. Pattern clustering evaluation 4.1. Data preparation

3.2.3. Pattern clustering In pattern clustering, we use moment invariant and ART1 as our clustering tool. Moment invariant can make the same shape in one cluster, whether the shape size, position is changed. ART1 can learnself by characteristic of each WBM. Same patterns with different position or different size will be clustered on different clusters. 3.2.3.1. Shape clustering by moment invariant. Moment invariant proposed by Hu (1962) is a shape recognition technique in image process and often applied on several fields (Rizon et al., 2006; Nagarajan and Balasubramanie, 2008; Moallem et al., 2011). The invariant moment of one image is an important shape feature that will not be changed via rotating or scaling transformation. Assume that one image (e.g., WBM) has M  N dies, the location of one die inside the image is denoted as(x, y) and its color value is denoted as f(x, y) respectively. Then, the moment mpq of the pth order in x-axis direction and the qth order in y-axis direction is: mpq ¼

4.2. Criterion for evaluating the clustering result

M X N X

xp yq f ðx,yÞ

ð10Þ

x¼0y¼0

  Assume that the central coordinate is x^ , y^ , the normalized central moment Zpq will become: "

Zpq ¼

M N  p  q P P xx^ yy^ f ðx,yÞ

x¼0y¼0

"

M P

N P

ð11Þ

#½ðp þ qÞ=2 þ 1 f ðx,yÞ

O1 7

   ,j

and j

O2 1 ,

1. Purity: the ratio of the data can be classified correctly.

   ,j

C P

C X

Then, seven invariant moments that we used to describe the shape features of one image can be defined. Furthermore, the seven invariant moments can be used as a similarity measurement to evaluate the distance between the shapes of two images. There are two images O1 and O2 with the corresponding invariant O1 1 ,

Following Wei and Dong (2001), we used four indexes to evaluate the clustering results to estimate the validity. First, we defined the indexes as follows:

#

x¼0y¼0

moments j

Indeed, in order to estimate the validity, we interview the domain experts to identify the twelve different WBM patterns that are commonly appeared in historical data as shown in Fig. 7. Using these specific patterns extracted from historical data and domain knowledge, it will be easy to systematically design the experiment and evaluate the clustering performance of proposed approach. In experiments, the defect yield range and twelve patterns are decided based on real cases. Without losing generality, we use synthetic data to simulate the defect values on the dies to replace the true value to protect the confidentiality of the case company. After identify the twelve different WBM patterns, ten WBMs were duplicated for each of the twelve patterns with different degrees of random effects to generate 120 WBMs with true category data for estimating the validity of the proposed approach.

O2 7 .

Then, the shape distance rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2ffi P7  O1 O2 between O1 and O2 is defined as dM ðO1 ,O2 Þ ¼ : i ¼ 1 ji ji 3.2.3.2. Pattern clustering by adaptive resonant theory (ART). The adaptive resonance theory (ART) (Garpenter and Grossberg, 1987a; 1987b; Freeman and Skapura, 1991) has been applied in many areas including pattern recognition and spatial analysis. ART derives fundamentally from the adaptive resonant feedback

N Purity ¼ purityðiÞ  i ¼ N i¼1

i¼1

N

ni , and purityðiÞ ¼

ni Ni

ð12Þ

where C ¼the total categories after clustering. ni ¼the maximal number of classified maps in the evolved category i. Ni ¼the total number of maps in the evolved category i. N ¼ the total number of maps. 2. Diversity: the ratio of the true categories can be recognized. Diversity ¼

tc Tc

ð13Þ

where tc ¼the number of true categories covered by the evolved categories. Tc ¼the number of true categories. 3. Specificity: the number of the involved true categories over the total number of evolved categories Specificity ¼

tc Te

ð14Þ

where Te ¼the total number of evolved categories.

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

6

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

Fig. 7. Twelve typical WBM patterns.

Fig. 8. The five input patterns with two true categories.

4. Efficiency: the number of true categories over the total number of evolved categories Efficiency ¼

te Te

ð15Þ

te ¼the total number of classified categories. We use the following example of five input patterns with two true categories to illustrate the criterion of clustering result as shown in Fig. 8. If the ART1 model clusters the input patterns into two groups, the purity is (3þ2)/5¼1, the diversity is 2/2¼1, the specificity is 2/2¼1 and the efficiency is 2/2¼1. If the ART1 model clusters the input patterns into only one group, the purity is 3/5¼0.6, the diversity is 1/2¼0.5, the specificity is 1/2¼0.5 and the efficiency is 1/2¼0.5. 4.3. WBMs clustering experiments and results In the experiments design, the relation between the failure rate and random noise sensitivity analysis is considered. We

repeated each experiment three times and four indexes, i.e., purity, diversity, specificity, and efficiency, are used to evaluate the cluster performance. We compare the proposed WBM clustering method with hierarchical clustering of Ward method, nonhierarchical clustering of K-means method, self organizing map (SOM) neural network, spectral cluster method (Ng et al., 2002) and the algorithm proposed by Chien et al. (2002). In the parameters setting, K-means and spectral cluster method defined the cluster number is twelve, and SOM set the map is 5  5. In the proposed method, we employed the ART1 model as the clustering tool after noise is eliminated. For the setting of ART1 vigilance value, Hsu and Chien (2007) found that the lower the vigilance threshold value, the more patterns are extracted. However, a low vigilance threshold value may cause dissimilar maps to be assigned into the same cluster. Hence, this research fixed the vigilance value 0.75 for ART1 that was derived by a number of numerical analysis of real data. Tables 3–7 show the comparison results. In particular, we find K-means method is stable no matter what the failure rate or the random noise changed. However, the K-means method cannot recognize the category of type (e) and

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

Table 3 Clustering result on the index of purity. Fail rate

0.15

0.20

0.25

Noise

0.01

0.04

0.07

0.01

0.04

0.07

0.01

0.04

0.07

This approach Chien et al. (2002) Ward K-means SOM spectral

0.97 0.98 0.99 0.89 0.34 0.29

0.98 0.99 0.92 0.82 0.27 0.34

0.87 0.93 0.54 0.67 0.28 0.28

0.85 0.99 0.91 0.89 0.28 0.28

0.90 1.00 0.94 0.86 0.29 0.40

0.93 0.99 0.75 0.86 0.28 0.39

0.93 0.98 0.93 0.76 0.48 0.32

0.97 0.98 0.82 0.89 0.30 0.38

0.97 0.99 0.66 0.81 0.28 0.56

Table 4 Clustering result on the index of diversity. 0.20

7

groups. Efficiency will also be decreased when the clustering group increases. In Table 7, we find our method decreased the number of groups. The possible reason may be attributed to that we modified the ART1 algorithm. Also, the CNN transformation makes the pattern become more clearly rather than the evolution and degeneracy proposed by Chien et al. (2002). After the experiment, we summarized four points of our algorithm: 1. This study has good performance on high failure rate with fixed noise effect (i.e., higher purity value, diversity value and lower cluster groups.) 2. High random noise resulted in the increase of clustering group. 3. Diversity always has good performance. 4. The higher random noise, the lower efficiency.

Fail rate

0.15

0.25

Noise

0.01

0.04

0.07

0.01

0.04

0.07

0.01

0.04

0.07

4.4. Experimental summary

This approach Chien et al. (2002) Ward K-means SOM Spectral

1.00 1.00 1.00 0.89 0.75 0.67

0.92 1.00 0.83 0.83 0.92 0.67

0.83 1.00 0.83 0.89 0.92 0.67

0.92 0.96 1.00 0.89 0.92 0.58

0.97 1.00 0.92 0.86 0.92 0.83

1.00 1.00 0.83 0.86 0.92 0.75

1.00 1.00 1.00 0.78 0.83 0.83

1.00 1.00 0.83 0.89 1.00 0.75

1.00 1.00 0.75 0.81 1.00 0.67

Based on the experimental results, we verify that the proposed approach has practical viability. The results showed that, after modifying the ART1 and adding the shape condition of each pattern, we could reduce the clustering groups. However, more empirical studies are necessary to support this claim. In particular, we also find the specific characteristics among the 12 true WBM patterns when applying the proposed ART1:

Table 5 Clustering result on the index of specificity.

Fail rate

0.15

Noise

0.01

0.04

0.07

0.01

0.04

0.07

0.01

0.04

0.07

1. The Type (a) and (b) usually mixed in the same clusters. 2. The Type (j) is hard to distinguish from the other patterns in our model. 3. Types (d), (g), (k) and (9) are the easiest patterns to cluster by ART1 model. The possible reason why the second point that type (j) is hard to distinguish may be that the pattern region of type (j) is too big to discern clearly with respect to other patterns in the same condition. Thus, the ART1 model could not cluster the type (j) easily from others. Furthermore, we find that the number of created clusters will increase, the specificity will decrease and the purity of each group will increase, when the vigilance value is set high. Therefore, to keep the high purity of each cluster and to decrease the total number of created clusters, one possible approach is to extract the common pattern from each cluster and then cluster the WBMs of common patterns again to combine them into a smaller number of clusters.

This approach Chien et al. (2002) Ward

0.29 0.10 0.60

0.16 0.10 0.50

0.13 0.10 0.50

0.86 0.33 0.52

0.75 0.11 0.46

0.41 0.10 0.43

1.00 0.74 0.34

0.86 0.39 0.38

0.48 0.17 0.36

5. Conclusion

Fail rate

0.15

0.20

0.25

Noise

0.01

0.04

0.07

0.01

0.04

0.07

0.01

0.04

0.07

This approach Chien et al. (2002) Ward K-means SOM Spectral

0.27 0.10 0.60 0.55 0.36 0.75

0.14 0.10 0.42 0.50 0.44 0.92

0.03 0.10 0.42 0.50 0.44 0.58

0.76 0.33 0.52 0.53 0.44 0.58

0.63 0.11 0.42 0.53 0.44 0.83

0.34 0.10 0.36 0.50 0.44 0.50

0.97 0.74 0.34 0.45 0.40 0.67

0.84 0.39 0.31 0.53 0.48 0.58

0.48 0.17 0.30 0.48 0.48 0.50

Table 6 Clustering result on the index of efficiency. 0.20

0.25

Table 7 Clustering result on groups. Fail rate

0.15

0.20

0.25

Noise

0.01

0.04

0.07

0.01

0.04

0.07

0.01

0.04

0.07

This approach Chien et al. (2002)

42 118

75 118

91 108

14 79

16 88

29 99

12 22

14 31

25 73

type (i). The Ward method is sensitive to random noise for the performance index of purity. The method proposed by Chien et al. (2002) had high purity and diversity yet with low efficiency. Finally, spectral and SOM do not perform well in the WBMs cases. As shown in Tables 3–7, no matter what the failure rate was, serious random noise led the ART1 to have more clustering

Due to WBMs are the key evidence to find the root causes which make low CP yield, a systematical WBMs analysis method is necessary. This research not only provides a computational analysis framework for WBMs, but also constructs the pattern diagnosis system via a friendly user interface to help the engineers detect the root cause, and records the know-how to identify root cause effectively. This system reduces the time for WBM pattern recognition significantly and is thus implemented in the case company. Furthermore, the developed system can accumulate root cause diagnosis experience and trouble shooting knowledge. The proposed WBM clustering method integrated several approaches to enhance the capability for WBM clustering. In this research, we focused on the mixed-type patterns on WBMs. Rather than decomposing the patterns into individual components or extracting the pattern features, this research used CNN method to eliminate the defect noise and enhance the patterns.

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i

8

C.-W. Liu, C.-F. Chien / Engineering Applications of Artificial Intelligence ] (]]]]) ]]]–]]]

Then, MI and ART1 are employed to analyze the patterns. In particular, this study used twelve different patterns based on domain knowledge and historical defect data to evaluate the clustering performance for validation. Four performance indexes, purity, diversity, specificity, and efficiency were used for evaluation. Finally, two experiments, noise and pattern fail rate sensitivity, and compared with other cluster methods are considered to find the model character and performance. In the analysis results, we find that the proposed method gets high clustering purity with lower groups than existing methods, and diversity do not have significant difference with other three methods. But the robustness in specificity is not better than that of the K-means and Ward method. Following the engineering conception, purity and diversity are main concerns for pattern detect efficiency. Future study can be done to enhance specificity issue for the developed solution.

Acknowledgements This research is partially supported by National Science Council (NSC100-2628-E-007-017-MY3; NSC101-2811-E-007-009), the Advanced Manufacturing and Service Management Research Center of National Tsing Hua University (101N2073E1), and Macronix International, Ltd. in Taiwan. References Agresti, A., 1990. Categorical Data Analysis. John Wiley, New York. Chen, N., Zhu, D.D., Wang, W., 2000. Intelligent materials processing by hyperspace data mining. Eng. Appl. Artif. Intell. 13 (5), 527–532. Chien, C., Lee, P., Peng, C., 2003. Semiconductor manufacturing data mining for clustering and feature extraction. J. Inf. Manage. 10 (1), 63–84. Chien, C.F., Lin, T., Liu, Q., Peng, C., Hsu, S., Huang, C., 2002. Developing a data mining method for wafer bin map clustering and an empirical study in a semiconductor manufacturing fab. J. Chin. Inst. Ind. Eng. 19 (2), 23–38. Chien, C.F., Chen, L., 2007. Using rough set theory to recruit and retain highpotential talents for semiconductor manufacturing. IEEE Trans. Semicond. Manuf. 20 (4), 528–541. Chien, C.F., Chen, Y., Peng, J., 2010. Manufacturing intelligence for semiconductor demand forecast based on technology diffusion and product life cycle. Int. J. Prod. Econ. 128 (2), 496–509. Chien, C.F., Dauzere-Peres, S., Ehm, H., Fowler, J.W., Jiang, Z., Krishnaswamy, S., Lee, T.E., Moench, L., Uzsoy, R., 2011. Modeling and analysis of semiconductor manufacturing in a shrinking world: challenges and successes. Eur. J. Ind. Eng. 5 (3), 254–271. Chien, C.F., Wang, W., Cheng, J., 2007. Data mining for yield enhancement in semiconductor manufacturing and an empirical study. Expert Syst. Appl. 33 (1), 192–198. Chua, L.O., Yang, L., 1988a. Cellular neural networks: theory. IEEE Trans. Circuits Syst. 35, 1257–1271. Chua, L.O., Yang, L., 1988b. Cellular neural networks: applications. IEEE Trans. Circuits Syst. 35, 1272–1290. Cunningham, S.P., Spanos, C.J., Voros, K., 1995. Semiconductor yield improvement: results and best practices. IEEE Trans. Semicond. Manuf. 8 (2), 103–109.

Freeman, J.A., Skapura, D.M., 1991. Neural Networks: Algorithm, and Programming Techniques. Addison-Wesley, Wokingham, England. Friedman, D.J., Hansen, M.H., Nair, V.N., James, D.A., 1997. Model-free estimation of defect clustering in integrated circuit fabrication. IEEE Trans. Semicond. Manuf. 10 (3), 344–359. Garpenter, G.A., Grossberg, S., 1987a. The ART of adaptive pattern recognition by a self-organizing neural network. Computer 21 (3), 77–88. Garpenter, G.A., Grossberg, S., 1987b. ART2: self-organization of stable category recognition codes for analog input patterns. Appl. Opt. 26 (23), 4919–4930. Garpenter, G.A., Grossberg, S., 1990. ART3: hierarchical search using chemical transmitters in self-organizing pattern recognition architectures. Neural Networks 3, 129–152. Grossberg, S., Markuzon, N., Reynolds, J.H., Rosen, D.B., 1992. Fuzzy ARTMAP: neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Trans. Neural Networks 3 (5), 698–713. Hsu, S.-C., Chien, C.F., 2007. Hybrid data mining approach for pattern extraction from wafer bin map to improve yield in semiconductor manufacturing. Int. J. Prod. Econ. 107, 88–103. Hu, M.K., 1962. Visual pattern recognition by moments invariants. IRE Trans. Inf. Theory 8, 179–187. Kumar, N., Kennedy, K., Gildersleeve, K., Abelson, R., Mastrangelo, C.M., Montgomery, D.C., 2006. A review of yield modeling techniques for semiconductor manufacturing. Int. J. Prod. Res. 44 (23), 5019–5036. Langford, R.E., Liou, J.J., Raghavan, V., 2001. The application and validation of a new robust windowing method for the Poisson yield model. In: Advanced Semiconductor Manufacturing Conference, IEEE/SEMI, 23–24 April 2001, Munich, Germany. Piscataway, NJ: IEEE, pp. 157–160. Moallem, P., Mousavi, B.S., Monadjemi, S.A., 2011. A novel fuzzy rule base system for pose independent faces detection. Appl. Soft Comput. 11 (2), 1801–1810. Nagarajan, B., Balasubramanie, P., 2008. Neural classifier system for object classification with cluttered backgroup using invariant moment features. Int. J. Soft Comput. 3 (4), 302–307. Ng, A., Jordan, M., Weiss, Y., 2002. On Spectral Clustering: Analysis and an Algorithm. Advances in Neural Information Processing Systems, 14. MIT Press 849–856. Ooi, M. P.-L., Sok, H.K., Kuang, Y.C., Demidenko, S., Chan, C., 2012. Defect cluster recognition system for fabricated semiconductor wafers. Engineering Applications of Artificial Intelligence, In press, http://dx.doi.org/10.1016/j.engappai. 2012.03.016. Palma, F.D., Nicolao, G.D., Miraglia, G., Pasquinetti, E., Pic- cinini, F., 2005. Unsupervised spatial pattern classification of electrical-wafer- sorting maps in semiconductor manufacturing. Pattern Recognit. Lett. 26 (12), 1857–1865. Rizon, M., Yazid, H., Saad, P., Md Shakaff, A.Y., Sadd, A.R., Mamat, M.R., Yaacob, S., Desa, H., Karthigayan, M., 2006. Object detection using geometric invariant moment. Am. J. Appl. Sci. 3 (6), 1876–1878. Stapper, C.H., 2000. LSI yield modeling and process monitoring. IBM J Res. Dev. 44 (2), 112–118. Taam, W., Hamada, M., 1993. Detecting spatial effects from factorial experiments: an application from integrated-circuit manufacturing. Technometrics 35 (2), 149–160. Wang, C.H., 2009. Outlier identification and market segmentation using kernelbased clustering techniques. Expert Syst. Appl. 36 (2), 3744–3750. Wang, C.H., 2008. Recognition of semiconductor defect patterns using spatial filtering and spectral clustering. Expert Syst. Appl. 34 (3), 1914–1923. Wang, C.H., Wang, S.J., Lee, W.D., 2006. Automatic identification of spatial defect patterns for semiconductor manufacturing. Int. J. Prod. Res. 44 (23), 5169–5185. Wei, C.P., Dong, Y.X., 2001. A mining-based category evolution approach to managing online document categories, In: Proceedings of the 34th Annual Hawaii International Conference on System Sciences. Wu, L., Zhang, J., 2010. Fuzzy neural network based yield prediction model for semiconductor manufacturing system. Int. J. Prod. Res. 48 (11), 3225–3243.

Please cite this article as: Liu, C.-W., Chien, C.-F., An intelligent system for wafer bin map defect diagnosis: An empirical study for semiconductor manufacturing. Eng. Appl. Artif. Intel. (2012), http://dx.doi.org/10.1016/j.engappai.2012.11.009i