Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system

Journal Pre-proofs Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system Junliang Wang, Peng Zheng...

Download PDF

1MB Sizes 0 Downloads 17 Views

Report

PDF Reader
Full Text

Journal Pre-proofs Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system Junliang Wang, Peng Zheng, Jie Zhang PII: DOI: Reference:

S0360-8352(20)30096-6 https://doi.org/10.1016/j.cie.2020.106362 CAIE 106362

To appear in:

Computers & Industrial Engineering

Received Date: Revised Date: Accepted Date:

19 August 2019 31 December 2019 11 February 2020

Please cite this article as: Wang, J., Zheng, P., Zhang, J., Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system, Computers & Industrial Engineering (2020), doi: https://doi.org/10.1016/j.cie.2020.106362

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Published by Elsevier Ltd.

Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system Junliang WANG1, College of Mechanical Engineering, Donghua University, Shanghai 201620, China Junliang Wang is a lecturer with College of Mechanical Engineering, Donghua University, Shanghai, China. He received the B.S. degree in scheduling of manufacturing systems from Wuhan University of Technology, Wuhan, China, in 2009 and the Ph.D. degree in big data driven operation for industrial engineering at School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China, in 2018. His current research focuses on big data analytics and operation in complex manufacturing systems. Peng ZHENG2 School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200040, China Peng Zheng is currently a PhD candidate at School of Mechanical Engineering, Shanghai Jiao Tong University, China. He received his BS degree from Wuhan University of Science and Technology, China, in 2016. His current research interest is big data driven production scheduling. Jie ZHANG1# College of Mechanical Engineering, Donghua University, Shanghai 201620, China # Corresponding Author / E-mail: [email protected] Jie Zhang is the dean of College of Mechanical Engineering at Donghua University in China. She got her PhD from Nanjing University of Aeronautics and Astronautics, China, in 1997. Before joining Donghua University, Prof. Zhang worked at the Institute of Intelligent Manufacturing and Information Engineering in Shanghai Jiao Tong University, China. Her research interests include industrial big data analysis, intelligent production scheduling, production control in intelligent manufacturing system, and intelligent quality analysis. Acknowledgement This work was supported by the Program of the National Natural Science Foundation of China under Grant No. 51905091, the Fundamental Research Funds for the Central Universities under Grant No. 2232019D3-34, and the Shanghai Sailing Program under Grant No. 19YF1401500.

Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system

Abstract: The dynamic wafer workshop gives rise to strong fluctuations of cycle time (CT) followed by earliness or tardiness penalty. Comprehending these fluctuations is a major challenge as the cycle time intricately interacts with massive parameters. This paper proposes a big data analytics method for feature selection to obtain all explanatory factors of CT that would shed light on the fluctuation of CT. Firstly, the correlative analysis is performed between each two candidate factors with mutual information metric to construct the observed network. Secondly, the network deconvolution is investigated to infer the direct dependence between the candidate factor and the CT by removing the effects of transitive relationships from the network. Additionally, a factor selection algorithm is designed to reduce the dimension of candidate factors to form the CT explanatory network, which contains the key factors interacting with the CT of wafer lots. The experimental results demonstrate the proposed method outperforms the references in term of classification and prediction accuracy. A case study is conducted with real data from a semiconductor wafer fabrication system, the big data captured for feature selection is analyzed, and the identified key factors are proved to be effective in forecasting and fluctuation interpretation of CT. Keywords: big data analytics, cycle time, complex network, feature selection

1

Introduction

Wafer fabrication is the most complex phase of semiconductor manufacturing, with such characteristics as the large production scale, the diversity of machine types (Qin and Zhang et al., 2013), the complex process route (Lu and Ramaswamy et al., 1994) and highly dynamic environment of the semiconductor wafer fabrication system (SWFS) (Eling and Langerak et al., 2013; Uzsoy and Fowler et al., 2018). The cycle time (CT) of the wafer product is a key performance indicator to keep high on-time delivery (Çerekçi and Banerjee, 2015; Ekici and Özener et al., 2019). CT fluctuates greatly according to the dynamic real-time material flow characteristics in the SWFS (Xie and Chien et al., 2013). In a typical 300mm SWFS, the cycle time of wafer fluctuates from 8,000 to more than 160,000 minutes. Understanding the CT fluctuation (UCTF) can provide important knowledge for production control, which is a critical issue in the operation of SWFSs. To reveal the cycle time fluctuation, the existing studies can be classified into two types: mechanism driven and data driven methods. The former detects the root cause of cycle time fluctuation by developing theoretical models (e.g. queuing model (Schelasin, 2011), plant simulation model (Hsieh and Chang et al., 2013), and Markov chain (Akhavan-Tabatabaei and Fathi et al., 2012)). With consideration of the complicated and unpredictable material flow characteristics in the SWFS, it is hard to construct an exquisite sophisticated model of the whole system. Fortunately, with the deployed industrial internet(Wang and Xu et al., 2020), a wealth of data is available for the optimizing purposes in the SWFS, which has the characteristic of big data: variety, volume, velocity, multi-source, multi-noise, and multi-scale. The emergence of massive data provides new opportunities for diagnosing the root cause of cycle time fluctuation by data driven methods. The big data analytics recognize explanatory key factors by correlation analysis without constructing a system model, which is exactly suitable to understand the fluctuation of cycle time in SWFSs (Zhong and Newman et al., 2016; Wang and Zhang et al., 2018; Gao and Gao et al., 2019; Gao and Gao et al., 2020). In

recent years, extensive studies have been conducted on big data collection, integration and analysis (Zhong and Xu et al., 2015; Wang and Zheng et al., 2019; Wen and Li et al., 2019). For example, the correlation between two factors can be estimated by mutual information or Pearson correlation coefficient(Wang and Zhang et al., 2018). However, the correlation between a candidate factor and cycle time evaluated by current big data analytics is different from the true situation, since the observed correlation contains not only direct dependence but also indirect dependence caused by transitive effects of correlations (Feizi and Marbach et al., 2013). As a result, the analyzed correlation may be heightened to mislead the key factor identification. To exactly select cycle time related features, the direct correlation extraction should be further addressed in the big data analytics. Network deconvolution is a technology extracting direct correlation based on the observed network, where each node corresponds to a candidate factor, and an edge weights the correlation between two factors. It assumes that indirect flow weights can be approximated as the product of direct edge weights by the effect of transitive information flow, and that observed edge weights are the sum of direct and indirect flows. With network deconvolution process, the indirect effect can be removed by reversing network transitive process to recover the true direct network. To reveal the direct correlation from the correlation observed by data correlative analysis, this paper proposes a complex network-based approach extracting direct dependence through network deconvolution to identify the key explanatory factors for cycle time fluctuation. The observed network is first constructed by big data correlation analysis, where each node refers to be a candidate factor and each edge values the correlation between two factors. In the correlation network, the indirect effects are expressed as the sum of direct correlation transmitting through the network in different steps. During the network deconvolution, the indirect effects are removed from the correlation network step by step to obtain the direct correlation. To demonstrate the effectiveness of this method, a case study with real industrial data from a SWFS in Shanghai is conducted to select 153

explanatory factors from more than one thousand candidate factors. And further analysis indicates that the exfoliation of indirect correlation is helpful to identify the real key explanatory factors and effective in the practice of RCTF. The rest of this paper is structured as follows: first, the related studies about the reveal of cycle time fluctuation and key factor identification are detailed. Then, a complex network-based method is proposed for identifying explanatory variable to understand the fluctuation of CT. Subsequently, the numerical experiments and case study are conducted and the results are reported and discussed. Finally, conclusions and future work are outlined in Section 6.

2

Related Works

2.1 Literature about the reveal of cycle time fluctuation In recent years, several approaches have been proposed about the RCTF of wafer lot (Wafer lot is a job unit in the wafer fab, which consists of 25 wafers with the same technical process.), which typically distinguish between three classes: physical model analysing, statistical forecasting, and key factor identifying. The physical model analysing methods rely on the refinement of causal models with physical considerations. With the causal models (e.g. Petri net, simulation models (Wein, 1988; Yang and Ankenman et al., 2008; Hsieh and Chang et al., 2013) and queuing models (Chung and Huang, 2002)(Schelasin, 2011)), the physical model analysing methods approximate the influencing ability of different factors to the CT of wafer lots. However, the construction of fantastic models is a daunting task in the largescaled and complicated SWFS, thus limiting these methods to be applied in the RCTF. The statistical forecasting methods predict the CT to foresee the fluctuation of CT through statistical models, like back-propagation-network, deep learning model (Wang and Zhang et al., 2018). These methods forecast CT of wafer lots with well-designed models to approximate jobs’ CT, which treat the mechanisms of CT fluctuation as a black-box (Chien and Wang et al., 2007). As a result, this method can forecast the CT

fluctuation, but cannot figure the reason of CT fluctuation. The key factor identifying methods formulate the RCTF problem as a factor selection problem. They focus on the identification of the key influential parameters and fitting the relationship between the influential parameters and CT. Kuo and his colleagues (Kuo and Chien et al., 2011) proposed a neural network model to exploit the value of the wealthy production data and tool data for predicting the WIP levels of the tool sets for cycle time reduction. Wang and Zhang (Wang and Zhang, 2016) designed a mutual information model to measure the relationship between the influential factor and CT of wafer lots. The key factor identifying methods enable the big data analytics to analyze the correlation between factors with high time efficiency and accuracy, which is appropriate for the RCTF. 2.2 Literature about the key factor identification To determine the key factors, the correlation analysis between candidate factors and wafer lots’ CT is a critical task. Several approaches have been proposed to evaluate dependence among variables. For example, the regression-based methods (such as linear regression (Sha and Storch et al., 2007), logistic regression (Cheng and Varshney et al., 2006; Ginsberg and Mohebbi et al., 2008), and Gauss-Newton regression (Chien and Hsu et al., 2012)) have been extensively applied to characterize relationships among variables with a pre-selected type of regression function. In SWFSs, it is difficult to predetermine the type of regression function to fit the correlation between a factor and wafer lots’ CT, since the correlation is always too complicated to describe with a mathematical formula. Mutual information (MI) (Rossi and Lendasse et al., 2006; Vergara and Estévez, 2014) is a non-parametric method to measure the correlation between two variables based on the theory of entropy. Based on the MI, several improved methods (Cheng and Qin et al., 2011) have been successfully applied to measure complicated and variable correlation in the factor selection problems. However, the interactive relationship between factors and CT of wafer lots is intricate (Qin and Zhang et al., 2013). For example, the utilisation and waiting queue length of

a machine not only correlate with the CT of wafer lots but also interact with each other. The correlation between factors could be passed to the CT through the second-order, third-order interaction, resulting in diffusion of the correlation. An independent factor would be mistaken for an explaining variable of CT due to transitive effects of correlations (Husmeier, 2003). Moreover, even if a true relationship exists between a factor and CT, its strength may be overestimated owing to additional indirect relationships, and distinguishing the convoluted direct and indirect contributions is a daunting task (Cheng and Hsieh et al., 2017). Several methods (Zoppoli and Morganella et al., 2010) have been proposed to infer direct dependence through the improved correlation measure criteria (e.g. DISR and mRMR (Brown and Pocock et al., 2012)). These methods are designed to separate direct from indirect dependence in a smallscale problem, or low-order interaction terms, thus limiting their applicability. Complex network deconvolution is very promising in extracting the direct correlations with large scale problems (Chen and Mundra et al., 2014), which can remove the combined effects of arbitrarily many indirect paths (Feizi and Marbach et al., 2013). Thus, this paper investigates a correlated complex network with network deconvolution to infer direct dependence by removing the effects of transitive relationships that result from indirect effects. 3

Complex network for cycle time related feature selection

Aiming to reveal the fluctuation of cycle time, this section proposed a complex network-based method to identify the key factors, which consists of three parts: construction of the observed network, direct dependence extraction through network deconvolution, and factor selection for CT explanatory network (shown in figure 1). First, an observed network is constructed through observing and measuring the correlation between factors by data analysis, which contains both direct real dependence and indirect dependence. Then, the direct network is extracted from the observed network through the network deconvolution with three steps: linear scaling, matrix decomposition, and network deconvolution. Based on the real direct dependence

contained in the direct network, the explanatory factors are identified to form the correlation network to reveal the fluctuation of CT.

Figure 1. The proposed approach to decipher the fluctuation of cycle time 3.1 Construction of the observed network Aiming to understand the CT fluctuation of wafer lots, an observed network is constructed to model the correlation between the candidate factors and CT of wafer lots. The observed network is defined as a set of node edge pair 𝑁 = {𝑝11,…𝑝1𝑛,…𝑝𝑛𝑛,}, where 𝑝𝑖𝑗 = {𝑖,𝑗,𝑒𝑖𝑗} is a node-edge pair, the 𝑒𝑖𝑗 means the edge between node 𝑖 and node 𝑗. In this network, the node in each node-edge pair corresponds to a specific variable in the CT analysis. There are two kinds of nodes in this observed network. The first type represents the candidate factors, which may affect the CT of wafer lots. And the second one refers to the CT of a wafer lot, which is the center node of the observed network. This section first defines the candidate factors that may be closely correlated with the CT of wafer lots. Based on the previous study (Wang and Zhang, 2016), we determine more than one thousand factors to build the candidate factor set, which could be divided into two kinds: workshop status parameters and product characteristics. These factors contain the processing times for each process, the priority of a wafer lot, the waiting queue length and utilization of each machine, and the total number of work in process in the manufacturing system. The complex network 𝑁 is a weighted graph, where the edge is weighted by the correlation between two nodes. Corresponding to the two kinds of nodes, there are two kinds of correlations: 𝐼（𝑓𝑖;𝐶𝑇） and 𝐼(𝑓𝑖;𝑓𝑗). 𝐼（𝑓𝑖;𝐶𝑇） means the correlation between the factor 𝑓𝑖 and CT, which indicates the amount of information that candidate factor 𝑓𝑖 has about the CT. Moreover, 𝐼(𝑓𝑖;𝑓𝑗) means the correlation between two candidate factors. In this section, the two kinds of correlations are estimated by the mutual information between two factors 𝐼(𝑥;𝑦), which is formulated

as follows. 𝐼(𝑥;𝑦) = 𝐻(𝑥) ―𝐻(𝑥|𝑦)

(1)

Where 𝐻(𝑥) means the information entropy of the candidate factor 𝑥, which measures the uncertainty of the candidate factors 𝑥. The entropy of a discrete random variable x is defined as: 𝐻(𝑥) = 𝐸[ 𝑙𝑜𝑔𝑝2 𝑥𝑖

( )

] = ― ∑𝑛𝑖= 1𝑝(𝑥𝑖)𝑙𝑜𝑔𝑝2(𝑥 ) 𝑖

(2)

where 𝑥 = {𝑥1,𝑥2,…𝑥n}， 𝑝(𝑥𝑖) is the probability of 𝑥 = 𝑥𝑖 (Cover and Thomas, 1991). The 𝐻(𝑥|𝑦) in equation (3) refers to the conditional entropy, which measures the uncertainty of the candidate factor 𝑓𝑖 with the pre-known CT. The conditional entropy of the candidate factor 𝑓𝑖 is measured as follows. 𝐻(𝑥|𝑦) = 𝐻({𝑥,𝑦}) ―𝐻(𝑦)

(3)

Where 𝐻({𝑥,𝑦}) indicates the joint entropy, which refers to the sum of the uncertainty contained by 𝑥 and 𝑦. The joint entropy between two random variables 𝑥 and 𝑦 is formulated in equation (4). 𝑛

𝑛

( ) 𝐻({𝑥,𝑦}) = ― ∑𝑖 = 1∑𝑗 = 1𝑝(𝑥𝑖,𝑦𝑗)𝑙𝑜𝑔𝑝2 𝑥𝑖,𝑦𝑗

wherex = {𝑥1,𝑥2,…𝑥n},

y = {𝑦1,𝑦2,…𝑦n}， 𝑝(𝑥𝑖,𝑦𝑗)is

(4)

joint

probability

where

(𝑥 = 𝑥𝑖 𝑎𝑛𝑑 𝑦 = 𝑦𝑖 ) (Cover and Thomas, 1991). After the correlations between every two nodes are calculated, the network is constructed as follows: 1) Evaluate the correlation 𝐼(𝑓𝑖;𝐶𝑇) and 𝐼(𝑓𝑖;𝑓𝑗) by the mutual information, and save the correlations in a list. 2) Take one unit in the list, and if there are none existing nodes corresponding to the candidate factors or cycle time of the wafer lot in the unit, add the nodes for the candidate factors or CT of wafer lots. 3) Then add the edges corresponding to the correlation between the two nodes in the unit. 4) Repeat step 2) until all units in the list have been visited. The adjacent matrix of the initial constructed observed network consisting of

n

nodes

is defined as follows: N in o

0 o   21     on1

o12 0  on 2

 o1n   o2 n       0 

Where oi , j means the correlation between node

i

and node

(5) j.

Usually, the node

n

refers to the CT of wafer lots, and the diagonal elements of N ino are set to 0. The oi , j is formulated as follows:

oi , j

 I  f i ; CT  i  {1, 2...n  1}, j  n   I  f j ; CT  j  {1, 2...n  1}, i  n   I  f i ; f j  i  j  {1, 2...n  1}  i  j  {1, 2...n} 0

(6)

3.2 Direct dependence extraction through network deconvolution The estimated correlation in the observed network contains not only direct dependence but also indirect dependence. With indirect dependence, the fluctuation of CT may be incorrectly understood, since the correlation between factors will be overestimated. The indirect effects can be expressed as the sum of direct correlation transmitting through the network in different steps. Take a typical five node-network as an example, which is shown in figure 2 (a-e). The observed correlation between factor 1 and CT consists of direct correlation, 2-order indirect correlation (1-2-CT), 3-order indirect correlation (1-2-3-CT, 1-2-4-CT), 4-order indirect correlation (1-2-3-4-CT), et al. To obtain the direct dependence between the candidate factor and the CT from the observed network, a network deconvolution method is designed to remove the effects of transitive relationships. The complex network with only direct correlation is called direct network. And the adjacent matrix of final direct network with only direct dependence is N d , which is defined as follows.  0 d N d   21     d n1

d12 0  dn2

 d1 n   d 2 n       0 

(7)

Where dij means the direct correlation between node

i

and node

j.

In the network deconvolution, the n-order indirect correlation is modeled by the N nd . Take figure 2-f for example, the 2-order indirect correlation of the five-node network (network in figure 2-a is evaluated by N2d . Specifically, the 2-order indirect correlation (1-2-CT) is calculated to be

x51   d51

d52

d53

0 *  0 d 21

d54

0 0 d51 

T

in the

network deconvolution, since the second factor connects to the factor 1 and 5 directly. So, the indirect effects along the network through increasing number of edges can be written as the sum of an infinite series of transitive closure, as follows. N O  N d  N 2 d  N 3d  ...  N m d

In equation (8)

m  

(8)

, which means the direct dependence can be transmitted

endlessly from node to node in the complex network. The infinite series is convergent d when the eigenvalue of N d meets the condition i

1

.

Figure 2. The transitive indirect dependence in a five-node network By using the infinite Taylor series, we can simplify the equation (8) as follows. N O  N d ( I  N 2 d  N 3d  ...  N n d )  N d ( I  N d ) 1

where

I

(9)

is the identity matrix. Through simple inverse transformation, we can express

the direct network as follows. N d  N o ( I  N o ) 1

(10)

To enhance the computing efficiency, we introduce the diagonalization to reduce the time complexity of the network deconvolution. The adjacent matrixes of observed network NO and direct network N d are decomposed into their eigenvectors and eigenvalues, which are shown in equation (11, 12). Nd  Uo

1d

0

0 



0

d 2

 

0 U o 1

nd

(11)

No  U o

1o

0

0 





o 2



0

0 U o 1

(12)

no

where  id is the i th eigenvalue of the N d , and io is the i th eigenvalue of the N o , U o is the eigenmatrix of both N d and N o .

With equation (11-13), the eigenvalues of the direct network can be expressed by the eigenvalues of the observed network, as follows. id 

io 1  io

To guarantee the convergent condition i

d

1

(13) in the network deconvolution, the initial

observed adjacent matrix N in should be linear scaled as follows. o No   N ino

(14)

With the linear scaling, the relationship between the eigenvalues of the N ino and N d is expressed as follows. 1  id 

To guarantee the convergent condition

io 1 1  io

 id  1

(15)

, the scaling parameter  should

satisfy the condition:   min(

  , ) o ( in ) (1   ) (1   )o (in )

Where  o ( in ) is the largest positive eigenvalue of N ino ,  o ( in ) is the smallest negative eigenvalue of N ino , and 

1

is a scaling parameter.

The network deconvolution method works as follows: in 1) The initial observed adjacent matrix N o is linear scaled according to

equation (11). 2) The linear scaled observed adjacent matrix N o is decomposed to its eigenvectors and eigenvalues as equation (9)

(16)

d 3) The eigenvalues of the direct adjacent matrix i is calculated with equation d (12). And the direct adjacent matrix N d is obtained by joining the i and

Uo .

After the network deconvolution, the direct network containing only direct dependence is obtained to describe the influential factors of CT. The observed network with key factors for CT of wafer lots in the case study is shown in figure 3-a), and the direct network after network deconvolution is shown in figure 3-b),. In the network, the node refers to the influential factor, and the edge weight is the correlation between factor and CT. Figure 3. The correlation network to reveal the fluctuation mechanism of CT 3.3 Factor selection for CT explanatory network Based on the direct correlation measured by the mutual information network deconvolution (MI-ND) method, we design the MI-ND factor selection algorithm to distinguish the CT-related factors which are highly correlated to the fluctuation of CT. The MI-ND factor selection works as follows (shown in figure 4). First, the factors are sorted by the direct correlation from high to low. Then, the algorithm adds one factor into the factor subset at a step, and evaluate the subset with a prediction method. The MI-ND factor selection is a greedy algorithm, which determines the factor subset with the best prediction performance after all factors are evaluated.

Figure 4. Procedure description of MI-ND factor selection With the CT-related factors selected by the MI-ND factor selection, the CT explanatory network is simplified from the direct correlation network N d . The CT explanatory network contains all CT-related factors that closely interact with CT of wafer lots. The fluctuation mechanism can be clearly described through the CT explanatory network. In the implementation, each CT-related factor is monitored by a control chart to detect

abnormal situations. When errors exceed the upper control limit and the lower control limit (e.g., the fitting error within 5%), the cycle time fluctuation could be understood through these anomalous data points. 4

Experiment and analysis for feature selection

The proposed big data-driven approach to reveal the fluctuation mechanism of CT through key factor identification with indirect dependence. Therefore, the experiments about factor selection on standard datasets and the case study on wafer data are conducted in the following section. Aiming to evaluate the performance of the proposed feature selection method, MI-DN method is compared with DISR and mRMR (Brown and Pocock et al., 2012) which are competitive in the factor selection. Six public benchmark datasets are used in the experiments, which are provided by the UCI Machine Learning Repository. Some extra factors are added to the three datasets (Sonar, ZOO, Image segmentation) by weighting some original factors to append the indirect relationships. To measure the performance of the selected factors, a decision tree classifier was constructed to sort the records on basis of the selected factors. The classification accuracy was estimated in the experiments with 10-fold cross-validation, which is formulated as follows. Accuracy=

nc nt

（17）

Where nc is the number of correctly classified samples, and the nt is the number of total samples in validating datasets. The experiment is conducted in four steps. 1) The correlations between the attributes and target are analyzed by DISR, mRMR and MI-DN. All attributes are ranked according to the estimated correlation from high to low. Set the counting variable i to 1 2) The validating datasets are built with top i factor in the rank. 3) A decision tree model is constructed with 80% of the instances of the validating dataset, and the classification accuracy of the decision tree is evaluated with 20%

of the instances of validating dataset to demonstrate the efficiency of selected attributes. 4) i=i+1, and go back to step 2), break the loop until all factors are selected into the validating dataset for classification.

Figure 5. The classification performance on six datasets

The classification results of three methods on six extended datasets with the different number of selected attributes are shown in figure 5. The MI-ND methods have the highest classification accuracy with fewer factors than the mRMR and DISR method in experiments with dataset added with some extra factors to append the indirect relationships. As illustrated in figure 5-a, the MI-ND achieves the highest accuracy (0.77) with 24 factors, while the mRMR has the highest accuracy (0.75) with 30 factors, and the DISR have the highest accuracy (0.74) with 41 factors. Compare with the reference methods, the MI-ND integrates with network deconvolution to remove the indirect dependence, which leads to the better results of key factor identification with these extended datasets. The results indicate the proposed method has higher effectiveness to identify the explanatory variables with datasets containing both direct and indirect dependence, which is exactly similar to the CT data of wafer lots. 5

Case study

In this section, the proposed method is evaluated by using lot transaction dataset extracted from the manufacturing execution system of a SWFS in Shanghai. This SWFS produces wafer in 8 inches with capacity of 200 thousand wafer lots. There are two hundred and twenty machine stations in this SWFS to accomplish the fabrication of integrated circuit products. Two thousand samples of the wafer lot with 331 procedures are captured from the SWFS during two months. In the case study, the proposed method is implemented with two steps to reveal the CT fluctuation of wafer lots. Firstly, the CT explanatory factors are obtained through the proposed MI-ND method and these factors

are analyzed according to the status of the SWFS. Then, through these CT explanatory factors, the root cause of abnormal in CT of wafer lots can be detected to reveal the CT fluctuation mechanism. Three abnormal wafer lots are thoroughly analyzed and the reasons for the delay are illustrated through the CT explanatory factors. 5.1 Big data captured for feature selection Collecting all candidate factors related to the CTs of wafer lots is the foundation of big data analysis. CTs of wafer lots is the elapsed time between the starting time for processing the first process and completion time for processing the last process in the SWFS. According to the factory physics, the cycle time of a job is closely related with the processing time, utilization of equipment and variability of the whole manufacturing process. In this paper, processing time means the processing time of each operation on the route of wafer manufacturing. Utilization of equipment includes the remaining workload of each station and the whole automated material handling system. Variability is expressed by changes of both the whole system and wafer lot, which include the priority of wafer lots, the number of WIP level in the shop and the size of waiting queue for each operation. All six types of candidate factors are defined in this section as follows. 1) The processing time of each operation The processing of each operation is the difference between the track in time and track out time of the operation, which can be calculated as follows: 𝑇𝑃𝑖 = 𝑇𝑃𝑖𝑜𝑢𝑡 ― 𝑇𝑃𝑖𝑖𝑛 where,

𝑇𝑃𝑖𝑖𝑛

𝑡ℎ

is the Track in time of the 𝑖

operation, and

(18) 𝑇𝑃𝑖𝑜𝑢𝑡

is the Track out

time of the 𝑖𝑡ℎoperation. 2) The priority of each wafer lot The priority of a wafer lot 𝑃𝑟 is exactly equal to the internal priority of a wafer lot in the SWFS. 3) The utilization of each machine The utilization of a machine means the average utilization of a machine in the past 24 hours, which can be calculated as follows:

𝑈𝑖 =

∑(𝑇𝑃𝑗𝑜𝑢𝑡 ― 𝑇𝑃𝑗𝑖𝑛) 24 𝐻𝑜𝑢𝑟

(19)

where, 𝑇𝑃𝑗𝑜𝑢𝑡 and 𝑇𝑃𝑗𝑖𝑛 are the Track in time and Track out time of the 𝑗𝑡ℎoperation processed on this machine in the past 24 hours. 4) The size of waiting queue for each machine The Qi can be calculated by summing the next procedure’s processing time of all wafer lots waiting before the machine i. 5) The workload of the automated material handling system (AMHS) This factor is identified by the present workload of the AMHS that can be computed as follows: 𝑈𝐴𝑀𝐻𝑆 =

𝑁𝑏𝑢𝑠𝑦 𝑁𝑖𝑑𝑙𝑒

(20)

where, 𝑁𝑏𝑢𝑠𝑦 is the number of vehicles working at present, 𝑁𝑖𝑑𝑙𝑒 is the number of free vehicles presently. 6) The WIP level The 𝑁𝑤𝑖𝑝 can be measured by counting the wafer lot whose first process has been finished and the final process has not been finished. The dataset containing all candidate factors has the characteristics of big data, which is featured by variety, volume, velocity, multi-source, multi-noise, and multi-scale (“3V3M”). Variety means the type and data structure is various. The candidate factors contain logistics system status, machines status, work-in-process status, order status etc. The relationship between product priority and cycle time is a one-to-one correspondence. However, the WIP level corresponds with product cycle time in a oneto-many relationship, the machine utilization corresponds with product cycle time in a many-to-many relationship. Volume refers to the large size of analytical datasets. The data for RCTF of each process operation is about one megabyte in volume. In every minute, A typical SWFS finishes more than one thousand operations, producing about more than one thousand gigabyte data one day. In this case study, the dataset for performance evaluation of two thousand wafer lots contains more than five million records for RCTF. Velocity is defined by the requirement of short data processing time. There are massive primary and foreign keys in the CT related dataset, resulting in long

query time. The number of data requests is about ten thousand per day, which are responded with million times of database manipulation. To keep real-time performance, the Apache big data platform is implemented in the management and storage of big data. Multi-source: the CT related data are generated and stored in origin databases of product data management system (PDM), manufacturing execution system (MES), and overhead hoist system (OHS). The different systems have various database structures and interfaces. Multi-noise: the CT related data has high noise since the electromagnetic interference and harsh environment in SWFSs. In the table entitled "wafer lot transaction", about five thousand records are containing missing values, and fifteen hundred records are containing abnormal values in every 8 million records. Multi-scale: the time scale of CT related data is various. The "wafer lot transaction" table is updated in every second, the machine utilization data is evaluated and updated per day, the queuing length of each machine is updated with newly arrived wafer lots. To perform the correlation analysis, a data preprocessing approach is applied to extract, transform and load all the data for candidate factors. The final dataset containing all the candidate factors of 2000 wafer lots, which is preprocessed from the databased of enterprise information systems. The obtained data of candidate factors for feature selection is detailed in Table 1. Table 1: The data of candidate factors for feature selection

5.2 The results of feature selection for the real case By applying the proposed MI-ND method, 153 CT-related factors were filtered from 1202 factors, which are illustrated in Table 2. Among the 153 selected factors, 134 factors are from the category: “the processing time of each operation (TP)”, 10 factors belong to the type of “the size of waiting queue for each machine (Queueing)”, and the other 9 factors are enumerated as: the utilization of a machine (Load), the priority of each wafer lot (Pr) and the WIP level (WIP). The factors belonging to the category “the processing time of each operation”, count 88 percent of all CT-related factors. This

result is consistent with the workshop status of the specific SWFS, which is ramping up production capacity. In this SWFS, the instability of the process parameter leads to the fluctuation of processing time, which influences the CT of wafer lots greatly. Table 2: CT-related factors selected by the proposed method To further demonstrate the effectiveness of the proposed MI-ND method, the selected factors are applied as input variables in a backpropagation neural network to forecast the CT of wafer lots. The MI-ND method was compared with mutual information (MI) method, which selects factors with the observed correlation containing both direct and indirect dependence. Moreover, the forecasting method taking all 1202 factors as input factors is another reference in the experiment. The prediction accuracy and precision were evaluated by the mean relative accuracy (MRA), which is formulated in equation (21). As is shown in figure 6-a), the MRA of MI-ND is about sixty percent higher than the prediction model with all factors, which means the factor selection is necessary to reveal the CT fluctuation. Moreover, the MRA of MI-ND is six percent higher than MI, which means the direct relationship extraction through network deconvolution is beneficial for key factor identification about CT fluctuation. 𝑛

𝑀𝑅𝐴 = 1 ― Where 𝑐𝑡𝑖 means the true CT of i th

i th

∑𝑖 = 1|𝑐𝑡𝑖 ― 𝑐𝑡𝑖| 𝑛

∑𝑖 = 1𝑐𝑡𝑖

（21）

wafer lot, and 𝑐𝑡𝑖 refers to the predicted CT of

wafer lot.

5.3 Reveal the cycle time fluctuation for real wafer lots With the experimental wafer data, the average cycle time of all wafer lots is 72492 minutes. Three abnormal wafer lots (ID: D84904, D84565, and D84635) from three different batches are selected out to be analyzed, whose cycle time exceeds more than 90,000 minutes (shown in figure 6-b)). The fluctuation of CT is foreseen through monitoring the CT-related factors. The factor value is weighted according to the factor type, and the factors deviations of the three wafer lots from the average level are shown in figure 6-d), where the “TP_All” means the sum of the processing time of all procedures.

Figure 6. Case study to understand the cycle time fluctuation of wafer lot The factors deviation analysis from the average level of wafer lot D84904 indicates that the processing time of some procedures is abnormal. Then, the processing time of some procedures contained in the CT-related factors are listed in Table 3. It is obvious that this wafer lot costs unusual processing time in the procedures, in order, TP283, TP9, TP58, TP34, and TP55. The processing time of “TP34” is even more than tenfold as the average level. The result suggests that this wafer may be reworked in several procedures, and the processing time should be controlled to reduce the cycle time of this wafer lot. Table 3: The processing time of the abnormal produces of D84904 The factors deviation analysis from the average level of wafer lot D84565 suggests that the queuing length of some machine is higher than the average level. Then, we analyze the queuing time before the abnormal machines of the D84565. This wafer lot costs about 2000 minutes more than average level, which is detailed in Table 4. The two machines are located in the photo area, which is a typical bottleneck in the SWFS. As for the other wafer lots (ID: D85492, D84912, and D85098) in the same batch of D84565, the CT also exceed 8000 minutes, which indicates the bottleneck also influences other wafer lots. For this reason, the dispatching and release rule should be adjusted to reduce the CT of wafer lots. Table 4: The queuing time of the abnormal produces of the D84565 As is shown in figure 6-d), the priority deviation of wafer lot D84904 is 0.44, which means the priority of wafer lot D84904 is much lower than the average level. In this case study, the priority value of this wafer lot is adjustable from 0 to 99 during the production (where 0 is the highest priority and 99 is the lowest). And the wafer lot with high priority has small priority value, which is preferred in the dispatching. We extract the priority value adjust record of this wafer lot, which is shown in figure 6-c). The average priority value for all steps of D84904 is 68, which is increased by 31% compared with the average level of all wafer lots. As shown in figure 6-c), the priority

value for most steps of this wafer lot even past 80, which means the priority of this wafer lot is low during the dispatching. Hence, it is obvious that the low priority is a critical issue for the delay of wafer lot D84904. 5.3 Applications issues in industrial practice 1) Integration with enterprise information systems During operation of manufacturing systems, the selected key factors can be applied to reveal the cycle time fluctuation, which is important for production planning and scheduling, order tracking, and work in process management. In the SWFS, the quality of fabricated circuits is closely correlated with residency time. Hence, the selected key factors can also support quality control of the fabrication process. In industrial implementations, the feature selection method can be programmed to be a module of manufacturing executing systems. 2) Resilience about the feature selection Manufacturing systems are always dynamic, and the changing workload, the shifted bottleneck, and other disturbances will influence the key factor selection. In dynamic systems, resilience is a critical indicator to measure the ability to recover from disturbances (Zhang and Lin, 2010; Zhang and Luttervelt, 2011). To coping with unforeseen circumstances (i.e., machine breakdowns, rush orders, etc.), the MRA defined in equation (21) can be estimated to evaluate the fitness of the proposed feature selection method. If the MRA satisfy the re-selection condition, the proposed MI-ND can be triggered to obtain freshly selected factors to improve the MRA to keep strong resilience. 6

Conclusion

This paper proposed a complex network based key factor identification approach to reveal the fluctuation mechanism of the cycle time, which involves four parts: candidate factor set constructing, correlation analysis, direct dependence extraction, and CTrelated factor identification. Different from the previous works, this study designs a complex network deconvolution method to recognize and remove the indirect effect

from the observed network to obtain the direct correlation between factors. A case study from a SWFS in Shanghai is conducted with industrial big data featured by “3V-3M”. The prediction accuracy of the prediction model with the factors identified by direct correlation is sixty percent higher than the prediction model with all factors and six percent higher than the prediction model with input factors identified with observed correlation. The results indicate that removing indirect effects over paths in the complex network is effective in key factor identification. And, the proposed method can be successfully implemented in the SWFS to explain the fluctuation of delayed wafer lots. In future research, we will address the evolution process of the correlation network during the system operation. The different key factors under dynamic system environment (such as the increasing WIP level) will be analyzed. Moreover, the consideration on cycle time control models that consider the relationship between the tunable parameters and cycle time of wafer lots will be paid in future work. 7

References:

Akhavan-Tabatabaei, R. and Y. Fathi, et al. (2012). "A Markov Chain Framework for Cycle Time Approximation of Toolsets." IEEE Transactions on Semiconductor Manufacturing 25 (4): 589-597. Brown, G. and A. Pocock, et al. (2012). "Conditional likelihood maximisation: a unifying framework for information theoretic feature selection." Journal of Machine Learning Research 13 (1): 27-66. Çerekçi, A. and A. Banerjee (2015). "Effect of upstream re-sequencing in controlling cycle time performance of batch processors." Computers & Industrial Engineering 88: 206-216. Chen, H. and P. A. Mundra, et al. (2014). "Highly sensitive inference of time-delayed gene regulation by network deconvolution." BMC Systems Biology 8 (4): 1-10. Cheng, F. and Y. Hsieh, et al. (2017). "A Scheme of High-Dimensional Key-Variable Search Algorithms for Yield Improvement." IEEE Robotics and Automation Letters 2 (1): 179-186. Cheng, H. and Z. Qin, et al. (2011). "Conditional mutual information-based feature selection analyzing for synergy and sedundancy." ETRI Journal 33 (2): 210-218. Cheng, Q. and P. K. Varshney, et al. (2006). "Logistic Regression for Feature Selection and Soft Classification of Remote Sensing Data." IEEE Geoscience and Remote Sensing Letters 3 (4): 491-494. Chien, C. and C. Hsu, et al. (2012). "Manufacturing intelligence to forecast and reduce semiconductor cycle time." Journal of Intelligent Manufacturing 23 (6): 2281-2294. Chien, C. and W. Wang, et al. (2007). "Data mining for yield enhancement in semiconductor manufacturing and an empirical study." Expert Systems with Applications 33 (1): 192-198. Chung, S. and H. Huang (2002). "Cycle time estimation for wafer fab with engineering lots." IIE Transactions 34 (2): 105-118.

Cover, T. M. and J. A. Thomas (1991). Elements of Information Theory. Wiley, Tsinghua University Press. Ekici, A. and O. Ö. Özener, et al. (2019). "Cyclic ordering policies from capacitated suppliers under limited cycle time." Computers & Industrial Engineering 128: 336-345. Eling, K. and F. Langerak, et al. (2013). "A Stage-Wise Approach to Exploring Performance Effects of Cycle Time Reduction." Journal of Product Innovation Management 30 (4): 626–641. Feizi, S. and D. Marbach, et al. (2013). "Network deconvolution as a general method to distinguish direct dependencies in networks." Nature Biotechnology 31 (8): 726-733. Gao, Y. and L. Gao, et al. (2019). "A zero-shot learning method for fault diagnosis under unknown working loads." Journal of Intelligent Manufacturing. DOI:10.1007/s10845-019-01485-w Gao, Y. and L. Gao, et al. (2020). "A semi-supervised convolutional neural network-based method for steel surface defect recognition." Robotics and Computer-Integrated Manufacturing 61: 101825. Ginsberg, J. and M. H. Mohebbi, et al. (2008). "Detecting influenza epidemics using search engine query data." Nature 457 (7232): 1012-1014. Hsieh, L. Y. and K. Chang, et al. (2013). "Efficient development of cycle time response surfaces using progressive simulation metamodeling." International Journal of Production Research 52 (10): 3097-3109. Husmeier, D. (2003). "Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks." Bioinformatics 19. Kuo, C. and C. Chien, et al. (2011). "Manufacturing intelligence to exploit the value of production and tool data to reduce cycle time." IEEE Transactions on Automation Science and Engineering 8 (1): 103111. Lu, S. C. and D. Ramaswamy, et al. (1994). "Efficient scheduling policies to reduce mean and variance of cycle-time in semiconductor manufacturing plants." IEEE Transactions on Semiconductor Manufacturing 7 (3): 374-388. Qin, W. and J. Zhang, et al. (2013). "Dynamic dispatching for interbay material handling by using modified Hungarian algorithm and fuzzy-logic-based control." The International Journal of Advanced Manufacturing Technology 67 (1-4): 295-309. Qin, W. and J. Zhang, et al. (2013). "Multiple-objective scheduling for interbay AMHS by using geneticprogramming-based composite dispatching rules generator." Computers in Industry 64 (6): 694-707. Rossi, F. and A. Lendasse, et al. (2006). "Mutual information for the selection of relevant variables in spectrometric nonlinear modelling." Chemometrics and Intelligent Laboratory Systems 80 (2): 215-226. Schelasin, R. (2011). Using static capacity modeling and queuing theory equations to predict factory cycle time performance in semiconductor manufacturing. Winter Simulation Conference 2011, Phoenix, Arizona, IEEE. Sha, D. Y. and R. L. Storch, et al. (2007). "Development of a regression-based method with case-based tuning to solve the due date assignment problem." International Journal of Production Research 45 (1): 65-82. Uzsoy, R. and J. W. Fowler, et al. (2018). "A survey of semiconductor supply chain models Part II: demand planning, inventory management, and capacity planning." International Journal of Production Research DOI: 10.1080/00207543.2018.1424363. Vergara, J. R. and P. A. Estévez (2014). "A review of feature selection methods based on mutual information." Neural Computing and Applications 24 (1): 175-186.

Wang, J. and C. Xu, et al. (2020). "A collaborative architecture of the industrial internet platform for manufacturing systems." Robotics and Computer-Integrated Manufacturing 61: 101854. Wang, J. and J. Zhang (2016). "Big data analytics for forecasting cycle time in semiconductor wafer fabrication system." International Journal of Production Research 54 (23): 7231-7244. Wang, J. and J. Zhang, et al. (2018). "A Data Driven Cycle Time Prediction with Feature Selection in a Semiconductor Wafer Fabrication System." IEEE Transactions on Semiconductor Manufacturing 31 (1): 173-182. Wang, J. and J. Zhang, et al. (2018). "Bilateral LSTM: A Two-Dimensional Long Short-Term Memory Model with Multiply Memory Units for Short-Term Cycle Time Forecasting in Re-entrant Manufacturing Systems." IEEE Transactions on Industrial Informatics 14 (2): 748-758. Wang, J. and P. Zheng, et al. (2019). "Fog-IBDIS: Industrial Big Data Integration and Sharing with Fog Computing for Manufacturing Systems." Engineering: 10.1016/j.eng.2018.12.013. Wein, L. M. (1988). "Scheduling semiconductor wafer fabrication." IEEE Transactions on Semiconductor Manufacturing 1 (3): 115-130. Wen, L. and X. Li, et al. (2019). "A New Two-Level Hierarchical Diagnosis Network Based on Convolutional Neural Network." IEEE Transactions on Instrumentation and Measurement PP (99): 1-9. Xie, Y. and C. Chien, et al. (2013). "A method for estimating the cycle time of business processes with many-to-many relationships among the resources and activities based on individual worklists." Computers & Industrial Engineering 65 (2): 194-206. Yang, F. and B. E. Ankenman, et al. (2008). "Estimating cycle time percentile curves for manufacturing systems via simulation." INFORMS Journal on Computing 20 (4): 628-643. Zhang, W. J. and C. A. V. Luttervelt (2011). "Toward a resilient manufacturing system." CIRP Annals Manufacturing Technology 60 (1): 469-472. Zhang, W. J. and Y. Lin (2010). "On the principle of design of resilient systems – application to enterprise information systems." Enterprise Information Systems 4 (2): 99-110. Zhong, R. Y. and C. Xu, et al. (2015). "Big Data Analytics for Physical Internet-based intelligent manufacturing shop floors." International Journal of Production Research 55 (9): 2610-2621. Zhong, R. Y. and S. T. Newman, et al. (2016). "Big Data for supply chain management in the service and manufacturing sectors: Challenges, opportunities, and future perspectives." Computers & Industrial Engineering 101: 572-591. Zoppoli, P. and S. Morganella, et al. (2010). "TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach." BMC Bioinformatics 11 (1): 154154.

3.1 Construction of the observed network

3

2

3.2 Direct dependencies extraction through network deconvolution 3

4

2

3.3 Factor selection for CT explanatory network

4

4

CT

CT

CT

The observed network

The direct network

The CT explanatory network

（direct and indirect dependencies）

（only direct dependencies）

（direct dependencies with only key factors）

Figure 1. The proposed approach to decipher the fluctuation of cycle time

3

3

4

3

4

5

5

5

5

4

2

2

2

2

3

4

1

3

c) 2-order indirect correlation

b) direct correlation

a) The direct network 4

Nd 2

2

0   d 21  0  0  d51

d 21 0 d32 d 42 d52

All factors connected to factor 5 e) 4-order indirect correlation

0

0

d 23 0 d 43 d53

d 24 d34 0 d54

d15   0 d 25   d 21 d35   0  d 45   0 0   d51

d 21 0 d32 d 42 d52

0

0

d 23 0 d 43 d53

d 24 d34 0 d54

d) 3-order indirect correlation

d15   0 d 25   x21 d35    x31   d 45   x24 0   x51

x12 0 x21 x21

x21

x21

x21 0 x21

x21 x21 0

x21

x21

x21

x21  x21  x21   x21  0 

All factors connected to factor 1

f) the estimation of 2-order indirection correlation in ND

Figure 2. The transitive indirect dependencies in a five node network

b) The direct network with CT-related factors

a) The observed network with CT-related factors

Figure 3. The correlation network to reveal the fluctuation mechanism of CT 1.

Algorithm MI-ND factor selection Input: 𝐹 = {𝑓1,𝑓2,…,𝑓𝑛}, CT Output: the CT-related factors and the direct correlation between factors and CT Step1: initial subset S Step2: measure the mutual information 𝐼（𝑓𝑖;𝐶𝑇） for every factor Step3: construct the complex network NO , and get the direct network N d Step4: sort all factors by the direct correlation from high to low Step5: for 𝑖 = 1 to |𝐹| Step 6:

predict the CT with the top i factors

Step7: end for Step8: optimize the subset S with the highest prediction accuracy Figure 4. Procedure description of MI-ND factor selection

0.8 Accuracy

0.75 0.7 0.65 0.6 0.55 0.5

1 Number of mRMR_D

factor DISR

MI-ND

a). The classification performance on Sonar dataset

0.7 0.65

Accuracy

0.6 0.55 0.5 0.45 0.4 0.35 0.3 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21

Number of factor mRMR_D

DISR

b). The classification performance on ZOO dataset

MI-ND

0.29

Accuracy

0.27 0.25 0.23 0.21 0.19 0.17 0.15 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Number of factor mRMR_D

DISR

MI-ND

Accuracy

c). The classification performance on Image segmentation dataset

0.75 0.74 0.73 0.72 0.71 0.7 0.69 0.68 1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

Number of factor mRMR_D

DISR

MI-ND

Accuracy

d). The classification performance on sensorless drive diagnosis dataset

0.34 0.32 0.3 0.28 0.26 0.24 0.22 0.2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

Number of factor mRMR_D

DISR

e). The classification performance on statlog dataset

MI-ND

Accuracy

0.7 0.65 0.6 0.55 0.5 0.45 0.4 1

2

3

4

5

6

7

8

9

10

11

12

13

14

Number of factor mRMR_D

DISR

MI-ND

f). The classification performance on gesture phase dataset Figure 5. The classification performance on six datasets

15

16

17

18

19

D84661

D84635

D86741

91,485

a) The mean relative accuracy of cycle time

b) The cycle time about the three typical abnormal

forecasting

wafer lots

TP 3

90 80

WIP

70 60

2 1

Pr

0

50 40

Load

30

TP_ALL

1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199 210

Priority Value

wafer lot ID

D84912

All factors

D84565

MI

90,814

D85492

MI-ND

91,201

D84531

31.00%

95000 90000 85000 80000 75000 70000 65000 60000 55000 50000

D84904

84.65%

D85201

90.90%

cycle time (minute)

MRA

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

The number of priority adjustment c) The priority adjust record of D84904

QUEUING D84904

D84565

d) Factors deviation from the average level

Figure 6. Case study to understand the cycle-time fluctuation of wafer lot

Table 1: The data of candidate factors for feature selection No.

Cycle time

The processing time of each process

(minute)

(datetime) TP1

TP2

TP331

The workload of each machine

U1

U2

…

U220

The queue size of each machine

The

(datetime)

level

Q1

Q2

…

Q220

WIP

The

The

workload

Priority

of AMHS

of wafer lot

1

87975.82

12.82

13.95

…

44.6

1.70

0.01

…

1.75

80

0

…

101

59695

0.13

40

2

90290.18

132.43

13.1

…

41.08

1.70

0.01

…

1.75

171

0

…

1675

59086

0.13

42

3

76673.37

0.3

25.48

…

44.57

1.53

253.62

…

2.54

0

833

…

398

53265

0.14

63

4

65437.47

2

24.32

…

44.15

1.30

7.12

…

169.40

637

0

…

2626

59758

0.14

61

5

69098.03

2.05

13.42

…

26.95

1.29

1.49

…

7.08

155

448

…

78

59758

0.14

81

6

69307.9

0.33

25.05

…

42.78

1.97

2.32

…

1.38

245

0

…

1734

56997

0.14

19

1993

75552.07

0.68

24.68

…

44.17

1.41

1.91

…

7.11

12333

85

…

2742

55999

0.15

74

1994

83717.23

0.42

24.97

…

42.15

167.08

1.92

…

15.50

4445

22665

…

145

55817

0.15

28

1995

80046.12

86.35

25.13

…

27.48

270.66

79.60

…

1.35

17

41

…

41

56662

0.15

71

1996

64078.68

0.72

13.99

…

43.67

0.31

2.31

…

2.31

48

7

…

3757

57912

0.15

81

1997

91201.38

0.32

13.72

…

42.03

1.32

4.08

…

22.05

144

24

…

99

59729

0.15

23

1998

67006.22

0.52

13.18

…

37.12

1.30

0.59

…

1.39

0

99

…

17404

59715

0.16

7

1999

61431.03

92.93

13.2

…

43.1

0.87

5.01

…

30.55

20

324

…

2486

56513

0.15

79

2000

74059.45

0.45

25.3

…

44.48

1.58

1.56

…

6.49

136

40

…

0

55106

0.15

33

……

a

Table 2: CT-related factors selected by the proposed method Number of Factors

Total Score

Average Score

Pr

1

0.448

0.448

TP

134

59.111

0.441

Load

7

3.085

0.441

Queueing

10

4.416

0.442

WIP

1

0.443

0.443

Table 3: The processing time of the abnormal produces of D84904 TP105

TP290

TP385

TP283

TP9

TP58

Average

44.81

46.01

44.44

39.71

161.17

59.85

D80904

68.37

74.23

63.92

105.95

519.68

399.17

TP240

TP34

TP232

TP55

TP345

TP66

Average

44.13

42.13

39.06

37.88

53.92

30.69

D80904

66.08

555.12

73.37

147.27

85.37

68.82

Table 4: The queuing time of the abnormal produces of the D84565 Machine ID

Average level

D84565

Machine

Working area

Q5

1160.756

1912

AWXCR02

Photo

Q9

1093.315

2123

ADSIN10

Photo

Highlights 

Big data analytics is investigated for feature selection.



The direct correlation and indirect correlation are investigated.



A network deconvolution method is designed to infer the direct correlation.



A real case is conducted to understand the cycle time fluctuation of wafer lots.

CRediT author statement Junliang Wang: Conceptualization, Methodology, Software Peng Zheng: Data curation, Visualization and proof Reading. Jie Zhang: Supervision and Editing

Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system

Big data analytics for cycle time related feature selection in the semiconductor wafer fabrication system

Recommend Documents