Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
Contents lists available at ScienceDirect
Robotics and Computer Integrated Manufacturing journal homepage: www.elsevier.com/locate/rcim
Full length Article
An adaptive CGAN/IRF-based rescheduling strategy for aircraft parts remanufacturing system under dynamic environment
T
⁎
Zheng Penga, Wang Junliangb, Zhang Jieb, , Yang Changqic, Jin Yongqiaoc a
School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200040, China College of Mechanical Engineering, Donghua University, Shanghai 201620, China c Shanghai Spaceflight Precision Machinery Institute, Shanghai 201699, China b
A R T I C LE I N FO
A B S T R A C T
Keywords: Aircraft parts remanufacturing system Dynamic environment Rescheduling Improved random forest
The production planning and control of aircraft remanufacturing system is becoming a critical issue with the development and application of reusable aircraft parts. With the unpredictable demands of remanufactured products, the aircraft parts remanufacturing system (APRS) operates with various disruptions, which could seriously influence the system's efficiency. To generate timely and efficient response towards the frequent disruptions in APRS, an adaptive rescheduling policy is presented in this paper with a trigger and a re-scheduler. The trigger evaluates the system performance loss caused by the disruptions with a specific Relative Performance Deviation Index (RPDI) and determines when the re-scheduler is activated to re-optimize the schedule. The rescheduler then intelligently selects the optimal rescheduling method by an improved random forest (IRF) based on the system status described by a set of reduced but important features. In order to deal with the data imbalance in different classes in the remanufacturing system, this paper develops a problem-oriented conditional generative adversarial networks (CGAN) for data augmentation. Aiming to demonstrate the effectiveness of the proposed method, an experiment is conducted with three commonly used classification approaches. And, the results show the competitiveness of the proposed approach in reducing the rescheduling trigger times and the rescheduling performance improvement.
1. Introduction With the development of aircraft parts reuse technology, the demand for remanufacturing of key aircraft parts will increase in next few years [1]. Aircraft parts remanufacturing system (APRS) consists of three subsystems: disassembly, reprocessing and reassembly shops. As the intermediate phase of the system, reprocessing shop is usually regarded as a bottleneck in the production process. To maintain high production efficiency, the scheduling for reprocessing is a critical issue in the operation of the APRS, which creates a production schedule for a given set of jobs and resources [2]. However, the arrival timing, quantity and quality of used products are usually unpredicted, since the life cycle of an aircraft part is relatively long and the mechanical wear of it is uncertain [3,4]. In addition, aircraft parts typically have very high machining accuracy requirements. These characteristics result in the practical APRS operating in dynamic environment, where frequent disruptions (e.g. machine breakdowns, rework and new orders are received) occur. These disruptions will result in changes in status parameters such as job processing time and available production resources,
⁎
which may upset original schedule and even make it infeasible. In practice, managers need to react quickly to unexpected disruptions and update current schedules in time. Therefore, compared with traditional shop scheduling problems, scheduling of remanufacturing system is more challenging. The static scheduling theory lacks responses to disruptions in dynamic environment. In order to minimize the impact of disruptions in the performance of the dynamic APRS, this paper focuses on exploring a practical and effective rescheduling strategy. Production scheduling for dynamic environments can be divided into three categories: robust pro-active scheduling, completely reactive scheduling and predictive–reactive scheduling [5,6]. The reprocessing of aircraft parts uses a lot of special processing equipment that require better performance and stability of the schedule. For the above reason, the rescheduling strategy proposed in this paper is based on predictive–reactive scheduling, which generates an optimized original schedule in advance and update the schedule when a disruption occurs. In predictive–reactive scheduling, the original schedule can be obtained through various static scheduling algorithms that have been extensively studied [7–8]. The reactive scheduling procedure consists of two steps:
Corresponding author. E-mail address:
[email protected] (J. Zhang).
https://doi.org/10.1016/j.rcim.2019.02.008 Received 31 August 2018; Received in revised form 25 February 2019; Accepted 28 February 2019 Available online 21 March 2019 0736-5845/ © 2019 Elsevier Ltd. All rights reserved.
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
common disruptions, which have different degree of impact on system performance, are defined and reclassified. Secondly, according to the different types of disruptions, a specific RTDI-based hybrid rescheduling triggering mechanism is designed. Then, a problem-related CGAN model is built to deal with the class imbalances in historical data. The processed data set is used to train an IRF model to construct the complicated nonlinear relationship between real-time status and the scheduling method under dynamic environment. Finally, case studies from an aircraft parts remanufacturing enterprise are solved to examine the performance of the proposed algorithm. The remainder of this paper is organized as follows. Section 2 introduces the characteristics of the APRS and gives the description of proposed method. The RTDI-based hybrid triggering mechanism is presented in Section 3. Section 4 introduces the details of the adaptive rescheduling method. The numerical experiments and case study are designed and the results are reported and discussed in Section 5. Section 6 concludes the paper.
(1) rescheduling trigger; (2) rescheduling implementation. In the dynamic manufacturing environment, how to determine the trigger point of rescheduling is of great importance [9]. Three commonly used triggering mechanisms have been proposed in the literature: periodic, event driven, and hybrid [10]. Periodic mechanism revise the schedule periodically and implement the resulting schedules on a rolling horizon basis [11]. Compared with other two mechanisms, periodic scheduling yields higher schedule stability because it is only driven by a preset period without considering specific disruptions. Accordingly, it is a key issue to determine period length for period scheduling. Event driven mechanism update the schedule every time a disruption that alters system status occurs. However, for large-scale dynamic systems with frequent disruptions, if the trigger make response to all disruptions without distinguishing them by impact degree, it may lead to poor schedule stability and great computational complexity [12]. Hybrid policy revise the schedule periodically and does so when major disruptions occur [13]. Thus, if properly designed, it can achieve high stability while keeping low computational complexity. In practical applications, a feasible triggering mechanism needs to be designed for specific problems, which depends upon the type of manufacturing system, the scheduling objective and the type of disruptions. Rescheduling implementation update current schedule with specific rescheduling algorithm. In recent years, many researchers focused on different algorithms for rescheduling and the common algorithms can be summarized into two strategies: full generation and repair-based [6]. The former reschedules the entire set of jobs including those unprocessed and inserted. Thus, almost all algorithms for static scheduling problems, including mathematical programming methods, various meta-heuristic algorithms can be used here [3,14,15]. The latter only makes local adjustments to the current schedule for disruptions. The popular right shift rescheduling (RSR) method [16] keeps the job sequence of original scheduler, and absorbs the impact of the disruption by postponing the start time of all operations that start after disturbing time. The match-up scheduling method [17] tries to find a so called matchup point and rescheduling the unprocessed operations before matchup point. This method can effectively absorb the impact of the disruption by making full use of the equipment idle time in the original schedule. The repair-based methods can keep the original schedule as much as possible, and thus they are widely used in the scenario that needs to react quickly to disruption. Although the above mentioned methods can be used to implement rescheduling, as reported in literatures, there is no rescheduling method that suits all kinds of dynamic status [18–20]. Adaptive dynamic rule/ method selection mechanism, as a data-mining-based technique, is more efficient and robust for dynamic environment than the abovementioned approaches due to its capability of selecting optimal method according to real-time environment state. A series of classification algorithms for data mining, such as decision tree (DT), K-nearest neighbor (KNN), support vector machines (SVM) have been widely used in dynamic method selection mechanism. Li and Min [18] present an adaptive dispatching method that uses linear regression model to find the relations between dispatching rule and the real-time state information. Based on the historical data, Shiue [19] adopts the SVM algorithm to make real-time scheduling decisions. Choi et al. [20] uses the decision tree algorithm to select dispatching scheduling rules in the operational issues in reentrant hybrid flow shops. In the practical dynamic environment, classic rescheduling methods are suitable for most scenarios, but in a few scenarios, other methods perform better. Therefore, when dealing with this type of data set, conventional classification models are biased toward the majority classes and tend to misclassify minority class examples. The data imbalances in different rescheduling method classes is a major challenge in this problem, which is ubiquitous in remanufacturing system but less considered. In this paper, an adaptive rescheduling strategy by combining RTDI based triggering mechanism and CGAN/IRF-based classification model is proposed to deal with disruptions in dynamic APRS. Firstly, ten
2. Problem description 2.1. Dynamic aircraft parts remanufacturing systems An APRS contains a series of specific production processes, which can transform end-of-life aircraft components into “like-new” remanufactured products. Compared with traditional production, remanufacturing is an economic and environmentally preferable option since materials are recyclable and cheap [21]. Although the production process of APRS is similar to that of general manufacturing facility, it still has several distinctive features: First, aircraft parts remanufacturing is custom-oriented production, which means that changes in orders will directly affect the production schedule. Secondly, compared with the ordinary processing workshop, the logistics system of APRS is less flexible, because the aircraft parts are large in size and complicated in assembly. Therefore, APRS has higher requirements for the stability of the schedule. Third, the remanufacturing of aircraft parts is a high-tech process with high added value, so the production process uses a lot of expensive processing equipment. For example, a thermal barrier coating processing equipment used to process aircraft engines costs up to millions of dollars, and expensive equipment depreciation costs impose higher requirements on production efficiency. Such features have brought new problems to production scheduling, which can be summarized as
• More frequent changes in job priority. There are many types of •
•
disruptions in reprocessing shop, including unavailable material caused by schedule changes of upstream disassembly shop, and urgent job arrival or job cancellation caused by schedule changes of downstream reassembly shop. Uncertain processing time and more reworked jobs. The quality and the composition of returned components vary, and using technology of these components are complicated, which leads to wide variation ranges of processing time. Furthermore, it is difficult to achieve high dimensional accuracy through one-time processing, so that the repair rate is relatively high. APRS needs an optimized schedule to ensure high equipment utilization. The optimized schedule can reduce the equipment idle time that effects the efficiency of production system and equipment economy.
In summary, APRS contains masses of uncertainties which have a great impact on production scheduling. Therefore, in order to obtain a feasible and effective schedule, the above properties must be fully considered.
231
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
Table 1 Disruptions in APRS. Disruptions
Type
Effect
Machine breakdown Tool breakdown Variation of set-up times Unavailability of raw material Delay in transport Variation of process time Change of priority Cancellation of job order Arrival of a new job order Rework
Abrupt Abrupt Gradual Gradual Gradual Gradual Gradual Abrupt/ Gradual Abrupt/ Gradual Abrupt
Machine unavailable for a period Machine unavailable for a period Change in start/end time Change in start/end time Change in start/end time Change in end time Job is required earlier or later than its scheduled time Job is no longer required to be produced New job has to be immediately inserted in the schedule Some operations of the job are required to be redone
2.2. Definition and classification of disruptions
2.3. Adaptive rescheduling strategy
The highly dynamic manufacturing environment causes a lot of uncertainties in the production scheduling of the remanufacturing shop [22]. In order to make effective rescheduling policies for different types of disruptions, it is necessary to classify and analyze the disruptions. As shown in Table 1, ten common disruptions in APRS are considered in this paper. The basic hypotheses for the considered disruptions include the following.
The system status of APRS changes frequently. As reported in literatures [18–20], implementing dynamic rescheduling method selection at each rescheduling point can effectively improve system performance than executing a single rescheduling method. The basic idea of the proposed rescheduling strategy is to provide managers with advice on when and how to deal with these disruptions. Such a strategy should be adaptive because it should be able to identify the real-time status and then determine the appropriate method to be used. The proposed adaptive rescheduling strategy consists of two parts: triggering mechanism and selection of rescheduling method.
1) Machine breakdown and tool breakdown occur randomly, but the repair duration can be obtained by technical detection. 2) Variations of times related to production resources can be obtained by the real-time materials information system. 3) Change in job priority can be reflected by the due date.
1) Triggering mechanism design This stage is to determine the triggering mechanism based on the analysis of the category and impact of the disruptions. Different from the traditional periodic and event driven approach, this study focuses on the frequent gradual disruptions and designs a two-step rescheduling triggering mechanism. Firstly, determine the type and RTDI of disruption. Secondly, the urgent or influential disruptions are immediately responded, and the disruptions with small degree of influence will be responded together after a period of time. Through this mechanism, we can respond to different disruptions efficiently and in a timely manner.
In general, disruptions have been classified into two types: resourcerelated disruptions and Job-related disruptions [6]. However, this classification method cannot fundamentally reveal the effects of different disruptions on production scheduling. According to their different impacts on the production environment, we reclassified disruptions into two categories: abrupt disruptions and gradual disruptions. Abrupt disruptions refers to the disruption directly impacting the original schedule. Once it is not dealt with immediately, the original schedule would be interrupted due to changes in resources or jobs. Gradual disruption refers to the disruption that changes mildly, but would lead to severe consequence towards the production efficiency after multiple accumulation. Under the influence of gradual disruption, jobs can still be processed in the sequence of the original schedule, but the performance may deteriorate. Some disruptions may belong to different categories under different conditions. For example, when a job being processed is cancelled, this disruption is abrupt. However, if the cancelled job has not been processed, the uncertainty is considered to be gradual because it will only cause the original schedule to generate additional equipment idle. Abrupt disruption can be dealt with by an event-driven rescheduling strategy that triggers rescheduling when a disruption occurs. However, it is relatively difficult to determine a clear triggering condition for the gradual disruption. If gradual disruptions cannot be responded in time, the equipment utilization may be low and the performance of the original schedule may be degraded. Conversely, if all the gradual disruptions, whether slight or severe, are responded in time, the status of the shop floor may be frequently changed and the computational complexity of the scheduling system will be large. Therefore, in order to design an effective rescheduling strategy, managers should explore the tradeoff between the efficiency of schedule and the stability of production process. In fact, this is also the original motivation for the strategy proposed below.
1) Rescheduling method selection This stage is to adaptively select an appropriate rescheduling method based on the status of the rescheduling point. A data-driven adaptive rescheduling model is proposed, which includes a CGAN-based data augmentation model, an IRF classifier and a genetic algorithmbased parameter optimizer. It is worth noting that the core of the adaptive rescheduling strategy is to mine the adaptability of different rescheduling methods under different status rather than the performance of a single method. 3. RTDI-based hybrid triggering mechanism In APRS, there are a large number of gradual disruptions that occur frequently and affect differently. In order to effectively respond to these disruptions, we proposed a hybrid rescheduling triggering mechanism. The method consists of two steps: evaluation of the impact of disruptions and rescheduling point selection. 3.1. Disruption evaluation based on RPDI Gradual disruptions lead to performance degradation of the schedule, and as the disruptions accumulate continue to changes in the degree of performance degradation. Considering the impact of the deviation of the scheduling optimization objective on the rescheduling decision, the concept of Relative Performance Deviation Index (RPDI) 232
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
If RM = IR , the responding way is the same as abrupt one; If RM = DR , we evaluate if the interval between current time point and the last rescheduling (Δt) exceeds the ΔTmin. If so, the disruption will be responded immediately. If not, it implies that rescheduling will be triggered after ΔTmin − Δt . If RM = NE , the disruption will be ignored because of its slight effect on the system. Step 3 . Trigger rescheduling method selection.
Table 2 System features used in the study. Domain
ID
Description
Overall
1–2 3–4
Machines
5–11 12–13 14–18
Number of unprocessed jobs and operations Total processing time of unprocessed jobs and operations Broken-down machine number (7 machines in total) Time point of machine breakdown, repair duration Mean, median, standard deviation(STD), min and max of machine utilization Number and total process time of operations affected during repair duration Number and total process time of cancelled jobs Whether the cancelled task is being processed Number and total process time of new jobs Mean, median, min and max of priority of new jobs (ranking of due date) Number and total process time of reworked operations Number of operations with process time changes The cumulative deviation of operation process time Number of jobs with release time changes The cumulative deviation of job release time
19–20
Jobs/operations
21–22 23 24–25 26–29 30–31 32 33 34 35
The system status parameters will be updated based on the effects of the disruption. After that, these parameters will be entered into the rescheduling strategy model as input information. (Note that once rescheduling is triggered, all DR type disruptions accumulated during the previous interval will be considered.) 4. Adaptive rescheduling strategy using the CGAN/IRF 4.1. Adaptive rescheduling strategy description According to the description of Park et al. [23]., a dynamic rescheduling strategy can be represented by a four-tuple, {F, O, M, D}, where F is the set of candidate features describing the system status, O represents the specific performance criteria, and M denotes the set of candidate rescheduling methods. At a real-time dynamic status with features F, the rescheduling method m ∈ M is activated by a data mining algorithm D with performance criterion O. The data mining algorithm D is used to implement fast mapping from features F to methods M.
based is proposed. It can be defined as:
RPDI =
min (Os ) − min (O s′) min (Os )
(1)
where s is the original schedule, s′ is the affected original schedule that the system status changes but the job sequence remains the same, and O (s) is the makespan of schedule s. Actually, RPDI is used to evaluate the degree of performance degradation of the original schedule if it is not updated when gradual uncertainty occurs. Furthermore, we proposes three kinds of response measures (RM) based on the impact of gradual disruptions: immediate rescheduling (IR), delayed rescheduling (DR), and neglect (NE). Before implementing the uncertainty evaluation, a specific threshold (α, β) of RPDI is predetermined. In the scheduling process, when a disruption occurs, the corresponding response measure can be selected by calculating RPDI with the real-time status information:
RM =
⎧ NE , ifRPDI < α DR, ifα ≤ RPDI ≤ β ⎨ ⎩ IR,others
4.1.1. Candidate system status features Defining the set of features is the core task in supervised machine learning, which is related to the accuracy of classification and computational efficiency. This study aims to identify appropriate system parameters that accurately and efficiently describe current production scenario. Mirshekarian and Šormaz [24] introduced correlation of jobshop scheduling problem features with scheduling efficiency. Based on this, we developed a preliminary set of 35 features covering system status information for job and machine as shown in Table 2. 4.1.2. Performance criteria According to the practical requirement, we set a combined objective O(Sori, Snew) that considers both stability and efficiency as the performance measurement. The stability index is defined as the mean deviation of the start time of each operations as follows:
(2)
The threshold of RPDI directly affects the trigger times, which should be determined according to problem specification. Therefore, different threshold value and the performance of disruption evaluation are tested with extensive simulation experiments (See Section 5.2).
n
SI(Sori, Snew ) = σ *
∑i = 1 STi, new − STi, ori n
3.2. Selection of rescheduling point
(3)
where Sori and Snew are the original and new schedules, respectively; n represents the number of total operations of jobs; STi, new and STi, ori are the start time of operation i in the new and original schedules, respectively. σ is a coefficient that is used to balance the magnitude of the two sub-objective. The index of efficiency is defined based on the difference between mean deviation of equipment utilizations under new and original schedules as
After identifying the impact of disruptions, the rescheduling point selection for different types of disruptions is to be accomplished. In Section 2.2, all disruptions have been classified into two broad categories: abrupt or gradual disruptions. Gradual disruptions are further divided into three types based on their RPDI. In order to balance the performance and stability of the original schedule, we designed a hybrid rescheduling triggering mechanism that uses event-driven strategy to deal with urgent disruptions and reduces the rescheduling frequency by defining a minimum rescheduling period (ΔTmin). It can be represented as: Step 1 . Implement the current schedule until a disruption occurs.
m
EI(Sori, Snew ) =
∑ j = 1 (Pj, ori − Pj, new ) m
(4)
where m represents the number of machines; Pj, ori and Pj, new are the equipment utilizations of machine j in the original and new schedules, respectively.
Step 2 . Distinguish the type of the disruption. When an abrupt disruption occurs, it needs to be responded immediately. When a gradual disruption occurs, the response methods depend on the RM of current system status.
O(Sori, Snew ) = λs *SI(Sori, Snew ) + λ e *EI(Sori, Snew )
(5)
where λs and λe are a pair of weight coefficient that represent the importance of stability index and efficiency index. The smaller O is, the 233
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
After obtaining the loss function of discriminator and generator, we use the gradient descent method to train the neural network. In the actual training process, in order to prevent the generator from overfitting, the discriminator needs to be updated multiple times whenever the generator is updated. In this way, the counterbalance between the generator and the discriminator is maintained.
better the new schedule is. 4.1.3. Candidate rescheduling methods As reported in literatures, there is no rescheduling method that suits all kinds of system status [18]. In the case of considering complex performance criteria, the performance of different rescheduling methods is quite different under the same system status. In this study, we chose the full generation method, the RSR method [16] and three improved algorithms based on the match-up method [17] as candidate rescheduling methods. The basic idea of original match-up scheduling method is to find a so called matchup point and reschedule the unprocessed operations before matchup point. However, the estimation of the matchup point is complicated. To decrease the computational cost, we designed three improved MUR algorithms, namely MUR_1, MUR_2, and MUR_3, which respectively reschedule the subsequent 10%, 20%, and 30% operations. If the impact of the disruption was absorbed by the idle time during this period, the original schedule can be followed again. If not, we employ RSR to adjust the remaining schedule.
4.3. Adaptive random forest classifier We design an improved random forest (IRF) as the classifier to select optimal rescheduling method. The hyper-parameters of IRF are optimized by using GA. 4.3.1. Improved random forest classifier Random forest (RF) is a classification and prediction algorithm proposed by Breiman [29], which is composed of a number of decision trees. The training set of each decision tree is derived from the Bagging method and sampled in the original training set. During the growth of individual decision tree, each node randomly chooses best splitting feature from a subset of all features according to the information gain. Finally, each tree decides the class for each sample by voting. We designed an improved random forest (IRF) algorithm, which can effectively improve the diversity of trees and the convergence speed of the algorithm. The IRF selects the features with low correlation with the used features as the feature subset of new tree. In this way, IRF can effectively reduce the correlation between decision trees and improve the classification accuracy. In the conventional RF, the features set of the new tree is randomly selected from all features. Although this method is simple, it may cause many repeated features between two trees, which may lead to a reduction in the diversity of the trees. The motivation for IRF is the roulette game. The features subset of the first tree is obtained in a random way. After that, we sort all the features before choosing the features set that are used to split the new tree. The sorting rule is based on the number of times the feature appears in the feature sets of existing trees, which can be defined as:
4.2. Data augmentation of the raw dataset by CGAN The class imbalance naturally present in many industrial domains [25,26], where one or several classes have many more instances than the remaining classes. In this study, the training set contains 1415 total examples, with 1096 (77.5%) RSR method and 319 (22.5%) remaining four methods. In this case, a classifier can obtain high prediction accuracy by predicting all new samples as the RSR method. However, such a classifier is of no value because it cannot predict the remaining methods, which may perform significantly better than RSR in a small number of cases. A common technique used to combat class imbalance is to increase the size of raw dataset through data augmentation, that is, to construct virtual training examples from real training examples. Conditional generative adversarial networks (CGAN) [27] are an emerging technique for data augmentation, which can be characterized by training a pair of networks (generator G and discriminator D) in competition with each other and generating a large amount of simulation data by learning the distribution of real samples. Different from the original GAN, CGAN adds extra information c (c is the label of the sample in this article) to the input of generator and discriminator. To generate more training samples, we built a CGAN and used it for data augmentation (seeFig. 2). We construct a five-layer fully connected neural network with 20 neurons in each of the three hidden layers as the generator network. The input is a 40-dimensional noise vector that obeys a uniform distribution. The output of this network is a 35-dimensional simulation samples. The activation function of hidden layer is a softplus function of [28] and the activation function in the output layer is tanh. The discriminator network has a similar structure. The input consists of real samples and simulation samples generated by the generator. We used logistic function in the output layer to get the probability that the sample is from the real training set. In this study, we use the adversarial loss formulated in [27], which optimizes the mini-max of generator G and discriminator D:
minmax LGAN (G, D) G
D
LGAN (G ) =
LGAN (D) =
ρi =
∑ log(1 − D (G (z (i) )))
1 m
∑ [log D (x (i) ) + log(1 − D (G (z (i) )))]
i=1
T
H (x ) = carg max ∑ hcj (x ) j
c=1
(10)
where T is the number of trees, hcj (x ) is the output of tree c on class j, hcj (x ) ∈ {0, 1} . 4.3.2. Genetic algorithm There are three hyper-parameters in the proposed IRF model, which can be tuned to improve the prediction accuracy of the model (see Table 3). The procedure for GA-based hyper-parameter optimization is summarized as follows. Step 1 . Initialize the parameters of GA, such as size of population, maximum generations, and crossover/mutation probability. Define the range of values for the three hyper-parameter.
(6)
(7)
m i=1
(9)
where k is the number of total features, ni is the number of times that feature i appears in the feature sets of existing trees. In this way, features that were previously less used will have a greater probability of being selected. While growing a new tree, we use a conventional entropy based measure for node splitting. Finally, the plurality voting method is used to obtain the class of the sample as follows:
m
1 m
1 ni k 1 ∑i = 1 n i
(8)
Step 2 . Generating three-segment type individuals randomly with binary coding until the number of individuals reaches the population size. The length of the individual depends on the maximum value of the hyper-parameter.
(i)
where D(x ) is the probability that the discriminator considers x to be a real sample, 1 − D (G (z (i) )) is the probability that the discriminator considers the generated sample to be fake. 234
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
Fig. 1. The flowchart of adaptive rescheduling strategy.
Fig. 2. The CGAN architecture. 235
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
Table 3 Hyper-parameters considered in this study.
Table 5 Classification accuracy for different methods.
Hyper-parameters
Discription
Case
AdaBoost
MLP
SVM
IRF
max_features n_estimators min_sample_leaf
The maximum number of features in individual tree The number of trees The minimum leaf size
1 2 3 4 5 6 7
81.67 83.89 82.22 82.78 85.56 86.11 84.44
80.56 83.33 82.78 80.56 87.78 85.56 84.44
88.56 87.32 89.22 88.25 89.93 90.15 89.56
91.67 90.56 88.89 89.44 91.11 91.67 89.44
Table 4 Case study description. Case
1
2
3
4
5
6
7
(λs, λe)
(0,1)
(0.2,0.8)
(0.4,0.6)
(0.5,0.5)
(0.6,0.4)
(0.8,0.2)
(1,0)
rescheduling of thermal spraying and laser processing machine groups. The original data of the reprocessing shop was recorded from June 12, 2017 to April 8, 2018. According to the practical requirement, we split the original data by the weekly scheduling period. The disruptions were obtained by event log of MES. Implementing the proposed hybrid triggering mechanism, there were totally 2021 rescheduling point recorded during the period. Training examples were generated by executing a simulation run for every candidate rescheduling method given the same scenario of the system status parameters recorded at the rescheduling point. These examples will be used to build the relational model between the system status parameters and the optimal rescheduling. After preprocessing, the comprehensive data set was randomly sampled and divided into a training set of 1617 examples (80% of data) and a test set of 404 examples (20% of data). In order to demonstrate the effectiveness of the proposed strategy, we designed 7 different cases by adjusting the (λs, λe) in Eq. (5) (see Table 4).
Step 3 . Repeat on this iteration till the maximum generations. 1) Perform IRF on each individual in population and calculate the classification accuracy. 2) Select individual with binary tournament selection method. 3) Generate new individuals through crossover and mutation operations. 4) Perform IRF and evaluate the classification accuracy of new individuals. 5) Update the population based on the individual's fitness. Step 4 . Output the individual with the best fitness. Perform a binarydecimal conversion to get the values of the three hyper-parameters.
5.2. Pilot test of hybrid triggering mechanism
5. Experiment and analysis
The threshold (α, β) of RPDI proposed in Section 3.1 is the key control parameter of the trigger and the value of it can directly affect the stability of the system. Based on the historical schedule data, we have tested the trigger times of gradual disruptions under different threshold values to determine the appropriate threshold. According to the practical requirement, we let ΔTmin equal to 180 min. Taking one week as a scheduling period and 60 min as the step size, we calculate the RPDI at each time point. Fig. 3 presents the average number of total trigger times by gradual disruptions and gradual disruptions accumulated during every interval under different threshold values per week. We let β = α + 0.02 , the results show that when the threshold is greater than (0.16, 0.18), most of disruptions are ignored and almost no rescheduling is triggered during the production process. In contrast, when the threshold is too small, the trigger times of rescheduling is
To test the performance of the rescheduling strategy proposed in this study, a series of computational experiments were done on a number of test instances and results are reported in this section. 5.1. Case study description An industrial example is utilized to validate the proposed adaptive rescheduling strategy. The dataset used to train and test the model comes from the reprocessing shop of a part processing company. There are five main machine groups in the shop, i.e., part cutting, thermal spraying, laser processing, electro-brush plating and blank forming. In real-time production, the impact of disruptions on the schedule is mainly reflected in the bottleneck machines, so we only consider the
Fig. 3. Result of different threshold values. 236
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
Fig. 4. Performance of different methods.
Fig. 5. Confusion matrixes of Case 1 by IRF.
and Fig. 4. The results show that for most of the cases, IRF out-performs other methods. We can see that the IRF method is, on average, significantly better than AdaBoost and MLP, with 7.9% and 8.2% improvement in classification accuracy, respectively. For case3 and case7, the performance of SVM is the best among the contrast methods. But for these two data sets, IRF is still significantly better than the other two algorithms. Notice the cases in the experiment are designed based on different values of coefficients in Eq. (5), so the results also show the excellent performance of the proposed method under different performance criteria. Next, we evaluate the performance of the proposed CGAN. We use the geometric means of the classification accuracy of every class (Gmean) as the performance metric in this study. G-mean is define as [30]
significantly increased, resulting in frequent changes in the schedule. Therefore, we set the threshold value to (0.08, 0.10), under which the total trigger times gradually stabilizes and each rescheduling can simultaneously respond to several accumulated gradual disruptions. 5.3. Performance measures of the CGAN/IRF We first compare the performances of the proposed IRF with other three commonly used classification approaches: AdaBoost, multi-layer perceptron (MLP) and support vector machine (SVM). The hyperparameters of IRF are optimized by GA. Since our research is a multiclass classification problem, we use AdaBoost.M2 algorithm and for SVM, we use the one-versus-rest policy. The values of other parameters of comparative algorithms are set through cross validation. We report the classification accuracy of different method on each dataset. The classification accuracy is estimated in our experiments with 10-fold cross-validation, which is formulated as follows.
n Accuracy = c nt
m
tr ⎞ ⎛ Gmean = ⎜∏ i ⎟ n ⎝ i=1 i ⎠
1 m
(12)
where m and ni represents the number of classes and the number of examples in class i, respectively. tri is the number of correctly classified examples in class i. Fig. 5 shows confusion matrixes of case 1 using the original data set and the data set augmented by CGAN. As mentioned in Section 4.2, the original data set is a class-
(11)
where nc is the number of correctly classified samples, and nt is the number of total samples in the test set. In the experiment, we take a dataset and run each method 10 times. The results are given in Table 5 237
Robotics and Computer Integrated Manufacturing 58 (2019) 230–238
P. Zheng, et al.
imbalanced data set, in which the first class (RSR method) samples accounts for a large proportion. This leads to the classification algorithm obtaining high classification accuracy on the class with more samples, while the new samples belonging to other classes are easily misclassified to the class with more samples. The G-mean of the confusion matrix shown in Fig. 4(a) is 86%, in which more than 12.5% and 15% of the samples belonging to category 2 and category 4 are misclassified as category 0. Compared to the original data set, CGAN-based IRF has 4.1% improvement in G-mean, and the classification accuracy of each class is more uniform and stable. In fact, statistical results show that CGAN can improve the G-mean by 3.5%−5% in all cases.
[5] E Vieira G, W Herrmann J, E Lin, Rescheduling Manufacturing Systems: a Framework of Strategies, Policies, and Methods, J. Sched. 6 (1) (2003) 39–62. [6] D Ouelhadj, S Petrovic, A survey of dynamic scheduling in manufacturing systems, J. Sched. 12 (4) (2009) 417. [7] R Ruizab, The hybrid flow shop scheduling problem, Eur. J. Oper. Res. 205 (1) (2010) 1–18. [8] P Sharma, A review on job shop scheduling with setup times, Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. (2016). [9] C Wang, P Jiang, Manifold learning based rescheduling decision mechanism for recessive disturbances in RFID-driven job shops, J. Intell. Manuf. (2016) 1–16. [10] L Church, REHAUZSOY, Analysis of periodic and event-driven rescheduling policies in dynamic shops, Int. J. Computer Integr. Manuf. 5 (3) (1992) 153–163. [11] G Vieira, J Herrmann, EdwardLin, Analytical models to predict the performance of a single-machine system under periodic and event-driven rescheduling strategies, Int. J. Prod. Res. 38 (8) (2000) 1899–1915. [12] L Jin, C Zhang, X Shao, et al., A study on the impact of periodic and event-driven rescheduling on a manufacturing system: an integrated process planning and scheduling case, Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 231 (3) (2016) 490–504. [13] F Qiao, M Ma Y, C Zhou M, et al., A Novel Rescheduling Method for Dynamic Semiconductor Manufacturing Systems, IEEE Trans. Syst. Man Cybernet. Syst. 99 (2018) 1–11. [14] Z Gao K, N Suganthan P, K Pan Q, et al., Artificial bee colony algorithm for scheduling and rescheduling fuzzy flexible job shop problem with new job insertion, Knowl.-Based Syst. 109 (2016) 1–16. [15] K Gao, F Yang, C Zhou M, et al., Flexible Job-Shop Rescheduling for New Job Insertion by Using Discrete Jaya Algorithm, IEEE Trans. Cybernet. 99 (2018) 1–12. [16] J Abumaizar R, A Svestka J, Rescheduling job shops under random disruptions, Int. J. Prod. Res. 35 (7) (1997) 2065–2082. [17] S Akturk M, E Gorgulu, Match-up scheduling under a machine breakdown, Eur. J. Oper. Res. 112 (1) (1999) 81–97. [18] L Li, Z Min, An efficient adaptive dispatching method for semiconductor wafer fabrication facility, Int. J. Adv. Manuf. Technol. 84 (1–4) (2016) 315–325. [19] R Shiue Y, Data-mining-based dynamic dispatching rule selection mechanism for shop floor control systems using a support vector machine approach, Int. J. Prod. Res. 47 (13) (2009) 3669–3690. [20] S Choi H, S Kim J, H Lee D, Real-time scheduling for reentrant hybrid flow shops: a decision tree based mechanism and its application to a TFT-LCD line, Expert Syst. Appl. 38 (4) (2011) 3514–3521. [21] Y Du, Li C, Implementing energy-saving and environmental-benign paradigm: machine tool remanufacturing by OEMs in China, J. Cleaner Prod. 66 (3) (2014) 272–279. [22] Jean-Pierre Kenné, Dejax P, A Gharbi, Production planning of a hybrid manufacturing–remanufacturing system under uncertainty within a closed-loop supply chain, Int. J. Prod. Econ. 135 (1) (2012) 81–93. [23] C Park S, N Raman, J Shaw M, Adaptive scheduling in dynamic flexible manufacturing systems: a dynamic rule selection approach, IEEE Trans. Robot. Automat. (1997) 486–502. [24] S Mirshekarian, N Šormaz D, Correlation of job-shop scheduling problem features with scheduling efficiency, Expert Syst. Appl. 62 (2016) 131–147. [25] J Lin S, C Chang, F Hsu M, Multiple extreme learning machines for a two-class imbalance corporate life cycle prediction, Knowl.-Based Syst. 39 (3) (2013) 214–223. [26] S Maldonado, R Weber, F Famili, Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines, Inf. Sci. 286 (2014) 228–246. [27] M Mirza, S. Osindero, Conditional generative adversarial nets, Comput. Sci. (2014) 2672–2680. [28] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. International Conference on Artificial Intelligence and Statistics. 2012:315–323. [29] L Breiman, Random forests, machine learning 45, J. Clin. Microbiol. 2 (2001) 199–228. [30] Y Sun, Y Wang, Y Wang, Boosting For Learning Multiple Classes With Imbalanced Class Distribution[C]// International Conference On Data Mining, IEEE, 2007, pp. 592–602.
6. Conclusions This article presents an efficient and easily implemented rescheduling strategy for APRS, which can adaptively provide optimized rescheduling method under different system status. The improvements of this method consist three points: (1) create a data-driven rescheduling model that take into account the efficiency and stability of remanufacturing process; (2) analyze the characteristics of various disruptions and design a hybrid triggering mechanism according to the effects of disruption; (3) create a CGAN/IRF based classifier to classify the class imbalanced rescheduling method and a genetic algorithm is designed to adaptively optimize classifier parameters. The experimental results show that the adaptive rescheduling strategy yields a better system performance than the other three commonly used methods. In addition, the same framework can be used to train customized models by adjusting performance criteria. Acknowledgment This work was supported by the Joint Funds of the National Natural Science Foundation of China under Grant No. U1637211. Supplementary material Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.rcim.2019.02.008. References [1] R Huang, S Zhang, X Bai, et al., Multi-level structuralized model-based definition model based on machining features for manufacturing reuse of mechanical parts, Int. J. Adv. Manuf. Technol. 75 (5–8) (2014) 1035–1048. [2] N Chari, C Diallo, U Venkatadri, et al., Production planning in the presence of remanufactured spare components: an application in the airline industry, Int. J. Adv. Manuf. Technol. 87 (1–4) (2016) 1–12. [3] C Li, F Liu, H Cao, et al., A stochastic dynamic programming based model for uncertain production planning of re-manufacturing system, Int. J. Prod. Res. 47 (13) (2009) 3657–3668. [4] M Liu, C Liu, L Xing, et al., Study on a tolerance grading allocation method under uncertainty and quality oriented for remanufactured parts, Int. J. Adv. Manuf. Technol. 87 (5–8) (2016) 1265–1272.
238