A hybrid local search algorithm for scheduling real-world job shops with batch-wise pending due dates

A hybrid local search algorithm for scheduling real-world job shops with batch-wise pending due dates

Engineering Applications of Artificial Intelligence 25 (2012) 209–221 Contents lists available at ScienceDirect Engineering Applications of Artificial...

730KB Sizes 0 Downloads 8 Views

Engineering Applications of Artificial Intelligence 25 (2012) 209–221

Contents lists available at ScienceDirect

Engineering Applications of Artificial Intelligence journal homepage: www.elsevier.com/locate/engappai

A hybrid local search algorithm for scheduling real-world job shops with batch-wise pending due dates Rui Zhang a,, Cheng Wu b a b

School of Economics and Management, Nanchang University, Nanchang 330031, PR China Department of Automation, Tsinghua University, Beijing 100084, PR China

a r t i c l e i n f o

abstract

Available online 25 May 2011

This paper aims at solving a real-world job shop scheduling problem with two characteristics, i.e., the existence of pending due dates and job batches. Due date quotation is an important decision process for contemporary companies that adopt the MTO (make to order) strategy. Although the assignment of due dates is usually performed separately with production scheduling, there exist strong interactions between the two tasks. Therefore, we integrate these two decisions into one optimization model. Meanwhile, each order placed by the customer defines a batch of jobs, for which the same due date should be set. Thus, the completion times of these jobs should be close to one another in order to reduce waiting time and cost. For this purpose, we propose a dispatching rule to synchronize their manufacturing progresses. A two-stage local search algorithm based on the PMBGA (probabilistic model-building genetic algorithm) and parameter perturbation is proposed to solve the integrated scheduling problem and its superiority is revealed by the applications to a real-world mechanical factory. & 2011 Elsevier Ltd. All rights reserved.

Keywords: Due date assignment Real-world job shop scheduling Probabilistic model-building genetic algorithm Local search Weighted tardiness

1. Introduction The standard job shop scheduling problem (JSSP) has been widely adopted as a model in the research of combinatorial optimization algorithms. There are several objective functions to be considered in theoretical investigations, of which the most frequently studied is the criterion of makespan (a.k.a. maximum completion time of all jobs) (Gonc- alves et al., 2005; Hasan et al., 2008, 2009). However, due-date-related performances are becoming increasingly important for manufacturing firms nowadays, because timely delivery of goods is closely related with service reputation (a source of sustainable profit). For this reason, the objective functions based on due dates are much more meaningful and relevant for practical scheduling than the makespan. The total weighted tardiness (TWT) criterion, which we will consider in this paper, is one such objective. Although some purely theoretical research has been done on JSSP with the objective of minimizing TWT (abbreviated as TWTJSSP) (Zhou et al., 2009; Essafi et al., 2008; Zhang and Wu, 2011), there are two aspects where the theoretical model significantly differs from real-world production environments. 1. The specification of due dates: In theoretical TWT-JSSP models, the due date of each job is an exogenously determined factor, which must be treated as an input to the algorithm used for

 Corresponding author.

E-mail address: [email protected] (R. Zhang). 0952-1976/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.engappai.2011.04.014

solving the TWT-JSSP (Zhou et al., 2009). However, in order to cope with the fluctuating market demand under the background of economic crisis, most modern companies are adopting the make-to-order (MTO) production mode in place of make-to-stock. In MTO, the factory will not manufacture or assemble the final products until it has received official orders from its customers. Actually, when a customer places an order, the firm must ‘‘quote’’ a due date (sometimes referring to a lead time) together with a price for this order. Of course, if the customer is not satisfied with the initial due date (price) quotation, there may arise several rounds of negotiations between the firm and the customer until consensus is reached, or otherwise, the customer may simply turn to another supplier and this firm will lose the business. In other words, for real-world manufacturing firms, due dates are not solely determined by the outside customers but ultimately set according to the will of both the manufacturer and the customers. Sometimes, the role of the company itself is even greater if the company maintains a monopoly position in the industry. Finally, the point is that, due dates should be regarded as decision variables rather than unchangeable input when we consider real-world scheduling. 2. The existence of batches: In theoretical JSSP models, the jobs are assumed to be independent of one another, and each job can have mutually unrelated property values (esp. in randomly generated test instances). In real MTO environments, an order that the firm receives usually includes a number of (different) products ordered at the same time. In other words, a set of jobs arrive simultaneously at the firm and require to be processed

210

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

as a whole. For the convenience of the customer, it is generally demanded that the set of jobs should be delivered on the same day (i.e., they will have the same due date in the scheduling system). In addition, these jobs may have identical values for other properties, e.g., the weight which describes the importance level of the job to the manufacturer (identical weights because they all come from the same customer). Such a set of jobs is called a batch in terms of scheduling. The production scheduling process must guarantee that the completion times of a batch of jobs should not be far apart from each other, or else there will arise inventory holding costs for the jobs finished considerably earlier than the others in the same batch. Therefore, the existence of job batches requires some special treatment in the design of a real-world scheduling algorithm. From the above point (1), we see that due date quotation is actually a very important decision process on the part of the manufacturer (Lucas and Moses, 2005). The due dates for most orders are initially promised and then fixed by MTO firms themselves rather than the external customers. In this process, the due date setter has to make a trade-off between delivery speed and delivery reliability (Baykaso˘glu et al., 2008; Alpay and ¨ ug ¨ ull ¨ u, ¨ 2009): Yuz

 If the firm quotes a lead time which is notably shorter than



that of its competitors, then it will have more chance of obtaining the order. But at last it may probably fail to deliver on time because the tight due date cannot be achieved by the production scheduling system, and consequently, the firm’s goodwill might be damaged. On the other hand, if the firm quotes a long lead time, it can keep its timely delivery reputation with a higher level of confidence. But it may have already lost the order in fierce market competition.

So, the process of setting the due date can be defined as an optimization problem. For the optimization of theoretical job shop schedules, the only contributing factor is the quality of the scheduling algorithm, since the fixed due dates cannot be adjusted. Exactly for this reason, scheduling has become a remarkably important research field ever since the 1950s (Akers, 1956), and is currently one of the hottest topics in the operational research society (Jain and Meeran, 1999; Minella et al., 2008). However, in real-world scheduling, due dates are controllable decision variables, which certainly complicates the optimization task but also provides opportunity for enhancing the overall performance of the manufacturing system. Therefore, in practical scheduling, the quality of due date setting and the quality of scheduling simultaneously determine the performance of manufacturing. Previous research has shown that due date optimization is not independent of scheduling. Actually, if the due dates have not been set in a rational way, the final scheduling performance will be unsatisfactory, no matter how superior the scheduling algorithm is (Vig and Dooley, 1991). For this reason, we will consider the two decision problems integrally in one optimization model. From the above point (2), we see that it is necessary to incorporate the batch information into the scheduling algorithm. Especially, two aspects have to be considered:

 A mechanism has to be designed to synchronize the production of a batch (the aim is to make the completion time of each job in the same batch close enough). To this end, we have devised a new dispatching rule for practical scheduling, which attempts to make each job keep in step with the production progress of the other jobs in the same batch.

 Although the jobs have been grouped into batches and a batch of jobs must have the same due date setting, each job is still independently processed on the production line. Therefore, we must decide when to consider a batch of jobs together and when to examine each job separately. In this paper, we propose a two-stage algorithm, where the first phase optimizes the batch-wise due dates in a rough manner while the second phase fine-tunes the production schedule in a job-by-job mode. In this research, we consider a firm (the actual prototype is a large-scale speed reducer manufacturer in Pingyao, Shanxi Province of China) which has accepted a number of orders, each consisting of many jobs. The due dates for some of the jobs have already been assigned, possibly due to the fact that such orders have strategic importance to the firm, and therefore, their due dates have been set according to the customers’ wishes and thus cannot be modified. For the remaining jobs, the firm’s scheduler (scheduling system) has to decide their due dates based on comprehensive information about the current production status of the workshops. In sum, the integrated scheduler must handle due date assignment and job shop scheduling at the same time. The output of the scheduling system includes the optimized due date values for each batch of unassigned jobs and an optimized processing sequence of relevant jobs for each machine. The contribution of the paper is twofold. First, it presents an optimization model for characterizing the features of real-world job shop scheduling. Pending due dates and batch information are also taken into consideration. Second, it proposes a local search-based hybrid algorithm for solving this problem. The first stage consists of a special type of genetic algorithm which belongs to estimation of distribution algorithms, and the second-stage algorithm is a local search procedure based on systematic parameter perturbation. The rest of this paper is organized as follows. Section 2 presents a brief introduction to job shop scheduling meta-heuristics and the basic theories on estimation of distribution algorithms. Section 3 defines the integrated due date assignment and job shop scheduling problem. Section 4 describes the proposed optimization approach in detail. Section 5 presents the computational results on the real running data of the Pingyao speed reducer factory and some related discussions. Finally, Section 6 concludes the paper.

2. Literature review 2.1. The job shop scheduling problem and total weighted tardiness objective In a standard JSSP instance, a set of n jobs J ¼ fJj gnj¼ 1 are to be processed on a set of m machines M ¼ fMh gm h ¼ 1 under the following basic assumptions: (i) There is no machine breakdown. (ii) No preemption of operations is allowed. (iii) The transportation time between different machines and the setup time between different jobs can be neglected. (iv) Each machine can process at most one job at a time. (v) Each job may be processed by at most one machine at a time. Each job has a fixed processing route which traverses all the machines in a predetermined order (the manufacturing process of a job on a machine is called an operation). The duration time of each operation is fixed and known. In the theoretical model, a preset due date dj and a preset weight wj are given for each job. Due dates are the preferred latest finishing time of each job, and completion of the job after this specific time will result in losses such as a worsened reputation in customers. Weights reflect the importance level of the orders from different customers, larger values suggesting higher strategic importance. If we use Cj to

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

denote the completion time of job j, the minimization objective of JSSP can be makespan ðCmax ¼ maxnj¼ 1 fCj gÞ, maximum lateness ðL ¼ maxnj¼ 1 fCj dj gÞ, total weighted tardiness ðTWT ¼ Pmax n wj Tj where Tj ¼ maxf0,Cj dj gÞ, number of tardy jobs j ¼ 1P ðU ¼ nj¼ 1 Uj where Uj ¼ 1 if Cj 4dj and Uj ¼ 0 otherwise), etc. Up till now, most research on JSSP has been focused on the makespan criterion. However, due-date-related performances are becoming more significant in the customer-centered market environment nowadays. In this sense, the total weighted tardiness measure is able to capture the critical factors that affect the profits of a firm more accurately than makespan. Meanwhile, from the theoretical perspective, the total weighted tardiness is more difficult to optimize than makespan. According to the concepts of computational complexity (Pinedo, 2008), minimizing Cmax is only a special case of minimizing TWT, which means the complexity of solving JSSP with total weighted tardiness objective is much greater than that of solving JSSP with makespan. Despite its importance, the contributions on TWT-JSSP are relatively scarce in the literature. The only exact solution methodology is the branch-and-bound algorithm proposed by Singer and Pinedo (1998).

2.2. A brief introduction to local search methods for JSSP Local search (LS) is a generic methodology for searching the solution space. The underlying idea of LS is quite straightforward: a given solution may be improved by a consecutive series of small modifications. For each solution v in the solution space V, the neighborhood function N defines the neighborhood of v. In other words, N ðvÞ is the set of neighbor solutions which can be reached from v. In general, a local search algorithm defines a walk in V such that each visited solution is a neighbor of the previously visited solution (Vaessens et al., 1996). The quality of a local search algorithm is directly affected by two factors, i.e., the definition of the neighborhood function and the method of selecting the next solution from the neighborhood. As for the first issue, there is usually a trade-off in the size of neighborhood. A large neighborhood may include high-quality solutions, but finding such a good solution requires more computational time. A small neighborhood is easy to perform an exhaustive search, but the improvement by each neighborhood move may be too trivial. In fact, the definition of neighborhood is rather problem-dependent and objective-dependent. So even for JSSP, the neighborhood structure for minimizing makespan ðCmax Þ should be somewhat different from the neighborhood used for minimizing tardiness. The quality of neighborhood is critical for such difficult combinatorial optimization problems as JSSP and thus this topic attracts consistent research interest. As for the second issue, different criteria exist for selecting the next visited solution. The simplest of all is the hill-climbing approach which always selects an improving solution from the current neighborhood. However, such a strategy makes the search process converge to local optima, which usually have poor quality. Alternatively, recent meta-heuristics such as simulated annealing (SA), tabu search (TS) and genetic algorithms (GAs) use special mechanisms in order not to get trapped by local optima. GAs were inspired by the principles of natural selection and population evolution. GAs can be viewed as a special case of local search methods. Different from those serial local search algorithms (like SA and TS), GAs generate a new solution from two previous solutions (rather than one immediately previous solution), and maintain a population of solutions (rather than one solution) in the searching process. So GAs are said to have an inherent parallel property. In fact, GAs are perhaps the maturest meta-heuristic for both continuous function optimization and combinatorial optimization. A vast number of papers dealing with

211

the application of GAs to JSSP exist in the literature. Cheng and Gen (1996, 1999) provide a systematic survey on this topic. 2.3. Basics of estimation of distribution algorithms (EDA) Recently, there has been a growing interest in the evolutionary algorithms that explore the search space by building and utilizing probabilistic models of high-quality solutions. Indeed, these algorithms use the following two steps to replace the conventional crossover and mutation operators in GA: (1) Build a probabilistic model of the selected promising solutions. (2) Sample the built model to produce a new generation of candidate solutions. The evolutionary algorithms based on such a principle are referred to as estimation of distribution algorithms (EDAs) or probabilistic model-building genetic algorithms (PMBGAs). The major steps of a PMBGA implementation are listed as follows, where GN is the maximum number of generations. Step 1: Set the generation index g¼0. Initialize the population of the first generation, i.e., Pð0Þ . Step 2: Select a subset S of promising individuals from PðgÞ . Step 3: Establish a probabilistic model M which somehow describes the distribution characteristics of S. Step 4: Generate a set N of new individuals by sampling M. Step 5: Select the best jP ðgÞ j individuals from PðgÞ [ N, and assign them to the next-generation population P ðg þ 1Þ . Step 6: Let g’g þ1. If g oGN, return to Step 2. Otherwise, output the best solution in PðgÞ . The PMBGA is especially useful for the complex optimization problems in which the decision variables are correlated. For such problems, the realized value of a certain decision variable can produce an impact on the optimal value for another decision variable. Therefore, if these variables are optimized in a separate way (or one by one), traditional GA will be very likely to converge to local optimum. PMBGA improves traditional GA by modeling the relationship between decision variables. Clearly, the most crucial element in the design of PMBGA is the type of the adopted probabilistic model (in Step 3), which directly affects the algorithm’s capability of producing high-quality offspring solutions. In the artificial intelligence community, the commonly adopted graphical model for characterizing the relationship between a set of discrete variables is the Bayesian networks (Heckerman et al., 1995). A Bayesian network is a directed acyclic graph. A node in the Bayesian network indicates a variable under investigation (each variable may actually correspond to a coding gene for possible solutions in PMBGA), and an arc indicates the probabilistic causal relationship between the two nodes connected by it. The direction of the arc implies that the variable corresponding to the head node of the arc is conditioned by the variable corresponding to the tail node. An example is given in Fig. 1. The six random variables X1 ,X2 , . . . ,X6 modeled by the Bayesian network in Fig. 1 have the following joint distribution function: Pðx1 ,x2 , . . . ,x6 Þ ¼ Pðx1 ÞPðx2 jx1 ÞPðx3 jx1 ÞPðx4 jx2 ,x3 ÞPðx5 jx4 ÞPðx6 jx4 Þ, ð1Þ where xi denotes the possible realization of Xi ði ¼ 1, . . . ,6Þ, Pðxi Þ denotes PrðXi ¼ xi Þ and Pðxi jxj Þ denotes PrðXi ¼ xi jXj ¼ xj Þ. Eq. (1) holds because a pair of variables which are not directly connected by any arc are assumed to be mutually independent. From this example, it is evident that Bayesian networks can reduce the

212

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

The meaning of the notations used in the above formulation are listed below:

X1

 Decision variables in the optimization problem: X2

X3

X4

X5

 X6

Fig. 1. An example Bayesian network of six variables.

required storage space for describing the joint distribution of random variables. If X1 , . . . ,X6 are all binary variables, i.e., x1 , . . . ,x6 A f0,1g, then 26 ¼64 values will be needed to characterize the joint distribution Pðx1 , . . . ,x6 Þ. However, the Bayesian network can factor the joint distribution into conditional distributions by utilizing the existing relationship between the variables. Hence, according to the right-hand side of Eq. (1), only 2þ4þ4þ8þ 4þ 4¼26 values are necessary for describing their joint distribution. In general, the joint probabilistic distribution of an n-variate random vector X ¼ ðX1 , . . . ,Xn Þ described by a Bayesian network can Q be calculated as PðxÞ ¼ ni¼ 1 Pðxi jpaðxi ÞÞ. In this formulation, x ¼ ðx1 , . . . ,xn Þ is a vector of values for X; paðxi Þ is a set of values for the parents of the random variable Xi. (In a Bayesian network, if there exists a directed arc pointing from node Xj to Xi, then Xj is called a parent of Xi. Regarding the example in Fig. 1, X2 and X3 are parents of X4.) A detailed introduction to the PMBGA can be found in Pelikan et al. (2002). Some important advances and interesting applica˜ aga and Lozano (2002) and tions of EDA are covered in Larran Lozano et al. (2006). Successes in utilizing EDA to solve scheduling problems have been reported in Tsutsui and Miki (2002) and Jarboui et al. (2009) (for flow shop scheduling).





According to the definition of the objective function Z in Eq. (2), tardiness costs are incurred when the completion time of a job is later than its due date, no matter whether the due date is preset outside the discussed problem range (i.e., dj) or determined within this optimization process (i.e., dg Þ. Moreover, for each unassigned batch of jobs, if we set a due date which is within the time range ½0,bg  (which exceeds the customer’s expectation on lead time), the possibility of losing the customer is so small that it can be omitted (i.e., og ½ðdg bg Þ þ v is zero). Only when the quoted due date is large enough there will arise DDA-related penalties. Constraint (a) defines the completion time of each job, which is equal to the finishing time of its last operation. Constraint (b) guarantees the precedence relations between any two operations from the same job. Constraint (c) ensures that each machine can process at most one job at a time (without overlap in processing periods). Constraint (d) sets an upper bound for the quoted due dates. The major differences between the problem studied in this paper and the ones in other existing works can be stated as follows:

3. Problem description The newly defined model incorporates due date assignment (DDA) and TWT-JSSP into one optimization problem. At the beginning, there are a number of jobs waiting to be scheduled. This set J of unscheduled jobs can be further divided into two subsets according to their due date quotation status: the jobs with already designated due dates belong to subset J 1 , while the jobs whose due dates are to be determined constitute subset J 2 (so J ¼ J 1 [ J 2 Þ. Furthermore, the jobs within J 2 are grouped into totally G batches, i.e., J 2 ¼ B1 [ B2 [    [ BG , and each batch Bg contains bg jobs, i.e., Bg ¼ fjg,1 ,jg,2 , . . . ,jg, bg g. Then, the optimization model for the integrated scheduling problem can be mathematically characterized as follows (where x þ  maxfx,0gÞ: 8 G X G X X P > > > minZ ¼ wj ðCj dj Þ þ þ wg ðCj dg Þ þ þ og bg ½ðdg bg Þ þ v , > > > jAJ 1 > g ¼ 1j A Bg g¼1 > > > > > < s:t: 8 8j A J , ðaÞ Cj ¼ maxi A OðjÞ fsi þ pi g, > > > > > > < si1 þ pi1 r si2 , > 8ði1 -i2 Þ A AðJ Þ, ðbÞ > > > > > > > ðsi1 þ pi1 r si2 Þ3ðsi2 þ pi2 r si1 Þ, 8ði1 2i2 Þ A EðJ Þ, ðcÞ > > > > > > : : dg r dmax , g ¼ 1, . . . , G: ðdÞ g

ð2Þ

J si (corresponding to scheduling): the starting time of operation i. J dg (corresponding to DDA): a decision variable indicating the pending due date for batch g in J 2 (assuming all jobs in one batch have the same due date). Order/job related information: J OðjÞ: the set of job j’s operations. J pi: the processing time of operation i. J Cj: job j’s completion time, which should equal the finishing time of its final operation. J ‘‘ði1 -i2 Þ A AðJ Þ’’ denotes the fact that operations i1 and i2 belong to the same job and i1 is required by the technological constraint to be processed before i2. J ‘‘ði1 2i2 Þ A EðJ Þ’’ denotes the fact that operations i1 and i2 should be processed by the same machine but their processing order has not been decided. Parameters set by the manufacturer: J wg : the weight of each job in batch g, which reflects the degree of losses if missing the due date (assuming all jobs in one batch have the same weight). J og : a penalty coefficient which measures the potential cost of losing the customer if too large a due date is quoted for batch g (assuming all jobs in one batch have the same penalty coefficient). J v Z1: a constant which depicts the growing pattern of the cost caused by quoting a larger due date. Parameters set by the customers: J dj (for j A J 1 Þ: the predetermined due date for job j in J 1 (possibly due to the considerable importance of the related customer). J bg : the expected due date of batch g (expectation of the customer). max J dg : the maximum value for dg accepted by the customer.

:

(1) The problem discussed in this paper integrates the optimization of operation sequences together with the optimization of due date settings. By contrast, in most existing literature on DDA methods, the sequencing of operations is not an optimization target but is directly achieved by simple dispatching rules, and thus, optimality cannot be ensured. (2) Our model includes a portion of jobs with already fixed due dates. Thus, this optimization model may be applied to a dynamic rescheduling environment, where the jobs with

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

pending due dates correspond to those new arrivals, and the existing jobs have fixed due dates because these previously assigned due dates have already been notified to the customers and cannot be changed.

4. The algorithm design The proposed solution framework for the integrated optimization problem consists of two functional modules: (1) The PMBGA-based algorithm which implements coarse-granularity optimization of the due date settings for each batch in J 2 , named DDA-PMBGA; (2) The parameter perturbation (PP) based algorithm for finetuning the processing sequence of all the operations, named OSA-PP (operation-sequence adjustment through parameter perturbation).

4.1. The DDA-PMBGA algorithm In DDA, the due date setting for one job may have an impact on the optimal due date setting for another job. For example, if some important jobs have been assigned with tight due dates, then the reasonable due date quotation for other jobs should be postponed to certain extent (because the overall production capacity of the system is limited). Therefore, the optimal due date settings for each batch are not mutually independent. Based on this observation and the fact that PMBGA is excellent in modeling the relationship between the variables to be optimized, we use PMBGA in this stage for roughly optimizing the batch-wise due dates. In PMBGA, the due dates for different batches are optimized not in a separate manner but based on their potential interaction. Before the proposed optimization process, the due date for each batch in J 2 is factored as

dg ¼ kg  t  TP g :

ð3Þ P

P

In this expression, TPg ¼ j A Bg i A OðjÞ pi is the total processing time of all jobs in batch g; t A ð0,1 is the unit step length for each adjustment during the rough optimization of due dates. With such factorization, the target that DDA-PMBGA attempts to optimize is the set of due date coefficients, fkg jg ¼ 1, . . . , Gg, while it is not necessary to care about the absolute scale of job sizes. Meanwhile, such factorization also discretizes the continuous dg values such that the resulting kg on which PMBGA works are all integers (since Bayesian networks can only handle discrete variables). In the following, we will describe the key elements of DDAPMBGA.

This also reflects the coarse-granularity feature of the DDAPMBGA optimization process because only the values at these discrete points controlled by the fixed step size t are considered in the optimization for dg . A smaller step size will enlarge the search space for kg , resulting in more accurate due date setting but longer computational time. So a trade-off must be obtained with respect to t. 4.1.2. Selection In each iteration, we first sort all the individuals in the current population according to their fitness and then select the best e% of individuals to form the set S, which will subsequently be used to build the Bayesian network and produce a new generation of individuals. 4.1.3. The network structure A Bayesian network has to be built to describe the probabilistic distribution of favorable kg values based on the elite individuals in S. Each individuals can be characterized by a directed acyclic network as shown in Fig. 2. In this network, a node Ng,k max ðg A f1, . . . , Gg, and for each g, k A fkmin g , . . . ,kg gÞ indicates the fact that the due date coefficient kg has been set as k for batch g. The directed arc from node Ng,k to node Ng þ 1,k0 represents the dependency between the two nodes, so it characterizes the possible influence of setting kg ¼ k for batch g on the setting of batch ðg þ 1Þ’s due date coefficient kg þ 1 . Therefore, a directed path from a certain node in the first row to a certain node in the G-th row can completely describe an individual in the population (because a directed path records an instantiation of all the kg values). 4.1.4. The probabilistic model Since we adopt a fixed network structure in DDA-PMBGA, building the Bayesian network is equivalent to determining the values of all the conditional probabilities according to the selected solution set S. After that, new individuals will be produced by iteratively sampling these probabilistic distributions, expecting to obtain high-quality offsprings. Given a number of individuals (i.e., the training set S), an estimate of the conditional probabilities can be obtained simply by counting the frequencies of occurrence. Here we provide a concrete example to illustrate the calculation process. For a DDAPMBGA optimization instance with G ¼ 3 and t ¼1, let us further suppose kmin ¼ kmin ¼ kmin ¼ 1, kmax ¼ 3, kmax ¼ 2, kmax ¼ 4 and the 1 2 3 1 2 3 current S contains 40 individuals. In Fig. 3, the statistics of these individuals are displayed on a network as previously defined. The weight of each arc (the number placed on the arc) indicates the occurring frequency of this arc (counted for all the individuals in

4.1.1. Encoding The coding length for each individual in DDA-PMBGA is l ¼ G because G due date coefficients have to be optimized. Each coding digit is an integer which records the value of kg for batch g in J 2 . Based on the integral encoding scheme and the factorization of dg max by Eq. (3), as well as the requirement that ‘‘dg r dg ’’ and the simple observation that ‘‘dg Z bg ,’’1 we know that 8g, the feasible domain for kg is ( ) $ max %      dg bg bg bg max Þ, Þ : ð ¼ kmin þ 1, þ 2, . . . , ð ¼ k g g t  TP g t  TPg t  TP g t  TP g

1 This is because, according to the definition of the objective function, there is no reward even if the promised due date exceeds the customer’s expectation. Quoting a due date shorter than bg only exerts unnecessary pressure on the scheduling department.

213

Fig. 2. The structure of the Bayesian network used in DDA-PMBGA.

214

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

4.1.5. The sampling process The sampling process for generating a new individual begins from the root nodes of the Bayesian network and gradually instantiates the whole network by selecting the outgoing arc at each node based on the calculated conditional probabilities.

Fig. 3. An example network in DDA-PMBGA.

S). For example, if there exists an individual coded as ‘‘½3,1,2’’ (i.e., k1 ¼3, k2 ¼1 and k3 ¼2), then the path ‘‘N1,3 -N2,1 -N3,2 ’’ in the Bayesian network is used to record this individual, and consequently, the weights (counted frequencies) of the arcs N1,3 -N2,1 and N2,1 -N3,2 will be increased by 1, respectively. Note that, in the final network, the sum of the weights of all the incoming arcs of a certain node should be equal to the sum of the weights of all the outgoing arcs of the same node. This is because each individual corresponds to a complete path from the first row to the last row. By using frequency as an approximation for probability, the relevant (conditional) probabilities should be calculated as follows: PðN1,1 Þ ¼

8 þ2 , 40

PðN1,2 Þ ¼

7þ7 , 40

PðN1,3 Þ ¼

PðN2,1 jN1,1 Þ ¼

8 , 8 þ2

PðN2,2 jN1,1 Þ ¼

2 , 8þ 2

PðN2,1 jN1,2 Þ ¼

7 , 7 þ7

PðN2,2 jN1,2 Þ ¼

7 , 7þ 7

PðN2,1 jN1,3 Þ ¼

10 , 10 þ 6

PðN3,1 jN2,1 Þ ¼

2 , 2 þ3 þ 5 þ15

PðN2,2 jN1,3 Þ ¼

5 PðN3,3 jN2,1 Þ ¼ , 2 þ3 þ 5 þ15

10 þ 6 , 40

i A OðjÞ

which reflects the processing urgency level of the considered job (smaller yj implies higher urgency of job j). The conventional minimum slack (MS) rule always selects the job with the shortest slack time to be processed when several jobs are waiting in the machine’s buffer. Here we define the weighted minimum slack (WMS) rule. In WMS, the priority index for job j A J 1 is evaluated as

rj ¼ ð1þ w j Þð2y j Þ, where w j A ½0,1 is the normalized weight of job j, and y j A ½0,1 is the normalized slack time of job j, and normalization means x j ¼ ðxj xmin Þ=ðxmax xmin Þ, xmin ¼ minj A J 1 fxj g, xmax ¼ maxj A J 1 fxj g, x A fy,wg. Likewise, we can define a WMS priority index for a batch of jobs in J 2 . First, the slack time of batch g is defined as

yg ¼ dg LBCmax ðBg Þ, where LBCmax ðBg Þ is a lower bound of makespan for the set of jobs ðBg Þ. The lower bound of makespan can be obtained in a number of ways. Here we use the single machine relaxation to calculate the lower bound, which is a computationally fast method:

6 , 10 þ6

PðN3,2 jN2,1 Þ ¼

4.1.6. Decoding First, we calculate the due date of each batch in J 2 using the fkg gG g ¼ 1 values indicated by the current individual, i.e., dg ¼ kg  t  TP g . Then, considering the characteristics of the objective function and the requirement of real-world scheduling, we devise two dispatching rules that can be used to obtain a feasible schedule in relation to the current solution. (I) The weighted minimum slack rule: The slack time for job j A J 1 is defined as X yj ¼ dj  pi ,

3 , 2þ 3 þ 5þ 15

15 PðN3,4 jN2,1 Þ ¼ , 2þ 3 þ 5þ 15

PðN3,1 jN2,2 Þ ¼

0 , 0 þ 10 þ 4 þ1

PðN3,2 jN2,2 Þ ¼

10 , 0 þ 10 þ 4þ 1

PðN3,3 jN2,2 Þ ¼

4 , 0 þ 10 þ 4 þ1

PðN3,4 jN2,2 Þ ¼

1 : 0 þ 10 þ 4þ 1

According to the above calculation method, a connection can never be re-discovered in the DDA-PMBGA if the corresponding conditional probability is zero (e.g., from N2,2 to N3,1 Þ. To overcome this drawback, we can set the minimum count to 1. Taking N2,2 as an example, the conditional probabilities for the outgoing arcs will then become PðN3,1 jN2,2 Þ ¼

1 , 1 þ11 þ 5 þ2

PðN3,2 jN2,2 Þ ¼

11 , 1þ 11 þ 5 þ2

PðN3,3 jN2,2 Þ ¼

5 , 1 þ11 þ 5 þ2

PðN3,4 jN2,2 Þ ¼

2 : 1þ 11 þ 5 þ2

Now it is possible to discover N3,1 from N2,2 , though the probability is small.

Step 1: For each machine k ðk ¼ 1, . . . ,mÞ, perform the following steps: (1.1) Note the operation of job j ðj A Bg Þ that has to be processed on machine k as jk. Evaluate the release P time of operation jk as rjk ¼ i A JPðjk Þ pi , where JPðjk Þ denotes the set of job predecessor operations of jk. (1.2) Sort the bg operations (from each job in Bg Þ fjk jj A Bg g in a non-decreasing order of rjk , yielding an operation sequence fo1 ,o2 , . . . ,obg g2. bg (1.3) Schedule the sequence of operations foi gi ¼ succes1 sively on machine k in accordance with their release times, and obtain the finishing time of the last operation as Cmax ðkÞ. Alternatively, Cmax ðkÞ ¼ Pbg ½ðroi foi1 Þ þ þ poi , where foi denotes the fini¼1 ishing time of operation oi with fo0 90 and foi ¼ maxfroi ,foi1 g þ poi . Step 2: Let LBCmax ðBg Þ ¼ maxfCmax ð1Þ, . . . ,Cmax ðmÞg.

2 After the relaxation, we are actually considering the single machine scheduling problem with release times, formally known as 1jrj jCmax . The optimal schedule for this problem can be obtained by sorting all the jobs in non-decreasing order of their release times frj g (Brucker, 2007, Section 4.2).

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

215

Consequently, the priority index for batch g (Bg  J 2 Þ is similarly defined by

minimize PG P

rg ¼ ð1 þw g Þð2y g Þ,

date is noted as d^ . The optimal due date adjustment policy (i.e., obtaining d^ g for each batch in the decoding process) is described

where w g A ½0,1 is the normalized weight of batch g, and y g A ½0,1 is the normalized slack time of batch g, and here normalization means x g ¼ ðxg xmin Þ=ðxmax xmin Þ, xmin ¼ minG g ¼ 1 fxg g, xmax ¼ maxG g ¼ 1 fxg g, x A fy,wg. By such a definition, we mean that each job in Bg has the same WMS priority index, which is equal to rg . (II) The batch synchronization rule: In order to ensure that the jobs within a batch are completed closely in time, we define the batch synchronization (BS) rule. First, the production progress rate of job j A Bg at the current time t is defined as P i A OðjÞ\fijfi r tg pi P PRj ðtÞ ¼ , i A OðjÞ pi where fi denotes the finishing time of operation i. So the index PRj represents the proportion of completed operations to all operations in job j. Meanwhile, we denote by ROj ðtÞ the number of remaining (unscheduled) operations of job j A Bg at the current time t. ROj ðtÞ provides a different measurement for the unfinished workload than ð1PRj ðtÞÞ. If the number of remaining operations is large (even if each remaining operation requires very short processing time), there will be more uncertainties in the following manufacturing process (because each unscheduled operation will participate a competition with other candidate operations). So, larger ROj ðtÞ values indicate higher production urgency level. When two or more jobs from the same batch are competing for machine processing, it is beneficial to select the job whose production progress has lagged behind its counterparts, or the job which has a larger number of subsequent operations. The motivation is to synchronize the completion times of the jobs in a batch so as to minimize inventory costs. To achieve this, we define the priority index for the BS rule as

r0j ¼ ð1 þRO j Þð2PR j Þ, where RO j A ½0,1 and PR j A ½0,1 are, respectively, the normalized values of ROj and PRj , and here normalization means x j ¼ ðxj xmin Þ=ðxmax xmin Þ, xmin ¼ minj A Bg fxj g, xmax ¼ maxj A Bg fxj g, x A fRO,PRg. (III) Application of WMS and BS: In order to evaluate the fitness of an individual (i.e., a rough DDA policy), we try to obtain a complete and reasonable schedule under such a due date setting. In the process of constructing the schedule, WMS and BS are applied in a hierarchical manner. When a number of jobs are ready for processing, we first evaluate their priority values under WMS, i.e., rj or rg (recall that rg is the shared priority value for each job from the same batch in J 2 Þ, and identify the largest one (noted as rmax Þ. If rmax corresponds to a single job j A J 1 , then select job j as the next job to be processed. Otherwise if rmax is the shared value for several jobs from a certain batch, then evaluate the BS priority values for each of the corresponding jobs and select the job with the largest r0j to be processed next (if there is only one job from that batch, of course we do not need to calculate the BS priority). In this way, we can produce a feasible schedule and obtain each job’s completion time Cj. 4.1.7. Evaluation Once decoding is done, the completion time of each job (Cj) is known and this is used to evaluate the first item of Z (i.e., P þ j A J 1 wj ðCj dj Þ Þ for this individual. It is noticeable that, after the schedule has been fixed, the due date of job j ðj A J 2 Þ could then be modified in order to further

g¼1

the

last

j A Bg wg ðCj dg Þ

two items of the objective Z (i.e., P þ v þ G g ¼ 1 og bg ½ðdg bg Þ  Þ. The new due

þ

as follows. We now consider only one batch of jobs. The subscript ‘‘g’’ can be omitted in this part because the following procedure will be applied separately to each batch to get each d^ g ðg ¼ 1, . . . , GÞ. Recall that b jobs belong to this batch, and they have the same non-penalty due date b, the same penalty coefficient o and the same weight w. The objective is to determine a common due date d^ for these jobs such that the following cost value is minimized: Z b ¼ zb ðdÞ ¼ w

b X

ðCi dÞ þ þ bo½ðdbÞ þ v :

i¼1

If 8iA f1, . . . , bg, Ci rb, then we have zb ðbÞ ¼ 0. Therefore, we can set d^ ¼ b under this situation. In the following, we will assume that there exist at least one Ci which satisfy Ci 4 b. We first sort C1 , . . . ,Cb and b in a non-decreasing order, yielding a

sequence

C½bL  , . . . ,C½2 ,C½1 ,b,C½1 ,C½2 , . . . ,C½bR 

R

L

R

ðb þ b ¼ b,

L

b 40Þ. That is to say, there are b jobs completed before the R expectation due date b and b jobs completed after b. Thus, the b piecewise function z ðdÞ can be reformulated on each of the L R intervals ½C½i1 ,C½i  ði ¼ b , . . . , b Þ with C½bL 1 90 and C½0 9b. L b When iA fb , . . . ,0g, z ðdÞ can be rewritten for interval ½C½i1 ,C½i  as follows: 2 3 bR X R b 4 C½i0  ðb iÞd5: z½i ðdÞ ¼ w i0 ¼ i,i0 a 0 L

It is clear that, for any i ¼ b þ1, . . . ,0, zb½i ðdÞ is always a linear R

function with a negative slope ðb iÞw which increases with i. Besides, when d ZC½bR  4b, zb ðdÞ ¼ boðdbÞv has a positive slope (of tangent). Now that zb ðdÞ is downward sloping when d rb and upward sloping when d Z C R , the optimal d^ to minimize zb ðdÞ ½b 

can only be found in ½b,C½bR  . R

When iA f1, . . . , b g, zb ðdÞ can be rewritten for interval ½C½i1 ,C½i  as follows: 2 R 3 b X R b C½i0  ðb i þ1Þd5 þ boðdbÞv : z½i ðdÞ ¼ w4 i0 ¼ i

By differentiating zb½i ðdÞ: dzb½i dd

R

¼ ðb iþ 1Þw þ bvoðdbÞv1 ,

and assuming dzb½i =dd ¼ 0, we get " #1=ðv1Þ R ðb iþ 1Þw d½i ¼ b þ : bvo %

ð4Þ

Note that the d½i may or may not lie within the domain of zb½i ðdÞ, %

i.e., ½C½i1 ,C½i . Therefore, we have to use the following algorithm to find the global minimum of zb ðdÞ. Step 1: Initialize i¼1. R Step 2: If i 4 b , then return d^ ¼ C½bR  . Otherwise, evaluate d½i according Eq. (4). Step 3: If d½i is located within ½C½i1 ,C½i , then d½i is the global optimum. Thus, return d^ ¼ d½i . Step 4: If d½i 4 C½i , then set i’i þ1 and go to Step 2. %

%

%

%

%

216

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

Step 5: If d½i o C½i1 , then the global optimum is C½i1 . Thus, return d^ ¼ C½i1 . %

The algorithm output is definitely the global minimum of zb ðdÞ because d½i decreases with i. If for a certain i0, d½i0  A ½C½i0 1 ,C½i0  , then %

R

%

8i0 oi r b , d½i2 = ½C½i1 ,C½i  and indeed d½i o C½i1 . This situation can be depicted in Fig. 4(a). In addition, it is also possible that the output d^ is a ‘‘corner’’ solution (Step 5), which is depicted in Fig. 4(b). %

%

A special case is v ¼1, and each segment of zb ðdÞ is a linear R

function (see Fig. 5). On interval ½C½i1 ,C½i  ði A f1, . . . , b gÞ, zb ðdÞ can be written as R

R

zb½i ðdÞ ¼ ½boðb iþ 1Þwd þ w

b X

C½i0  bob:

i0 ¼ i R

Hence, the slope of this line segment is k½i ¼ boðb i þ 1Þw. The above minimization algorithm should be correspondingly modified as follows. Step 1: Initialize i¼ 1. R Step 2: If i 4 b , then return d^ ¼ C½bR  . Otherwise, calculate k½i ¼ boðbR i þ 1Þw. Step 3: If k½i ¼ 0, then return d^ ¼ C½i . Step 4: If k½i o 0, then set i’i þ1 and go to Step 2. Step 5: If k½i 4 0, then return d^ ¼ C½i1 .

schedule obtained under a given (fixed) due-date-related dispatching rule will change, which can be used as a local search mechanism. Furthermore, if the directions of successive perturbations are not completely random but determined partly based on the knowledge gained from previous experience, then this method can have satisfactory searching performance with respect to both solution quality and computational time (Agarwal et al., 2006). In order to guarantee the effectiveness of fine-tuning, we allow the due dates of the jobs in one batch to be perturbed independently. Before OSA-PP begins, the due date of each job j A J 2 is initially inherited from its belonging batch gðjÞ, i.e., dj ¼ dgðjÞ . Then, after we exert perturbations on each dj ðj A J 2 Þ independently, each job in j A J 2 will have separate and distinct WMS indices rj (because now yj ¼ dj LBCmax ðBgðjÞ Þ can be different for the jobs in one batch). Perturbing the due date of each job in one batch separately is a main feature of OSA-PP against DDA-PMBGA. This slight change brings a higher degree of freedom in search directions so it can provide a reasonably large neighborhood for local search. Below is the detailed procedure of such a local search algorithm—OSA-PP, which relies on random perturbations to explore the neighborhood and involves a parameter learning

Finally, after d^ g is obtained for each batch using the above procedure, the objective value Z could ultimately be calculated (substituting the dg in Z with d^ g Þ. Z is directly taken as the evaluation of the current individual, and a smaller Z suggests higher solution quality. 4.2. The OSA-PP algorithm After obtaining a rough due date value for each unassigned batch with DDA-PMBGA, we use a fine-granularity optimization procedure (called OSA-PP) to fine-tune the final solutions in an attempt to further improve the solution quality. The input of OSAPP is a set of kg ðg ¼ 1, . . . , GÞ values as output by DDA-PMBGA, which indicate that the due date for each job in batch g has been set as dg ¼ kg  t  TPg . On such a basis, OSA-PP conducts the search for a better schedule of all the operations that belong to J . The fundamental idea underlying OSA-PP is that, by means of perturbing the due date values of some jobs, the production

Fig. 4. Sketch of the function zb ðdÞ.

Fig. 5. The sketch of zb ðdÞ under v ¼1.

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

process similar to that of artificial neural networks for guiding the search direction. Step 1: Initialize the iteration index: u¼1. Let kðuÞ ¼ kgðjÞ (this is j implemented for all j A J 2 , and similarly hereinafter). Step 2: Evaluate the (perturbed) due dates in the current itera-

5.1.1. Algorithm related information The parameters of the proposed TSO have been determined by empirical studies and are listed as follows:

 Parameters of DDA-PMBGA:

ðuÞ

 t  TP gðjÞ (t is the same step size as used tion: dj ¼ kðuÞ j in DDA-PMBGA). Step 3: Apply the WMSþBS rule3 (in which the due date of each ðuÞ

job in J 2 is regarded as dj Þ to generate a feasible schedule (denoted by Su), and then calculate its objective value Zu using the evaluation procedure described in Section 4.1.7. Step 4: If Zu is the best objective value achieved so far, then set



Z ¼ Zu , S ¼ Su and k j ¼ kjðuÞ . Step 5: If u ¼ umax , go to Step 10. Step 6: If u 41 and the best objective value has been improved, then reinforcement is incurred by letting

kðuÞ ’kjðuÞ þ l j

217





ðkjðuÞ  kjðu1Þ Þ.

J The population size PS ¼ 50. J The generation number GN ¼ 500 (maximum), or determined by the externally imposed computational time limit. J The step size t ¼0.15. J The percentage of individuals selected for building the Bayesian network: e% ¼ 30%. Parameters of OSA-PP: J umax ¼ 1000 (maximum), or determined by the externally imposed computational time limit. J l ¼ 3. J Q¼15. J a ¼ 0:3. Other parameters: J The time allocation between DDA-PMBGA and OSA-PP: T% ¼ 75% of the total available computational time is allocated to DDA-PMBGA, and the rest to OSA-PP.

Step 7: If u 4Q and Z has not been improved during the most recent Q iterations, then let kjðuÞ ¼ k j . Step 8: Generate a random number x from the uniform distribution

U½1,1,

min Dj ¼ kmax gðjÞ kgðjÞ

and

let

þ 1Þ ¼ kðuÞ þ x  a  Dj kðu j j

where

represents the scale of the search range

for kj. Step 9: Let u’u þ 1 and go to Step 2. Step 10: Output the results: the best schedule Sn, and its corresponding optimal due date settings fd^ g gG g ¼ 1.

Meanwhile, for comparative purposes, we use two other algorithms to give the baseline for performance evaluation (the schedule builder in these two comparative algorithms is based on the WMS rule only, and ties are broken randomly):

 GLS (a hybrid genetic local search algorithm proposed by Essafi

The parameters of OSA-PP include:

   

umax : the total iteration number; l: the reinforcement step size; Q: the allowed number of iterations without any improvement; a A ð0,1: the amplitude of random perturbation.

In OSA-PP, Step 6 applies a reinforcement strategy when the current perturbation direction is beneficial for improving the overall objective value. Step 7 is a backtracking policy which restores the solution to the best-so-far when the latest Q perturbations do not result in any improvement. Step 8 imposes a random perturbation on the current solution to explore its neighborhood region. In Steps 6 and 8, the reinforced or perturbed max new kj values should be bounded by ½kmin gðjÞ ,kgðjÞ .

5. Computational results and analysis 5.1. Experimental setup To test the effectiveness of the proposed two-stage optimization algorithm (referred to as TSO below), experiments are carried out for real-world instances from the Pingyao speed-reducer factory located in Shanxi, China. All related algorithms have been implemented in Visual Cþþ 7 and tested on an Intel Core2 Duo 2.2 GHz/2 GB RAM/ Windows XP platform. Next, we will introduce the parameter setting for TSO and the characteristics of the scheduling instances. 3 Here WMSþ BS functions in the following way. Each time when a number of jobs are waiting, first identify the job j~ with the largest rj . If j~ A J 1 , select this job ~ for processing. Otherwise if j~ A J 2 and there are more than one job from batch gðjÞ in the buffer, select the job that has the largest r0j (BS index) to be processed next.



et al. (2008) for TWT-JSSP). As Section 4.1.7 shows, when the production schedule (i.e., operation sequence) is fixed, the optimal due date settings d^ g can be thought of as a function of the schedule. So the GLS adopted here only searches in the space of operation sequences (i.e., each solution in GLS is a sequence of all the operations). The optimal due date for each batch is determined in the evaluation stage before calculating the ultimate objective value. TS/SA (a nested combination of tabu search and simulated annealing proposed by Zhang et al. (2008) for JSSP). The TS/SA method used here has been modified as follows in order to better suit our problem. The outer iteration (TS) performs a search for the G due date settings of each batch in J 2 while the inner iteration (SA) optimizes the operation sequence under every set of due dates chosen by TS.

5.1.2. The real-world test instances These production instances are extracted from a large-scale speedreducer manufacturing enterprise in China. The factory manufactures many types of speed-reducers to be used in various industries4 and its engineering shop is typical job shop style in terms of technological layout and routes. The required processing time of each operation ranges from 0.5 to 150 min (with satisfactory precision). The number of schedulable machine cells is around 42 (according to the type of jobs to be processed). To give only a small example, the manufacturing route of a certain part type is illustrated in Fig. 6. From this example, some additional characteristics can be found in the real manufacturing data: (1) There exist reentrant jobs on the production line. For example, the ‘‘rough turning’’ and ‘‘finish turning’’ are performed on the same machine (ID: 101). This feature will not affect the functioning of our algorithm since we have used a rulebased schedule builder. (2) Different from standard JSSP, most jobs do 4 The major products of this factory include cycloidal gears, involute gears, single/double-circular tooth gears, satellite gears, worm speed reducers, electric pulleys, etc. in up to 3000 different models.

218

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

Fig. 6. The processing route of a part. Table 1 Computational results from the application in a speed-reducer factory. Scenario No. R-n

1 2 3 4 5 6 7 8 9 10

Size ðnðGÞ  mÞ

374(94)  42 410(95)  42 428(89)  42 457(89)  42 524(98)  42 536(73)  42 624(62)  42 672(75)  42 738(71)  42 781(84)  42

TSO

GLS

Best

Mean

Worst

Best

Mean

Worst

Best

Mean

Worst

1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000

1.053 1.065 1.059 1.060 1.062 1.098 1.056 1.074 1.078 1.066

1.088 1.090 1.082 1.104 1.117 1.152 1.133 1.116 1.129 1.147

1.073 1.065 1.081 1.138 1.108 1.124 1.083 1.079 1.092 1.095

1.192 1.194 1.213 1.208 1.244 1.282 1.250 1.249 1.238 1.217

1.428 1.387 1.414 1.406 1.381 1.437 1.395 1.388 1.415 1.390

1.062 1.054 1.067 1.075 1.071 1.096 1.085 1.073 1.087 1.089

1.186 1.178 1.200 1.192 1.196 1.217 1.209 1.212 1.226 1.183

1.317 1.339 1.285 1.361 1.410 1.328 1.387 1.334 1.374 1.361

not need to traverse all the machines to complete processing. Hence, the total number of operations in these instances is less than n  m. (3) Another common feature in real-world scheduling scenarios is that there exist quite a number of identical jobs. Therefore, although the number of jobs to be scheduled in each instance is large (compared to theoretical benchmark instances), the complexity for optimization is still within the affordable range. In the investigated speed-reducer factory, the planning staff (under the general management department) have to make both DDA and job sequencing decisions when new orders arrive. Consistent with our model, a ‘‘batch’’ of jobs corresponding to the same order have to be considered together. The test instances used in this section are extracted from the orders accepted by this factory during different months. The characteristics of these real-world instances are described as follows:

 v¼2, which means the penalty cost grows as a square of





TS/SA

ðdg bg Þ þ . This assumption is to depict the fact that the probability of losing orders increases acceleratedly with the promised lead time. max The maximum acceptable due date dg and the expectation due date bg for each job batch g are all set according to the customers’ preferences (which have been recorded in the process of order placement). The weight ðwg Þ and the due date penalty coefficient ðog Þ of each batch are measured by three levels: wg ¼ og ¼ 9 for most important customers (accounting for about 30% of orders for the studied reducer factory); wg ¼ og ¼ 5 for ordinary customers (accounting for about 55% of total orders); wg ¼ og ¼ 0:5 for the customers who hold a passive position in the supply chain (accounting for about 15% of total orders).

The allowed computational time for the three algorithms to solve each instance is 300 s. The aim of imposing the same time limit is to make the subsequent comparisons meaningful and reliable. 5.2. Computational results In this paper, we consider 10 test instances which are extracted from the investigated company’s ERP and MES systems.

For each instance, all the algorithms have been executed for 20 consecutive times independently. Then, the best result obtained from the 20 independent runs for each instance is displayed in the ‘‘best’’ column. Likewise, the worst result and the mean of the 20 output values for each instance are displayed in the ‘‘worst’’ and the ‘‘mean’’ column, respectively. Since absolute objective values are unimportant for comparative purposes, the above-mentioned data have been converted into relative numbers, taking the best result achieved by TSO as reference (i.e., the conversion is simply ‘‘the obtained actual value/the best objective value from TSO’’). The results are displayed in Table 1. In this paper, we have introduced a new dispatching rule named BS which is intended to synchronize the production of jobs in a batch. In order to examine the effectiveness of this rule, we define the following coordination index, which is calculated for the final production schedule output by TSO: 0 1 G X 1X 1 @ CI ¼ jCj C g jA,

G g ¼ 1 bg j A B

g

P where C g ¼ ð1=bg Þ j A Bg Cj is the average completion time of the jobs in batch g. Clearly, smaller CI suggests a higher level of coordination in the processing of a batch of jobs, and thus a lower level of product inventory. We also calculate the CI index for the two comparative algorithms and list the results on the 10 instances in Table 2. For each instance, the reported CI (in minutes) is the average value based on 20 independent runs just as the above. In order to provide more accurate information, we use the Mann–Whitney U test to compare the computational results. One group of tests are performed between TSO and GLS, and another group of tests are performed between TSO and TS/SA. Because each algorithm is run for 20 times independently, we have n1 ¼ n2 ¼ 20 in the Mann–Whitney U test. The null hypothesis is that there is no difference between the results of the two compared algorithms. Therefore, if the obtained U (the lesser of U1 and U2) is below 127, the null hypothesis can be rejected at the 5% level. Furthermore, if U o105, the null can be rejected at the 1% level. The values of U are listed in Table 3 (for comparison of objective values) and Table 4 (for comparison of the coordination

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

Table 2 Comparison of degree of coordination within batches. Scenario No. R- 1 2 3 4 5 6 7 8 9 10

Size ðnðGÞ  mÞ

TSO

GLS

TS/SA

374(94)  42 410(95)  42 428(89)  42 457(89)  42 524(98)  42 536(73)  42 624(62)  42 672(75)  42 738(71)  42 781(84)  42

28.7 26.1 31.5 30.7 28.8 35.3 43.8 42.2 40.2 39.4

41.6 38.4 36.5 45.7 53.6 59.0 73.9 77.3 81.5 71.6

35.7 36.1 33.6 34.9 33.2 37.7 53.4 49.2 56.1 52.4

Table 3 Mann–Whitney U tests on the objective values. Scenario No. R-n 1 2 3 4 5 6 7 8 9 10

UTSOGLS

UTSOTS=SA

92 87 89 84 81 77 75 80 76 78

88 90 91 89 85 81 78 84 80 86

Table 4 Mann–Whitney U tests on the values of CI. Scenario No. R-n 1 2 3 4 5 6 7 8 9 10

UTSOGLS

UTSOTS=SA

71 69 88 74 62 68 64 61 53 50

93 97 112 101 94 106 87 94 75 71

degree). Based on the statistical tests, we can conclude that TSO is significantly more effective than the two comparative methods. Besides the fact that TSO outperforms GLS and TS/SA on all instances as revealed by the computational results, the following comments can be made as an attempt to explain the behavior of the algorithms.

219

(2) In the TS/SA algorithm, the outer-layer TS is used to optimize the due date settings, but the optimization of due dates by TS is implicitly based on the assumption that the due dates of different batches are mutually independent. Since there actually exist strong interrelations between the optimal due date setting for each batch, the searching efficiency of TS is notably restricted compared with DDA-PMBGA. Meanwhile, when a set of due dates have been fixed, the inner-layer SA is faced with a TWT-JSSP which is strongly NP-hard, so SA cannot ensure high-quality solutions within short time. Therefore, TS/SA probably needs considerably long computational time to converge. However, it is seen from Table 1 that TS/SA is still able to achieve a remarkable improvement over GLS for these instances, which also shows that the optimization procedure working on the due date level is superior to the procedure searching on the operation-sequence level. (3) For the tested instances, we also find that larger improvement occurs when the number of batches ðGÞ is relatively smaller compared to the number of jobs (i.e., the average number of jobs in each batch, n=G, is relatively larger), e.g., for instance No. R-7. This can be attributed to the second-stage algorithm (OSA-PP) which allows separate perturbations to the due dates of the jobs within a batch and thereby conducts an effective local search for the operation sequence. Indeed, this procedure enables a finer ‘‘exploitation’’ which is complementary to the coarse-granularity ‘‘exploration’’ by DDAPMBGA. If there are more jobs in each batch, this mechanism tends to gain more improvement over TS/SA in which only a common due date for each batch is considered and optimized. (4) The comparison of CI verifies the efficacy of the proposed BS rule. GLS has the worst performance on this criterion because it has not effectively utilized the batch information in its search on operation sequences. The coordination degree of TS/ SA is between TSO and GLS but more close to GLS. A possible explanation is that TS/SA optimizes a common due date for a batch of jobs, and the common property values for jobs in the same batch at least provides some cue to synchronize the production progress for a batch when a due-date-related dispatching rule (WMS) is used.

5.3. Experiment on the algorithm parameters Just like many other local search-based algorithms, the proposed DDA-PMBGA and OSA-PP also involve several parameters which must be specified before the optimization process. In order to observe the influence of TSO parameters on the final solution quality, we select five important factors for experiment:

 T%: the proportion of computational time devoted to DDAPMBGA.

 e%: the proportion of individuals selected for building Bayesian networks.

(1) As introduced earlier, the optimization target of GLS is operation sequence and GLS regards the optimal due date setting as a function of a specific schedule. The drawback of this approach is that the distribution and the potential interaction of different batches’ due date settings have been completely neglected. In addition, the neighborhood structure based on operation sequences is considerably irregular (since d^ g and Z are discontinuous with respect to the job completion times Cj), which brings great trouble for operation-sequence-based local search algorithms. Therefore, GLS is less effective than TSO and also TS/ SA. It can be concluded that, for the investigated optimization problem, local search only on the operation level is insufficient to guarantee good solution quality.

 a: the perturbation amplitude in OSA-PP.  t: the step size of DDA-PMBGA.  l: the reinforcement factor. The optimization results (mean value of 20 independent runs) for a selected larger-scale instance (No. R-7) under different settings of the five parameters are shown in Figs. 7(a)–(e). In these experiments, when one of the parameters is being tested, all of the other parameters will be kept constant at their original values (as listed in Section 5.1.1). When we experiment on the parameters a and l of the second-stage optimizer (i.e., OSA-PP), we use the same output solution of DDA-PMBGA as the input of OSA-PP so as to make the results comparable. In Fig. 7, the data are also

220

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

Fig. 7. Impact of parameter setting on the final solution quality. (a) Impact of T% on the quality, (b) Impact of e% on the solution quality (c) Impact of a on the solution quality and (d) Impact of t on the solution quality.

displayed in relative numbers as the so-called performance ratio ðPRÞ, which is defined as PR ¼ Zm ðYÞ=Zm ðY Þ. Zm ðYÞ denotes the obtained mean objective value in 20 runs when the currently tested parameter is set to Y, and Y ¼ argminY A Y Zm ðYÞ is the best value for this parameter within the considered value range Y. The following remarks can be made concerning these results (each item corresponds to the discussion on one parameter): (1) When over 60% of the total computational time is spent on OSAPP, the overall optimization effect is not satisfactory. One obvious reason is that the searching ability of OSA-PP is too weak to perform a global exploration of the solution space. However, it definitely improves the solution output by DDA-PMBGA because, according to Fig. 7(a), the results also worsen on the other extreme when DDA-PMBGA takes up 100% of the available

computational time and thus OSA-PP is not used. Overall, the result is most desirable when the time allocation for DDA-PMBGA is 70–80%. At this ratio, DDA-PMBGA’s advantage in extensive exploration and OSA-PP’s advantage in intensive exploitation can be beneficially combined. (2) The number of solutions selected for building the Bayesian network also affects the solution quality. When more than 60% of the population are selected for S, the optimization performance is poor. This is because such a high proportion of individuals cannot represent the really ‘‘promising’’ solutions in the population. The overall trend displayed by Fig. 7(b) is that a smaller S will generally result in better optimization quality. It is recommended to adopt the best 20–40% of the population for Bayesian model building at each generation. A proportion below 10% will be quite unacceptable, because too few samples may lead to

R. Zhang, C. Wu / Engineering Applications of Artificial Intelligence 25 (2012) 209–221

over-fitting of the obtained Bayesian network and consequently give rise to premature convergence of DDA-PMBGA. (3) OSA-PP relies on random perturbations to search the neighborhood, and thus, the relative amplitude ðaÞ of these perturbations affects the searching efficiency. According to Fig. 7(c), too large a perturbation amplitude ða 4 0:6Þ will deteriorate the solution quality. Intuitively, persistent large perturbations will make the solution leap around all over the search range and it is impossible to fine-tune the current solution. Therefore, it is difficult to obtain substantial improvements on the current solution with a very large a. On the other hand, if a is too small (e.g., a o 0:05Þ, the result may also worsen because the available computation time for OSA-PP does not allow it to conduct an extremely finegranularity search. (4) The parameter t affects only DDA-PMBGA but not OSA-PP because the actual search step size of OSA-PP is controlled by a (notice that kj is not constrained to integer values in OSAPP). The influence of t exhibits a quite similar pattern to that of a, but to a much weaker extent (reflected by the smaller PR variation range) because the output of DDA-PMBGA can be further improved by OSA-PP. The final solution deteriorates a bit when t o0:2 because such a smaller step size hinders the global exploration behavior of DDA-PMBGA. The function of the adopted Bayesian network is to quickly obtain a clear-cut description about the optimal relative sizes of the pending due dates. If the step size is too small and thus there are too many possible values for each due date, the function of Bayesian learning weakens. (5) Reinforcement is a key step in OSA-PP, the basic idea for which is borrowed from the training of neural networks. Therefore, the reinforcing step size l naturally has an effect on the optimization result. By definition: kðuÞ ’kjðuÞ þ j ðu1Þ ðuÞ ðu1Þ l  ðkðuÞ k Þ ¼ ð l þ1Þk  l k , a moderate l will move j j j j the current solution towards a possibly rewarding direction and thus increase the probability of meeting a better solution. However, an overlarge l will not be helpful in promoting the search efficiency because it will move the solution too far away in a certain direction. Indeed, even if better solutions exist along that direction, driving too fast may miss them and may even intrude into a more complicated region which increases the difficulty of finding optimal directions in the subsequent attempts. 6. Conclusion Real-world scheduling problems differ from theoretical problems in several aspects. In this paper, we mainly focus on two features, that is, the pending due dates and the influence of job batches. We integrate due date assignment with job shop scheduling and define a new optimization model. To solve the problem, a two-stage algorithm is proposed, in which the first stage performs a rough optimization for the batch-wise due dates while the second stage performs a local search by perturbing the due date of each job separately. Meanwhile, a dispatching rule is proposed to make the completion times of the jobs in a batch close to each other in order to reduce product inventory. Applications to a real-world mechanical factory verified the efficacy of the proposed approach. The impact of parameter settings is also discussed. Future research will be conducted for incorporating more of real-world factors, for example, the effect of random events in the production line. From the operations management perspective, the occurring frequencies of such events as machine breakdown and defective products (requiring rework) can be regarded as

221

controllable decision variables. It would be interesting to consider such factors in the scheduling model and construct a simulation optimization problem, which will be another great step closer to real-world applications.

Acknowledgment The paper is supported by the National Natural Science Foundation of China (Nos. 60874071, 71001048). The authors would like to thank the anonymous referees for their detailed comments which help to improve the quality of this paper. References Agarwal, A., Colak, S., Eryarsoy, E., 2006. Improvement heuristic for the flow-shop scheduling problem: an adaptive-learning approach. Eur. J. Oper. Res. 169 (3), 801–815. Akers, S.B., 1956. A graphical approach to production scheduling problems. Oper. Res. 4 (2), 244–245. ¨ ug ¨ ull ¨ u, ¨ N., 2009. Dynamic job shop scheduling for missed due date Alpay, S- ., Yuz performance. Int. J. Prod. Res. 47 (15), 4047–4062. ¨ -ken, M., Unutmaz, Z.D., 2008. New approaches to due date Baykaso˘glu, A., Goc assignment in job shops. Eur. J. Oper. Res. 187 (1), 31–45. Brucker, P., 2007. Scheduling Algorithms, fifth ed., Springer, Berlin, Heidelberg. Cheng, R., Gen, M., 1996. A tutorial survey of job-shop scheduling problems using genetic algorithms—Part I: representation. Comput. Ind. Eng. 34 (4), 983–997. Cheng, R., Gen, M., 1999. A tutorial survey of job-shop scheduling problems using genetic algorithms—Part II: hybrid genetic search strategies. Comput. Ind. Eng. 36 (2), 343–364. Essafi, I., Mati, Y., Dauze re-Pe´re s, S., 2008. A genetic local search algorithm for minimizing total weighted tardiness in the job-shop scheduling problem. Comput. Oper. Res. 35 (8), 2599–2616. Gonc- alves, J.F., Mendes, J.J.M., Resende, M.G.C., 2005. A hybrid genetic algorithm for the job shop scheduling problem. Eur. J. Oper. Res. 167 (1), 77–95. Hasan, S.M.K., Sarker, R., Cornforth, D., 2008. GA with priority rules for solving jobshop scheduling problems. In: Proceedings of the 2008 IEEE Congress on Evolutionary Computation, pp. 1913–1920. Hasan, S.M.K., Sarker, R., Essam, D., Cornforth, D., 2009. Memetic algorithms for solving job-shop scheduling problems. Memetic Comput. 1 (1), 69–83. Heckerman, D., Geiger, D., Chickering, D.M., 1995. Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20 (3), 197–243. Jain, A.S., Meeran, S., 1999. Deterministic job-shop scheduling: past, present and future. Eur. J. Oper. Res. 113 (2), 390–434. Jarboui, B., Eddaly, M., Siarry, P., 2009. An estimation of distribution algorithm for minimizing the total flowtime in permutation flowshop scheduling problems. Comput. Oper. Res. 36 (9), 2638–2646. ˜ aga, P., Lozano, J.A., 2002. Estimation of Distribution Algorithms: A New Tool Larran for Evolutionary Optimization. Kluwer Academic Publishers, Boston. ˜ aga, P., Inza, I., Bengoetxea, E., 2006. Towards a New EvoluLozano, J.A., Larran tionary Computation: Advances in the Estimation of Distribution Algorithms. Springer, Berlin. Lucas, A.J., Moses, S.A., 2005. Scalability and performance of computational structures for real-time order promising. J. Intell. Manuf. 16 (1), 5–20. Minella, G., Ruiz, R., Ciavotta, M., 2008. A review and evaluation of multiobjective algorithms for the flowshop scheduling problem. INFORMS J. Comput. 20 (3), 451–471. Pelikan, M., Goldberg, D.E., Lobo, F.G., 2002. A survey of optimization by building and using probabilistic models. Comput. Optim. Appl. 21 (1), 5–20. Pinedo, M.L., 2008. Scheduling: Theory, Algorithms and Systems, third ed., Springer, New York. Singer, M., Pinedo, M., 1998. A computational study of branch and bound techniques for minimizing the total weighted tardiness in job shops. IIE Trans. 30 (2), 109–118. Tsutsui, S., Miki, M., 2002. Solving flow shop scheduling problems with probabilistic model-building genetic algorithms using edge histograms. In: Proceedings of the 4th Asia-Pacific Conference on Simulated Evolution and Learning, pp. 465–471. Vaessens, R.J.M., Aarts, E.H.L., Lenstra, J.K., 1996. Job shop scheduling by local search. INFORMS J. Comput. 8 (3), 302–317. Vig, M.M., Dooley, K.J., 1991. Dynamic rules for due date assignment. Int. J. Prod. Res. 29 (7), 1361–1377. Zhang, C.Y., Li, P.G., Rao, Y.Q., Guan, Z.L., 2008. A very fast TS/SA algorithm for the job shop scheduling problem. Comput. Oper. Res. 35 (1), 282–294. Zhang, R., Wu, C., 2011. A simulated annealing algorithm based on block properties for the job shop scheduling problem with total weighted tardiness objective. Comput. Oper. Res. 38 (5), 854–867. Zhou, H., Cheung, W., Leung, L.C., 2009. Minimizing weighted tardiness of job-shop scheduling using a hybrid genetic algorithm. Eur. J. Oper. Res. 194 (3), 637–649.