Available online at www.sciencedirect.com
Computers & Operations Research 30 (2003) 1575 – 1593
www.elsevier.com/locate/dsw
Impact of the replacement heuristic in a grouping genetic algorithm Evelyn C. Browna; ∗ , Robert T. Sumichrastb a
Business Information Technology, 1007 Pamplin Hall (0235), Virginia Tech Blacksburg VA 24061, USA b Pamplin College of Business, 1044 Pamplin Hall (0209), Virginia Tech, Blacksburg VA 24061, USA Received 1 August 2001; received in revised form 1 April 2002
Abstract The grouping genetic algorithm (GGA), developed by Emmanuel Falkenauer, is a genetic algorithm whose encoding and operators are tailored to suit the special structure of grouping problems. In particular, the crossover operator for a GGA involves the development of heuristic procedures to restore group membership to any entities that may have been displaced by preceding actions of the operator. In this paper, we present evidence that the success of a GGA is heavily dependent on the replacement heuristic used as a part of the crossover operator. We demonstrate this by comparing the performance of a GGA that uses a na23ve replacement heuristic (GGA0 ) to a GGA that includes an intelligent replacement heuristic (GGACF ). We evaluate both the na23ve and intelligent approaches by applying each of the two GGAs to a well-known grouping problem, the machine-part cell formation problem. The algorithms are tested on problems from the literature as well as randomly generated problems. Using two measures of e6ectiveness, grouping e7ciency and grouping e7cacy, our tests demonstrate that adding intelligence to the replacement heuristic enhances the performance of a GGA, particularly on the larger problems tested. Since the intelligence of the replacement heuristic is highly dependent on the particular grouping problem being solved, our research brings into question the robustness of the GGA. Scope and purpose Our research investigates the signi9cance of the replacement heuristic used as a part of the crossover operator in a grouping genetic algorithm (GGA). We test two GGAs and compare their replacement heuristics using test problems from the well-known machine-part cell formation domain. The purpose of our research is three-fold. First, we compare and contrast the GGA with standard GA to improve understanding of how they di6er in problem representation and operation. Second, we provide evidence that GGA is limited not only to problems where the objective is to form groups, but also to problems where it is practical to incorporate a substantial amount of problem-speci9c information. Third, we estimate the impact that the GGA replacement heuristic has on performance. Results indicate that GGA performs up ∗
Corresponding author. Tel.: +1-540-231-4535; fax: +1-540-231-3752. E-mail addresses:
[email protected] (E.C. Brown),
[email protected] (R.T. Sumichrast).
0305-0548/03/$ - see front matter ? 2003 Elsevier Science Ltd. All rights reserved. PII: S 0 3 0 5 - 0 5 4 8 ( 0 2 ) 0 0 0 8 5 - 0
1576
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
to 40% worse when problem-speci9c knowledge is not incorporated into the replacement heuristic. ? 2003 Elsevier Science Ltd. All rights reserved. Keywords: Grouping genetic algorithm; Heuristic
1. Introduction The genetic algorithm (GA), introduced by John Holland [1], is a heuristic search technique based on an analogy to natural selection and survival of the 9ttest. A GA employs a population of solutions and combines parts of the good solutions in an attempt to reach better solutions. Although the GA is a heuristic technique which cannot guarantee optimality, its superior performance on numerous complex problems has helped it to become a popular solution methodology [2]. Falkenauer [3] developed the grouping genetic algorithm (GGA) as a GA whose encoding and operators are tailored to accommodate the special structure of grouping problems. In his text, Genetic Algorithms and Grouping Problems [4], Falkenauer explains why the standard GA is not suited for grouping problems. He provides details on the encoding and operators of the GGA and includes application examples. The focus of this paper is on the problem-dependent heuristics employed as a part of the GGA crossover operator. These heuristics are used in order to replace any items which are no longer associated with a group due to previous actions of the crossover operator. Our research demonstrates that intelligence built into these heuristics provides a key to successful performance of the GGA.
2. Literature review In this section, we provide background on GAs and trace their development and use. Our purpose in providing information on GAs is only to summarize basic characteristics so that fundamental contrasts with the GGA can be drawn later. Details on the GA can be found in Holland [5], Goldberg [6], or Michaelwicz [7]. Following the introduction to GAs, we discuss the grouping GA and recent GGA applications. 2.1. Genetic algorithms GAs were developed by John Holland and his colleagues at the University of Michigan in the early 1970s. GAs are heuristic search algorithms that utilize an analogy to natural selection and survival of the 9ttest. These algorithms operate by combining parts of good solutions in hopes of producing better solutions. A GA represents problem solutions as strings of values; these strings are called chromosomes. Each chromosome represents a possible solution to the problem. A chromosome is evaluated based on the objective function value of the solution it represents. This value, or a function of it, is assigned as the chromosome’s 9tness.
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1577
GAs work with a population of solutions rather than just one. They use a variety of operators to combine and modify parts of parent chromosomes, chosen from the current generation of the population, and produce o6spring, which may be selected to survive into future generations. The o6spring chromosomes seek to have the advantages of both parents. Crossover and mutation, the most fundamental GA operators, allow for a balance of exploitation and exploration of the search space. Crossover combines parts of good solutions to exploit promising areas of the search space, while mutation works to recover lost material and explore new regions of the search space. 2.2. GA applications One of the chief advantages of the GA over tailored heuristics is the robustness of the GA. As Goldberg [6] points out, “(it is) worthwhile sacri9cing peak performance on a particular problem to achieve a relatively high level of performance across the spectrum of problems” (p. 5). Many applications of the GA in scholarly literature as well as commercially available, general purpose GA software, such as Evolver [8], attest to the success of this approach. GAs have been shown to work well across a range of combinatorial optimization problems [2]. Although GAs are robust and have been successfully applied to a broad range of problems, many researchers have tailored GA operators or proposed hybrid GAs to achieve the best performance for a narrow problem domain. Published research in these areas is extensive, as most researchers realize optimal performance is a general measure of acceptance for algorithms. Examples of hybrid and tailored GA applications can also be found in Alendar [2]. 2.3. Grouping GAs A grouping problem is a type of problem whose goal is to partition members of a set U into one or more groups. The subsets ui of U that are created by this partitioning must be mutually disjoint, meaning each member of U is in exactly one group. In most situations, there are constraints on how the members of U may be partitioned into groups. Within these constraints, the objective is to create groups which optimize a given cost function. Examples of grouping problems include machine-part cell formation (MPCF), bin packing, assembly line balancing, and graph coloring. Introduced by Falkenauer [3], the GGA is a GA designed speci9cally for grouping problems. The GGA incorporates problem-speci9c information in its encoding and operators. Falkenauer’s GGA is not just a tailored GA, it is an algorithm with distinct di6erences from the standard GA. Section 3 provides details on these di6erences. A more complete treatment of the GGA can be found in Falkenauer’s text Genetic Algorithms and Grouping Problems [4]. 2.4. GGA applications Falkenauer’s early research [9] provides examples of GGA applied to the bin packing problem. As an extension, Falkenauer [10] employs a local optimization technique to explore the enhancement of the bin packing GGA. Zulawinski [11] modi9es Falkenauer’s hybrid GGA for bin packing using a swapping heuristic of his own. Zulawinski also applies his swapping technique to the minimum make span problem. Rekiek et al. [12] apply a grouping GA to the multi-product resource planning
1578
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
problem. Later work by these authors utilizes a GGA to help with the assembly line balancing problem [13]. 3. GA and GGA contrasts Falkenauer [3] points to three major drawbacks of applying a classical GA to a grouping problem: (1) the standard encoding scheme is highly redundant, (2) the standard recombination operator may produce new population members that possess none of the qualities of the members used to produce it, (3) other standard operators may cause too much disruption to successful population members. 3.1. Encoding As previously stated, with GAs, each solution is encoded by an entity known as a chromosome. The individual elements that compose the chromosome are called genes. As an example, consider a chromosome for a grouping problem. The chromosome ABCAB contains 9ve genes and represents a solution that places the 9rst and fourth items into the same group, the second and 9fth items into the same group, and the third item into a group by itself. We could also represent this exact solution using the chromosome CABCA. It is easy to see the redundancy that Falkenauer identi9es as a key ine7ciency when using a standard GA encoding for a grouping problem. With GGA encoding, Falkenauer does not eliminate redundancy, but instead shifts the focus of a chromosome to the composition of groups, not the names attached to the groups. The encoding scheme Falkenauer [4] developed for GGA chromosomes includes an additional group portion that contains a listing of the groups from the main (or object) portion of the chromosome. This modi9cation to the standard chromosome allows the crossover and mutation operators to be performed on the group portion of the chromosome. The object portion of the chromosome identi9es which objects form which groups; the names assigned to these groups (i.e., “A”, “B”, or “C”) are not signi9cant. This change in encoding is signi9cant because, used in conjunction with the modi9ed crossover operator, it allows groups to be modi9ed as a whole, not just individual members to be changed. This becomes more clear when we investigate GGA crossover. 3.2. Crossover Crossover is the primary combination operator associated with GAs. The objective of the crossover operator is to combine good qualities of two separate parent chromosomes and create an o6spring that represents a better solution than either of the two parents. Details of the crossover operator may vary with each implementation, however, one of the most commonly used forms of crossover in GAs is standard single point crossover [6]. The crossover operation is performed or not performed according to a random variable with a probability called the crossover rate or crossover probability. When single point crossover is performed, two parents are selected, and a crosspoint is chosen randomly for each parent. An o6spring is generated by combining the section to the left of the crosspoint from one parent with the section to the right of the crosspoint from the other parent. The process is repeated with the parents’ roles
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1579
reversed to yield a second o6spring. The result is two o6spring, both of which contain a portion of each of the parents used to produce them. Researchers have proposed various ways to deal with a parent pair in the absence of crossover. In our algorithm, if crossover is not performed, the two parents are simply copied into the o6spring generation. As an example, consider the two parents given below, with the “—” symbol used to indicate a randomly generated cross-point. parent one: AB—CB;
o6spring one: ABAC;
parent two: BC—AC;
o6spring two: BCCB:
Crossover provides a means of combining parts of successful chromosomes. In many types of problems, patterns of gene values “close together” are meaningful, and preserving a section of genes from a good solution tends to lead to another good solution. The term “close together” refers to genes whose locations di6er by one or a small number of locations relative to the size of the chromosome. The term does not refer to the genes’ values. The reader interested in a more formal explanation should investigate the schemata theorem [6]. In a grouping problem, however, the patterns of gene values “close together” are much less meaningful. Preserving these patterns is not an appropriate goal for a recombination operator. In a grouping problem, chromosomes representing high-quality solutions are likely to include groups that should be preserved during recombination. Consider the example above where the chromosomes represent solutions to a grouping problem. Each of the two parents has the second and fourth items in the same group, but neither o6spring shares this feature. Also, each parent has items divided into three speci9c groups, however o6spring two has items in only two groups. Thus, the goal of passing chromosome characteristics from parent to o6spring is not met when standard single point crossover is applied to a chromosome for a grouping problem. To address the special structure of grouping problems and the di6erent goal of the recombination operator, Falkenauer created the GGA with a new, non-standard crossover operator. Falkenauer [4] proposes the following 9ve-step crossover operator for the GGA. Note that operations are performed on the group portion of each chromosome to promote the retention of existing groups. (1) Select two cross-points for the group portion of each parent chromosome. (2) Inject the cross-section of the second parent at the 9rst cross-point of the 9rst parent. (3) There may now exist items that are members of two groups, a group from parent one and a group from parent two’s injected portion. Remove each such item from the group it was a member of in parent one. This means the group membership of parent one may be altered. Some groups from parent one may no longer contain the same objects they contained originally. (4) If needed, apply local, problem-dependent heuristics in order to adapt the resulting groups. These adaptations will depend on the constraints and objective function of the problem. (5) Apply points (2) – (4) to the pair of parents with their roles reversed. Unlike the standard GA, GGA explicitly calls for problem-dependent heuristics in step 4. With the MPCF problem, such heuristics are needed any time step 3 leads to groups which contain either no parts or no machines.
1580
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
3.3. Mutation One other common operator employed by a GA is mutation. With mutation, the structure of a chromosome is altered slightly in an attempt to explore some new portion of the search space. Standard mutation involves selecting a gene randomly and changing its value. In the case of grouping problems, standard mutation is often too disruptive, greatly reducing a solution’s quality [4]. Falkenauer [4] does not specify a generally applicable mutation operator for grouping problems, however, he speci9es strategies. The three basic strategies that he proposes for GGA mutation are creating a new group, eliminating an existing group, or randomly shuSing selected items among their respective groups. If the mutation operator creates infeasible chromosomes, then problem-dependent heuristics are employed to restore feasibility. In previous research, testing indicated that the inclusion of mutation did not improve the performance of a GGA for the MPCF problem [14]. In our present research, the primary goal is to assess the impact of the crossover operator’s replacement heuristic. For these reasons, we set the mutation rate to 0 in all testing of GGA0 and GGACF .
4. MPCF problem We investigate the role of the crossover operator’s replacement heuristic in GGA performance using the most widely studied grouping problem: the MPCF problem. The MPCF problem arises in the manufacturing industry when parts need to be grouped into families based on their processing requirements and machines need to be separated into groups based on their abilities to process speci9c part families. The objective of the MPCF problem is to create machine-part (MP) groups, called cells, that allow for the most e7cient, and thus the most cost-e6ective, processing possible. For cells to operate e7ciently, tra7c between cells should be kept to a minimum and utilization of machines should be kept near maximum. Intercell tra7c occurs when a part must leave its cell to be processed by a machine in another cell. Reduced utilization of a machine occurs when a machine must sit idle because it is not required for processing one or more parts in its cell. Further details on the MPCF problem can be found in Burbidge [15]. 4.1. MPCF representation The MPCF problem has been the topic of much research [15 –24]. Researchers have focused on the block diagonalization of the given MP incidence matrix. A member of the MP matrix, aij , has a value of 1 if machine i is required by part j, and a value of 0 otherwise. As an example, consider an MPCF problem consisting of four machines and 10 parts. The MP incidence matrix in Fig. 1 indicates which machines are required by each of the parts. In the 9gure, blank cells represent zero values. Once groups have been identi9ed by a MPCF solution, the columns and rows of the MP matrix can be arranged in a corresponding order and evaluated to determine the solution’s e7ciency. Optimal solutions to the MPCF problem are those that contain the fewest voids (zeros in the diagonal blocks) and the fewest exceptions (ones in the o6-diagonal area).
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1581
Fig. 1. Sample MP incidence matrix.
Fig. 2. Sample solution matrix.
Fig. 2 represents one possible solution to the 4 × 10 sample problem. This solution contains three groups. The 9rst consists of machine 1 and parts 1, 7, and 10. The second is composed of machine 4 and parts 2, 4, 8, and 9. The third includes machines 2 and 3, and parts 3, 5, and 6. The solution represented by Fig. 2 contains voids and exceptions. Voids occur when a part does not require one of the machines in its group. For example, part 7 does not require machine 1, and the sample solution matrix reUects this condition with a void in that position. In general, such voids indicate machine utilization below 100%. There are eight exceptional elements in the solution represented by Fig. 2. Exceptional elements arise when a part requires a machine from another group. In the solution matrix, they are indicated by ones located outside of the diagonal blocks. As an example, part 7 requires machine 4, and machine 4 is not in the same group as part 7. This means part 7 must move outside of its group to be processed. Exceptional elements indicate intercell tra7c, which should be minimized in optimal solutions to the MPCF. 4.2. MPCF objective function A measure of e6ectiveness must be chosen in order to evaluate and compare multiple solutions to the same MPCF problem. We compare solution quality based on both grouping e7ciency [18] and grouping e7cacy [22]. The grouping e7ciency, , is calculated as a weighted average of 1 and 2 , using the formula: = q1 + (1 − q)2 ;
(1)
where, 0 6 q 6 1, and ed ; 1 = k r=1 Mr Nr 2 = 1 −
mn −
(2)
eo k
r=1
Mr Nr
;
(3)
1582
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
where ed is the total number of ones in the diagonal blocks, eo the total number of ones in the o6-diagonal blocks, k the number of groups in current solution, m the number of machines (rows), n the number of components (columns), Mr the number of machines in the rth cell and Nr the number of components in the rth cell. 1 expresses the ratio of the number of non-zero elements in the diagonal blocks to the total number of elements in the diagonal blocks. A value of 1 close to 1.0 indicates the machine utilization is close to 100%. 2 represents the ratio of the number of zeros in o6-diagonal sections to the total number of elements in the o6-diagonal sections. Values of 2 close to 1.0 indicate minimal intercell tra7c. The weighting factor q “enables the analyst to alter the emphasis between utilization and intercell movement, depending on the speci9c requirements of the given problem” [18, p. 457]. Kumar and Chandrasekharan [22] developed a second, similar method for measuring the quality of di6erent solutions to a MPCF problem. Their measure, grouping e7cacy, considers the “number of operations” to be the number of ones in the original MP matrix. Let ’ be de9ned as the ratio of exceptional elements to number of operations, and be de9ned as the ratio of voids to number of operations. The grouping e7cacy formula is given as 1 − ’ 1 − eo =e e − eo eo + ev = = = =1− ; (4) 1 + 1 + ev =e e + ev e + ev where e is the number of operations, ev the number of voids and eo the number of exceptions. With e7cacy, the inUuence of exceptions and voids is not identical. This is an intentional variation from the e7ciency formula because Kumar and Chandrasekharan [22] believe that, in real life, exceptional elements are much more signi9cant than voids. 5. GGA for the MPCF problem In this section we present two algorithms, GGA0 and GGACF , and the details of their implementation. GGA0 is designed to follow Falkenauer’s general proposal for a GGA as closely as possible. In particular, the replacement heuristic minimizes the incorporation of knowledge speci9c to the problem domain. On the other hand, GGACF includes an intelligent replacement heuristic that refers to the given incidence matrix when making decisions. 5.1. Chromosome encoding When designing GGA0 , we selected an encoding that 9t naturally with the MPCF problem. As explained in Section 3.1, the GGA chromosome includes an object part and a group part. The natural representation for the MPCF lists the group of each component and machine in the object portion of the chromosome. The following notation is used for chromosome representation in GGA0 : c 1 c 2 c 3 · · · c P | m 1 m 2 m 3 · · · m M | g1 g 2 g 3 · · · g G ; where ci = group to which component i is assigned; i = 1 − −P; 1 6 ci 6 G; mj = group to which machine j is assigned; j = 1 − −M; 1 6 mj 6 G;
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1583
gt = an existing group number; each listed once; t = 1 − −G; G = number of groups in the solution; G = max{m1 ; m2 ; : : : ; mM }; G 6 min(M; P); P = number of components in the problem; M = number of machines in the problem: The length of the object portion of the chromosome is constant for all chromosomes and depends on the number of components and machines in the problem. However, the group section may vary in length. Its length depends upon the number of groups into which the components and machines are divided. To better understand the encoding, consider the problem of grouping 10 components and four machines into cells. Under the assumption of no outside problem constraints, one solution to this problem is the chromosome 2 1 3 1 3 3 2 1 1 2 | 2 3 3 1 | 2 1 3. In this chromosome, components 1, 7 and 10 are assigned to group 2. In addition, machine 1 is in group 2. Relating this more directly to the MPCF problem, a cell is designed with machine 1 processing components 1, 7 and 10. Similarly, components 2, 4, 8, and 9 are in a cell with machine 4; and components 3, 5, and 6 are in a cell with machines 2 and 3. Fig. 2 provides the MP matrix corresponding to this particular solution. 5.2. Initial population generation As a 9rst step, the GGA must generate an initial population of chromosomes. For GGA0 , these values are created using random number generation. The algorithm insures that all chromosomes in the initial population are feasible. 5.3. Fitness calculation and ranking Once an initial, feasible population is created, the 9tness of each chromosome is calculated. Fitness is based on an appropriate objective function such as the e7ciency or e7cacy formulas given in Section 4. The chromosomes are next sorted in ascending order by 9tness, and rank-based 9tness is assigned using Reeves’ [25] formula: rankfit(z) =
2z : N (N + 1)
(5)
In Eq. (5), z is the rank of the chromosome based on ascending order of 9tness, and N is the total number of chromosomes in the population. Using rank-based 9tness allows GGA0 to more clearly di6erentiate among chromosomes whose raw 9tness scores are very similar. Also, once converted to a rank-based score, the total 9tness of the population sums to one. This conversion helps to simplify the selection scheme used. 5.4. Selection of parent chromosomes Before the crossover operator can be applied, a pair of parent chromosomes must be selected. For GGA0 , this is done using rank-based roulette wheel selection [6]. For GGA0 , we require that
1584
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
parent one and parent two be di6erent chromosomes. Once two unique parents are chosen, crossover begins. 5.5. Crossover Utilizing the 9ve basic steps of GGA crossover as given in Section 3.2, our 9rst task is to determine two crosspoints for the group portion of each parent chromosome. Once again, random number generation is employed. To illustrate all of the steps of crossover, we will use an example. Consider again the problem of grouping 10 components and four machines into cells. Assume that the MP incidence matrix shown in Fig. 1 reUects the conditions for the example. Also, assume that the parents selected for crossover are as follows: parent one : 2131332112|2331|213
parent two : 1 3 2 1 2 3 1 2 1 3 | 3 2 1 3 | 1 3 2:
Recall that crossover is performed on the group portion of the chromosomes. With this in mind, we now consider the group portions of each parent. Randomly generated crosspoints have been selected before groups 2 and 3 in parent one and before groups 3 and 2 in parent two. These crosspoints cause the formation of a cross-section, the section between the crosspoints. GGA0 does not allow for two crosspoints at the same location for the same parent, meaning that the cross-section always consists of at least one group. In this example, the cross-section of parent one consists of groups 2 and 1. For parent two, the cross-section consists only of group 3. Our second task for GGA crossover is to create the 9rst o6spring. To do this, the cross-section of parent two is inserted after the 9rst crosspoint of parent one. The result is o6spring one = 3 2 1 3. Note that the 9rst 3 is underlined to signify that it is part of the inserted section, not part of the original parent. Now the composition of each group of o6spring one is listed, with braces used to separate the components list from the machines list. For example, group 2 includes components 1, 7, and 10, as well as machine 1 group group group group
3 2 1 3
{2; 6; 10}; {1; 4}; {1; 7; 10}; {1}; {2; 4; 8; 9}; {4}; {3; 5; 6}; {2; 3}:
Note that 9ve items now occur twice: component 2, component 6, component 10, machine 1 and machine 4. Following Falkenauer’s steps for crossover, as a third task we remove the duplicates from the groups they were in as a part of parent one. Thus, the injected section, 3, remains in tact and the other groups are subject to alteration. The result of this step is the following: group group group group
3 2 1 3
{2; 6; 10}; {1; 4} {1; 7} {4; 8; 9} {3; 5}; {2; 3}
[unaltered]; [component 10 and machine 1 removed]; [component 2 and machine 4 removed]; [component 6 removed]:
The solution resulting from this process may be infeasible, as this example illustrates. A necessary condition for feasibility in the MPCF problem is that each group contains at least one component
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1585
and one machine. Currently, components 1, 7, 4, 8, and 9 are assigned to groups with zero machines. As a result, groups 2 and 1 are dissolved. Their displaced members are assigned to existing groups using a replacement heuristic. For GGA0 , a na23ve replacement heuristic is employed for the fourth task of crossover. This heuristic randomly assigns the displaced components (or machines) to an existing group, with each group having equal probability of being chosen. Since the only two existing groups in our example are groups 3 and 3, each of the 9ve displaced components has a 50% chance of being assigned to group 3 and a 50% chance of being assigned to group 3. As the last step in the GGA crossover operator, we repeat the o6spring creation process with the roles of the parents reversed. 5.6. Selection of next generation The steps of selecting pairs of parents and creating children are repeated until N o6springs are created. Assuming we start with an initial population of size N , we now have 2N chromosomes from which we must choose our next generation. GGA0 employs a rank-based roulette wheel selection strategy when determining which chromosomes survive into the next generation. This strategy involves scaling the raw 9tness scores of each chromosome using Eq. (5) and employing random number generation to select chromosomes based on their rank-based 9tness values. However, with next generation selection, GGA0 automatically passes the most 9t chromosome to the next generation. Also, once a chromosome has been selected to be a part of the next generation, it cannot be selected again. This type of generational selection without replacement helps to maintain population diversity and avoid premature convergence. 5.7. Termination of algorithm Once the next generation has been selected, the algorithm returns to the action of selecting parent pairs and the process repeats. GGA0 terminates once a 9xed number of generations have occurred. This parameter value varies depending upon the experiment. 5.8. GGACF replacement heuristic GGA0 and GGACF are identical in all aspects except one, the replacement heuristic used as a part of the crossover operator. With GGACF , the group into which a particular component is placed is not selected randomly. Rather, the replacement heuristic utilizes the information contained in the MP incidence matrix to place each displaced component into the group containing the most machines it needs. Similarly, the heuristic places each displaced machine into the group containing the most components that require it. Ties are broken arbitrarily. As a 9rst step of the heuristic, the set of displaced items is randomly ordered before any reassignments are made. This is done because the order in which displaced components and machines are reassigned has an impact on succeeding reassignments. The steps of the replacement heuristic follow, as originally presented in Brown and Sumichrast [14].
1586
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
Replacement Heuristic for GGACF : 1. Create a randomly ordered list of all displaced components and machines. 2. Determine what type of item is at the top of the list (component = go to step 3, machine = go to step 7). 3. For each displaced component j, create S(j), the set of all machines used by component j. Refer to the MP incidence matrix for this information. 4. For each of the currently existing groups, determine the number of machines from S(j) in each of these groups. Identify the group, G ∗ , with the maximum number of machines from S(j). Break ties arbitrarily. 5. Assign the displaced component j to group G ∗ . 6. If displaced components or machines still exist, return to step 2, otherwise stop. 7. For each displaced machine i, create T (i), the set of all parts which require machine i. Refer to the MP incidence matrix for this information. 8. For each of the currently existing groups, determine the number of parts from T (i) in each of these groups. Identify the group, G ∗∗ , with the maximum number of parts from T (i). Break ties arbitrarily. 9. Assign the displaced machine i in group G ∗∗ . 10. If displaced components or machines still exist, return to step 2, otherwise stop. This replacement heuristic is intended to help increase machine utilization and decrease intercell tra7c. Since high utilization and low intercell tra7c are goals of group e7ciency and group e7cacy, they 9t logically as goals for the replacement heuristic. Returning to the example begun in Section 5.1, suppose we now employ the heuristic of GGACF in order to restore feasibility after the third step of crossover. First, we must randomly order the list of displaced items, which are all components for our example. Suppose the random ordering is given by (9; 8; 4; 7; 1). To help determine where to insert each of these 9ve components, we refer to the given MP matrix from Fig. 1. From Fig. 1, we see that component 9 requires only machine 4. This being the case, component 9 is inserted onto group 3, since group 3 currently contains machine 4. When replacing component 8, the MP matrix indicates component 8 requires machine 1 and machine 4. In this case, we place component 8 in group 3 because that is the group containing both of the machines component 8 needs. When replacing component 4, the MP matrix indicates component 4 requires machines 3 and 4. We are forced to choose between groups 3 and 3 for insertion of this component. Each group is given equal weight, and a random number is generated to determine which cell component 4 is assigned. For the example, we assume component 4 is assigned to group 3. Similar ties must be broken when placing component 7 and component 1 into new groups, as both components require one machine from each of the two groups. Assume we assign component 7 to group 3 and component 1 to group 3. With all of the reassignments now made, the resulting feasible groups are: group 3 group 3
{2; 6; 10; 8; 9; 1}; {1; 4}; {3; 5; 4; 7}; {2; 3}:
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1587
Table 1 Test problem sizes Problem
Size
CR1 CR2 CR3 CR4 CR5 CR6 SRI CAR KIN SAN
24 × 40 24 × 40 24 × 40 24 × 40 24 × 40 24 × 40 24 × 40 20 × 35 16 × 43 10 × 20
This leads to the resulting o6spring one: 3 3 3 3 3 3 3 3 3 3 | 3 3 3 3 | 3 3. To avoid the confusion caused by the notation, we renumber the groups in order of appearance and obtain the 9nal o6spring chromosome: 1 1 2 2 2 1 2 1 1 1 | 1 2 2 1 | 1 2. 6. Experimentation and analysis In this section we provide details of the four experiments used to test the performance of both GGA0 and GGACF . For the 9rst three experiments, we utilize problems from the literature. For the last experiment, we use a set of randomly generated problems. 6.1. Experiment one In our 9rst experiment, we use seven problems from the literature and evaluate the performance of the two di6erent algorithms based on grouping e7ciency. Six of the problems analyzed are taken from Chandrasekharan and Rajagopalan [21] and one is taken from Sandbothe [24]. Table 1 presents the sizes for these problems. Fig. 3 presents the e7ciency scores achieved by both GGA0 and GGACF for each of the seven problems. The 9gure also presents the time (in seconds) needed for each algorithm to terminate. Parameters used for experiment one include crossover rate of 1.0, population size of 50, and termination after 25 total generations. As a 9rst step in analyzing e7ciency and e7cacy values for all four of our experiments, we perform a Kolmogorov–Smirnov normality test. At the " = 0:05 level, it allows the assumption that all e7ciency and e7cacy data are normally distributed. Also, we examine the variances involved in each of our four experiments and note signi9cant di6erences. Thus, we utilize a two-sample t-test of means assuming unequal variances. Since GGACF includes an intelligent heuristic, we expect its performance to be superior, and perform one-tailed tests for mean e7ciency and e7cacy. Problems CR1–CR6 are analyzed as a group because they all have the same size. For this group of problems, the average grouping e7ciency using GGA0 is 79.05. This increases to 96.22 when
1588
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
Fig. 3. E7ciency results for GGA0 and GGACF .
the intelligent replacement heuristic of GGACF is used. The average percent di6erence (#) for this group of problems is 18%. GGACF − GGA0 #= 100%: (6) GGACF Applying the previously described t-test to these data indicates that, at the " = 0:05 level, the performance of GGACF is signi9cantly better than the performance of GGA0 . From the literature, we 9nd that when this same set of problems is solved using the method of Chandrasekharan and Rajagopalan [21], the average grouping e7ciency is 84.24. The average time to solve these problems is less when using GGACF than GGA0 (51 vs. 54 s). A Kolmogorov–Smirnov test allows us to assume that these times are normally distributed. We then perform the same t-test used on the e7ciency scores and fail to demonstrate that there is a signi9cant di6erence in the run times of the algorithms for these problems, at the " = 0:05 level. The problem from Sandbothe (SAN) shows a smaller di6erence in performance as measured by #. The intelligent replacement heuristic improves e7ciency by only 5%. This lower value of # is a result of the smaller problem size for this example, which involves only 10 machines and 20 components. With this small size, even the na23ve replacement heuristic allows the GGA to 9nd good solutions. The e7ciency for GGA0 on this problem is 90.57, as compared with the average e7ciency of 79.05 for the group of 24 × 40 problems. The e7ciency for this small problem does not increase when using GGACF as compared to the improvement seen on the 24 × 40 problems. Statistical analysis of small problems is considered in experiment four. 6.2. Experiment two Our second experiment uses the same seven problems tested in experiment one, along with three other problems from the literature. These additional problems come from Srinivasan [23], Carrie [16], and King [17]. The problem sizes can be found in Table 1. Experiment two di6ers from experiment one in that the measure of e6ectiveness used to compare each algorithm’s performance
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1589
Fig. 4. E7cacy results for GGA0 and GGACF .
is grouping e7cacy [22]. This measure was created to overcome the disadvantages of the grouping e7ciency measure and is currently used in the literature to evaluate solutions to MPCF problems. Experiment two includes additional problems to provide extended algorithm analysis using the more popular e7cacy measure. Fig. 4 presents the e7cacy scores achieved by both GGA0 and GGACF for each of the ten problems tested. The 9gure also presents the time (in s) needed for each algorithm to terminate. Parameters used for experiment two include crossover rate of 1.0, population size of 100, and termination after 25 total generations. Problems CR1–CR6, along with the problems from Srinivasan (SRI), Carrie (CAR), and King (KIN) are grouped together in the analysis of experiment two because all have approximately the same size as measured by machines and components. The average grouping e7cacy for this set of problems is only 36.73 using GGA0 , compared to 63.41 using GGACF . Using an " value of 0.05, a t-test indicates that GGACF performs superior to GGA0 . From the literature, we 9nd that the average grouping e7cacy for this same set of problems is 59.98 when using other solution techniques. So, the solution quality achieved by GGA0 is not only inferior to GGACF , but it is also well below the typical performance obtained by previous researchers who have tackled the same problems. As with experiment one, the average time to solve these problems is less when using GGACF than GGA0 (82 vs. 86 s). A Kolmogorov–Smirnov test on the times from experiment two indicates that they are not normally distributed. The non-parametric Kruskal–Wallis test does not indicate that there is a signi9cant di6erence in the run times of the two algorithms for this set of problems, at the " = 0:05 level. 6.3. Experiment three Since experiments one and two indicate that the performance of GGA0 is signi9cantly inferior to the performance of GGACF , we alter the termination criteria of GGA0 to determine if its performance can improve over more generations. We focus exclusively on e7cacy as the measure of solution quality because it is more commonly used in current literature.
1590
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
Fig. 5. E7cacy results for GGA0 with 250 generations and GGACF .
Experiment three utilizes the same parameters and problems used in experiment two, with the exception of termination criteria. GGACF is not rerun in experiment three, but GGA0 is tested using 250 total generations, or ten times the number allowed for GGACF . The results are presented in Fig. 5. A t-test is performed using all but the smaller SAN problem. On these similar size problems, GGACF outperforms GGA0 at the " = 0:05 level, even though GGA0 is allowed ten times as many generations. As expected, the ten-fold increase in generations leads to a proportional increase in algorithm run-time. The GGA0 average run time for 25 generations is 86 s while the average run time for 250 generations is 858 s. GGACF runs for only 25 generations, with an average run time of 82 s. The additional time improves performance of GGA0 . In experiment two, when both algorithms are run for 25 generations, the average value of # is 40%. In experiment three, when GGA0 is run for 250 generations, the average value of # is reduced to 26%. Fig. 6 presents the results when GGA0 is run for 25 generations (as in experiment two) and for 250 generations (as in experiment three). A t-test performed using all but the SAN data indicates that the improvement achieved by GGA0 when it is allowed to run for 250 generations is significant, at the " = 0:05 level. The SAN problem is again excluded from analysis due to its smaller size. 6.4. Experiment four In the 9rst three experiments, the quality of the solution produced by GGA0 for the small (10×20) problem is similar to that produced by GGACF . In order to test the hypothesis that GGA0 and GGACF produce similar results when applied to small problems, we randomly generate six small problems, all consisting of 10 machines and 20 parts. Experiment four uses the following parameters: crossover rate of 1.0, population size of 100, and termination after 100 total generations. Note that the smaller problem size allows for an increase in total generations without incurring prohibitive run times. The results of experiment four are presented
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1591
Fig. 6. E7cacy results for GGA0 with 25 and 250 generations.
Fig. 7. E7cacy results for small, randomly generated problems.
in Fig. 7. Using the previously described t-test, we fail to demonstrate that the di6erence in e7cacy scores for GGACF over GGA0 is signi9cant at the " = 0:05 level. It is reasonable to expect that GGACF will not lead to signi9cantly better e7cacy scores than GGA0 for these small problems. The algorithms compared are very similar with the only di6erence occurring in the replacement strategies of the crossover operator. With smaller problems, there are fewer groups and so using intelligence to select an appropriate group for replacement of a displaced part or machine leads to less of an advantage. More generally, the size of the problem is an approximate measure of the di7culty in 9nding high-quality solutions. These small problems are relatively easy problems, so any solution method should be capable of identifying solutions close to the optimal solution. Based on a Kolmogorov–Smirnov test for the times in experiment four, we can assume they are normally distributed. We then perform the same t-test used on the e7cacy scores. Results fail to
1592
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
demonstrate that there is a signi9cant di6erence in the run times of the two algorithms for this set of problems, at the " = 0:05 level. 7. Conclusions The standard GA is robust and can be applied to a broad range of optimization problems. While it is sometimes possible to improve the performance of the GA using problem-speci9c knowledge, the degree of improvement available does not always warrant the development time and computational e6ort. The grouping genetic algorithm is guided by the same principles as the GA: maintaining a population of solutions; recombining parts of current solutions to create new ones; and performing selection and recombination operations in a manner that makes it more likely that good traits from current solutions are included in future solutions. However, GAs and grouping genetic algorithms di6er in a number of ways. The GGA is designed exclusively for grouping problems and includes a crossover operator that requires problem-speci9c information. In particular, problem-speci9c information is required in the replacement step of crossover. It is possible to use the general characteristics of a class of problems to design a replacement heuristic. In this paper, the na23ve heuristic used in GGA0 relies only on the fact that a feasible MPCF solution must have parts and machines in every group. However, such basic information is not su7cient to provide high-quality solutions. For standard test problems from the MPCF literature, GGA0 achieved only an average grouping e7ciency of 79.05 and an average grouping e7cacy of 36.73. This is much lower than other available methods. The replacement heuristic used in GGACF includes more than just information about the class of problem. It includes information about the speci9c problem, through the machine-part incidence matrix, and an “intelligent” though simple set of logical rules for determining how to restore group membership to displaced elements. The revision in the replacement heuristic dramatically improves the performance of the GGA for all except very small problems. When performance is measured by grouping e7ciency, the performance improves by 18%; when measured by grouping e7cacy, the performance improves by an average of 40%. These improvements are shown to be statistically signi9cant. The level of performance of GGACF is at or above the best known solutions for the standard test problems. The need for problem-speci9c knowledge to apply GGA limits its applicability. Unlike GA, the grouping genetic algorithm is incomplete. A researcher or practitioner must customize the algorithm. Adding problem-speci9c knowledge to the steps of an algorithm has the potential for increasing its complexity and computational burden. Based on the experiments of this research, there is no statistically signi9cant di6erence between the average solution time for GGA0 and GGACF . References [1] Holland JH. Adaptation in natural and arti9cial systems: an introductory analysis with applications in biology, control, and arti9cial intelligence. Ann Arbor, MI: University of Michigan Press, 1975. [2] Alendar JT. An indexed bibliography of genetic algorithms: years 1957–1993. Compiled by Alendar JT. Printed in Finland, 1994.
E.C. Brown, R.T. Sumichrast / Computers & Operations Research 30 (2003) 1575 – 1593
1593
[3] Falkenauer E. The grouping genetic algorithms—widening the scope of the GAs. JORBEL—Belgian Journal of Operations Research, Statistics and Computer Science 1992;33:79–102. [4] Falkenauer E. Genetic algorithms for grouping problems. New York: Wiley, 1998. [5] Holland JH. Genetic algorithms. Scienti9c American 1992;267(1):66–72. [6] Goldberg D. Genetic algorithms in search, optimization, and machine learning. Reading, MA: Addison-Wesley, 1989. [7] Michalewicz Z. Genetic algorithms + data structures = evolution programs. New York: Springer, 1994. [8] Evolver 4.0, Palisade, 2000, www.palisade.com. [9] Falkenauer E. A new representation and operators for genetic algorithms applied to grouping problems. Evolutionary Computation 1994;2(2):123–44. [10] Falkenauer E. A hybrid grouping genetic algorithm for bin packing. Journal of Heuristics 1996;2:5–30. [11] Zulawinski B. The swapping heuristic. MS thesis. Michigan State University, USA, 1995. [12] Rekiek B, Falkenauer E, Delchambre A. Multi-product resource planning. Proceedings of the IEEE International Symposium on Assembly and Task Planning, 1997. [13] Rekiek B, De Lit P, Pellichero F, Falkenauer E, Delchambre A. Applying equal piles problem to balance assembly lines. Proceedings of the IEEE International Symposium on Assembly and Task Planning, 1999. [14] Brown EC, Sumichrast RT. CF-GGA: a grouping genetic algorithm for the cell formation problem. International Journal of Production Research 2001;39(16):3651–69. [15] Burbidge JL. Production Uow analysis. Production Engineer 1963;42:742–52. [16] Carrie AS. Numerical taxonomy applied to group technology and plant layout. International Journal of Production Research 1973;11:399–416. [17] King JR. Machine-component grouping in production Uow analysis: an approach using rank order clustering algorithm. International Journal of Production Research 1980;18:213–37. [18] Chandresekharan MP, Rajagopalan R. An ideal seed non-hierarchical clustering algorithm for cellular manufacturing. International Journal of Production Research 1986;24:451–64. [19] Chandresekharan MP, Rajagopalan R. MODROC: an extension of rank-order clustering for group technology. International Journal of Production Research 1986;24:1221–33. [20] Chandresekharan MP, Rajagopalan R. ZODIAC: an algorithm for concurrent formation of part-families and machine-cells. International Journal of Production Research 1987;25:835–50. [21] Chandresekharan MP, Rajagopalan R. GROUPABILITY: an analysis of the properties of binary data matrices for group technology. International Journal of Production Research 1989;27:1035–52. [22] Kumar CS, Chandresekharan MP. Grouping e7cacy: a quantitative criterion for goodness of block diagonal forms of binary matrices in group technology. International Journal of Production Research 1990;28:233–43. [23] Srinivasan G. A clustering algorithm for machine cell formation in group technology using minimum spanning trees. International Journal of Production Research 1994;32:2149–58. [24] Sandbothe RA. Two observations on the grouping e7cacy measure of block diagonal forms. International Journal of Production Research 1998;36(11):3217–22. [25] Reeves CR. A genetic algorithm for Uowshop sequencing. Computers and Operations Research 1995;22:5–13.
Evelyn C. Brown is an assistant professor in the department of Business Information Technology at Virginia Tech. She received her masters degree in Operations Research from North Carolina State University and her Ph.D. in Systems Engineering from University of Virginia. Her research interests include genetic algorithm development and optimization of manufacturing processes. Robert T. Sumichrast is the Associate Dean for Graduate and International Programs and professor of Business Information Technology at Virginia Tech. He received his bachelors degree in Physics from Purdue University and his Ph.D. in Management Science from Clemson University. His research interests lie in the development of heuristics and systems to support business decision making.