Advances in Engineering Software 33 (2002) 779–791 www.elsevier.com/locate/advengsoft
Niche identification techniques in multimodal genetic search with sharing scheme Chyi-Yeu Lin*, Wen-Hong Wu Department of Mechanical Engineering, National Taiwan University of Science and Technology, 43 Keelung Road, Section 4, Taipei 10672, Taiwan, ROC Accepted 2 August 2002
Abstract Genetic algorithms with sharing have been applied in many multimodal optimization problems with success. Traditional sharing schemes require the definition of a common sharing radius, but the predefined radius cannot fit most problems where design niches are of different sizes. Yin and Germay proposed a sharing scheme with cluster analysis methods, which can determine design clusters of different sizes. Since clusters are not necessarily coincident with niches, sharing with clustering techniques fails to provide maximum sharing effects. In this paper, a sharing scheme based on niche identification techniques (NIT) is proposed, which is capable of determining the center location and radius of each of existing niches based on fitness topographical information of designs in the population. Genetic algorithms with NIT were tested and compared to GAs with traditional sharing scheme and sharing with cluster analysis methods in four illustrative problems. Results of numerical experiments showed that the sharing scheme with NIT improved both search stability and effectiveness of locating multiple optima. The niche-based genetic algorithm and the multiple local search approach are compared in the fifth illustrative problem involving a discrete ten-variable bump function problem. q 2002 Elsevier Science Ltd. All rights reserved. Keywords: Genetic algorithms; Multimodal optimization; Sharing; Niche identification; Topographical information
1. Introduction Genetic algorithms (GAs) have received much attention from researchers and design engineers in the last decade due to GAs’ capabilities of attaining the global optimum in problems with complicated design spaces. Genetic algorithms have been successfully applied in structural optimization problems where the design space is non-convex and disjoint [1]. Genetic algorithms have also been proven of possessing superior global optima attainment capabilities in problems including discontinuous design variables such as integer variables and discrete variables [2]. Genetic algorithms due to its robustness have been applied in many engineering fields with success [3 – 6]. In many occasions, in addition to the global optimum it is desirable to obtain other relative optima. These relative optima at hand serve as excellent alternative solutions in many occasions. One example of that is when design constraints are changed and the global optimum becomes infeasible these relative optima provide handy alternatives. In another case, the global optimal design may be too * Corresponding author. E-mail address:
[email protected] (C.Y. Lin).
expensive or even impossible to manufacture, other lower cost relative optima become attractive alternatives. While simple genetic algorithms aim to seek the global optimum only, with some simple modifications they can be used to tackle multiple relative optima in a simple run of genetic search. Cavicchio [7] proposed the preselection scheme in 1970 so as to preserve diversity of genetic searches. A design created through gene variation will replace one of parents, which is most similar to the new design. Cavicchio’s preselection scheme set the first milestone for multimodal optimization with genetic algorithms. Based on Cavicchio’s work, De Jong [8] proposed in his dissertation the crowding scheme in 1975, in which after a new design is created, one old design in the population, which is most similar to the new design is chosen to be replaced, where similarity is based on the number of matching alleles. Mahfoud [9] studied these preselection and crowding schemes and found them incapable of tackling more than two peaks due to replacement errors. Later on, Mahfoud proposed the deterministic crowding algorithm that could reduce replacement errors [10]. In 1987, Goldberg and Richardson [11] proposed a sharing strategy in which a design will be penalized due to the presence of other designs in its neighborhood. It was based on the theory that
0965-9978/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved. PII: S 0 9 6 5 - 9 9 7 8 ( 0 2 ) 0 0 0 4 5 - 5
780
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
organisms in a common niche will compete for the same amount of resource therefore the amount of resource which each organism can get will be reduced by the existence of other organisms in the same niche. By doing that a design will no longer seek a niche where the amount of resource is the greatest, it will instead seek the niche having the maximum average resource, defined as the amount of resource divided by the number of designs resided in the niche. The balance on the number of designs in a niche and the height of this niche will thus be automatically achieved. Deb and Goldberg [12] reported that both forms of sharing scheme provided much better results than the crowding scheme. Deb and Goldberg also proposed the mating restriction scheme to strengthen the effect of sharing scheme. The mating restriction allows mating only on designs located within a distance. The proper operation of the sharing scheme proposed by Goldberg and Richardson depends on one critical parameter, the sharing radius. Two designs are deemed in a common niche if their distance is less than the predetermined sharing radius. For most design problems, every niche may have a different shape and size. The use of one sharing radius will not provide the best results on most problems. Yin and Germay [13] proposed the use of cluster analysis methods to improve effects and speed of sharing scheme, and their study showed that the cluster analysis method based sharing scheme costs less computational time and remains effective. Lin and Yang [14] proposed a sharing scheme using a different cluster identification technique based on a crowdedness function that provides a larger measure on a design if more designs reside around this design. Beasley et al. [15] proposed a sequential niche technique to tackle multiple solutions, in which a genetic search is responsible for seeking a relative optimum and a subsequent genetic search will search for the optimum in the design space excluding the niche where the optimum has been already attained. Lin et al. [16] proposed hybrid multimodal optimization approaches combining genetic algorithms and local search techniques, in which each of clusters formed during a genetic search will be sequentially isolated out of the original design space and a local search is conducted on the subspace with the best design in the cluster as the initial design. Yin and Germay’s cluster analysis methods can better accommodate the sharing scheme in problems with niches of varying sizes due to automatic detection of clusters based on design distribution information. Nevertheless, the use of clusters cannot replace niches in a functional consideration. A cluster is simply a number of designs that stay together in a small neighborhood, but these designs do not necessarily belong to a common valley. Designs in a niche must share a common valley. Successful local searches with two different initial designs in the same niche will eventually lead to the same local minimum. However, local searches starting from two designs in a cluster will not always lead to a common local minimum. Sharing scheme is meant to be applied on niches, not clusters, and proper sharing should penalize a
design when neighboring designs exist inside the common niche, not just being close enough but not in a same niche. Therefore it is the goal of the present work to provide a niche identification technique (NIT) so that proper sharing can be most effectively implemented in genetic algorithms. In the rest of this paper, basic implementation of sharing scheme and mating restriction will be first introduced, Yin and Germay’s sharing scheme with adaptive cluster analysis methods will be next described, NIT will then be introduced, four illustrative problems comparing traditional sharing, Yin and Germay’s sharing and sharing with NIT will be subsequently included, and finally concluding remarks are provided.
2. Sharing scheme and mating restriction The sharing scheme proposed by Goldberg and Richardson [11] was based on an idea that the fitness of a design in a niche should be degraded due to the presence of other designs in the same niche. In this scheme, two designs are considered to be located inside the same niche if the distance between two designs is smaller than a predetermined sharing radius. Any design, Xj, located within the distance of the sharing radius to the considered design, Xi, will result in a degrading sharing function, Shij, defined as follows 8 a > < 1 2 dij ; dij , sSh sSh Shij ¼ ð1Þ > : 0; otherwise where dij is the distance between the ith design, Xi, and the jth design, Xj. The sharing radius, (sSh, is a predetermined constant that decides if the design, Xj, will cause degradation or not. The second constant, a, exclusively set to 1.0 in this work, controls the magnitude of sharing severity. The sharing function of each design in the population, including the ith design itself, has to be calculated, and then the sharing modified fitness f~i ; for the ith design is computed as follows: f~i ¼
fi N X Shij
ð2Þ
j¼1
If more designs are located within the distance of the sharing radius to the ith design, the original fitness of the ith design will be divided by a greater number, and thus producing a smaller modified fitness. Crossover among designs in remote niches often causes disruptions on the convergence to the optimum in each niche. Deb and Goldberg [12] devised the mating restriction scheme that aims to prevent designs in different niche from crossing each other. Only two designs are located within a distance smaller than a predefined mating radius should be allowed
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
781
5. After all designs are assigned, fix the location of each of all centroids as a seed point of cluster. Reassign each of N designs to its nearest seed point.
Fig. 1. Design distribution.
to become mating partners. In practice, the sharing radius is often used as the mating radius.
3. Sharing with cluster analysis methods In order to improve the effect of sharing scheme on multimodal genetic searches, Yin and Germay [13] introduced two cluster analysis methods that can automatically separate the population into a number of clusters of varying sizes. Each design cluster is considered as a niche; therefore the fitness of a design inside a cluster will be degraded by the presence of neighboring designs in the same cluster. The operation procedure of the suggested adaptive MacQueen’s KMEAN cluster analysis method that defines clusters is as follows. 1. Define three parameters: dmax, dmin, and k. (dmax, is the maximum distance between a design and the centroid of the cluster that this design belongs to; dmin is the minimum distance between two centroids; and k is the initial guess of the number of cluster.) 2. Arbitrarily choose k designs from the population of N designs. Each of k designs is considered as a one-member cluster. 3. Calculate all pairwise distance among these one-member clusters. If the smallest distance is less than dmin, merge the two associated clusters. Find the new centroid of the merged clusters, and calculate the distance between the new centroid to each of the remaining clusters. Continue merging nearest clusters until all centroids are separated by at least the distance dmin. 4. Assign each of the N-k designs to the cluster with the nearest centroid. After each assignment, update the centroid of the gaining cluster and calculate the distance to the centroids of other clusters. Merge the cluster with the nearest cluster if the distance between two centroids is less than dmin, and continue merging as necessary until all centroids are separated by at least the distance dmin. If the distance between the new design and the nearest cluster is greater than dmax, treat this design as a new onemember cluster.
The constant k only serves as an initial seed number and will not restrain the final number of clusters. This adaptive MacQueen’s KMEAN cluster analysis method will determine the number of clusters according to the design distribution. The other two parameters, dmax and dmin, will influence the result of the cluster analysis, and require proper definitions on their values for improved cluster analysis. In practice, dmax is selected as a sufficiently large number, and dmin is taken as a small number. After completing the adaptive cluster analysis, designs in the population are assigned to a number of clusters. A design is penalized by the presence of other designs in the same cluster only. The fitness fi of the ith design located in the jth cluster that contains Nj designs will be degraded as follows a dij f f~i ¼ i with Si ¼ Nj £ 1 2 ð3Þ Si 2Dmax where dij is the distance between the ith design and the centroid of the jth cluster. The exponential, a, is a magnitude controlling constant that was chosen as 1.0 [13]. Mating restriction can be easily implemented that only designs in the same cluster can become mating partners. Elitist strategy that was often used in traditional genetic algorithms to preserve the best design in the previous generation can be also conveniently extended to preserve the best design in each cluster. This will speed up the multimodal convergence. In Yin and Germay’s sharing approach, a cluster is equivalent to a niche while they can be significantly different.
4. Niche identification techniques 4.1. Niche identification If a multimodal function is well sampled by a sufficiently large number of designs, the number of niches can be calculated by observing the function topography based on the information gathered from samples. As shown in Fig. 1, there are 12 design samples on the two-niche onedimensional function. If the fitness function value of a design, such as X4, is greater than that of both adjacent designs, X3 and X5, the design X4, is considered as a relative optimum. It is noted that here the term relative optimum is referred to as the best design among neighboring design samples, and it is not necessary the theoretical optimum in the niche. On the other hand, if the fitness function value of a design, such as X9, is less than that of both adjacent designs, X8 and X10, these two designs X8 and X10 are considered to belong to two different niches. If the fitness function value of a design is smaller than that of both
782
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
Therefore, it stresses the need of a fitness increment tolerance ratio so that a small bump between X1 and X6 can be ignored. The fitness increment bump ratio Bj, for the jth design in the sequence, can be defined as follows
Bj ¼
Fig. 2. Design sequence based on distance to the center design.
adjacent designs, these two adjacent designs likely belong to two different niches. NIT start from assigning the design with the highest fitness function value as the center of the first niche. Then, the effort is to locate the boundary of the niche. Each natural niche in a given function has its own shape and size. However, in this study, each niche is modeled by a hypersphere that requires definitions of center location and radius, and the relative optimum is located at the center of the hypersphere. With the center of the first hyperspherical niche at hand, the next aim is to locate the design that separates the first niche and another one. If the design X4 in Fig. 1 is deemed as the center of the first hyperspherical niche, a procedure is developed so that the boundary design X9 can be located. Fig. 2 shows variation of fitness function values of a design sequence consisting of all 12 designs in Fig. 1. The center design of the first niche is the design that has the largest fitness in the population and is placed at the top of the sequence. The position of any other design in the sequence depends on the rank of the distance between the design and the center design in the design space. A design is placed closer to the center design if the design has a closer distance to the center than others. Then, efforts will be placed on finding a design whose fitness function value is greater than that of the design located one position closer to the center design. As shown in Fig. 2, after a sequence of comparisons of fitness between design pairs X3 and X4, X5 and X3, X2 and X5, X6 and X2, X1 and X6, the design X6 is preliminarily qualified as a boundary design between two connecting niches due to the fact that its fitness is less than that of X1. If the design X6 is eventually considered as the boundary design of the first niche centered at the design X4, the radius of this hyperspherical niche is then the distance between X4 and X6. Referring back to Fig. 1, it is found that the selection of X6 as the boundary design for the first niche is incorrect. Misjudgment based on the above-mentioned single rule are easy to happen in most problems, and it is due to the fact that the fitness of a design is not necessarily identical to that of another design with an identical distance to the center but located in opposite side of the niche.
Fðdjþ1 Þ 2 Fðdj Þ DF ¼ Fmax 2 Fmin Fmax 2 Fmin
ð4Þ
where F(dj) represents the fitness of the jth design in the sequence, Fmax and Fmin the largest and smallest fitness function values of the population. When the bump ratio is less than a predetermined tolerance ratio, B p, the small fitness increment between two consecutive designs will then be ignored. A larger tolerance ratio will generally result in a smaller number of niches with larger radii. On the contrary, a smaller tolerance ratio leads to a larger number of niches that have smaller radii. In this study, a 10% bump ratio tolerance is exclusively used. If Bj is less than or equal to B p, the jth design in the sequence belongs to the same niche as the center design. In the case that Bj is greater than B p, the jth design is then used as the boundary design of the niche under consideration. After the boundary design of the first niche is determined, the radius of the first hyperspherical niche is defined as the distance between the center design and the boundary design. It is noted that the entire niche identification model is based on the assumption of hyperspherical niches. If the design space contains niches of shapes significantly departing from hyperspheres, the chances of incorrect niche identifications will be increased. Before the entire NIT procedure can be described, it is needed to discuss how it should be resolved when two hyperspherical niches are found interfering each other. 4.2. Niche interference modification After a niche is newly determined, it is necessary to check if this niche interferes with any of existing niches already obtained. Two interference conditions that require further niche modification include that one large niche contains the center of a small niche, and two niches intersect each other but none contains the center of the other niche. The detailed modification processes are defined as follows: 1. The large niche contains the center of the small niche: radii of two intersecting hyperspherical niches are R1 and R2 ðR1 . R2 Þ as shown in Fig. 3. The distance between two centers is L. If the center of the small niche (with R2) is contained inside the large niche (R1 . L ), two niches are then merged into one new niche with a new center and a new radius. The new center is located on the line connecting two old centers by moving a distance L p from the center of the large niche towards the center of the small niche. The offset distance L p and the new radius of
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
Fig. 3. Niche interference: a small niche and a large niche containing the small one merge to a new niche.
the new niche are defined as follows: Lp ¼
Rnew
R2 £L R1 þ R2 sffiffiffiffiffi 3 R2 £ R1 ¼ R1
ð5Þ
ð6Þ
2. Two intersecting but non-containing niches: radii of these two intersecting niches are R1 and R2 as shown in Fig. 4. The distance between two centers is L. If each niche does not contain the center of the other niche due to R1 , L and R2 , L; two niches will be reduced to one with smaller radii, R1-new and R2-new defined as follows: Ri-new ¼
Ri £ L; R1 þ R2
i ¼ 1; 2
ð7Þ
4.3. Niche identification procedure After both determination of the center and the radius of a niche, and modification of two interfering niches are introduced, the complete niche identification procedure can then be defined as follows. 1. The design with the highest fitness in the population is used as the center of the first hyperspherical niche, Xc, and this design is then marked. 2. Calculate the distances between any unmarked design and the center of hypersphere, Xc, and construct a sequence based on the distance to the center, the shorter the distance the closer in the sequence. 3. Sequentially check the bump ratio of each design until the bump ratio of one design exceeds the tolerance limit. Use this boundary design to define the radius of the new niche. Mark all designs in the niche. 4. The unmarked, highest fitness design is used as the center
783
Fig. 4. Niche interference: two non-containing niches shrink into two connecting niches.
of a new hyperspherical niche, Xc, and repeat process 2 and 3 until all designs are marked. 5. Dismiss any hyperspherical niche if it contains designs less than a percentage (10%) of the population. 6. Conduct interference modification on intersecting hyperspheres until there exist no intersecting niches. 7. Redistribute each of the designs to its containing hypersphere. Designs that are not contained in any hypersphere are classified as non-niche members. 4.4. Sharing implementation with NIT After completing gene variation processes, crossover and mutation, the genetic search will start the implementation of niche identification. At the end of niche identification and modification, both a number of niches and information relating all design members to each of niches are obtained. If the ith design has a fitness function value of fi, and belongs to the jth niche that comprises Nj designs, the sharing penalty for the ith design is defined as the number of designs in the niche where it belongs to, Nj. Therefore, the sharing modified fitness of this ith design will then be defined as follows: f f~i ¼ i Nj
ð8Þ
By this definition, every design in the same niche will have identical sharing penalties. After sharing implementation, a design that has a higher fitness function value than others before sharing will still have a higher fitness if they are deemed locating inside the same niche. For those designs that are not assigned to any niche, the sharing penalty for each of the designs will be defined as the average of sharing penalties of all niches in the population. If the fitness of a non-niche design is fk, the modified fitness will then be calculated as follows: f~k ¼
fk m 1 X N m j¼1 j
ð9Þ
784
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
Mating restriction strategy can be easily implemented by allowing a design to mate with only another design in the same niche. Mating restriction is often used in the late stage of the genetic evolution so as to speed up the in-valley convergence. Furthermore, the elitist strategy that used to preserve the best design in the generation, now extend its capability to preserve the best design in each of the niches identified in the population.
5. Illustrative problems The proposed sharing with NIT is tested in two numerical problems and two engineering problems, and its results are compared to two other sharing techniques, the traditional sharing by Goldberg and Richardson [11], and the sharing with the adaptive MacQueen’s KMEAN cluster analysis method by Yin and Germay [13]. In the work, each illustrative problem was executed 10 times with 10 different initial random seeds by each algorithm so as to reduce the sampling errors. All results of genetic searches were based on averaged performance of 10 executions. In traditional sharing scheme, three different sharing radii, sSh, were selected in each problem. Similarly, three different sets of Dmax and Dmin were chosen to run Yin and Germay’s genetic algorithm in each problem. An ideal distribution pattern of designs in the multimodal genetic algorithms can be calculated by the modified Karmed bandit problem introduced by Holland [17]. Designs in different niches will form stable subpopulations where the number of designs in a subpopulation is proportional to the maximum fitness of the representative peak of the niche. In order to measure the capabilities of attaining multiple optima in different niches of varied size and height, the chisquare-like distribution error measure [12] was used to rank the performance of the multimodal genetic algorithms. Consider a design space that comprises q niches, and the maximum fitness of the ith niche is fi. The expected number of designs converging to the ith niche, mi, from a total of N designs in the design space is expressed as f mi ¼ Pi £ N fi
If there is ni designs that converges to ith niche, the number of designs that fail to converge to any niche can then be computed as q X nqþ1 ¼ N 2 ni ð14Þ i¼1
The chi-squared like distribution error measure, Pchi can be defined as vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uqþ1 u X ni 2 m i 2 Pchi ¼ t ð15Þ si i¼1 A smaller chi-squared like distribution measure indicates a better design distribution. A large Pchi value represents uneven design distributions, meaning too many designs converge to a partial number of niches. In this study, a design is considered converged to the peak of a niche if the fitness of the design is more than 80% of the fitness of the peak design in the niche. All genetic searches were executed with a population size of 50, probability of crossover Pc ¼ 0:6; and probability of mutation Pm ¼ 0:001: Two-point crossover was used in the entire work. Genetic searches were terminated at the end of the 200th generation. Mating restriction started to be effective after the 50th generation. Elitist strategy was applied in all genetic algorithms. A multiple strategy genetic algorithm, EVOLVE [18], written in Fortran, is used in this work to tackle all genetic searches. 5.1. First numerical problem—F1 The first illustrative problem is a one-dimensional numerical function as follows: 5p 4 F1 ¼ ð1 2 xÞsin ; 0,x,1 ð16Þ ð1 þ xÞ4 There are five peaks in this function where each niche has a different width of domain as shown in Fig. 5. The higher the
ð10Þ
and the distribution variance si of the expected number mi for the ith niche is expressed as m ð11Þ s2i ¼ Npi ð1 2 pi Þ; pi ¼ i N The expected number of designs converging to none of q niches is expressed as
mqþ1 ¼ 0
ð12Þ
and the distribution variance of the expected number mqþ1 is expressed as
sqþ1 ¼
q X i¼1
s2i
ð13Þ Fig. 5. Five-niche function F1.
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
Fig. 6. Comparison of the chi-square-like deviation measure of Goldberg and Richardson’s sharing scheme with three sharing radii on F1.
peak the smaller the niche’s domain is. The upper and lower bound for the variable is 1.0 and 0.0, and a 30-bit binary string is used to represent the variable. The objective function (fitness function) is bounded between 1.0 and 0.0. The genetic algorithm with Goldberg and Richardson’s sharing scheme was tested with three different sharing radii, sSh ¼ 0:1; 0.2, 0.3. It is noted that sharing radii used in this work are exclusively normalized by treating the range of each variable as a unit. The averaged convergence history of the chi-square-like function of 10 genetic searches with each of the sharing radii is shown in Fig. 6. Genetic search with Yin and Germay’s sharing approach was also tested in three different sets of cluster analysis parameters, (Dmin, Dmax) ¼ (0.02, 0.06), (0.05, 0.10), and (0.10, 0.18). The
785
Fig. 8. Comparison of the chi-square-like deviation measure of all three sharing schemes on F1.
averaged convergence history of the chi-square-like function of 10 genetic searches with each of three cluster parameter sets is shown in Fig. 7. The convergence history that is the average of the three chi-square-like function histories shown in Fig. 6, the convergence history that is the average of the three chisquare-like function histories shown in Fig. 7, and the average chi-square-like function history of 10 genetic searches with NIT are shown in Fig. 8. It is noted that different sharing radii and different cluster analysis parameters resulted insignificantly different outputs. A fixed sharing radius did not suit this problem in which five niches are of varied sizes, therefore the results were inferior to that of other two sharing approaches. Yin and Germay’s three cases created convergence histories of large deviations. The smaller Dmax case had the best chisquare-like function convergence history among all approaches, while larger Dmax cases had apparently worse convergence histories. On the average of Yin and Germay’s sharing approaches with three parameter sets, the convergence capability of chi-square-like function convergence capability is at the same level as that of sharing with NIT. Information regarding the number of precise local optimum located in each optimal basin by different algorithms, other than near optimum design information provided by the chi-square-like function, provides additional information for algorithmic comparison. The number of precise local optimum located in 10 runs of each of seven algorithms are listed in Table 1. The NIT algorithm and two Yin and Germay algorithms with smaller parameters performed much better than other algorithms, in locating the precise optima. Traditional sharing schemes by Goldberg and Richardson had high difficulty of locating the 3rd and the 4th optimum. 5.2. Second numerical problem—F2
Fig. 7. Comparison of the chi-square-like deviation measure of Yin and Germay’s sharing scheme with three (Dmax, Dmin) sets on F1.
The second numerical problem involves an exponential
786
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
Table 1 Number of precise local optima located during 10 runs for F1 Number of relative optimum
Local 1
Local 2
Local 3
Local 4
Local 5
NIT Yin and Germay (0.02, 0.06) Yin and Germay (0.05, 0.10) Yin and Germay (0.10, 0.18) Goldberg and Richardson (0.2) Goldberg and Richardson (0.3) Goldberg and Richardson (0.4)
10 9 10 8 9 9 9
10 10 9 6 4 8 7
10 10 5 3 0 1 0
9 10 10 8 0 1 0
5 6 9 6 6 6 9
function defined as follows: F2 ¼ 10 expð20:03ðx1 2 3Þ2 2 0:03ðx2 2 3Þ2 Þ 2 8 expð20:08ðx1 þ 5Þ2 2 0:08ðx2 þ 5Þ2 Þ 2 7 expð20:08ðx1 2 4Þ2 2 0:40ðx2 þ 7Þ2 Þ
ð17Þ
This two-variable function comprises three niches of varied sizes and heights as shown in Fig. 9. The highest peak is at the center of the valley in the upper part of the design space. The second largest peak is at the center of the valley located in the lower left corner. The lowest peak is at the middle of the flat valley located in the lower right corner. Each of two design variables is bounded between 2 10 and 10, and each variable is coded by a 22-bit binary substring. Traditional sharing was executed with three different sharing radii; sSh ¼ 0:2; 0.3, 0.4. Yin and Germay’s approach was executed with three cluster analysis parameter sets, (Dmin, Dmax) ¼ (0.10, 0.20), (0.10, 0.30), (0.10, 0.40). Each genetic search with a specific sharing setting was conducted 10 times and the average of 10 runs was reported. As shown in Fig. 10, two smaller sharing radii cases had impressive convergence histories in the early stage of evolution, but after about the 80th generation all three cases converged to a disappointing high value. As shown in
Fig. 9. Contours of three-niche function F2.
Fig. 11, Yin and Germay’s three cluster analysis parameter sets started with similar performance but after around the 75th generation the chi-square-like functions of two cases with larger Dmax values converged to a higher value of 6.0, and the function value of Dmax ¼ 2:0 case fluctuated around 4. In this problem, genetic searches with NIT sharing scheme had the best convergence history as shown in Fig. 12. Its chi-square-like function value almost coincided with that of Yin and Germay’s approach before around the 60th generation, but after that while the function values of NIT stayed fluctuating between 2.0 and 3.0, Yin and Germay’s function values went up to about 4.0. It is noted that mating restriction started to be applied to both genetic searches of Yin and Germay and NIT since the 50th generation. If design clusters identified by Yin and Germay’s approach failed to match three niches in the design space, the mating restriction may degrade the convergence rate. The numbers of precise local optimum uncovered in 10 runs of each of seven cases are listed in Table 2. The NIT algorithm and Yin and Germay algorithms performed more consistently than the Goldberg and Richardson algorithms, in terms of locating all precise optima. Goldberg and Richardson’s algorithms performed poorly in locating the 2nd and the 3rd optimum.
Fig. 10. Comparison of the chi-square-like deviation measure of Goldberg and Richardson’s sharing scheme with three sharing radii on F2.
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
787
Table 2 Number of precise local optima located during 10 runs for F2 Number of relative optimum
Local 1
Local 2
Local 3
NIT Yin and Germay (0.1, 0.2) Yin and Germay (0.1, 0.3) Yin and Germay (0.1, 0.4) Goldberg and Richardson (0.2) Goldberg and Richardson (0.3) Goldberg and Richardson (0.4)
10 10 10 10 10 10 10
9 10 10 8 0 1 2
10 10 10 9 3 4 7
stress of 20,000 psi on all nodes. The problem formulation is as follows:
Fig. 11. Comparison of the chi-square-like deviation measure of Yin and Germay’s sharing scheme with three (Dmax, Dmin) sets on F2.
5.3. Two-beam grillage problem—E1 The first engineering problem used to test the three different sharing schemes was the two-beam grillage problem [19] as shown in Fig. 13. Each side of the shorter beam is divided into two sections of equal-length whose cross-sectional area is defined as A1 and A2, and each side of the longer beam is divided into two equal-length sections whose cross-sectional area is defined as A3 and A4. Both beams are loaded with distributed forces of 1000 lb/in. In the first engineering problem, four independent crosssectional areas were simplified to two independent areas, x1 ¼ A1 ¼ A2 and x2 ¼ A3 ¼ A4 : Two area variables were side bounded between 5 and 30 in.2 The optimization seeks the minimum weight while the constraint is an allowable
Fig. 12. Comparison of the chi-square-like deviation measure of all three sharing schemes on F2.
Minimize OBJ ¼ 100x1 þ 120x2
ð18Þ
Subject to smax # 20; 000 psi
ð19Þ
This two variable problem consists of three relative optima as shown in Fig. 14. Each variable was represented by a 22bit binary substring. Traditional sharing scheme used three different sharing radii, sSh ¼ 0:2; 0.3, and 0.4. Yin and Germay’s sharing scheme used three cluster analysis parameter sets, (Dmin, Dmax) ¼ (0.10, 0.20), (0.10, 0.30), (0.10, 0.40). Averaged convergence histories of 10 runs by three traditional sharing and three Yin and Germay approaches were shown in Figs. 15 and 16, respectively. The average convergence histories of the traditional sharing scheme and Yin and Germay’s scheme as well as the NIT sharing scheme are shown in Fig. 17 for comparison. Three sharing radii of the traditional sharing scheme produced similar convergence histories in which the function values stably fluctuated around 9.0. Different to two previous problems, the smaller Dmax case ðDmax ¼ 0:2Þ of Yin and Germay’s approach lead to the worst chi-squarelike function value of 9.5. Two other cases, (Dmax ¼ 0:3 and 0.4), resulted in final function values around 6.5 and 8.0, respectively. As shown in Fig. 17, the NIT sharing scheme produced the significantly superior convergence capabilities compared to the average performance of traditional sharing and Yin and Germay’s sharing schemes. Three different sharing schemes started with close performances but the NIT approach quickly lowered the function value after around the 50th generation when the mating restriction was initiated. After about the 90th generation, each of three chisquare-like function values seemed to settle to a fixed value,
Fig. 13. Two-beam grillage.
788
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
Fig. 14. Design space of the two-beam grillage problem E1. The three relative optima are marked as local 1, local 2 and local 3.
9.0 for the traditional sharing, 8.0 for Yin and Germay’s sharing, and 5.5 for the NIT sharing. The number of precise local optimum located in 10 runs of all seven algorithms are listed in Table 3. In this problem, three algorithms performed almost perfectly that the 1st and the 3rd local optima were located in all runs while the 2nd optimum located in more than 90% of chances.
Fig. 16. Comparison of the chi-square-like deviation measure of Yin and Germay’s sharing scheme with three (Dmax, Dmin) sets on E1.
function is to minimize the total beam volume and the constraint is a maximum allowable stress on each node. The mathematic statement of this problem is defined as follows: Minimize OBJ ¼ 50ðA1 þ A2 Þ þ 60ðA3 þ A4 Þ
ð20Þ
Subject to smax # 20; 000 psi
ð21Þ
The two-beam grillage problem as shown in Fig. 13 was extended into a four variable problem where A1, A2, A3 and A4 were all considered as independent variables. Each variable was side bounded between 5 and 30 in.2, and was represented by a 22-bit binary substring. The objective
A large number of gradient-based local searches starting from varied initial designs were conducted in this problem and a total of four distinct relative optima were attained and listed in Table 4. The fourth relative optimum is located in a small, shallow valley. If the fourth optimal design is rounded to (8.2, 10.7, 18.3, 23.1), the local search starting from this design will converge to the third optimum. Traditional sharing used sharing radii of 0.4, 0.6, and 0.8, and Yin and Germay’s approach used three parameter sets,
Fig. 15. Comparison of the chi-square-like deviation measure of Goldberg and Richardson’s sharing scheme with three sharing radii on E1.
Fig. 17. Comparison of the chi-square-like deviation measure of all three sharing schemes on E1.
5.4. Two-beam grillage problem—E2
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
789
Table 3 Number of precise local optima located during 10 runs for E1 Number of relative optimum
Local 1
Local 2
Local 3
NIT Yin and Germay (0.1, 0.2) Yin and Germay (0.1, 0.3) Yin and Germay (0.1, 0.4) Goldberg and Richardson (0.2) Goldberg and Richardson (0.3) Goldberg and Richardson (0.4)
10 10 10 10 10 10 10
10 9 9 9 10 10 10
10 10 10 10 10 10 10
(Dmin, Dmax) ¼ (0.10, 0.40), (0.10, 0.60), (0.10, 0.80). Average convergence histories of 10 runs by using three traditional sharing cases and three Yin and Germay’s cases are shown in Figs. 18 and 19, respectively. Three sharing radii produced similar convergence histories before the 100th generation and then started to vary their courses. The sSh ¼ 0:6 case had the best final function value around 8.5, compared to other two radii. Coincidently, the Dmax ¼ 0:6 case of Yin and Germay’s sharing approach had the best performance compared to two other Dmax cases. The Dmax ¼ 0:6 case started to take the lead after the 60th generation and after the 100th generation the function value stayed at the converged value of about 8. Average convergence histories of the traditional sharing and Yin and Germay’s sharing and also that of the NIT sharing are shown in Fig. 20. Yin and Germay’s sharing produced a very fast convergence history than that of the traditional sharing before the 80th generation, but the advantage was slowly erased and almost lost at the end of the genetic search. The NIT approach and Yin and Germay’s sharing were quite competitive to each other before the 100th generation, after that only the NIT sharing continued to lower the chi-square-like function value.
Fig. 18. Comparison of the chi-square-like deviation measure of Goldberg and Richardson’s sharing scheme with three sharing radii on E2.
where y i ¼ Pi £
10 X
ðxj 2 Lj;i Þ2
ð23Þ
j¼1
2
þ3
þ3
þ3
þ3
þ3
þ3
þ3
þ3
6 6 23 6 6 6 L ¼ 6 þ4 6 6 6 24 4
23
23 23
23
23 23
23
23
þ4
23
þ4
23
þ4
23
þ3
24
þ3
24
þ3
24
þ3
þ4
þ4
þ4
þ4
þ4
24 24
24
þ3
þ3
3
7 23 23 7 7 7 7 þ4 23 7 7 7 24 þ3 7 5 24 24
ð24Þ P ¼ ½0:03; 0:02; 0:02; 0:02; 0:01
ð25Þ
5.5. Discrete ten-variable bump function This last illustrative problem comprises a discrete tenvariable bump function that contains five local optima in the design space. The bump function is defined as follows Minimize OBJ ¼ 120 2
5 X
Wi expð2yi Þ
ð22Þ
i¼1
Table 4 Four found relative optima of the two-beam grillage problem E2 Number of relative optimum
A1 (in.2)
A2 (in.2)
A3 (in.2)
A4 (in.2)
OBJ (lb)
1st optimum 2nd optimum 3rd optimum 4th optimum
17.76 5.53 11.33 8.15
23.47 5.43 12.79 10.66
6.37 19.68 15.87 18.27
7.34 25.43 19.09 23.13
2883.70 3254.74 3303.06 3424.94
Fig. 19. Comparison of the chi-square-like deviation measure of Yin and Germay’s sharing scheme with three (Dmax, Dmin) sets on E2.
790
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791 Table 6 Performance comparisons between multiple local searches and niche-based GA method in the discrete ten-variable bump function problem Local optimum
Average number of function calls
#1 #2 #3 #4 #5 Local searches 4 Niche-based GA 8
10 8 1 10 10 5
6 36,437 10 30,000
Local searches terminate at either all five local optima are located or any local optimum has been located five times. Niche-based GA uses Goldberg and Richardson’s sharing radius, r ¼ 1:1; population size ¼ 100, generation ¼ 300.
optima than the multiple local search approach in this problem with less computational cost. This advantage on the niche-based genetic algorithm side can be further extended when the number of the discrete design variables increases. Fig. 20. Comparison of the chi-square-like deviation measure of all three sharing schemes on E2.
6. Concluding remarks W ¼ ½100; 95; 90; 90; 85
ð26Þ
The ten variables xj are of discrete type and can be selected from any of the set as follows: ½25; 24:8; 24:6; 24:4; 24:2; 24:0; 23:8; 23:6; ð27Þ
23:4; 23:2; …; 4:2; 4:4; 4:6; 4:8; 5:0
Multiple local searches and a niche-based genetic algorithm are used to solve this problem. The five local optima found of this discrete ten-variable bump function problem are listed in Table 5. A branch-and-bound approach is used in the local search to handle the discrete variable situation. The multiple local search approach implements a sequential execution of local searches that terminates at either all five local optima are located or any local optimum has been located by five times. The multiple local search approach is conducted 10 times for getting average performance. A niche-based genetic algorithm that uses Goldberg and Richardson’s traditional sharing, r ¼ 1:1; is executed 10 times in a population size of 100 and each terminates after a total of 300 generations. The performance comparison on the numbers of five local optima located in 10 runs of two different approaches is provided in Table 6. The genetic algorithm with sharing consistently locates more local
The sharing scheme proposed by Goldberg and Richardson enabled genetic algorithms to simultaneously tackle multiple relative optima in multimodal optimization problems. Since a multimodal design space may contain a number of niches that are of varied sizes, the predetermined sharing radius, an essential sharing parameter, formed the embedded drawback for the traditional sharing scheme. The selection of the sharing radius highly influenced the search results of the multimodal genetic algorithm with traditional sharing schemes. Yin and Germay’s sharing scheme with cluster analysis methods are capable of identifying clusters of varied sizes. Their sharing scheme outperformed the traditional sharing scheme on many of illustrative problems, but unsatisfactory search results often occurred when ‘bad’ cluster parameters were used. The sharing scheme with NIT required no parameter settings that were based on sizes of niches in the design space. Proper niche identification increased not only the effects of both sharing and mating restriction schemes, but also the effect of the elitist strategy. The new proposed sharing with NIT produced significantly stable and superior search performance than two other competing schemes. How to utilize and implement topographical information existing in the design populations
Table 5 Five found local optima of the discrete ten-variable bump function problem Local optimum
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
Obj.
1 2 3 4 5
3 22 4 23.8 3.8
3 22 22 2.8 3.6
3 22 4 23.8 3.8
3 22 22 2.8 3.6
3 22 4 23.8 3.8
2.8 23.2 23.2 2.6 24
2.8 23.2 3.4 24 23.8
2.8 23.2 23.2 2.6 24
2.8 23.2 3.4 24 23.8
2.8 23.2 23.2 2.6 24
9.233 16.989 19.816 24.04 32.998
C.-Y. Lin, W.-H. Wu / Advances in Engineering Software 33 (2002) 779–791
in a more robust and effective way for multimodal genetic searches is a topic worth of future investigation.
Acknowledgements Authors would like to thank the National Science Council of Taiwan for their financial support on this research by the grant NSC88-2212-E-011-001.
References [1] Hajela P. Genetic search—an approach to the nonconvex optimization problem. AIAA J 1990;26(7):1205– 10. [2] Lin C-Y, Hajela P. Genetic algorithms in optimization problems with discrete and integer design variables. Engng Optim 1992;19(3): 309–27. [3] Le Riche R, Haftka RT. Optimization of laminate stacking sequence for buckling load maximization by genetic algorithms. AIAA J 1995; 31(5):951–6. [4] Rao SS, Pan T-S, Venkayya VB. Optimal placement of actuators in actively controlled structures using genetic algorithms. AIAA J 1991; 29(6):942–3. [5] Hajela P, Lee E, Lin C-Y. Genetic algorithms in structural topology optimization. Proceedings of the NATO Advanced Research Workshop on Topology Design of Structures, Sesimora, Portugal; 1992. [6] Chapman CD, Saitou K, Jakiela MJ. Genetic algorithms as an approach to configuration and topology design. J Mech Des 1994;116: 1005–11. [7] Cavicchio DJ. Adaptive search using simulated evolution. PhD Thesis. University of Michigan; 1970.
791
[8] De Jong KA. An analysis of the behavior of a class of genetic adaptive systems. PhD Thesis. University of Michigan. Dissertation Abstracts International 1975;36(10):5140B. [9] Mahfoud SW. Crowding and preselection revisited. Proceedings of the Second International Conference on Parallel Problem Solving from Nature; 1992. p. 27 –36. [10] Mahfoud SW. Crossover interactions among niches. Proceedings of the First IEEE Conference on Evolutionary Computation, vol. 1; 1994. p. 188–93. [11] Goldberg DE, Richardson J. Genetic algorithm with sharing for multimodal function optimization. Proceedings of the Second International Conference on Genetic Algorithms; 1987. p. 41– 9. [12] Deb K, Goldberg DE. An investigation of niche and species formation in genetic function optimization. Proceedings of the Third International Conference on Genetic Algorithms; 1989. p. 42 –50. [13] Yin X, Germay N. A fast genetic algorithm with sharing scheme using cluster analysis methods in multimodal function optimization. Proceedings of the International Conference on Artificial Neural Networks and Genetic Algorithms; 1993. p. 450 –7. [14] Lin C-Y, Yang Y-J. Cluster identification techniques in genetic algorithms for multimodal optimization. Comput-Aided Civil Infrastruct Engng 1998;13(1):53– 62. [15] Beasley D, Bull DR, Martin RR. A sequential niche technique for multimodal function optimization. Evol Comput 1993;1(2): 101–25. [16] Lin C-Y, Liou J-Y, Yang Y-J. Hybrid multimodal optimization with clustering genetic strategies. Engng Optim 1998;30:263–80. [17] Holland JH. Adaptation in natural and artificial systems. Ann Arbor: The University of Michigan Press; 1975. [18] Lin C-Y, Hajela P. Design optimization with advanced genetic search strategies. Adv Engng Software 1994;21:179–89. [19] Kirsch U. Optimum structural design: concepts, methods, and applications. New York: McGraw-Hill; 1981.