Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited

Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited

Environmental Modelling & Software xxx (2014) 1e14 Contents lists available at ScienceDirect Environmental Modelling & Software journal homepage: ww...

1MB Sizes 1 Downloads 40 Views

Environmental Modelling & Software xxx (2014) 1e14

Contents lists available at ScienceDirect

Environmental Modelling & Software journal homepage: www.elsevier.com/locate/envsoft

Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited* Matthew S. Gibbs a, b, *, Holger R. Maier a, Graeme C. Dandy a a b

The University of Adelaide, Adelaide, South Australia, 5005, Australia Department of Environment, Water and Natural Resources, GPO Box 1047, Adelaide, South Australia, 5001, Australia

a r t i c l e i n f o

a b s t r a c t

Article history: Received 31 March 2014 Received in revised form 11 August 2014 Accepted 13 August 2014 Available online xxx

The Genetic Algorithm (GA) parameter values that result in the best possible solutions being found are generally problem specific, and therefore expected to be related to the characteristics of the fitness function. In this work, statistics that characterise the fitness function have been related to the convergence of a GA population due to the repetitive application of tournament selection. Assuming that this operator has the dominant influence on the variance of the population, and that the computational time available is limited, the result can be used to determine a suitable population size. The methodology developed has been compared to other GA calibration methodologies, and was found to be the best of the different methods considered across a range of stopping criteria and problem formulations. This result demonstrates the potential usefulness of fitness function characteristics to inform the configuration of GAs, and in turn find the best possible solutions. © 2014 Elsevier Ltd. All rights reserved.

Keywords: Genetic Algorithms Water resources Calibration Fitness function analysis Convergence Water distribution systems

Software availability The code (in C language) of the GA implementation adopted and functions used to calculate fitness function statistics are available from the corresponding author on request. The Cherry Hill-Brushy Plains Network test function considered is available with the EPANET software, and the code (C language) to calculate the fitness function for this test function is also available from the corresponding author on request. 1. Introduction Genetic Algorithms (GAs) have demonstrated the ability to successfully solve many optimisation problems in the field of water resources. This success is not unfounded, as these algorithms can operate on any objective function, and it has been demonstrated that a GA can be guaranteed to identify the optimal solution of an

*

Special Issue on Evolutionary Algorithms. * Corresponding author. The University of Adelaide, Adelaide, South Australia, 5005, Australia. Tel.: þ61 8 8313 0515. E-mail addresses: [email protected], [email protected]. edu.au (M.S. Gibbs), [email protected] (H.R. Maier), graeme.dandy@ adelaide.edu.au (G.C. Dandy).

optimisation problem given enough time (Suzuki, 1995). However, the process of selecting and applying a GA still requires expert knowledge to reliably produce suitable solutions, as discussed in Maier et al. (2014). One aspect of the application of GAs requiring expert knowledge is the specification of a number of algorithm parameters, where common parameters include population size, probability of crossover, and probability of mutation. Several studies have shown that a GA can tolerate some variation in its parameter values without dramatically affecting its overall performance (Lobo, 2000). However, even though GAs are quite robust with respect to some parameter settings, this does not imply that GAs will work equally well regardless of the settings. As specified by the No Free Lunch Theorem (Wolpert and Macready, 1997), there is no one set of parameter values that will lead to the best results for all problems. As such, the most suitable GA parameters should be identified for each optimisation problem. A poorly selected set of parameter values may lead the GA to converge prematurely to sub-optimal solutions, or direct it to search the fitness function too widely, without making maximum use of current knowledge to find better solutions. As the best parameter values to use are problem specific, it might be expected that these best parameter values are related to the characteristics of the fitness function to be optimised.

http://dx.doi.org/10.1016/j.envsoft.2014.08.023 1364-8152/© 2014 Elsevier Ltd. All rights reserved.

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

2

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

Typically, GA parameter values are found using a trial-and-error approach. However, the GA operators will interact with each other, and therefore the behaviour produced by the value of one parameter will be dependent on the selected values for all other parameters. Due to computational requirements, only a limited number of parameter combinations can be tested, and it is unlikely that a trialand-error approach will identify parameter values close to those that will produce the best performance from the selected algorithm. In order to address this problem, supervisory algorithms have been used to calibrate GA parameters (Grefenstette, 1986; Boeringer et al., 2005; Chen et al., 2012). However, this reduces computational efficiency and introduces the problem of which parameters to use to control the supervisory algorithm. Many studies have recognised this importance of the characteristics of the problem on the success of a given optimisation algorithm, and a number of statistics have been developed to quantify these characteristics. Examples include the fitness distance correlation (Jones and Forrest, 1995), correlation length (Weinberger, 1990), epistasis variance (Davidor, 1991), schema variance (Radcliffe and Surry, 1995), hyperplane ranking (Mathias et al., 1995), iterated local search (Merz, 2004) and fitness evolvability portraits (Smith et al., 2002). Gibbs et al. (2011) proposed a number of statistics for this purpose, including a spatial correlation to identify the structure in the fitness function, and statistics based on the mutual information in a sample of fitness function values to identify highly salient decision variables, as well as interactions between decision variables. To assist with the problem of finding suitable GA parameter values based on problem characteristics, Gibbs et al. (2008) introduced a method for determining the optimal population size based on the number of generations before the population could be expected to converge due to genetic drift, which is the convergence of the population due to the repetitive application of a random selection operator. However, this approach only provides an upper bound to the convergence of the population, as it would be expected that the selection pressure produced by the fitness function values would cause the population to converge quicker than it would due to repeatedly applying random selection alone. However, this convergence due to selection pressure is more difficult to quantify, as it is dependent on the characteristics of the objective function. In order to address this shortcoming, an approach to quantify convergence due to selection pressure based on characteristics of the fitness function is introduced in this paper. This information can be used to estimate the number of generations of a GA population that could be expected to occur before it converges to a set of very similar solutions. This approach is in line with the No Free Lunch Theorem, as the only way one strategy can outperform another is if it is specialised to the structure of the specific problem under consideration (Ho and Pepyne, 2002), where the structure of the specific problem is quantified through the fitness function statistics. Given this, the objectives of this work are to: 1. Develop an improved approach to select the GA population size for a given computational budget, based on the convergence of the GA population. A component of this objective is to develop a relationship between the reduction in population variance due to the selection operator and the characteristics of the problem (i.e. the fitness function). 2. Test the utility of the approach developed on a number of Water Distribution System case studies, by comparing its performance with that of a number of existing approaches. The following section outlines a number of previous studies that provided the basis for the approach adopted in this work. Following

on from this background, a relationship to estimate the rate of convergence of a GA population based on characteristics of the fitness function is derived in Section 3. This convergence rate can then be used to determine a suitable population size for a given fitness function, and when combined with other previous findings, a full GA calibration methodology is produced. The resulting methodology is compared to other approaches that have been proposed to determine GA parameter values in Section 4, using two Water Distribution System (WDS) optimisation problems as test problems. The main implications of these results are then discussed in Section 5, before concluding with a summary of the main contributions of this work. 2. Background The methodology adopted in this work is based on the findings from a number of previous studies, which are briefly outlined in this section. There are two main areas of research presented, the first being approaches to estimate the convergence of a GA population due to the selection operator, and the second approaches to quantify characteristics of the fitness function that are expected to influence the convergence of the population. 2.1. GA population convergence Estimating the convergence of a GA population is useful to inform the setting of the GA parameters, to ensure a good balance is achieved between exploring the search space and exploiting the good solutions identified in the number of function evaluations available (Lee et al., 2008). The concept applied in this study was to identify GA parameter values that result in a balance between this trade off, by allowing as much exploration of the search space as possible, but still exploiting the best solutions identified by end of the GA run. While mutation is used to inject random solutions, much of the original variation in solutions is contained in the initial population. Hence, the main objective of the proposed approach was to specify a population size that is large enough to avoid premature convergence. However, if the population size is too large, the GA will not have sufficient time to exploit the good solutions that have been identified. It is assumed that the search spaces of the types of optimisation problems tackled in the water resources field are large, so large that the optimal solution is never found, and more searching of the problem domain will lead to locating better solutions. For example, the most commonly studied problem in the field of WDS optimisation is the New York Tunnels optimisation problem (Schaake and Lai, 1969). Improved solutions have been identified for this problem over a number of decades (Marchi et al., 2014). In this work, the term convergence is used as defined by Louis and Rawlins (1993) for the case when most of the population, in terms of decision variable values, is identical, or in other words, diversity is minimal. However, it is noted that in most cases (i.e. where the fitness function is not flat) there will be a direct relationship between convergence in the fitness function values and the decision variable values. It has been assumed in this work that the commonly used two parent tournament selection operator is adopted, and this operator is the dominant influence on the change in population variance over a GA run. Depending on the other operators selected in the implementation of a GA, this assumption may or may not be appropriate. Studies that have investigated the design of crossover operators for real coded GAs (Kita and Yamamura, 1999; Someya, 2013) have proposed a design principle of “statistics preservation”. The principle requires that distribution of the offspring

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

generated by crossover operators should preserve the statistics such as the mean vector and the covariance matrix of the parents. If this guideline is followed (which is the case for the crossover operator implemented in this work) the operator would not be expected to change the population variance substantially, even though new solutions are being generated. Although the mutation operator will alter the population variance, as new solutions are generated at random, it is assumed in this work that this operator is applied at a low probability, and as such its effect on the overall population variance is small. Selection operators other than two parent tournament selection have not been considered, as tournament selection has been found to outperform ranking and roulette wheel based selection operators (Goldberg and Deb, 1991; Oladele and Sadiku, 2013; Zhong et al., 2005). The GA tournament selection operator will decrease the variance in the fitness function values, due to the process of replacing the worst solutions by copies of the better ones. Based on quantitative genetic theory (Falconer, 1981), the decrease in variance is a constant value at each generation, k. This results in an exponential decay of the variance in the fitness function values in terms of the number of generations. Therefore, the number of generations until the GA converges to a single solution, gconv, is dependent on this rate of decay of the population variance. Even without the influence of the fitness function values in the selection operator, the population will eventually converge to a single solution, simply due to the repetitive application of the tournament selection operator in a GA. This process is termed genetic drift. Rogers and Prugel-Bennett (1999) developed analytic expressions for the change in population variance due to genetic drift, and for a GA with a tournament size of two, the decay in population variance from one generation to the next was determined to be:

k¼1

1 N

(1)

where N is the population size. Gibbs et al. (2008) used this relationship to determine that the population size that can be expected to converge within a given number of function evaluations can be derived using:

FE logðkÞ ¼ M  log N

rffiffiffiffiffiffi! l 12

(2)

Where FE is the number of function evaluations available, l is the number of decision variables, and M is a parameter related to the convergence of the final population, recommended to be 3. By solving the implicit equation in Equation (2) for N by substituting in Equation (1), the largest population size that can be expected to converge due to genetic drift in the number of function evaluations that are available for a given problem size can be determined. However, in most cases, the fitness function does contribute to the convergence of the GA population, and if this is the case, it would be expected that convergence will occur due to selection pressure before it will occur due to genetic drift. Convergence due to selection pressure is dependent on the fitness function, and therefore is much more difficult to quantify. Hence, the population size computed by Equations (1) and (2) can be considered to provide a minimum bound on the population size, where even with no selection pressure the GA can be expected to converge for a given number of function evaluations. Gibbs et al. (2011) empirically tested the hypothesis that the number of generations before a GA population will converge due to selection pressure is constant. The hypothesis was validated for a number of test functions, provided certain conditions were

3

met. These conditions included the presence of a structured fitness function, and either epistatic interactions between decision variables (where combinations of variables must be considered concurrently to locate better solutions) or the presence of at least one salient decision variable dominates the fitness function value. These functions were termed ‘Optimal Generation Functions’ (OGFs), as there was found to be a constant number of generations before the population converged, that could be then used to inform the best population size for a given stopping criterion (specified as the number of function evaluations available). For functions that did not fit this classification, Gibbs et al. (2011) found that the best solutions were located using as many generations as possible, by adopting a small population size. As such, these functions were termed ‘Maximal Generation Functions’ (MGF). A number of statistics to quantify characteristics of the fitness function were also outlined by Gibbs et al. (2011), in that case to assist with the classification between OGFs and MGFs. 2.2. Quantifying fitness function characteristics The previous section has highlighted the need to quantify characteristics of the fitness function, to determine the type of function (i.e. OGF or MGF), and also to provide information about the convergence of a GA due to selection pressure. Quantifying characteristics of fitness functions is not a new idea, with a number of measures proposed over the past two decades. A number of recent reviews provide an overview of the current status and future directions for the application and development of fitness function statistics (Maier et al., 2014; Pitzer and Affenzeller, 2012; Malan and Engelbrecht, 2013). One of the first statistics used to investigate the behaviour of GAs was the correlation length (Weinberger, 1990). Along with the correlation length, Gibbs et al. (2011) also calculated the average correlation in the fitness function over the correlation length, Rav, which provided information about the “roughness” of the fitness function for solutions that are near each other in the search space. It was observed by Gibbs et al. (2011) that an increase in the roughness of the fitness function did not affect the correlation length, but the value for Rav did decrease with an increase in roughness, indicating that this is a useful statistic for the characterisation of fitness functions. Another problem characteristic that fitness function statistics have attempted to quantify is the degree of epistatic interactions between variables. An epistatic function is more complex than a simple summation of the relationship between each decision variable and the fitness function value, and is therefore derived from combination of the decision variable values. Davidor (1991) proposed epistasis variance to measure the epistasis of a problem, which in simple terms is a measure of the distance between a separable approximation to the fitness function and the fitness function itself. While there have been a number of improvements to the original epistasis variance (Reeves and Wright, 1995; Mason, 1991), Naudts and Kallel (2000) outlined a number of concerns regarding the application of epistasis variance to quantify the epistasis of a fitness function, namely: i) the statistic can only provide information about the presence of epistasis, not a measure of the degree of epistasis; and ii) the epistasis measure is sensitive to nonlinear scaling of the fitness function, something a GA with tournament or ranking selection is entirely insensitive to. Gibbs et al. (2011) proposed a separability measure to quantify the epistasis between decision variables, where by sampling the search space in a structured way, interactions between pairs of decision variables could be identified. If there were no interactions between the variables, the two sets of values would be the same, where any

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

4

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

Fig. 1. Methodology adopted to estimate the optimal population size for convergence due to selection pressure, with steps undertaken in this work in black, and existing methods presented in grey.

differences are caused by interactions between the variables. By interrogating the interactions between variables, a number of statistics can be derived, including: the number of interacting variables (or building blocks), mBB, the maximum number of variables that interact together, kBB, and the distance between interacting variables in the GA solution string, dBB. A final measure proposed by Gibbs et al. (2011) used Mutual Information to identity highly salient variables. The Normalised Mutual Information, NI (Sharma, 2000) was calculated between the fitness function value and each decision variable value, for a random sample of solutions over the search space. A Salience Measure, D, was then proposed as the difference between the maximum NI and minimum NI across the set of decision variables. This statistic provides information about the maximum dominance of one variable over another for a given fitness function, which is also expected to influence the convergence of the GA population (Thierens and Goldberg, 1994; Thierens et al., 1998).

3. The relationship between fitness function characteristics and population convergence The previous section provided an overview of two important topics for this work, an understanding of the expected convergence of a GA population, and statistics that can be used to quantify characteristics of a fitness function. The purpose of this study is to develop a relationship between the two, so that the expected GA convergence can be estimated based on information about the fitness function. The methodology adopted is shown in Fig. 1. The elements in grey in Fig. 1 represent: 1) the previous methodology of Gibbs et al. (2008), where the convergence of a GA population due to genetic drift was determined for the two parent tournament selection operator, then Equation (2) was used to solve for the population size, and 2) the fitness function statistics selected by Gibbs et al. (2011). The methodology in black in Fig. 1 is the research contribution of this work to improve the estimate of the

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

convergence of the GA population due to selection pressure, where the following has been undertaken:  Test Fitness Functions: A total of 18 672 test functions were generated to enable the empirical relationship between fitness function characteristics and population convergence to be developed.  Observed Population Convergence: Tournament selection was applied to a population of solutions multiple times, to quantify the reduction in population variance due to the selection operator for the test functions considered.  Calculate Range of Fitness Function Statistics: A number of fitness function statistics were calculated for each test function, to quantify a range of characteristics of each function considered.  Select Related Fitness Function Statistics: An input selection method was used to determine which of the fitness function statistics considered are related to the observed population convergence.  Relationship Between Convergence and Statistics: Finally, a relationship was developed between the selected fitness function statistics and the population convergence, to provide a method to predict this convergence based on the statistics applied to any fitness function. From Fig. 1, it can be seen that the proposed methodology is used to provide an improved estimate of the convergence rate of a GA population. This convergence rate can then be substituted into Equation (2) to estimate the population size for a given convergence criterion (expressed as a number of function evaluations), addressing the first objective of this work. The remainder of this section outlines each step of the methodology in more detail. 3.1. Test Fitness Functions Functions of the form of a Fourier Series have been selected as the test functions to investigate the convergence due to selection pressure. These functions have been selected as a wide range of characteristics can be generated, complex functions can be obtained by the superposition of a number of terms with different amplitudes and frequencies, and they are computationally cheap to evaluate. Up to three cosine terms have been used to generate the test functions. The functional form of the test functions is given in Equation (3):

FðxÞ ¼

3 X l X

  jp Ai cos fi xj ;

5

a frequency of f1 ¼ 1. This allows the global structure to change from a ‘big bowl’ with A1 ¼ 2, to totally flat, with A1 ¼ 0. The remaining two terms for each test function have been used to control the degree of roughness and multimodality. The amplitude and frequency values considered for terms two and three are given in Table 1. It can be seen from Table 1 that terms two and three can have similar frequencies, to produce complex function characteristics, where the cosine terms are additive for some of the search space, and negating each other in other areas of the search space. The p parameter in Equation (3) has been included to allow the salience of the decision variables to be controlled, with values of p ¼ 0 and 1 considered. Along with the original form of the test function without epistatic interactions between decision variables, interaction cases varying the number of pairs of interacting variables, mBB, and distance in the solution string between these pairs of variables, dBB, were considered, with values of mBB ¼ 1 and dBB ¼ l  1, mBB ¼ 2 and dBB ¼ l/2, mBB ¼ l/2 and dBB ¼ 2 adopted. Interactions between decision variables were produced by rotating each of the terms of the test function in Equation (3) using a transformation matrix, M, proposed by Salomon (1996). An example test function is shown in Fig. 2, which has been produced using parameter values: l ¼ 2, A1 ¼ 0.5, A2 ¼ 2, A3 ¼ 1, f1 ¼ 1, f2 ¼ 8, f3 ¼ 9, and p ¼ 1. Every combination of the function parameters considered in Table 1 has been tested, along with each of the different interaction cases, however, combinations that produce the same function characteristics have not been reevaluated. Therefore, a total of 18 672 functions were used.

3.2. Observed Population Convergence The decay in population variance has been computed for each of the test functions considered to provide data to calibrate the empirical relationship to. The population standard deviation, spop, has been computed based on the population variance statistic proposed by Beyer and Deb (2001):

spop

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n l  u1 X X 2 ¼t x  xs;j ; n i¼1 j¼1 s;i;j

(4)

where xs are sampled solution strings, with decision variable j in the solution size of l, for each solution i in a population of size n. Initially, the population of solutions was randomly sampled from a

(3)

i¼1 j¼1

where Ai and fi are the amplitude and frequency of each term i, and l is the number of decision variables. The range considered for the decision variables was xj 2 [0, 2p]. The first cosine term was used to control the global structure of the fitness function. To do so, each of the amplitudes shown in Table 1 has been implemented, along with Table 1 Parameters values used for the test functions. Parameter

Values

l Ai f1 f2 f3 p

2, 0, 1 2, 3, 0,

4, 10, 30, 50, 80, 100 0.5, 1, 2 4, 8, 16 5, 9 1

Fig. 2. Example of the Fourier Series test functions used.

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

6

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

uniform distribution over the search space, to represent the initial parent solutions in a GA population. Tournament selection was then applied to the sampled solutions, producing the children solutions. The variance in the population will decrease after the selection operator was applied, as after selection the worst solutions in the population were replaced by a number of copies of the better solutions in the population. spop is then recomputed from the children solutions using Equation (4). The observed decay in the standard deviation of the population due to selection, kobs, was then computed by:

kobs ¼

spop; children : spop; parents

(5)

A total of n ¼ 5000 samples in the search space have been used to represent a population of solutions. To remove any variability produced by the random sampling, the average over 30 different random data sets was taken as the kobs value for each test function. 3.3. Calculate fitness function statistics Each of the fitness function measures proposed by Gibbs et al. (2011) has been applied to each of the test functions outlined in Section 3.1. Along with the average correlation in the fitness function up to the correlation length, Rav, the total correlation in the fitness function, RT, has also been calculated. The value for Rav was computed as the average area under the correlation plot for distances less than the correlation length (Rl), as indicated by the grey bars in Fig. 3. The total positive correlation in the fitness function, RT, was computed in a similar fashion, however, in this case, the area under the whole positive section of the correlation plot was used, as indicated by both the grey and black bars. This statistic provides a useful insight into the global structure in the search space. For example, if RT > Rav, points that are far apart in the search space are positively correlated, indicating that a ‘big bowl’ structure exists for the fitness function being analysed. However, if RT ¼ Rav, then the plot of correlation against distance in the search space does not turn positive again after the correlation length, indicating that the function is likely to be globally flat. Before a relationship between the calculated values of the fitness function statistics and the kobs values can be developed, the data derived from the test functions were screened to include only the functions that are OGFs, as the other type of function identified by Gibbs et al. (2011), MGFs, are not expected to benefit from a larger population size to find better solutions. Of the original 18 672

test functions considered, 9347 remained as OGFs suitable to be used to develop the relationship between kobs and the characteristics of the fitness function. 3.4. Select Related Fitness Function Statistics In order to identify the statistics that explained some of the variance in the kobs values calculated for the different test functions, an input determination technique was used. The Partial Mutual Information (PMI) (Sharma, 2000; May et al., 2008a, b) was used as the input determination technique, due to the method's ability to detect non-linear relationships between two variables. The PMI was calculated between the values of kobs as the dependent variable, and each of the statistic values as well as the dimension of the problem, l, were used as the candidate predictor variables. The analysis was performed using the Gaussian reference bandwidth and a Gaussian kernel. A base 2 logarithm has been used to compute the entropy, therefore the unit of the PMI values is ‘bits’. For this work, significant inputs were identified using 95% confidence levels, determined by bootstrapping the data 1000 times. The results from the input determination procedure using the data for the 9347 test functions can be seen in Table 2, with the significance values indicated in brackets. All the statistics computed have a strong relationship with kobs, apart from those derived from the separability measure, namely mBB and kBB. Therefore, the statistics computed from the separability measure were not been included in the relationship to estimate kobs. 3.5. Relationship between population convergence and fitness function statistics The general form of the function to estimate the decay in population variance due to selection adopted was as follows:

kpred ¼ 1  aRbl RcT Rdav lðeþfDÞ ;

(6)

where: kpred is the predicted decay of population variance, Rl is the correlation length, RT is the total correlation, Rav is the average correlation, l is the problem size, D is the dominance statistic, and a, b, c, d, e and f are all parameters to be calibrated to fit kpred to kobs. The overall functional form of Equation (6) was based on the theoretical population sizing equation proposed by Harik et al. (1999), where each input is raised to a power, and the product of each input term is taken. The population sizing equation proposed by Harik et al. (1999) was focused on maximising the final solution quality found by the GA, and hence terms similar to mBB and kBB were included in the equation. However, as discussed above, these parameters were found not to influence the convergence of the population. The relationships proposed by Thierens and Goldberg (1994) and Thierens et al. (1998) have also been used to specify the functional form of the relationship. Their results suggested that the

Table 2 Input selection results.

Fig. 3. The computation of the correlation statistics from the spatial correlation measure.

Rank

Parameter

PI (Xi, Y)

1 2 3 4 5 6 7

Rav RT L R1 D mBB kBB

0.899 0.825 0.384 0.288 0.212 0.076 0.077

(0.177) (0.113) (0.101) (0.083) (0.089) (0.079) (0.086)

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

pffi relationship between gconv and l was between Oð lÞ and O(l), depending on the salience of the decision variables. Therefore, the l and D parameters have been included in the form l(a þ bD). A non-linear generalised reduced gradient search method was used to determine the parameter values in the relationship, as it would be expected that the search space characteristics are smooth, with few local optima, and gradient methods are suitable for these purposes. A number of different starting points were used with the search method in order to have the best chance of determining the globally optimal set of coefficients. Based on these analyses, the following relationship has been fitted to the data collected from the test functions:

kpred ¼ 1  2:266

R1:168 R2:145 T l ð0:506þ0:346DÞ 1:074 Rav l

:

(7)

The plot of the kobs values experimentally computed from a sample of solutions from each test function against the kpred values estimated using Equation (7) is given in Fig. 4. The dashed grey line indicates the line of perfect prediction. The figure shows that very good agreement was achieved between the functional form used in Equation (7) and the experimentally derived data, with a coefficient of determination of R2 ¼ 0.812. The biggest errors in the predictions, seen for some functions with an experimentally observed kobs > 0.98, can be attributed to a slight underestimation of either Rl or RT. However, this represents a small number of the total function evaluations at this value of kobs, indicated by the high density of points closer to the line of perfect prediction. Equation (7) can be computed for any real-valued fitness function, after the spatial correlation and dominance fitness function statistics have been applied to that fitness function. Once a value of kpred is obtained, Equation (2) can be used to provide an estimate of the population size expected to converge due to selection pressure for a given number of function evaluations available to solve the problem. 4. Application of GA calibration methods to WDS optimisation The method of determining the GA population size proposed has been compared to three other approaches for setting the population size. These three methods were also considered by Gibbs et al. (2010b), including the similar but simpler method to

Fig. 4. Predicted and observed values of k for the Fourier Series test functions.

7

determine the population size based on convergence due to genetic drift (Gibbs et al., 2008), a commonly used value of N ¼ 100, and the parameterless approach, which involves starting with a small population size, and continually increasing the population size each time the GA stalls in locating better solutions. To set the remaining GA parameter values, both constant and self-adaptive parameter values have been considered, where the self adaptive approach finds values for the GA parameters as part of the solution string. These approaches are described in more detail below. The GA calibration methods have been compared on two optimisation problems. Both problems consider the optimal operation of a WDS, to minimise the costs involved in operating the system. The first is a network that has been optimised by a number of different approaches in the past, the Cherry Hill-Brush Plains network. This optimisation problem was used by Gibbs et al. (2010b) to compare the three existing GA population sizing methods considered. The second network is the Woronora WDS, located in Sydney, Australia. In this case, the optimisation problem is to determine the most cost effective operation of the pumps and valves in the network, while also considering the water quality produced by the operation of the WDS. Both problems are described in more detail below. 4.1. Genetic Algorithm and calibration methods 4.1.1. Genetic Algorithm The GA implementation of Gibbs et al. (2008) was used in this work. This included real coding to represent the decision variables, tournament selection with a tournament size of two, and an elitist strategy to preserve good solutions, where the worst solution in the population in each generation was replaced by an additional copy of the best solution found each generation. A one-point distributed crossover operator was used, as neighbourhood-based crossover operators such as this have been found to exploit the numerical nature of real coded GAs (Herrera et al., 1998). A uniformly distributed mutation operator was adopted, producing any value over the range of each decision variable. Therefore, for the operators adopted, the following parameters values are required: population size N, probability of crossover pc, probability of mutation pm, number of elite solutions e, and standard deviation of the distribution used for the crossover operator c. In this work pm is the probability that mutation is applied to a solution string, with the mutated variable selected randomly within the string, if applied. This GA conforms to the assumptions made in developing the relationship between population convergence and fitness function statistics, as two parent tournament selection is used, and the parent centered crossover operator has no net effect on the variance in the population. Also, as outlined in the following section, pm is low, and as such tournament selection has the major influence on the change in population variance from one generation to the next. 4.1.2. GA calibration methods The first approach considered was the simplest, adopting typical values. For this work, they were assumed to be N ¼ 100, pm ¼ 1 per €ck and Schwefel, 1993). string, pc ¼ 0.9, c ¼ 6, and e ¼ 1 (Ba The second and third methods are similar, with the difference being the method used to determine the population size. The first of these was that proposed in the previous section, called ‘Predicted’, substituting Equation (7) into Equation (2) to solve for the population size. The other method uses a similar approach, but instead Equation (1) was substituted into Equation (2), and is called ‘Drift’. For the remaining GA parameters, the values proposed by Gibbs et al. (2008) are adopted, pm ¼ 5/N, pc ¼ 1, c ¼ 6, and e ¼ 1. The final method considered was the ‘Parameterless GA calibration method. This GA calibration methodology was first

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

8

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

proposed by Harik and Lobo (1999), as well as in a number of subsequent studies (Lobo and Goldberg, 2004; Reed and Yamaguchi, 2004; Minsker, 2005). As is the case in this work, Reed and Yamaguchi (2004) assumed that population size is the key parameter controlling the reliability and efficiency of GAs. The approach involves doubling the population size after there has been no improvement in the best solution found for a number of generations, and standard values are used for the remaining GA parameters. The initial starting population used in this work was N ¼ 10. As proposed by Reed and Yamaguchi (2004), the population size was increased after a minimum of g ¼ Nl generations, and after the improvement in the fitness function value from one generation to the next was less than 1%. When this occurred, the population size was doubled and reinitialised with random solutions, along with the best solution from the last GA run. The process was continued until the available number of FEs was reached. For this work, the parameter values proposed by Minsker (2005) for the parameterless GA were adopted, namely pc ¼ 0.5 and pm ¼ 1. As with the other approaches, c ¼ 6 and e ¼ 1 have been used for the remaining GA parameters. The parameters adopted for the four different methods are summarised in Table 3, and c ¼ 6 and e ¼ 1 are adopted for all methods. It can be seen that different values for pc and pm were used for the different population sizing methods. This approach was used to preserve the relationship between the GA parameter values as proposed in the literature, as the different values are not necessarily separable. The GA parameter values other than the population size used for the Drift relationship were also adopted for the Predicted method developed in this work, as these two approaches are conceptually similar. Also, it was assumed that population size is the key parameter controlling the reliability and efficiency of the GA (Reed and Yamaguchi, 2004), and as such the influence of the other GA parameter values on the solutions identified are expected to be secondary to the differences in population size. As mentioned previously, each of these methods to determine the population size have also been used in a self-adaptive framework. In these cases, values for pm, pc and c are included in each GA solution string to be optimised, along with the decision variables being solved. When crossover occurred, the crossover parameter values to be used were randomly selected with an equal probability from one of the two parent solutions to be crossed over. The range for each parameter considered was [0  pm  1], [0.5  pc  1], and [0.01  c  36]. A value of e ¼ 1 was again adopted, as along with the population size, the number of elite solutions is applied at the population level, not the individual solution level, and therefore there is no clear way for the GA to self adapt this value. 4.1.3. Analyses conducted The eight GA calibration methods outlined above, namely Predicted, Drift, Parameterless, and Typical, with both constant and self-adaptive parameter values, have been applied to both WDS optimisation problems considered (outlined below). Three different stopping criteria have been considered to test the ability of the methods to adapt to different conditions (i.e. short run times and long run times), with FE ¼ 5  103, 104, and 105 tested. Table 3 Parameter values adopted for the different GA calibration methods. Name

Population size (N)

pm

pc

Predicted Drift Parameterless Typical

Equations (5) and (2) Equations (1) and (2) Incrementally Doubled 100

5/N 5/N 1 1/l

1 1 0.5 0.9

To test for differences between the different methods, each GA calibration method was applied to each problem 13 times, repeated due to the stochastic nature of the GA. The number of 13 runs was selected as a trade-off between the number of replicates for statistically significant results, and the computation time required to repeat the GA runs, which was in the order of months for the results presented in this work. However, current research has investigated approaches to speed up WDS simulation models, through decomposition (Diao et al., 2014; Zheng and Zecchin, 2014) or metamodelling (Broad et al., 2010; Razavi et al., 2012) approaches. A two-sided t-test was used to test for a difference in the mean fitness function value found by the different GA calibration methods, with a < 0.05 used to identify significantly different results. Based on these results the eight methods have been ranked for each test problem and each stopping criterion. When the t-test did not result in a statistically significant difference in the mean solution found, the two calibration approaches were given the same ranking. To provide an overall ranking for each test problem, the average ranking across the different stopping criteria has also been calculated. 4.2. Test problems 4.2.1. Cherry Hill-Brushy Plains Network The network layout of the Cherry Hill-Brushy Plains portion of the South Central Connecticut Regional Water Authority network, USA, used to test the GA calibration methods, is given in Fig. 5. In the past, this network has been used to validate and test different water quality models, and has been optimised by a number of methods, including linear programming (Boccelli et al., 1998), a linear least-squares formulation (Propato and Uber, 2004) and a number of GAs (Munavalli and Kumar, 2003; Gibbs et al., 2010b, for example). The data for the system are the same as those used by Boccelli et al. (1998). The objective of the Cherry Hill-Brushy Plains optimisation problem is to minimise the mass of chlorine added to the system over a 24 h period. The decision variables are the mass of chlorine injected at each dosing point, in mg/min, for each of the four six-hour time periods when the hydraulics of the system change. The dosing points are represented by nodes A, B, C, D, E, and F in Fig. 5. This produces a total of 24 decision variables to be determined by the GA. The range considered for each decision variable was between 0 and 800 mg/min. To remove the influence of the initial chlorine concentrations in the network, the simulation was run until a 24 h repeating pattern was observed in the chlorine concentrations at the demand nodes. The constraint on the system is to maintain the chlorine concentrations in the network in the acceptable range of 0.2e4.0 mg/L. A penalty of 100 (mg/L)1 multiplied by the difference between the actual and required chlorine concentrations was applied at any nodes that violate this constraint each 10 min time step of the simulation. 4.2.2. Woronora WDS The Woronora WDS optimisation problem is a ‘real world’ WDS optimisation problem in Sydney, Australia. The layout of the Woronora WDS EPANET model can be seen in Fig. 6. The objective of the Woronora optimisation problem was to minimise the total cost of operating the system. The total cost consists of the electricity cost due to running the pumps in the system, and the cost of dosing calcium hypochlorite tablets in the reservoirs. The decision variables that could be changed to meet the objective were the operations for three pump stations, two valves and the tablet dosing in each reservoir to maintain water quality constraints. Each of the pumping stations consisted of two active pumps, which were controlled by the water level in the reservoir

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

9

provided by Gibbs et al. (2010a). In order for the selection operator to compare two infeasible solutions, penalty factors have been applied to any constraints that have been violated. A penalty of 100 multiplied by the maximum constraint violation over the simulation was included in the fitness function value for any infeasible solutions. The exception to this was the constraint on the minimum total chlorine concentration, for which a penalty of 5 (mg/L)1 was multiplied by the constraint violation at each time step for which the chlorine concentration was below 1 mg/L at a demand node. Four different scenarios of the Woronora WDS system have been considered to be optimised. The first scenario initialised the model with reservoir levels to 70% full. The second scenario was the same as the first, however, the constraint on the chlorine concentrations was ignored. This scenario was considered to investigate the effect of including the water quality constraint on the best hydraulic operation of the system. The third scenario was initialised with reservoir levels that are more representative of those used in the system, which are typically much higher than the 70% full considered in scenario 1. It was expected that by reducing the overnight reservoir levels in scenario 1, compared to scenario 3, more pumping will be able to be shifted into the off-peak electricity period, therefore producing lower pumping costs. Finally, scenario 4 is the same as scenario 3, but as with scenario 2, the water quality constraint was ignored. For the scenarios that ignored the water quality constraint, a simulation time of 24 h was used, and with the water quality constraint included, a 57 h simulation was required to allow enough time for the influence of the tablet dosing to be observed. 4.3. Fitness function analysis The results from calculating the fitness function statistics for each test problem are outlined in this section. For each function, a sample of 5000 solutions was used to calculate the statistics.

Fig. 5. Schematic of the Cherry Hill-Brushy Plains Network.

downstream. Therefore, the decision variables for each pump were the water level in the reservoir to trigger the pump to turn on, and the water level in the reservoir to trigger the pump to turn off. The other hydraulic decisions to be made were the reservoir trigger levels for the AICV at Lucas Heights to open and close, and the trigger levels in the Menai reservoir to control the TCV upstream. Therefore there were a total of 16 different hydraulic decision variables. Different values for the hydraulic decision variables were permitted for the different periods that correspond to different electricity tariffs over the day. Therefore, there were total of four sets of the 16 hydraulic decision variables, producing 64 hydraulic decision variables. The final decisions that must be made was whether or not to dose calcium hypochlorite tablets in each of the seven reservoirs, producing a total of 71 decision variables for the Woronora WDS fitness function when the water quality constraint was considered. The objective of minimising the costs of operating the Woronora WDS was subject to a number of constraints, such as including minimum water levels in each reservoir, that the reservoir must return to at least the level it was at the start of the day, minimum flow from the WFP, minimum total chlorine concentrations, a maximum electricity capacity and a maximum number of pump switches per day. More details on the specific constraints are

4.3.1. Cherry Hill-Brushy Plains Network The two fitness function statistics required to calculate Equation (7) have been applied to the Cherry Hill-Brushy Plains fitness function over the defined search space of 0e800 mg/min for each decision variable. After applying the spatial correlation measure and the dominance measure the values determined for the necessary statistics were: Rav ¼ 0.2434, Rl ¼ 0.45, RT ¼ 0.2434 and D ¼ 0.0407. From these values of the spatial correlation and dominance measures, the value of kpred computed from Equation (7) was kpred ¼ 0.9626. Based on a problem size of l ¼ 24, the number of generations before the population will converge on this function was estimated to be gconv ¼ 190. Therefore, for the Predicted calibration methodology, the population sizes used for the different convergence criteria were N ¼ 30 (FE ¼ 5  103), N ¼ 50 (FE ¼ 104), and N ¼ 530 (FE ¼ 105). For the Drift GA calibration method, the population size was determined from Equation (2), based on l ¼ 24 and the different cases of FE considered were N ¼ 30 (FE ¼ 5  103), N ¼ 40 (FE ¼ 104), and N ¼ 120 (FE ¼ 105). For the shorter convergence times of FE ¼ 5  103 and 104, it can be seen that both the Predicted and Drift methods used very similar population sizes. However, for FE ¼ 105, the population size calculated by the Drift method was much smaller than that calculated by the Predicted method. 4.3.2. Woronora WDS The spatial correlation measure and the dominance measure were also applied to the Woronora WDS fitness function, and the values determined for the four different scenarios considered are provided in Table 4. It can be seen that the different variations of

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

10

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

Fig. 6. Schematic of the Woronora WDS EPANET model.

the fitness function considered had little effect on the fitness function statistics computed, with the only noticeable difference being a decrease in the correlation length for scenario 4. The results for the dominance statistic indicate that there was not one decision variable that dominated the fitness function value. The results for the spatial correlation statistics indicated that the fitness function is a plane, as Rav ¼ RT. The values of kpred and gconv computed from the spatial correlation and dominance measure results are given in Table 4. The larger value predicted for gconv for Scenario 4 was produced by the higher value of kpred ¼ 0.993, which was due to the shorter correlation length for this version of the fitness function. The corresponding population sizes used by the Predicted method for each of the stopping criteria considered are also given in Table 4. For the Drift GA calibration method, the population sizes calculated were N ¼ 30 (FE ¼ 5  103), N ¼ 40 (FE ¼ 104), N ¼ 110 (FE ¼ 105). The different problem sizes of l ¼ 64 and 71 did not influence the population size after being rounded to the nearest 10 solutions. 4.4. Optimisation results 4.4.1. Cherry Hill-Brushy Plains Network The solutions found by the different GA calibration methods for the three different FE considered are given in Table 5 and the rankings of the methods for the different stopping criteria in Table 6, sorted by their mean ranking over the different stopping criteria. Overall, it can be seen that the Predicted and Drift methods

Table 5 Average fitness fiction value found (g/day) for the different GA calibration methods.

Table 4 Statistic values for the Woronora fitness function. Scenario

Rav

R1

RT

D

k

gconv

N 5  10

1 2 3 4

0.134 0.122 0.129 0.117

0.45 0.45 0.45 0.3

0.134 0.122 0.129 0.117

0.011 0.019 0.020 0.014

0.988 0.989 0.989 0.993

with constant parameter values found the best solutions on average, with no significant difference between the two methods. As might be expected with no mechanism to alter the population size based on the problem, the typical GA population size with constant values for the other parameters performed worst of the four methods tested. The population size for the Predicted and Drift methods are the same for FE ¼ 5  103 (N ¼ 30) and very similar for FE ¼ 104 (N ¼ 50 and 40, respectively). Hence, the results produced from these two methods for these stopping criteria were very similar. However, by increasing the population size to N ¼ 100, used by the Typical GA calibration method, the mean value found was much higher, being 10 times higher for the stopping criterion of FE ¼ 5  103, and three times higher for the stopping criterion of FE ¼ 104, implying that this larger population size has not had time to converge for the two shorter stopping criteria considered. Very similar solutions were found by all GA calibration methods for the longest stopping criterion of FE ¼ 105. This is somewhat surprising, as the Drift and Predicted methods used very different population sizes of N ¼ 120 and N ¼ 530, respectively. The Typical and Drift methods were used with very similar population sizes for FE ¼ 105, however the poorer performance of the Typical method can be attributed to the higher mutation rate adopted. The Parameterless method performed similarly to the Predicted and Drift methods for FE ¼ 104 and 105. The Parameterless method did produce the best results averaged over the 13 independent GA runs for the stopping criterion of FE ¼ 104. In this case, the

655 692 693 1159

10 10 10 10

3

10 20 10 10 10

4

10

5

150 140 140 90

FE

Parameters

Predicted

Drift

Parameterless

Typical

5e3

Constant Self-adapt. Constant Self-adapt. Constant Self-adapt.

3447 59 970 2294 12 232 1197 1245

3447 59 970 3152 12 746 1196 1763

14 315 13 643 1531 27 956 1318 1307

33 091 29 792 9144 10 417 1616 1500

1e4 1e5

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14 Table 6 Average rankings.

11

Table 8 Ranking of GA calibration methods for Woronora fitness function for FE ¼ 104.

Parameter setting

Mean

5  103

104

105

Predicted - constant values Drift - constant values Parameterless - constant values Parameterless - self adaptive Predicted - self adaptive Typical - self adaptive Drift - self adaptive Typical - constant values

1.67 1.67 3.50 4.83 5.17 5.83 6.67 6.67

1.50 1.50 3.50 3.50 6.50 6.50 6.50 6.50

2.00 2.00 2.00 6.00 6.00 6.00 6.00 6.00

1.50 1.50 5.00 5.00 3.00 5.00 7.50 7.50

differences were not statistically different, however, this result can be largely attributed to the higher variances for these methods. The Parameterless method performed slightly worse that the Predicted and Drift methods for FE ¼ 105, and from Table 6, it can be seen that the difference in average solution found was significant at a 95% confidence level. 4.4.2. Woronora WDS Given the number of combinations of GA calibration methods, stopping criteria and scenarios considered for the Woronora system, only the rankings of the different GA calibration methods have been presented, in Tables 7e9 for the stopping criteria of FE ¼ 5  103, FE ¼ 104, and FE ¼ 105, respectively. The overall ranking for each calibration method, averaged across stopping criteria, is given in Table 10. The ranked order of the different calibration methods is very similar to that obtained for the Cherry Hill - Brushy Plain problem, where the Predicted method with constant values produced the best results for all cases considered. The relative performance of the Predicted method was better for the convergence criterion where a smaller number of FEs was used, indicating the value of this approach when a limited computational budget was available. The Parameterless method with constant parameter values also produced very good results, as the only statistical difference between this method and the Predicted method was for the FE ¼ 5  103 stopping criterion. The Drift method with constant values for pm, pc, and c also had the same overall ranking as the Predicted and Parameterless calibration methods for the FE ¼ 105 stopping criterion. The Drift and Predicted calibration methods used very similar population sizes for FE ¼ 104 and 105, and therefore the ranks for these methods for these two stopping criteria are also very similar, with the only difference being that the Predicted method located a better solution on average for scenario 4 after FE ¼ 104. However, for FE ¼ 5  103 the Predicted method with constant parameter values consistently outperformed the Drift method, where in this case the Predicted method used N ¼ 10, compared to N ¼ 30 for the Drift method. The Typical GA parameter values produced statistically poorer results for all stopping criteria than the three other GA calibration methods with constant parameter values. This result further reinforces the need to determine the most suitable GA parameter values on a case by case basis. Table 7 Ranking of GA calibration methods for Woronora fitness function for FE ¼ 5  103.

Parameter setting

Predicted - constant values Parameterless - constant values Drift - constant values Typical - constant values Typical - self adaptive Drift - self adaptive Parameterless - self adaptive Predicted - self adaptive

Mean

2.5 2.5 3.5 4.2 4.2 5.9 6.2 6.9

Scenario 1

2

3

4

2.0 2.0 2.0 5.0 5.0 7.5 5.0 7.5

3.0 3.0 3.0 3.0 3.0 7.0 7.0 7.0

3.0 3.0 3.0 3.0 7.0 3.0 7.0 7.0

2.0 2.0 6.0 6.0 2.0 6.0 6.0 6.0

Table 9 Ranking of GA calibration methods for Woronora fitness function for FE ¼ 105. Parameter setting

Mean

Scenario 1

2

3

4

Predicted - constant values Drift - constant values Parameterless - constant values Typical - self adaptive Predicted - self adaptive Typical - constant values Drift - self adaptive Parameterless - self adaptive

3.8 3.8 3.8 4.6 4.8 4.8 4.8 5.9

2.5 2.5 6.5 2.5 6.5 6.5 6.5 2.5

3.5 3.5 3.5 7.0 3.5 3.5 3.5 8.0

4.0 4.0 4.0 4.0 4.0 4.0 4.0 8.0

5.0 5.0 1.0 5.0 5.0 5.0 5.0 5.0

Table 10 Overall ranking of GA calibration methods for Woronora fitness function. Parameter setting

Mean

5  103

104

105

Predicted - constant values Parameterless - constant values Drift - constant values Typical - self adaptive Typical - constant values Drift - self adaptive Predicted - self adaptive Parameterless - self adaptive

2.62 3.00 3.62 4.42 4.58 5.33 6.12 6.29

1.62 2.75 3.62 4.38 4.75 5.38 6.75 6.75

2.50 2.50 3.50 4.25 4.25 5.88 6.88 6.25

3.75 3.75 3.75 4.62 4.75 4.75 4.75 5.88

In general, the calibration of the GA did not have as great an impact on the performance of the GA on the Woronora WDS fitness function. This can be seen in the rankings of the different methods for each case in Tables 7e9, where generally there were only two or three different ranks shared between the eight GA calibration methods. This may be due to the lower average correlation computed for this fitness function of Rav z 0.13, where for the Cherry Hill-Brushy Plain fitness function the average correlation of the fitness function was Rav ¼ 0.24. Therefore, as there is less structure in the fitness function, the GA must rely more on randomly searching the search space, and the parameter values are less influential on the final solution found by the GA. 5. Discussion 5.1. Comparison of GA calibration methods

Parameter setting

Mean

Scenario 1

2

3

4

Predicted - constant values Parameterless - constant values Drift - constant values Typical - self adaptive Typical -constant values Drift - self adaptive Predicted - self adaptive Parameterless - self adaptive

1.6 2.8 3.6 4.4 4.8 5.4 6.8 6.8

1.0 2.5 2.5 6.5 4.0 6.5 6.5 6.5

2.0 5.0 2.0 2.0 5.0 5.0 7.5 7.5

1.5 1.5 4.0 7.0 4.0 4.0 7.0 7.0

2.0 2.0 6.0 2.0 6.0 6.0 6.0 6.0

The Predicted method found the best solutions in all cases considered (although not always uniquely), and the advantage of this approach was most evident when the computational budget was limited, i.e. the stopping criterion of FE ¼ 5  103. The main disadvantage to the Predicted method is that the fitness function statistics must be computed prior to the optimisation of the problem to determine the population size. This extra computational burden (5000 samples) has not been considered in the

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

12

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

comparison of methods, under the assumption that it is a one-off cost for a given fitness function that may be optimised many times. As such, this method is likely to be most applicable to situations where the optimisation process is repeated frequently within a short time frame, for example in operational scenarios where different starting conditions influence the management of a system. The Parameterless calibration method is easier to implement than the Predicted method, as in this case, the information about the fitness function is obtained implicitly as the GA is solving the problem. However, the Parameterless method is not straight forward to apply either, as a number of modifications must be made to a standard GA to allow the population size to be doubled automatically, and to inject the best solution from the previous GA run into the initial population. Also, the results suggest that in general, one GA run with a suitably identified population size can find better solutions than multiple restarts of smaller population sizes for the same FE. Other studies (Cantú-Paz and Goldberg, 2003, for example) have also come to this conclusion. The Drift GA calibration method performed as well as the Predicted method for the Cherry Hill-Brushy Plains fitness function, and this approach was only slightly outperformed by both the Predicted and Parameterless methods on the Woronora fitness function. The Drift method has the advantage that the population size can be determined very easily prior to undertaking any optimisation runs. Therefore, the Drift method may provide a simple alternative to the Predicted calibration method to determine a suitable population size for applications where it is not feasible to compute the fitness function statistics. However, in situations where the available computational time is limited, the Predicted method is likely to perform better. Of the methods considered to determine the population size, the ‘typical’ GA parameter values performed worst. The typical GA parameter values used were the same for each function, whereas all other methods considered had a mechanism to change the GA parameter values with the function characteristics; either by increasing the population size with FE or l (Drift method), increasing the population size after only a smaller population size had converged (Parameterless method), or by relating the population size directly to the characteristics of the function (Predicted). Therefore, it is not surprising that the Typical method performed worst overall, as the GA parameter values were not tuned to each fitness function, thereby not providing different GA behaviour for the different functions. With the exception of the Typical method, the calibration methods that adopted self-adaptive values for the pm, pc, and c parameters produced the worst overall results of all the methods €ck et al. (2000), who considered. A similar result was found by Ba concluded that the performance of self-adapting the parameters pm and pc was disappointing when used on its own (i.e.; without any control of the population size). The most likely reason for this is that time spent on searching for good parameter values is time €ck et al., 2000). The taken away from finding the optimum (Ba exception to this is when there is no ability to alter the population size based on the problem being solved (i.e. the Typical method) and in this case some benefit was found by adopting self-adaptive parameter values, but this benefit was much less than using an approach to set the population size in some way based on the characteristics of the problem considered. 5.2. Potential of fitness function statistics The good performance of the Predicted method suggests that fitness function statistics can be useful to inform the calibration of GA parameter values. Many statistics to provide information about

different fitness function characteristics have been developed previously, however, for each example that indicates that any of these measures provide information about the difficulty of an optimisation problem, there appear to be as many counter examples displaying that statistic's unreliability. Generally, other studies have concluded that fitness function statistics are a poor indicator of GA performance. However, these studies did not consider the calibration of the algorithm, and hence the convergence observed was in some ways rather arbitrary. The Predicted method relates the GA population size to the characteristics of the problem, which fitness function statistics provide information about. This approach is different to that generally adopted when making use of these statistics, which typically attempts to relate fitness function statistic values to how closely a GA will converge to the optimum solution. However, this is not only a function of the fitness function characteristics, but also the GA operators adopted, the parameters for those operators and the convergence criterion. Maier et al. (2014) noted that an important question to answer is whether calculating fitness function statistics is worthwhile, or whether the function evolutions and processing time required for fitness function analysis are better spent on actually solving the problem. This is especially the case for water resources problems, where often the fitness function is based on the output of simulation models that have long computation times and a high dimensional search space that requires many evaluations in order to obtain sufficient samples for accurate estimation of fitness function statistics. While fitness function analysis is usually much more resource intensive than just solving a given problem, the resulting insights can be valuable for increasing problem understanding (Pitzer and Affenzeller, 2012). As such, fitness function analysis may not be the best use of resources in applied studies where the sole aim is to solve a single problem instance and the computational time is not strictly limited, as most approaches considered in this work performed similarly for the longest stopping criterion considered. However, this approach is likely to be highly beneficial in more research focused studies, where a deeper understanding of the performance of different algorithms on a subset of problem classes is of interest (Pitzer and Affenzeller, 2012). The application of the fitness function statistics to the WDS fitness functions considered in this paper provided a number of insights into the characteristics of these functions that are generally largely unknown. For the Woronora WDS fitness function, the values of the spatial correlation measure suggested that the search space was much less correlated than that of the Cherry Hills Brushy Plains network, with Rav z 0.125 compared to Rav ¼ 0.24. The correlation in the fitness function may indicate the expected influence of the population size on the quality of the solutions found, where for the more correlated functions, the population size had a larger effect. 5.3. Benefits and further research To demonstrate the benefit of the approaches considered, a comparison to a large scale GA parametric study that was applied to the Cherry Hill-Brushy Plain fitness function (Gibbs et al., 2005) is informative. By making use of the methodology proposed in this work, only one combination of the GA parameter values was tested for each method, as opposed to the 378 used by Gibbs et al. (2005). However, the solutions found were very similar to those found by the full GA parametric study, with a fraction of the computational effort. The best solution found by Gibbs et al. (2005) added a total of 1163 g/day of chlorine to the system, and the average fitness function value over the five different GA runs was 1184 g/day, compared to a best fitness function value of 1168 g/day found in this work, with average fitness function values of 1196 g/day for the

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

Drift method and 1197 g/day for the Predicted method. It should also be noted that the GA implementation is slightly different between the studies, and only five GA runs were considered in that study, compared to 13 used in this work. It was assumed in this work that the only change in population variance was due to two parent tournament selection. The selection pressure would be expected to increase with larger tournament sizes in the selection operator, and as such the relationship derived in this work is applicable only for the case of the commonly used two parent tournament selection, and when the crossover operator is designed to conform to the principle of “statistics preservation”. Future work should aim to extend the relationship developed to consider other operators, and potentially other evolutionary algorithms. Similarly, the only GA coding scheme considered in this work was real coding. Further work should also investigate if similar results are obtained for GAs with other encodings, such as binary and integer coding. The relationship between fitness function statistics and the rate of population convergence was empirically derived from a large sample of functions (9347 test functions), constructed from periodic cosine functions. The fact that the resulting relationship identified a suitable population size for the two WDS optimisation case studies considered provides some confidence in the applicability of this empirical relationship to water resources related optimisation problems. However, further work should be undertaken on a broader range of fitness functions to test and revise the relationship relating the convergence of a population due to the application of tournament selection to the characteristics of fitness functions. Multi-objective approaches to find Pareto optimal solutions have become more popular when solving water resources optimisation problems, as opposed to lumping a number of different objectives into one fitness function value (Zheng and Zecchin, 2014, for example). However, the field of fitness landscape analysis on multi-objective problems is relatively new, and there was no mention of multi-objective approaches in two recent review papers on fitness landscape analysis (Pitzer and Affenzeller, 2012; Malan and Engelbrecht, 2013). As such, methods to select, or determine parameter values for, optimisation algorithms for these problems based on fitness function statistics is expected to require substantial further research. 6. Summary In this work, a relationship between fitness function statistics and the decay of variance in a GA population due to the application of tournament selection was developed. The decay rate was used to determine how many generations of the population could be expected to occur before it converged to very similar solutions. For a given computational budget (as a number of function evaluations), the population size could then be determined. As the population size is expected to be the most important parameter controlling the performance of a GA, this relationship provided a mechanism to tailor the population size directly to the characteristics of the problem being solved. The results suggested that the fitness function statistics could be useful for the calibration of GA parameter values. This in itself is a significant finding, as generally, previous studies have concluded that fitness function statistics are a poor indicator of GA performance. However, these studies did not consider the calibration of the algorithm, and hence the convergence observed was rather arbitrary. The relationship to predict the decay in population variance, kpred, has demonstrated that the number of generations before a GA will converge is a function of the characteristics of the problem, which fitness function statistics provide information

13

about. This approach is different to that typically adopted when making use of these statistics, which attempts to relate fitness function statistic values to how closely a GA will converge to the optimum solution. However, this is not only a function of the fitness function characteristics, but also the GA parameters and the convergence criterion. Four methods to determine the GA population size were tested on two WDS optimisation problems, each with both constant and self-adaptive values for the remaining GA parameters. The results demonstrate that for the GA implementation and the two test problems considered, the proposed ‘Predicted’ method with constant parameter values located the best solutions in all cases. The Predicted method performed particularly well when a small computational budget was imposed, suggesting that this method is likely to be most applicable to situations where the optimisation process is repeated frequently within a short time frame, for example in operational settings where different starting conditions influence the management of a system. The Parameterless and Drift methods with constant parameter values also provided good parameter values for the GA. Somewhat surprisingly, the calibration methods that used self-adaptive values for the pm, pc, and c parameters produced the worst overall results. The ranking of the performance of the different calibration methods also indicated that the typical GA parameter values lead to the poorest results, reinforcing the importance of determining the most suitable GA parameter values for each fitness function optimised. In summary, the comparison of the different calibration approaches demonstrated four main results: 1) a correctly calibrated GA is very beneficial when solutions are required with a limited computational budget or in a limited time frame; 2) a self-adaptive framework for GA parameters that are applied on a solution level (i.e. probability of crossover and mutation) is less effective than constant parameter values used for the whole GA run, provided reasonable values are used; 3) the proposed methodology based on fitness function statistics can be applied in practice; and 4) the proposed methodology produced the best results out of the different GA calibration methods compared for the case studies considered. Acknowledgements The authors would like to thank the anonymous reviewers for the helpful comments on this paper. References €ck, T., Eiben, A.E., van der Vaart, N., 16-20 Sep. 2000. An empirical study on GAs Ba “without parameters”. In: Schoenauer, M., Deb, K., Rudolph, G., Yao, X., Lutton, E., Merelo, J.J., Schwefel, H.-P. (Eds.), Parallel Problem Solving from NaturedPPSN VI, Vol. 1917 of Lecture Notes in Computer Science. Springer Verlag, Paris, France, pp. 315e324. €ck, T., Schwefel, H.-P., 1993. An overview of evolutionary algorithms for paramBa eter optimisation. Evol. Comput. 1 (1), 1e23. Beyer, H.G., Deb, K., 2001. On self-adaptive features in real-parameter evolutionary algorithms. IEEE Trans. Evol. Comput. 5 (3), 250e270. Boccelli, D.L., Tryby, M.E., Uber, J.G., Rossman, L.A., Zierolf, M.L., Polycarpou, M.M., 1998. Optimal scheduling of booster disinfection in water distribution systems. J. Water Resour. Plan. Manag. ASCE 124 (2), 99e111. Boeringer, D.W., Werner, D.H., Machuga, D.W., 2005. A simultaneous parameter adaptation scheme for genetic-algorithms with application to phased array synthesis. IEEE Trans. Antennas Propag. 53 (1), 356e371. Broad, D.R., Maier, H.R., Dandy, G.C., 2010. Optimal operation of complex water distribution systems using metamodels. J. Water Resour. Plan. Manag. ASCE 136 (4), 433e443. Cantú-Paz, E., Goldberg, D.E., 2003. Are multiple runs of genetic algorithms better than one? In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O'Reilly, U.-M., Beyer, H.-G., Standish, R.K., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A.C., Dowsland, K.A., Jonoska, N., Miller, J.F.

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023

14

M.S. Gibbs et al. / Environmental Modelling & Software xxx (2014) 1e14

(Eds.), Genetic and Evolutionary ComputationdGECCO 2003, Vol. 2723 of Lecture Notes in Computer Science. Springer, pp. 801e812. Chen, L., Kao, S.J., Traore, S., 2012. Predicting and managing reservoir total phosphorus by using modified grammatical evolution coupled with a macro-genetic algorithm. Environ. Model. Softw. 38, 89e100. Davidor, Y., 1991. Epistasis variance: a viewpoint on GA-hardness. In: Rawlins, G. (Ed.), Foundations of Genetic Algorithms. Morgan Kaufmann, pp. 23e35. Diao, K., Wang, Z., Burger, G., Chen, C., Rauch, W., Zhou, Y., 2014. Speedup of water distribution simulation by domain decomposition. Environ. Model. Softw. 52, 253e263. Falconer, D.S., 1981. Introduction to Quantitative Genetics, second ed. Longman Scientific and Technical, Harlow, Essex, UK. Gibbs, M., Dandy, G., Maier, H., Nixon, J., 2005. Selection of genetic algorithm parameters for water distribution system optimization. In: Walton, R. (Ed.), Proceedings of the World Water & Environmental Resource Congress. ASCE, Anchorage, AK, USA, pp. 1e12. Gibbs, M.S., Dandy, G.C., Maier, H.R., 2008. A genetic algorithm calibration method based on convergence due to genetic drift. Inf. Sci. 178 (14), 2857e2869. Gibbs, M.S., Dandy, G.C., Maier, H.R., 2010a. Calibration and optimization of the pumping and disinfection of a real water supply system. J. Water Resour. Plan. Manag. ASCE 136 (4), 493e501. Gibbs, M.S., Maier, H.R., Dandy, G.C., 2010b. Comparison of genetic algorithm parameter setting methods for chlorine injection optimization. J. Water Resour. Plan. Manag. ASCE 136 (2), 288e291. Gibbs, M.S., Maier, H.R., Dandy, G.C., 2011. Relationship between problem characteristics and the optimal number of genetic algorithm generations. Eng. Optim. 43 (4), 349e376. Goldberg, D.E., Deb, K., 1991. A comparative analysis of selection schemes used in genetic algorithms. In: Rawlins, G.J.E. (Ed.), Foundations of Genetic Algorithms. Morgan Kaufmann, San Francisco, CA, pp. 69e93. Grefenstette, J., 1986. Optimization of control parameters for genetic algorithms. IEEE Trans. Syst. Man. Cybern. 16 (1), 122e128. Harik, G., Cant-Paz, E., Goldberg, D.E., Miller, B.L., 1999. The gambler's ruin problem, genetic algorithms, and the sizing of populations. Evol. Comput. 7 (3), 231e253. Harik, G.R., Lobo, F.G., 13-17 1999. A parameter-less genetic algorithm. In: Banzhaf, W., Daida, J., Eiben, A.E., Garzon, M.H., Honavar, V., Jakiela, M., Smith, R.E. (Eds.), Proceedings of the Genetic and Evolutionary Computation Conference, vol. 1. Morgan Kaufmann, Orlando, Florida, USA, pp. 258e265. Herrera, F., Lozano, M., Verdegay, J.L., 1998. Tackling real-coded genetic algorithms: operators and tools for behavioural analysis. Artif. Intell. Rev. 12 (4), 265e319. Ho, Y.C., Pepyne, D.L., 2002. Simple explanation of the no-free-lunch theorem and its implications. J. Optim. Theory Appl. 115 (3), 549e570. Jones, T., Forrest, S., 1995. Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: Eshelman, L. (Ed.), Proceedings of the 6th International Conference on Genetic Algorithms. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp. 184e192. Kita, H., Yamamura, M., 1999. A functional specialization hypothesis for designing genetic algorithms. In: Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on, vol. 3, pp. 579e584. Lee, S., Soak, S., Kim, K., Park, H., Jeon, M., 2008. Statistical properties analysis of real world tournament selection in genetic algorithms. Appl. Intell. 28 (2), 195e205 times Cited: 15 Lee, S. Soak, S. Kim, K. Park, H. Jeon, M. Jeon, Moongu/A-10092012 15. Lobo, F.G., 2000. The Parameter-less Genetic Algorithm: Rational and Automated Parameter Selection for Simplified Genetic Algorithm Operation (PhD thesis). Universidade Nova de Lisboa, Lisbon, Portugal. Lobo, F.G., Goldberg, D.E., 2004. The parameter-less genetic algorithm in practice. Inf. Sci. 167 (1e4), 217e232. Louis, S.J., Rawlins, G.J.E., 1993. Predicting convergence time for genetic algorithms. In: Whitley, L.D. (Ed.), Proceedings of the Second Workshop on Foundations of Genetic Algorithms. Vail, Colorado, USA, July 26-29 1992. Morgan Kaufmann, pp. 141e161. Maier, H., Kapelan, Z., Kasprzyk, J., Kollat, J., Matott, L., da Conceio Cunha, M., Dandy, G., Gibbs, M., Keedwell, E., Marchi, A., Ostfeld, A., Savic, D., Solomatine, D., Vrugt, J., Zecchin, A., Minsker, B., Barbour, E., Kang, D., Kuczera, G., Pasha, F., 2014. Evolutionary algorithms and other metaheuristics in water resources: current status, research challenges and future directions. Environ. Model. Softw. (in this issue). Malan, K.M., Engelbrecht, A.P., 2013. A survey of techniques for characterising fitness landscapes and some possible ways forward. Inf. Sci. 241, 148e163. Marchi, A., Dandy, G., Wilkins, A., Rohrlach, H., 2014. Methodology for comparing evolutionary algorithms for optimization of water distribution systems. J. Water Resour. Plan. Manag. 140 (1), 22e31. Mason, A., 1991. Partition coefficients, static deception and deceptive problems for non-binary alphabets. In: Belew, R., Booker, L. (Eds.), Proceedings of the 4th International Conference on Genetic Algorithms. Morgan Kaufmann, San Mateo, CA, pp. 210e214.

Mathias, K., Whitley, D., Pyeatt, L., 1995. Hyperplane ranking in simple genetic algorithms. In: Eshelman, L. (Ed.), Proceedings of the 6th International Conference on Genetic Algorithms. Morgan Kaufmann, pp. 231e238. May, R.J., Dandy, G.C., Maier, H.R., Nixon, J.B., 2008a. Application of partial mutual information variable selection to ann forecasting of water quality in water distribution systems. Environ. Model. Softw. 23 (10e11), 1289e1299. May, R.J., Maier, H.R., Dandy, G.C., Fernando, T.M.K.G., 2008b. Non-linear variable selection for artificial neural networks using partial mutual information. Environ. Model. Softw. 23 (10e11), 1312e1326. Merz, P., 2004. Advanced fitness landscape analysis and the performance of memetic algorithms. Evol. Comput. 12 (3), 303e325. Minsker, B., 2005. Genetic algorithms. In: Kumar, P., Alameda, J., Bajcsy, P., Folk, M., Markus, M. (Eds.), Hydroinformatics: Data Integrative Approaches in Computation, Analysis and Modeling. CRC Press, Florida, USA, pp. 439e456. Munavalli, G.R., Kumar, M.S.M., 2003. Optimal scheduling of multiple chlorine sources in water distribution systems. J. Water Resour. Plan. Manag. ASCE 129 (6), 493e504. Naudts, B., Kallel, L., 2000. A comparison of predictive measures of problem difficulty in evolutionary algorithms. IEEE Trans. Evol. Comput. 4 (1), 1e15. Oladele, R.O., Sadiku, J.S., May 2013. Genetic algorithm performance with different selection methods in solving multi-objective network design problem. Int. J. Comput. Appl. 70 (12), 5e9 published by Foundation of Computer Science, New York, USA. Pitzer, E., Affenzeller, M., 2012. A comprehensive survey on fitness landscape analysis. In: Fodor, J., Klempous, R., Araujo, C.P.S. (Eds.), Recent Advances in Intelligent Engineering Systems, Vol. 378 of Studies in Computational Intelligence. Springer-Verlag, Berlin, pp. 161e191. Propato, M., Uber, J.G., 2004. Linear least-squares formulation for operation of booster disinfection systems. J. Water Resour. Plan. Manag. ASCE 130 (1), 53e62. Radcliffe, N., Surry, P., 1995. Fitness variance of formae and performance prediction. In: Whitley, L., Vose, M.D. (Eds.), Foundations of Genetic Algorithms 3. Morgan Kaufmann, pp. 51e72. Razavi, S., Tolson, B.A., Burn, D.H., 2012. Numerical assessment of metamodelling strategies in computationally intensive optimization. Environ. Model. Softw. 34, 67e86. Reed, P., Yamaguchi, S., 2004. Simplifying the parameterization of real-coded evolutionary algorithms. In: World Water and Environmental Resources Congress. EWRI/ASCE, Salt Lake City, Utah, USA, pp. 1e9. Reeves, C.R., Wright, C., 1995. An experimental design perspective on genetic algorithms. In: Whitley, D., Vose, M. (Eds.), Foundations of Genetic Algorithms, vol. 3. Morgan Kaufmann, San Francisco, CA, pp. 7e22. Rogers, A., Prugel-Bennett, A., November 1999. Genetic drift in genetic algorithm selection schemes. IEEE Trans. Evol. Comput. 3 (4), 298e303. Salomon, R., 1996. Re-evaluating genetic algorithm performance under coordinate rotation of benchmark functions: a survey of some theoretical and practical aspects of genetic algorithms. BioSystems 39, 263e278. Schaake, J., Lai, D., 1969. Linear Programming and Dynamic Programming Application to Water Distribution Network Design. M.I.T. report. Mit, Hydrodynamics Laboratory. Sharma, A., 2000. Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: part 1-a strategy for system predictor identification. J. Hydrol. 239 (1e4), 232e239. Smith, T., Husbands, P., Layzell, P., O'Shea, M., 2002. Fitness landscapes and evolvability. Evol. Comput. 10 (1), 1e34. Someya, H., Dec 2013. Striking a mean- and parent-centric balance in real-valued crossover operators. Evol. Comput. IEEE Trans. 17 (6), 737e754. Suzuki, J., 1995. A Markov-chain analysis on simple genetic algorithms. IEEE Trans. Syst. Man. Cybern. 25 (4), 655e659. Thierens, D., Goldberg, D.E., 1994. Convergence models of genetic algorithm selection schemes. In: Davidor, Y., Schwefel, H., M€ anner, R. (Eds.), Parallel Problem Solving from NaturedPPSN III, Vol. 866 of Lecture Notes in Computer Science. Springer-Verlag, London, UK, pp. 119e129. Thierens, D., Goldberg, D.E., Pereira, A., 1998. Domino convergence, drift and the temporalsalience structure of problems. In: IEEE International Conference on Evolutionary Computation. IEEE Press, New York, NY, pp. 535e540. Weinberger, E., 1990. Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern. 63 (5), 325e336. Wolpert, D.H., Macready, W.G., April 1997. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1 (1), 67e82. Zheng, F.F., Zecchin, A., 2014. An efficient decomposition and dual-stage multiobjective optimization method for water distribution systems with multiple supply sources. Environ. Model. Softw. 55, 143e155. Zhong, J., Hu, X., Gu, M., Zhang, J., Nov 2005. Comparison of performance between different selection strategies on simple genetic algorithms. In: Computational Intelligence for Modelling, Control and Automation, 2005 and International Conference on Intelligent Agents, Web Technologies and Internet Commerce, International Conference on, vol. 2, pp. 1115e1121.

Please cite this article in press as: Gibbs, M.S., et al., Using characteristics of the optimisation problem to determine the Genetic Algorithm population size when the number of evaluations is limited, Environmental Modelling & Software (2014), http://dx.doi.org/10.1016/ j.envsoft.2014.08.023