Computers chem. Engng Vol.20, Suppl.,pp. $79-$84, 1996
Pergamon
S0098-1354(96)00024-5
Copyright© 1996 ElsevierScienceLtd Printed in Great Britain.All rightsreserved 0098-1354/96 $15.00+0.00
SYNTHESIS AND OPTIMIZATION OF A NONIDEAL DISTILLATION SYSTEM USING A PARALLEL GENETIC ALGORITHM
ERIC S FRAGA & TERESA R SENOS MATIAS Department of Chemical Engineering, University of Edinburgh, Edinburgh, United Kingdom EH9 3JL Abstract - The automated synthesis of separation sequences for nonideal mixtures has only recently been attempted. The need for rigorous physical property estimation procedures along with the possibly complex recycle structures leads to the specification of a nonlinear programming problem, one in which both the objective function and the feasibility region are nonconvex. This paper describes the use of a stochastic optimization procedure, genetic algorithms, for the optimization of a preselected sequence of distillation units for the separation of a three component azeotropic mixture. By fixing the structure in terms of the units desired, the optimization problem is reduced to one of designing each of the relevant units, choosing the appropriate operating conditions, and finding the optimal heat exchange network. To tackle the high computational effort required, the implementation uses a distributed memory multicomputer in the form of a network of workstations.
INTRODUCTION The synthesis of distillation sequences for the separation of nonideal mixtures is a computationally difficult task. For some time, heuristic procedures have been available, based on geometric and/or heuristic principles (Julka & Doherty, 1990, Wahnschafft et al., 1991, Malone & Doherty, 1995). Recently, however, algorithmic techniques have been introduced which can synthesize realistic processes (Senos Matias et al., 1995). The techniques make use of rigorous physical property estimation methods, allowing studies over ranges of operating conditions for each of the distillation units considered and enabling the synthesis of heat integrated processes. Although the method described by Senos Matias et al. involves the use of short-cut iterative procedures, the computation effort is still substantial. Therefore, any further reduction in the amount of time required is desirable, especially in the context of an interactive design environment. The nonideal distillation model described by Senos Matias et al. was incorporated into the CHIPS process synthesis package (Fraga & McKinnon, 1994). CHIPS is based on the use of discretization to convert an MINLP problem into a discrete graph search problem with nonlinear cost functions and discrete utilities, allowing for the simultaneous synthesis of the heat exchange network. As the level of discretization used is user specified, the high computational expense discourages the user from requesting fine discretizations. The first step in reducing the time requirements has already been taken: a fully rigorous method has been replaced, with no reduction in quality, by a short-cut method. This paper describes the next step: the use of an alternative constrained global optimization procedure using parallel computing techniques. We present a genetic algorithm which uses the rigorous short-cut method for the design of each distillation unit, using a network of workstations with the PVM message passing library (Geist et al., 1993) as the parallel computing resource.
THE PROBLEM The problem of synthesizing process flowsheets can be decomposed into three interrelated subproblems: 1. determining the sequence of units with their interconnections, 2. designing each unit in the sequence, while choosing suitable operating conditions (e.g. operating pressure for a distillation column), and 3. designing the heat exchange network (HEN). As the type of problems we are interested in solving involve very few different base structures (i.e. the choice of processing units), this paper assumes that the choice of structure has already been made. It is a simple task to iterate over all possible sequences of units, a procedure suggested by Smith & Pantelides for synthesis in general (Smith & Pantelides, 1995). We are therefore more interested in the simultaneous choice of the HEN and the unit design parameters. Furthermore, methods for the synthesis of heat exchange networks have been extensively reported in the literature (Nishida et al., 1981) and we have decided to use the pinch method as part of the objective function. As the short-cut procedure for the design of a column requires only the unit operating conditions as input, the problem has been reduced to one of minimizing a function with respect to the set of operating conditions for the units in the structure. The objective function itself is a design procedure whose evaluation determines the unit design parameters (reflux rate and number of stages) and the heat exchange network along with the full annualized operating cost. $79
$80
European Symposium on Computer Aided Process Engineering--6. Part A
~ G U R E 1. Sequentialversionofageneticalgorithm. procedure GeneticAlgorithm create initial population evaluate and sort population iterate: choose pair of members from population (using weighted random index) create two new children via crossover (crossover is defined by a mask) for each child do randomly mutate evaluate child and insert into population end for if population exceeds some maximum threshold then randomly prune to reduce population to np end if end iteration end procedure
One feature of this particular form of objective function is that the region of feasibility is hard to compute. It is in fact nonconvex due to the use of discrete utilities and because of the constraints imposed by the azeotropic nature of the mixtures involved. Methods that rely on a priori knowledge of the region of feasibility will be difficult to use. In our case, we assume that the objective function will return a very large cost when an infeasible point is encountered. GENETIC ALGORITHMS Genetic algorithms are a form of stochastic optimization suitable for a wide range of problems and are particularly intended for nonconvex problems where the possibility of being trapped in a local optimum is high. Recently, Stair & Fraga (1995) have applied a genetic algorithm to the problem of optimizing a distillation sequence. The problem considered in that paper, however, was ideal and one in which the region of feasibility was convex. The genetic algorithm used by Stair & Fraga was based on the classical approach. A population of solutions (known as chromosomes) is created, each of which is assigned a fitness value based on the cost estimate of the process it represents. The procedure is then to iterate for a predefined number of generations. For each generation, a number of members of the population are chosen, randomly, to mate. The random choice is weighted towards choosing the most fit members of the population. The mating results in possibly new solutions, which may be randomly mutated (one of the genes that makes up the chromosome is changed randomly). These solutions are then inserted into the population, deleting some previously existing ones. Deleting is also done by choosing randomly with a weighted function, choosing predominantly the least fit members of the population. The version of the genetic algorithm used for this paper is subtly different: for each iteration (generation) we choose only one pair of members to mate, immediately inserting the new solutions into the population. The population increases from iteration to iteration until some maximum (chosen to be 2 times the initial population size for the problems attempted in this paper). When that maximum is reached, the population is pruned back to the initial size. Again, the choice of members to mate and the members to kill is based on a weighted random function of the fitness value. The parameters for this method are the initial population size, np, the number of iterations to perform, n i, and the mutation rate, rm. T h e method is described in the form of an algorithm, shown in Figure 1. The crucial aspect which determines the success of a genetic algorithm in solving a particular problem is the choice of encoding. A poor encoding will lead to a less than satisfactory cover of the search space, leading at best to a local optimum of the objective function. In the problem we are interested in, the mapping is intuitive: each design variable of interest is mapped to a gene in the chromosome, where a chromosome completely specifies a point in the search space. For a two column process, a chromosome will be described by, for example, the operating pressure and the reflux rate factor of each column, resulting in a four gene chromosome (or a 4 dimensional search space). Parallel I m p l e m e n t a t i o n
The motivation for the parallel implementation of the genetic algorithm method is the high cost of the objective function evaluation. Due to the nature of genetic algorithms, each function evaluation is independent of any other and hence can be performed on separate processors. The approach is to farm out function evaluations as they are encountered. The main process waits only when all the processors in the task farm are busy.
Euro~anSymposium on ComputerAided ~ocessEngineering---6. Part A
$81
FIGURE2. Parallelversionofthegeneticalgorithm. procedure ParallelGeneticAlgorithm initialize population evaluate population in parallel iterate: choose pair from population crossover for each child do mutate randomly if all processors busy then wait insert new member into population end if assign child to parallel processor end for if population exceeds some maximum threshold then randomly prune to reduce population to np end if end iteration wait for any active p r o c e s s o r s to finish end procedure
The parallel algorithm, shown in Figure 2, is different than the initial sequential version. The parallel version has reintroduced the concept of a time lag, or gestation period, albeit a gestation that has no effect on the parents involved as they can proceed to breed immediately. Breeding continues amongst the population while some processors are still computing the fitness of some new members. The effect is that the choice of parents is more restrictive at each step and may result in slower convergence in theory. This algorithm has been implemented using the PVM (Geist et al., 1993) message passing system. This enables us to use a set of networked Unix workstations as a parallel computing resource. Previous uses of PVM with workstation clusters by one of the authors has shown that a high level of efficiency is possible, even under adverse load conditions (Fraga & McKinnon, 1995). The granularity of the proposed parallel version of the genetic algorithm is low enough that the same efficiency should be observed. The computing resource used for the results described in the next section is a set of 5 Sun SPARCstation 5 workstations running SunOS 4.1.3. RESULTS Distributed computing can be applied in two fundamentally different ways: it can be used to solve a given problem in less time or it can be used to generate a better solution to the problem (Garcia & Herman, in press). Due to the stochastic nature of genetic algorithms, and to the difficulty in finding a guaranteed global optimum for the problem of azeotropic distillation, either approach is of interest. We present results of both cases in this paper. In all cases, results were generated by considering 10 runs for each experiment, reporting the best, worst, and average results (and the standard deviation in the form of error bars in the figures). The random number generator (the C library random function in SunOS 4.1.3) is initialized by a variable number of calls (based on the last few digits of the current time when the program is started) before each experiment. For simplification of the analysis, the population size is chosen to be an exact multiple of the number of processors used. Parameters for the sequential genetic algorithm have been chosen somewhat arbitrarily: mutation rate of 0.1, an initial population size of 30, and termination after 100 iterations. The parameters for the parallel experiments are a linear function of the sequential version or the same as the sequential version. Figure 4 shows the cost profile as a function of the number of iterations for five different approaches to the problem. At each iteration, the cost of the best solution (which may vary from iteration to iteration) in the population is used. From top to bottom, the first graph shows the sequential case; the second, the parallel case with the same parameters; the third, the parallel case with a population 5 times as large (an increase in line with the number of processors used); the fourth, in parallel with 5 times as many iterations; and, finally, a parallel case with 5 times the population and 5 times as many iterations. CPU times (in wall clock seconds on a Sun SPARCstation 5 running SunOS 4.1.3) are reported in Table 1. The best structure found is shown in Figure 3. The values of the design variables for the best solution found are described in Table 2. Also shown in the table is the cost of the best structure showing a saving of approximately 30
$82
European Symposium on Computer Aided Process Engineering---6. Part A TABLE 1. Execution Times for Sequential and Parallel versions. Genetic Algorithm Parameters Problem
cpu Time (s)
np
ni
6736
30
I00
1401
30
100
2140
150
100
5770
30
500
6754
150
500
Sequential
Parallel
FIGURE 3. Heat integrated structure for splitting ternary mixture of Acetone, Chloroform, and Benzene.
[l~ Feed
Acetone
.
F1
. Chloroform
F2
.
Benzene
per cent from the initial heat integrated structure. We have used the cost correlations described by Rathore et al. (1974). The solution shows that the condenser duty of the second column is met entirely from the heat requirements of the first unit's reboiler. A little extra heating is required for the first column. A plant life of 10 years enables a significant change in capital requirements to lead to a lower annualized cost. Different cost correlations will of course lead to different results. However, the optimization procedure is not affected by the choice of cost correlations due to the stochastic nature of genetic algorithms. DISCUSSION The results clearly show the stochastic nature of genetic algorithms. The final solutions found in the ten attempts at each problem show a spread of anywhere up to 10%. In the sequential case, we see that the spread in solutions stays constant after about 40 iterations. The first parallel case exhibits a similar behaviour, showing that the difference in the underlying algorithm (due to the gestation period) does not greatly affect the performance. However, the stochastic nature of genetic algorithms is apparent from the fact that the initial population is better (on average) in this case than in the sequential case, The computer times show that the parallel version is a little less than five times faster than the sequential case. The rest of the cases confirm that the best solutions found in both the sequential and initial parallel cases are reasonable. By increasing the number of iterations we see that we can not do significantly better; increasing the population size simply ensures a better initial population and hence we reach the global optimum solution in less
European Symposiumon Computer Aided Process Engineering--6. Part A
$83
TABLE 2. Design parameters for two column configuration, as described by Senos Matias et al. (1995). Design Parameter
Column 1
Column 2
Stages
24
62
Reflux ratio
5.3
11.8
Pressure (atm)
0.66
2.1
Reboiler duty (GJ/hr)
7.9
9.5
Reboiler temperature (K)
337
378
Condenser duty (GJ/hr)
6.2
7.8
Condenser temperature (K)
317
358
Cost ($/yr)
82,451
iterations. Finally, by combining the increase in number of iterations with the increase in population, we can solve more convincingly the problem, with 5 processors, in the time required to solve it sequentially. NOTATION ni Number of iterations for genetic algorithm. np Size of population. rm Mutation rate (0
Fraga, E S & McKinnon, K I M, 1994, CHIPS: A process synthesis package. Chem. Engng. Res. & Des., 72(A3), 389-394. Fraga, E S & McKinnon, K I M, 1995, A Portable Code for Process Synthesis Using Workstation Clusters and Distributed Memory Multicomputers, 1995, Comp. Chem. Engng, 19(6/7), 759-773. Garcia, I & Herman, G T, in press, Global optimization by parallel constrained biased random search. State of the Art in Global Optimization: Computational Methods and Applications, C A Floudas & P M Pardalos, Editors. Geist, A, Beguelin, A, Dongarra, J, Jiang, W, Manchek, R, & Sunderam, V, 1993, PVM 3.0 user's guide and reference manual. Oak Ridge National Laboratory Report TMP- 12187. Julka, M V & Doherty, M F, 1990, Geometric behaviour and minimum flows for nonideal multicomponent distillation. Chem. Engng. Sci., 45(7), 1801-1822. Malone, M F & Doherty, M F, 1995, Separation system synthesis for nonideal liquid mixtures. AIChE Symposium Series No. 304, 91, 9-18. Nishida, N, Stephanopolous, G, & Westerberg, A W, 1981. A review of process synthesis. AIChE J., 27(3), 321351. Rathore, R N S, van Wormer, K A, & Powers, G J, 1974, Synthesis of distillation systems with energy integration. AIChE J., 20, 940-950. Senos Matias, T R, Fraga, E S, & Ponton, J W, 1995, Nonideal distillation in automated synthesis. Comp. Chem. Engng., 19, $57-$62. Smith, E M B & Pantelides, C C, 1995, Design of reaction/separation networks using rigorous models. Comp. Chem. Engng., 19, $83-$88. Stair, C & Fraga, E S, 1995, Optimization of unit operating conditions for heat integrated processes using genetic algorithms. Proceedings of the 1995 L Chem.E. Research Event, Institution of Chemical Engineers, Rugby, United Kingdom, 95-97. Wahnschafft, O M, Jurian, T P, & Westerberg, A W, 1991, SPLIT: A separation process designer. Comp. Chem. Engng., 15, 565-581.
S84
E~0~
SymposiumonComputerAided~ocessEngineefing-----6. p ~ A
NGURE4. Costev01uti0nasaNncti0n0f~enumber0fiterations. 1 2 0 0 0 O a v e r a g e b ~ t
1 1 5 O 0 0
w o r s t
,
.
,
........
1 1 o o 0 0
~
1 o o o o o 9 5 0 0 0
o ooo ooooo
!i)iii}!iii i J iii)iiiiiiiiii iiiii ii
8 5 0 0 0
8 0 O 0 0
' 0
'
20
40
Iteration
'
,
60
50
100
1 2 0 0 0 0 a v a r a g e best w o r e t
1 1 5 0 0 0
, = ....... .......
1 1 0 0 0 0 1 0 5 0 0 0 1 0 0 0 0 0
~~iiiiriiiiiii~i~j~iiiii~n~j~iiiii~;iii)]~l~iiiii~I'~iii~jiii]~I~;iiii~/~;ir ~oooo %~-I~IIItlIIIIIIIIIIIIIIIIlllIIIIIIIIIIIIIIlIIIIIII]IIIIIIIIIIIIIIj~U.I.~ 9 5 0 0 0
8 6 0 0 0 8 0 0 0 0 0
20
40
60
80
100
Iteration 1 2 0 0 0 0 a v e r a g e bast w o rst
1 1 5 0 0 0
, ° ....... ........
1 1 0 0 0 0 1 0 5 0 0 0 1 0 0 0 0 0
g5ooo II iii---iiiiiT~,i;i;,,-i,i;ii;d IIII ii11[iiiiiiIllliiiiii(ii 65ooo9°°°°~IJJIIIIIIII[II~I]IIIILLLWJJIIIIIILLLWJJ]II![[[(!WJ!! ~ .
6 0 0 0 0
' 0
.
.
'
20
.
.
' 60
40
' 80
100
Iteration 1 2 0 0 0 0 a v a r a g e best w o r s t
1 1 5 0 0 0
,
-
........
1 1 0 0 0 0 1 0 5 0 0 0 1 0 0 0 0 0
I~LLUJ2LII[IIIIIIIIIIIIII[/IIIIIIIIIIIIIIIIlIIIIIIIIIll --a4~--J.LL~ii~iiI1i!~L~~I~I~I~t~~I~
9 5 0 0 0
9000o 8
~
0
0
0
//i|
1
.....................................................................
8 0 0 0 0
'
0
50
'
100
'
150
' "
2 0 0
L
1 2 0 0 0 0
'
2 5 0 Iteration
'
3 0 0
'
3 5 0
'
4 0 0
ava~
1 1 5 0 0 0
w o r s t
'
4 6 0
5 0 0
~_t__: ........
1 0 6 0 0 0 ~
1 0 0 0 0 0
9~oo0 ~ o o o o
6 0 0 0 0
.J..~.~iJ~.j~iiiii~J~Jiiii.~.J-l~(iiiiii.1.~iiii))i.i~iiiii1~.~iiiii~~~i~iiiiii~
-~-.m~..s~]i]!!![![[LiiiIi~iiiii~iiiiiiiiii~iiiiiiii~iii~1U~i~i~]~ /
.
.
.
0
50
100
150
.
2 0 0
.
2 5 9 Iteration
.
3 0 0
.
3 5 0
.
4 0 0
-
i
.
/
4 5 0
5 0 0