carbon composites

carbon composites

Acta Materialia 52 (2004) 299–305 www.actamat-journals.com Applications of neural networks and genetic algorithms to CVI processes in carbon/carbon c...

560KB Sizes 0 Downloads 57 Views

Acta Materialia 52 (2004) 299–305 www.actamat-journals.com

Applications of neural networks and genetic algorithms to CVI processes in carbon/carbon composites Li Aijun, Li Hejun *, Li Kezhi, Gu Zhengbing Super High Temperature Composites Key Laboratory, Northwestern Polytechnical University, Xi’an 710072, PR China Received 17 June 2003; received in revised form 14 September 2003; accepted 17 September 2003

Abstract A model of artificial neural networks and genetic algorithms is developed for the analysis and prediction of the correlation between CVI processing parameters and physical properties in carbon/carbon composites (C/C). The input parameters of the artificial neural network (ANN) are the infiltration temperature, the pressure in furnaces, the volume ratio of propylene, and the fiber volume fraction. The outputs of the ANN model are the two most important physical properties, namely, the density and density distribution of workpieces. After the ANN model based on BP algorithms is trained successfully, genetic algorithms (GAs) are used to optimize the input parameters of the model and select perfect combinations of CVI processing parameters. A good generalization performance of the model is achieved. Moreover, some explanations of those predicted results from the physical and chemical viewpoints are given. A graphical user interface is also developed for the integrated model. Ó 2003 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved. Keywords: Carbon/carbon composites; Artificial neural network; Genetic algorithms; CVI processing parameters; Graphical user interface

1. Introduction High performance carbon/carbon (C/C) composites are mainly processed by using CVI techniques [1,2]. Variety kinds of CVI techniques have been developed for last decades. However, it should be pointed out that the relationship between CVI processing factors and physical properties of C/C composites has so far been studied empirically by trial-and-error methods, which is both costly and time consuming. For this reason, scientists are looking forward to conducting fewer experiments and yet obtaining a desired goal. With the development of computer science especially artificial intelligent technologies some progress has been made in the fields of material processing and design [3–6]. Artificial neural networks (ANNs) that possess massive parallel computing capability and can map the input/output relationship have attracted much attention in research on material processing [7]. On the other hand, *

Corresponding author. Tel.: +86-298-495-004; fax: +86-298-495240. E-mail address: [email protected] (L. Hejun).

ANNs are suitable for simulations of correlations that are hard to describe by physical models because they work as a ÔBlack BoxÕ. In this paper, four parameters of CVI processes, namely, infiltration temperature, pressure in furnaces, volume ratio of propylene, and fiber volume fraction are selected to study and the goal is to map the correlation between those parameters and physical properties in carbon/carbon composites. In addition, genetic algorithms (GAs) [8] that can perform well for optimization in complex nonlinear systems are introduced to optimize CVI processing factors in C/C composites on the basis of the ANN model.

2. Collecting samples Although ANNs have been established for several decades, how to select appropriate samples is still a problem. In order to obtain enough information, a fullfactorial design method is adopted to collect sample datasets. The first processing factor, infiltration time, and the last three factors are covered with four and three levels, respectively. Following the method, there are

1359-6454/$30.00 Ó 2003 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.actamat.2003.09.020

300

L. Aijun et al. / Acta Materialia 52 (2004) 299–305

Table 1 The factor settings (including 81 full-factorial design samples) for training data Level Temperature Pressure Volume ratio Fiber volume (°C) (atm) (C3 H6 /(C3 H6 + N2 )) fraction 1 2 3 4

825 925 1025 1125

0.025 0.035 0.045 –

0.35 0.45 0.55 –

0.45 0.55 0.65 – Fig. 1. A neuron with a sigmoidal transfer function.

Table 2 The factor settings (including 16 full-factorial design samples) for testing data Level Temperature Pressure Volume ratio Fiber volume (°C) (atm) (C3 H6 /(C3 H6 + N2 )) fraction 1 2 3

875 975 1075

0.03 0.04 –

0.40 0.50 –

0.50 0.60 –

total 4  33 training samples. Otherwise, total 3  23 testing samples are also collected according to different factor levels in order to verify the generalization capability of the ANN model. The selected preform is a disk of which inner radius, outer radius, and height are 15, 25, and 30 mm, respectively. The adopted process is the isothermal CVI (ICVI) because of its simpleness and practicality and the densification time is setup to 100 h. All sample datasets shown in Tables 1 and 2 are got by the method described in the literature [6]. Tables 1 and 2 only show how to sample and contain no actual figures because there are more than 100 samples. In this paper, density uniformity is introduced to scale density distribution as follows: qu ¼ minðqÞ= maxðqÞ;

ð1Þ

where minðqÞ and maxðqÞ are the minimum density and the maximum density of workpieces, respectively.

Fig. 2. The four-layered feed-forward neural network (s1 ¼ 4 and s2 ¼ 10).

perceptrons Xi ði ¼ 1; 2; 3; 4Þ represent the value of the four factors, respectively, and Y1 and Y2 are the outputs of the ANN. Here, wi ði ¼ 1; 2; 3Þ is the connection weight matrix; bi ði ¼ 1; 2; 3Þ is the bias matrix of neurons; H 1 and H 2 are the output neuron matrix of the mid-layers; f is the transfer function (e.g., Fig. 1) and in this paper log-sigmoid transfer function is selected as follows: f ¼

1 : 1 þ ex

ð2Þ

3. Modeling using a backpropagation neural network 3.2. The framework of the improved BP algorithm 3.1. Topology of a backpropagation neural network Backpropagation (BP) neural network is usually referred to as a feed forwarded, multi-layered perceptron (MLP) with a number of hidden layers, conventionally trained with gradient descent techniques that are based on the error-correction learning rule. And an example of a neuron with a sigmoidal transfer function is shown in Fig. 1. This simple processing unit is known as an elementary perceptron. In order to model the relationship between CVI processing factors and physical properties in C/Cs, a four-layered feed-forward neural network is presented in the paper as shown in Fig. 2. The four input elementary

Though the conventional training algorithm: the steepest descent one is simple and easy to understand, yet it is often too slow for practical problems. In this paper, Levenberg–Marquardt algorithm is adopted, which is the standard numerical optimization technique. Like the quasi-Newton methods, this algorithm is designed to approach second-order training speed without having to compute the Hessian matrix. A simple outline of the ANN used in this study is given as follows: Step 1. Design hidden layers and regularize the desired value of output Y into the range 0.2–0.8 because transfer function is log-sigmoid; to insure stabilization of the algorithm, regularize the

L. Aijun et al. / Acta Materialia 52 (2004) 299–305

value of input matrix X also into the range 0.2– 0.8 as follows: X ¼ 0:2 þ

0:6ðX  minðX ÞÞ : maxðX Þ  minðX Þ

ð3Þ

Step 2. Initialize the weight and bias matrix by taking a group of random numbers within ð1; 1Þ as the initial values of wi and bi ði ¼ 1; 2; 3Þ: set all weights and biases to values suitable as a starting point for training, i.e., these values can ensure the output of the transfer function is not saturated. Step 3. Compute the outputs of all neurons layer by layer, beginning with the first hidden layer as shown in Fig. 2. Step 4. Measure performance according to the sum of squared errors (SSE):  i  Y1  T1i i e ¼ ; i ¼ 1; 2; . . . ; P ; ð4Þ Y2i  T2i E¼

P X

½ei T ei ;

i ¼ 1; 2; . . . ; P ;

ð5Þ

i¼1

where T1j ðj ¼ 1; 2Þ is the desired valued of the output of the model; P represents the total number of patterns. Step 5. Continue until the given stopping criteria are satisfied. Step 6. Compute J through the standard backpropagation technique, which is the Jacobian matrix that contains first derivatives of the network errors with respect to the weights (including biases):

 oe ; Ji ¼ owi

301



i ¼ 1; 2; 3:

ð6Þ

Step 7. Update the weight matrix (including biases) w1 , w2 and w3 : 1

wikþ1 ¼ wik  ðJ Ti J i þ lIÞ J Ti e;

ð7Þ

where J Ti is the transposed matrix of J i ; I is the identity matrix which has the same dimensions with J Ti J i ; l is a adjustable constant multiplier and when it is down to zero, Eq. (7) is just approximate NewtonÕs method; when l is large, it becomes gradient descent with a small step size. NewtonÕs method is faster and more accurate near an error minimum, so the aim is to shift towards NewtonÕs method as quickly as possible. Thus, l is decreased after each successful step (reduction in performance function) and is increased only when a tentative step would increase the performance function. Step 8. Go to step 3 and repeat from step 3 to 7. The primary question for an ANN designer is how to obtain an optimal topology of the ANN [10], though optimal topologies are corresponding to the specific sample data. Nowadays the most common approach is still the trial-and-error method. In order to save training time, an improved strategy is attempted by means of ‘‘early stopping with validation’’ in this paper. In addition, an optimal ANN with four neurons in the first hidden layer and 10 neurons in the second hidden layer is found from large numbers of possible topologies with the help of the GUI system shown in Section 6 in this paper. After 531 iterations the SSE of the system is less than 0.01. Fig. 3 shows that the predicted outputs of 24 testing

Fig. 3. Generalization performance of the ANN model.

302

L. Aijun et al. / Acta Materialia 52 (2004) 299–305

Table 3 Comparison between modeling and experimental values Model Infiltration time (h) 100 Final bulk density (g/cm3 ) 1.26

Experiment

Deviation (%)

110 1.28

10 1.57

samples (Table 2) compare quite favorably with the desired output values, which indicates that the ANN model is accurate enough and produces a good generalization. 3.3. Comparison between modeling and experimental results To test the precision of the ANN model, a comparison is made between modeling and experimental results shown in Table 3. The deviation of final bulk density is minor and with an error less than 10%. It is clear that the ANN modeling results is in good agreement with the experimental values.

4. General discussion After the ANN is trained successfully it can be used to predict the correlation between CVI processing factors and physical properties in C/Cs. In the following sections, three-dimensional graphs are drawn which independent variables are selected randomly from the four input parameters and which dependent variable is the density or its distribution of workpieces. 4.1. Coupling effect of volume ratio (C3 H6 and N2 ) and temperature on densification Assume the pressure in furnace and volume fraction are two given values, respectively, Fig. 4 can be drawn.

It shows that with the temperature increasing, the uniformity of densification decreases, especially in high temperature and densifications of workpieces with higher temperatures are more sensitive to the volume ratio of propylene than those with lower temperatures. It is indicated that the temperatures have significant impact on densification as shown in Figs. 5(a) and (b). Presumably, it is because temperatures seriously affect the pyrogenation reactions of propylene (C3 H6 ). With the assumption of first-order reactions, the rate constant of gas phase pyrogenation reaction ks is defined as follows: ks ¼ A0 expðEa =RT Þ;

ð8Þ

where A0 is the coefficient of Eq. (8) (Arrhenius); R is the ideal-gas constant; Ea is the activation energy; T is the temperature (kelvin) of the reaction. As seen in Eq. (8), under the same process conditions, higher temperatures lead to higher ks , which results in less ringent pores on the surface of the preform, whereas more pores remained easily in the inner of the perform. This is the partial reason why the uniformity of densification decreases with the temperature increasing. 4.2. Coupling effect of pressure in furnace and fiber volume fraction on densification Fig. 6 shows the correlation between CVI processing factors: the pressure and fiber volume fraction of propylene (C3 H6 ) and one of the processing targets: the density. As presented in Fig. 5, preforms with higher fiber volume fraction have higher initial densities, so under the same conditions they have higher final densities in the given densification time; higher pressures in furnaces lead to higher final densities. Presumably, it is because higher pressures result in higher concentration of propylene that

Fig. 4. The effect of volume ratio (C3 H6 and N2 ) and temperature on densification.

L. Aijun et al. / Acta Materialia 52 (2004) 299–305

303

Fig. 5. The effect of temperature on densification.

can penetrate into pores of preforms and then higher rate of densification is obtained. So pressures should be paid much more attention to during processing.

5. Processing parameters optimization using a GA Genetic algorithms are highly effective, directed and globe search and optimization algorithms based on the evolutionary process of biological organisms in nature. During the course of evolution, natural populations evolve according to the principles of natural selection and ‘‘survival of the fittest’’. In order to use GAs to solve any problem, the variable c is typically encoded into a string or chromosome structure that represents a possible solution to the given problem. Fitness value of each individual (chromosome) is evaluated with respect to the given objective function. Then this initial population is operated by three main operators, namely, reproduction, crossover, and mutation to create hopefully a better population.

Fig. 6. The surface of pressure, the fiber volume fraction, and density.

5.1. Combining ANN and genetic algorithms with real codification Up to now, GA and ANN can be combined in several ways [9]. For instance, GAs have mostly been looked on as a algorithm to generate weights of an ANN or to optimize the architecture of an ANN (as shown in Fig. 7); sometimes ANNs have been used to built the fitness function of a GA, while GAs are used to optimize the inputs of the ANN (as shown in Fig. 8). In this paper, the latter combination method is adopted. On the basis of the analysis of Section 4, the primary goal of the optimization is supposed to get better density uniformity of preforms under higher pressures and lower temperatures as shown fitðcÞ ¼

UnetðcÞ P ; ðT þ 0:001Þ

ð9Þ

where c is an input vector of the ANN model – net(); UnetðcÞ is the predicted density uniformity of the ANN model. In order to avoid the denominator gets to zero

Fig. 7. The neural network/genetic algorithm cycle for modeling.

304

L. Aijun et al. / Acta Materialia 52 (2004) 299–305

c21 ¼ ½g1 ; g2 ; g3 ; g4 ; c22 ¼ ½h1 ; h2 ; h3 ; h4 : (2) Simple crossover – a position i 2 f1; 2; 3g is randomly chosen and two new chromosomes are built: c21 ¼ ½g1 ; g2 ; . . . ; gi; ; hiþ1 ; . . . ; h4 ; c22 ¼ ½h1 ; h2 ; . . . ; h; ; giþ1 ; . . . ; g4 : Fig. 8. The neural network/genetic algorithm cycle for optimization.

by accident, ÔT +0.001Õ takes the place of ÔT Õ as the denominator. In this paper, a real-coded GA is introduced and the chromosome structure is described with a vector c consisting of l genes gi : c ¼ ½g1 ; . . . ; gl ; . . . ; g4 ;

l ¼ 1; . . . ; 4;

ð10Þ

where l is the locus of a gene in a chromosome; gi represents one of the four CVI processing parameters which is initialized with a floating point number that is drawn randomly in a uniform distribution. Next section illustrates a simple outline of the GA used in this study. Step 1. Initialization – choose an initial population containing 31 solutions to be the current population. The format of each solution (chromosome) ci ði ¼ 1; 2; . . . ; 31Þ is shown in Eq. (10). Step 2. Evaluation – each individual ci of the current population is evaluated by the fitness function (9) in order to assign it a probability for being redrawn in the next generation. The probability Pi is computed as follows: fitðci Þ Pi ¼ P31 ; j¼1 fitðcj Þ

i ¼ 1; . . . ; 31:

ð11Þ

Step 3. Selection – two individuals are selected as members of the intermediate population from the current population based on their assigned probability and the roulette wheel selection strategy [11]. Step 4. Genetic operation – apply one genetic operator to the selected chromosomes. The genetic operator is selected equiprobably from three operators, namely, reproduction, crossover and mutation. Assume that c11 ¼ ½g1 ; g2 ; g3 ; g4  and c12 ¼ ½h1 ; h2 ; h3 ; h4  are the selected chromosomes. (1) Reproduction – copy the selected individuals into the next generation without any change. Moreover, to make sure that the best-performing chromosome always survives intact from one generation to the next. The elitist strategy is also adopted which inserts the fittest chromosome of the current population into the new population

(3) Uniform mutation – two positions i 2 f1; 2; 3; 4g and j 2 f1; 2; 3; 4g are randomly selected and two new chromosomes are built independently: gi ¼ randomð‘uniform’; 0:2; 0:8Þ; hj ¼ randomð‘uniform’; 0:2; 0:8Þ; c21 ¼ ½g1 ; g2 ; . . . ; gi; ; giþ1 ; . . . ; g4 ; c22 ¼ ½h1 ; h2 ; . . . ; h; ; hjþ1 ; . . . ; h4 : Step 5. Repeat Steps 3 through 4 until the size of the intermediate population has been up to 30. Step 6. Termination – the algorithm will terminate on a user specified number of generations. The elitist chromosome of the last generation will be saved. In this study, the inputs and outputs of the ANN model are regularized According to Eq. (3), so each gene of the elitist should be decoded respectively by the inverse operation principle after the GA terminates. 5.2. Optimization for CVI processing parameter of carbon/carbon composites Fig. 9 gives the evolution history of the elitist as well as its predicted outputs of the ANN model. The result of optimization is shown in Table 4. Slightly more than two thirds of chromosomes are convergent to the elitist, so the elitist can be regarded as the optimal values, i.e., the optimum combination of CVI processing parameters adapted to carbon/carbon composites is [825 °C, 0.045

Fig. 9. The evolution history of the elitist.

L. Aijun et al. / Acta Materialia 52 (2004) 299–305

305

Table 4 The chromosome distribution of the last generation Individuals

Temperature (°C)

7# 9# 21# 28# 22#, 27# 6#, 12# Other 23a a

1020 825 825 825 825 825 825

Pressure (atm)

The Volume ratio (C3 H6 /(C3 H6 + N2 ))

Fiber volume fraction

Density (g/cm3 )

Density uniformity

0.045 0.033 0.029 0.036 0.045 0.045 0.045

0.55 0.40 0.55 0.55 0.40 0.39 0.55

0.60 0.65 0.59 0.65 0.65 0.60 0.65

1.47 1.46 1.45 1.52 1.51 1.45 1.58

0.93 0.90 0.90 0.91 0.90 0.89 0.91

Including the elitist.

atm, 0.55, 0.65], which finds fairly good agreement with [12].

6. Graphical user interface A graphical user interface (GUI) is created (Fig. 3) for easy use of the integrated model. In this way, users can change the number of hidden layers as well as the quantity of nodes per hidden layers very easily and can also save/open a successfully trained net work conveniently. Furthermore, the GUI provides an opportunity for easy comparison of physical properties corresponding to different combinations of processing factors (Fig. 5). It should be noted that the program could still work perfectly if only the user change datasets and select a suitable architecture of the ANN.

We have thus developed a three-dimensional graphic for making the neural network model much more transparent. It is anticipated that such system will offer advantages in visualizing the correlation between process conditions and the properties of C/Cs and searching for optimum CVI processing parameters.

Acknowledgements This work has been supported by the National Natural Science Foundation of China under Grant No. 50072019 and the Natural Science Foundation of China for Distinguished Young Scholars under Grant No. 50225210.

7. Summary

References

A CVI process model based on a BP neural network and GA has been developed to describe the densification of C/Cs. This model can reflect the relationship between process conditions and the bulk density and the density uniformity, making it extremely useful for process optimizing applications. The ANN sub-model is a feedforward structure that employs a modified version of the backpropagation algorithm by Levenberg–Marquardt algorithm for learning. Use of this approach allows the analysis of several critical known parameters. The GA sub-model is a real codification algorithm of which the fitness function is based on the ANN sub-model. The integration of the two sub-models can optimize the combination of these critical known parameters over a wide range of operating conditions to obtain better properties of C/Cs.

[1] Hou Xianghui, Li Hejun, Chen Yixi, Li Kezhi. Carbon 1999;37:669–77. [2] Birakayala Narayana, Evans Edward A. Carbon 2002;40:675–83. [3] Muc A, Gerba W. Compos Struct 2001;54:275–81. [4] Kim Jung-Seok, Kim Chun-Gon, Hong Chang-Sun. Compos Struct 1999;46:171–87. [5] Zhang Weigang, Huttinger Klaus J. Carbon 2001;39:1013–22. [6] Kezhi Li, Hejun Li, Kaiyu Jiang, Xianghui Hou. Sci China (Series E) 2000;43(1):77–85. [7] Song RG, Zhang QZ. J Mater Process Technol 2001;117:84–8. [8] Zhao Weixiang, Chen Dezhao, Hu Shangxu. Comput Chem Eng 2000;24:61–5. [9] Blanco A, Delgado M, Pegalajar MC. Int J Approx Reason 2000;23:67–83. [10] Walczak Steven, Cerpa Narciso. Inf Software Technol 1999;41:107–17. [11] Holland JH. Genet Algo Scient Am 1992;4:4–50. [12] Jiang Kaiyu, PhD Thesis, Northwestern Polytechnical University, China, 2000.