Neural network representation of finite element method

Neural network representation of finite element method

Neural Networks, Vol 7, No 2, pp. 389-395, 1994 Copyrtght© 1994 EisevaerSemnce Ltd Pnnted m the USA. All rights reserved 0893-6080/94 $6.00 + 00 Perg...

609KB Sizes 1 Downloads 142 Views

Neural Networks, Vol 7, No 2, pp. 389-395, 1994 Copyrtght© 1994 EisevaerSemnce Ltd Pnnted m the USA. All rights reserved 0893-6080/94 $6.00 + 00

Pergamon

CONTRIBUTED ARTICLE

Neural Network Representation of Finite Element Method J U N T A K E U C H I AND YUKIO K O S U G I TokyoInsutute of Technology (Recetved 26 November 1992; revtsedand accepted 12 August 1993) Abstract--A fimte element method ( FEM)-based neural network for boundary value problems ts constdered. The neural network consists of node-umts and element-subnets whose synapttc weights are predetermmed usmg FEM's formulation procedure Using the network mverston techmque, unknown mputs of the network are updated to satisfy both the governmg law and the boundary conditmns, and consequently reach the solutton of the problem. We applied the network to the electricfield problem governed by Poisson's equatton The numertcal slmulatwn shows the validity of thts network

Keywords--Neural networks, Finite element method, Polsson's equation, Network inversion, Electrical impedance tomography, Inverse problem, Boundary value problem. Section 2 summarizes the FEM formulation of Poisson's equation. Definitions of a forward problem and an inverse problem are also made in the section. The equivalent networks for the forward problem and the inverse problem are presented in Sections 3 and 4, respectively. Numerical simulation result on the inverse problem is shown in Section 5.

1. I N T R O D U C T I O N Artificial neural networks have been studied and utilized in various science and engineering fields, mostly for classification or recognition problems. In this paper we consider neural network representation of finite element method ( F E M ) , with the application to the electric field problem governed by Poisson's equation. The FEM is a very powerful computational tool for boundary value problems (Stasa, 1985), which transforms a continuous system into an approximate discrete system with finite elements and nodes. Structural similarity between the FEM and a neural network with distributed elements motivated us to consider an equivalent network for the FEM. We place neural units on all the FEM elements and nodes, then connect them according to the system equation in discrete form. We mainly use an iterative technique of Linden and Kinderman (1989), which is referred to as "network inversion," to solve the problem with the network. The network inversion is a kind of "back-propagation learning" (Rumelhart et al., 1986), in which the output error is propagated backward to the inputs, and the inputs of the network are updated iteratively, instead of updating the weights, to decrease the output error. Each boundary condition is treated as a clamped input or target output of the network.

2. F E M D E S C R I P T I O N OF P O I S S O N ' S EQUATION We consider an inhomogeneous isotropic domain D with its surface S. Then the electric field is governed by Poisson's equation (Webster, 1990). d i v ( - a grad x) = Q

( 1)

where a, x, and Q are the conductivity, voltage, and current source within the domain D respectively. The surface S is classified by its boundary condition as,

So:

x = Xu,

q = qu;

incompletely defined boundary

Sl:

x =

xg,

q = q~;

Dirichlet boundary

$2:

x = x~,

q = qg;

Neumann boundary

$3:

x = xg,

q = qg;

over-defined boundary

where Requests for reprints should be sent to Dr. YukaoKosugi, Inter&sophnary Graduate Schoolof Scienceand Engineering Tokyo Instttute of Technology,4259, Nagatsuta, Mldori-ku, Yokohama227,

q = a grad x.n.

(2)

Subscripts ( # ) and ( u ) denote given value and unknown value, respectively; q, ( . ) , and n are the inward

Japan.

389

390

J Takeucht and Y Kosugt

boundary current flux, inner product, and outward unit vector normal to the boundary, respectively. We define in this paper a forward problem and an inverse problem. The forward problem is to find the voltage and current density distribution from the given conductivity distribution, current sources, and boundary conditions. The reverse problem is to reconstruct the conductivity distribution from the given current sources and boundary conditions. The reconstruction problem m electrical impedance tomography (EIT), discussed in Secnon 4, is a typical inverse problem. FEM is essentially a numerical approximation method to solve boundary value problems, usually in forward problem form. In the FEM, the domain is dlvided into finite number of elements (Figure 1 ), and the voltage distribution w~thin each element is approximated by interpolating the voltages of the nodes, or the nodal voltages. The voltage inside the ith element m Figure 1 is expressed as,

[

x,,]

x,~ = N,~'X~ = [n,j n,z n,3]

I

(3)

X~3J

where superscripts ( " ), ( ' ), and subscript ( , ) denote element, transpose, and /th element, respectively. N and X are the interpolation vector and nodal voltage vector, respectively, for example, Xf = [x, ~x,2 x,3]' is the element nodal voltage vector of t h e / t h element. Then the governing equation is rewritten in terms of the equation of the nodal voltages. There are several ways to derive the equation. For Poisson's equation we can utilize the variational approach, or the minimization of the functional, to derive this equation. The functional X for eqns ( 1) and (2) becomes

X= fo[½algradxl2-Qx}dv-

fs

2,33

(qx) ds

(4)

By substituting eqn ( 3 ) into eqn (4), functional X in the discrete form becomes as follows:

et [ONeVONe

m [fofl

) [ ' = 1=1 Z

[-~ffXt[~)--~xt-oNettx

e

e

}

dl)

where × and m are coordinate vector and the number of elements, respectively. Proceeding the integral, eqn (5) can be rearranged as X :

m ~

1 et K,X, e e - X , et f ,e) (~X,

(6)

Xll

I-th

IQ ~

element(e)

node

surface (S)

domain (D) FIGURE 1. An example of two-dimensional finite element model, in which the object domain is divided into finite number of threenode tdangular elements, The voltage in each element is expressed by interpolating the voltages of the surrounding three nodes.

tributions of Q and q are summed up into the nodal current vector f. The conductivity matrix K is a nonnegative definite, symmetric matrix and is determined by conductivity, nodal coordinates, and other structural factors. The necessary condition for minimizing the functional X is obtained by equating the derivative of X, with respect to nodal voltage vector X, to null vector 0. ox

0x

_

lox;'

e/0x;'v/,__l

t0x K' 0x j j x -

f~=0,

(8)

or

&V 0X

-

KaX

f~ : 0

(9)

In solving the above equation, boundary conditions are imposed by fixing the nodal voltages ~ = xje (J E 1, 2 . . . . p: p is the number of nodes); this is the case where the j t h node is on S~ (Dirichlet) or $3 (overdefined) boundaries. In this case, remaining nodal voltages on So (incompletely defined) or $2 (Neumann) boundaries are treated as unknown parameters, that is, xj = xju. Similarly, nodal currents o n S 2 or $3 boundaries are fixed asfj =fig (j E 1.2 . . . . . p), whereas the remaining nodal currents are treated as unknown ones asfj = fj, when the nodes are on So or Sl. In ordinary boundary value problems, the surface S consists of S~ (Dirichlet) and $2 (Neumann) boundaries. Thus, we have p unknown parameters xju orfj,, in total, in simultaneous eqn (8) or (9). Hence, we can obtain the unknown parameters by solving eqn (9), after ~mposing the nodal voltages xjg gwen.

t-I

or

X = IxtKax

- Xtf%

(7)

where K and fare the stiffness (or conductivity) matrix and the nodal force (or current) vector, respectively. The superscript (a) denotes an assemblage. The con-

3. NEURAL N E T W O R K E X P R E S S I O N OF FORWARD P R O B L E M 3.1.

Energy-Based

Approach

An approach to solve the forward problem is the energybased one, which updates unknown nodal voltages it-

NN Representatton of F E M

391 6

1

2

3. Each error is fed back to the corresponding input unit through the feedback path in Figure 2. 4. Each unknown nodal voltage is updated, using the error with a certain gradient-based nonlinear programming technique, for example,

3

(a) <

<

~

.,

,,\,, \f,

i ~

(Xj)n~=Xj-atj

>

, , / ,, /,.

't ((2)K~ ((a)K, ~'}

/ Xl ?X2 /3

~

>

(4)K,T)~

X5 ~ X6~

(b) FIGURE 2. (a) A simple two-dimensional FEM model with six nodes and four elements. (b) Neural network model for the forward problem of FEM description (a). The network represents eqn (8). A solid line circle denotes the linear-weightedsum unit. A broken line circle corresponds to each element, which we call element-subnet. Each pair of input and output units, connected by feedback path, corresponds to each node and has connection to the corresponding element-subnets.

eratively to decrease the functional, or the energy, in eqns (6) or (7). Figure 2 illustrates a network model for the forward problem. The network represents eqn (8), with nodal voltages and nodal currents expressed in the input and the output layer, respectively. For the extension of this network to the inverse problem, we use eqn (8), instead of eqn (9), to construct the network. Boundary conditions of the problem specify a part of the input and the output values. The partial derivatives of the energy, with respect to the nodal voltages, are obtained by eqn (8), which are referred to the output errors ej (j E 1, 2 . . . . . p) of the network. ej

=

ox

_ _

Oxj

=

I °x: t=l [ OXj

K

e

v--i

)

t ~ OX ] j x - fjg

(forjth node E $2 orS3).

(10)

Therefore, the algorithm proceeds as follows: 1. Clamp the known inputs, or the given nodal voltages, and initialize the unknown inputs with arbitrary values. 2. Each output unit is activated and the error cj is produced only at the output unit where the desired output fjg is given.

(forjth node E $2)

(11)

where the subscript (new)and a denote updated value and modification gain, respectively. 5. Return to step 2 if the error is larger than the predetermined convergence criterion. Note that, in this approach, unknown nodal voltage on So node (node on So boundary) will never be updated in the iteration after initialized with arbitrary value, because the error is not generated at corresponding output unit. On the other hand, the error on $3 node will not be used at all, because nodal voltage on $3 node is clamped. This implies that some condition or information is disregarded. Therefore, this approach does not work properly when So node or $3 node exists. Furthermore, in general FEM formulation, we cannot always find a functional to be minimized, but can derive only a simultaneous eqn like (8) or (9), using, for example, the method of weighted residuals. For solving such problems, we will introduce an equationbased approach in the next subsection.

3.2. Equation-BasedApproach This approach uses the network inversion technique (Linden & Kindermann, 1989) to update the unknown inputs of the network in solving for nodal voltages. Network reversion is a kind of back propagation learning, in which the output error is propagated backward to the input layer, and not the weights but the unspecified inputs are iteratively modified. This idea is based on the duality of weights and inputs in the general expression of the error function J as, J = l O t - g(Wl)l 2

(12)

where O, I, and W denote output vector, input vector, and weight matrix of the connection between the input and the first hidden layer, respectively. Subscript (t) and g denote target value and transfer function vector between the first hidden layer and the output layer, respectively. Thus, we can modify the input vector I in place of the weight matrix W with the back propagation technique. They applied the method to reconstruct the image at the input, from an incomplete image given to the output side. The error function of our network is defined as follows: j=l

E ~. 2 j~s2,s3

(13)

Partial derivative of J, with respect to unknown nodal voltage, is obtained by the following equation:

392

0xk

J Takeuchl and Y Kosugt

OXkL,=,[OX ' ~ O x ] J J T t \ a x , ]

~J' ,

forkthnodeES0orS2.

(14)

This can be calculated by back propagating the output error to the input. This approach differs from the algorithm described in the previous subsection, in that the unknown inputs of the network are updated by using the back propagated error. This approach allows the existence of So nodes, as long as the number of So nodes does not exceed that of $3 nodes. This means that when the number of unknown parameters, that is, unknown nodal voltages and currents, exceeds the number of equations in the simultaneous eqn (8) or (9), we cannot obtain a unique solution of the problem. By superimposing the corresponding input unit and the output unit, we can rearrange the network of Figure 2 into a mutually connected form, as illustrated in Figure 3. In this form, signal transfer occurs only locally, among adjacent units and subnets. This feature is important for building up a large-size network in hardware, because we can design the hardware architecture without thinking about additional circuits for broadcasting the signal to units in far positions. In solving the linear simultaneous eqn (9), both direct methods and iterative methods have been conventionally used, and the latter ones include relaxation methods and gradient methods, for example, the conjugate gradient method. We cannot say, so far, which one is best because each method has merits and demerits. The architecture of our network is advantageous to accelerate the calculation speed using the gradient method, when we think about the implementation of the network in hardware, although some modification wdl be needed to accommodate the whole method in hardware. 4. INVERSE P R O B L E M 4.1. Inverse Problem in EIT

Electrical impedance tomography (EIT) (Webster, 1990) is a noninvasive technique to reconstruct the conductivity distribution inside the object, by means of measuring the voltages and currents at the object surface. EIT has been used in geological applications, such as in mineral prospecting, and more widely in biomedical applications. Figure 4 illustrates a fourelectrode measuring system. In EIT, most reconstruction algorithms are based on FEM, in modeling the object. In solving the problem, the following assumptions are usually imposed: there is no current source inside the object, and the surface is isolated except where the current electrodes are attached. It is also assumed that the conductivity in each element, which we call element conductivity,

,iJ.\

/(~

0

node-unit

q4))... ~

t j. element-subnet

5;5)

FIGURE 3. Mutually connected form of the network of Figure 2b. Each pair of units, connected by feedback path in Figure 2b, is supedmpased to constitute a node-unit. The connections between node-units and element-subnets in this figure express both the connection from input layer to element-subnet layer and from element-subnet layer to output layer in Figure 2b.

is homogeneous. In this case, the maximum number of independent element conductivity values we can obtain is E ( E - 1)/2, where E is the number of electrodes (Webster, 1990 ). In the FEM modeling, we also use the above assumptions, but we do not directly solve the forward problem, that is, eqn (9), which the most conventionally used algorithm relies upon (Yorkey, Webster, & Tompkins, 1987). We show the neural network for an EIT problem in the next subsection. 4.2. Neural N e t w o r k for EIT

When the element conductivity is homogeneous, element conductivity matrix in eqn (6) or (8) becomes K[ = ~[L',

(15)

where a,~ is the element conductivity of the/th element, and L [ is a matrix that does not depend on a[. Then we can rearrange the element subnet, as illustrated in Figure 5. Thin means that new unknown parameters, that is, the element conductivities, are added to the system to form an intermediate input layer in the network for the forward problem illustrated in Figure 2. We use the network inversion technique to update the unknown nodal voltages and the element conductivities simultaneously. In general, an EIT problem requires multiple voltage-and-current measurement sets, or the training sets, for each current injection pattern. object surface

-

• : electrode

FIGURE 4. Four-electrode impedance measuring system. Current is injected through a pair of current electrodes. Differential voltage is measured at a pair of voltage electrodes.

393

N N Representation of F E M

LT

ing the direct methods. Then the element conductivities are modified in a direction to decrease the voltage error at the electrode. Thus, the unknown nodal voltages are regarded as dependent variables of the unknown conductivities. On the other hand, in our approach both variables are treated in equal. This will be helpful for designing a parallel processing architecture in a simple form. However, it is still an open question which method leads rapid convergence to the solution, in general.

/

FIGURE 5. The ith element-subnet of the neural network for EIT problem.

In this case, the error function E of the network for EIT problems becomes

T (16)

E = ~ Jk,

k=l

where T is the number of training sets, and J~ is the error function J, defined in eqn (13), of the kth training set. Partial derivatives of E, with respect to unknown nodal voltages, are obtained by eqn (14) for each training set. On the other hand, partial derivative of E with respect to the ith element's conductivity is obtained by the following equation: OE = ~ { k=l

Z

(e 19X/0xe

5. N U M E R I C A L S I M U L A T I O N In this section, we show two examples of the numerical simulation of our network for the EIT problem. Figure 6 shows the FEM models we used for the simulation. All the elements are three-node triangular. We used the opposite method (Webster, 1990 ) for current injection, in which the current is injected through a pair of opposite electrodes [e.g., electrode pairs ( 1, 10) (2, 11 ) • . . (9, 18) in Figure 6 ]. That is, we have nine training sets for each model. In both cases, the maximum number of independent element conductivities we can estimate is 117 (Webster, 1990), which is larger than the number of elements in the simulation model. Therefore, we can obtain a unique solution for the element conductivities.

e [oxe\tv'" X ) 6

jES2,S 3

where { }k means that the nodal voltages in the brackets are given by the kth training set. This can be calculated by summing up errors back propagated to the element conductivity input in Figure 5, for all training sets. Thus, the output errors are back propagated to affect all the inputs of the network and unknown parameters, differently from the algorithm shown in Section 3.2. Unknown nodal voltages and element conductivities are both updated at once after all the training sets are presented to the network. The same element conductivitles are supposed for all training sets, whereas unknown nodal voltages, which are not included in the training sets, depend on each current injection pattern. Thus, we must keep in memory the unknown nodal voltages together with the training set, and recall them when the same training set is presented to the network in the iterative procedure. Differently from the ordinary forward problems, we have a number of $3 nodes, which correspond to multifunctional electrodes where the voltage and the current are simultaneously measured, in the EIT problem. This additional information enables the network to learn additional unknown intermediate inputs, that is, the element conductivities. In most of the conventional methods in EIT problems, they solve eqn (9) for tentative conductivities us-

5

7

4

18 i~

electrode

11 0=05 o=10 /14

~15

0=20

(a)

(b) FIGURE 6. Two-dimensional circular (a) and rectangular (b) FEM m~:lels for numerical simulation. Circular m~:lel has 37 nodes, 54 elements, and 18 electrodes. Rectangular model has 30 nodes, 40 elements, and 18 electrodes. All the elements are supposed to have independent element conductivity. Three or two different conductivibes are supposed with arbitrary distributions.

394

J Takeuchl and Y Kosugt

For the preparation of the simulation, we evaluated the nodal voltages and currents for each current injection pattern by solving a simultaneous eqn (9) with Gaussian elimination. In constructing the training set, we assumed voltages were measurable at all the electrodes, including the ones for current injection. To evaluate the solution stability against the error in the voltage measurement, we added mean zero Gaussian noise to the nodal voltages in the rectangular model training set, in which the standard deviation of the noise distribution was set to 0.1% of the maximum nodal voltage. In the simulation, initial conductivities are all set to unity. To avoid excessive changes of the conductlvities, we applied the learning rule only to unknown nodal voltages for the first 50 iterations. Figure 7a shows the circular model conductivity distribution at various stages of the iteratmn. In the progress of the iterations, the solution converged to the exact image. Slow convergence at the central part may be attributable to the fact that the influence of the voltage measurement at the surface electrodes is gradually propagating from the surface to the center. The rectangular model also converged to the exact solutmn, in the absence of noise. However, some error induced by the noise appeared in the reconstructed image (Figure 7b), and it reached about 20% of the exact value at an element in this simulation. There are many causes of (~

1.0 0.0

(a)

20001teragon

10 0.0

(b) FIGURE 7. (a) Exact and reconstructed conductivity images of the circular model, at each iteration. (b) Reconstructed image of the rectangular model at 3000 iteration (having already converged) with random noise added in the training sets.

measurement error; among all, the electrode location error is known as one of the major factors. In this simulatmn, we used the Fletcher-Reeves conjugate gradient method (Himmelblau, 1972) for back propagation learning. For the implementation of the neural network in hardware, this method requires an additional control unit that is connected to all the input units. The unit should organize the one-dimensional search and the calculation of the momentum rate, using the present and the previous gradients in determining the next searching direction. 6. C O N C L U S I O N We have shown a neural network architecture applied to the boundary value problem of electric field. We organized the network to represent the equation derived from FEM formula. Then we considered two approaches, an energy-based approach and an equationbased approach, to solve the forward problem. In the equation-based approach, we used the network inversion technique to update the unknown nodal voltages lteratively. We extended this scheme to the reconstruction algorithm in EIT, introducing the element conductivity parameters into the system to constitute an intermediate input layer. Finally, we demonstrated the numerical simulation with satisfactory results in the accuracy. In the original network inversion, interlayer weights have to be well adjusted through the learning before updating of inputs takes place. In this preadjustmg phase, a large-sized network with many weights will require tremendous number of training data to satisfy the conditions for getting generalization ability ( Baum & Haussler, 1989). In our approach, the internodal weights are hard-wired according to physical conditions inherent to the FEM problem. This will allow us to get accurate solutions, irrespective of the conditions necessary for the generalization. In finite element analysis, however, the available number of elements, which is closely related to the accuracy of the solution, is limited and it causes some residual error in the solution. Uncertainty of physical parameters, including the electrode dislocation error in EIT problem, also degrades the accuracy of the solution. Using the neural network representation will facilitate decreasing such errors by the training with experimental responses of the real system. Nonhnear transfer functions and the adjustable weights of the network enable the applications to the extensive nonlinear problems, such as the magnetic flux distribution analysis in ferromagnetic materials in nonlinear region. Another advantage of the neural network representation of FEM is that the network may be able to explain some nonlinear relations among physical quantities, without explicitly defining the nonlinear function forms. When we think about the calculation of the

N N Representatton of F E M

magnetic flux distribution, for example, in a ferromagnetic material, the magnetic flux density does not linearly depend on the magnetic field applied. In this case, we may be able to explain the nonlinearity by introducing a number of sigmoid functions to stand for a nonlinear element, without explicitly knowing the nonlinear B-H curve. With the advantages mentioned above, the network representation of FEM can be a useful framework for further applications, in particular, when we have a hardware apparatus of the network to yield high-speed solution. REFERENCES Baum, E. B., & Haussler, D. (1989). What size net gaves vahd generahzatlon9 Neural Computatton, 1, 151-160 Himmelblau, D. M (1972). Apphed nonhnear programmmg New York' McGraw-Hill. Linden, A, & Kmdermann, J (1989). Inversmn of multdayer nets. Proceedmgs of lnternattonal Jomt Conferenceon NeuralNetworks (pp 425-430). Washington, DC. Rumelhart, D E., & McClelland, J. L (Eds.) (1986). Parallel dtstrtbutedprocessmg (Vol. 1). Cambridge MIT Press. Stasa, F L (1985). Apphed fimte element analysts for engmeers New York: CBS Webster, J. G (Ed.) (1990). Electrtcal tmpedance tomography New York. Adam Hdger Yorkey, T. J , Webster, J. G., & Tompklns, W J. (1987). Comparing reconstruction algorithms for electrical impedance tomography. IEEE Transacttonon BtomedwalEngmeermg, BME-34( 11 ), 843852.

395

0 W j k (g) (u) (a) S grad x X p K e g I l T (.) (t) (e) J, E n X=

output vector weight matrix j t h (node) kth (training set) given value unknown value assemblage surface, boundary gradient voltage functional number of nodes conductivity matrix error transfer function vector input vector ith (element) number of training sets inner product transpose element error function outward unit vector normal to the boundary (Xl, x2 . . . . . Xp) t nodal voltage vector N = (n~, n2 . . . . . ripe)t interpolation vector ( p e is the number of nodes in each element) fa = (fl, f2 . . . . . fp)t assemblage nodal current vector 0 = (0, 0 . . . . . 0)t null vector notation

NOMENCLATURE D div a Q q m x a

domain divergence conductivity current source within the domain inward boundary current flux number of elements coordinate vector modification coefficient

Oa__2 Obl Oa~ Oa ~

Oa2 Obl

...

Oan Obl

...

Oan

Ob2

Ob Oal Obm

Obm