Associative, Self-Programmable Neural Networks for Manufacturing Processes

Associative, Self-Programmable Neural Networks for Manufacturing Processes

Copyright@IFAC 12th Triennial World Congress. Sydney. Australia. 1993 ASSOCIATIVE, SELF-PROGRAMMABLE NEURAL NETWORKS FOR MANUFACTURING PROCESSES M.B...

975KB Sizes 1 Downloads 48 Views

Copyright@IFAC 12th Triennial World Congress. Sydney. Australia. 1993

ASSOCIATIVE, SELF-PROGRAMMABLE NEURAL NETWORKS FOR MANUFACTURING PROCESSES M.B. Zaremba Deparlemenl d'informaliqlU!, Universile du Quibec, Hull, Quebec J8X 3X7, Canada

Abstract. The use of associative neural networks in real-time measurement and control processes encountered in manufacturing is discussed in this paper. In particular, the problem of constructing neural architectures capable of learning arbitrary associations very rapidly and to generalize this knowledge to an arbitrary input state is addressed. Beginning with the problem of object separability and indexing, through some computational operations and the function approximation by neural networks, the paper provides a survey of network construction methods and their applications. Keywords. Neural nets; manufacturing systems; machine learning; measurement systems; robotics.

directly onto an association matrix, with noniterative presentation of training inputs. The resulting neural architectures are able to learn arbitrary associations very rapidly and to generalize this knowledge to an arbitrary input state by an extension of the associativity task. A very important feature of the algorithms and methods dealt with in this paper is that they are particularly fit for implementation on optical neural networks (Caul field et al., 1989) and associative holographic memories (Lee et al., 1989; Mao et ai, 1991) - technologies that have recently generated considerable interest. The feasibility of using holographic memories can be proved by such reports as the storage of 500 images in volume holograms with no apparent crosstalk (Mok et ai, 1991).

1. INTRODUCTION A growing number of manufacturing applications requmng real-time pattern recognition and classification, optimization, signal prediction, and other often NP-hard tasks that may involve a heavy computational burden make massively parallel processing and its implementation on neural networks increasingly important. There are several advantages of neuromorphic architectures that are interesting from the practical application standpoint. Inherently distributed processing gives neural networks a fault tolerant behaviour. Neural net classifiers are able to calculate higher-order decision boundaries under realistic constraints, while not being subject to combinatorial explosion or excessive requirements regarding the storage of learning samples.

Three problem areas are discussed in this paper in more detail : stimulus-response associations, local computational capabilities of neuromorphic architectures, and function approximation. Some examples of applications pertaining to manufacturing systems, particularly to robotics and measurement systems, are also provided.

Presently, feed forward neural architectures using the backpropagation training algorithm are being applied in the vast majority of practical situations. The backpropagation tralOlOg scheme provides a powerful, general tool for the determination of connection weights. Its generality, however, is obtained at the expense of training speed, occasional poor convergence, and a limit to the number of training inputs. This paper addresses the general problem of mapping stimulus-response associations

2. ASSOCIATIVE FUNCTIONS A typical example of indexing - an associative

341

function in a strong sense - encountered in manufacturing systems is binary image recognition, a frequent element of assembly processes using machine vision systems. The network, upon presentation of the image of an object X(k), activates the index Ok corresponding to the ol:ject, and only that index. Since the decision regions in this case reduce to points, two layers of connections are able to deal with any indexing task.

2. I. Geometrical information An essential element of the calculation of a number of geometrical parameters is determination of the normal vector. Assuming short-range (Fig. I)

Computation of the connection weights, being a type of regression problem (Poelzleitner and Wechsler, 1990), resolves to the computation of a matrix operator (Kohonen, 1988). A whole spectrum of associative networks can be considered, ranging from simple single-layer perceptron-like architectures to quite complex ones, such as holographic networks (Sutherland, 1992) performing operations on vectors representing the entire image.

~l-------''---------.J' (J' \ - /'

A key issue affecting the efficiency of the ol:ject recognition is the separability of the object family. The family {y(k)} is separable ifit can be arranged in a separable sequence, which is when y(k) = 0 for all k. The family is strongly separable if each arrangement in a sequence is a separahle sequence. Many recognition prohlems can he reduced to problems of indexing strongly separahle representations.

Fig. I. Short-range dipoles. connections in the graph vectors are calculated as ni

elementary normal

= -{ fey,)! 1,0 I +

f(Yw)[-I,O I + f(YN)[O, II [f(yrJ - f(yw), f(YN) - f(ys)] = lai(x), hi(x)1 (4)

+ f(Ys)[O,-11}

In the case of strong separability, the connection matrix is defined by the following recurrent formula :

r,

=

where, for example, f(YE) denotes the output of a unit with the transfer function f(.) processing the "East" dipole. The aggregate vector N p = lA, BI at point P is obtained by integration:

(I)

(2) A,. = Eij.{(i)lIi(x(P», B" = Eij.{(i)hi(x(P»

and e(k) = [OkJ - (11 O'k)y(k) M(k-/J, O'k = Card(y{k)

(5)

where: j.{(i) - weight distribution over the processing window around P. Similarly, tangent vector Tp = IB, -AI· The aggregate local curvature can be defined, for an eight-connectedness graph, as

(3)

The tests for separability and the self-programming procedures in the general case of separability are discussed by Zaremba and Porada, 1992. In order to enhance object separability, an additional dipole processing layer that performs set-theoretical operations was proposed. The dipoles connect pairs of neighbouring input neurons, as defined by a connectivity graph r.

Cl' = Eij.{(i)(O'p - 3)

(6)

0',.

The activation = (activation of P) - (activation of i) = Xp - Xi' and corresponds to the number of the obstacle cells (pixels) neighbouring point P.

2.2. Control Law 2. LOCAL PARALLEL PROCESSING Geometrical operations performed by the network on the internal representations of the obstacles may be used by an autonomous mobile robot or an AGV to maneuver among obstacles, the shape and location of which need not be known a priori. A control law that allows the robot to move toward a target

Connectionnist networks with a dipole processing layer discussed in previous section are ahle to convey geometrical information concerning the objects projected on the input layer. This information can be further used for control purposes.

342

position and to avoid the obstacles can be expressed in terms of the robot displacement vector as

discussed methodology. 3.2 Measurement Systems

(7) Measurement systems are typical examples of an application where an approximation of a parametric function is of importance. Moreover, it is often needed that the values of the function at the trainimg points correspond precisely to the required values. That means that the network output will be equal to the required calibrated values of the measurand at the calibration points.

The displacement of the robot is a function of a globally defined field of target-attracting forces Up. Scalar coefficient Fp defines the impact of Up. It is worth noting that the vector field of forces Up can also be generated using dipole processing and applying a wavefront propagation algorithm.

The problem can be stated as follows. Let us suppose that each input neuron i receives a sample Xj of the sensor signal X. Variation of the measured physical value v modifies the distribution, but the dependence between v and a sample xj{v) is not, in general, given analytically. The task of the neural processor is to evaluate v from the vector sample

2.3. Controller Architecture The controller is implemented as a multi-layer feed forward network with the weights predetermined according to the elementary operation performed by a particular module. The first layer contains dipoletype connections. With a network limited to shortrange connections there are 8n hidden units, n being the number of non-peripheral input neurons. The next layer of the network is a multiple copy of the non-peripheral part of the input layer. Each copy has a different weight pattern and the transfer properties of the unit, depending on the type of operation performed by the sub-layer. Eight-connected binary units serve as border indicators; differently connected linear units are used for computation of border curvature and normal vectors.

(8) where I denotes the input layer. A network has to be constructed that combines the functions Xj{v) into a function F approximating the measurand v:

I F{x) - v I

<

E

for x = x{v)

(9)

Two approaches can be adopted to construct such an approximation network . In the first one, the network architecture (for a single measurand) is designed according to a pattern:

Before the control signal can be sent for execution by the robot, there are some logical operations to be performed . They are related to the stagnation points in the robot trajectory . Stagnation (or equilibrium) points can occur in two situations, when TeU = 0 and when N = O. Based on the verification of those conditions, a decision has to be made regarding the strategy for robot motion . This part of the controller would perform rule-based operations, and could as well be implemented in a form of a neural network.

x - IM _ HN _ (2nd hidden layer)L - 01 -1'

(1O)

where the size M of the input layer corresponds to the number of independent parameters affecting the measurand . The number of neurons in the hidden layer{s) depends on the required approximation error E. The connection weights can be determined using explicit approximation formulas (Cardaliaguet and Euvrard, 1992).

3. COMPUTATIONAL NETWORKS 3.1. Function Approximation

The second approach consists in improving the approximation with each new sampling, as long as the successive inputs xj{v) are mutualy linearly independent functions. The architecture of the network (Zaremba et aI, 1993) can be described as:

An extension of the discrete indexing problem into the real-value domain leads us to the function approximation problem. It has been proven that any continuous mapping of a function and its derivatives (Funahashi, 1989; Hornik et aI, 1990) can be approximately realized by multilayer feed forward neural networks with at least one hidden layer of neurons. However, the problem of efficient learning of such function approximation networks remains open. From the engineering point of view, development of constructive methods for the network design is a prerequisite for the practical use of the

(Il) This type of network is less sensitive to the sensor errors, and is particularly well suited to the measurement systems where the sensor output is a continuous distribution of an analog signal over a planar domain. These kinds of distributed signals can be encountered in systems containing cameras

343

in distributed systems. Further work will concentrate on hybrid systems combining the capabilities of neural networks with Artificial Intelligence techniques. In terms of future hardware implementations, optical technologies are of our particular interest.

and other array sensors. The learning procedure involves selection of the bias coefficients XO of the input layer neurons (12), and (12) the calculation of the weight matrix W between layers I and H. Both procedures execute iteratively in step k = 1, 2, ... , n. The choice of the bias coefficients is based on the condition of the linear independency of the calibration values y(JI~ with respect to Yj(JI~, j < k, which occurs when

5. ACKNOWLEDGMENTS The work presented in this paper was supported by the National Science and Engineering Research Council, grant no. OGP 9227.

(13) 6. REFERENCES where U is the weight vector between layers H and 0, and uj = Yj(JlO) uj = 0

for j = 1, ... , k-l for j = k, ... , n.

Cardaliaguet P. and G. Euvrard (1992). Approximation of a function and its derivative with a neural network. Neural Networks,S, 207-220. Caulfield, H.l., 1. Kinser, and S.K. Rogers (1989). Optical neural networks. Proc. of the IEEE, 77, 1573-1583. Funahashi, K. (1989) . On the approximate realization of continuous mappings by neural networks. Neural Networks, 2,3, 183-192. Hornik, K., M. Stichcombe, and A. White (1990). Universal approximation of an unknown mapping and its derivatives using multi layer feedforward networks. Neural Networks, 3, 551-560. Kohonen, T. (1988). Self-Organization and Associative MemOlY. Springer-Verlag, New York. Lee, L.S., H.M. Stoll, and M.C. Tackitt (1989). Continuous-time optical neural network associative memory. Opt. Lett. , 14, 162-164. Mao, Z.Q., D.R. Selviah, S. Tao, and J.E. Midwinter (1991). Holographic high order associative memory system. Proc. 3rd lEE Con! on Holographic Systems, pp. 132-136. Mok, F.H., M.C. Tackitt, and H.M. Stoll (1991). Storage of 500 high-resolution holograms in a LiNb0 3 crystal. Opt. Let!. , 16, 605-607. Poelzleitner, W., and Wechsler, H. (1990) . Selective and focused invariant recognition using distributed associative memories (DAM). IEEE Trans. PAMI, 12, 8, 809-814. Sutherland, 1.0 (1992). The holographic neural method. In: Fuzzy, Holographic, and Parallel Intelligellce, (B. Soucek, Ed.), pp. 7-92. John Wiley & Sons. Zaremba, M.B., and E. Porada (1992). Selfprogramming of neural networks for indexing and function approximation tasks. Appl. Math. and Comp. Sci. 2, 2, 251-276. Zaremba, M.B., E. Porada, and W.l. Bock (1993). An approximation network for measurement systems. Proc. IEEE COli! 011 Neural Networks, San Francisco, March 28 - April 1, 1993 .

(14)

In this way, the current network can be used for testing XO and ,,0 in step k. By analyzing the function in equation (13) for different levels of xko, we can choose the best bias coefficient in neuron k together with the best current calibration value. Thus, the method provides the user with the possibility of ongoing insight into approximation precision and with the means to determine optimal selection of the successive training inputs. The calculation of the connection matrix W follows the rules (1) - (3), defined for continuous signals. Vector U corresponds to the vector of the calibration values (see (14». The neural architecture (11) has been applied to the processing of far field radiation patterns generated by highly birefringent high pressure polarimetric optical fibre sensor. After nine calibration steps the approximation errors were limited to a fraction of percent.

4. CONCLUSIONS Associative memory-type networks offer the advantage of fast learning, while maintaining the degree of generalization often sufficient for a broad class of manufacturing applications. Some examples of the use of such networks in machine vision, robotics, and measurement systems were given in this paper. A constructive method for the design of a function approximation network for the processing of distributed measurement signals was presented. Due to the 2-D sensor signal interpretation complexity, the neuromorphic processing seems to be a technology of choice for real-time measurement

344