A Control Engineer's View of Artificial Intelligence

A Control Engineer's View of Artificial Intelligence

Copyright Cl IFAC Artificial Intelligence in Real-Time Control, Kuala Lumpur, Malaysia, 1997 A CONTROL ENGINEER'S VIEW OF ARTIFICIAL INTELLIGENCE Her...

3MB Sizes 0 Downloads 33 Views

Copyright Cl IFAC Artificial Intelligence in Real-Time Control, Kuala Lumpur, Malaysia, 1997

A CONTROL ENGINEER'S VIEW OF ARTIFICIAL INTELLIGENCE Herbert E. Rauch Lockheed Martin Advanced Technology Center (Dept. HI-6I, Bldg. 250) 3251 Hanover Street, Palo Alto CA 94304 USA [email protected]

ABSTRACT This paper looks at three areas in decision and control where the solution is improved through artificial intelligence. The first area is of historical interest (1975) where pattern recognition tools were used for process prediction and monitoring. The second area is current and shows how a particular neural network can improve process monitoring and control (with many more extensive applications). The third area is also current and lists some intelligent control applications to aerospace fault accommodation and flight control with the specific example of nonlinear flight control using neural networks Copyright © 1998 IFAC

1

Introduction

Complex systems need sophisticated control procedures to meet demanding petformance requirements uOOer adverse conditions. Conventional approaches often have difficulty meeting these stringent performance requirements. Intelligent control has been developedfor challenging problems which cannot be solved by conventional approaches. Intelligent control uses a diverse collection of technologies and disciplines including pattern recognition, expert systems, neural networks, fuzzy logic, and genetic algorithms. In an academic environment an engineeris taught how to solve relatively difficult problems in theory ani relatively simple problems in practice. However, when working in an industrial environment, the engineer is required to solve problems which may be relatively straightforward in theory, but are quite difficult in practice. For example, in an academic environment, most decision and control problems start out with a given mathematical model. In many practical industrial problems the engineer starts with input-output data ani there is no mathematical model. An important task is to develop a reasonable model. On the other hand,

there may be a detailed nonlinear model, but terms in the model may be known imprecisely and may vary with time. An important task is to develop a flexible approach to control For neural network applications see the June 1997 Special Issue of IEEE Transactions on Neural Networks on "Everyday Applications of Neural Networks" which contains 14 papers on neural networks systems which are currently in practice [1]. Neural network control applicauons include a cold rolling mill process in a steel works, recovering from long term drift in a plasma etch process, hot dip galvanizing in a steel plant, and self calibration of a space robot. The April 1997 inaugural issue of IEEE Transactions on Evolutionary Computation has a survey paper as well as a paper on an adaptive evolutional planner/navigator for mobile robots [2]. The April 1997 Special issue of IEEE Control Systems Magazine on Intelligent Control contains five papers on control applications plus a sixth paper (with 173 references) which presents a systematic classification of neural-network-basedcontrol [3]. Two recent papers present overviews of intelligent fault diagnosis and autonomous control reconfiguration[4,5]. For an interesting treatment of "The History of Control" see the June 1996 Special Issue of IEEE Control Systems Magazine [6]. This paper presents a control engineer's view of three alternative approaches to decision and control. Section 2 presents material from the past, wherethe author and his colleagues used pattern recognition tools to develop models for process prediction and monitoring using input-output data [6-9]. Current process problems are similar although techniques are much improved Section 3 discusses the Probabilistic Neural Network (PNN) for classification and the General Regression Neural Network (GRNN) for nonlinear modeling based on input-output data [10-12 ]. Equations for these two

297

the symmetric covariance matrix represent (orthogonal) projections of the vectors in 16 dimensional space, and the associatedeigen values representthe sample variance of these projections. The two-dimensional projection which has the maximum variance uses the two eigen vectors associated with the two largest eigen values.

neural networks are summarized in the Appendix. These recent neural networks have extensive application and allow more precise determination of process models. Section 4 lists some recent aerospace applications of intelligent control [13-24, with emphasis on a nonlinear aircraft flight control application using neural networks [25]. It is hoped that revisiting the use of unconventional approaches in the past may give some added insight into recent developments.

2

Pattern Recognition Approach

Precise mathematical approaches to decision and control are successful when there are well defined models. When models are not that well known, then alternative approaches may prove useful. In many practical control systems there are a large number of variables. One of the first question to ask is "How does a control engineer know which variables are important and which variables are not?" The control engineer must evaluate inputoutput data, and the evaluation is aided by engineering judgment. Twenty years ago the evaluation could also be aided by pattern recognition tools forexploratorydaia analysis [6-9]. This section presents one example where pattern recognition was applied to process monitoring, and one for non-destructive testing using ultrasonic imaging. Another example of process monitoring, not discussed here, was for manufacture of the space shuttle tiles [9].

The resulting two-dimensional projection of the 66 Objects is shown in Figure 1 with each number representing a separate object. There are no labels on the horizontal and vertical axes; they represent projections onto the two eigen-vectors associated with the largest eigen values. From prior information it is known that the 66 objects have been manufactured in three batches. The dashed lines drawn by hand show division of the 66 objects into the three batches: objects I through 25, objects 26 through 45, and objects 46 through 66. Visual inspection shows that two objects (number 14 in the upper left corner and number 17 in the lower right comer) could be outliers.

It should be noted that a modem, sophisticated application of non-destructive testing using image analysis is described by Chang, et al., [26]. The classification system uses unified image processing 1nl fuzzy-neural network methodology to classify coric stoppers into eight classes based on the degree of defects in the cork surface. Tests show a 6.7% rejection rate compared with the 40% rate provided by traditional systems. 2.1

Pattern Recognition Tools

A display tool for first level screening maps the multidimensional data into a two-dimensional plot. The desired projection tries to preserve as much as possible (in two-dimensions) the distance between points in the multi-dimensional space. For example, 66 objects with sixteen measurements for each object are representedas vector points in sixteen dimensional space. For projection purposes, the data is first normalized so the average of each variable is zero and the standard deviation is unity. A sample covariancematrix (16 by 16) is formed from the 66 objects. The eigen vectors of

298

Assume that the first 25 objects called class one (representing success) are compared with 15 objects called class two (objects 90 through 105 representing failure) as shown in Figure 2 using the same kind of two-dimensional plot as before. The object 26 inside the triangle represents a new object which is to be classifiedas success or failure. The pashed line drawn by hand encloses class two (representing failure). One way to separate two classes in multidimensional space is by a plane. The Fisher linear discriminant finds the plane that best separates,class one from class two (in the original 16 dimensional space). and then projects the 40 objects onto that vector. The line in Figure 3 shows the distance of each point from that plane (the Fisher projection) where the points in class one (success) are plotted above the line and the points in class two (failure) are plotted below the line. The new object appears 10 be in class one. The projection for the Fisher plot in Figure 3 is chosen to maximize the separation between class one andclass two, and it has much better separation than the two-dimensional projection in Figure 2 which was chosen to maximize variance over all the points. Current neural network techniques can be used to analyze this kind of multi-dimensional data. A recent adaptive neural network approach is discussed in Section 3. 2.2

Performance Prediction

The purpose of this analysis was (1) to obtain plausible explanations for the failure of tested manufactured items

(which are igniters for rockets) and (2) to predict the reliability of items which have not yet been tested. TIle igniters are expensive and are destroyed during testing. Therefore it is desirable to test as few as possible. Experts in igniters had examined in detail the igniters exhi biting anomalous behavior, had developed a variety of theories, and concluded that no independent variable could, by itself, predict success of failure. It was hoped that the pattern recognition analysis could find some combination of measured variables which could be used for successful prediction. Successful prediction of the malfunctions could lead to a change in the manufacturing process. Also, if desired, igniters presumed defective might be recalled.

Fisher Linear Discriminant plots, similar to Fig.3, in combination with other related techniques were used to predict success or failure for a small number of untested igniters which were subsequently tested . Successful predictions for this small number of igniters resulted in increased confidence in our approach, and the Fisher Linear Discriminent techniques were used to rank the larger number of untested igniters. There were two distinct design sets, so the tested single lot igniters were used in ranking the untested single lot igniters am similarly for the multiple lot igniters. A t the same time, additional related but independent work was carriedout based on physical and mechanical considerations as opposed to the Pattern Recognition analysis based strictly on numerical data. Engineering judgment was used to check, compare, and combine the information obtained from the related work with that from the Pattern Recognition analysis. Even though it took time before testing was completed and the ultimate efficacy of the Pattern Recognition analysis was evaluated, management felt there was considerable current value in the menu of different cuts at organizing and presenting the immense pile of otherwise unmanageable data. The decision makers were able to use the data structured by Pattern Recognition to gain insight into possible failure modes and future dala collection requirements.

Igniters are made of materials which go through several processing procedures with a variety of ingredients aihl along the way to form the product. During manufacture, extensive measurements were taken of component characteristics, molding processes am material properties. The first goal of our effort was to acquire all the pertinent measurements and make them accessible to a computer data base. Gathering data is . often the most difficult and time consuming part of the investigation. After diligent search we ended up with a data base consisting of 62 tested igniters with 8 of the igniters exhibiting malfunctions during testing and up to 22 potential variables for each igniter. After setting up the data base, preliminary screening involved analysis of two-dimensional displays of the multi-dimensional data similar to Figs. 1 and 2 to help identify outlying points due to such things as incorrect transcribing of numbers. The next step was for a histogram for each variable to see if (in spite of the experts opinion) there was some single variable which could be used to distinguish the successes from the failures. Failing to do this, two-dimensional displays of selected multiple variables sets were examined to see if there were visible natural grouping which might be used in the investigation.

2.3

Nondestructive Testing

This example involves a form of nondestructive testing using ultrasonic imaging. Here the problem is to quantify and formalize an intuitive procedure which has been used relatively successfully by acoustic experts in a laboratory environment. After the procedure is quantified and formalized, it can be implemented or automated using a computer in an on-line environment.

From two-dimensional displays similar to Figs. 1 and 2 we discovered there were significant differences in material properties between the earlierigniters and those which were manufactured later. Furthermore, the early igniters were each made from a single lot of JaW material while the later igniters where made from two (or even three) lots of material. The change from single to mUltiple lots also coincided with a change in the manufacturing process. Using this information we divided the igniters up into two sets which were called the single lot set and the multiple lot set.

The overall purpose of the effort discussed here was to devise a method to evaluate the quality of material during its manufacture. Ultrasonic imaging was a candidate approach which had been used relatively successfully in the laboratory environment to rank er evaluate the quality of the subject material. In order to use the ultrasonic imaging approach, the measurement and design process must be formalized and quantified in a way that lend itself to on-line testing.

299

For the purpose of visual analysis by a human operator, ultrasonic data can be displayed on a photograph. The photograph shows the transmission of ultrasonic energy

environment. Further work needs to be done to see if the entire procedure can be automated.

though a slice of material with darker area<; representing discontinuities with characteristic patterns which may contain anomalies. We start with an intuitive approach developed by the experts in ultrasonic imaging for evaluating the photographs. These experts demonstrated an ability to make valid judgments about the quality of the material, but they were not clear about how they arrivedat their decisions. Hence, the first goal of our effort was to formalize the measurement and decision process used by the experts. A second goal was to show how the procedure might be automated.

3 Neural Network Approach The Probabilistic Neural Network (PNN) is a feedforwardneural network that can separate objects into two or more classes using Bayesian sample statistics from the training set [10-12]. The General Regression Neural Network (GRNN) is a related feedforward neural network: that can be used to develop(non-linear) process models based on inputoutput vectors [11]. The basic form for both networks is characterized by one-pass learning and the use of the same width for the basis function for all dimensions in the learning space. The adaptive version of both networks is cbaracterizedby adapting separate widths for the basis function for each dimension [12]. The adaptive version can be used to eliminate unimportant variables from the decision process (to reduce errors). Because the adaptation is iterative, it sacrifices the one pass learning, but it can achieve better accuracy. The implementation equations for both networks are summarized in the Appendix.

After discussion with an ultrasonic imaging expert, a first cut description was preparedof significant features visible on the photograph of the material. We devised procedures for ranking the measurements related to these features so that these measurements could be made manually by a relatively untrained person. The final definition of the measurements was the result of an interactive process involving attempts to use the initial descriptions and additional discussions with the expert Representative measurements including the location, size and number of features described as gradients, footprints, lines, wood grain structures and textured structures.

3.1

A combination of pattern recognition techniques was used to rank ordera relatively small number of samples of the material for which test results were known. lbe Fisher linear discriminant was selected as the primary predictive tool, similar to Figure 3 while the other techniques were used to provide a consistency check. Because of a large number of possible measurements, and comparatively few test results, it was necessary to reduce the original set of 34 measurements to seven variables. This was accomplished by eliminating variables which failed to meet engineering judgment.

Probabilisti~

Neural Network

The probability calculations for the Probabilistic Neural Network are based on the assumption that each object represents a sample from the multi-dimensional probability space. Hence a probability distribution can be created from the samples. Bayesian prior information can be used to adjust the probability distribution.

A final Fisher linear Discriminant vector consisting of seven variables was derived and used to rank another small number of samples of the material before they were subjected to destructive testing. The ranking based on the pattern recognition techniques was essentially as successful in predicting the results of the destructive testing as that of the experts using intuitive judgment on ultrasonic images.

The training data is represented as vector points in multi-dimensional space. The input variables are fU'St normalized so that the standard deviation of each variable is unity. The smoothing parameter s, corresponding to the width of the basis function for each sample, is the one parameter which is adjusted. An extreme small value for s can lead to nearest neighbor classification, and an extreme large a value can lead to a hyperplane plane surface determining classification. Fortunately, classification results are not that sensitive to reasonable values of s, and there are heuristic rules for choosing a near optimal value on the initial try.

As an outcome of our work on this problem we were able to show that the use of pattern recognition techniques allowed us to: (1) replace expert intuitive judgment by manual jUdgment made by a relatively untrained person, and (2) use the relatively; untrained human operator for testing in an on-line production

An interesting advantage is that a category is created which means the input has not been seen before. This category has particular meaning in many identification problems. For example. an airborne imager with 12 spectral channels was used to make ground images in the Sierra Mountains in California The classification

300

PNN. The adaptive PNN decreases the number of features to 3 and increases the accuracy of 87.4%. Improvement such as this might be achieved eventually through engineering judgment and trial and error, but the adaptive PNN does this automatically.

used one pixel samples (with 12 spectral bands or dimensions) to distinguish between clouds, snow, water, bare soil, and runoff. with no other contextual information. The training set consisted of 37 pixel samples for each category. Essentially all known areas were correct (including correct classification of snow and water in the shadow of clouds). The same neural network (without additional training) was given a new image to classify, taken nearthe Sierra Mountains over Mono Lake. The usual regions appeared to be COJreCt but there were three large areas of unknown origin which turned out to be a region of lava, a small island covered with waste, and the asphalt of a highway. In this example, the classification scheme really
3.2

Key advantagesof the Probabilistic Neural Network: are that it does the best separation (in a probability sense) and that it can specify the probability that new object should be associated with each class. Two recent reports from the National Institute of Standards and Technology (NIST) [27,28] show that the Probabilistic Neural Network has the lowest error mte with two difficult classification problems: recognition of fingerprints aOO recognition of handwritten numbers. With large databases, a disadvantage is that it requires one node for each trnining pattern, but clustering can be used used to to reduce the number of nodes in the network: from one per sample to one per cluster center. Adaptive PNN The adaptive version of the Probabilistic Neural Network adjusts separate smoothing parnmeters Si fer each measured dimension. The adaptive version can be used for automatic feature selection with the price of increased training time. The adaptation uses gradient descent (varying the Si one dimension at a time) with the holdout method for validation. Whenever the change in the s vector is too small to cause a change in classification accuracy, the sum of probabilities is used, which is a continuous criterion. The adaptation can also eliminate variables that han little positive effect of the classification. Fifteen cases are presented to show the improvement with the adaptive technique. For example, a data base called Aircraft Health Monitoring has 90 samples with 16 original features(dimensions) and basic PNN accuracy of 74%. The adaptive PNN reduces the number of features to 6 and increases the accuracy to 95%. A data base called Engine Misfire Prediction has 2520 samples with 4 original features and 77.5% accuracy with basic

301

General Regression Neural Network

A variation on the Probabilistic Neural Network:, call the General Regression Neural Network (GRNN), can be used to model a process with input-output data. An adaptive version of the General Regression Neural Network can eliminate unimportant variables so it can determine which variables are important and which are not. With large databases, a disadvantage is that it requires one node for each training pattern, but clustering can be used used to to reduce the number of nodes in the network from one per sample to one per cluster center. For example, a data base called Phase Diversity 1 has 543 original samples and 245 original features (dimensions). After adaptation and clustering the number of samples is reduced to 264 (through clustering) and the number of features is reducedto 43 with the resulting root-mean-square (rms.) error in prediction reduced by a factor of 4.

4 4.1

Aerospace Flight Control Fault Detection and Flight Control

Two recent papers treat fault detection using neural networks. DaandLin [13] use the State chi-square test and ARTMAP neural networks to determine when a failure occurs and where it is located. A simulation example treats soft failures in an integmted global positioning systeml inertial navigation system. Napolitano, et al., [14] use one main neural network anda series of on-line learning neural networks, one fer each sensor, to detectand identify sensor failures, and to replace the failed sensor signal by its estimate. Three papers use neural networks for control of interceptor missiles. Cottrell, Vincent, and Sadati [15] synthesize a terminal guidance law for an interceptor kill vehicle that minimizes the incremental divert velocity along the trajectory. The closed loop guidance law, which is implemented as a neural network. performs better than augmented proportional navigation. Fu, et al.. [16] propose an adaptive feedforwardneural network for bank-ta-tum missile autopilot design. A stable adaptive control law is derived using Lyapunov theory with performance demonstrated through simulation. Geng and McCullough [17] use a fuzzy cerebellar model

discrepancies between the first neural network and the actual dynamics. First, consider feedback linearization. Let the nonlinear dynamic system have the following form wherex is the state vector, x' and x" are the first and second derivative of the state vector, and d is the vector control variable (which for aircraft can be changes or deltas in the angles of the control surfaces).

arithmetic computer (fuzzy CMAC) neural network in the design of advanced missile control systems. Ha [18] uses a three-Iayerfeedforwardneural network to design a discrete time lateral-directioncontrollaw for a high performance aircraft KrishnaKumar, et al., [19] use a genetic algorithm to optimize the attributes of a fuzzy logic controller, including parameters of the membership function and the rule structure. Simulation results for a wide-envelope F-18 longitudinal model show good robustness properties Tseng and Chi [20] combine a neural network (to represent nonlinear characteristics) with a rule-based fu:zzy logic controller to provide maximum achievable traction for an aircraft anti-lock brake system. Napolitano andKincheloe [21] use on-line learning neural network architecture to replace gain scheduling for the autopilot controller of a high-performance aircraft The neural network training uses an extended back-propagation algorithm.

x"

=f (x' x, d )

When x and d have the same dimension, it is called the square system. If the function f is invertible and if x and x' are measureable. the system can be transformed into the following system where u is the pseudo control.

x" =u f (x' x, d)

u

=

The inverse transformation of the function Sadhukhan and Feteih [22] obtain a decoupledresponse to pilot pitch rate and velocity commands using an exact inverse neural controller with full state feedback. The simulation on linearizedlongitudinal dynamics of the F8 aircraft shows ability to learn the inverse dynamics during the course of flight to respond to modeling uncertainty or battle damage. BaIakrishnan and Biega [23) develop a dual neural network architecture, an action network and a critic network. for optimal control. Simulation of the longitudinal dynamics of an aircraft show this approach yields optimal control over the entire range of training so it can function as an autopilot Lin and Maa [24) develop a self organizing fuzzy logic controller with two suites of fuzzy logic, one for control and one for learning. The simulation of the short period longitudinal model of the F-4E flight control system shows satisfactory performance lm:Jer various flight conditions.

expressed in the following form where the notation f -1 indicates a function to be detennined. d

= f -1 (x' x, u )

The first neural network (designatedNNl) is used to implement this inverse transformation and produce an approximation (designated by g). The training can be conducted off-line, prior to flight, using input-output pairs generated from a mathematical model. If the function f is perfectly known, and if the realization g with the neural network is perfect, then the transformed system is linear. The second neural network (designated NN2) is used to implement adaptive control to stabilize the system when there are discrepancies between the desired and actual system. The discrepancy between the desired pseudo control (uc with subscript c based on the actual control

4.2

f is

Nonlinear Flight Control

4)

is

represented by the notation D.

The remainder of this section describes work by Kim and Calise [25] who use feedback linearization with neural networks to implement aircraft flight control. The control of nonlinear systems by feedback linearization is well known with a number of applications. The application to aircraft flight has included a number of actual flight tests. The application of feedbacklinearization discussed here uses two neural networks to implement aircraft flight control (in a simulation). The first neural network, trained off-line, implements the feedback linearization. The second neural network, adaptive on-line, compensates for

302

x" =Uc + D D =f (x' x, dc ) - Uc The pseudo control U c is composed of three parts, the desired change in state acceleration (designated Xc") , the proportional- derivative control control (uaf).

(~),

and the adaptive

The proportional-derivative control has terms (with coefficients l), and lcJ based on the difference between

w' = - g eT P b

the desired state (xc) and the actual state (x) which is designated by x*

The purpose of the first neural network (NNI) is to approximate inversion of the aircraft attitude dynamics. Careful examination of the dependence of aerodynamic moments on flight conditions allows development of an efficient topology to minimize the number of neurons and maintain accuracy, based on a combination of Radial Basis Functions and sigma-rho (polynomial) units. Radial Basis Functions (two-dimensional Gaussian) were used for two variables, Mach number and angle of attaclc; sigma-rho units were used for the other variables. The training data were generated at a variety of flight conditions from simulation software (4275 conditions for longitudinal motion which has one control variable, effective elevator deflection, and 30,240 conditions for lateral motion which has two control variables, differential tail deflection and rudder deflection).

Upi = kp x* + k
x*

=Xc - x

Substituting the pseudo control (including the expression for proportional-derivativecontrol) into the nonlinear dynamic system and rearranging terms slightly gives the following dynamic system with the driving term based on the difference between the adaptive control (uad) and the discrepancy ( D). x*" + kp x* + k
(x, x' ,ue>

The purpose of the second neural network (NN2) is to adapt to compensate for discrepancies in the actual dynamics. . Radial Basis Functions (one-dimensional Gaussian) were used for one variable, Mach number, aOO sigma-rho units were used for the other three variables. There were 21 Radial Basis Functions and 40 sigma-rho units (5 times 4 times 2) so there were. a total of 840 weights for adaptation. The (three) proportionalderivative controllers were chosen to have a 1% settling time of o.s seconds and a damping ratio of 0.7(17, so the adaptive controller must modify the weights rapidly to lceep up with the traditional controller.

composed of the sum of a set of scaler coefficients (represented by the vector w) multiplied by a set of basis functions (such as the neural network Radial Basis Functions). The basis functions (represented by the vector function b) depend on the state (x,x') and the desired psuedo control (u c). uad=wT b (x, x',uc) I t is assumed that the discrepancy D can also be represented by the sum of a different set of scaler coefficients multiplied by the same set of basis functions. The adaptive control continually monitors the vector error term e (composed of the vectors x* axi x* ') and modifies the scaler coefficient w so they more closely approximate (and cancel out) the discrepancy term. To prevent dither, there is a deadzone, so that if the error term is small, the adaptive coefficients are not changed. The rate of cbangeof the adaptive coefficients (w') is given by the following, where g is the scaler step size term, e is the vector error term, and P is a positive definite matrix based on the system linear dynamics.

A challenging maneuver was used to illustrate performance. The F-IS aircraft starts at a subsonic Mach number and reaches a supersonic value at the end, while starting with a roll rate command followed by a normal acceleration command, all at 100% throttle. The maneuver was simulated under nine conditions with starting altitudes at 10,000 feet and 20.000 feet axi starting speed between 0.5 and 1.0 Mach numbers. Simulations were performed with just the first neural network (NN 1) and with both neural networks (NN 1 aOO NN2) to illustrate the improved performance with the adaptive on-line control.

303

The measure of performance is the rms. average over time of the error (in degrees) of the left stabilator deflection. With one neural networlc (NN1) the rms. error ranged between 0.6 degrees and 2.1 degreesfor the nine trials. With the adaptive control (NNI and NN2) the rms. error ranged between 0.1 degrees and 0.015 degrees,an improvement of an order of magnitude. The time histories were even more impressive with the

adaptive controller responding to changing conditions in less than a second. Two additional cases included flight outside the training region and flight with 30% loss of left stabilator effectiveness. In both these cases the adaptive control worked quite well, with rapid response time, while the single neural network had real difficulties.

The General Regression Neural Network is based on a training set of samples Xi and Yi where Xi is the i-th input vector and Yi is the i-th output vector.

The

calculations determine the conditional mean of the output vector y based on the input vector x. where s is a smoothing parameter, the summation S is over all samples i, and the normalizing constant c is obtained by summing over all samples.

The authors draw three main conclusions from this work. First, feedbacklinearization is an alternative to gain scheduling. It holds the potential for simplifying the problem of flight control design for high performance aircraft. Second, with careful attention, it is possible to design neural networks that are capable of internally representing the inversion needed to make feedback linearization a viable approach to real-time implementation of full envelope flight control systems. Third, an on-line adaptive neural network is able to closely approximate perfect inversion without full knowledge of the uncertaintiies.

y C

= Si =

Yi Pi (x, xi) Ic Si

Pi (x, xi)

Pi (x, xi)= exp[-(x-xki)T (x-xki)/(2 s2) ] For this example the the smoothing parameter s is the same for all components of the vector x, but with the adaptive version, the smoothing parameter Sj represents the value for the j-th component of the vector

Appendix

x.

This appendix presents equations for the Probabilistic Neural Network [PNN], the General Regression Neural Network [GRNN], and Radial Basis Functions [RBF].

Estimation using Radial Basis Functions is also based on a training set of samples Xj and Yi where Xj is the i-

The Probabilistic Neural Network starts with a training set of sample vectors xki representing samples from a

calculations use basis function (designated b) with the Ie-th center at location x*k. An associated constant Y*k

number of categories. The calculations first determine the unnonnalized probability density for the k-th category at a sample vector point x (designated.p*k(x) )

is calculated as weight for the Ie-th center. The calculations determine the output vector y based on the input vector x, where s is a smoothing parameter (where usually Sj represents the value for the j-th

th input vector and Yi is the i-th output vector.

written as shown where xki is the vector representing

The

component of the vector x), the summation S is over all samples i, and the normalizing constant c is obtained by summing over all samples.

the i-th training sample from the k-th category,x is the new vector point being evaluated.. s is a smoothing parameter, the summation S is over all samples i in the k-th category. and the superscript T represents the

transpose of the vector x so that xT x is a scalar.

For this example the the smoothing parameter s is the same for all components of the vector x, but with the adaptive version, the smoothing parameter Sj

The constants Y*k are the solution to a least squares problem that minimizes the difference between the estimates Y and the outputs Yi at all the sample points

represents the value for the j-th component of the vector x. The normalized probability density for the k-th category is designated by PJc
304

Y*k values are chosen to m1nimize

= Si

{Yi - Sk Y*k ~ (X;, xk) ] T {Yi - Sk Y*k bk (Xi, xk) ]

Naylor, M. Innocenti, and G. Silvestri, "Neural Network-Based Scheme for Sensor Failure Detection, Identification, and Accommodation," AIAA Journal of Guidance, Control and Dynamics, November-December, 1995. 15 R. Cottrell, T. Vincent, andS. Sadati, "Minimizing References Interceptor Size Using Neural Networks for Terminal 1 Special Issue on Everyday Applications of Nema1 Guidance Law Synthesis", AIAA Journal of Guidance, Networks, IEEE Transactions on Neural Networks, June Control and Dynamics, May-June 1996. 16 L.-c. Fu, W.-D. Chang, J.-H. Yang, andT.-S. Kuo, 1997. 2 IEEE Transactions on Evolutionary Computation, "Adaptive Robust Bank-to-Turn Missile Autopilot Design Using Neural Networks, AIAA Journal of April 1997 3 Special Issue on Intelligent Control, IEEE Control Guidance, Control and Dynamics, March-April 1997. 17 Z. Geng and C. McCullough, "Missile Control Systems Magazine, April 1997. 4 H. E. Rauch, "Intelligent Fault Diagnosis and Control Using Fuzzy Cerebellar Model Arithmetic Computer Reconfiguration," IEEE Control Systems Magazine, Neural Networks", AIAA Journal of Guidance, Control and Dynamics, May-June 1997. June 1994. H. E. Rauch, "Autonomous Control 18 C. Ha, "Neural Networks Approach to AIAA Aircraft 5 Reconfiguration", IEEE Control Systems Magazine, Control Design Challenge", AIAA Journal of Guidance, Control and Dynamics, July-August 1995 December 1995. 6 R. A. Hughes, D. M. Campbell, and K. Chew, "The 19 K. KrishnaKumar, P. Gonsalves, A. Satyadas, am Use of Pattern Recognition in the Validation; G. Zacharias, "Hybrid Fuzzy Logic Right Controller Processing, and Analysis of Test Data", Proceedings of Synthesis via Pilot Modeling", AIAA Journal of 13th AlAA Aerospace Sciences Meeting, January 20-22, Guidance, Control and Dynamics, September-October 1975, (AIAA Paper 75-88). An abbreviated version is 1995 published as a Synoptic underthe title "Use of Pattern 20 H. Tseng and C. Chi, "Aircraft Antilock Brake Recognition to Validate Test Data", AIAA Journal of System with Neural Networks and Fuzzy Logic", AlAA Spacecraft, V01.l2, No. 10, October 1975. Journal of Guidance, Control and Dynamics, September7 R. A. Hughes, H. E. Rauch and M. A Fisher, "Using October 1995 Pattern Recognition in Engineering Analysis", 21 M. Napolitano and M. Kincheloe, "On-Line AIAA/AAS Astrodynamics Conference, August 18-20, Learning Neural Network Controllers for Autopilot 1976, (AlAA Paper 76-802). Systems", AIAA Journal of Guidance, Control and 8 R. A. Hughes, M. A. Fisher, and H. E. Rauch "Using Dynamics, November-December 1995. Pattern Recognition in Product Assurance". Proceedings 22 D. Sadhukhan and S. Feteih, "F8 Neurocontroller of 1977 Annual Reliabili ty and Maintainability Based on Dynamic Inversion", AIAA Journal of Symposium, January 18-20, 1977. Guidance, Control and Dynamics, January-February 9 R. A. Hughes and H. E. Rauch, "Using Pattern 1996. Recognition on Space Shuttle Tiles", Proceedings of 23 S. Balakrishnan and V. Biega, uAdaptive-Critic-Based 32nd Annual Technical Conference of the American Neural Networks for Aircraft Optimal Control", AlAA Society for Quality Control, May 8-10,1978. Journal of Guidance, Control and Dynamics, July10 D. F. Specht, "Probabilistic Neural Networks", August 1996. Neural Networks, Vol. 3, 1990. 24 C.-M. Lin and J.-H. Maa, "Right Control System 11 D. F. Specht, "A General Regression Neural Design by Self-Organizing Fuzzy Logic Controllers", Network", IEEE Transactions on Neural Networks, AIAA Journal of Guidance, Control and Dynamics, November, 1991. January-February 1997. 12 D. F. Specht, "Probabilistic Neural Networks IDl 25 B. Kim and A. Calise, "Nonlinear Right Control General Regression Neural Networks", Chapter 3, Fuzzy Using Neural Networks", AlA A Journal of Guidance, Logic andNeura1 Network Handbook, C. Cben, Edita- Control and Dynamics, January-February 1997. in-Chief, McGraw-Hill, Inc., New York, 1996. 26 J. Chang, G. Han, J. Valverde, N. Griswold, J. 13 R. Da and C.-F. Lin, "Failure Diagnosis System Duque-Carrillo. E. Sanchez-Sinencio, "Cork Quality Using ARTMAP Neural Networks, AIAA Journal of Classification System using a Unified Image Processing and Fuzzy-Neural Network Methodology", IEEE Guidance, Control and Dynamics, July-August, 1995. 14 M. R. Napolitano, C. Neppach, V. Casdorpb, S. Transactions on Neural Networks, June 1997.

The "p" notation for the General Regression Newal Network and the "b" notation for Radial Basis Functions has been chosen to illustrate similarities and differences between the two approaches.

305

'27. G. Candela and R. Chellappa, "Comparative Perfonnance of Classification Methods for Fingerprints", U. S. Department of Commerce, Report NISTIR 5163 , April 1993. Email jeny@ magi.ncsl.nist.gov. 28. P. Grother and G. Candela, "Comparison of Handprinted Digit Classifiers', U. S . Department of Commerce, Report 5209, May, 1993.

6 61 48

59

9

5662

2

60 63~

NEW OIJECT TO

19

at CLASSIFIED

10

~~_u 16 12

12

49 64

65

53

22

52

20

~.

2 24 I

8

47

23

101

t1totttJ

102

so

"

103

t4

t5

tJ t6

23

'"

1\ 17

.. S 6 4 10597

ClASS ONE

. ~19 (sucasS)

14 , 20 10 •

22

137

104

100

cv.si TWO

13

(FAlll.aEl

Fig. 3 Fisher Linear Discriminallt Plot to Separate Objects From Class One (&tccess) and Class Two (Failure)

Fig. 1 Two-Dimensional Projection of Objects From Sixteen Dimensional Space

NEW OaJECT INSIDE TlIANGl£

/

"

...

ClASS ONE

_---ii_,~~S ,

(SUCCESS) OUTSIDE OASHEO LINE

7

"-

.92~

I ClASS TWO I (FAllURf) \ 102

98

91

97

103

15\ , I

I

3

12 9

4 - "

"

18,' 93, ' IOS ,11 _1~ ... ___ 10

'-

..... ' ... _

......

I

2 19

\99

101:25 --_,6 ,-, \ 8 \ 100) , 16 I \ ... 94 _ - __ \ I 95 I 20 -22 , , _ ....

,96,

24 23

'-.,/

13

17

Fig. 2 Two-Dimensional Projection of Objects From Class One (Slccess) and Class Two (Failure)

306