Aerospace Science and Technology 93 (2019) 105325
Contents lists available at ScienceDirect
Aerospace Science and Technology www.elsevier.com/locate/aescte
Morphing aircraft control based on switched nonlinear systems and adaptive dynamic programming Qing Wang a,∗ , Ligang Gong a,b , Chaoyang Dong c , Kewei Zhong d a
School of Automation Science and Electrical Engineering, Beihang University, Beijing, 100191, China Science and Technology on Space Intelligent Control Laboratory, Beijing Institute of Control Engineering, Beijing 100190, China c School of Aeronautic Science and Engineering, Beihang University, Beijing, 100191, China d Shanghai Aerospace Control Technology Institute, Shanghai, 201100, China b
a r t i c l e
i n f o
Article history: Received 16 April 2018 Received in revised form 3 July 2019 Accepted 1 August 2019 Available online 6 August 2019 Keywords: Morphing aircraft Switched nonlinear systems Backstepping Adaptive dynamic programming Extended state observer
a b s t r a c t This paper investigates the control problem of a morphing aircraft with variable sweep wings based on switched nonlinear systems and adaptive dynamic programming (ADP). The longitudinal altitude motion of the morphing aircraft is first modeled as switched nonlinear systems in lower triangular form. Then, the designed controller is comprised of the basic part and supplementary part. For the basic part, the backstepping technique is applied and a modified dynamic surface is introduced to overcome the ‘explosion of complexity’ problem. Disturbance observers inspired from the idea of extended state observer are designed to obtain the estimations of the internal uncertainties and external disturbances. The common virtual control laws of the backstepping method are developed by the disturbance observers and radial basis function neural networks. On the other hand, for the supplementary part, an ADP approach with the name of action-dependent heuristic dynamic programming is used to further decrease the altitude tracking error, which generates an additional control input by observing the differences between the actual and desired values in the backstepping design. Finally, comparative simulations are conducted to demonstrate the improved control performance of the proposed approach. © 2019 Elsevier Masson SAS. All rights reserved.
1. Introduction The morphing aircraft can change its aerodynamic shape to adapt to multipoint mission environments, which resolves flight performance conflicts between high-speed and low-speed regimes [1–3]. The morphing capability is mainly guaranteed by the wing transition to improve flight performance when accomplishing the mission. However, the aircraft may become unstable due to large variations in mass distribution and applied aerodynamic forces during the morphing process, as well as internal uncertainties and external disturbances [4,5]. The complication of wing transition process brings great challenges to the control system design and is receiving considerable attention in recent years. The existing researches on the morphing aircraft control mainly concentrate on the longitudinal motion. The Jacobian linearization approach is adopted in [6] to obtain a linear parameter varying (LPV) model of the longitudinal dynamics of a folding-wing aircraft. The dynamic responses of the LPV model are consistent with the nonlinear dynamic equations. Based on the LPV model,
*
Corresponding author. E-mail address:
[email protected] (Q. Wang).
https://doi.org/10.1016/j.ast.2019.105325 1270-9638/© 2019 Elsevier Masson SAS. All rights reserved.
a combination of gain-scheduled out-loop controller and classical inner-loop controller is further designed in [7] to ensure robustness to uncertainties. Similarly, the nonlinear dynamic equations of a gull-wing aircraft are linearized in [8] to establish a LPV model. The H ∞ attitude controller is then obtained by state feedback and feedforward. The authors in [9] investigate a new LPV model via the tensor product modeling method for a morphing aircraft undergoing large change of aerodynamic configuration. A related LPV controller is then devised on the basis of linear matrix inequalities. There are also researches on designing sliding mode controller to deal with uncertainties. In [10], the sliding mode LPV controller synthesis is studied, where an adaptive reaching law is applied to ensure the stability and robustness. Since the methods in [6–10] are difficult to find further application when the nonlinear characteristics of the morphing aircraft dynamics are considered, meanwhile, several nonlinear control methodologies have also been developed. In [11], the aircraft motion equations are transformed into normal form to simplify the analysis, where a high order chained differentiator is combined with neural networks (NNs) to determine the control input. In [12], the adaptive dynamic surface control is proposed by using NNs to handle system uncertainties, and the input-output constraints are considered based on a barrier Lyapunov function. In [13], a smooth switching strategy is integrated
2
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
into the adaptive dynamic surface control to relax the constraints on active region of the NNs, and a transformed tracking error is used to guarantee the prescribed performance. In [14], a nonlinear time-varying model is established for a telescopic wing morphing, where the asymmetric wing telescoping is regarded as a part of the control input and a multi-loop sliding mode controller is then designed. In addition, the research on moving mass control technology is surveyed recently in [15] for improvement of properties of both dynamics and control, which can be employed as a novel tool for morphing aircraft structure and expedite the control performance via regulation of the internal moving masses. The coupled attitude and servo actuator dynamics are investigated in [16] for a moving mass flight vehicle and provide guidelines on designing parameters to guarantee stability and high performance. Both [15] and [16] can be further exploited to analyze the coupled dynamics and related control methods for morphing aircraft when moving mass control technology is applied. Despite the aforementioned progress in nonlinear control of morphing aircraft, the dramatic variations in mass distribution and aerodynamic forces during the process of morphing are not treated appropriately for controller design. These variations are either regarded as unknown nonlinearities approximated by NNs in [11–13], or completely known and canceled out in [14]. Both the approximation and exactly modeling of the variations are complicated due to the time-varying characteristics. To eliminate this complexity problem, switched systems provide a practicable framework for modeling of morphing aircraft dynamics which have switching features. Switched systems consist of a collection of subsystems and a switching law governing the specific active subsystem. In [17], uncertain switched linear systems are adopted for modeling the longitudinal dynamics of a morphing aircraft with variable sweepback wings based on the Jacobian linearization approach along the transition trajectory. The finite-time boundedness of the systems is ensured by robust state feedback control design. The modeling capability of switched linear systems is further enlarged in [18,19], where switched LPV models of the morphing aircraft are established in the continuous and discrete domain, respectively. The authors in [18] propose smooth switching LPV controllers with overlapped scheduled parameter subsets, while the authors in [19] consider the non-fragile H ∞ control in the presence of data missing between the sensors, controllers and actuators. Based on the same switched LPV modeling approach in [19], the asynchronous switching between the controllers and subsystems is further investigated in [20] for finite-time H ∞ controller design. However, the above switched systems based approaches require linearization of the nonlinear dynamics, which may degrade control performance when the nonlinearities are severe. In view of this, switched nonlinear systems can be utilized as a substitution for modeling the morphing aircraft dynamics. Although the theoretical researches on stability analysis and controller design for switched nonlinear systems have been extensively explored in the past few years [21–25], to the best of the authors’ knowledge, the related applications of switched nonlinear systems to morphing aircraft control are rare. It should be mentioned that most of the previous efforts on morphing aircraft control deal with internal uncertainties and external disturbances via robustness of the designed controller. Meanwhile, the extended state observer (ESO) provides an effective disturbance rejection technique, which views the internal uncertainties and external disturbances as an extended state. The ESO can be used as a disturbance observer and is the key part of the active disturbance rejection control (ADRC) strategy, where the extended state is estimated by the ESO and then canceled out in real time by the controller. The ESO and ADRC schemes have found numerous engineering applications such as power converters [26], spacecraft system [27], underwater vehicles [28], etc. The authors
in [29–31] provide rigorous proofs of the convergence of ESO and ADRC, but the systems considered are in chained or lower triangular form, and there are few related results on ESO for switched nonlinear systems. On the other hand, the adaptive dynamic programming (ADP) has been studied as a remarkable technique for optimal control problems during the past decade. The ADP is a data-driving control approach, where two NNs called critic network and action network are adopted to obtain the estimations of optimal performance index and control input, respectively. In [32], the global adaptive optimal control of nonlinear polynomial systems is addressed by the proposed ADP methodology, where a relaxed policy iteration method is developed. In [33], a novel value iteration ADP scheme which eliminates the zero initial condition is proposed for undiscounted optimal control of nonlinear systems. In [34], a modified ADP architecture with integration of filter-based action network is investigated for tracking control of nonlinear systems. Besides, the ADP scheme has been applied to many engineering problems including hypersonic vehicles tracking control [35], battery energy management [36], missile guidance [37] and so forth. In particular, the authors in [35] design a control strategy by combing the sliding mode control with action-dependent heuristic dynamic programming (ADHDP), where the ADHDP outputs a supplementary control input to expedite the control performance. Motivated by the above discussions, this study deals with morphing aircraft control on the basis of switched nonlinear systems and ADP. Specifically, the longitudinal altitude motion of a morphing aircraft with variable sweep wings is modeled as switched nonlinear systems in lower triangular form. Components of the designed controller are decomposed into the basic part and supplementary part. The backstepping technique is adopted to design the basic part, where disturbance observers inspired from the idea of ESO are designed to estimate the internal uncertainties and external disturbances. The common virtual control laws are obtained via a combination of the disturbance observers and radial basis function (RBF) NNs. Besides, the ‘explosion of complexity’ problem is avoided by the introduction of a modified dynamic surface at each step. Moreover, for the supplementary part, the ADHDP approach is used considering the online-learning merit, which generates an additional control input on account of the tracking errors of the basic part. The utilization of ADHDP can further decrease altitude tracking error. Finally, the proposed control methodology is compared with that contains the basic part only to show the improved control performance. The rest of the paper is organized as follows. The modeling of morphing aircraft by switched nonlinear systems is addressed in Section 2. The basic part of the controller based on the switched nonlinear systems and the supplementary part of the controller based on the ADHDP approach are described in Section 3. Comparative simulation studies are depicted in Section 4, and final conclusion completes this paper. 2. Model description We here first provide the nonlinear model of the morphing aircraft for description of the altitude motion, and the established nonlinear model is then further simplified to facilitate the controller design. 2.1. Modeling of the aircraft motion The altitude motion model of a morphing aircraft with variable sweep wings can be described as
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
⎧ h˙ = V sin γ , ⎪ ⎪ ⎪ ⎪ L + T sin α − F Ikz g ⎪ ⎪ ⎪ γ˙ = − cos γ , ⎪ ⎨ mV V L + T sin α − F I z g α˙ = − + cos γ + q, ⎪ ⎪ ⎪ mV V ⎪ ⎪ ⎪ − I˙y q − S x g cos θ + M A + T Z T + M I y ⎪ ⎪ ⎩ q˙ = ,
equivalent disturbances caused by model uncertainties, exogenous perturbations, etc.
(1)
Iy
where h, γ , α and q denote the altitude, flight path angle, angle of attack and pitch rate, respectively. V is the velocity, m and I y are the aircraft mass and inertia, T is the engine thrust, Z T is thrust moment arm, g is the acceleration of gravity, L and M A are the lift and pitch moment, F Ikz , F I z and M I y represent inertial forces and moment related to morphing process which can be written as
F I z = F Ikz = S x (˙q cos α − q2 sin α ) + 2 S˙ x q cos α + S¨ x sin α ,
˙ cos α − V q cos α ) M I y = S x ( V˙ sin α + V α with S x being the static moment distribution in x axis of the body frame. The related definitions are given as follows:
S x (ξ ) ≈ 2m1 r1x + m3 r3x , L ≈ L α (ξ )α + L 0 (ξ ) = q¯ S w (ξ )(C Lα (ξ )α + C L0 (ξ )), M A ≈ M A α (ξ )α + M A0 (ξ ) + M A δe (ξ )δe + M Aq (ξ )
qc A (ξ )
2V
α (ξ )α + C 0 (ξ ) + C δe (ξ )δ = q¯ S w (ξ )c A (ξ ) C M e M M
+
q C M qc A (ξ )
2V
,
where m1 and m3 denote the mass of aircraft’s wing and fuselage, r1x and r3x denote the positions of corresponding components in the body frame, q¯ = 12 ρh V 2 denotes the dynamic pressure, ρh denotes the air density, ξ denotes the sweep angle, S w (ξ ) denotes the wing surface, and δe denotes the elevator angle. The other related definitions can be found in [11,12]. It can be seen that the original motion model of the morphing aircraft can be regarded as a nonlinear parameter varying system while a lack of theoretical basis confines the direct controller design. In order to handle this issue, the nonlinear parameter varying system is further simplified to be switched nonlinear systems in lower triangular form for controller design with reasonable assumptions on the flight path angle and sweep angle.
γ is small, i.e. sin γ ≈ γ
From Eq. (1), when the sweep angle ξ takes a series of values within the available range, the related altitude dynamics can be rewritten as
⎧˙ h = Vγ, ⎪ ⎪ ⎪ ⎨ γ˙ = f 2,σ (t ) + g2,σ (t ) θ + dγ , ⎪ θ˙ = q, ⎪ ⎪ ⎩ q˙ = f 4,σ (t ) + g 4,σ (t ) δe + dq ,
(2)
L 0,σ (t ) − L α ,σ (t ) γ − Vg cos , g 2,σ (t ) mV T Z T + M A α ,σ (t ) α + M A0,σ (t ) dγ , f 4,σ (t ) = I y ,σ (t )
where f 2,σ (t ) =
T sin α − F Ikz + mV M A δe ,σ (t ) g 4,σ (t ) = I , y ,σ (t )
γ
dq =
M I y − ˙I y q− S x g cos θ Iy
Remark 1. The flight path angle is assumed to be small such that sin γ ≈ γ , which facilitates the backstepping design process since the derivation of the virtual control law for γ and the related stability analysis are thus simplified. We adopt this assumption along the line of extant references on longitudinal motion control of aircraft including [11–13]. However, the assumption on the flight path angle γ is valid only when γ is very small, which cannot be guaranteed via our method. We acknowledge that the flight path angle γ may vary within a certain range and the controller performance could drop in highly nonlinear regions where γ takes big values. We demonstrate the rationality of this assumption by simulation at present and a further analysis on the limits of the longitudinal channel of the system will be the topic of our future research. Besides, the switched nonlinear systems are adopted for controller design while the real motion model can be indeed regarded as a nonlinear parameter varying system. It is noted that both the controller design and stability analysis are based on switched nonlinear systems while the simulation model of the aircraft takes the nonlinear parameter varying form. The conservativeness of the controller is thus introduced since the models for description of motion and controller design are different. The direct controller design for the nonlinear parameter varying system is also an open problem worthy of further study. 2.2. Problem description
Assumption 1. The flight path angle [11–13].
3
= +
L α ,σ (t ) , dγ = mV M Aq,σ (t ) qc A ,σ (t ) , 2I y ,σ (t ) V
+ dq , σ (t ) : [0, +∞) →
S M represents the switching signal related to the sweep angle ξ , and S M = {1, 2, · · · , M} denotes the finite index set of the aircraft configurations for certain sweep angles. dγ and dq denote the
This paper addresses the altitude tracking control problem for a morphing aircraft with variable sweep wings. It can be seen that the morphing will introduce large variations in mass distribution and aerodynamic forces, which is reflected by the nonlinear system model as shown in Eq. (1). The specific expressions of these related items are also provided and it is noteworthy that a direct deign of the controller for the nonlinear parameter varying system is very difficult. We note that the morphing effect is not handled appropriately in most of the extant studies which are conducted within the framework of general nonlinear systems or switched linear systems. On one hand, the physical and aerodynamic variations of the aircraft are not exploited for general nonlinear systems where they are either estimated by neural networks or assumed known and canceled out. The designed controller is hence complicated due to the time-varying characteristics of the variations. On the other hand, the linearization computations of the aircraft motion model are vital for generating switched linear systems and cause conservativeness of the devised controller in the presence of severe nonlinearities. To handle this issue, the switched nonlinear systems are hence adopted to further simplify the original system model and facilitate the controller design. The switched nonlinear systems can also reflect the influence of morphing by the switching of a series of subsystems. On the other hand, the morphing also aggravates the inconsistency between the model employed for motion simulation and the model adopted for controller design. As a result, the altitude tracking for the morphing aircraft is more susceptible to disturbances and makes the controller design a challenging problem, which constitutes the main purpose of our investigation. The problem of our study can be stated that let the altitude track a given reference signal by designing a controller based on the developed switched nonlinear systems. In this study, the RBF NNs are utilized to approximate the nonlinear functions in Eq. (2). By the universal approximation property of the NNs, for a given constant δ¯ > 0 and any continuous function f ( Z ) defined on a compact set Ω Z ⊂ Rn , there exists a RBF NN W T S ( Z ) such that f ( Z ) can be expressed as
4
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
f ( Z ) = W T S ( Z ) + δ( Z ),
¯ |δ( Z )| δ,
(3)
where Z ∈ Ω Z ⊂ Rn denotes the input vector, W = [ w 1 , w 2 , · · · , ¯ w ¯l ]T ∈ Rl denotes the ideal weight vector, ¯l > 0 is the NN node number, δ( Z ) denotes the approximation error. S ( Z ) = [s1 ( Z ), s2 ( Z ), · · · , s¯l ( Z )]T denotes the basis function vector with si ( Z ) being the following commonly used Gaussian function:
si ( Z ) = exp −
( Z − μi )T ( Z − μi ) , ζi
i = 1, 2, · · · , ¯l
(4)
Fig. 1. Decomposition of the controller.
with μi = [μi1 , μi2 , · · · , μin ]T being the center of the receptive field, and ζi being the width of the Gaussian function. The ideal weight vector W in Eq. (3) is defined as
W := arg min
ˆ S ( Z )| , sup | f ( Z ) − W T
¯
ˆ ∈Rl W
Z ∈Ω Z
(5)
ˆ denotes the estimation of W . Besides, we recall the next where W lemma for handling the estimations of parameters before designing the controller. Lemma 1. The following inequality holds for any x ∈ R and > 0 [38]:
0 |x| − x tanh
x
,
= 0.2785.
(6)
Remark 2. To summarize, we focus on employing an appropriate modeling approach for the morphing aircraft and devising a related controller design method. The advantage of the proposed technique is verified by comparative simulation studies for altitude tracking control. The capability to manage changes in the aerodynamic parameters is for the sake of evaluating the effectiveness of the control procedure and not the unique merit of our method. As a result, we adopt switched nonlinear systems for establishment of the altitude motion model that facilitate the controller design. Correspondingly, a combination of the backstepping method and ADHDP technique is proposed to develop the control input. Remark 3. The original motion model of the morphing aircraft can be regarded as a nonlinear parameter varying system while a lack of theoretical basis confines the direct controller design. In order to handle this issue, the nonlinear parameter varying system is further simplified to be switched nonlinear systems in lower triangular form for controller design with reasonable assumptions on the flight path angle and sweep angle. The switched nonlinear systems retain the morphing characteristics of the aircraft via the switching of subsystems. On the other hand, the controller is devised based on the switched nonlinear systems and the related stability analysis of the closed-loop system is also provided. For the basic part controller via the backstepping technique, it is noted that the difficulty lies in the switched systems and the continuous states. The related common virtual laws and the final control input are developed by an employment of neural networks to circumvent this difficulty. However, the focus of backstepping design is stability and the control performance is neglected, which motivate the design of supplementary part controller from the perspective of control performance. As a result, a supplementary part is further applied via the ADHDP method considering its online-learning merit. So far we have established the switched nonlinear system and defined the related control problem. Since the main idea of this paper is the combination of backstepping technique and the ADHDP method, the designed controller is also divided into two components which are designed via these two schemes, respectively. It
Fig. 2. Structure of the control scheme.
is noted that the backstepping technique can guarantee stability of the closed-loop system while the ADHDP method can expedite the control performance. However, the ADHDP method cannot ensure stability during the learning process. As a result, we devise the controller via superposition and the final control input is the sum of these two components. Specifically, the basic part component is generated by the backstepping technique and the supplementary part component is devised by the ADHDP method. In other words, the basic part component provides an environment where the learning process of the ADHDP method will not violate stability of the closed-loop system. The decomposition of the controller is depicted in Fig. 1. 3. Controller design The designed control input δe in this Section is split into two parts: the basic part δe1 and supplementary part δe2 , i.e., δe = δe1 + δe2 . For the basic part, the backstepping approach is applied to the switched nonlinear systems established in Section 2, where the internal uncertainties and external disturbances are estimated by designing disturbance observers inspired from the idea of ESO. The RBF NNs are utilized to devise common virtual control laws and a dynamic surface is introduced at each step to avoid the ‘explosion of complexity’ problem. For the supplementary part, the ADHDP is further adopted to decrease altitude tracking error and improve control performance, where the control input is generated based on the tracking errors of the basic part. The structure of the control scheme is provided in Fig. 2. Remark 4. As the altitude motion of the morphing aircraft has been modeled as switched nonlinear systems in lower triangular form, the related basic part controller is then generated via the backstepping method. A combination of the dynamic surface technique and radial basis function neural networks is exploited to devise the common virtual control laws and the final basic part control input. It is noteworthy that the adoption of neural networks alleviates the direct cancellation of switched nonlinear terms and facilitates the development of common virtual control laws. Moreover, the disturbance observers are employed to attenuate the uncertainties due to internal uncertainties and external disturbances. All signals of the closed-loop system are proved to be bounded with the basic part controller. For further improvement of control performance, the supplementary part controller is devised via the ADHDP method. The ADHDP technique can be regarded as an effective way of solving optimal control problems
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
which generates a control input such that the user-defined cost function is minimized. The control performance can be depicted by the user-defined cost function based on optimal control theory. In view of that stability is the focus of the basic part controller and control performance is neglected, the ADHDP scheme is exploited to design supplementary control input and expedite control performance. We here adopt the tracking error variables defined in the basic part controller design process as states of the ADHDP approach and develops a supplementary part controller such that these errors can be minimized. Remark 5. For the basic part controller via the backstepping technique, it is noted that the difficulty lies in the switched systems and the continuous states. The related common virtual laws and the final control input are developed by an employment of NNs to circumvent this difficulty.
The following assumptions are made for the altitude dynamics (2). Assumption 2. The velocity V and its derivative V˙ are bounded. Assumption 3. The altitude reference signal hr is twice differentiable and bounded. Assumption 4. The signs of g 2i and g 4i are fixed, and there exist positive constants g 2m , g 2M , g 4m and g 4M such that g 2m | g 2i | g 2M , g 4m | g 4i | g 4M , ∀i ∈ S M . Assumption 5. The disturbances dγ and dq are bounded and differentiable, i.e., there exist positive constants N γ and N q such that
|dγ | N γ , |d˙ γ | N γ , |dq | N q , |d˙ q | N q .
To design the basic part, it is assumed δe2 = 0 in Section 3.1. Moreover, for any i ∈ S M , it means that the sweep angle ξ takes a certain value, and the corresponding basic part controller is designed by following the backstepping method.
t
Step 1 Define e 0 = 0 e 1 (υ )dυ , e 1 = h − hr , then the time derivatives of e 0 and e 1 are
e˙ 1 = h˙ − h˙ r = V γ − h˙ r .
(7)
Denote the virtual control law of e 1 by γr , and introducing a new state variable γ¯r , whose dynamics will be given later. Define e¯ 1 = [e 0 , e 1 ]T , γ˜r = γ¯r − γr , e 2 = γ − γ¯r , and choosing the following Lyapunov function candidate
L V 1 = e¯ T1 P e¯ 1 +
1 2
γ˜r2 ,
γr =
−k0 e 0 − k1 e 1 + h˙ r
(9)
V
with k0 and k1 being positive constants, and the matrix P is choa positive sen to satisfy P A + A T P = −k¯ 1 I 2 , where k¯ 1 denotes 0 1 constant, I 2 denotes the identity matrix, A = . The −k0 −k1 time derivative of L V 1 then reads
L˙ V 1 = −k¯ 1 e 20 − k¯ 1 e 21 + 2e¯ T1 P [0 V (e 2 + γ˜r )]T + γ˜r γ˙˜r
= −k¯ 1 e 20 − k¯ 1 e 21 + 2V e 2 ( p 12 e 0 + p 22 e 1 ) + 2V γ˜r ( p 12 e 0 + p 22 e 1 ) + γ˜r γ˙˜r .
(10)
τ1 γ˙¯r + γ¯r = γr − 2τ1 V ( p 12 e0 + p 22 e1 ),
(11)
where τ1 > 0 denotes the time constant. By Eq. (11), Assumptions 2, 3 and 5, we further obtain
L˙ V 1 = −k¯ 1 e 20 − k¯ 1 e 21 −
− γ˜r
1
γ˜ 2 + 2V e2 ( p 12 e0 + p 22 e1 ) τ1 r ∂ γr ∂ γr ¨ ∂ γr ˙
∂ γr e˙ 0 + e˙ 1 + hr + V ∂ e0 ∂ e1 ∂V ∂ h˙ r
= −k¯ 1 e 20 − k¯ 1 e 21 −
1
γ˜ 2 + 2V e2 ( p 12 e0 + p 22 e1 ) τ1 r − γ˜r B 1 (e 0 , e 1 , e 2 , γ˜r , hr , h˙ r , h¨ r , V , V˙ ),
(12)
where B 1 (·) is a continuous function related to the time derivative of γr . Step 2 Differentiating e 2 with respect to time leads to
Remark 6. Assumptions 2 and 3 are on the boundedness of V , hr and their derivatives, which are commonly adopted in extant references on aircraft longitudinal motion control [39,40]. Assumptions 2 and 3 are mainly for convenience of designing virtual control laws of the h loop when the backestepping scheme is applied. Assumption 4 ensures the control singularity will not happen [46]. Assumption 5 is on the boundedness of dγ , dq and their derivatives, which is for convenience of the convergence analysis of the disturbance observer and common in many references on disturbance observer based control [29,31,41].
e˙ 0 = e 1 ,
p 11 p 12 is a positive definite matrix. Let the virtual p 12 p 22 control law γr be designed as where P =
From Eq. (10), the dynamics of γ¯r is selected as
3.1. Basic part design
5
(8)
e˙ 2 = γ˙ − γ˙¯r = f 2,i + g 2,i θ + dγ +
1
τ1
γ˜r + 2V ( p 12 e0 + p 22 e1 ). (13)
For the γ loop, motivated by the idea of ESO, the disturbance observer is designed as follows:
⎧ ⎪ ˙ˆ = f 2,i + g 2,i θ + dˆ γ + φ1γ γ − γˆ , ⎪ γ ⎨ εγ ˆ 1 γ − γ ⎪ ˙ ⎪ ⎩ dˆ γ = , φ2γ
εγ
(14)
εγ
where γˆ and dˆ γ are the states of the disturbance observer, εγ is a small positive constant, φ1γ (·) and φ2γ (·) ∈ C (R, R) are the design functions. Remark 7. The point we want to express in this paper is that the ESO handles the disturbances in an active way and can be adopted for morphing aircraft control design. In other words, the ESO is regarded as a tool which is exploited in the backstepping design to devise the common virtual control laws and final control input in combination with neural networks. However, we do not deliberately stress the importance of ESO and the comparison between the ESO and other methods is not our emphasis. The main idea of this paper is that the combination of backstepping method and ADHDP scheme can be achieved to improve the control performance. As a result, the ESO only is an application in the control design.
6
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
With the estimated value dˆ γ and similar to [25,46], a RBF NN is
used to approximate the nonlinear term f 2,i + sat N γ (dˆ γ ) + τ1 γ˜r + 1 4V ( p 12 e 0 + p 22 e 1 ) such that for any given constant δ¯1,i > 0, one gets
1
f 2,i + sat N γ (dˆ γ ) +
τ1
ˆ 1 (0) is chosen such that where κ1 is a positive constant and ˆ 1 0 always holds. Invoking Eqs. (22) and (23), one obtains L˙ V 2 −k¯ 1 e 20 − k¯ 1 e 21 −
(15)
where W 1,i denotes the ideal weight vector, X 1 = [ V , γ , dˆ γ , e 0 , e 1 , γ˜r ]T , δ1,i ( X 1 ) denotes the approximation error, we then have
e˙ 2 = W 1T,i S 1 ( X 1 ) + δ1,i ( X 1 ) + g 2,i θ − 2V ( p 12 e 0 + p 22 e 1 )
+ dγ − sat N γ
(dˆ γ ).
It follows that
¯ 1 ( S 1 ( X 1 ) + 1) e 2 ( W 1T,i S 1 ( X 1 ) + δ1,i ( X 1 )) |e 2 | (17)
¯ 1 = max{ W 1,i , δ¯1,i , i ∈ S M }, Y 1 ( X 1 ) = S 1 ( X 1 ) + 1. Apwhere plying Lemma 1 to Eq. (17) yields
¯ 1 e 2 Y 1 tanh e 2 ( W 1T,i S 1 ( X 1 ) + δ1,i ( X 1 ))
e2 Y 1
1
with 1 being a positive constant. Denote the virtual control law of e 2 by θr , and introducing a new state variable θ¯r , which is obtained by the following modified first-order filter with a time constant τ2 :
τ2 θ˙¯r + θ¯r = θr − τ2 g2,i e2 .
(19)
ˆ 1 − 1 , where 1 is an ˜1= Define θ˜r = θ¯r − θr , e 3 = θ − θ¯r , ¯ ˆ unknown parameter related to 1 , 1 is the estimation of 1 , and choosing the Lyapunov function candidate as
LV2 = LV1 +
2
e 22 +
1 2
θ˜r2 +
g 2m 2r1
˜ 21 ,
(20)
where r1 is a positive constant. Taking the time derivative of L V 2 leads to
L˙ V 2
¯ 1 Y 1 tanh e 2 Y 1 L˙ V 1 + e 2 g 2,i (θr + e 3 + θ˜r ) + e 2 1 ˆ − 2V ( p 12 e 0 + p 22 e 1 ) + dγ − sat N γ (dγ ) 1 g 2m ˙ˆ ˜ 1 ¯ + θ˜r − θ˜r − g 2,i e 2 − θ˙r + 1 + 1 1
τ2
1
1
g 2m ˙ˆ ˜ 1 ¯ + e 2 (dγ − sat N γ (dˆ γ )) + 1 + 1 1
(21)
−1 ¯ with 1 = g 2m 1 . The related the common virtual control law θr is then chosen as
θr = −sign( g 2i )
k2 g 2m
ˆ 1 Y 1 tanh e2 +
e2 Y 1
1
,
(22)
where k2 is a positive constant, sign(·) denotes the signum funcˆ 1 reads tion. The update law for
e2 Y 1 ˙ˆ 1 = r 1 e 2 Y 1 tanh
1
ˆ 1, − κ1
− k2 e 22
−
1
γ˜r2 −
1
θ˜ 2 − γ˜r B 1 (·)
τ2 r − θ˜r B 2 (e 0 , e 1 , e 2 , e 3 , γ˜r , θ˜r , γ − γˆ , dγ τ1
g 2m κ1 r1
˜ 1 ˆ1
¯ 1 1 , + e 2 g 2,i e 3 + e 2 (dγ − sat N γ (dˆ γ )) + (24)
Step 3 Differentiating e 3 with respect to time yields
e˙ 3 = θ˙ − θ˙¯r = q +
1
τ2
θ˜r + g 2,i e 2 .
(23)
(25)
Denote the virtual control law of e 3 by qr , and introducing a new state variable q¯ r , which is obtained by the following modified first-order filter with a time constant τ3 :
τ3 q˙¯ r + q¯ r = qr − τ3 e3 .
(26)
Define q˜ r = q¯ r − qr , e 4 = q − q¯ r , and choosing the Lyapunov function candidate as
LV3 = LV2 +
1 2
e 23 +
1 2
q˜ 2r .
(27)
The time derivative of L V 3 is then given by
−k¯ 1 e 20 − k¯ 1 e 21 − γ˜r2 − θ˜r2 − γ˜r B 1 (·) − θ˜r θ˙r + e 2 g 2,i e 3 τ1 τ 2 e2 Y 1 + g 2,i θr + e 2 g 2m 1 Y 1 tanh r1
1
− k¯ 1 e 21
L˙ V 3 = L˙ V 2 + e 3 qr + e 4 + q˜ r +
r1
1
−k¯ 1 e 20
where B 2 (·) is a continuous function related to the time derivative of θr .
¯ 1 1 + (18)
1
τ2
ˆ 1 , hr , h˙ r , h¨ r , V , V˙ ) − − sat N γ (dˆ γ ),
¯ 1 Y 1 ( X 1 ), |e 2 |
γ˜r2 −
r1
(16)
1
θ˜r2 − γ˜r B 1 (·) − θ˜r θ˙r + e 2 g 2,i e 3 | g 2, i | e2 Y 1 − + e 2 g 2m 1 Y 1 tanh k2 e 2 1 g 2m ˆ 1 Y 1 tanh e 2 Y 1 + e 2 (dγ − sat N γ (dˆ γ )) − | g 2 , i | 1 g 2m ¯ 1 1 ˜ 1 r1 e 2 Y 1 tanh e 2 Y 1 − κ1 ˆ1 + +
γ˜r + 4V ( p 12 e0 + p 22 e1 )
= W 1T,i S 1 ( X 1 ) + δ1,i ( X 1 ), |δ1,i ( X 1 )| δ¯1,i ,
1
τ1
1
τ2
θ˜r + g 2,i e 2
1 + q˜ r − q˜ r − e 3 − q˙ r τ3 1 1 ˙ ˜ = L V 2 + e 3 qr + e 4 + θr + g 2,i e 2 + q˜ r − q˜ r − q˙ r .
τ2
τ3
(28) From Eq. (28), the virtual control law qr is designed as
qr = −k3 e 3 −
1
τ2
θ˜r ,
(29)
where k3 is a positive constant. A combination of Eqs. (28) and (29) comes to
L˙ V 3 = L˙ V 2 − k3 e 23 −
1
τ3
q˜ 2r + e 3 e 4 + g 2,i e 2 e 3
− q˜ r B 3 (e 0 , e 1 , e 2 , e 3 , e 4 , γ˜r , θ˜r , q˜ r , γ − γˆ , dγ ˆ 1 , hr , h˙ r , h¨ r , V , V˙ ), − sat N (dˆ γ ), γ
(30)
where B 3 (·) is a continuous function related to the time derivative of qr .
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
ˆ 2 (0) is chosen such that κ2 is a positive constant and ˆ 2 0 always holds. By substituting Eqs. (39) and (40) into
Step 4 Differentiating e 4 with respect to time leads to
e˙ 4 = q˙ − q˙¯ r = f 4,i + g 4,i δe1 + dq +
1
τ3
q˜ r + e 3 .
where
(31)
For the q loop, the dynamics of the disturbance observer satisfies
⎧ ⎪ ˙ˆ = f 4,i + g 4,i δe1 + dˆ q + φ1q q − qˆ , ⎪ q ⎨ εq ˆ 1 q − q ⎪ ˙ ⎪ ⎩ dˆ q = φ2q ,
εq
Eq. (38), one obtains
L˙ V 4 L˙ V 3 − e 3 e 4
| g 4, i | e4 Y 2 − + e 4 g 4m 2 Y 2 tanh k4 e 4 ˆ 2 Y 2 tanh − | g 4 , i |
(32)
εq
+
where qˆ and dˆ q are the states of the disturbance observer, εq is a small positive constant, φ1q (·) and φ2q (·) ∈ C (R, R) are the design functions. With the estimated value dˆ q , a RBF NN is utilized to approxi-
1
q˜ r + 2e 3 = W 2T,i S 2 ( X 2 ) + δ2,i ( X 2 ),
|δ2,i ( X 2 )| δ¯2,i ,
(33)
where W 2,i denotes the ideal weight vector, X 2 = [ V , γ , θ, q, dˆ q , e 3 , q˜ r ]T , δ2,i ( X 2 ) denotes the approximation error, and it follows that
e˙ 4 = W 2T,i S 2 ( X 2 ) + δ2,i ( X 2 ) + g 4,i δe1 − e 3 + dq − sat N q (dˆ q ). (34) Similar to Eq. (17), one can obtain
¯ 2 ( S 2 ( X 2 ) + 1) e 4 ( W 2T,i S 2 ( X 2 ) + δ2,i ( X 2 )) |e 4 | ¯ 2 Y 2 ( X 2 ), |e 4 |
(35)
¯ 2 = max{ W 2,i , δ¯2,i , i ∈ S M }, Y 2 ( X 2 ) = S 2 ( X 2 ) + 1. Apwhere plying Lemma 1 to Eq. (35) yields
¯ 2 e 4 Y 2 tanh e 4 ( W 2T,i S 2 ( X 2 ) + δ2,i ( X 2 ))
e4 Y 2
2
ˆ 2 − 2 , where 2 ˜2= with 2 being a positive constant. Define ¯ 2, ˆ 2 is the estimation of is an unknown parameter related to 2 , and taking 1 2
e 24 +
g 4m 2r2
˜ 22
(37)
as the Lyapunov function candidate, where r2 is a positive constant. The time derivative of L V 4 is then given by
L˙ V 4 L˙ V 3 + e 4 g 4,i δe1 + e 4
g 4m 2 Y 2 tanh
e4 Y 2
2
− e3
g 4m ˜ ˙ˆ ¯ 2 2 + dq − sat Nq (dˆ q ) + 2 2 +
−1 ¯ with 2 = g 4m 2 . The final basic control input δe1 becomes
δe1 = −sign( g 4i )
k4 g 4m
ˆ 2 Y 2 tanh e4 +
e4 Y 2
2
,
(39)
ˆ 2 is defined where k4 is a positive constant. The update law for as
e4 Y 2 ˙ˆ ˆ 2, − κ2 = r e Y tanh 2 2 4 2
2
2
e4 Y 2 ˜ ˆ − κ2 2 2 r2 e 4 Y 2 tanh
2
r2
¯ 2 2 . + e 4 (dq − sat Nq (dˆ q )) +
(41)
Remark 8. The combination of the ESO and NN method is exploited for basic part controller design. Specifically, the NN method is adopted to obtain the common virtual control laws for the switched nonlinear systems, where the ESO is employed to generate the estimations of disturbances. It is noted that the direct cancellation of nonlinear terms during the backstepping design is not achievable since the system is switched while the states are continuous. To handle this issue, the NN is utilized to approximate the switched nonlinear terms to obtain an equivalent switched terms where only the unknown ideal weight vectors are switched. The ESO is employed during the approximation process. Then, the property of magnitude of ideal weight vectors is exploited to generate the common virtual control laws in combination with techniques of inequalities. Finally, the obtained common virtual control laws and basic control input can guarantee stability of the closedloop system by Lyapunov stability theory.
ηγ = [ηγ 1 , ηγ 2 ]T , ηγ 1 = ηq = [ηq1 , ηq2 ]T , ηq1 =
γ − γˆ , ηγ 2 = dγ − dˆ γ , εγ
q − qˆ
εq
,
ηq2 = dq − dˆ q ,
the dynamics of the scaled estimation errors mulated as
ηι can then be for-
ει η˙ ι1 = ηι2 − φ1ι (ηι1 ), ει η˙ ι2 = ει d˙ ι − φ2ι (ηι1 ), ι = γ , q.
(42)
For Eq. (42), we make the following assumptions:
(38)
r2
g 4m
Define scaled estimation errors of the designed disturbance observer (14) and (32) as follows:
¯ 2 2 + (36)
LV4 = LV3 +
r2
2
e4 Y 2
g 4m κ2 ˜ 2 ˆ 2 − e3 e4 L˙ V 3 − k4 e 24 −
mate the nonlinear term f 4,i + sat N q (dˆ q ) + τ q˜ r + 2e 3 such that for 3 any given constant δ¯2,i > 0
τ3
g 4m
¯ 2 2 + e 4 (dq − sat Nq (dˆ q )) +
1
f 4,i + sat N q (dˆ q ) +
7
(40)
Assumption 6. There exist positive definite, radially unbounded, and continuous differentiable functions L V ι and L W ι :R2 → R such that 1) λ1ι y 2 L V ι ( y ) λ2ι y 2 , λ3ι y 2 L W ι ( y ) λ4ι y 2 , ∂ L ( y) ∂ L ( y) 2) ( y 2 − φ1ι ( y 1 )) ∂Vyι − φ2ι ( y 1 ) ∂Vyι − L W ι ( y ),
∂L
( y)
1
2
3) ∂Vyι p ι y , 2 where y = [ y 1 , y 2 ]T ∈ R2 , λ1ι , λ2ι , λ3ι , λ4ι and p ι are positive constants.
8
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
Remark 9. Assumption 6 is on the dynamics of the designed ESO, which is analogous to the related assumption in [29–31,42]. Assumption 6 is mainly to ensure asymptotic stability of the zero equilibrium of the following system:
η˙ ι1 = ηι2 − φ1ι (ηι1 ), η˙ ι2 = −φ2ι (ηι1 ).
(43)
In particular, when the design functions φ1ι (·) and φ2ι (·) take linear form:
η˙ ι1 = ηι2 − l1ι ηι1 , η˙ ι2 = −l2ι ηι1 ,
(44)
where l1ι and l2ι are positive constants. It can then be seen that the polynomial s¯ 2 + l1ι s¯ + l2ι is Hurwitz, and the conditions in Assumption 6 are satisfied. More details can be found in [29–31,42]. For the basic part controller, we have the following theorem:
By Assumptions 2 and 3, the sets Π0 = {hr2 + h˙ r2 + h¨ r2 ν0 } and Π1 = { V 2 + V˙ 2 ν1 } are compact for some positive constants ν0 and ν1 , respectively. Besides,⎧for any given ⎫ positive constants ν2 ,
ν3 , ν4 and ν5 , the sets Π2 =
LV = LV4 + LVγ + LVq . Considering Eqs. (24), (30) and (41), the time derivative of L V 4 satisfies
L˙ V 4 −k¯ 1 e 20 − k¯ 1 e 21 − k2 e 22 − k3 e 23 − k4 e 24 −
−
1
τ3
1
τ1
γ˜r2 −
1
τ2
θ˜r2
−
r1
˜ 1 ˆ1−
g 4m κ2 r2
˜ 2 ˆ 2 + ( ¯ 1 1 + ¯ 2 2 ).
j =1
|dι − sat N ι (dˆ ι )| |dι − dˆ ι | + |dˆ ι − sat N ι (dˆ ι )| 2|ηι2 |, ι = γ , q, ˜ j | + | j |, j = 1, 2, ˆ j | | | it can then be seen that for all arguments of the continuous functions B j (·)( j = 1, 2, 3) are bounded on Π0 × Π1 × Π2 × Π3 × Π4 × Π5 , and there exist related positive constants M j such that | B j (·)| M j . Applying the Young’s inequality [43] to Eq. (48) yields
L˙ V 4 −k¯ 1 e 20 − k¯ 1 e 21 − (k2 − g 2M − 1)e 22 − (k3 − g 2M )e 23
1 1 1 1 − (k4 − 1)e 24 − − γ˜r2 − − θ˜ 2 τ1 2 τ2 2 r 1 1 g 2m κ1 2 2 ˜ 1 − g 4m κ2 + ηγ2 2 + ηq2 − − q˜ 2r −
τ3
+
2
2
ˆ j − ˜ 2 + 2 . ˜ j ˜2− ˜ j j − − j j j
1
g 2m κ1
g 2m κ1 2r1
21 +
g 4m κ2 2r2
g 4m κ2 2r2
¯ 1 1 + ¯ 2 2 ). 22 + (
ι2
λ3 ι
ει
− +
√
2
ηι + N ι p ι ν4 .
(50)
τ3
λ3γ
εγ g 2m κ1 2r1
2
2r1
− 1 ηγ 2 −
21 +
g 4m κ2 2r2
2r2
λ3q
εq
3 1 2 − 1 ηq 2 + Mj 2
j =1
¯ 1 1 + ¯ 2 2 ) 22 + (
√ + ( N γ p γ + N q pq ) ν4 .
(51)
Let
g 4m κ2
¯ 1 1 + ¯ 2 2 ). 22 + (
j =1
21 +
(47)
1 3
NV =
2
j =1
M 2j +
g 2m κ1 2r1
21 +
g 4m κ2 2r2
¯ 1 1 + ¯ 2 2 ) 22 + (
√ + ( N γ p γ + N q pq ) ν4 ,
+ e 2 (dγ − sat N γ (dˆ γ )) + e 4 (dq − sat Nq (dˆ q )) +
2r1
1 1 1 1 − (k4 − 1)e 24 − − γ˜r2 − − θ˜ 2 τ1 2 τ2 2 r 1 1 g 2m κ1 2 ˜ 1 − g 4m κ2 ˜ 22 − − q˜ 2r −
(45)
˜ 21 − ˜ 22 γ˜ 2 − θ˜ 2 − q˜ 2 − τ1 r τ2 r τ3 r 2r1 2r2 − γ˜r B 1 (·) − θ˜r B 2 (·) − q˜ r B 3 (·) −
g 2m κ1
L˙ V −k¯ 1 e 20 − k¯ 1 e 21 − (k2 − g 2M − 1)e 22 − (k3 − g 2M )e 23
L˙ V 4 −k¯ 1 e 20 − k¯ 1 e 21 − (k2 − g 2M )e 22 − (k3 − g 2M )e 23 − k4 e 24 1
M 2j +
2r2
By Eqs. (49) and (50), the time derivative of L V is given by
Substituting Eqs. (46) and (47) into Eq. (45), one obtains
1
2r1
∂ L V ι (ηι ) ∂ L V ι (ηι ) (ηι2 − φ1ι (ηι1 )) − φ2ι (ηι1 ) ει ∂ ηι 1 ∂ ηι 2 ∂ L V ι (ηι ) + d˙ ι ∂ ηι 2 ∂ L V ι (ηι ) λ3ι 2 ˙ − ηι + |dι | ε ∂η
1 L˙ V ι =
˜ j ( j = 1, 2), it can be Meanwhile, noticing the definition of verified that 1
2
2
On the other hand, the time derivative of L V ι (ι = γ , q) reads
(46)
1
3 1
(49)
By the Young’s inequality [43], we have
2g 2,i e 2 e 3 g 2M (e 22 + e 23 ).
j =0
compact. Noticing that
−
+ e 2 (dγ − sat N γ (dˆ γ )) + e 4 (dq − sat Nq (dˆ q )) g 2m κ1
⎩
ι
q˜ 2r − γ˜r B 1 (·) − θ˜r B 2 (·) − q˜ r B 3 (·) + 2g 2,i e 2 e 3
⎬
e 2j ν2 , Π3 = {γ˜r2 + θ˜r2 + q˜ 2r
⎭ ⎫ ⎧ 2 ⎬ ⎨ 2 2 2 Θ˜ j ν5 are also ν3 }, Π4 = { ηγ + ηq ν4 } and Π5 = ⎭ ⎩
Theorem 1. Consider the closed-loop switched nonlinear systems formed of the plant (2), the disturbance observers (14), (32), the modified dynamic surfaces (11), (19), (26), the parameter update laws (23), (40) and the basic control law (39). Suppose that δe2 = 0, Assumptions 1–6 are satisfied, and initial values of the closed-loop system are bounded. Then there exist the design parameters k0 , k1 , k¯ 1 , k2 , k3 , k4 , τ1 , τ2 , τ3 , εγ , εq , r1 , 1 , κ1 , r2 , 2 , κ2 , and the design functions φ1γ (·), φ2γ (·), φ1q (·), φ2q (·) such that all signals of the closed-loop system remain bounded, and the altitude tracking error e 1 converges to a small neighborhood of the origin determined by the design parameters. Proof. Choosing the Lyapunov function candidate for the closedloop system as
4 ⎨
(48)
where λmax ( P ) denotes the maximum eigenvalue of P , and select the design parameters such that
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
k0 > 0, k1 > 0, k¯ 1 > 0, k2 > g 2M + 1, k3 > g 2M , k4 > 1,
τ1 < 2, τ2 < 2, τ3 < 2, κ1 > 0, κ2 > 0, 0 < εγ < λ3γ , 0 < εq < λ3q , then we have
L˙ V −λ V L V + N V ,
(52) k¯ 1
where λ V = min{ λ ( P ) , 2(k2 − g 2M − 1), 2(k3 − g 2M ), 2(k4 − max λ3γ −εγ λ −ε 1), τ2 − 1, τ2 − 1, τ2 − 1, κ1 , κ2 , ε λ , ε3qλ q }. Applying the com1
2
γ 2γ
3
q 2q
parison principle to Eq. (52) comes to
L V (t )
L V (0) −
NV
λV
e−λ V t +
NV
λV
(53)
.
As a result, all the signals in the closed-loop system are bounded, and the altitude tracking error e 1 satisfies
lim |e 1 |
t →∞
NV
(54)
λmin ( P )λ V
with λmin ( P ) being the minimum eigenvalue of P . This completes the proof of Theorem 1. 2 3.2. Supplementary part design To improve the control performance, the ADHDP approach is further applied to design the supplementary part controller, where the state vector is defined as x(t ) = [e 1 (t ), e 2 (t ), e 3 (t ), e 4 (t )]T , the control input is denoted by u (t ) = δe2 (t ). The related utility function of the ADHDP is then given as
s(x(t ), u (t )) = [xT (t ), u T (t )] R r [xT (t ), u T (t )]T ,
(55)
where R r is a positive definite diagonal matrix. The sample method is used to implement the ADHDP algorithm with a certain time step, where the cost function is formulated as
J (x(t ), u (t )) = s(x(t ), u (t )) + ρ (β s(x(t + 1), u (t + 1))
+ β 2 s(x(t + 2), u (t + 2)) + · · · = s(x(t ), u (t )) + ρ
∞
β j −t s(x( j ), u ( j )),
(56)
j =t +1
where ρ is a positive constant, β denotes the discount factor and satisfies 0 < β < 1. The goal of the ADHDP is to find the control input u (t ) which minimizes the cost function. Let the optimal cost function denoted by J ∗ (x(t ), u (t )), i.e. ∗
J (x(t ), u (t )) = min u (t )
⎧ ⎨ ⎩
s(x(t ), u (t )) + ρ
∞ j =t +1
⎫ ⎬ j −t β s(x( j ), u ( j )) . ⎭
(57) By the optimal control theory, J ∗ (x(t ), u (t )) satisfies the following Bellman’s equation:
J ∗ (x(t ), u (t )) = min{s(x(t ), u (t )) + (ρ − 1)β s(x(t + 1), u (t + 1)) u (t )
+ β J ∗ (x(t + 1), u (t + 1))}.
it can be seen from Eq. (57) that if e (t ) converges to zero, then the ADHDP control input u (t ) = 0, J ∗ (x(t ), u (t )) = 0, and the altitude system states h, γ , θ , q can track their desired values, respectively. On the other hand, when e (t ) derivate from zero, the ADHDP will provide supplementary control input to reduce the magnitude of e (t ) and improve the control performance. In ADHDP, both the critic network and action network utilized are comprised of Multi-Layer Perceptron with one hidden layer. e−t A common hyperbolic tangent function z(t ) = 11− is applied as +e−t the activation function of the hidden layer. Let the number of inputs, hidden nodes and outputs of the action network denoted by ai , ah and ao , respectively. From the idea of ADHDP, the critic network takes all the inputs and outputs of the action network as inputs, and outputs the estimated cost function ˆJ (t ). Hence, the critic network has ci = ai + ao inputs, ch hidden nodes, and one output. In this section, the indexes of the critic network and action network nodes are denoted by i, j and k for brevity, which will not cause confusion with Section 3.1. Let ϕc j (t ) denote the input to the j-th hidden node of the critic network, and σc j (t ) denote the related output, then we have
⎧ ai ao ⎪ ⎪ (1 ) (1 ) ⎪ ϕ ( t ) = w ( t ) x ( t ) + w c( +i) j (t )u i (t ), ⎪ cj i ci j ⎪ ai ⎪ ⎪ i =1 i =1 ⎪ ⎪ ⎪ −ϕc j (t ) ⎨ 1−e σc j (t ) = z(ϕc j (t )) = , −ϕ (t ) ⎪ ⎪ 1 + e cj ⎪ ⎪ ⎪ ch ⎪ ⎪ ⎪ ˆJ (t ) = w (2) (t )σ (t ), ⎪ ⎪ cj cj ⎩
Since the exact solution of Eq. (58) is difficult to obtain, the NNs are utilized in the ADHDP to approximately solve the Bellman’s equation. The output of critic network is the estimation of J ∗ (x(t ), u (t )), which is denoted by ˆJ (x(t ), u (t )). The control policy u (t ) is generated by action network based on ˆJ (x(t ), u (t )). Besides,
(59)
j =1
(1)
where w c i j (t ) denotes the weight of the link between the i-th in(2)
put node and the j-th hidden node, w c j (t ) denotes the weight
of the link between the j-th hidden node and output, ˆJ (t ) = ˆJ (x(t ), u (t )). It has been proven in [44,45] that if the input-to-hidden layer weights are selected at random initially and kept fixed, the NN reconstruction errors can be made arbitrarily small when the number of hidden nodes is sufficiently large. Therefore, we fix the (1) (2) weight variables w c i j (t ) and only update w c j (t ) during the learning process. From Eq. (58), the error function for training the critic network is defined as
ec (t ) = β ˆJ (t ) + (ρ − 1)β s(t ) + s(t − 1) − ˆJ (t − 1),
(60)
where s(t ) = s(x(t ), u (t )). The related objection function for weight updating is then given by
E c (t ) =
1 2
ec2 (t ).
(61) (2)
The weight variables w c j (t ) are updated by a gradient-based algorithm to minimize E c (t ). By the chain rule, the weight updating policy is (2 )
(2 )
(2 )
w c j (t + 1) = w c j (t ) + w c j (t ),
(62)
where (2 )
(58)
9
w c j (t ) = −μc
∂ E c (t ) (2 )
∂ w c j (t )
= −μc
∂ E c (t ) ∂ ec (t ) ∂ ˆJ (t ) ∂ ec (t ) ∂ ˆJ (t ) ∂ w c(2) (t ) j
= −β μc ec (t )σc j (t ) with μc > 0 being the learning rate. On the other hand, let ϕa j (t ) denote the input to the j-th hidden node of the critic network, and σa j (t ) denote the related output, one obtains
10
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
⎧ ai ⎪ ⎪ (1 ) ⎪ ϕ ( t ) = w ai j (t )xi (t ), ⎪ aj ⎪ ⎪ ⎪ i =1 ⎪ ⎪ ⎪ −ϕ (t ) ⎨ 1 − e aj σa j (t ) = z(ϕa j (t )) = , −ϕ (t ) ⎪ ⎪ 1 + e aj ⎪ ⎪ ⎪ ah ⎪ ⎪ (2 ) ⎪ ⎪ w a jk (t )σa j (t ), ⎪ ⎩ uk (t ) =
(63)
Theorem 2. Consider the critic network and action network established as Eqs. (59) and (63), respectively. The weight updating rules follow ˜ c (t ) and w ˜ a (t ) Eqs. (62) and (66). Then the weight estimation errors w are uniformly ultimately bounded with the following conditions:
μc
1
β 2 σc (t ) 2
(1)
(1)
where w ai j (t ) denotes the weight of the link between the i-th in(2)
put node and the j-th hidden node, w a jk (t ) denotes the weight of the link between the j-th hidden node and the k-th output node, uk (t ) denotes the k-th output. (2) Similar to the critic network, only weight variables w a jk (t ) are updated during the learning process. The error function for training the action network is
ea2 (t ).
(65) (2)
The weight updating policy of w a jk (t ) is to minimize E a (t ) using a gradient-based algorithm: (2 )
w c (
ai +l) j
, 1 (1 2
− σc j (t ))2 ×
(t ), j = 1, 2, · · · , ch , k = 1, 2, · · · , ao .
˜ c (t ) and w ˜ a (t ) Proof. From Eqs. (62) and (66), the dynamics of w can be expressed as
˜ c (t + 1) = w c (t + 1) − w c∗ w ˜ c (t ) − β μc σc (t )[β w cT (t )σc (t ) + (ρ − 1)β s(t ) =w
(64)
where U c denotes the desired ultimate cost objective. Since the goal of ADHDP is to let e (t ) converge to zero, then u (t ) = 0 and ˆJ (t ) = 0 as a result. Hence, we set U c = 0 without loss of generality. The related objection function for weight updating is then given as
2
1
σa (t ) 2 w cT (t ) A (t ) 2
+ s(t − 1) − w cT (t − 1)σc (t − 1)]
ea (t ) = ˆJ (t ) − U c ,
1
μa
where A (t ) ∈ Rch ×ao , whose j, k entry is a jk (t ) =
j =1
E a (t ) =
,
(2 )
(2 )
w a jk (t + 1) = w a jk (t ) + w a jk (t ),
(66)
where
˜ c (t ) − β μc σc (t )[β w c∗T σc (t ) = ( I − β 2 μc σc (t )σcT (t )) w + (ρ − 1)β s(t ) + s(t − 1) − w cT (t − 1)σc (t − 1)]. (67)
˜ a (t + 1) = w a (t + 1) − w a∗ w ˜ a (t ) − μa [ w cT (t )σc (t )]σa (t )[ w cT (t ) A (t )]. =w
(68)
Choosing the following Lyapunov function candidate
L AC (t ) = L c (t ) + L a (t ), 1
˜ cT (t ) w ˜ c (t )), tr( w
(69)
˜ aT (t ) w ˜ a (t )). μa tr( w 1
where L c (t ) = μ L a (t ) = c computing the first difference of L AC (t ), we obtain
L AC (t ) = L AC (t + 1) − L AC (t ) = L c (t ) + La (t ).
By
(70)
For L c (t ), it can be verified that
(2 )
w a jk (t ) ∂ E a (t ) ∂ E a (t ) ∂ ea (t ) ∂ ˆJ (t ) = −μa = −μa (2 ) ∂ ea (t ) ∂ ˆJ (t ) ∂ w a(2) (t ) ∂ w a jk (t ) jk ⎡ ⎤ ch 1 (2 ) ( 1 ) = −μa ˆJ (t ) ⎣ w c (t )(1 − σc2 (t )) w c( +k) (t )σa j (t )⎦ =1
2
L c (t ) = =
(2 )
⎡
ch
(2 )
(2 )
w a11 (t ) w a12 (t ) · · · ⎢ (2 ) (2 ) ⎢ w (t ) w a22 (t ) · · · ⎢ a21 w a (t ) = ⎢ .. .. .. ⎢ . . . ⎣ (2 ) (2 ) w a 1 (t ) w a 2 (t ) · · · ah
ah
(2 )
w a1ao (t ),
⎤
⎥ w a2ao (t ) ⎥ ⎥ ⎥, .. ⎥ . ⎦ (2 ) w a ao (t ) (2 )
ah
σc = [σc1 , σc2 , · · · , σcch ]T , σa = [σa1 , σa2 , · · · , σaah ]T , ˜ a (t ) = w a (t ) − w a∗ , ˜ c (t ) = w c (t ) − w c∗ , w w where w c∗ weights of then have theorem:
and w a∗ denote the optimal hidden-to-output layer the critic network and action network, respectively, we ˆJ (t ) = w cT (t )σc (t ), u (t ) = w aT (t )σa (t ), and the following
μc
˜ c (t ) − w ˜ cT (t ) w ˜ c (t ) ˜ cT (t )( I − β 2 μc σc (t )σcT (t ))2 w w
− w cT (t − 1)σc (t − 1)] + β 2 μc2 σcT (t )σc (t ) × [β w c∗T σc (t ) + (ρ − 1)β s(t ) + s(t − 1) − w cT (t − 1)σc (t − 1)]2 . Denote
(2 )
1
× [β w c∗T σc (t ) + (ρ − 1)β s(t ) + s(t − 1)
with μa > 0 being the learning rate. The Lyapunov stability analysis of the ADHDP algorithm is along the line of [35] and [47]. For brevity, we define the following variables: (2 )
˜ cT (t + 1) w ˜ c (t + 1) − w ˜ cT (t ) w ˜ c (t )) tr( w
˜ cT (t )( I − β 2 μc σc (t )σcT (t ))σc (t ) − 2β μc w
ai
w c (t ) = [ w c1 (t ), w c2 (t ), · · · , w c (t )]T ,
1
μc
(71)
c (t ) = w˜ cT (t )σc (t ), one can get from Eq. (71) that
˜ c (t ) − w ˜ cT (t )( I − β 2 μc σc (t )σcT (t ))2 w ˜ cT (t ) w ˜ c (t ) L c1 (t ) = w ˜ c (t ) ˜ cT (t )[−2β 2 μc σc (t )σcT (t ) + β 4 μc2 (σc (t )σcT (t ))2 ] w =w = −β 2 μc c (t ) 2 − β 2 μc c (t ) 2 (1 − β 2 μc σc (t ) 2 ). ˜ cT (t )( I − β 2 μc σc (t )σcT (t ))σc (t ) L c2 (t ) = −2β μc w × [β w c∗T σc (t ) + (ρ − 1)β s(t ) + s(t − 1) − w cT (t − 1)σc (t − 1)] = −2β 2 μc c (t )(1 − β 2 μc σc (t ) 2 ) × [ w c∗T σc (t ) + (ρ − 1)s(t ) + β −1 s(t − 1) − β −1 w cT (t − 1)σc (t − 1)].
L c3 (t ) = β 2 μc2 σcT (t )σc (t )[β w c∗T σc (t ) + (ρ − 1)β s(t ) + s(t − 1)
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
− w cT (t − 1)σc (t − 1)]2 =
β c2 c (t ) 2 [ w c∗T c (t ) + ( − β −1 w cT (t − 1) c (t − 1)]2 . 4
μ σ
ρ − 1)s(t ) + β
σ
−1
s(t − 1)
positive constants !im such that !i (t ) !im , i = 1, 2, 3. A combination of Eqs. (72) and (74) comes to
L AC (t ) = L c (t ) + La (t )
σ
−β 2 c (t ) 2 − β 2 (1 − β 2 μc σc (t ) 2 ) c (t )
Then L c (t ) can be further expressed as
L c (t ) = =
1
μc 1
μc
+ !1 (t ) 2 − (1 − μa σa (t ) 2 w cT (t ) A (t ) 2 ) !2 (t ) 2
( L c1 (t ) + L c2 (t ) + L c3 (t ))
+ β 2 !21 (t ) + 2 !2 (t ) 2 + !3 (t ) 2 −β 2 c (t ) 2 − β 2 (1 − β 2 μc σc (t ) 2 ) c (t )
{−β 2 μc c (t ) 2 − β 2 μc c (t ) 2
+ !1 (t ) 2 − (1 − μa σa (t ) 2 w cT (t ) A (t ) 2 ) !2 (t ) 2
× (1 − β 2 μc σc (t ) 2 )
+β 2 !21m + 2!22m + !23m .
− 2β 2 μc c (t )(1 − β 2 μc σc (t ) 2 )[ w c∗T σc (t ) + (ρ − 1)s(t ) + β −1 s(t − 1) − β −1 w cT (t − 1)σc (t − 1)] + β 4 μc2 σc (t ) 2 [ w c∗T σc (t ) + (ρ − 1)s(t )
=
σcm
c (t ) 2
μ σ
(72) w c∗T σc (t )
where 1 (t ) = β (1 − β μc σc (t ) ), !1 (t ) = 1)s(t ) + β −1 s(t − 1) − β −1 w cT (t − 1)σc (t − 1). ˜ aT (t )σa (t ), one gets For L a (t ), denote a (t ) = w 2
La (t ) =
1
μa
2
2
+ (ρ −
˜ aT (t + 1) w ˜ a (t + 1) − w ˜ aT (t ) w ˜ a (t )) tr( w
˜ aT (t )σa (t )[ w cT (t ) A (t )]} = −2tr{[ w cT (t )σc (t )] w + μa tr{[ A T (t ) w c (t )]σaT (t )[ w cT (t )σc (t )]T × [ w cT (t )σc (t )]σa (t )[ w cT (t ) A (t )]}.
(73)
The first term of Eq. (73) satisfies
˜ aT (t )σa (t )[ w cT (t ) A (t )]} La1 (t ) = −2tr{[ w cT (t )σc (t )] w = −2[ w cT (t )σc (t )][ w cT (t ) A (t )]a (t ) = − w cT (t )σc (t ) 2 − w cT (t ) A (t )a (t ) 2 + w cT (t ) A (t )a (t ) − w cT (t )σc (t ) 2 . For the second term of Eq. (73), the following equation can be obtained
La2 (t ) = μa tr{[ A T (t ) w c (t )]σaT (t ) × [ w cT (t )σc (t )]T [ w cT (t )σc (t )]σa (t )[ w cT (t ) A (t )]} = μa w cT (t )σc (t ) 2 σa (t ) 2 w cT (t ) A (t ) 2 . It is not difficult to see that
La (t ) = La1 (t ) + La2 (t ) = −(1 − μa σa (t ) 2 w cT (t ) A (t ) 2 ) w cT (t )σc (t ) 2 − w cT (t ) A (t )a (t ) 2 + w cT (t ) A (t )a (t ) − w cT (t )σc (t ) 2 −(1 − μa σa (t ) 2 w cT (t ) A (t ) 2 ) !2 (t ) 2 +2 !2 (t ) 2 + !3 (t ) 2 ,
(74)
where !2 (t ) = σ !3 (t ) = a (t ). Let w cm , σcm , w am , σam and sm denote the upper bounds for w c (t ), σc (t ), w a (t ), σa (t ) and s(t ), respectively, then there exist w cT (t ) c (t ),
w cT (t ) A (t )
!21m +
1
β2
σcm w˜ c (t ) , one obtains w˜ c (t ) > (2!22m + !23m ), which means the weight estima-
˜ c (t ) and w ˜ a (t ) are ultimately uniformly bounded. tion errors w
− 2 1 (t ) c (t )!1 (t ) + β c2 c (t ) 2 !1 (t ) 2 −β 2 c (t ) 2 − 1 (t ) c (t ) + !1 (t ) 2 + β 2 !21 (t ),
(75) 1 σa (t ) 2 w cT (t ) A (t ) 2
μc β 2 σ (t ) 2 , μa , then c 1 2 2 2 for any c (t ) > !1m + β 2 (2!2m + !3m ), we get L a (t ) 0. From Eq. (75), if
1
1 (t )
4
1
It isnoticed that c (t )
+ β −1 s(t − 1) − β −1 w cT (t − 1)σc (t − 1)]2 } = −β 2 c (t ) 2 −
11
2
Remark 10. The second component of the controller is for the sake of further improvement of control performance and there indeed exists no flight regime in which the involvement of the second component is required. It is noted that the basic part controller itself is sufficient for stability of the closed-loop system and boundedness of the altitude tracking error. As a result, the employment of supplementary part controller is to expedite the control performance of the basic part controller in view of the online-learning merit of the ADHDP method. Besides, several items related to the second component of the controller should be taken into consideration: (i) For the feasibility of the controller inputs, the magnitude of the supplementary part controller should be small since the main role of the control system is played by the basic part controller. The supplementary part controller is just icing on the cake and further improves the control performance. In other words, the supplementary part controller can be regarded as a fine tuning of the basic part controller. (ii) For the importance of the tracking errors for a particular flight region, we do not deliberately emphasize on identifying the flight region where the tracking errors are vital. The importance of the tracking errors is not our focus and we adopt the tracking errors mainly to demonstrate the effectiveness of the proposed approach. The main idea of this paper is that the development of ADHDP scheme can be employed to further decrease tracking errors. (iii) For the difference from the nominal approaches, it is noteworthy that the ideas of the basic part controller and supplementary part control are different. The basic part controller starts from Lyapunov stability theory and develops the control law based on a construction of Lyapunov functions. The principle of designing the basic part controller is stability. On the other hand, the supplementary part controller is devised via the ADHDP technique which focus on generating the optimal control input such that the user-defined cost function is minimized. Correspondingly, the control performance can be reflected by the cost function and the supplementary part controller can thus be exploited to improve the control performance. (iv) For computational aspect in real time, the ADHDP scheme can be adopted which possesses the capability of online learning. However, only the convergence of weight estimation errors can be guaranteed and the optimal values may not be achieved. The specific computation can be adjusted by tuning the parameters including the learning rates and number of
12
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
Fig. 3. The variation of C D .
Fig. 4. The variation of C L .
iterations. As a result, the ADHDP scheme satisfies the requirement of real time while the optimality of solutions cannot be guaranteed theoretically. (v) For verifiability of the algorithm for a real-time aerospace application, it is noted that the verification is not present in this paper. We have planned a real-time aerospace application of the algorithm which will constitute the simulation part of our next paper.
We use linear disturbance observers similar to the form of (44) with l1γ = 3, l2γ = 2.25, εγ = 0.06, l1q = 3, l2q = 2.25, εq = 0.06. For the NN W 1T,i S 1 ( X 1 ), we take N γ = 1 and use 150 nodes with the centers evenly spaced on [25, 30] × [−0.5, 0.5] × [−1, 1] × [−10, 10] × [−2, 2] × [−0.5, 0.5], where the width is 1.5. For the NN W 2T,i S 2 ( X 1 ), we choose N q = 5 and adopt 180 nodes with the centers evenly spaced on [25, 30] × [−0.5, 0.5] × [−0.5, 0.5] × [−0.5, 0.5] × [−5, 5] × [−0.5, 0.5] × [−1, 1] where the width is 1.5. For estimating 1 and 2 , we take r1 = 5, 1 = 0.5, κ1 = 0.02, ˆ 1 (0) = 0, ˆ 2 (0) = 0. r2 = 0.3, 2 = 0.5, κ2 = 1, For the ADHDP based supplementary part, the utility weight matrix R r = diag(0.5, 0.3, 0.3, 0.3, 0.1), the discount factor β = 0.95, ρ = 1.2. The critic network has 30 hidden nodes with an initial learning rate μc = 0.5 and the maximum iteration number being 120. The action network has 25 hidden nodes with an initial learning rate μa = 0.5 and the maximum iteration number being 100. The learning rates μa and μa are decreased by 0.01 every ten time steps until they are less than 0.005 and kept fixed, or be reset as initial values if |e 1 (t )| > 0.1. Both the weight values of the critic network and action network are initialized in [−0.3, 0.3] at random, and the input-to-hidden layer weights are then kept fixed. For comparison, the method in [12] is also simulated, and the related results are shown in Figs. 5–18, where basic part controller described in Section 3.1 is denoted by m1, the combination of basic part and supplementary part controller is denoted by m1+ADP, and the method in [12] is denoted by m2. Figs. 5 and 6 give the altitude tracking performance of the three methods. It can be seen that all three schemes can guarantee the boundedness of the altitude tracking error, while both the m1 and m1+ADP method can ensure better tracking performance and that the altitude tracking error converges towards zero. In particular, compared with the m1 method, the m1+ADP method has smaller overshoot and similar steady performance. The supplementary controller is mainly for improvement of the control performance which takes the error-related terms defined in the backstepping design method as states and the focus of our problem is altitude tracking. To summarize, the supplementary controller further reduces the altitude tracking error as shown in Fig. 6, which is the main idea of our work. The reason why there is a steady state error for the m2 method may be due to the presence of disturbances and the related controller gains are not big enough. Besides, the assumption on the bound of the flight path angle γ may also contribute to the tracking error. It is noted that we adopt the integration of tracking error to for derivation of the controller such that the steady state error is eliminated.
Remark 11. Compared with previous works, the switched nonlinear systems are adopted for description of the longitudinal altitude motion and the designed controller is divided into the basic part and supplementary part, which constitute the main innovation of our work. For the basic part controller, the neural networks are exploited to devise the common virtual laws and the final control input. Disturbance observers are designed to attenuate the disturbances caused by internal and external uncertainties. For the supplementary part controller, the ADHDP scheme is utilized and the states of the algorithm are based on the related error variables defined in the backstepping design process. Correspondingly, the supplementary part controller is developed to further decrease the tracking errors of the basic part controller and improve the control performance. It is noted that the backstepping method is model-based and focuses on stability while the ADHDP technique is data-based and focused on optimality. To summarize, this work demonstrates the applicability of date-based control approaches to model-based control approaches for an enhancement of control performance without violating stability of the closed-loop system. 4. Numerical simulation A comparison simulation is conducted to demonstrate the effectiveness of the proposed control scheme. The parameters and aerodynamic coefficients of the morphing aircraft model are mainly taken from [12], and the aerodynamic coefficients are with 20% uncertainties in the simulation. The nominal values of aerodynamic coefficients C D and C L are given in Figs. 3 and 4, respectively. The initial values of the altitude system (2) are taken as [h0 , γ0 , α0 , q0 ] = [1000 m, 0◦ , 0.4976◦ , 0◦ /s]. The reference signal .04 hr varies from h0 to 1100 m with a transfer function s2 +00.4s . +0.04 The sweep angle ξ varies from 0◦ to 45◦ with a transfer func.01 tion s2 +00.2s , where for the controller design, it is assumed +0.01 that the system switching occurs every 5◦ in the available [0◦ , 45◦ ] range. The controller parameters are selected as k0 = 0.2, k1 = 1, k¯ 1 = 0.5, k2 = 4, k3 = 4, k4 = 6, τ1 = 0.05, τ2 = 0.05, τ3 = 0.05.
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
Fig. 5. Altitude tracking.
Fig. 6. Altitude tracking error.
Fig. 7. The response of the flight path angle γ . (For interpretation of the colors in the figure(s), the reader is referred to the web version of this article.)
Fig. 8. The response of the angle of attack
Fig. 9. The response of the pitch rate q.
Fig. 10. The control input δe .
13
α.
14
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
ˆ 1. Fig. 11. The response of Fig. 13. The disturbance and disturbance estimation in the
γ loop (m1).
trol input is not analyzed and will be the topic of our next paper.
ˆ 1 and For the m1 and m1+ADP method, the responses of ˆ 2 are depicted in Figs. 11 and 12, where both ˆ 1 and ˆ 2 are ˆ 1 and ˆ 2 are employed to devise the bounded. We note that
Figs. 7–9 reveal responses of state variables γ , α and q of the altitude system. It can be observed that γ , α and q stay bounded during the tracking control process for all three control schemes. Fig. 10 gives the related control inputs of the simulation, where all three control inputs are of the same order and no larger than 18◦ . Besides, the responses of γ , α , q and δe of the m1+ADP method are similar to that of the m1 method. The picks in Figs. 7–9 are mainly caused by the inconsistency between the models adopted for simulation of motion and for controller design. Specifically, the motion model for simulation can be described as Eq. (1) for the sake of approaching the real motion, which is a nonlinear parameter varying system. However, the model for controller design and stability analysis is simplified to be written into switched nonlinear systems where the sweep angle take discrete values and the flight path angle is assumed to be small. During the simulation, the variation of sweep angle also aggravates the inconsistency between the model employed for motion simulation and the model adopted for controller design, which makes the flight states depicted in Figs. 7–9 oscillated.
common virtual control laws of the backstepping method for the ˆ 1 and ˆ 1 are switched nonlinear systems. The boundedness of sufficient for stability of the closed-loop system and they converge to steady values gradually during the simulation, which demonstrates both the effectiveness of the m1 and m1+ADP methods. Figs. 13–16 show the disturbances and related estimations in the γ and q loop. It can be seen that the designed disturbance observer can ensure the estimations track the disturbances in the first several seconds and after that, the estimation errors stay within small bounds even when the switching occurs. Figs. 13–16 can demonstrate the effectiveness of the designed disturbance observers. It is noted that the disturbances that can be handled by the disturbance observers should be differentiable and do not belong to random variables covered by the optimal filtering theory. As a result, the focus of the references on disturbance observers and extended state observers is mainly the convergence of estimation errors and does not discuss statistical results including RMSE values [29–31]. We here follow this practice and do not emphasis on the statistical performance of the disturbance observers since the design of disturbance observers are not the main part of our work. However, we acknowledge that the random measurement noises should be considered for disturbance observers in view of engineering application and the related statistical results can be conducted. The control performance in the presence of random measurement noises will be the topic of our further investigation. Finally, for the m1+ADP method, the weight updating of the (2) first five values of w a (t ) are illustrated in Fig. 17, and Fig. 18 reveals the response of the utility function s(x(t ), u (t )). It can be observed that the weight values adjust adaptively and converge to constant values after a period of time. Correspondingly, the utility function converges towards a small neighborhood of the origin gradually, which exhibits the effectiveness of the learning process of the ADHDP.
Remark 12. It is noted that Fig. 8 and Fig. 9 describe the related flight states and Fig. 10 depicts the control input. The specific values in Fig. 8, Fig. 9 and Fig. 10 do not reflect the altitude tracking control performance directly. We acknowledge that the influence of supplementary controller on the related flight states and con-
Remark 13. Fig. 17 describes the updating of some weights when they are initialized randomly. The plus or minus component of the weights at the beginning of the numerical investigation are mainly cause by the random initialization. Besides, it is noted that the required limits for the weights cannot be provided in theory since
ˆ 2. Fig. 12. The response of
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
15
(2)
Fig. 17. The weight updating of w a .
Fig. 14. The disturbance and disturbance estimation in the q loop (m1).
Fig. 18. The response of the utility function s(x, u ).
Fig. 15. The disturbance and disturbance estimation in the
γ loop (m1+ADP).
we have only proved the boundedness of weight estimation errors. The analysis of specific limits is still an open problem worthy of further study for the weight updating via the gradient-based algorithm. 5. Conclusion
Fig. 16. The disturbance and disturbance estimation in the q loop (m1+ADP).
The longitudinal motion control of a morphing aircraft with variable sweep wings based on switched nonlinear systems and ADP is presented. The morphing aircraft dynamics is first modeled as switched nonlinear systems in lower triangular form. Then, a basic part and a supplementary part of the controller are designed, respectively. The backstepping technique with integration of a modified dynamic surface is utilized for the basic part, thereby avoiding the problem of ‘explosion of complexity’. Besides, considering the existence of internal uncertainties and external disturbances, disturbance observers are designed and combined with RBF NNs to obtain the common virtual control laws. Moreover, the supplementary part is designed by the ADHDP approach, where the differences between the actual and desired values in the backstepping design are viewed as the states of the ADHDP. The simulations clearly show that compare with the controller which contains the basic part only, the proposed supplementary control input can further reduce altitude tracking error and improve control performance.
16
Q. Wang et al. / Aerospace Science and Technology 93 (2019) 105325
Declaration of competing interest The authors declare that they have no conflict of interest with the present manuscript. Acknowledgements This work was supported by the National Natural Science Foundation of China under Grants 61374012, 61403028, 61873295 and 61833016, the Aeronautical Science Foundation of China (2016ZA51011) and the Shanghai Aerospace Science and Technology Innovation Foundation (SAST2017-096). The authors are also grateful to the anonymous reviewers for their constructive comments addressing the paper. References [1] F. Afonso, J. Vale, F. Lau, A. Suleman, Performance based multidisciplinary design optimization of morphing aircraft, Aerosp. Sci. Technol. 67 (2017) 1–12. [2] R.M. Ajaj, C.S. Beaverstock, M.I. Friswell, Morphing aircraft: the need for a new design philosophy, Aerosp. Sci. Technol. 49 (2016) 154–166. [3] T.A. Weisshaar, Morphing aircraft systems: historical perspectives and future challenges, J. Aircr. 50 (2) (2013) 337–353. [4] J. Sun, Q. Guan, Y. Liu, J. Leng, Morphing aircraft based on smart materials and structures: a state-of-the-art review, J. Intell. Mater. Syst. Struct. 27 (17) (2016) 2289–2312. [5] M. Wu, T. Xiao, H. Ang, H. Li, Optimal flight planning for a Z-shaped morphingwing solar-powered unmanned aerial vehicle, J. Guid. Control Dyn. 41 (2) (2018) 497–505. [6] T. Yue, L. Wang, J. Ai, Longitudinal linear parameter varying modeling and simulation of morphing aircraft, J. Aircr. 50 (6) (2013) 1673–1681. [7] T. Yue, L. Wang, J. Ai, Gain self-scheduled H ∞ control for morphing aircraft in the wing transition process based on an LPV model, Chin. J. Aeronaut. 26 (4) (2013) 909–917. [8] T. Guo, Z. Hou, B. Zhu, Dynamic modeling and active morphing trajectoryattitude separation control approach for gull-wing aircraft, IEEE Access 5 (2017) 17006–17019. [9] Z. He, M. Yin, Y.P. Lu, Tensor product model-based control of morphing aircraft in transition process, Proc. Inst. Mech. Eng., G J. Aerosp. Eng. 230 (2) (2016) 378–391. [10] N. Wen, Z. Liu, L. Zhu, Linear-parameter-varying-based adaptive sliding mode control with bounded L2 gain performance for a morphing aircraft, Proc. Inst. Mech. Eng., G J. Aerosp. Eng. (2018) 0954410018764472. [11] Z. Wu, J. Lu, J. Rajput, J. Shi, W. Ma, Adaptive neural control based on high order integral chained differentiator for morphing aircraft, Math. Probl. Eng. (2015) 2015. [12] Z. Wu, J. Lu, Q. Zhou, J. Shi, Modified adaptive neural dynamic surface control for morphing aircraft with input and output constraints, Nonlinear Dyn. 87 (4) (2017) 2367–2383. [13] Z. Wu, J. Lu, J. Shi, Y. Liu, Q. Zhou, Robust adaptive neural control of morphing aircraft with prescribed performance, Math. Probl. Eng. (2017) 2017. [14] T. Yue, X. Zhang, L. Wang, J. Ai, Flight dynamic modeling and control for a telescopic wing morphing aircraft via asymmetric wing morphing, Aerosp. Sci. Technol. 70 (2017) 328–338. [15] J. Li, C. Gao, C. Li, W. Jing, A survey on moving mass control technology, Aerosp. Sci. Technol. 82–83 (2018) 594–606. [16] J. Li, C. Gao, W. Jing, Y. Fan, Nonlinear vibration analysis of a novel moving mass flight vehicle, Nonlinear Dyn. 90 (1) (2017) 733–748. [17] T. Wang, C. Dong, Q. Wang, Finite-time boundedness control of morphing aircraft based on switched systems approach, Optik, Int. J. Light Electron Opt. 126 (23) (2015) 4436–4445. [18] W. Jiang, C. Dong, Q. Wang, A systematic method of smooth switching LPV controllers design for a morphing aircraft, Chin. J. Aeronaut. 28 (6) (2015) 1640–1649. [19] H. Cheng, C. Dong, W. Jiang, Q. Wang, Y. Hou, Non-fragile switched H ∞ control for morphing aircraft with asynchronous switching, Chin. J. Aeronaut. 30 (3) (2017) 1127–1139. [20] H. Cheng, W. Fu, C. Dong, Q. Wang, Y. Hou, Asynchronously finitetime H ∞ control for morphing aircraft, Trans. Inst. Meas. Control (2018) 0142331217746737.
[21] X. Zhao, X. Zheng, B. Niu, L. Liu, Adaptive tracking control for a class of uncertain switched nonlinear systems, Automatica 52 (2015) 185–191. [22] L. Long, J. Zhao, Adaptive output-feedback neural control of switched uncertain nonlinear systems with average dwell time, IEEE Trans. Neural Netw. Learn. Syst. 26 (7) (2015) 1350–1362. [23] D. Zhai, L. An, J. Dong, Q. Zhang, Switched adaptive fuzzy tracking control for a class of switched nonlinear systems under arbitrary switching, IEEE Trans. Fuzzy Syst. 26 (2) (2018) 585–597. [24] Y. Li, S. Tong, L. Liu, G. Feng, Adaptive output-feedback control design with prescribed performance for switched nonlinear systems, Automatica 80 (2017) 225–231. [25] B. Niu, Y. Liu, G. Zong, Z. Han, J. Fu, Command filter-based adaptive neural tracking controller design for uncertain switched nonlinear output-constrained systems, IEEE Trans. Cybern. 47 (10) (2017) 3160–3171. [26] J. Liu, S. Vazquez, L. Wu, A. Marquez, H. Gao, L.G. Franquelo, Extended state observer-based sliding-mode control for three-phase power converters, IEEE Trans. Ind. Electron. 64 (1) (2017) 22–31. [27] B. Li, Q. Hu, G. Ma, Extended state observer based robust attitude control of spacecraft with input saturation, Aerosp. Sci. Technol. 50 (2016) 173–182. [28] Z. Peng, J. Wang, Output-feedback path-following control of autonomous underwater vehicles based on an extended state observer and projection neural networks, IEEE Trans. Syst. Man Cybern. Syst. 48 (4) (2018) 535–544. [29] B.Z. Guo, Z.L. Zhao, On the convergence of an extended state observer for nonlinear systems with uncertainty, Syst. Control Lett. 60 (6) (2011) 420–430. [30] B.Z. Guo, Z.L. Zhao, On convergence of the nonlinear active disturbance rejection control for MIMO systems, SIAM J. Control Optim. 51 (2) (2013) 1727–1757. [31] Z.L. Zhao, B.Z. Guo, Extended state observer for uncertain lower triangular nonlinear systems, Syst. Control Lett. 85 (2015) 100–108. [32] Y. Jiang, Z.P. Jiang, Global adaptive dynamic programming for continuous-time nonlinear systems, IEEE Trans. Autom. Control 60 (11) (2015) 2917–2929. [33] Q. Wei, D. Liu, H. Lin, Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems, IEEE Trans. Cybern. 46 (3) (2016) 840–853. [34] C. Mu, Z. Ni, C. Sun, H. He, Data-driven tracking control with adaptive dynamic programming for a class of continuous-time nonlinear systems, IEEE Trans. Cybern. 47 (6) (2017) 1460–1470. [35] C. Mu, Z. Ni, C. Sun, H. He, Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming, IEEE Trans. Neural Netw. Learn. Syst. 28 (3) (2017) 584–598. [36] Q. Wei, D. Liu, F.L. Lewis, Y. Liu, J. Zhang, Mixed iterative adaptive dynamic programming for optimal battery energy control in smart residential microgrids, IEEE Trans. Ind. Electron. 64 (5) (2017) 4110–4120. [37] J. Sun, C. Liu, Q. Ye, Robust differential game guidance laws design for uncertain interceptor-target engagement via adaptive dynamic programming, Int. J. Control 90 (5) (2017) 990–1004. [38] M.M. Polycarpou, P.A. Ioannou, A robust adaptive nonlinear control design, Automatica 32 (3) (1996) 423–427. [39] Q. Hu, Y. Meng, Adaptive backstepping control for air-breathing hypersonic vehicle with actuator dynamics, Aerosp. Sci. Technol. 67 (2017) 412–421. [40] B. Xu, D. Wang, Y. Zhang, Z. Shi, DOB-based neural control of flexible hypersonic flight vehicle considering wind effects, IEEE Trans. Ind. Electron. 64 (11) (2017) 8676–8685. [41] Q. Dong, Q. Zong, B. Tian, C. Zhang, W. Liu, Adaptive disturbance observerbased finite-time continuous fault-tolerant control for reentry RLV, Int. J. Robust Nonlinear Control 27 (18) (2017) 4275–4295. [42] M. Ran, Q. Wang, C. Dong, Stabilization of a class of nonlinear systems with actuator saturation via active disturbance rejection control, Automatica 63 (2016) 302–310. [43] M. Krstic, I. Kanellakopoulos, P.V. Kokotovic, Nonlinear and Adaptive Control Design, Wiley, New York, 1995. [44] B. Igelnik, Y.H. Pao, Stochastic choice of basis functions in adaptive function approximation and the functional-link net, IEEE Trans. Neural Netw. Learn. Syst. 6 (6) (1995) 1320–1329. [45] F. Liu, J. Sun, J. Si, W. Guo, S. Mei, A boundedness result for the direct heuristic dynamic programming, Neural Netw. 32 (2012) 229–235. [46] Z. Liu, B. Chen, C. Lin, Adaptive neural backstepping for a class of switched nonlinear system without strict-feedback form, IEEE Trans. Syst. Man Cybern. Syst. 47 (7) (2017) 1315–1320. [47] Y. Sokolov, R. Kozma, L.D. Werbos, P.J. Werbos, Complete stability analysis of a heuristic approximate dynamic programming control design, Automatica 59 (2015) 9–18.