Automatica 43 (2007) 1156 – 1164 www.elsevier.com/locate/automatica
System theory for numerical analysis夡 Kenji Kashima a,∗ , Yutaka Yamamoto b a Department of Mechanical and Environmental Informatics, Graduate School of Information Science and Engineering, Tokyo Institute of Technology, 2-12-1,
Oh-okayama, Meguro-ku, Tokyo 152-8552, Japan b Department of Applied Analysis and Complex Dynamical Systems, Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto,
Kyoto 606-8501, Japan Received 15 January 2006; received in revised form 19 August 2006; accepted 26 December 2006 Available online 15 May 2007
Abstract Many numerical schemes can be suitably studied from a system theoretic point of view. This paper studies the relationship between the two disciplines, that is, numerical analysis and system theory. We first see that various iterative solution schemes for linear and nonlinear equations can be suitably transformed into the form of a closed-loop feedback system, and show the crucial role of the internal model principle in such a context. This leads to new stability criteria for Newton’s method. We then study Runge–Kutta type methods for solving differential equations, and also derive new stability criteria based on recent results on LMI. A numerical example is given to illustrate the advantage of the present theory. 䉷 2007 Elsevier Ltd. All rights reserved. Keywords: Numerical analysis; System theory; Stability of numerical methods; Newton’s method; Internal model principle; Lur’e control systems; Absolute stability; Linear matrix inequality
1. Introduction There exist many iterative numerical schemes for solving linear or nonlinear equations or differential equations, having different characteristics adopted to varied needs. These schemes are mostly represented by difference equations, and hold an important position. Although system theory is clearly a suitable tool for analyzing such dynamical systems, the study of numerical analysis from this viewpoint has not been quite popular. However, recently some authors have started system theoretic approaches toward numerical analysis; see, for example, Gustafsson, Lundh, and Söderlind (1988), Bhaya and Kaszkurewicz (2003, 2004), Kaszkurewicz, Bhaya, and Ramos (1995), Schaerer and Kaszkurewicz (2001); and Wakasa and Yamamoto (2001) and references therein. The crux of these 夡 This paper was partially presented at 16th IFAC World Congress, Prague, July 2005. This paper was recommended for publication in revised form by Associate Editor Siep Weiland under the direction of Editor Roberto Tempo. ∗ Corresponding author. Tel.: +81 3 5734 2646; fax: +81 3 5734 2646. E-mail addresses:
[email protected] (K. Kashima),
[email protected] (Y. Yamamoto).
0005-1098/$ - see front matter 䉷 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.automatica.2006.12.028
approaches lies in the fact that we can not only describe the behavior of numerical solutions obtained by iterative schemes in such a system theoretic framework, but also study various requirements such as convergence, stability and robustness against external errors from this viewpoint. This fact opens a great opportunity for system theory to provide numerical analysis with valuable new tools and concepts for analyzing or even synthesizing pertinent dynamical systems associated with it. We start this paper by analyzing the simple linear equation Ax =b, to show the generic idea here on one hand: that is, to interpret various iterative solution processes as feedback systems that are to track a constant (step) input b with the 0th order plant A. On the other hand, the important objective here is to show the crucial relevance of the internal model principle in this context. It arises from the objective of tracking b in spite of small computational or data errors arising in the process of computation, and this is how the internal model principle comes into play; see Section 2 for details. We then generalize this idea to nonlinear equations. A stability criteria for Newton’s iterative process is derived in a framework of nonlinear control system, especially system of Lur’e type. The internal model principle again plays an important role here. The present analysis
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
enables us to relax conditions on the convergence region, and the result is compared with those by conventional analysis. In Section 2, we turn our attention to numerical integration methods of ordinary differential equations, particularly the analysis of the absolute stability region of Runge–Kutta type methods. The absolute stability at a point in the complex plane means that a corresponding linear test problem is stable. The absolute stability region governs the step size to guarantee accurate numerical solutions. While this is an important problem, it is also known to be difficult to describe relationships between this region and Runge–Kutta coefficients. Only for some special cases, algebraic conditions of the coefficients have been obtained, e.g., Scherer and Türke (1989). We here invoke a new generalized Kalman–Yakubovich–Popov (KYP) lemma derived by Iwasaki and Hara (2005), to obtain a more general characterization of this region in terms of a linear matrix inequality (LMI). This allows us to design the coefficients of a Runge–Kutta type method by optimizing the region of absolute stability. 2. Iterative schemes and the internal model principle
Fig. 1. Block diagram for iterative methods.
model principle and provides a unified, and simplified viewpoint. Let us first regard A as a 0th order plant to be controlled, and define the error signal ek := b − Ax k which is expected to converge to 0. We can rewrite the algorithm of the Jacobi method as xk+1 = − D −1 (A − D)xk + D −1 b = xk + D −1 (b − Ax k ) = xk + D −1 ek . Similarly, the other two methods can also be brought into a feedback form xk+1 = xk + ek
The objective of this section is to show the relevance of the internal model principle to various iterative schemes. 2.1. Iterative processes for linear equations We start with a simple linear equation Ax = b. The crux here is to place this into the framework of tracking systems, and show that the internal model principle plays a crucial role. Let A ∈ Rn×n , b ∈ Rn , and consider the linear equation Ax = b for x ∈ Rn . Suppose that A is nonsingular, and we wish to generate a sequence xk that converges to the solution A−1 b. Decompose A as A = D + E + F,
1157
(1)
where D, E and F are diagonal, strictly lower triangular and strictly upper triangular matrices, respectively. Then the Jacobi, Gauss–Seidel (GS), and successive over-relaxation (SOR) methods are given as follows (Quarteroni, Sacco, & Saleri, 2000): Jacobi : xk+1 = − D −1 (E + F )xk + D −1 b, GS : xk+1 = − (D + E)−1 F x k + (D + E)−1 b, SOR : xk+1 = (I + D −1 E)−1 {(1 − )I − D −1 F }xk + (D + E)−1 b, where is called a relaxation parameter. These formulas perhaps give a somewhat ad hoc impression. What is important here is to recognize that they can be brought into the form of a feedback system driven by the error Ax − b. This makes it possible to relate such schemes with the internal
with given by (D +E)−1 for GS and (D +E)−1 for SOR. It is readily obvious that these methods take the common form of Fig. 1, with step input signal u ≡ b. We now attempt to see a more intrinsic reason for this. Let us start by observing that any reasonable iterative solver should satisfy the following properties: • For arbitrary b, the method should work, i.e., the output yk = Ax k should track arbitrary constant b in order that xk converge to the exact solution. • This tracking property is robust, in presence of some data errors, the method should still converge. The celebrated internal model principle (Francis & Wonham, 1975) asserts that this robust tracking property is satisfied if and only if the following two conditions hold: (i) the feedback system is internally stable, i.e., the transfer matrix from u to y is stable and no unstable pole-zero cancellation exists; and (ii) the loop transfer matrix from e to y contains the internal model of exogenous signals, which is the step signal generator 1/(z − 1) in this case. If we further assume uncertainty in the plant, property (ii) forces the controller to contain the integrator 1/(z − 1), and the simplest of such a construction is that given in Fig. 1. Moreover, this construction rejects an arbitrary constant disturbance added to x in Fig. 1. This tells us that why most, if not all, iterative schemes assume the structure xk+1 = xk + correction term: it is a crucial consequence of the internal model principle. In Fig. 1, the open-loop transfer matrix clearly contains 1/(z − 1). Thus xk converges to A−1 b if and only if condition (i) above is satisfied, i.e., all eigenvalues of I − A lie in the open unit disc. This is consistent with the conventional results.
1158
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
Fig. 2. Block diagram for Newton’s method with computational errors. Fig. 3. Lur’e system for Newton’s method.
Remark 1. The role of the internal model principle in this context was perhaps first pointed out in Wakasa and Yamamoto (2001) and some convergence analysis is given there. While Schaerer and Kaszkurewicz (2001) and Bhaya and Kaszkurewicz (2003) also reformulated such iterative schemes as feedback systems, it is important to note that the role of the internal model principle, as well as the robust tracking property, is not recognized, because of the important difference that the integrator is considered as part of the plant in their treatment. The internal model principle was also mentioned, but in a different context of controlling the step size of Runge–Kutta type methods in a series of publications started off by Gustafsson et al. (1988); see also Remark 12 in Section 3. We also note that Bhaya and Kaszkurewicz (2003) discussed the choice of for better convergence properties, and also the conjugate gradient method as a proportional-derivative control. Some other numerical schemes are also considered in Kaszkurewicz et al. (1995), Schaerer and Kaszkurewicz (2001) and Bhaya and Kaszkurewicz (2004). Naturally, one may expect to generalize the above result to nonlinear equations, and this is the theme of the next subsection. 2.2. Newton’s method and Lur’e system We now turn our attention to iterative methods for solving nonlinear equations of a single variable: f (x) = 0, particularly Newton’s method. We first suppose that f is continuously differentiable and that f (x) = 0 for all x ∈ R. Moreover we assume the existence of a solution, and denote it by x ∗ . Newton’s method generates a sequence {xk } which converges to x ∗ when starting from an adequate initial value x0 close to x ∗ (Quarteroni et al., 2000): xk+1 = xk −
f (xk ) , f (xk )
k = 0, 1, 2, . . . .
In practice, we also have to take the computational error dk into account: xk+1 = xk −
f (xk ) + dk , f (xk )
k = 0, 1, 2, . . . .
(2)
As before, we can similarly describe this dynamics as a nonlinear feedback system as shown in Fig. 2. Here the convergence of xk to x ∗ means that yk := f (xk )/f (xk ) converges to 0. Remark 2. In Bhaya and Kaszkurewicz (2003), stability conditions of this system have been shown by means of Lyapunov
Fig. 4. Error system for Newton’s method.
functions. However, only the first order approximation of f is considered. We now convert system (2) to a system of Lur’e type for stability analysis. In what follows, we denote f/f by g for simplicity, and define g(x) ˜ := g(x + x ∗ ). Notice that g(0) ˜ = 0. Assume that there exist a, b such that a
g(x) ˜ b x
for all x = 0.
(3)
The introduction of g˜ changes the feedback system as shown in Fig. 3 with step disturbance w ≡ −x ∗ because yk = g(xk ) is equal to g(x ˜ k − x ∗ ). The objective here is now modified to that of attenuating the step disturbance w in Fig. 3. It should be noted that g˜ is introduced just for stability analysis and it is not needed for execution of Newton’s method. Similarly to the case of linear equations in the previous subsection, the loop transfer function contains the internal model 1/(z − 1) of step signals, which is necessary to reject the unknown step disturbance w according to the internal model principle. To express this more precisely, we transform the system in Fig. 3 to an error system. With g˜ defined above, Eq. (2) becomes ek+1 = ek − g(e ˜ k ) + dk ,
(4)
where ek := xk − x ∗ is the error signal. Eq. (4) represents the feedback system in Fig. 4. Such a nonlinear system consisting of a linear system and a sector bounded nonlinear function is called a system of Lur’e type, and several stability criteria have been derived for this type of nonlinear systems; see Khalil (1992). Theorem 3. Consider the algorithm (2). Suppose that there exist constants a and b such that 0
f (x) b < 2, (x − x ∗ )f (x)
(5)
∗ for all value x0 and any {dk } such x = x . Then for any initial that k |dk |2 is finite, the sum k |ek |2 of the squared error is also finite. Furthermore, if dk ≡ 0, xk converges to the exact solution x ∗ .
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
Proof. We have (3) from the assumption (5). According to stability analysis for Lur’e system, the first statement holds if 1 z+l−1
∈R
where a+b , 2
r=
locally (e.g., Quarteroni et al., 2000). This property corresponds to the fact that condition (5) holds only on a bounded set of R. We now apply Theorem 3 to enlarge the domain of convergence. Corollary 5. Suppose that the solution x ∗ is in I := [, ] and that there exist a and b satisfying (5) for all x ∈ I \{x ∗ }. If we choose x0 ∈ I and set f (xk ) , k = 0, 1, 2, . . . xk+1 = max , min , xk − f (xk )
is stable and r = max r , r sup j < 1, e + l − 1 l 2−l
l=
1159
then xk converges to the solution x ∗ .
b−a . 2
It can be shown that this is equivalent to 0 < a b < 2. Let us prove the second statement. Eq. (3) and (4) lead to |ek+1 |q|ek |
(6)
with q := max{|1 − a|, |1 − b|}. Since 0 q < 1 for 0 < a b < 2, the desired result follows. For any affine function f (except for constant functions), it can be easily verified g(x)/x ˜ ≡ 1. Theorem 3 guarantees that if f does not possess very strong nonlinearity in the sense that g(x)/x ˜ is close to 1, then xk does not diverge due to the computational error dk . Furthermore, when there exist no computational errors, numerical solutions converge to the exact solution for such a function f.
Proof. Since x ∗ is assumed to be in I, trivially f (xk ) ∗ ∗ − x q|xk − x ∗ | |xk+1 − x | xk − f (xk ) holds with 0 q < 1.
2.4. Comparison with conventional convergence conditions We now discuss the difference between Theorem 3 and the convergence condition based on the contraction principle; see Rudin (1976) and Appendix for details. Define by (x) := x −
f (x) . f (x)
(8)
Then Newton’s method is described as
2.3. Modification of the convergence theorem
xk+1 = (xk ).
Condition (5) may appear a little restrictive in that the constants a, b are bounded by 2 and that it is assumed to hold globally. We show two corollaries of Theorem 3 to remedy this. The first corollary is concerned with the case where the condition (5) does not hold for b < 2. Even if b 2 in (5), we can make xk converge to x ∗ by introducing a suitable gain in the iteration process.
According to the contraction principle, xk converges to the exact solution x ∗ if the function is a contraction on a closed interval.
Corollary 4. Suppose that there exist a and b satisfying
Proof. See Appendix.
0
f (x) b (x − x ∗ )f (x)
for all x = x ∗ .
(7)
Consider the iterative process xk+1 = xk − L
f (xk ) , f (xk )
k = 0, 1, 2, . . .
Theorem 6. Suppose that the solution x ∗ exists in a closed interval I and that the function f is twice continuously differentiable on I . If function is contractive on I , there exist a and b satisfying (5) for all x ∈ I \{x ∗ }.
This theorem means that our result includes the conventional one as a special case. Actually it is indeed a strict generalization. An example is as follows: take a function f as f (x) = x 3 + x 2 + x + 1.
with constant L such that 0 < L < 2/b. Then xk converges to the solution x ∗ .
Then g(x)=f (x)/f (x) satisfies the convergence condition (5) as shown in Fig. 5, whereas (x) = x − g(x) is not contractive. If were contractive, then 0 < g < 2 should hold; see also the proof of Theorem 6 in Appendix.
Proof. Similarly to the proof of the second statement of Theorem 3, we obtain (6) with q := max{|1 − aL|, |1 − bL|} < 1. This completes the proof.
3. Runge–Kutta type methods
Corollary 4 corresponds to stability analysis for the so-called modified Newton’s method; see Quarteroni et al. (2000). The other corollary deals with local stability. As is well known, the convergence of Newton’s method is guaranteed only
In this section the region of absolute stability of Runge–Kutta type methods is analyzed and optimized. This is a central theme in choosing such methods (see, e.g., Lambert, 1991), and we show that a system theoretic approach, in particular LMI, can effectively be adopted to optimize this region.
1160
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
Fig. 6. Block diagram of Runge–Kutta type method.
and the nonlinear system F ⎡ hf (t + c h, X ) ⎤ n 1 n,1 ⎥ ⎢ .. ⎥ F : Xn → kn = ⎢ . ⎦ ⎣ hf (tn + cs h, Xn,s )
Fig. 5. An example where (x) is not contractive.
with 3.1. Absolute stability of Runge–Kutta type methods Methods of the Runge–Kutta type are widely used for the numerical integration of the initial value problem x(t) ˙ = f (t, x(t)),
x(0) = x0 ∈ Rm
(9)
with f : R × Rm → Rm . The algorithm with step size h is represented by xn+1 = xn +
s
bi kn,i ,
(10)
kn,i = hf (tn + ci h, Xn,i ),
Xn,i = xn +
s
aij kn,j ,
(11)
j =1
and tn = nh. Here xn should be an approximation of the exact solution x(tn ). We refer to the coefficients aij and bi as the Runge–Kutta coefficients. When aij =0 for all i j , the method is said to be explicit, and otherwise implicit. Let us bring this dynamics of the Runge–Kutta type methods into the form of a feedback system, and interpret absolute stability in a system theoretic terminology. Denote a computational error at each step by dn , that is, bi kn,i + dn
(12)
i=1
and Xn := := By ··· ··· straightforward computation, the dynamics determined by (11) and (12) can be represented by the nonlinear feedback system as shown in Fig. 6, consisting of the linear system G,
⎧ dn ⎪ T ´ ⎪ x , = x + ] [ Im b n+1 n ⎪ ⎨ kn G: (13)
⎪ xn Im 0 0 dn ⎪ ⎪ ⎩ = xn + 0 A´ Xn e´ kn T [ Xn,1
a11
⎢ ⎢ ⎢ a21 A := ⎢ ⎢ . ⎢ . ⎣ .
a12 ..
···
. ..
.
as1 · · · · · · ⎡ ⎤ 1 ⎢ ⎥ ⎢1⎥ ⎢ ⎥ e := ⎢ . ⎥ ∈ Rs , ⎢ .. ⎥ ⎣ ⎦ A´ := A ⊗ Im ,
where for i = 1, 2, . . . , s,
xn+1 = xn +
⎡
a1s
⎤
⎥ ⎥ a2s ⎥ ⎥, .. ⎥ ⎥ . ⎦ ass
⎡
b1
⎤
⎢ ⎥ ⎢ b2 ⎥ ⎢ ⎥ b := ⎢ . ⎥ , ⎢ . ⎥ ⎣ . ⎦ bs
1
i=1
s
(14)
T ]T , k Xn,s n
T [ kn,1
T ]T . kn,s
b´ := b ⊗ Im ,
e´ := e ⊗ Im ,
where ⊗ means the tensor (Kronecker) product; ⎤ ⎡ x11 Y · · · x1n Y ⎢ . .. ⎥ .. ⎥ . X⊗Y =⎢ . . ⎦ ⎣ . xm1 Y
· · · xmn Y
with xij denoting the (i, j ) element of X. It is difficult to obtain an error estimate between exact and numerical solutions for general nonlinear function f. A standard approach to this problem is to study its behavior for first order approximations, i.e., examine the behavior of numerical solution for the linear test problem: x˙ = J x,
x(0) = x0
(15)
for various J. If the considered Runge–Kutta type method behaves reasonably for a rich class of such linear problems, one might expect that it would also behave nicely for sufficiently smooth nonlinear systems. In the linear test problem, F in (14) becomes the following linear operator: F : Xn → kn = hJ`Xn
(16)
with J` := Is ⊗ J , and consequently the feedback system in Fig. 6 can be described as xn+1 = Acl xn + dn ,
(17)
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
1161
where Acl = Im + b´ T hJ`(Ims − hA´ J`)−1 e´
(18)
and J` := Is ⊗J . Suppose that all eigenvalues of J have negative real part. Since the exact solution for (15) is clearly x(t) = exp(J t)x0 , the system (17) must be stable so that the resulting numerical solution {xn } decays as desired. Moreover, provided that J is diagonalizable, all eigenvalues of Acl lie in the open unit disc if and only if |P (h)| < 1 for any eigenvalue of J where P (z) is defined by P (z) := 1 + bT z(I − zA)−1 e.
(19)
In view of this, the absolute stability is defined as follows (Dekker & Verwer, 1984). Definition 7. A Runge–Kutta type method specified as above is absolutely stable at z ∈ C if |P (z)| < 1. The region of absolute stability is defined as R := {z ∈ C : |P (z)| < 1}. When applying the Runge–Kutta type method, the resulting approximation accuracy largely depends on the choice of the step size h. It seems that we should choose as small h as possible. However, when we consider round-off errors or deal with so-called stiff systems (Dekker & Verwer, 1984), we cannot take h overly small because the required number of iteration increases as h becomes smaller. The region R characterizes how large the step size h can be: in order to obtain the numerical solution for (15), we must choose h such that h belongs to R to guarantee a stable behavior of the numerical solution. Hence it is desirable that we have a large region of absolute stability. For a given set of Runge–Kutta coefficients, there exist several methods to depict the corresponding absolute stability region, e.g., the so-called boundary locus techniques (Lambert, 1991). It is, however, not easy to find coefficients such that the corresponding region satisfies some required properties. For this problem, some results have been derived in the following special cases: • A Runge–Kutta method is said to be A-stable, if R includes the open left half complex plane. Scherer and Türke (1989) derived an algebraic condition on the coefficients of A-stable methods. This result was accomplished by invoking the positive real lemma explicitly. Unfortunately, it turned out that A-stable methods always turn out to be implicit Runge–Kutta schemes that require complicated procedures to solve nonlinear equations. On the other hand, explicit Runge–Kutta methods, while not A-stable, are widely used for their simplicity. • The intersection of the absolute stability region and the real axis is called the stability interval. The coefficients which achieve the maximal absolute stability interval have been studied. However, as seen in Section 3.4, a large interval of stability does not necessarily mean a wide region of the absolute stability.
Fig. 7. Region (r1 , r2 , ).
In view of this, we here attempt to estimate the region of absolute stability quantitatively for those methods that are not necessarily A-stable. 3.2. Problem formulation for analysis of absolute stability region Let us formulate the problem. Define the region (r1 , r2 , ) := {z ∈ C : r1 |z| r2 , | arg z − | }, where r1 , r2 and satisfy 0 < r1 r2 and 0 /2 (Fig. 7). In particular, when = 0, this reduces to an interval in the real axis. To estimate the region of absolute stability, we consider the following: Problem 8. For given 0 < r1 r2 , 0 /2 and the Runge–Kutta coefficients A and b, determine whether the corresponding region of absolute stability R includes the region (r1 , r2 , ). If =0 this problem is a stability interval analysis. To see the meaning of three parameters r1 , r2 and , consider the linear test problem (15) again. We here take such that | arg − | , r2 h|| and sufficiently small r1 . If (r1 , r2 , ) ⊂ R for these parameters, then we have h ∈ R and corresponding numerical solutions decay as desired. Furthermore, especially when a distribution of eigenvalues corresponding to the linearized systems at required operating points is available, we can compute a (sub)maximal step size h by a bisection algorithm on r2 . 3.3. LMI conditions of coefficients Denote K(z) := P (z−1 ), i.e., K(z) = 1 + bT (zI − A)−1 e.
(20)
This K(z) can be viewed as a proper and rational transfer function. Clearly |P (z)| < 1,
z ∈ (r1 , r2 , )
if and only if |K(z)| < 1,
z ∈ (1/r2 , 1/r1 , ).
1162
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
The latter condition is the bounded-realness of the system. Since many results on relationships between bounded-realness and system matrices are known, it is natural to apply such results to characterization of absolute stability. In particular, a generalized KYP Lemma, recently obtained by Iwasaki and Hara (2005) for the situations dealing with a finite frequency range, enables us to characterize system matrices so that the corresponding system be bounded-real on a class of curves in the complex plane. We can now give a solution to Problem 8 by using this lemma; see Appendix for the proof. Theorem 9. Let r1 , r2 , and the Runge–Kutta coefficients A and b be given. Suppose that the region (1/r2 , 1/r1 , ) contains no eigenvalues of matrix A. Then the region of absolute stability R includes (r1 , r2 , ) if and only if there exist Hermitian matrices Pi , Qi ∈ Cs×s (i = 1, 2, 3) such that Qi > 0, i = 1, 2, 3, ⎡ ⎤ Pi −Qi ⎦ F + < 0, i = 1, 2, 1 2 FT ⎣ −Qi − 2 Pi − cos Qi ri ri ⎡ ⎤ −Q3 e−j (j P 3 − rc Q3 ) ⎦ F + < 0, F T ⎣ j 1 −e (j P 3 + rc Q3 ) − Q3 r1 r2 where
A F := I
e
,
0
:=
bbT
b
bT
0
,
1 rc := 2
1 1 + r1 r2
,
and X > 0 (X < 0) means that Hermitian matrix X is positive (negative) definite. Remark 10. The shape of is not essential. For example, any polygon can be dealt with; see the proof in Appendix. Theorem 9 does not depend on a realization of K(z). As in the next subsection, it is often more useful to deal with the stability function directly rather than the Runge–Kutta coefficients. Hereafter we confine ourselves to explicit Runge–Kutta methods. This implies that A is strictly lower triangular and then K(z) is clearly a polynomial of z−1 by (20). Therefore the stability function P (z) can be rewritten as P (z) = 1 +
s
i zi ,
(21)
i=1
which means that the region of absolute stability of an explicit formula cannot be unbounded. Consequently, we obtain K(z) = 1 + (zI − Ak )−1 Bk ,
(22)
where ⎡0 ⎢ .. ⎢. ⎢ Ak := ⎢ ⎢ .. ⎣. 0
1 0
···
· · · 0⎤ .⎥ 1 .. ⎥ ⎥ ⎥, .. ⎥ . 1⎦ ··· 0
⎡ ⎤ 0 ⎢.⎥ ⎢ .. ⎥ ⎢ ⎥ Bk := ⎢ ⎥ , ⎢0⎥ ⎣ ⎦ 1
⎡
s
⎤T
⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ := ⎢ ⎥ . ⎢ ⎥ ⎣ 2 ⎦ 1
Fig. 8. Regions of absolute stability: = 0 (solid), /10 (dash) and RKF45 (dotted).
Corollary 11. Let r1 , r2 , and the stability function P (z) in (21) given. Then |P (z)| < 1 for all z ∈ (r1 , r2 , ) if and only if there exist Hermitian matrices Pi and Qi (i = 1, 2, 3) such that matrix inequalities in Theorem 9 are satisfied where F and
are given by
T
T Ak Bk , := . F := I 0 0 Notice that F does not depend on and the pertinent matrix inequalities are characterized by , Pi and Qi (i = 1, 2, 3). Furthermore, by using the Schur complement, they are all transformed to LMIs with respect to these variables. Hence we can easily design which intends to maximize the corresponding absolute stability region. 3.4. An application Generally speaking, the stability function P (z) for an s-stage explicit Runge–Kutta method of order p is represented as (21) with 1 i = , i = 1, 2, . . . , p i! (Lambert, 1991). However, for a class of such methods, i for i = p + 1, . . . , s are free parameters and the Runge–Kutta coefficients are determined by them. As an application of the result obtained, we propose a new procedure for designing the Runge–Kutta coefficients that maximize the region of absolute stability. We illustrate an example of a 5-stage explicit Runge–Kutta method of order 4. We design the free parameter 5 in (21) by the procedure mentioned above; we take r1 =0.1 and maximize r2 . The obtained regions are shown in Fig. 8. Taking =0 yields 5 = 4.087 × 10−3 that optimizes the stability interval. In this case, the corresponding region has little width at z =−4.7. This is because only an interval of the real axis is evaluated. When = /10, the optimal 5 = 4.717 × 10−3 . We also see that the larger becomes, the greater width the corresponding region assumes, as expected. For comparison, the region of absolute 1 stability for RKF45 (5 = 104 ) is also shown in Fig. 8; see also Lambert (1991). This example illustrates the effectiveness of the present approach.
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
Remark 12. In this section, only the region of absolute stability was investigated. An alternative existing solution is to change the step size h adaptively. This topic is discussed in Gustafsson et al. (1988) from a PI control point of view.
We have given some accounts on a system-theoretic viewpoint for numerical analysis methods. We have seen that such conventional methods can often be embraced in a unified framework of feedback systems, and also presented some new criteria for convergence. In the first half of this paper, it is seen that the internal model principle and stability analysis for Lur’e type systems have played a crucial role. Needless to say, these are very well known and important concepts in system theory contexts. In the second half we investigated the relationship between the Runge–Kutta coefficients and the corresponding stability region. This result was accomplished by invoking a generalized KYP lemma. These results clearly suggest that system theory can provide novel and powerful tools for numerical analysis. We expect that much wider applications are yet to be studied from this point of view. Acknowledgment The authors would like to thank Professor Jun-ichi Imura, Tokyo Institute of Technology, for valuable discussions and general support of this research. Appendix A. Proof of Theorem 6 Definition 13. A mapping defined on a closed interval I is said to be contractive if the following conditions are satisfied: (1) (x) ∈ I for any x ∈ I ; and (2) there exists a positive constant p < 1 such that (A.1)
for any x, y ∈ I . Proof of Theorem 6. Since f is twice continuously differentiable, in (8) is continuously differentiable on I. By the assumption that is contractive, there exists a positive constant p < 1 such that (A.1) is satisfied. It follows from these facts that | (x)| = |1 − g (x)| < p for any x ∈ I . Therefore, a := 1 − p and b := 1 + p satisfy 0 < a b < 2 and a < g (x) < b. for any x ∈ I . This implies that a<
Appendix B. Proof of Theorem 9 We give a proof of Theorem 9. First, for Hermitian matrices , ∈ C2×2 , define a subset in the complex plane by ( , ) := {z ∈ C : (z, ) = 0, (z, ) 0},
4. Conclusion
|(x) − (y)| < p|x − y|
1163
g(x) − g(x ∗ )
for any x ∈ I \{x ∗ }. Since g(x ∗ ) = 0, the desired result follows.
where, for z ∈ C and Hermitian matrix ∈ C2×2 ,
∗ z z . (z, ) := 1 1 Note that ( , ) represents curves in the complex plane, when and are chosen appropriately (Iwasaki & Hara, 2005). The following lemma, a special case of Theorem 3 in Iwasaki and Hara (2005), gives system matrices such that corresponding transfer functions have several properties, which are specified by a Hermitian matrix , on these curves. Lemma 14. Let matrices A ∈ Rn×n , B ∈ Rn×1 and Hermitian matrices , ∈ C2×2 and ∈ C(n+1)×(n+1) be given. Suppose that ( , ) represents bounded curves in the complex plane and that det(zI − A) = 0 for all z in ( , ). Then
(zI − A)−1 B ∗ (zI − A)−1 B <0 (B.1)
I I for all z ∈ ( , ) if and only if there exist Hermitian matrices P and Q satisfying Q > 0 and
A B ∗ A B ( ⊗ P + ⊗ Q) + < 0. I 0 I 0 By taking adequately, we can characterize the boundedrealness of K(z) given by (20) on curves ( , ). We are now ready to prove Theorem 9. Proof of Theorem 9. Since K(z) is analytic on (1/r2 , 1/r1 , ) by the assumption, |K(z)| < 1 on this domain if and only if this holds on its boundary, according to the maximum modulus theorem. This boundary consists of two arcs and two lines, as in Fig. 7. Here these can be represented in the form of ( , ) with adequately defined matrices , : {z ∈ C : |z| = 1/ri , arg z − } 1 0 0 −1 = , , 0 −ri−2 −1 −2ri−1 cos {z ∈ C : 1/r2 |z| 1/r1 , arg z = + } ⎛ ⎤⎞ ⎡
−1 −e−j rc 0 je−j = ⎝ ,⎣ 1 ⎦⎠ . −ej rc − −jej 0 r1 r 2 If we set B = e and
T T b 0 T 1 0 0 b
= , 0 1 0 1 0 −1 (B.1) is equivalent to |K(z)| < 1. Hence by applying Lemma 14, we obtain Theorem 9.
1164
K. Kashima, Y. Yamamoto / Automatica 43 (2007) 1156 – 1164
References Bhaya, A., & Kaszkurewicz, E. (2003). Iterative methods as dynamical systems with feedback control. In Proceedings of the 42nd IEEE conference on decision and control (pp. 2374–2380). Bhaya, A., & Kaszkurewicz, E. (2004). Newton algorithms via control Liapunov functions for polynomial zero finding. In Proceedings of the 43rd IEEE conference on decision and control (pp. 1629–1634). Dekker, K., & Verwer, J. G. (1984). Stability of Runge–Kutta methods for stiff nonlinear differential equations. Amsterdam: North-Holland. Francis, B. A., & Wonham, W. M. (1975). The internal model principle for linear multivariable regulators. Applied Mathematics and Optimization, 2, 170–194. Gustafsson, K., Lundh, M., & Söderlind, G. (1988). A PI stepsize controller for the numerical solution of ordinary differential equations. BIT, 28, 270–287. Iwasaki, T., & Hara, S. (2005). Generalized KYP lemma: Unified frequency domain inequalities with design applications. IEEE Transactions on Automatic Control, 50(1), 41–59. Kaszkurewicz, E., Bhaya, A., & Ramos, P. R. V. (1995). A control-theoretic view of diagonal preconditioners. International Journal of Systems Science, 26, 1659–1672. Khalil, H. (1992). Nonlinear systems. London: Macmillan. Lambert, J. D. (1991). Numerical methods for ordinary differential systems. New York: Wiley. Quarteroni, A., Sacco, R., & Saleri, F. (2000). Numerical mathematics. New York: Springer. Rudin, W. (1976). Principles of mathematical analysis. 3rd ed., New York: McGraw-Hill. Schaerer, C. E., & Kaszkurewicz, E. (2001). The shooting method for the solution of ordinary differential equations: A control-theoretical perspective. International Journal of Systems Science, 32, 1047–1053.
Scherer, R., & Türke, H. (1989). Algebraic characterization of A-stable Runge–Kutta methods. Applied Numerical Mathematics, 5, 133–144. Wakasa, Y., & Yamamoto, Y. (2001). An iterative method for solving linear equations by using robust control methods (in Japanese). In Proceedings of SICE first annual conference on control systems (pp. 451–454). Kenji Kashima was born in 1977 in Oita, Japan. He received his Bachelor’s degree in engineering and his Master’s and Doctoral degrees in informatics from Kyoto University, Japan, in 2000, 2002 and 2005, respectively. He is currently an assistant professor of the Graduate School of Information Science and Engineering, Tokyo Institute of Technology. His research interests include infinite-dimensional control theory, sampled-data control theory and numerical analysis. Yutaka Yamamoto received his Ph.D. degree in mathematics from the University of Florida, in 1978, under the guidance of Professor R.E. Kalman. He is currently a professor in Department of Applied Analysis and Complex Dynamical Systems, Graduate School of Informatics, Kyoto University. His current research interests include the theory of sampled-data control systems, its application to digital signal processing, realization and robust control of distributed parameter systems and repetitive control. He is a fellow of IEEE, and is the recipient of G.S. Axelby outstanding paper award of IEEE Control Systems Society in 1996. He is currently the Vice President for Publications of Control Systems Society of IEEE and also the Vice President of the Institute of Systems, Control and Information Engineers, Japan.