Applied Numerical Mathematics 152 (2020) 1–11
Contents lists available at ScienceDirect
Applied Numerical Mathematics www.elsevier.com/locate/apnum
The PRP conjugate gradient algorithm with a modified WWP line search and its application in the image restoration problems ✩ Gonglin Yuan ∗ , Junyu Lu, Zhan Wang College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi, PR China
a r t i c l e
i n f o
Article history: Received 5 October 2019 Received in revised form 22 January 2020 Accepted 23 January 2020 Available online 28 January 2020 Keywords: PRP method Global convergence Line search Image restoration problems
a b s t r a c t It is well known that the conjugate gradient algorithm is one of the most classic and useful methods for solving large-scale optimization problems, where the Polak-RibièrePolyak(PRP) method is an important and effective conjugate gradient algorithm. However, the global convergence of the PRP conjugate gradient method can not be achieved when the WWP line search technique is used for a nonconvex function. In this paper, we propose a modified WWP line search technique for the PRP conjugate gradient algorithm. The new line search technique can guarantee the global convergence of the PRP method for general functions. The numerical results show that the PRP method with a new line search technique enables a practical computation and is effective in the image restoration problems. © 2020 IMACS. Published by Elsevier B.V. All rights reserved.
1. Introduction Consider
min f (x),
(1.1)
x∈n
where f : n → , f ∈ C 2 is a continuously differentiable function. There are many methods that can be used for the above model, such as Newton methods, quasi-Newton methods, conjugate gradient methods, and nonlinear conjugate gradient (CG) methods [5,15,18,22,28], which play a significant role in (1.1) due to their lower storage and simple computation [1,3, 27,31]. The iterative formula of the CG method to solve (1.1) takes the following form:
xk+1 = xk + αk dk , k = 0, 1, 2, · · · , where xk is the k-th iteration point, the step size direction defined by
(1.2)
αk > 0 is determined by line search techniques, and dk is the search
✩ This work was supported by the National Natural Science Foundation of China (Grant No. 11661009), the High Level Innovation Teams and Excellent Scholars Program in Guangxi institutions of higher education (Grant No. [2019]52), and the Guangxi Natural Science Key Fund (No. 2017GXNSFDA198046). Corresponding author. E-mail addresses:
[email protected] (G. Yuan),
[email protected] (J. Lu),
[email protected] (Z. Wang).
*
https://doi.org/10.1016/j.apnum.2020.01.019 0168-9274/© 2020 IMACS. Published by Elsevier B.V. All rights reserved.
2
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
dk =
if k = 0, − gk , − gk + βk dk−1 , if k ≥ 1,
(1.3)
where βk is a scalar and gk is ∇ f (xk ). There are six famous formulae for βk given below:
βkP R P = βkC D =
gkT ( gk − gk−1 )
g k −1
2
gk 2 −dkT−1 gk−1
, βkF R =
, βkL S =
g T ( g k − g k −1 ) gk 2 , βkH S = T k , 2 g k −1 dk−1 ( gk − gk−1 )
gkT ( gk − gk−1 )
−dkT−1 gk−1
, βkD Y =
gk 2 , ( gk − gk−1 )T dk−1
where gk−1 is the gradient ∇ f (xk−1 ) of f (x) at the point xk−1 and . denotes the Euclidean norm of a vector. The above methods are called the Polak-Ribiére-Polyak (PRP) [23,24], Fletcher-Reeves (FR) [13], Hestenses-Stiefel (HS) [16], conjugate descent (CD) [12], Liu-Storrey (LS) [21], and Dai-Yuan (DY) [8] CG methods. Among these methods, the numerical results of the PRP, LS, and HS methods are promising, but the convergence is not very good; for the rest of the methods, the situation is the opposite. Because the numerical results of the PRP method are promising, some scholars have tried to study its convergence. Polak and Ribiére [23] proved that the PRP method with an exact line search is globally convergent for strongly convex functions. With a descent condition for the search direction, Yuan [31] obtained similar results with the Wolfe line search. However, Powell [25] constructed a counterexample and showed that the PRP method can circle infinitely without approaching the solution, which means that this method is not globally convergent for general functions. Moreover, Dai [4] showed that the PRP method may produce an uphill search direction with a strong Wolfe line search. Powell analysed the PRP method and showed that global convergence was not satisfied, probably due to βk being less than 0; thus, he suggested that βk should no be less than 0 to ensure global convergence, which is significant when we use the PRP method to resolve optimization problems [9,10,26,29]. With consideration of his suggestion, Gilbert and Nocedal [14] proved that the method of βkP R P + = max{0, β P R P } method is globally convergent for general nonconvex functions with a suitable line search. Other researchers modified βk to obtain the global convergence of the algorithm for general functions, for example, modifications such as the three-term [32,38] or two-term PRP [2,35] method were often adopted to ensure global convergence for the objective functions. The weak Wolfe-Powell (WWP) line search plays an important role in optimization methods to find αk such that
f (xk + αk dk ) ≤ f k + δ αk gkT dk and
g (xk + αk dk ) T dk ≥ σ gkT dk , where δ ∈ (0, 12 ) and σ ∈ (δ, 1). However, the PRP method has no established global convergence for general functions with this line search technique. Therefore, some scholars choose to modify one of the above inequalities of this WWP line search (see [6,7,17,19,30] etc.) and get some results. Zhou et al. [39,40] analyze the PRP method by modifying the Armijo technique and obtain the good convergence. Yuan, Wei, and Lu [36] presented a new inexact line search technique as follows:
f (xk + αk dk ) ≤ f k + δ αk gkT dk + αk min{−δ ∗ gkT dk , δ
αk 2
dk 2 }
and
g (xk + αk dk ) T dk ≥ σ gkT dk + min{−δ ∗ gkT dk , δ αk dk 2 }, where δ ∈ (0, 12 ), δ ∗ ∈ (0, δ) and σ ∈ (δ, 1). By modifying the line search technique, Yuan, Wei and Lu partially resolved the open problem about the global convergence of the PRP method for general functions in the case of α α min{−δ2 gkT dk , δ 2k dk 2 } = δ 2k dk 2 . Some applications of the technique can be found in [20,33,37]. However, if αk T 2 min{−δ2 gk dk , δ 2 dk } = −δ2 gkT dk , the conclusion of global convergence does not hold because this line search is the WWP line search. In this paper, we further study the convergence properties of the PRP method with a new modified version of the WWP line search technique, called a modified WWP (MWWP) line search, as follows:
f (xk + αk dk ) ≤ f k + δ αk gkT dk + δ
αk2 2
dk 2 δ1
(1.4)
and
g (xk + αk dk ) ≥ σ gkT dk + δ αk dk 2 δ1 ,
(1.5)
where δ ∈ (0, 12 ), δ1 is a sufficiently small scalar and σ ∈ (δ, 1). The above line search will be introduced in the next section in detail. The main attributes of this paper are stated below. The global convergence for nonconvex functions is established with the modified WWP line search technique.
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
3
The numerical results demonstrate that this line search technique is competitive with the WWP line search technique for the given problems. The engineering Muskingum model is used to determine the interesting aspects of the given algorithm. The image restoration problems are also done to show the performance of the presented algorithm. This paper is organized as follows. The modified line search is introduced, and the PRP algorithm is provided in Section 2. The global convergence of the PRP method is studied in Section 3. The numerical results of the given algorithm are reported in Section 4. Finally, we present a conclusion section. Throughout this paper, we let f k = f (xk ), f k+1 = f (xk+1 ), gk = g (xk ), and gk+1 = g (xk+1 ) and · be the Euclidean norm. 2. New line search and algorithm In this section, we present the PRP algorithm and prove that the algorithm is well-defined; in other words, it is necessary for us to show that (1.4) and (1.5) are reasonable before we study the global convergence of the PRP method with the MWWP line search method. Namely, there exists αk > 0 such that (1.4) and (1.5) hold, so we have the following theorem. Theorem 2.1. Let f (x) ∈ C 2 be bounded from below and gkT dk ≤ 0. Then, there exists a constant α (0 < α < ∞) satisfying (1.4) and (1.5). Proof. Consider the following function:
(α ) = f (xk + αdk ) − f k − δ α gkT dk − δ
α2 2
dk 2 δ1 .
(2.6)
Note that, f k+1 and f k are bounded from below and gkT dk ≤ 0. It is clear that (0) = 0, and for a sufficiently small scalar δ1 > 0, we have α as follows:
(α ) = f (xk + αdk ) − f k − δ α gkT dk − δ
α2 2
dk 2 δ1
= ( f k + α gkT dk + o(α )) − f k − δ α gkT dk − δ = α (1 − δ) gkT dk − δ
α2 2
α2 2
dk 2 δ1
dk 2 δ1 + o(α ) < 0.
On the other hand, based on the proof of Lemma 3.1 of [22], we know that there exists f k + δ2 α gkT dk ; we choose δ > δ2 , so we have
(α ) = (δ2 − δ)α gkT dk − δ
α 2 2
α > 0, such that f (xk + α dk ) =
dk 2 δ1 ≥ 0.
It is clear that there is a constant ξ such that (ξ ) = 0. Therefore, for the global minimizer of (α ) in (0, ξ ), the minimum value cannot occur at the endpoints because (0) = (ξ ) = 0, and then there is α ∈ (0, ξ ) that satisfies (α ) < 0. We can deduce that there is a local minimizer α ∈ (0, ξ ) satisfying ( α ) < 0 and ( α ) > 0 based on the above discussion. Then,
( α ) = f (xk + αdk ) − f k − δ α gkT dk − δ
α2 2
dk 2 δ1 ,
that is,
f (xk + αdk ) ≤ f k + δ α gkT dk + δ
α2 2
dk 2 δ1 .
Furthermore, we obtain
( α ) = lim
α → α
(α ) − ( α) α − α 2
= lim {
f (xk + αdk ) − f k − δ α gkT dk − δ α2 dk 2 δ1
α − α 2 T f (xk + αdk ) − f k − δ α gk dk − δ α2 dk 2 δ1 − } α − α 2 2 f (xk + αdk ) − f (xk + αdk ) − δ(α − α ) gkT dk δ α2 dk 2 δ1 − δ α2 dk 2 δ1 − } = lim { α → α α − α α − α α → α
4
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
(α − α ) g (xk + αdk ) + O ((α − α )2 ))kT dk − δ gkT dk − δ α dk 2 δ1 α → α α − α
= lim
= g (xk + αdk )T dk − δ gkT dk − δ α dk 2 δ1 . By ( α ) ≥ 0, we have
( α ) = g (xk + αdk )T dk − δ gkT dk − δ α dk 2 δ1 ≥ 0. Then,
g (xk + αdk )T dk ≥ δ gkT dk + δ α dk 2 δ1 ≥ σ gkT dk + δ α dk 2 δ1 , where σ > δ and gkT dk ≤ 0. By the existence of a local minimizer and the continuity of (α ), we obtain the existence of an appropriate interval. The proof is complete. 2 Remark i. The proof of the above theorem is similar to that in the paper [36]; here, we also state the detailed process, because the proofs are different in some aspects. Based on the above theorem and the PRP formula, the presented PRP algorithm is listed as follows. Algorithm 1 (PRP-MWWP Algorithm). Step Step Step Step Step Step
1: 2: 3: 4: 5: 6:
Choose an initial point x0 ∈ n , ε ∈ (0, 1), δ ∈ (0, 12 ), δ1 ∈ (0, δ), σ ∈ (0, 1). Set d0 = − g 0 = −∇ f (x0 ), k := 0. gk ≤ ε , and then the PRP algorithm stops. Compute the step size αk using the modified WWP line search rule (1.4) and (1.5). Let xk+1 = xk + αk dk . If gk+1 ≤ ε , and then the PRP algorithm stops. Calculate the search direction
dk+1 = − gk+1 + βkP R P dk .
(2.7)
Step 7: Set k := k + 1, and go to Step 3. Remark ii. (1) βkP R P =
gkT ( gk − gk−1 )
gk−1 2
is the well-known PRP formula.
(2) Theorem 2.1 has proved that the line search (1.4) and (1.5) is satisfied, so the algorithm is well defined. 3. Convergence for the PRP method To demonstrate the global convergence of Algorithm 1, the following assumption is required. Assumption. (i) The level set T 0 = {x | f (x) ≤ f (x0 )} is bounded. (ii) In some neighbourhood N of T 0 , f is differentiable, and its gradient g satisfies
g (x) − g ( y ) ≤ L x − y ,
(3.8)
where L > 0 is a constant and any x, y ∈ N, namely, g is Lipschitz continuous. Remark iii. Assumptions (i) and (ii) imply that there exists a constant G > 0 satisfying
g (x) ≤ G , x ∈ N .
(3.9)
Lemma 3.1. Let Assumptions (i) and (ii) hold. If there exists a constant υ > 0 satisfying
g (xk ) ≥ υ , ∀ k ≥ 0,
(3.10)
d(xk ) ≤ ϑ, ∀ k ≥ 0
(3.11)
then,
holds, where ϑ > 0 is a constant.
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
5
Proof. By (1.4), we know that δ1 is a sufficiently small scalar and gkT dk ≤ 0, and then there exists a positive constant δ3 < δ that guarantees −(δ − δ3 )αk gkT dk ≥ δ
(αk )2 2
f (xk + αk dk ) ≤ f k + δ αk gkT dk + δ
dk 2 δ1 ; then,
αk2 2
dk 2 δ1
≤ f k + δ3 αk gkT dk , so we have
−δ3 αk gkT dk ≤ f k − f (xk + αk dk ). Then, summing these inequalities from k = 0 to ∞ and using Assumption (i) gives ∞ (−δ3 αk gkT dk ) < ∞.
(3.12)
k =0
By (2.7), we obtain
dk+1 ≤ gk+1 + |βkP R P |dk gk+1 gk+1 − gk ≤ g k +1 + dk gk G L sk dk , ≤G+
υ
(3.13)
where sk = xk+1 − xk = αk dk , the third inequality follows (3.8) and (3.9), and the last inequality follows (3.10). By the proof of (3.12), we have
αk gkT dk ≤ −
δδ1 2(δ − δ3 )
αk2 dk 2 .
(3.14)
Thus, we obtain ∞
sk 2 =
k =0
∞
αk dk 2 ≤
k =0
∞ 2(δ − δ3 ) (− αk gkT dk ) < ∞. δδ1 k =0
Then, we have
sk 2 → 0, k → ∞. This means that there exists a constant
G L sk
υ
ω ∈ (0, 1) and a positive integer k0 ≥ 0 satisfying
≤ ω, ∀k ≥ k0 .
Therefore, for all k ≥ k0 , using (3.13), we obtain
dk+1 ≤ G + ωdk ≤ G (1 + ω + ω2 + ... + ωk−k0 −1 ) + ωk−k0 dk0 ≤
G 1−ω
+ dk0 .
G Let ϑ = max{d1 , d2 , .., dk0 , 1− ω + dk0 }, we have
d(xk ) ≤ ϑ, ∀k ≥ 0. This completes the proof.
2
Theorem 3.1. Suppose that the conditions of Lemma 3.1 are true, and then we obtain
lim gk = 0.
k→∞
(3.15)
6
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
Proof. The theorem will be proven by contradiction. Suppose that there exists a constant
ς > 0 satisfying
g k ≥ ς , ∀ k ≥ 0. Similar to (3.12), using (3.14) gives
δ3 δδ1 2(δ − δ3 )
αk dk 2 ≤ −
2(δ − δ3 )
δ3 δδ1
δδ1
2(δ − δ3 )
αk gkT dk
= −δ3 αk gkT dk → 0, k → ∞. Then, we have
αk dk 2 → 0, k → ∞
(3.16)
The above relation implies the following two cases. Case i: The step length αk → 0 as k → ∞. By the line search (1.5) and the Taylor formula, we have
gkT dk + O (αk dk 2 ) = g (xk + αk dk ) T dk ≥ σ gkT dk + δ α dk 2 δ1 ≥ σ gkT dk . Using (3.14) leads to
O (αk dk 2 ) ≥ −(1 − σ ) gkT dk ≥
δδ1 (1 − σ ) αk dk 2 . 2(δ − δ3 )
Then,
O (αk ) ≥
δδ1 (1 − σ ) 2(δ − δ3 )
holds, which generates a contradiction with αk → 0 as k → ∞; then, this theorem is true. Case ii: The direction dk → 0 as k → ∞. Using (2.7), (3.8), (3.9), and (3.16) gives
0 ≤ dk+1 = − gk+1 + βkP R P dk
gk+1 gk+1 − gk dk gk G L αk dk ≤ dk+1 + dk
≤ dk+1 +
ς
→ 0, k → ∞. Then, we deduce that this theorem holds. In one word, we conclude that
lim gk = 0
k→∞
is true. This completes the proof.
2
4. Numerical experiments In this section, we report numerical results for the PRP method with the normal WWP line search technique and the MWWP technique. The normal unconstrained optimization problems and image restoration problems are included. In this section, all the numerical tests run on a 2.30 GHz CPU with 8.00 GB of memory on the Windows 10 operating system. 4.1. Normal unconstrained optimization problems The test problems are listed in Table 1. We not only show the performance of the PRP algorithm with the MWWP line search technique for the test problems in Table 1, but also show that δ1 should be sufficiently small. We introduce the stop rules, dimension and some parameters in numerical experiments as follows: Stop rules (the Himmelblau stop rule [34]): If | f (xk )| > e 1 , let stop1 =
| f (xk )− f (xk+1 )| , or stop1 = | f (xk ) − f (xk+1 )|. If the | f (xk )| − 6 = 10 , the algorithm will stop if the number of
conditions g (x) < or stop1 < e 2 are satisfied, where e 1 = e 2 = 10−5 , iterations is greater than 1000. Dimension: 3000, 6000, and 9000 variables. Parameters: δ = 0.1, δ ∗ = 0.05, δ1 = 10−16 or δ1 = 10−8 , and σ = 0.9. Experiment methods: The PRP-WWP method denotes the PRP method with the WWP line search, and the PRP-MWWP method is the PRP method with the MWWP line search.
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
7
Table 1 Test problems. Nr.
Test problem
Nr.
Test problem
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Extended Freudenstein and Roth Function Extended Trigonometric Function Extended Rosenbrock Function Extended White and Holst Function Extended Beale Function Extended Penalty Function Perturbed Quadratic Function Raydan 1 Function Raydan 2 Function Diagonal 1 Function Diagonal 2 Function Diagonal 3 Function Hager Function Generalized Tridiagonal 1 Function Extended Tridiagonal 1 Function Extended Three Exponential Terms Function Generalized Tridiagonal 2 Function Diagonal 4 Function Diagonal 5 Function Extended Himmelblau Function Generalized PSC1 Function Extended PSC1 Function Extended Powell Function Extended Block Diagonal BD1 Function Extended Maratos Function Extended Cliff Function Quadratic Diagonal Perturbed Function Extended Wood Function Extended Hiebert Function Quadratic Function QF1 Function Extended Quadratic Penalty QP1 Function Extended Quadratic Penalty QP2 Function A Quadratic Function QF2 Function Extended EP1 Function Extended Tridiagonal-2 Function BDQRTIC Function (CUTE) TRIDIA Function (CUTE)
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
ARWHEAD Function (CUTE) ARWHEAD Function (CUTE) NONDQUAR Function (CUTE) DQDRTIC Function (CUTE) EG2 Function (CUTE) DIXMAANA Function (CUTE) DIXMAANB Function (CUTE) DIXMAANC Function (CUTE) DIXMAANE Function (CUTE) Partial Perturbed Quadratic Function Broyden Tridiagonal Function Almost Perturbed Quadratic Function Tridiagonal Perturbed Quadratic Function EDENSCH Function (CUTE) VARDIM Function (CUTE) STAIRCASE S1 Function LIARWHD Function (CUTE) DIAGONAL 6 Function DIXON3DQ Function (CUTE) DIXMAANF Function (CUTE) DIXMAANG Function (CUTE) DIXMAANH Function (CUTE) DIXMAANI Function (CUTE) DIXMAANJ Function (CUTE) DIXMAANK Function (CUTE) DIXMAANL Function (CUTE) DIXMAAND Function (CUTE) ENGVAL1 Function (CUTE) FLETCHCR Function (CUTE) COSINE Function (CUTE) Extended DENSCHNB Function (CUTE) DENSCHNF Function (CUTE) SINQUAD Function (CUTE) BIGGSB1 Function (CUTE) Partial Perturbed Quadratic PPQ2 Function Scaled Quadratic SQ1 Function Scaled Quadratic SQ2 Function
The columns of Table 1 and have the following meanings: Nr.: The number of tested problems. Test problem: The name of the problem. NI: The number of iterations. NFG: The total of function and gradient evaluations. CPU: The calculation time in seconds. The numerical results of these methods in Figures 1–3 are test in δ1 = 10−16 , and the numerical results of these methods in Figures 4–6 are test in δ1 = 10−8 . From the performances of the PRP-MWWP method in CPU according Fig. 3 and Fig. 6, it is obviously show that PRP-MWWP method perform better when δ1 = 10−16 . So we can think δ1 should be sufficiently small. Next we just focus on Figures 1–3. Test methods are responding to PRP method with modified WWP line search, PRP method with WWP line search and PRP method with YWL line search [36], respectively. In the numerical results, the dimension of most of the problems becomes large, and the CPU time will increase in normal circumstances, but sometimes the computer system leads to the CPU time becoming smaller, such as the problems 4, 8, and 17 for the PRP-WWP method and problems 10 and 12 for the PRP-MWWP method. The method of Dolan and Moré [11] is used to analyse the profiles of these two methods, and Figures 1–3 show the profiles relative to the NI, NFG, and CPU time, respectively. From the performances of the PRP-MWWP method in Figures 1–3, we can see that these three figures exhibit a similar trend, so we only analyse Fig. 3 in terms of the CPU time. There is not very good different is one more point, from the theoretical analysis and proof, PRP-MWWP method has a relatively good theoretical results. However, from Fig. 3, the overall effect of the three algorithms is not much different. Fig. 3 shows that PRP-YWL method is best among these three methods, but the robust property of PRP-MWWP method is better than that of PRP-WWP method. So the PRP-MWWP method is better than the PRP-WWP method, we can think the PRP-MWWP method is competitive with the PRP-WWP method. Thus, the PRP-MWWP method provides noticeable advantages.
8
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
Fig. 1. Performance profiles of these methods (NI) (δ1 = 10−16 ).
Fig. 2. Performance profiles of these methods (NFG) (δ1 = 10−16 ).
Fig. 3. Performance profiles of these methods (CPU) (δ1 = 10−16 ).
4.2. Image restoration problems This subsection is to deal with the image restoration problems to recover the original image from an image corrupted by impulse noise, where it has been proved that these problems are not easy to be done and has many application fields. | f
− f |
x
−x
k+1 k The code will be stopped if the condition < 10−3 or k+x1 k < 10−3 holds. The experiments choose Lena fk k (256 × 256) and Barbara (512 × 512) as the tested images. The detail performances are given by the following Fig. IMI-1 and Fig. IMI-2. It is easy to see that both of these two algorithms are successful for solving these two images. The spent
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
9
Fig. 4. Performance profiles of these methods (NI) (δ1 = 10−8 ).
Fig. 5. Performance profiles of these methods (NFG) (δ1 = 10−8 ).
Fig. 6. Performance profiles of these methods (CPU) (δ1 = 10−8 ).
CPU time is stated in Table IMI1 to compare these three algorithms. And the given algorithm has the competitive result about the spent CPU time. These results in Table IMI1 show us at least two conclusions: (i) Algorithm PRP-MWWP and the PRP-WWP, PRP-YWL method are successful for restoring these images with suitable CPU time; and (ii) the presented algorithm is competitive to the others method for 40% noise problems, 65% noise problems.
10
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
Fig. IMI-1. Restoration of the Lena image by the norm PRP method, PRP-WWP method, PRP-YWL method. From left to right: a noisy image with 40% salt-and-pepper noise, restorations obtained by minimizing z with the norm PRP method, PRP-WWP method and PRP-YWL method.
Fig. IMI-2. Restoration of the Barbara image by the norm PRP method, PRP-WWP method, PRP-YWL method. From left to right: a noisy image with 65% salt-and-pepper noise, restorations obtained by minimizing z with the norm PRP method, PRP-WWP method and PRP-YWL method.
Table IMI1 The CPU time of these method in seconds. 40% noise
Lena
Barbara
Total
PRP-MWWP PRP-WWP PRP-YWL 65% noise PRP-MWWP PRP-WWP PRP-YWL
2.7812 2.9062 3.1562 Lena 4.2656 3.9531 4.3750
2.8750 2.7969 3.5312 Barbara 4.0468 4.0781 4.1406
5.6562 5.7031 6.6878 Total 8.3124 8.0312 8.5156
5. Conclusion In this paper, we present a modified weak Wolfe-Powell (MWWP) line search and prove the global convergence of the PRP method with this technique for general functions. The numerical results show that the PRP-MWWP method is competitive with the normal PRP-WWP method for large-scale dimension problems. The Muskingum model and the image restoration problem are analysed by given algorithms, and the numerical performance is interesting. In the future, there exist at least five points that should be noticed. (a) We think that the normal PRP method with the normal WWP technique can also solve the Muskingum model, and we will test this method in the future. (b) The theory and numerical performance of the MWWP technique for other nonlinear conjugate gradient methods should be studied. (c) The theory and numerical performance of the MWWP technique for the quasi-Newton method is an interesting future work. (d) Similar potential work is the theory and numerical performance of the MWWP technique with the trust region method. (e) More numerical experiments including more fact problems should be done in the future to test the performance of the proposed algorithm. Acknowledgements The authors would like to thank the editor and the referees for their valuable comments which greatly improve this paper. References [1] [2] [3] [4] [5]
E.G. Birgin, J.M. Martínez, A spectral conjugate gradient method for unconstrained optimization, Appl. Math. Optim. 43 (2001) 117–128. W. Cheng, A two-term PRP-based descent method, Numer. Funct. Anal. Optim. 28 (2007) 1217–1230. A. Cohen, Rate of convergence of several conjugate gradient algorithms, SIAM J. Numer. Anal. 9 (1972) 248–259. Y. Dai, Analysis of Conjugate Gradient Methods, Ph.D. thesis, Institute of Computational Mathematics and Scientific/Engineering Computing, 1997. Y. Dai, New properties of a nonlinear conjugate gradient method, Numer. Math. 89 (2001) 83–98.
G. Yuan et al. / Applied Numerical Mathematics 152 (2020) 1–11
11
[6] Y. Dai, Conjugate gradient methods with Armijo-type line searches, Acta Math. Appl. Sin. 18 (2002) 123–130. [7] Y. Dai, C. Kou, A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search, SIAM J. Optim. 23 (2013) 296–320. [8] Y. Dai, Y. Yuan, A nonlinear conjugate gradient with a strong global convergence property, SIAM J. Optim. 10 (1999) 177–182. [9] Z.F. Dai, H. Zhu, Forecasting stock market returns by combining Sum-of-the-parts and ensemble empirical mode decomposition, Appl. Econ. (2019), https://doi.org/10.1080/00036846.2019.1688244. [10] Y. Ding, Y. Xiao, J. Li, A class of conjugate gradient methods for convex constrained monotone equations, Optimization 66 (2017) 2309–2328. [11] E.D. Dolan, J.J. Moré, Benchmarking optimization software with performance profiles, Math. Program. 91 (2002) 201–213. [12] R. Fletcher, Practical Method of Optimization, vol. I: Unconstrained Optimization, 2nd ed., John Wiley and Sons, Chichester, 1987. [13] R. Fletcher, C.M. Reeves, Function minimization by conjugate gradients, Comput. J. 7 (1964) 149–154. [14] J.C. Gilbert, J. Nocedal, Global convergence properties of conjugate gradient methods for optimization, SIAM J. Optim. 2 (1992) 21–42. [15] L. Grippo, S. Lucidi, A globally convergent version of the Polak-Ribière-Polyak conjugate gradient method, Math. Program. 78 (1997) 375–391. [16] M.R. Hestenes, E. Stiefel, Method of conjugate gradient for solving linear equations, J. Res. Natl. Bur. Stand. 49 (1952) 409–436. [17] Y. Huang, C. Liu, Dai-Kou type conjugate gradient methods with a line search only using gradient, J. Inequal. Appl. 66 (2017) 1–17. [18] K.M. Khoda, Y. Liu, C. Storey, Generalized Polak-Ribière algorithm, J. Optim. Theory Appl. 75 (1992) 345–354. [19] C. Kou, An improved nonlinear conjugate gradient method with an optimal property, Sci. China Math. 57 (2014) 635–648. [20] X. Li, S. Wang, Z. Jin, H. Pham, A conjugate gradient algorithm under Yuan-Wei-Lu line search technique for large-scale minimization optimization models, Math. Probl. Eng. 2018 (2018) 4729318. [21] Y. Liu, C. Storey, Efficient generalized conjugate gradient algorithms part 1: theory, J. Optim. Theory Appl., Appl. Math. 10 (2000) 177–182. [22] J. Nocedal, S.J. Wright, Numerical Optimization, second ed., Springer Series in Operations Research, Springer, 2006. [23] E. Polak, G. Ribière, Note Sur la convergence de directions conjugèes, Rev. Fr. Inform. Rech. Opér., 3e Annèe 16 (1969) 35–43. [24] B.T. Polyak, The conjugate gradient method in extreme problems, USSR Comput. Math. Math. Phys. 9 (1969) 94–112. [25] M.J.D. Powell, Nonconvex Minimization Calculations and the Conjugate Gradient Method, Lecture Notes in Mathematics, vol. 1066, Springer-Verlag, Berlin, 1984, pp. 122–141. [26] M.J.D. Powell, Convergence properties of algorithm for nonlinear optimization, SIAM Rev. 28 (1986) 487–500. [27] D.F. Shanno, Conjugate gradient methods with inexact line searches, Math. Oper. Res. 3 (1978) 244–256. [28] Z. Shi, Restricted PR conjugate gradient method and its global convergence, Adv. Math. 31 (2002) 47–55. [29] Y. Xiao, H. Zhu, A conjugate gradient method to solve convex constrained monotone equations with applications in compressive sensing, J. Math. Anal. Appl. 405 (2013) 310–319. [30] G. Yu, L. Guan, Z. Wei, Globally convergent Polak-Ribière-Polyak conjugate gradient methods under a modified Wolfe line search, Appl. Math. Comput. 215 (2009) 3082–3090. [31] Y. Yuan, Analysis on the conjugate gradient method, Optim. Methods Softw. 2 (1993) 19–29. [32] G. Yuan, T. Li, W. Hu, A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems, Appl. Numer. Math. 147 (2020) 129–141. [33] G. Yuan, Z. Sheng, B. Wang, W. Hu, C. Li, The global convergence of a modified BFGS method for nonconvex functions, J. Comput. Appl. Math. 327 (2018) 274–294. [34] Y. Yuan, W. Sun, Theory and Methods of Optimization, Science Press of China, Beijing, 1999. [35] G. Yuan, X. Wang, Z. Sheng, Family weak conjugate gradient algorithms and their convergence analysis for nonconvex functions, Numer. Algorithms (2019) 1–22, https://doi.org/10.1007/s11075-019-00787-7. [36] G. Yuan, Z. Wei, X. Lu, Global convergence of the BFGS method and the PRP method for general functions under a modified weak Wolfe-Powell line search, Appl. Math. Model. 47 (2017) 811–825. [37] G. Yuan, Z. Wei, Y. Yang, The global convergence of the Polak-Ribière-Polyak conjugate gradient algorithm under inexact line search for nonconvex functions, J. Comput. Appl. Math. 362 (2019) 262–275. [38] L. Zhang, W. Zhou, D. Li, A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence, IMA J. Numer. Anal. 26 (2006) 629–640. [39] W. Zhou, A short note on the global convergence of the unmodified PRP method, Optim. Lett. 7 (2013) 1367–1372. [40] W. Zhou, D. Li, On the convergence properties of the unmodified PRP method with a non-descent line search, Optim. Methods Softw. 29 (2014) 484–496.