Nonlinear Analysis 71 (2009) e2305–e2315
Contents lists available at ScienceDirect
Nonlinear Analysis journal homepage: www.elsevier.com/locate/na
An adaptive-step primal-dual interior point algorithm for linear optimization Min Kyung Kim a , Yong-Hoon Lee a , Gyeong-Mi Cho b,∗ a
Department of Mathematics, Pusan National University, Busan, South Korea
b
Department of Multimedia Engineering, Dongseo University, Busan 617-716, South Korea
article
info
MSC: 49M15 65K05 90C33 Keywords: Adaptive-step Primal-dual interior point method Large-update Kernel function Polynomial algorithm Worst-case complexity Linear optimization problem
abstract A common feature shared by most practical algorithms in interior point methods is the use of Mehrotra’s predictor–corrector algorithm in [S. Mehrotra, On the implementation of a (primal-dual) interior point method, SIAM Journal on Optimization 2 (1992) 575–601.] where the predictor step is never performed but it is used only to calculate an adaptive update, and thus instead of a predictor and a corrector centering step, a single Newton step is made toward the adaptively chosen target. In this paper we propose a new adaptive single-step large-update primal-dual interior point algorithm with wide neighborhood for linear optimization(LO) problems based on the simple kernel function which is first defined in [Y.Q. Bai, C. Roos, A primal-dual interior-point method based on a new kernel function with linear growth rate, in: Proceedings of Industrial Optimization √ Symposium and Optimization Day, Nov., 2002]. We show that the algorithm has O(q nτ log(n/ε)) complexity which is similar to the one in the above-mentioned reference. © 2009 Elsevier Ltd. All rights reserved.
1. Introduction In this paper we consider the LO problem in a standard form as follows:
(P )
min{c T x : Ax = b, x ≥ 0},
where A ∈ Rm×n , rank(A) = m, b ∈ Rm , c ∈ Rn and its dual problem
(D)
max{bT y : AT y + s = c , s ≥ 0}.
Since Karmarkar’s seminal paper [1] a number of various interior point algorithms were proposed and analyzed. For these the reader refers to [2–4,14]. The primal-dual interior point methods (IPMs) for LO problems were first introduced by Kojima et al. [5] and Megiddo [6]. They have shown their powers in solving large classes of optimization problems. In the IPMs the choice of the proximity function is crucial not only for the quality and elegance of the analysis but also for the performance of the algorithm. Most of the polynomial-time interior point algorithms for LO were based on the use of the logarithmic barrier function. Recently Peng et al. [7] √ introduced a class of self-regular proximity functions and designed primal-dual IPMs based on this class. They obtained O( n log n log(n/)) complexity result for large-update primal-dual IPMs for LO based on some specific class of these functions and used these to measure the proximity from the central path to the current iterate and to find the search direction. They also proposed a dynamic large-update version of interior point
∗
Corresponding author. E-mail address:
[email protected] (G.-M. Cho).
0362-546X/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.na.2009.05.021
e2306
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
method (IPM) in a large neighborhood based on a specific self-regular barrier function ψ(t ) = (t 2 − 1)/2 + (t −2 − 1)/2 [8]. Later on, Maziar and Terlaky [9] proposed an adaptive single-step large-update IPM based on the specific class of selfregular proximity functions ψ(t ) = (t 2 − 1)/2 + (t 1−q − 1)/(q − 1), q > 1 and obtained the worst-case iteration bound O(qn(q+1)/2q log(n/ε)). Pan and Li [10] proposed a self-adjusting primal-dual IPM for LO based on a new proximity function and gave the numerical experiments without theoretical analysis of the algorithm. Bai et al. [11] proposed new large-update primal-dual IPMs for LO based on a simple barrier function which is not logarithmic nor self-regular and used this as the proximity function and obtained the polynomial complexity. They also simplified the analysis. The choice of the barrier update parameter θ also plays an important role in both theory and practice of IPMs. Usually, if θ is a constant independent of l, the √ dimension of the problem, e.g., θ = 1/2, then we call the algorithm a large-update method. If θ depends on l, e.g., θ = 1/(2 l), then the algorithm is called a small-update method. In the worst-case complexity theory small-update primal-dual IPMs are better than large-update ones. But in practice large-update primal-dual IPMs are the most efficient methods [12]. In this paper we propose a variant of the algorithm in [11] for LO, an adaptive single-step large-update primal-dual interior point algorithm with wide neighborhood and derive the worst-case complexity of the algorithm. We choose the target parameter µ depending on the current iterate and do not use any inner process to get recentered which is a common feature shared by most algorithms implemented in IPMs. This is different from the one in [11] where they take large-update with fixed ratios and some inner iterations to recenter. This makes the analysis different. Since our algorithm is based on the proximity function which has a linear growth term, the analysis is different from the one in [8,9]. This paper is organized as follows. In Section 2 we recall some basic concepts and the generic primal-dual interior point algorithm for LO. In Section 3 we introduce the proximity function and define a new adaptive-step primal-dual interior point algorithm. In Section 4 we analyze the worst-case complexity of the algorithm. Finally, we give the concluding remarks in Section 5. We use the following notations throughout the paper. Rn+ denotes the set of n-dimensional nonnegative vectors, Rn++ , the set of n-dimensional strictly positive vectors. xs denotes the componentwise product (Hadamard product) of vectors x and s. For x = (x1 , x2 , . . . , xn )T ∈ Rn , xmin = min{x1 , x2 , . . . , xn }, i.e., the minimal component of x.xT s is the scalar product of two vectors x and s. X is the diagonal matrix from vector x, i.e., X = diag (x), k · k is the Euclidean norm, e is the n-dimensional vector of ones, and I is the n-dimensional identity matrix. J is the index set, say J = {1, 2, . . . , n}. 2. Preliminaries In this section we give some definitions and the basic idea of the generic primal-dual interior point method.
√
Definition 2.1. A function f : D(⊂ R) → R is exponentially convex if and only if f ( x1 x2 ) ≤ (1/2)(f (x1 ) + f (x2 )) for x1 , x2 ∈ D. Without loss of generality we may assume that both (P ) and (D) satisfy the interior point condition (IPC), i.e., there exists an (x0 , y0 , s0 ) such that Ax0 = b, x0 > 0, AT y0 + s0 = c , s0 > 0. For this the reader is referred to [2,4]. Finding an optimal solution of (P ) and (D) is equivalent to solving the following KKT system:
Ax = b, x ≥ 0, AT y + s = c , s ≥ 0, xs = 0.
(1)
The basic idea of primal-dual IPMs is to relax the complementarity condition which is the third equation in (1) and we get the following parameterized system:
Ax = b, x > 0, AT y + s = c , s > 0, xs = µe,
(2)
where µ > 0. If the IPC holds, then the system (2) has a unique solution for each µ > 0. We denote by (x(µ), y(µ), s(µ)) the solution of (2) for given µ > 0. We call it the µ-center and the solution set {(x(µ), y(µ), s(µ)) | µ > 0} the central path of (P ) and (D). Primal-dual IPMs follow the central path approximately and approach the solution of (P ) and (D) as µ goes to zero [2]. To find the solution of (2) we apply the Newton method and then we get the following Newton system:
A∆x = 0, AT ∆y + ∆s = 0, S ∆x + X ∆s = µe − xs.
(3)
Since rank(A) = m and (x, s) > 0, the Newton system (3) has a unique solution which is called the Newton search direction (∆x, ∆y, ∆s). The generic primal-dual interior point algorithm works as follows. We assume that we are given a strictly feasible point (x, y, s) which is in a τ -neighborhood of the given µ-center [2]. We then decrease µ to µ+ = (1 − θ )µ for some fixed
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
e2307
θ ∈ (0, 1) and solve the Newton system (3) to obtain the unique search direction (∆x, ∆y, ∆s) with µ := µ+ . The positivity condition of a new iterate (x(α), y(α), s(α)) with x(α) := x + α ∆x, y(α) := y + α ∆y, s(α) := s + α ∆s, is ensured with the right choice of the step size α . This procedure is repeated until we find a new iterate (x(α), y(α), s(α)) that is in a τ neighborhood of the µ+ -center and then we let µ := µ+ and (x, y, s) := (x(α), y(α), s(α)). Then µ is again reduced by the factor 1 − θ and we solve the Newton system (3) targeting at the new µ+ -center and so on. This process is repeated until µ is small enough, e.g., nµ ≤ ε . The generic primal-dual algorithm is outlined as follows: Generic Primal-Dual Algorithm for LO Input: a proximity parameter τ > 0; an accuracy parameter ε > 0; a fixed barrier update parameter θ , 0 < θ < 1; a starting point (x0 , s0 ) and µ0 such that Ψ (x0 , s0 , µ0 ) ≤ τ ; begin x := x0 ; s := s0 ; µ := µ0 ; while nµ ≥ ε do begin µ := (1 − θ )µ while Ψ (x, s, µ) ≥ τ do solve the system (3) for ∆x, ∆y, and ∆s; begin determine a step size α x := x + α ∆x; y := y + α ∆y; s := s + α ∆s; end end end
3. The algorithm In this section we introduce some properties of the proximity function and define a new adaptive single-step large-update interior point algorithm based on wide neighborhoods. For notational convenience, we define the following:
v=
xs/µ,
p
dx = v ∆x/x,
ds = v ∆s/s.
(4)
From (4), we can rewrite the Newton system (3) as follows:
¯ x = 0, Ad A¯ T ∆y + ds = 0, dx + ds = v −1 − v,
(5)
−1 where A¯ = µ−1 AV −1 X and V −1 = diag(v −1 ). Note that the right-hand Pn side v − v in the Pnthird 2equation in (5) is the steepest descent direction of the logarithmic barrier function Ψl (v) = i=1 ψl (vi ), ψl (t ) = i=1 ((t − 1)/2 − log t ), say, dx + ds = −∇ Ψl (v). This implies that solving the complementarity condition xs = µe in (2) is equivalent toP minimizing n the logarithmic barrier function. In this paper we replace the logarithmic barrier function Ψl (v) with Ψq (v) = i=1 ψq (vi ), where
ψq (t ) = t − 1 + (t 1−q − 1)/(q − 1),
q≥3
(6)
and call ψq (t ) the kernel function. Since ψq (t ) has a linear growth term, it is not self-regular which is defined in [7]. We also use Ψq (v) as a proximity function and denote Ψq (v) = Φq (x, s, µ) = µ−1/2 (x1/2 )T s1/2 − n + (q − 1)−1 µ(q−1)/2 (x(q−1)/2 )T s(q−1)/2 − n/(q − 1). By replacing the kernel function we have the modified Newton system as follows:
¯ x = 0, Ad A¯ T ∆y + ds = 0, dx + ds = −∇ Ψq (v),
(7)
e2308
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
where ∇ Ψq (v) = e − v −q . We also define the norm-based proximity measure σ (v) as follows:
σ (v) = k − ∇ Ψq (v)k = kdx + ds k = kv −q − ek.
(8)
For notational convenience we denote σ (v) by σ . And since dx and ds are orthogonal,
σ 2 = kdx + ds k2 =
n X
([dx ]2i + [ds ]2i ) = k(dx T , ds T )k2 ,
i=1
where [dx ]i and [ds ]i are the ith components of vectors dx and ds , respectively. Using (4), we can write the third equation in (7) as xsdx /v + xsds /v = µ(q+1)/2 (xs)(1−q)/2 − µ1/2 (xs)1/2 . Thus we can rewrite the system (7) in the original space as follows:
A∆x = 0, AT ∆y + ∆s = 0, S ∆x + X ∆s = µ(q+1)/2 (xs)(1−q)/2 − µ1/2 (xs)1/2 .
(9)
For ψq (t ), q ≥ 3, we have
ψq0 (t ) = 1 − t −q ,
ψq00 (t ) = qt −q−1 ,
ψq000 (t ) = −q(q + 1)t −q−2 .
Since ψq00 (t ) > 0, ψq (t ) is strictly convex. Hence Ψq (v) is strictly convex and minimal at v = e. Then we have
Ψq (v) = 0 ⇔ σ (v) = 0 ⇔ v = e. In the following we introduce key properties which are important in the analysis of the algorithm. Lemma 3.1 (Section 2.5 in [13]). Kernel function ψq (t ) in (6) satisfies the following properties: (i) t ψq00 (t ) + ψq0 (t ) > 0, t > 0. (ii) ψq000 (t ) < 0, t > 0, (iii) ψq00 (t )ψq0 (β t ) − βψq0 (t )ψq00 (β t ) > 0, t > 1, β > 1.
By using Lemma 3.1 and Lemma 2.3 in [11], the kernel function ψq (t ) is exponentially convex. Define % : [0, ∞) →
[1, ∞) as the inverse function of ψq (t ) for t ≥ 1.
In the following lemma we obtain a lower bound for σ . Lemma 3.2. Let σ be the norm-based proximity measure as we defined in (8). Then we have
σ ≥ 1/2.
(10)
Proof. By Theorem 4.9 in [13] and the definition of ψq with Ψq := Ψq (v), we have
σ ≥ ψq0 (%(Ψq )) = 1 − 1/(%(Ψq )q ).
(11)
Let s = ψq (t ), t ≥ 1. Then by the definition of %, %(s) = t . 1 + s = 1 + ψq (t ) = t + (t 1−q − 1)/(q − 1). Since t ≥ 1 and q > 1, t
q −1
(12)
≥ 1. Hence
t 1−q /(q − 1) ≤ 1/(q − 1).
(13)
From (12) and (13), 1 + s ≤ t = %(s). So 1 + Ψq ≤ %(Ψq ) and hence 1/%(Ψq )q ≤ 1/(1 + Ψq )q . Assuming Ψq ≥ τ ≥ 1, we have 1/%(Ψq )q ≤ 1/(1 + Ψq )q ≤ 1/2q . From (11), (14), and q > 1, we have σ ≥ 1 − 1/2q > 1/2.
(14)
Lemma 3.3. Let σ be the norm-based proximity measure defined in (8). Then we have
vmin ≥ (σ + 1)−1/q . Proof. First, we consider the case vmin ≤ 1. Since q ≥ 3, we have −q −q σ = k − ∇ Ψq (v)k = kv −q − ek ≥| vmin − 1 |= vmin − 1.
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
e2309
Thus we have vmin ≥ (σ + 1)−1/q . Secondly, for vmin > 1, since 1 ≥ (1 + σ )−1/q , we have vmin ≥ (σ + 1)−1/q . This completes the proof.
2/(q−1)
2
For (x, s) > 0, define µ ˜ := (x1/2 )T s1/2 /n, µ˜h := n/((x(1−q)/2 )T s(1−q)/2 ) . Note that for q = 3, µ˜h := n/(x−T s−1 ) T is the harmonic mean of xi si , 1 ≤ i ≤ n and µ ˜ ≥ µa , where µa := x s/n, the arithmetic mean of xi si 0 s. In the following lemma we obtain the global minimizer of the proximity function Φq (x, s, µ) with respect to µ. Lemma 3.4. For given (x, s) > 0, the proximity function Φq (x, s, µ), q ≥ 3, is a strictly convex function with respect to µ and has a global minimum at
2/q µ∗ = (x1/2 )T s1/2 /((x(1−q)/2 )T s(1−q)/2 ) . Proof. By the definition of the proximity function Φq (x, s, µ), we have Φq0 (x, s, µ) = −(1/2) µ−3/2 (x1/2 )T s1/2 + (1/2)
µ(q−3)/2 (x(1−q)/2 )T s(1−q)/2 and Φq00 (x, s, µ) = (3/4)µ−5/2 (x1/2 )T s1/2 + ((q − 3)/4)µ(q−5)/2 (x(1−q)/2 )T s(1−q)/2 > 0. Thus Φq (x, s, µ) is strictly convex with respect to µ. By letting Φq0 (x, s, µ) = 0, we get a global minimizer µ∗ = (x1/2 )T s1/2 / 2/q ((x(1−q)/2 )T s(1−q)/2 ) . Throughout the paper, we assume that τ > n.
√
Lemma 3.5. If µ ˜ < τ µ˜h , then Φq (x, s, µ/τ ˜ ) < 3 nτ − qn/(q − 1). Proof. By assumption, µ ˜ = τ¯ µ˜h for some τ¯ < τ . By the definition of the proximity function Φq (x, s, µ), we have
Φq (x, s, µ/τ ˜ ) = (µ/τ ˜ )−1/2 (x1/2 )T s1/2 − n + (1/(q − 1)) (µ/τ ˜ )(q−1)/2 (x(1−q)/2 )T s(1−q)/2 − n/(q − 1).
(15)
We can express the first term in (15) as follows:
˜ )−1/2 (x1/2 )T s1/2 = (τ /µ) ˜ 1/2 (x1/2 )T s1/2 = (µ/τ
√
nτ /((x1/2 )T s1/2 ) (x1/2 )T s1/2 =
√
nτ .
(16)
Since τ¯ /τ < 1, the third term in (15) can be expressed in the following form.
(q − 1)−1 (µ/τ ˜ )(q−1)/2 (x(1−q)/2 )T s(1−q)/2 = (q − 1)−1 (τ¯ µ˜h /τ )(q−1)/2 (x(1−q)/2 )T s(1−q)/2 = (q − 1)−1 (τ¯ /τ )(q−1)/2 µ˜h (q−1)/2 (x(1−q)/2 )T s(1−q)/2 < (q − 1)−1 n/((x(1−q)/2 )T s(1−q)/2 ) (x(1−q)/2 )T s(1−q)/2 = n/(q − 1). (17) From (15), (16), and (17), we have
Φq (x, s, µ/τ ˜ )<
√
√
nτ − n < 3 nτ − qn/(q − 1).
The second inequality follows from the assumption τ > n and q ≥ 3.
√
√
Lemma 3.6. If Φq (x, s, µ/τ ˜ ) < 3 nτ − qn/(q − 1), then Φq (x, s, µ) ˜ < 3 n τ q/2 . Proof. By assumption, we have
Φq (x, s, µ/τ ˜ ) = (µ/τ ˜ )−1/2 (x1/2 )T s1/2 − n + (1/(q − 1)) (µ/τ ˜ )(q−1)/2 (x(1−q)/2 )T s(1−q)/2 − n/(q − 1) √ < 3 nτ − qn/(q − 1). This implies that
√ ˜ )−1/2 (x1/2 )T s1/2 + (1/(q − 1)) (µ/τ ˜ )(q−1)/2 (x(1−q)/2 )T s(1−q)/2 < 3 nτ . (µ/τ By the definition of µ ˜ , we have
2 1/2 1/2 T 1/2 √ µ ˜ −1/2 (x1/2 )T s1/2 = n/ (x1/2 )T (s1/2 ) (x ) s = n. Since (µ/τ ˜ )
−1/2
(18)
(x1/2 )T s1/2 > 0, we have
√ (1/(q − 1)) (µ/τ ˜ )(q−1)/2 (x(1−q)/2 )T s(1−q)/2 < 3 nτ . Then we have
√ (µ ˜ (q−1)/2 /(q − 1))(x(1−q)/2 )T s(1−q)/2 < 3 n τ q/2 .
(19)
e2310
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
By the definition of the proximity function Φq (x, s, µ) ˜ , (18), and (19), we have
Φq (x, s, µ) ˜ =µ ˜ −1/2 (x1/2 )T s1/2 − n + (µ ˜ (q−1)/2 /(q − 1))(x(1−q)/2 )T s(1−q)/2 − n/(q − 1) √ = n−n+ µ ˜ (q−1)/2 /(q − 1) (x(1−q)/2 )T s(1−q)/2 − n/(q − 1) √ < 3 n τ q/2 . Now we define a new neighborhood for the central path as follows:
√
N (n, τ ) := {(x, y, s) : (x, s) > 0, Ax = b, AT y + s = c , Φq (x, s, µ) ˜ < 3 n τ q/2 }, − where τ > n and q ≥ 3. Note that N (n, τ ) is a wide neighborhood since N∞ (ρ) ⊂ N (n, τ ) for τ > (1 − ρ)−1 , ρ ∈ (0, 1). − The wide neighborhood N∞ is defined as follows: − N∞ (ρ) := {(x, y, s) : (x, s) > 0, Ax = b, AT y + s = c , k(v 2 − e)− k∞ ≤ ρ, µ := µ}, ˜ − where a− = min{a, 0} and ρ ∈ (0, 1) is a constant independent of n. Indeed, if (x, s) ∈ N∞ (ρ), then k(v 2 − e)− k∞ < ρ , i.e., maxi |(vi2 − 1)− | < ρ . We denote J+ := {i ∈ J | vi ≥ 1} and J− := J − J+ . Then for i ∈ J− , 1 − vi2 < ρ . So we have
vi2 > 1 − ρ and vi1−q < (1 − ρ)(1−q)/2 . For i ∈ J+ , vi ≥ 1. Since ρ ∈ (0, 1), vi2 > 1 − ρ . Hence vi1−q < (1 − ρ)(1−q)/2 for i ∈ J+ . Thus we have eT v 1−q < n(1 − ρ)(1−q)/2 .
(20)
From (18), (20), and the fact τ > n, we have for µ = µ ˜
Φq (x, s, µ) ˜ =
√
n − n + (eT v 1−q − n)/(q − 1) ≤ (n(1 − ρ)(1−q)/2 − n)/(q − 1)
√ < n(τ (q−1)/2 − 1)/(q − 1) < 3 n τ q/2 , where τ > (1 − ρ)−1 . Therefore (x, s) ∈ N (n, τ ). Remark 3.7. In practice IPMs with large neighborhood are always more efficient than IPMs with small neighborhood. But IPMs working with small neighborhood have better worst-case complexity than IPMs with large neighborhood [4].
√
Since we operate in a large neighborhood, we define 3 nτ − qn/(q − 1) as the maximum allowed value of the proximity function Φq (x, s, µ) with respect to the target parameter µ. We compute the solution of the equation Φq (x, s, µ) = √ √ 3 nτ − qn/(q − 1). Since Φq (x, s, µ) is strictly convex with respect to µ and 3 nτ − qn/(q − 1) > 0, the above equation has two positive solutions. We use the smaller solution of the equation as the target parameter and denote it µt , i.e., q/2
(x(1−q)/2 )T s(1−q)/2 µt
√ 1/2 − 3 nτ (q − 1)µt + (q − 1)(x1/2 )T s1/2 = 0.
(21)
Lemma 3.8. Let µt be the solution of (21) for any (x, s) > 0. If µ ˜ < τ µ˜h , then we have
µ ˜ < 9τ µt . √
Proof. Define h(µ) = (x(1−q)/2 )T s(1−q)/2 µq/2 − 3 nτ (q − 1)µ1/2 + (q − 1)(x1/2 )T s1/2 . By the definition of h(µ) and Lemma 3.5, it is enough to show that h(µ/( ˜ 9τ )) > 0. For notational convenience we denote a = (x(1−q)/2 )T s(1−q)/2 > 0, b = √ 3 nτ (q − 1) > 0, and c = (q − 1)(x1/2 )T s1/2 > 0. Then h(µ) = aµq/2 − bµ1/2 + c , h0 (µ) = (aq/2)µ(q−2)/2 − (b/2)µ−1/2 , and h00 (µ) = (aq(q − 2)/4)µ(q−4)/2 + (b/4)µ−3/2 > 0. Thus h(µ) is strictly convex with respect to µ. By the definition of µ ˜ and µ˜h , we have (1−q)/2
√
1/2
+ (q − 1)(nµ) ˜ 1/2 √ = (9τ )−1/2 (nµ) ˜ 1/2 (n1/2 µ˜h (1−q)/2 µ ˜ (q−1)/2 (9τ )(1−q)/2 − 3 τ (q − 1) + (q − 1)(9τ )1/2 ) = (9τ )−1/2 (nµ) ˜ 1/2 µ˜h (1−q)/2 n1/2 µ ˜ (q−1)/2 (9τ )(1−q)/2 > 0.
h(µ/( ˜ 9τ )) = µ˜h
nµ ˜ (q/2) (9τ )−q/2 − 3 nτ (q − 1) (µ/( ˜ 9τ ))
For x(α) := x + α ∆x and s(α) := s + α ∆s, we define µ(α) ˜ :=
2/(q−1) s(α)(1−q)/2 ) .
(x(α)1/2 )T s(α)1/2
2
/n, µ˜h (α) := n/((x(α)(1−q)/2 )T
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
e2311
We define a new adaptive large-update algorithm as follows: Algorithm Input: a proximity parameter τ > n; an accuracy parameter ε > 0; a starting point (x, s) := (x0 , s0 ) > 0 such that µ/ ˜ µ˜h < τ ; begin while ((x1/2 )T (s1/2 ))2 ≥ ε do begin µ := µt from (21) solve the system (9) for ∆x, ∆y, and ∆s; begin determine a feasible step size α such that ; ¯ ) Φq (x(α), s(α), µ) ≤ Φq (x, s, µ) − 1/(64kq and µ(α)/ ˜ µ˜h (α) < τ x := x(α); y := y(α); s := s(α); end end end
Remark 3.9. For the starting point, we can further assume that xo = so = e by using the self-dual embedding model under the IPC. Then for (x, s) := (xo , so ) we have µ ˜ = n and µ˜h = 1. Since we assume that τ > n, we obtain µ/ ˜ µ˜h = n < τ . 4. Complexity analysis In this section we compute the iteration bound for the algorithm. First, we compute the strictly feasible step size. At the start of the algorithm, we have (x, s) > 0 such that µ/ ˜ µ ˜ h < τ . For µ := µt , we solve the modified Newton system (9) and we get the search direction (∆x, ∆s). After a damped step for fixed µ we have x(α) = x + α ∆x,
s(α) = s + α ∆s.
Lemma 4.1. Let σ be the value in (8). Then we have
k(x−1 ∆x, s−1 ∆s)k ≤ σ (σ + 1)1/q .
(22)
Proof. Using (4), (8) and Lemma 3.3, we have
p k(x−1 ∆x, s−1 ∆s)k = k(v −1 dx , v −1 ds )k ≤ (1/vmin ) kdx k2 + kds k2 ≤ σ /vmin ≤ σ (σ + 1)1/q , where [dx ]i and [ds ]i denote the ith components of the vectors dx and ds , respectively.
Define αˇ := 1/(σ (σ + 1)1/q ). Then we have (x(α), s(α)) > 0 for any α ∈ (0, α] ˇ . Indeed, if ∆x > 0, it is clear. Otherwise, if there exists an index set J such that J = {i ∈ I : ∆xi < 0}, then by (22), we have max(−xi −1 ∆xi ) ≤ k − x−1 ∆xk ≤ σ (σ + 1)1/q = αˇ −1 . i∈J
Thus we have mini∈J (−xi (∆xi )−1 ) ≥ αˇ > α and xi + α ∆xi > 0, for any i ∈ J and α ∈ (0, α] ˇ . Hence x + α ∆x > 0 for any α ∈ (0, α] ˇ . By the same way, we can show that s + α ∆s > 0 for any α ∈ (0, α]. ˇ From (4), we have x(α) = x (e + α ∆x/x) = x (e + α dx /v) = (x/v)(v + α dx ) and s(α) = s (e + α ∆s/s) = s (e + α ds /v) = (s/v)(v + α ds ). Thus we have
v(α)2 = x(α)s(α)/µ = (v + α dx )(v + α ds ).
e2312
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
By the exponential convexity of Ψq , we have
Ψq (v(α)) = Ψq
p (v + α dx )(v + α ds ) ≤ (1/2) Ψq (v + α dx ) + Ψq (v + α ds ) .
By letting f (α) be the difference of the new and old proximity measures as a function of α , i.e., f (α) = Ψq (v(α)) − Ψq (v), we have f (α) ≤ f1 (α), where f1 (α) := (1/2)(Ψq (v + α dx ) + Ψq (v + α ds )) − Ψq (v). Note that f (0) = f1 (0) = 0. By taking the derivative of f1 (α) with respect to α , we have f10 (α) = (1/2)
n X (ψq0 (vi + α[ds ]i )[dx ]i + ψq0 (vi + α[ds ]i )[ds ]i ). i =1
From (7) and (8), f10 (0) = (1/2)∇ Ψq (v)T (dx + ds ) = −(1/2)∇ Ψq (v)T ∇ Ψq (v) = −σ 2 /2. By differentiating f10 (α) with respect to α , we obtain f100 (α) = (1/2)
n X (ψq00 (vi + α[dx ]i )[dx ]2i + ψq00 (vi + α[ds ]i )[ds ]2i ). i=1
Since (x, s) is not both on the central path during an iteration, f100 (α) > 0. Hence f1 (α) is strictly convex with respect to α . We cite some lemmas in [11] without proof. Lemma 4.2 (Lemma 3.1 in [11]). We have f100 (α) ≤ (σ 2 /2)ψq00 (vmin − ασ ). Lemma 4.3 (Lemma 3.2 in [11]). We have f10 (α) ≤ 0 if
− ψq0 (vmin − ασ ) + ψq0 (vmin ) ≤ σ .
(23)
Let ρ : [0, ∞) → (0, 1] be the inverse function of the restriction of −(1/2)ψq0 (t ) to the interval (0, 1]. Then by the definition of ψq (t ), for all t ≥ 0
ρ(t ) = (1/(2t + 1))1/q .
(24)
Using Lemma 3.1, we cite Lemma 3.3 in [11] without proof. In the following lemma we obtain the step size α such that the proximity measure decreases when we take a new iterate for fixed µ. Lemma 4.4 (Lemma 3.3 in [11]). The largest step size α satisfying (23) is given by
α ∗ := (1/σ ) (ρ (σ /2) − ρ(σ )) .
(25)
Lemma 4.5 ([11]). Let ρ and α ∗ be the values as defined in (24) and (25), respectively. Then we have
α ∗ ≥ 1/(q(σ + 1)1/q (2σ + 1)). We denote
α¯ = 1/(q(σ + 1)1/q (2σ + 1)).
(26)
Lemma 4.6. If µ/ ˜ µ˜h < τ , then there exists some αˆ > 0 such that µ˜g (α)/µ˜h (α) < τ for α ≤ α. ˆ Proof. Define k(α) = µ(α)/ ˜ µ˜h (α), µ( ˜ 0) = µ ˜ , and µ˜h (0) = µ˜h . Since (x(α), s(α)) is continuous with respect to α , k(α) is continuous in α . By assumption µ˜g /µ˜h ≤ τ and the continuity of k(α), there exist some αˆ > 0 such that k(α) ˆ ≤ τ for α ≤ αˆ .
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
e2313
Since α¯ < αˇ , we define
α˜ = min{α, ˇ α, ˆ α}. ¯
(27)
Then we have α¯ ≥ α˜ and hence for some k¯ ≥ 1
α˜ = (1/k¯ )α. ¯
(28)
In the following theorem we obtain the upper bound for the difference f (α) between the new and old proximity measures. Theorem 4.7. Let α˜ be a step size as defined in (27). Then for σ in (8) and k¯ in (28), we have
¯ ). f (α) ˜ ≤ −1/(64kq
(29)
Proof. Using Lemma 3.6 in [11], (28), (26), and (10), we have
¯ (σ + 1)1/q (2σ + 1)) f (α) ˜ ≤ −(1/4)ασ ˜ 2 = −σ 2 /(4kq ¯ ))(σ 2 /((2σ + 1)(1+q)/q )) ≤ −(1/(4kq ¯ ))(σ 2 /(16σ 2 )) = −1/(64kq ¯ ). ≤ −(1/(4kq Theorem 4.8. Let (∆x, ∆y, ∆s) be the solution of (9) with µ = µt . If µ ˜ < τ µ˜h , then we have for step size α ≤ α, ˜
√ q Φq (x(α), s(α), µ(α)) ˜ < 3 nτ 2. Proof. By solving the system (9) with µ = µt , we obtain a unique search direction ∆x and ∆s. For a step size α such that 0 < α ≤ α˜ , we get new iterates (x(α), s(α)). By the definition of µt in (21), we have for all α ≤ α˜ and for all iterations,
√ Φq (x(α), s(α), µt (α)) = 3 nτ − qn/(q − 1).
By (27), Lemma 3.5, and Lemma 4.6, we have for all α ≤ α˜ ,
√ Φq (x(α), s(α), µ(α)/τ ˜ ) < 3 nτ − qn/(q − 1).
Then by Lemma 3.6, we obtain for all α ≤ α˜
√ Φq (x(α), s(α), µ˜g (α)) < 3 n τ q/2 .
In the following lemma we estimate the change of the target parameter µt after an iteration which is important for the complexity analysis.
√
Lemma 4.9. Let v+ = v/ 1 − θ for some θ ∈ (0, 1). Then we have
Ψq (v+ ) ≤ (1 − θ )−1 Ψq (v) + θ nq/((1 − θ )(q − 1)). Proof. By the definition of the proximity function Ψq (v), we have 1 −q
Ψq (v+ ) = eT v+ − n + (eT v+ 1/2 2
= kv+ k
− n)/(q − 1)
(1−q)/2 2 − n + (kv+ k −1/2 1/2 2
= (1 − θ )
kv
− n)/(q − 1)
k − n + ((1 − θ )(q−1)/2 kv (1−q)/2 k2 − n)/(q − 1)
≤ (1 − θ )−1 kv 1/2 k2 − n + ((1 − θ )(q−1)/2 kv (1−q)/2 k2 − n)/(q − 1) = (1 − θ )−1 kv 1/2 k2 − n + (kv (1−q)/2 k2 − n)/(q − 1) − n + ((1 − θ )(q−1)/2 kv (1−q)/2 k2 − n)/(q − 1) + n/(1 − θ ) − (kv (1−q)/2 k2 − n)/((1 − θ )(q − 1)) = (1 − θ )−1 Ψq (v) + kv (1−q)/2 k2 /(q − 1) (1 − θ )(q−1)/2 − 1/(1 − θ ) + θ nq/((1 − θ )(q − 1)) ≤ (1 − θ )−1 Ψq (v) + θ nq/((1 − θ )(q − 1)). Theorem 4.10. Let (∆x, ∆y, ∆s) be the solution of (9) with µ := µt , α˜ the default step size defined in (27). Then we have
Φq (x(α), ˜ s(α), ˜ (1 − η)µt ) ≤ Φq (x, s, µt ), √ ¯ nτ ). where η = 1/(192kq
e2314
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
Proof. By Theorem 4.7, we have
¯ ). Φq (x(α), ˜ s(α), ˜ µt ) ≤ Φq (x, s, µt ) − 1/(64kq
(30)
Using Lemma 4.9, we have
Φq (x(α), ˜ s(α), ˜ (1 − η)µt ) ≤ (1 − η)−1 Φq (x(α), ˜ s(α), ˜ µt ) + ηnq/((1 − η)(q − 1)).
(31)
¯ ) ≤ (1 − η)Φq (x, s, µt ), Φq (x, s, µt ) + ηnq/(q − 1) − 1/(64kq
(32)
If
then by using (31), (30), and (32), we have
Φq (x(α), ˜ s(α), ˜ (1 − η)µt ) ≤ (1 − η)−1 Φq (x(α), ˜ s(α), ˜ µt ) + ηnq/((1 − η)(q − 1)) −1 ¯ ) + ηnq/(q − 1) Φq (x, s, µt ) − 1/(64kq ≤ (1 − η)
≤ Φq (x, s, µt ). Now we compute η satisfying (32). From (32), we obtain
¯ ). ηnq/(q − 1) + ηΦq (x, s, µt ) ≤ 1/(64kq
(33)
√
Hence we need to compute η satisfying the inequality (33). Since Φq (x, s, µt ) = 3 nτ − qn/(q − 1), we can rewrite (33) as follows:
√ ¯ ). ηnq/(q − 1) + η 3 nτ − nq/(q − 1) ≤ 1/(64kq
This implies that
√ ¯ nτ ). η ≤ 1/(192kq
If we take
√ ¯ nτ ) < 1 , η = 1/(192kq
then we have Φq (x(α), ˜ s(α), ˜ (1 − η)µt ) ≤ Φq (x, s, µt ).
√
Let µt (α) ˜ be the target parameter value after one step, i.e., Φq (x(α), ˜ s(α), ˜ µt (α)) ˜ = 3 nτ − qn/(q − 1). Then by the definition of µt ,
Φq (x, s, µt ) = Φq (x(α), ˜ s(α), ˜ µt (α)). ˜ By Theorem 4.10, we have
Φq (x(α), ˜ s(α), ˜ (1 − η)µt ) ≤ Φq (x, s, µt ). Then we have
Φq (x(α), ˜ s(α), ˜ (1 − η)µt ) ≤ Φq (x(α), ˜ s(α), ˜ µt (α)). ˜ By Lemma 3.4, the proximity function Φq (x, s, µ) is strictly convex with respect to µ and hence we have √ ¯ nτ ))µt . µt (α) ˜ ≤ (1 − 1/(192kq √ ¯ nτ ). This implies that the algorithm is large-updated. If we let µt (α) ˜ = (1 − θ)µt , then θ ≥ 1/(192kq
(34)
In the following theorem we obtain the complexity bound of the algorithm. Theorem 4.11. Let a liner optimization problem be given. Assume that τ > n and a strictly feasible starting point is available
2
with µ/ ˜ µ˜h < τ . Then the total number of iterations to have an approximate solution (x, s) such that (x1/2 )T (s1/2 ) bounded by
√
¯ nτ log 9nτ µt 0 /ε 192kq
≤ ε is
, √
where µ0t is satisfying Φq (x0 , s0 , µ0t ) = 3 nτ − qn/(q − 1). Proof. By Lemma 3.8 and the assumption, we have µ ˜ < 9τ µt . If we assume that µ ˜ = ((x1/2 )T (s1/2 ))2 /n ≤ 9τ µt ≤ ε/n, then (x1/2 )T (s1/2 )
√
2
≤ ε . So we need to compute the number of iterations to get 9τ µt ≤ ε/n. By (34), µt (α) ˜ ≤ (1 −
1/(192k¯ nτ q))µt . We denote by µkt the target parameter at the kth iteration. Then we have µt k ≤ (1 − θ )k µt 0 . By letting (1−θ)k µt 0 ≤ ε/(9nτ ), we have (1−θ )k ≤ ε/(9nτ µt 0 ). Hence we have k log(1−θ ) ≤ log(ε/(9nτ µt 0 )). Since log(1−θ ) < 0 and log(1 − θ ) ≤ −θ , we have k ≥ (1/ log(1 − θ )) log(ε/(9nτ µt 0 )) ≥ (1/θ ) log(9nτ µt 0 /ε). Remark 4.12. Since xT s ≤ (x1/2 )T s1/2
2
≤ ε , we get an ε -approximate solution.
M.K. Kim et al. / Nonlinear Analysis 71 (2009) e2305–e2315
e2315
5. Concluding remarks In this paper we proposed a new adaptive single-step large-update primal-dual interior point algorithm with wide neighborhood for LO problems. The search direction is adjusted adaptively according to the current iterates and √ at each target parameter update only one Newton step is made. The algorithm has the worst-case iteration complexity O(q nτ log(n/ε)). Acknowledgment This work was supported by the Korea Research Foundation Grant funded by the Korean Government (KRF-2007-531C00012). References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]
N.K. Karmarkar, A new polynomial-time algorithm for linear programming, Combinatorica 4 (1984) 373–395. C. Roos, T. Terlaky, J.Ph. Vial, Theory and algorithms for linear optimization, in: An Interior Approach, John Wiley and Sons, Chichester, UK, 1997. S.J. Wright, Primal-dual Interior-Point Methods, SIAM, Philadelphia, USA, 1997. Yinyu Ye, Interior Point Algorithms: Theory and Analysis, Wiley-Interscience, 1997. M. Kojima, S. Mizuno, A. Yoshise, A primal-dual interior point algorithm for linear programming, in: N. Megiddo (Ed.), Progress in Mathematical Programming: Interior Point and Related Methods, Springer Verlag, New York, 1989, pp. 29–47. N. Megiddo, Pathways to the optimal set in linear programming, in: N. Megiddo (Ed.), Progress in Mathematical Programming: Interior Point and Related Methods, Springer Verlag, New York, 1989, pp. 313–158. J. Peng, C. Roos, T. Terlaky, Self-Regularity, in: A New Paradigm for Primal-Dual Interior-Point Algorithms, Princeton University Press, 2002. J. Peng, T. Terlaky, A dynamic large-update primal-dual interior point method for linear optimization, Optimization Methods and Software 17 (6) (2002) 1077–1104. M. Salahi, T. Terlarky, An adaptive self-regular proximity based large-update IPM for linear optimization, Optimization Methods and Software 20 (1) (2005) 169–185. S. Pan, X. Li, A self-adjusting primal-dual interior point method for linear programs, Optimization Methods and Software 19 (2004) 389–397. Y.Q. Bai, C. Roos, A primal-dual interior-point method based on a new kernel function with linear growth rate, in: Proceedings of Industrial Optimization Symposium and Optimization Day, Nov., 2002. E.D. Andersen, J. Gondzio, Cs. Mészáros, X. Xu, Implementation of interior-point methods for large scale linear programming, in: T. Terlaky (Ed.), Interior Point Methods of Mathematical Programming, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1996, pp. 189–252. Y.Q. Bai, M. El Ghami, C. Roos, A Comparative study of kernel functions for primal-dual interior-point algorithms in Linear Optimization, SIAM Journal on Optimization 15 (2005) 101–128. S. Mehrotra, On the implementation of a (primal-dual) interior point method, SIAM Journal on Optimization 2 (1992) 575–601.