Optimality condition and algorithm with deviation integral for global optimization

Optimality condition and algorithm with deviation integral for global optimization

J. Math. Anal. Appl. 357 (2009) 371–384 Contents lists available at ScienceDirect Journal of Mathematical Analysis and Applications www.elsevier.com...

202KB Sizes 0 Downloads 33 Views

J. Math. Anal. Appl. 357 (2009) 371–384

Contents lists available at ScienceDirect

Journal of Mathematical Analysis and Applications www.elsevier.com/locate/jmaa

Optimality condition and algorithm with deviation integral for global optimization ✩ Yirong Yao a , Liu Chen a , Quan Zheng a,b,∗ a b

Department of Mathematics, Shanghai University, Shanghai, China 200444 Department of Mathematics, Columbus State University, Columbus, GA 31907, United States

a r t i c l e

i n f o

Article history: Received 2 April 2009 Available online 18 April 2009 Submitted by Goong Chen Keywords: Global optimization Integral global minimization Global optimality condition Robust analysis Deviation integral Stochastic implementation

a b s t r a c t To study the integral global minimization, a general form of deviation integral is introduced and its properties are examined in this work. In terms of the deviation integral, optimality condition and algorithms are given. Algorithms are implemented by a properly designed Monte Carlo simulation. Numerical tests are given to show the effectiveness of the method. © 2009 Elsevier Inc. All rights reserved.

1. Introduction Let X be a topological space and f : X → R a real-valued function. Consider the following minimization problem: c ∗ = inf f (x).

(1)

x

The problem of minimizing a function has been investigated since the seventeenth century with the concepts of derivative and Lagrangian multiplier. The gradient-based approach to optimization is the mainstream of that research. However, the requirement of differentiability restricts its application to many practical problems. Moreover, it can only be utilized to characterize and find a local solution of a general optimization problem. In this work we will investigate a minimization problem with discontinuous objective function using the integral approach. Global optimization (1) has a lot of applications of real-world problems in natural science, social science and industrial applications. It attracts a number of specialists in various fields, they provided various methods to solve it. These methods can be classified into deterministic method, stochastic method and their combination. Deterministic method exploits analytical properties of problem (1) to generate a deterministic sequence of points which converges to a global optimal solution, these properties generally include differentiability, convexity and monotonicity. The stochastic method generates a sequence of points converging to global optimal solution probabilistically, such as genetic algorithm, tabu search and simulated annealing algorithm, etc. Combining deterministic and stochastic method, we have the third class. The integral level-set method [2] is its representative. It forms two sequences in the integral level-set method, i.e.,



*

This research is supported by Key Discipline of Shanghai Municipality J50101, S30104. Corresponding author at: Department of Mathematics, Columbus State University, Columbus, GA 31907, United States. E-mail address: [email protected] (Q. Zheng).

0022-247X/$ – see front matter doi:10.1016/j.jmaa.2009.04.022

© 2009 Elsevier Inc.

All rights reserved.

372

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384



1

ck+1 =

μ( H c k )



f (x) dμ, H ck



H ck = x: f (x)  ck , where

(2)

k = 1, 2, . . . ,

(3)

μ is Lebesgue measure. And the stopping condition is v (ck ) =



1

μ( H c k )



2

ck − f (x)

dμ <  .

(4)

H ck

Under certain conditions, sequences {ck } and { H ck } converge to global minimum value and set of global minimizers, respectively. It looks that level-set { H ck } in the algorithm is difficult to be determined. So in implementation of the algorithm, level-set H ck is approximately computed by a properly designed Monte Carlo technique. The integral approach of global minimization can handle discontinuous objective function and disconnected constraint set, robust analysis theory has been developed [8,9,6]. Without using robust analysis, Phu and Hoffmann [5] introduced mean deviation function m(c ) (see (10)) to study essential infimum of a summable function. Wu et al. [7] studied a general form of m(c ) introducing higher moment deviation function. Inspired by these ideas, we define a general form of deviation function V φ (c ) (see (8)). This function possesses good properties for global optimization term, such as convexity, monotonicity and differentiability. We then prove an optimality condition for global minimization: V φ (c ) = 0. In this work, we first recall basic concepts of robust sets, functions and the integral approach to global minimization. In Section 3, we introduce concept of deviation integral V φ (c ). Properties of this integral are examined. With these properties, integral optimality condition is derived in Section 4. With the help of this optimality condition integral algorithms are proposed and implemented by Monte Carlo simulation in Section 5. In Section 6 numerical tests (up to dimension of 100) are given to illustrate the effectiveness of the algorithm. We conclude our study in Section 7. 2. Robust sets, functions and Q -measure space In this section we will summarize several concepts and properties of the integral global minimization of robust discontinuous functions, which will be utilized in the following sections. For more details, see [8,9,6]. 2.1. Robust sets and functions Let X be a topological space, a subset D of X is said to be robust if cl D = cl int D ,

(5)

where cl D denotes the closure of the set D and int D denotes the interior of D. A robust set consists of robust points of the set. A point x ∈ D is said to be a robust point of D, if for each neighborhood N (x) of x, N (x) ∩ int D = ∅. A set D is robust if and only if each point of D is robust one. A point x ∈ D is a robust point of D if and only if there exists a net {xλ } ⊂ int D such that xλ → x. The interior of a non-empty robust set is non-empty. A union of robust sets is robust. An intersection of two robust sets may be non-robust; but the intersection of an open set and a robust set is robust. A set D is robust if and only if ∂ D = ∂ int D, where ∂ D = cl D \ int D denotes the boundary of the set D. A function f : X → R is said to be upper robust if the set



F c = x: f (x) < c



(6)

is robust for each real number c. A sum or a product of two upper robust functions may be non upper robust; but the sum of an upper robust function and an upper semi-continuous (u.s.c., for the product case non-negativity is required) function is upper robust. A function f is upper robust if and only if it is upper robust at each point; f is upper robust at a point x if x ∈ F c implies x is a robust point of F c . An example of a non upper robust function on R 1 is

 f (x) =

x2 ,

x = 0,

−1, x = 0.

f is not upper robust at x = 0. A function f is upper robust if and only if it is upper approximatable: the set C of continuity of f is dense and for each point x ∈ X , there is a sequence {xλ } in subset C such that f (x) = lim sup f (xλ ). λ

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

373

2.2. Q -measure spaces and integration In order to investigate a minimization problem with an integral approach, a special class of measure spaces, which are called Q -measure spaces, should be examined. Let X be a topological space, Ω a σ -field of subsets of X and μ a measure on Ω . A triple ( X , Ω, μ) is called a Q measure space iff: 1. Each open set in X is measurable; 2. The measure μ(G ) of a non-empty open set G in X is positive: 3. The measure μ( K ) of a compact set K in X is finite.

μ(G ) > 0;

The n-dimensional Lebesgue measure space ( R n , Ω, μ) is a Q -measure space; a non-degenerate Gaussian measure μ on a separable Hilbert space H with Borel sets as measurable sets constitutes an infinite dimensional Q -measure space. A specific optimization problem is related to a specific Q -measure space which is suitable for consideration in this approach. Once a measure space is given we can define integration in a conventional way. Since the interior of a non-empty open set is non-empty, the Q -measure of a measurable set containing a non-empty robust set is always positive. This is an essential property we need in the integral approach of minimization. Hence, the following assumptions are usually required: (A): f is lower semi-continuous (l.s.c.) and X is inf-compact. (M): ( X , Ω, μ) is a Q -measure space. (R): f is a measurable upper robust function. 3. Deviation integral Let f be a measurable upper robust function on a compact robust set D ⊆ X and minx∈ D f (x) = c ∗ . We have [8] Lemma 3.1. For all c > c ∗ , we have

μ( H c ∩ D ) > 0.

(7)

Definition 3.1. Let φ : R 1 → R 1 be a strictly increasing continuous function and φ(0) = 0. We define deviation integral of f as following:

 V φ (c ) =

  φ c − f (x) dμ,

(8)

Hc ∩D

where the integration is with respect to x over H c ∩ D. Note that, sometimes we may emphasize the constraint set D, we use the notation V φ (c ; D ) = V φ (c ). We now examine properties of the integral V φ (c ). Proposition 3.1. Integral V φ (c ) has the following properties: 1. 2. 3. 4.

V φ (c ) > 0, ∀c > c ∗ ; V φ (c ) = 0, ∀c < c ∗ ; V φ (c ) is non-decreasing on (−∞, +∞) and strictly increasing on (c ∗ , ∞); V φ (c ) is continuous.

Proof. 1. Let

η = (c − c ∗ )/2, then μ( H c−η ∩ D ) > 0. Thus, 

V φ (c )  H c −η ∩ D

  φ c − f (x) dμ 

 φ(η) dμ = φ(η)μ( H c−η ∩ D ) > 0. H c −η ∩ D

Since H c ∩ D = ∅ when c < c ∗ , 2. follows.

374

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

3. For c 2 > c 1 > c ∗ , we have





  φ c 2 − f (x) dμ −

V φ (c 2 ) − V φ (c 1 ) = H c2 ∩ D

H c1 ∩ D



  φ c 2 − f (x) dμ +

=



    φ c 2 − f (x) − φ c 1 − f (x) dμ

H c1 ∩ D

( H c2 \ H c1 )∩ D



  φ c 1 − f (x) dμ



   φ c 2 − f (x) − φ c 1 − f (x) dμ > 0,

 H c1 ∩ D

since φ(c 2 − f (x)) > φ(c 1 − f (x)) on H c1 ∩ D and μ( H c1 ∩ D ) > 0. 4. For c > c ∗ , we have (assume that c > 0, the proof of c < 0 case is similar)



  φ c + c − f (x) dμ

V φ (c + c ) = H c + c ∩ D





  φ c + c − f (x) dμ +

=

  φ c − f (x) dμ + I

Hc ∩D

( H c + c \ H c )∩ D

   V φ (c ) + A · μ( H c+ c ∩ D ) − μ( H c ∩ D ) + I → V φ (c ), where



     φ c + c − f (x) − φ c − f (x) dμ → 0 as c → 0.

I= Hc ∩D

Moreover, we have |c + c − f (x)|  A on ( H c + c \ H c ) ∩ D and μ( H c + c ∩ D ) − μ( H c ∩ D ) → 0 by continuity of measure V φ (c ) is also continuous at c ∗ . Let c > 0. We have 0  (c ∗ + c − f (x))  c on H c ∗ + c ∩ D, thus







0  V φ c ∗ + c =

μ.





c ∗ + c − f (x) dμ  c · μ( H c ∗ + c ∩ D ) → 0.

H c ∗ + c ∩ D

If c < 0, V φ (c ∗ + c ) = 0, so that





lim V φ c ∗ + c = 0. −

2

c →0

Theorem 3.1. Suppose that, in addition, φ is differentiable on (−∞, ∞) and φ (0) = 0, then the integral V φ (c ) is differentiable on (∞, +∞), and V φ (c ) = V φ (c ).

(9)

Moreover, V φ (c ) is convex. Proof. Consider the case c > c ∗ . For c > 0, we have V φ (c + c ) − V φ (c )

c where I1 =



1

c



1

c

1

c





  φ c + c − f (x) dμ −

H c + c ∩ D



  φ c − f (x) dμ = I 1 + I 2 ,

Hc ∩D

  φ c + c − f (x) dμ

( H c + c \ H c )∩ D

and I2 =

=

     φ c + c − f (x) − φ c − f (x) dμ.

Hc ∩D

We have c  f (x)  c + c on ( H c + c \ H c ) ∩ D which implies that 0  φ(c + c − f (x))  φ( c ). Thus 0  I1 

1

c



( H c + c \ H c )∩ D

    φ( c ) − φ(0) dμ  φ (0) + 1 · μ ( H c+ c \ H c ) ∩ D → 0 as c → 0.

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

375

Therefore, lim I 1 = 0.

c →0

On the other hand,



  φ c − f (x) dμ = V φ (c ),

lim I 2 =

c →0

Hc ∩D

where the equality can be obtained by the L’Hospital’s rule. Hence V φ (c ) = V φ (c ). The proof of c < 0 case is similar. For c < c ∗ case, we always have V φ (c ) = 0. v (c ) is differentiable at c ∗ . For c > 0, we have 0  c ∗ + c − f (x)  c on H c ∗ + c ∩ D, thus 0

V φ (c ∗ + c ) − V φ (c ∗ )

c

If c < 0, we have V φ (c ∗



c

c →0−

φ( c ) dμ → φ (0) · μ( H c∗ + c ∩ D ) = 0.

H c ∗ + c ∩ D

+ c ) = 0, so that

V φ (c ∗ + c ) − V φ (c ∗ )

lim =



1

c

= 0.

Hence, V φ (c ∗ ) exists and V φ (c ∗ ) = 0. Furthermore, V φ (c ) = V φ (c ) is non-negative and non-decreasing, which implies that V φ (c ) is convex [1].

2

Example 3.1. Let φ(x) = xm , m = 1, 2, . . . , then we have higher moments deviation integrals:



m



V m (c ) =

c − f (x)

d μ,

m = 1, 2, . . . .

Hc ∩D

Especially m = 1, 2, we have mean and variation deviation integrals:



m(c ) =







2

c − f (x) dμ

Hc ∩D

and

 v (c ) =

c − f (x)

d μ.

Hc ∩D

Example 3.2. Let φ(x) = e x − 1 − x, then φ (x) = e x − 1 with φ(0) = φ (0) = 0, we have an exponential deviation integral:



V e (c ) =





e (c − f (x)) − 1 − c − f (x) dμ

Hc ∩D

and



V e (c ) =

e (c − f (x)) − 1 dμ.

Hc ∩D

Note that φ (x) = e x

− 1 > 0 and φ (x) = e x > 0, ∀x > 0, the exponential deviation integral satisfies all of the conditions we

need. 1 Example 3.3. Let φ(x) = ln(1 + x) − x + 12 x2 , then φ (x) = 1+ − 1 + x with φ(0) = φ (0) = 0, we have a logarithmic deviation x integral:

 V ln (c ) =







ln c − f (x) − c − f (x) + Hc ∩D

2 1 c − f (x) dμ 2

376

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

and



V ln (c ) =

1 1 + (c − f (x))

Hc ∩D

  − 1 + c − f (x) dμ.

1 Note that φ (x) = 1+ − 1 + x > 0 and φ (x) = − (1+1x)2 + 1 > 0, ∀x > 0, the logarithmic deviation integral satisfies all of the x conditions we need.

When φ(x) = x, φ (x) = 1, φ (0) = 0, however, we have the following proposition: Proposition 3.2. For mean deviation integral



m(c ) =





c − f (x) dμ

(10)

Hc ∩D

we have m (c ) = μ( H c ∩ D ). Proof. We have that









m(c + c ) − m(c ) = H c + c ∩ D

c − f (x) dμ

Hc ∩D



 c + c − f (x) dμ +



=









c + c − f (x) dμ −

Hc ∩D

( H c + c \ H c )∩ D







c + c − f (x) dμ −





c − f (x) dμ

Hc ∩D

 α + c − f (x) dμ + c μ( H c ∩ D )



= ( H c + c \ H c )∩ D

= I + c μ( H c ∩ D ). On the set H c + c \ H c , we have c < f (x)  c + c,



0
  (c + c − c ) dμ = c · μ ( H c+ c \ H c ) ∩ D .

( H c + c \ H c )∩ D

Thus,

μ( H c ∩ D ) <

m(c + c ) − m(c )

c

   μ( H c ∩ D ) + μ ( H c+ c \ H c ) ∩ D .

Therefore, as c → 0, μ(( H c + c \ H c ) ∩ D ) → 0, the limit exists and equals to m (c ) = lim

m(c + c ) − m(c )

c

c

= μ( H c ∩ D ).

μ( H c ∩ D ) :

2

Which implies (k)

V m (α ) = k!μ( H c ∩ D ). Then we have Proposition 3.3. Under the assumptions of Theorem 3.1, we have, for k  1, V k (c ) = kV k−1 (c ),

k = 1, 2, . . . .

(11)

If k  1 is an integer number, we have (k)

V k (c ) = k! · μ( H c ∩ D ). Especially, v (c ) = 2m(c ).

(12)

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

377

As we know that m(c ) is continuous on (−∞, +∞), thus v (c ) = 2m(c ) is continuous on (−∞, +∞), we then have Corollary 3.1. The derivative v (c ) is continuous on (−∞, ∞). We also have Proposition 3.4. The variance integral v (c ) is convex on (−∞, +∞) and strictly convex on (c ∗ , +∞). 4. Optimality condition and variable measures We now examine optimality conditions of global minimization. Let cn > c ∗ = min f (x) x∈ D





V φ (cn ) > 0

be a sequence of real numbers constructed by an algorithm, and limn→∞ cn = c ∗ . Theorem 4.1. Under the assumptions (A), (R) and (M), c ∗ is the global minimum value if and only if for cn ↓ c ∗ , lim V φ (cn ) = 0.

(13)

n→∞

Proof. The necessity follows from continuity of V φ (c ), and V φ (c ) = 0 when c < c ∗ . Sufficiency of (13): Suppose c ∗ is not the global minimum value of f but cˆ is. Then c ∗ − cˆ = 2η > 0. We have



 

V φ c∗ =

  φ c ∗ − f (x) dμ =

H c∗ ∩ D







  φ c ∗ − f (x) dμ +

( H c ∗ \ H cˆ +η )∩ D





  φ c ∗ − f (x) dμ

H cˆ +η ∩ D



φ c ∗ − cˆ − η dμ = φ(η) · μ( H cˆ +η ∩ D ) > 0,

H cˆ +η ∩ D

which is a contradiction.

2

5. Algorithm and implementation We know that c ∗ = minx∈ X f (x) is the largest root of equation V φ (c ) = 0, and we can solve this equation, say, using Newton’s method, by the properties of V φ (c ) and V φ (c ). Based on this, we propose an algorithm for global optimization problem (1) conceptually. Step 1. Let

 > 0 be a given small value. Take a point x0 ∈ X , calculate c0 = f (x0 ) and set k := 0.

Step 2. Compute V φ (ck ) and V φ (ck ) as follows:



V φ (ck ) =

  φ ck − f (x) dμ

(14)

  φ ck − f (x) dμ.

(15)

.

(16)

H ck

and V φ (ck ) =

 H ck

Let

λk =

V φ (ck ) V φ (ck )

Step 3. If λk <  , go to Step 4; else let ck+1 = ck − λk , and set k := k + 1, return to Step 2.

(17)

378

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

Step 4. Set c ∗ := ck and H ∗ := H ck , where c ∗ is the approximation of global minimum value, and H ∗ is the approximation of the set of global minimizers. To justify the algorithm, we need the following lemma: Lemma 5.1. If V φ (c 1 ) > 0, then V φ (c 2 ) > 0. Proof. We have, from V φ (c 1 ) > 0, c2 = c1 −

V φ (c 1 ) V φ (c 1 )

or V φ (c 1 ) + V φ (c 1 )(c 2 − c 1 ) = 0. On the other hand, because V φ (c ) is strictly convex on (c ∗ , c 0 ), so that V φ (c 2 ) > V φ (c 1 ) + V φ (c 1 )(c 2 − c 1 ) = 0, it implies c 2 > c ∗ . Hence, V φ (c 2 )(c 2 − c ∗ ) = ( V φ (c 2 ) − V φ (x∗ ))(c 2 − c ∗ ) > 0, which implies that V φ (c 2 ) > 0.

2

Since V φ (c 2 ) > 0 implies c 2 > c ∗ , we then have Corollary 5.1. If c 1 > c ∗ , then c 1  c 2 > c ∗ . Letting

 = 0 in the algorithm, we obtain a sequence of decreasing level constants:

c 1  c 2  · · ·  ck  ck+1  · · · > c ∗ .

(18)

The limit lim ck = cˆ

k→∞

exists. Theorem 5.1 (Convergence theorem). The limit of the sequence limk→∞ ck is the global minimum value c ∗ and limk→∞ H ck = H ∗ is the set of global minimum points. Proof. The existence of the sequence {ck } implies lim ck − ck+1 = lim λk = cˆ − cˆ = 0.

k→∞

k→∞

Furthermore, we have

λk =

V φ (ck ) V φ (ck )



V φ (ck ) V φ (c 1 )

0

which imply the limit exists lim

k→∞

V φ (ck ) V φ (c 1 )

=0

(19)

or lim V φ (ck ) = V φ (ˆc ) = 0.

(20)

k→∞

Therefore cˆ = c ∗ , the global minimum value and H ∗ = H c ∗ is the set of global minimizers. As a special case, taking φ = x, we have the following algorithm:

2

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

379

Algorithm 1.

 > 0 be a given small value. Take a point x0 ∈ X , calculate c0 = f (x0 ) and set k := 0.

Step 1. Let

Step 2. Compute m(ck ) and m (ck ) as follows:







m(ck ) =

ck − f (x) dμ

(21)

H ck

and m (ck ) = μ H ck .

(22)

Let

λk =

m(ck ) m (ck )

(23)

.

Step 3. If λk <  , go to Step 4; else let ck+1 = ck − λk ,

(24)

and set k := k + 1, return to Step 2. Step 4. Set c ∗ := ck and H ∗ := H ck , where c ∗ is the approximation of global minimum value, and H ∗ is the approximation of the set of global minimizers. This algorithm is equivalent to the mean value-level set algorithm in [2,8]. Proposition 5.1. ck+1 =



1

μ( H c k )

f (x) dμ H ck

is the mean value of f over its level set. Proof. We have ck+1 = ck −

= ck −

=

m(c )

μ( H c k ) μ( H c k )

μ( H c k )



μ( H c k )

ck dμ + H ck

f (x) dμ.



1



1

1

= ck −





ck − f (x) dμ

H ck

1

μ( H c k )

 f (x) dμ H ck

2

H ck

To justify the algorithm, we need the following lemma which has been proved in [8,9]: Lemma 5.2. If m (c 1 ) = μ( H c1 ) > 0, then m (c 2 ) = μ( H c2 ) > 0. Corollary 5.2. If c 1 > c ∗ , then c 1  c 2 > c ∗ . Letting

 = 0 in the algorithm, we obtain a sequence of decreasing level constants:

c 1  c 2  · · ·  ck  ck+1  · · · > c ∗ . The limit lim ck = cˆ

k→∞

exists.

(25)

380

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

Theorem 5.2 (Convergence theorem). The limit of the sequence limk→∞ ck is the global minimum value c ∗ and limk→∞ H ck = H ∗ is the set of global minimum points. Proof. The existence of the limit of sequence {ck } implies lim ck − ck+1 = lim λk = cˆ − cˆ = 0.

k→∞

k→∞

Furthermore, we have m(ck )

λk =

m(ck )



m (ck )

0

m (c 1 )

which imply the limit exists m(ck )

lim

k→∞

=0

m (c 1 )

(26)

or lim m(ck ) = m(ˆc ) = 0.

(27)

k→∞

Therefore cˆ = c ∗ , the global minimum value and H ∗ = H c ∗ is the set of global minimizers.

2

In [4], φ = x2 is used to design a variance deviation based algorithm. Algorithm 2. Step 1. Let

 > 0 be a given small value. Take a point x0 ∈ X , calculate c0 = f (x0 ) and set k := 0.

Step 2. Compute m(ck ) and v (ck ) as follows:



m(ck ) =







2

ck − f (x) dμ

(28)

H ck

and

 v (ck ) =

ck − f (x)

d μ.

(29)

H ck

Let

λk =

v (ck ) 2m(ck )

(30)

.

Step 3. If λk <  , go to Step 4; else let ck+1 = ck − λk ,

(31)

and set k := k + 1, return to Step 2. Step 4. Set c ∗ := ck and H ∗ := H ck , where c ∗ is the approximation of global minimum value, and H ∗ is the approximation of the set of global minimizers. Implementation of the algorithm. Let us consider a simple model of a global minimization problem. Suppose the constraint set D is a cuboid in R n ,



D = x: ai  xi  b i , i = 1, . . . , n



(32)

and the objective function f is a lower semi-continuous and upper robust function with a unique global minimizer x∗ ∈ D. In other words, for a decreasing sequence {ck } which converges to the global minimum value c ∗ , the size of the level sets satisfies

ρk = ρ ( H ck ) = max x − y  → 0 as k → ∞. x, y ∈ H ck

(33)

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

381

We have c ∗ = min f (x) = x∈ D

min

x∈ H ck ∩ D

f (x) = min f (x),

(34)

x∈ D k

where D k is the smallest cuboid containing the level set H ck ∩ D. Instead of computing m(ck ; D ) and v (ck ; D ), we compute m(ck ; D k ) and v (ck ; D k ). At each iteration, we try to find D k instead of level set H ck , where





D k = x: aki  xi  bki , i = 1, . . . , n , aki bki





1

i

n



(35)



= min xi : x , . . . , x , . . . , x ∈ H ck ,     = max xi : x1 , . . . , xi , . . . , xn ∈ H ck .

(36) (37)

To implement the algorithm, we need to calculate the integral in v (ck ) and m(ck ). We do this by using Monte Carlo simulation. The implementation of Algorithm 2 is described as follows: 1. Approximation of c 0 and H c0 . Let ξ = (ξ 1 , . . . , ξ n ) be an independent n-multiple random number which is uniformly distributed on [0, 1]n . Let





xi = a i + b i − a i · ξ i ,

i = 1, . . . , n.

(38)

Then x = (x1 , . . . , xn ) is uniformly distributed on D. Take km samples and evaluate function values f (x j ), j = 1, 2, . . . , km, at these sample points. Comparing the values of the function at these points, we obtain a set W of sample points corresponding to the t smallest function values: F V [ j ], j = 1, 2, . . . , t , ordered by their function values, i.e., F V [1]  F V [2]  · · ·  F V [t ].

(39)

The set W is called an acceptance set which can be regarded as an approximation to the level set H c0 , where c 0 = F V [1] is the largest value of { F V [ j ]}. The positive integer t is called the statistical index. It is clear that f (x)  c 0 for all x ∈ W . Also, the mean deviation M (c 0 ) and variance deviation v (c 0 ) of f over the level set H c0 can be approximated by m1 ≈









2



c 0 − F V [1] + · · · + c 0 − F V [t ] /t

(40)

and v1 ≈

c 0 − F V [1]

 2  + · · · + c 0 − F V [t ] /t .

(41)

Let

λ=

v1

and c 1 = c 0 − λ.

2m1

(42)

2. Generating a new cuboid by W . The new cubic domain of dimension n







D 1 = x = x1 , . . . , xn : a1i  xi  b1i , i = 1, . . . , n



can be generated statistically. The following procedure is proposed. Suppose that the random samples in W are Let



σ0i = min τ1i , . . . , τni where



and





σ1i = max τ1i , . . . , τni , i = 1, . . . , n,

(43)

τ1 , . . . , τn . (44)

τ j = (τ j1 , . . . , τ nj ), j = 1, . . . , t. We use ai = σ0i −

σ1i − σ0i t−1

and b i = σ1i +

σ1i − σ0i t−1

(45)

as estimators to generate a1i and b1i , i = 1, . . . , n. 3. Continuing the iterative process. The samples are now taken in the new domain D 1 . Take a random sample point x = (x1 , . . . , xn ) in D 1 , where





xi = a1i + b1i − a1i · ξ i ,

i = 1, . . . , n.

(46)

382

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

Evaluate f (x). If f (x)  ck , then drop this sample point; otherwise, update the sets { F V [ j ]} and W such that the new { F V [ j ]} is made up of the t best function values obtained so far. The acceptance set W is updated accordingly. Repeating this procedure until F V [1]  c 1 , we obtain, new F V and W . 4. Iterative solution. At each iteration, the smallest value F V [t ] in the set { F V [ j ]} and the corresponding point in W can be regarded as an iterative solution. 5. Convergence criterion. At each iteration, we evaluate deviation integrals mk and v k by Monte Carlo technique. Let vk

λk =

2mk

(47)

.

If λk is less than the given precision  , then the iterative process terminates, and the current iteration in Step 4 would serve as an estimate of the global minimum value and the global minimizer. 6. Numerical tests An important way to ascertain the performance of a global minimization algorithm is to see if it can pass numerical tests successfully. There are a lot of test problems for the global minimization available in the literature, see [10]. We select 3 of them for testing. They are complicate with high dimensions. Problem U.1. Source: [3]. Objective function: f (x) =

π n

 2

sin (π x1 ) +

 2 (xi − 1.0) 1 + 10.0 sin (π xi +1 ) + (xn − 1.0) .

n −1 

2



2

i =1

Search domain:





D = (x1 , . . . , xn ) ∈ R n : −10.0  xi  10.0, i = 1, . . . , n . Solution: x∗ = (1, . . . , 1),

f ∗ = 0.

The following tableau gives the number of iterations N i , the amount of function evaluation N f , the function value f ∗ and

λ corresponding variables n = 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, respectively. The stopping criterion for this problem is λ < 10−20 . n

2

5

10

20

30

40

Ni Nf f∗

79 2525 4.311 × 10−31 8.682 × 10−21

116 3739 2.466 × 10−21 8.971 × 10−21

202 6692 1.981 × 10−20 9.214 × 10−21

345 14165 4.343 × 10−20 9.936 × 10−21

448 28311 6.432 × 10−20 8.387 × 10−21

520 44321 1.142 × 10−19 9.916 × 10−21

n

50

60

70

80

90

100

Ni Nf f∗

606 79413 1.350 × 10−19 9.461 × 10−21

677 91207 2.106 × 10−19 9.843 × 10−21

763 133723 1.526 × 10−19 8.820 × 10−21

844 213947 1.345 × 10−19 8.818 × 10−21

907 235566 1.697 × 10−19 9.425 × 10−21

973 280885 2.044 × 10−19 7.972 × 10−21

λ

λ

Problem U.2. Source: [3]. Objective function: f (x) = sin2 (3π x1 ) +

n −1    (xi − 1.0)2 1.0 + sin2 (3π xi +1 ) + (xn − 1.0)2 1.0 + sin2 (2π xn ) . i =1

Search domain:





D = (x1 , . . . , xn ) ∈ R n : −10.0  xi  10.0, i = 1, . . . , n . Solution: x∗ = (1.0, . . . , 1.0),

f ∗ = 0.

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

383

Using this function, we consider the following discontinuous objective function [10]: g (x) = f (x) +

[ f (x)] n

(48)

,

where [ y ] denote the integer part of y. Thus, the objective function is discontinuous. The following tableau gives the number of iterations N i , the amount of function evaluation N f , the function value f ∗ and λ corresponding variables n = 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, respectively. The stopping criterion for this problem is λ < 10−20 . n

2

5

10

20

30

40

Ni Nf f∗

78 2392 1.3511 × 10−31 5.770 × 10−21

120 3086 2.999 × 10−21 9.886 × 10−21

207 6485 2.019 × 10−20 8.392 × 10−21

357 14842 3.218 × 10−20 8.209 × 10−21

460 27514 4.690 × 10−20 8.437 × 10−21

555 44169 8.063 × 10−20 9.858 × 10−21

λ n

50

60

70

80

90

100

Ni Nf f∗

637 78980 1.036 × 10−19 7.367 × 10−21

726 97334 1.316 × 10−19 9.556 × 10−21

812 193021 1.198 × 10−19 8.088 × 10−21

891 223184 1.576 × 10−19 9.887 × 10−21

971 261732 1.583 × 10−19 9.840 × 10−21

1048 315618 1.802 × 10−19 8.916 × 10−21

λ

Problem U.3. Source: [2]. Objective function:

 f (x) =

1.0 +

n

i =1

n

| xi |

+ sgn(sin( n n |x | ) − 0.5), x = 0, i =1

(49)

i

x = 0.

0, Search domain:







D = (x1 , . . . , xn )  −2.0  xi  2.0, i = 1, . . . , n . Solution: x∗ = (0, . . . , 0),

f ∗ = 0.

Remark. The function has an infinite number of discontinuous hypersurfaces. Its unique global minimizer is at the origin where the objective function has a discontinuity of “the second kind.” of the variable value n Since the −restriction 10 that sine function can take, the function f takes a value zero when . The following tableau gives i =1 |xi |/n < 10 the number of iterations N i , the amount of function evaluation N f , the function value f ∗ and λ corresponding variables n = 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, respectively. The stopping criterion for this problem is λ < 10−20 . n

2

5

10

20

30

40

Ni Nf f∗

91 2871 0.0 9.551 × 10−21

140 5501 0 .0 7.144 × 10−21

194 9870 0 .0 9.598 × 10−21

287 24007 0 .0 5.996 × 10−21

366 54284 0 .0 5.326 × 10−21

436 93461 0 .0 6.600 × 10−21

λ n

50

60

70

80

90

100

Ni Nf f∗

506 193511 0.0 6.961 × 10−21

559 411348 0 .0 6.260 × 10−21

627 400459 0 .0 5.917 × 10−21

619 1019272 0 .0 5.930 × 10−21

744 1659780 0 .0 5.650 × 10−21

798 3054279 0 .0 5.725 × 10−21

λ

7. Conclusion In this paper, optimality condition and algorithms are examined with the deviation integral. Combining classical Newton method for finding the largest zero of the deviation integral, the algorithm is implemented by using Monte Carlo simulation. Numerical tests are given to show the effectiveness and accuracy of the algorithm. The test problems presented in this paper are illustrative of several noteworthy ideas. They are high dimensional and complicate. Problem U.2 is discontinuous with a lot of local minimizers. Problem U.3 is more complicate: the global minimizer has discontinuity of second kind. There is no existing methodology which can match that performance. In the coming research we will study constrained and discrete problems using the deviation integral.

384

Y. Yao et al. / J. Math. Anal. Appl. 357 (2009) 371–384

References [1] [2] [3] [4] [5] [6] [7] [8]

Mokhtar S. Bazaraa, et al., Nonlinear Programming (Theory and Algorithms), John Wiley and Sons, 1993. Chew Soo Hong, Zheng Quan, Integral Global Optimization (Theory, Implementation and Applications), Springer-Verlag, Berlin, 1988. A.V. Levy, A. Montalvo, The tunneling algorithm for the global minimization of functions, SIAM J. Sci. Stat. Comput. 6 (1985) 15–29. Z. Pan, D. Wu, W. Yu, Q. Zheng, A level-value estimation algorithm and its stochastic implementation for global optimization, preprint. H.X. Phu, A. Hoffmann, Essential supremum and supremum of summable functions, Numer. Funct. Anal. Optim. 17 (1–2) (1996) 167–180. Shuzhong Shi, Quan Zheng, Deming Zhuang, Discontinuous robust mapping are approximatable, Trans. Amer. Math. Soc. 347 (1995) 4943–4957. Dog-Hua Wu, Wu-Yang Yu, Quan Zheng, A sufficient and necessary condition for global optimization, preprint. Q. Zheng, Robust analysis and global minimization of a class of discontinuous functions (I) and (II), Acta Math. Appl. Sin. Engl. Ser. 6 (1990) 205–223, 317–337. [9] Q. Zheng, Robust analysis and global optimization, Int. J. Comput. Math. Appl. 21 (6/7) (1991) 17–24. [10] Q. Zheng, D. Zhuang, Integral global minimization: Algorithms, implementations and numerical tests, J. Global Optim. 7 (1995) 421–454.