Nonlinear Analysis 60 (2005) 1473 – 1484 www.elsevier.com/locate/na
Mean value in invexity analysis Tadeusz Antczak∗ Faculty of Mathematics, University of Łód´z, Banacha 22, 90-238 Łód´z, Poland
Abstract In this paper, a generalization of the mean value theorem is considered in the case of functions defined on an invex set with respect to (which is not necessarily connected). 䉷 2004 Elsevier Ltd. All rights reserved. Keywords: Invex set with respect to ; Invex function with respect to ; -path; Mean-value inequality; Mean-value theorem
1. Introduction In many studies of applied analysis including optimization problems, differential equations, approximation and convergence results in numerical analysis, one uses what is probably one of the most powerful theorems in differential calculus, the mean value theorem. The mean value theorem for differentiable functions plays an important role in analysis because estimations of function values can be derived from it. For a continuous function f : [a, b] → R that is differentiable on (a, b), the classical mean value theorem states the existence of some c ∈ (a, b) such that f (b) − f (a) = (b − a)T ∇f (c). In the nondifferentiable case, Lebourg [10] was the first to give an extension of this theorem for Lipschitz functions, using the Clarke subdifferential [4]. In the recent years there appeared many publications in which mean value theorems for various classes of nondifferentiable real- and vector-valued functions are derived. These theorems were stated by means of various generalizations of the notion of a gradient (or derivative) in differential ∗ Tel.: +48 42 355949; fax: +48 42 354266.
E-mail address:
[email protected] (T. Antczak). 0362-546X/$ - see front matter 䉷 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.na.2004.11.005
1474
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
calculus and of a subdifferential in convex analysis. A survey on mean value properties has been given in [9] by Hiriart-Urruty. The purpose of the paper is to give a mean value theorem for differentiable (and twice differentiable) real-valued functions and for (nondifferentiable) Lipschitz functions under assumption that considered functions are defined on an invex set with respect to . The definition of a set of this type was given by Ben Israel and Mond [3], who considered (not necessarily differentiable) functions called pre-invex with respect to which were defined on such sets. Also, we shall extend well-known results of the geometrical characterization of convexity by using a so-called “mean-value inequality” to the case of pre-invex functions.
2. Preliminaries We shall also use a definition of an invex set with respect to . Definition 1. Let S be a nonempty subset of R n , : S × S → R n and let u be an arbitrary point of S. Then the set S is said to be invex at u with respect to if, for each x ∈ S, u + (x, u) ∈ S,
∀ ∈ [0, 1].
(1)
S is said to be an invex set with respect to if S is invex at each u ∈ S with respect to the same . Remark 2. It is to be noted that any set in R n is invex with respect to (x, u) ≡ 0 for all x, u ∈ R n . Remark 3. The definition of an invex set has a clear geometric interpretation. Thus, the definition essentially says that there is a path starting from u which is contained in S. We do not require that x should be one of the end points of the path. However, if we demand that x should be an end point of the path for every pair of points x, u ∈ S then (x, u) = x − u, and invexity reducing to convexity. Thus, it is true that every convex set is also invex with respect to (x, u) = x − u, but the converse is not necessarily true. Example 4. The following is an example a nonconvex set S in R 2 , which is a bounded invex set with respect to a nontrivial function : S × S → R 2 . Let us consider the bounded set S := ([−9, −2] ∪ [1, 8]) × ([−9, −2] ∪ [1, 8]). This set is a bounded invex set with respect to (x, u) = 1 (x,u) (x,u) given as 2
x − u1 , x1 0, 1 −9 − u1 , x1 0, 1 (x, u) = x1 0, 1 − u1 , x1 − u1 , x1 0,
u1 0, u1 0, u1 0, u1 0,
x − u2 , x2 0, 2 −9 − u2 , x2 0, 2 (x, u) = x2 0, 1 − u2 , x2 − u2 , x2 0,
u2 0, u2 0, u2 0, u2 0.
On the basis of the considerations of the invex sets introduced above, we give the definition of an -path as follows.
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
1475
Definition 5. Let S ⊂ R n be a nonempty invex set with respect to , and x and u two arbitrary points of S. A set Puv is said to be a closed -path joining the points u and v = u + (x, u) (contained in S) if Puv := {y = u + (x, u) : ∈ [0, 1]}. Analogously, an open -path joining the points u and v = u + (x, u) (contained in S) we call a set of the form 0 Puv := {y = u + (x, u) : ∈ (0, 1)}.
Remark 6. It is easy to see that if (x, u) = x − u then by definition v = x, and a set Puv = Pux = {y = u + (x − u) : ∈ [0, 1]} is a definition of segment line with the end points u and x. Example 7. We consider again the invex set of Example 4. Let u=(2; −5) and x =(−8; 3) be arbitrary points of S. In this case the closed -path joining the points u = (2; −5) and v = u + (x, u) = (1; −9) has the form P(−8;3)(2;−5) := {y = (y1 , y2 ) ∈ R 2 : y1 = −2 − ∧ y2 = −5 − 4, ∈ [0, 1]}. We recall a definition of pre-invex functions with respect to . This class of functions was introduced to the optimization theory by Ben-Israel and Mond [3]. Definition 8. Let S ⊂ R n be a nonempty invex set with respect to . A function f : S → R is said to be pre-invex with respect to if, there exists a vector-valued function : S × S → R n such that the relation f (u + (x, u)) f (x) − (1 − )f (u) ∀x, u ∈ S, ∀ ∈ [0, 1]
(2)
holds. Invex functions were introduced to optimization theory by Hanson [8] (and called by Craven [5]) as a very broad generalization of convex functions. Definition 9. Let S ⊂ R n be a nonempty invex set with respect to . A differentiable function f : S → R is said to be invex with respect to if there exists a vector-valued function : S × S → R n such that the relation f (x) − f (u) [(x, u)]T ∇f (u)
∀x, u ∈ S
(3)
holds. Remark 10. Every convex function is pre-invex (invex in the case of differentiability) with respect to (x, u) = x − u, but the converse may not always be true.
1476
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
3. Mean value inequality The geometrical meaning of convexity is well-known, the following equivalent characterization is easily derived: It is useful to characterize a convex function g : C → R defined on a nonempty convex set C ⊂ R n by using a so-called “mean-value inequality”, i.e. for all a, b ∈ C, the following inequality: g(x) g(a) +
g(b) − g(a) (b − a)T (b − a)
(x − a)T (b − a)
(4)
holds, where x := a + (1 − )b for some ∈ (0, 1). Theorem 11. Let the differentiable function g : C → R be defined on a nonempty convex set C ⊂ R n . For g to be a convex function on C it is necessary and sufficient that the mean-value inequality (4) holds. It turns out that a mean-value inequality holds not only for a class of convex functions defined on convex sets. In analogous way as in the case of convex functions, the mean-value inequality can be proved also for a class of pre-invex functions with respect to defined on a nonempty invex set S ⊂ R n . Indeed, we assume that f : S → R is pre-invex function with respect to defined on a nonempty invex set S with respect to , and the function : S × S → R n satisfies the following condition: (a, b) = 0 for all a, b ∈ S such that a = b. Then, for all a, b ∈ S, the inequality f (x) f (a) +
f (b) − f (a) (x − a)T (b, a) [(b, a)]T (b, a)
(5)
holds, where x := a + (b, a) for some ∈ (0, 1). Theorem 12. Let f : S → R be defined on a nonempty invex set S ⊂ R n with respect to , and (a, b) = 0 for all a, b ∈ S such that a = b. It is necessary and sufficient for f to be a pre-invex function with respect to on S that the mean-value inequality (5) holds. Proof. “⇒” Let f : S → R be a pre-invex function with respect to defined on a nonempty invex set S ⊂ R n with respect to , and (b, a) = 0 for all a, b ∈ S such that b = a. We take a and b in S, ∈ (0, 1) and set x := a + (b, a). Since S is invex T with respect to , it follows that x ∈ S. By definition of x, we have = (x−a)T (b,a) and (b,a) (b,a)
(b,a) 1 − = ((b,a)−(x−a)) . Since f is pre-invex with respect to on S, therefore, the (b,a)T (b,a) inequality T
f (a + (b, a)) f (b) + (1 − )f (a)
(6)
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
1477
holds. Taking into account in the inequality above computing value and 1 − , we obtain (x − a)T (b, a) ((b, a) − (x − a))T (b, a) f (a) f (b) + [(b, a)]T (b, a) [(b, a)]T (b, a) (x − a)T (b, a)f (b) + ((b, a) − (x − a))T (b, a)f (a) = [(b, a)]T (b, a) f (b) − f (a) (x − a)T (b, a). = f (a) + [(b, a)]T (b, a)
f (x)
“⇐” Assume that the mean-value inequality (5) holds. Since x = a + (b, a) for some ∈ (0, 1) and any arbitrary points a, b ∈ S, and hence x ∈ S, and we have f (b) − f (a) (x − a)T (b, a) [(b, a)]T (b, a) f (b) − f (a) (a + (b, a) − a)T (b, a) = f (a) + [(b, a)]T (b, a) = f (b) + (1 − )f (a).
f (x) f (a) +
Taking into account the definition of x, we get the inequality (6), by which we conclude that f is pre-invex with respect to on S. Remark 13. In the case when (b, a) = b − a, i.e. in the case when f is a convex function defined on a convex set, we obtain the mean-inequality (4).
4. Mean value theorem and its applications The classical mean value theorem (for differentiable real-valued functions) states that for a given function g : C → R defined on a nonempty open convex set C ⊂ R n there exists a point c between two points a and b in C such that g(b) − g(a) = (b − a)T ∇g(c),
(7)
where ∇g(c) denotes gradient of g at c. Now, we give a mean value theorem for a differentiable function defined on a nonempty invex set S ⊂ R n with respect to . Theorem 14. Let S ⊂ R n be a nonempty invex set with respect to : S × S → R n , and Pab be an arbitrary -path contained in int S. Moreover, we assume that f : S → R is 0 such defined on S and differentiable on int S. Then, for any a, b ∈ S, there exists c ∈ Pab that the following relation f (a + (b, a)) − f (a) = [(b, a)]T ∇f (c) holds.
(8)
1478
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
Proof. We define a function g : [0, 1] → R as follows: g() := f (a + (b, a)) − f (a) − (f (a + (b, a)) − f (a)).
(9)
Since g(0) = g(1) = 0, hence using Rolle’s theorem (see, for example, [6,7,15]), it follows that there exists 0 ∈ (0, 1) such that the relation g(1) − g(0) = g (0 )(1 − 0) holds. By (9), we have 0 = g (0 ) = (b, a)T ∇f (a + 0 (b, a)) − (f (a + (b, a)) − f (a)). Hence f (a + (b, a)) − f (a) = [(b, a)]T ∇f (a + 0 (b, a)). 0 . This ends the proof of We set c = a + 0 (b, a). Since 0 ∈ (0, 1), by definition c ∈ Pab theorem.
Remark 15. Using the definition of a point c we can write (8) in the form f (a + (b, a)) − f (a) = [(b, a)]T ∇f (a + 0 (b, a)). In the case twice differentiable functions defined on a nonempty convex set Taylor’s theorem holds (see, for example, [11]). This theorem can be called as the mean value theorem in the case of twice differentiable function. In its classical version Taylor’s theorem as follows. Theorem 16. Let C be a nonempty convex subset of R n , and f : C → R be a twice differentiable function on int C. Then, the relation f (b) = f (a) + (b − a)T ∇f (a) + 21 (b − a)T ∇ 2 f (c)(b − a)
(10)
holds for all a, b ∈ C, where c := a + (b − a) for some ∈ (0, 1), and ∇ 2 f (c) denotes Hessian of f at c. It turns out that also in the case of twice differentiable functions defined on a nonempty invex set with respect to it can give an equivalent of the classical Taylor’s theorem. Theorem 17. Let S ⊂ R n be a nonempty invex set with respect to : S × S → R n , and Pab be an arbitrary -path contained in int S. Moreover, we assume that f : S → R is 0 such that, the defined on S and twice differentiable on int S. Then, there exists c ∈ Pab relation f (a + (b, a)) = f (a) + [(b, a)]T ∇f (a) + 21 [(b, a)]T ∇ 2 f (c)(b, a) holds.
(11)
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
1479
Proof. The proof of the mean value theorem for functions defined on an invex set can be derived from corresponding properties of the real function g defined on [0, 1]. We define a function g : [0, 1] → R by g() = f (a + (b, a)). Then g is twice differentiable and g () = [(b, a)]T ∇f (a + (b, a)), g () = 21 [(b, a)]T ∇ 2 f (a + (b, a))(b, a).
(12)
By Taylor’s theorem [2] for a real function defined on convex set, it follows that g(1) = g(0) + g (0) +
1 2
g (0 )
(13)
for some 0 ∈ (0, 1). We set c = a + 0 (b, a). Since 0 ∈ (0, 1), therefore, by definition 0 . From (12) together with (13), we get the required Taylor’s formula for functions c ∈ Pab defined on an invex set with respect to . Remark 18. Using the definition of a point c we can write (11) in the form f (a + (b, a)) = f (a) + [(b, a)]T ∇f (a) + 21 [(b, a)]T ∇ 2 f (a + 0 (b, a))(b, a). Now, we give some applications of the mean value theorem. Theorem 19. Let f : S → R be defined on a nonempty invex set S ⊂ R n with respect to , and differentiable on int S. Moreover, we assume that u, x ∈ int S are two arbitrary points such that v = u + (x, u) ∈ int S and Puv ⊂ int S. Then |f (u + (x, u)) − f (u)| |[(x, u)]T | sup |∇f (u + (x, u))|. 01
(14)
Proof. Let u ∈ int S and v = u + (x, u) ∈ int S be end points of Puv contained in int S. We define g : [0, 1] → R by g() = f (u + (x, u)). g is differentiable in any point of [0, 1], and its derivative has the form g () = [(x, u)]T ∇f (u + (x, u)).
(15)
By Theorem 8.5.2 [6], we have |g(1) − g(0)| M(1 − 0), where |g ()| M for any ∈ [0, 1]. From the definition of g and by the mean value theorem (Theorem 14), we obtain |f (u + (x, u)) − f (u)| = |[(x, u)]T ∇f (u + 0 (x, u))|.
(16)
Since g () is bounded for any ∈ [0, 1], it follows by (15) that also the derivative of f is bounded on Puv . Taking this fact in (16), we get the conclusion of theorem.
1480
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
Theorem 20. Let f : S → R be defined on a nonempty invex set S ⊂ R n with respect to and differentiable on int S, and Puv ⊂ int S. Then, for any x0 ∈ S, the following inequality is true: |f (u + (x, u)) − f (u) − [(x, u)]T ∇f (x0 )| |[(x, u)]T | sup |∇f (z) − ∇f (x0 )|. z∈Puv
(17)
Proof. We define g by g(x) = f (x) − x T ∇f (x0 ). The function g is differentiable at any point of S, and its derivative has a form ∇g(x) = ∇f (x) − ∇f (x0 ). By (14), we have |g(u + (x, u)) − g(u)| |[(x, u)]T | sup |∇g(u + (x, u))| 01
and so, by definition of g, we obtain |f (u + (x, u)) − f (u) − [(x, u)]T ∇f (x0 )f (u)|
|[(x, u)]T | sup |∇f (u + (x, u)) − ∇f (x0 )|. 01
We set z = u + (x, u) in the inequality above. Hence, and by hypothesis, we obtain (17) what completes the proof of theorem. Now we give a generalization of the well-known Lagrange mean value theorem in the case when the function considered is defined on an invex set. Theorem 21. Let f : S → R be a function defined on a nonempty invex set S ⊂ R with respect to with (x, u) = 0 for x = u. Moreover, we assume that f is continuous on 0 . Then, there exists a point ∈ P 0 , such that, the Puv ⊂ int S and differentiable on Puv uv inequality f (u + (x, u)) − f (u) = f () (x, u) holds for all x ∈ Puv . Proof. Follows along the lines of Theorem 14.
Now we generalize the well-known Cauchy theorem in the case when the functions considered are defined on an invex set. Theorem 22. Let Puv be some -path contained in a nonempty invex set S ⊂ R n with respect to , and let f : S → R, g : S → R be defined on S. Moreover, we assume that the following conditions are satisfied: (a) f and g are continuous on Puv , 0 , (b) f and g are differentiable on Puv
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
1481
0 , (c) ∇g(x) = 0 for any x ∈ Puv (d) (x, u) = 0 whenever x = u. 0 such that Then there exists a point ∈ Puv
f (v) − f (u) [(x, u)]T ∇f () . = g(v) − g(u) [(x, u)]T ∇g() In other words, there exists 0 ∈ (0, 1) such that f (u + (x, u)) − f (u) [(x, u)]T ∇f (u + 0 (x, u)) . = g(u + (x, u)) − g(u) [(x, u)]T ∇g(u + 0 (x, u)) Proof. By the mean value theorem (Theorem 14) together with the assumptions (c) and (d) we have g(u + (x, u)) = g(u). We define the function : [0, 1] → R as follows:
() = f (u + (x, u)) − f (u) f (u + (x, u)) − f (u) − (g(u + (x, u)) − g(u)). g(u + (x, u)) − g(u) Since (0) = 0 = (1), and
() = [(x, u)]T ∇f (u + (x, u)) f (u + (x, u)) − f (u) [(x, u)]T ∇g(u + (x, u)), − g(u + (x, u)) − g(u)
(18)
then by Rolle’s theorem [6,7] there exists 0 ∈ (0, 1) such that (0 ) = 0. Taking this fact in (18) we get the result of the theorem.
5. Locally Lipschitz functions In this section, we consider locally Lipschitz functions which defined on a nonempty invex set with respect to . Let X denotes a Banach space. Definition 23. f : X0 → R is said to be a locally Lipschitz function at x on a nonempty open subset X0 of X if, there exists a neighborhood of x in X, such that the inequality |f (x) − f (y)| Kx − y is satisfied for all x, y ∈ , where K 0. Definition 24. [4] Let f : X0 → R be a locally Lipschitz function at x on a nonempty open subset X0 of X. The generalized directional derivative of f at x in the direction v ∈ X,
1482
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
denoted by f 0 (x; v), is given by f 0 (x; v) := lim sup y→x t↓0
[f (y + tv) − f (y)] , t
where, of course, y belongs to X and t to (0, ∞). As usual, X ∗ denotes the (continuous) dual of X. Definition 25. [4] The generalized gradient (in the sense of Clarke) of f at x, denoted *f (x), is the (nonempty) set of all in X ∗ satisfying f 0 (x; v) T v for all v in X. Proposition 26. For every x ∈ X0 , v ∈ X, and every real number s satisfying: −f 0 (x; −v) s f 0 (x; v), there exists ∈ *f (x) such that T v = s. In particular, for every x ∈ X0 , *f (x) = ∅. Throughout this section f : S → R denotes a locally Lipschitz function defined on a nonempty open invex set S ⊂ X with respect to . To give the mean value theorem for locally Lipschitzian function g defined on S we use the definition of a g-critical point. The definition of a g-critical point was introduced by Lebourg for an interval. Now, we generalize this definition for an -path. 0 ) denotes an -path (an open -path) contained in S ⊂ X, and Definition 27. Let Puv (Puv g any continuous function defined on Puv . We say that z is a g-critical point for Puv if and only if there exist sequences un and vn in Puv and rn > 0 such that
(i) g(vn ) − g(un ) = rn [g(v) − g(u)], vn − un = rn (v − u), (ii) vn → z, un → z, rn ↓ 0. If in the definition of g-critical point for an -path Puv we choose a form of its end-point, i.e. v = u + (x, u), then the definition above can be written down in the following form: (i) g(vn ) − g(un ) = rn [g(u + (x, u)) − g(u)], vn − un = rn (x, u), (ii) vn → z, un → z, rn ↓ 0. To prove the main result of this section we need the following proposition. Proposition 28 (Existence of a critical point for -path). For every -path Puv contained in S and every continuous real-valued function f defined on Puv , there exists an “f-critical 0 . point” z ∈ Puv Proof. Proof is similar to the proof of Proposition 1.6 [10] and it will be omitted. Now we give the main result of this section.
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
1483
Theorem 29 (Mean value theorem). For any locally Lipschitzian function f defined on Puv 0 and ∈ *f (z) contained in an open invex set S ⊂ X with respect to there exist z ∈ Puv such that f (v) − f (u) = T (x, u). 0 is an f-critical point for P . There exist sequences u and Proof. We assume that z ∈ Puv uv n vn in Puv and tn > 0 such that
f (u + (x, u)) − f (u) = tn−1 [f (vn ) − f (un )],
vn − un = tn (x, u),
where un → z, vn → z, tn ↓ 0. Then f (vn ) − f (un ) tn f (un + tn (x, u)) − f (un ) = lim n→∞ tn f (y + (x, u)) − f (y) lim sup = f 0 (z; (x, u)). y→z
f (u + (x, u)) − f (u) = lim
n→∞
↓0
By similar calculations, we obtain f (u+ (x, u))−f (u) f 0 (z; −(x, u)). Then, theorem is directly consequence of Proposition 26. 6. Conclusion The classical mean value theorem is stated for differentiable real-valued functions on connected convex sets. In the last few years, the mean value theorem for various classes of nondifferentiable functions has been subject of many investigations. These results were stated by means of various generalizations of the notion of a gradient (or derivative) in differential calculus and a subdifferential in convex analysis. But in these cases they were always derived on connected convex sets. In the present paper we used a different approach to derive a mean value theorem. Our results are applicable for a larger class of functions than the classical mean value theorem because of a set on which they were proved. Indeed, the main motivation for the present paper was to weaken the assumptions on a set on which mean value results are established. Thus, we proved mean value results on invex sets which are not necessarily connected convex (see, for example, Example 4). Hence, as follows from the results obtained in the paper, mean value theorem is also valid for the real-valued functions defined on nonconnected nonconvex sets. References [2] M.S. Bazaraa, H.D. Sherali, C.M. Shetty, Nonlinear Programming Theory and Algorithms, Wiley, New York, 1991. [3] A. Ben-Israel, B. Mond, What is invexity, J. Austral. Math. Soc. Ser. B 28 (1986) 1–9. [4] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
1484 [5] [6] [7] [8] [9]
T. Antczak / Nonlinear Analysis 60 (2005) 1473 – 1484
B.D. Craven, Invex functions and constrained local minima, Bull. Austral. Math. Soc. 24 (1981) 357–366. J. Dieudonné, Foundations of Modern Analysis, Academic Press, New York, 1960. G.M. Fichtenholz, Rachunek ró˙zniczkowy i całkowy, Warszawa, 1962. M.A. Hanson, On sufficiency of the Kuhn–Tucker conditions, J. Math. Anal. Appl. 80 (2) (1981) 545–550. J.B. Hiriart-Urruty, Mean value theorems in nonsmooth analysis, Numer. Functional Anal. Optim. 2 (1) (1980) 1–30. [10] G. Lebourg, Generic differentiability of Lipschitzian functions, Trans. Am. Math. Soc. 256 (1979) 125–144. [11] O.L. Mangasarian, Nonlinear Programming, McGraw-Hill, New York, 1969. [15] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, New York, 1976.