J. Math. Anal. Appl. 388 (2012) 753–759
Contents lists available at SciVerse ScienceDirect
Journal of Mathematical Analysis and Applications www.elsevier.com/locate/jmaa
Sequential decision problems on isolated time domains Ferhan M. Atıcı ∗ , Nezihe Turhan Western Kentucky University, Department of Mathematics and Computer Science, Bowling Green, KY 42101-3576, USA
a r t i c l e
i n f o
a b s t r a c t
Article history: Received 13 May 2011 Available online 5 October 2011 Submitted by A. Dontchev
In this paper, we shall study the deterministic dynamic sequence problem on isolated time domains. After introducing the Euler equations and the transversality condition, we shall prove that the Euler equations and transversality condition are sufficient for the existence of the optimal solution. We shall also introduce the Bellman equation on isolated time scales. This equation will generalize the well-known Bellman equation in the theory of dynamic programming. As an application in financial economics, we shall optimize a sequence problem of growth model on isolated time domains. © 2011 Elsevier Inc. All rights reserved.
Keywords: Dynamic programming Isolated time scales Bellman equation Optimization
1. Introduction Dynamic programming (DP) is a recursive method for solving sequential (or multistage) decision problems. The theory on DP enables economists and mathematicians to construct and solve a huge variety of sequential decision making problems both in deterministic (i.e. decision making under certainty) and stochastic (i.e. decision making under uncertainty) settings. A pioneering work has been done by a mathematician, Richard Bellman, who published his first paper [4] on the subject in 1952, and his first book [5] in 1957. The well-known Bellman equation, which is a necessary condition for optimal solution for a sequential decision problem, is named after him. Since Richard Bellman’s discovery of the theory, economists have been studying both discrete and continuous time models. Stokey and Lucas did groundwork in both deterministic and stochastic sequential decision problems [14]. Romer [12] studied many other economics models, such as, the Solow growth model, the Ramsey–Cass–Koopmans model, and the Diamond model on continuous time. Hassler analyzed and solved explicitly the discrete time growth model in [7]. For further reading, the book by Kamien and Schwartz [8] is an excellent source. In this paper, we intend to attract attention to a different view of the theory, which is called DP on isolated time domains where time is not uniformly distributed. A pioneering work in the intersection of economics and time scales has been done by Atıcı, Biles and Lebedinsky in [2]. Recently many papers published in this area of research [1,3,9]. In the paper by Malinowska, Martins and Torres [10], the weak maximality was obtained on infinite horizon by the use of the techniques of the theory of calculus of variations on time scales. Trading is not a uniformly discrete nor a continuous act. Let’s consider a stock market example. A stockholder purchases stocks of a particular company on varying time intervals. Assume that on the first day of the month, a stockholder purchased stocks in every hour, and in the following two days, the company did a good profit, which made the stockholder purchase several times in an hour. Unluckily, the company could not manage well the rest of the month, so the stockholder bought only one stock on the last day of the month. Since there is not a strict rule on the behavior of the stockholder, we cannot tell if purchases had been made on a regular and continuous base. On the contrary, depending on the management of the profit and goodwill of the company, the stockholder purchased on times which were not uniformly distributed. This
*
Corresponding author. E-mail addresses:
[email protected] (F.M. Atıcı),
[email protected] (N. Turhan).
0022-247X/$ – see front matter doi:10.1016/j.jmaa.2011.09.068
©
2011 Elsevier Inc. All rights reserved.
754
F.M. Atıcı, N. Turhan / J. Math. Anal. Appl. 388 (2012) 753–759
argument leads our research to time scales calculus, exclusively on isolated time intervals, and we formulate and solve the deterministic growth model on isolated time scales. The plan of the paper follows: In Section 2, we shall briefly summarize some basics of the time scale calculus only for isolated time domains. In Section 3, we shall introduce the deterministic dynamic sequence problem on isolated time domains and prove the main results of this paper. After introducing the Euler equations and the transversality condition, we shall prove that the Euler equations and transversality condition are sufficient for the existence of the optimal solution. We shall also introduce the Bellman equation on isolated time scales. This equation will generalize the well-known Bellman equation in the theory of DP. In Section 4, we shall optimize a sequence problem of growth model. 2. Isolated time scales Let T = {0 = t 0 , t 1 , t 2 , t 3 , . . .} be the set of positive real numbers such that t i < t j for i < j. Then for t i ∈ T, we define
σ (t i ) = t i+1 and ρ (t i ) = t i−1 and for every t ∈ T, μ(t ) = σ (t ) − t. If i = 0, we assume that ρ (t 0 ) = t 0 . With these properties,
the time domain T is known as an isolated time scale whose elements are isolated (see [6]). Examples of isolated time + h > 0 with μ(t ) = h, qZ ∪ {0} ) = 1, hZ+ = {hz: √z ∈ Z+ } where domains include the set of positive integers Z+ with μ(t√ √ + where q > 1 with μ(0) = 1, μ(t ) = (q − 1)t for t = 0 and Z with μ(t ) = t + 1 − t. Let f be a real-valued function defined on T. Then the -derivative of f is defined as
f (σ (t )) − f (t )
f (t ) =
μ(t )
where t ∈ T. It follows from this definition that for any given two real-valued functions f and g on a time scale T, the product rule is valid
f (t ) g (t )
= f (t ) g (t ) + f σ (t ) g (t ).
The -integral of f is defined as
T f (τ )τ =
μ(s) f (s),
s∈[t 0 , T )
t0
where t 0 , T ∈ T. We will use the notation e p (t , t 0 ) for the exponential function on the time domain T. If T := Z, then e p (t , t 0 ) = (1 + p )t −t0 . Here p is a constant so that 1 + p μ(t ) = 0 for all t ∈ T. p is defined to be − 1+μp(t ) p for all t ∈ Tκ . If 1 + p μ(t ) > 0, then e p (t , t 0 ) > 0 for all t ∈ T (for its proof, see [6]). 3. Dynamic sequence problem We define the deterministic sequence problem on isolated time scale T as
(SP)
sup
{x(t ),c (t )}t∞ =0
e β−1 (s, 0) F x(s), c (s) s, μ
s∈[0,∞)T
x σ (t ) = g t , x(t ), c (t ) , x(0) ∈ X given,
(3.1)
where (i) (ii) (iii) (iv) (v)
T ∩ [0, ∞) = [0, ∞)T , β is a discount factor with 0 < β < 1, x(t ) is the state variable, c (t ) is the control or choice variable, for each y, F (·, y ) is strictly increasing in its first variable and F is concave, continuously differentiable real-valued
objective function, (vi) X is the space of sequences {x(t )}t∞ =0 , (vii) g is concave and has continuous partial derivatives. We define the deterministic Bellman equation on T as
e β−1 (t , 0) V t , x(t ) μ
+ e β−1 (t , 0) F x(t ), c (t ) = 0 μ
where V (t , x(t )) is called the value function.
(3.2)
F.M. Atıcı, N. Turhan / J. Math. Anal. Appl. 388 (2012) 753–759
Lemma 3.1. Eq. (3.2) is equivalent to
V t , x(t ) = μ(t ) F x(t ), c (t ) + β V
755
σ (t ), x σ (t ) .
Proof. We first use the product rule on the left-hand side of Eq. (3.2), then we have
σ e t , x(t ) + e β−1 (t , 0) F x(t ), c (t ) = 0. β−1 (t , 0) V t , x(t ) + e β−1 (t , 0) V μ
μ
μ
Then applying the definition of delta derivative, we obtain
β −1 V σ (t , x(t )) − V (t , x(t )) + e β−1 (t , 0) F x(t ), c (t ) = 0. e β−1 (t , 0) V t , x(t ) + β e β−1 (t , 0) μ μ μ(t ) μ μ(t ) The equality
V t , x(t ) = μ(t ) F x(t ), c (t ) + β V σ t , x(t )
2
follows from algebra steps and the property that e β−1 (t , 0) > 0. μ
Theorem 3.1. Assume (SP ) attains its supremum at {x(t ), c (t )}t∞ =0 . Then
V t , x(t ) =
e β−1 (s, t ) F x(s), c (s) s
(3.3)
μ
s∈[t ,∞)T
solves the Bellman equation with initial condition at t = 0. Proof. It is sufficient to show that the value function (3.3) satisfies the Bellman equation (3.2). Then by plugging the value function, V (t , x(t )) on the right-hand side of Eq. (3.2), we have
μ(t ) F x(t ), c (t ) + β V σ (t ), x σ (t )
= μ(t ) F x(t ), c (t ) + β
e β−1 s, σ (t ) F x(s), c (s) s μ
s∈[σ (t ),∞)T
= μ(t ) F x(t ), c (t ) + β
e β−1 (s, 0)
s∈[σ (t ),∞)T
β = μ(t ) F x(t ), c (t ) + β
s∈[σ (t ),∞)T
e β−1 (σ (t ), 0)
F x(s), c (s) s
μ
e β−1 (s, 0) μ F x(s), c (s) s e β−1 (t , 0) μ
= μ(t ) F x(t ), c (t ) +
μ
e β−1 (s, t ) F x(s), c (s) s μ
s∈[σ (t ),∞)T
= μ(t ) F x(t ), c (t ) −
σ (t )
e β−1 (s, t ) F x(s), c (s) s +
e β−1 (s, t ) F x(s), c (s) s,
μ
μ
s∈[t ,∞)T
t
where we used some properties of the exponential function (see [11]). Note that σ (t )
e β−1 (s, t ) F x(s), c (s) s = μ(t )e β−1 (t , t ) F x(t ), c (t ) μ
t
Hence we have the desired result.
μ
= μ(t ) F x(t ), c (t ) .
2
Remark 3.1. In the literature, there is an equation called Hamilton–Jacobi–Bellman equation which is a main tool to optimize finite time problems on continuous time domains. On the other hand, Bellman equation was developed for infinite time horizon problems on uniformly distributed time domains. In the papers [13] and [15], the Hamilton–Jacobi–Bellman equation was introduced on any time scale.
756
F.M. Atıcı, N. Turhan / J. Math. Anal. Appl. 388 (2012) 753–759
Next we shall introduce the Euler equations and the transversality condition. The Euler equations of the Bellman equation follow from the Envelope Theorem:
β V x σ (t ), g t , x∗ (t ), c ∗ (t ) gc t , x∗ (t ), c ∗ (t ) + μ(t ) F c x∗ (t ), c ∗ (t ) = 0,
(3.4)
and
V x t , x∗ (t ) = β V x
σ (t ), g t , x∗ (t ), c ∗ (t ) g x t , x∗ (t ), c ∗ (t ) + μ(t ) F x x∗ (t ), c ∗ (t ) .
(3.5)
The transversality condition states that
lim e β−1 ( T , 0) V x T , x∗ ( T ) x∗ ( T ) = 0,
T →∞
where
(3.6)
μ
V x T , x (T ) = ∗
e β−1 (s, T ) F x x∗ (s), c ∗ (s) s. μ
s∈[ T ,∞)T
Remark 3.2. The proof to the transversality condition (3.6) is given in a more general setting in the paper [10]. The following theorem assures the sufficiency of the Euler equations and transversality condition for optimal solution of the sequential decision problem. Theorem 3.2. If the sequence {x∗ (t ), c ∗ (t )}t∞ =0 satisfies (3.4), (3.5) and (3.6), then it is optimal for the problem (SP). Proof. It is sufficient to show that, the difference between the objective functions in (SP ) evaluated at {x∗ (t ), c ∗ (t )} and at {x(t ), c (t )} is nonnegative. Therefore, we start by setting the difference as
(SP)∗ − (SP) = lim
T →∞ s∈[0, T )T
e β−1 (s, 0) F x∗ (s), c ∗ (s) − F x(s), c (s) s. μ
Since F is concave, continuous and differentiable, we have
∗
(SP) − (SP) lim
T →∞ s∈[0, T )T
e β−1 (s, 0) F x x∗ (s), c ∗ (s) x∗ (s) − x(s) + F c x∗ (s), c ∗ (s) c ∗ (s) − c (s) s μ
= C.
(3.7)
If we rewrite the Euler equations (3.4) and (3.5), then we obtain
β V x σ (s), g s, x∗ (s), c ∗ (s) gc s, x∗ (s), c ∗ (s) , μ(s)
β 1 V x σ (s), g s, x∗ (s), c ∗ (s) g x s, x∗ (s), c ∗ (s) + V x s, x∗ (s) . μ(s) μ(s)
F c x∗ (s), c ∗ (s) = − and
F x x∗ (s), c ∗ (s) = −
Now we substitute F c and F x on the right side of (3.7). Then we have
C = lim
T →∞ s∈[0, T )T
e β−1 (s, 0) μ
1
μ(s)
V x s, x∗ (s) x∗ (s) − x(s)
− β V x σ (s), g s, x∗ (s), c ∗ (s) g x s, x∗ (s), c ∗ (s) x∗ (s) − x(s) + gc s, x∗ (s), c ∗ (s) c ∗ (s) − c (s) s 1 ∗ ∗ lim e β−1 (s, 0) V x s, x (s) x (s) − x(s) μ T →∞ μ(s) s∈[0, T )T
− β V x σ (s), g s, x∗ (s), c ∗ (s) g s, x∗ (s), c ∗ (s) − g s, x(s), c (s) s 1 ∗ ∗ = lim e β−1 (s, 0) V x s, x (s) x (s) − x(s) − β V x σ (s), x∗ σ (s) x∗ σ (s) − x σ (s) s, μ T →∞ μ(s) s∈[0, T )T
since x(σ (t )) = g (t , x(t ), c (t )).
F.M. Atıcı, N. Turhan / J. Math. Anal. Appl. 388 (2012) 753–759
757
By plugging
x
σ (s) = x(s) + μ(s)x (s)
in the last expression above, we have
I = lim
e β−1 (s, 0)
T →∞ s∈[0, T )T
μ
1
μ(s)
V x s, x∗ (s) x∗ (s) − x(s)
− β V x σ (s), x∗ σ (s) x∗ (s) + μ(s)x∗ (s) − x(s) − μ(s)x (s) s 1 ∗ ∗ = lim e β−1 (s, 0) V x s, x (s) x (s) − x(s) μ T →∞ μ(s) s∈[0, T )T
∗ − β V x σ (s), x∗ σ (s) x (s) − x(s) + μ(s) x∗ (s) − x (s) s
1 β V x σ (s), x∗ σ (s) μ(s) x∗ (s) − x (s) = lim −e β−1 (s, 0) μ T →∞ μ(s) s∈[0, T )T
+ e β−1 (s, 0) μ
1
μ(s)
V x s, x∗ (s) − β V x
σ (s), x∗ σ (s)
x∗ (s) − x(s)
s.
By using the recurrence relation of the exponential function, i.e., e σβ−1 (t , 0) = β e β−1 (t , 0), we rewrite the above equation. μ
μ
Then, we have
I = lim
T →∞ s∈[0, T )T
−e β−1 σ (s), 0 V x σ (s), x∗ σ (s) x∗ (s) − x (s) μ
e σβ−1 (s, 0) V xσ (s, x∗ (s)) − e β−1 (s, 0) V x (s, x∗ (s)) μ
μ
−
μ(s)
x∗ (s) − x(s)
s
−e β−1 σ (s), 0 V x σ (s), x∗ σ (s) x∗ (s) − x (s) s
= lim
T →∞ s∈[0, T )T
μ
− lim
T →∞ s∈[0, T )T
e β−1 (s, 0) V x s, x∗ (s) μ
x∗ (s) − x(s) s.
In the first integral above, we use the integration by parts formula on time scales, i.e.,
u σ (t ) v (t )t = (uv )(t ) −
u (t ) v (t )t .
Now let
u
σ (t ) = −e β−1 σ (s), 0 V x σ (s), x∗ σ (s)
μ
and
v (t ) = x∗ (s) − x (s) ,
and apply integration by parts formula. Then, we have
I = lim
T →∞ s∈[0, T )T
−e β−1 σ (s), 0 V x σ (s), x∗ σ (s) x∗ (s) − x (s) s
− lim
T →∞ s∈[0, T )T
μ
e β−1 (s, 0) V x s, x∗ (s) μ
x∗ (s) − x(s) s
T = − lim e β−1 (t , 0) V x t , x (t ) x∗ (t ) − x(t ) 0 + lim T →∞
T →∞ s∈[0, T )T
μ
+ lim
T →∞ s∈[0, T )T
∗
e β−1 (s, 0) V x s, x∗ (s) μ
∗ − e β−1 (s, 0) V x s, x∗ (s) x (s) − x(s) s μ
= − lim e β−1 ( T , 0) V x T , x∗ ( T ) x∗ ( T ) − x( T ) − e β−1 (0, 0) V x 0, x∗ (0) x∗ (0) − x(0) . T →∞
μ
μ
x∗ (s) − x(s) s
758
F.M. Atıcı, N. Turhan / J. Math. Anal. Appl. 388 (2012) 753–759
Since x∗ (0) − x(0) = 0, we have
I = − lim e β−1 ( T , 0) V x T , x∗ ( T ) x∗ ( T ) − x( T ) μ T →∞ − lim e β−1 ( T , 0) V x T , x∗ ( T ) x∗ ( T ), T →∞
μ
where the last line uses the fact that F x 0 and x( T ) 0, for all t. It then follows from the transversality condition (3.6) that (SP )∗ − (SP ) 0. 2 4. An application in financial economics: Optimal growth model In this section, we optimize the social planner’s problem on isolated time scale T by the guessing method. In the social planner’s problem the objective function F is assumed to be as U which is called utility function. Let’s consider
sup
{k(σ (t ))}t∞ =0
k
e β−1 (s, 0)U k(s), c (s) s, μ
s∈[0,∞)T
σ (t ) = kα (t ) − c (t ),
k(0) ∈ X given,
(4.1)
where k(t ) is the state variable and c (t ) is the control variable. In this sequential decision problem, we use CRRA utility function, i.e., U (k(t ), c (t )) = ln(c (t )). Then, we have the Bellman equation as
V t , k(t ) = μ(t ) ln kα (t ) − k σ (t ) + β V σ (t ), k σ (t ) .
(4.2)
By guessing, we claim that the solution of this Bellman equation, V (t , k(t )), has the form
V t , k(t ) = A (t ) ln k(t ) + B (t ).
(4.3)
Theorem 4.1. For the sequence {k(σ (t ))}t∞ =0 , which maximizes (SP ), the value function V (t , k(t )) is equal to
V t , k(t ) = α e
1 −1
(t , 0)
1
1 − αβ
+ e 1 −1 (t , 0) B (0) − αβ μ
+ where
A (t ) = α e and
B (0) = 2
1 −1
αβ μ
e αβ−1 (s, 0)μ(s) ln k(t )
s∈[0,t )T
μ
e β−1 (s, 0) ln
β
μ
−
μ
s∈[0,t )T
μ(s)
μ(s) + β A (σ (s))
β A (σ (s)) β A (σ (s)) s , ln μ(s) μ(s) + β A (σ (s)) (t , 0)
ln(1 − α β) 1−β
1 1 − αβ
−
s∈[0,t )T
(4.4)
e αβ−1 (s, 0)μ(s)
(4.5)
μ
+
α β ln(α β) . (1 − β)(1 − α β)
Remark 4.1. Hassler [7] formulated the value function V (t , k(t )) as
ln(1 − α β) α α β ln(α β) + ln k(t ) + 1 − αβ 1−β (1 − β)(1 − α β) in discrete time. This follows immediately from Theorem 4.1 when T = Z. Since the value function V (t , k(t )), given with (4.3) is solution of the Bellman equation (4.2). Therefore, we have
A (t ) ln k(t ) + B (t ) = μ(t ) ln kα (t ) − k σ (t ) + β A σ (t ) ln k σ (t ) + B σ (t ) . Then by differentiating both sides of Eq. (4.6) with respect to k(σ (t )), we have
(4.6)
F.M. Atıcı, N. Turhan / J. Math. Anal. Appl. 388 (2012) 753–759
759
∂ ∂ A (t ) ln k(t ) + B (t ) = μ(t ) ln kα (t ) − k σ (t ) + β A σ (t ) ln k σ (t ) + B σ (t ) , ∂ k(σ (t )) ∂ k(σ (t )) 1 −1 + β A σ (t ) 0 = μ(t ) α . k (t ) − k(σ (t )) k(σ (t )) Using the budget constraint k(σ (t )) = kα (t ) − c (t ), we find the control variable c (t ) as
μ(t )
β A (σ (t )) = α , k (t ) − c (t ) μ(t )kα (t ) = μ(t ) + β A σ (t ) c (t ), μ(t ) c (t ) = kα (t ). μ(t ) + β A (σ (t )) c (t )
We find the state variable k(σ (t )) satisfies the following equation
k
σ (t ) = kα (t ) − c (t ) = kα (t ) −
μ(t )
μ(t ) + β A (σ (t )) β A (σ (t )) kα (t ). = μ(t ) + β A (σ (t ))
kα (t )
Remark 4.2. One can solve the Bellman equation, either by guessing or iteration of the value function. Since we formulate the sequential decision problem on infinite horizon, it is not practical to use iteration method. In our growth model, we guessed the value function as V (t , k(t )) = A (t ) ln(k(t )) + B (t ). We want to point out that the existence and uniqueness of the value function follows from the Contraction Mapping Theorem. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
R. Almeida, D.F.M. Torres, Isoperimetric problems on time scales with nabla derivatives, J. Vib. Control 15 (6) (2009) 951–958. F.M. Atıcı, D.C. Biles, A. Lebedinsky, An application of time scales to economics, Math. Comput. Modelling 43 (2006) 718–726. F.M. Atıcı, D.C. Biles, A. Lebedinsky, A utility maximization problem on multiple time scales, Int. J. Dyn. Syst. Differ. Equ. 3 (1–2) (2011) 38–47. R.E. Bellman, On the theory of dynamic programming, Proc. Natl. Acad. Sci. USA 38 (8) (1952) 716–719. R.E. Bellman, Dynamic Programming, sixth ed., Princeton University Press, New Jersey, 1957. M. Bohner, A. Peterson, Dynamic Equations on Time Scales; An Introduction with Applications, Birkhäuser, Boston, 2001. J. Hassler, Dynamic optimization in discrete time, Mathematics II lectures, http://hassler-j.iies.su.se/Courses/matfuii/1999/lects199Bellman.pdf, November 22, 1999. M.I. Kamien, N.L. Schwartz, Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management, second ed., Elsevier, North-Holland, 1992. A.B. Malinowska, D.F.M. Torres, Euler–Lagrange equations for composition functionals in calculus of variations on time scales, Discrete Contin. Dyn. Syst. 29 (2) (2011) 577–593. A.B. Malinowska, N. Martins, D.F.M. Torres, Transversality conditions for infinite horizon variational problems on time scales, Optim. Lett. 5 (1) (2011) 41–53. E. Merrell, R. Ruger, J. Severs, First-order recurrence relations on isolated time scales, Panamer. Math. J. 14 (1) (2004) 83–104. D. Romer, Advanced Macroeconomics, third ed., McGraw–Hill, 1996. J. Seiffert, S. Sanyal, D.C. Wunsch, Hamilton–Jacobi–Bellman equations and approximate dynamic programming on time scales, IEEE Trans. Syst. Man Cybern., Part B, Cybern. 38 (4) (2008) 918–923. N. Stokey, R. Lucas, Recursive Methods in Economic Dynamics, Harvard, 1989. Z. Zhan, W. Wei, H. Xu, Hamilton–Jacobi–Bellman equations on time scales, Math. Comput. Modelling 49 (2009) 2019–2028.