Journal of Economic Dynamics & Control 37 (2013) 1284–1299
Contents lists available at SciVerse ScienceDirect
Journal of Economic Dynamics & Control journal homepage: www.elsevier.com/locate/jedc
Stochastic Stackelberg equilibria with applications to time-dependent newsvendor models Bernt Øksendal a,n,1, Leif Sandal b,2, Jan Ubøe b a b
Department of Mathematics, University of Oslo, P.O. Box 1053 Blindern, 0316 Oslo, Norway Norwegian School of Economics, Helleveien 30, 5045 Bergen, Norway
a r t i c l e in f o
abstract
Article history: Received 8 February 2012 Received in revised form 21 January 2013 Accepted 8 February 2013 Available online 6 March 2013
In this paper, we prove a maximum principle for general stochastic differential Stackelberg games, and apply the theory to continuous time newsvendor problems. In the newsvendor problem, a manufacturer sells goods to a retailer, and the objective of both parties is to maximize expected profits under a random demand rate. Our demand rate is an Itˆo–Le´vy process, and to increase realism information is delayed, e.g., due to production time. A special feature of our time-continuous model is that it allows for a price-dependent demand, thereby opening for strategies where pricing is used to manipulate the demand. & 2013 Elsevier B.V. All rights reserved.
JEL Classifications: C44 C61 C73 D81 Keywords: Stochastic differential games Delayed information Itoˆ –Le´vy processes Stackelberg equilibria Newsvendor models Optimal control of forward-backward stochastic differential equations
1. Introduction The one-period newsvendor model is a widely studied object that has attracted increasing interest in the last two decades. The basic setting is that a retailer wants to order a quantity q from a manufacturer. Demand D is a random variable, and the retailer wishes to select an order quantity q maximizing his expected profit. When the distribution of D is known, this problem is easily solved. The basic problem is very simple, but appears to have a never-ending number of variations. There is now a very large literature on such problems, and for further reading we refer to the survey papers by Cacho´n (2003) and Qin et al. (2011) and the numerous references therein. The (discrete) multiperiod newsvendor problem has been studied in detail by many authors, including Matsuyama (2004), Berling (2006), Bensoussan et al. (2007, 2009), Wang et al. (2010), just to quote some of the more recent
n
Corresponding author. E-mail addresses:
[email protected] (B. Øksendal),
[email protected] (L. Sandal),
[email protected] (J. Ubøe). 1 The research leading to these results has received funding from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007–2013)/ERC Grant agreement no [228087]. 2 The research leading to these results has received funding from NFR project 196433. 0165-1889/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jedc.2013.02.010
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
Nomenclature Main variables w
wholesale price per unit (chosen by the manufacturer)
q R D M S
1285
order quantity (rate chosen by the retailer) retail price per unit (chosen by the retailer) demand (random rate) production cost per unit (fixed) salvage price per unit (fixed)
contributions. Two papers whose approach is not unlike that used in our paper are Kogan (2003) and Kogan and Lou (2003), where the authors consider continuous time-scheduling problems. In many cases, demand is not known and the parties gain information through a sequence of observations. There is a huge literature on cases with partial information, e.g., Scarf (1958), Gallego and Moon (1993), Bensoussan et al. (2007), Perakis and Roels (2008), Wang et al. (2010), just to mention a few. When a sufficiently large number of observations have been made, the distribution of demand is fully revealed and can be used to optimize order quantities. This approach only works if the distribution of D is static, and leads to false conclusions if demand changes systematically over time. In this paper we will assume that the demand rate is a stochastic process Dt and we seek optimal decision rules for that case. In our paper, a retailer and a manufacturer write contracts for a specific delivery rate following a decision process in which the manufacturer is the leader who initially decides the wholesale price. Based on that wholesale price, the retailer decides on the delivery rate and the retail price. We assume a Stackelberg framework, and hence ignore cases where the retailer can negotiate the wholesale price. The contract is written at time td, and goods are received at time t. It is essential to assume that information is delayed. If there is no delay, the demand rate is known, and the retailer’s order rate is made equal to the demand rate. Information is delayed by a time d. One justification for this is that production takes time, and orders cannot be placed and effectuated instantly. It is natural to think about d as a production lead time. The single period newsvendor problem with price dependent demand is classical, see Whitin (1955). Mills (1959) refined the construction considering the case where demand uncertainty is added to the price-demand curve, while Karlin and Carr (1962) considered the case where demand uncertainty is multiplied with the price-demand curve. For a nice review of the problem with extensions see Petruzzi and Dada (1999). Stackelberg games for single period newsvendor problems with fixed retail price have been studied extensively by Lariviere and Porteus (2001), providing quite general conditions under which unique equilibria can be found. Multiperiod newsvendor problems with delayed information have been discussed in several papers, but none of these papers appears to make the theory operational. Bensoussan et al. (2009) use a time-discrete approach and generalize several information delay models. However, these are all under the assumption of independence of the delay process from inventory, demand, and the ordering process. They assert that removing this assumption would give rise to interesting as well as challenging research problems, and that a study of computation of the optimal base stock levels and their behavior with respect to problem parameters would be of interest. Computational issues are not explored in their paper, and they only consider decision problems for inventory managers, disregarding any game theoretical issues. Calzolari et al. (2011) discuss filtering of stochastic systems with fixed delay, indicating that problems with delay lead to nontrivial numerical difficulties even when the driving process is Brownian motion. In our paper, solutions to general delayed newsvendor equilibria are formulated in terms of coupled systems of stochastic differential equations. Our approach may hence be useful also in the general case where closed form solutions cannot be obtained. Stochastic differential games have been studied extensively in the literature. However, most of the works in this area have been based on dynamic programming and the associated Hamilton–Jacobi–Bellman–Isaacs type of equations for systems driven by Brownian motion only. More recently, papers on stochastic differential games based on the maximum principle (including jump diffusions) have appeared. See, e.g., Øksendal and Sulem (2012) and the references therein. This is the approach used in our paper, and as far as we know, the application to the newsvendor model is new. The advantage with the maximum approach is two-fold:
We can handle non-Markovian state equations and non-Markovian payoffs. We can deal with games with partial and asymmetric information. Fig. 1 shows a sample path of an Ornstein–Uhlenbeck process that is mean reverting around a level m ¼ 100. Even though the long-time average is 100, orders based on this average are clearly suboptimal. At, e.g., t ¼30, we observe a demand rate D30 ¼ 157. When the mean reversion rate is as slow as in Fig. 1, the information D30 ¼ 157 increases the odds that the demand rate is more than 100 at time t ¼37. If the delay d ¼ 7 (days), the retailer should hence try to exploit this extra information to improve performance. Based on the information available at time td, the manufacturer should offer the retailer a price per unit wt for items delivered at time t. Given the wholesale price wt and all available information, the retailer should decide on an order rate qt and a retail price Rt. The retail price can in principle lead to changes in demand, and in general the demand rate Dt is, hence, a function of Rt. However, such cases are hard to solve in terms of explicit expressions. We will also look at the simplified case where R is exogenously given and fixed. To carry out our construction, we will need to assume that items cannot be stored. That is of course a strong limitation, but applies to important cases like electricity markets and markets for fresh foods.
1286
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
Dt
200
δ 150
100
50
0
50
100
150
200
t
Fig. 1. An Ornstein–Uhlenbeck process with delayed information.
Assuming that both parties have full information about demand rate at time td, and that the manufacturer knows how much the retailer will order at any given unit price w, we are left with a Stackelberg game where the manufacturer is the leader and the retailer is the follower. To our knowledge, stochastic differential games of this sort have not been discussed in the literature previously. Before we can discuss game equilibria for the newsvendor problem, we must formulate and prove a maximum principle for general stochastic differential Stackelberg games. In the case where R is exogenously given and fixed, it seems reasonable to conjecture that our optimization problem could be reduced to solving a family of static newsvendor problems pointwise in t. Theorem 3.2.2 confirms that this approach provides the correct solution to the problem. Note, however, that our general framework is non-Markovian, and that solutions may depend on path properties of the demand. The paper is organized as follows. In Section 2, we set up a framework where we discuss general stochastic differential Stackelberg games. In Section 3, we use the machinery in Section 2 to consider a continuous-time newsvendor problem. In Section 4, we consider the special case where the demand rate is given by an Ornstein–Uhlenbeck process and provide explicit solutions for the unique equilibria that occur in that case. Examples with R-dependent demand are considered in Section 5. Finally, in Section 6 we offer some concluding remarks. 2. General stochastic differential Stackelberg games In this section, we will consider general stochastic differential Stackelberg games. In our framework, the state of the system is given by a stochastic process Xt. The game has two players. Player 1 (leader, denoted by L) can at time t choose a control uL(t) while player 2 (follower, denoted by F) can choose a control uF(t). The controls determine how Xt evolves in time. The performance for player i is assumed to be of the form Z T f i ðt,X t ,uL ðtÞ,uF ðtÞ, oÞ dt þ g i ðX T , oÞ , i ¼ L,F, ð1Þ J i ðuL ,uF Þ ¼ E d
where f i ðt,x,w,v, oÞ : ½0,T R Rl Rm O-R is a given F t -adapted process and g i ðx, oÞ : R O-R are given F T measurable random variables for each x,w,v; i ¼ L,F. We will assume that fi are C1 in v,w,x and that gi are C1 in x, i ¼ L,F. In our Stackelberg game, player 1 is the leader, and player 2 the follower. Hence when uL is revealed to the follower, the follower will choose uF to maximize J F ðuL ,uF Þ. The leader knows that the follower will act in this rational way. Suppose that for any given control uL there exists a map F (a ‘‘maximizer’’ map) that selects uF that maximizes JF ðuL ,uF Þ. The leader will hence choose uL ¼ unL such that uL /J L ðuL , FðuL ÞÞ is maximal for uL ¼ unL . In order to solve problems of this type we need to specify how the state of the system evolves in time. We will assume that the state of the system is given by a controlled jump diffusion of the form Z dX t ¼ mðt,X t ,uðtÞ, oÞ dt þ sðt,X t ,uðtÞ, oÞ dBt þ gðt,X t ,uðtÞ, x, oÞN~ ðdt,dxÞ ð2Þ R
Xð0Þ ¼ x 2 R, where the coefficients mðt,x,u, oÞ : ½0,T R U O-R, sðt,x,u, oÞ : ½0,T R U O-R Rn , gðt,x,u, x, oÞ : ½0,T R U R0 O-R are given continuous functions assumed to be continuously differentiable with respect to x and u, and ~ R0 ¼ R\f0g. Here Bt ¼ Bðt, oÞ; ðt, oÞ 2 ½0,1Þ O is a Brownian motion in Rn and Nðdt,d xÞ ¼ N~ ðdt,dx, oÞ is an independent compensated Poisson random measure on a filtered probability space ðO,F ,fF t gt Z 0 ,PÞ. See Øksendal and Sulem (2007) for more information about controlled jump diffusions. The set U ¼ UL UF is a given set of admissible control values uðt, oÞ. We assume that the control u ¼ uðt, oÞ consists of two components, u ¼ ðuL ,uF Þ, where the leader controls uL 2 Rl and the follower
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
1287
controls uF 2 Rm . We also assume that the information flow available to the players is given by the filtration fE t gt2½0,T , where for all t 2 ½0,T:
Et D F t
ð3Þ
For example, the case much studied in this paper is when E t ¼ F td
for all t 2 ½d,T,
ð4Þ
for some fixed information delay d 4 0. We assume that uL(t) and uF(t) are E t -predictable, and assume there is given a family AE ¼ AL,E AF,E of admissible controls contained in the set of E t -predictable processes. We now consider the following game theoretic situation: Suppose the leader decides her control process uL 2 AL,E . At any time t the value is immediately known to the follower. Therefore he chooses uF ¼ unF 2 AF,E such that uF /J F ðuL ,uF Þ is maximal for uF ¼ unF :
ð5Þ
Assume that there exists a measurable map F : AL,E -AF,E such that uF /J F ðuL ,uF Þ is maximal for uF ¼ unF ¼ FðuL Þ:
ð6Þ
The leader knows that the follower will act in this rational way. Therefore the leader will choose uL ¼ unL 2 AL,E such that uL /JL ðuL , FðuL ÞÞ is maximal for uL ¼ unL :
ð7Þ
The control u :¼ ðuL , FðuL ÞÞ 2 AL,E AF,E is called a Stackelberg equilibrium for the game defined by (1)–(2). In the newsvendor problem studied in this paper, the leader is the manufacturer who decides the wholesale price uL ¼ w for the ð2Þ retailer, who is the follower, and who decides the order rate uð1Þ F ¼ q and the retailer price uF ¼ R. Thus uF ¼ ðq,RÞ. We may summarize (5) and (7) as follows: n
n
n
max J F ðuL ,uF Þ ¼ J F ðuL , FðuL ÞÞ
ð8Þ
max JL ðuL , FðuL ÞÞ ¼ J L ðunL , FðunL ÞÞ
ð9Þ
uF 2AF,E
and uL 2AL,E
We see that (8) and (9) constitute two consecutive stochastic control problems with partial information, and hence we can, under some conditions, use the maximum principle for such problems as presented in Øksendal and Sulem (2012) (see also, e.g., Framstad et al., 2004; Baghery and Øksendal, 2007) to find a maximum principle for Stackelberg equilibria. To this end, we define the Hamiltonian HF ðt,x,u,aF ,bF ,cF ðÞ, oÞ : ½0,T R U R Rn þ 1 R O-R by Z ð10Þ HF ðt,x,u,aF ,bF ,cF ðÞ, oÞ ¼ f F ðt,x,u, oÞ þ mðt,x,u, oÞaF þ sðt,x,u, oÞbF þ gðt,x,u, x, oÞcF ðxÞnðdxÞ; R
where R is the set of functions cðÞ : R0 -R such that (10) converges, n is a Le´vy measure. For simplicity of notation the explicit dependence on o 2 O is suppressed in the following. The adjoint equation for HF in the unknown adjoint processes aF ðtÞ,bF ðtÞ, and cF ðt, xÞ is the following backward stochastic differential equation (BSDE): Z @HF ~ ðt,XðtÞ,uðtÞ,aF ðtÞ,bF ðtÞ,cF ðt,ÞÞ dt þbF ðtÞ dBt þ cF ðt, xÞNðdt,d xÞ; 0 rt r T ð11Þ daF ðtÞ ¼ @x R aF ðTÞ ¼ g 0F ðXðTÞÞ
ð12Þ
Here XðtÞ ¼ X u ðtÞ is the solution to (2) corresponding to the control u 2 AE . Next, assume that there exists a function f : ½0,T UL O-UF such that
FðuL ÞðtÞ ¼ fðt,ul ðtÞÞ i:e FðuL Þ ¼ fð,uL ðÞÞ f
ð13Þ nþ1
Define the Hamiltonian HL ðt,x,uL ,aL ,bL ,cL ðÞÞ : ½0,T R UL R R
R-R by Z f HL ðt,x,uL ,aL ,bL ,cL ðÞÞ ¼ f L ðt,x,uL , FðuL ÞÞ þ mðt,x,uL , FðuL ÞÞaL þ sðt,x,uL , FðuL ÞÞbL þ gðt,x,uL , FðuL Þ, xÞcL ðxÞnðdxÞ:
ð14Þ
R
f
The adjoint equation (for HL ) in the unknown processes aL ðtÞ,bL ðtÞ,cL ðt, xÞ is the following BSDE: Z f @H ~ daL ðtÞ ¼ L ðt,XðtÞ,uL ðtÞ, fðt,uL ðtÞÞ,aL ðtÞ,bL ðtÞ,cL ðt,ÞÞ dt þ bL ðtÞ dBt þ cL ðt, xÞNðdt,d xÞ; 0 r t rT @x R aL ðTÞ ¼ g 0L ðXðTÞÞ: uL , FðuL Þ
ð15Þ ð16Þ
ðtÞ is the solution to (2) corresponding to the control uðtÞ :¼ ðuL ðtÞ, fðt,uL ðtÞÞÞ; t 2 ½0,T, assuming that this Here XðtÞ ¼ X is admissible. We make the following assumptions: (A1) For all ui 2 Ai,E and all bounded bi 2 Ai,E there exists E 40 such that ui þsbi 2 Ai,E for all s 2 ðE, EÞ; i ¼ L,F:
1288
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
(A2) For all t 0 2 ½0,T and all bounded E t0 -measurable random variables ai , the control process bi ðtÞ defined by ai if t 2 ½t0 ,T bi ðtÞ ¼ ; t 2 ½0,T, 0 otherwise belongs to Ai,E ; i ¼ L,F. (A3) For all ui , bi 2 Ai,E with bi bounded, the derivative processes d xL ðtÞ ¼ ðX uL þ sbL ,uF ðtÞÞ ds s¼0
xF ðtÞ ¼
d uL ,uF þ sbF ðX ðtÞÞ ds s¼0
exist and belong to L2 ðl PÞ, where l denotes Lebesgue measure on ½0,T. We can now formulate our maximum principle for Stackelberg equilibria: Theorem 1 (Maximum principle). Assume that (13) and (A1)–(A3) hold. Put u ¼ ðuL ,uF Þ ¼ ðuL , FðuL ÞÞ where F : UL -UF , and let XðtÞ,ðai ,bi ,ci Þ be the corresponding solutions of (2), (11)–(12) (for i ¼ F) and (15)–(16) (for i ¼ L), respectively. Suppose that for all bounded bi 2 Ai,E , i ¼ L,F we have "Z ( ! 2 Z 2 T @s @s @g @g ðtÞxi ðiÞ þ ðt, zÞxi ðtÞ þ ðai ðtÞÞ2 ðtÞbi ðtÞ þ ðt, zÞbi ðtÞ nðdzÞ E @x @ui @ui R @x 0 Z 2 dt o 1: ð17Þ þ xi ðtÞ ðbi ðtÞÞ2 þ ðci ðt, zÞÞ2 nðdzÞ R
Then the following, (I) and (II), are equivalent. (I) d d ðJ F ðuL , FðuL Þ þ sbF ÞÞ ¼ ðJ L ðuL þ sbL , FðuL þ sbL ÞÞÞ ¼0 ds ds s¼0 s¼0 for all bounded bL 2 AL,E , bF 2 AF,E . (II) @ HF ðt,XðtÞ,uL ðtÞ,vF ,aF ðtÞ,bF ðtÞ,cF ðt,ÞÞE t E @vF vF
¼0
ð18Þ
ð19Þ
¼ FðuL Þ
for j ¼1, 2 and @ f HL ðt,XðtÞ,vL ,aL ðtÞ,bL ðtÞ,cL ðt,ÞÞE t ¼0 E @vL vL ¼ uL ðtÞ
ð20Þ
Proof. This follows by first applying the maximum principle for optimal control with respect to uF 2 AF,E of the state process X uL ,uF ðtÞ for fixed uL 2 AL,E , as presented in Øksendal and Sulem (2012). See also Framstad et al. (2004), Baghery and Øksendal (2007), and Øksendal and Sulem (2007). Next we apply the same maximum principle with respect to uL 2 AL,E of the state process X uL , FðuL Þ ðtÞ, for the given function F. We omit the details. & Corollary 1. Suppose ðuL , FðuL ÞÞ is a Stackelberg equilibrium for the game (1)–(2) and that (13), (A1)–(A3), and (17) are satisfied. Then the first order conditions (19)–(20) hold. 3. A continuous time newsvendor problem In this section, we will formulate a continuous time newsvendor problem and use the results in Section 2 to describe a set of explicit equations that we need to solve to find Stackelberg equilibria. We will assume that the demand rate for a good is given by a (possibly controlled) stochastic process Dt. A retailer is at time td offered a unit price wt for items to be delivered at time t. Here d 4 0 is the delay time. At time td, the retailer chooses an order rate qt. The retailer also decides a retail price Rt. We assume that items can be salvaged at a unit price S Z0, and that items cannot be stored, i.e., they must be sold instantly or salvaged. Remarks. The delay d can be interpreted as a production lead time, and it is natural to assume that wt and qt should both be settled at time td. In general the retail price Rt can be settled at a later stage. To simplify notation we assume that Rt, too, is settled at time td. The assumption that items cannot be stored is, of course, quite restrictive. Many important cases lead to assumptions of this kind; we mention in particular the electricity market and markets for fresh foods. Assuming that sale will take part in the time period d rt r T, the retailer will get an expected profit Z T J F ðw,q,RÞ ¼ E ðRt SÞ min½Dt ,qt ðwt SÞqt dt d
ð21Þ
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
1289
When the manufacturer has a fixed production cost per unit M, the manufacturer will get an expected profit Z T JL ðw,q,RÞ ¼ E ðwt MÞqt dt
ð22Þ
d
Technical remarks To solve these problems mathematically, it is convenient to apply an equivalent mathematical formulation: At time t the retailer orders the quantity t for immediate delivery, but the information at that time is the delayed information F td about the demand d units of time. Similarly, when the manufacturer delivers the ordered quantity qt at time t, the unit price wt is based on F td . From a practical point of view this formulation is entirely different, but leads to the same optimization problem. 3.1. Formalized information We will assume that our demand rate is given by a (possibly controlled) process of the form Z ~ dDt ¼ mðt,Dt ,Rt , oÞ dt þ sðt,Dt ,Rt , oÞ dBt þ gðt,Dt ,Rt , x, oÞNðdt,d xÞ; t 2 ½0,T R
D0 ¼ d0 2 R:
ð23Þ
~ are driving the stochastic differential equation in (23), and Brownian motion Bt and the compensated Poisson term Nðt,dzÞ it is hence natural to formalize information with respect to these objects. We therefore let F t denote the s-algebra ~ generated by Bs and Nðs,dzÞ, 0 r s rt. Intuitively F t contains all the information up to time t. When information is delayed, we instead consider the s-algebras E t :¼ F td
t 2 ½d,T
ð24Þ
Both the retailer and the manufacturer should base their actions on the delayed information. Technically that means that qt and wt should be E t -adapted, i.e., q and w should be E-predictable processes. In principle, the retail price Rt can be settled at a later stage. This case is possible to handle, but leads to complicated notation. We hence only consider the case where Rt is E-predictable. 3.2. Finding Stackelberg equilibria in the newsvendor model We now apply our general result for stochastic Stackelberg games to the newsvendor problem. In the newsvendor problem, we have the control u ¼ ðuL ,uF Þ where uL ¼ w is the wholesale price, and uF ¼ ðq,RÞ with q the order rate and R the retail price. Moreover Xt ¼Dt, f L ðt,XðtÞ,uðtÞÞ ¼ ðwt MÞqt ,
g L ¼ 0,
ð25Þ
f F ðt,XðtÞ,uðtÞÞ ¼ ðRt SÞ minðDt ,qt Þðwt SÞqt ,
and
g F ¼ 0:
ð26Þ
Therefore by (10) HF ðt,Dt ,qt ,Rt ,wt ,aF ðtÞ,bF ðtÞ,cF ðt,ÞÞ ¼ ðRt SÞ minðDt ,qt Þðwt SÞqt Z þ aF ðtÞmðt,Dt ,Rt Þ þ bF ðtÞsðt,Dt ,Rt Þ þ gðt,Dt ,Rt , xÞcF ðxÞnðdxÞ
ð27Þ
R
Similarly by (14), with uF ¼ fðuL Þ ¼ ðf1 ðwÞ, f2 ðwÞÞ ¼ ðqðwÞ,RðwÞÞ, f
HL ðt,Dt ,wt ,aL ðtÞ,bL ðtÞ,cL ðt,ÞÞ
ð28Þ
¼ ðwt MÞf1 ðt,wðtÞÞ þaL ðtÞmðt,Dt , f2 ðt,wðtÞÞÞ þ bL ðtÞsðt,Dt , f2 ðt,wðtÞÞÞÞ
ð29Þ
Z þ R
cL ðt, xÞgðt,Dt , f2 ðt,wðtÞÞ, xÞnðdxÞ
ð30Þ
Here we have assumed that the dynamics of Dt only depends on the control Rt ¼ f2 ðt,wðtÞÞ and has the general form Z ~ dDt ¼ mðt,Dt ,Rt Þ dt þ sðt,Dt ,Rt Þ dBt þ gðt,Dt ,Rt , xÞNðdt,d xÞ; t 2 ½0,T ð31Þ R
D0 ¼ d0 2 R
ð32Þ 1
where mðt,D,RÞ, sðt,D,RÞ and gðt,D,T, xÞ are continuous with respect to t and continuously differentiable (C ) with respect to D and R. We choose AL,E ,AF,E to be the set of all E-predictable processes with values in UL ¼ R and UF ¼ R2 respectively, where E t ¼ F td as above. Then we see that assumptions (A1)–(A3) hold, with xL ðtÞ and xF ðtÞ given by xL ðtÞ ¼ 0 for all t 2 ½0,T and Z @m @s @g ~ ðt,Dt ,Rt Þ dt þ ðt,Dt ,Rt Þ dBt þ ðt,Dt ,Rt , xÞNðdt,d xÞ dxF ðtÞ ¼ xF ðtÞ @D @D R @D
1290
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
þ bF ðtÞ
Z @m @s @g ~ ðt,Dt ,Rt Þ dt þ ðt,Dt ,Rt Þ dBt þ ðt,Dt ,Rt , xÞNðdt,d xÞ ; t 2 ½0,T, @R @R R @R
xF ð0Þ ¼ 0
ð33Þ ð34Þ
where denotes a vector product. To find a Stackelberg equilibrium we use Theorem 2.1. Hence by (19) we get the ^ ^ R^ t ¼ f2 ðt, wðtÞÞ: following first order conditions for the optimal values q^ t ¼ f1 ðt, wðtÞÞ,
and
E½ðR^ t SÞX ½0,Dt ðq^ t Þwt þS9E t ¼ 0
ð35Þ
Z @m ^ þ bF ðtÞ @s ðt,Dt , RÞ ^ þ cF ðt, xÞ @g ðt,Dt , R, ^ xÞnðdxÞE t ¼ 0: ðt,Dt , RÞ E minðDt , q^ t Þ þaF ðtÞ @R @R @R R
ð36Þ
Here X ½0,Dt ðq^ t Þ denotes the indicator function for the interval ½0,Dt , i.e., a function that has the value 1 if 0 r q^ t r Dt , and is zero otherwise. Let q^ t ¼ F1 ðwÞðtÞ, R^ t ¼ F2 ðwÞðtÞ be the solution of this coupled system. Next, by (20) we get the first-order condition @m ^ ^ ^ ^ ^ t MÞf01 ðt, wðtÞÞ ðt,Dt f2 ðt, wðtÞÞÞ þ f1 ðt, wðtÞÞ þ f02 ðt, wðtÞÞE aL ðtÞ ðw @R Z @s @g ^ ^ ðt,Dt , f2 ðt, wðtÞÞÞþ þ bL ðtÞ cL ðt, xÞ ðt,Dt , f2 ðt, wðtÞÞ, xÞnðdxÞE t ¼ 0 ð37Þ @R @R R ^ t . We summarize what we have proved in the following theorem. for the optimal value w Theorem 3.2.1. Suppose un is a Stackelberg equilibrium for the newsvendor problem with state Xt ¼Dt given by (31) and performance functionals Z T J L ðw,ðq,RÞÞ ¼ E ðwt MÞqt dt ðmanufacturer’s profitÞ, ð38Þ d
Z T ððRt SÞ minðDt ,qt Þðwt SÞqt Þ dt J F ðw,ðq,RÞÞ ¼ E
ðretailer’s profitÞ:
ð39Þ
d
Assume that (13), (A1)–(A3), and (17) hold. Let q^ t ¼ f1 ðt,wðtÞÞ, R^ t ¼ f2 ðt,wðtÞÞ be the solution of (35)–(36). Assume that fi 2 C 1 ^ t be the solution of (37). Then and that the conditions of Theorem 2.1 are satisfied. Let w ^ t ,ðf1 ðt, wðtÞÞ, ^ ^ f2 ðt, wðtÞÞÞ 2 AE : un ¼ ðw
In other words max fJ F ðw,ðq,RÞÞg ¼ JF ðw,ðf1 ð,w Þ, f2 ð,w ÞÞÞ
ð40Þ
ðq,RÞ2AF,E
and ^ f1 ð, w ^ Þ, f2 ð, w ^ ÞÞÞ: max fJ L ðw,ðF1 ðwÞ, F2 ðwÞÞÞg ¼ J L ðw,ð
w2AL,E
ð41Þ
Remark. Note that if R is fixed and cannot be chosen by the retailer, then (36) is irrelevant and we are left with (35) leading to the simpler equations in Theorem 3.2.2. In the special case when Dt does not depend on Rt, we get: Theorem 3.2.2. Assume that Dt has a continuous distribution, that Dt does not depend on Rt and that Rt ¼ R is exogenously given and fixed. For any given wt with S o M r wt rR consider the equation E½ðRSÞX ½0,Dt ðqt Þwt þ S9E t ¼ 0
ð42Þ
Let qt ¼ f1 ðt,wðtÞÞ denote the unique solution of (42), and assume that the function wt /E½ðwt MÞf1 ðt,wt Þ9E t
ð43Þ
^ is a unique Stackelberg equilibrium for the ^ t . Then with q^ t ¼ f1 ðt, wðtÞÞ ^ ^ qÞ the pair ðw, has a unique maximum at wt ¼ w newsvendor problem defined by (22) and (21). Proof. To see why (42) always has a unique solution, note that wt is E t -measurable and hence (42) is equivalent to E½X ½0,Dt ðqt Þ9E t ¼
wt S RS
ð44Þ
Existence and uniqueness of qt then follows from monotonicity of conditional expectation. Uniqueness of the Stackelberg ^ equilibrium follows from Theorem 3.2.1. To see that the candidate q^ t ¼ F1 ðwÞðtÞ is indeed a Stackelberg Equilibrium, we
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
1291
^ t is unique, any other wt will lead to strictly lower expected profit at time t. As argue as follows: Since the maximum w demand does not depend on wt, low expected profit at one point in time cannot be compensated by higher expected profits ^ t a.s. l P (l denotes Lebesgue measure on ½0,T) is false, any such strategy will lead later on. Hence if the statement wt ¼ w ^ is a Stackelberg ^ qÞ to strictly lower expected profits. The same argument applies for the retailer, and hence ðw, equilibrium. & To avoid degenerate cases we need to know that Dt has a continuous distribution. In the next sections we will consider special cases, and we will often be able to write down explicit solutions to (42) and prove that (43) has a unique maximum. Notice that (42) is an equation defined in terms of conditional expectation. Conditional statements of this type are in general difficult to compute, and the challenge is to state the result in terms of unconditional expectations. 4. Explicit solution formulas In this section we will assume that the conditions of Theorem 2.1 hold. 4.1. The Ornstein–Uhlenbeck process with constant coefficients In this section, we offer explicit formulas for the equilibria that occur when the demand rate is given by a constant coefficient Ornstein–Uhlenbeck process, i.e., the case where Dt is given by dDt ¼ aðmDt Þ dt þ s dBt
ð45Þ
where a, m, and s are constants. The Ornstein–Uhlenbeck process is important in many applications. In particular, it is commonly used as a model for the electricity market. The process is mean reverting around the constant level m, and the constant a decides the speed of mean reversion. The explicit solution to (45) is Z t Dt ¼ D0 eat þ mð1eat Þ þ seaðstÞ dBs ð46Þ 0
It is easy to see that Dt ¼ Dtd ead þ mð1ead Þ þ
Z
t
seaðstÞ dBs
ð47Þ
td
Because the last term is independent of E t with a normal distribution Nð0, s2 ð1e2ad Þ=2a), it is easy to find a closed form solution to (42). We let G½z denote the cumulative distribution of a standard normal distribution, and G1 ½z its inverse. The final result can be stated as follows: Proposition 4.1.1. For each y 2 R, let Fy : ½M,R-R denote the function rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1e2ad 1 wS Fy ½w ¼ yead þ mð1ead Þ þ s 1 G RS 2a
ð48Þ
and let Cy : ½M,R-R denote the function Cy ½w ¼ ðwMÞFy ½w. If Fy ½M 40, the function Cy is quasiconcave and has a unique maximum with a strictly positive function value. At time td the parties should observe y ¼ Dtd , and a unique Stackelberg equilibrium is obtained at Fy ½Argmax½Cy if Fy ½M 4 0 Argmax½Cy if Fy ½M 4 0 wnt ¼ qnt ¼ M otherwise 0 otherwise
ð49Þ
To prove Proposition 4.1.1, we need the following lemma. Lemma 4.1.2. In this lemma G½x is the cumulative distribution function of the standard normal distribution. Let 0 r m r 1, and for each m consider the function hm : R-R defined by hm ½z ¼ zð1mG½zÞG0 ½z
ð50Þ
hm ½z o 0
ð51Þ
Then for all z 2 R
Proof of Lemma 4.1.2. Note that if z Z 0, then hm ½z r h0 ½z and if z r 0, then hm ½z r h1 ½z. It hence suffices to prove the lemma for m ¼0 and m¼1. Using G00 ½z ¼ z G0 ½z, it is easy to see that h00m ½z ¼ G0 ½z r 0. If m¼ 0, it is straightforward to check that h0 is strictly increasing, and that limz- þ 1 h0 ½z ¼ 0. If m¼1, it is straightforward to check that h1 ½z is strictly decreasing, and that limz-1 h1 ½z ¼ 0. This proves that h0 and h1 are strictly negative, completing the proof of the lemma. &
1292
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
Proof of Proposition 4.1.1. From (47), we easily see that the statement qt r Dt is equivalent to the inequality Z t qt ðDtd ead þ mð1ead Þ r seaðstÞ dBs
ð52Þ
td
The left-hand side is E t -measurable, while the right-hand side is normally distributed and independent of E t . Using the Itˆo isometry, we see that the right-hand side has expected value zero and variance s2 ð1e2ad Þ=2a. It is then straightforward to see that 2 3
a d a d 6qt Dtd e
þ mð1e Þ 7 7 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð53Þ E X ½0,Dt ðq^ t Þ9E t ¼ 1G6 4 5 s2 ð1e2ad Þ 2a and (48) follows trivially from (44). It remains to prove that the function Cy has a unique maximum if Fy ½M 40. First put y^ ¼
y ead þ mð1ead Þ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1e2ad s 2a
ð54Þ
and note that Cy is proportional to wS ðwMÞ y^ þ G1 1 RS
ð55Þ
We make a monotone change of variables using z ¼ G1 ½1ðwSÞ=ðRSÞ. With this change of variables we see that Cy is proportional to MS ðRSÞ 1G½z ðy^ þ zÞ ð56Þ RS Put m ¼ ðMSÞ=ðRSÞ, and note that Cy is proportional to ð1mG½zÞðy^ þ zÞ
ð57Þ 1
Fy ½M 4 0 is equivalent to y^ þ G ½1m 40, and the condition wZ M is equivalent to z rG then 0 r mr 1. For each fixed 0 r mr 1, y^ 2 R consider the function f m ½z ¼ ð1mG½zÞðy^ þ zÞ
on the interval y^ rz r G1 ½1m
1
½1m. Note that if S r M r R,
ð58Þ
If y^ þ G1 ½1m 4 0, the interval is nondegenerate and nonempty, and f 0m ½z ¼ G0 ½zðy^ þzÞ þ ð1mG½zÞ
ð59Þ
^ 40, and that f m ½y ^ ¼ f m ½G1 ½1m ¼ 0. These functions therefore have at least one strictly positive Note that f 0m ½y maximum. To prove that the maximum is unique, assume that f 0m ½z0 ¼ 0, and compute f 00m ½z0 . Using G00 ½z ¼ z g 0 ½z, it follows that: f 00m ½z0 ¼ z0 ð1mG½z0 Þ2G0 ½z0 o z0 ð1mG½z0 ÞG0 ½z0 o0,
ð60Þ
by Lemma 4.1.2. The function is thus quasiconcave and has a unique, strictly positive maximum. Existence of a unique Stackelberg equilibrium then follows from Theorem 3.2.2. & The condition Fy ½M 4 0 has an obvious interpretation. The manufacturer cannot offer a wholesale price w lower than the production cost M. If Fy ½M r0, it means that the retailer is unable to make a positive expected profit even at the lowest wholesale price the manufacturer can offer. When that occurs, the retailer’s best strategy is to order q¼0 units. When the retailer orders q¼0 units, the choice of w is arbitrary. However, the choice w¼M is the only strategy that is increasing and continuous in y. Given values for the parameters a, m, s,S,M,R, and d, the explicit expression in (48) makes it straightforward to construct the deterministic function y/Argmax½Cy numerically. Two different graphs of this function are shown in Fig. 2. Fig. 3 shows the corresponding function Fy ½Argmax½Cy . In the construction we used a delay d ¼ 7 and d ¼ 30, with the parameter values a ¼ 0:05
m ¼ 100 s ¼ 12 R ¼ 10 S ¼ 1 M ¼ 2
ð61Þ
Note that the manufacturing cost M¼2 is relatively low, and Fy ½M 40 is satisfied for all y4 0 in these cases. It is interesting to note that the equilibria change considerably when the delay increases from d ¼ 7 to d ¼ 30 (notice the scale on the y-axis).
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
8.5
1293
6.0
8.0 7.5
5.5
7.0 6.5 6.0
50 50
100
150
200
250
100
150
200
250
200
250
Dt
Dt
Fig. 2. wnt as a function of the observed demand rate D ¼ Dtd .
140 190
120
180 170
100
160
80
150
60
140 50
100
150
200
250
50
100
150
Fig. 3. qnt as a function of the observed demand rate D ¼ Dtd .
4.2. Further applications to the case with fixed R Explicit results like the one in Proposition 4.1.1 can be carried out in a number of different cases. A complete discussion of these cases would be too long to be include here, and is provided in a separate paper Øksendal et al. (2012). To demonstrate the usefulness of this theory, we briefly survey the results in Øksendal et al. (2012):
The Ornstein–Uhlenbeck process with time-variable (deterministic) coefficients: Existence, uniqueness, and explicit solutions for the equilibria.
Geometric Brownian motion with constant coefficients: Existence, uniqueness, and explicit solutions for the equilibria.
Interestingly, the equilibrium wholesale price wt is constant in this case, and the retailer orders a fixed fraction of the observed demand. Geometric Brownian motion with time-variable (deterministic) coefficients: Existence, uniqueness, and explicit solutions for the equilibria. In this case the equilibrium wholesale price is not constant. It is, however, given by a deterministic function, and as a consequence the manufacturer needs not observe demand to settle the price. Geometric Le´vy processes: Explicit solutions for the equilibria are offered for special cases with random coefficients, leading to non-Markovian solutions. Existence and uniqueness is established for some cases. Typically the manufacturer has an equilibrium wholesale price defined in terms of a deterministic function, and needs not observed demand. The retailer should observe both demand and t‘1he growth rate of demand as his optimal order q is a deterministic function of these two quantities.
5. R-dependent demand In this section we provide a solution to an example with R-dependent demand. This problem is more difficult than the case we handled in the previous section. We also discuss a more complicated example, raising some interesting issues for future research. 5.1. An example with R-dependent demand When demand depends on Rt, Theorem 3.2.2 no longer applies. High profits at some stage may become too costly later, due to reduced demand, and the problem can no longer be separated into independent one-periodic problems. In particular we shall see that (13) no longer holds. However, we can still apply the maximum principle for the optimization of JF (the follower problem), since this part does not need (13). To simplify the discussion, we note that in the particular case where the coefficients m, s, g do not depend on D, then the adjoint Eqs. (15)–(16) have the trivial solution
1294
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
aL ðtÞ ¼ bL ðtÞ ¼ 0. If dDt ¼ ðKRt Þ dt þ s dBt
ð62Þ
the second pair of adjoint variables is also solvable, i.e., (11)–(12) has the explicit solution Z T ðRs SÞX ½0,qs ðDs Þ ds9f t bF ðtÞ ¼ 0 aF ðtÞ ¼ E
ð63Þ
t
If we make the simplifying assumption that Rt is decided at time td, i.e., at the same time as wt and qt, then, maximizing the Hamiltonian HF as in Theorem 3.2.1, we arrive by (35) and (36) at the following first-order conditions for the optimal ^ t ,q ¼ q^ t ¼ F1 ðwÞðtÞ ^ ^ functions wt ¼ w and R ¼ R^ t ¼ F2 ðwÞðtÞ h i w S t , t 2 ½d,T, ð64Þ E X ½0,Dtþ ðqt Þ9E t ¼ Rt S Z T ðRs SÞX ½0,qs ðDs Þds9E t ¼ 0 E min½Dt ,qt
t 2 ½d,T
ð65Þ
t
The functions F1 and F2 are found solving (64)–(65), as explained in the following. It is interesting to note that while (64) can be solved pointwise in t, (65) depends on path properties in the remaining time period, reflecting that decisions taken at one point in time influence later performance. The optimal order quantity qt ¼ F1 ðwÞðtÞ can be found from the equations as follows: Using the same separation technique that we used in Section 4, we can express qt explicitly in terms of wt and Rt Z t pffiffiffiffiffiffi wt S : ð66Þ qt ¼ Dtd þ ðKRs Þ ds þ sd G1 1 Rt S td If we put t ¼ d, we obtain Z d pffiffiffiffiffiffi w S qd ¼ D0 þ : ðKRs Þ dsþ sd G1 1 d Rd S 0
ð67Þ
The interesting point here is that we need to know the prices Rt ,0 rt r d in the period prior to the sales period ½d,T. One option is to consider these values as exogenously given initial values, which is typical when handling differential equations with delay. Alternatively, these prior values can be considered part of the decision process. In that case, the choice Rt ¼0 if 0 rt r d is optimal as it leads to higher values of initial demand, clearly an advantage for both the retailer and the manufacturer. This strategy corresponds to advertising in the presales period, in which case a small number of items are given away free to stimulate demand. We now proceed to solve (64)–(65): By (64) we obtain
wt S E X ½0,qt ðDt Þ9E t ¼ 1 Rt S
ð68Þ
The function x/ht ðxÞ :¼ E½X ½0,x ðDt Þ9E t
ð69Þ 1 ht ðxÞ.
is strictly increasing and hence has an inverse Thus (68) can be written Rt wt y 1 1 ¼ ht qt ¼ ht yþ wt S y ¼ Rt wt ðRt wt Þ þ wt S
ð70Þ
If we substitute (68) into (65), we get Z T ws S dsE t ¼ E½min½Dt ,qt 9E t , ðRs SÞ 1 E R S s t or Z T ðRs ws Þ dsE t ¼ Y t , E
ð71Þ
t
where Y t :¼ E½min½Dt ,qt 9E t ¼ E½qt X ½0,Dt ðqt Þ9E t þ f t ðqt Þ ¼ qt
wt S þf t ðqt Þ, Rt S
ð72Þ
with f t ðxÞ ¼ E½Dt X ½0,x ðDt Þ9E t :
ð73Þ
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
1295
Hence, by (70) Y t ¼ F t ðwt ,Rt wt Þ,
ð74Þ
where 1
F t ðw,yÞ ¼ ht
y wS y 1 þf t ht : y þwS yþ wS y þ wS
ð75Þ
For each fixed t and w, let F 1 t ðw,Þ be a measurable left-inverse of the mapping y/F t ðw,yÞ, in the sense that F 1 t ðw,F t ðw,yÞÞ ¼ y
for all y 2 R:
ð76Þ
s 2 ½0,T:
ð77Þ
Then Rs ws ¼ F 1 s ðws ,Y s Þ;
Therefore Eq. (71) can be written as Z T F 1 t 2 ½d,T E s ðws ,Y s Þ dsE t ¼ Y t ;
ð78Þ
t
This is a backward stochastic differential equation (BSDE) in the unknown process Yt. It can be reformulated as follows: Find an E t -adapted process Yt and an E t -martingale Zt such that ( dY t ¼ F 1 t 2 ½d,T t ðwt ,Y t Þ dt þ dZ t ; ð79Þ YT ¼ 0 From known BSDE theory we obtain the existence and uniqueness of a solution for ðY t ,Z t Þ of such an equation under certain conditions on the driver process F 1 t ðwt ,Y t Þ. For example, it suffices that Z T 1 2 F 1 ð80Þ E t ðwt ,0Þ dt o 1 and y/F t ðwt ,yÞ is Lipschitz: d
See, e.g., Pardoux and Peng (1990), El Karoui et al. (1997), and Øksendal and Zhang (2012) and the references therein. Moreover, Yt and Zt can be obtained as a fixed point of a contraction operator and hence as a limit of an iterative procedure. This makes it possible to compute Yt numerically in some cases. In general, however, the solution of the BSDE (79) need not be unique, because F 1 t ðwt ,Þ is not necessarily unique, and, even if Ft is invertible it is not clear that the inverse satisfies (80). If we assume that the solution Y t ¼ Y t ðoÞ of (79) has been found, then the optimal Rt ¼ R^ t ðwÞ ¼ F2 ðwÞ is given by (77) and the optimal qt ¼ q^ t ðwÞ ¼ F1 ðwÞðtÞ is given by (70). Finally we turn to the manufacturer’s maximization problem. The performance functional for the manufacturer has the form Z T ^ ðwt MÞðqðwÞÞ ð81Þ JL ðw, FðwÞÞ ¼ E t dt : 0
Therefore, by (70) and by (77) the problem to maximize JL ðw, FðwÞÞ over all paths w, can be regarded as a problem of optimal stochastic control of a coupled system of forward-backward stochastic differential equations (FBSDEs), as follows: (Forward system) dDt ¼ ðKRt Þ dt þ s dBt ¼ ðKwt F 1 t ðwt ,Y t ÞÞ dt þ s dBt ; (Backward system) ( dY t ¼ F 1 t ðwt ,Y t Þ dt þ dZ t ;
t 2 ½d,T
YT ¼ 0 (Performance functional) "Z T
JðwÞ : ¼ E
0
1 ðwt MÞht
D0 2 R:
:
F 1 t ðwt ,Y t Þ F 1 t ðwt ,Y t Þ þ wt S
ð82Þ
ð83Þ
! # dt
ð84Þ
This is a special case of the following stochastic control problem of a coupled system of FBSDEs: (Forward system) dX t ¼ bðt,X t ,Y t ,ut Þ dt þ sðt,X t ,Y t ,ut Þ dBt X0 2 R
ð85Þ
1296
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
(Backward system) dY t ¼ gðt,ut ,Y t Þ dt þ dZ t
ð86Þ
YT ¼ 0 (Performance functional) Z T JðuÞ ¼ E f ðt,X t ,Y t ,ut Þ dt
ð87Þ
0
Here ut is our control. To handle this problem, we need an extension of the result in Øksendal and Sulem (2012) to systems with the coupling given in (85) and (87). For details see Øksendal and Sulem (2013). The extension gives the following solution procedure: Define the Hamiltonian 1 Hðt,x,y,w, l,p,qÞ ¼ ðwMÞq^ t ðwyÞ þ lF 1 t ðw,yÞ þ pðKwF t ðw,yÞÞ þqs
The adjoint processes lt ,pt ,qt are given by the following FB system: @H d 1 ðtÞ dt ¼ ðwt MÞðq^ 0t ðwt Y t Þ þ lt ðF t ðwt ,yÞÞy ¼ Y t dt; dlt ¼ @y dy
t Z0
ð88Þ
ð89Þ
l0 ¼ 0 dpt ¼
@H ðtÞ dt þqt dBt ¼ qt dBt @x
ð90Þ
pT ¼ 0: From (90) we get pt ¼ qt ¼ 0, and the first order condition for maximization of the functional w/Hðt,Dt ,w, lt ,pt ,qt Þ becomes d ^ t Y t Þ þ q^ t ðw ^ t Y t Þ þ l^ ðtÞ ^ ðF 1 ðw,Y t Þw ¼ w^t ¼ 0 ðwMÞ q^ 0t ðw dw t
ð91Þ
We formulate what we have proved in a proposition. Proposition 5.1.1. Suppose that the demand process is as in (62) and that E t ¼ F td ; t Z d. Suppose that an optimal solution ^ t , q^ t ¼ F1 ðwÞðtÞ, ^ ^ w and R^ t ¼ F2 ðwÞðtÞ of the Stackelberg game (21)–(22) exists. Then the retailer’s optimal order response qt ¼ F1 ðwÞðtÞ and optimal price Rt ¼ F2 ðwÞðtÞ, respectively, are given by Rt wt ð92Þ F1 ðwÞðtÞ ¼ h1 t Rt S
F2 ðwÞðtÞ ¼ wt þ F 1 t ðwt ,Y t Þ,
ð93Þ
tÞ is a solution of the BSDE (79) for some measurable left inverse F 1 where Y t ¼ Y ðw t t ðwt ,Þ of F t ðwt ,Þ. Accordingly, the ^ t is the solution wt of Eq. (91). manufacturer’s wholesale price w
Some remarks Even though the result in Proposition 5.1.1 only covers a special case, we believe that the solution features insights to more general cases. We see that once Rt is decided, the order quantity qt can be found via a pointwise optimization. This is true because the order size does not influence demand, and a suboptimal choice at time t cannot be compensated by improved performance later on. We expect this strategy to hold more generally. Once qt is eliminated from the equations, the optimal retail price is found via a transformation to a backward stochastic differential equation. We believe that similar transformations might work for other cases. It makes good sense that the optimal retail price satisfies a backward problem. As we approach time T, it becomes less important what happens later on. In the limiting stages we take all we can get, leading to an end-point constraint. If Ft is not invertible, our framework will allow for solutions that might jump to new levels. Solutions of this type are found regularly when solving ordinary stochastic control problems. Our setup appears to allow for a similar type of effect in a quite unexpected way. A possible conjecture is that there exist switching levels, i.e., when demand reaches a low level the retailer should stop selling and lower prices to increase demand (sell marginal quantities with marketing effects in mind), and start selling when demand reaches a high enough level. Non-uniqueness of F 1 could lead to switching effects t of this kind. This is an interesting problem which is left for future research.
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
1297
5.2. A second example allowing complete elimination of the adjoint equations Another model admitting a similar type of analysis is dDt ¼ Dt ðKRt Þ dt þ sDt dBt :
ð94Þ
This is a second example where the adjoint equations can be solved explicitly, eventually leading to a system of the following form for the unknowns qt ¼ F1 ðwÞ and Rt ¼ F2 ðwÞ: E½X ½0,Dt ðqt Þ9E t ¼
wt S Rt S
t 2 ½d,T
ð95Þ
Z T Dt ðRs SÞX ½0,qs ðDs ÞGðsÞ ds9E t ¼ 0 E min½Dt ,qt GðtÞ t
t 2 ½d,T
ð96Þ
where dGðtÞ ¼ GðtÞððKRt Þ dt þ s dBt Þ;
Gð0Þ ¼ 1
ð97Þ
i.e. Z
t
GðtÞ ¼ exp
0
1 KRs s2 dsþ sBt ; 2
t 2 ½0,T
ð98Þ
We see that even though the adjoint equations can be eliminated, the resulting system is an order of magnitude more complicated than (64)–(65). Proceeding as in Section 5.1 we define ht ,f t and Ft as in (69), (73), and (76), respectively, but note that the values of these functions are different now, since Dt is different. Then we get as in (70) that (95) gives y 1 , ð99Þ qt ¼ ht y þ wt S y ¼ Rt wt which substituted into (96) gives the following equation in the unknown Y t : ¼ E½min½Dt ,qt 9E t : Z T 1 F 1 ðw ,Y Þ G ðsÞ ds9E t 2 ½0,T: E s s t ¼ Yt; GðtÞ t s
ð100Þ
Compared with (78) this equation has an additional difficulty stemming from the process GðtÞ, which is not necessarily E t adapted. This makes the equation a Volterra BSDE with respect to E t . We do not know how to solve (100). It is an interesting topic for future research. For a related paper see Yong (2006). If Eq. (100) can be solved for Y, then the corresponding optimal retailer value q^ t ðwÞ ¼ F1 ðwÞðtÞ and price R^ t ¼ F2 ðwÞðtÞ is again given by (92) and (93), respectively, ^ t ¼ FL ðq^ t , R^ t Þ is given by (91). while the optimal manufacturer price w 5.3. An example with explicit solution In this section we consider a simplified case with R-dependent demand, but where the contract must be written upfront, i.e., that w,q,R are decided once and for all prior to the sales period. This corresponds to the case when E t ¼ E t ¼ F 0 ¼ f|, Og
for all t 2 ½d,T:
Hence E½Y9E t ¼ E½Y
for all t
It can be shown that the maximum principle can be modified to cover this situation. We do not give the proof here, but refer to the argument given in Section 10.4 in Øksendal and Sulem (2007): When the controls w,q,R are not allowed to depend on t, we consider the t-integrated Hamiltonians, given by (see (12)–(17)) Z T ~ F ðw,q,R,aF ,bF ,cF Þ :¼ H HF ðt,Dt ,w,q,R,aF ðtÞ,bF ðtÞ,cF ðtÞÞ dt d
Z
T
min½Dt ,q dtðwSÞqT þðKRÞ
¼ ðRSÞ d
Z
T
aF ðtÞ dt
ð101Þ
d
and ~ f ðw,aL ,bL ,cL Þ :¼ H L
Z
T
d
f
HL ðt,Dt ,w,aL ðtÞ,bL ðtÞ,cL ðtÞÞ dt
^ ¼ ðwMÞfL ðwÞT ¼ ðwMÞqðwÞT,
ð102Þ
^ ^ where fðwÞ ¼ ðfL ðwÞ, fF ðwÞÞ ¼ ðqðwÞ, RðwÞÞ is the maximizer with respect to q and R of Z T Z T ~ ðp,qÞ/E½Hðw,p,q,a min½Dt ,q dt ðwSÞqT þðKRÞE aF ðtÞ dt F ,bF ,cF Þ ¼ ðRSÞE d
d
ð103Þ
1298
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
^ chosen by the manufacturer, is then the maximizer of The optimal constant value w ¼ w f
~ ðw,aL ,bL ,cL Þ ¼ ðwMÞqðwÞT: ^ w/E½H L This gives the following first order conditions for the optimal (q,R) and w, (see (63)–(64)) Z T wS , X ½0,Dtþ ðqÞ dt ¼ E RS d Z E
T
Z min½Dt ,q dt ¼ ðRSÞE
d
T
d
Z
T
X ½0,q ðDs Þ ds dt ,
ð104Þ
ð105Þ
t
and ðwMÞ
d ^ ^ ^ ^ qðw, RðwÞÞ þ qðw, RðwÞÞ ¼ 0, dw
ð106Þ
^ ^ ^ where qðw,RÞ is the solution of (90) for given w, R. Substituting q ¼ qðw,RÞ into (91), we find the optimal R^ ¼ RðwÞ by solving (91), which can be reformulated to RT ^ dt ^ E½ min½Dt , qðw, RÞ R^ ¼ S þ Rd T , ð107Þ E½ d tX ½0, qðw, ^ ðDt Þ dt ^ RÞ by changing the order of integration. The main result can be summarized in the following theorem: ^ w ^ R, ^ are given as follows: For Theorem 5.3.1. If the values of q, R and w are required to be constant, then the optimal values q, ^ ^ ^ is found as the solution of (92). given (w,R) let qðw,RÞ be the solution of (90). Next let R^ ¼ RðwÞ be the solution of (93). Then w To get an impression how this works in a specific case, we consider the demand given by (62). In that example we have qD0 ðKRÞt pffiffi ð108Þ PðDt r qÞ ¼ G s t Eq. (89) takes the form Z T wS PðDt r qÞ dt ¼ T RS d
ð109Þ
This equation clearly has no analytical solution, but can be handled numerically. Eq. (93) leads to the equation pffi RT Rq z 2 ffieðzD0 ðKRÞtÞ =2s t dz þq ð1PðDt r qÞÞ dt d d pffiffiffiffiffiffiffiffiffiffiffiffiffi 2 p t s 2 R ¼ Sþ ð110Þ RT d t PðDt rqÞ dt which we also think is reasonable to handle, as given w this is an equation in 1-variable only. Tables of q ¼ qðR,wÞ and R ¼ RðwÞ can then be constructed numerically, and values from these tables can be used to find maximum of the 1-variable function w/ðwMÞqðRðwÞ,wÞT: Since this process is Markov, we see that the parties only need to know D0 to write the contract at time t ¼0. 6. Concluding remarks This paper has two main topics. First, we develop a new theory for stochastic differential Stackelberg games and second we apply that theory to continuous time newsvendor problems. In the continuous time newsvendor problem we offer a full description of the general case where our stochastic demand rate Dt is a function of the retail price Rt. The wholesale price wt and the order rate qt are decided based on information present at time td, while the retail price can in general be decided later, i.e., at time tdR where d 4 dR . This problem can be solved using Theorem 3.2.1. However, the solution is defined in terms of a coupled system of stochastic differential equations and in general such systems are hard to solve in terms of explicit expressions. The case where demand is independent of R, leads to the simpler version in Theorem 3.2.2. If demand is given by an Ornstein–Uhlenbeck process, there is a unique, closed form solution to the problem. In Section 5 we have discussed some examples with R-dependent demand. These cases are simple, but nonetheless they appear to capture important economic effects. It would hence be quite interesting if one could solve such problems using more refined expressions. A further analysis of these and similar examples poses real challenges, however, and much more work will be needed before we can understand these issues in full. This is, therefore, an interesting topic for future research.
B. Øksendal et al. / Journal of Economic Dynamics & Control 37 (2013) 1284–1299
1299
Acknowledgment The authors wish to thank Steve LeRoy for several useful comments on the paper. The authors gratefully acknowledge several valuable suggestions from the referees. References Baghery, F., Øksendal, B., 2007. A maximum principle for stochastic control with partial information. Stochastic Analysis and Applications 25, 705–717. Bensoussan, A., C - akanyıldırım, M., Sethi, S.P., 2007. A multiperiod newsvendor problem with partially observed demand. Mathematics of Operations Research 23 (2), 322–344. Bensoussan, A., C - akanyıldırım, M., Sethi, S.P., 2009. Optimal ordering policies for stochastic inventory problems with observed information delays. Production and Operations Management 18 (5), 546–559. Berling, P., 2006. Real options valuation principle in the multi-period base-stock problem. Omega 36, 1086–1095. Cacho´n, G.P., 2003. Supply chain coordination with contracts. In: de Kok, A.G., Graves, S.C. (Eds.), The Handbook of Operations Research and Management Science: Supply Chain Management: Design, Coordination and Operation, Elsevier, Amsterdam, pp. 229–340. (Chapter 6). Calzolari, A., Florchinger, P., Nappo, G., 2011. Nonlinear filtering for stochastic systems with fixed delay: approximation by a modified Milstein scheme. Computers and Mathematics with Applications 61 (9), 2498–2509. El Karoui, N., Peng, S., Quenez, M.-C., 1997. Backward stochastic differential equations in finance. Mathematical Finance 7, 1–71. Framstad, N., Øksendal, B., Sulem, A., 2004. Sufficient stochastic maximum principle for optimal control of jump diffusions and applications to finance. Journal of Optimization Theory and Applications 121, 77–98. Gallego, G., Moon, I., 1993. The distribution free newsboy problem: review and extensions. Journal of the Operational Research Society 44 (8), 825–834. Karlin, S., Carr, C.R., 1962. Prices and optimal inventory policy. In: Arrow, K., Karlin, S., Scarf, H. (Eds.), Studies in Applied Probability and Management Science, Stanford University Press, Stanford, pp. 159–172. Kogan, K., Lou, S., 2003. Multi-stage newsboy problem: a dynamical model. European Journal of Operational Research 149 (2), 448–458. Kogan, K., 2003. Scheduling parallel machines by the dynamic newboy problem. Computers & Operations Research 31 (3), 429–443. Lariviere, M.A., Porteus, E.L., 2001. Selling to the newsvendor: an analysis of price-only contracts. Manufacturing & Service Operations Management 3 (4), 293–305. Matsuyama, K., 2004. The multi-period newsboy problem. European Journal of Operational Research 171 (1), 170–188. Mills, E.S., 1959. Uncertainty and price theory. Quarterly Journal of Economics 73, 116–130. Øksendal, B., Sandal, L., Ubøe, J., 2012. Stackelberg equilibria in a continuous time vertical contracting model with uncertain demand and delayed information. Manuscript. Øksendal, B., Sulem, A., 2007. Applied Stochastic Control of Jump Diffusions, second ed. Springer, Berlin, Heidelberg, New York. Øksendal, B., Sulem, A., 2012. Forward-backward SDE games and stochastic control under model uncertainty. Journal of Optimization Theory and Applications. doi:10.1007/s10957-012-0166-7. Øksendal, B., Sulem, A., 2013. Stochastic control of FBSDEs, stochastic differential games and risk minimization. In: NHH Lecture Notes. Øksendal, B., Zhang, T., 2012. Backward stochastic differential equations with respect to general filtrations and applications to insider finance. Communications on Stochastic Analysis 6 (4), 703–722. Pardoux, E., Peng, S., 1990. Adapted solution of a backward stochastic differential equation. Systems & Control Letters 14, 55–61. Perakis, G., Roels, G., 2008. Regret in the newsvendor model with partial information. Operations Research 56 (1), 188–203. Petruzzi, C.N., Dada, M., 1999. Pricing and the newsvendor problem: a review with extensions. Operations Research 47, 183–194. Qin, Q., Wang, R., Vakharia, A.J., Chen, Y., Seref, M., 2011. The newsvendor problem: review and directions for future research. European Journal of Operational Research 213, 361–374. Scarf, H., 1958. A min-max solution of an inventory problem. In: Arrow, K., Karlin, S., Scarf, H. (Eds.), Studies in The Mathematical Theory of Inventory and Production, Stanford University Press, Stanford, pp. 201–209. Wang, H., Chen, B., Yan, H., 2010. Optimal inventory decisions in a multiperiod newsvendor model with partially observed Markovian supply capacities. European Journal of Operational Research 202 (2), 502–517. Whitin, T.M., 1955. Inventory control and price theory. Management Science 2, 61–68. Yong, J., 2006. Backward stochastic Volterra integral equations and some related problems. Stochastic Processes and their Applications 116, 779–795.