2nd IFAC Workshop on Control of Systems Governed 2nd IFAC Control 2ndPartial IFAC Workshop Workshop on Control of of Systems Systems Governed Governed by Differentialon Equations 2nd IFAC Workshop on Control of Systems Governed by Differential Equations by Partial Partial Differential Equations June 13-15, 2016. Bertinoro, Italy Available online at www.sciencedirect.com by Partial Differential Equations June June 13-15, 13-15, 2016. 2016. Bertinoro, Bertinoro, Italy Italy June 13-15, 2016. Bertinoro, Italy
ScienceDirect
IFAC-PapersOnLine 49-8 (2016) 266–271
Risk-averse Merton’s Portfolio Problem Risk-averse Merton’s Risk-averse Merton’s Portfolio Portfolio Problem Problem Laurent Pfeiffer ∗∗ Laurent Pfeiffer ∗∗ Laurent Laurent Pfeiffer Pfeiffer ∗ Institute for Mathematics and Scientific Computing, ∗ ∗ Institute for Mathematics and Scientific Computing, and Computing, ∗ Institute for Mathematics Karl-Franzens-Universit¨ at, Heinrichstraße 36, 8010 Graz, Austria Institute for Mathematics and Scientific Scientific Computing, Karl-Franzens-Universit¨ a t, Heinrichstraße 36, 8010 Graz, Austria Austria Karl-Franzens-Universit¨ a t, Heinrichstraße 36, 8010 Graz, (e-mail:
[email protected]). Karl-Franzens-Universit¨ a t, Heinrichstraße 36, 8010 Graz, Austria (e-mail:
[email protected]). (e-mail:
[email protected]). (e-mail:
[email protected]). Abstract: A one-dimensional version of a continuous-time Merton’s portfolio problem is Abstract: A one-dimensional of a continuous-time problem is Abstract: A version of Merton’s portfolio problem is solved numerically for different version cost functionals involving the Merton’s probabilityportfolio distribution of the Abstract: A one-dimensional one-dimensional version of aa continuous-time continuous-time Merton’s portfolio problem is solved numerically for different cost functionals involving the probability distribution of the solved numerically for different cost functionals involving the probability distribution of the state variable. Such functionals allow to take into account the risk aversion of the consumer. solved numerically for different cost functionals involving the probability distribution of the state associated variable. Such Such functionals allow to to take in intoa account account the risk of aversion of the the consumer. state variable. functionals allow take into risk aversion of The optimality conditions consist coupled the system the Fokker-Planck and state variable. Such functionals allow to take in intoa account the risk of aversion of the consumer. consumer. The associated optimality conditions consist coupled system the Fokker-Planck and The associated optimality conditions consist in a coupled system of the Fokker-Planck and the Hamilton-Jacobi-Bellman equations, as is customary for mean-field type optimal control The associated optimality conditions consist in a coupled system of the Fokker-Planck and the Hamilton-Jacobi-Bellman equations, as is customary for mean-field type optimal control the Hamilton-Jacobi-Bellman equations, as is customary for mean-field type optimal control problems. We solve this coupled system with an iterative method. the Hamilton-Jacobi-Bellman equations, as is customary for mean-field type optimal control problems. We solve this coupled system with an iterative method. problems. We solve system with iterative method. problems. We(International solve this this coupled coupled system with an anControl) iterative method. © 2016, IFAC Federation of Automatic Hosting by Elsevier Ltd. All rights reserved. Keywords: Fokker-Planck equation, mean-field type control, semi-Lagrangian schemes, Keywords: Fokker-Planck equation, mean-field type control, semi-Lagrangian schemes, Keywords: Fokker-Planck Merton’s portfolio. Keywords: Fokker-Planck equation, equation, mean-field mean-field type type control, control, semi-Lagrangian semi-Lagrangian schemes, schemes, Merton’s portfolio. portfolio. Merton’s Merton’s portfolio. 1. INTRODUCTION respectively in the resolution of the HJB equation and the 1. INTRODUCTION INTRODUCTION respectively in in the the resolution of of the HJB HJB equation and and the 1. respectively resolution Fokker-Planck equation. 1. INTRODUCTION respectively in the resolution of the the HJB equation equation and the the Fokker-Planck equation. Fokker-Planck equation. Fokker-Planck equation. 1.1 Context We use here the second algorithm proposed in Pfeiffer 1.1 Context Context We use use here here the secondincorporates algorithm proposed proposed in Pfeiffer Pfeiffer 1.1 We second algorithm in (2015a). This the algorithm a penalization term 1.1 Context We use here the secondincorporates algorithm proposed in Pfeiffer (2015a). This algorithm a penalization term (2015a). This algorithm incorporates a penalization term in the fixed point approach in order to ensure the con(2015a). Thispoint algorithm incorporates atopenalization term In this paper, we solve risk-averse Merton’s portfolio in in the fixed approach in order ensure the conthe fixed point approach in order to ensure the convergence of the method. The involved coefficient must In this paper, we solve risk-averse Merton’s portfolio In this paper, we solve risk-averse Merton’s portfolio in the fixed point approach in order to ensure the conproblems in dimension 1 adapted from Merton (1971). Our vergence vergence of the method. The involved coefficient must In this paper, we solve risk-averse Merton’s portfolio of the method. The involved coefficient must be tuned iteratively at each iteration. The problem is problems in dimension 1 adapted from Merton (1971). Our problems in dimension 1 adapted from Merton (1971). Our vergence of the method. The involved coefficient must model is the following:1 aadapted risky and a Merton non-risky assetOur are be be tuned tuned iteratively iteratively at each each iteration. iteration. Theasproblem problem is problems in dimension from (1971). at The is discretized with a semi-Lagrangian scheme, in Carlini model is the following: a risky and a non-risky asset are model is the following: aa risky and aaofnon-risky asset are be tuned iteratively at each iteration. Theasproblem is available. At any time, the manager the portfolio may discretized with a semi-Lagrangian scheme, in Carlini model is the following: risky and non-risky asset are discretized with a semi-Lagrangian scheme, as in Carlini ´with a (2013) available. At any any time, the manager manager of greater the portfolio portfolio may and Silva Alvarez for example. Note that the conavailable. At time, the of the may discretized semi-Lagrangian scheme, as in Carlini invest a part of the portfolio (possibly than 1) in ´ available. At any time, the manager of greater the portfolio may ´´ the accuracy and Silvaand Alvarez (2013) for for example. Note have that not the been conSilva Alvarez (2013) Note that the coninvest part ofthe the portfolio (possibly than 1) in and vergence of example. the algorithm invest aa of the portfolio (possibly greater 1) the risky asset, in the non-risky asset. than The value Silvaand Alvarez (2013) for example. Note have that not the been coninvest a part part ofthe theother portfolio (possibly greater than 1) in in and vergence the accuracy of the algorithm vergence and the accuracy of the algorithm have not been the risky asset, other in the non-risky asset. The value soand far.the accuracy of the algorithm have not been the risky asset, the other in the non-risky asset. The value of the portfolio has to be optimized at a given maturity T . proved vergence the risky asset, the other in the non-risky asset. The value proved so far. so far. of the theproblem portfoliotakes has to to beform optimized at aa given given maturity maturity T .. proved of portfolio has optimized at T The thebe of a mean-field (MFT) proved far. discuss the following cost functions, which testsoand of theproblem portfoliotakes has to beform optimized at a giventype maturity T . We The the of a mean-field type (MFT) The problem takes the form of a mean-field type (MFT) We test and discuss the following following cost functions, which optimal control problem when the used cost function is a We test and the are described in (Shapiro et al., cost 2014,functions, Chapter which 6): a The problem takes the form ofthe a mean-field type (MFT) We and discuss discuss the following cost functions, optimal control problem when usedofcost cost function is optimal control problem when the used are test described in (Shapiro (Shapiro et al., al., the 2014, Chapter which 6): aa function of the probability distribution thefunction value of is theaaa cost are described in et 2014, Chapter 6): involving the semi-deviation, Conditional Value optimal control problem when the used cost function is are described in (Shapiro et al., 2014, Chapter 6): a function of the probability distribution of the value of the function of the probability distribution of the value of the cost involving the semi-deviation, the Conditional Value portfolio. cost involving the semi-deviation, Risk (CVaR), and a cost withthe a Conditional penalization Value term function of the probability distribution of the value of the at cost involving the semi-deviation, the Conditional Value portfolio. portfolio. at Risk Risk (CVaR), (CVaR), and aa cost cost with with aa penalization penalization term term at and involving a target. portfolio. MFT optimal control problems basically consist of opti- at Risk (CVaR), involving a target. target. and a cost with a penalization term involving a MFT optimal control problems basically consist of optiMFT optimal control problems basically of mal control problems the Fokker-Planck equation. This We involving next a target. a general formulation of the problem. In MFT optimal controlof basically consist consist of optioptimal control control problems ofproblems the Fokker-Planck Fokker-Planck equation. This We give mal problems of the equation. This give next general formulation of the the problem. In parabolic linear partial differential equation indeed models We give next aaa general formulation of problem. In section 2, we recall optimality conditions and in section mal control problems of the Fokker-Planck equation. This We give2,next general formulation of the problem. In parabolic linear partial differential equation of indeed models section parabolic linear partial differential equation indeed models section we recall optimality conditions and in section the evolution of the probability distribution a stochastic 2, we recall optimality conditions and in section we describe the numerical method. Weand describe the parabolic linear partial differential equation of indeed models 3, section 2, we recall optimality conditions in section the evolution of the probability distribution a stochastic the evolution of the probability distribution of a stochastic 3, we describe the numerical method. We describe the process whichofisthe solution to a stochastic differential equa- portfolio 3, we numerical method. We describe the model the in section 4. We provide numerical the evolution probability distribution of a stochastic 3, we describe describe the numerical method. Wethe describe the process whichMFT is solution solution to aacontrol stochastic differential equa- portfolio process which is to stochastic differential equaportfolio model in5. section section 4. We We provide the numerical tion (SDE). optimal problems are related model in 4. provide the numerical results in section process which is solution to a stochastic differential equaportfolio model in section 4. We provide the numerical tion (SDE). MFT optimal control problems are related tion (SDE). optimal control problems are related results in in section 5. 5. to mean-field games, which have received much tion (SDE). MFT MFT optimal control problems areattention related results results in section section 5. to mean-field mean-field games, which have received much attention to games, which have received much attention in the last years (see Cardaliaguet (2012), Bensoussan to mean-field games, have received much attention in al. the(2013)). last years years (seewhich Cardaliaguet (2012), Bensoussan in the last (see Cardaliaguet (2012), Bensoussan et In this MFT control problems 1.2 Formulation of 1-dimensional MFT control problems in the(2013)). last years (seeframework, Cardaliaguet (2012), Bensoussan et al. In this framework, MFT control problems et al. (2013)). In this framework, MFT control 1.2 Formulation Formulation of of 1-dimensional 1-dimensional MFT MFT control control problems problems are called potential mean-field games, see the problems example 1.2 et al. (2013)). In this framework, MFT control problems are called potential mean-field games, see the example 1.2 Formulation of 1-dimensional MFT control problems are called potential mean-field games, see the example provided in Lachapelle et al. (2010). are calledinpotential mean-field games, see the example Let T > 0, let (Wt )t∈[0,T ] be a 1-dimensional standard provided Lachapelle et al. al. (2010). provided in et Let T T > 0, 0, let (W (Wtt ))t∈[0,T 1-dimensional provided in Lachapelle Lachapelle et al. (2010). (2010).problems the stochas- Brownian ]] be Let let bea aacompact 1-dimensional standard In this paper, we call “standard” t∈[0,T motion. U be subset standard of R, we Let T > > 0, let (WLet 1-dimensional standard t )t∈[0,T ] bea a In this paper, we call “standard” problems the stochasBrownian motion. Let U be compact subset of R, we In this paper, we call “standard” problems the stochasBrownian motion. Let U be a compact subset of R, tic optimal control problems for which the cost funcdenote by U the set of adapted processes (u ) t t∈[0,T In this paper, we callproblems “standard” problems the stochasBrownian motion. Let U be a processes compact (u subset of] taking R, we we tic optimal control for which the cost funcdenote by U the set of adapted ) taking t t∈[0,T ] tic optimal control problems for which the cost funcdenote by U the set of adapted processes (u ) taking t to tion is the expectation of a given function. Such probt∈[0,T values in UU which are adapted with respect (W]]t )taking t∈[0,t] . tic optimal control problems for which the cost funcdenote by the set of adapted processes (u ) t t∈[0,T tion is is the expectation of aabygiven given function. Such probprob- values in U adapted respect to tion of function. Such tt ) in which are adapted with respect to (W )t∈[0,t] lems canthe beexpectation directly solved solving the corresponding The drift b :which R × U are →R and thewith volatility : R(W ×U → R,... t∈[0,t] tion of abygiven function. Such prob- values values in U U which are adapted with respectσ to (W t )t∈[0,t] lems is canthe beexpectation directly solved solved solving the corresponding corresponding The drift b : R × U → R and the volatility σ : R × U → R, lems can be directly by solving the The drift b : R × U → R and the volatility σ : R × U → R, Hamilton-Jacobi-Bellman (HJB) equation, itself derived as well as ban: R initial condition x ¯0 ∈ R are given. ∈ U, lems can be directly solved(HJB) by solving the corresponding The drift ×uU → R and the volatility σ : RFor × Uu → R, Hamilton-Jacobi-Bellman equation, itself in derived as well as an initial condition x ¯¯00 ∈ R are given. For u ∈ U, Hamilton-Jacobi-Bellman (HJB) equation, itself derived as well as an initial condition x ∈ R are given. For u ∈ U, from a dynamic programming principle. As shown Pfeifwe denote by (X ) the solution to the following SDE: t∈[0,T ] Hamilton-Jacobi-Bellman (HJB) equation, itself derived tu as well as an initial condition x ¯ ∈ R are given. For u ∈ U, 0 from(2015b), dynamic programming principle. take As shown shown in PfeifPfeif- we denote (Xttuu ))t∈[0,T to the following SDE: from aa principle. As in ]] the denote by by the solution solution following SDE: fer theprogramming optimality conditions the following from a dynamic dynamic programming principle. take As shown in Pfeif- we we by (X (X )t∈[0,T solution to toXthe the following SDE: ] the dXdenote + σ(X ¯0 . (1) fer (2015b), (2015b), the optimality conditions the following t = b(Xt , ut )tdtt∈[0,T t , ut ) dWt , 0 = x fer the optimality conditions take the following general form for a MFT control problem: an optimal dX = b(X , u ) dt + σ(X , u ) dW , X = x ¯ . (1) fer (2015b), the optimality conditions take the following tt = b(Xtt , utt ) dt + σ(Xtt , utt ) dWtt , 0 0 dX X = x ¯ . (1) 0 0 general form for a MFT control problem: an optimal general form for MFT control an optimal dXtassume = b(Xtthe , ut )existence dt + σ(Xof dW , Xthat ¯for control process is aaalso optimal for problem: a standard problem. t , uL t )> 0 = x 0 . all x, y(1) We 0 tsuch in general form for MFT control problem: an optimal control process is also optimal for a standard problem. We assume the existence of L > 0 such that for all x, y in control process is also optimal for a standard problem. We assume the existence of L > 0 such that for all x, y The terminal condition of the corresponding HJB equation R, for all u and v in U , control process is also ofoptimal for a standard problem. We assume the existence of L > 0 such that for all x, y in in The terminal condition the corresponding HJB equation R, for all u and v in U , The terminal condition of the corresponding HJB equation R, for all u and v in U , itself dependscondition on the optimal control and may seen as R, for all u and v in U , The terminal of the corresponding HJBbe equation |b(x, u)| ≤ L(1 + |x| + |u|), (2) itself depends on the optimal control and may be seen as depends on control seen as aitself linearized Thisoptimal result motivates fixedbe point |b(x, u)| ≤ L(1 + |x| + |u|), (2) itself dependscost. on the the control and andaa may may seenapas |b(x, u)| L(1 + + (2) linearized cost. Thisoptimal result motivates fixedbe point ap|b(y, ≤ L(|y − x| + |v − u|). (3) |b(x, v) u)|−≤ ≤b(x, L(1u)| + |x| |x| + |u|), |u|), (2) aa linearized cost. This result motivates a fixed point approach, made of backward and forward passes, consisting a linearized cost. This result motivates a fixed point ap|b(y, v) − b(x, u)| ≤ L(|y − x| + |v − u|). (3) |b(y, v) − b(x, u)| ≤ L(|y − x| + |v − u|). (3) proach, made made of of backward backward and and forward forward passes, passes, consisting consisting proach, |b(y, v) − b(x, u)| ≤ L(|y − x| + |v − u|). (3) proach, made of backward and forward passes, consisting
Copyright © 2016, 2016 International Federation of 268Hosting by Elsevier Ltd. All rights reserved. 2405-8963 © IFAC (International Federation of Automatic Control) Copyright © 2016 International Federation of 268 Copyright © 2016 responsibility International of Federation of Federation of Automatic 268Control. Automatic Control Peer review under International Copyright © 2016 International Federation of 268 Automatic Control Automatic Control 10.1016/j.ifacol.2016.07.452 Automatic Control
268 268 268 268
IFAC CPDE 2016 June 13-15, 2016. Bertinoro, Italy
Laurent Pfeiffer / IFAC-PapersOnLine 49-8 (2016) 266–271
We assume that σ satisfies the same assumption. We ¯2 (R) the set of probabiliy measures on R denote by B having a variance smaller than R: ¯ m ∈ B2 (R) ⇐⇒ x2 dm(x) ≤ R. R
We also consider the Wasserstein distance associated with the L1 -norm, denoted here by d1 . For all u ∈ U, for all t ∈ [0, T ], we denote by mut the probability distribution of Xtu , uniquely characterized by the following property: for all continuous and bounded function φ : R → R, φ(x) dmut (x) = E φ(Xtu ) . R
The assumptions (2) and (3) ensure the existence of R ≥ 0 ¯ 2 (R). such that for all u ∈ U, for all t ∈ [0, T ], mut ∈ B
A control process u is said to be a feedback control if there exists a measurable function u : [0, T ] × R → U such that almost surely: ut = u(t, Xtu ). If u is Lipschitz continuous w.r.t. x, then (t, x) �→ mt (x) is a weak solution to the Fokker-Planck equation: 1 2 ∂t mut (x) − ∂xx (mut (x)σ(x, u(t, x))2 ) 2 (4) + ∂x (mut (x)b(x, u(t, x)) = 0. ¯2 (R) → R is given, assumed to A cost function χ : B be continuous with respect to d1 and differentiable in ¯2 (R), there exists a the following sense: for all m1 ∈ B continuous function x ∈ R �→ Dχ(m1 , x) ∈ R such that ¯2 (R), for θ ∈ [0, 1]: for all m2 ∈ B χ (1 − θ)m1 + θm2 = χ(m1 )+ (5) θ Dχ(m1 , x)(dm2 (x) − dm1 (x)) + o(θ). R
For all m, we assume that for some L > 0, |Dχ(m, x)| ≤ L(1 + x2 ).
(6)
Finally, we aim at solving the following problem: sup χ(muT ).
(P )
u∈U
2. OPTIMALITY CONDITIONS 2.1 Standard case In this paper, the “standard” case is the following problem, where a continuous function φ : R → R is given: sup φ(x) dmuT (x) = sup E φ(XTu ) . (P (φ)) u∈U
u∈U
R
As is well-known, it can be solved by dynamic programming. Define the value function V : [0, T ] × Rn → R as: V (t, x) = sup E φ(XTt,x,u ) , u∈U
where (Xst,x,u )s∈[t,T ] is the solution to: dXst,x,u
b(Xst,x,u) ds
σ(Xst,x,u ) dWs ,
Xtt,x,u
= + = x. The function V is solution to the following HJB equation: 1 2 −∂t V(t, x) = sup ∂x V(t, x)b(x, u) + ∂xx V(t, x)σ(x, u)2 , 2 u∈U V (T, x) = φ(x). The value of problem P (φ) is then V (0, x ¯0 ). 269
267
2.2 General case In the general case, an optimal solution u ¯ to P is also the solution a standard problem. Theorem 1. Let u ¯ ∈ U be a solution to P . Then, it is a solution to P (φ), with φ = Dχ(muT¯ , ·). A proof can be found in Pfeiffer (2015b). Note that problem P (φ) is well-defined if the growth of φ is quadratic; this is why assumption (6) is needed. Observe that when u ¯ is a solution to (P (φ)) with φ = Dχ(muT¯ , ·), the value function associated with P (φ) can be seen as an adjoint equation to the Fokker-Planck equation (4). We introduce the following notation: ′ ε(u) = sup Dχ(muT , x) d(muT (x) − muT (x)). u′ ∈U
R
Observe that ε(u) ≥ 0 and that the optimality condition provided by Theorem 1 is equivalent to ε(¯ u) = 0. Theorem motivates a fixed-point approach. A first attempt of algorithm could be the following: given a control uk at iteration k, the next control could be chosen as a solution k to problem P (φ), with φ = Dχ(muT , ·). In this way, we could expect to find a control u satisfying the optimality condition of Theorem 1. Such a method does not converge in general and requires the introduction of a penalization term, this approach is described in the next section in a discrete setting. 3. NUMERICAL METHOD 3.1 Discretization The first step of the method consists in discretizing the SDE (1) as a controlled Markov chain, which we denote by Xju . We fix two discretization parameters NX > 0 and NT > 0. The index j is a time index in {0, ..., NT } and corresponds to the time jδt, where δt = T /NT . A parameter A > 0 is also fixed, in order to bound the domain. The Markov chain takes values in the following discretized space set: X := {xk | k = 0, ..., NX }, where k is the space index and xk = −A + kδx, with δx = 2A/NX .
Discretized feedback controls are denoted as follows: u = (uj (k))j=0,...,NT −1, k=0,...,NX ∈ U NX ×NT . For all v ∈ U , a probability transition Pv (k, k ′ ) ≥ 0 ′ from k to k is computed with a semi-Lagrangian scheme, which we do not describe here, we refer to Pfeiffer (2015a) for details. For all v ∈ U , for all k = 0, ..., NX , it holds: NX
Pv (k, k ′ ) = 1.
k′ =0
A reflecting boundary condition is used to ensure this property. From now on, we associate with any vector m = X NX +1 (m(k))k=0,...,NX ∈ R+ such that N k=0 m(k) = 1 the probability measure defined by: NX
k=0
m(k)δxk ,
IFAC CPDE 2016 268 June 13-15, 2016. Bertinoro, Italy
Laurent Pfeiffer / IFAC-PapersOnLine 49-8 (2016) 266–271
where δxk is the Dirac measure centered at xk . The probability distribution of Xju is denoted by muj (here a vector of RNX +1 , using the above convention). In this context, the Fokker-Planck equation can be discretized as follows: for u ∈ U NX ×NT , muj+1 (k ′ ) =
NX �
muj (k)Puj (k) (k, k ′ ),
k=0
with the initial condition m0 (x) = δk,k¯ , where δ is the Kronecker symbol and where we assume that x ¯0 = xk¯ . The proposed scheme is consistent, in so far as it is the “dual” scheme to the semi-Lagrangian schemes used for solving the HJB equation – recall that the HJB equation is here the adjoint equation to the Fokker-Planck equation. 3.2 Penalized problem Let u ¯ ∈ U NX ×NT , α > 0 and φ : R → R. Consider the following problem: N� T −1 � � � �� � u 2 u u sup . E φ(XN ) ) − α (X ) − u ¯ (X u j j j j T u∈U NX×NT
j=0
(P (¯ u, φ, α)) Such a problem can be solved by dynamic programming: VNT (k) = φ(xk ) NX �� � �2 � Vj (k) = sup . (7) Pv (k, k ′ )Vj+1 (k ′ )−α v− u ¯j (k) v∈U
k′ =0 NX ×NT
is then optimal (for the problem A control u ∈ U P (¯ u, φ, α)) if and only if for all j and k, uj (k) maximizes the r.h.s. of (7). We finally re-define the coefficient ε(u) in this discrete setting. For u ∈ U NX ×NT , we now denote: � � u′ u E Dχ(muT )(XN − XN ) , ε(u) = sup T T u′ ∈U NX ×NT
3.3 Method We describe now the algorithm used to solve problem P . Two parameters η1 > 0 and η2 > 0 are given. 0
Choose u0 ∈ U NX ×NT , compute m0 = muNT . Choose α0 > 0, set ℓ = 0. while ε(uℓ ) > η1 do � � Backward phase: compute a solution uℓ+1 to � � P (uℓ , Dχ(mℓ , ·), αℓ ). � ℓ+1 � � Forward phase: compute mℓ+1 = muNT . � ℓ+1 ℓ � � if� χ(m ) ≥ χ(m ) − η2 then � � Set αℓ+1 = αℓ /2. � � else � � � � χ(mℓ+1 ) ≥ χ(mℓ ) − η2 do � � while � � � � Set αℓ = 2αℓ . � � � � � � Backward phase: compute a solution uℓ+1 to � � � ℓ ℓ � � � � � � P (u , Dχ(m , ·), αℓ ). ℓ+1 � � � � � � Forward phase: set mℓ+1 = muNT . � � � � end � � � � Set αℓ+1 = αℓ . � � end end
Let us comment on the role of problem P (¯ u, φ, α). At the iteration ℓ, in the backward phase, the cost to be maximized is then: T −1 � N� � � � �2 � u ℓ E Dχ(mℓ , XN uj (Xju ) − uℓj (Xju ) . ) − α E T j=0
The first term is a linearization of χ. The second term is a penalization term: the coefficient αℓ must be large enough so that the solution uℓ+1 remains close to uℓ . In this manner, the linearization of χ provides a good enough ℓ+1 approximation of χ and mℓ+1 = muNT satisfies: ℓ+1
χ(muNT ) ≥ χ(mℓNT ) − η2 . If this condition is satisfied at the first attempt of the iteration, then the penalization coefficient can be reduced (here, divided by 2). Otherwise, it has to be increased as much as necessary to ensure that uℓ+1 and uℓ are close enough; this is achieved by multiplying this coefficient by 2 at each iteration of the second while-loop. 4. MERTON’S PORTFOLIO A risky and a non-risky asset are available. The wealth at time t is denoted by Xt , the proportion of the wealth invested in the risky asset at time t is denoted by πt . The parameters of our model are the following: r rate of the non-risky asset µ rate of the risky asset σ volatility of the risky asset π − /π + lower/upper bound for πt The dynamics of Xt is therefore given by: dXtπ = [r + πt (µ − r)]Xtπ dt + Xtπ πt σ dWt , (8) − + under the constraints: πt ∈ [π , π ].
Let π ¯ be an optimal solution to the problem for a given cost function χ, with associated value function V (t, x). At time t, for x = Xtπ¯ , we have: � (µ − r)V (t, x) � x , if Vxx (t, x) < 0 proj 2 V (t, x) xσ − + xx [π ,π ] πt = π + , if (µ − r)Vx (t, x) > 0 and Vxx (t, x) ≥ 0 − π , if (µ − r)Vx (t, x) < 0 and Vxx (t, x) ≥ 0 (9) when the derivatives Vx and Vxx are well-defined. 5. NUMERICAL RESULTS We provide numerical results for three different cost functions, denoted by χ1 , χ2 , and χ3 . The involved probability measure m refers to the probability distribution of XTπ (solution to (8)). � The cost function χ(m) = R x dm(x) (the expectation of the value of the portfolio at the final time) has for optimal solution: πt = π + , which is the most profitable strategy, but also the “riskiest” in the sense that one has to invest as much as possible in the risky asset. The costs χ1 , χ2 , and χ3 are variants, incorporating penalization terms modelling a risk-aversion. For each of the tested cost functions, we present a graph of the probability distribution mut (x), a graph of the optimal control u, a graph of the value function associated with
270
IFAC CPDE 2016 June 13-15, 2016. Bertinoro, Italy
Laurent Pfeiffer / IFAC-PapersOnLine 49-8 (2016) 266–271
problem P (φ), with φ = Dχ(muT ), which can be seen as a dual variable, and convergence results. Note that the probability distribution which is represented is the one the discretized variable; one gets an approximal of the probability density function of mut by multiplying it by δx−1 . The chosen parameters are the following: r = 0.02, µ = 0.03, σ = 0.05, π − = −2, π + = 2, T = 5. The discretization parameters are: δx = δt = 0.02, with NX = 400 and NT = 500. We take here α0 = 1. For the considered examples, the choice of α0 does not seem to have played any role. The problem is solved for a reformulation involving the state variable Yt = ln(Xt ). However, we present numerical results for the variable Xt , which explains that the grid of the graph is not uniform (w.r.t. to the variable x). Note that in the four considered cases, the convergence is quick (the coefficient εℓ is smaller than 10−6 in less than 10 iterations). 5.1 Lower semi-standard deviation model ¯ 2 (R) is defined by: The lower semi-variance of m ∈ B 2 z−m ¯ − dm(z), Var− (m) = R ¯ = R y dm(y) is the where z− = min(z, 0), and where m expectation of m.
R
The first cost function that we test is the following (for some penalization parameter γ > 0): x dm(x) − γ Var− (m). χ1 (m) = R
γDVar− (m,x)
√
2
Var− (m)
.
The cost function χ1 can be seen as a combination of the expected value of the portfolio at time T and a term penalizing the lower semi-standard deviation, that is to say, a term penalizing the uncertainty of the lower tail of the distribution. The parameter γ is chosen equal to 1. The value function has the following structure: for x ≤ 1 (approximately), V (t, ·) is quadratic and concave: the optimal control is therefore decreasing, by (9). For x ≥ 1, V (t, ·) is linear and increasing: the control is then equal to π + . A risky strategy is adopted for small values of x, since it both increases the expected value of the portfolio and reduces the lower semi-deviation. 5.2 Conditional Value at Risk For a given β ∈ (0, 1), the Conditional Value at Risk (CVaR) can be defined by: χ2 (m) = CVaR(m) = inf x dm1 (x). m1 ,m2 ∈P1 (R) (1−β)m1 +βm2 =m
We denote by A(m) the following set: α ∈ R | m (−∞, α] ≥ 1 − β, m [α, +∞) ≥ β . It is easy to check that this set is a closed interval, in general reduced to a singleton (if m has a density for example). In this case, the unique contained value is called Value at Risk (VaR). Note also that in this case, VaR(m) 1 CVaR(m) = x dm(x). 1 − β −∞ The CVaR is not differentiable in general, but is convex and sub-differentiable in the following sense: for all m1 and ¯2 (R), for all θ ∈ [0, 1], for all α ∈ A(m1 ), m2 ∈ B CVaR (1 − θ)m2 + θm1 θ (x − α)− d(m2 (x) − m1 (x)). ≥ CVaR(m1 ) + 1−β R Note that the above formula is related to (Rockafellar and Uryasev, 2000, Theorem 1). In the algorithm, instead of using the derivative, we simply choose a sub-gradient. Numerical results for β = 0.9 are represented on figures 58. The CVaR takes into account the worst cases, therefore a profitable (but risky) strategy is used for small values of x. When x is greater than the VaR, the control is equal to 0, which is somehow the safest strategy. 5.3 Penalization term involving a target Given x¯ ∈ R, β ∈ (0, 1) and γ > 0, we consider the cost: 2 χ3 (m) = x dm(x) − γ 1[¯x,+∞) (x) dm(x) − β , R
Its derivative (in the sense of (5)) is given by: 2 z−m ¯ − dm(z) x. DVar− (m, x) = x − m ¯ −−2
We have: Dχ1 (m, x) = x −
269
R
Note that in the usual definition (for minimization problems), the operator inf is replaced by the operator sup. 271
−
R
where 1[¯x,+∞) (x) = 1 if x ≥ x ¯, 0 otherwise. The derivative is formally given by: Dχ3 (m, x) = x−2γ 1[¯x,+∞) (x) dm(x)−β 1[¯x,+∞) (x). R
−
The cost function χ3 is neither continuous, nor differentiable. However, if 1[¯x,+∞) is replaced by a more regular function (like for example 2 arctan(N x)/π − 1, with N large), the continuity and differentiability assumptions are satisfied. The cost function χ3 must be considered as a combination of the expected value and of a penalization term. The penalization term is such that there is no penalization if with probability at least β, the value of the portfolio is greater than x ¯. The penalization plays the role of a probability constraint. Numerical results are presented for x ¯ = 1.2, β = 1, γ = 1. In this setting, the probability constraint is not satisfied, and thus, the value function has a jump at the final time. As a result, for t ≥ 8 (approximately), the value function is convex (with respect to x) for small values of x, concave for large values of x. For t ≤ 8, it is concave. For values of x around 1.2, the concavity is the strongest (i.e. Vxx is small), as a result, by (9), the control is smaller. This means that a less risky strategy is employed only in the neighborhood of x ¯; for the other values, the penalization has no influence. ACKNOWLEDGEMENTS The author acknowledges the Austrian Science Fund for financial support under grant SFB F32 “Mathematical Optimization and Applications in Biomedical Sciences”. The author thanks the reviewers for useful comments.
IFAC CPDE 2016 270 June 13-15, 2016. Bertinoro, Italy
Laurent Pfeiffer / IFAC-PapersOnLine 49-8 (2016) 266–271
REFERENCES Bensoussan, A., Frehse, J., and Yam, P. (2013). Mean field games and mean field type control theory. Springer Briefs in Mathematics. Springer, New York. Cardaliaguet, P. (2012). Notes on Mean Field Games. ´ Carlini, E. and Silva Alvarez, F.J. (2013). SemiLagrangian schemes for mean field game models. In Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, 3115–3120. Lachapelle, A., Salomon, J., and Turinici, G. (2010). Computation of mean field equilibria in economics. Math. Models Methods Appl. Sci., 20(4), 567–588. Merton, R.C. (1971). Optimum consumption and portfolio rules in a continuous-time model. J. Econom. Theory, 3(4), 373–413. Pfeiffer, L. (2015a). Numerical methods for mean-field type optimal control problems. SFB Report. Pfeiffer, L. (2015b). Optimality conditions for mean-field type optimal control problems. SFB Report. Rockafellar, R.T. and Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2, 21–41. Shapiro, A., Dentcheva, D., and Ruszczy´ nski, A. (2014). Lectures on stochastic programming. MOS-SIAM Series on Optimization. SIAM; Mathematical Optimization Society.
Numerical results for χ1 with γ = 1
1
0.5
0 0 5
Time 10
0.5
1.5
1
2
Space
Fig. 1. Probability distribution
2 1 0 0 2 4 6
2 8
Time
1.5 10 0.5
1
Space
Fig. 2. Optimal control
1 0 −1 −2 0 2
5
Time
1.5 10 0.5
1
Space
Fig. 3. Dual variable (value function) ℓ 0 2 4 6 8 10
Cost: χ(mℓ ) 1,1728 1,1833 1,1835 1,1835 1,1835 1,1835
Fig. 4. Convergence results
272
Criterion: εℓ 1,4.10−2 2,9.10−4 5,1.10−5 1,6.10−5 6.10−6 3,8.10−6
IFAC CPDE 2016 June 13-15, 2016. Bertinoro, Italy
Laurent Pfeiffer / IFAC-PapersOnLine 49-8 (2016) 266–271
271
Numerical results for χ3
Numerical results for χ2
1
0.4
0.5
0.2
0 0
0 0
2
2 2
4 6
1.5 8
1
1 10 0.5
Time
1.5
5
Time
Space
Space
10 0.5
Fig. 9. Probability distribution
Fig. 5. Probability distribution
2 1 0 0 2
2 0
1 0 0.5
4 6
5 1
8 1.5
Space
Time
Time
2 10
10
1
0.5
2
1.5
Space
Fig. 10. Optimal control
Fig. 6. Optimal control
4
0 −0.2
2 −0.4 0 0 0
2 2
4 1.5
6 8
Time
5
1
1 10 0.5
Space
Time
Cost: χ(mℓ ) 1,1252 1,1257 1,1257 1,1257
10 0.5
Space
Fig. 11. Dual variable (value function)
Fig. 7. Dual variable (value function) ℓ 0 2 4 6
2 1.5
ℓ 0 2 4 6 8 10
Criterion: εℓ 5,0.10−4 5,5.10−7 6,0.10−9 ≈0
Fig. 8. Convergence results
Cost: χ(mℓ ) 0.8067 1,4114 1,4159 1,4159 1,4159 1,4159
Fig. 12. Convergence results
273
Criterion: εℓ 7,8.10−1 5,9.10−3 2,4.10−5 2,9.10−7 5,3.10−9 2,9.10−10