You are on page 1of 55

Dynamic Programming

Quantitative Macroeconomics [Econ 5725]

Raül Santaeulàlia-Llopis

Washington University in St. Louis

Spring 2011

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 1 / 55


1 The Finite Horizon Case
Environment
Dynamic Programming Problem
Bellman’s Equation
Backward Induction Algorithm

2 The Infinite Horizon Case


Preliminaries for T → ∞
Bellman’s Equation
Basic Elements of Functional Analysis
Blackwell Sufficient Conditions
Contraction Mapping Theorem (CMT)
V is a Fixed Point
VFI Algorithm
Characterization of the Policy Function: The Euler Equation and TVC

3 Roadmap

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 2 / 55


The Finite Horizon Case

• Time is discrete and indexed by t = 0, 1, ..., T < ∞

• Environment is stochastic

• Uncertainty is introduced via zt , an exogenous r.v. (or shock)

• zt follows a Markov process with transition function

Q (z 0 , z) = Pr (zt+1 ≤ z 0 |zt = z)

with z0 given.

• We assume zt , but not zt+1 , is known at time t.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 3 / 55


• The instantaneous return function is u(xt , ct )

• u(xt , ct ) is continuous and bounded in xt and ct

• The state variable xt ∈ X ⊂ <m , ∀t

• The control variable ct ∈ C (xt , zt ) ⊂ <n , ∀t

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 4 / 55


• The objective function is

T
X
E0 β t u(xt , ct )
t=0

where

• β < 1 is the discount factor


• E0 denotes the expectation conditional on t = 0

• Subject to the law of motiong of xt , the stochastic process zt , and


given x0 and z0 .

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 5 / 55


• The law of motion of state x is:

xt+1 = f (xt , zt , ct )

with x0 given.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 6 / 55


• (xt , zt ) completely describes the state of the economy at every t.

• Plus additive separability of the objective function imply the action ct


depends solely on the currents states through a (possibly)
time-varying function gt ,

gt : X × Z → C , ∀t

that is, ct = gt (xt , zt ).

• The function gt is the decision rule.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 7 / 55


• The sequence πT = {g0 , g1 , ..., gT } is the policy.

• If each gt (xt , zt ) ∈ C (xt , zt ), then πT ∈ Π, i.e, the policy is feasible.

• If each gt (xt , zt ) = g (xt , zt ) ∈ C (xt , zt ) ∀ t, the policy is stationary.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 8 / 55


• The expected discounted value of a given feasible policy πT ∈ Π is

T
X
WT (x0 , z0 , πT ) = E0 β t u(xt , gt (xt , zt ))
t=0

where

• xt+1 follows f (xt , zt , gt (xt , zt ))


• x0 , z0 is given
• The expectation is taken with respect to Q(z 0 , z)

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 9 / 55


The Dynamic Programming Problem

An individual maximizes,

max WT (x0 , z0 , πT )
gt (xt ,zt )∈C (xt ,zt )

subject to
xt+1 = f (xt , zt , gt (xt , zt ))

given
x0 , z0 and Q(z 0 , z)

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 10 / 55


Theorem of the Maximum
If
• the constraint set C (xt , zt ) is non-empty, compact and continuous,
• u(·) is continuous and bounded,
• f (·) is continuous, and
• Q satisfies the Feller property,

then
• there exists a solution (optimal policy) to the problem above,
πT∗ = {g0∗ , g1∗ , ..., gT∗ }, and
∗ ) is also continuous.
• the value function VT (x0 , z0 ) = WT (x0 , z0 , πT

Proof See SLP p.62

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 11 / 55


Precisely, value function VT (x0 , z0 ) is the expected discounted present
value of the optimal policy πT∗ ,

T
X
VT (x0 , z0 ) = E0 β t u(xt , gt∗ (xt , zt ))
t=0

Corollary: If C (xt , zt ) is convex and u(·) and f (·) are strictly concave in
ct , then gt (xt , zt ) is also continuous.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 12 / 55


Toward Bellman’s Equation

• Given the existence of a solution, we can write the value function:

T
X
VT (x0 , z0 ) = max E0 {u(x0 , c0 ) + β t u(xt , ct )}
πT
t=1

• By the law of iterated expectations, E0 (x1 ) = E0 (E1 (x1 )), hence

T
X
VT (x0 , z0 ) = max E0 {u(x0 , c0 ) + E1 β t u(xt , ct )}
πT
t=1

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 13 / 55


• Then, we can cascade the max operator,

T
X
VT (x0 , z0 ) = max E0 {u(x0 , c0 ) + max E1 β t u(xt , ct )}
c0 πT −1
t=1

where πT −1 = {c1 , c2 , ..., cT }

• Rearranging the discount factor,

T
X
VT (x0 , z0 ) = max E0 {u(x0 , c0 ) + β max E1 β t−1 u(xt , ct )}
c0 πT −1
t=1
= max E0 {u(x0 , c0 ) + β max WT −1 (x1 , z1 , πT −1 )}
c0 πT −1

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 14 / 55


• If we, analogously to above, define the value function

VT −1 (x1 , z1 ) = WT −1 (x1 , z1 , πT∗ −1 )


as the expected present value of the optimal policy with T − 1
periods left,

• then, we can write

VT (x0 , z0 ) = max u(x0 , c0 ) + β E0 VT −1 (x1 , z1 )


c0

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 15 / 55


• Omitting time subscripts on x, z and c we can rewrite the previous
expression for any number of periods s ∈ {1, 2, ..., T } left to go

Vs (x, z) = max u(x, c) + β E Vs−1 (x 0 , z 0 )


c∈C (x,z)

where primes denote next period.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 16 / 55


Bellman’s Equation with Finite Horizon

• Inserting the law of motion (for x) and using the definition of


conditional expectation that is captured by the transition function for
z, we arrive at
Z
Vs (x, z) = max u(x, c) + β Vs−1 (f (x, z, c), z 0 )dQ(z 0 , z) (1)
c∈C (x,z) Z

where, x = xT −s , z = zT −s , and z 0 = zT −s+1 ; and s are periods left


to go, T is the total number of periods to go, and T − s are periods
that passed from t = 0.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 17 / 55


Few important remarks:

• Bellman’s equation is useful because reduces the choice of a sequence


of decision rules to a sequence of choices for the control variable

• Hence a dynamic problem is reduced to a sequence of static problems,

• that is, is sufficient to solve the DP problem sequentially T + 1 times,


as shown in the next section.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 18 / 55


Bellman’s Principle of Optimality

A consequence of this result is the Bellman’s principle of optimality:

∗ = {g ∗ , g ∗ , ..., g ∗ } is the optimal


• If the sequence of functions πT 0 1 T
policy that maximizes WT (x0 , z0 , πT ),

• then, after j periods with j + s = T , πs∗ = {gT∗ −s , gT∗ −s+1 , ..., gT∗ } is
the optimal policy (with elements identical to the original one) that
maximizes Ws (xj , zj , πs ).

Policies with this property are called time-consistent.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 19 / 55


Backward Induction Algorithm

This leads to the following algorithm to solve the DP problem:

• First, start from the last period, with 0 periods to go (s=0). Then
the problem is static and reads:

V0 (xT , zT ) = max u(xT , cT )


cT ∈C (xT ,zT )

From here we obtain gT∗ (xT , zT ).

Then, given a specification of u(·), we have an explicit functional


form for V0 (xT , zT ).

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 20 / 55


• Second, go back one period. With 1 period to go (s=1) and using the
constraint xT = f (xT −1 , zT −1 , cT −1 ) and Q we can write

V1 (xT −1 , zT −1 ) = maxcT −1 ∈C (xT −1 ,cT −1 ) u(xT −1 , zT −1 ) +

R
β Z VT (f (xT −1 , zT −1 , cT −1 ), zT )dQ(zT , zT −1 )

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 21 / 55


• Third, we continue going back until time 0 and collect the sequence
of decision rules into the optimal policy vector.

• Finally, given the initial conditions at time 0, we can reconstruct the


whole optimal path for the state and control, contingent on any
realization of {zt }T
t=0 .

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 22 / 55


The Infinite Horizon Case

We take the same environment and primitives as before except for:

T → ∞.

Important consequences:

• We cannot proceed with backward induction, as @ last period.


• The DP problem (conditional on the initial state) is the same at each
period since we always have the same number of periods left to go
(∞), hence the environment is stationary.
• Then, the value function will be time invariant as well, V(x,z).

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 23 / 55


Bellman’s Equation

In the infinite horizon case, we can write the Bellman’s equation as:

Z
V (x, z) = max u(x, c) + β V (f (x, z, c), z 0 )dQ(z 0 , z) (2)
c∈C (x,z) Z

where we have dropped the time subindexes in the value function.

• The solution to this problem will be a stationary (time-invariant)


decision rule c ∗ = g ∗ (x, z).

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 24 / 55


It will be useful to think of (2) as a functional equation, i.e. an equation
where the unknown is a function ϕ(x, z) belonging to some functional
space C.

Then, more generally, the Bellman’s equation (2) can be written


Z
T (ϕ) = max u(x, c) + β ϕ(x 0 , z 0 )dQ(z 0 , z)
c∈C (x,z) Z

where T : C → C.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 25 / 55


We are interested in knowing,

1 if (2) has a solution, that is, a function V belonging to the same


functional space C that satisfies the fixed point property V = T (V ).

2 whether (2) can be thought of as the limit of the DP problem with


finite horizon in (1) as s → ∞.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 26 / 55


Some Basic Elements of Functional Analysis

A metric space (M, d) is a set M together with a metric


d : M × M → <+ satisfying the following conditions ∀ ϕ, φ and ψ in M:

1 d(ϕ, φ) = 0 ⇐⇒ ϕ = φ

2 d(ϕ, φ) = d(φ, ϕ)

3 d(ϕ, ψ) ≤ d(ϕ, φ) + d(φ, ψ)

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 27 / 55


Convergent Sequence. A sequence {ϕn } in M is said to converge to
ϕ ∈ M if: ∀δ > 0∃N = N(δ) > 0 such that d(ϕn , ϕ) < δ if n > N.

Cauchy Sequence. A sequence {ϕn } in M with the property that:


∀δ > 0∃N = N(δ) > 0 such that d(ϕn , ϕm ) < δ if n, m > N.

Complete Metric Space. A metric space (M, d) is complete if every


Cauchy sequence in (M, d) converges to a point in the space. This space
is also called Banach space.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 28 / 55


Operator, T : A function T mapping a metric space into itself is called an
operator.

Contraction Mapping, T : Let (M, d) be a complete metric space and


T = (M, d) → (M, d). Then T is said to be a contraction with
modulus β if there is a number β ∈ (0, 1) such that

∀(ϕ, φ) ∈ (M, d), d (T (ϕ), T (φ)) ≤ d(ϕ, φ). (3)

That is, a contraction contracts two points so that their images, T (ϕ) and
T (φ), are closer together than ϕ and φ

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 29 / 55


Blackwell Sufficient Conditions (BSC)

BSC: Let T be an operator on a metric space (M, d∞ ) where M is a


space function with domain X and d∞ is the sup metric. Then, T is a
contraction mapping with modulus β if it satisfies the following two
conditions:

1 (monotonicity) ϕ ≤ φ → T ϕ ≤ T φ, ∀ϕ, φ ∈ M

2 (discounting) T (a + ϕ) ≤ aβ + T ϕ, ∀a > 0, ϕ ∈ M

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 30 / 55


CMT: Let (M, d) be a complete metric space and let T be a contraction
mapping with modulus β. Then, it follows that:

1 T has a unique fixed point ϕ∗ in M

2 For any ϕ0 in M, the sequence ϕn+1 = T (ϕn ) started at ϕ0


converges to ϕ∗ in the metric d.

This is a very useful theorem because it ensures the existence and


uniqueness of a fixed point under very general conditions and provides
(through point 2) an algorithm to compute the fixed point, by simple
iteration.

The CMT is also known as the Banach Fixed Point Theorem.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 31 / 55


Show that the Value Function in Bellman’s Equation (2) is a Fixed
Point of a Contraction:
• First, consider the functional equation (3),

u(x, c) + βE ϕ(f (x, z, c), z 0 )


 
T (ϕ) = max
c∈C (x,z)

that defines a mapping T with domain equal to the space of


continuous and bounded functions M.

Two results:

• Given the assumption on the boundedness of u and ϕ, it is immediate


to show that T ϕ is bounded as well.
• By the Theorem of the Maximum, we obtain that if ϕ is continuous,
then T ϕ is continuous as well (just interpret ϕ as Vs−1 and T ϕ as Vs ).

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 32 / 55


• Second, equip the space M with the sup norm d∞ . Since (M, d∞ ) is
a complete metric space (as you have proved earlier) we can use the
Blackwell’s sufficient conditions to check if T is a contraction
mapping:
• (monotonicity) Take ϕ1 ≥ ϕ2 , then

T (ϕ1 ) = max{u(x, c) + β E [ϕ1 (f (x, z, c), z 0 )]}


c
≥ max{u(x, c) + β E [ϕ2 (f (x, z, c), z 0 )]}
c
= T ϕ2

• (discounting) For any function φ, positive real numbers a > 0 and


β ∈ (0, 1)

T (ϕ + a) = {u(x, c ∗ ) + β E [ϕ(f (x, z, c ∗ ), z 0 ) + a]}


= {u(x, c ∗ ) + β E [ϕ(f (x, z, c ∗ ), z 0 )] + βa}
= T (ϕ) + βa

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 33 / 55


Hence,

• It follows that T is a contraction mapping, hence, together with a


complete metric space, it satisfies CMT.
• We can therefore use the contraction mapping theorem in
characterizing the solution of the DP problem in infinite horizon.
• This is extremely useful because we can interpret the value function
of the infinite horizon DP problem as the fixed point of a contraction
mapping.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 34 / 55


Two important results,
1 We know that the infinite horizon Bellman’s equation (2) has a
solution V and this solution is unique under general conditions.
2 We can interpret the infinite horizon value function as the limit as
T → ∞ of a finite horizon DP problem.
3 The CMT provides a computational algorithm (its point 2 above), a
value function iteration (VFI).

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 35 / 55


Value Function Iteration (VFI) Algorithm

1 To compute our unknown V , which we have just showed it is fixed point, we


can start iterating on (2) from any initial (continuous and bounded) function
ϕ0 , and we are certain to converge to the solution V — Value Function
Iteration (VFI) algorithm.

2 Roughly speaking: let V0 = ζ be the initial guess of the value function.


Iterating on it, V1 = T (ζ), V2 = T (V1 ), ..., Vn+1 = T (Vn ) will get us to a
convergence point at some iteration N, where V ∗ = VN+1 = T (VN ) is
approximately VN .

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 36 / 55


Contractions have the feature of guaranteeing that the fixed point V (weakly)
preserves the properties of the functions ϕ on which we iterate, such as
monotonicity, continuity and concavity.

For the deterministic case (see SLP chapter 9 for stochastic environments),

1 If in addition to the assumptions in the contraction mapping theorem we


have that i) u(·) is strictly concave, and ii) X is monotone, then V is
strictly increasing. (SLP 4.7)

2 If in addition to the assumptions in the contraction mapping theorem we


have that i) u(·) is strictly concave, and ii) X is convex, then V is strictly
concave and the associated decision rule g is single-valued and
continuous. (SLP 4.8)

If SLP 4.8 holds (i.e., the decision rule is continuous), we can find the optimal
decision rule function using some continuous method of approximation — this
means we may do not need to solve the decision rule at a lot of points. If SLP 4.8
does not hold, there is not guarantee the decision rule can be approximated with
a continuous function, and we will have to use discretization — takes more time.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 37 / 55


Few remarks,

It is always nice to have continuity in the optimal decision rule and strict
concavity of the value function because then it is easier to search for optimal
decisions. However, many interesting models do not have such property:

• Models with convex costs of adjustment (e.g. fixed cost of changing the size
of the house).
• Models with non-trivial tax schedule functions.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 38 / 55


Examples of trouble,

• There are not nice results for problems with time-inconsistent


preferences (hyperbolic or quasi-geometric discounting) because
agents tomorrow are solving a different problem than agents today.
But, you can still solve the problem using backward induction for the
finite horizon case.
• Same thing applies with the presence of lack of commitment —
time-inconsistent policies. Some tricks that include some state
variables as record keeping are proposed in Kydland and Prescott
1977.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 39 / 55


Characterization of the Policy Function: The Euler
Equation and TVC

The fixed point value function V is associated to an optimal policy


function c ∗ = g ∗ (x, z).

We can characterize the optimal policy function with the usual methods of
calculus, i.e. by differentiation, like in the static case or in the optimal
control problem in continuous time.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 40 / 55


• Suppose we can invert the function x 0 = f (x, z, c) with respect to the
choice variable, c, we obtain c = f −1 (x, z, x 0 ).

• Then we substitute out the control variable c in the Bellman’s


equation and the choice becomes on the state next period x 0 ,
Z
0
V (x, z) = max
0
u(x , x, z) + β V (x 0 , z 0 )dQ(z 0 , z) (4)
x ∈X Z

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 41 / 55


• If we knew that V were differentiable, by taking the FOC with respect
to x 0 we would obtain:
Z
0
ux 0 (x, x , z) − β Vx 0 (x 0 , z 0 )dQ(z 0 , z) = 0 (5)
Z

• While we are free to make assumptions on u and f , given these


assumptions the differentiability of V must be established.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 42 / 55


Differentiability of V

The Benveniste and Scheinkman (1979) Theorem:

Let V be a concave function defined on the set X , let x0 ∈ X , and let N(x0 ) be a
neighborhood of x0 . If there is a concave differentiable function Ω : N(x0 ) → <
such that Ω(x) ≤ V (x), ∀x ∈ N(x0 ) with the equality holding at x0 , then V is
differentiable at x0 and Vx x0 = Ωx x0 .

Proof: See SLP p.85.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 43 / 55


• Let’s apply this result to the Bellman’s equation. Define
Z
Ω(x, z) = u(x, g ∗ (x0 , z)) + β V (f (x0 , z, g ∗ (x0 , z)), z 0 )dQ(z 0 , z)
Z

• It follows that Ω(x0 , z) = V (x0 , z) = and ∀x ∈ N(x0 ):


Z
Ω(x, z) ≤ max {u(x, c) + β V (f (x, z, c), z 0 )dQ(z 0 , z) = V (x, z)
c∈C (x,z) Z

since g ∗ (x0 , z) is not the optimal policy function for x 6= x0 .

• Hence, by the Benveniste and Scheinkman Theorem, V is


differentiable.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 44 / 55


• Further, assuming that u() is concave and differentiable implies that
Ω(x, z) is concave and differentiable as well, since the integral
element is just a constant.

• We can therefore apply the envelope theorem to obtain that:

Vx 0 (x 0 , z 0 ) = ux 0 (x 0 , g ∗ (x 0 , z 0 ))

which tells us that the derivative of the value function is the partial
derivative of the utility function with respect to the state variable
evaluated at the optimal value for the control.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 45 / 55


• We can use this result to substitute out the derivative of the value
function from (5) to obtain
Z
0
ux 0 (x, x , z) − β ux 0 (x 0 , c 0 )dQ(z 0 , z) = 0 (6)
Z

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 46 / 55


The Euler Equation

• Euler equation: If we substitute out next period control through


c 0 = ϕ(x 0 , x 00 , z 0 ), we obtain
Z
0
ux 0 (x, x , z) − β ux 0 (x 0 , x 00 , z 0 )dQ(z 0 , z) = 0 (7)
Z

• To stress that this equation is satisfied by the optimal sequence of states


{xt∗ }∞
t=0 we can re-express (7) it with time subscripts,
Z
∗ ∗ ∗ ∗
uxt+1 (xt , xt+1 , z) − β uxt+1 (xt+1 , xt+2 , zt+1 )dQ(zt+1 , zt ) = 0 (8)
Z

This is a second order nonlinear difference equation in the state variable.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 47 / 55


The Transversality Condition (TVC)

• The Euler equation is a second order nonlinear difference equation in


the state variable.

• To be able to fully characterize the optimal dynamic path of the state


{xt∗ }∞
t=0 , we need two boundary conditions.

• These boundary conditions are


• Initial conditions x0 and z0 given
• Transversality Condition (TVC): limt→∞ β t uc (xt∗ , ct∗ ) xt∗ = 0

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 48 / 55


Sufficiency of the Euler Equation and TVC

• It is possible to prove that if a sequence of states {xt∗ }∞


t=0 satisfies the
Euler equation and the TVC, then it is optimal for the DP problem

• To do so, we will show that the difference D between the objective


function evaluated at {xt∗ }∞ ∞
t=0 and at {xt }t=0 , any alternative feasible
sequence of states, is non-negative.

• In this particular proof, we abstract from the stochastic nature of the


problem

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 49 / 55


• First, let’s assume concavity and differentiability of the return u.

Then from concavity it follows that:


T
X
β t u(xt∗ , xt+1

 
D = lim ) − u(xt , xt+1 )
T →∞
t=0
T
X
β t uxt+1 (xt∗ , xt+1

)(xt∗ − xt ) − uxt+1 (xt , xt+1 )(xt+1

 
≥ lim − xt+1 )
T →∞
t=0

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 50 / 55


• Second, since x0∗ = x0 , given as initial condition, we can rewrite the sum
above as:.

PT −1 
≥ limT →∞ { t=0 β t ux 0 (xt∗ , xt+1

 ∗
D ) − βux 0 (xt , xt+1 ) (xt+1 − xt+1 )

+β T ux 0 (xT∗ , xT∗ +1 )(xT∗ +1 − xT +1 )}

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 51 / 55


• Third, since the term in brakets [·] satisfies the Euler Equation, then

D ≥ limT →∞ {β T ux 0 (xT∗ , xT∗ +1 )(xT∗ +1 − xT +1 )}

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 52 / 55


• Fourth, using the Euler equation (8) to substitute the last term, we obtain:

D ≥ lim {β T ux 0 (xT∗ , xT∗ +1 )(xT∗ +1 − xT +1 )}


T →∞

= − lim {β T +1 ux 0 (xT∗ +1 , xT∗ +2 )(xT∗ +1 − xT +1 )}


T →∞

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 53 / 55


• Fifth, since ux 0 > 0 and xT +1 > 0, then (we can add a last term):

D ≥ − limT →∞ {β T +1 ux 0 (xT∗ +1 , xT∗ +2 )(xT∗ +1 − xT +1 )}

− limT →∞ β T +1 ux 0 (xT∗ +1 , xT∗ +2 )xT +1

• Using the TVC, we can finally establish that D ≥ 0.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 54 / 55


What’s next?
In order to implement solution methods (Backward Ind, VFI, etc...), we need first
to learn some numerical techniques:

• First, our object of interest are functions, we deal with functional equations.
We will learn how to approximate functions in a nice way that a computer
can store and handle.

• Second, we will learn how to take derivatives and integrals with


unavailable analytical solutions.

• Third, we will review how to find roots to solve (systems of) non-linear
equations.

• Finally, we will go over numerical optimization - useful sometimes as a


direct approach.

Before going into numerical techniques, we will have a quick review at how we
can look at aggregate and survey data in next class.

Raül Santaeulàlia-Llopis (Wash.U.) Dynamic Programming Spring 2011 55 / 55

You might also like