Bellman Equation

For a discrete-time linear system, with dynamics in the form $x_{k + 1} = f (x_{k}, u_{k}, k),$ let us assume we are trying to minimize the total cost $J (x_{0}, u_{0 : N - 1}, 0) = Φ (x_{N}) + i = 0 \sum N - 1 l (x_{i}, u_{i}, i) .$ of a trajectory $(x_{0 : N}, u_{0 : N - 1})$ where the initial state $x_{0}$ is prescribed. The concept of the total cost can be generalized for any $k \in ⟨ 0, N ⟩$ to a cost-to-go $J (x_{k}, u_{k : N - 1}, k) = Φ (x_{N}) + i = k \sum N - 1 l (x_{i}, u_{i}, i),$ for which we may define a value function $V (x_{k}, k) = u_{k : N - 1} min J (x_{k}, u_{k : N - 1}, k),$ and an optimal control policy $u_{k : N - 1}^{*} = u_{k : N - 1} arg min J (x_{k}, u_{k : N - 1}, k),$ the application of which results in the system following the optimal sequence of states $x_{k + 1 : N}^{*}$ .

The so-called Bellman equation can then be derived by formulating the value function for step $k$ recursively, using the value function for step $k + 1$ . From its definition, it is apparent that $V (x_{k}, k) = l (x_{k}, u_{k}^{*}, k) + V (x_{k + 1}^{*}, k + 1), x_{k + 1}^{*} = f (x_{k}, u_{k}^{*}, k), (1)$ where $V (x_{N}, N) = Φ (x_{N}) .$ Assuming we don't know what the optimal input at step $k$ is, we may pose the right-hand side of (1) as an optimization problem $V (x_{k}, k) = u_{k} min (l (x_{k}, u_{k}, k) + V (f (x_{k}, u_{k}, k), k + 1)) . (2)$ Equation (2) is then referred to as the Bellman equation.

Optimal and Predictive Control

Bellman Equation