Bellman Equation
For a discrete-time linear system, with dynamics in the form let us assume we are trying to minimize the total cost of a trajectory where the initial state is prescribed. The concept of the total cost can be generalized for any to a cost-to-go for which we may define a value function and an optimal control policy the application of which results in the system following the optimal sequence of states .
The so-called Bellman equation can then be derived by formulating the value function for step recursively, using the value function for step . From its definition, it is apparent that where Assuming we don't know what the optimal input at step is, we may pose the right-hand side of (1) as an optimization problem Equation (2) is then referred to as the Bellman equation.