According to [Bellman1966] "an optimal policy has the property that, whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision." When applied recursively, this means that decisions at all states of an optimal trajectory must appear to be optimal when considering only the future behavior of the system. While the statement might sound rather vague, Bellman's principle is the foundation of optimal control, applicable to both continuous-time and discrete-time systems and trajectories analyzed not only on finite but also infinite time horizons.
In the following subsections we will overview the Bellman and the Hamilton-Jacobi-Bellman (HJB) equation which are the consequence of Bellman's principle being applied to the discrete-time and continuous-time optimal control problems. Later we will use both the Bellman and HJB equation to derive variants of the linear-quadratic regulator (LQR), a staple in optimal control.
To emphasize the relevance of Bellman's principle in present day research a mention of the differential dynamic programming algorithm is called for. While the original algorithm was introduces in [Mayne1966] its extensions are the product of research in the current decade.