These are notes for tutorials of the subject "Optimal and Predictive Control System" taught at the Faculty of Mechanical Engineering at the Czech Technical University in Prague.

Dating back to the 1950s optimal control is an every evolving field currently fueled by the incredible amounts of computational power at our disposal. Although in this course we only cover the fundamentals as applied to linear system which can be successfully implemented on microcontrollers, cutting-edge approaches push hardware to its limits, allowing for the control of walking robots, autonomous vehicles, etc.

The materials are loosely divided into three blocks. We cover the basics of numerical optimization which we will become relevant for optimal control only at the very end of the course. Regardless we will attempt to provide examples that demonstrate their utility as we go. Then we will move on to the main content of the course. Starting from the quite abstract concept of the Bellman principle we cover the topic of the linear-quadratic regulator (LQR) from which we will transition to model predictive control (MPC).

We will apply each control strategy to the system of a bi-rotor, i.e. a quad-rotor simplified into the horizontal plane. As supplementary materials we will show how its dynamic model can be derived.

DISCLAIMER: These notes have not gone through a peer review process and are likely to contain (hopefully only minor) mistakes.

The vast majority of numerical optimization problems can be written in the form $x min s.t. f (x) g (x) \leq 0 h (x) = 0,$ where $x \in R^{n}$ are optimized variables, $f : R^{n} \to R$ is the objective function, $g : R^{n} \to R^{r}$ are inequality constraints and $h : R^{n} \to R^{s}$ are equality constraints. Here we assume that $n$ , $r$ , and $s$ lie in $N$ and that constraints $g$ and $h$ define a non-empty set of feasible solutions.

Based on the properties of $f$ , $g$ , and $h$ , different approaches can be taken to solve the optimization problem. For example, if the objective function is convex and the constraint are affine the problem has a unique solution (optimal value of the objective function) that can be found using gradient descend. In this course we will focus on this type of problems which are classified as convex.

We will start with a general overview of Lagrangian Duality and KKT conditions which are applicable to both convex and non-convex optimization problems. Then we will describe two specific types of convex optimization problems: linear programming (LP) and quadratic programming (QP).

Lagrangian Duality

In the following section we will derive the so-called dual problem of a (primal) optimization problem, following the derivation demonstrated in a series of short lectures which accompany the book [Bierlaire2018]. Possibly the most important property of the dual problem (from a practical point of view) is that it is convex even if the primal optimization problem is not [Boyd2004].

Let us consider a constrained optimization problem in the form $x min s.t. f (x) g (x) \leq 0 h (x) = 0,$ to which we will refer to as the primal problem. With regard to this problem, let us define the Lagrangian $L (x, λ, μ) = f (x) + λ^{⊤} h (x) + μ^{⊤} g (x),$ where $λ$ and $μ$ are Lagrange multipliers, and the dual function $q (λ, μ) = x min L (x, λ, μ) .$

Theorem

Let $x^{*}$ be the solution to the primal problem. If $μ \geq 0$ , then $q (λ, μ) \leq f (x^{*}) .$

Proof

$q (λ, μ) = x min L (x, λ, μ) \leq L (x^{*}, λ, μ)$

(For any $λ$ and $μ \geq 0$ , I can find an $x$ that minimizes the Lagrangian as well as or better than $x^{*}$ by violating the constraints.)

and

$L (x^{*}, λ, μ) = f (x^{*}) + λ^{⊤} = 0 h (x^{*}) + \geq 0 μ^{⊤} \leq 0 g (x^{*}) \leq f (x^{*})$ (Because if $μ_{i} > 0$ and $g_{i} (x^{*}) < 0$ then $μ_{i} g_{i} (x^{*}) < 0$ )

$□$

Corollary

Let $x^{*}$ be the solution to the primal problem and $x$ feasible w.r.t its constraints. If $λ \in R^{n}$ and $μ \in R^{p}$ , $μ \geq 0$ , then $q (λ, μ) \leq f (x^{*}) \leq f (x) .$

Dual problem

As was proven $min_{x} L (x, λ, μ)$ provides a lower bound on the solution of the primal problem when $μ \geq 0$ . The dual problem can then be regarded as maximing this lower bound by penalizing violations of the primal problem's constraints using the Lagrange multipliers: $λ, μ max s.t. q (λ, μ) x min L (x, λ, μ) μ \geq 0 (λ, μ) \in {λ, μ ∣ q (λ, μ) > - \infty} .$

The second condition states that we consider on $λ$ and $μ$ such that the dual function is bounded. This is necessary for the problem to be well posed.

(Weak duality) theorem

If $x^{*}$ is the solution to the primal problem and $(λ^{*}, μ^{*})$ is the solution to the dual problem, then $q (λ^{*}, μ^{*}) \leq f (x^{*}) .$

Corollary

If one problem is unbounded the other is infeasible.

Corollary

If $\exists x^{*}, λ^{*}, μ^{*}$ such that $q (λ^{*}, μ^{*}) = f (x^{*}),$ they are optimal.

Karuch-Kuhn-Tucker Conditions

The Karuch-Kuhn-Tucker (KKT) conditions are first order necessary conditions for finding the solution of an optimization problem in the form $x min s.t. f (x) g (x) \leq 0 h (x) = 0 .$ Furthermore, they are also sufficient conditions for convex problems, i.e. those where $f (x)$ , $g (x)$ are convex and $h (x)$ is affine (linear) [Boyd2004].

A solution satisfying these conditions gives us not only $x$ but also the Lagrange multipliers $λ$ and $μ$ present in the Lagrangian $L (x, λ, μ) = f (x) + λ^{⊤} h (x) + μ^{⊤} g (x) .$

The conditions are as follows:

stationarity $\nabla f (x) + \nabla g (x) μ + \nabla h (x) λ = 0 (1)$ (The linear combination of the constraints' gradients has to be equal to the objective function's gradient.)
primal feasibility $g (x) h (x) \leq 0 = 0 (2-3)$ (Constraints of the stated optimization problem have to be satisfied.)
dual feasibility $μ \geq 0 (4)$ (For a given $x$ , the columns of $\nabla g (x)$ define a polyhedral cone in which $μ^{⊤} \nabla g (x)$ must lie.)
complementary slackness $μ^{⊤} g (x) = 0 (5)$ (If $x$ lies inside the feasible set w.r.t to the condition $g (x) < 0$ , then $μ = 0$ and therefore $μ^{⊤} \nabla g (x) = 0$ .)

Physical interpretation

The KKT conditions can be intuitively derived though an analogy where $\nabla f (x)$ is a potential field, for example of gravity of magnetism. Let us first address only the inequality constrained case $x min s.t. f (x) g (x) \leq 0,$ where the inequality constraints can be regarded as physical barriers. If we then recall the Lagrangian $L (x, λ, μ) = f (x) + μ^{⊤} g (x),$ the multipliers and $μ$ can be viewed as contact forces acting against the influence of the potential field.

From this analogy we can immediately recover the dual feasibility and complementary slackness conditions (4-5) as contact forces can only be positive and do not act at a distance. The stationarity condition (1) (without the term related to the equality constraint) can then be interpreted as the contact forces and the pull of potential field being in equilibrium, which must occur at a local minima. Lastly, we are considering only solutions within the bounds of the inequality constraints, as if we were initially placed within them, satisfying the primal feasibility condition (2).

To incorporate equality constraints into this analogy we can simply reformulate them as $0 \leq h (x) \leq 0$ likening them to being sandwiched between two surfaces. As we are always in contact there is no need for a complementary slackness condition and due to its bi-directional nature, elements of $λ$ can be both positive and negative. Therefore we must only include a primal feasibilty condition (2) and add an additional term to the stationarity condition (1).

Linear Programming

Let us consider a constrained optimization problem in the form $x max s.t. c^{⊤} x A x \leq b x \geq 0 .$

As both the objective function and constraints are linear the problem is convex. For this reason KKT conditions are not only necessary but also sufficient if a feasible $x$ w.r.t the constraints exists.

KKT conditions

The KKT conditions of this problem can be states as: $c + A^{⊤} μ_{1} - μ_{2} A x x μ_{1} μ_{2} μ_{1}^{⊤} (A x - b) - μ_{2}^{⊤} x = 0 \leq b \geq 0 \geq 0 \geq 0 = 0 = 0 stationarity primal feasibility dual feasibility complementary slackness$

Dual problem

The dual problem of the LP problem can be written as $μ_{1}, μ_{2} max s.t. q (μ_{1}, μ_{2}) x min L (x, μ_{1}, μ_{2}), L (x, μ_{1}, μ_{2}) = - c^{⊤} x - μ_{1}^{⊤} x + μ_{2}^{⊤} (A x - b) μ_{1} \geq 0 μ_{2} \geq 0 (μ_{1}, μ_{2}) \in {μ_{1}, μ_{2} ∣ q (μ_{1}, μ_{2}) > - \infty},$ where the Lagrangian can be manipulated into the form $L (x, μ_{1}, μ_{2}) = x^{⊤} (A^{⊤} μ_{2} - μ_{1} - c) - b^{⊤} μ_{2} .$ As we are only looking for bound solutions (last constraint) the condition $A^{⊤} μ_{2} - μ_{1} - c = 0$ must be satisfied and therefore $μ_{1} = A^{⊤} μ_{2} - c$ When substituted back into the original dual problem, after basic manipulations, we attain $μ_{2} min s.t. b^{⊤} μ_{2} A^{⊤} μ_{2} \geq c μ_{2} \geq 0$

Quadratic Programming

Let us consider a constrained optimization problem in the form $x min s.t. \frac{1}{2} x^{⊤} Q x + c^{⊤} x A x \leq b,$ where $Q ⪰ 0$ .

As the objective function is quadratic and the constraints linear the problem is convex. For this reason KKT conditions are not only necessary but also sufficient if a feasible $x$ w.r.t the constraints exists.

KKT Conditions

The KKT conditions of this problem can be states as: $Q x + c + A^{⊤} μ A x μ μ^{⊤} (A x - b) = 0 \leq b \geq 0 = 0 stationarity primal feasibility dual feasibility complementary slackness$

Dual problem

The dual problem of the QP problem can be written as $μ max s.t. q (μ) x min L (x, μ), L (x, μ) = \frac{1}{2} x^{⊤} Q x + c^{⊤} x + μ^{⊤} (A x - b) μ \geq 0 μ \in {μ ∣ q (μ) > - \infty},$ where the Lagrangian can be manipulated into the form $L (x, μ) = \frac{1}{2} x^{⊤} Q x + x^{⊤} (c + A^{⊤} μ) - b^{⊤} μ .$ Its critical point can be found by solving $\frac{\partial L}{\partial x} (x, μ) = Q x + c + A^{⊤} μ = 0,$ which yields $x = - Q^{- 1} (c + A^{⊤} μ) .$ When substituted back into the original dual problem, after basic manipulations, we attain $μ max s.t. q (μ) = - \frac{1}{2} (c + A^{⊤} μ)^{⊤} Q^{- 1} (c + A^{⊤} μ) - b^{⊤} μ μ \geq 0 μ \in {μ ∣ q (μ) > - \infty} .$

According to [Bellman1966] "an optimal policy has the property that, whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision." When applied recursively, this means that decisions at all states of an optimal trajectory must appear to be optimal when considering only the future behavior of the system. While the statement might sound rather vague, Bellman's principle is the foundation of optimal control, applicable to both continuous-time and discrete-time systems and trajectories analyzed not only on finite but also infinite time horizons.

In the following subsections we will overview the Bellman and the Hamilton-Jacobi-Bellman (HJB) equation which are the consequence of Bellman's principle when applied to the discrete-time and continuous-time optimal control problems. Regardless of the representation, the idea is find a sequence of control inputs that minimize the total cost of a trajectory, defined on a horizon. Instead of looking for the entire sequence at once, the Bellman principle allows us to break it down into a sequence of problems, described by the aforementioned equations. For their derivation we introduce the concept of the cost-to-go, i.e. the remainder of the total cost from any point on the horizon to its end, assuming a particular sequence of inputs. In the case of the optimal sequence of inputs we refer to the corresponding cost-to-go as the value function. The value function figures on the left-hand side of both equations where is expressed recursively.

This is high level overview which without additional context might be incomprehensible. Hopefully the following subsections will provide the necessary context in the form of rigorous derivations and later also applications. Especially the derivations are likely to be sparse on text, so always feel free to jump back.

To emphasize the relevance of Bellman's principle in present day research a mention of the differential dynamic programming algorithm is called for. While the original algorithm was introduces in [Mayne1966] its extensions are vibrant area of contemporary research.

Bellman Equation

For a discrete-time linear system, with dynamics in the form $x_{k + 1} = f (x_{k}, u_{k}, k),$ let us assume we are trying to minimize the total cost $J (x_{0}, u_{0 : N - 1}, 0) = Φ (x_{N}) + i = 0 \sum N - 1 l (x_{i}, u_{i}, i) .$ of a trajectory $(x_{0 : N}, u_{0 : N - 1})$ where the initial state $x_{0}$ is prescribed. The concept of the total cost can be generalized for any $k \in ⟨ 0, N ⟩$ to a cost-to-go $J (x_{k}, u_{k : N - 1}, k) = Φ (x_{N}) + i = k \sum N - 1 l (x_{i}, u_{i}, i),$ for which we may define a value function $V (x_{k}, k) = u_{k : N - 1} min J (x_{k}, u_{k : N - 1}, k),$ and an optimal control policy $u_{k : N - 1}^{*} = u_{k : N - 1} arg min J (x_{k}, u_{k : N - 1}, k),$ the application of which results in the system following the optimal sequence of states $x_{k + 1 : N}^{*}$ .

The so-called Bellman equation can then be derived by formulating the value function for step $k$ recursively, using the value function for step $k + 1$ . From its definition, it is apparent that $V (x_{k}, k) = l (x_{k}, u_{k}^{*}, k) + V (x_{k + 1}^{*}, k + 1), x_{k + 1}^{*} = f (x_{k}, u_{k}^{*}, k), (1)$ where $V (x_{N}, N) = Φ (x_{N}) .$ Assuming we don't know what the optimal input at step $k$ is, we may pose the right-hand side of (1) as an optimization problem $V (x_{k}, k) = u_{k} min (l (x_{k}, u_{k}, k) + V (f (x_{k}, u_{k}, k), k + 1)) . (2)$ Equation (2) is then referred to as the Bellman equation.

Hamilton-Jacobi-Bellman Equation

Let us assume we are trying to minimize the total cost $J (x (t_{0}), u (τ), t_{0}) = Φ (x (t_{f})) + \int_{t_{0}}^{t_{f}} l (x (τ), u (τ), τ) d τ .$ of a continuous-time system's trajectory $x (τ)$ , $τ \in ⟨ t_{0}, t_{f} ⟩$ with dynamics in the form $\dot{x} (t) = f (x (t), u (t), t),$ starting from the state $x (t_{0}) = \tilde{x}_{0}$ .

The concept of the total cost can be generalized for any $t \in ⟨ t_{0}, t_{f} ⟩$ to a cost-to-go $J (x (t), u (τ), t) = Φ (x (t_{f})) + \int_{t}^{t_{f}} l (x (τ), u (τ), τ) d τ,$ for which we may define a value function $V (x (t), t) = u (τ) min J (x (t), u (τ), t),$ and optimal control policy $u^{*} (τ) = u (τ) arg min J (x (t), u (τ), t), τ \in ⟨ t, t_{f})$ the application of which results in the system following an optimal trajectory $x^{*} (τ)$ , $τ \in (t, t_{f} ⟩$ .

For the previously defined value function the Hamilton-Jacobi-Bellman (HJB) equation can be derived¹ as $- \frac{\partial V}{\partial t} (x (t), t) = u (t) min (l (x (t), u (t), t) + (\frac{\partial V}{\partial x})^{⊤} (x (t), t) f (x (t), u (t), t)) .$

An overview of the derivation is presented by Steven Brunton in one of his videos [Brunton2022-HJB] also available here. As a note, there is a small mistake, acknowledged by the presenter in the comments, at 9:11 of the video where " $0$ " should be replaced with " $t$ ".

Linear-Quadratic Regulator

The linear-quadratic regulator gets its name from the linear model of the system and the quadratic cost function used for its design. For these, the optimal controller can be derived analytically and takes the form of linear state feedback making its implementation particularly simple.

The systems can be described in both the discrete-time and continuous-time domain and particular variants of the LQR allow for time-variant systems. Cost functions must then reflect the type of the system. For time-invariant systems, the cost function can either take the form of a time-invariant running cost integrated over an infinite horizon, or a sum of a final cost and a running cost (optionally time-variant) integrated over a finite horizon. If the system is time-variant, only the second option is available.

In the following sections we will cover four variants based on the time-domain and horizon

discrete-time x continuous-time,
finite-horizon x infinite-horizon.

Many more variants exist including those for reference tracking, dead-beat control, and even nonlinear trajectory optimization.

Discrete-Time Infinite-horizon Linear-Quadratic Regulator

For a linear time-invariant discrete-time system $x_{k + 1} = f (x_{k}, u_{k}, k) A x_{k} + B u_{k}$ and a quadratic total cost $J (x_{0}, u_{0 : \infty}, 0) = i = 0 \sum \infty l (x_{i}, u_{i}, i) x_{i}^{⊤} Q x_{i} + u_{i}^{⊤} R u_{i}, Q ⪰ 0, R ≻ 0$ of its trajectory $(x_{0 : \infty}, u_{0 : \infty})$ the optimal controller can be derived based on the assumption that the value function on the infinite horizon takes the time-invariant form $V (x_{k}, k) = x_{k}^{⊤} S x_{k}, S ≻ 0.$

When substituted into the Bellman Equation along with the system's dynamics we attain $x_{k}^{⊤} S x_{k} = u_{k} min {x_{k}^{⊤} Q x_{k} + u_{k}^{⊤} R u_{k} + (A x_{k} + B u_{k})^{⊤} S (A x_{k} + B u_{k})} . (1)$ To find the minimum, we may take the gradient of its argument (which is by design quadratic and convex) with respect to $u_{k}$ , set it to zero and find the solution (optimal control input) $u_{k}^{*} = - (R + B^{⊤} SB)^{- 1} B^{⊤} S A x_{k} .$

The input can then be substitued back into (1). As the equation must hold for all $x_{k}$ , through basic manipulations we then attain the discrete-time algebraic Riccati equation (DARE) $S = Q + A^{⊤} S A - A^{⊤} SB (B^{⊤} SB + R)^{- 1} B^{⊤} S A, S ≻ 0.$

Discrete-Time Finite-horizon Linear-Quadratic Regulator

For a linear time-variant discrete-time system $x_{k + 1} = f (x_{k}, u_{k}, k) A_{k} x_{k} + B_{k} u_{k}$ and a quadratic total cost $J (x_{0}, u_{0 : N - 1}, 0) = Φ (x_{N}) x_{N}^{⊤} Q_{N} x_{N} + i = 0 \sum N - 1 l (x_{i}, u_{i}, i) x_{i}^{⊤} Q x_{i} + u_{i}^{⊤} R u_{i}, Q_{N} ⪰ 0, Q ⪰ 0, R ≻ 0$ of its trajectory $(x_{0 : N}, u_{0 : N - 1})$ the optimal controller can be derived based on the assumption that the value function takes the form $V (x_{k}, k) = x_{k}^{⊤} S_{k} x_{k}, S_{k} ≻ 0.$ When substituted into the Bellman Equation along with the system's dynamics we attain $x_{k}^{⊤} S_{k} x_{k} = u_{k} min {x_{k}^{⊤} Q x_{k} + u_{k}^{⊤} R u_{k} + (A_{k} x_{k} + B_{k} u_{k})^{⊤} S_{k + 1} (A_{k} x_{k} + B_{k} u_{k})} . (1)$ To find the minimum, we may take the gradient of its argument (which is by design quadratic and convex) with respect to $u_{k}$ , set it to zero and find the solution (optimal control input) $u_{k}^{*} = - (R + B_{k}^{⊤} S_{k + 1} B_{k})^{- 1} B_{k}^{⊤} S_{k + 1} A_{k} x_{k} .$

The input can then be substitued back into (1). As the equation must hold for all $x_{k}$ , through basic manipulations we then attain the discrete-time dynamic Riccati equation (DDRE) $S_{k} = Q + A_{k}^{⊤} S_{k + 1} A_{k} - A_{k}^{⊤} S_{k + 1} B_{k} (B_{k}^{⊤} S_{k + 1} B_{k} + R)^{- 1} B_{k}^{⊤} S_{k + 1} A_{k}, S_{N} = Q_{N}$ for a finite horizon $N \in N$ .

Continuous-Time Infinite-Horizon Linear-Quadratic Regulator

For a linear time-invariant continuous-time system $\overset{x}{˙} (t) = f (x (t), u (t), t) A x (t) + B u (t)$ and a quadratic total cost $J (x (t_{0}), u (τ), t_{0}) = \int_{t_{0}}^{\infty} l (x (τ), u (τ), τ) x (τ)^{⊤} Q x (τ) + u (τ)^{⊤} R u (τ) d τ,$ where $Q ⪰ 0$ and $R ≻ 0$ , of its trajectory $(x (τ), u (τ))$ , $τ \in (t_{0}, \infty)$ the optimal controller can be derived based on the assumption that the value function takes the form $V (x (t), t) = x (t)^{⊤} S x (t), S ≻ 0.$ When substituted into the Hamilton-Jacobi-Bellman Equation along with the system's dynamics we attain $0 = u (t) min {x (t)^{⊤} Q x (t) + u (t)^{⊤} R u (t) + 2 x (t)^{⊤} S (A x (t) + B u (t))} . (1)$ To find the minimum, we may take the gradient of its argument (which is by design quadratic and convex) with respect to $u (t)$ , set it to zero and find the solution (optimal control input) $u^{*} (t) = - R^{- 1} B^{⊤} S x (t) .$

The input can then be substituted back into (1). As the equation must hold for all $x (t)$ , through basic manipulations we then attain the continuous-time algebraic Riccati equation (CARE) $0 = Q - S B^{⊤} R^{- 1} BS + S A + A^{⊤} S, S ≻ 0.$

Continuous-Time Infinite-Horizon Linear-Quadratic Regulator

For a linear time-variant continuous-time system $\overset{x}{˙} (t) = f (x (t), u (t), t) A (t) x (t) + B (t) u (t)$ and a quadratic total cost $J (x (t_{0}), u (τ), t_{0}) = Φ (x_{N}) x (t_{f})^{⊤} Q_{f} x (t_{f}) + \int_{t_{0}}^{t_{f}} l (x (τ), u (τ), τ) x (τ)^{⊤} Q x (τ) + u (τ)^{⊤} R u (τ) d τ,$ where $Q_{f} ⪰ 0$ , $Q ⪰ 0$ , and $R ≻ 0$ , of its trajectory $(x (τ), u (τ))$ , $τ \in ⟨ t_{0}, t_{f} ⟩$ the optimal controller can be derived based on the assumption that the value function takes the form $V (x (t), t) = x (t)^{⊤} S (t) x (t), S (t) ≻ 0.$ When substituted into the Hamilton-Jacobi-Bellman Equation along with the system's dynamics we attain $- x (t)^{⊤} \dot{S} (t) x (t) = u (t) min {x (t)^{⊤} Q x (t) + u (t)^{⊤} R u (t) + 2 x (t)^{⊤} S (t) (A (t) x (t) + B (t) u (t))} . (1)$ To find the minimum, we may take the gradient of its argument (which is by design quadratic and convex) with respect to $u (t)$ , set it to zero and find the solution (optimal control input) $u^{*} (t) = - R^{- 1} B (t)^{⊤} S (t) x (t) .$

The input can then be substituted back into (1). As the equation must hold for all $x (t)$ , through basic manipulations we then attain the continuous-time differential Riccati equation (CDRE) $- \dot{S} (t) = Q - S (t) B (t)^{⊤} R^{- 1} B (t) S (t) + S (t) A (t) + A (t)^{⊤} S (t)$ for a finite horizon $t_{f} \in R$ . Supplemented with the boundary conditon $S (t_{f}) = Q_{f}$ , the CDRE forms a initial value problem (IVP) which can be solved using numerical integration. Numerical errors in the integration process may lead to the loss of positive-semi-definiteness. To overcome this, instead of integrating $S (t)$ directly, we may use its factorized form $S (t) = P (t) P^{⊤} (t)$ a.k.a. the "square-root form" where $- \dot{P} (t) = \frac{1}{2} Q P^{- ⊤} (t) - \frac{1}{2} S (t) B (t) R^{- 1} B (t)^{⊤} P (t) + A (t)^{⊤} P (t) .$ As $P (t)$ must be invertible $Q_{f}$ must be (at least numerically) positive definite. Consequenlty we may use the cholesky factorization in order to form the boundary condition $P (t_{f}) = L, Q_{f} = L L^{⊤} .$

Linear-Quadratic MPC

Model predictive control (MPC) is a control strategy which relies on the system's model to predict and optimize its trajectory online, periodically re-optimizing the trajectory as a form of feedback. If a linear representation of the system and a quadratic running and final cost are used, it is essentially an extension of the finite-horizon LQR that includes control in state inputs. This is in fact what is commonly referred to as MPC. If the system's dynamics are nonlinear or the cost functions contain terms that are not linear nor quadratic, the strategy is referred to as nonlinear MPC.

In this course we will limit ourselves only to the former case which can be transcribed into a QP problem. We will go through two formulations of this problem based on how the system's dynamics are incorporated into the optimization problem:

explicitly constrained - both states $x_{k}$ and inputs $u_{k}$ are the decision variables,
implicitly constrained - only inputs $u_{k}$ are decision variables.

Direct Linear-Quadratic MPC

The standard realization of an MPC controller is such that solves the optimization problem ${x_{k}}_{k = 1}^{N}, {u_{k}}_{k = 0}^{N - 1} min s.t. x_{N}^{⊤} Q_{N} x_{N} + k = 0 \sum N - 1 x_{k}^{⊤} Q x_{k} + u_{k}^{⊤} R u_{k} x_{k + 1} = A x_{k} + B u_{k}, k = 0, \dots, N - 1 u_{m i n} \leq u_{k} \leq u_{m a x}, k = 0, \dots, N - 1$ where the initial state $x_{0}$ is given¹.

Canonical QP form

Optimized variables

$x = [u_{0}^{⊤} \dots \dots u_{N - 1}^{⊤} x_{1}^{⊤} \dots \dots x_{N}^{⊤}]^{⊤}$

Objective function's terms

$P = R ⋱ ⋱ R Q ⋱ ⋱ Q, q = 0 ⋮ ⋮ 00 ⋮ ⋮ 0$

Constraint's terms

$l = - A x_{0} 0 ⋮ 0 u_{m i n} ⋮ ⋮ u_{m i n}, A = B I B ⋱ ⋱ ⋱ B I - I A - I ⋱ ⋱ A - I, u = - A x_{0} 0 ⋮ 0 u_{m a x} ⋮ ⋮ u_{m a x}$

If the dynamics $x_{k + 1} = A x_{k} + B u_{k}$ represent the linearized dynamics of a nonlinear system (i.e. $x_{k}$ and $u_{k}$ are in-fact $\overset{ˉ}{x}_{k} = x_{k} - x^{*}$ and $\overset{ˉ}{u}_{k} = u_{k} - u^{*}$ ) the value of $u^{*}$ must be considered in the choice of limits $u_{m i n}$ and $u_{m a x}$ .

Controllability/Reachability

Controllability and reachability are defined for discrete-time systems as:

A system is controllable if from any state $x_{0} \in R^{n}$ one can achieve the state $x_{N} = 0$ using a series of inputs $u_{0 : N - 1}$ .
A state $x_{N} \in R^{n}$ is reachable if it can be achieved by applying a series of inputs $u_{0 : N - 1}$ when starting from the initial state $x_{0} = 0$ .

For continuous-time systems they are similarly defined as:

A system is controllable if from any state $x (0) \in R^{n}$ one can achieve the state $x (t) = 0$ using a control policy $u (τ)$ , $τ \in ⟨ 0, t)$ .
A state $x (t) \in R^{n}$ is reachable if it can be achieved by applying a control policy $u (τ)$ , $τ \in ⟨ 0, t)$ when starting from the initial state $x (0) = 0$ .

For linear systems all states are reachable if the system is controllable.

Controlability of LTI Discrete-Time Systems

The first $n + 1$ states of a linear time-invariant system $x_{k + 1} = A x_{k} + B u_{k}, x_{k} \in R^{n}, u_{k} \in R^{m},$ assuming an initial state $x_{0}$ and a series of inputs ${u_{k}}_{0}^{N - 1}$ , can be expressed as $x_{1} x_{2} x_{n} x_{n + 1} = A x_{0} + B u_{0} = A^{2} x_{0} + A B u_{0} + B u_{1} ⋮ = A^{n} x_{0} + A^{n - 1} B u_{0} + A^{n - 2} B u_{1} + \dots + B u_{n - 1} = A^{n + 1} x_{0} + A^{n} B u_{0} + A^{n - 1} B u_{1} + A^{n - 2} B u_{2} + \dots + B u_{n}$ For square matrices that satisfy their own characteristic equation the Cayley-Hamilton theorem states that for $N \geq n$ we may express $A^{N}$ as a linear combination of the lower matrix powers of $A$ : $A^{N} = a_{0} I + a_{1} A + \dots + a_{n - 1} A^{n - 1} .$ Therefore, the state $x_{n + 1}$ can be rewritten as $x_{n + 1} = A^{n + 1} x_{0} + A^{n - 1} B (u_{1} + a_{n - 1} u_{0}) + A^{n - 2} B (u_{2} + a_{n - 2} u_{0}) + \dots + B (u_{n} + a_{0} u_{0})$ which can be manipulated into the form $x_{n + 1} = A^{n + 1} x_{0} + R u_{n} + a_{0} u_{0} u_{n - 1} + a_{1} u_{0} u_{1} + a_{n - 1} u_{0} .$ where $R = [B A B \dots A^{n - 1} B]$ is the controllability matrix.

The same substitution can be applied also for subsequent timesteps up-to infinity, changing only the particular form of the vector of inputs' linear combinations. This has two consequences:

All states reachable in $N > n$ steps are also reachable in $n$ steps (with unlimited inputs).
If $R$ is rank deficient, some directions in the state-space cannot be effected by the inputs and therefore the system is uncontrollable.

Controlability of LTI Continuous-Time Systems

The state of a linear time-invariant system $\overset{x}{˙} (τ) = A x (τ) + B u (τ), x (τ) \in R^{n}, u (τ) \in R^{m}$ at time $t$ , starting from the initial state $x (0)$ and influenced by a continuous input $u (τ)$ , can be expressed as $x (t) = e^{A t} x (0) + \int_{0}^{t} e^{A (t - τ)} B u (τ) d τ . (1)$ For square matrices that satisfy their own characteristic equation the Cayley-Hamilton theorem has a consequence that the inifinite series $e^{A t} = I + A t + \frac{( A t ) ^{2}}{2 !} + \dots$ can be expressed using a finite number of terms $e^{A t} = ϕ_{0} (t) I + ϕ_{1} (t) A + \dots + ϕ_{n - 1} (t) A^{n - 1} .$ After substituting it back into the second term of (1) we attain $x (t) = e^{A t} x (0) + \int_{0}^{t} (ϕ_{0} (t - τ) I + ϕ_{1} (t - τ) A + \dots + ϕ_{n - 1} (t - τ) A^{n - 1}) B u (τ) d τ$ which can be further manipulated into the form $x (t) = e^{A t} x (0) + R \int_{0}^{t} ϕ_{0} (t - τ) u (τ) ϕ_{1} (t - τ) u (τ) ⋮ ϕ_{n - 1} (t - τ) u (τ) d τ,$ where $R = [B A B \dots A^{n - 1} B]$ is the controllability matrix. If $R$ is rank deficient, some directions in the state-space cannot be effected by the inputs and therefore the system is uncontrollable.

Discretization of LTI Systems

Let us consider a linear time-invariant system in the form $\overset{x}{˙} (τ) = A x (τ) + B u (τ) .$

Integration of the system's dynamics over an iterval $τ \in ⟨ 0, t + h ⟩$ gives us $x (t + h) = e^{A (t + h)} x (0) + \int_{0}^{t + h} e^{A (t + h - τ)} B u (τ) d τ .$ To express its state at $x (t + h)$ with respect to the state $x (t)$ we may perform a series of manipulations $x (t + h) = e^{A (t + h)} x (0) + \int_{0}^{t + h} e^{A (t + h - τ)} B u (τ) d τ = e^{A (t + h)} x (0) + \int_{0}^{t} e^{A (t + h - τ)} B u (τ) d τ + \int_{t}^{t + h} e^{A (t + h - τ)} B u (τ) d τ = e^{A h} x (t) (e^{A t} x (0) + \int_{0}^{t} e^{A (t - τ)} B u (τ) d τ) + \int_{t}^{t + h} e^{A (t + h - τ)} B u (τ) d τ . (1)$ If we then set $u (τ) = \overset{u}{^}$ , $τ \in ⟨ t, t + h)$ the integral $\int_{t}^{t + h} e^{A (t + h - τ)} B u (τ) d τ$ can be manipulated into $\int_{t}^{t + h} e^{A (t + h - τ)} B u (τ) d τ = e^{A (t + h)} \int_{t}^{t + h} e^{- A τ} d τ B \overset{u}{^} = e^{A (t + h)} [- A^{- 1} e^{- A τ}]_{t}^{t + h} B \overset{u}{^} = e^{A (t + h)} (- A^{- 1} e^{- A (t + h)} + A^{- 1} e^{- A t}) B \overset{u}{^} = A^{- 1} (e^{A h} - I) B \overset{u}{^} . (2)$ By substituting (2) into (1) we obtain $x_{k + 1} = e^{A h} x_{k} + A^{- 1} (e^{A h} - I) B u_{k} (3)$ where $x_{k + 1} = x (t + h)$ , $x_{k} = x (t)$ , and $u_{k} = \overset{u}{^}$ .

If we use the first two terms of the taylor expansion $e^{A h} = I + A h + \frac{( A h ) ^{2}}{2 !} + \dots$ in (3) we get the system representation $x_{k + 1} = (I + A h) x_{k} + B h u_{k} .$

Optimal Luenberger Observer

Finite horizon

Let us consider a LTI system with additive Gaussian noise $x_{k} y_{k} = A x_{k - 1} + B u_{k - 1} + w_{k - 1}, = C x_{k} + v_{k}, Q_{k - 1} R_{k - 1} = E [w_{k} w_{k}^{⊤}] = E [v_{k} v_{k}^{⊤}] .$ and a Luenberger observer $\overset{x}{^}_{k} = A \overset{x}{^}_{k - 1} + B u_{k - 1} - L_{k} (C \overset{x}{^}_{k - 1} - y_{k - 1}) . (1)$ The state estimate's error $e_{k} = \overset{x}{^}_{k} - x_{k}$ and its covariance $P_{k} = E [e_{k} e_{k}^{⊤}]$ can be expressed recursively as $e_{k} P_{k} = w_{k - 1} + (A - L_{k} C) e_{k - 1} + L_{k} v_{k - 1} = Q_{k - 1} + (A - L_{k} C) P_{k - 1} (A - L_{k} C)^{⊤} + L_{k} R_{k - 1} L_{k}^{⊤} . (2)$ The gain $L_{k}$ can then be designed such that $P_{k}$ is minimized. This can be achieved by finding the stationary point $\frac{\partial P _{k}}{\partial L _{k}} = 0$ , where $L_{k} = P_{k - 1} C^{⊤} (C P_{k - 1} C^{⊤} + R_{k - 1})^{- 1} . (3)$ Substituting (3) back into (2b) we attain the DDRE $P_{k} = Q_{k - 1} + A P_{k - 1} A^{⊤} - A P_{k - 1} C^{⊤} (C P_{k - 1} C^{⊤} + R_{k - 1})^{- 1} C P_{k - 1} A^{⊤}, (4)$ which governs the evolution of the state error's covariance.

The state of the observer then consists of $\overset{x}{^}_{k}$ and $P_{k}$ , its dynamics described by (1) and (4). This observer can also be used in the case where the dynamics of the system are time variant.

Infinite horizon

In the case where the covariances of the process and measurement noise are time-invariant, i.e. $Q = E [w_{k} w_{k}^{⊤}], R = E [v_{k} v_{k}^{⊤}],$ we can also design the gain of the observer based on the steady-state error covariance $P_{k} \to P$ at $k \to \infty$ . The covariance $P$ can be obtained by solving the DARE $P = Q + A P A^{⊤} - A P C^{⊤} (CP C^{⊤} + R)^{- 1} CP A^{⊤}, (5)$ based on which we evaluate the gain $L = P C^{⊤} (CP C^{⊤} + R)^{- 1} .$

Duality with LQR

It is of note that the form of (4) and (5) is identical to those of the LQR with different matrices:

LQR	Luenberger
$S$	$P$
$A$	$A^{⊤}$
$B$	$C^{⊤}$

Kalman Filter

Let us consider a LTI system with additive Gaussian noise $x_{k} y_{k} = A x_{k - 1} + B u_{k - 1} + w_{k - 1}, = C x_{k} + v_{k}, Q_{k} R_{k} = E [w_{k} w_{k}^{⊤}] = E [v_{k} v_{k}^{⊤}] .$ In addition to the state estimate $\overset{x}{^}_{k}$ the state of the Kalman Filter includes the error covariance $P_{k} = E [e_{k} e_{k}^{⊤}],$ where $e_{k} = \overset{x}{^}_{k} - x_{k}$ is the state estimate's error.

The Kalman filter estimates the state $x_{k}$ in two steps, first creating an a priori state estimate and error covariance $\overset{x}{^}_{k ∣ k - 1} P_{k ∣ k - 1} = A \overset{x}{^}_{k - 1} + B u_{k - 1} = A P_{k - 1} A^{⊤} + Q_{k - 1},$ where $P_{k ∣ k - 1} = E [e_{k ∣ k - 1} e_{k ∣ k - 1}^{⊤}], e_{k ∣ k - 1} = \overset{x}{^}_{k ∣ k - 1} - x_{k} = A e_{k - 1} - w_{k - 1},$ and then performing a correction based on the measurement $y_{k}$ and Kalman gain $L_{k}$ , to attain the a posteriori state estimate and error covariance $\overset{x}{^}_{k} P_{k} = \overset{x}{^}_{k ∣ k - 1} - L_{k} (C \overset{x}{^}_{k ∣ k - 1} - y_{k}) = (I - L_{k} C) P_{k ∣ k - 1} (I - L_{k} C)^{⊤} + L_{k} R_{k} L_{k}^{⊤}, (1)$ as $e_{k} = \overset{x}{^}_{k} - x_{k} = (I - L_{k} C) e_{k ∣ k - 1} + L_{k} v_{k} .$ The Kalman gain is designed to minimize the a posteriori error covariance by finding the stationary point of (1b) $L_{k} = P_{k ∣ k - 1} C^{⊤} S_{k}^{- 1}, S_{k} = C P_{k ∣ k - 1} C^{⊤} + R_{k} . (2)$ After substituting (2) into (1b) we can manipulate the equation into the "Joseph" form $P_{k} = P_{k ∣ k - 1} - L_{k} S_{k} L_{k}^{⊤},$ or alternatively the simplified form $P_{k} = (I - L_{k} C) P_{k ∣ k - 1},$ which is numerically less stable.

State-Space Representation of Mechanical Systems

Equations of Motion

Let us consider manipulator equations in the form $M (q) \overset{q}{¨} + c (q, \overset{q}{˙}) - τ_{g} (q) = B (q) u (1)$ who's left-hand side may be for example derived from the system's kinetic $T (q, \overset{q}{˙})$ and potential $V (q)$ energy expressed in generalized coordinates by expanding Lagrange equations of the second kind $\frac{d}{d t} (\frac{\partial L}{\partial q ˙}) - \frac{\partial L}{\partial q} = i \sum \frac{\partial r _{i}}{\partial q} F_{i},$ where $L = T - V$ . First by separating the Lagrangian into its components $\frac{d}{d t} (\frac{\partial T}{\partial q ˙} - \frac{\partial V}{\partial q ˙}) - \frac{\partial T}{\partial q} + \frac{\partial V}{\partial q} = i \sum \frac{\partial r _{i}}{\partial q} F_{i}$ and then applying the change rule $\frac{\partial ^{2} T}{\partial q ˙ ^{2}} \overset{q}{¨} + \frac{\partial ^{2} T}{\partial q ˙ \partial q} \overset{q}{˙} - \frac{\partial T}{\partial q} + \frac{\partial V}{\partial q} = i \sum \frac{\partial r _{i}}{\partial q} F_{i} .$ If the external forces are linearly dependent on inputs $u_{i}$ , such that $F_{i} = f_{i} u_{i}$ , the terms of (1) as derived by this approach are $M (q) c (q, \overset{q}{˙}) τ_{g} (q) B (q) = \frac{\partial ^{2} T}{\partial q ˙ ^{2}} = \frac{\partial ^{2} T}{\partial q ˙ \partial q} \overset{q}{˙} - \frac{\partial T}{\partial q} = - \frac{\partial V}{\partial q} = [\frac{\partial r _{1}}{\partial q} f_{1} \dots \frac{\partial r _{m}}{\partial q} f_{m}] .$

Continuous-time dynamics

Most commonly used state-space representation of a mechanical system in continuous time is attained by concatenating $q$ and $\overset{q}{˙}$ to form the system's state $x = [q \overset{q}{˙}], \overset{x}{˙} = f (x, u) = [\overset{q}{˙} M^{- 1} (q) (τ_{g} (q) - c (q, \overset{q}{˙}) + B (q) u)] .$

Linearization

The system's dynamics can be linearized by performing a Taylor expansion around a point ${x^{*}, u^{*}}$ which yields $\overset{x}{˙} = f (x, u) \approx f (x^{*}, u^{*}) + A_{C} \frac{\partial f}{\partial x} (x^{*}, u^{*}) \overset{x}{ˉ} (x - x^{*}) + B_{C} \frac{\partial f}{\partial u} (x^{*}, u^{*}) \overset{u}{ˉ} (u - u^{*}),$ where also $\dot{\overset{x}{ˉ}} = \frac{d}{d t} (x - x^{*}) = \overset{x}{˙}$ .

If ${x^{*}, u^{*}}$ is a fixed point, i.e. $f (x^{*}, u^{*}) = 0$ , first order partial derivatives of the system's dynamics simplify to $\frac{\partial f}{\partial x} (x^{*}, u^{*}) \frac{\partial f}{\partial u} (x^{*}, u^{*}) = [0 M^{- 1} (q^{*}) (\frac{\partial τ _{g}}{\partial q} (q^{*}) + \frac{\partial B}{\partial q} (q^{*}) u^{*}) I 0] = [0 M^{- 1} (q^{*}) B (q^{*})]$ as terms involving $\frac{\partial M ^{- 1}}{\partial q}$ drop out because $τ_{g} - c + B u^{*} = 0$ for all fixed points. Partial derivatives of $c$ also disappear as all of its terms contain second degree products of velocities and all velocities must be equal to zero at a fixed point.

Discretization via integration using the explicit Euler scheme

The most basic approach with which we may discretize continuous-time dynamics of a nonlinear system is by integrating the systems state with a timestep of $h$ using the explicit Euler scheme $x_{k + 1} = x_{k} + h f (x_{k}, u_{k}) .$

Linearization

Compared to more advanced integration methods, linearization of these discrete time dynamics around a point ${x^{*}, u^{*}}$ is trivial: $\overset{x}{ˉ}_{k + 1} x_{k + 1} - x^{*} = x_{k} + h f (x_{k}, u_{k}) - x^{*} \approx h f (x^{*}, u^{*}) + A_{D} (I + h \frac{\partial f}{\partial x} (x^{*}, u^{*})) \overset{x}{ˉ}_{k} (x_{k} - x^{*}) + B_{D} h \frac{\partial f}{\partial u} (x^{*}, u^{*}) \overset{u}{ˉ}_{k} (u_{k} - u^{*}) .$

Cart-Pole Equations of Motion

                  \
                  /\
                 /  \
                /    \
               l      \
              /       >< m_p
             /       //
            /       //
          \/       //
           \      //
            \    //
  |   |------\--//------|
  |   |       \//\      |
  g   |  m_c  ( ) theta |
  |   |        |_/      |
  v   |========|========|
_________ooo___|___ooo_________
               |
//|-----s----->|

For generalized coordinates $q = [s θ]^{⊤}$ the system's kinetic and potential energy are $T V = \frac{1}{2} (m_{c} \overset{s}{˙}^{2} + m_{p} (\overset{x}{˙}_{p}^{2} + \overset{y}{˙}_{p}^{2})) = - g m_{p} cos (θ),$ where $\overset{x}{˙}_{p} \overset{y}{˙}_{p} = \overset{s}{˙} + l \dot{θ} cos (θ) = l \dot{θ} sin (θ) .$ The individual terms of the manipulator equations¹ are then $M c τ_{g} B = [m_{c} + m_{p} m_{p} l cos (θ) m_{p} l cos (θ) m_{p} l^{2}] = [- m_{p} l sin (θ) \dot{θ}^{2} 0] = [0 - g m_{p} l sin (θ)] = [10],$ assuming a single input is acting in the direction of $s$ .

Linearization in the upright configuration

From the equations we may see that for the state and input vectors $x u = [q \overset{q}{˙}] = [s θ \overset{s}{˙} \dot{θ}]^{⊤} = [u]$ all points $x^{*} u^{*} = [q^{*} \overset{q}{˙}^{*}] = [s π 00]^{⊤}, s \in R = [0]$ are stationary. In this configuration the relevant terms for the system's linearization are $M (q^{*}) \frac{\partial τ _{g}}{\partial q} (q^{*}) \frac{\partial B}{\partial q} (q^{*}) B (q^{*}) = [m_{c} + m_{p} m_{p} l m_{p} l m_{p} l^{2}] = [00 0 g m_{p} l] = 0 = [10]$

The python script for the derivation of the following terms can be found here. A very similar form of these equations of motion can also be found in section 3.2.1 of [Underactuated2023].

Bi-Rotor Equations of Motion

As a bi-rotor (quadcopter simplified to 2D) is essentially a rigid body, its equations of motion can be easily derived as $m \overset{x}{¨} m \overset{y}{¨} I \ddot{θ} = - sin (θ) (u_{1} + u_{2}) = cos (θ) (u_{1} + u_{2}) - m g = a (u_{1} - u_{2}) .$

Linearization in a hovering configuration

From the EoM we may see that for the state description $x u = x y θ \overset{x}{˙} \overset{y}{˙} \dot{θ}, = [u_{1} u_{2}] \overset{x}{˙} = \overset{x}{˙} \overset{y}{˙} \dot{θ} - \frac{1}{m} sin (θ) (u_{1} + u_{2}) \frac{1}{m} cos (θ) (u_{1} + u_{2}) - m g \frac{a}{I} (u_{1} - u_{2})$ all points $x^{*} u^{*} = [x y 0000]^{⊤}, {x, y} \in R^{2} = [\frac{1}{2} m g \frac{1}{2} m g]^{⊤}$ are stationary points, for which the system's linearization is trivial $A B = - g 111 = \frac{1}{m} \frac{a}{I} \frac{1}{m} \frac{- a}{I}$

Derivation in manipulator equation form

For those who want to validate the equations by deriving the manipulator equations, kinetic and potential energy for generalized coordinates $q = x y θ$ are $T V = \frac{1}{2} m (\overset{x}{˙}^{2} + \overset{y}{˙}^{2}) + \frac{1}{2} I \dot{θ}^{2} = m g y$ which should yield $M = m 00 0 m 0 00 I c = 000 τ_{g} = 0 - m g 0 .$ The manipulation matrix $B = - sin (θ) cos (θ) a - sin (θ) cos (θ) - a$ can then be derived separately by evaluating thrust vectors and differentiating moment arms.

Optimization Modeling Languages

CVXPY

"Open source Python-embedded modeling language for convex optimization problems"

Pyomo

"Python-based, open-source optimization modeling language with a diverse set of optimization capabilities"

JuMP

"JuMP is a modeling language and collection of supporting packages for mathematical optimization in Julia"

Solvers

OSQP

"Numerical optimization package for solving convex quadratic programs"

Uses the alternating direction method of multipliers (ADMM)

IPOPT

"Open source software package for large-scale nonlinear optimization"

Uses the interior point (IP) method

Bibliography

[Boyd2004] - Boyd, Stephen and Vandenberghe, Lieven - Convex Optimization. - 2004. -

Summary/Abstract

N/A

[Bierlaire2018] - Michel Bierlaire - Optimization: Principles and Algorithms. - 2018. -

Summary/Abstract

N/A

[Underactuated2023] - Tedrake, Russ - Underactuated Robotics. - 2023. -

Summary/Abstract

N/A

[Brunton2022-HJB] - Brunton, Steven L. - Nonlinear Control: Hamilton Jacobi Bellman (HJB) and Dynamic Programming. - 2022. -

Summary/Abstract

This video discusses optimal nonlinear control using the Hamilton Jacobi Bellman (HJB) equation, and how to solve this using dynamic programming.

[Bellman1966] - Richard Bellman - Dynamic Programming. - 1966. -

Summary/Abstract

Little has been done in the study of these intriguing questions, and I do not wish to give the impression that any extensive set of ideas exists that could be called a theory. What is quite surprising, as far as the histories of science and philosophy are concerned, is that the major impetus for the fantastic growth of interest in brain processes, both psychological and physiological , has come from a device, a machine, the digital computer. In dealing with a human being and a human society, we enjoy the luxury of being irrational, illogical, inconsistent, and incomplete, and yet of coping. In operating a computer, we must meet the rigorous requirements for detailed instructions and absolute precision. If we understood the ability of the human mind to make effective decisions when confronted by complexity, uncertainty, and irrationality, then we could use computers a million times more effectively than we do. Recognition of this fact has been a motivation for the spurt of research in the field of neurophysiology. The more we study the information-processing aspects of the mind, the more perplexed and impressed we become. It will be a very long time before we understand these processes sufficiently to reproduce them. In any case, the mathematician sees hundreds and thousands of formidable new problems in dozens of blossoming areas, puzzles galore, and challenges to his heart's content. He may never resolve some of these, but he will never be bored. What more can he ask?

[Mayne1966] - David Q. Mayne - A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems. - 1966. -

Summary/Abstract

N/A