Model Predictive Control
From the state-space model to a quadratic program — and a Python script you can actually run.
1. The idea in one paragraph
Model Predictive Control (MPC) is feedback control by repeated optimization. At every sample $k$ you measure the current state $x_k$, predict how the plant will behave over the next $N$ steps for any candidate input sequence $u_k, u_{k+1}, \dots, u_{k+N-1}$, pick the sequence that minimizes a cost, apply only the first input, and repeat at $k+1$. This is the receding-horizon principle. Constraints on inputs and states are handled explicitly as part of the optimization, which is the feature that distinguishes MPC from classical PID or LQR.
2. Plant model
We work in discrete time, with the linear time-invariant model
$$ x_{k+1} = A\,x_k + B\,u_k, \qquad y_k = C\,x_k. $$
$x_k \in \mathbb{R}^{n}$ is the state, $u_k \in \mathbb{R}^{m}$ the input. If you have a continuous-time model $\dot x = A_c x + B_c u$, discretize it (zero-order hold) with sampling time $T_s$: $A = e^{A_c T_s}$, $B = \int_0^{T_s} e^{A_c \tau} d\tau\, B_c$.
3. Cost function
Over a prediction horizon of $N$ steps, define
$$ J = \sum_{i=0}^{N-1} \left[ (x_{k+i} - x_{\mathrm{ref}})^\top Q\,(x_{k+i} - x_{\mathrm{ref}}) + u_{k+i}^\top R\,u_{k+i} \right] + (x_{k+N} - x_{\mathrm{ref}})^\top P\,(x_{k+N} - x_{\mathrm{ref}}). $$
- $Q \succeq 0$ penalises tracking error — bigger $Q$ ⇒ tighter tracking, more aggressive input.
- $R \succ 0$ penalises control effort — bigger $R$ ⇒ smoother input, slower response.
- $P$ is the terminal cost (often the solution of the discrete algebraic Riccati equation, but $P = Q$ is a fine starting point).
4. Constraints
Hard constraints are just affine inequalities: $u_{\min} \le u_k \le u_{\max}$ and (optionally) $x_{\min} \le x_k \le x_{\max}$. Slew-rate, output, and obstacle-avoidance constraints have the same form once you express them through $u$ and the prediction equations.
5. Reduction to a QP
Stack the predicted states and the decision variables: $\mathbf{X} = [x_{k+1}; x_{k+2}; \dots; x_{k+N}]$ and $\mathbf{U} = [u_k; u_{k+1}; \dots; u_{k+N-1}]$. Iterating the dynamics gives
$$ \mathbf{X} = \Phi\,x_k + \Gamma\,\mathbf{U}, $$
where $\Phi$ stacks $A^i$ and $\Gamma$ is block lower-triangular with entries $A^{i-1-j} B$. Plug this into $J$ and you get the canonical QP
$$ \min_{\mathbf{U}}\; \tfrac{1}{2}\,\mathbf{U}^\top H\,\mathbf{U} + g^\top \mathbf{U} \quad \text{s.t.}\quad A_{\mathrm{ineq}} \mathbf{U} \le b_{\mathrm{ineq}}, $$
with $H = 2(\Gamma^\top \bar Q\, \Gamma + \bar R)$ and $g = 2\,\Gamma^\top \bar Q\,(\Phi x_k - \mathbf{X}_{\mathrm{ref}})$. Solve, take the first $m$ entries of $\mathbf{U}^\star$, apply, advance time, repeat.
Once you see MPC as “build a QP, solve it, throw away most of the answer”, the rest is engineering: pick $N$ big enough, $Q/R$ to taste, and a fast solver.
6. Try it
The panel below builds the QP for the parameters you choose and emits a Python script that solves it on real data. Edit any field and the code on the right updates live. The default plant is a damped mass–spring system $\ddot{q} + 2\zeta\omega_n \dot{q} + \omega_n^2 q = u$ with $\omega_n = 1$, $\zeta = 0.1$, sampled at $T_s = 0.1$ s.
Choosing the horizons
Two horizons govern the QP: the prediction horizon $H_p$ (how far ahead we forecast the state) and the control horizon $H_u$ (how many distinct future moves $\Delta u_k$ remain free decision variables — after $H_u$, the input is held constant or zero). This page exposes a single $N$ and treats $H_u = H_p = N$, but the trade-offs are easier to see if you keep them mentally separate.
- $H_p$ — prediction horizon. A common rule of thumb is to pick $H_p \cdot T_s$ long enough to cover the dominant open-loop transient — roughly the settling time of the slowest mode you care about (e.g. $\sim 4/(\zeta\omega_n)$ for a second-order system, or $\sim 5\tau$ for a first-order one). Too short and the controller is myopic — it can drive the state in directions it would not choose if it could see further, and stability margins shrink. Too long wastes effort on predictions the model cannot trust and bloats the QP.
- $H_u$ — control horizon. Each extra free move adds $m$ decision variables, so the QP grows as $\mathcal{O}(H_u \cdot m)$ in size and the solve cost grows super-linearly (interior-point methods are roughly cubic in the variable count, active-set methods depend on the active-set churn). In practice $H_u$ is kept much smaller than $H_p$ — often $H_u \in [2, 5]$ with $H_p \in [10, 30]$ — through move blocking: freeze $\Delta u_k = 0$ for $k \ge H_u$. This keeps the QP small while still letting the predicted trajectory roll out far enough to feel constraints and the terminal cost.
- Trade-off in one line. $H_p$ controls how far we look; $H_u$ controls how much detail we plan. Increase $H_p$ for stability and constraint-awareness; keep $H_u$ small for real-time feasibility.
Standard references for these guidelines: Maciejowski, Predictive Control with Constraints (2002), §3; Rawlings, Mayne & Diehl, Model Predictive Control: Theory, Computation, and Design (2nd ed., 2017), §2.4; Borrelli, Bemporad & Morari, Predictive Control for Linear and Hybrid Systems (2017), §11.
Controller parameters
7. Download & run
Once your parameters look right, click Generate. The dimensions are
validated against the state-space model, and a .py script is
produced. Download it, install the dependencies for your chosen solver,
and run.
How to run
$ pip install numpy scipy matplotlib
# 2. install the QP solver you picked
$ pip install cvxpy
# 3. run
$ python mpc_controller.py
Two windows pop up at the end: the state trajectories (with reference
lines) and the input signal (with $u_{\min}/u_{\max}$ marked). To use the
controller online, drop the mpc_step(x) function into your loop
— it takes a measurement and returns the next input, no setup per call.
8. Offset-free tracking
Run the controller above on a real plant and you'll often see a small but stubborn steady-state error: the model is never exactly the plant, sensors drift, loads change. Plain MPC has no integral action — the cost $\|x - x_{\mathrm{ref}}\|_Q^2$ is satisfied at the model's prediction, not at the real plant.
The standard fix (Pannocchia & Rawlings, 2003) is to augment the model with a disturbance state and estimate it online. Pick an output disturbance $d$ that enters as $\hat y = C x + d$, and let it drift as $d_{k+1} = d_k$ (random-walk model). The augmented system is
$$ \begin{bmatrix} x_{k+1} \\ d_{k+1} \end{bmatrix} = \underbrace{\begin{bmatrix} A & 0 \\ 0 & I \end{bmatrix}}_{\tilde A} \begin{bmatrix} x_k \\ d_k \end{bmatrix} + \underbrace{\begin{bmatrix} B \\ 0 \end{bmatrix}}_{\tilde B} u_k, \qquad y_k = \underbrace{\begin{bmatrix} C & I \end{bmatrix}}_{\tilde C} \begin{bmatrix} x_k \\ d_k \end{bmatrix}. $$
Run a Kalman filter on $(\tilde A, \tilde B, \tilde C)$ and you get $\hat x_k$ and $\hat d_k$. Before each MPC solve, recompute the steady-state target $(x_{ss}, u_{ss})$ that makes the predicted output match the reference given the current disturbance estimate:
$$ \begin{bmatrix} I - A & -B \\ C & 0 \end{bmatrix} \begin{bmatrix} x_{ss} \\ u_{ss} \end{bmatrix} = \begin{bmatrix} 0 \\ r - \hat d_k \end{bmatrix}. $$
Then run MPC on the deviation variables $\Delta x = x - x_{ss}$, $\Delta u = u - u_{ss}$ and apply $u_k = u_{ss} + \Delta u_k^\star$. Because $\hat d_k$ absorbs the model mismatch, the closed-loop output converges to $r$ with zero offset — exactly the role the integral term plays in a PI controller. Sketch:
# --- once, offline --- nx, nu, ny = A.shape[0], B.shape[1], C.shape[0] A_aug = np.block([[A, np.zeros((nx, ny))], [np.zeros((ny, nx)), np.eye(ny)]]) B_aug = np.vstack([B, np.zeros((ny, nu))]) C_aug = np.hstack([C, np.eye(ny)]) # Kalman gain L for the augmented system (use scipy or your favourite DARE solver) L = kalman_gain(A_aug, C_aug, Qw, Rv) xhat = np.zeros(nx + ny) # --- every step --- y_meas = sensor_read() xhat = A_aug @ xhat + B_aug @ u_prev + L @ (y_meas - C_aug @ xhat) x_hat, d_hat = xhat[:nx], xhat[nx:] # target calculation: hold y at r given d_hat M = np.block([[np.eye(nx) - A, -B], [C, np.zeros((ny, nu))]]) rhs = np.concatenate([np.zeros(nx), r - d_hat]) xs_us = np.linalg.solve(M, rhs) x_ss, u_ss = xs_us[:nx], xs_us[nx:] # MPC on deviation; apply absolute input du = mpc_step(x_hat - x_ss) u = u_ss + du u_prev = u
Two practical notes: (a) the augmented system must be observable — check $\mathrm{rank}\!\begin{bmatrix} I-A & 0 \\ -C & -I \end{bmatrix} = n_x + n_d$ (Maeder–Borrelli–Morari condition); (b) for input disturbances, replace the $C$-block by $B$ and you get the version that rejects step loads rather than sensor offsets.
9. Embedded MPC
On a desktop, the QP solves in milliseconds and nobody cares. On a microcontroller running at 1 kHz, the solver has microseconds, no dynamic memory, and no exception handling. Three approaches dominate in practice:
9.1 Code generation (OSQP, qpOASES)
Build the parametric QP once in Python, then generate ANSI C source that solves it without runtime allocation. OSQP exposes this directly:
# build the QP exactly as in the generated script above, then: prob.codegen( 'mpc_solver_c', # output folder parameters='matrices', # let q, l, u, P-values change at runtime extension_name='emosqp', force_rewrite=True, FLOAT=True, # single precision — ~2x faster on Cortex-M LONG=False, )
You get a self-contained C library with one entry point per QP update:
osqp_update_lin_cost(...), osqp_solve(...).
Drop into a CMSIS or Zephyr project; link, call from your control task.
Typical numbers on a Cortex-M7 @ 480 MHz, single precision: a 4-state /
1-input plant with $N = 20$ solves in 30–60 µs after a warm start, well
below a 1 kHz loop. qpOASES with online active-set
is often even faster when the active set changes slowly.
9.2 Explicit MPC
For small problems (state dim ≤ ~5, short horizon) you can solve the QP
offline as a multi-parametric QP. The optimal $u^\star(x)$ turns
out to be piecewise-affine over polyhedral regions of the state space.
At runtime you just locate the region and evaluate one matrix-vector
product — nanoseconds, no iterative solver. Tools: MPT3 (Matlab),
POP, hybrid toolboxes. Caveat: the number of regions can
explode with constraints, so this only fits the simplest problems.
9.3 Acados & tinympc
For nonlinear MPC, soft constraints, or larger horizons,
acados
generates a real-time iteration SQP with HPIPM as the inner
solver and ships C code you can flash. tinympc is a newer
ADMM-based solver designed specifically for resource-constrained
microcontrollers (8 KB RAM, no FPU); the trade-off is fewer
constraint types and looser tolerances.
Rule of thumb: linear plant + box constraints ⇒ OSQP codegen; small problem + obsessive timing ⇒ explicit MPC; nonlinear or high-state ⇒ acados.
9.4 What changes in your code
- No malloc. Pre-allocate every buffer at startup; OSQP-codegen does this for you.
- Fixed iteration cap. Set
max_iterso a worst-case solve still meets the deadline; accept the suboptimal answer if it overruns. - Warm start always on. The previous step's solution is the best initial guess; this typically halves the iteration count.
- Single precision. Almost always fine for control; double the throughput on M4F/M7.
- Watchdog the solver. If it returns infeasible or hits
max_iterwith bad residuals, fall back to a safe input (last applied, or hold-to-zero with rate limit).
10. Further directions
- Soft constraints: add slack variables to state bounds so the QP stays feasible under disturbance.
- Move blocking: reduce the input dimension by holding $u$ constant over chunks of the horizon — fewer decision variables, faster solve.
- Nonlinear MPC: swap the QP for an NLP (CasADi + IPOPT, or acados RTI) when the plant is no longer linear.
- Stability guarantees: add a terminal constraint $x_{k+N} \in \mathcal{X}_f$ where $\mathcal{X}_f$ is a positively invariant set under the LQR feedback.