Optimization

How objectives, constraints, convexity, and certificates fit together in the mathematical language behind fitting, design, control, and machine learning.
Modified

April 26, 2026

Keywords

optimization, convex optimization, KKT, duality, gradient methods

1 Why This Module Matters

Optimization is the language of fit the model, choose the design, allocate the resource, and control the system.

Once a problem becomes pick variables to minimize cost while respecting structure, you are in optimization, even if the surrounding application is machine learning, signal processing, robotics, operations research, or scientific computing.

This module is where several earlier strands finally meet:

  • linear algebra gives the geometry of spaces, projections, and operators
  • probability and statistics explain what objective functions are trying to estimate or regularize
  • calculus explains local change, gradients, and curvature
  • algorithms turn optimality conditions into actual solvers

If this module is skipped, later material starts to feel like disconnected recipes: gradient descent becomes a trick, Lagrange multipliers become a symbol game, duality becomes mysterious, and ML training objectives look more arbitrary than they really are.

Prerequisites Linear algebra first. Single-Variable Calculus and Multivariable Calculus should come first. Probability helps connect objectives to estimation and uncertainty.

Unlocks Convex modeling, duality, constrained ML, numerical optimization, control

Research Use Reading papers with objectives, constraints, certificates, regularization, and solver-based arguments

2 First Pass Through This Module

  1. Convex Sets and Separation
  2. Convex Functions and Subgradients
  3. Constrained Optimization, KKT, and Lagrangians
  4. Unconstrained First-Order Methods
  5. Duality and Certificates

This module is best read after you are already comfortable with derivatives, partial derivatives, gradients, and basic curvature intuition from the site’s calculus modules.

4 Core Concepts

5 Proof Patterns In This Module

  • First-order stationarity: at an optimum, feasible perturbations cannot improve the objective to first order.
  • Geometry of active constraints: optimality becomes a balance between the objective gradient and constraint normals.
  • Certificate thinking: primal feasibility plus dual information can prove optimality or suboptimality.

6 Applications

6.1 Machine Learning And Statistical Estimation

Loss minimization, regularization, constrained training, adversarial formulations, sparse estimation, and calibration all turn into optimization problems. KKT and duality appear in support vector machines, lasso-style methods, differentiable optimization layers, and modern implicit-bias analysis.

6.2 Control, Design, And Scientific Computing

Trajectory planning, resource allocation, filter design, inverse problems, and engineering tradeoff studies all depend on writing the right objective and constraints, then extracting interpretable multipliers, sensitivities, and certificates from the solution.

7 Go Deeper By Topic

Follow the five-page first pass in order:

  1. Convex Sets and Separation
  2. Convex Functions and Subgradients
  3. Constrained Optimization, KKT, and Lagrangians
  4. Unconstrained First-Order Methods
  5. Duality and Certificates

If you want adjacent context right away, the best current bridges are:

8 Optional Deep Dives After First Pass

The strongest official deeper references currently connected to this module are:

9 Study Order

The intended first pass is the five-step sequence above.

You are ready to move deeper into the module when you can:

  • explain why convex feasible regions are special
  • explain why convex objectives turn local first-order information into global information
  • distinguish objective, variables, and constraints cleanly
  • explain why unconstrained and constrained optima obey different first-order logic
  • interpret a multiplier as more than a symbol attached to a constraint
  • explain why a dual feasible point can certify a lower bound

10 Sources and Further Reading

Sources checked online on 2026-04-24:

  • Stanford EE364a course page
  • Boyd and Vandenberghe convex optimization book page
  • MIT 6.253 course page
  • MIT EE227A course page
  • NeurIPS 2019 differentiable convex optimization layers proceedings page
Back to top