Convex Sets and Separation
convex set, convex hull, hyperplane, halfspace, separation theorem
1 Role
This page gives the geometric starting point of the optimization module.
It explains which feasible regions are well behaved enough to support global reasoning, and why hyperplanes can certify membership, non-membership, and optimality.
2 First-Pass Promise
Read this page first.
If you stop here, you should still understand:
- what convexity means geometrically
- why convex feasible sets are special for optimization
- what a separating hyperplane is trying to prove
- how this geometry reappears in classification, duality, and KKT arguments
3 Why It Matters
Optimization becomes dramatically easier once the feasible region is convex.
If a set is convex, then averaging two feasible points stays feasible. That single fact has deep consequences:
- local minima are much more likely to be global minima
- interpolation between candidate solutions stays inside the model
- hyperplanes can certify when a point is outside the region
- geometry, algebra, and computation start to agree with one another
This is why convex sets sit at the front of almost every serious optimization course. Before algorithms, duality, or KKT conditions, you need to know what kind of geometry those tools are talking about.
4 Prerequisite Recall
- an affine combination has coefficients summing to one
- a hyperplane is a set of the form \(\{x : a^\top x = b\}\)
- a halfspace is a set of the form \(\{x : a^\top x \le b\}\) or \(\{x : a^\top x \ge b\}\)
- projections and orthogonality already tell you how nearest-point geometry can create normal directions
5 Intuition
A set is convex if every line segment between two of its points stays inside the set.
So convexity is really a statement about no dents inward.
That matters because optimization constantly asks questions like:
- if two feasible designs are allowed, is a blended design still allowed?
- if a point is outside the feasible region, can I prove it with one simple inequality?
- if a boundary point is optimal, is there a supporting normal direction that explains why?
Convexity makes the answer to these questions much cleaner.
Separation is the next step. If a point lies outside a nice convex set, there is often a hyperplane that places the point on one side and the set on the other. That hyperplane is not just a picture. It is a certificate.
6 Formal Core
Definition 1 (Definition) A set \(C \subseteq \mathbb{R}^n\) is convex if for every \(x,y \in C\) and every \(\theta \in [0,1]\),
\[ \theta x + (1-\theta)y \in C. \]
Equivalently, the full line segment joining \(x\) and \(y\) stays inside \(C\).
Important families of convex sets include:
- affine sets
- halfspaces and hyperplanes
- polyhedra, as intersections of finitely many halfspaces
- Euclidean balls and norm balls
- cones such as the positive semidefinite cone
- probability simplices
Proposition 1 (Basic Closure Rules) The following operations preserve convexity:
- intersections of convex sets
- affine images of convex sets
- inverse images of convex sets under affine maps
- convex hulls of arbitrary point sets
These rules are the backbone of optimization modeling.
Theorem 1 (Separating Hyperplane Principle) If \(C\) is a nonempty closed convex set and \(z \notin C\), then there exists a nonzero vector \(a\) and scalar \(b\) such that
\[ a^\top z > b \qquad \text{and} \qquad a^\top x \le b \quad \text{for all } x \in C. \]
So one hyperplane can certify that the point \(z\) lies outside the set \(C\).
This principle comes in several forms. For a first pass, the main thing to retain is:
- convex sets can often be certified by linear inequalities
- outside points can often be ruled out by one hyperplane
- later, KKT and duality reuse exactly this geometry
7 Worked Example
Consider the set
\[ C = \{(x,y)\in\mathbb{R}^2 : x^2 + y^2 \le 1\}, \]
the unit disk.
This set is convex because the Euclidean norm ball is convex. Geometrically, any line segment joining two points in the disk stays inside the disk.
Now take the outside point
\[ z = (2,0). \]
We want a hyperplane that separates \(z\) from \(C\).
The nearest point in \(C\) to \(z\) is clearly
\[ x^\star = (1,0), \]
which lies on the boundary of the disk.
The boundary normal at that point points in the horizontal direction, so the natural separating hyperplane is
\[ x = 1. \]
In inner-product form, this is
\[ a^\top x = b \quad \text{with} \quad a = \begin{bmatrix} 1\\ 0 \end{bmatrix}, \qquad b=1. \]
Then for every \((x,y)\in C\),
\[ a^\top \begin{bmatrix} x\\ y \end{bmatrix} = x \le 1, \]
while for the outside point,
\[ a^\top z = 2 > 1. \]
So the hyperplane \(x=1\) separates the unit disk from the point \((2,0)\).
This example is simple, but it contains the entire pattern:
- convex set
- outside point
- nearest-point geometry
- normal direction
- hyperplane certificate
Later optimization arguments will package the same logic into dual variables, gradients, and KKT conditions.
8 Computation Lens
In computational optimization, convex geometry is rarely manipulated as a drawing. It is encoded through operations like:
- intersections of halfspaces
- cone constraints
- affine maps
- norm-ball constraints
- convex hulls or mixtures
That is why closure properties matter so much. They tell you when a complicated feasible region is still convex without requiring a picture.
Separation also has a computational meaning: if a point violates the region, a hyperplane or linear functional can often witness the violation. This is the geometric ancestor of feasibility certificates, dual certificates, and cutting-plane methods.
9 Application Lens
Two very important applications come almost directly from this page.
First, in maximum-margin classification, a separating hyperplane is not just a geometric picture. It is the object being learned. Convex feasible sets and supporting hyperplanes explain why margin-based methods admit clean optimization formulations.
Second, in optimization theory itself, duality often works by producing a linear or affine functional that certifies infeasibility, suboptimality, or support at the optimum. So when later pages talk about dual variables or KKT multipliers, this page is the geometric background they are using.
10 Stop Here For First Pass
If you can now explain:
- what convexity means using line segments
- why halfspaces, balls, and polyhedra are central examples
- what a separating hyperplane is proving
- why this geometry matters for optimization and classification
then this page has done its main job.
11 Go Deeper
The strongest next pages are:
- Convex Functions and Subgradients for the objective-shape side of the same geometry
- Constrained Optimization, KKT, and Lagrangians for how boundary geometry becomes optimality conditions
- Optimization for Machine Learning for how these geometric ideas show up in practical objectives
12 Optional Paper Bridge
- OptNet: Differentiable Optimization as a Layer in Neural Networks -
Paper bridge- watch how convex constraints and KKT systems become trainable layers in larger models. Checked2026-04-24. - Differentiable Convex Optimization Layers -
Paper bridge- a modern bridge from convex geometry and parametrized convex programs to differentiable ML pipelines. Checked2026-04-24.
13 Optional After First Pass
If you want more depth after the first pass:
- read the convex-sets lecture in Stanford SEE EE364A and watch how examples accumulate by closure rules
- revisit halfspaces, polyhedra, and positive semidefinite constraints in the Boyd and Vandenberghe book
- compare set separation here with support vectors and margins in modern classification pages
14 Common Mistakes
- checking convexity at a few sample points instead of using the line-segment definition
- confusing
affinewithconvex - assuming every nonconvex set can be repaired by adding one constraint
- thinking separation works in the same strong form for arbitrary nonclosed or nonconvex sets
- treating a hyperplane certificate as a picture only, rather than a reusable mathematical witness
15 Exercises
- Show directly from the definition that a halfspace \(\{x : a^\top x \le b\}\) is convex.
- Give an example of a set in \(\mathbb{R}^2\) that is not convex and explain which line segment leaves the set.
- For the unit disk and outside point \((0,2)\), write a separating hyperplane analogous to the worked example.
16 Sources and Further Reading
- Stanford EE364a: Convex Optimization I -
First pass- current official course page for the canonical optimization backbone. Checked2026-04-24. - Stanford Engineering Everywhere: EE364A -
First pass- official lecture archive with direct convex-set and supporting-hyperplane material. Checked2026-04-24. - Convex Optimization by Boyd and Vandenberghe -
First pass- canonical official text for convex sets, convex functions, and separation geometry. Checked2026-04-24. - MIT 6.253: Convex Analysis and Optimization -
Second pass- useful when you want more mathematical convex-analysis depth. Checked2026-04-24. - OptNet: Differentiable Optimization as a Layer in Neural Networks -
Paper bridge- a practical bridge showing where convex geometry and optimality systems surface in modern ML. Checked2026-04-24.
Sources checked online on 2026-04-24:
- Stanford EE364a course page
- Stanford SEE EE364A page
- Boyd and Vandenberghe convex optimization book page
- MIT 6.253 course page
- PMLR page for OptNet