Positive Semidefinite Matrices and Quadratic Forms
positive semidefinite, quadratic form, Gram matrix, covariance, Cholesky
1 Role
This is the second page of the Matrix Analysis module.
The norms page treated matrices as operators that stretch vectors.
This page treats a symmetric matrix as a quadratic object:
\[ x^\top A x. \]
That shift is what makes PSD structure central in optimization, covariance geometry, kernels, and spectral proofs.
2 First-Pass Promise
Read this page after Norms and Operator Norms.
If you stop here, you should still understand:
- what
positive semidefinitemeans - why PSD is a statement about quadratic forms, not just entries
- why Gram matrices and covariance matrices are automatically PSD
- why PSD structure appears constantly in optimization and ML theory
3 Why It Matters
Many of the most important matrices in modern theory are not arbitrary.
They are PSD:
- covariance matrices
- Gram and kernel matrices
- Hessians of convex quadratics
- graph Laplacians
- normal-equation matrices like \(X^\top X\)
PSD structure matters because it gives:
- nonnegative energies
- nonnegative eigenvalues
- factorization as a square-like object
- a partial order that supports comparison arguments later
4 Prerequisite Recall
- operator norms measure worst-case amplification
- symmetric matrices have real eigenvalues
- quadratic expressions like \(x^\top A x\) encode geometric information
- Hessians and quadratic objectives are good examples of where this language appears
5 Intuition
5.1 Quadratic Forms
Given a symmetric matrix \(A\), the expression
\[ x^\top A x \]
measures a direction-dependent energy.
If this quantity is always nonnegative, then the matrix never bends space in a genuinely negative direction.
That is the PSD condition.
5.2 Why Symmetry Matters
PSD language belongs naturally to symmetric matrices.
Without symmetry, the quadratic-form picture and spectral picture no longer line up cleanly.
5.3 Gram And Covariance Pictures
If
\[ A = B^\top B, \]
then
\[ x^\top A x = \|Bx\|_2^2 \ge 0. \]
That is the cleanest PSD intuition:
a PSD matrix is something that looks like a squared linear map
This is why Gram matrices and covariance matrices keep showing up here.
6 Formal Core
Definition 1 (Definition: Positive Semidefinite) A symmetric matrix \(A\) is positive semidefinite if
\[ x^\top A x \ge 0 \qquad \text{for all }x. \]
If the inequality is strict for every nonzero \(x\), then \(A\) is positive definite.
Theorem 1 (Theorem Idea: Equivalent PSD Pictures) For a symmetric matrix \(A\), the following first-pass pictures line up:
- \(A\) is positive semidefinite
- every eigenvalue of \(A\) is nonnegative
- \(A\) can be written as \(B^\top B\) for some matrix \(B\)
These are the main equivalent ways of recognizing PSD structure.
Theorem 2 (Theorem Idea: Gram And Covariance Matrices Are PSD) Matrices of the form
\[ G_{ij}=\langle v_i,v_j\rangle \]
and covariance-type matrices of the form
\[ \Sigma = \mathbb E[(X-\mu)(X-\mu)^\top] \]
are positive semidefinite.
That is why PSD language appears so naturally in machine learning and statistics.
Theorem 3 (Theorem Idea: Positive Definiteness Means Strict Curvature) If a quadratic objective has Hessian matrix \(A\) and \(A\) is positive definite, then the quadratic bends upward in every nonzero direction.
This is one of the simplest bridges from linear algebra to optimization.
7 Worked Example
Consider
\[ A= \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix}. \]
Then
\[ x^\top A x = \begin{bmatrix} x_1 & x_2 \end{bmatrix} \begin{bmatrix} 1 & 1\\ 1 & 1 \end{bmatrix} \begin{bmatrix} x_1\\ x_2 \end{bmatrix} = (x_1+x_2)^2. \]
So
\[ x^\top A x \ge 0 \qquad \text{for all }x, \]
which means \(A\) is PSD.
But it is not positive definite, because if \(x=(1,-1)\) then
\[ x^\top A x = 0 \]
even though \(x\neq 0\).
This is a good first-pass example because it shows:
- PSD allows zero-energy directions
- PD rules them out
- a PSD matrix can still be singular
It also has the Gram form
\[ A = v v^\top \qquad \text{with } v=(1,1)^\top, \]
so the B^\top B picture is visible immediately.
8 Computation Lens
When checking whether a matrix is PSD in practice, you usually do not test infinitely many vectors.
Instead, you use one of the equivalent pictures:
- verify symmetry
- inspect eigenvalues
- identify a Gram or covariance form
- in positive-definite cases, use a Cholesky factorization when appropriate
That is why PSD structure is both theoretically clean and computationally useful.
9 Application Lens
9.1 Optimization
Convex quadratic objectives and many Hessian-based arguments are really PSD statements in disguise.
9.2 Statistics
Covariance matrices are PSD by construction, so variance geometry already lives in this language.
9.3 Machine Learning
Kernel matrices, Gram matrices, and normal-equation matrices like \(X^\top X\) are PSD, which is why PSD structure keeps appearing in regression, kernels, Gaussian processes, and high-dimensional theory.
10 Stop Here For First Pass
If you can now explain:
- why PSD is a quadratic-form condition
- why nonnegative eigenvalues are the spectral picture of PSD structure
- why Gram and covariance matrices are naturally PSD
- why PSD does not mean invertible or strictly positive in every direction
then this page has done its job.
11 Go Deeper
This is the current stopping point of the live Matrix Analysis opening.
The next natural module steps are:
- Spectral Inequalities and Variational Principles
Perturbation and Stability
Until those pages are live, the strongest adjacent pages right now are:
12 Optional Deeper Reading After First Pass
The strongest current references connected to this page are:
- MIT 18.065 Lecture 5: Positive Definite and Semidefinite Matrices - official lecture page centered exactly on PSD ideas. Checked
2026-04-25. - MIT 18.06SC Lecture 27: Positive Definite Matrices and Minima - official lecture page connecting PSD structure to minima and quadratic geometry. Checked
2026-04-25. - MIT 18.06 lecture notes - official notes with clean PSD equivalences and examples. Checked
2026-04-25. - Stanford EE364a: Convex Optimization I - official course page where PSD matrices and semidefinite constraints are standard language. Checked
2026-04-25.
13 Sources and Further Reading
- MIT 18.065 Lecture 5: Positive Definite and Semidefinite Matrices -
First pass- official lecture page focused directly on PSD structure. Checked2026-04-25. - MIT 18.06SC Lecture 27: Positive Definite Matrices and Minima -
First pass- official lecture page connecting positive definiteness to minima and quadratic forms. Checked2026-04-25. - MIT 18.06 lecture notes -
First pass- official notes with clean PSD equivalences, examples, and spectral language. Checked2026-04-25. - Stanford EE364a: Convex Optimization I -
Second pass- official optimization course page where PSD structure becomes part of broader convex-analysis language. Checked2026-04-25.