Noise Models, Wiener Filtering, and MMSE Estimation
noise, MMSE, LMMSE, Wiener filter, power spectral density
1 Role
This is the fourth page of the Signal Processing and Estimation module.
Its job is to move from deterministic signals and sampling into noisy observations and principled estimation.
The earlier pages said:
- systems transform signals
- frequency views explain filters
- sampling explains analog-to-digital conversion
This page adds:
- signals and noise are often modeled probabilistically
- estimation quality can be measured by mean-square error
- Wiener filtering is the spectral linear-estimation version of that idea
2 First-Pass Promise
Read this page after Sampling, Aliasing, and Reconstruction.
If you stop here, you should still understand:
- why random noise models are useful
- what
MSE,MMSE, andLMMSEmean - why orthogonality is the core least-squares geometry
- why Wiener filtering balances signal power against noise power by frequency
3 Why It Matters
Real measurements are almost never clean.
They are corrupted by:
- sensor noise
- channel noise
- blur plus measurement error
- finite-resolution acquisition
- interference from other sources
So signal processing needs more than deterministic convolution and sampling.
It needs a language for:
- modeling uncertainty
- defining what a “good estimate” means
- designing filters that recover useful structure without amplifying noise too aggressively
This is the point where signal processing starts to overlap strongly with statistics, estimation, control, and modern ML.
4 Prerequisite Recall
- probability provides expectations, variances, covariance, and conditional expectation
- statistics already introduced mean-square loss and estimation error
- Fourier and spectral viewpoints tell us how signal energy can be distributed across frequencies
LTIsystems provide the natural class of filters we can use for denoising or recovery
5 Intuition
5.1 Noise Is A Model, Not Just A Nuisance Word
When we write
\[ y = x + v, \]
we are usually not claiming every fluctuation is literally random in a metaphysical sense.
We are choosing a probabilistic model that lets us reason about uncertainty and average error.
5.2 Mean-Square Error Measures Typical Estimation Quality
If \hat{x} is an estimate of x, then the mean-square error
\[ \mathbb{E}[(x-\hat{x})^2] \]
measures expected squared discrepancy.
It penalizes large mistakes more heavily than small ones and interacts cleanly with linear algebra and orthogonality.
5.3 MMSE And LMMSE Are Not The Same Object
MMSE means:
- choose the estimator with smallest mean-square error among all measurable estimators
LMMSE means:
- restrict attention to estimators that are linear in the observations
So LMMSE is usually easier to compute, but it is a restricted problem.
In Gaussian settings, that restricted problem often matches the full MMSE problem.
5.4 Orthogonality Is The Geometry Behind Least Squares
The first-pass geometry is:
- the estimation error should be orthogonal to the information used to form the estimate
That is the signal-processing version of projection in Hilbert space.
5.5 Wiener Filtering Is Frequency-Wise Linear Estimation
For wide-sense stationary random processes, the frequency domain often decouples the estimation problem.
At first pass, Wiener filtering says:
- trust frequencies where signal power dominates noise
- shrink frequencies where noise power dominates signal
So Wiener filtering is not just “smooth the data.”
It is a structured spectral tradeoff between fidelity and denoising.
6 Formal Core
Definition 1 (Definition: Random Signal Model) A random signal model treats the signal of interest, the noise, or both as random variables or random processes with specified first- and second-order structure.
At first pass, this usually means:
- means
- variances
- covariances
- sometimes power spectral densities
Definition 2 (Definition: Mean-Square Error) For an estimate \hat{x} of a random quantity x, the mean-square error is
\[ \operatorname{MSE}(\hat{x}) = \mathbb{E}\big[(x-\hat{x})^2\big]. \]
Vector versions use squared norm error.
Definition 3 (Definition: MMSE Estimator) An MMSE estimator minimizes mean-square error over all admissible estimators based on the observation.
At first pass, the load-bearing fact is:
\[ \hat{x}_{\mathrm{MMSE}} = \mathbb{E}[x \mid y]. \]
Definition 4 (Definition: LMMSE Estimator) An LMMSE estimator minimizes mean-square error among estimators that are linear functions of the observation.
This is usually easier to compute because it depends only on second-order statistics.
Theorem 1 (Theorem Idea: Orthogonality Principle) For least-square optimal estimation, the estimation error is orthogonal to the space generated by the data used to form the estimate.
At first pass, this is the projection rule that produces normal equations.
Definition 5 (Definition: Wide-Sense Stationarity) A random process is wide-sense stationary if its mean is constant and its autocovariance depends only on time difference, not absolute time.
This makes spectral analysis and LTI filtering especially clean.
Theorem 2 (Theorem Idea: Wiener Filtering) In a stationary linear-estimation setting, the Wiener filter is the linear filter that minimizes mean-square error between the desired signal and the filter output.
For the classic additive model y = x + v with uncorrelated signal and noise, the noncausal Wiener filter has the first-pass spectral form
\[ H(\omega) = \frac{S_{xx}(\omega)}{S_{xx}(\omega) + S_{vv}(\omega)}, \]
where S_{xx} and S_{vv} are the signal and noise power spectral densities.
The interpretation is immediate:
- near
1where signal dominates - near
0where noise dominates
7 Worked Example
Suppose the observation is
\[ y = x + v, \]
where:
xhas mean0and variance\sigma_x^2vhas mean0and variance\sigma_v^2xandvare uncorrelated
Look for the best linear estimator of the form
\[ \hat{x} = a y. \]
The mean-square error is
\[ \mathbb{E}\big[(x-a(x+v))^2\big] = (1-a)^2 \sigma_x^2 + a^2 \sigma_v^2. \]
Minimizing over a gives
\[ a^\star = \frac{\sigma_x^2}{\sigma_x^2 + \sigma_v^2}. \]
So
\[ \hat{x}_{\mathrm{LMMSE}} = \frac{\sigma_x^2}{\sigma_x^2 + \sigma_v^2} y. \]
This has the exact shape we want:
- if signal variance is much larger than noise variance, trust the measurement more
- if noise variance is much larger, shrink the measurement heavily
The Wiener filter is the frequency-dependent process version of this same tradeoff.
8 Computation Lens
When you meet a noisy-signal estimation problem, ask:
- what is random here: the signal, the noise, or both?
- what loss is being minimized?
- is the target full
MMSE, linearLMMSE, or a causal filtering problem? - are we working with scalar variables, vectors, or stationary processes?
- is the useful structure in covariance or in the power spectral density?
These questions tell you whether the right tool is conditional expectation, normal equations, Wiener filtering, or later-state estimation tools like Kalman filtering.
9 Application Lens
9.1 Communications
Receivers constantly estimate signals buried in channel noise and interference.
9.2 Imaging And Sensing
Denoising and deblurring problems often trade sharp recovery against noise amplification, which is exactly the terrain Wiener filtering was built for.
9.3 Sequential Estimation
The static and stationary ideas here become the bridge to state estimation and Kalman filtering on the next branch of the module tree.
10 Stop Here For First Pass
If you stop here, retain these five ideas:
- random signal models let us reason about noise systematically
MMSEis the best estimator overall under squared lossLMMSEis the best estimator within the linear class- orthogonality is the core geometry behind least-square estimation
- Wiener filtering is frequency-dependent linear MMSE estimation for stationary signal-plus-noise models
11 Go Deeper
The strongest next page is:
The strongest adjacent live pages are:
- Signal Processing and Estimation
- Sampling, Aliasing, and Reconstruction
- State Estimation, Smoothing, and Hidden-State Inference
- Control and Dynamics: Estimation, Kalman Filtering, and the Separation Principle
- Stochastic Control and Dynamic Programming: Stochastic Linear Systems, LQG, and the Separation Principle
- Probability
- Statistics
12 Optional Deeper Reading After First Pass
- MIT 6.011 readings page - official MIT reading map showing the arc from MMSE and LMMSE estimation to WSS processes and Wiener filtering. Checked
2026-04-25. - MIT 6.011 Lecture 13: MMSE and LMMSE Estimation - official lecture notes on the vector and projection view of MMSE and LMMSE estimation. Checked
2026-04-25. - MIT 6.011 Lecture 14: LMMSE Estimation, Orthogonality - official lecture notes for orthogonality and normal-equation structure. Checked
2026-04-25. - MIT 6.011 Lecture 16: LTI Filtering of WSS Processes - official lecture notes on stationary processes and filtering structure. Checked
2026-04-25. - MIT 6.011 Lecture 20: Wiener Filtering - official lecture notes dedicated to Wiener filtering. Checked
2026-04-25. - Stanford EE278 course overview - official Stanford course home explicitly listing MMSE estimation, Wiener filtering, and Kalman filtering in the current course scope. Checked
2026-04-25. - Stanford EE278 course outline - official outline showing MSE estimation, random processes, power spectral density, and Wiener filtering in sequence. Checked
2026-04-25.
13 Sources and Further Reading
- MIT 6.011 readings page -
First pass- official topic map for MMSE, WSS processes, and Wiener filtering. Checked2026-04-25. - MIT 6.011 Lecture 13: MMSE and LMMSE Estimation -
First pass- official lecture notes on projection and least-square estimation geometry. Checked2026-04-25. - MIT 6.011 Lecture 14: LMMSE Estimation, Orthogonality -
First pass- official lecture notes on orthogonality and normal equations. Checked2026-04-25. - MIT 6.011 Lecture 16: LTI Filtering of WSS Processes -
First pass- official notes on the random-process and filtering bridge. Checked2026-04-25. - MIT 6.011 Lecture 20: Wiener Filtering -
First pass- official lecture notes on Wiener filter construction and interpretation. Checked2026-04-25. - Stanford EE278 course overview -
First pass- official Stanford course page connecting MMSE estimation, Wiener filtering, and Kalman filtering. Checked2026-04-25. - Stanford EE278 course outline -
Second pass- official outline for the estimation, random-process, PSD, and Wiener-filtering sequence. Checked2026-04-25.