Noise Models, Wiener Filtering, and MMSE Estimation

How random signal models, mean-square error, orthogonality, and Wiener filtering turn noisy observations into principled estimators.
Modified

April 26, 2026

Keywords

noise, MMSE, LMMSE, Wiener filter, power spectral density

1 Role

This is the fourth page of the Signal Processing and Estimation module.

Its job is to move from deterministic signals and sampling into noisy observations and principled estimation.

The earlier pages said:

  • systems transform signals
  • frequency views explain filters
  • sampling explains analog-to-digital conversion

This page adds:

  • signals and noise are often modeled probabilistically
  • estimation quality can be measured by mean-square error
  • Wiener filtering is the spectral linear-estimation version of that idea

2 First-Pass Promise

Read this page after Sampling, Aliasing, and Reconstruction.

If you stop here, you should still understand:

  • why random noise models are useful
  • what MSE, MMSE, and LMMSE mean
  • why orthogonality is the core least-squares geometry
  • why Wiener filtering balances signal power against noise power by frequency

3 Why It Matters

Real measurements are almost never clean.

They are corrupted by:

  • sensor noise
  • channel noise
  • blur plus measurement error
  • finite-resolution acquisition
  • interference from other sources

So signal processing needs more than deterministic convolution and sampling.

It needs a language for:

  • modeling uncertainty
  • defining what a “good estimate” means
  • designing filters that recover useful structure without amplifying noise too aggressively

This is the point where signal processing starts to overlap strongly with statistics, estimation, control, and modern ML.

4 Prerequisite Recall

  • probability provides expectations, variances, covariance, and conditional expectation
  • statistics already introduced mean-square loss and estimation error
  • Fourier and spectral viewpoints tell us how signal energy can be distributed across frequencies
  • LTI systems provide the natural class of filters we can use for denoising or recovery

5 Intuition

5.1 Noise Is A Model, Not Just A Nuisance Word

When we write

\[ y = x + v, \]

we are usually not claiming every fluctuation is literally random in a metaphysical sense.

We are choosing a probabilistic model that lets us reason about uncertainty and average error.

5.2 Mean-Square Error Measures Typical Estimation Quality

If \hat{x} is an estimate of x, then the mean-square error

\[ \mathbb{E}[(x-\hat{x})^2] \]

measures expected squared discrepancy.

It penalizes large mistakes more heavily than small ones and interacts cleanly with linear algebra and orthogonality.

5.3 MMSE And LMMSE Are Not The Same Object

MMSE means:

  • choose the estimator with smallest mean-square error among all measurable estimators

LMMSE means:

  • restrict attention to estimators that are linear in the observations

So LMMSE is usually easier to compute, but it is a restricted problem.

In Gaussian settings, that restricted problem often matches the full MMSE problem.

5.4 Orthogonality Is The Geometry Behind Least Squares

The first-pass geometry is:

  • the estimation error should be orthogonal to the information used to form the estimate

That is the signal-processing version of projection in Hilbert space.

5.5 Wiener Filtering Is Frequency-Wise Linear Estimation

For wide-sense stationary random processes, the frequency domain often decouples the estimation problem.

At first pass, Wiener filtering says:

  • trust frequencies where signal power dominates noise
  • shrink frequencies where noise power dominates signal

So Wiener filtering is not just “smooth the data.”

It is a structured spectral tradeoff between fidelity and denoising.

6 Formal Core

Definition 1 (Definition: Random Signal Model) A random signal model treats the signal of interest, the noise, or both as random variables or random processes with specified first- and second-order structure.

At first pass, this usually means:

  • means
  • variances
  • covariances
  • sometimes power spectral densities

Definition 2 (Definition: Mean-Square Error) For an estimate \hat{x} of a random quantity x, the mean-square error is

\[ \operatorname{MSE}(\hat{x}) = \mathbb{E}\big[(x-\hat{x})^2\big]. \]

Vector versions use squared norm error.

Definition 3 (Definition: MMSE Estimator) An MMSE estimator minimizes mean-square error over all admissible estimators based on the observation.

At first pass, the load-bearing fact is:

\[ \hat{x}_{\mathrm{MMSE}} = \mathbb{E}[x \mid y]. \]

Definition 4 (Definition: LMMSE Estimator) An LMMSE estimator minimizes mean-square error among estimators that are linear functions of the observation.

This is usually easier to compute because it depends only on second-order statistics.

Theorem 1 (Theorem Idea: Orthogonality Principle) For least-square optimal estimation, the estimation error is orthogonal to the space generated by the data used to form the estimate.

At first pass, this is the projection rule that produces normal equations.

Definition 5 (Definition: Wide-Sense Stationarity) A random process is wide-sense stationary if its mean is constant and its autocovariance depends only on time difference, not absolute time.

This makes spectral analysis and LTI filtering especially clean.

Theorem 2 (Theorem Idea: Wiener Filtering) In a stationary linear-estimation setting, the Wiener filter is the linear filter that minimizes mean-square error between the desired signal and the filter output.

For the classic additive model y = x + v with uncorrelated signal and noise, the noncausal Wiener filter has the first-pass spectral form

\[ H(\omega) = \frac{S_{xx}(\omega)}{S_{xx}(\omega) + S_{vv}(\omega)}, \]

where S_{xx} and S_{vv} are the signal and noise power spectral densities.

The interpretation is immediate:

  • near 1 where signal dominates
  • near 0 where noise dominates

7 Worked Example

Suppose the observation is

\[ y = x + v, \]

where:

  • x has mean 0 and variance \sigma_x^2
  • v has mean 0 and variance \sigma_v^2
  • x and v are uncorrelated

Look for the best linear estimator of the form

\[ \hat{x} = a y. \]

The mean-square error is

\[ \mathbb{E}\big[(x-a(x+v))^2\big] = (1-a)^2 \sigma_x^2 + a^2 \sigma_v^2. \]

Minimizing over a gives

\[ a^\star = \frac{\sigma_x^2}{\sigma_x^2 + \sigma_v^2}. \]

So

\[ \hat{x}_{\mathrm{LMMSE}} = \frac{\sigma_x^2}{\sigma_x^2 + \sigma_v^2} y. \]

This has the exact shape we want:

  • if signal variance is much larger than noise variance, trust the measurement more
  • if noise variance is much larger, shrink the measurement heavily

The Wiener filter is the frequency-dependent process version of this same tradeoff.

8 Computation Lens

When you meet a noisy-signal estimation problem, ask:

  1. what is random here: the signal, the noise, or both?
  2. what loss is being minimized?
  3. is the target full MMSE, linear LMMSE, or a causal filtering problem?
  4. are we working with scalar variables, vectors, or stationary processes?
  5. is the useful structure in covariance or in the power spectral density?

These questions tell you whether the right tool is conditional expectation, normal equations, Wiener filtering, or later-state estimation tools like Kalman filtering.

9 Application Lens

9.1 Communications

Receivers constantly estimate signals buried in channel noise and interference.

9.2 Imaging And Sensing

Denoising and deblurring problems often trade sharp recovery against noise amplification, which is exactly the terrain Wiener filtering was built for.

9.3 Sequential Estimation

The static and stationary ideas here become the bridge to state estimation and Kalman filtering on the next branch of the module tree.

10 Stop Here For First Pass

If you stop here, retain these five ideas:

  • random signal models let us reason about noise systematically
  • MMSE is the best estimator overall under squared loss
  • LMMSE is the best estimator within the linear class
  • orthogonality is the core geometry behind least-square estimation
  • Wiener filtering is frequency-dependent linear MMSE estimation for stationary signal-plus-noise models

11 Go Deeper

The strongest next page is:

The strongest adjacent live pages are:

12 Optional Deeper Reading After First Pass

13 Sources and Further Reading

Back to top