Inference in High Dimension

Why confidence intervals, p-values, and uncertainty statements become delicate after selection or regularization, and the first-pass ideas behind debiasing and selective inference.

Modified

April 26, 2026

Keywords

high-dimensional inference, debiasing, selective inference, confidence intervals, p-values

1 Role

This is the seventh page of the High-Dimensional Statistics module.

The previous pages built estimators, rates, covariance/PCA viewpoints, and minimax lower bounds.

This page asks the final first-pass question:

after regularization or selection, what can we still say with confidence about uncertainty?

That is the entry point to high-dimensional inference.

2 First-Pass Promise

Read this page after Minimax and Lower Bounds.

If you stop here, you should still understand:

why classical regression inference does not automatically survive selection or regularization
why lasso estimates are useful for prediction or support recovery but biased for direct inference
why debiasing tries to recover approximately centered, normal coordinates
why selective inference changes the target by conditioning on the selection event

3 Why It Matters

In classical regression, confidence intervals and p-values are tied to estimators with well-understood sampling distributions.

In high dimensions, many common pipelines do something different:

select variables
shrink coefficients
regularize an optimization problem
then apply classical inferential formulas as if the model had been fixed in advance

That shortcut is dangerous.

The same data have been used both:

to choose a model or sparse representation
and to quantify uncertainty inside that chosen model

So high-dimensional inference is not just “do the old t-test after lasso.”

It is a distinct theory problem.

4 Prerequisite Recall

classical confidence intervals assume a stable target and a tractable estimator distribution
lasso improves estimation and selection by shrinking coefficients, which introduces bias
minimax theory tells us uncertainty targets can themselves be fundamentally hard
optimization and KKT conditions help explain where lasso bias comes from

5 Intuition

5.1 Prediction Is Not Inference

A model can predict well and still give unreliable uncertainty statements about individual coefficients.

That is because prediction asks:

does X beta-hat produce good outputs?

Inference asks:

how uncertain are we about a specific target parameter?

Those are different goals.

5.2 Why Naive Post-Selection Inference Fails

If a variable is selected because it looked strong in the sample, then treating it afterward as if it had been chosen in advance ignores the randomness of the selection step.

This makes intervals too optimistic and p-values too small.

5.3 Two First-Pass Responses

Two common first-pass strategies are:

debiasing / desparsifying: correct a regularized estimator so individual coordinates behave more like centered asymptotically normal quantities
selective inference: condition on the fact that a particular model or active set was selected, and do inference for a conditional target

These do not answer exactly the same question.

That is one of the central conceptual lessons of the area.

6 Formal Core

Definition 1 (Definition: Inferential Target) In high-dimensional inference, the first question is:

what parameter are we trying to make a statement about?

Typical targets include:

one coefficient
one linear functional
a low-dimensional subset of coordinates
a parameter defined conditional on selection

The target must be specified before the interval or test can be interpreted.

Theorem 1 (Theorem Idea: Naive Post-Selection Least-Squares Inference Is Invalid) If one uses the same data to select variables and then runs ordinary least-squares inference as though that selected model were fixed in advance, the resulting confidence intervals and p-values generally do not have their nominal guarantees.

This is one of the main first-pass warnings of the subject.

Theorem 2 (Theorem Idea: Debiasing Tries to Restore Approximate Normality) A debiased or desparsified estimator starts with a regularized estimate such as the lasso and adds a correction term designed to reduce its bias.

At first pass, the shape is

\[ \widetilde\beta = \widehat\beta + \text{correction term}. \]

Under suitable assumptions, individual coordinates of \(\widetilde\beta\) can behave approximately like centered normal quantities, which then supports confidence intervals and tests for low-dimensional targets.

Theorem 3 (Theorem Idea: Selective Inference Changes the Conditioning and Often the Target) Selective inference conditions on the event that a given model or active set was selected, and then performs inference relative to that conditional world.

This can produce valid post-selection inference, but the resulting target is not always the same as the unconditional population coefficient one may have started with.

Theorem 4 (Theorem Idea: Inference Is Harder Than Prediction) Prediction error can often be small even when honest uncertainty quantification is delicate or impossible without stronger assumptions.

That is why valid high-dimensional inference usually needs more structure than point prediction alone.

7 Worked Example

Suppose a lasso fit selects variables j = 3, 8, 15 from a problem with p = 5000 and n = 200.

A naive workflow is:

run lasso on the full data
keep those three variables
run ordinary least squares on the same data
report classical confidence intervals for those three coefficients

The problem is that step 3 pretends the model {3,8,15} was fixed in advance.

But it was chosen because the same sample made those variables look special.

So the reported intervals ignore model-selection randomness.

A first-pass fix is not:

just trust the OLS formulas anyway

It is to change the inferential procedure:

either debias the penalized estimator and target a low-dimensional coordinate
or condition on the selection event and do selective inference

The details are advanced, but the conceptual repair is already visible here.

8 Computation Lens

When you read a high-dimensional inference result, ask:

what is the target parameter?
is the method debiasing, sample splitting, selective inference, or something else?
what assumptions are needed on sparsity, noise, and the design covariance?
is the guarantee marginal, simultaneous, or conditional on selection?

Those questions usually tell you what the interval or p-value really means.

9 Application Lens

9.1 Scientific Variable Claims

In genomics, neuroscience, and social science, people often want uncertainty on individual features. High-dimensional inference explains why sparse selection alone is not enough to justify those claims.

9.2 Post-Regularization ML Interpretability

Regularized models are often treated as interpretable because they are sparse. This page is the caution that interpretability and valid uncertainty are separate achievements.

9.3 Multiple Testing and Large-Scale Discovery

Many modern problems involve many possible signals at once. High-dimensional inference is one of the bridges from sparse estimation to large-scale testing, false discovery control, and uncertainty after selection.

10 Stop Here For First Pass

If you can now explain:

why post-selection OLS inference is generally invalid
why lasso bias is a direct obstacle to coefficient-wise inference
why debiasing and selective inference are different responses
why valid uncertainty quantification needs a carefully specified target

then this page has done its job.

11 Go Deeper

After this page, the strongest adjacent live pages are:

12 Optional Deeper Reading After First Pass

The strongest current references connected to this page are:

Stanford STATS 202: High-dimensional regression - official notes with a clear warning about inference when p > n. Checked 2026-04-25.
CMU 36-709 Lecture 14 - official lecture notes covering debiasing, the inferential goal, and the need to correct lasso bias. Checked 2026-04-25.
Stanford: Confidence Intervals and Hypothesis Testing for High-Dimensional Regression - official project page explaining a debiasing-based route to confidence intervals and p-values. Checked 2026-04-25.
Stanford Stats 364: Conditional inference II - official notes showing the conditional/selective-inference viewpoint after lasso selection. Checked 2026-04-25.

13 Sources and Further Reading

Stanford STATS 202: High-dimensional regression - First pass - official notes for why classical inference breaks in the p > n setting. Checked 2026-04-25.
CMU 36-709 Lecture 14 - First pass - official lecture notes for debiasing, low-dimensional inferential targets, and the bias-variance tradeoff. Checked 2026-04-25.
Stanford: Confidence Intervals and Hypothesis Testing for High-Dimensional Regression - Second pass - official Stanford page for one influential debiasing route to coefficient-wise inference. Checked 2026-04-25.
Stanford Stats 364: Conditional inference II - Second pass - official notes for the conditional/selective-inference viewpoint after model selection. Checked 2026-04-25.