AI / ML Theory Roadmap

A dependency-aware study path from mathematical foundations to theory-facing machine learning topics.
Modified

April 26, 2026

Keywords

roadmap, machine learning theory, learning theory, optimization, high-dimensional statistics

1 Purpose

This roadmap is for readers who do not just want to use ML tools, but want to read theory-facing papers, understand assumptions, and know why the main guarantees look the way they do.

It is not a universal ML curriculum. It is a dependency-aware path through the mathematics that most often supports ML theory.

2 Who This Is For

Use this roadmap if your goal is any of the following:

  • read papers with proofs, bounds, asymptotics, or concentration arguments
  • understand why optimization and generalization claims are stated the way they are
  • move toward learning theory, high-dimensional statistics, or deep learning theory
  • connect the current site’s math modules to ML research directions

If your goal is only to get a working model pipeline quickly, this is probably too theory-heavy as a first route.

3 Main Sequence

Use this as the default order.

  1. Proofs
  2. Logic
  3. Linear Algebra
  4. Probability
  5. Statistics
  6. Single-Variable Calculus
  7. Multivariable Calculus
  8. Optimization
  9. Real Analysis
  10. Learning Theory
  11. Matrix Analysis
  12. High-Dimensional Probability
  13. High-Dimensional Statistics

The first thirteen stages are now live on the site through High-Dimensional Statistics, with Matrix Analysis and High-Dimensional Probability feeding directly into a complete first-pass high-dimensional-statistics module. The site now also has a full Numerical Methods module for the computation side of that stack, which is enough to begin reading the cleaner end of ML-facing theory pages.

4 Why This Order Works

4.1 Proofs and Logic First

These pages train the habits that later theory papers assume without apology:

  • parse assumptions carefully
  • expose hidden quantifiers
  • translate between prose and symbolic structure
  • negate a statement correctly before trying to prove or refute it

4.2 Linear Algebra Before Most ML

ML is full of vectors, projections, low-rank structure, eigenmodes, and learned linear maps.

Without linear algebra, model descriptions become memorized recipes. With it, many architectures reduce to variations on a small number of reusable objects.

4.3 Probability Before Statistics

Theory-facing ML uses probability to talk about randomness, sampling, concentration, conditioning, and asymptotics.

Statistics then turns that language into estimators, validation, uncertainty, and generalization-facing ideas.

5 Shared Core Bridge Into ML

Before branching, there is one short shared core:

  1. supervised learning setup
  2. loss functions and empirical risk
  3. train / validation / test logic
  4. calibrated confidence and uncertainty
  5. regularization and model complexity

Useful shared-core bridge pages are:

6 Branch Points

After the shared core, most readers should branch instead of forcing one long linear path.

6.1 Optimization Branch

Use this branch if you care about:

  • gradient descent and SGD
  • objective design
  • regularization
  • training dynamics
  • convex and nonconvex viewpoints

6.2 Learning Theory Branch

Use this branch if you care about:

  • generalization guarantees
  • bias-variance structure
  • stability and capacity ideas
  • VC / Rademacher style reasoning

6.3 High-Dimensional Statistics Branch

Use this branch if you care about:

  • many-features regimes
  • overparameterization
  • sparsity and shrinkage
  • inference and estimation when \(p\) is large relative to \(n\)

6.4 Deep Learning Theory Branch

Use this branch if you care about:

  • backpropagation and computation graphs
  • implicit bias of optimization
  • representation learning
  • scaling and overparameterized training

Useful bridge pages for this branch are:

6.5 Graph and Structured Learning Branch

Use this branch if you care about:

  • graph diffusion and neighborhood averaging
  • message-passing neural networks
  • molecular, relational, or recommendation graphs
  • spectral versus spatial graph design choices

Useful bridge pages for this branch are:

6.6 Kernel and Bayesian Nonparametrics Branch

Use this branch if you care about:

  • similarity-based nonlinear prediction
  • kernel ridge regression
  • Gaussian-process regression
  • uncertainty-aware function prediction

Useful bridge pages for this branch are:

6.7 Generative Modeling Branch

Use this branch if you care about:

  • denoising as a learning objective
  • iterative sample generation
  • diffusion probabilistic models
  • score-based or stochastic-process views of generation

Useful bridge pages for this branch are:

7 Pages On The Site That Already Support This Roadmap

7.1 Strongest Current Math Support

7.2 Strongest Current ML Bridge Pages

8 Paper Reading Overlay

Do not wait until the entire roadmap is complete before touching papers.

Run a light paper-reading overlay in parallel:

  1. How to Read a Paper
  2. one ML-facing application page
  3. one matching paper-lab page

A good current sequence is:

  1. Vector Mixtures in Embeddings and Attention
  2. Paper Lab: Attention as Weighted Vector Mixture

9 Common Next Theory Directions

After the current foundations, calculus bridge, optimization path, learning-theory spine, Matrix Analysis, High-Dimensional Probability, High-Dimensional Statistics, Numerical Methods, and Control and Dynamics, a strong adjacent theory module now live is:

10 Sources and Further Reading

  • CS229: Machine Learning - First pass - official current ML course hub with a broad math-aware overview of the field. Checked 2026-04-24.
  • CS 189 Syllabus - First pass - official Berkeley syllabus showing a modern intro-ML course with clear prerequisite expectations. Checked 2026-04-24.
  • EE364a: Convex Optimization I - Second pass - official optimization course that naturally supports the next major branch in the roadmap. Checked 2026-04-24.
  • Mathematics for Machine Learning - Second pass - strong math bridge for readers moving from foundations toward ML language. Checked 2026-04-24.
  • CS229T / Statistical Learning Theory - Paper bridge - a compact entry into theory-heavy ML beyond introductory courses. Checked 2026-04-24.
Back to top