Skip to content

Machine Learning Theory

This section covers the mathematical foundations of modern machine learning algorithms.

SOTA Roadmap

1. Modern Optimization Theory

  • Loss Landscapes: Mode Connectivity, Sharpness-Aware Minimization (SAM).
  • Convergence: Grokking (Delayed Generalization), Edge of Stability.

2. Deep Learning Foundations

  • Normalization: LayerNorm vs RMSNorm effects on signal propagation.
  • Scaling Laws: Chinchilla Scaling, Kaplan Laws.
  • Neural Tangent Kernel (NTK): Limits of infinite width networks.

3. Generalization

  • Double Descent: Bias-Variance Trade-off revisited.
  • Benign Overfitting: Why deep networks generalize without regularization.

Key Resources