Machine Learning Theory¶
This section covers the mathematical foundations of modern machine learning algorithms.
SOTA Roadmap¶
1. Modern Optimization Theory¶
- Loss Landscapes: Mode Connectivity, Sharpness-Aware Minimization (SAM).
- Convergence: Grokking (Delayed Generalization), Edge of Stability.
2. Deep Learning Foundations¶
- Normalization: LayerNorm vs RMSNorm effects on signal propagation.
- Scaling Laws: Chinchilla Scaling, Kaplan Laws.
- Neural Tangent Kernel (NTK): Limits of infinite width networks.
3. Generalization¶
- Double Descent: Bias-Variance Trade-off revisited.
- Benign Overfitting: Why deep networks generalize without regularization.
Key Resources¶
- Book: Deep Learning (Goodfellow et al.).
- Course: Practical Deep Learning for Coders (fast.ai).
- Visuals: Distill.pub (Interactive explanations).