Linear Regression Gradient Descent Benchmark¶
- Title:
Linear Regression Gradient Descent Benchmark - Judge / source:
Canonical first-order optimization benchmark - Original URL: https://cs229.stanford.edu/summer2020/cs229-notes1.pdf
- Secondary topics:
Batch gradient descent,Squared loss,Linear regression - Difficulty:
medium - Subtype:
Fixed-step full-batch gradient descent on one-feature linear regression - Status:
solved - Solution file: linearregressiongd.cpp
Why Practice This¶
This is the cleanest first in-repo anchor for Gradient Descent.
The benchmark is intentionally narrow:
- one feature
- one affine model
- one squared loss
- one batch gradient
- one deterministic update loop
So the whole lesson is exactly the lane itself:
- write the gradient
- update in the correct direction
- choose one stable learning rate
- watch the parameters move toward the optimum
Recognition Cue¶
Reach for this route when:
- the benchmark explicitly wants gradient descent
- the loss is smooth and differentiable
- one tiny regression model is enough to teach the update rule
The strongest smell is:
I want one honest first-order optimization benchmark, not a full ML or convex-optimization curriculum.
That is exactly this note.
Problem-Specific Route¶
This repo's benchmark uses:
- one integer
n - one learning rate
alpha - one epoch count
epochs npairs(x_i, y_i)
The route is:
- initialize
w = 0,b = 0 - repeat
epochstimes: - compute
dwanddbfrom the full batch - update
wandb - print the final parameters
This is exactly the batch gradient-descent route in simulator form.
Core Idea¶
The first route is not about solving linear regression in the smartest exact way.
It is about:
- the gradient as local direction information
- taking steps against that direction
- understanding the role of the learning rate
1. Loss¶
Use:
2. Gradient¶
Compute:
3. Update¶
Move against the gradient:
Worked Example¶
For points:
(0,1), (1,3), (2,5), (3,7)
the ideal line is:
y = 2x + 1
Starting from zero, a moderate alpha and enough epochs move:
wtoward2btoward1
The benchmark prints those final learned parameters.
Complexity¶
With n data points and T epochs:
- time =
O(Tn) - memory =
O(1)beyond the input arrays
The benchmark is update-rule-first, not asymptotic-performance-first.
Input / Output Contract For This Repo¶
This repo's benchmark uses:
- one line:
n alpha epochs - then
nlines, each withx y
The solution prints exactly one line:
w b
with fixed decimal formatting.
This is a repo-native benchmark, not one standard online judge task.
Pitfalls / Judge Notes¶
- This is batch gradient descent, not SGD.
- The update is against the gradient, not with it.
- Large
alphacan diverge. - This benchmark is deterministic and fixed-step by design.
- This note does not stand in for momentum, Adam, or full optimizer engineering.
Reusable Pattern¶
- Topic page: Gradient Descent
- Practice ladder: Gradient Descent ladder
- Starter template: gradient-descent-linear-regression-1d.cpp
- Notebook refresher: Gradient Descent hot sheet
- Compare points:
- Machine Learning Algorithms
- Optimization And Duality
Solutions¶
- Code: linearregressiongd.cpp