Machine Learning Algorithms¶
This lane exists as a textbook-breadth advanced lane, not as a claim that competitive programming suddenly needs a full machine-learning toolkit.
The repo's exact first route is intentionally narrow:
- binary classification
- linear separator
- deterministic example order
- mistake-driven online updates
- linearly separable training data
That means this page is not:
- a general ML course
- a statistics page
- a deep-learning page
- a Gradient Descent lane
It is the first clean algorithmic route for:
- the perceptron update rule
- linearly separable data
- online mistake-driven learning
- a lightweight bridge from algorithms into one classical ML algorithm
with Perceptron Classification Benchmark as the flagship benchmark.
This page sits next to:
- Online Algorithms because the perceptron is an online update rule driven by mistakes
- Gradient Descent when the real lesson is smooth loss minimization rather than mistake-driven separable classification
- Randomized Algorithms when the missing idea is probabilistic design rather than deterministic learning updates
- Algorithm Engineering when the question is scaling and systems practice, not the core learning rule
At A Glance¶
- Use when:
- you want the first algorithmic ML benchmark that still feels like an algorithms page
- the data is linearly separable or intentionally benchmarked as such
- the model is one hyperplane classifier with online updates
- the real lesson is the update rule, not statistical evaluation or deployment
- Prerequisites:
- Reasoning
- Online Algorithms as a useful sibling
- Boundary with nearby pages:
- use this page when the exact first route is
perceptron on separable data - use Online Algorithms when the primary lens is competitive ratio against offline OPT
- do not use this lane when the problem is really optimization with known full input, or when you need modern ML pipelines
- Strongest cues:
- binary labels such as
-1 / +1 - one linear separator is the model
- the algorithm updates only on mistakes
- the benchmark assumes linearly separable training data
- Strongest anti-cues:
- the data is not separable and no surrogate-loss route is provided
- the real topic is Gradient Descent or convex optimization
- the task needs probabilities, calibration, or real-valued logistic outputs
- the statement is a full ML system problem, not one learning-rule benchmark
- Success after studying this page:
- you can state the perceptron update rule cleanly
- you know the exact role of linear separability in convergence
- you know why this lane is breadth, not CP-core
Problem Model And Notation¶
The exact starter route in this repo uses:
- feature vectors
x_i in Z^d - labels
y_i in {-1, +1} - one affine classifier:
The exact update rule is:
- if example
(x_i, y_i)is classified correctly, do nothing - if it is misclassified or lies on the wrong side of the boundary, update:
The repo's exact first route assumes the training data is linearly separable, so one full pass with no mistakes is the stopping condition.
This is deliberate.
Why:
- it keeps the algorithmic invariant honest
- it makes convergence behavior clean to explain
- it avoids pretending the first repo route is full empirical ML
Why This Exists Next To Online Algorithms¶
Online Algorithms teaches:
- future is hidden
- decisions are made online
- quality is measured against an offline optimum
This page teaches a different online-style benchmark:
- examples arrive in sequence
- the model changes only when it makes a mistake
- quality is not a competitive ratio
- quality is whether one separator can eventually classify the separable training set
So the division of labor is:
- online-algorithms page: adversarial future, benchmark against
OPT - machine-learning page: mistake-driven perceptron updates on labeled examples
Core Route In This Repo¶
The exact route is:
- start with
w = 0andb = 0 - scan training examples in deterministic input order
- whenever
y_i (w · x_i + b) <= 0, apply one perceptron update - stop after one full pass with no mistakes
The exact starter contract is intentionally narrow:
- binary labels only
- integer features
- deterministic order
- separable data assumed
- returned model supports only hard
-1 / +1prediction
The exact non-promises matter:
- no logistic regression
- no probability estimates
- no Gradient Descent
- no nonseparable convergence guarantee
- no claim that this is a contest-high-ROI lane
Core Invariants¶
1. The Model Is One Hyperplane¶
The whole exact route is about one linear separator.
If the real structure is nonlinear, kernelized, or probabilistic, this starter is no longer the right benchmark.
2. Updates Happen Only On Mistakes¶
The perceptron is not "nudge weights every step."
It is:
- predict
- check sign
- update only when the sign is wrong or zero
That mistake-driven discipline is the entire route.
3. Linear Separability Is The Honest Boundary¶
The first route assumes there exists some (w*, b*) that separates the two classes.
Without that assumption:
- convergence is no longer guaranteed
- "one clean pass means done" may never happen
So the lane must say this assumption out loud.
Worked Example: Perceptron Classification Benchmark¶
Use Perceptron Classification Benchmark.
Take training points in 2D:
- positive:
(2, 1),(1, 2) - negative:
(-1, -2),(-2, -1)
These are linearly separable by a line through the origin with positive slope-normal.
Starting from zero weights, the perceptron updates until one full pass makes no mistakes.
After training, query points such as:
(3, 1)should be predicted+1(-1, -3)should be predicted-1
The benchmark is intentionally about:
- update rule
- convergence on separable data
- hard-sign prediction
not about one statistically realistic dataset.
Variant Chooser¶
| Situation | Best first move | Why it fits | Where it fails |
|---|---|---|---|
| one linearly separable binary classifier with online mistake-driven updates is the whole lesson | use this lane | the perceptron is exactly the intended first route | weak if the data is not separable or probabilities matter |
| the main lesson is adversarial online benchmarking against hindsight optimum | use Online Algorithms | the benchmark is competitive ratio, not training updates | weak if you really need a classifier |
| the real method is convex loss minimization or Gradient Descent | do not use this first route | the repo does not advertise that as part of this starter | weak if you force perceptron onto a smoother optimization problem |
| the problem is standard contest modeling with full offline input | use the specialized CP family | ML framing is probably artificial | weak if the statement truly is a learning-rule benchmark |
Complexity And Tradeoffs¶
For n examples in dimension d, one pass costs:
O(nd)
If convergence happens after P passes, the total route is:
O(Pnd)
The important tradeoff is not asymptotic novelty.
It is:
- extremely small implementation surface versus
- a very strong assumption boundary (
linearly separable)
That is why this lane belongs in advanced and breadth, not in CP core.
Main Traps¶
- forgetting the bias term
- updating on every example instead of only mistakes
- assuming convergence without checking separability assumptions
- treating the final weights as unique or canonical
- overclaiming from perceptron to "machine learning" in general
Reopen Path¶
- Topic page: Machine Learning Algorithms
- Practice ladder: Machine Learning Algorithms ladder
- Starter template: perceptron-linear-separable.cpp
- Notebook refresher: Machine Learning hot sheet
- Compare points:
- Online Algorithms
- Randomized Algorithms
- Perceptron Classification Benchmark