Machine Learning Algorithms¶

This lane exists as a textbook-breadth advanced lane, not as a claim that competitive programming suddenly needs a full machine-learning toolkit.

The repo's exact first route is intentionally narrow:

binary classification
linear separator
deterministic example order
mistake-driven online updates
linearly separable training data

That means this page is not:

a general ML course
a statistics page
a deep-learning page
a Gradient Descent lane

It is the first clean algorithmic route for:

the perceptron update rule
linearly separable data
online mistake-driven learning
a lightweight bridge from algorithms into one classical ML algorithm

with Perceptron Classification Benchmark as the flagship benchmark.

This page sits next to:

Online Algorithms because the perceptron is an online update rule driven by mistakes
Gradient Descent when the real lesson is smooth loss minimization rather than mistake-driven separable classification
Randomized Algorithms when the missing idea is probabilistic design rather than deterministic learning updates
Algorithm Engineering when the question is scaling and systems practice, not the core learning rule

At A Glance¶

Use when:
you want the first algorithmic ML benchmark that still feels like an algorithms page
the data is linearly separable or intentionally benchmarked as such
the model is one hyperplane classifier with online updates
the real lesson is the update rule, not statistical evaluation or deployment
Prerequisites:
Reasoning
Online Algorithms as a useful sibling
Boundary with nearby pages:
use this page when the exact first route is perceptron on separable data
use Online Algorithms when the primary lens is competitive ratio against offline OPT
do not use this lane when the problem is really optimization with known full input, or when you need modern ML pipelines
Strongest cues:
binary labels such as -1 / +1
one linear separator is the model
the algorithm updates only on mistakes
the benchmark assumes linearly separable training data
Strongest anti-cues:
the data is not separable and no surrogate-loss route is provided
the real topic is Gradient Descent or convex optimization
the task needs probabilities, calibration, or real-valued logistic outputs
the statement is a full ML system problem, not one learning-rule benchmark
Success after studying this page:
you can state the perceptron update rule cleanly
you know the exact role of linear separability in convergence
you know why this lane is breadth, not CP-core

Problem Model And Notation¶

The exact starter route in this repo uses:

feature vectors x_i in Z^d
labels y_i in {-1, +1}
one affine classifier:

\[ f(x) = \operatorname{sign}(w \cdot x + b) \]

The exact update rule is:

if example (x_i, y_i) is classified correctly, do nothing
if it is misclassified or lies on the wrong side of the boundary, update:

\[ w \leftarrow w + y_i x_i,\qquad b \leftarrow b + y_i \]

The repo's exact first route assumes the training data is linearly separable, so one full pass with no mistakes is the stopping condition.

This is deliberate.

Why:

it keeps the algorithmic invariant honest
it makes convergence behavior clean to explain
it avoids pretending the first repo route is full empirical ML

Why This Exists Next To Online Algorithms¶

Online Algorithms teaches:

future is hidden
decisions are made online
quality is measured against an offline optimum

This page teaches a different online-style benchmark:

examples arrive in sequence
the model changes only when it makes a mistake
quality is not a competitive ratio
quality is whether one separator can eventually classify the separable training set

So the division of labor is:

online-algorithms page: adversarial future, benchmark against OPT
machine-learning page: mistake-driven perceptron updates on labeled examples

Core Route In This Repo¶

The exact route is:

start with w = 0 and b = 0
scan training examples in deterministic input order
whenever y_i (w · x_i + b) <= 0, apply one perceptron update
stop after one full pass with no mistakes

The exact starter contract is intentionally narrow:

binary labels only
integer features
deterministic order
separable data assumed
returned model supports only hard -1 / +1 prediction

The exact non-promises matter:

no logistic regression
no probability estimates
no Gradient Descent
no nonseparable convergence guarantee
no claim that this is a contest-high-ROI lane

Core Invariants¶

1. The Model Is One Hyperplane¶

The whole exact route is about one linear separator.

If the real structure is nonlinear, kernelized, or probabilistic, this starter is no longer the right benchmark.

2. Updates Happen Only On Mistakes¶

The perceptron is not "nudge weights every step."

It is:

predict
check sign
update only when the sign is wrong or zero

That mistake-driven discipline is the entire route.

3. Linear Separability Is The Honest Boundary¶

The first route assumes there exists some (w*, b*) that separates the two classes.

Without that assumption:

convergence is no longer guaranteed
"one clean pass means done" may never happen

So the lane must say this assumption out loud.

Worked Example: Perceptron Classification Benchmark¶

Use Perceptron Classification Benchmark.

Take training points in 2D:

positive: (2, 1), (1, 2)
negative: (-1, -2), (-2, -1)

These are linearly separable by a line through the origin with positive slope-normal.

Starting from zero weights, the perceptron updates until one full pass makes no mistakes.

After training, query points such as:

(3, 1) should be predicted +1
(-1, -3) should be predicted -1

The benchmark is intentionally about:

update rule
convergence on separable data
hard-sign prediction

not about one statistically realistic dataset.

Variant Chooser¶

Situation	Best first move	Why it fits	Where it fails
one linearly separable binary classifier with online mistake-driven updates is the whole lesson	use this lane	the perceptron is exactly the intended first route	weak if the data is not separable or probabilities matter
the main lesson is adversarial online benchmarking against hindsight optimum	use Online Algorithms	the benchmark is competitive ratio, not training updates	weak if you really need a classifier
the real method is convex loss minimization or Gradient Descent	do not use this first route	the repo does not advertise that as part of this starter	weak if you force perceptron onto a smoother optimization problem
the problem is standard contest modeling with full offline input	use the specialized CP family	ML framing is probably artificial	weak if the statement truly is a learning-rule benchmark

Complexity And Tradeoffs¶

For n examples in dimension d, one pass costs:

O(nd)

If convergence happens after P passes, the total route is:

O(Pnd)

The important tradeoff is not asymptotic novelty.