Stress Testing Workflow¶

This page is the shortest reusable route from “I think the idea is right” to “I have actual evidence that the implementation is trustworthy.”

Trigger: nontrivial fast solution, suspicious edge cases, or low trust after samples
Inputs needed: fast solution plus either a brute-force checker, property validator, or small-case oracle
Output artifact: one reproducible mismatch or many clean small-case passes
Stop condition: the first unexplained mismatch is understood, or the local confidence loop is clean enough to move on
Pair with: Foundations cheatsheet, Local judge workflow, Template library

Use it when:

your fast solution is nontrivial enough that a few hand tests do not feel convincing
the bug is probably in implementation details, not in the high-level algorithm choice
you can write a slower checker, brute-force solver, or property validator on small cases

If you are still on the first few beginner notes, do not start here. A saved sample plus one deliberate edge case is the right first loop before you invest in generators and checkers.

Which Workflow To Use Right Now¶

Choose this page only if:

the problem is still an ordinary batch problem
saved samples already pass
trust is low because the fast solution is nontrivial

If the task is interactive or validator-heavy, switch to Local judge workflow instead.

Fast Loop¶

The default loop is:

write the clearest correct fast solution you can
write a tiny brute-force or slower checker for small limits
generate many small random tests
compare outputs or compare a shared invariant
stop only after the first mismatch is explained

If a problem has many valid answers, do not compare raw output strings. Compare a predicate or a normalized property instead.

Compile Baseline¶

Use the same baseline flags the repo already recommends:

c++ -std=c++20 -O2 -Wall -Wextra -pedantic fast.cpp -o fast
c++ -std=c++20 -O2 -Wall -Wextra -pedantic brute.cpp -o brute
c++ -std=c++20 -O2 -Wall -Wextra -pedantic gen.cpp -o gen

That does two jobs at once:

catches avoidable warnings early
keeps local verification closer to the contest build you actually use

Three Levels Of Confidence¶

1. Hand Cases¶

Start with a few hand-written tests before any automation:

smallest valid input
smallest invalid or edge-shaped input
all-equal / already-sorted / reverse-sorted cases
repeated values if the problem allows them
the exact boundary where your branch logic changes

This stage is about catching obvious model mistakes quickly.

2. Brute-Force Comparison¶

If the constraints allow a tiny checker, compare the fast solution against it.

Typical pattern:

for seed in $(seq 1 2000); do
  ./gen "$seed" > input.txt
  ./fast < input.txt > fast.txt
  ./brute < input.txt > brute.txt
  diff -u fast.txt brute.txt || break
done

Use this when:

the answer is unique
the brute-force output format matches the fast one exactly

3. Property-Based Checking¶

For constructive problems or problems with many valid outputs, compare properties instead of exact answers.

The checker should answer:

is the output well-formed?
does it satisfy the statement?
does it hit the required counts or invariants?

Examples:

permutation really uses 1..n exactly once
number of local maxima/minima matches the request
returned path is legal and has the claimed length
printed sequence of moves is valid at every step

Construction Problems¶

For constructive tasks, the best local verifier is usually a dedicated predicate checker.

The order should be:

write a validator for one output
use the validator on hand cases
only then generate random small inputs

Do not compare against one “reference construction” if the problem accepts many different valid answers.

Instead, validate:

domain correctness
invariants after every operation
final-state correctness

Generator Habits¶

Good random generators:

stay inside the real constraints of the statement
hit edge regimes on purpose, not just uniform random noise
accept a seed so failures are reproducible

Once a mismatch appears, freeze that case into a permanent hand test or note snippet.

What To Print On Mismatch¶

When the first failure appears, print enough context to reproduce it instantly:

seed
generated input
fast output
checker output or brute output
one short explanation of the first violated invariant, if known

Do not keep testing past the first unexplained mismatch.

Interactive And Judge-Like Tasks¶

When the task is interactive or protocol-heavy, local checking changes a bit:

verify flushing discipline
verify query count or operation count
simulate the judge with a small deterministic harness if possible
log the transcript of one failing run

For those tasks, local confidence often comes more from protocol simulation than from random brute force.

When To Stop¶

You are usually ready to submit when:

warnings are clean
hand cases pass
small brute-force comparison or property checking passes for many seeds
the first boundary cases around each branch condition have been exercised deliberately

The goal is not “prove the program correct by testing.” The goal is to make hidden implementation mistakes very hard to survive.