Stress Testing Workflow¶
This page is the shortest reusable route from “I think the idea is right” to “I have actual evidence that the implementation is trustworthy.”
- Trigger: nontrivial fast solution, suspicious edge cases, or low trust after samples
- Inputs needed: fast solution plus either a brute-force checker, property validator, or small-case oracle
- Output artifact: one reproducible mismatch or many clean small-case passes
- Stop condition: the first unexplained mismatch is understood, or the local confidence loop is clean enough to move on
- Pair with: Foundations cheatsheet, Local judge workflow, Template library
Use it when:
- your fast solution is nontrivial enough that a few hand tests do not feel convincing
- the bug is probably in implementation details, not in the high-level algorithm choice
- you can write a slower checker, brute-force solver, or property validator on small cases
If you are still on the first few beginner notes, do not start here. A saved sample plus one deliberate edge case is the right first loop before you invest in generators and checkers.
Which Workflow To Use Right Now¶
Choose this page only if:
- the problem is still an ordinary batch problem
- saved samples already pass
- trust is low because the fast solution is nontrivial
If the task is interactive or validator-heavy, switch to Local judge workflow instead.
Fast Loop¶
The default loop is:
- write the clearest correct fast solution you can
- write a tiny brute-force or slower checker for small limits
- generate many small random tests
- compare outputs or compare a shared invariant
- stop only after the first mismatch is explained
If a problem has many valid answers, do not compare raw output strings. Compare a predicate or a normalized property instead.
Compile Baseline¶
Use the same baseline flags the repo already recommends:
c++ -std=c++20 -O2 -Wall -Wextra -pedantic fast.cpp -o fast
c++ -std=c++20 -O2 -Wall -Wextra -pedantic brute.cpp -o brute
c++ -std=c++20 -O2 -Wall -Wextra -pedantic gen.cpp -o gen
That does two jobs at once:
- catches avoidable warnings early
- keeps local verification closer to the contest build you actually use
Three Levels Of Confidence¶
1. Hand Cases¶
Start with a few hand-written tests before any automation:
- smallest valid input
- smallest invalid or edge-shaped input
- all-equal / already-sorted / reverse-sorted cases
- repeated values if the problem allows them
- the exact boundary where your branch logic changes
This stage is about catching obvious model mistakes quickly.
2. Brute-Force Comparison¶
If the constraints allow a tiny checker, compare the fast solution against it.
Typical pattern:
for seed in $(seq 1 2000); do
./gen "$seed" > input.txt
./fast < input.txt > fast.txt
./brute < input.txt > brute.txt
diff -u fast.txt brute.txt || break
done
Use this when:
- the answer is unique
- the brute-force output format matches the fast one exactly
3. Property-Based Checking¶
For constructive problems or problems with many valid outputs, compare properties instead of exact answers.
The checker should answer:
- is the output well-formed?
- does it satisfy the statement?
- does it hit the required counts or invariants?
Examples:
- permutation really uses
1..nexactly once - number of local maxima/minima matches the request
- returned path is legal and has the claimed length
- printed sequence of moves is valid at every step
Construction Problems¶
For constructive tasks, the best local verifier is usually a dedicated predicate checker.
The order should be:
- write a validator for one output
- use the validator on hand cases
- only then generate random small inputs
Do not compare against one “reference construction” if the problem accepts many different valid answers.
Instead, validate:
- domain correctness
- invariants after every operation
- final-state correctness
Generator Habits¶
Good random generators:
- stay inside the real constraints of the statement
- hit edge regimes on purpose, not just uniform random noise
- accept a seed so failures are reproducible
Once a mismatch appears, freeze that case into a permanent hand test or note snippet.
What To Print On Mismatch¶
When the first failure appears, print enough context to reproduce it instantly:
- seed
- generated input
- fast output
- checker output or brute output
- one short explanation of the first violated invariant, if known
Do not keep testing past the first unexplained mismatch.
Interactive And Judge-Like Tasks¶
When the task is interactive or protocol-heavy, local checking changes a bit:
- verify flushing discipline
- verify query count or operation count
- simulate the judge with a small deterministic harness if possible
- log the transcript of one failing run
For those tasks, local confidence often comes more from protocol simulation than from random brute force.
When To Stop¶
You are usually ready to submit when:
- warnings are clean
- hand cases pass
- small brute-force comparison or property checking passes for many seeds
- the first boundary cases around each branch condition have been exercised deliberately
The goal is not “prove the program correct by testing.” The goal is to make hidden implementation mistakes very hard to survive.