Parallel Algorithms Hot Sheet¶
Use this page when the repo's exact parallel route is:
- one associative scan
- one
work / spanbenchmark - one Blelloch tree
- one sequential simulator that preserves the parallel structure
Do Not Use When¶
- the task is just an ordinary sequential prefix sum
- the real topic is SIMD / bitset packing
- the real topic is OpenMP, CUDA, locks, atomics, or runtime scheduling
- the problem needs a full parallel graph / sorting / dynamic-data-structure lane
Exact Route In This Repo¶
- read one array
- pad to one power of two with identities
- up-sweep partial sums
- set root to identity
- down-sweep prefixes
- output the first
nleaves as the exclusive scan
Recognition Cues¶
parallel prefix sumscanwork-efficientBrent's theoremup-sweep / down-sweep
Core Invariants¶
- the operator must be associative
- the exact first route is exclusive scan
work O(n)andspan O(log n)are the real benchmark values- the repo snippet is a structure-preserving simulator, not a multicore performance claim
Main Traps¶
- confusing exclusive scan with inclusive scan
- skipping padding and then hand-waving the tree shape
- treating the route as hardware programming instead of algorithmic breadth
- claiming real speedup from a sequential demo
Exact Starter In This Repo¶
- starter ->
blelloch-exclusive-scan.cpp - flagship rep -> Parallel Prefix Sum Benchmark
- classical sibling -> Prefix Sums
Reopen Paths¶
- full tutorial -> Parallel Algorithms
- parent router -> Advanced overview
- retrieve layer -> Build Kit and Template Library