Prefix Sums¶
Prefix sums are the first serious example of this contest pattern:
- preprocess once
- answer many static queries cheaply
The trick is not the formula itself.
The real trick is choosing a prefix definition so the target quantity becomes:
- one prefix
- or the difference of two prefixes
- or, in 2D, an inclusion-exclusion combination of four prefixes
Once you see a query as "everything up to here minus everything before there," the implementation is usually one line.
At A Glance¶
- Use when: the data is static and the query can be written as a difference of cumulative aggregates
- Core worldview: range questions become easy after you store all prefixes once
- Main tools: 1D prefix sums, frequency/count prefixes, modulo/count prefixes, 2D prefix sums
- Typical complexity:
O(n)preprocessing andO(1)per static 1D query - Main trap: inconsistent indexing or using prefix sums in a problem that actually needs online updates
Prerequisites¶
Helpful neighboring topics:
Problem Model And Notation¶
Let the array be:
The safest contest convention is:
so:
prefhas lengthn + 1pref[0] = 0pref[i]stores the sum of the firstielements
Then the sum on the half-open interval [l, r) is:
If the statement uses one-based inclusive intervals [l, r], you can still use the same idea after converting indices once.
From Brute Force To The Right Idea¶
Brute Force: Recompute Every Query¶
Suppose queries ask:
- sum on
[l, r] - number of zeros on
[l, r] - number of values satisfying a predicate in
[l, r]
The brute-force answer is:
- scan the interval every time
That costs:
per query, which is too slow when there are many queries.
The First Shift: Store Everything Up To Here¶
Instead of answering a range query directly, answer a simpler question once for every endpoint:
- what is the aggregate on the prefix ending here?
Now every future query can reuse that work.
The Second Shift: A Range Is A Difference Of Prefixes¶
For sums, this is the core identity:
So the hard work is not the subtraction. The hard work is defining a prefix whose meaning is stable and easy to trust.
The Third Shift: "Prefix Sum" Means "Any Additive Prefix Aggregate"¶
You are not limited to numeric sums.
The same pattern works for:
- counts
- frequencies
- weighted sums
- parity counts
- indicator arrays
- 2D rectangle sums
So the general question is:
- can I encode the thing I care about as something additive over prefixes?
If yes, prefix sums may be the right tool.
Core Invariants And Why They Work¶
1. Stable Prefix Meaning¶
The most important invariant is:
pref[i] has one precise meaning and never changes during the whole solution.
For the standard half-open convention:
pref[i] = sum of a[0..i-1]
Once that meaning is fixed, every range formula becomes routine.
2. Why Range Subtraction Works¶
Under the half-open convention:
and:
Subtracting cancels the common left prefix and leaves exactly:
This proof is tiny, but it matters because almost every indexing bug is a meaning bug, not an arithmetic bug.
3. Prefix Sums Work For Any Additive Quantity¶
Suppose b[i] is not the original array value, but an indicator:
Then:
counts how many positions in [l, r) satisfy the property.
This is why prefix sums immediately generalize from "sum" to:
- count positives
- count zeros
- count occurrences of one letter
- count how many values are at most some threshold after preprocessing
4. 2D Prefix Sums Are Just Inclusion-Exclusion¶
For a matrix, define:
as the aggregate on the rectangle from (1,1) to (x,y).
Then a query rectangle is:
This is the same prefix idea again, only now one overlap must be added back.
5. Why Prefix Sums Fail For Frequent Updates¶
If one array element changes, then all later prefixes change too.
So a single point update may force:
recomputation.
That is the exact boundary where you should stop using static prefix sums and move to:
- Difference Arrays for offline range updates
- Fenwick Tree or Segment Tree for online updates
Variant Chooser¶
Use 1D Prefix Sums When¶
- the array is static
- many range sums or range counts must be answered
- each answer can be expressed as a difference of two cumulative values
Canonical examples:
- static range sum
- static frequency count
- number of bad positions in a segment
Use Frequency Prefixes When¶
- the value domain is small or compressed
- queries ask "how many times does value
xoccur in[l, r]?"
Then you may store one prefix per value.
Use Prefixes Modulo Something When¶
- the target is about divisibility or equal-prefix classes
Canonical smell:
- subarray sums divisible by
m - equal count of two symbols
Then the useful invariant is often not the raw prefix sum, but the prefix sum modulo m or the difference between two counters.
Use 2D Prefix Sums When¶
- the data is a static grid
- rectangle sums or rectangle counts are queried
Do Not Use Plain Prefix Sums When¶
- updates appear between queries
- the answer depends on a moving window with no static preprocessing
- the aggregate is not additive in a prefix-friendly way
Then you should think about:
Worked Examples¶
Example 1: Static Range Sum Queries¶
This is the repo's canonical first note:
Build:
Then each sum query is:
under one-based inclusive indexing, or:
under zero-based half-open indexing.
The technique is identical; only the indexing convention changes.
Example 2: Count Zeros In A Range¶
Define:
Then the number of zeros in [l, r) is simply:
This is the cleanest reminder that prefix sums are not tied to actual sums of the original values.
Example 3: 2D Rectangle Sum¶
Let a matrix query ask for the sum over:
Use 2D prefix sums and answer with:
The minus-minus-plus pattern is not a trick to memorize blindly. It comes from:
- subtracting the top strip
- subtracting the left strip
- adding back the overlap that was subtracted twice
Example 4: Prefix Sums + Another Technique¶
Many problems do not end at "build prefix sums."
Instead, prefix sums are the cheap helper that feeds another idea:
- interval cost for DP
- feasibility check for binary search
- rectangle area/count queries inside geometry or grids
That is why prefix sums are often a support tool rather than the whole solution.
Example 5: Prefix Sums Modulo m¶
Suppose you want to count subarrays whose sum is divisible by m.
Let:
Then the subarray sum on [l, r) is divisible by m exactly when:
which is equivalent to:
So the problem becomes:
- count pairs of equal prefix-sum remainders
This is the first important jump beyond direct range queries:
- you are still using prefix sums
- but the answer comes from comparing prefix classes, not subtracting one explicit interval query
Algorithms And Pseudocode¶
Standard 1D Build¶
pref[0] = 0
for i from 0 to n-1:
pref[i + 1] = pref[i] + a[i]
Half-Open Query¶
sum(l, r):
return pref[r] - pref[l]
One-Based Inclusive Query¶
sum(l, r):
return pref[r] - pref[l - 1]
2D Build¶
pref[x][y] =
grid[x][y]
+ pref[x - 1][y]
+ pref[x][y - 1]
- pref[x - 1][y - 1]
Implementation Notes¶
- Pick one indexing convention and keep it everywhere.
- The safest 1D contest convention is often
pref[0..n]with half-open queries. - Use
long longwhenever sums may exceedint. - For 2D prefix sums, initialize the zero row and zero column so boundary formulas stay uniform.
- Prefix sums pair naturally with immutable arrays. If the array changes repeatedly, the technique is usually the wrong main tool.
Practice Archetypes¶
You should strongly suspect prefix sums when you see:
- many static range-sum queries
- range counts over a fixed array
- rectangle sums on a fixed grid
- interval cost precomputation for DP
Repo anchors:
Starter template:
Notebook refresher:
References And Repo Anchors¶
Course / tutorial style:
- USACO Guide: Introduction to Prefix Sums
- OI Wiki: Prefix Sum And Adjacent Difference
- Competitive Programmer's Handbook
Practice / repo anchors: