Sorting¶
Sorting is not just preprocessing.
In many contest problems, sorting is the move that creates the structure the proof needs:
- adjacency becomes meaningful
- greedy choices become comparable
- one-pass scanning becomes possible
- duplicates become grouped
- event order becomes canonical
That is why so many solutions secretly look like:
- sort
- maintain one small invariant
- scan once
At A Glance¶
- Use when: relative order matters more than original positions, or a greedy/sweep proof needs a canonical order
- Core worldview: sorting creates monotonicity and local structure
- Main tools:
sort, custom comparators, tie rules, adjacency reasoning, earliest-finish greedy, opposite-end scans - Typical complexity:
O(n \log n)preprocessing plus linear or near-linear postprocessing - Main trap: believing that sorting alone proves a greedy algorithm, or writing an inconsistent comparator
Prerequisites¶
Helpful neighboring topics:
Problem Model And Notation¶
Suppose the input objects are:
Sorting chooses an order according to one or more keys:
The most common uses are:
- sort by value
- sort by coordinate
- sort by finishing time
- sort by one key, then tie-break by another
The key question is always:
- what property becomes simple after the order is fixed?
From Brute Force To The Right Idea¶
Brute Force: Treat All Arrangements As Different¶
Many beginner problems look combinatorial only because the order is still arbitrary.
Examples:
- pairing children into gondolas
- selecting the most compatible intervals
- finding the closest pair on a line
Without order, it feels like:
- try many subsets
- try many pairings
- try many possible next choices
The First Shift: Sort To Make One Choice Obviously Hardest Or Safest¶
After sorting, one element often becomes the natural anchor:
- the earliest finishing interval
- the heaviest remaining child
- the smallest or largest remaining value
That anchor is what makes the proof go through.
The Second Shift: Nearby Elements Become The Only Interesting Ones¶
On a line, many global comparisons collapse after sorting.
For example, the minimum absolute difference among all pairs must occur between adjacent elements in sorted order.
So sorting can reduce:
all-pairs reasoning to one linear scan.
The Third Shift: Sorting And Scanning Is Often The Real Algorithm¶
The solution is often not:
- "use greedy"
It is:
- choose the correct order
- then greedy becomes forced
That distinction matters because the ordering is usually half of the proof.
Core Invariants And Why They Work¶
1. Comparator Meaning Must Match The Proof¶
If you sort by finishing time, the proof must use finishing time.
If you sort by weight, the proof must use weight.
This sounds obvious, but many wrong solutions come from:
- sorting by one key
- proving correctness as if another key had been used
So the first invariant is conceptual:
The chosen order must be the order the proof actually reasons about.
2. Exchange Argument Is The Main Greedy Proof¶
The standard sorting proof pattern is:
- take any optimal solution
- look at the first place where it differs from the sorted-order greedy choice
- swap in the greedy-compatible choice
- show the solution remains feasible and no worse
Repeat until the optimal solution agrees with the greedy one.
This is the proof behind interval scheduling and many other sort-first greedy problems.
3. Adjacency Reduction¶
Suppose:
If you want the minimum difference pair, it is enough to check adjacent pairs:
Why?
If the best pair were (a_i, a_j) with i + 1 < j, then some value lies between them, which yields an equal or smaller difference to one side.
Sorting turns a global search into a local scan.
4. Opposite-End Reasoning¶
After sorting, the two ends may represent:
- the most constrained remaining item
- the least costly possible partner
This is exactly why Ferris Wheel works:
- the heaviest remaining child is hardest to place
- pairing them with the lightest remaining child is the best remaining chance
5. Event Order Invariant¶
For sweep-like problems, sorting creates a forward timeline:
After processing all events up to position x,
the maintained state reflects exactly those events.
This is the bridge from basic sorting to offline tricks and sweep line.
Variant Chooser¶
Use Sort + Greedy When¶
- one local choice should be made in a canonical order
- an exchange argument naturally fits
Canonical examples:
- earliest finishing interval first
- smallest/largest feasible item first
- pair the most constrained element first
Use Sort + Adjacency Scan When¶
- the best candidate after sorting must be near a boundary or an adjacent element
Canonical examples:
- minimum difference
- duplicate detection
- coordinate compression preparation
Use Sort + Two Pointers When¶
- feasibility depends on the two ends or on a monotone scan
Canonical examples:
- pair sums
- minimizing containers under a two-item limit
- merging two sorted lists
Use Sort + Binary Search When¶
- the data becomes searchable after sorting
- many predecessor/successor or count queries follow
Use nth_element Or Selection Instead When¶
- you only need one order statistic, not a full sorted order
That is a useful boundary: do not over-sort if only the median or k-th item matters.
Worked Examples¶
Example 1: Movie Festival¶
This is the repo's cleanest interval-scheduling anchor:
Sort intervals by increasing finishing time.
Greedily pick the first interval compatible with the last chosen one.
Why is this safe?
Because among all currently compatible choices, the earliest finishing one leaves the most room for the future. Any optimal solution that starts with a later-finishing compatible interval can exchange it with the earlier-finishing one without reducing the number of intervals later.
Example 2: Ferris Wheel¶
This is the repo's first sort + opposite-end pairing anchor:
Sort the weights. Let:
ibe the lightest remaining childjbe the heaviest remaining child
If a[i] + a[j] fits, pair them.
If not, then a[j] cannot pair with anyone, because every other remaining child is at least as heavy as a[i].
That is a sorting proof, not merely a two-pointer trick.
Example 3: Minimum Difference¶
Sort the array.
Then check only adjacent pairs.
This is the standard example of sorting revealing locality: after ordering, global closeness becomes local closeness.
Example 4: Event Sorting¶
Suppose intervals, jobs, or points are processed by coordinate or time.
Sorting gives one forward order where you can maintain a small active state instead of comparing every pair.
This is the base pattern behind:
- sweep line
- offline query sweeps
- many greedy scheduling problems
One subtle but important detail is tie policy.
Suppose intervals are closed and you want to detect whether an interval ending at x overlaps one starting at x.
Then on equal coordinate x:
- processing
startbeforeendtreats them as overlapping - processing
endbeforestarttreats them as non-overlapping
So "sort events by coordinate" is not yet a full algorithm.
You also need:
- the correct equal-coordinate event order
and that tie rule must match the statement's boundary convention.
Algorithms And Pseudocode¶
Generic Sort With Tie Rule¶
sort items by:
primary key
then explicit secondary key
Interval Scheduling Skeleton¶
sort intervals by increasing end time
last_end = -INF
answer = 0
for interval in sorted order:
if interval.start >= last_end:
take interval
last_end = interval.end
answer += 1
Opposite-End Pairing Skeleton¶
sort array
i = 0
j = n - 1
answer = 0
while i <= j:
if i < j and a[i] + a[j] fits:
i += 1
j -= 1
answer += 1
Implementation Notes¶
- Your comparator must define a strict weak ordering. If
cmp(a, b)andcmp(b, a)can both be true, the sort is invalid. - If ties matter for proof or deterministic behavior, state them explicitly.
stable_sortis only needed when equal-key original order must be preserved. Most contest problems do not need it.- Sorting often destroys original indices, so store them when the output needs original positions.
- Use
long longif comparisons or sums after sorting may overflowint.
Practice Archetypes¶
You should strongly suspect sorting when you see:
- only relative order matters
- a greedy proof starts with "among all feasible choices, choose the earliest/smallest/largest..."
- nearby values become meaningful after ordering
- a scan becomes monotone only after sorting
Repo anchors:
Starter template:
Notebook refresher:
References And Repo Anchors¶
Course / proof style:
- Princeton: Greedy Algorithms I
- Virginia Tech: Interval Scheduling
- University of Washington: Greedy Algorithms / Interval Scheduling
Reference:
Practice / repo anchors: