Treap / Implicit Treap¶
Treap is the randomized split/merge family for two different but related jobs:
- key-based treap: ordered-set operations where explicit key order still matters
- implicit treap: sequence operations where the "key" is the element position and should not be stored explicitly
The repo now has two exact starter routes inside the same family:
- key-based route -> treap-key-ordered-set.cpp -> Salary Queries
- implicit route -> implicit-treap-sequence.cpp -> Cut and Paste
- compare points -> PBDS / Order Statistics Tree, Persistent Treap, Heaps And Ordered Sets, Lazy Segment Tree
This lane matters when arrays are too rigid and ordered sets are too weak:
- arrays make insert / erase / cut / paste by position cost
O(n) - ordinary ordered sets do not naturally model "the 37th element of the current sequence"
Treap fixes that by combining:
- BST order
- heap order on random priorities
and then reusing one pair of core operations:
splitmerge
At A Glance¶
- Use when: one mutable sequence needs insert / erase / cut / paste / split / merge by position, or an ordered set needs split/merge by key
- Core worldview: priorities choose the shape, while BST order or implicit position determines the in-order sequence
- Main tools: randomized priorities, subtree sizes, split, merge, and optional lazy tags later
- Typical complexity: expected
O(log n)per split/merge/insert/erase - Main trap: mixing up key-based split conventions and implicit split conventions as if they were the same API
Strong contest signals:
- "cut a substring and paste it somewhere else"
- "ordered set is right, but I want a self-hosted split/merge tree instead of PBDS"
- "insert / erase at position
i" - "split one sequence into two, then merge back in a different order"
- "a balanced BST with split/merge would be cleaner than fighting
std::set"
Strong anti-cues:
- the structure is fixed-length and only needs range aggregates -> Segment Tree or Lazy Segment Tree
- you only need predecessor / successor / erase-one in value order -> Heaps And Ordered Sets
- the array is static -> Wavelet Tree, Sparse Table, or offline lanes
- the task never splits or merges a sequence, only scans or updates fixed indices
Prerequisites¶
- Heaps And Ordered Sets
- Segment Tree
- Randomized Algorithms as the best conceptual neighbor for why random priorities keep the height small in expectation
Helpful compare points:
- Heaps And Ordered Sets: use this first when plain
set / multiset / priority_queuealready solves the problem - PBDS / Order Statistics Tree: use this first when GNU PBDS already gives a simpler exact route for rank /
k-th - Persistent Treap: use this when old sequence versions stay alive and split/merge itself must be non-destructive
- Lazy Segment Tree: use this when indices are fixed and only interval aggregates/updates matter
- Lexicographic Enumeration: sequence order reasoning sometimes appears, but the data-structure need is different
- Balanced BSTs For Contests: use this if you want the AVL / Red-Black / Scapegoat / SBT compare note and the broader map of textbook balanced-BST variants
Problem Model And Notation¶
There are two variants to keep separate.
1. Key-Based Treap¶
Each node stores:
- one explicit key
key - one random priority
prior
The in-order traversal is the sorted order by key.
This fits:
- ordered-set split by key
- merge of two disjoint key ranges
2. Implicit Treap¶
Each node stores:
- payload value
- random priority
- subtree size
But not the position key itself.
Instead, a node's implicit position is:
all nodes in its left subtree
+ all nodes in left subtrees of ancestors where this node sits in the right branch
+ those ancestors themselves
That means position can be recovered while descending, so insert/erase by position stays logarithmic.
The implicit starter in this repo uses zero-based half-open sequence ranges [l, r).
It supports:
insert(pos, c)erase(pos)cut_paste_to_end(l, r)to_string()
That is the exact route for Cut and Paste.
From Brute Force To The Right Idea¶
Brute Force On Arrays¶
Suppose a sequence must support:
- cut substring
[l, r) - paste it elsewhere
With a plain array or string, one operation can shift:
elements.
That is too slow once both:
n- number of operations
reach 2 * 10^5.
Why Ordered Sets Alone Do Not Fix It¶
An ordered set is good when the identity of an element is:
- its value key
But in sequence tasks, the identity is often:
- current position in the sequence
If one insertion near the front shifts all later positions, storing explicit indices becomes awkward.
The Structural Observation¶
If a structure can answer:
- first
kelements - remaining elements
as a split, then cut/paste becomes simple:
- split before
l - split before
r - rearrange the three pieces with merges
So the whole lane reduces to:
make split and merge cheap while keeping the tree balanced
The Treap Trick¶
Treap stores:
- BST/sequence order invariant
- heap invariant on random priorities
The priority decides shape probabilistically, which keeps expected height near logarithmic.
Then:
splitpeels one tree into twomergeglues two valid trees into one
Everything else is built from those two primitives.
Core Invariant And Why It Works¶
1. Two Invariants For Ordinary Treap¶
For key-based treap:
- BST property on explicit keys
- heap property on priorities
That means:
- in-order traversal is sorted by key
- the highest-priority node in a subtree becomes its root
2. Why Split Works¶
For key-based treap:
split(root, x)
returns:
- left tree = keys
< x - right tree = keys
>= x
Only one root-to-leaf path is rewired, so the operation stays logarithmic in expected height.
For implicit treap:
split(root, k)
returns:
- left tree = first
ksequence elements - right tree = the rest
The only difference is the routing test:
- use subtree sizes instead of explicit keys
3. Why Merge Works¶
If every key in A belongs before every key in B, then:
merge(A, B)
chooses the root with larger priority and recursively merges the remaining child.
Because both inputs already satisfy the treap invariants, the result does too.
For implicit treap, the same merge works because sequence order is:
- all of left sequence
- then all of right sequence
No explicit key comparison is needed.
4. Why Subtree Size Is Mandatory For Implicit Treap¶
Implicit key is not stored.
So at any node, the count:
tells us how many sequence elements come before the current node inside this subtree.
That is exactly the information split-by-position needs.
The invariant is:
size[v] = true number of live nodes in subtree v
Every merge and split must repair this with one pull.
5. Why Expected Height Stays Small¶
Random priorities make the treap behave like a randomized BST / heap merge structure.
The shape is not chosen by the adversarial insertion order alone, so expected height is logarithmic.
This is the family reason treap works at all.
Worked Examples¶
Example 1: Key-Based Split By Value¶
Suppose the treap stores keys:
Calling:
split(root, 9)
should produce:
- left =
{2, 5, 8} - right =
{11, 14}
This is the right treap flavor when:
- explicit key order is the point
- split/merge by key is cleaner than fighting
std::set
This repo's exact first key-based anchor is:
Example 2: Implicit Cut And Paste¶
Take the sequence:
AYBABTU
and cut positions [2, 5) zero-based:
YBA
Split into:
- left =
A - mid =
YBA - right =
BTU
Then merge:
left + right + mid
to get:
ABTUYBA
The sequence changed, but no element stored an explicit mutable index.
That is exactly why implicit treap exists.
Algorithm And Pseudocode¶
Key-Based Split / Merge¶
split_by_key(t, x):
if t is null:
return (null, null)
if t.key < x:
(a, b) = split_by_key(t.right, x)
t.right = a
pull(t)
return (t, b)
else:
(a, b) = split_by_key(t.left, x)
t.left = b
pull(t)
return (a, t)
merge(a, b):
if a is null: return b
if b is null: return a
if a.prior > b.prior:
a.right = merge(a.right, b)
pull(a)
return a
else:
b.left = merge(a, b.left)
pull(b)
return b
Implicit Split By Position¶
split_by_pos(t, k):
if t is null:
return (null, null)
left_size = size(t.left)
if k <= left_size:
(a, b) = split_by_pos(t.left, k)
t.left = b
pull(t)
return (a, t)
else:
(a, b) = split_by_pos(t.right, k - left_size - 1)
t.right = a
pull(t)
return (t, b)
Cut And Paste To End¶
For zero-based half-open [l, r):
(a, bc) = split(root, l)
(b, c) = split(bc, r - l)
root = merge(merge(a, c), b)
That one identity is the first exact pattern worth memorizing in this lane.
Variant Chooser¶
Use Key-Based Treap When¶
- explicit value order is still the intended order
- you need split / merge by key
std::setgives predecessor/successor, but not the structural operations cleanly enough
Use Implicit Treap When¶
- the real key is current sequence position
- sequence edits shift later positions
- cut / paste / insert / erase by position are central
This is the exact first implicit route in this repo.
Prefer PBDS When¶
- the task is fundamentally online rank /
k-th on one ordered set - GNU PBDS is allowed
- you do not need a self-hosted split/merge structure
The clean compare point is:
Prefer Ordered Sets When¶
- you only need predecessor / successor / erase-one / top-k style set behavior
- no sequence surgery or split/merge by structure is required
In C++, plain ordered sets are usually the lighter first choice.
Prefer Segment Tree / Lazy Segment Tree When¶
- the sequence length is fixed
- positions are stable
- interval aggregate/update is the real job
Treap shines when:
- sequence structure itself changes
not just values on a fixed index line.
Reopen The Topic Before Reusing The Starter When¶
- you need range sums, min/max, reverse tags, or affine lazy tags
- the problem is really a stronger implicit-treap-with-lazy lane
- you need the key-based starter, not the sequence starter
The two starters here are intentionally narrow:
- treap-key-ordered-set.cpp:
ordered-set-by-key, expected
O(log n)insert / erase / rank /k-th - implicit-treap-sequence.cpp: sequence surgery by position
Neither starter pretends to cover the whole family.
Implementation Notes¶
1. Use One pull Discipline¶
Every place that rewires children must repair:
- subtree size
If you later add aggregates, pull is where they belong too.
2. Keep Position Convention Explicit¶
The starter uses:
- zero-based positions
- half-open
[l, r)
Contest statements often use:
- one-based inclusive intervals
Convert at the call site.
3. Do Not Store Explicit Indices In Implicit Treap¶
That defeats the whole purpose.
The position is recovered from subtree sizes while descending.
4. Random Priorities Must Be Stable Enough¶
Use one RNG and assign one priority per node at creation time.
Do not recompute priorities later.
5. The First Starter Is Sequence-Surgery-First¶
The repo starter intentionally covers:
- build from one initial string
- insert one character at position
- erase one position
- move one substring to the end
That keeps the first route honest and easy to trust.
Practice Archetypes¶
First Rep In This Repo¶
Best Compare Points In This Repo¶
Good External Follow-Ups¶
- CSES - Substring Reversals
- CSES - Reversals and Sums
- Library Checker - Dynamic Sequence Range Affine Range Sum