XOR Basis / Linear Basis¶

This lane is for the moment when a xor problem stops being about:

one pair of numbers
one stored partner maximizing xor
one prefix xor trick

and becomes about:

the whole span of subset xors
linear independence over GF(2)
whether one value is representable by xor-ing some chosen elements

In contest terms, the key shift is:

stop thinking of the input as "many unrelated numbers"
treat each number as one binary vector
keep only a maximal linearly independent set
answer subset-xor questions from that compressed basis

This is the exact lane for the repo's first xor basis route.

At A Glance¶

Use when: the task is about xor of some subset of numbers, not xor against one chosen stored partner
Core worldview: the input numbers generate one vector space over GF(2), and the basis is a maximal linearly independent set of generators
Main tools: pivot bits, greedy insertion by highest set bit, representability test, greedy maximum-subset-xor query
Typical complexity: O(B) per insertion or query for bit width B
Main trap: confusing subset-xor basis queries with binary-trie queries against one stored element

Strong contest signals:

"choose any subset"
"maximum xor obtainable from the given numbers"
"can we xor some chosen elements to get x?"
"keep only independent xor generators"
"the statement smells like Gaussian elimination, but over bits"

Strong anti-cues:

the query is "best stored number for this current x" -> Binary Trie / XOR Queries
the task is one static range xor or one path xor with no cycle freedom -> prefix xor or tree xor may already solve it
the task is ordinary modular linear algebra rather than xor/bitwise structure

The repo's exact first route for this family is:

starter -> xor-basis.cpp
flagship note -> XMAX - XOR Maximization
compare point -> Binary Trie / XOR Queries

Prerequisites¶

Number Theory Basics only at the level of being comfortable with bitwise reasoning and compact algebraic invariants
C++ Language For Contests for safe long long bit operations

Helpful neighboring topics:

Binary Trie / XOR Queries: choose between "one stored partner" and "any subset"
Linear Recurrence / Matrix Exponentiation: a compare point for the broader linear-algebra worldview
Shortest Paths: useful later for xor-path problems with cycle-space reductions

Problem Model And Notation¶

Let the input numbers be:

\[ a_1, a_2, \dots, a_n \]

Interpret each number as a binary vector over GF(2).

The reachable values are:

\[ \left\{ \bigoplus_{i \in S} a_i \mid S \subseteq \{1,2,\dots,n\} \right\} \]

So the real object is not the list of numbers itself.

It is the xor-span generated by those numbers.

An xor basis stores a maximal set of linearly independent vectors from that span, usually organized by pivot bit:

one basis vector per highest set bit

With the repo starter convention:

basis[b] is the stored vector whose highest set bit is b, or 0 if none exists there

From Brute Force To The Right Idea¶

Brute Force¶

If the task asks for:

maximum xor over all subsets

then the direct route is:

try all 2^n subsets

That becomes impossible immediately.

Structural Observation¶

Different subsets can produce the same xor value.

So the right object is not:

all subsets

but:

the vector space generated by the numbers under xor

Inside that space, many numbers are redundant because they can already be expressed as xor of earlier ones.

The exact question becomes:

which numbers are truly independent generators of the xor span?

That is a basis question.

Why Greedy Highest-Bit Insertion Works¶

Take one number x.

Look at its highest set bit.

if no existing basis vector has that pivot, keep x as the new pivot vector there
otherwise xor it with the existing pivot vector and continue downward

This is just Gaussian elimination over GF(2), specialized to bitmasks.

Each xor step removes the current highest set bit, so the process strictly progresses.

Core Invariant And Why It Works¶

1. One Pivot Per Highest Set Bit¶

Maintain:

basis[b] = one vector whose highest set bit is b

for each bit position b.

Then no two stored vectors share the same highest set bit, which means:

the stored vectors are linearly independent

2. Insertion Preserves The Span¶

When inserting x, repeatedly xor away pivot bits that already exist.

Two cases remain:

x becomes 0
x ends with a highest set bit b not used yet

If x becomes 0, then:

the new number was already representable by the old basis

If x survives with new pivot b, then:

adding it enlarges the span by one dimension

So insertion never changes the generated xor-space incorrectly:

it either keeps the same span or extends it by one independent direction

3. Why Representability Testing Works¶

To test whether some value x is representable:

run the same elimination process against the stored basis

If it reduces to 0, then:

x belongs to the span

If not, then:

some pivot bit remains unmatched, so x is outside the span

4. Why Greedy Maximum-Xor Query Works¶

Suppose we want the maximum xor value obtainable from any subset.

Start with ans = 0 and scan pivot bits from high to low.

If:

ans xor basis[b] > ans

then take it.

This is correct because a higher bit dominates all lower bits.

Once we decide whether bit b can be made 1, no lower-bit choice can compensate for getting that decision wrong.

So the query is a greedy walk over pivot bits, exactly like the standard maximum-xor logic in Gaussian form.

Worked Examples¶

Example 1: Maximum Subset Xor¶

This repo's flagship note:

XMAX - XOR Maximization

Set:

1, 2, 4

These are already independent pivot vectors, so the maximum subset xor is:

1 xor 2 xor 4 = 7

This is the cleanest first benchmark because:

the only goal is maximum subset xor
the whole payoff comes directly from building the basis correctly

Example 2: Why Binary Trie Is A Different Family¶

Suppose the query is:

for current x, choose one stored number y maximizing x xor y

That is not an xor-basis question.

That is the lane of Binary Trie / XOR Queries, because the task is about one stored partner, not any subset.

The family split is:

binary trie -> maximize against one stored element
xor basis -> maximize over xor of any subset

Example 3: Graph Cycle-Space Stretch¶

In xor-path problems like:

find the minimum xor path from 1 to n

one DFS/tree xor gives one candidate path value, and every extra cycle contributes one xor vector.

Then the cycle-xor set is reduced to an xor basis, and that basis is used to improve the path value greedily.

That is a strong stretch use, but it is not the repo's exact first starter route.

Variant Chooser¶

Use XOR Basis When¶

the task is about xor of some subset
maximum subset xor or representability is the real query
a compact independent generator set is enough

That is the lane of this page.

Use Binary Trie When¶

the query is against one current external value x
you need one stored element maximizing xor with it
insert / erase / query happen online in a live multiset

Use Prefix-Xor Tricks When¶

the structure is about subarrays or tree paths with fixed endpoints and no extra subset freedom
one xor-difference identity already captures the whole task

Complexity And Tradeoffs¶

For fixed bit width B + 1:

insert: O(B)
representability test: O(B)
maximum subset xor: O(B)
memory: O(B)

Tradeoffs:

excellent when the whole span matters
much lighter than storing all numbers
not the right tool for one-partner online xor queries
advanced variants like kth-xor, merges, or rollback need extra structure and are outside the exact starter route here

Starter Fit¶

The starter in this repo is intentionally narrow:

long long xor basis
insert(x)
can_make(x)
max_xor()
dimension()

That exact route is enough for:

maximum subset xor
representability queries
many first linear-basis problems

If the task needs more, extend carefully only after the basic row-echelon worldview is stable:

reduced basis for kth-xor style queries
merging bases
graph cycle-basis reductions

Common Pitfalls¶

mixing up subset-xor with pair-xor
forgetting that duplicate or dependent numbers may add no new dimension
scanning bits in the wrong direction during greedy max query
using signed-bit assumptions carelessly when inputs are large
overclaiming that the starter already supports kth-xor or rollback/segment-tree merges

Repo Routes¶

exact starter -> xor-basis.cpp
quick recall -> XOR Basis hot sheet
flagship note -> XMAX - XOR Maximization
compare data-structure route -> Binary Trie / XOR Queries