Huffman / Data Compression Hot Sheet¶
Use this page when the real question is "build one optimal binary prefix-code tree by repeatedly merging the two lightest subtrees."
Do Not Use When¶
- the task is one scan with prefix feasibility or repair -> Prefix Constraints
- the point is a production compression format, bit packing, or container headers
- the task only needs a priority queue, not a proof of optimal merge structure
Choose By Signal¶
- one optimal merge / prefix-code tree from positive frequencies ->
Huffman - one heap-backed feasibility repair over ordered input ->
Prefix Constraints - one plain priority-queue process without coding-tree semantics -> reopen Heaps And Ordered Sets
Core Invariants¶
- the heap stores the current forest of subtree weights
- every merge increases total weighted path length by exactly
a + b - total optimum equals the sum of all merge costs
- one-symbol input needs an explicit empty-code convention
Main Traps¶
- confusing code assignment tie policy with the actual optimality invariant
- trying to teach LZW / run-length / file-format details in the first Huffman lane
- forgetting that the objective is weighted depth, not lexicographic prettiness of the codewords
Exact Starter Route¶
- exact starter ->
huffman-coding.cpp - flagship rep -> Huffman Coding Benchmark
- compare route -> Greedy overview
- compare route -> Heaps And Ordered Sets
Reopen Paths¶
- full lesson -> Huffman / Data Compression
- greedy family hub -> Greedy
- broader retrieval -> Build Kit