Regex Membership¶

Title: Regex Membership
Judge / source: Canonical Thompson-NFA benchmark
Original URL: https://algs4.cs.princeton.edu/54regexp/
Secondary topics: Thompson construction, Epsilon closure, Active-state-set simulation
Difficulty: medium
Subtype: Full-string regex membership under a small Thompson syntax
Status: solved
Solution file: regexmembership.cpp

Why Practice This¶

This is the cleanest first in-repo flagship for Regular Expressions / Finite Automata.

The benchmark is intentionally canonical:

So the hard part is exactly the lane itself:

Reach for the regex-automaton worldview when:

The strongest smell is:

That is exactly this lane.

This benchmark does not want:

The clean route is:

That is exactly the first regex / finite-automata route.

The useful monotone fact is:

after reading any prefix of the text, the only thing that matters is the set of NFA states reachable after that prefix

So the algorithm never has to backtrack over parse choices manually. It only advances one reachable-state frontier.

That is the whole Thompson lesson.

With regex length m and text length n:

The point of this benchmark is not to mimic industrial regex engines. The point is:

This repo's canonical benchmark uses:

The solution prints:

Supported metacharacters are exactly:

Everything else is treated as a literal.

Topic page: Regular Expressions / Finite Automata
Practice ladder: Regex / Finite Automata ladder
Starter template: regex-thompson-nfa.cpp
Notebook refresher: Regex / Finite Automata hot sheet
Compare points:
KMP
Aho-Corasick
Suffix Automaton
This note adds: the canonical full-string Thompson-NFA route before grep-style substring wrappers or richer regex syntax.