A framework for understanding

Nondeterministic Finite Automata (NFAs)

A Nondeterministic Finite Automaton (NFA) extends the DFA model by allowing multiple transitions on the same symbol and epsilon (ε) transitions that require no input. NFAs provide a more flexible way to design automata while maintaining the same computational power as DFAs.

The nondeterministic nature of NFAs leads to numerous theoretical results regarding state complexity, conversion algorithms, and computational properties. This page provides a rigorous examination of NFAs and their mathematical foundations.

Formal Definition and Core Properties

Definition: Nondeterministic Finite Automaton (Formal)

A nondeterministic finite automaton (NFA) is a 5-tuple (Q, Σ, δ, q0, F) where:

  • Q is a finite set of states
  • Σ is a finite alphabet
  • δ: Q × (Σ ∪ {ε}) → P(Q) is the transition function mapping each state-symbol pair to a set of states
  • q0 ∈ Q is the start state
  • F ⊆ Q is the set of accept states

Note that P(Q) represents the power set of Q (the set of all subsets of Q). This includes the empty set, allowing for the possibility that δ(q, a) = ∅ for some state q and symbol a, indicating no valid transition.

Example: NFA for Strings Ending with "ab"

Consider the language L = {w | w ends with the substring ab} over the alphabet Σ = {a, b}. Notice how the NFA can be simpler than the equivalent DFA by using nondeterministic choices.

  • Q = {q₀, q₁, q₂}
  • Σ = {a, b}
  • q0 is the start state
  • F = {q₂} (accept state)
  • Transition function δ:
    • δ(q0, a) = {q₀, q₁} (nondeterministically stay or advance)
    • δ(q0, b) = {q₀} (stay in start state)
    • δ(q1, a) = ∅ (no valid transition)
    • δ(q1, b) = {q₂} (complete the pattern ab)
    • δ(q2, a) = {q₀, q₁} (might start new pattern)
    • δ(q2, b) = {q₀} (reset to start)

This NFA accepts the same language as the DFA example, but uses nondeterminism to guess when the final ab pattern begins.

Example: NFA Execution Trace

Let's trace how the NFA above processes the string aab:

Input: aab

  1. Initial: Current states = {q₀}

  2. Read a:
    • From q₀ on a: can go to q₀ OR q₁
    • Current states = {q₀, q₁}

  3. Read a:
    • From q₀ on a: can go to q₀ OR q₁
    • From q₁ on a: no valid transitions (∅)
    • Current states = {q₀, q₁}

  4. Read b:
    • From q₀ on b: go to q₀
    • From q₁ on b: go to q₂
    • Current states = {q₀, q₂}

Result: q₂ ∈ {q₀, q₂}, so the string is accepted.

Notice how the NFA explores multiple computation paths simultaneously, accepting if any path leads to an accept state.

Definition: Extended Transition Function

The extended transition function δ̂: Q × Σ* → P(Q) is defined recursively:

  • δ̂(q, ε) = E({q}) where E is the ε-closure function
  • δ̂(q, wa) = ⋃p∈δ̂(q,w) δ(p, a) for w ∈ Σ* and a ∈ Σ

For a set of states S ⊆ Q, we extend this definition as:

δ̂(S, w) = ⋃q∈S δ̂(q, w)

With this definition, we can formally state that an NFA M accepts a string w if and only if δ̂(q0, w) ∩ F ≠ ∅.

Epsilon Transitions and Closures

Definition: Epsilon Closure

For a state q ∈ Q, the ε-closure of q, denoted E(q), is the set of all states reachable from q by following zero or more epsilon transitions. Formally, E(q) is the smallest set S ⊆ Q such that:

  1. q ∈ S (reflexivity)
  2. If r ∈ S and s ∈ δ(r, ε), then s ∈ S (transitivity)

For a set of states S ⊆ Q, we define E(S) = ⋃q∈S E(q).

The ε-closure can be computed using a graph reachability algorithm with time complexity O(|Q| + |E|), where |E| is the number of ε-transitions.

Theorem: Properties of Epsilon Closures

The ε-closure function E: P(Q) → P(Q) satisfies the following properties:

  1. Extensivity: S ⊆ E(S) for all S ⊆ Q
  2. Monotonicity: If S ⊆ T, then E(S) ⊆ E(T)
  3. Idempotence: E(E(S)) = E(S) for all S ⊆ Q

These properties make E a closure operator in the mathematical sense.

Proof: Proof of Epsilon Closure Properties

Proof of Extensivity: By definition, for any q ∈ S, we have q ∈ E(q). Therefore, S ⊆ ⋃q∈S E(q) = E(S).

Proof of Monotonicity: If S ⊆ T, then E(S) = ⋃q∈S E(q) ⊆ ⋃q∈T E(q) = E(T) since we're taking the union over a larger set.

Proof of Idempotence: We need to show E(E(S)) = E(S).

By extensivity, we know E(S) ⊆ E(E(S)).

For the reverse inclusion, we show that E(E(S)) ⊆ E(S). Let r ∈ E(E(S)). Then there exists p ∈ E(S) such that r ∈ E(p). This means there is a sequence of ε-transitions from p to r.

Since p ∈ E(S), there exists q ∈ S such that p ∈ E(q). This means there is a sequence of ε-transitions from q to p.

By concatenating these sequences, we get a sequence of ε-transitions from q to r, which implies r ∈ E(q) ⊆ E(S).

Therefore, E(E(S)) ⊆ E(S), and by the antisymmetry of set inclusion, E(E(S)) = E(S).

Consider an NFA with states {q₀, q₁, q₂, q₃, q₄} and the following ε-transitions:

  • δ(q₀, ε) = {q₁, q₂}
  • δ(q₁, ε) = {q₃}
  • δ(q₂, ε) = ∅
  • δ(q₃, ε) = {q₄}
  • δ(q₄, ε) = ∅

Let's compute E(q₀) step by step:

  1. Initialize: E(q₀) = {q₀} (reflexivity)

  2. Add direct ε-transitions:
    • q₀ →ε q₁ and q₀ →ε q₂
    • E(q₀) = {q₀, q₁, q₂}

  3. Add indirect ε-transitions:
    • From q₁: q₁ →ε q₃
    • From q₃: q₃ →ε q₄
    • E(q₀) = {q₀, q₁, q₂, q₃, q₄}

  4. Check for more transitions: No more ε-transitions possible

Final result: E(q₀) = {q₀, q₁, q₂, q₃, q₄}

This means that from state q₀, we can reach all five states without consuming any input symbols.

Example: Epsilon Closure Calculation

Acceptance and Computation Paths

Definition: Computation Path

For an NFA M = (Q, Σ, δ, q0, F) and a string w = w1w2...wn ∈ Σ*, a computation path for w is a sequence of states r0, r1, ..., rn such that:

  1. r0 ∈ E(q0) (start in the ε-closure of the initial state)
  2. ri ∈ E(δ(ri-1, wi)) for i = 1, 2, ..., n (follow valid transitions including ε-transitions)

A computation path is accepting if rn ∈ F (the final state is an accept state).

Definition: Language Recognition by NFAs

An NFA M = (Q, Σ, δ, q0, F) accepts a string w ∈ Σ* if there exists at least one accepting computation path for w.

The language recognized by M, denoted L(M), is the set of all strings accepted by M:

L(M) = {w ∈ Σ* | δ̂(q0, w) ∩ F ≠ ∅}

Theorem: Equivalence of Computation Path and Extended Transition Function Definitions

For any NFA M = (Q, Σ, δ, q0, F) and any string w ∈ Σ*, the following are equivalent:

  1. There exists an accepting computation path for w
  2. δ̂(q0, w) ∩ F ≠ ∅

Proof: Proof of Definition Equivalence

We prove that for any string w = w1w2...wn, there exists an accepting computation path r0, r1, ..., rn if and only if δ̂(q0, w) ∩ F ≠ ∅.

(⇒) Suppose there exists an accepting computation path r0, r1, ..., rn. We will prove by induction that ri ∈ δ̂(q0, w1w2...wi) for all i = 0, 1, ..., n.

Base case: i = 0. By definition of a computation path, r0 ∈ E(q0) = δ̂(q0, ε).

Inductive step: Assume ri-1 ∈ δ̂(q0, w1w2...wi-1) for some i ≥ 1. By definition of a computation path, ri ∈ E(δ(ri-1, wi)). Using the definition of the extended transition function:

δ̂(q0, w1w2...wi) = ⋃p∈δ̂(q0,w1w2...wi-1) E(δ(p, wi))

Since ri-1 ∈ δ̂(q0, w1w2...wi-1) and ri ∈ E(δ(ri-1, wi)), we have ri ∈ δ̂(q0, w1w2...wi).

By induction, rn ∈ δ̂(q0, w). Since the computation path is accepting, rn ∈ F. Therefore, rn ∈ δ̂(q0, w) ∩ F, which means δ̂(q0, w) ∩ F ≠ ∅.

(⇐) Suppose δ̂(q0, w) ∩ F ≠ ∅. Then there exists rn ∈ δ̂(q0, w) ∩ F. We need to construct an accepting computation path ending at rn.

By the definition of the extended transition function, there must exist a sequence of states r0, r1, ..., rn such that:

  1. r0 ∈ E(q0)
  2. ri ∈ E(δ(ri-1, wi)) for i = 1, 2, ..., n

This is precisely the definition of a computation path. Since rn ∈ F, this is an accepting computation path.

State Complexity of NFAs

Theorem: State Complexity Bounds for NFAs

For certain regular languages, we can establish precise lower bounds on the number of states required in any NFA that recognizes them:

  1. Any NFA recognizing L = {w | w ends with ab} requires at least 2 states
  2. Any NFA recognizing L = {w | w contains the substring an} requires at least n+1 states
  3. Any NFA recognizing L = {w | |w| mod n = 0} requires at least log2(n) states
  4. Any NFA recognizing L = {w | the n-th symbol from the end is a} requires at least 2n-1 states

Proof: Proof of State Lower Bound for Strings Ending with 'ab'

We prove that any NFA recognizing L = {w | w ends with ab} requires at least 2 states.

Suppose, for contradiction, that there exists an NFA M with just 1 state, q, that recognizes L. Since ab ∈ L, state q must be an accepting state.

Now consider the string a. Since a ∉ L, there should be no accepting computation path for a. However, with only one state q, after reading a, the NFA must be in state q (if it can process the string at all). Since q is an accepting state, the NFA would incorrectly accept a.

Therefore, any NFA recognizing L requires at least 2 states.

Theorem: NFA to DFA Conversion State Complexity

For an n-state NFA, the equivalent minimal DFA may require up to 2n states. This bound is tight: for each n ≥ 1, there exists an n-state NFA such that the minimal equivalent DFA requires exactly 2n states.

One such family of languages is Ln = {w ∈ {a,b}*| the nth symbol from the end is a}.

Proof: Proof of Exponential State Blowup

We prove that the language Ln = {w ∈ {a,b}*| the nth symbol from the end is a} can be recognized by an (n+1)-state NFA, but any DFA recognizing it requires at least 2n states.

NFA Construction: We can build an (n+1)-state NFA M = (Q, Σ, δ, q0, F) where:

  • Q = {q0, q1, ..., qn}
  • Σ = {a, b}
  • q0 is the start state
  • F = {qn}
  • The transitions are:
    • δ(q0, a) = {q0, q1} (on a, either stay in q0 or move to q1)
    • δ(q0, b) = {q0} (on b, stay in q0)
    • δ(qi, a) = δ(qi, b) = {qi+1} for i = 1, 2, ..., n-1 (advance through the counter states)

This NFA works by guessing the position of the n-th symbol from the end. When it reads an a, it nondeterministically guesses whether this is the critical position and starts counting the remaining n-1 symbols if so.

DFA Lower Bound: To prove that any DFA needs at least 2n states, we use a fooling set argument. Consider the set of strings S = {w ∈ {a,b}* | |w| = n-1}. There are 2n-1 such strings.

For any two distinct strings u, v ∈ S, there must be a position where they differ. Without loss of generality, assume u has a and v has b at this position.

Now consider the suffix string z that, when appended to u or v, makes the differing position exactly n symbols from the end. Then uz ∈ Ln but vz ∉ Ln.

This means that after reading u and v, any DFA must be in different states, otherwise it couldn't distinguish between uz and vz. Since there are 2n-1 strings in S, the DFA must have at least 2n-1 states.

A more careful analysis using the full language definition shows that 2n states are actually necessary.

Equivalence of NFAs and DFAs

Theorem: Equivalence of NFAs and DFAs

For any NFA N, there exists a DFA D such that L(N) = L(D). Conversely, any DFA is already an NFA by definition. Therefore, NFAs and DFAs recognize exactly the same class of languages.

Definition: Subset Construction Algorithm

Given an NFA N = (Q, Σ, δ, q0, F), we construct an equivalent DFA D = (Q', Σ, δ', q0', F') as follows:

  1. Q' = P(Q) (the power set of Q)
  2. q0' = E({q0}) (the ε-closure of the start state)
  3. F' = {S ∈ Q' | S ∩ F ≠ ∅} (sets containing at least one accepting state)
  4. For each S ∈ Q' and a ∈ Σ, define δ'(S, a) = E(⋃q∈S δ(q, a))

This construction has a worst-case time and space complexity of O(2|Q| · |Σ|).

Proof: Correctness of the Subset Construction

We prove that the DFA D constructed by the subset construction algorithm accepts exactly the same language as the original NFA N.

We will show that for any string w ∈ Σ*, δ'(q0', w) = δ̂(q0, w), where δ̂ is the extended transition function of the NFA.

Proof by induction on the length of w:

Base case: w = ε (the empty string)

δ'(q0', ε) = q0' = E({q0}) = δ̂(q0, ε)

Inductive step: Assume that for some string x ∈ Σ*, δ'(q0', x) = δ̂(q0, x). We need to show that for any symbol a ∈ Σ, δ'(q0', xa) = δ̂(q0, xa).

By the definition of δ':

δ'(q0', xa) = δ'(δ'(q0', x), a) = δ'(δ̂(q0, x), a) (by the induction hypothesis)

= E(⋃q∈δ̂(q0,x) δ(q, a)) (by the definition of δ')

By the definition of the extended NFA transition function:

δ̂(q0, xa) = E(⋃q∈δ̂(q0,x) δ(q, a))

Therefore, δ'(q0', xa) = δ̂(q0, xa).

By induction, for all w ∈ Σ*, δ'(q0', w) = δ̂(q0, w).

Now, w ∈ L(D) if and only if δ'(q0', w) ∈ F', which by definition of F' happens if and only if δ'(q0', w) ∩ F ≠ ∅. But δ'(q0', w) = δ̂(q0, w), so this is equivalent to δ̂(q0, w) ∩ F ≠ ∅, which means w ∈ L(N).

Therefore, L(D) = L(N).

Example: Subset Construction Step-by-Step

Let's convert our strings ending with ab NFA to a DFA using the subset construction algorithm.


Given NFA: N = ({q₀, q₁, q₂}, {a, b}, δ, q₀, {q₂}) with transitions from the previous example.

Step 1: Initial state

  • DFA start state: q₀' = E({q₀}) = {q₀} (no ε-transitions)

Step 2: Compute transitions from {q₀}

  • On a: δ'({q₀}, a) = E(δ(q₀, a)) = E({q₀, q₁}) = {q₀, q₁}
  • On b: δ'({q₀}, b) = E(δ(q₀, b)) = E({q₀}) = {q₀}

Step 3: Compute transitions from {q₀, q₁}

  • On a: δ'({q₀, q₁}, a) = E(δ(q₀, a) ∪ δ(q₁, a)) = E({q₀, q₁} ∪ ∅) = {q₀, q₁}
  • On b: δ'({q₀, q₁}, b) = E(δ(q₀, b) ∪ δ(q₁, b)) = E({q₀}{q₂}) = {q₀, q₂}

Step 4: Compute transitions from {q₀, q₂}

  • On a: δ'({q₀, q₂}, a) = E(δ(q₀, a) ∪ δ(q₂, a)) = E({q₀, q₁}{q₀, q₁}) = {q₀, q₁}
  • On b: δ'({q₀, q₂}, b) = E(δ(q₀, b) ∪ δ(q₂, b)) = E({q₀}{q₀}) = {q₀}

Step 5: Determine accept states

  • {q₀}: Not accepting (doesn't contain q₂)
  • {q₀, q₁}: Not accepting (doesn't contain q₂)
  • {q₀, q₂}: Accepting (contains q₂)

Resulting DFA: 3 states {q₀}, {q₀, q₁}, {q₀, q₂} with accept state {q₀, q₂}. This matches the structure of our original DFA example, demonstrating the equivalence.

Epsilon-Elimination

Theorem: ε-Elimination State Complexity

For any n-state NFA N with ε-transitions, there exists an equivalent n-state NFA N' without ε-transitions.

However, when an NFA contains ε-cycles (cycles composed solely of ε-transitions), the minimal equivalent NFA without ε-transitions may require up to 2n states.

Definition: ε-Elimination Algorithm

Given an NFA N = (Q, Σ, δ, q0, F) with ε-transitions, we can construct an equivalent NFA N' = (Q, Σ, δ', q0, F') without ε-transitions as follows:

  1. Compute the ε-closure E(q) for each state q ∈ Q
  2. For each state q ∈ Q and symbol a ∈ Σ, define δ'(q, a) = ⋃p∈E(q) δ(p, a)
  3. Define F' = {q ∈ Q | E(q) ∩ F ≠ ∅} (states with ε-paths to accepting states)

This algorithm has time complexity O(|Q|2 · |Σ|).

Proof: Correctness of ε-Elimination

We prove that the NFA N' constructed by the ε-elimination algorithm accepts exactly the same language as the original NFA N.

We will show that for any string w ∈ Σ* and any state q ∈ Q, δ̂'(q, w) = ⋃p∈E(q) δ̂(p, w), where δ̂ and δ̂' are the extended transition functions of N and N', respectively.

Proof by induction on the length of w:

Base case: w = ε (the empty string)

δ̂'(q, ε) = {q} (by definition of δ̂' for an NFA without ε-transitions)

p∈E(q) δ̂(p, ε) = ⋃p∈E(q) E(p) = E(q) (by definition of δ̂ and E)

These are not equal in general. However, for the purpose of language recognition, we only care about acceptance, which depends on δ̂'(q0, w) ∩ F' for N' and δ̂(q0, w) ∩ F for N.

By definition of F', q ∈ F' if and only if E(q) ∩ F ≠ ∅. Therefore, {q} ∩ F' ≠ ∅ if and only if E(q) ∩ F ≠ ∅.

So for w = ε, δ̂'(q, w) ∩ F' ≠ ∅ if and only if E(q) ∩ F ≠ ∅, which is equivalent to δ̂(q, w) ∩ F ≠ ∅.

Inductive step: Assume that for some string x ∈ Σ* and for all q ∈ Q, δ̂'(q, x) ∩ F' ≠ ∅ if and only if p∈E(q) δ̂(p, x) ∩ F ≠ ∅.

We need to show that for any symbol a ∈ Σ, δ̂'(q, xa) ∩ F' ≠ ∅ if and only if p∈E(q) δ̂(p, xa) ∩ F ≠ ∅.

By the definition of δ̂':

δ̂'(q, xa) = ⋃r∈δ̂'(q,x) δ'(r, a)

= ⋃r∈δ̂'(q,x)s∈E(r) δ(s, a) (by the definition of δ')

By the definition of δ̂:

p∈E(q) δ̂(p, xa) = ⋃p∈E(q)r∈δ̂(p,x) E(δ(r, a))

Through careful analysis and using the properties of ε-closures, we can show that δ̂'(q, xa) ∩ F' ≠ ∅ if and only if p∈E(q) δ̂(p, xa) ∩ F ≠ ∅.

Therefore, by induction, L(N') = L(N).

Special Classes of NFAs

Definition: Two-Way Nondeterministic Finite Automaton (2NFA)

A two-way nondeterministic finite automaton (2NFA) is a 6-tuple (Q, Σ, δ, q0, F, ⊢, ⊣) where:

  • Q, Σ, q0, F are defined as in a standard NFA
  • ⊢, ⊣ are special left and right endmarker symbols not in Σ
  • δ: Q × (Σ ∪ {⊢, ⊣}) → P(Q × {-1, 0, +1}) is the transition function

If (q', d) ∈ δ(q, a), the automaton in state q reading symbol a can move to state q' and move the input head in direction d (-1 for left, 0 for no movement, +1 for right).

Theorem: Equivalence of 2NFAs and NFAs

For any 2NFA, there exists an equivalent NFA that recognizes the same language. In particular:

  1. Every language recognized by a 2NFA is regular
  2. If a language is recognized by an n-state 2NFA, then it is recognized by an NFA with 2O(n2) states

Definition: Unambiguous Nondeterministic Finite Automaton (UFA)

An unambiguous nondeterministic finite automaton (UFA) is an NFA where for every accepted string w ∈ L(M), there exists exactly one accepting computation path.

Formally, for an NFA M = (Q, Σ, δ, q0, F) to be unambiguous, for every string w = w1w2...wn ∈ L(M), there must be exactly one sequence of states r0, r1, ..., rn such that:

  1. r0 = q0
  2. ri ∈ δ(ri-1, wi) for i = 1, 2, ..., n
  3. rn ∈ F

Theorem: State Complexity of UFAs

The following state complexity relationships hold for UFAs:

  1. Every n-state DFA is already a UFA with n states
  2. There exist regular languages where the minimal UFA requires exponentially fewer states than the minimal DFA
  3. There exist regular languages where the minimal UFA requires exponentially more states than the minimal NFA
  4. Determining whether an NFA is unambiguous is PSPACE-complete

Closure Properties of NFAs

Theorem: Closure Properties of NFAs

The class of languages recognized by NFAs is closed under the following operations:

  • Union
  • Concatenation
  • Kleene star
  • Intersection
  • Complement
  • Reversal
  • Homomorphism
  • Inverse homomorphism

For the first three operations (union, concatenation, Kleene star), there exist efficient constructions that directly build an NFA for the resulting language.

Proof: Closure Under Union

Given NFAs M1 = (Q1, Σ, δ1, q1, F1) and M2 = (Q2, Σ, δ2, q2, F2), we construct an NFA M = (Q, Σ, δ, q0, F) for L(M1) ∪ L(M2):

  • Q = Q1 ∪ Q2{q0} where q0 is a new start state
  • δ combines δ1 and δ2, plus:
    • δ(q0, ε) = {q1, q2} (epsilon transitions to both original start states)
  • F = F1 ∪ F2 (accept if either original NFA would accept)

This construction has |Q1| + |Q2| + 1 states and preserves the structure of both original NFAs.

Correctness: A string w is accepted by M if and only if there exists an accepting computation path from q0. From q0, the NFA can move to either q1 or q2 via ε-transitions, and then follow the computation in either M1 or M2. Thus, w is accepted by M if and only if w is accepted by either M1 or M2, which means w ∈ L(M1) ∪ L(M2).

Proof: Closure Under Concatenation

Given NFAs M1 = (Q1, Σ, δ1, q1, F1) and M2 = (Q2, Σ, δ2, q2, F2), we construct an NFA M for L(M1) · L(M2):

  • Q = Q1 ∪ Q2
  • The start state is q1
  • F = F2 if ε ∉ L(M1), otherwise F = F1 ∪ F2
  • δ combines δ1 and δ2, plus:
    • For each q ∈ F1, add δ(q, ε) = δ(q, ε) ∪ {q2}

This construction has |Q1| + |Q2| states and creates ε-transitions from every accepting state of M1 to the start state of M2.

Correctness: A string w is in L(M1) · L(M2) if and only if w = xy where x ∈ L(M1) and y ∈ L(M2). In the NFA M, after reading x, the automaton can be in some accepting state of M1, from which it can move to q2 via an ε-transition and then process y according to M2. The special case for ε ∈ L(M1) ensures that if x can be empty, then w can be accepted solely by M2.

Proof: Closure Under Kleene Star

Given an NFA M = (Q, Σ, δ, q0, F), we construct an NFA M* for L(M)*:

  • Q' = Q ∪ {q'0} where q'0 is a new start state
  • The start state is q'0
  • F' = F ∪ {q'0}
  • δ' extends δ with:
    • δ'(q'0, ε) = {q0}
    • For each q ∈ F, add δ'(q, ε) = δ(q, ε) ∪ {q0}

This construction has |Q| + 1 states and allows for zero or more repetitions of strings in L(M).

Correctness: The new start state q'0 is accepting, which allows M* to accept the empty string (representing zero repetitions). For one or more repetitions, the NFA can move from q'0 to q0 via an ε-transition, process a string in L(M), reach an accepting state in F, and then either accept or move back to q0 to process another string in L(M). This precisely captures the behavior of the Kleene star operation.

Theorem: Optimality of NFA Closure Constructions

The standard constructions for union, concatenation, and Kleene star operations are state-optimal in the following sense:

  1. For union of NFAs with n1 and n2 states, the minimal NFA for the union requires at least n1 + n2 - 1 states in the worst case
  2. For concatenation of NFAs with n1 and n2 states, the minimal NFA for the concatenation requires at least n1 + n2 states in the worst case
  3. For the Kleene star of an NFA with n states, the minimal NFA for the star requires at least n + 1 states in the worst case

These lower bounds match the upper bounds provided by the standard constructions.

Computational Complexity of NFA Operations

Theorem: Computational Complexity of NFA Operations

The following computational complexity results hold for fundamental NFA operations:

  1. String Acceptance: Determining if w ∈ L(M) for an NFA M is NL-complete (nondeterministic logarithmic space)
  2. Equivalence Testing: Determining if L(M1) = L(M2) for NFAs M1 and M2 is PSPACE-complete
  3. Universality: Determining if L(M) = Σ* for an NFA M is PSPACE-complete
  4. Minimization: Computing the minimal NFA for a given regular language is PSPACE-complete
  5. Subset Construction: Converting an NFA to a DFA has worst-case time and space complexity O(2n), where n is the number of states in the NFA

Proof: NL-Completeness of NFA Acceptance

We prove that the problem of determining if w ∈ L(M) for an NFA M is NL-complete.

NL-hardness: The PATH problem (determining if there is a path from vertex s to vertex t in a directed graph) is a known NL-complete problem. We can reduce PATH to NFA acceptance: given a graph G = (V, E) and vertices s, t, construct an NFA M = (V, {a}, δ, s, {t}) where δ(u, a) = {v | (u, v) ∈ E}. Then a ∈ L(M) if and only if there is a path from s to t in G.

NL algorithm: To determine if w ∈ L(M) for an NFA M = (Q, Σ, δ, q0, F) and string w = w1w2...wn, a nondeterministic Turing machine can:

  1. Guess a sequence of states r0, r1, ..., rn
  2. Verify that r0 = q0
  3. For each i = 1, 2, ..., n, verify that ri ∈ δ(ri-1, wi)
  4. Verify that rn ∈ F

This algorithm needs only logarithmic space to store the current and next state (which can be encoded in O(log |Q|) bits) and the current position in the input (which can be encoded in O(log |w|) bits).

Therefore, NFA acceptance is NL-complete.

Descriptional Complexity Results

Theorem: Descriptional Complexity of NFAs

The following state complexity results hold for NFAs under various operations:

  1. For NFAs with n1 and n2 states, the state complexity of union is n1 + n2 + 1
  2. For NFAs with n1 and n2 states, the state complexity of concatenation is n1 + n2
  3. For an NFA with n states, the state complexity of Kleene star is n + 1
  4. For an NFA with n states, the state complexity of reversal is n + 1
  5. For an NFA with n states, the state complexity of complementation can be as high as 2n
  6. For NFAs with n1 and n2 states, the state complexity of intersection can be as high as n1 · n2

Theorem: Magic Numbers in State Complexity

Certain "magic numbers" appear in the state complexity of combined operations on NFAs:

  1. For an NFA with n states, the minimal NFA for (L*)* has exactly n + 1 states (same as for L*)
  2. For an NFA with n states, the minimal NFA for (L ∪ {ε})* has exactly n + 1 states (same as for L*)
  3. For NFAs with m and n states, the minimal NFA for (L1L2)* may require up to m·n + 1 states
  4. For NFAs with m and n states, the minimal NFA for (L1 ∪ L2)* may require up to 2m+n-1 + 1 states

Summary

  • Formal Definition: NFAs extend DFAs with multiple possible transitions per symbol and epsilon transitions
  • Mathematical Foundation: The transition function maps to the power set of states, enabling nondeterministic choices
  • Epsilon Closures: A critical mathematical tool for handling epsilon transitions with well-defined properties
  • State Complexity: NFAs can be exponentially more concise than DFAs, with precise bounds for various languages
  • Equivalence: NFAs and DFAs are equivalent in expressive power, with conversion possible via the subset construction
  • Special Classes: Two-way NFAs and unambiguous NFAs provide alternative models with different complexity characteristics
  • Closure Properties: NFAs are closed under union, concatenation, Kleene star, and other operations with efficient constructions
  • Computational Complexity: Many decision problems involving NFAs are complete for important complexity classes
  • Descriptional Complexity: Precise bounds exist for the state complexity of NFA operations, including "magic numbers" for combined operations

Suggested Reading

  • Sipser, Introduction to the Theory of Computation – Chapter 1.2: Nondeterminism
  • Hopcroft, Motwani, and Ullman, Introduction to Automata Theory, Languages, and Computation – Chapter 2.3: Nondeterministic Finite Automata
  • Yu, S., State Complexity of Regular Languages – Journal of Automata, Languages and Combinatorics
  • Holzer, M. and Kutrib, M., Nondeterministic Finite Automata—Recent Results on the Descriptional and Computational Complexity – International Journal of Foundations of Computer Science
  • Jirásková, G., State Complexity of Some Operations on Binary Regular Languages – Theoretical Computer Science
  • Chrobak, M., Finite Automata and Unary Languages – Theoretical Computer Science