Introduction

Hey everyone! We’ve seen how to check if a context-free grammar (CFG) produces any strings. Now let’s ask a bit more: does it produce a finite number of strings, or is the language infinite? Good news: this one’s also decidable, using a nifty trick involving grammar cycles.

Formal Setup

Recall a CFG

$$ G = (V, Σ, R, S) $$

with language

$$ L(G) = {,w \in Σ^* \mid S \xRightarrow{*} w}. $$

The Finiteness Problem asks whether |L(G)| is finite or infinite.

Example Grammars

# Grammar G1:
# S -> 'a' S | 'b'
# This generates {"b", "ab", "aab", ...}, which is infinite.

# Grammar G2:
# S -> 'a' 'b' | 'b' 'a'
# Only two strings {"ab","ba"}, so finite.

Algorithm (Detecting Infinity via Graphs)

  1. Construct a directed graph $D$ whose vertices are the nonterminals in $V$.

  2. Add an edge $A \to B$ if there is a production $A \to α B β$ for some $α, β$.

  3. Detect if there is any cycle in $D$ reachable from the start symbol $S$ that also leads to a terminal derivation.

    • First, compute the set of generating nonterminals $Gen$ as in the Emptiness algorithm.
    • Then restrict $D$ to vertices in $Gen$.
    • If there is a cycle in this restricted graph reachable from $S$, then $L(G)$ is infinite; otherwise, it’s finite.

This runs in polynomial time using standard graph algorithms.

Why It Matters

  • Language Analysis: Knowing a grammar’s output size can guide parser optimizations.
  • Grammar Design: Helps grammar writers ensure they don’t accidentally allow unbounded recursion.

Wrap-Up

That wraps up finiteness for CFGs, two for two on decidability! Next up: Intersection Emptiness for multiple CFGs, hint: that one’s a bit trickier.