A definitive reference

Advanced Undecidability

Building on our understanding of the basic undecidable problems, this module explores more sophisticated undecidability results and the powerful technique of reduction, which allows us to transfer undecidability from one problem to another.

We first examine the Halting Problem in depth, providing a comprehensive proof of its undecidability. We then investigate various types of reductions and use them to establish the undecidability of a gallery of problems including the acceptance problem, the emptiness problem, and the Post Correspondence Problem.

Finally, we present Rice's Theorem, a powerful general result that shows the undecidability of a wide class of problems involving properties of Turing Machine languages, and explore its far-reaching consequences for program analysis and verification.

The Halting Problem in Depth

While the Halting Problem was introduced in the Turing Machines module, here we provide a more comprehensive analysis, including alternative proofs and implications.

Formal Definition

Definition: The Halting Problem

The Halting Problem is the decision problem of determining, given a Turing Machine M and an input w, whether M halts when run on w. Formally, it is the language:

HALTTM = {⟨M, w⟩ | M is a TM and M halts on input w}

Where ⟨M, w⟩ represents an encoding of the Turing Machine M and the input string w.

Classic Proof of Undecidability

The classic proof that the Halting Problem is undecidable uses a technique known as diagonalization, similar to Cantor's proof that the real numbers are uncountable.

Theorem: Undecidability of the Halting Problem

The language HALTTM is undecidable. That is, there is no Turing Machine that, given ⟨M, w⟩, always halts and correctly determines whether M halts on input w.

Proof: Proof by Contradiction and Diagonalization

Suppose, for contradiction, that there exists a Turing Machine H that decides HALTTM. So for any Turing Machine M and input w:

  • H(⟨M, w⟩) = "yes" if M halts on w
  • H(⟨M, w⟩) = "no" if M does not halt on w

Now, we construct a new Turing Machine D that works as follows:

  1. On input ⟨M⟩ (a Turing Machine encoding)
  2. Run H(⟨M, ⟨M⟩⟩) to determine if M halts when given its own encoding as input
  3. If H says "yes" (i.e., M halts on ⟨M⟩), then D enters an infinite loop
  4. If H says "no" (i.e., M doesn't halt on ⟨M⟩), then D halts

Now consider what happens when we run D on its own encoding ⟨D⟩:

  • If D halts on ⟨D⟩, then H(⟨D, ⟨D⟩⟩) = "yes", which means D enters an infinite loop on ⟨D⟩ and doesn't halt
  • If D doesn't halt on ⟨D⟩, then H(⟨D, ⟨D⟩⟩) = "no", which means D halts on ⟨D⟩

In either case, we have a contradiction: D halts on ⟨D⟩ if and only ifD doesn't halt on ⟨D⟩. This contradicts our assumption that H exists. Therefore, HALTTM is undecidable.

Alternative Proof Approaches

There are several alternative approaches to proving the undecidability of the Halting Problem, each providing different insights.

Proof Using Fixed Points

An elegant approach using Kleene's Recursion Theorem:

Proof: Proof Using the Recursion Theorem

Suppose, for contradiction, that there exists a Turing Machine H that decides the Halting Problem.

Define a computable function f that takes a Turing Machine encoding ⟨M⟩ and produces a new Turing Machine that:

  1. On any input, simulates H(⟨M, ⟨M⟩⟩)
  2. If H says M halts on ⟨M⟩, enters an infinite loop
  3. If H says M doesn't halt on ⟨M⟩, halts immediately

By Kleene's Recursion Theorem, there exists a fixed point P such that P is equivalent to f(⟨P⟩).

Now, does P halt on its own code ⟨P⟩?

  • If P halts on ⟨P⟩, then H(⟨P, ⟨P⟩⟩) outputs "yes," so P enters an infinite loop on any input, which contradicts P halting on ⟨P⟩
  • If P doesn't halt on ⟨P⟩, then H(⟨P, ⟨P⟩⟩) outputs "no," so P halts immediately on any input, which contradicts P not halting on ⟨P⟩

This contradiction shows that H cannot exist, so the Halting Problem is undecidable.

Variants of the Halting Problem

There are several variants of the Halting Problem, all of which are undecidable:

The Self-Halting Problem

Definition: The Self-Halting Problem

The Self-Halting Problem is the decision problem of determining, given a Turing Machine M, whether M halts when run on its own encoding ⟨M⟩. Formally, it is the language:

SELFHALT = {⟨M⟩ | M is a TM and M halts on input ⟨M⟩}

Theorem: Undecidability of the Self-Halting Problem

The language SELFHALT is undecidable.

The Blank Tape Halting Problem

Definition: The Blank Tape Halting Problem

The Blank Tape Halting Problem is the decision problem of determining, given a Turing Machine M, whether M halts when run on the empty input ε. Formally, it is the language:

BLANKHALT = {⟨M⟩ | M is a TM and M halts on the empty input}

Theorem: Undecidability of the Blank Tape Halting Problem

The language BLANKHALT is undecidable.

Implications and Significance

The undecidability of the Halting Problem has profound implications across computer science and mathematics:

  • Program Analysis: It establishes fundamental limits on what we can determine about programs' behavior before running them. General program verification, optimization, and bug detection all face inherent limitations.
  • Mathematical Logic: It connects to Gödel's Incompleteness Theorems, showing limitations of formal systems.
  • Computability Theory: It serves as the starting point for proving other problems undecidable through reduction techniques.
  • Computer Security: It implies that perfect virus detection is impossible, as no algorithm can determine for all programs whether they contain malicious behavior.
  • Compiler Design: It shows that certain optimizations, like determining if a piece of code is redundant because it never terminates, cannot be performed in general.

The significance of the Halting Problem lies not just in what it tells us is impossible, but in how it shapes our understanding of computation, algorithms, and the limitations of automated reasoning.

Reduction Techniques

Reduction is a powerful technique for proving problems undecidable. By showing that an already-known undecidable problem (like the Halting Problem) can be reduced to another problem, we can prove that the second problem is also undecidable.

Types of Reductions

There are several types of reductions used in computability theory:

Definition: Many-One Reduction (Mapping Reduction)

A language A is many-one reducible (or mapping reducible) to a language B, denoted A ≤m B, if there exists a computable function f: Σ* → Σ* such that for all x ∈ Σ*:

x ∈ A if and only if f(x) ∈ B

Definition: Turing Reduction

A language A is Turing reducible to a language B, denoted A ≤T B, if there exists an oracle Turing Machine MB that decides A.

An oracle Turing Machine MB is a Turing Machine with access to an oracle for B. The oracle can answer membership queries for B in a single step.

Properties of Reductions

Reductions have several important properties that make them useful for proving undecidability:

Theorem: Preservation of Decidability

If A ≤m B (or A ≤T B) and B is decidable, then A is decidable.

Equivalently, if A ≤m B (or A ≤T B) and A is undecidable, then B is undecidable.

Lemma: Transitivity of Reductions

If A ≤m B and B ≤m C, then A ≤m C.

Similarly, if A ≤T B and B ≤T C, then A ≤T C.

Reduction Strategies

When constructing reductions to prove undecidability, several strategies are commonly employed:

  1. Simulation: Embed the computation of one problem within another. For example, to reduce the Halting Problem to another problem, embed the simulation of a Turing Machine within an instance of the target problem.
  2. Encoding: Represent the structure and behavior of one problem using the constructs of another problem. This is common when reducing to problems with rich structural elements like grammars or automata.
  3. Problem Modification: Transform a known undecidable problem by adding constraints or modifications that match the target problem's definition.
  4. Contraposition: Instead of directly reducing problem A to B, show that if B were decidable, you could use it to decide A (which is known to be undecidable).

Gallery of Undecidable Problems

Using reduction techniques, we can establish the undecidability of a wide variety of problems beyond the Halting Problem. Here we explore several important undecidable problems and sketch the reductions that prove their undecidability.

The Acceptance Problem

Definition: The Acceptance Problem

The Acceptance Problem is the decision problem of determining, given a Turing Machine M and an input w, whether M accepts w. Formally, it is the language:

ATM = {⟨M, w⟩ | M is a TM and M accepts w}

Theorem: Undecidability of the Acceptance Problem

The language ATM is undecidable.

Proof: Proof by Reduction from the Halting Problem

We will show that HALTTMm ATM, which implies that ATM is undecidable.

Given an instance ⟨M, w⟩ of HALTTM, we construct a Turing Machine M' that:

  1. On any input, simulates M on w
  2. If M halts on w (either by accepting or rejecting), M' accepts
  3. If M doesn't halt on w, then M' also doesn't halt

Now, M halts on w if and only if M' accepts any input.

Let f(⟨M, w⟩) = ⟨M', x⟩ for any fixed string x. Then ⟨M, w⟩ ∈ HALTTM if and only if ⟨M', x⟩ ∈ ATM.

This establishes a many-one reduction from HALTTM to ATM, proving that ATM is undecidable.

The Emptiness Problem

Definition: The Emptiness Problem

The Emptiness Problem is the decision problem of determining, given a Turing Machine M, whether the language recognized by M is empty. Formally, it is the language:

ETM = {⟨M⟩ | M is a TM and L(M) = ∅}

Theorem: Undecidability of the Emptiness Problem

The language ETM is undecidable.

Proof: Proof by Reduction from the Acceptance Problem

We will show that ATMm ETM, which implies that ETM is undecidable.

Given an instance ⟨M, w⟩ of ATM, we construct a Turing Machine M' that:

  1. On any input, first erases its tape and writes w
  2. Then simulates M on w
  3. Accepts if and only if M accepts w

If M accepts w, then M' accepts every input, so L(M') is not empty.

If M doesn't accept w (either by rejecting or running forever), then M' doesn't accept any input, so L(M') is empty.

Let f(⟨M, w⟩) = ⟨M'⟩. Then ⟨M, w⟩ ∈ ATM if and only if ⟨M'⟩ ∉ ETM.

This establishes a many-one reduction from ATM to the complement of ETM. Since ATM is undecidable, the complement of ETM is undecidable, which means ETM itself is undecidable.

The Equality Problem

Definition: The Equality Problem for Turing Machines

The Equality Problem is the decision problem of determining, given two Turing Machines M₁ and M₂, whether they recognize the same language. Formally, it is the language:

EQTM = {⟨M₁, M₂⟩ | M₁ and M₂ are TMs and L(M₁) = L(M₂)}

Theorem: Undecidability of the Equality Problem

The language EQTM is undecidable.

The Post Correspondence Problem

Definition: The Post Correspondence Problem (PCP)

The Post Correspondence Problem (PCP) is defined as follows: Given a finite set of pairs of strings{(x₁, y₁), (x₂, y₂), ..., (xₙ, yₙ)}, determine whether there exists a sequence of indices i₁, i₂, ..., iₖ such that xi₁xi₂...xiₖ = yi₁yi₂...yiₖ.

Theorem: Undecidability of the Post Correspondence Problem

The Post Correspondence Problem is undecidable.

Rice's Theorem and Its Consequences

Rice's Theorem is a powerful general result that shows the undecidability of a wide class of problems involving properties of Turing Machine languages. It greatly simplifies proving undecidability for many problems by establishing a general principle.

Statement of Rice's Theorem

Theorem: Rice's Theorem

Let P be any nontrivial property of the language recognized by a Turing Machine. Then the problem of determining whether a given Turing Machine M has property P is undecidable.

A property is nontrivial if there exists at least one Turing Machine that has the property and at least one that doesn't.A property of languages is a function that maps languages to {0, 1} (or equivalently, a set of languages).

Proof of Rice's Theorem

Proof: Proof of Rice's Theorem

Let P be any nontrivial property of languages recognized by Turing Machines. Then there exists a Turing Machine Myes such that L(Myes) has property P, and a Turing Machine Mno such that L(Mno) does not have property P.

Without loss of generality, assume that the empty language ∅ does not have property P(if it does, we can work with the complement of P instead).

We will reduce the Halting Problem to the problem of determining whether a Turing Machine's language has property P.

Given ⟨M, w⟩ from the Halting Problem, we construct a Turing Machine M' that:

  1. On input x, simulates M on w
  2. If M halts on w, M' simulates Myes on x
  3. If M doesn't halt on w, M' rejects immediately (so L(M') = ∅)

Now, if M halts on w, then L(M') = L(Myes), which has property P.

If M doesn't halt on w, then L(M') = ∅, which does not have property P.

Therefore, M halts on w if and only if L(M') has property P.

Since the Halting Problem is undecidable, the problem of determining whether a Turing Machine's language has property P is also undecidable.

Examples of Rice's Theorem Applications

Rice's Theorem immediately implies the undecidability of a wide range of problems:

  • Determining if a Turing Machine:
    • Accepts a specific string
    • Accepts an infinite language
    • Accepts all strings
    • Accepts a regular language
    • Accepts a context-free language
    • Accepts a recursive language
  • Determining if two Turing Machines:
    • Accept the same language
    • Accept languages where one is a subset of the other

All these problems are undecidable because they involve nontrivial properties of the languages recognized by Turing Machines, not properties of the machines themselves.

Practical Implications

Rice's Theorem has profound implications for program analysis, verification, and optimization:

  • Program Analysis: We cannot build a general-purpose analyzer that can determine nontrivial semantic properties of all programs. This includes determining whether a program has bugs, security vulnerabilities, or meets its specification.
  • Compiler Optimization: Certain optimizations, like detecting dead code or proving program equivalence, cannot be performed in general, as they involve properties of program behavior.
  • Program Verification: Fully automatic program verification for arbitrary properties is impossible. This is why practical verification systems either require human guidance or restrict themselves to specific classes of programs.
  • Type Systems: Type systems in programming languages can be seen as decidable approximations of undecidable semantic properties. Rich type systems that attempt to express too many semantic properties eventually become undecidable.

Rice-Shapiro Theorem: A Generalization

A generalization of Rice's Theorem to properties of computable functions is the Rice-Shapiro Theorem, which characterizes the properties that can be positively decided based on finite information.

Theorem: The Rice-Shapiro Theorem

A property P of the functions computed by programs is recursively enumerable if and only if:

  1. P is monotonic: if function f has property P and f ⊆ g (i.e., g extends f), then g has property P
  2. P is compact: if function f has property P, then some finite subfunction of f has property P

RE-completeness and the Recursively Enumerable Languages

Just as NP-completeness classifies the hardest problems in NP, RE-completeness identifies the hardest problems among recursively enumerable languages. These problems are undecidable but recognizable, and all other RE problems can be reduced to them.

Definition of RE-completeness

Definition: RE-completeness

A language L is RE-complete if:

  1. L is recursively enumerable (RE)
  2. For every recursively enumerable language A, A ≤m L (where m denotes many-one reducibility)

Examples of RE-complete Problems

Several key problems are known to be RE-complete:

The Acceptance Problem (ATM)

The Acceptance Problem for Turing Machines (ATM) is RE-complete. Recall that this is the problem of determining, given a Turing Machine M and an input w, whether M accepts w.

Theorem: RE-completeness of the Acceptance Problem

The language ATM = {⟨M, w⟩ | M is a TM and M accepts w} is RE-complete.

Proof: Proof of RE-completeness of the Acceptance Problem

We know ATM is recursively enumerable, as we can simulate M on w and accept if M accepts.

To show that it is RE-complete, we need to prove that for any recursively enumerable language L, L ≤m ATM.

Let L be a recursively enumerable language. Then there exists a Turing Machine ML that recognizes L.

For any input string x, we can map it to the pair ⟨ML, x⟩.

Now, x ∈ L if and only if ML accepts x, which is true if and only if ⟨ML, x⟩ ∈ ATM.

This establishes a many-one reduction from L to ATM, proving that ATM is RE-complete.

The Halting Problem (HALTTM)

The Halting Problem is also RE-complete. This shows that the Halting Problem and the Acceptance Problem are equivalently hard from the perspective of many-one reducibility.

Theorem: RE-completeness of the Halting Problem

The language HALTTM = {⟨M, w⟩ | M is a TM and M halts on w} is RE-complete.

Properties of RE-complete Problems

RE-complete problems share several important properties:

  • Undecidability: All RE-complete problems are undecidable. If any RE-complete problem were decidable, all RE problems would be decidable, which is not the case.
  • Recognizability: By definition, RE-complete problems are recognizable (recursively enumerable).
  • Equivalence: All RE-complete problems are many-one equivalent to each other. This means that one can be transformed into another via a computable function that preserves membership.
  • Complementation: The complement of an RE-complete problem is not recursively enumerable (not in RE). If it were, the problem would be both RE and co-RE, making it decidable.
  • Reducibility: Every recursively enumerable problem can be reduced to any RE-complete problem. This means that RE-complete problems are at least as hard as all other RE problems.

The Structure of Recursively Enumerable Languages

RE-completeness helps us understand the structure of recursively enumerable languages:

  • Decidable Languages: These form a proper subset of the recursively enumerable languages. All decidable languages are trivially also recognizable.
  • Intermediate Languages: There exist recursively enumerable languages that are neither decidable nor RE-complete. These occupy an intermediate position in terms of difficulty.
  • RE-complete Languages: These are the hardest recursively enumerable languages. All other recursively enumerable languages can be reduced to them.

This hierarchy within the recursively enumerable languages parallels the similar hierarchy within NP (P ⊂ NP-intermediate ⊂ NP-complete), though the contexts are different: one deals with undecidability, the other with computational complexity.

Summary

  • The Halting Problem: A fundamental undecidable problem proven through diagonalization that shows the inherent limits of what algorithms can determine about other algorithms.
  • Reduction Techniques: Many-one reductions and Turing reductions provide powerful tools for transferring undecidability results from one problem to another.
  • Gallery of Undecidable Problems: The Acceptance Problem, Emptiness Problem, Equality Problem, and Post Correspondence Problem are all undecidable, each demonstrating different aspects of undecidability.
  • Rice's Theorem: A sweeping result showing that any nontrivial property of the languages recognized by Turing Machines is undecidable, greatly simplifying many undecidability proofs.
  • RE-completeness: Characterizes the hardest problems among recursively enumerable languages, analogous to NP-completeness in complexity theory.
  • Practical Implications: These undecidability results place fundamental limits on program analysis, verification, compiler optimization, and many aspects of software engineering.

Suggested Reading

  • Sipser, Introduction to the Theory of Computation – Chapter 5: Reducibility
  • Sipser, Introduction to the Theory of Computation – Chapter 6: Advanced Topics in Computability Theory
  • Hopcroft, Motwani, and Ullman, Introduction to Automata Theory, Languages, and Computation – Chapter 9: Undecidability
  • Rice, Classes of Recursively Enumerable Sets and Their Decision Problems – Original paper introducing Rice's Theorem
  • Post, A Variant of a Recursively Unsolvable Problem – Original paper on the Post Correspondence Problem
  • Cutland, Computability: An Introduction to Recursive Function Theory