Theory Basics
chapter ‹Introduction›
text ‹
The Cook-Levin theorem states that the problem \SAT{} of deciding the
satisfiability of Boolean formulas in conjunctive normal form is
$\NP$-complete~\cite{Cook,Levin}. This article formalizes a proof of this
theorem based on the textbook \emph{Computational Complexity:\ A Modern
Approach} by Arora and Barak~\cite{ccama}.
›
section ‹Outline›
text ‹
We start out in Chapter~\ref{s:TM} with a definition of multi-tape Turing
machines (TMs) slightly modified from Arora and Barak's definition. The
remainder of the chapter is devoted to constructing ever more complex machines
for arithmetic on binary numbers, evaluating polynomials, and performing basic
operations on lists of numbers and even lists of lists of numbers.
Specifying Turing machines and proving their correctness and running time is
laborious at the best of times. We slightly alleviate the seemingly inevitable
tedium of this by defining elementary reusable Turing machines and introducing
ways of composing them sequentially as well as in if-then-else branches and
while loops. Together with the representation of natural numbers and lists, we
thus get something faintly resembling a structured programming language of
sorts.
In Chapter~\ref{s:TC} we introduce some basic concepts of complexity theory,
such as $\mathcal{P}$, $\NP$, and polynomial-time many-one reduction. Following
Arora and Barak the complexity class $\NP$ is defined via verifier Turing
machines rather than nondeterministic machines, and so the deterministic TMs
introduced in the previous chapter suffice for all definitions. To flesh out the
chapter a little we formalize obvious proofs of $\mathcal{P} \subseteq \NP$ and
the transitivity of the reducibility relation, although neither result is needed
for proving the Cook-Levin theorem.
Chapter~\ref{s:Sat} introduces the problem \SAT{} as a language over bit
strings. Boolean formulas in conjunctive normal form (CNF) are represented as
lists of clauses, each consisting of a list of literals encoded in binary numbers.
The list of lists of numbers ``data type'' defined in Chapter~\ref{s:TM} will
come in handy at this point.
The proof of the Cook-Levin theorem has two parts: Showing that \SAT{} is in
$\NP$ and showing that \SAT{} is $\NP$-hard, that is, that every language in
$\NP$ can be reduced to \SAT{} in polynomial time. The first part, also proved
in Chapter~\ref{s:Sat}, is fairly easy: For a satisfiable CNF formula, a
satisfying assignment can be given in roughly the size of the formula, because
only the variables in the formula need be assigned a truth value. Moreover
whether an assignment satisfies a CNF formula can be verified easily.
The hard part is showing the $\NP$-hardness of \SAT{}. The first step
(Chapter~\ref{s:oblivious}) is to show that every polynomial-time computation on
a multi-tape TM can be performed in polynomial time on a two-tape
\emph{oblivious} TM. Oblivious means that the sequence of positions of the
Turing machine's tape heads depends only on the \emph{length} of the input.
Thus any language in $\NP$ has a polynomial-time two-tape oblivious verifier TM.
In Chapter~\ref{s:Reducing} the proof goes on to show how the computations of
such a machine can be mapped to CNF formulas such that a CNF formula is
satisfiable if and only if the underlying computation was for a string in the
language \SAT{} paired with a certificate. Finally in Chapter~\ref{s:Aux_TM} and
Chapter~\ref{s:Red_TM} we construct a Turing machine that carries out the
reduction in polynomial time.
›
section ‹Related work›
text ‹
The Cook-Levin theorem has been formalized before. Gamboa and
Cowles~\cite{Gamboa2004AMP} present a formalization in ACL2~\cite{acl2}. They
formalize $\NP$ and reducibility in terms of Turing machines, but analyze the
running time of the reduction from $\NP$-languages to \SAT{} in a
different, somewhat ad-hoc, model of computation that they call ``the major
weakness'' of their formalization.
Employing Coq~\cite{coq}, Gäher and Kunze~\cite{Gher2021MechanisingCT} define
$\NP$ and reducibility in the computational model ``call-by-value
$\lambda$-calculus L'' introduced by Forster and
Smolka~\cite{Forster2017WeakCL}. They show the $\NP$-completeness of \SAT{} in
this framework. Turing machines appear in an intermediate problem in the chain
of reductions from $\NP$ languages to \SAT{}, but are not used to show the
polynomiality of the reduction. Nevertheless, this is likely the first
formalization of the Cook-Levin theorem where both the complexity theoretic
concepts and the proof of the polynomiality of the reduction use the same model
of computation.
With regards to Isabelle, Xu~et al.~\cite{Universal_Turing_Machine-AFP} provide
a formalization of single-tape Turing machines with a fixed binary alphabet in
the computability theory setting and construct a universal TM. While I was
putting the finishing touches on this article, Dalvit and
Thiemann~\cite{Multitape_To_Singletape_TM-AFP} published a formalization of
(deterministic and nondeterministic) multi-tape and single-tape Turing machines
and showed how to simulate the former on the latter with quadratic slowdown.
Moreover, Thiemann and Schmidinger~\cite{Multiset_Ordering_NPC-AFP} prove the
$\NP$-completeness of the Multiset Ordering problem, without, however, proving
the polynomial-time computability of the reduction.
This article uses Turing machines as model of computation for both the
complexity theoretic concepts and the running time analysis of the reduction. It
is thus most similar to Gäher and Kunze's work, but has a more elementary, if
not brute-force, flavor to it.
›
section ‹The core concepts›
text ‹
The proof of the Cook-Levin theorem awaits us in Section~\ref{s:complete} on the
very last page of this article. The way there is filled with definitions of
Turing machines, correctness proofs for Turing machines, and running time-bound
proofs for Turing machines, all of which can easily drown out the more relevant
concepts. For instance, for verifying that the theorem on the last page really
is the Cook-Levin theorem, only a small fraction of this article is relevant,
namely the definitions of $\NP$-completeness and of \SAT{}. Recursively breaking
down these definitions yields:
\begin{itemize}
\item $\NP$-completeness: Section~\ref{s:TC-NP}
\begin{itemize}
\item languages: Section~\ref{s:TC-NP}
\item $\NP$-hard: Section~\ref{s:TC-NP}
\begin{itemize}
\item $\NP$: Section~\ref{s:TC-NP}
\begin{itemize}
\item Turing machines: Section~\ref{s:tm-basic-tm}
\item computing a function: Section~\ref{s:tm-basic-comp}
\item pairing strings: Section~\ref{s:tm-basic-pair}
\item Big-Oh, polynomial: Section~\ref{s:tm-basic-bigoh}
\end{itemize}
\item polynomial-time many-one reduction: Section~\ref{s:TC-NP}
\end{itemize}
\end{itemize}
\item \SAT{}: Section~\ref{s:sat-sat-repr}
\begin{itemize}
\item literal, clause, CNF formula, assignment, satisfiability: Section~\ref{s:CNF}
\item representing CNF formulas as strings: Section~\ref{s:sat-sat-repr}
\begin{itemize}
\item string: Section~\ref{s:tm-basic-tm}
\item CNF formula: Section~\ref{s:CNF}
\item mapping between symbols and strings: Section~\ref{s:tm-basic-comp}
\item mapping between binary and quaternary alphabets: Section~\ref{s:tm-quaternary-encoding}
\item lists of lists of natural numbers: Section~\ref{s:tm-numlistlist-repr}
\begin{itemize}
\item binary representation of natural numbers: Section~\ref{s:tm-arithmetic-binary}
\item lists of natural numbers: Section~\ref{s:tm-numlist-repr}
\end{itemize}
\end{itemize}
\end{itemize}
\end{itemize}
In other words the Sections~\ref{s:tm-basic}, \ref{s:tm-arithmetic-binary},
\ref{s:tm-numlist-repr}, \ref{s:tm-numlistlist-repr}, \ref{s:tm-quaternary-encoding},
\ref{s:TC-NP}, \ref{s:CNF}, and \ref{s:sat-sat-repr} cover all definitions for
formalizing the statement ``\SAT{} is $\NP$-complete''.
›
chapter ‹Turing machines\label{s:TM}›
text ‹
This chapter introduces Turing machines as a model of computing functions within
a running-time bound. Despite being quite intuitive, Turing machines are
notoriously tedious to work with. And so most of the rest of the chapter is
devoted to making this a little easier by providing means of combining TMs and a
library of reusable TMs for common tasks.
The basic idea (Sections~\ref{s:tm-basic} and~\ref{s:tm-trans}) is to treat
Turing machines as a kind of GOTO programming language. A state of a TM
corresponds to a line of code executing a rather complex command that, depending
on the symbols read, can write symbols, move tape heads, and jump to another
state (that is, line of code). States are identified by line numbers. This makes
it easy to execute TMs in sequence by concatenating two TM ``programs''. On top
of the GOTO implicit in all commands, we then define IF and WHILE in the
traditional way (Section~\ref{s:tm-combining}). This makes TMs more composable.
The interpretation of states as line numbers deprives TMs of the ability to
memorize values ``in states'', for example, the carry bit during a binary
addition. In Section~\ref{s:tm-memorizing} we recover some of this flexibility.
Being able to combine TMs is helpful, but we also need TMs to combine. This
takes up most of the remainder of the chapter. We start with simple operations,
such as moving a tape head to the next blank symbol or copying symbols between
tapes (Section~\ref{s:tm-elementary}). Extending our programming language
analogy for more complex TMs, we identify tapes with variables, so that a tape
contains a value of a specific type, such as a number or a list of numbers. In
the remaining Sections~\ref{s:tm-arithmetic} to~\ref{s:tm-wellformed} we define
these ``data types'' and devise TMs for operations over them.
It would be an exaggeration to say all this makes working with Turing machines
easy or fun. But at least it makes TMs somewhat more feasible to use for
complexity theory, as witnessed by the subsequent chapters.
›
section ‹Basic definitions\label{s:tm-basic}›
theory Basics
imports Main
begin
text ‹
While Turing machines are fairly simple, there are still a few parts to define,
especially if one allows multiple tapes and an arbitrary alphabet: states, tapes
(read-only or read-write), cells, tape heads, head movements, symbols, and
configurations. Beyond these are more semantic aspects like executing one or
many steps of a Turing machine, its running time, and what it means for a TM to
``compute a function''. Our approach at formalizing all this must look rather
crude compared to Dalvit and Thiemann's~\cite{Multitape_To_Singletape_TM-AFP},
but still it does get the job done.
For lack of a better place, this section also introduces a minimal version of
Big-Oh, polynomials, and a pairing function for strings.
›
subsection ‹Multi-tape Turing machines\label{s:tm-basic-tm}›
text ‹
Arora and Barak~\cite[p.~11]{ccama} define multi-tape Turing machines with these
features:
\begin{itemize}
\item There are $k \geq 2$ infinite one-directional tapes, and each has one head.
\item The first tape is the input tape and read-only; the other $k - 1$ tapes
can be written to.
\item The tape alphabet is a finite set $\Gamma$ containing at least the blank
symbol $\Box$, the start symbol $\triangleright$, and the symbols \textbf{0}
and \textbf{1}.
\item There is a finite set $Q$ of states with start state and halting state
$q_\mathit{start}, q_\mathit{halt} \in Q$.
\item The behavior is described by a transition function $\delta\colon\ Q
\times \Gamma^k \to Q \times \Gamma^{k-1} \times \{L, S, R\}^k$. If the TM is
in a state $q$ and the symbols $g_1,\dots,g_k$ are under the $k$ tape heads
and $\delta(q, (g_1, \dots, g_k)) = (q', (g'_2, \dots, g'_k), (d_1, \dots,
d_k))$, then the TM writes $g'_2, \dots, g'_k$ to the writable tapes, moves
the tape heads in the direction (Left, Stay, or Right) indicated by the $d_1,
\dots, d_k$ and switches to state $q'$.
\end{itemize}
›
subsubsection ‹Syntax›
text ‹
An obvious data type for the direction a tape head can move:
›