chapter ‹Satisfiability\label{s:Sat}› theory Satisfiability imports Wellformed NP begin text ‹ This chapter introduces the language \SAT{} and shows that it is in $\NP$, which constitutes the easier part of the Cook-Levin theorem. The other part, the $\NP$-hardness of \SAT{}, is what all the following chapters are concerned with. We first introduce Boolean formulas in conjunctive normal form and the concept of satisfiability. Then we define a way to represent such formulas as bit strings, leading to the definition of the language \SAT{} as a set of strings (Section~\ref{s:sat-sat}). For the proof that \SAT{} is in $\NP$, we construct a Turing machine that, given a CNF formula and a string representing a variable assignment, outputs \textbf{1} iff. the assignment satisfies the formula. The TM will run in polynomial time, and there are always assignments polynomial (in fact, linear) in the length of the formula (Section~\ref{s:Sat-np}). › section ‹The language \SAT{}\label{s:sat-sat}› text ‹ \SAT{} is the language of all strings representing satisfiable Boolean formulas in conjunctive normal form (CNF). This section introduces a minimal version of Boolean formulas in conjunctive normal form, including the concepts of assignments and satisfiability. › subsection ‹CNF formulas and satisfiability\label{s:CNF}› text ‹ Arora and Barak~\cite[p.~44]{ccama} define Boolean formulas in general as expressions over $\land$, $\lor$, $\lnot$, parentheses, and variables $v_1, v_2, \dots$ in the usual way. Boolean formulas in conjunctive normal form are defined as $\bigwedge_i\left(\bigvee_j v_{i_j}\right)$, where the $v_{i_j}$ are literals. This definition does not seem to allow for empty clauses. Also whether the ``empty CNF formula'' exists is somewhat doubtful. Nevertheless, our formalization allows for both empty clauses and the empty CNF formula, because this enables us to represent CNF formulas as lists of clauses and clauses as lists of literals without having to somehow forbid the empty list. This seems to be a popular approach for formalizing CNF formulas in the context of \SAT{} and $\NP$~\cite{Gher2021MechanisingCT,Multiset_Ordering_NPC-AFP}. We identify a variable $v_i$ with its index $i$, which can be any natural number. A \emph{literal} can either be positive or negative, representing a variable or negated variable, respectively. \null › datatype literal = Neg nat | Pos nat type_synonym clause = "literal list" type_synonym formula = "clause list" text ‹ An \emph{assignment} maps all variables, given by their index, to a Boolean: › type_synonym assignment = "nat ⇒ bool" abbreviation satisfies_literal :: "assignment ⇒ literal ⇒ bool" where "satisfies_literal α x ≡ case x of Neg n ⇒ ¬ α n | Pos n ⇒ α n" definition satisfies_clause :: "assignment ⇒ clause ⇒ bool" where "satisfies_clause α c ≡ ∃x∈set c. satisfies_literal α x" definition satisfies :: "assignment ⇒ formula ⇒ bool" (infix "⊨" 60) where "α ⊨ φ ≡ ∀c∈set φ. satisfies_clause α c" text ‹ As is customary, the empty clause is satisfied by no assignment, and the empty CNF formula is satisfied by every assignment. › proposition "¬ satisfies_clause α []" by (simp add: satisfies_clause_def) proposition "α ⊨ []" by (simp add: satisfies_def) lemma satisfies_clause_take: assumes "i < length clause" shows "satisfies_clause α (take (Suc i) clause) ⟷ satisfies_clause α (take i clause) ∨ satisfies_literal α (clause ! i)" using assms satisfies_clause_def by (auto simp add: take_Suc_conv_app_nth) lemma satisfies_take: assumes "i < length φ" shows "α ⊨ take (Suc i) φ ⟷ α ⊨ take i φ ∧ satisfies_clause α (φ ! i)" using assms satisfies_def by (auto simp add: take_Suc_conv_app_nth) lemma satisfies_append: assumes "α ⊨ φ⇩_{1}@ φ⇩_{2}" shows "α ⊨ φ⇩_{1}" and "α ⊨ φ⇩_{2}" using assms satisfies_def by simp_all lemma satisfies_append': assumes "α ⊨ φ⇩_{1}" and "α ⊨ φ⇩_{2}" shows "α ⊨ φ⇩_{1}@ φ⇩_{2}" using assms satisfies_def by auto lemma satisfies_concat_map: assumes "α ⊨ concat (map f [0..<k])" and "i < k" shows "α ⊨ f i" using assms satisfies_def by simp lemma satisfies_concat_map': assumes "⋀i. i < k ⟹ α ⊨ f i" shows "α ⊨ concat (map f [0..<k])" using assms satisfies_def by simp text ‹ The main ingredient for defining \SAT{} is the concept of \emph{satisfiable} CNF formula: › definition satisfiable :: "formula ⇒ bool" where "satisfiable φ ≡ ∃α. α ⊨ φ" text ‹ The set of all variables used in a CNF formula is finite. › definition variables :: "formula ⇒ nat set" where "variables φ ≡ {n. ∃c∈set φ. Neg n ∈ set c ∨ Pos n ∈ set c}" lemma finite_variables: "finite (variables φ)" proof - define voc :: "clause ⇒ nat set" where "voc c = {n. Neg n ∈ set c ∨ Pos n ∈ set c}" for c let ?vocs = "set (map voc φ)" have "finite (voc c)" for c proof (induction c) case Nil then show ?case using voc_def by simp next case (Cons a c) have "voc (a # c) = {n. Neg n ∈ set (a # c) ∨ Pos n ∈ set (a # c)}" using voc_def by simp also have "... = {n. Neg n ∈ set c ∨ Neg n = a ∨ Pos n ∈ set c ∨ Pos n = a}" by auto also have "... = {n. (Neg n ∈ set c ∨ Pos n ∈ set c) ∨ (Pos n = a ∨ Neg n = a)}" by auto also have "... = {n. Neg n ∈ set c ∨ Pos n ∈ set c} ∪ {n. Pos n = a ∨ Neg n = a}" by auto also have "... = voc c ∪ {n. Pos n = a ∨ Neg n = a}" using voc_def by simp finally have "voc (a # c) = voc c ∪ {n. Pos n = a ∨ Neg n = a}" . moreover have "finite {n. Pos n = a ∨ Neg n = a}" using finite_nat_set_iff_bounded by auto ultimately show ?case using Cons by simp qed moreover have "variables φ = ⋃?vocs" using variables_def voc_def by auto moreover have "finite ?vocs" by simp ultimately show ?thesis by simp qed lemma variables_append: "variables (φ⇩_{1}@ φ⇩_{2}) = variables φ⇩_{1}∪ variables φ⇩_{2}" using variables_def by auto text ‹ Arora and Barak~\cite[Claim~2.13]{ccama} define the \emph{size} of a CNF formula as the numbr of $\wedge / \vee$ symbols. We use a slightly different definition, namely the number of literals: › definition fsize :: "formula ⇒ nat" where "fsize φ ≡ sum_list (map length φ)" subsection ‹Predicates on assignments› text ‹ Every CNF formula is satisfied by a set of assignments. Conversely, for certain sets of assignments we can construct CNF formulas satisfied by exactly these assignments. This will be helpful later when we construct formulas for reducing arbitrary languages to \SAT{} (see Section~\ref{s:Reducing}). › subsubsection ‹Universality of CNF formulas› text ‹ A set (represented by a predicate) $F$ of assignments depends on the first $\ell$ variables iff.\ any two assignments that agree on the first $\ell$ variables are either both in the set or both outside of the set. › definition depon :: "nat ⇒ (assignment ⇒ bool) ⇒ bool" where "depon l F ≡ ∀α⇩_{1}α⇩_{2}. (∀i<l. α⇩_{1}i = α⇩_{2}i) ⟶ F α⇩_{1}= F α⇩_{2}" text ‹ Lists of all strings of the same length: › fun str_of_len :: "nat ⇒ string list" where "str_of_len 0 = [[]]" | "str_of_len (Suc l) = map ((#) 𝕆) (str_of_len l) @ map ((#) 𝕀) (str_of_len l)" lemma length_str_of_len: "length (str_of_len l) = 2 ^ l" by (induction l) simp_all lemma in_str_of_len_length: "xs ∈ set (str_of_len l) ⟹ length xs = l" by (induction l arbitrary: xs) auto lemma length_in_str_of_len: "length xs = l ⟹ xs ∈ set (str_of_len l)" proof (induction l arbitrary: xs) case 0 then show ?case by simp next case (Suc l) then obtain y ys where "xs = y # ys" by (meson length_Suc_conv) then have "length ys = l" using Suc by simp show ?case proof (cases y) case True then have "xs ∈ set (map ((#) 𝕀) (str_of_len l))" using `length ys = l` Suc `xs = y # ys` by simp then show ?thesis by simp next case False then have "xs ∈ set (map ((#) 𝕆) (str_of_len l))" using `length ys = l` Suc `xs = y # ys` by simp then show ?thesis by simp qed qed text ‹ A predicate $F$ depending on the first $\ell$ variables $v_0, \dots, v_{\ell-1}$ can be regarded as a truth table over $\ell$ variables. The next lemma shows that for every such truth table there exists a CNF formula with at most $2^\ell$ clauses and $\ell\cdot2^\ell$ literals over the first $\ell$ variables. This is the well-known fact that every Boolean function (over $\ell$ variables) can be represented by a CNF formula~\cite[Claim~2.13]{ccama}. › lemma depon_ex_formula: assumes "depon l F" shows "∃φ. fsize φ ≤ l * 2 ^ l ∧ length φ ≤ 2 ^ l ∧ variables φ ⊆ {..<l} ∧ (∀α. F α = α ⊨ φ)" proof - define cl where "cl = (λv. map (λi. if v ! i then Neg i else Pos i) [0..<l])" have cl1: "satisfies_clause a (cl v)" if "length v = l" and "v ≠ map a [0..<l]" for v a proof - obtain i where i: "i < l" "a i ≠ v ! i" using ‹length v = l› ‹v ≠ map a [0..<l]› by (smt (verit, best) atLeastLessThan_iff map_eq_conv map_nth set_upt) then have *: "cl v ! i = (if v ! i then Neg i else Pos i)" using cl_def by simp then have "case (cl v ! i) of Neg n ⇒ ¬ a n | Pos n ⇒ a n" using i(2) by simp then show ?thesis using cl_def * that(1) satisfies_clause_def i(1) by fastforce qed have cl2: "v ≠ map a [0..<l]" if "length v = l" and "satisfies_clause a (cl v)" for v a proof assume assm: "v = map a [0..<l]" from that(2) have "∃x∈set (cl v). case x of Neg n ⇒ ¬ a n | Pos n ⇒ a n" using satisfies_clause_def by simp then obtain i where i: "i < l" and "case (cl v ! i) of Neg n ⇒ ¬ a n | Pos n ⇒ a n" using cl_def by auto then have "v ! i ≠ a i" using cl_def by fastforce then show False using i assm by simp qed have filter_length_nth: "f (vs ! j)" if "vs = filter f sol" and "j < length vs" for vs sol :: "'a list" and f j using that nth_mem by (metis length_removeAll_less less_irrefl removeAll_filter_not) have sum_list_map: "sum_list (map g xs) ≤ k * length xs" if "⋀x. x ∈ set xs ⟹ g x = k" for xs :: "'a list" and g k using that proof (induction "length xs" arbitrary: xs) case 0 then show ?case by simp next case (Suc x) then obtain y ys where "xs = y # ys" by (metis length_Suc_conv) then have "length ys = x" using Suc by simp have "y ∈ set xs" using `xs = y # ys` by simp have "sum_list (map g xs) = sum_list (map g (y # ys))" using `xs = y # ys` by simp also have "... = g y + sum_list (map g ys)" by simp also have "... = k + sum_list (map g ys)" using Suc `y ∈ set xs` by simp also have "... ≤ k + k * length ys" using Suc `length ys = x` ‹xs = y # ys› by auto also have "... = k * length xs" by (metis Suc.hyps(2) ‹length ys = x› mult_Suc_right) finally show ?case by simp qed define vs where "vs = filter (λv. F (λi. if i < l then v ! i else False) = False) (str_of_len l)" define φ where "φ = map cl vs" have "a ⊨ φ" if "F a" for a proof - define v where "v = map a [0..<l]" then have "(λi. if i < l then v ! i else False) j = a j" if "j < l" for j by (simp add: that) then have *: "F (λi. if i < l then v ! i else False)" using assms(1) depon_def that by (smt (verit, ccfv_SIG)) have "satisfies_clause a c" if "c ∈ set φ" for c proof - obtain j where j: "c = φ ! j" "j < length φ" using φ_def `c ∈ set φ` by (metis in_set_conv_nth) then have "c = cl (vs ! j)" using φ_def by simp have "j < length vs" using φ_def j by simp then have "F (λi. if i < l then (vs ! j) ! i else False) = False" using vs_def filter_length_nth by blast then have "vs ! j ≠ v" using * by auto moreover have "length (vs ! j) = l" using vs_def length_str_of_len ‹j < length vs› by (smt (verit) filter_eq_nths in_str_of_len_length notin_set_nthsI nth_mem) ultimately have "satisfies_clause a (cl (vs ! j))" using v_def cl1 by simp then show ?thesis using `c = cl (vs ! j)` by simp qed then show ?thesis using satisfies_def by simp qed moreover have "F α" if "α ⊨ φ" for α proof (rule ccontr) assume assm: "¬ F α" define v where "v = map α [0..<l]" have *: "F (λi. if i < l then v ! i else False) = False" proof - have "(λi. if i < l then v ! i else False) j = α j" if "j < l" for j using v_def by (simp add: that) then show ?thesis using assm assms(1) depon_def by (smt (verit, best)) qed have "length v = l" using v_def by simp then obtain j where "j < length (str_of_len l)" and "v = str_of_len l ! j" by (metis in_set_conv_nth length_in_str_of_len) then have "v ∈ set vs" using vs_def * by fastforce then have "cl v ∈ set φ" using φ_def by simp then have "satisfies_clause α (cl v)" using that satisfies_def by simp then have "v ≠ map α [0..<l]" using `length v = l` cl2 by simp then show False using v_def by simp qed ultimately have "∀α. F α = α ⊨ φ" by auto moreover have "fsize φ ≤ l * 2 ^ l" proof - have "length c = l" if "c ∈ set φ" for c using that cl_def φ_def by auto then have "fsize φ ≤ l * length φ" unfolding fsize_def using sum_list_map by auto also have "... ≤ l * length (str_of_len l)" using φ_def vs_def by simp also have "... = l * 2 ^ l" using length_str_of_len by simp finally show ?thesis . qed moreover have "length φ ≤ 2 ^ l" proof - have "length φ ≤ length (str_of_len l)" using φ_def vs_def by simp also have "... = 2 ^ l" using length_str_of_len by simp finally show ?thesis . qed moreover have "variables φ ⊆ {..<l}" proof fix x assume "x ∈ variables φ" then obtain c where c: "c ∈ set φ" "Neg x ∈ set c ∨ Pos x ∈ set c" using variables_def by auto then obtain v where v: "v ∈ set (str_of_len l)" "c = cl v" using φ_def vs_def by auto then show "x ∈ {..<l}" using cl_def c by auto qed ultimately show ?thesis by auto qed subsubsection ‹Substitutions of variables› text ‹ We will sometimes consider CNF formulas over the first $\ell$ variables and derive other CNF formulas from them by substituting these variables. Such a substitution will be represented by a list $\sigma$ of length at least $\ell$, meaning that the variable $v_i$ is replaced by $v_{\sigma(i)}$. We will call this operation on formulas \emph{relabel}, and the corresponding one on literals \emph{rename}: › fun rename :: "nat list ⇒ literal ⇒ literal" where "rename σ (Neg i) = Neg (σ ! i)" | "rename σ (Pos i) = Pos (σ ! i)" definition relabel :: "nat list ⇒ formula ⇒ formula" where "relabel σ φ ≡ map (map (rename σ)) φ" lemma fsize_relabel: "fsize (relabel σ φ) = fsize φ" using relabel_def fsize_def by (metis length_concat length_map map_concat) text ‹ A substitution $\sigma$ can also be applied to an assignment and to a list of variable indices: › definition remap :: "nat list ⇒ assignment ⇒ assignment" where "remap σ α i ≡ if i < length σ then α (σ ! i) else α i" definition reseq :: "nat list ⇒ nat list ⇒ nat list" where "reseq σ vs ≡ map ((!) σ) vs" lemma length_reseq [simp]: "length (reseq σ vs) = length vs" using reseq_def by simp text ‹ Relabeling a formula and remapping an assignment are equivalent in a sense. › lemma satisfies_sigma: assumes "variables φ ⊆ {..<length σ}" shows "α ⊨ relabel σ φ ⟷ remap σ α ⊨ φ" proof assume sat: "α ⊨ relabel σ φ" have "satisfies_clause (remap σ α) c" if "c ∈ set φ" for c proof - obtain i where "i < length φ" "φ ! i = c" by (meson ‹c ∈ set φ› in_set_conv_nth) then have "satisfies_clause α (map (rename σ) c)" (is "satisfies_clause α ?c") using sat satisfies_def relabel_def by auto then obtain x where "x∈set ?c" "case x of Neg n ⇒ ¬ α n | Pos n ⇒ α n" using satisfies_clause_def by auto then obtain j where j: "j < length ?c" "case (?c ! j) of Neg n ⇒ ¬ α n | Pos n ⇒ α n" by (metis in_set_conv_nth) have "case c ! j of Neg n ⇒ ¬ (remap σ α) n | Pos n ⇒ (remap σ α) n" proof (cases "c ! j") case (Neg n) then have 1: "?c ! j = Neg (σ ! n)" using j(1) by simp have "n ∈ variables φ" using Neg j(1) nth_mem that variables_def by force then have "n < length σ" using assms by auto then show ?thesis using Neg 1 j(2) remap_def by auto next case (Pos n) then have 1: "?c ! j = Pos (σ ! n)" using j(1) by simp have "n ∈ variables φ" using Pos j(1) nth_mem that variables_def by force then have "n < length σ" using assms by auto then show ?thesis using Pos 1 j(2) remap_def by auto qed then show ?thesis using satisfies_clause_def j by auto qed then show "remap σ α ⊨ φ" using satisfies_def by simp next assume sat: "remap σ α ⊨ φ" have "satisfies_clause α c" if "c ∈ set (relabel σ φ)" for c proof - let ?phi = "relabel σ φ" let ?beta = "remap σ α" obtain i where i: "i < length ?phi" "?phi ! i = c" by (meson ‹c ∈ set ?phi› in_set_conv_nth) then have "satisfies_clause ?beta (φ ! i)" (is "satisfies_clause _ ?c") using sat satisfies_def relabel_def by simp then obtain x where "x∈set ?c" "case x of Neg n ⇒ ¬ ?beta n | Pos n ⇒ ?beta n" using satisfies_clause_def by auto then obtain j where j: "j < length ?c" "case (?c ! j) of Neg n ⇒ ¬ ?beta n | Pos n ⇒ ?beta n" by (metis in_set_conv_nth) then have ren: "c ! j = rename σ (?c ! j)" using i relabel_def by auto have "case c ! j of Neg n ⇒ ¬ α n | Pos n ⇒ α n" proof (cases "?c ! j") case (Neg n) then have *: "c ! j = Neg (σ ! n)" by (simp add: ren) have "n ∈ variables φ" using Neg i j variables_def that length_map mem_Collect_eq nth_mem relabel_def by force then have "n < length σ" using assms by auto moreover have "¬ (remap σ α) n" using j(2) Neg by simp ultimately have "¬ α (σ ! n)" using remap_def by simp then show ?thesis by (simp add: *) next case (Pos n) then have *: "c ! j = Pos (σ ! n)" by (simp add: ren) have "n ∈ variables φ" using Pos i j variables_def that length_map mem_Collect_eq nth_mem relabel_def by force then have "n < length σ" using assms by auto moreover have "(remap σ α) n" using j(2) Pos by simp ultimately have "α (σ ! n)" using remap_def by simp then show ?thesis by (simp add: *) qed moreover have "length c = length (φ ! i)" using relabel_def i by auto ultimately show ?thesis using satisfies_clause_def j by auto qed then show "α ⊨ relabel σ φ" by (simp add: satisfies_def) qed subsection ‹Representing CNF formulas as strings\label{s:sat-sat-repr}› text ‹ By identifying negated literals with even numbers and positive literals with odd numbers, we can identify literals with natural numbers. This yields a straightforward representation of a clause as a list of numbers and of a CNF formula as a list of lists of numbers. Such a list can, in turn, be represented as a symbol sequence over a quaternary alphabet as described in Section~\ref{s:tm-numlistlist}, which ultimately can be encoded over a binary alphabet (see Section~\ref{s:tm-quaternary}). This is essentially how we represent CNF formulas as strings. We have to introduce a bunch of functions for mapping between these representations. \null › fun literal_n :: "literal ⇒ nat" where "literal_n (Neg i) = 2 * i" | "literal_n (Pos i) = Suc (2 * i)" definition n_literal :: "nat ⇒ literal" where "n_literal n ≡ if even n then Neg (n div 2) else Pos (n div 2)" lemma n_literal_n: "n_literal (literal_n x) = x" using n_literal_def by (cases x) simp_all lemma literal_n_literal: "literal_n (n_literal n) = n" using n_literal_def by simp definition clause_n :: "clause ⇒ nat list" where "clause_n cl ≡ map literal_n cl" definition n_clause :: "nat list ⇒ clause" where "n_clause ns ≡ map n_literal ns" lemma n_clause_n: "n_clause (clause_n cl) = cl" using n_clause_def clause_n_def n_literal_n by (simp add: map_idI) lemma clause_n_clause: "clause_n (n_clause n) = n" using n_clause_def clause_n_def literal_n_literal by (simp add: map_idI) definition formula_n :: "formula ⇒ nat list list" where "formula_n φ ≡ map clause_n φ" definition n_formula :: "nat list list ⇒ formula" where "n_formula nss ≡ map n_clause nss" lemma n_formula_n: "n_formula (formula_n φ) = φ" using n_formula_def formula_n_def n_clause_n by (simp add: map_idI) lemma formula_n_formula: "formula_n (n_formula nss) = nss" using n_formula_def formula_n_def clause_n_clause by (simp add: map_idI) definition formula_zs :: "formula ⇒ symbol list" where "formula_zs φ ≡ numlistlist (formula_n φ)" text ‹ The mapping between formulas and well-formed symbol sequences for lists of lists of numbers is bijective. › lemma formula_n_inj: "formula_n φ⇩_{1}= formula_n φ⇩_{2}⟹ φ⇩_{1}= φ⇩_{2}" using n_formula_n by metis definition zs_formula :: "symbol list ⇒ formula" where "zs_formula zs ≡ THE φ. formula_zs φ = zs" lemma zs_formula: assumes "numlistlist_wf zs" shows "∃!φ. formula_zs φ = zs" proof - obtain nss where nss: "numlistlist nss = zs" using assms numlistlist_wf_def by auto let ?phi = "n_formula nss" have *: "formula_n ?phi = nss" using nss formula_n_formula by simp then have "formula_zs ?phi = zs" using nss formula_zs_def by simp then have "∃φ. formula_zs φ = zs" by auto moreover have "φ = ?phi" if "formula_zs φ = zs" for φ proof - have "numlistlist (formula_n φ) = zs" using that formula_zs_def by simp then have "nss = formula_n φ" using nss numlistlist_inj by simp then show ?thesis using formula_n_inj * by simp qed ultimately show ?thesis by auto qed lemma zs_formula_zs: "zs_formula (formula_zs φ) = φ" by (simp add: formula_n_inj formula_zs_def numlistlist_inj the_equality zs_formula_def) lemma formula_zs_formula: assumes "numlistlist_wf zs" shows "formula_zs (zs_formula zs) = zs" using assms zs_formula zs_formula_zs by metis text ‹ There will of course be Turing machines that perform computations on formulas. In order to bound their running time, we need bounds for the length of the symbol representation of formulas. › lemma nlength_literal_n_Pos: "nlength (literal_n (Pos n)) ≤ Suc (nlength n)" using nlength_times2plus1 by simp lemma nlength_literal_n_Neg: "nlength (literal_n (Neg n)) ≤ Suc (nlength n)" using nlength_times2 by simp lemma nlllength_formula_n: fixes V :: nat and φ :: formula assumes "⋀v. v ∈ variables φ ⟹ v ≤ V" shows "nlllength (formula_n φ) ≤ fsize φ * Suc (Suc (nlength V)) + length φ" using assms proof (induction φ) case Nil then show ?case using formula_n_def by simp next case (Cons cl φ) then have 0: "⋀v. v ∈ variables φ ⟹ v ≤ V" using variables_def by simp have 1: "n ≤ V" if "Pos n ∈ set cl" for n using that variables_def by (simp add: Cons.prems) have 2: "n ≤ V" if "Neg n ∈ set cl" for n using that variables_def by (simp add: Cons.prems) have 3: "nlength (literal_n v) ≤ Suc (nlength V)" if "v ∈ set cl" for v proof (cases v) case (Neg n) then have "nlength (literal_n v) ≤ Suc (nlength n)" using nlength_literal_n_Neg by blast moreover have "n ≤ V" using Neg that 2 by simp ultimately show ?thesis using nlength_mono by fastforce next case (Pos n) then have "nlength (literal_n v) ≤ Suc (nlength n)" using nlength_literal_n_Pos by blast moreover have "n ≤ V" using Pos that 1 by simp ultimately show ?thesis using nlength_mono by fastforce qed have "nllength (clause_n cl) = length (numlist (map literal_n cl))" using clause_n_def nllength_def by simp have "nllength (clause_n cl) = (∑n←(map literal_n cl). Suc (nlength n))" using clause_n_def nllength by simp also have "... = (∑v←cl. Suc (nlength (literal_n v)))" proof - have "map (λn. Suc (nlength n)) (map literal_n cl) = map (λv. Suc (nlength (literal_n v))) cl" by simp then show ?thesis by metis qed also have "... ≤ (∑v←cl. Suc (Suc (nlength V)))" using sum_list_mono[of cl "λv. Suc (nlength (literal_n v))" "λv. Suc (Suc (nlength V))"] 3 by simp also have "... = Suc (Suc (nlength V)) * length cl" using sum_list_const by blast finally have 4: "nllength (clause_n cl) ≤ Suc (Suc (nlength V)) * length cl" . have "concat (map (λns. numlist ns @ [5]) (map clause_n (cl # φ))) = (numlist (clause_n cl) @ [5]) @ concat (map (λns. numlist ns @ [5]) (map clause_n φ))" by simp then have "length (concat (map (λns. numlist ns @ [5]) (map clause_n (cl # φ)))) = length ((numlist (clause_n cl) @ [5]) @ concat (map (λns. numlist ns @ [5]) (map clause_n φ)))" by simp then have "nlllength (formula_n (cl # φ)) = length ((numlist (clause_n cl) @ [5]) @ concat (map (λns. numlist ns @ [5]) (map clause_n φ)))" using formula_n_def numlistlist_def nlllength_def by simp also have "... = length (numlist (clause_n cl) @ [5]) + length (concat (map (λns. numlist ns @ [5]) (map clause_n φ)))" by simp also have "... = length (numlist (clause_n cl) @ [5]) + nlllength (formula_n φ)" using formula_n_def numlistlist_def nlllength_def by simp also have "... = Suc (nllength (clause_n cl)) + nlllength (formula_n φ)" using nllength_def by simp also have "... ≤ Suc (Suc (Suc (nlength V)) * length cl) + nlllength (formula_n φ)" using 4 by simp also have "... ≤ Suc (Suc (Suc (nlength V)) * length cl) + fsize φ * Suc (Suc (nlength V)) + length φ" using Cons 0 by simp also have "... = fsize (cl # φ) * Suc (Suc (nlength V)) + length (cl # φ)" by (simp add: add_mult_distrib2 mult.commute fsize_def) finally show ?case by simp qed text ‹ Since \SAT{} is supposed to be a set of strings rather than symbol sequences, we need to map symbol sequences representing formulas to strings: › abbreviation formula_to_string :: "formula ⇒ string" where "formula_to_string φ ≡ symbols_to_string (binencode (numlistlist (formula_n φ)))" lemma formula_to_string_inj: assumes "formula_to_string φ⇩_{1}= formula_to_string φ⇩_{2}" shows "φ⇩_{1}= φ⇩_{2}" proof - let ?xs1 = "binencode (numlistlist (formula_n φ⇩_{1}))" let ?xs2 = "binencode (numlistlist (formula_n φ⇩_{2}))" have bin1: "binencodable (numlistlist (formula_n φ⇩_{1}))" by (simp add: Suc_le_eq numeral_2_eq_2 proper_symbols_numlistlist symbols_lt_numlistlist) then have "bit_symbols ?xs1" using bit_symbols_binencode by simp then have 1: "string_to_symbols (symbols_to_string ?xs1) = ?xs1" using bit_symbols_to_symbols by simp have bin2: "binencodable (numlistlist (formula_n φ⇩_{2}))" by (simp add: Suc_le_eq numeral_2_eq_2 proper_symbols_numlistlist symbols_lt_numlistlist) then have "bit_symbols ?xs2" using bit_symbols_binencode by simp then have "string_to_symbols (symbols_to_string ?xs2) = ?xs2" using bit_symbols_to_symbols by simp then have "?xs1 = ?xs2" using 1 assms by simp then have "numlistlist (formula_n φ⇩_{1}) = numlistlist (formula_n φ⇩_{2})" using binencode_inj bin1 bin2 by simp then have "formula_n φ⇩_{1}= formula_n φ⇩_{2}" using numlistlist_inj by simp then show ?thesis using formula_n_inj by simp qed text ‹ While @{const formula_to_string} maps every CNF formula to a string, not every string represents a CNF formula. We could just ignore such invalid strings and define \SAT{} to only contain well-formed strings. But this would implicitly place these invalid strings in the complement of \SAT{}. While this does not cause us any issues here, it would if we were to introduce co-$\NP$ and wanted to show that $\overline{\mathtt{SAT}}$ is in co-$\NP$, as we would then have to deal with the invalid strings. So it feels a little like cheating to ignore the invalid strings like this. Arora and Barak~\cite[p.~45 footnote~3]{ccama} recommend mapping invalid strings to ``some fixed formula''. A natural candidate for this fixed formula is the empty CNF, since an invalid string in a sense represents nothing, and the empty CNF formula is represented by the empty string. Since the empty CNF formula is satisfiable this implies that all invalid strings become elements of \SAT{}. We end up with the following definition of the protagonist of this article: › definition SAT :: language where "SAT ≡ {formula_to_string φ | φ. satisfiable φ} ∪ {x. ¬ (∃φ. x = formula_to_string φ)}" section ‹\SAT{} is in $\NP$\label{s:Sat-np}› text ‹ In order to show that \SAT{} is in $\NP$, we will construct a polynomial-time Turing machine $M$ and specify a polynomial function $p$ such that for every $x$, $x\in \SAT$ iff. there is a $u\in\bbOI^{p(|x|)}$ such that $M$ outputs \textbf{1} on $\langle x; u\rangle$. The idea is straightforward: Let $\phi$ be the formula represented by the string $x$. Interpret the string $u$ as a list of variables and interpret this as the assignment that assigns True to only the variables in the list. Then check if the assignment satisfies the formula. This check is ``obviously'' possible in polynomial time because $M$ simply has to iterate over all clauses and check if at least one literal per clause is true under the assignment. Checking if a literal is true is simply a matter of checking whether the literal's variable is in the list $u$. If the assignment satisfies $\phi$, output \textbf{1}, otherwise the empty symbol sequence. If $\phi$ is unsatisfiable then no assignment, hence no $u$ no matter the length will make $M$ output \textbf{1}. On the other hand, if $\phi$ is satisfiable then there will be a satisfying assignment where a subset of the variables in $\phi$ are assigned True. Hence there will be a list of variables of at most roughly the length of $\phi$. So setting the polynomial $p$ to something like $p(n) = n$ should suffice. In fact, as we shall see, $p(n) = n$ suffices. This is so because in our representation, the string $x$, being a list of lists, has slightly more overhead per number than the plain list $u$ has. Therefore listing all variables in $\phi$ is guaranteed to need fewer symbols than $x$ has. There are several technical details to work out. First of all, the input to $M$ need not be a well-formed pair. And if it is, the pair $\langle x, u\rangle$ has to be decoded into separate components $x$ and $u$. These have to be decoded again from the binary to the quaternary alphabet, which is only possible if both $x$ and $u$ comprise only bit symbols (\textbf{01}). Then $M$ needs to check if the decoded $x$ and $u$ are valid symbol sequences for lists (of lists) of numbers. In the case of $u$ this is particularly finicky because the definition of $\NP$ requires us to provide a string $u$ of exactly the length $p(|x|)$ and so we have to pad $u$ with extra symbols, which have to be stripped again before the validation can take place. In the first subsection we describe what the verifier TM has to do in terms of symbol sequences. In the subsections after that we devise a Turing machine that implements this behavior. › subsection ‹Verifying \SAT{} instances› text ‹ Our verifier Turing machine for \SAT{} will implement the following function; that is, on input @{term zs} it will output @{term "verify_sat zs"}. It performs a number of decodings and well-formedness checks and outputs either \textbf{1} or the empty symbol sequence. › definition verify_sat :: "symbol list ⇒ symbol list" where "verify_sat zs ≡ let ys = bindecode zs; xs = bindecode (first ys); vs = rstrip ♯ (bindecode (second ys)) in if even (length (first ys)) ∧ bit_symbols (first ys) ∧ numlistlist_wf xs then if bit_symbols (second ys) ∧ numlist_wf vs then if (λv. v ∈ set (zs_numlist vs)) ⊨ zs_formula xs then [𝟭] else [] else [] else [𝟭]" text ‹ Next we show that @{const verify_sat} behaves as is required of a verifier TM for \SAT. Its polynomial running time is the subject of the next subsection. › text ‹ First we consider the case that @{term zs} encodes a pair $\langle x, u\rangle$ of strings where $x$ does not represent a CNF formula. Such an $x$ is in \SAT{}, hence the verifier TM is supposed to output \textbf{1}. › lemma ex_phi_x: assumes "xs = string_to_symbols x" assumes "even (length xs)" and "numlistlist_wf (bindecode xs)" shows "∃φ. x = formula_to_string φ" proof - obtain nss where "numlistlist nss = bindecode xs" using assms(3) numlistlist_wf_def by auto moreover have "binencode (bindecode xs) = xs" using assms(1,2) binencode_decode by simp ultimately have "binencode (numlistlist nss) = xs" by simp then have "binencode (numlistlist (formula_n (n_formula nss))) = xs" using formula_n_formula by simp then have "formula_to_string (n_formula nss) = symbols_to_string xs" by simp then show ?thesis using assms(1) symbols_to_string_to_symbols by auto qed lemma verify_sat_not_wf_phi: assumes "zs = ⟨x; u⟩" and "¬ (∃φ. x = formula_to_string φ)" shows "verify_sat zs = [𝟭]" proof - define ys where "ys = bindecode zs" then have first_ys: "first ys = string_to_symbols x" using first_pair assms(1) by simp then have 2: "bit_symbols (first ys)" using assms(1) bit_symbols_first ys_def by presburger define xs where "xs = bindecode (first ys)" then have "¬ (even (length (first ys)) ∧ bit_symbols (first ys) ∧ numlistlist_wf xs)" using first_ys ex_phi_x assms(2) by auto then show "verify_sat zs = [𝟭]" unfolding verify_sat_def Let_def using ys_def xs_def by simp qed text ‹ The next case is that @{term zs} represents a pair $\langle x, u\rangle$ where $x$ represents an unsatisfiable CNF formula. This $x$ is thus not in \SAT{} and the verifier TM must output something different from \textbf{1}, such as the empty string, regardless of $u$. › lemma verify_sat_not_sat: fixes φ :: formula assumes "zs = ⟨formula_to_string φ; u⟩" and "¬ satisfiable φ" shows "verify_sat zs = []" proof - have bs_phi: "bit_symbols (binencode (formula_zs φ))" using bit_symbols_binencode formula_zs_def proper_symbols_numlistlist symbols_lt_numlistlist by (metis Suc_le_eq dual_order.refl numeral_2_eq_2) define ys where "ys = bindecode zs" then have "first ys = string_to_symbols (formula_to_string φ)" using first_pair assms(1) by simp then have first_ys: "first ys = binencode (formula_zs φ)" using bit_symbols_to_symbols bs_phi formula_zs_def by simp then have 2: "bit_symbols (first ys)" using assms(1) bit_symbols_first ys_def by presburger have 22: "even (length (first ys))" using first_ys by simp define xs where "xs = bindecode (first ys)" define vs where "vs = rstrip 5 (bindecode (second ys))" have wf_xs: "numlistlist_wf xs" using xs_def first_ys bindecode_encode formula_zs_def numlistlist_wf_def numlistlist_wf_has2' by (metis le_simps(3) numerals(2)) have φ: "zs_formula xs = φ" using xs_def first_ys "2" binencode_decode formula_to_string_inj formula_zs_def formula_zs_formula wf_xs by auto have "verify_sat zs = (if bit_symbols (second ys) ∧ numlist_wf vs then if (λv. v ∈ set (zs_numlist vs)) ⊨ φ then [3] else [] else [])" unfolding verify_sat_def Let_def using ys_def xs_def vs_def 2 22 wf_xs φ by simp then show "verify_sat zs = []" using assms(2) satisfiable_def by simp qed text ‹ Next we consider the case in which @{term zs} represents a pair $\langle x, u\rangle$ where $x$ represents a satisfiable CNF formula and $u$ a list of numbers padded at the right with @{text ♯} symbols. This $u$ thus represents a variable assignment, namely the one assigning True to exactly the variables in the list. The verifier TM is to output \textbf{1} iff.\ this assignment satisfies the CNF formula represented by $x$. First we show that stripping away @{text ♯} symbols does not damage a symbol sequence representing a list of numbers. › lemma rstrip_numlist_append: "rstrip ♯ (numlist vars @ replicate pad ♯) = numlist vars" (is "rstrip ♯ ?zs = ?ys") proof - have "(LEAST i. i ≤ length ?zs ∧ set (drop i ?zs) ⊆ {♯}) = length ?ys" proof (rule Least_equality) show "length ?ys ≤ length ?zs ∧ set (drop (length ?ys) ?zs) ⊆ {♯}" by auto show "length ?ys ≤ m" if "m ≤ length ?zs ∧ set (drop m ?zs) ⊆ {♯}" for m proof (rule ccontr) assume "¬ length ?ys ≤ m" then have "m < length ?ys" by simp then have "?ys ! m ∈ set (drop m ?ys)" by (metis Cons_nth_drop_Suc list.set_intros(1)) moreover have "set (drop m ?ys) ⊆ {♯}" using that by simp ultimately have "?ys ! m = ♯" by auto moreover have "?ys ! m < ♯" using symbols_lt_numlist `m < length ?ys` by simp ultimately show False by simp qed qed then show ?thesis using rstrip_def by simp qed lemma verify_sat_wf: fixes φ :: formula and pad :: nat assumes "zs = ⟨formula_to_string φ; symbols_to_string (binencode (numlist vars @ replicate pad ♯))⟩" shows "verify_sat zs = (if (λv. v ∈ set vars) ⊨ φ then [𝟭] else [])" proof - have bs_phi: "bit_symbols (binencode (formula_zs φ))" using bit_symbols_binencode formula_zs_def proper_symbols_numlistlist symbols_lt_numlistlist by (metis Suc_le_eq dual_order.refl numeral_2_eq_2) have "binencodable (numlist vars @ replicate pad ♯)" using proper_symbols_numlist symbols_lt_numlist binencodable_append[of "numlist vars" "replicate pad ♯"] by fastforce then have bs_vars: "bit_symbols (binencode (numlist vars @ replicate pad ♯))" using bit_symbols_binencode by simp define ys where "ys = bindecode zs" then have "first ys = string_to_symbols (formula_to_string φ)" using first_pair assms(1) by simp then have first_ys: "first ys = binencode (formula_zs φ)" using bit_symbols_to_symbols bs_phi formula_zs_def by simp then have bs_first: "bit_symbols (first ys)" using assms(1) bit_symbols_first ys_def by presburger have even: "even (length (first ys))" using first_ys by simp have second_ys: "second ys = binencode (numlist vars @ replicate pad ♯)" using bs_vars ys_def assms(1) bit_symbols_to_symbols second_pair by simp then have bs_second: "bit_symbols (second ys)" using bs_vars by simp define xs where "xs = bindecode (first ys)" define vs where "vs = rstrip ♯ (bindecode (second ys))" then have "vs = rstrip ♯ (numlist vars @ replicate pad ♯)" using second_ys ‹binencodable (numlist vars @ replicate pad ♯)› bindecode_encode by simp then have vs: "vs = numlist vars" using rstrip_numlist_append by simp have wf_xs: "numlistlist_wf xs" using xs_def first_ys bindecode_encode formula_zs_def numlistlist_wf_def numlistlist_wf_has2' by (metis le_simps(3) numerals(2)) have "verify_sat zs = (if even (length (first ys)) ∧ bit_symbols (first ys) ∧ numlistlist_wf xs then if bit_symbols (second ys) ∧ numlist_wf vs then if (λv. v ∈ set (zs_numlist vs)) ⊨ zs_formula xs then [𝟭] else [] else [] else [3])" unfolding verify_sat_def Let_def using bs_second ys_def xs_def vs_def by simp then have *: "verify_sat zs = (if bit_symbols (second ys) ∧ numlist_wf vs then if (λv. v ∈ set (zs_numlist vs)) ⊨ zs_formula xs then [𝟭] else [] else [])" unfolding verify_sat_def Let_def using even bs_first wf_xs by simp have "xs = formula_zs φ" using xs_def bindecode_encode formula_zs_def first_ys proper_symbols_numlistlist symbols_lt_numlistlist by (simp add: Suc_leI numerals(2)) then have φ: "φ = zs_formula xs" by (simp add: zs_formula_zs) have vars: "vars = zs_numlist vs" using vs numlist_wf_def numlist_zs_numlist zs_numlist_ex1 by blast then have wf_vs: "numlist_wf vs" using numlist_wf_def vs by auto have "verify_sat zs = (if (λv. v ∈ set (zs_numlist vs)) ⊨ zs_formula xs then [𝟭] else [])" using * bs_second wf_xs wf_vs by simp then show ?thesis using φ vars by simp qed text ‹ Finally we show that for every string $x$ representing a satisfiable CNF formula there is a list of numbers representing a satisfying assignment and represented by a string of length at most $|x|$. That means there is always a string of exactly the length of $x$ consisting of a satisfying assignment plus some padding symbols. › lemma nllength_remove1: assumes "n ∈ set ns" shows "nllength (n # remove1 n ns) = nllength ns" using assms nllength sum_list_map_remove1[of n ns "λn. Suc (nlength n)"] by simp lemma nllength_distinct_le: assumes "distinct ns" and "set ns ⊆ set ms" shows "nllength ns ≤ nllength ms" using assms proof (induction ms arbitrary: ns) case Nil then show ?case by simp next case (Cons m ms) show ?case proof (cases "m ∈ set ns") case True let ?ns = "remove1 m ns" have "set ?ns ⊆ set ms" using Cons by auto moreover have "distinct ?ns" using Cons by simp ultimately have *: "nllength ?ns ≤ nllength ms" using Cons by simp have "nllength ns = nllength (m # ?ns)" using True nllength_remove1 by simp also have "... = Suc (nlength m) + nllength ?ns" using nllength by simp also have "... ≤ Suc (nlength m) + nllength ms" using * by simp also have "... = nllength (m # ms)" using nllength by simp finally show ?thesis . next case False then have "set ns ⊆ set ms" using Cons by auto moreover have "distinct ns" using Cons by simp ultimately have "nllength ns ≤ nllength ms" using Cons by simp then show ?thesis using nllength by simp qed qed lemma nlllength_nllength_concat: "nlllength nss = nllength (concat nss) + length nss" using nlllength nllength by (induction nss) simp_all fun variable :: "literal ⇒ nat" where "variable (Neg i) = i" | "variable (Pos i) = i" lemma sum_list_eq: "ns = ms ⟹ sum_list ns = sum_list ms" by simp lemma nllength_clause_le: "nllength (clause_n cl) ≥ nllength (map variable cl)" proof - have "variable x ≤ literal_n x" for x by (cases x) simp_all then have *: "Suc (nlength (variable x)) ≤ Suc (nlength (literal_n x))" for x using nlength_mono by simp let ?ns = "map literal_n cl" have "nllength (clause_n cl) = nllength ?ns" using clause_n_def by presburger also have "... = (∑n←?ns. Suc (nlength n))" using nllength by simp also have "... = (∑x←cl. Suc (nlength (literal_n x)))" by (smt (verit, del_insts) length_map nth_equalityI nth_map) also have "... ≥ (∑x←cl. Suc (nlength (variable x)))" using * by (simp add: sum_list_mono) finally have "(∑x←cl. Suc (nlength (variable x))) ≤ nllength (clause_n cl)" by simp moreover have "(∑x←cl. Suc (nlength (variable x))) = nllength (map variable cl)" proof - have 1: "map (λx. Suc (nlength (variable x))) cl = map (λn. Suc (nlength n)) (map variable cl)" by simp then have "(∑x←cl. Suc (nlength (variable x))) = (∑n←(map variable cl). Suc (nlength n))" using sum_list_eq[OF 1] by simp then show ?thesis using nllength by simp qed ultimately show ?thesis by simp qed lemma nlllength_formula_ge: "nlllength (formula_n φ) ≥ nlllength (map (map variable) φ)" proof (induction φ) case Nil then show ?case by simp next case (Cons cl φ) have "nlllength (map (map variable) (cl # φ)) = nlllength (map (map variable) [cl]) + nlllength (map (map variable) φ)" using nlllength by simp also have "... = Suc (nllength (map variable cl)) + nlllength (map (map variable) φ)" using nlllength by simp also have "... ≤ Suc (nllength (map variable cl)) + nlllength (formula_n φ)" using Cons by simp also have "... ≤ Suc (nllength (clause_n cl)) + nlllength (formula_n φ)" using nllength_clause_le by simp also have "... = nlllength (formula_n (cl # φ))" using nlllength by (simp add: formula_n_def) finally show ?case . qed lemma variables_shorter_formula: fixes φ :: formula and vars :: "nat list" assumes "set vars ⊆ variables φ" and "distinct vars" shows "nllength vars ≤ nlllength (formula_n φ)" proof - let ?nss = "map (map variable) φ" have "nllength (concat ?nss) ≤ nlllength ?nss" using nlllength_nllength_concat by simp then have *: "nllength (concat ?nss) ≤ nlllength (formula_n φ)" using nlllength_formula_ge by (meson le_trans) have "set vars ⊆ set (concat ?nss)" proof fix n :: nat assume "n ∈ set vars" then have "n ∈ variables φ" using assms(1) by auto then obtain c where c: "c ∈ set φ" "Neg n ∈ set c ∨ Pos n ∈ set c" using variables_def by auto then obtain x where x: "x ∈ set c" "variable x = n" by auto then show "n ∈ set (concat (map (map variable) φ))" using c by auto qed then have "nllength vars ≤ nllength (concat ?nss)" using nllength_distinct_le assms(2) by simp then show ?thesis using * by simp qed lemma ex_assignment_linear_length: assumes "satisfiable φ" shows "∃vars. (λv. v ∈ set vars) ⊨ φ ∧ nllength vars ≤ nlllength (formula_n φ)" proof - obtain α where α: "α ⊨ φ" using assms satisfiable_def by auto define poss where "poss = {v. v ∈ variables φ ∧ α v}" then have "finite poss" using finite_variables by simp let ?beta = "λv. v ∈ poss" have sat: "?beta ⊨ φ" unfolding satisfies_def proof fix c :: clause assume "c ∈ set φ" then have "satisfies_clause α c" using satisfies_def α by simp then obtain x where x: "x ∈ set c" "satisfies_literal α x" using satisfies_clause_def by auto show "satisfies_clause ?beta c" proof (cases x) case (Neg n) then have "¬ α n" using x(2) by simp then have "n ∉ poss" using poss_def by simp then have "¬ ?beta n" by simp then have "satisfies_literal ?beta x" using Neg by simp then show ?thesis using satisfies_clause_def x(1) by auto next case (Pos n) then have "α n" using x(2) by simp then have "n ∈ poss" using poss_def Pos ‹c ∈ set φ› variables_def x(1) by auto then have "?beta n" by simp then have "satisfies_literal ?beta x" using Pos by simp then show ?thesis using satisfies_clause_def x(1) by auto qed qed obtain vars where vars: "set vars = poss" "distinct vars" using `finite poss` by (meson finite_distinct_list) moreover from this have "set vars ⊆ variables φ" using poss_def by simp ultimately have "nllength vars ≤ nlllength (formula_n φ)" using variables_shorter_formula by simp moreover have "(λv. v ∈ set vars) ⊨ φ" using vars(1) sat by simp ultimately show ?thesis by auto qed lemma ex_witness_linear_length: fixes φ :: formula assumes "satisfiable φ" shows "∃us. bit_symbols us ∧ length us = length (formula_to_string φ) ∧ verify_sat ⟨formula_to_string φ; symbols_to_string us⟩ = [𝟭]" proof - obtain vars where vars: "(λv. v ∈ set vars) ⊨ φ" "nllength vars ≤ nlllength (formula_n φ)" using assms ex_assignment_linear_length by auto define pad where "pad = nlllength (formula_n φ) - nllength vars" then have "nllength vars + pad = nlllength (formula_n φ)" using vars(2) by simp moreover define us where "us = numlist vars @ replicate pad ♯" ultimately have "length us = nlllength (formula_n φ)" by (simp add: nllength_def) then have "length (binencode us) = length (formula_to_string φ)" (is "length ?us = _") by (simp add: nlllength_def) moreover have "verify_sat ⟨formula_to_string φ; symbols_to_string ?us⟩ = [𝟭]" using us_def vars(1) assms verify_sat_wf by simp moreover have "bit_symbols ?us" proof - have "binencodable (numlist vars)" using proper_symbols_numlist symbols_lt_numlist by (metis Suc_leI lessI less_Suc_numeral numeral_2_eq_2 numeral_le_iff numeral_less_iff order_less_le_trans pred_numeral_simps(3) semiring_norm(74)) moreover have "binencodable (replicate pad ♯)" by simp ultimately have "binencodable us" using us_def binencodable_append by simp then show ?thesis using bit_symbols_binencode by simp qed ultimately show ?thesis by blast qed lemma bit_symbols_verify_sat: "bit_symbols (verify_sat zs)" unfolding verify_sat_def Let_def by simp subsection ‹A Turing machine for verifying formulas› text ‹ The core of the function @{const verify_sat} is the expression @{term " (λv. v ∈ set (zs_numlist vs)) ⊨ zs_formula xs"}, which checks if an assignment represented by a list of variable indices satisfies a CNF formula represented by a list of lists of literals. In this section we devise a Turing machine performing this check. Recall that the numbers 0 and 1 are represented by the empty symbol sequence and the symbol sequence \textbf{1}, respectively. The Turing machines in this section are described in terms of numbers. We start with a Turing machine that checks a clause. The TM accepts on tape $j_1$ a list of numbers representing an assignment $\alpha$ and on tape $j_2$ a list of numbers representing a clause. It outputs on tape $j_3$ the number 1 if $\alpha$ satisfies the clause, and otherwise 0. To do this the TM iterates over all literals in the clause and determines the underlying variable and the sign of the literal. If the literal is positive and the variable is in the list representing $\alpha$ or if the literal is negative and the variable is not in the list, the number 1 is written to the tape $j_3$. Otherwise the tape remains unchanged. We assume $j_3$ is initialized with 0, and so it will be 1 if and only if at least one literal is satisfied by $\alpha$. The TM requires five auxiliary tapes $j_3 + 1, \dots, j_3 + 5$. Tape $j_3 + 1$ stores the literals one at a time, and later the variable; tape $j_3 + 2$ stores the sign of the literal; tape $j_3 + 3$ stores whether the variable is contained in $\alpha$; tapes $j_3 + 4$ and $j_3 + 5$ are the auxiliary tapes of @{const tm_contains}. › definition tm_sat_clause :: "tapeidx ⇒ tapeidx ⇒ tapeidx ⇒ machine" where "tm_sat_clause j1 j2 j3 ≡ WHILE [] ; λrs. rs ! j2 ≠ □ DO tm_nextract 4 j2 (j3 + 1) ;; tm_mod2 (j3 + 1) (j3 + 2) ;; tm_div2 (j3 + 1) ;; tm_contains j1 (j3 + 1) (j3 + 3) ;; IF λrs. rs ! (j3 + 3) = □ ∧ rs ! (j3 + 2) = □ ∨ rs ! (j3 + 3) ≠ □ ∧ rs ! (j3 + 2) ≠ □ THEN tm_setn j3 1 ELSE [] ENDIF ;; tm_setn (j3 + 1) 0 ;; tm_setn (j3 + 2) 0 ;; tm_setn (j3 + 3) 0 DONE ;; tm_cr j2" lemma tm_sat_clause_tm: assumes "k ≥ 2" and "G ≥ 5" and "j3 + 5 < k" "0 < j1" "j1 < k" "j2 < k" "j1 < j3" shows "turing_machine k G (tm_sat_clause j1 j2 j3)" using tm_sat_clause_def tm_mod2_tm tm_div2_tm tm_nextract_tm tm_setn_tm tm_contains_tm Nil_tm tm_cr_tm assms turing_machine_loop_turing_machine turing_machine_branch_turing_machine by simp locale turing_machine_sat_clause = fixes j1 j2 j3 :: tapeidx begin definition "tmL1 ≡ tm_nextract 4 j2 (j3 + 1)" definition "tmL2 ≡ tmL1 ;; tm_mod2 (j3 + 1) (j3 + 2)" definition "tmL3 ≡ tmL2 ;; tm_div2 (j3 + 1)" definition "tmL4 ≡ tmL3 ;; tm_contains j1 (j3 + 1) (j3 + 3)" definition "tmI ≡ IF λrs. rs ! (j3 + 3) = □ ∧ rs ! (j3 + 2) = □ ∨ rs ! (j3 + 3) ≠ □ ∧ rs ! (j3 + 2) ≠ □ THEN tm_setn j3 1 ELSE [] ENDIF" definition "tmL5 ≡ tmL4 ;; tmI" definition "tmL6 ≡ tmL5 ;; tm_setn (j3 + 1) 0" definition "tmL7 ≡ tmL6 ;; tm_setn (j3 + 2) 0" definition "tmL8 ≡ tmL7 ;; tm_setn (j3 + 3) 0" definition "tmL ≡ WHILE [] ; λrs. rs ! j2 ≠ □ DO tmL8 DONE" definition "tm2 ≡ tmL ;; tm_cr j2" lemma tm2_eq_tm_sat_clause: "tm2 = tm_sat_clause j1 j2 j3" unfolding tm2_def tmL_def tmL8_def tmL7_def tmL6_def tmL5_def tmL4_def tmL3_def tmI_def tmL2_def tmL1_def tm_sat_clause_def by simp context fixes tps0 :: "tape list" and k :: nat and vars :: "nat list" and clause :: clause assumes jk: "0 < j1" "j1 ≠ j2" "j3 + 5 < k" "j1 < j3" "j2 < j3" "0 < j2" "length tps0 = k" assumes tps0: "tps0 ! j1 = nltape' vars 0" "tps0 ! j2 = nltape' (clause_n clause) 0" "tps0 ! j3 = (⌊0⌋⇩_{N}, 1)" "tps0 ! (j3 + 1) = (⌊0⌋⇩_{N}, 1)" "tps0 ! (j3 + 2) = (⌊0⌋⇩_{N}, 1)" "tps0 ! (j3 + 3) = (⌊0⌋⇩_{N}, 1)" "tps0 ! (j3 + 4) = (⌊0⌋⇩_{N}, 1)" "tps0 ! (j3 + 5) = (⌊0⌋⇩_{N}, 1)" begin abbreviation "sat_take t ≡ satisfies_clause (λv. v ∈ set vars) (take t clause)" definition tpsL :: "nat ⇒ tape list" where "tpsL t ≡ tps0 [j2 := nltape' (clause_n clause) t, j3 := (⌊sat_take t⌋⇩_{B}, 1)]" lemma tpsL0: "tpsL 0 = tps0" proof - have "nltape' (clause_n clause) 0 = tps0 ! j2" using tps0(2) by presburger moreover have "⌊sat_take 0⌋⇩_{B}= ⌊0⌋⇩_{N}" using satisfies_clause_def by simp ultimately show ?thesis using tpsL_def tps0 jk by (metis list_update_id) qed definition tpsL1 :: "nat ⇒ tape list" where "tpsL1 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take t⌋⇩_{B}, 1), j3 + 1 := (⌊literal_n (clause ! t)⌋⇩_{N}, 1)]" lemma tmL1 [transforms_intros]: assumes "ttt = 12 + 2 * nlength (clause_n clause ! t)" and "t < length (clause_n clause)" shows "transforms tmL1 (tpsL t) ttt (tpsL1 t)" unfolding tmL1_def proof (tform tps: assms tps0 tpsL_def tpsL1_def jk) have len: "t < length clause" using assms(2) clause_n_def by simp show "ttt = 12 + 2 * nlength 0 + 2 * nlength (clause_n clause ! t)" using assms(1) by simp have *: "j2 ≠ j3" using jk by simp have **: "clause_n clause ! t = literal_n (clause ! t)" using len by (simp add: clause_n_def) show "tpsL1 t = (tpsL t) [j2 := nltape' (clause_n clause) (Suc t), j3 + 1 := (⌊clause_n clause ! t⌋⇩_{N}, 1)]" unfolding tpsL_def tpsL1_def using list_update_swap[OF *, of tps0] by (simp add: **) qed definition tpsL2 :: "nat ⇒ tape list" where "tpsL2 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take t⌋⇩_{B}, 1), j3 + 1 := (⌊literal_n (clause ! t)⌋⇩_{N}, 1), j3 + 2 := (⌊literal_n (clause ! t) mod 2⌋⇩_{N}, 1)]" lemma tmL2 [transforms_intros]: assumes "ttt = 12 + 2 * nlength (clause_n clause ! t) + 1" and "t < length (clause_n clause)" shows "transforms tmL2 (tpsL t) ttt (tpsL2 t)" unfolding tmL2_def by (tform tps: assms tps0 tpsL2_def tpsL1_def jk) definition tpsL3 :: "nat ⇒ tape list" where "tpsL3 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take t⌋⇩_{B}, 1), j3 + 1 := (⌊literal_n (clause ! t) div 2⌋⇩_{N}, 1), j3 + 2 := (⌊literal_n (clause ! t) mod 2⌋⇩_{N}, 1)]" lemma tmL3 [transforms_intros]: assumes "ttt = 16 + 4 * nlength (clause_n clause ! t)" and "t < length (clause_n clause)" shows "transforms tmL3 (tpsL t) ttt (tpsL3 t)" unfolding tmL3_def proof (tform tps: assms(2) tps0 tpsL3_def tpsL2_def jk) have len: "t < length clause" using assms(2) clause_n_def by simp have **: "clause_n clause ! t = literal_n (clause ! t)" using len by (simp add: clause_n_def) show "ttt = 12 + 2 * nlength (clause_n clause ! t) + 1 + (2 * nlength (literal_n (clause ! t)) + 3)" using assms(1) ** by simp qed definition tpsL4 :: "nat ⇒ tape list" where "tpsL4 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take t⌋⇩_{B}, 1), j3 + 1 := (⌊literal_n (clause ! t) div 2⌋⇩_{N}, 1), j3 + 2 := (⌊literal_n (clause ! t) mod 2⌋⇩_{N}, 1), j3 + 3 := (⌊literal_n (clause ! t) div 2 ∈ set vars⌋⇩_{B}, 1)]" lemma tmL4 [transforms_intros]: assumes "ttt = 20 + 4 * nlength (clause_n clause ! t) + 67 * (nllength vars)⇧^{2}" and "t < length (clause_n clause)" shows "transforms tmL4 (tpsL t) ttt (tpsL4 t)" unfolding tmL4_def proof (tform tps: assms(2) tps0 tpsL4_def tpsL3_def jk time: assms(1)) have "tpsL3 t ! (j3 + 4) = (⌊0⌋⇩_{N}, 1)" using tpsL3_def tps0 jk by simp then show "tpsL3 t ! (j3 + 3 + 1) = (⌊0⌋⇩_{N}, 1)" by (metis ab_semigroup_add_class.add_ac(1) numeral_plus_one semiring_norm(2) semiring_norm(8)) have "tpsL3 t ! (j3 + 5) = (⌊0⌋⇩_{N}, 1)" using tpsL3_def tps0 jk by simp then show "tpsL3 t ! (j3 + 3 + 2) = (⌊0⌋⇩_{N}, 1)" by (simp add: numeral_Bit1) qed definition tpsL5 :: "nat ⇒ tape list" where "tpsL5 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take (Suc t)⌋⇩_{B}, 1), j3 + 1 := (⌊literal_n (clause ! t) div 2⌋⇩_{N}, 1), j3 + 2 := (⌊literal_n (clause ! t) mod 2⌋⇩_{N}, 1), j3 + 3 := (⌊literal_n (clause ! t) div 2 ∈ set vars⌋⇩_{B}, 1)]" lemma tmI [transforms_intros]: assumes "ttt = 16" and "t < length (clause_n clause)" shows "transforms tmI (tpsL4 t) ttt (tpsL5 t)" unfolding tmI_def proof (tform tps: jk tpsL4_def time: assms(1)) show "10 + 2 * nlength (if sat_take t then 1 else 0) + 2 * nlength 1 + 2 ≤ ttt" using assms(1) nlength_0 nlength_1_simp by simp have len: "t < length clause" using assms(2) by (simp add: clause_n_def) let ?l = "clause ! t" have 1: "read (tpsL4 t) ! (j3 + 3) = □ ⟷ literal_n ?l div 2 ∉ set vars" using tpsL4_def jk read_ncontents_eq_0[of "tpsL4 t" "j3 + 3"] by simp have 2: "read (tpsL4 t) ! (j3 + 2) = □ ⟷ literal_n ?l mod 2 = 0" using tpsL4_def jk read_ncontents_eq_0[of "tpsL4 t" "j3 + 2"] by simp let ?a = "λv. v ∈ set vars" let ?cond = "read (tpsL4 t) ! (j3 + 3) = □ ∧ read (tpsL4 t) ! (j3 + 2) = □ ∨ read (tpsL4 t) ! (j3 + 3) ≠ □ ∧ read (tpsL4 t) ! (j3 + 2) ≠ □" have *: "?cond ⟷ satisfies_literal ?a ?l" proof (cases ?l) case (Neg v) then have "literal_n ?l div 2 = v" "literal_n ?l mod 2 = 0" by simp_all moreover from this have "satisfies_literal ?a ?l ⟷ v ∉ set vars" using Neg by simp ultimately show ?thesis using 1 2 by simp next case (Pos v) then have "literal_n ?l div 2 = v" "literal_n ?l mod 2 = 1" by simp_all moreover from this have "satisfies_literal ?a ?l ⟷ v ∈ set vars" using Pos by simp ultimately show ?thesis using 1 2 by simp qed have **: "sat_take (Suc t) ⟷ sat_take t ∨ satisfies_literal ?a ?l" using satisfies_clause_take[OF len] by simp show "tpsL5 t = (tpsL4 t)[j3 := (⌊1⌋⇩_{N}, 1)]" if ?cond proof - have "(if sat_take (Suc t) then 1::nat else 0) = 1" using that * ** by simp then show ?thesis unfolding tpsL5_def tpsL4_def using that by (simp add: list_update_swap) qed show "tpsL5 t = (tpsL4 t)" if "¬ ?cond" proof - have "sat_take t = sat_take (Suc t)" using * ** that by simp then show ?thesis unfolding tpsL5_def tpsL4_def using that by (simp add: list_update_swap) qed qed lemma tmL5 [transforms_intros]: assumes "ttt = 36 + 4 * nlength (clause_n clause ! t) + 67 * (nllength vars)⇧^{2}" and "t < length (clause_n clause)" shows "transforms tmL5 (tpsL t) ttt (tpsL5 t)" unfolding tmL5_def by (tform tps: assms) definition tpsL6 :: "nat ⇒ tape list" where "tpsL6 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take (Suc t)⌋⇩_{B}, 1), j3 + 1 := (⌊0⌋⇩_{N}, 1), j3 + 2 := (⌊literal_n (clause ! t) mod 2⌋⇩_{N}, 1), j3 + 3 := (⌊literal_n (clause ! t) div 2 ∈ set vars⌋⇩_{B}, 1)]" lemma tmL6 [transforms_intros]: assumes "ttt = 46 + 4 * nlength (clause_n clause ! t) + 67 * (nllength vars)⇧^{2}+ 2 * nlength (literal_n (clause ! t) div 2)" and "t < length (clause_n clause)" shows "transforms tmL6 (tpsL t) ttt (tpsL6 t)" unfolding tmL6_def by (tform tps: assms tps0 tpsL6_def tpsL5_def jk) definition tpsL7 :: "nat ⇒ tape list" where "tpsL7 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take (Suc t)⌋⇩_{B}, 1), j3 + 1 := (⌊0⌋⇩_{N}, 1), j3 + 2 := (⌊0⌋⇩_{N}, 1), j3 + 3 := (⌊literal_n (clause ! t) div 2 ∈ set vars⌋⇩_{B}, 1)]" lemma tmL7 [transforms_intros]: assumes "ttt = 56 + 4 * nlength (clause_n clause ! t) + 67 * (nllength vars)⇧^{2}+ 2 * nlength (literal_n (clause ! t) div 2) + 2 * nlength (literal_n (clause ! t) mod 2)" and "t < length (clause_n clause)" shows "transforms tmL7 (tpsL t) ttt (tpsL7 t)" unfolding tmL7_def by (tform tps: assms tps0 tpsL7_def tpsL6_def jk) definition tpsL8 :: "nat ⇒ tape list" where "tpsL8 t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take (Suc t)⌋⇩_{B}, 1), j3 + 1 := (⌊0⌋⇩_{N}, 1), j3 + 2 := (⌊0⌋⇩_{N}, 1), j3 + 3 := (⌊0⌋⇩_{N}, 1)]" lemma tmL8: assumes "ttt = 66 + 4 * nlength (clause_n clause ! t) + 67 * (nllength vars)⇧^{2}+ 2 * nlength (literal_n (clause ! t) div 2) + 2 * nlength (literal_n (clause ! t) mod 2) + 2 * nlength (if literal_n (clause ! t) div 2 ∈ set vars then 1 else 0)" and "t < length (clause_n clause)" shows "transforms tmL8 (tpsL t) ttt (tpsL8 t)" unfolding tmL8_def by (tform tps: assms tps0 tpsL8_def tpsL7_def jk) lemma tmL8': assumes "ttt = 70 + 6 * nllength (clause_n clause) + 67 * (nllength vars)⇧^{2}" and "t < length (clause_n clause)" shows "transforms tmL8 (tpsL t) ttt (tpsL8 t)" proof - let ?l = "literal_n (clause ! t)" let ?ll = "clause_n clause ! t" let ?t = "66 + 4 * nlength ?ll + 67 * (nllength vars)⇧^{2}+ 2 * nlength (?l div 2) + 2 * nlength (?l mod 2) + 2 * nlength (if ?l div 2 ∈ set vars then 1 else 0)" have "?t = 66 + 4 * nlength ?ll + 67 * (nllength vars)⇧^{2}+ 2 * nlength (?ll div 2) + 2 * nlength (?ll mod 2) + 2 * nlength (if ?ll div 2 ∈ set vars then 1 else 0)" using assms(2) clause_n_def by simp also have "... ≤ 66 + 4 * nlength ?ll + 67 * (nllength vars)⇧^{2}+ 2 * nlength ?ll + 2 * nlength (?ll mod 2) + 2 * nlength (if ?ll div 2 ∈ set vars then 1 else 0)" using nlength_mono[of "?ll div 2" "?ll"] by simp also have "... = 66 + 6 * nlength ?ll + 67 * (nllength vars)⇧^{2}+ 2 * nlength (?ll mod 2) + 2 * nlength (if ?ll div 2 ∈ set vars then 1 else 0)" by simp also have "... ≤ 66 + 6 * nlength ?ll + 67 * (nllength vars)⇧^{2}+ 2 * nlength 1 + 2 * nlength (if ?ll div 2 ∈ set vars then 1 else 0)" using nlength_mono by simp also have "... ≤ 66 + 6 * nlength ?ll + 67 * (nllength vars)⇧^{2}+ 2 * nlength 1 + 2 * nlength 1" using nlength_mono by simp also have "... = 70 + 6 * nlength ?ll + 67 * (nllength vars)⇧^{2}" using nlength_1_simp by simp also have "... ≤ 70 + 6 * nllength (clause_n clause) + 67 * (nllength vars)⇧^{2}" using assms(2) member_le_nllength by simp finally have "?t ≤ ttt" using assms(1) by simp then show ?thesis using assms tmL8 transforms_monotone by blast qed definition tpsL8' :: "nat ⇒ tape list" where "tpsL8' t ≡ tps0 [j2 := nltape' (clause_n clause) (Suc t), j3 := (⌊sat_take