Theory EigbyzDefs
theory EigbyzDefs
imports "../HOModel"
begin
section ‹Verification of the \eigbyz{} Consensus Algorithm›
text ‹
Lynch~\<^cite>‹"lynch:distributed"› presents \eigbyz{}, a version of
the \emph{exponential information gathering} algorithm tolerating
Byzantine faults, that works in $f$ rounds, and that was originally
introduced in~\<^cite>‹"bar-noy:shifting-gears"›.
We begin by introducing an anonymous type of processes of finite
cardinality that will instantiate the type variable ‹'proc›
of the generic HO model.
›
typedecl Proc
axiomatization where Proc_finite: "OFCLASS(Proc, finite_class)"
instance Proc :: finite by (rule Proc_finite)
abbreviation
"N ≡ card (UNIV::Proc set)"
text ‹
The algorithm is parameterized by $f$, which represents the number of
rounds and the height of the tree data structure (see below).
›
axiomatization f::nat
where f: "f < N"
subsection ‹Tree Data Structure›
text ‹
The algorithm relies on propagating information about the initially proposed
values among all the processes. This information is stored in trees whose
branches are labeled by lists of (distinct) processes. For example, the
interpretation of an entry ‹[p,q] ↦ Some v› is that the current
process heard from process ‹q› that it had heard from process ‹p›
that its proposed value is ‹v›. The value initially proposed by the process
itself is stored at the root of the tree.
We introduce the type of \emph{labels}, which encapsulate lists of distinct
process identifiers and whose length is at most ‹f+1›.
›
definition "Label = {xs::Proc list. length xs ≤ Suc f ∧ distinct xs}"
typedef Label = Label
by (auto simp: Label_def intro: exI[where x= "[]"])
text ‹There is a finite number of different labels.›
lemma finite_Label: "finite Label"
proof -
have "Label ⊆ {xs. set xs ⊆ (UNIV::Proc set) ∧ length xs ≤ Suc f}"
by (auto simp: Label_def)
moreover
have "finite {xs. set xs ⊆ (UNIV::Proc set) ∧ length xs ≤ Suc f}"
by (rule finite_lists_length_le) auto
ultimately
show ?thesis by (auto elim: finite_subset)
qed
lemma finite_UNIV_Label: "finite (UNIV::Label set)"
proof -
from finite_Label have "finite (Abs_Label ` Label)" by simp
moreover
{
fix l::Label
have "l ∈ Abs_Label ` Label"
by (rule Abs_Label_cases) auto
}
hence "(UNIV::Label set) = (Abs_Label ` Label)" by auto
ultimately show ?thesis by simp
qed
lemma finite_Label_set [iff]: "finite (S :: Label set)"
using finite_UNIV_Label by (auto intro: finite_subset)
text ‹Utility functions on labels.›
definition root_node where
"root_node ≡ Abs_Label []"
definition length_lbl where
"length_lbl l ≡ length (Rep_Label l)"
lemma length_lbl [intro]: "length_lbl l ≤ Suc f"
unfolding length_lbl_def using Label_def Rep_Label by auto
definition is_leaf where
"is_leaf l ≡ length_lbl l = Suc f"
definition last_lbl where
"last_lbl l ≡ last (Rep_Label l)"
definition butlast_lbl where
"butlast_lbl l ≡ Abs_Label (butlast (Rep_Label l))"
definition set_lbl where
"set_lbl l = set (Rep_Label l)"
text ‹
The children of a non-leaf label are all possible extensions of that label.
›
definition children where
"children l ≡
if is_leaf l
then {}
else { Abs_Label (Rep_Label l @ [p]) | p . p ∉ set_lbl l }"
subsection ‹Model of the Algorithm›
text ‹
The following record models the local state of a process.
›
record 'val pstate =
vals :: "Label ⇒ 'val option"
newvals :: "Label ⇒ 'val"
decide :: "'val option"
text ‹
Initially, no values are assigned to non-root labels, and an arbitrary value
is assigned to the root: that value is interpreted as the initial proposal of
the process. No decision has yet been taken, and the ‹newvals› field
is unconstrained.
›
definition EIG_initState where
"EIG_initState p st ≡
(∀l. (vals st l = None) = (l ≠ root_node))
∧ decide st = None"
type_synonym 'val Msg = "Label ⇒ 'val option"
text ‹
At every round, every process sends its current ‹vals› tree to all processes.
In fact, only the level of the tree corresponding to the round number
is used (cf. definition of ‹extend_vals› below).
›
definition EIG_sendMsg where
"EIG_sendMsg r p q st ≡ vals st"
text ‹
During the first ‹f-1› rounds, every process extends its
tree ‹vals› according to the values received in the round.
No decision is taken.
›
definition extend_vals where
"extend_vals r p st msgs st' ≡
vals st' = (λ l.
if length_lbl l = Suc r ∧ msgs (last_lbl l) ≠ None
then (the (msgs (last_lbl l))) (butlast_lbl l)
else if length_lbl l = Suc r ∧ msgs (last_lbl l) = None then None
else vals st l)"
definition next_main where
"next_main r p st msgs st' ≡ extend_vals r p st msgs st' ∧ decide st' = None"
text ‹
In the final round, in addition to extending the tree as described previously,
processes construct the tree ‹newvals›, starting
at the leaves. The values at the leaves are copied from ‹vals›,
except that missing values ‹None› are replaced by the default value
‹undefined›. Moving up, if there exists a majority value among the
children, it is assigned to the parent node, otherwise the parent node
receives the default value ‹undefined›. The decision is set to the
value computed for the root of the tree.
›
fun fixupval :: "'val option ⇒ 'val" where
"fixupval None = undefined"
| "fixupval (Some v) = v"
definition has_majority :: "'val ⇒ ('a ⇒ 'val) ⇒ 'a set ⇒ bool" where
"has_majority v g S ≡ card {e ∈ S. g e = v} > (card S) div 2"
definition check_newvals :: "'val pstate ⇒ bool" where
"check_newvals st ≡
∀l. is_leaf l ∧ newvals st l = fixupval (vals st l)
∨ ¬(is_leaf l) ∧
( (∃w. has_majority w (newvals st) (children l) ∧ newvals st l = w)
∨ (¬(∃w. has_majority w (newvals st) (children l))
∧ newvals st l = undefined))"
definition next_end where
"next_end r p st msgs st' ≡
extend_vals r p st msgs st'
∧ check_newvals st'
∧ decide st' = Some (newvals st' root_node)"
text ‹
The overall next-state relation is defined such that every process applies
‹nextMain› during rounds ‹0›, \ldots, ‹f-1›, and applies
‹nextEnd› during round ‹f›. After that, the algorithm terminates
and nothing changes anymore.
›
definition EIG_nextState where
"EIG_nextState r ≡
if r < f then next_main r
else if r = f then next_end r
else (λp st msgs st'. st' = st)"
subsection ‹Communication Predicate for \eigbyz›
text ‹
The secure kernel ‹SKr› w.r.t. given HO and SHO collections consists
of the process from which every process receives the correct message.
›
definition SKr :: "Proc HO ⇒ Proc HO ⇒ Proc set" where
"SKr HO SHO ≡ { q . ∀p. q ∈ HO p ∩ SHO p}"
text ‹
The secure kernel ‹SK› of an entire execution (i.e., for sequences of
HO and SHO collections) is the intersection of the secure kernels for
all rounds. Obviously, only the first ‹f› rounds really matter,
since the algorithm terminates after that.
›
definition SK :: "(nat ⇒ Proc HO) ⇒ (nat ⇒ Proc HO) ⇒ Proc set" where
"SK HOs SHOs ≡ {q. ∀r. q ∈ SKr (HOs r) (SHOs r)}"
text ‹
The round-by-round predicate requires that the secure kernel at every round
contains more than ‹(N+f) div 2› processes.
›
definition EIG_commPerRd where
"EIG_commPerRd HO SHO ≡ card (SKr HO SHO) > (N + f) div 2"
text ‹
The global predicate requires that the secure kernel for the entire
execution contains at least ‹N-f› processes. Messages from these
processes are always correctly received by all processes.
›
definition EIG_commGlobal where
"EIG_commGlobal HOs SHOs ≡ card (SK HOs SHOs) ≥ N - f"
text ‹
The above communication predicates differ from Lynch's presentation of
\eigbyz{}. In fact, the algorithm was originally designed for synchronous
systems with reliable links and at most ‹f› faulty processes.
In such a system, every process receives the correct message from at least
the non-faulty processes at every round, and therefore the global predicate
‹EIG_commGlobal› is satisfied. The standard correctness
proof assumes that $N>3f$, and therefore $N - f > (N + f) \div 2$.
Since moreover, for any $r$, we obviously have
\[
\bigg(\bigcap_{p \in \Pi, r' \in \nat}\!\!\!\! SHO(p,r')\bigg)
\ \subseteq\ \bigg(\bigcap_{p \in \Pi} SHO(p,r)\bigg),
\]
it follows that any execution of \eigbyz{} where $N>3f$ also satisfies
‹EIG_commPerRd› at any round. The standard correctness hypotheses thus imply
our communication predicates.
However, our proof shows that \eigbyz{} can indeed tolerate more
transient faults than the standard bound can express. For example, consider the
case where $N=5$ and $f=2$. Our predicates are satisfied in
executions where two processes exhibit transient faults, but never fail
simultaneously. Indeed, in such an execution, every process receives four
correct messages at every round, hence ‹EIG_commPerRd› always holds.
Also, ‹EIG_commGlobal› is satisfied because there are
three processes from which every process receives the correct messages at all
rounds. By our correctness proof, it follows that $EIGByz_f$ then
achieves Consensus, unlike what one could expect from the standard correctness
predicate. This observation underlines the interest of expressing assumptions
about transient faults, as in the HO model.
›
subsection ‹The \eigbyz{} Heard-Of Machine›
text ‹
We now define the non-coordinated SHO machine for \eigbyz{} by assembling
the algorithm definition and its communication-predicate.
›
definition EIG_SHOMachine where
"EIG_SHOMachine = ⦇
CinitState = (λ p st crd. EIG_initState p st),
sendMsg = EIG_sendMsg,
CnextState = (λ r p st msgs crd st'. EIG_nextState r p st msgs st'),
SHOcommPerRd = EIG_commPerRd,
SHOcommGlobal = EIG_commGlobal
⦈"
abbreviation "EIG_M ≡ (EIG_SHOMachine::(Proc, 'val pstate, 'val Msg) SHOMachine)"
end