Theory AutoCorresInfrastructure

(*
 * Copyright (c) 2022 Apple Inc. All rights reserved.
 *
 * SPDX-License-Identifier: BSD-2-Clause
 *)

chapter ‹Overview of AutoCorres \label{chap:AutoCorresInfrastructure}›

theory AutoCorresInfrastructure
imports AutoCorres
begin



text ‹
This theory elaborates on some of the (internal) AutoCorres infrastructure and is also supposed to
act as a testbench for this infrastructure, e.g. simplifier setup.

Following the Isabelle tradition user relevant changes to AutoCorres are described in
the file 🗏‹../NEWS›.
›

section ‹Building Blocks›

text ‹'AutoCorres' has the following major building blocks and contributions.

 The 'C-parser' (by Michael Norrish) takes C-Input files and parses them to 'SIMPL' (by Norbert Schirmer) programs within Isabelle/HOL: 
     C-parser: \autoref{chap:strictc}
     SIMPL: @{url ‹https://www-wjp.cs.uni-saarland.de/leute/private_homepages/nschirmer/pub/schirmer_phd.pdf›}
 The unified memory model (UMM) (by Harvey Tuch) models C-types as HOL-types with conversions between
  the typed view and the raw-byte-level view. Moreover it provides a separation logic framework.
     UMM: @{url ‹https://trustworthy.systems/publications/papers/Tuch%3Aphd.pdf›}
 AutoCorres (by David Greenaway): @{url ‹https://trustworthy.systems/publications/nicta_full_text/8758.pdf›}
text‹
Some historic remarks, that might give some insight why things are as they are.
The motivation for AutoCorres was the C-verification tasks coming from the sel4 Projects 
@{url ‹https://sel4.systems›}, a verified microkernel, written in C.

The project started with high level HOL-specifications, refining them to monadic functions (within
HOL and Haskell) and then refining them to C-code. The nondetermistic state-monad 
goes back to Cook. et al 🌐‹http://www.cse.unsw.edu.au/~kleing/papers/Cock_KS_08.pdf›. 
To link the C-code with the monadic functions
within Isabelle / HOL the C-parser was written. The C-parser produces SIMPL programs, and 
SIMPL is equipped with various semantics (big and small-step) and a Hoare-logic framework including 
a verification condition generator.
The correspondence of the C-functions represented in SIMPL and the monadic functions was manually 
expressed within this framework.

AutoCorres was then built after (or in parallel) to this verification effort in order to speed
up the process for future projects.

While the main concepts described in the publications above are still valid, a lot
of things have evolved and were extended. Especially the implementation details have
changed. In this document we also give some notes on these differences. Also see
 🗏‹../NEWS›.
›

section ‹C Parser›

text ‹
Some remarks on the C-Parser
 Lexical and syntactical analysis is implemented using ML-Lex / ML-Yacc coming from the MLton project
  🌐‹http://mlton.org/›. Originally mlton was used to generate the ML files from the grammers. 
  Meanwhile this is all integrated into Isabelle/ML.
 The C parser first generates an ML-data structure representing the program. It then does some
  transformations on that data structure (e.g. storing results of nested function calls into temporary
  variables), before translating it to SIMPL. In roughly the following steps:
   Generating HOL-types for the C-types. 
      The C parser performs a complete program analysis to determine the types used.
      Defines the types and performs the UMM proofs, in particular conversions and properties to and 
       from raw byte lists.
   Defining the state-space type: global variables, local variables, heap variables. The C parser
    does a complete program analysis to determine how global and heap variables are used to decide whether
    to model them as part of the globals-record or the heap. When no 'address-of' a variable is
    calculated it is stored as part of the global variable record, otherwise it is stored within
    the heap. Global variables are more abstract to deal with compared to heap variables.
   Defining the SIMPL procedures
   Proving 'modifies' specifications for the procedures, i.e. frame conditions on global variables.
›

section ‹AutoCorres›

text ‹AutoCorres abstracts the SIMPL program to a monadic HOL function in various phases, producing
correspondence (a.k.a. simulation, a.k.a. refinement) theorems along the way. Note that the translation 
(and the correspondence theorems) might only be partial in the following sense: A phase might
introduce additional guards into the code. Simulation only holds for programs that do not fail 
on the abstract execution. Note that non-termination is currently also modelled as failure. So
simulation only holds for terminating abstract programs:
   Correspondence relation: @{const ac_corres}

Each phase is processed in similar stages:
 Raw translation and refinement-proof
 Optimisation of the output program, sometimes distinguished between a more lightweight 
  "peephole" optimisation and a more heavyweight "flow" optimisation
 Define a constant for the output of the translation for that phase.

The phases are:
 From SIMPL (deep embedding of statements, shallow embedding of expressions) to L1 (shallow 
  embedding of statements and expressions). The transformation preserves the state-space representation,
  and merely focuses on translating SIMPL statements to monadic functions:
   Definition of L1: 🗏‹../L1Defs.thy›
   Correspondence relation: @{const L1corres} 
   Peephole optimisation:  🗏‹../L1Peephole.thy›
   Flow optimisation: 🗏‹../ExceptionRewrite.thy›
   Main ML file:  🗏‹../simpl_conv.ML›
 Local variable abstraction, from L1 to L2. In L2, local variables are represented as lambda
  abstractions. So the underlying state-space type changes, getting rid of the local variable record.
  As local variables are now treated as lambda abstractions, and thus names become meaningless, 
  autocorres keeps the original names around as part of (logically redundant) name annotations within 
  the L2 constants, e.g. @{term "L2_gets (λ_. x) [c_source_name_hint]"}. Lists of names are
  used, as variables might pile up in tuples. These annotations are removed in the final
  polishing phase of autocorres, attempting to rename the bound variables accordingly on the fly.
   Definition of L2: 🗏‹../L2Defs.thy›
   Correspondence relation: @{const L2corres} 
   Exception rewriting, introducing nested exceptions: 🗏‹../L2ExceptionRewrite.thy›
   Peephole optimisation:  🗏‹../L2Peephole.thy›
   Main ML files:  🗏‹../local_var_extract.ML›, 🗏‹../l2_opt.ML›
  From now on we stay in the language L2, and also the optimisation stages are reused.
 In/Out parameter abstraction (IO). Pointer parameters are replaced by passing value parameters.
  This step may eliminate pointers to local variables.
   Theory: 🗏‹../In_Out_Parameters.thy›
   Correspondence relation: @{const IOcorres}
   Documentation: 🗏‹In_Out_Parameters_Ex.thy›
 Heap abstraction, referred to as heap-lifting (HL). 
  The monolithic-byte-oriented heap is abstracted to a typed-split-heap model. For
  that the new type lifted_globals› is introduced. Note, that the separation logic of Tuch also
  attempts to give a typed view on top of the byte-level-heap. In case of nested structure types
  it even is capable to express the various levels of typed-heap views from bytes to individual structure fields
  to (nested) structures. AutoCorres takes a simpler, more pragmatic approach here and only
  provides a single split-heap abstraction on the level of complete types. This means there
  is a heap for every compound type and for every primitive type but not for nested 
  structures. However, autocorres allows to
  mix between the abstractions layers (split-heap vs. byte-heap) on the granularity of functions.
  So you can model low-level functions like memory-allocators on the untyped byte-heap and then still
  use these functions within other functions in the split-heap abstraction.
  Meanwhile, genuine support for a split-heap approach with addressable nested structures was added. It is
  described in 🗏‹open_struct.thy›. 
   Main theory:  🗏‹../HeapLift.thy›
   Correspondence relation: @{const L2Tcorres}
   Main ML files:  🗏‹../heap_lift_base.ML›, 🗏‹../heap_lift.ML›
 Word abstraction (WA): (un)signed n-bit words are converted to @{typ int} or @{typ nat}. This
  conversion only affects local variable (lambda abstraction), so the underlying state-space type
  is maintained. When reading and storing to memory the values are converted accordingly. 
   Main theory:  🗏‹../WordAbstract.thy› 
   Correspondence relation: @{const corresTA}
   Main ML file:  🗏‹../word_abstract.ML›
 Type strengthening (TS), sometimes also referred to as type lifting or type specialisation: 
  Best effort approach to simplify the structure of the monad as far as possible:
   Pure functions
   Reader monad, read-only state-dependant functions.
   Option (reader) monad, in particular for potential non-terminating functions (recursion / while). 
    Nontermination is treated as failure in autocorres and failure of guards are 
    represented as a result of @{term None}
   Nondeterministic state monad (exceptional control flow confined within function boundaries). 
    Failure (of guards) and nontermination are represented by @{const Failure} as outcome.
   Nondeterministic state monad with exit (exceptional control flow crossing function boundaries).
  New monad types can be registered within Isabelle / ML.
   Main theories:  🗏‹../TypeStrengthen.thy›, 🗏‹../Refines_Spec.thy›
   Correspondence relation: @{const refines}. Originally this was actually 
    based on equality. As we now implement (mutually) recursive functions by a CCPO
    @{command fixed_point} instead of introducing an explicit measure parameter we changed to
    simulation. Equality is not @{const ccpo.admissible}, whereas simulation is.
   Main ML files:  🗏‹../type_strengthen.ML›,  🗏‹../monad_types.ML›,  🗏‹../monad_convert.ML›
 Polish. Performed as a part of type strengthening. Here remaining special and limited L2 definitions are
  expanded to 'ordinary' monadic functions. Also annotations of original c variable names are removed
  and bound variables are named accordingly.
›


section ‹AutoCorres Flow›

text ‹
The original AutoCorres implementation of David Greenaway was refined in several ways. In particular
from a high level perspective the concepts of incremental build and parallel processing where
unified:
 Parallel processing is moved to the outermost invocation of autocorres. All tasks 
  are calculated respecting the dependencies of the call graph and the various autocorres phases, cf.
  @{ML AutoCorres.parallel_autocorres}.
 Every task then invokes a distinct call to @{ML AutoCorres.do_autocorres} exactly specifying 
  the scope (which function clique) and which phase via the options. 
 The results of an invocation are maintained in generic data, such that they are available to 
  subsequent dependent calls.

The common parts of each phase are implemented in @{ML_structure AutoCorresUtil}. 
The last phase, Type strengthening (TS), historically played a special role and still does not
make much use of @{ML_structure AutoCorresUtil}, but conceptually it follows the same idea:

Functions are processed in a sequence of cliques, from the bottom of the call graph to the top. 
Here a clique is either a single function or a group of strongly connected recursive calls. 
After processing of a phase the clique might get splitted due to dead code elimination. This
general idea is implemented in @{ML AutoCorresUtil.convert_and_define_cliques}.
›

subsection ‹General Remarks›

text  The main autocorres files, putting everything together:
  🗏‹../AutoCorres.thy›, 🗏‹../autocorres.ML›
 There are two main approaches in a single phase to come up with an abstract program (output
  of the phase) and the correspondence proof to the concrete program (input of the phase):
   Implicit synthesis of the abstract program from the concrete program by applying "intro" rules, 
  where the abstract program is initialised with a schematic variable. Prototypical examples are
  the heap abstraction, word abstraction and type strengthening.
   First explicitly calculate the abstract program (fragment) within ML from the 
  concrete program (fragment), and then perform the correspondence proof. An prototypical
  example is the local variable abstraction.
 The common aspects of the various stages within a single phase is (partially) generalised into
  library functions: 🗏‹../autocorres_util.ML›

subsection ‹Links to more documentation / examples›
text  Pointers to local variables: 🗏‹pointers_to_locals.thy›
 In-Out parameters: 🗏‹In_Out_Parameters_Ex.thy›
 Addressable fields and open types: 🗏‹open_struct.thy›
 Function pointers:  🗏‹fnptr.thy›
 Unions: 🗏‹union_ac.thy›

section ‹Overview on the Locales›

text ‹The representation of local variables in SIMPL and L1 where changed from records
to 'Statespaces' as implemented by @{command statespace} and finally to
positional variables as in typlocals in  🗏‹../c-parser/CLocals.thy›.
 This allows for an uniform representation of local variables as a function from 
'nat to byte list', Typing is achieved by ML data organised in locales and bundles,
cf.  🗏‹../c-parser/CTranslationInfrastructure.thy›. 
These are also used to do the L1 and the L2 correspondence proofs. Starting from
L2 on the local variables are represented as lambda bindings.

We describe the various locales (and bundles) later on with our running examples.

To support function pointers even more locales where introduced. See 🗏‹fnptr.thy›.
›



section ‹Example Program›

install_C_file "autocorres_infrastructure_ex.c"
print_theorems
thm upd_lift_simps
thm fl_ti_simps

text ‹The variables consist of parameters (input and output) and the local variables. Note that 
this is merely a ML-declaration of the scope which can be accesses via a bundle,
cf.  🗏‹../c-parser/CTranslationInfrastructure.thy›.
›
context includes add_variables
begin
term "´n :== 42"
end

text ‹Adresses of global variables or function pointers are kept in the global_addresses› locale.›
print_locale autocorres_infrastructure_ex_global_addresses

text ‹This is everything necessary to define the body of a function. All bodies 
are defined in that locale.›
context autocorres_infrastructure_ex_global_addresses
  opening add_variables
begin
term add_body
thm add_body_def
end

text ‹The implementation locale holds the defining equation of the function in SIMPL and is closed
under callees. ›

context call_add_impl
begin
thm add_impl
thm call_add_impl
end

text ‹In case of a clique there is a clique locale and aliases for each function.›
print_locale even_odd_impl
print_locale even_impl
print_locale odd_impl

text ‹The locale importing all implementations is named like the file with suffix _simpl›.›

print_locale "autocorres_infrastructure_ex_simpl"

subsection ‹Incremental Build›

autocorres [phase=L1, scope=add] "autocorres_infrastructure_ex.c"
autocorres [phase=L2, scope=add] "autocorres_infrastructure_ex.c"
autocorres [phase=HL, scope=add] "autocorres_infrastructure_ex.c"
autocorres [phase=WA, scope=add] "autocorres_infrastructure_ex.c"
autocorres [phase=TS, scope=add] "autocorres_infrastructure_ex.c"

subsection ‹All the rest›

autocorres "autocorres_infrastructure_ex.c"

context unsigned_to_signed_impl
begin
thm unsigned_to_signed_body_def
end
context autocorres_infrastructure_ex_all_corres 
begin
thm unsigned_to_signed'_def
text ‹The SIMPL versions produced by c-parser›

declare [[clocals_short_names]]
thm max_body_def
thm add_body_def
term "add_body::((globals, locals, 32 signed word) CProof.state, unit ptr, strictc_errortype) com"
thm call_add_body_def
thm seq_assign_body_def
thm call_seq_assign_body_def
thm inc_g_body_def
thm deref_body_def
thm deref_g_body_def
thm factorial_body_def
thm call_factorial_body_def
thm odd_body_def
thm even_body_def
thm dead_f_body_def
thm dead_h_body_def
thm dead_g_body_def

text ‹L1 versions. Monadic with same state. 
Note that recursive procedures are defined with the @{command fixed_point} package and hence
the .simps› instead of _def› makes more sense to look at.
In this layer a first exception elimination optimisation takes places. The optimized definition
has the extension "opt".
›

declare [[show_types=false]]
thm l1_max'_def
thm l1_opt_max'_def

term "l1_add'::(unit, unit, (globals, locals, 32 signed word) CProof.state) exn_monad"
thm l1_add'_def
thm l1_opt_add'_def

thm l1_call_add'_def
thm l1_opt_call_add'_def

thm l1_seq_assign'_def
thm l1_opt_seq_assign'_def

thm l1_call_seq_assign'_def
thm l1_opt_call_seq_assign'_def

thm l1_inc_g'_def
thm l1_opt_inc_g'_def

thm l1_deref'_def
thm l1_opt_deref'_def

thm l1_deref_g'_def
thm l1_opt_deref_g'_def

thm l1_factorial'.simps
thm l1_opt_factorial'_def

thm l1_call_factorial'_def
thm l1_opt_call_factorial'_def

thm l1_odd'.simps
thm l1_opt_odd'_def

thm l1_even'.simps
thm l1_opt_even'_def

thm l1_dead_f'.simps
thm l1_opt_dead_f'_def

thm l1_dead_h'.simps
thm l1_opt_dead_h'_def

thm l1_dead_g'.simps
thm l1_opt_dead_g'_def


text ‹L2 version: lambda bindings for local variables›

thm l2_max'_def
term "l2_add'::32 signed word  32 signed word
       (32 signed word c_exntype, 32 signed word, globals) exn_monad"
thm l2_add'_def
thm l2_call_add'_def
thm l2_seq_assign'_def
thm l2_call_seq_assign'_def
thm l2_inc_g'_def
thm l2_deref'_def
thm l2_deref_g'_def
thm l2_factorial'.simps
thm l2_call_factorial'_def
thm l2_odd'.simps
thm l2_even'.simps
thm l2_dead_f'.simps
thm l2_dead_h'.simps
thm l2_dead_g'_def ―‹Note that @{term l2_dead_g'} is no longer part of the clique, because of dead code elimination.›


text ‹Heap abstraction. Note that the change in the heap component of the global variables. The
type changes to @{typ lifted_globals} instead of @{typ globals}.›

print_record globals ― ‹contains byte level tagged heap: @{term "t_hrs_'"}
print_record lifted_globals ― ‹contains split heap with single component @{term heap_w32}, 
                                as we only have a pointers to unsigend.›

thm hl_max'_def
term "hl_add':: 32 signed word  32 signed word 
       (32 signed word c_exntype, 32 signed word, lifted_globals) exn_monad"
thm hl_add'_def
thm hl_call_add'_def
thm hl_seq_assign'_def
thm hl_call_seq_assign'_def
thm hl_inc_g'_def
thm hl_deref'_def
thm hl_deref_g'_def
thm hl_factorial'.simps
thm hl_call_factorial'_def
thm hl_odd'.simps
thm hl_even'.simps
thm hl_dead_f'.simps
thm hl_dead_h'.simps
thm hl_dead_g'_def 


text ‹Word abstraction.›
thm wa_max'_def
term "wa_add'::int  int  (32 signed word c_exntype, int, lifted_globals) exn_monad"
thm wa_add'_def
thm wa_call_add'_def
thm wa_seq_assign'_def
thm wa_call_seq_assign'_def
thm wa_inc_g'_def
thm wa_deref'_def
thm wa_deref_g'_def
thm wa_factorial'.simps
thm wa_call_factorial'_def
thm wa_odd'.simps
thm wa_even'.simps
thm wa_dead_f'.simps
thm wa_dead_h'.simps
thm wa_dead_g'_def 


text ‹Final definition.›
term "max'::int  int  int"
thm max'_def
term "add':: int  int  lifted_globals  int option"
thm add'_def
thm wa_call_add'_def
thm wa_seq_assign'_def
thm wa_call_seq_assign'_def
thm wa_inc_g'_def
thm wa_deref'_def
thm wa_deref_g'_def
thm wa_factorial'.simps
thm wa_call_factorial'_def
thm wa_odd'.simps
thm wa_even'.simps
thm wa_dead_f'.simps
thm wa_dead_h'.simps
thm wa_dead_g'_def 
text ‹Correspondence theorem.›
thm factorial'_ac_corres
thm call_factorial'_ac_corres

end

section ‹Simplification strategy and dealing with tuples in L2-optimization phases›

text ‹Consider a congruence rule from AutoCorres›

lemma
  assumes c_eq: "r s. c r s = c' r s" 
  assumes bdy_eq: "r s. c' r s  run (A r) s = run (A' r) s" 
  shows "L2_while c A = L2_while c' A'"
  using c_eq bdy_eq
  by (rule L2_while_cong)

text ‹
Note that termr could be a tuple and and we would like to "split" it. The condition termc will 
already be formulated with constcase_prod, e.g. termλ(x,y,z) s. x < y. 

When the congruence rule above is applied, variable pattern?c' has to be instantiated at some point.
First the congruence rule is applied as is. Then we use @{thm [source] split_paired_all} to split up the 
termr. For the example we will end up with the subgoal:

 patterna b c s. (a < b) = ?c' (a, b, c) s

Mere Unification will fail here as it does not work "modulo" constcase_prod. To find the proper 
instantiation we developed @{ML Tuple_Tools.tuple_inst_tac} which can be added to the simplifier as
a looper. It will instantiate pattern?c' with the respective constcase_prod instance 
introducing a new schematic variable: pattern?c' = (λ(a, b, c). ?f a b c). Now the unification 
will kick in and do the rest of the work.

However, allthough an instantiation is now found, it does not have the expected effect 
on ‹body_eq›. The problem is, that the premise of the implication is not simplified as it is added
to the "prems" of the simplifier so it will be term(λ(a, b, c) s. a < b) (a, b, c) s  True and not
just terma < b  True". The former format is unfortunately ineffective in the simplification of
the while body.

Fortunately the simpset can be instructed to apply a function to the premise before it is added.
So adding @{ML Tuple_Tools.mksimps} with @{ML Simplifier.set_mksimps} helps with that respect. As 
the simplifier descends into the term it calls the function to do "beta-reduction" of 
tuple application before adding the premise.

›

lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. 0 < y)) (λ_. X)) x ns 
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. True)) (λ_. X)) x ns"
 apply (tactic simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimps @{thms split_paired_all}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_cong}
           |> Simplifier.set_mksimps (Tuple_Tools.mksimps)
) 1)
  done

text ‹
Unfortunately there is still an issue. Linear arithmetic is applied as a simproc by the simplifier,
e.g.
@{lemma "0 < (n::nat)  Suc 0  n" by simp}

Unfortunately this does not work as expected in the setup above:
›

unbundle clocals_string_embedding

lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) (λ_. X)) x ns 
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. z = a)) (λ_. X)) x ns"
apply (tactic simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimps @{thms split_paired_all}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_cong}
           |> Simplifier.set_mksimps (Tuple_Tools.mksimps)
) 1)
  oops

  text ‹After some investigation in the interplay between the simplifier and the linear arithmetic 
simproc it seems that the linear arithmetic somehow ignores the transformed theorem that 
@{ML Tuple_Tools.mksimps} yields. Without understanding all the details I guess the reason is
the missmatch between the term in the hyps and the prop of the resulting theorem. As the simplifier
adds descends into an implication propP  Q it generates a theorem from propP, by also adding 
P to the hyps, so we have P [P]› as a theorem. When @{ML Tuple_Tools.mksimps} kicks in it only 
simplifies the prop not the hyps. So we end up with 
   a < b ≡ True [(λ(a, b, c) s. a < b) (a, b, c) s ≡ True]› 
instead of 
   a < b ≡ True [a < b ≡ True]›.
This seems to irritate the linear arithmetic.
One solution could be to also simplify the hyps. We did not explore this path, as I would expect
some issues when the hyps are eventually discharged. Nevertheless there could be a solution 
following that path.

We came up with a different solution. We get rid of the dependency of @{ML Tuple_Tools.mksimps} by
using propP =simp=> Q to trigger simplification of P.
›

text ‹Note the various options to trace the simplification process.›
declare [[linarith_trace=false]]
declare [[simp_trace=false, simp_trace_depth_limit=100]]
declare [[simp_debug=false]]
declare [[show_hyps=false]]

lemma  
  assumes c_eq: "r s. c r s = c' r s" 
  assumes bdy_eq: "r s. c' r s =simp=> run (A r) s = run (A' r) s"  
  shows "L2_while c A = L2_while c' A' "
  using c_eq bdy_eq by (rule L2_while_simp_cong)

lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) (λ_. X)) x ns 
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. z = a)) (λ_. X)) x ns"
apply (tactic simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimps @{thms split_paired_all}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_simp_cong}
) 1)
  done

text ‹The more standard congruence rule also works fine.›
lemma  
  assumes c_eq: "c = c'" 
  assumes bdy_eq: "r s. c' r s =simp=> run (A r) s = run (A' r) s"  
  shows "L2_while c A = L2_while c' A'"
  using c_eq bdy_eq by (rule L2_while_simp_cong')



lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) (λ_. X)) x ns 
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. z = a)) (λ_. X)) x ns"
apply (tactic simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimps @{thms split_paired_all}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_simp_cong'}
) 1)
  done

text ‹To avoid repeated splitting (and diving into the subterms) of tuples with 
@{thm split_paired_all} consider the follwing simproc setup.
›

lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) (λ_. X)) x ns 
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. z = a)) (λ_. X)) x ns"
apply (tactic simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimprocs [Tuple_Tools.split_tupled_all_simproc, Tuple_Tools.tuple_case_simproc]
           delsimps @{thms Product_Type.prod.case Product_Type.case_prod_conv}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_simp_cong'}
) 1)
  done

text ‹This is also the setup of @{ML L2Opt.cleanup_ss}
lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) (λ_. X)) x ns 
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. z = a)) (λ_. X)) x ns"
apply (tactic asm_full_simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)
 done


lemma "L2_while (λ(x, y, z) s. y = 0)
     (λ(x, y, z).
         L2_seq (L2_gets (λs. y) [𝒮 ''ret'']) (λr. XXX21 r x))
     names =
    L2_while (λ(x, y, z) s. y = 0) (λ(x, y, z). L2_seq (L2_gets (λs. 0) [𝒮 ''ret'']) (λr. XXX21 r x)) names"
apply (tactic asm_full_simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)
  done


lemma "PROP SPLIT (r. ((λ(x,y,z). y < z  z=s) r)  P r)
  (x y z. y < z  z = s  P (x, y, s))"
  apply (tactic asm_full_simp_tac (put_simpset HOL_basic_ss @{context} 
 addsimprocs [Tuple_Tools.SPLIT_simproc, Tuple_Tools.tuple_case_simproc] |> Simplifier.add_cong @{thm SPLIT_cong}) 1)
  done
 
experiment
begin

schematic_goal foo:
"x1 x2 x3 s. (case (x1, x2, x3) of (x, y, z)  λs. 0 < y) s  
 (case (x1, x2, x3) of (x, y, z)  L2_seq (L2_guard (λ_. Suc 0  y  z = a)) X) = ?A' (x1, x2, x3) s"
  apply (tactic asm_full_simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimprocs [Tuple_Tools.SPLIT_simproc, Tuple_Tools.tuple_case_simproc]
           delsimps @{thms Product_Type.prod.case Product_Type.case_prod_conv}

) 1)
  done
text pattern?A' should be properly instantiated.›
thm foo

end

declare [[simp_trace=true, simp_trace_depth_limit=100]]
lemma "L2_while (λ(x,y,z) s. 0 < (y::nat)) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) X) x ns
       = 
       L2_while (λ(x,y,z) s. 0 < y) (λ(x,y,z). L2_seq (L2_guard (λ_. z = a)) X) x ns"
apply (tactic asm_full_simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimprocs [Tuple_Tools.SPLIT_simproc, Tuple_Tools.tuple_case_simproc]
           delsimps @{thms Product_Type.prod.case Product_Type.case_prod_conv}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_cong_simp_split}
           |> Simplifier.add_cong @{thm SPLIT_cong}
) 1)
  done

text ‹Finally the following setup implements a very controlled tuple esplitting on
the While body.›

lemma 
  assumes c_eq: "c = c'" 
  assumes bdy_eq: "PROP SPLIT (r s. c' r s  run (A r) s = run (A' r) s)"  
  shows "L2_while c A = L2_while c' A'"
  using c_eq bdy_eq by (rule L2_while_cong_split)

text ‹With @{const SPLIT} we mark a position in the term that will trigger 
@{ML Tuple_Tools.SPLIT_simproc}. By adding @{thm SPLIT_cong} as congruence rule we
prohibit the simplifier from diving into the loop body before we actually split the
result variable. Note that the simplifier usually works bottom up, only congruence rules
are applied top down.
›

lemma "L2_while (λ(x,y,(z::nat)) s. 0 < (y::nat)  y=x) (λ(x,y,z). L2_seq (L2_guard (λ_. Suc 0  y  z = a)) X) x ns
       = 
  L2_while (λ(x, y, z) s. 0 < y  y = x) (λ(x, x, z). L2_seq (L2_guard (λ_. z = a)) X) x ns"
apply (tactic asm_full_simp_tac ( @{context} 
           addloop ("tuple_inst_tac", Tuple_Tools.tuple_inst_tac)
           addsimprocs [Tuple_Tools.SPLIT_simproc, Tuple_Tools.tuple_case_simproc]
           delsimps @{thms Product_Type.prod.case Product_Type.case_prod_conv}
           |> Simplifier.add_cong @{thm L2_seq_guard_cong}
           |> Simplifier.add_cong @{thm L2_while_cong_simp_split}
           |> Simplifier.add_cong @{thm SPLIT_cong}
) 1)
  done


section ‹Simplification of conditions (guards, loops, conditionals)›

text ‹
The basic peephole optimization for conditions tries to simplify "trivial" guards / conditions by
propagating the information of guards / conditions to subsequent guards / conditions. It is
called "peephole" optimisation (in contrast to "flow"-sensitive), because it only propagates the
state independent information of gaurds, i.e. constraints on constant expressions or local variables, 
not constraints on the state itself.

To facilitate this propagation with congruence rules and simprocs, 
we introduce two special constants constL2_seq_gets and constL2_seq_guard that are introduced
as intermediate step by a conversion. With this marking we can distinguish the cases by congruence
rules, without the marking both cases would be subject to a single congruence rule for constL2_seq. 
(Note that the simplifier only considers the head term of a congruence rules.)

We add the congruence rules @{thm L2_marked_seq_gets_cong} and @{thm L2_marked_seq_guard_cong}.
The congruence rules for the guard is a standard congruence rules that adds the gaurd as a precondition for
simplifying the second statement in the sequential composition. The congruence rule for 
@{thm L2_marked_seq_gets_cong} stops the simplifier from descending into the second term
of the sequential composition. Instead the simproc @{ML L2Opt.l2_marked_gets_bind_simproc} is 
triggered to analyse the situation.
It does the following:

 If the returned value in the first statement is simple, only occurs once in second statement, or
is a structure-constructor or update that is only appied to structure-selections then the value is
directly propagated to the second statement and the sequential composition is removed.
 It peeks into the prems of the simplifier to see if there are already "interesting" facts collected
from the congruence rules of guards / conditions. If not, it unfolds the marking and continues with
the ordinary sequential composition by simplifying the second statement. "Interesting" means there
is at least one premise that has impact on the new return value. E.g. if we have termx + y < 5 in the
prems, and we now we return termx + y, than the simproc introduces a new variable termr for return value,
augments the context with the equation termr = x + y and derives the new premise termr < 5 which
is put to the premises of the simplifier as well as the simpset. Then the simplifier is called recursively
on the second statement of the sequential composition with the augmented context. When it is finished
it uses the rule @{thm L2_marked_seq_gets_stop} to introduce the constant constSTOP. The congruence 
rule @{thm STOP_cong} prohibits the simplifier to descend down into the just simplified 
second statement again. 

Note that this setup works around the limitation of the simplifier that does not allow us to modify
the context by something like a "congproc", similar to a "simproc". Keep in mind that a congruence
rule is applied top-down as the simplifier works its way down into a term, whereas a simpproc (or any
other simplification rule) is applied buttom up. So a simproc / simplification rule can build on 
the fact that the subterms are already normalised. The simplifier makes use of this by doing some
bookkeeping with "term-skeletons" to avoid resimplification of subterms. This mechanism fails
short in our setup of using a congruence rule to stop simplification of the second statement and
calling the simproc instead. That is why we explicitly introduce constSTOP. It will be removed
after simplification by an additional traversal of the term.
›


declare [[simp_trace=false, ML_print_depth=1000]]

text ‹Propagation of simple constant by unfolding.›

lemma "L2_seq_gets c [𝒮 ''r'']  (λr. (L2_guard (λ_. P r ))) =
       L2_guard (λ_. P c )"
  by (tactic simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)

text ‹Propagation of term, as it only appears once in the second statement.›

lemma "L2_seq_gets (c + d) [𝒮 ''r'']  (λr. (L2_guard (λ_. P r ))) =
       L2_guard (λ_. P (c + d) )"
  by (tactic simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)

text ‹As nothing is in the prems yet, marking is just removed and second statement is thus simplified›
lemma "L2_seq_gets (c + d) [𝒮 ''r'']  (λr. (L2_guard (λ_. P r  (P r  Q r)))) =
       L2_seq (L2_gets (λ_. c + d) [𝒮 ''r'']) (λr. L2_guard (λ_. P r  Q r))"
  by (tactic simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)

text ‹The first guard condition is propagated to the second guard, via the intermediate
assignment @{term "r = a + b"}.›
lemma "L2_seq_guard (λ_. a + b < 5) 
        (λ_. L2_seq_gets (a + b) [𝒮 ''r''] 
          (λr. L2_guard (λ_. r < 5  P))) =
       L2_seq_guard (λ_. a + b < 5)  
         (λ_. L2_guard (λ_. P))"
  by (tactic simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)


ML fun assert_rhs thm =
 Tuple_Tools.assert_cterm' @{context} (Thm.term_of (Thm.rhs_of thm))
ML_val val ct = @{cterm "L2_seq_guard (λ_. (a::int) + b < 5) 
    (λ_. L2_seq_gets (a + b) [𝒮 ''r'']  
      (λr. (L2_guard (λ_. r < 5  P))))"}
val test = Simplifier.asm_full_rewrite (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) ct
val _ = assert_rhs test 
 @{cterm "STOP (L2_seq_guard (λ_. (a::int) + b < 5) (λ_. L2_guard (λ_. P)))"}


lemma "L2_seq_guard (λs. V1 + a - (r::int)  2)
        (λ_. L2_seq (L2_condition (λs. CC) (L2_seq_guard (λs. V1 + a - r  3) (λ_. X1)) X2) (λ_. X3)) =
       L2_seq_guard (λs. V1 + a - r  2)
        (λ_. L2_seq (L2_condition (λs. CC) (L2_seq_guard (λs. True) (λ_. X1)) X2) (λ_. X3))"
  by (tactic asm_full_simp_tac (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) 1)

ML_val val ct = @{cterm "L2_seq_guard (λs. (V1 + a - (r::int)  2))
          (λ_. L2_seq (L2_condition (λs. CC) (L2_seq_guard (λs. V1 + a - r  3) (λ_. X1)) X2) (λ_. X3))"}

val test = Simplifier.asm_full_rewrite (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP) ct 
val _ = assert_rhs test
  @{cterm "STOP (L2_seq_guard (λs. V1 + a - r  2)
     (λ_. L2_seq (L2_condition (λs. CC) (STOP (L2_seq_guard (λs. True) (λ_. X1))) X2) (λ_. X3)))"}

ML val ct = @{cterm "L2_seq_guard
         (λ_. ((g k)::int) + (a::int) - r  n)
          (λ_. 
            (L2_seq_guard (λs. (g k) + a - r  n + 1) (λ_. X1)))"}

val test = Simplifier.asm_full_rewrite (Variable.set_body true (L2Opt.cleanup_ss @{context} [] FunctionInfo.PEEP)) ct
val _ = assert_rhs test @{cterm "STOP (L2_seq_guard (λ_. g k + a - r  n) (λ_