Theory Sepref_Guide_Quickstart

section ‹Quickstart Guide›
theory Sepref_Guide_Quickstart
imports "../IICF/IICF"
begin
  subsection ‹Introduction›
  text ‹
    Sepref is an Isabelle/HOL tool to semi-automatically synthesize
    imperative code from abstract specifications.

    The synthesis works by replacing operations on abstract data 
    by operations on concrete data, leaving the structure of the program 
    (mostly) unchanged. Speref proves a refinement theorem, stating the 
    relation between the abstract and generated concrete specification. 
    The concrete specification can then be converted to executable code using 
    the Isabelle/HOL code generator.

    This quickstart guide is best appreciated in the Isabelle IDE (currently Isabelle/jedit),
    such that you can use cross-referencing and see intermediate proof states.
    ›

  subsubsection ‹Prerequisites›
  text ‹
    Sepref is a tool for experienced Isabelle/HOL users. So, this 
    quickstart guide assumes some familiarity with Isabelel/HOL, and will not 
    explain standard Isabelle/HOL techniques.

    Sepref is based on Imperative/HOL (@{theory "HOL-Imperative_HOL.Imperative_HOL"}) and the Isabelle Refinement Framework (@{theory Refine_Monadic.Refine_Monadic}).
    It makes extensive use of the Separation logic formalization for Imperative/HOL (@{theory Separation_Logic_Imperative_HOL.Sep_Main}).
    
    For a thorough introduction to these tools, we refer to their documentation.
    However, we try to explain their most basic features when we use them.
    ›


  subsection ‹First Example›
  text ‹As a first example, let's compute a minimum value in a non-empty list,
    wrt.~ some linear order.

    We start by specifying the problem:
    ›
  definition min_of_list :: "'a::linorder list  'a nres" where
    "min_of_list l  ASSERT (l[])  SPEC (λx. yset l. xy)"

  text ‹This specification asserts the precondition and then specifies 
    the valid results x›. The ⪢› operator is a bind-operator on monads.

    Note that the Isabelle Refinement Framework works with a set/exception monad
    over the type @{typ "_ nres"}, where @{term FAIL} is the exception, 
    and @{term "RES X"} specifies a set @{term X} of possible results.
    @{term SPEC} is just the predicate-version of @{term RES} 
    (actually @{term "SPEC Φ"} is a syntax abbreviation for @{term "RES (Collect Φ)"}).
    
    Thus, @{term min_of_list} will fail if the list is empty, and otherwise 
    nondeterministically return one of the minimal elements.
    ›

  subsubsection ‹Abstract Algorithm›
  text ‹
    Next, we develop an abstract algorithm for the problem. 
    A natural choice for a functional programmer is folding over the list,
    initializing the fold with the first element.
    ›
  definition min_of_list1 :: "'a::linorder list  'a nres" 
    where "min_of_list1 l  ASSERT (l[])  RETURN (fold min (tl l) (hd l))"

  text ‹Note that @{term RETURN} returns exactly one (deterministic) result. ›  

  text ‹We have to show that our implementation actually refines the specification›
  lemma min_of_list1_refine: "(min_of_list1,min_of_list)  Id  Idnres_rel"
    text ‹This lemma has to be read as follows: If the argument given to 
      @{const min_of_list1} and @{const min_of_list} are related 
      by @{const Id} (i.e.\ are identical), then the result of @{const min_of_list1} is
      a refinement of the result of @{const min_of_list}, wrt.\ relation @{const Id}.

      For an explanation, lets simplify the statement first:
      ›
    apply (clarsimp intro!: nres_relI)
    text ‹The @{typ "_ nres"} type defines the refinement ordering, which is a lifted subset ordering,
      with @{term FAIL} being the greatest element. This means, that we can assume a 
      non-empty list during the refinement proof 
      (otherwise, the RHS will be @{term FAIL}, and the statement becomes trivial)

      The Isabelle Refinement Framework provides various techniques to extract verification 
      conditions from given goals, we use the standard VCG here:
      ›
    unfolding min_of_list_def min_of_list1_def
    apply (refine_vcg)
    text ‹The VCG leaves us with a standard HOL goal, which is easily provable›
    by (auto simp: neq_Nil_conv Min.set_eq_fold[symmetric])

  text ‹A more concise proof of the same lemma omits the initial simplification, 
    which we only inserted to explain the refinement ordering: ›  
  lemma "(min_of_list1,min_of_list)  Id  Idnres_rel"  
    unfolding min_of_list_def[abs_def] min_of_list1_def[abs_def]
    apply (refine_vcg)
    by (auto simp: neq_Nil_conv Min.set_eq_fold[symmetric])

  subsubsection ‹Refined Abstract Algorithm›
  text ‹Now, we have a nice functional implementation. 
    However, we are interested in an imperative implementation.
    Ultimately, we want to implement the list by an array. 
    Thus, we replace folding over the list by indexing into the list,
    and also add an index-shift to get rid of the @{term hd} and @{term tl}.
    ›
  definition min_of_list2 :: "'a::linorder list  'a nres" 
    where "min_of_list2 l  ASSERT (l[])  RETURN (fold (λi. min (l!(i+1))) [0..<length l - 1] (l!0))"

  text ‹Proving refinement is straightforward, using the @{thm [source] fold_idx_conv} lemma.›    
  lemma min_of_list2_refine: "(min_of_list2, min_of_list1)Id  Idnres_rel"
    unfolding min_of_list2_def[abs_def] min_of_list1_def[abs_def]
    apply refine_vcg
    apply clarsimp_all
    apply (rewrite in "_=" fold_idx_conv)
    by (auto simp: nth_tl hd_conv_nth)

  subsubsection ‹Imperative Algorithm›  
  text ‹The version @{const min_of_list2} already looks like the desired imperative version,
    only that we have lists instead of arrays, and would like to replace the folding over 
    @{term "[0..<length l -1]"} by a for-loop. 

    This is exactly what the Sepref-tool does. The following command synthesizes 
    an imperative version min_of_list3› of the algorithm for natural numbers, 
    which uses an array instead of a list:
    › 
  sepref_definition min_of_list3 is min_of_list2 :: "(array_assn nat_assn)k a nat_assn"
    unfolding min_of_list2_def[abs_def] 
    by sepref

  text ‹The generated constant represents an Imperative/HOL program, and
    is executable: ›  
  thm min_of_list3_def  
  export_code min_of_list3 checking SML_imp

  text ‹Also note that the Sepref tool applied a deforestation optimization: 
    It recognizes a fold over @{term "[0..<n]"}, and implements it by the 
    tail-recursive function @{const "imp_for'"}, which uses a counter instead of 
    an intermediate list. 

    There are a couple of optimizations, which come in the form of two sets of 
    simplifier rules, which are applied one after the other:
    ›
  thm sepref_opt_simps
  thm sepref_opt_simps2
  text ‹They are just named theorem collections, e.g., sepref_opt_simps add/del› 
    can be used to modify them.›


  text ‹Moreover, a refinement theorem is generated, which states the correspondence between
    @{const min_of_list3} and @{const min_of_list2}: ›
  thm min_of_list3.refine
  text ‹It states the relations between the parameter and the result of 
    the concrete and abstract function. The parameter is related by 
    @{term "array_assn nat_assn"}. Here, @{term "array_assn A"} relates arrays 
    with lists, such that the elements are related @{term A} --- in our case by 
    nat_assn›, which relates natural numbers to themselves. 
    We also say that we @{emph ‹implement›} lists of nats by arrays of nats.
    The result is also implemented by natural numbers. 

    Moreover, the parameters may be stored on the heap, and we have to indicate whether
    the function keeps them intact or not. Here, we use the annotation _k (for @{emph ‹keep›}) to indicate 
    that the parameter is kept intact, and _d (for @{emph ‹destroy›}) to indicate that it is destroyed.
    ›

  subsubsection ‹Overall Correctness Statement›
  text ‹Finally, we can use transitivity of refinement to link our implementation to
    the specification. The @{attribute FCOMP} attribute is able to compose refinement 
    theorems:›
  theorem min_of_list3_correct: "(min_of_list3,min_of_list)  (array_assn nat_assn)k a nat_assn"
    using min_of_list3.refine[FCOMP min_of_list2_refine, FCOMP min_of_list1_refine] .

  text ‹While the above statement is suited to re-use the algorithm within the sepref-framework,
    a more low-level correctness theorem can be stated using separation logic.
    This has the advantage that understanding the statement depends on less 
    definitional overhead:›  
  lemma "l[]  <array_assn nat_assn l a> min_of_list3 a <λx. array_assn nat_assn l a * (yset l. xy)>t"
    text ‹The proof of this theorem has to unfold the several layers of the Sepref framework,
      down to the separation logic layer. An explanation of these layers is out of scope of this
      quickstart guide, we just present some proof techniques that often work. In the best case,
      the fully automatic proof will work:›
    by (sep_auto 
      simp: min_of_list_def pure_def pw_le_iff refine_pw_simps
      heap: min_of_list3_correct[THEN hfrefD, of l a, THEN hn_refineD, simplified])
    
  text ‹If the automatic method does not work, here is a more explicit proof, 
    that can be adapted for proving similar statements:›  
  lemma "l[]  <array_assn nat_assn l a> min_of_list3 a <λx. array_assn nat_assn l a * (yset l. xy)>t"
  proof -
    text ‹We inlined the definition of @{const min_of_list}. 
      This will yield two proof obligations later, which we discharge as auxiliary lemmas here
      ›
    assume [simp]: "l[]"
    have [simp]: "nofail (min_of_list l)" 
      by (auto simp: min_of_list_def refine_pw_simps)
    have 1: "x. RETURN x  min_of_list l  yset l. xy"  
      by (auto simp: min_of_list_def pw_le_iff refine_pw_simps)

    note rl = min_of_list3_correct[THEN hfrefD, of l a, THEN hn_refineD, simplified]
    text ‹This should yield a Hoare-triple for @{term "min_of_list3 a"}, 
      which can now be used to prove the desired statement via a consequence rule›
    show ?thesis
      apply (rule cons_rule[OF _ _ rl])
      text ‹The preconditions should match, however, @{method sep_auto} is also able to discharge
        more complicated implications here. Be sure to simplify with @{thm [source] pure_def},
        if you have parameters that are not stored on the heap (in our case, we don't, but include the
        simplification anyway.)› 
      apply (sep_auto simp: pure_def)  
      text ‹The heap-parts of the postcondition should also match. 
        The pure parts require the auxiliary statements that we proved above.›
      apply (sep_auto simp: pure_def dest!: 1)  
      done
    qed  
    

  subsubsection ‹Using the Algorithm› 
  text ‹As an example, we now want to use our algorithm to compute the minimum value
    of some concrete list. In order to use an algorithm, we have to declare both, 
    it's abstract version and its implementation to the Sepref tool. 
    ›
  sepref_register min_of_list
    ― ‹This command registers the abstract version, and generates 
        an @{emph ‹interface type›} for it. We will explain interface types later,  
        and only note that, by default, the interface type corresponds to the operation's
        HOL type.›
  declare min_of_list3_correct[sepref_fr_rules]  
    ― ‹This declares the implementation to Sepref›

  text ‹Now we can define the abstract version of our example algorithm.
    We compute the minimum value of pseudo-random lists of a given length
    ›  
  primrec rand_list_aux :: "nat  nat  nat list" where
    "rand_list_aux s 0 = []"
  | "rand_list_aux s (Suc n) = (let s = (1664525 * s + 1013904223) mod 2^32 in s # rand_list_aux s n)"
  definition "rand_list  rand_list_aux 42"

  definition "min_of_rand_list n = min_of_list (rand_list n)"

  text ‹And use Sepref to synthesize a concrete version›
  text ‹We use a feature of Sepref to combine imperative and purely functional code,
    and leave the generation of the list purely functional, then copy it into an array,
    and invoke our algorithm. We have to declare the @{const rand_list} operation:›
  sepref_register rand_list
  lemma [sepref_import_param]: "(rand_list,rand_list)nat_rel  nat_rellist_rel" by auto

  text ‹Here, we use a feature of Sepref to import parametricity theorems.
    Note that the parametricity theorem we provide here is trivial, as 
    @{const nat_rel} is identity, and @{const list_rel} as well as @{term "(→)"} 
    preserve identity. 
    However, we have to specify a parametricity theorem that reflects the 
    structure of the involved types.
  ›

  text ‹Finally, we can invoke Sepref›
  sepref_definition min_of_rand_list1 is "min_of_rand_list" :: "nat_assnk a nat_assn"
    unfolding min_of_rand_list_def[abs_def]
    text ‹We construct a plain list, however, the implementation of @{const min_of_list}
      expects an array. We have to insert a conversion, which is conveniently done
      with the @{method rewrite} method:
      ›
    apply (rewrite in "min_of_list " array_fold_custom_of_list)
    by sepref
  text ‹In the generated code, we see that the pure @{const rand_list} function 
    is invoked, its result is converted to an array, which is then passed to 
    @{const min_of_list3}.

    Note that @{command sepref_definition} prints the generated theorems to the 
    output on the end of the proof. Use the output panel, or hover the mouse over 
    the by-command to see this output.
  ›

  text ‹The generated algorithm can be exported›
  export_code min_of_rand_list1 checking SML OCaml? Haskell? Scala
  text ‹and executed›
  ML_val @{code min_of_rand_list1} (@{code nat_of_integer} 100) ()
  text ‹Note that Imperative/HOL for ML generates a function from unit, 
    and applying this function triggers execution.›

subsection ‹Binary Search Example›
text ‹As second example, we consider a simple binary search algorithm.
  We specify the abstract problem, i.e., finding an element in a sorted list.
›
definition "in_sorted_list x xs  ASSERT (sorted xs)  RETURN (xset xs)"

text ‹And give a standard iterative implementation:›
definition "in_sorted_list1_invar x xs  λ(l,u,found).
    (lu  ulength xs)
   (found  xset xs)
   (¬found  (xset (take l xs)  xset (drop u xs))
  )"

definition "in_sorted_list1 x xs  do {
  let l=0;
  let u=length xs;
  (_,_,r)  WHILEIT (in_sorted_list1_invar x xs)
    (λ(l,u,found). l<u  ¬found) (λ(l,u,found). do {
      let i = (l+u) div 2;
      ASSERT (i<length xs); ― ‹Added here to help synthesis to prove precondition for array indexing›
      let xi = xs!i;
      if x=xi then
        RETURN (l,u,True)
      else if x<xi then
        RETURN (l,i,False)
      else  
        RETURN (i+1,u,False)
  
    }) (l,u,False);
  RETURN r  
}"

text ‹Note that we can refine certain operations only if we can prove that their 
  preconditions are matched. For example, we can refine list indexing to array 
  indexing only if we can prove that the index is in range. This proof has to be 
  done during the synthesis procedure. However, such precondition proofs may be 
  hard, in particular for automatic methods, and we have to do them anyway when 
  proving correct our abstract implementation. Thus, it is a good idea to assert
  the preconditions in the abstract implementation. This way, they are immediately
  available during synthesis (recall, when refining an assertion, you may assume
  the asserted predicate @{thm le_ASSERTI}).
  
  An alternative is to use monadic list operations that already assert their precondition.
  The advantage is that you cannot forget to assert the precondition, the disadvantage
  is that the operation is monadic, and thus, nesting it into other operations is more cumbersome.
  In our case, the operation would be @{const mop_list_get} 
  (Look at it's simplified definition to get an impression what it does). 
›
thm mop_list_get_alt

text ‹We first prove the refinement correct›
context begin
private lemma isl1_measure: "wf (measure (λ(l,u,f). u-l + (if f then 0 else 1)))" by simp

private lemma neq_nlt_is_gt:
  fixes a b :: "'a::linorder"  
  shows "ab  ¬(a<b)  a>b" by simp

private lemma isl1_aux1:
  assumes "sorted xs"
  assumes "i<length xs"
  assumes "xs!i < x"
  shows "xset (take i xs)"
  using assms
  by (auto simp: take_set leD sorted_nth_mono)

private lemma isl1_aux2: 
  assumes "x  set (take n xs)"
  shows "xset (drop n xs)  xset xs"
  apply (rewrite in "_ = " append_take_drop_id[of n,symmetric])
  using assms
  by (auto simp del: append_take_drop_id)

lemma in_sorted_list1_refine: "(in_sorted_list1, in_sorted_list)Id  Id  Idnres_rel"
  unfolding in_sorted_list1_def[abs_def] in_sorted_list_def[abs_def]
  apply (refine_vcg isl1_measure)
  apply (vc_solve simp: in_sorted_list1_invar_def isl1_aux1 isl1_aux2 solve: asm_rl)
  apply (auto simp: take_set set_drop_conv leD sorted_nth_mono) []
  apply (auto simp: take_set leD sorted_nth_mono dest: neq_nlt_is_gt) []
  done
end  

text ‹First, let's synthesize an implementation where the list elements are natural numbers. 
  We will discuss later how to generalize the implementation for arbitrary types.

  For technical reasons, the Sepref tool works with uncurried functions. That is, every
  function has exactly one argument. You can use the @{term uncurry} function,
  and we also provide abbreviations @{term uncurry2} up to @{term uncurry5}.
  If a function has no parameters, @{term uncurry0} adds a unit parameter.
›
sepref_definition in_sorted_list2 is "uncurry in_sorted_list1" :: "nat_assnk *a (array_assn nat_assn)k a bool_assn"
  unfolding in_sorted_list1_def[abs_def]
  by sepref  

export_code in_sorted_list2 checking SML
lemmas in_sorted_list2_correct = in_sorted_list2.refine[FCOMP in_sorted_list1_refine]
  
subsection ‹Basic Troubleshooting›
text ‹
  In this section, we will explain how to investigate problems with the Sepref tool.
  Most cases where @{method sepref} fails are due to some 
  missing operations, unsolvable preconditions, or an odd setup. 
›

subsubsection ‹Example›
text ‹We start with an example. Recall the binary search algorithm. 
  This time, we forget to assert the precondition of the indexing operation.
›
definition "in_sorted_list1' x xs  do {
  let l=0;
  let u=length xs;
  (_,_,r)  WHILEIT (in_sorted_list1_invar x xs)
    (λ(l,u,found). l<u  ¬found) (λ(l,u,found). do {
      let i = (l+u) div 2;
      let xi = xs!i; ― ‹It's not trivial to show that i› is in range›
      if x=xi then
        RETURN (l,u,True)
      else if x<xi then
        RETURN (l,i,False)
      else  
        RETURN (i+1,u,False)
  
    }) (l,u,False);
  RETURN r  
}"

text ‹We try to synthesize the implementation. Note that @{command sepref_thm} behaves like 
  @{command sepref_definition}, but actually defines no constant. It only generates a refinement theorem.›
sepref_thm in_sorted_list2 is "uncurry in_sorted_list1'" :: "nat_assnk *a (array_assn nat_assn)k a bool_assn"
  unfolding in_sorted_list1'_def[abs_def]
  (* apply sepref  Fails *)
  ― ‹If @{method sepref} fails, you can use @{method sepref_dbg_keep} to get some more information.›
  apply sepref_dbg_keep
  ― ‹This prints a trace of the different phases of sepref, and stops when the first phase fails.
    It then returns the internal proof state of the tool, which can be inspected further.
    
    Here, the translation phase fails. The translation phase translates the control structures and operations of
    the abstract program to their concrete counterparts. To inspect the actual problem, we let translation run 
    until the operation where it fails:›
  supply [[goals_limit=1]] ― ‹There will be many subgoals during translation, and printing them takes very long with Isabelle :(›
  apply sepref_dbg_trans_keep
  ― ‹Things get stuck at a goal with predicate @{const hn_refine}. This is the internal refinement predicate,
    @{term "hn_refine Γ c Γ' R a"} means, that, for operands whose refinement is described by @{term Γ},
    the concrete program @{term c} refines the abstract program @{term a}, such that, afterwards, the operands
    are described by @{term Γ'}, and the results are refined by @{term R}.
    
    Inspecting the first subgoal reveals that we got stuck on refining the abstract operation
    @{term "RETURN $ (op_list_get $ b $ xf)"}. Note that the @{term "($)"} is just a constant for function 
    application, which is used to tame Isabelle's higher-order unification algorithms. You may use 
    unfolding APP_def›, or even simp› to get a clearer picture of the failed goal.

    If a translation step fails, it may be helpful to execute as much of the translation step as possible:›
  apply sepref_dbg_trans_step_keep
  ― ‹The translation step gets stuck at proving @{term "pre_list_get (b, xf)"}, which is the 
    precondition for list indexing.›
  apply (sepref_dbg_side_keep) ― ‹If you think the side-condition should be provable, this command 
    returns the left-over subgoals after some preprocessing and applying auto›
  (* apply sepref_dbg_side_unfold (* Preprocessing only*) *)
  oops  

subsubsection ‹Internals of Sepref›
text ‹
  Internally, @{method sepref} consists of multiple phases that are executed
  one after the other. Each phase comes with its own debugging method, which 
  only executes that phase. We illustrate this by repeating the refinement of
  @{const "min_of_list2"}. This time, we use @{command sepref_thm}, which only
  generates a refinement theorem, but defines no constants:
›

sepref_thm min_of_list3' is min_of_list2 :: "(array_assn nat_assn)k a nat_assn"
  ― ‹The sepref_thm› or sepref_definition› command assembles a schematic 
    goal statement.›
  unfolding min_of_list2_def[abs_def] 
  ― ‹The preprocessing phase converts the goal into 
    the @{const "hn_refine"}-form. Moreover, it adds interface type 
    annotations for the parameters. (for now, the interface type is just the HOL 
    type of the parameter, in our case, @{typ "nat list"})›
  apply sepref_dbg_preproc
  ― ‹The next phase applies a consequence rule for the postcondition and
    result. This is mainly for technical reasons.›
  apply sepref_dbg_cons_init
  ― ‹The next phase tries to identify the abstract operations, and inserts
    tag-constants for function application and abstraction. These tags are for 
    technical reasons, working around Isabelle/HOL's unifier idiosyncrasies.

    Operation identification assigns a single constant to each abstract operation,
    which is required for technical reasons. Note that there are terms in HOL, 
    which qualify as a single operation, but consists of multiple constants, 
    for example, @{term "{x}"}, which is just syntactic sugar for 
    @{term [source] "insert x {}"}. In our case, the operation identification 
    phase rewrites the assertion operations followed by a bind to a single 
    operation @{const op_ASSERT_bind}, and renames some operations to more 
    canonical names.›
  apply sepref_dbg_id
  ― ‹Now that it is clear which operations to execute, we have to specify an 
    execution order. Note that HOL has no notion of execution at all. However,
    if we want to translate to operations that depend on a heap, we need a notion 
    of execution order. We use the nres›-monad's bind operation as sequencing operator,
    and flatten all nested operations, using left-to-right evaluation order.›
  apply sepref_dbg_monadify
  ― ‹The next step just prepares the optimization phase,
    which will be executed on the translated program. It just applies the rule   
    @{thm TRANS_init}.›
  apply sepref_dbg_opt_init
  ― ‹The translation phase does the main job of translating the abstract program
    to the concrete one. It has rules how to translate abstract operations to
    concrete ones. For technical reasons, it differentiates between 
    operations, which have only first-order arguments (e.g., @{const length})   
    and combinators, which have also higher-order arguments (e.g., @{const fold}).

    The basic idea of translation is to repeatedly apply the translation rule for the
    topmost combinator/operator, and thus recursively translate the whole program.
    The rules may produce various types of side-conditions, which are resolved by the tool.›
  apply sepref_dbg_trans
  ― ‹The next phase applies some simplification rules to optimize the translated program.
    It essentially simplifies first with the rules @{thm [source] sepref_opt_simps}, and
    then with @{thm [source] sepref_opt_simps2}.›
  apply sepref_dbg_opt
  ― ‹The next two phases resolve the consequence rules introduced by the cons_init› phase.›
  apply sepref_dbg_cons_solve
  apply sepref_dbg_cons_solve
  ― ‹The translation phase and the consequence rule solvers may postpone some
    side conditions on yet-unknown refinement assertions. These are solved in the 
    last phase.›
  apply sepref_dbg_constraints
  done

text ‹In the next sections, we will explain, by example, how to troubleshoot 
  the various phases of the tool. We will focus on the phases that are most 
  likely to fail.›

subsubsection ‹Initialization›
text ‹A common mistake is to forget the keep/destroy markers for the
  refinement assertion, or specify a refinement assertion with a non-matching
  type. This results in a type-error on the command›

(* Forgot keep/destroy *)
(*sepref_thm min_of_list3' is min_of_list2 :: "(array_assn nat_assn) →a nat_assn"*)

(* Wrong type (@{term hs.assn} is for sets (hashset), not for lists) *)
(*sepref_thm min_of_list3' is min_of_list2 :: "(hs.assn nat_assn)ka nat_assn"*)

(* Operand must be function to nres *)
(*sepref_thm test is "λx. 2+x" :: "nat_assnka nat_assn"*)
(* Correct: *)
sepref_thm test_add_2 is "λx. RETURN (2+x)" :: "nat_assnk a nat_assn"
  by sepref

(* Type correct, but nonsense: Yields a proof failed message, as the tool
  expects a refinement assertion *)
(*sepref_thm min_of_list3' is min_of_list2 :: "undefined"*)

subsubsection ‹Translation Phase›
text ‹In most cases, the translation phase will fail. Let's try the following refinement:›
sepref_thm test is "λl. RETURN (l!1 + 2)" :: "(array_assn nat_assn)k a nat_assn"
  text ‹The @{method sepref} method will just fail. To investigate further, we use
    @{method sepref_dbg_keep}, which executes the phases until the first one fails.
    It returns with the proof state before the failed phase, and, moreover, outputs
    a trace of the phases, such that you can easily see which phase failed.
    ›
  apply sepref_dbg_keep
  ― ‹In the trace, we see that the translation phase failed. We are presented
    the tool's internal goal state just before translation. If a phase fails,
    the usual procedure is to start the phase in debug mode, and see how far it gets.
    The debug mode of the translation phase stops at the first operation or combinator
    it cannot translate. Note, it is a good idea to limit the visible goals, as printing 
    goals in Isabelle can be very, very slow :(›
  supply [[goals_limit = 1]]
  apply sepref_dbg_trans_keep
  ― ‹Here, we see that translation gets stuck at op_list_get›. This may have 
    two reasons: Either there is no rule for this operation, or a side condition 
    cannot be resolved. We apply a single translation step in debug mode, i.e., 
    the translation step is applied as far as possible, leaving unsolved side conditions:›
  apply sepref_dbg_trans_step_keep
  ― ‹This method reports that the "Apply rule" phase produced a wrong number of subgoals.
    This phase is expected to solve the goal, but left some unsolved side condition, which we
    are presented in the goal state. We can either guess  
    what @{term pre_list_get} means and why it cannot be solved, or try to partially
    solve the side condition:›
  apply sepref_dbg_side_keep
  ― ‹From the remaining subgoal, one can guess that there might be a problem 
    with too short lists, where index 1› does not exist.›
  (** You may use the following methods instead of sepref_dbg_side_keep to have 
    more control on how far the side-condition is solved. By default, you will see
    the result of auto after unfolding the internal tags.
  apply sepref_dbg_side_unfold apply simp
  *)
  oops
text ‹Inserting an assertion into the abstract program solves the problem:›
sepref_thm test is "λl. ASSERT (length l > 1)  RETURN (l!1 + 2)" :: "(array_assn nat_assn)k a nat_assn"
  by sepref

text ‹Here is an example for an unimplemented operation:›
sepref_thm test is "λl. RETURN (Min (set l))" :: "(array_assn nat_assn)k a nat_assn"
  supply [[goals_limit = 1]]
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  ― ‹Translation stops at the set› operation›
  apply sepref_dbg_trans_step_keep
  ― ‹This tactic reports that the "Apply rule" phase failed, which means that 
    there is no applicable rule for the set› operation on arrays.›
  oops  
  
subsection ‹The Isabelle Imperative Collection Framework (IICF)›
text ‹
  The IICF provides a library of imperative data structures, and some 
  management infrastructure. The main idea is to have interfaces and implementations.

  An interface specifies an abstract data type (e.g., @{typ "_ list"}) and some operations with preconditions 
  on it (e.g., @{term "(@)"} or @{term "nth"} with in-range precondition). 

  An implementation of an interface provides a refinement assertion from the abstract data type to
  some concrete data type, as well as implementations for (a subset of) the interface's operations.
  The implementation may add some more implementation specific preconditions.
  
  The default interfaces of the IICF are in the folder IICF/Intf›, and the standard implementations are in
  IICF/Impl›.
›

subsubsection ‹Map Example›
text ‹Let's implement a function that maps a finite set to an initial 
  segment of the natural numbers
›
definition "nat_seg_map s  
  ASSERT (finite s)  SPEC (λm. dom m = s  ran m = {0..<card s})"

text ‹We implement the function by iterating over the set, and building the map›
definition "nat_seg_map1 s  do {
  ASSERT (finite s);
  (m,_)  FOREACHi (λit (m,i). dom m = s-it  ran m = {0..<i}  i=card (s - it)) 
    s (λx (m,i). RETURN (m(xi),i+1)) (Map.empty,0);
  RETURN m
}"

lemma nat_seg_map1_refine: "(nat_seg_map1, nat_seg_map)  Id  Idnres_rel"
  apply (intro fun_relI)
  unfolding nat_seg_map1_def[abs_def] nat_seg_map_def[abs_def]
  apply (refine_vcg)
  apply (vc_solve simp: it_step_insert_iff solve: asm_rl dest: domD)
  done
  
text ‹We use hashsets @{term "hs.assn"} and hashmaps (@{term "hm.assn"}). ›
sepref_definition nat_seg_map2 is nat_seg_map1 :: "(hs.assn id_assn)k a hm.assn id_assn nat_assn"
  unfolding nat_seg_map1_def[abs_def]
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  ― ‹We got stuck at op_map_empty›. This is because Sepref is very conservative 
    when it comes to guessing implementations. Actually, no constructor operation 
    will be assigned a default operation, with some obvious exceptions for numbers and Booleans.›
  oops

text ‹
  Assignment of implementations to constructor operations is done by rewriting them to
  synonyms which are bound to a specific implementation. For hashmaps, we have 
  @{const op_hm_empty}, and the rules @{thm [source] hm.fold_custom_empty}.
›
sepref_definition nat_seg_map2 is nat_seg_map1 :: "(hs.assn id_assn)k a hm.assn id_assn nat_assn"
  unfolding nat_seg_map1_def[abs_def]
  ― ‹We can use the @{method rewrite} method for position-precise rewriting:›
  apply (rewrite in "FOREACHi _ _ _ " "hm.fold_custom_empty")
  by sepref

export_code nat_seg_map2 checking SML

lemmas nat_seg_map2_correct = nat_seg_map2.refine[FCOMP nat_seg_map1_refine]

subsection ‹Specification of Preconditions› (*Move up! *)
text ‹In this example, we will discuss how to specify precondition of operations, 
  which are required for refinement to work.
  Consider the following function, which increments all members of a list by one:
›
definition "incr_list l  map ((+) 1) l"
text ‹We might want to implement it as follows›
definition "incr_list1 l  fold (λi l. l[i:=1 + l!i]) [0..<length l] l"
  
lemma incr_list1_refine: "(incr_list1, incr_list)Id  Id"
proof (intro fun_relI; simp)
  fix l :: "'a list"
  { fix n m
    assume "nm" and "length l = m"
    hence "fold (λi l. l[i:=1+l!i]) [n..<m] l = take n l @ map (((+))1) (drop n l)"
      apply (induction  arbitrary: l rule: inc_induct)
      apply simp
      apply (clarsimp simp: upt_conv_Cons take_Suc_conv_app_nth)
      apply (auto simp add: list_eq_iff_nth_eq nth_Cons split: nat.split)
      done
  }
  from this[of 0 "length l"] show "incr_list1 l = incr_list l"
    unfolding incr_list_def incr_list1_def
    by simp
qed

text ‹Trying to refine this reveals a problem:›
sepref_thm incr_list2 is "RETURN o incr_list1" :: "(array_assn nat_assn)d a array_assn nat_assn"
  unfolding incr_list1_def[abs_def]
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  apply sepref_dbg_trans_step_keep
  apply sepref_dbg_side_keep
  ― ‹We get stuck at the precondition of @{const op_list_get}.
    Indeed, we cannot prove the generated precondition, as the translation process
    dropped any information from which we could conclude that the index is in range.›
  oops

  text ‹
    Of course, the fold loop has the invariant that the length of the list does not change,
    and thus, indexing is in range. We only cannot prove it during the automatic synthesis.

    Here, the only solution is to do a manual refinement into the nres-monad, 
    and adding an assertion that indexing is always in range.

    We use the @{const nfoldli} combinator, which generalizes @{const fold} in two directions:
     The function is inside the nres monad
     There is a continuation condition. If this is not satisfied, the fold returns immediately, 
      dropping the rest of the list.
    ›

definition "incr_list2 l  nfoldli 
  [0..<length l] 
  (λ_. True)  
  (λi l. ASSERT (i<length l)  RETURN (l[i:=1+l!i]))
  l"

text ‹
  Note: Often, it is simpler to prove refinement of the abstract specification, rather
  than proving refinement to some intermediate specification that may have already done
  refinements "in the wrong direction". In our case, proving refinement of @{const incr_list1}
  would require to generalize the statement to keep track of the list-length invariant,
  while proving refinement of @{const incr_list} directly is as easy as proving the original 
  refinement for @{const incr_list1}.
›
lemma incr_list2_refine: "(incr_list2,RETURN o incr_list)  Id  Idnres_rel"
proof (intro nres_relI fun_relI; simp)  
  fix l :: "'a list"
  show "incr_list2 l  RETURN (incr_list l)"
    unfolding incr_list2_def incr_list_def
    ― ‹@{const nfoldli} comes with an invariant proof rule. In order to use it, we have to specify
      the invariant manually:›
    apply (refine_vcg nfoldli_rule[where I="λl1 l2 s. s = map (((+))1) (take (length l1) l) @ drop (length l1) l"])
    apply (vc_solve 
      simp: upt_eq_append_conv upt_eq_Cons_conv
      simp: nth_append list_update_append upd_conv_take_nth_drop take_Suc_conv_app_nth
      solve: asm_rl
    )
    done
qed

sepref_definition incr_list3 is "incr_list2" :: "(array_assn nat_assn)d a array_assn nat_assn"
  unfolding incr_list2_def[abs_def]
  by sepref

lemmas incr_list3_correct = incr_list3.refine[FCOMP incr_list2_refine]

subsection ‹Linearity and Copying›
text ‹Consider the following implementation of an operation to swap to list 
  indexes. While it is perfectly valid in a functional setting, an imperative 
  implementation has a problem here: Once the update a index i› is done,
  the old value cannot be read from index i› any more. We try to implement the
  list with an array:›
sepref_thm swap_nonlinear is "uncurry2 (λl i j. do {
  ASSERT (i<length l  j<length l);
  RETURN (l[i:=l!j, j:=l!i])
})" :: "(array_assn id_assn)d *a nat_assnk *a nat_assnk a array_assn id_assn"
  supply [[goals_limit = 1]]
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep ― ‹(1) We get stuck at an @{const op_list_get} operation›
  apply sepref_dbg_trans_step_keep ― ‹(2) Further inspection reveals that the "recover pure" 
    phase fails, and we are left with a subgoal of the form 
    @{term "CONSTRAINT is_pure (array_assn id_assn)"}. Constraint side conditions are 
    deferrable side conditions: They are produced as side-conditions, and if they cannot 
    be solved immediately, they are deferred and processed later, latest at the end of the synthesis.
    However, definitely unsolvable constraints are not deferred, but halt the translation phase immediately,
    and this is what happened here: At (1) we can see that the refinement for the array we want to access is 
    @{term "hn_invalid (array_assn id_assn)"}. This means, the data structure was destroyed by some preceding 
    operation. The @{const hn_invalid} only keeps a record of this fact. When translating an operation that uses
    an invalidated parameter, the tool tries to restore the invalidated parameter: This only works if the data 
    structure was purely functional, i.e., not stored on the heap. This is where the @{term is_pure} constraint
    comes from. However, arrays are always stored on the heap, so this constraint is definitely unsolvable,
    and thus immediately rejected instead of being deferred. 

    Note: There are scenarios where a constraint gets deferred @{emph ‹before›} it becomes definitely unsolvable.
      In these cases, you only see the problem after the translation phase, and it may be somewhat tricky to figure
      out the reason.› (* TODO: Check for unsolvable constraints after each translation step, and refuse refinements that render
      any constraints unsolvable. Make this debuggable, e.g. by injecting those constraints as additional side 
      conditions! *)
  oops

text ‹The fix for our swap function is quite obvious. Using a temporary storage 
  for the intermediate value, we write:›
sepref_thm swap_with_tmp is "uncurry2 (λl i j. do {
  ASSERT (i<length l  j<length l);
  let tmp = l!i;
  RETURN (l[i:=l!j, j:=tmp])
})" :: "(array_assn id_assn)d *a nat_assnk *a nat_assnk a array_assn id_assn"
  by sepref

text ‹Note that also the argument must be marked as destroyed ()d. Otherwise, we get a similar error as above,
  but in a different phase: ›
sepref_thm swap_with_tmp is "uncurry2 (λl i j. do {
  ASSERT (i<length l  j<length l);
  let tmp = l!i;
  RETURN (l[i:=l!j, j:=tmp])
})" :: "(array_assn id_assn)k *a nat_assnk *a nat_assnk a array_assn id_assn"
  apply sepref_dbg_keep ― ‹We get stuck at a frame, which would require restoring an invalidated array›
  apply sepref_dbg_cons_solve_keep ― ‹Which would only work if arrays were pure›
  oops
  
text ‹If copying is really required, you have to insert it manually. 
  Reconsider the example @{const incr_list} from above. This time,
  we want to preserve the original data (note the ()k annotation):
›
sepref_thm incr_list3_preserve is "incr_list2" :: "(array_assn nat_assn)k a array_assn nat_assn"
  unfolding incr_list2_def[abs_def]
  ― ‹We explicitly insert a copy-operation on the list, before it is passed to the fold operation›
  apply (rewrite in "nfoldli _ _ _ " op_list_copy_def[symmetric])
  by sepref

subsection ‹Nesting of Data Structures›
text ‹
  Sepref and the IICF support nesting of data structures with some limitations:
     Only the container or its elements can be visible at the same time. 
      For example, if you have a product of two arrays, you can either see the
      two arrays, or the product. An operation like snd› would have to destroy 
      the product, loosing the first component. Inside a case distinction, you
      cannot access the compound object.
    
      These limitations are somewhat relaxed for pure data types, which can always 
      be restored.
     Most IICF data structures only support pure component types. 
      Exceptions are HOL-lists, and the list-based set and multiset implementations
      List_MsetO› and List_SetO› (Here, the O› stands for own›, which means 
      that the data-structure owns its elements.).

›

text ‹Works fine:›
sepref_thm product_ex1 is "uncurry0 (do {
    let p = (op_array_replicate 5 True, op_array_replicate 2 False);
    case p of (a1,a2)  RETURN (a1!2)
  })" :: "unit_assnk a bool_assn"
  by sepref


text ‹Fails: We cannot access compound type inside case distinction›
sepref_thm product_ex2 is "uncurry0 (do {
    let p = (op_array_replicate 5 True, op_array_replicate 2 False);
    case p of (a1,a2)  RETURN (snd p!1)
  })" :: "unit_assnk a bool_assn"
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  apply sepref_dbg_trans_step_keep
  oops

text ‹Works fine, as components of product are pure, such that product can be restored inside case.›
sepref_thm product_ex2 is "uncurry0 (do {
    let p = (op_list_replicate 5 True, op_list_replicate 2 False);
    case p of (a1,a2)  RETURN (snd p!1)
  })" :: "unit_assnk a bool_assn"
  by sepref_dbg_keep

text ‹Trying to create a list of arrays, first attempt: ›
sepref_thm set_of_arrays_ex is "uncurry0 (RETURN (op_list_append [] op_array_empty))" :: "unit_assnk a arl_assn (array_assn nat_assn)"
  unfolding "arl.fold_custom_empty"
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  apply sepref_dbg_trans_step_keep
  supply [[goals_limit = 1, unify_trace_failure]]
  (*apply (rule arl_append_hnr[to_hnr])*)
  ― ‹Many IICF data structures, in particular the array based ones, requires the element types
    to be of @{class default}. If this is not the case, Sepref will simply find no refinement for
    the operations. Be aware that type-class related mistakes are hard to debug in Isabelle/HOL,
    above we sketched how to apply the refinement rule that is supposed to match with unifier 
    tracing switched on. The @{attribute to_hnr} attribute is required to convert the rule from 
    the relational form to the internal @{const hn_refine} form. Note that some rules are already 
    in @{const hn_refine} form, and need not be converted, e.g., @{thm hn_Pair}.›
  oops

text ‹So lets choose a circular singly linked list (csll), which does not require its elements to be of default type class›
sepref_thm set_of_arrays_ex is "uncurry0 (RETURN (op_list_append [] op_array_empty))" :: "unit_assnk a csll.assn (array_assn nat_assn)"
  unfolding "csll.fold_custom_empty"
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  apply sepref_dbg_trans_step_keep
  ― ‹We end up with an unprovable purity-constraint: As many IICF types, csll 
    only supports pure member types. We expect this restriction to be lifted in 
    some future version.›
  oops

text ‹Finally, there are a few data structures that already support nested element types, for example, functional lists:›
sepref_thm set_of_arrays_ex is "uncurry0 (RETURN (op_list_append [] op_array_empty))" :: "unit_assnk a list_assn (array_assn nat_assn)"
  unfolding "HOL_list.fold_custom_empty"
  by sepref


subsection ‹Fixed-Size Data Structures›
text ‹For many algorithms, the required size of a data structure is already known,
  such that it is not necessary to use data structures with dynamic resizing.

  The Sepref-tool supports such data structures, however, with some limitations.
›

subsubsection ‹Running Example›
text ‹
  Assume we want to read a sequence of natural numbers in the range @{term "{0..<N}"},
  and drop duplicate numbers. The following abstract algorithm may work:
›

definition "remdup l  do {
  (s,r)  nfoldli l (λ_. True) 
    (λx (s,r). do {
      ASSERT (distinct r  set r  set l  s = set r); ― ‹Will be required to prove that list does not grow too long›
      if xs then RETURN (s,r) else RETURN (insert x s, r@[x])
    }) 
    ({},[]);
  RETURN r
}"

text ‹We want to use remdup› in our abstract code, so we have to register it.›
sepref_register remdup

text ‹The straightforward version with dynamic data-structures is: ›
sepref_definition remdup1 is "remdup" :: "(list_assn nat_assn)k a arl_assn nat_assn"
  unfolding remdup_def[abs_def]
  ― ‹Lets use a bit-vector for the set›
  apply (rewrite in "nfoldli _ _ _ " ias.fold_custom_empty)
  ― ‹And an array-list for the list›
  apply (rewrite in "nfoldli _ _ _ " arl.fold_custom_empty)
  by sepref

subsubsection ‹Initialization of Dynamic Data Structures›
text ‹Now let's fix an upper bound for the numbers in the list.
  Initializations and statically sized data structures must always be fixed variables,
  they cannot be computed inside the refined program. 

  TODO: Lift this restriction at least for initialization hints that do not occur 
    in the refinement assertions.
›
context fixes N :: nat begin

sepref_definition remdup1_initsz is "remdup" :: "(list_assn nat_assn)k a arl_assn nat_assn"
  unfolding remdup_def[abs_def]
  ― ‹Many of the dynamic array-based data structures in the IICF can be 
    pre-initialized to a certain size. THis initialization is only a hint, 
    and has no abstract consequences. The list data structure will still be 
    resized if it grows larger than the initialization size.›
  apply (rewrite in "nfoldli _ _ _ " ias_sz.fold_custom_empty[of N])
  apply (rewrite in "nfoldli _ _ _ " arl_sz.fold_custom_empty[of N])
  by sepref

end

text ‹To get a usable function, we may add the fixed N› as a parameter, effectively converting
  the initialization hint to a parameter, which, however, has no abstract meaning›

definition "remdup_initsz (N::nat)  remdup"
lemma remdup_init_hnr: 
  "(uncurry remdup1_initsz, uncurry remdup_initsz)  nat_assnk *a (list_assn nat_assn)k a arl_assn nat_assn"
  using remdup1_initsz.refine unfolding remdup_initsz_def[abs_def]
  unfolding hfref_def hn_refine_def
  by (auto simp: pure_def)


subsubsection ‹Static Data Structures›

text ‹We use a locale to hide local declarations. Note: This locale will never be interpreted,
  otherwise all the local setup, that does not make sense outside the locale, would become visible.
  TODO: This is probably some abuse of locales to emulate complex private setup, 
      including declaration of constants and lemmas.
›
locale my_remdup_impl_loc = 
  fixes N :: nat 
  assumes "N>0" ― ‹This assumption is not necessary, but used to illustrate the 
    general case, where the locale may have such assumptions›
begin
  text ‹For locale hierarchies, the following seems not to be available directly in Isabelle,
    however, it is useful when transferring stuff between the global theory and the locale›
  lemma my_remdup_impl_loc_this: "my_remdup_impl_loc N" by unfold_locales

  text ‹
    Note that this will often require to use N› as a usual constant, which 
    is refined. For pure refinements, we can use the @{attribute sepref_import_param}
    attribute, which will convert a parametricity theorem to a rule for Sepref:
    ›  
  sepref_register N
  lemma N_hnr[sepref_import_param]: "(N,N)nat_rel" by simp
  thm N_hnr
  text ‹Alternatively, we could directly prove the following rule, which, however, is 
    more cumbersome: ›
  lemma N_hnr': "(uncurry0 (return N), uncurry0 (RETURN N))unit_assnk a nat_assn"
    by sepref_to_hoare sep_auto

  text ‹Next, we use an array-list with a fixed maximum capacity. 
    Note that the capacity is part of the refinement assertion now.
  ›
  sepref_definition remdup1_fixed is "remdup" :: "(list_assn nat_assn)k a marl_assn N nat_assn"
    unfolding remdup_def[abs_def]
    apply (rewrite in "nfoldli _ _ _ " ias_sz.fold_custom_empty[of N])
    apply (rewrite in "nfoldli _ _ _ " marl_fold_custom_empty_sz[of N])
    supply [[goals_limit = 1]]
    apply sepref_dbg_keep
    apply sepref_dbg_trans_keep
    apply sepref_dbg_trans_step_keep
    ― ‹In order to append to the array list, we have to show that the size is not yet exceeded.
      This may require to add some assertions on the abstract level. We already have added
      some assertions in the definition of @{const remdup}.›
    oops
  
  text ‹Moreover, we add a precondition on the list›
  sepref_definition remdup1_fixed is "remdup" :: "[λl. set l  {0..<N}]a (list_assn nat_assn)k  marl_assn N nat_assn"
    unfolding remdup_def[abs_def]
    apply (rewrite in "nfoldli _ _ _ " ias_sz.fold_custom_empty[of N])
    apply (rewrite in "nfoldli _ _ _ " marl_fold_custom_empty_sz[of N])
    supply [[goals_limit = 1]]
    apply sepref_dbg_keep
    apply sepref_dbg_trans_keep
    apply sepref_dbg_trans_step_keep
    apply sepref_dbg_side_keep
    ― ‹We can start from this subgoal to find missing lemmas›
    oops

  text ‹We can prove the remaining subgoal, e.g., by @{method auto} with the following
    lemma declared as introduction rule:›  
  lemma aux1[intro]: " set l  {0..<N}; distinct l   length l < N"  
    apply (simp add: distinct_card[symmetric])
    apply (drule psubset_card_mono[rotated])
    apply auto
    done

  text ‹We use some standard boilerplate to define the constant globally, although
    being inside the locale. This is required for code-generation.›  
  sepref_thm remdup1_fixed is "remdup" :: "[λl. set l  {0..<N}]a (list_assn nat_assn)k  marl_assn N nat_assn"
    unfolding remdup_def[abs_def]
    apply (rewrite in "nfoldli _ _ _ " ias_sz.fold_custom_empty[of N])
    apply (rewrite in "nfoldli _ _ _ " marl_fold_custom_empty_sz[of N])
    by sepref
    
  concrete_definition (in -) remdup1_fixed uses "my_remdup_impl_loc.remdup1_fixed.refine_raw" is "(?f,_)_"
  prepare_code_thms (in -) remdup1_fixed_def
  lemmas remdup1_fixed_refine[sepref_fr_rules] = remdup1_fixed.refine[OF my_remdup_impl_loc_this] 
  text ‹The @{command concrete_definition} command defines the constant globally, without any locale assumptions. For this,
    it extracts the definition from the theorem, according to the specified pattern. Note, you have to
    include the uncurrying into the pattern, e.g., (uncurry ?f,_)∈_›.

    The @{command prepare_code_thms} command sets up code equations for recursion combinators that may have been synthesized. 
    This is required as the code generator works with equation systems, while the heap-monad works with 
    fixed-point combinators.
    
    Finally, the third lemma command imports the refinement lemma back into the locale, and registers it
    as refinement rule for Sepref.
    ›

  text ‹Now, we can refine @{const remdup} to @{term "remdup1_fixed N"} inside the 
    locale. The latter is a global constant with an unconditional definition, thus code
    can be generated for it.›  

  text ‹Inside the locale, we can do some more refinements: ›  
  definition "test_remdup  do {l  remdup [0..<N]; RETURN (length l) }"
  text ‹Note that the abstract @{const test_remdup} is just an abbreviation for 
    @{term "my_remdup_impl_loc.test_remdup N"}.
    Whenever we want Sepref to treat a compound term like a constant, we have to wrap the term into
    a @{const PR_CONST} tag. While @{command sepref_register} does this automatically, 
    the PR_CONST› has to occur in the refinement rule.›
  sepref_register "test_remdup"
  sepref_thm test_remdup1 is 
    "uncurry0 (PR_CONST test_remdup)" :: "unit_assnk a nat_assn"
    unfolding test_remdup_def PR_CONST_def
    by sepref
  concrete_definition (in -) test_remdup1 uses my_remdup_impl_loc.test_remdup1.refine_raw is "(uncurry0 ?f,_)_"
  prepare_code_thms (in -) test_remdup1_def
  lemmas test_remdup1_refine[sepref_fr_rules] = test_remdup1.refine[of N]

end    

text ‹Outside the locale, a refinement of @{term my_remdup_impl_loc.test_remdup} also makes sense,
  however, with an extra argument @{term N}.›

thm test_remdup1.refine
lemma test_remdup1_refine_aux: "(test_remdup1, my_remdup_impl_loc.test_remdup)  [my_remdup_impl_loc]a nat_assnk  nat_assn"
  using test_remdup1.refine
  unfolding hfref_def hn_refine_def
  by (auto simp: pure_def)

text ‹We can also write a more direct precondition, as long as it implies the locale›
lemma test_remdup1_refine: "(test_remdup1, my_remdup_impl_loc.test_remdup)  [λN. N>0]a nat_assnk  nat_assn"
  apply (rule hfref_cons[OF test_remdup1_refine_aux _ entt_refl entt_refl entt_refl])
  by unfold_locales
  
export_code test_remdup1 checking SML

text ‹We can also register the abstract constant and the refinement, to use it in further refinements›
sepref_register my_remdup_impl_loc.test_remdup
lemmas [sepref_fr_rules] = test_remdup1_refine


subsubsection ‹Static Data Structures with Custom Element Relations›

text ‹In the previous section, we have presented a refinement using an array-list
  without dynamic resizing. However, the argument that we actually could append 
  to this array was quite complicated.

  Another possibility is to use bounded refinement relations, i.e., 
  a refinement relation intersected with a condition for the abstract object.
  In our case, @{term "nbn_assn N"} relates natural numbers less than N› to themselves.

  We will repeat the above development, using the bounded relation approach:
›

definition "bremdup l  do {
  (s,r)  nfoldli l (λ_. True) 
    (λx (s,r). do {
      ASSERT (distinct r  s = set r); ― ‹Less assertions than last time›
      if xs then RETURN (s,r) else RETURN (insert x s, r@[x])
    }) 
    ({},[]);
  RETURN r
}"
sepref_register bremdup

locale my_bremdup_impl_loc = 
  fixes N :: nat 
  assumes "N>0" ― ‹This assumption is not necessary, but used to illustrate the 
    general case, where the locale may have such assumptions›
begin
  lemma my_bremdup_impl_loc_this: "my_bremdup_impl_loc N" by unfold_locales

  sepref_register N
  lemma N_hnr[sepref_import_param]: "(N,N)nat_rel" by simp

  text ‹Conceptually, what we insert in our list are elements, and
    these are less than N›.›
  abbreviation "elem_assn  nbn_assn N"

  lemma aux1[intro]: " set l  {0..<N}; distinct l   length l < N"  
    apply (simp add: distinct_card[symmetric])
    apply (drule psubset_card_mono[rotated])
    apply auto
    done

  sepref_thm remdup1_fixed is "remdup" :: "[λl. set l  {0..<N}]a (list_assn elem_assn)k  marl_assn N elem_assn"
    unfolding remdup_def[abs_def]
    apply (rewrite in "nfoldli _ _ _ " ias_sz.fold_custom_empty[of N])
    apply (rewrite in "nfoldli _ _ _ " marl_fold_custom_empty_sz[of N])
    by sepref
    
  concrete_definition (in -) bremdup1_fixed uses "my_bremdup_impl_loc.remdup1_fixed.refine_raw" is "(?f,_)_"
  prepare_code_thms (in -) bremdup1_fixed_def
  lemmas remdup1_fixed_refine[sepref_fr_rules] = bremdup1_fixed.refine[OF my_bremdup_impl_loc_this] 

  definition "test_remdup  do {l  remdup [0..<N]; RETURN (length l) }"
  sepref_register "test_remdup"

  text ‹This refinement depends on the (somewhat experimental) subtyping feature 
    to convert from @{term nat_assn} to @{term elem_assn}, based on context information›
  sepref_thm test_remdup1 is 
    "uncurry0 (PR_CONST test_remdup)" :: "unit_assnk a nat_assn"
    unfolding test_remdup_def PR_CONST_def
    by sepref

  concrete_definition (in -) test_bremdup1 uses my_bremdup_impl_loc.test_remdup1.refine_raw is "(uncurry0 ?f,_)_"
  prepare_code_thms (in -) test_bremdup1_def
  lemmas test_remdup1_refine[sepref_fr_rules] = test_bremdup1.refine[of N]

end    

lemma test_bremdup1_refine_aux: "(test_bremdup1, my_bremdup_impl_loc.test_remdup)  [my_bremdup_impl_loc]a nat_assnk  nat_assn"
  using test_bremdup1.refine
  unfolding hfref_def hn_refine_def
  by (auto simp: pure_def)

lemma test_bremdup1_refine: "(test_bremdup1, my_bremdup_impl_loc.test_remdup)  [λN. N>0]a nat_assnk  nat_assn"
  apply (rule hfref_cons[OF test_bremdup1_refine_aux _ entt_refl entt_refl entt_refl])
  by unfold_locales
  
export_code test_bremdup1 checking SML

text ‹We can also register the abstract constant and the refinement, to use it in further refinements›
sepref_register test_bremdup: my_bremdup_impl_loc.test_remdup ― ‹Specifying a base-name for 
    the theorems here, as default name clashes with existing names.›
lemmas [sepref_fr_rules] = test_bremdup1_refine

subsubsection ‹Fixed-Value Restriction›
text ‹Initialization only works with fixed values, not with dynamically computed values›
sepref_definition copy_list_to_array is "λl. do {
  let N = length l; ― ‹Introduce a let›, such that we have a single variable as size-init›
  let l' = op_arl_empty_sz N;
  nfoldli l (λx. True) (λx s. mop_list_append s x) l'
}" :: "(list_assn nat_assn)k a arl_assn nat_assn"
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  apply sepref_dbg_trans_step_keep
  supply [[unify_trace_failure, goals_limit=1]]
  (*apply (rule arl_sz.custom_hnr[to_hnr])*)
  ― ‹The problem manifests itself in trying to carry an abstract variable 
    (the argument to op_arl_empty_sz›) to the concrete program (the second argument of hn_refine›).
    However, the concrete program can only depend on the concrete variables, so unification fails.›
  oops


subsubsection ‹Matrix Example›

text ‹
  We first give an example for implementing point-wise matrix operations, using
  some utilities from the (very prototype) matrix library.

  Our matrix library uses functions @{typ "'a mtx"} (which is @{typ "nat×nat  'a"})
  as the abstract representation. The (currently only) implementation is by arrays,
  mapping points at coordinates out of range to @{term 0}.
›

text ‹Pointwise unary operations are those that modify every point
  of a matrix independently. Moreover, a zero-value must be mapped to a zero-value.
  As an example, we duplicate every value on the diagonal of a matrix
›

text ‹Abstractly, we apply the following function to every value.
  The first parameter are the coordinates.›
definition mtx_dup_diag_f:: "nat×nat  'a::{numeral,times,mult_zero}  'a"
  where "mtx_dup_diag_f  λ(i,j) x. if i=j then x*(2) else x"

text ‹We refine this function to a heap-function,
  using the identity mapping for values.›
context 
  fixes dummy :: "'a::{numeral,times,mult_zero}"
  notes [[sepref_register_adhoc "PR_CONST (2::'a)"]]
    ― ‹Note: The setup for numerals, like 2›, is a bit subtle in that
      numerals are always treated as constants, but have to be registered
      for any type they shall be used with. By default, they are only 
      registered for @{typ int} and @{typ nat}.›
  notes [sepref_import_param] = IdI[of "PR_CONST (2::'a)"]
  notes [sepref_import_param] = IdI[of "(*)::'a_", folded fun_rel_id_simp]
begin

sepref_definition mtx_dup_diag_f1 is "uncurry (RETURN oo (mtx_dup_diag_f::_'a_))" :: "(prod_assn nat_assn nat_assn)k*aid_assnk a id_assn"
  unfolding mtx_dup_diag_f_def
  by sepref

end

text ‹Then, we instantiate the corresponding locale, to get an implementation for 
  array matrices. Note that we restrict ourselves to square matrices here: ›
interpretation dup_diag: amtx_pointwise_unop_impl N N mtx_dup_diag_f id_assn mtx_dup_diag_f1
  apply standard
  applyS (simp add: mtx_dup_diag_f_def) []
  applyS (rule mtx_dup_diag_f1.refine)
  done

text ‹We introduce an abbreviation for the abstract operation.
  Note: We do not have to register it (this is done once and for all 
    for @{const mtx_pointwise_unop}), nor do we have to declare a refinement rule 
    (done by amtx_pointwise_unop_impl›-locale)   
› 
abbreviation "mtx_dup_diag  mtx_pointwise_unop mtx_dup_diag_f"

text ‹The operation is usable now:›
sepref_thm mtx_dup_test is "λm. RETURN (mtx_dup_diag (mtx_dup_diag m))" :: "(asmtx_assn N int_assn)d a asmtx_assn N int_assn"
  by sepref

text ‹Similarly, there are operations to combine to matrices, and to compare two matrices:›

interpretation pw_add: amtx_pointwise_binop_impl N M "(((+))::(_::monoid_add)  _)" id_assn "return oo ((+))"
  for N M
  apply standard
  apply simp
  apply (sepref_to_hoare) apply sep_auto ― ‹Alternative to 
    synthesize concrete operation, for simple ad-hoc refinements›
  done
abbreviation "mtx_add  mtx_pointwise_binop ((+))"

sepref_thm mtx_add_test is "uncurry2 (λm1 m2 m3. RETURN (mtx_add m1 (mtx_add m2 m3)))" 
  :: "(amtx_assn N M int_assn)d *a (amtx_assn N M int_assn)d *a (amtx_assn N M int_assn)k a amtx_assn N M int_assn"
  by sepref

text ‹A limitation here is, that the first operand is destroyed on a coarse-grained level.
  Although adding a matrix to itself would be valid, our tool does not support this.
  (However, you may use an unary operation)›
sepref_thm mtx_dup_alt_test is "(λm. RETURN (mtx_add m m))" 
  :: "(amtx_assn N M int_assn)d a amtx_assn N M int_assn"
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  ― ‹We get stuck at a @{const COPY} goal, indicating that a matrix has to be copied.›
  apply sepref_dbg_trans_step_keep
  ― ‹Which only works for pure refinements›
  oops

text ‹Of course, you can always copy the matrix manually:›
sepref_thm mtx_dup_alt_test is "(λm. RETURN (mtx_add (op_mtx_copy m) m))" 
  :: "(amtx_assn N M int_assn)k a amtx_assn N M int_assn"
  by sepref

text ‹A compare operation checks that all pairs of entries fulfill some property f›, and
  at least one entry fullfills a property g›.›
interpretation pw_lt: amtx_pointwise_cmpop_impl N M "((≤)::(_::order)  _)" "((≠)::(_::order)  _)" id_assn "return oo (≤)" "return oo (≠)"
  for N M
  apply standard
  apply simp
  apply simp
  apply (sepref_to_hoare) apply sep_auto
  apply (sepref_to_hoare) apply sep_auto
  done
abbreviation "mtx_lt  mtx_pointwise_cmpop (≤) (≠)"

sepref_thm test_mtx_cmp is "(λm. do { RETURN (mtx_lt (op_amtx_dfltNxM N M 0) m) })" :: "(amtx_assn N M int_assn)k a bool_assn"
  by sepref ― ‹Note: Better fold over single matrix (currently no locale for that), instead of creating a new matrix.›

text ‹In a final example, we store some coordinates in a set, and then
  use the stored coordinates to access the matrix again. This illustrates how 
  bounded relations can be used to maintain extra information, i.e., coordinates 
  being in range›

context
  fixes N M :: nat
  notes [[sepref_register_adhoc N M]]
  notes [sepref_import_param] = IdI[of N] IdI[of M]
begin
  text ‹We introduce an assertion for coordinates›
  abbreviation "co_assn  prod_assn (nbn_assn N) (nbn_assn M)"
  text ‹And one for integer matrices›
  abbreviation "mtx_assn  amtx_assn N M int_assn"

  definition "co_set_gen  do {
    nfoldli [0..<N] (λ_. True) (λi. nfoldli [0..<M] (λ_. True) (λj s. 
      if max i j - min i j  1 then RETURN (insert (i,j) s)
      else RETURN s
    )) {}
  }"

  sepref_definition co_set_gen1 is "uncurry0 co_set_gen" :: "unit_assnk a hs.assn co_assn"
    unfolding co_set_gen_def
    apply (rewrite "hs.fold_custom_empty")
    apply sepref_dbg_keep
    apply sepref_dbg_trans_keep
    ― ‹We run into the problem that the Sepref tool uses nat_assn› to refine natural
      numbers, and only later tries to convert it to nbn_assn›. However, at this point, the
      information is already lost.›
    oops

  text ‹We can use a feature of Sepref, to annotate the desired assertion directly 
    into the abstract program. For this, we use @{thm [source] annotate_assn}, 
    which inserts the (special) constant @{const ASSN_ANNOT}, which is just identity, 
    but enforces refinement with the given assertion.›  
  sepref_definition co_set_gen1 is "uncurry0 (PR_CONST co_set_gen)" :: "unit_assnk a hs.assn co_assn"
    unfolding co_set_gen_def PR_CONST_def
    apply (rewrite "hs.fold_custom_empty")
    apply (rewrite in "insert  _" annotate_assn[where A=co_assn])
      ― ‹Annotate the pair as coordinate before insertion›
    by sepref
  lemmas [sepref_fr_rules] = co_set_gen1.refine

  sepref_register "co_set_gen"

  text ‹Now we can use the entries from the set as coordinates, 
    without any worries about them being out of range›
  sepref_thm co_set_use is "(λm. do {
    co  co_set_gen;
    FOREACH co (λ(i,j) m. RETURN ( m((i,j) := 1))) m
  })" :: "mtx_assnd a mtx_assn"
    by sepref
    

end

subsection ‹Type Classes›
text ‹TBD›
subsection ‹Higher-Order›
text ‹TBD›

subsection ‹A-Posteriori Optimizations›
text ‹The theorem collection @{attribute sepref_opt_simps}
  and @{attribute sepref_opt_simps2} contain simplifier lemmas that are
  applied, in two stages, to the generated Imperative/HOL program.

  This is the place where some optimizations, such as deforestation, and
  simplifying monad-expressions using the monad laws, take place.
›
thm sepref_opt_simps
thm sepref_opt_simps2

subsection ‹Short-Circuit Evaluation›
text ‹Consider›
sepref_thm test_sc_eval is "RETURN o (λl. length l > 0  hd l)" :: "(list_assn bool_assn)k a bool_assn"
  apply sepref_dbg_keep
  apply sepref_dbg_trans_keep
  apply sepref_dbg_trans_step_keep
  ― ‹Got stuck, as the operands of ∧› are evaluated before applying the operator, i.e.,
    hd› is also applied to empty lists›
  oops

sepref_thm test_sc_eval is "RETURN o (λl. length l > 0  hd l)" :: "(list_assn bool_assn)k a bool_assn"
  unfolding short_circuit_conv ― ‹Enables short-circuit evaluation 
    by rewriting ∧›, ∨›, and ⟶› to if›-expressions›
  by sepref

end