Theory open_struct

(*
 * Copyright (c) 2022 Apple Inc. All rights reserved.
 *
 * SPDX-License-Identifier: BSD-2-Clause
 *)

section ‹Pointers into Structures in Split Heap \label{sec:open_struct}›

theory open_struct
  imports "AutoCorres"
begin

subsection ‹Overview›

text ‹
Heap lifting in AutoCorres transforms the monolithic byte-level heap typheap_mem with explicit
term level heap type description typheap_typ_desc and typing constraints, like constc_guard,
 consth_t_valid and termd t p to a more abstract split heap model with implicit HOL-typing.
The major advantage of the split heap model is that, updates to one particular heap do
not affect all other distinct heaps, removing the obligation to explicitly take care about aliasing.
Aliasing between different heaps is ruled out by the model already.

The question that arises is on which granularity do we split the byte-level heap?
The original AutoCorres splitted at the granularity of types. Each distinct c-type is mapped into
a separate heap. In that model you can still have intra-type aliasing (i.e. pointers of the
same type can alias), but no inter-type aliasing (i.e. pointers of different type are distinct, and
 cannot alias). As an example a pointer to a list structure refers to a different (split) heap as
a pointer to a tree structure. So when a function modifies lists, no pointer to a tree is affected.

Of course not every C-program can be represented in this split-heap model. In those cases
the abstraction fails.
In particular think of a low-level memory allocator, that allocates some raw-bytes and
delivers a void-pointer, that is then retyped to point to the actual structure. To be able
to combine those byte-level functions with high-level typed functions (in the split heap) one
can switch between both layers of abstraction by constexec_concrete and constexec_abstract.
The good thing is that this gives us a very expressive tool to switch between byte-level and
typed view. We can do the specification and proofs of different parts of the program on
the adequate level and still combine the results. But switching between the layers can be
tedious and thus should be limited to the low-level functions that really inherently need
byte-level reasoning.

Unfortunately the type granularity of the split heaps also rules out programs dealing with
pointers into a structure. For example a structure containing a int› field count› cannot pass a
int› pointer to that field count› to a another function, that might update it.
The complete structure has a separate
(split) heap that is distinct from the heap containing int›s. This would mean that we
have to resort to the byte-level heap to reason about those functions.

We extended the split heap model to support this common use-case. The core idea is that
in addition to split heaps on the granularity of types you can also specify split heaps on the
granularity of structure fields. Conceptually you 'open' the structure into its 
components (aka. fields) and each field gets a separate heap. Additionally you can specify which 
fields should be ‹addressable› and thus shared. In the example you can specify that count› is addressable,
so it is represented within the same heap as all other int›. This means that an int› pointer can
now point to a plain int› or to a count› field within a structure. You explicitly allow
this limited inter-type aliasing on int› pointers. All other fields that are not
explicitly marked as ‹addressable› still have their own heap and thus cannot alias with another
pointer of the same field type. Note that mainly for performance reasons we actually avoid to 
have a separate heap for each single field but try to minimise the number of heaps by clustering the 
fields that are treated the same. Some more details on this construction are below. 
›

subsection ‹Example Program and some Intuition›

text ‹We illustrate the approach with the following example specifying some structures,
with addressable and non-addressable fields. Note that for an addressable field that is
 of an array type, each element of the array is considered to be addressable.
›

install_C_file "open_struct.c"

text ‹›

declare [[show_types=false, show_sorts=false,
 verbose = 0, verbose_timing = 0, ML_print_depth = 1000, goals_limit = 20]]