Theory Category

(*  Title:       Category
    Author:      Eugene W. Stark <stark@cs.stonybrook.edu>, 2016
    Maintainer:  Eugene W. Stark <stark@cs.stonybrook.edu>
*)

chapter "Category"

theory Category
imports Main "HOL-Library.FuncSet"
begin

  text ‹
    This theory develops an ``object-free'' definition of category loosely following
    \<^cite>‹"AHS"›, Sec. 3.52-3.53.
    We define the notion ``category'' in terms of axioms that concern a single
    partial binary operation on a type, some of whose elements are to be regarded
    as the ``arrows'' of the category.

    The nonstandard definition of category has some advantages and disadvantages.
    An advantage is that only one piece of data (the composition operation) is required
    to specify a category, so the use of records is not required to bundle up several
    separate objects.  A related advantage is the fact that functors and natural
    transformations can be defined simply to be functions that satisfy certain axioms,
    rather than more complex composite objects.
    One disadvantage is that the notions of ``object'' and ``identity arrow'' are
    conflated, though this is easy to get used to.  Perhaps a more significant disadvantage
    is that each arrow of a category must carry along the information about its domain
    and codomain. This implies, for example, that the arrows of a category of sets and
    functions cannot be directly identified with functions, but rather only with functions that
    have been equipped with their domain and codomain sets.

    To represent the partiality of the composition operation of a category, we assume that the
    composition for a category has a unique zero element, which we call ‹null›,
    and we consider arrows to be ``composable'' if and only if their composite is non-null.
    Functors and natural transformations are required to map arrows to arrows and be
    ``extensional'' in the sense that they map non-arrows to null.  This is so that
    equality of functors and natural transformations coincides with their extensional equality
    as functions in HOL.
    The fact that we co-opt an element of the arrow type to serve as ‹null› means that
    it is not possible to define a category whose arrows exhaust the elements of a given type.
    This presents a disadvantage in some situations.  For example, we cannot construct a
    discrete category whose arrows are directly identified with the set of \emph{all}
    elements of a given type @{typ 'a}; instead, we must pass to a larger type
    (such as @{typ "'a option"}) so that there is an element available for use as ‹null›.
    The presence of ‹null›, however, is crucial to our being able to define a
    system of introduction and elimination rules that can be applied automatically to establish
    that a given expression denotes an arrow.  Without ‹null›, we would be able to
    define an introduction rule to infer, say, that the composition of composable arrows
    is composable, but not an elimination rule to infer that arrows are composable from
    the fact that their composite is an arrow.  Having the ability to do both is critical
    to the usability of the theory.
›

  text ‹
    A \emph{partial magma} is a partial binary operation ‹OP› defined on the set
    of elements at a type @{typ 'a}.  As discussed above,
    we assume the existence of a unique element ‹null› of type @{typ 'a}
    that is a zero for ‹OP›, and we use ‹null› to represent ``undefined''.
  ›

  locale partial_magma =
  fixes OP :: "'a ⇒ 'a ⇒ 'a"
  assumes ex_un_null: "∃!n. ∀t. OP n t = n ∧ OP t n = n"
  begin

    definition null :: 'a
    where "null = (THE n. ∀t. OP n t = n ∧ OP t n = n)"

    lemma null_eqI:
    assumes "⋀t. OP n t = n ∧ OP t n = n"
    shows "n = null"
      using assms null_def ex_un_null the1_equality [of "λn. ∀t. OP n t = n ∧ OP t n = n"]
      by auto
    
    lemma null_is_zero [simp]:
    shows "OP null t = null" and "OP t null = null"
      using null_def ex_un_null theI' [of "λn. ∀t. OP n t = n ∧ OP t n = n"]
      by auto

  end

  section "Partial Composition"

  text ‹
    A \emph{partial composition} is formally the same thing as a partial magma,
    except that we think of the operation as an operation of ``composition'',
    and we regard elements ‹f› and ‹g› of type @{typ 'a} as \emph{composable}
    if their composition is non-null.
›

  type_synonym 'a comp = "'a ⇒ 'a ⇒ 'a"

  locale partial_composition =
    partial_magma C
  for C :: "'a comp" (infixr ‹⋅› 55)
  begin

    text ‹
      An \emph{identity} is a self-composable element ‹a› such that composition of
      any other element ‹f› with ‹a› on either the left or the right results in
      ‹f› whenever the composition is defined.
›

    definition ide
    where "ide a ≡ a ⋅ a ≠ null ∧
                   (∀f. (f ⋅ a ≠ null ⟶ f ⋅ a = f) ∧ (a ⋅ f ≠ null ⟶ a ⋅ f = f))"

    text ‹
      A \emph{domain} of an element ‹f› is an identity ‹a› for which composition of
      ‹f› with ‹a› on the right is defined.
      The notion \emph{codomain} is defined similarly, using composition on the left.
      Note that, although these definitions are completely dual, the choice of terminology
      implies that we will think of composition as being written in traditional order,
      as opposed to diagram order.  It is pretty much essential to do it this way, to maintain
      compatibility with the notation for function application once we start working with
      functors and natural transformations.
›

    definition domains
    where "domains f ≡ {a. ide a ∧ f ⋅ a ≠ null}"

    definition codomains
    where "codomains f ≡ {b. ide b ∧ b ⋅ f ≠ null}"

    lemma domains_null:
    shows "domains null = {}"
      by (simp add: domains_def)

    lemma codomains_null:
    shows "codomains null = {}"
      by (simp add: codomains_def)

    lemma self_domain_iff_ide:
    shows "a ∈ domains a ⟷ ide a"
      using ide_def domains_def by auto

    lemma self_codomain_iff_ide:
    shows "a ∈ codomains a ⟷ ide a"
      using ide_def codomains_def by auto

    text ‹
      An element ‹f› is an \emph{arrow} if either it has a domain or it has a codomain.
      In an arbitrary partial magma it is possible for ‹f› to have one but not the other,
      but the ‹category› locale will include assumptions to rule this out.
›

    definition arr
    where "arr f ≡ domains f ≠ {} ∨ codomains f ≠ {}"

    lemma not_arr_null [simp]:
    shows "¬ arr null"
      by (simp add: arr_def domains_null codomains_null)

    text ‹
      Using the notions of domain and codomain, we can define \emph{homs}.
      The predicate @{term "in_hom f a b"} expresses ``@{term f} is an arrow from @{term a}
      to @{term b},'' and the term @{term "hom a b"} denotes the set of all such arrows.
      It is convenient to have both of these, though passing back and forth sometimes involves
      extra work.  We choose @{term "in_hom"} as the more fundamental notion.
›

    definition in_hom     (‹«_ : _ → _»›)
    where "«f : a → b» ≡ a ∈ domains f ∧ b ∈ codomains f"

    abbreviation hom
    where "hom a b ≡ {f. «f : a → b»}"

    lemma arrI:
    assumes "«f : a → b»"
    shows "arr f"
      using assms arr_def in_hom_def by auto

    lemma ide_in_hom [intro]:
    shows "ide a ⟷ «a : a → a»"
      using self_domain_iff_ide self_codomain_iff_ide in_hom_def ide_def by fastforce

    text ‹
      Arrows @{term "f"} @{term "g"} for which the composite @{term "g ⋅ f"} is defined
      are \emph{sequential}.
›

    abbreviation seq
    where "seq g f ≡ arr (g ⋅ f)"

    lemma comp_arr_ide:
    assumes "ide a" and "seq f a"
    shows "f ⋅ a = f"
      using assms ide_in_hom ide_def not_arr_null by metis

    lemma comp_ide_arr:
    assumes "ide b" and "seq b f"
    shows "b ⋅ f = f"
      using assms ide_in_hom ide_def not_arr_null by metis

    text ‹
      The \emph{domain} of an arrow @{term f} is an element chosen arbitrarily from the
      set of domains of @{term f} and the \emph{codomain} of @{term f} is an element chosen
      arbitrarily from the set of codomains.
›

    definition dom
    where "dom f = (if domains f ≠ {} then (SOME a. a ∈ domains f) else null)"

    definition cod
    where "cod f = (if codomains f ≠ {} then (SOME b. b ∈ codomains f) else null)"

    lemma dom_null [simp]:
    shows "dom null = null"
      by (simp add: dom_def domains_null)

    lemma cod_null [simp]:
    shows "cod null = null"
      by (simp add: cod_def codomains_null)

    lemma dom_in_domains:
    assumes "domains f ≠ {}"
    shows "dom f ∈ domains f"
      using assms dom_def someI [of "λa. a ∈ domains f"] by auto

    lemma cod_in_codomains:
    assumes "codomains f ≠ {}"
    shows "cod f ∈ codomains f"
      using assms cod_def someI [of "λb. b ∈ codomains f"] by auto

  end

  section "Categories"

  text‹
    A \emph{category} is defined to be a partial magma whose composition satisfies an
    extensionality condition, an associativity condition, and the requirement that every
    arrow have both a domain and a codomain.
    The associativity condition involves four ``matching conditions''
    (‹match_1›, ‹match_2›, ‹match_3›, and ‹match_4›)
    which constrain the domain of definition of the composition, and a fifth condition
    (‹comp_assoc'›) which states that the results of the two ways of composing
    three elements are equal.  In the presence of the ‹comp_assoc'› axiom
    ‹match_4› can be derived from ‹match_3› and vice versa.
›

  locale category = partial_composition +
  assumes ext: "g ⋅ f ≠ null ⟹ seq g f"
  and has_domain_iff_has_codomain: "domains f ≠ {} ⟷ codomains f ≠ {}"
  and match_1: "⟦ seq h g; seq (h ⋅ g) f ⟧ ⟹ seq g f"
  and match_2: "⟦ seq h (g ⋅ f); seq g f ⟧ ⟹ seq h g"
  and match_3: "⟦ seq g f; seq h g ⟧ ⟹ seq (h ⋅ g) f"
  and comp_assoc': "⟦ seq g f; seq h g ⟧ ⟹ (h ⋅ g) ⋅ f = h ⋅ g ⋅ f"
  begin

    text‹
      Associativity of composition holds unconditionally.  This was not the case in
      previous, weaker versions of this theory, and I did not notice this for some
      time after updating to the current axioms.  It is obviously an advantage that
      no additional hypotheses have to be verified in order to apply associativity,
      but a disadvantage is that this fact is now ``too readily applicable,''
      so that if it is made a default simplification it tends to get in the way of
      applying other simplifications that we would also like to be able to apply automatically.
      So, it now seems best not to make this fact a default simplification, but rather
      to invoke it explicitly where it is required.
›

    lemma comp_assoc:
    shows "(h ⋅ g) ⋅ f = h ⋅ g ⋅ f"
      by (metis comp_assoc' ex_un_null ext match_1 match_2)

    lemma match_4:
    assumes "seq g f" and "seq h g"
    shows "seq h (g ⋅ f)"
      using assms match_3 comp_assoc by auto

    lemma domains_comp:
    assumes "seq g f"
    shows "domains (g ⋅ f) = domains f"
    proof -
      have "domains (g ⋅ f) = {a. ide a ∧ seq (g ⋅ f) a}"
        using domains_def ext by auto
      also have "... = {a. ide a ∧ seq f a}"
        using assms ide_def match_1 match_3 by meson
      also have "... = domains f"
        using domains_def ext by auto
      finally show ?thesis by blast
    qed

    lemma codomains_comp:
    assumes "seq g f"
    shows "codomains (g ⋅ f) = codomains g"
    proof -
      have "codomains (g ⋅ f) = {b. ide b ∧ seq b (g ⋅ f)}"
        using codomains_def ext by auto
      also have "... = {b. ide b ∧ seq b g}"
        using assms ide_def match_2 match_4 by meson
      also have "... = codomains g"
        using codomains_def ext by auto
      finally show ?thesis by blast
    qed

    lemma has_domain_iff_arr:
    shows "domains f ≠ {} ⟷ arr f"
      by (simp add: arr_def has_domain_iff_has_codomain)

    lemma has_codomain_iff_arr:
    shows "codomains f ≠ {} ⟷ arr f"
      using has_domain_iff_arr has_domain_iff_has_codomain by auto

    text‹
      A consequence of the category axioms is that domains and codomains, if they exist,
      are unique.
›

    lemma domain_unique:
    assumes "a ∈ domains f" and "a' ∈ domains f"
    shows "a = a'"
    proof -
      have "ide a ∧ seq f a ∧ ide a' ∧ seq f a'"
        using assms domains_def ext by force
      thus ?thesis
        using match_1 ide_def not_arr_null by metis
    qed

    lemma codomain_unique:
    assumes "b ∈ codomains f" and "b' ∈ codomains f"
    shows "b = b'"
    proof -
      have "ide b ∧ seq b f ∧ ide b' ∧ seq b' f"
        using assms codomains_def ext by force
      thus ?thesis
        using match_2 ide_def not_arr_null by metis
    qed

    lemma domains_simp:
    assumes "arr f"
    shows "domains f = {dom f}"
      using assms dom_in_domains has_domain_iff_arr domain_unique by auto

    lemma codomains_simp:
    assumes "arr f"
    shows "codomains f = {cod f}"
      using assms cod_in_codomains has_codomain_iff_arr codomain_unique by auto

    lemma domains_char:
    shows "domains f = (if arr f then {dom f} else {})"
      using dom_in_domains has_domain_iff_arr domain_unique by auto

    lemma codomains_char:
    shows "codomains f = (if arr f then {cod f} else {})"
      using cod_in_codomains has_codomain_iff_arr codomain_unique by auto

    text‹
      A consequence of the following lemma is that the notion @{term "arr"} is redundant,
      given @{term "in_hom"}, @{term "dom"}, and @{term "cod"}.  However, I have retained it
      because I have not been able to find a set of usefully powerful simplification rules
      expressed only in terms of @{term "in_hom"} that does not result in looping in many
      situations.
›

    lemma arr_iff_in_hom:
    shows "arr f ⟷ «f : dom f → cod f»"
      using cod_in_codomains dom_in_domains has_domain_iff_arr has_codomain_iff_arr in_hom_def
      by auto

    lemma in_homI [intro]:
    assumes "arr f" and "dom f = a" and "cod f = b"
    shows "«f : a → b»"
      using assms cod_in_codomains dom_in_domains has_domain_iff_arr has_codomain_iff_arr
            in_hom_def
      by auto

    lemma in_homE [elim]:
    assumes "«f : a → b»"
    and "arr f ⟹ dom f = a ⟹ cod f = b ⟹ T"
    shows "T"
     using assms in_hom_def domains_char codomains_char has_domain_iff_arr
     by (metis empty_iff singleton_iff)

    text‹
      To obtain the ``only if'' direction in the next two results and in similar results later
      for composition and the application of functors and natural transformations,
      is the reason for assuming the existence of @{term null} as a special element of the
      arrow type, as opposed to, say, using option types to represent partiality.
      The presence of @{term null} allows us not only to make the ``upward'' inference that
      the domain of an arrow is again an arrow, but also to make the ``downward'' inference
      that if @{term "dom f"} is an arrow then so is @{term f}.  Similarly, we will be able
      to infer not only that if @{term f} and @{term g} are composable arrows then
      @{term "C g f"} is an arrow, but also that if @{term "C g f"} is an arrow then
      ‹f› and ‹g› are composable arrows.  These inferences allow most necessary
      facts about what terms denote arrows to be deduced automatically from minimal
      assumptions.  Typically all that is required is to assume or establish that certain
      terms denote arrows in particular homs at the point where those terms are first
      introduced, and then similar facts about related terms can be derived automatically.
      Without this feature, nearly every proof would involve many tedious additional steps
      to establish that each of the terms appearing in the proof (including all its subterms)
      in fact denote arrows.
›

    lemma arr_dom_iff_arr:
    shows "arr (dom f) ⟷ arr f"
      using dom_def dom_in_domains has_domain_iff_arr self_domain_iff_ide domains_def
      by fastforce

    lemma arr_cod_iff_arr:
    shows "arr (cod f) ⟷ arr f"
      using cod_def cod_in_codomains has_codomain_iff_arr self_codomain_iff_ide codomains_def
      by fastforce

    lemma arr_dom [simp]:
    assumes "arr f"
    shows "arr (dom f)"
      using assms arr_dom_iff_arr by simp

    lemma arr_cod [simp]:
    assumes "arr f"
    shows "arr (cod f)"
      using assms arr_cod_iff_arr by simp

    lemma seqI [simp]:
    assumes "arr f" and "arr g" and "dom g = cod f"
    shows "seq g f"
    proof -
      have "ide (cod f) ∧ seq (cod f) f"
        using assms(1) has_codomain_iff_arr codomains_def cod_in_codomains ext by blast
      moreover have "ide (cod f) ∧ seq g (cod f)"
        using assms(2-3) domains_def domains_simp ext by fastforce
      ultimately show ?thesis
        using match_4 ide_def ext by metis
    qed

    text ‹
      This version of ‹seqI› is useful as an introduction rule, but not as useful
      as a simplification, because it requires finding the intermediary term ‹b›.
      Sometimes \emph{auto} is able to do this, but other times it is more expedient
      just to invoke this rule and fill in the missing terms manually, especially
      when dealing with a chain of compositions.
    ›

    lemma seqI' [intro]:
    assumes "«f : a → b»" and "«g : b → c»"
    shows "seq g f"
      using assms by fastforce

    lemma compatible_iff_seq:
    shows "domains g ∩ codomains f ≠ {} ⟷ seq g f"
    proof
      show "domains g ∩ codomains f ≠ {} ⟹ seq g f"
        using cod_in_codomains dom_in_domains empty_iff has_domain_iff_arr has_codomain_iff_arr
              domain_unique codomain_unique
        by (metis Int_emptyI seqI)
      show "seq g f ⟹ domains g ∩ codomains f ≠ {}"
      proof -
        assume gf: "seq g f"
        have 1: "cod f ∈ codomains f"
          using gf has_domain_iff_arr domains_comp cod_in_codomains codomains_simp by blast
        have "ide (cod f) ∧ seq (cod f) f"
          using 1 codomains_def ext by auto
        hence "seq g (cod f)"
          using gf has_domain_iff_arr match_2 domains_null ide_def by metis
        thus ?thesis
          using domains_def 1 codomains_def by auto
      qed
    qed

    text‹
      The following is another example of a crucial ``downward'' rule that would not be possible
      without a reserved @{term null} value.
›

    lemma seqE [elim]:
    assumes "seq g f"
    and "arr f ⟹ arr g ⟹ dom g = cod f ⟹ T"
    shows "T"
      using assms cod_in_codomains compatible_iff_seq has_domain_iff_arr has_codomain_iff_arr
            domains_comp codomains_comp domains_char codomain_unique
      by (metis Int_emptyI singletonD)

    lemma comp_in_homI [intro]:
    assumes "«f : a → b»" and "«g : b → c»"
    shows "«g ⋅ f : a → c»"
      using assms(1-2) codomains_comp domains_comp in_hom_def seqI' by auto

    lemma comp_in_homI' [simp]:
    assumes "arr f" and "arr g" and "dom f = a" and "cod g = c" and "dom g = cod f"
    shows "«g ⋅ f : a → c»"
      using assms by auto

    lemma comp_in_homE [elim]:
    assumes "«g ⋅ f : a → c»"
    obtains b where "«f : a → b»" and "«g : b → c»"
      using assms in_hom_def domains_comp codomains_comp
      by (metis arrI in_homI seqE)

    text ‹
      The next two rules are useful as simplifications, but they slow down the
      simplifier too much to use them by default.  So it is necessary to guess when
      they are needed and cite them explicitly.  This is usually not too difficult.
    ›

    lemma comp_arr_dom:
    assumes "arr f" and "dom f = a"
    shows "f ⋅ a = f"
      using assms dom_in_domains has_domain_iff_arr domains_def ide_def by auto

    lemma comp_cod_arr:
    assumes "arr f" and "cod f = b"
    shows "b ⋅ f = f"
      using assms cod_in_codomains has_codomain_iff_arr ide_def codomains_def by auto

    lemma ide_char:
    shows "ide a ⟷ arr a ∧ dom a = a ∧ cod a = a"
      using ide_in_hom by auto

    text ‹
      In some contexts, this rule causes the simplifier to loop, but it is too useful
      not to have as a default simplification.  In cases where it is a problem, usually
      a method like \emph{blast} or \emph{force} will succeed if this rule is cited
      explicitly.
    ›

    lemma ideD [simp]:
    assumes "ide a"
    shows "arr a" and "dom a = a" and "cod a = a"
      using assms ide_char by auto

    lemma ide_dom [simp]:
    assumes "arr f"
    shows "ide (dom f)"
      using assms dom_in_domains has_domain_iff_arr domains_def by auto

    lemma ide_cod [simp]:
    assumes "arr f"
    shows "ide (cod f)"
      using assms cod_in_codomains has_codomain_iff_arr codomains_def by auto

    lemma dom_eqI:
    assumes "ide a" and "seq f a"
    shows "dom f = a"
      using assms cod_in_codomains codomain_unique ide_char
      by (metis seqE)

    lemma cod_eqI:
    assumes "ide b" and "seq b f"
    shows "cod f = b"
      using assms dom_in_domains domain_unique ide_char
      by (metis seqE)

    lemma dom_eqI':
    assumes "a ∈ domains f"
    shows "a = dom f"
      using assms dom_in_domains domain_unique by blast

    lemma cod_eqI':
    assumes "a ∈ codomains f"
    shows "a = cod f"
      using assms cod_in_codomains codomain_unique by blast

    lemma ide_char':
    shows "ide a ⟷ arr a ∧ (dom a = a ∨ cod a = a)"
      using ide_dom ide_cod ide_char by metis

    lemma dom_dom:
    shows "dom (dom f) = dom f"
      by (metis dom_null domains_char ideD(2) ide_dom dom_def)

    lemma cod_cod:
    shows "cod (cod f) = cod f"
      by (metis arr_cod_iff_arr cod_def has_codomain_iff_arr ideD(3) ide_cod)

    lemma dom_cod:
    shows "dom (cod f) = cod f"
      by (metis arr_cod_iff_arr cod_def has_codomain_iff_arr has_domain_iff_arr ideD(2)
          ide_cod dom_def)

    lemma cod_dom:
    shows "cod (dom f) = dom f"
      by (metis cod_null has_domain_iff_arr ideD(3) ide_dom dom_def)

    lemma dom_comp [simp]:
    assumes "seq g f"
    shows "dom (g ⋅ f) = dom f"
      using assms by (simp add: dom_def domains_comp)

    lemma cod_comp [simp]:
    assumes "seq g f"
    shows "cod (g ⋅ f) = cod g"
      using assms by (simp add: cod_def codomains_comp)

    lemma comp_ide_self [simp]:
    assumes "ide a"
    shows "a ⋅ a = a"
      using assms comp_arr_ide arrI by auto

    lemma ide_compE [elim]:
    assumes "ide (g ⋅ f)"
    and "seq g f ⟹ seq f g ⟹ g ⋅ f = dom f ⟹ g ⋅ f = cod g ⟹ T"
    shows "T"
      using assms dom_comp cod_comp ide_char ide_in_hom
      by (metis seqE seqI)

    text ‹
      The next two results are sometimes useful for performing manipulations at the
      head of a chain of composed arrows.  I have adopted the convention that such
      chains are canonically represented in right-associated form.  This makes it
      easy to perform manipulations at the ``tail'' of a chain, but more difficult
      to perform them at the ``head''.  These results take care of the rote manipulations
      using associativity that are needed to either permute or combine arrows at the
      head of a chain.
›

    lemma comp_permute:
    assumes "f ⋅ g = k ⋅ l" and "seq f g" and "seq g h"
    shows "f ⋅ g ⋅ h = k ⋅ l ⋅ h"
      using assms by (metis comp_assoc)

    lemma comp_reduce:
    assumes "f ⋅ g = k" and "seq f g" and "seq g h"
    shows "f ⋅ g ⋅ h = k ⋅ h"
      using assms comp_assoc by auto

    text‹
      Here we define some common configurations of arrows.
      These are defined as abbreviations, because we want all ``diagrammatic'' assumptions
      in a theorem to reduce readily to a conjunction of assertions of the basic forms
      @{term "arr f"}, @{term "dom f = X"}, @{term "cod f = Y"}, and @{term "in_hom f a b"}.
›

    abbreviation endo
    where "endo f ≡ seq f f"
     
    abbreviation antipar
    where "antipar f g ≡ seq g f ∧ seq f g"

    abbreviation span
    where "span f g ≡ arr f ∧ arr g ∧ dom f = dom g"

    abbreviation cospan
    where "cospan f g ≡ arr f ∧ arr g ∧ cod f = cod g"

    abbreviation par
    where "par f g ≡ arr f ∧ arr g ∧ dom f = dom g ∧ cod f = cod g"

  end

end