Inductive type definitions
An inductive type definition, sometimes called a datatype, has the following general structure.
This defines \(n\) mutually inductive types, named \(T_1 \ldots T_n\), called the type constructors. Each type constructor \(T_i\) has a number of parameters, the \(\overline{x_i : p_i}\), and a number of indexes, the \(\overline{y_i:q_i}\).
Each type constructor \(T_i\) has zero or more data constructors \(\overline{D_i:t_i}\). For each data constructor \(D_{ij}\), its type \(t_{ij}\) must be of the form \(\overline{z:s} \rightarrow T_i~\bar{x_i}~\bar{e}\), i.e., it must be a function type returning an instance of \(T_i\) with the same parameters \(\overline{x_i}\) as in the type constructor’s signature, but with any other well-typed terms \(\overline{e}\) for the index arguments. This is the main difference between a parameter and an index—a parameter of a type constructor cannot vary in the result type of the data constructors, while the indexes can.
Further, in each of the arguments \(\overline{z:s}\) of the data
constructor, none of the mutually defined type constructors
\(\overline{T}\) may appear to the left of an arrow. That is, all
occurrences of the type constructors must be strictly positive. This
is to ensure that the inductive definitions are well-founded, as
explained below. Without this restriction, it is easy to break
soundness by writing non-terminating functions with Tot
types.
Also related to ensuring logical consistency is the universe level of an inductive type definition. We’ll return to that later, once we’ve done a few examples.
Strictly positive definitions
As a strawman, consider embedding a small dynamically typed
programming language within F*. All terms in our language have the
same static type dyn
, although at runtime values could have
type Bool
, or Int
, or Function
.
One attempt at representing a language like this using a data type in F* is as follows:
noeq
type dyn =
| Bool : bool -> dyn
| Int : int -> dyn
| Function : (dyn -> dyn) -> dyn
The three cases of the data type represent our three kinds of runtime
values: Bool b
, Int b
, and Function f
. The Function
case, however, is problematic: The argument f
is itself a function
from dyn -> dyn
, and the constructor Function
allows promoting
a dyn -> dyn
function into the type dyn
itself, e.g., one can
represent the identity function in dyn
as Function (fun (x:dyn)
-> x)
. However, the Function
case is problematic: as we will see
below, it allows circular definitions that enable constructing
instances of dyn
without actually providing any base case. F*
rejects the definition of dyn
, saying “Inductive type dyn does not
satisfy the strict positivity condition”.
Consider again the general shape of an inductive type definition:
This definition is strictly positive when
for every type constructor \(T \in T_1, ..., T_n\),
and every data constructor \(D : t \in \overline{D_1}, ... \overline{D_n}\), where t is of the form \(x0:s_0 \rightarrow ... \rightarrow xn:s_n \rightarrow T_i ...\), and \(s_0, ..., s_n\) are the types of the fields of \(D\)
and for all instantiations \(\overline{v}\) of the type parameters \(\overline{p}\) of the type \(T\),
\(T\) does not appear to the left of any arrow in any \(s \in (s_0, ..., s_k)[\overline{v}/\overline{p}]\).
Our type dyn
violates this condition, since the defined typed
dyn
appears to the left of an arrow type in the dyn ->
dyn
-typed field of the Function
constructor.
To see what goes wrong if F* were to accept this definition, we can
suppress the error reported by using the option __no_positivity
and see what happens.
#push-options "--__no_positivity"
noeq
type dyn =
| Bool : bool -> dyn
| Int : int -> dyn
| Function : (dyn -> dyn) -> dyn
#pop-options
Note
F* maintains an internal stack of command line options. The
#push-options
pragma pushes additional options at the top of
the stack, while #pop-options
pops the stack. The pattern used
here instructs F* to typecheck dyn
only with the
__no_positivity
option enabled. As we will see, the
__no_positivity
option can be used to break soundness, so use
it only if you really know what you’re doing.
Now, having declared that dyn
is a well-formed inductive type,
despite not being strictly positive, we can break the soundness of
F*. In particular, we can write terms and claim they are total, when
in fact their execution will loop forever.
let loop' (f:dyn)
: dyn
= match f with
| Function g -> g f
| _ -> f
let loop
: dyn
= loop' (Function loop')
Here, the type of loop
claims that it is a term that always
evaluates in a finite number of steps to a value of type dyn
. Yet,
reducing it produces an infinite chain of calls to loop'
(Function loop')
. Admitting a non-positive definition like dyn
has allowed us to build a non-terminating loop.
Such loops can also allow one to prove False
(breaking soundness),
as the next example shows.
#push-options "--__no_positivity"
noeq
type non_positive =
| NP : (non_positive -> False) -> non_positive
#pop-options
let almost_false (f:non_positive)
: False
= let NP g = f in g f
let ff
: False
= almost_false (NP almost_false)
This example is very similar to dyn
, except NP
stores a
non-positive function that returns False
, which allows use to
prove ff : False
, i.e., in this example, not only does the
violation of strict positivity lead to an infinite loop at runtime, it
also renders the entire proof system of F* useless, since one can
prove False
.
Finally, in the example below, although the type also_non_pos
does
not syntactically appear to the left of an arrow in a field of the
ANP
constructor, an instantiation of the type parameter f
(e.g., with the type f_false
) does make it appear to the left of
an arrow—so this type too is deemed not strictly positive, and can be used
to prove False
.
#push-options "--__no_positivity"
noeq
type also_non_pos (f:Type -> Type) =
| ANP : f (also_non_pos f) -> also_non_pos f
#pop-options
let f_false
: Type -> Type
= fun a -> (a -> False)
let almost_false_again
: f_false (also_non_pos f_false)
= fun x -> let ANP h = x in h x
let ff_again
: False
= almost_false_again (ANP almost_false_again)
We hope you are convinced that non-strictly positive types should not
be admissible in inductive type definitions. In what follows, we will
no longer use the __no_positivity
option. In a later section, once
we’ve introduced the effect of divergence, we will see that
non-positive definitions can safely be used in a context where
programs are not expected to terminate, allowing one to safely model
things like the dyn
type, without compromising the soundness of
F*.
Strictly Positive Annotations
Sometimes it is useful to parameterize an inductive definition with a
type function, without introducing a non-positive definition as we did
in also_non_pos
above.
For example, the definition below introduces a type free f a
, a
form of a tree whose leaf nodes contain a
values, and whose
internal nodes branch according the type function f
.
noeq
type free (f:([@@@ strictly_positive] _:Type -> Type))
(a:Type)
: Type =
| Leaf : a -> free f a
| Branch : f (free f a) -> free f a
We can instantiate this generic free
to produce various kinds of
trees. Note: when instantiating free list a
in
variable_branching_list
below, we need to explicitly re-define the
list
type with a strict-positivity annotation: F* does not
correctly support rechecking type constructors to prove that they are
strictly positive when they are used at higher order.
let binary_tree (a:Type) = free (fun t -> t & t) a
let list_redef ([@@@strictly_positive] a:Type) = list a
let variable_branching_list a = free list_redef a
let infinite_branching_tree a = free (fun t -> nat -> t) a
However, we should only be allowed to instantate f
with type
functions that are strictly positive in their argument, since otherwise
we can build a proof of False
, as we did with
also_non_pos
. The @@@strictly_positive
attribute on the
formal parameter of f
enforces this.
If we were to try to instantiate free
with a non-strictly positive
type function,
let free_bad = free (fun t -> (t -> False)) int
then F* raises an error:
Binder (t: Type) is marked strictly positive, but its use in the definition is not
Unused Annotations
Sometimes one indexes a type by another type, though the index has no
semantic meaning. For example, in several F* developments that model
mutable state, the a heap reference is just a natural number modeling
its address in the heap. However, one might use the type let ref
(a:Type) = nat
to represent the type of a reference, even though the
type a
is not used in the definition. In such cases, it can be
useful to mark the parameter as unused, to inform F*’s positivity
checker that the type index is actually irrelevant. The snippet below
shows an example:
irreducible
let ref ([@@@unused] a:Type) = nat
noeq
type linked_list (a:Type) =
| LL : ref (a & linked_list a) -> linked_list a
noeq
type neg_unused =
| NU : ref (neg_unused -> bool) -> neg_unused
Here, we’ve marked the parameter of ref
with the unused
attribute. We’ve also marked ref
as irreducible
just to
ensure for this example that F* does not silently unfold the
definition of ref
.
Now, knowing that the parameter of ref
is unused, one can define
types like linked_list a
, where although linked_list a
appears
as an argument to the ref
type, the positivity checker accepts it,
since the parameter is unused. This is similar to the use of a
strictly_positive
annotation on a parameter.
However, with the unused
attribute, one can go further: e.g., the
type neg_unused
shows that even a negative occurrence of the
defined type is accepted, so long as it appears only as an
instantiation of an unused parameter.