# Inductive type definitions

An inductive type definition, sometimes called a datatype, has the following general structure.

$\begin{split}\mathsf{type}~T_1~\overline{(x_1:p_1)} : \overline{y_1:q_1} \rightarrow \mathsf{Type} = \overline{| D_1 : t_1} \\ \ldots\qquad\qquad\qquad\qquad\\ \mathsf{and}~T_n~\overline{(x_n:p_n)} : \overline{y_n:q_n} \rightarrow \mathsf{Type} = \overline{| D_n : t_n} \\\end{split}$

This defines $$n$$ mutually inductive types, named $$T_1 \ldots T_n$$, called the type constructors. Each type constructor $$T_i$$ has a number of parameters, the $$\overline{x_i : p_i}$$, and a number of indexes, the $$\overline{y_i:q_i}$$.

Each type constructor $$T_i$$ has zero or more data constructors $$\overline{D_i:t_i}$$. For each data constructor $$D_{ij}$$, its type $$t_{ij}$$ must be of the form $$\overline{z:s} \rightarrow T_i~\bar{x_i}~\bar{e}$$, i.e., it must be a function type returning an instance of $$T_i$$ with the same parameters $$\overline{x_i}$$ as in the type constructor’s signature, but with any other well-typed terms $$\overline{e}$$ for the index arguments. This is the main difference between a parameter and an index—a parameter of a type constructor cannot vary in the result type of the data constructors, while the indexes can.

Further, in each of the arguments $$\overline{z:s}$$ of the data constructor, none of the mutually defined type constructors $$\overline{T}$$ may appear to the left of an arrow. That is, all occurrences of the type constructors must be strictly positive. This is to ensure that the inductive definitions are well-founded, as explained below. Without this restriction, it is easy to break soundness by writing non-terminating functions with Tot types.

Also related to ensuring logical consistency is the universe level of an inductive type definition. We’ll return to that later, once we’ve done a few examples.

## Strictly positive definitions

As a strawman, consider embedding a small dynamically typed programming language within F*. All terms in our language have the same static type dyn, although at runtime values could have type Bool, or Int, or Function.

One attempt at representing a language like this using a data type in F* is as follows:

noeq
type dyn =
| Bool : bool -> dyn
| Int  : int -> dyn
| Function : (dyn -> dyn) -> dyn


The three cases of the data type represent our three kinds of runtime values: Bool b, Int b, and Function f. The Function case, however, is problematic: The argument f is itself a function from dyn -> dyn, and the constructor Function allows promoting a dyn -> dyn function into the type dyn itself, e.g., one can represent the identity function in dyn as Function (fun (x:dyn) -> x). However, the Function case is problematic: as we will see below, it allows circular definitions that enable constructing instances of dyn without actually providing any base case. F* rejects the definition of dyn, saying “Inductive type dyn does not satisfy the strict positivity condition”.

Consider again the general shape of an inductive type definition:

$\begin{split}\mathsf{type}~T_1~\overline{(x_1:p_1)} : \overline{y_1:q_1} \rightarrow \mathsf{Type} = \overline{| D_1 : t_1} \\ \ldots\qquad\qquad\qquad\qquad\\ \mathsf{and}~T_n~\overline{(x_n:p_n)} : \overline{y_n:q_n} \rightarrow \mathsf{Type} = \overline{| D_n : t_n} \\\end{split}$

This definition is strictly positive when

• for every type constructor $$T \in T_1, ..., T_n$$,

• and every data constructor $$D : t \in \overline{D_1}, ... \overline{D_n}$$, where t is of the form $$x0:s_0 \rightarrow ... \rightarrow xn:s_n \rightarrow T_i ...$$, and $$s_0, ..., s_n$$ are the types of the fields of $$D$$

• and for all instantiations $$\overline{v}$$ of the type parameters $$\overline{p}$$ of the type $$T$$,

• $$T$$ does not appear to the left of any arrow in any $$s \in (s_0, ..., s_k)[\overline{v}/\overline{p}]$$.

Our type dyn violates this condition, since the defined typed dyn appears to the left of an arrow type in the dyn -> dyn-typed field of the Function constructor.

To see what goes wrong if F* were to accept this definition, we can suppress the error reported by using the option __no_positivity and see what happens.

#push-options "--__no_positivity"
noeq
type dyn =
| Bool : bool -> dyn
| Int  : int -> dyn
| Function : (dyn -> dyn) -> dyn
#pop-options


Note

F* maintains an internal stack of command line options. The #push-options pragma pushes additional options at the top of the stack, while #pop-options pops the stack. The pattern used here instructs F* to typecheck dyn only with the __no_positivity option enabled. As we will see, the __no_positivity option can be used to break soundness, so use it only if you really know what you’re doing.

Now, having declared that dyn is a well-formed inductive type, despite not being strictly positive, we can break the soundness of F*. In particular, we can write terms and claim they are total, when in fact their execution will loop forever.

let loop' (f:dyn)
: dyn
= match f with
| Function g -> g f
| _ -> f

let loop
: dyn
= loop' (Function loop')


Here, the type of loop claims that it is a term that always evaluates in a finite number of steps to a value of type dyn. Yet, reducing it produces an infinite chain of calls to loop' (Function loop'). Admitting a non-positive definition like dyn has allowed us to build a non-terminating loop.

Such loops can also allow one to prove False (breaking soundness), as the next example shows.

#push-options "--__no_positivity"
noeq
type non_positive =
| NP : (non_positive -> False) -> non_positive
#pop-options

let almost_false (f:non_positive)
: False
= let NP g = f in g f

let ff
: False
= almost_false (NP almost_false)


This example is very similar to dyn, except NP stores a non-positive function that returns False, which allows use to prove ff : False, i.e., in this example, not only does the violation of strict positivity lead to an infinite loop at runtime, it also renders the entire proof system of F* useless, since one can prove False.

Finally, in the example below, although the type also_non_pos does not syntactically appear to the left of an arrow in a field of the ANP constructor, an instantiation of the type parameter f (e.g., with the type f_false) does make it appear to the left of an arrow—so this type too is deemed not strictly positive, and can be used to prove False.

#push-options "--__no_positivity"
noeq
type also_non_pos (f:Type -> Type) =
| ANP : f (also_non_pos f) -> also_non_pos f
#pop-options

let f_false
: Type -> Type
= fun a -> (a -> False)

let almost_false_again
: f_false (also_non_pos f_false)
= fun x -> let ANP h = x in h x

let ff_again
: False
= almost_false_again (ANP almost_false_again)


We hope you are convinced that non-strictly positive types should not be admissible in inductive type definitions. In what follows, we will no longer use the __no_positivity option. In a later section, once we’ve introduced the effect of divergence, we will see that non-positive definitions can safely be used in a context where programs are not expected to terminate, allowing one to safely model things like the dyn type, without compromising the soundness of F*.

### Strictly Positive Annotations

Sometimes it is useful to parameterize an inductive definition with a type function, without introducing a non-positive definition as we did in also_non_pos above.

For example, the definition below introduces a type free f a, a form of a tree whose leaf nodes contain a values, and whose internal nodes branch according the type function f.

noeq
type free (f:([@@@ strictly_positive] _:Type -> Type))
(a:Type)
: Type =
| Leaf : a -> free f a
| Branch : f (free f a) -> free f a


We can instantiate this generic free to produce various kinds of trees. Note: when instantiating free list a in variable_branching_list below, we need to explicitly re-define the list type with a strict-positivity annotation: F* does not correctly support rechecking type constructors to prove that they are strictly positive when they are used at higher order.

let binary_tree (a:Type) = free (fun t -> t & t) a
let list_redef ([@@@strictly_positive] a:Type) = list a
let variable_branching_list a = free list_redef a
let infinite_branching_tree a = free (fun t -> nat -> t) a


However, we should only be allowed to instantate f with type functions that are strictly positive in their argument, since otherwise we can build a proof of False, as we did with also_non_pos. The @@@strictly_positive attribute on the formal parameter of f enforces this.

If we were to try to instantiate free with a non-strictly positive type function,

let free_bad = free (fun t -> (t -> False)) int


then F* raises an error:

Binder (t: Type) is marked strictly positive, but its use in the definition is not


### Unused Annotations

Sometimes one indexes a type by another type, though the index has no semantic meaning. For example, in several F* developments that model mutable state, the a heap reference is just a natural number modeling its address in the heap. However, one might use the type let ref (a:Type) = nat to represent the type of a reference, even though the type a is not used in the definition. In such cases, it can be useful to mark the parameter as unused, to inform F*’s positivity checker that the type index is actually irrelevant. The snippet below shows an example:

irreducible
let ref ([@@@unused] a:Type) = nat
noeq

Here, we’ve marked the parameter of ref with the unused attribute. We’ve also marked ref as irreducible just to ensure for this example that F* does not silently unfold the definition of ref.
Now, knowing that the parameter of ref is unused, one can define types like linked_list a, where although linked_list a appears as an argument to the ref type, the positivity checker accepts it, since the parameter is unused. This is similar to the use of a strictly_positive annotation on a parameter.
However, with the unused attribute, one can go further: e.g., the type neg_unused shows that even a negative occurrence of the defined type is accepted, so long as it appears only as an instantiation of an unused parameter.