Inductive types and pattern matching

In this chapter, you’ll learn how to define new types in F*. These types are called inductive types, or, more informally, datatypes. We’ll also learn how to define functions over these inductive types by pattern matching and to prove properties about them.

Note

We’ll only cover the most basic forms of inductive types here. In particular, the types we show here will not make use of indexing or any other form of dependent types—we’ll leave that for a later chapter.

Enumerations

We’ve seen that unit is the type with just one element () and that bool is the type with two elements, true and false.

You can define your own types with an enumeration of elements, like so.

type three =
  | One_of_three : three
  | Two_of_three : three
  | Three_of_three : three

This introduces a new type three : Type, and three distinct constants One_of_three : three, Two_of_three : three, Three_of_three : three. These constants are also called “constructors” or “data constructors”. The name of a constructor must begin with an uppercase letter.

Note

In this case, it may seem redundant to have to write the type of each constructor repeatedly—of course they’re all just constructors of the type three. In this case, F* will allow you to just write

type three =
  | One_of_three
  | Two_of_three
  | Three_of_three

As we start to use indexed types, each constructor can build a different instance of the defined type, so it will be important to have a way to specify the result type of each constructor. For uniformity, throughout this book, we’ll always annotate the types of constructors, even when not strictly necessary.

F* can prove that they are distinct and that these are the only terms of type three.

let distinct = assert (One_of_three <> Two_of_three /\
                       Two_of_three <> Three_of_three /\
                       Three_of_three <> One_of_three)

let exhaustive (x:three) = assert (x = One_of_three \/
                                   x = Two_of_three \/
                                   x = Three_of_three)

To write functions that case analyze these new types, one uses the match construct. The syntax of match in F* is very similar to OCaml or F#. We’ll assume that you’re familiar with its basics. As we go, we’ll learn about more advanced ways to use match.

Here are some examples.

let is_one (x:three)
  : bool
  = match x with
    | One_of_three -> true
    | _ -> false

let is_two (x:three)
  : bool
  = match x with
    | Two_of_three -> true
    | _ -> false

let is_three (x:three)
  : bool
  = match x with
    | Three_of_three -> true
    | _ -> false

Discriminators

These functions test whether x : three matches a given constructor, returning a bool in each case. Since it’s so common to write functions that test whether a value of an inductive type matches one of its constructors, F* automatically generates these functions for you. For example, instead of writing

let three_as_int (x:three)
  : int
  = if is_one x then 1
    else if is_two x then 2
    else 3

One can write:

let three_as_int' (x:three)
  : int
  = if One_of_three? x then 1
    else if Two_of_three? x then 2
    else 3

In other words, for every constructor T of an inductive type t, F* generates a function named T? (called a “discriminator”) which tests if a v:t matches T.

Exhaustiveness

Of course, an even more direct way of writing three_as_int is

let three_as_int'' (x:three)
  : int
  = match x with
    | One_of_three -> 1
    | Two_of_three -> 2
    | Three_of_three -> 3

Every time you use a match, F* will make sure to prove that you are handling all possible cases. Try omitting one of the cases in three_as_int above and see what happens.

Exhaustiveness checking in F* is a semantic check and can use the SMT solver to prove that all cases are handled appropriately. For example, you can write this:

let only_two_as_int (x:three { not (Three_of_three? x) })
  : int
  = match x with
    | One_of_three -> 1
    | Two_of_three -> 2

The refinement on the argument allows F* to prove that the Three_of_three case in the pattern is not required, since that branch would be unreachable anyway.

Tuples

The next step from enumerations is to define composite types, e.g., types that are made from pairs, triples, quadruples, etc. of other types. Here’s how

type tup2 (a:Type)  (b:Type) =
  | Tup2 : fst:a -> snd:b -> tup2 a b

type tup3 a b c =
  | Tup3 : fst:a -> snd:b -> thd:c -> tup3 a b c

The type definition for tup2 a b states that for any types a : Type and b : Type, Tup2 : a -> b -> tup2 a b. That is, Tup2 is a constructor of tup2, such that given x:a and y:b, Tup2 x y : tup2 a b.

The other types tup3 and tup4 are similar—the type annotations on the bound variables can be inferred.

These are inductive types with just one case—so the discriminators Tup2?, Tup3?, and Tup4? aren’t particularly useful. But, we need a way to extract, or project, the components of a tuple. You can do that with a match.

let tup2_fst #a #b (x:tup2 a b)
  : a
  = match x with
    | Tup2 fst _ -> fst

let tup2_snd #a #b (x:tup2 a b)
  : b
  = match x with
    | Tup2 _ snd -> snd

let tup3_fst #a #b #c (x:tup3 a b c)
  : a
  = match x with
    | Tup3 fst _ _ -> fst

let tup3_snd #a #b #c (x:tup3 a b c)
  : b
  = match x with
    | Tup3 _ snd _ -> snd

let tup3_third #a #b #c (x:tup3 a b c)
  : c
  = match x with
    | Tup3 _ _ thd -> thd

Projectors

These projectors are common enough that F* auto-generates them for you. In particular, for any data constructor T of type x1:t1 -> ... -> xn:tn -> t, F* auto-generates the following function:

  • T?.xi : y:t{T? y} -> ti

That is, T?.xi is a function which when applied to a y:t in case T? y, returns the xi component of T x1 ... xn.

In the case of our tup2 and tup3 types, we have

  • Tup2?.fst, Tup2?.snd

  • Tup3?.fst, Tup3?.snd, Tup3?.thd

Syntax for tuples

Since tuples are so common, the module FStar.Pervasives.Native.fst defines tuple types up to arity 14. So, you shouldn’t have to define tup2 and tup3 etc. by yourself.

The tuple types in FStar.Pervasives.Native come with syntactic sugar.

  • You can write a & b instead of the tup2 a b; a & b & c instead of tup3 a b c; and so on, up to arity 14.

  • You can write x, y instead of Tup2 x y; x, y, z instead of Tup3 x y z; an so on, up to arity 14.

  • You can write x._1, x._2, x._3, etc. to project the field i of a tuple whose arity is at least i.

That said, if you’re using tuples beyond arity 4 or 5, it’s probably a good idea to define a record, as we’ll see next—since it can be hard to remember what the components of a large tuple represent.

Records

A record is just a tuple with user-chosen names for its fields and with special syntax for constructing then and projecting their fields. Here’s an example.

type point3D = { x:int; y:int; z:int}

let origin = { y=0; x=0; z=0 }

let dot (p0 p1:point3D) = p0.x * p1.x + p0.y * p1.y + p0.z * p1.z

let translate_X (p:point3D) (shift:int) = { p with x = p.x + shift}

let is_origin (x:point3D) =
  match x with
  | {z=0;y=0;x=0} -> true
  | _ -> false
  • A record type is defined using curly braces {}. See type point3D

  • A record value is also constructed using curly braces, with an assignment for each field of the record. The fields need not be given in order. See origin.

  • To access the fields of a record, you can use the dot notation p.x; See dot, which computes a dot product using dot notation.

  • Records also support the with notation to construct a new record whose fields are the same as the old record, except for those fields mentioned after the with. That is, translate_X p shift returns { x = p.x + shift; y = p.y; z = p.z}.

  • Records can also be used to pattern match a value. For example, in is_origin, we match the fields of the record (in any order) against some patterns.

Options

Another common type from F*’s standard library is the option type, which is useful to represent a possibly missing value.

Consider implementing a function to divide x / y, for two integers x and y. This function cannot be defined when y is zero, but it can be defined partially, by excluding the case where y = 0, as shown below. (Of course, one can also refine the domain of the function to forbid y = 0, but we’re just trying to illustrate the option type here.)

let try_divide (x y:int)
  : option int
  = if y = 0 then None else Some (x / y)

let divide (x:int) (y:int{y<>0}) = x / y

Like most other functional languages, F* does not have a null value. Whenever a value may possibly be null, one typically uses the option type, using None to signify null and Some v for the non-null case.

Unions, or the either type

FStar.Pervasives also defines the either type, shown below.

type either a b =
  | Inl : v: a -> either a b
  | Inr : v: b -> either a b

The type either a b represents a value that could either be Inl v with v:a, or Inr v with v:b. That is, either a b is a tagged union of the a and b. It’s easy to write functions to analyze the tag Inl (meaning it’s “in the left case”) or Inr (“in the right case”) and compute with the underlying values. Here’s an example:

let same_case #a #b #c #d (x:either a b) (y:either c d)
  : bool
  = match x, y with
    | Inl _, Inl _
    | Inr _, Inr _ -> true
    | _ -> false

let sum (x:either bool int) (y:either bool int{same_case x y})
  : z:either bool int{ Inl? z <==> Inl? x}
  = match x, y with
    | Inl xl, Inl yl -> Inl (xl || yl)
    | Inr xr, Inr yr -> Inr (xr + yr)

The same_case x y function decides if the two unions are both simultaneously in the left or right case.

Then, in sum x y, with a refinement that x and y are in the same case, we can handle just two cases (when they are both in left, or both in right) and F* can prove that the case analysis is exhaustive. In the left case, the underlying values are boolean, so we combine them with ||; in the right case, the underlying values are integers, so we combine them with +; and return them with the appropriate tag. The type of the result z:either bool int{ Inl? z <==> Inl? x} shows that the result has the same case as x (and hence also y). We could have written the result type as z:either bool int { same_case z x }.

Lists

All the types we’ve see far have been inductive only in a degenerate sense—the constructors do not refer to the types they construct. Now, for our first truly inductive type, a list.

Here’s the definition of list from Prims:

type list a =
  | Nil  : list a
  | Cons : hd:a -> tl:list a -> list a

The list type is available implicitly in all F* programs and we have special (but standard) syntax for the list constructors:

  • [] is Nil

  • [v1; ...; vn] is Cons v1 ... (Cons vn Nil)

  • hd :: tl is Cons hd tl.

You can always just write out the constructors like Nil and Cons explicitly, if you find that useful (e.g., to partially apply Cons hd : list a -> list a).

Length of a list

Let’s write some simple functions on lists, starting with computing the length of a list.

let rec length #a (l:list a)
  : nat
  = match l with
    | [] -> 0
    | _ :: tl -> 1 + length tl

The length function is recursive and implicitly polymorphic in a type a. For any list l : list a, length l returns a nat. The definition pattern matches on the list and calls length recursively on the tail of list, until the [] case is reached.

Exercises

Click here for the exercise file.

Here’s the definition of append, a function that concatenates two lists. Can you give it a type that proves it always returns a list whose length is the sum of the lengths of its arguments?

let rec append l1 l2
  = match l1 with
    | [] -> l2
    | hd :: tl -> hd :: append tl l2

Answer

val append (#a:Type) (l1 l2:list a)
  : l:list a { length l = length l1 + length l2 }