Linked Lists

In this chapter, we develop a linked list library. Along the way, we’ll see uses of recursive predicates, trades, and universal quantification.

Representing a Linked List

Let’s start by defining the type of a singly linked list:

noeq
type node (t:Type0) = {
    head : t;
    tail : llist t;
}

and node_ptr (t:Type0) = ref (node t)

//A nullable pointer to a node
and llist (t:Type0) = option (node_ptr t)

A node t contains a head:t and a tail:llist t, a nullable reference pointing to the rest of the list. Nullable references are represented by an option, as we saw before.

Next, we need a predicate to relate a linked list to a logical representation of the list, for use in specifications.

let rec is_list #t (x:llist t) (l:list t)
: Tot vprop (decreases l)
= match l with
  | [] -> pure (x == None)
  | head::tl -> 
    exists* (p:node_ptr t) (tail:llist t).
      pure (x == Some p) **
      pts_to p { head; tail } **
      is_list tail tl

The predicate is_list x l is a recursive predicate:

  • When l == [], the reference x must be null.

  • Otherwise, l == head :: tl, x must contains a valid reference p, where p points to { head; tail } and, recursively , we have is_list tail tl.

Boilerplate: Introducing and Eliminating is_list

We’ve seen recursive predicates in a previous chapter, and just as we did there, we need some helper ghost functions to work with is_list. We expect the Pulse checker will automate these boilerplate ghost lemmas in the future, but, for now, we are forced to write them by hand.

```pulse
ghost
fn elim_is_list_nil (#t:Type0) (x:llist t)
requires is_list x 'l ** pure ('l == [])
ensures pure (x == None)
{
   rewrite each 'l as Nil #t;
   unfold (is_list x [])
}
```

```pulse
ghost
fn intro_is_list_nil (#t:Type0) (x:llist t)
requires pure (x == None)
ensures is_list x []
{
    fold (is_list x []);
}
```

```pulse
ghost
fn elim_is_list_cons (#t:Type0) (x:llist t) (l:list t) (head:t) (tl:list t)
requires is_list x l ** pure (l == head::tl)
ensures (
  exists* (p:node_ptr t) (tail:llist t).
    pure (x == Some p) **
    pts_to p { head; tail } **
    is_list tail tl
)
{
  rewrite each l as (head::tl);
  unfold (is_list x (head::tl));
}
```
    

```pulse
ghost
fn intro_is_list_cons (#t:Type0) (x:llist t) (v:node_ptr t) (#node:node t) (#tl:list t)
requires
  pts_to v node **
  is_list node.tail tl **
  pure (x == Some v)
ensures
  is_list x (node.head::tl)
{
    rewrite (pts_to v node) as (pts_to v { head=node.head; tail=node.tail });
    fold (is_list x (node.head::tl));
}
```

Case analyzing a nullable pointer

When working with a linked list, the first thing we’ll do, typically, is to check whether a given x:llist t is null or not. However, the is_list x l predicate is defined by case analyzing l:list t rather than x:llist t, since that is makes it possible to write the predicate by recursing on the tail of l. So, below, we have a predicate is_list_cases x l that inverts is_list x l predicate based on whether or not x is null.

let is_list_cases #t (x:llist t) (l:list t)
: vprop 
= match x with
  | None -> pure (l == [])
  | Some v -> 
    exists* (n:node t) (tl:list t).
      pts_to v n **
      pure (l == n.head::tl) **
      is_list n.tail tl

Next, we define a ghost function to invert is_list into is_list_cases.

ghost
fn cases_of_is_list #t (x:llist t) (l:list t)
requires is_list x l
ensures is_list_cases x l
{
  match l {
    Nil -> { 
      unfold (is_list x []);
      fold (is_list_cases None l);
      rewrite each (None #(ref (node t))) as x;
    }
    Cons head tl -> { 
      unfold (is_list x (head::tl));
      with w tail. _;
      let v = Some?.v x;
      rewrite each w as v;
      rewrite each tail as (({ head; tail }).tail) in (is_list tail tl);
      fold (is_list_cases (Some v) l);
      rewrite each (Some #(ref (node t)) v) as x;
    }
  }
}

We also define two more ghost functions that package up the call to cases_of_is_list.

ghost
fn is_list_case_none (#t:Type) (x:llist t) (#l:list t)
requires is_list x l ** pure (x == None)
ensures is_list x l ** pure (l == [])
{
  cases_of_is_list x l;
  rewrite each x as (None #(ref (node t)));
  unfold (is_list_cases None l);
  fold (is_list x []);
}
ghost
fn is_list_case_some (#t:Type) (x:llist t) (v:node_ptr t) (#l:list t) 
requires is_list x l ** pure (x == Some v)
ensures
  exists* (node:node t) (tl:list t).
    pts_to v node **
    is_list node.tail tl **
    pure (l == node.head::tl)
{
  cases_of_is_list x l;
  rewrite each x as (Some v);
  unfold (is_list_cases (Some v) l);
}

Length, Recursively

With our helper functions in hand, let’s get to writing some real code, starting with a function to compute the length of an llist.

fn rec length (#t:Type0) (x:llist t)
requires is_list x 'l
returns n:nat
ensures is_list x 'l ** pure (n == List.Tot.length 'l)
{
  match x {
    None -> {
      is_list_case_none x;
      0
    }
    Some vl -> {
      is_list_case_some x vl;
      with _node _tl. _;
      let node = !vl;
      rewrite each _node as node;
      let n = length node.tail;
      intro_is_list_cons x vl;
      (1 + n)
    }
  }
}

The None case is simple.

Some points to note in the Some case:

  • We use with _node _tl. _ to “get our hands on” the existentially bound witnesses.

  • After reading let node = !vl, we need is_list node.tail _tl to make the recursive call. But the context contains is_list _node.tail _tl and node == _node. So, we need a rewrite.

  • We re-introduce the is_list predicate, and return 1 + n. While the intro_is_list_cons x vl is a ghost step and will be erased before execution, the addition is not—so, this function is not tail recursive.

Exercise 1

Write a tail-recursive version of length.

Exercise 2

Index the is_list predicate with a fractional permission. Write ghost functions to share and gather fractional is_list predicates.

Length, Iteratively, with Trades

What if we wanted to implement length using a while loop, as is more idiomatic in a language like C. It will take us a few steps to get there, and we’ll use the trade operator (@==>) to structure our proof.

Trade Tails

Our first step is to define tail_for_cons, a lemma stating that with permission on a node pointer (pts_to v n), we can build a trade transforming a permission on the tail into a permission for a cons cell starting at the given node.

ghost
fn tail_for_cons (#t:Type) (v:node_ptr t) (#n:node t) (tl:erased (list t))
requires 
  pts_to v n
ensures 
  (is_list n.tail tl @==> is_list (Some v) (n.head::tl))
{
  ghost
  fn aux ()
  requires 
    pts_to v n ** is_list n.tail tl
  ensures 
    is_list (Some v) (n.head::tl)
  {
    intro_is_list_cons (Some v) v
  };
  I.intro _ _ _ aux;
}

Tail of a list

Next, here’s a basic operation on a linked list: given a pointer to a cons cell, return a pointer to its tail. Here’s a small diagram:

x             tl
|             |
v             v
.---.---.     .---.---.
|   | --|---> |   | --|--> ...
.---.---.     .---.---.

We’re given a pointer x to the cons cell at the head of a list, and we want to return tl, the pointer to the next cell (or None, of x this is the end of the list). But, if we want to return a pointer to tl, we a permission accounting problem:

  • We cannot return permission to x to the caller, since then we would have two aliases pointing to the next cell in the list: the returned tl and x -> next.

  • But, we cannot consume the permission to x either, since we would like to return permission to x once the return tl goes out of scope.

The solution here is to use a trade. The type of tail below says that if x is a non-null pointer satisfying is_list x 'l, then tail returns a pointer y such that is_list y tl (where tl is the tail of 'l); and, one can trade is_list y tl to recover permission to is_list x 'l. The trade essentially says that you cannot have permission to is_list x 'l and is_list y tl at the same time, but once you give up permission on y, you can get back the original permission on x.

fn tail (#t:Type) (x:llist t)
requires is_list x 'l ** pure (Some? x)
returns y:llist t
ensures exists* tl.
    is_list y tl **
    (is_list y tl @==> is_list x 'l) **
    pure (Cons? 'l /\ tl == Cons?.tl 'l)
{ 
    let np = Some?.v x;
    is_list_case_some x np;
    with node tl. _;
    let nd = !np;
    rewrite each node as nd;
    tail_for_cons np tl;
    nd.tail
}

length_iter

The code below shows our iterative implementation of length. The basic idea is simple, though the proof takes a bit of doing. We initialize a current pointer cur to the head of the list; and ctr to 0. Then, while cur is not null, we move cur to the tail and increment ctr. Finally, we return the !ctr.

fn length_iter (#t:Type) (x: llist t)
requires is_list x 'l
returns n:nat
ensures is_list x 'l ** pure (n == List.Tot.length 'l)
{
  open I;
  let mut cur = x;
  let mut ctr = 0; 
  I.refl (is_list x 'l); //initialize the trade for the invariant
  while (
    let v = !cur;
    Some? v
  )
  invariant b.  
  exists* n ll suffix.
    pts_to ctr n **
    pts_to cur ll **
    is_list ll suffix **
    pure (n == List.Tot.length 'l - List.Tot.length suffix /\
          b == (Some? ll)) **
    (is_list ll suffix @==> is_list x 'l)
  {
    with _n _ll _suffix. _; //bind existential variables in the invariant
    let n = !ctr;
    let ll = !cur;
    rewrite each _ll as ll; //again, rewrite the context to use ll instead of _ll
    let next = tail ll;     //tail gives us back a trade
    with tl. _;
    I.trans (is_list next tl) _ _; //extend the trade, transitively
    cur := next;
    ctr := n + 1;
  };
  with _n ll _sfx. _;
  is_list_case_none ll; //this tells us that suffix=[]; so n == List.Tot.length 'l
  I.elim _ _;           //regain ownership of x, giving up ll
  let n = !ctr;
  n
}

Now, for the proof. The main part is the loop invariant, which says:

  • the current value of the counter is n;

  • cur holds a list pointer, ll where ll contains the list represented by suffix;

  • n is the the length of the prefix of the list traversed so far;

  • the loop continues as long as b is true, i.e., the list pointer l is not None;

  • and, the key bit: you can trade ownership on ll back for ownership on the original list x.

Some parts of this could be simplified, e.g., to avoid some of the rewrites.

One way to understand how trades have helped here is to compare length_iter to the recursive function length. In length, after each recursive call returns, we called a ghost function to repackage permission on the cons cell after taking out permission on the tail. The recursive function call stack kept track of all these pieces of ghost code that had to be executed. In the iterative version, we use the trade to package up all the ghost functions that need to be run, rather than using the call stack. When the loop terminates, we use I.elim to run all that ghost code at once.

Of course, the recursive length is much simpler in this case, but this pattern of using trades to package up ghost functions is quite broadly useful.

Append, Recursively

Here’s another recursive function on linked lists: append concatenates y on to the end of x.

It’s fairly straightforward: we recurse until we reach the last node of x (i.e., the tail field is None; and we set that field to point to y.

fn rec append (#t:Type0) (x y:llist t)
requires is_list x 'l1 ** is_list y 'l2 ** pure (Some? x)
ensures is_list x ('l1 @ 'l2)
{
  let np = Some?.v x;
  is_list_case_some x np;
  with _node _tl. _;
  let node = !np;
  rewrite each _node as node;
  match node.tail {
    None -> {
      is_list_case_none node.tail;
      elim_is_list_nil node.tail;
      np := { node with tail = y };
      rewrite each y as ({ node with tail = y }).tail in (is_list y 'l2);
      intro_is_list_cons x np; 
    }
    Some _ -> {
      append node.tail y;
      intro_is_list_cons x np;
    }
  }
}

The code is tail recursive in the Some _ case, but notice that we have a ghost function call after the recursive call. Like we did for length, can we implement an iterative version of append, factoring this ghost code on the stack into a trade?

Append, Iteratively

Let’s start by defining a more general version of the tail function from before. In comparison, the postcondition of tail_alt uses a universal quantifier to say, roughly, that whatever list tl' the returns y points to, it can be traded for a pointer to x that cons’s on to tl. Our previous function tail can be easily recovered by instantiating tl' to tl.

fn tail_alt (#t:Type) (x:llist t)
requires is_list x 'l ** pure (Some? x)
returns y:llist t
ensures exists* hd tl.
  is_list y tl **
  (forall* tl'. is_list y tl' @==> is_list x (hd::tl')) **
  pure ('l == hd::tl)
{ 
  let np = Some?.v x;
  is_list_case_some x np;
  with _node _tl. _;
  let node = !np;
  rewrite each _node as node;
  ghost
  fn aux (tl':list t)
    requires pts_to np node ** is_list node.tail tl'
    ensures is_list x (node.head::tl')
  {
    intro_is_list_cons x np;
  };
  FA.intro_forall_imp _ _ _ aux;
  node.tail
}

We’ll use these quantified trades in our invariant of append_iter, shown below. The main idea of the implementation is to use a while loop to traverse to the last element of the first list x; and then to set y as the next pointer of this last element.

fn append_iter (#t:Type) (x y:llist t)
requires is_list x 'l1 ** is_list y 'l2 ** pure (Some? x)
ensures is_list x ('l1 @ 'l2)
{
  let mut cur = x;
  //the base case, set up the initial invariant
  FA.intro emp (fun l -> I.refl (is_list x l));
  rewrite (forall* l. is_list x l @==> is_list x l)
      as  (forall* l. is_list x l @==> is_list x ([]@l));
  while (
    with _b ll pfx sfx. _;
    let l = !cur;
    rewrite each ll as l;
    let b = is_last_cell l; //check if we are at the last cell
    if b 
    { 
      false //yes, break out of the loop
    }
    else 
    {
      let next = tail_alt l;
      with hd tl. _;
      (* this is the key induction step *)
      FA.trans_compose 
          (is_list next) (is_list l) (is_list x)
          (fun tl -> hd :: tl)
          (fun tl -> pfx @ tl);
      //Use F* sugar for classical connectives to introduce a property
      //needed for the next rewrite
      (introduce forall tl. pfx @ (hd :: tl) == (pfx @ [hd]) @ tl
       with List.Tot.Properties.append_assoc pfx [hd] tl);
      rewrite (forall* tl. is_list next tl @==> is_list x (pfx@(hd::tl)))
           as (forall* tl. is_list next tl @==> is_list x ((pfx@[hd])@tl));
      cur := next;
      non_empty_list next; //need to prove that Some? next, for the invariant
      true
    }
  )
  invariant b.
  exists* ll pfx sfx.
    pts_to cur ll **   //cur holds the pointer to the current head of the traversal, ll
    is_list ll sfx **  //ll points to some suffix of the original list, since `pfx@sfx = 'l1` below
    //the main bit: whatever ll points to `sfx'`, trade it for x pointing to the concatenation ``pfx @ sfx'`` 
    (forall* sfx'. is_list ll sfx' @==> is_list x (pfx @ sfx')) ** 
    pure (
      (b==false ==> List.Tot.length sfx == 1) /\ //the loop ends when we reach the last cell
      pfx@sfx == 'l1 /\ //sfx is really the suffix
      Some? ll          //and the current list is always non-null
    )
  { () };
  with ll pfx sfx. _;
  let last = !cur;
  rewrite each ll as last;
  append_at_last_cell last y;
  FA.elim_forall_imp (is_list last) (fun sfx' -> is_list x (pfx @ sfx')) (sfx@'l2);
  List.Tot.Properties.append_assoc pfx sfx 'l2;
  ()
}

There are few interesting points to note.

  • The main part is the quantified trade in the invariant, which, as we traverse the list, encapsulates the ghost code that we need to run at the end to restore permission to the initial list pointer x.

  • The library function, FA.trans_compose has the following signature:

    ghost
    fn trans_compose (#a #b #c:Type0)
                     (p: a -> vprop)
                     (q: b -> vprop)
                     (r: c -> vprop)
                     (f: a -> GTot b)
                     (g: b -> GTot c)
    requires
      (forall* x. p x @==> q (f x)) **
      (forall* x. q x @==> r (g x))
    ensures
      forall* x. p x @==> r (g (f x))
    

    We use it in the key induction step as we move one step down the list—similar to what we had in length_iter, but this time with a quantifier.

  • Illustrating again that Pulse is a superset of pure F*, we make use of a bit of F* sugar in the introduce forall to prove a property needed for a Pulse rewrite.

  • Finally, at the end of the loop, we use FA.elim_forall_imp to restore permission on x, now pointing to a concatenated list, effectively running all the ghost code we accumulated as we traversed the list.

Perhaps the lesson from all this is that recursive programs are much easier to write and prove correct that iterative ones? That’s one takeaway. But, hopefully, you’ve seen how trades and quantifiers work and can be useful in some proofs—of course, we’ll use them not just for rewriting recursive as iterative code.

Exercise 3

Write a function to insert an element in a list and a specific position.

Exercise 4

Write a function to reverse a list.