Ghost Computations
Throughout the chapters on pure F*, we made routine use of the
Lemmas
and ghost functions to prove properties of our
programs. Lemmas, you will recall, are pure, total functions that
always return unit
, i.e., they have no computational significance
and are erased by the F* compiler. F* ghost functions are also pure, total functions, except that they are
allowed to inspect erased values in a controlled way—they too are
erased by the F* compiler.
As we’ve seen already, F* lemmas and ghost functions can be directly
used in Pulse code. But, these are only useful for describing
properties about the pure values in scope. Often, in Pulse, one needs
to write lemmas that speak about the state, manipulate vprops
,
etc. For this purpose, Pulse provides its own notion of ghost
computations (think of these as the analog of F* lemmas and ghost
functions, except they are specified using vprops
); and ghost
state (think of these as the analog of F* erased types, except ghost
state is mutable, though still computationally irrelevant). Ghost
computations are used everywhere in Pulse—we’ve already seen a few
example. Ghost state is especially useful in proofs of concurrent
programs.
Ghost Functions
Here’s a Pulse function that fails to check, with the error message below.
[@@expect_failure]
```pulse
fn incr_erased_non_ghost (x:erased int)
requires emp
returns y:int
ensures emp ** pure (y == x + 1)
{
let x = reveal x;
(x + 1)
}
```
Cannot bind ghost expression reveal x with ST computation
We should expect this to fail, since the program claims to be able to
compute an integer y
by incrementing an erased integer x
—the
x:erased int
doesn’t exist at runtime, so this program cannot be
compiled.
But, if we tag the function with the ghost
qualifier, then this
works:
ghost
fn incr_erased (x:erased int)
requires emp
returns y:int
ensures emp ** pure (y == x + 1)
{
let x = reveal x;
(x + 1)
}
The ghost
qualifier indicates the the Pulse checker that the
function is to be erased at runtime, so ghost
functions are
allowed to make use of F* functions with GTot
effect, like
FStar.Ghost.reveal
.
However, for this to be sound, no compilable code is allowed to depend
on the return value of a ghost
function. So, the following code
fails with the error below:
[@@expect_failure]
```pulse
fn use_incr_erased (x:erased int)
requires emp
returns y:int
ensures emp ** pure (y == x + 1)
{
incr_erased x;
}
```
Expected a term with a non-informative (e.g., erased) type; got int
That is, when calling a ghost
function from a non-ghost context,
the return type of the ghost function must be non-informative, e.g,
erased
, or unit
etc. The class of non-informative types and
the rules for allowing F* ghost computations to be used in total
contexts is described here, and the same
rules apply in Pulse.
To use of incr_erased
in non-ghost contexts, we have to erase its
result. There are a few ways of doing this.
Here’s a verbose but explicit way, where we define a nested ghost
function to wrap the call to incr_erased
.
fn use_incr_erased (x:erased int)
requires emp
returns y:erased int
ensures emp ** pure (y == x + 1)
{
ghost
fn wrap (x:erased int)
requires emp
returns y:erased int
ensures emp ** pure (y == x + 1)
{
let y = incr_erased x;
hide y
};
wrap x
}
The library also contains Pulse.Lib.Pervasives.call_ghost
that is
a higher-order combinator to erase the result of a ghost call.
fn use_incr_erased_alt (x:erased int)
requires emp
returns y:erased int
ensures emp ** pure (y == x + 1)
{
call_ghost incr_erased x;
}
The call_ghost
combinator can be used with ghost functions of
different arities, though it requires the applications to be curried
in the following way.
Suppose we have a binary ghost function, like add_erased
:
ghost
fn add_erased (x y:erased int)
requires emp
returns z:int
ensures emp ** pure (z == x + y)
{
let x = reveal x;
let y = reveal y;
(x + y)
}
To call it in a non-ghost context, one can do the following:
fn use_add_erased (x y:erased int)
requires emp
returns z:erased int
ensures emp ** pure (z == x + y)
{
call_ghost (add_erased x) y
}
That said, since ghost
functions must have non-informative return
types to be usable in non-ghost contexts, it’s usually best to define
them that way to start with, rather than having to wrap them at each
call site, as shown below:
ghost
fn add_erased_erased (x y:erased int)
requires emp
returns z:erased int
ensures emp ** pure (z == x + y)
{
let x = reveal x;
let y = reveal y;
hide (x + y)
}
Some Primitive Ghost Functions
Pulse ghost functions with emp
or pure _
pre and
postconditions are not that interesting—such functions can usually
be written with regular F* ghost functions.
Ghost functions are often used as proof steps to prove equivalences
among vprops
. We saw a few examples of ghost functions
before—they are ghost since their
implementations are compositions of ghost
functions from the Pulse
library.
The rewrite
primitive that we saw previously is in fact a defined function in the Pulse
library. Its signature looks like this:
ghost
fn __rewrite (p q:vprop)
requires p ** pure (p == q)
ensures q
Many of the other primitives like fold
, unfold
, etc. are
defined in terms of rewrite
and are ghost
computations.
Other primitives like introduce exists*
are also implemented in
terms of library ghost
functions, with signatures like the one
below:
ghost
fn intro_exists (#a:Type0) (p:a -> vprop) (x:erased a)
requires p x
ensures exists* x. p x
Recursive Predicates and Ghost Lemmas
We previously saw how to define custom predicates, e.g., for representation predicates on data
structures. Since a vprop
is just a regular type, one can also
define vprops
by recursion in F*. Working with these recursive
predicates in Pulse usually involves writing recursive ghost functions
as lemmas. We’ll look at a simple example of this here and revisit in
subsequent chapters as look at programming unbounded structures, like
linked lists.
Say you have a list of references and want to describe that they all
contain integers whose value is at most n
. The recursive predicate
all_at_most l n
does just that:
let rec all_at_most (l:list (ref nat)) (n:nat)
: vprop
= match l with
| [] -> emp
| hd::tl -> exists* (i:nat). pts_to hd i ** pure (i <= n) ** all_at_most tl n
As we did when working with nullable references, it’s useful to define a few helper ghost functions to introduce and eliminate this predicate, for each of its cases.
Recursive Ghost Lemmas
Pulse allows writing recursive ghost functions as lemmas for use in Pulse code. Like F* lemmas, recursive ghost functions must be proven to terminate on all inputs—otherwise, they would not be sound.
To see this in action, let’s write a ghost function to prove that
all_at_most l n
can be weakened to all_at_most l m
when n <=
m
.
ghost
fn rec weaken_at_most (l:list (ref nat)) (n:nat) (m:nat)
requires all_at_most l n ** pure (n <= m)
ensures all_at_most l m
decreases l
{
match l {
Nil -> {
unfold (all_at_most [] n);
fold (all_at_most [] n);
}
Cons hd tl -> {
unfold (all_at_most (hd::tl) n);
weaken_at_most tl n m;
fold (all_at_most (hd::tl) m);
}
}
}
A few points to note:
Recursive functions in Pulse are defined using
fn rec
.Ghost recursive functions must also have a
decreases
annotation—unlike in F*, Pulse does not yet attempt to infer a default decreases annotation. In this case, we are recursing on the listl
.List patterns in Pulse do not (yet) have the same syntactic sugar as in F*, i.e., you cannot write
[]
andhd::tl
as patterns.The proof itself is fairly straightforward:
In the
Nil
case, we eliminate theall_at_most
predicate atn
and introduce it atm
.In the
Cons
case, we eliminateall_at_most l n
,` use the induction hypothesis to weaken theall_at_most
predicate on thetl
; and then introduce it again, packaging it with assumption onhd
.
Mutable Ghost References
The underlying logic that Pulse is based on actually supports a very
general form of ghost state based on partial commutative monoids
(PCMs). Users can define their own ghost state abstractions in F*
using PCMs and use these in Pulse programs. The library
Pulse.Lib.GhostReference
provides the simplest and most common
form of ghost state: references to erased values with a
fractional-permission-based ownership discipline.
We’ll use the module abbreviation module GR =
Pulse.Lib.GhostReference
in what follows. The library is very
similar to Pulse.Lib.Reference
, in that it provides:
GR.ref a
: The main type of ghost references.GR.ref
is an erasable type and is hence considered non-informative.
GR.pts_to (#a:Type0) (r:GR.ref a) (#p:perm) (v:a) : vprop
is the main predicate provided by the library. Similar to the regularpts_to
, the permission index defaults to1.0R
.Unlike
ref a
(and more likebox a
), ghost referencesGR.ref a
are not lexically scoped: they are allocated usingGR.alloc
and freed usingGR.free
. Of course, neither allocation nor free’ing has any runtime cost—these are just ghost operations.Reading a ghost reference using
!r
returns anerased a
, whenr:GR.ref a
. Likewise, to updater
, it is enough to provide a new valuev:erased a
.Operations to
share
andgather
ghost references work just as withref
.
A somewhat contrived example
Most examples that require ghost state usually involve stating interesting invariants between multiple threads, or sometimes in a sequential setting to correlate knowledge among different components. We’ll see examples of that in a later chapter. For now, here’s a small example that gives a flavor of how ghost state can be used.
Suppose we want to give a function read/write access to a reference, but want to ensure that before it returns, it resets the value of the reference to its original value. The simplest way to do that would be to give the function the following signature:
fn uses_but_resets #a (x:ref a)
requires pts_to x 'v
ensures pts_to x 'v
Here’s another way to do it, this time with ghost references.
First, we define a predicate correlated
that holds full permission
to a reference and half permission to a ghost reference, forcing them
to hold the same value.
let correlated #a (x:ref a) (y:GR.ref a) (v:a)=
pts_to x v ** GR.pts_to y #0.5R v
Now, here’s the signature of a function use_temp
: at first glance,
from its signature alone, one might think that the witness v0
bound in the precondition is unrelated to the v1
bound in the
postcondition.
fn use_temp (x:ref int) (y:GR.ref int)
requires exists* v0. correlated x y v0
ensures exists* v1. correlated x y v1
But, use_temp
only has half-permission to the ghost reference and
cannot mutate it. So, although it can mutate the reference itself, in
order to return its postcondition, it must reset the reference to its
initial value.
{
unfold correlated;
let v = !x;
x := 17; //temporarily mutate x, give to to another function to use with full perm
x := v; //but, we're forced to set it back to its original value
fold correlated;
}
This property can be exploited by a caller to pass a reference to
use_temp
and be assured that the value is unchanged when it
returns.
fn use_correlated ()
requires emp
ensures emp
{
let mut x = 17;
let g = GR.alloc 17;
GR.share g;
fold correlated; // GR.pts_to g #0.5R 17 ** correlated x g 17
use_temp x g; // GR.pts_to g #0.5R 17 ** correlated x g ?v1
unfold correlated; // GR.pts_to g #0.5R 17 ** GR.pts_to g #0.5R ?v1 ** pts_to x ?v1
GR.gather g; //this is the crucial step
// GT.pts_to g 17 ** pure (?v1 == 17) ** pts_to x ?v1
assert (pts_to x 17);
GR.free g;
}