# Parallel Increment

In this chapter, we look at an example first studied by Susan Owicki
and David Gries, in a classic paper titled Verifying properties of
parallel programs: an axiomatic approach. The problem involves
proving that a program that atomically increments an integer reference
`r`

twice in parallel correctly adds 2 to `r`

. There are many ways
to do this—Owicki & Gries’ approach, adapted to a modern separation
logic, involves the use of additional ghost state and offers a modular
way to structure the proof.

While this is a very simple program, it captures the essence of some of the reasoning challenges posed by concurrency: two threads interact with a shared resource, contributing to it in an undetermined order, and one aims to reason about the overall behavior, ideally without resorting to directly analyzing each of the possible interleavings.

## Parallel Blocks

Pulse provides a few primitives for creating new threads. The most basic one is parallel composition, as shown below:

```
parallel
requires p1 and p2
ensures q1 and q2
{ e1 }
{ e2 }
```

The typing rule for this construct requires:

```
val e1 : stt a p1 q1
val e2 : stt b p2 q2
```

and the `parallel`

block then has the type:

```
stt (a & b) (p1 ** p2) (fun (x, y) -> q1 x ** q2 y)
```

In other words, if the current context can be split into separate
parts `p1`

and `p2`

satisfying the preconditions of `e1`

and
`e2`

, then the parallel block executes `e1`

and `e2`

in
parallel, waits for both of them to finish, and if they both do,
returns their results as a pair, with their postconditions on each
component.

Using `parallel`

, one can easily program the `par`

combinator
below:

```
fn par (#pf #pg #qf #qg:_)
(f: unit -> stt unit pf (fun _ -> qf))
(g: unit -> stt unit pg (fun _ -> qg))
requires pf ** pg
ensures qf ** qg
{
parallel
requires pf and pg
ensures qf and qg
{ f () }
{ g () };
()
}
```

As we saw in the introduction to Pulse, it’s easy to increment two separate references in parallel:

```
fn par_incr (x y:ref int)
requires pts_to x 'i ** pts_to y 'j
ensures pts_to x ('i + 1) ** pts_to y ('j + 1)
{
par (fun _ -> incr x)
(fun _ -> incr y)
}
```

But, what if we wanted to increment the same reference in two separate threads? That is, we wanted to program something like this:

```
fn add2 (x:ref int)
requires pts_to x 'i
ensures pts_to x ('i + 2)
{
par (fun _ -> incr x)
(fun _ -> incr x)
}
```

But, this program doesn’t check. The problem is that we have only a
single `pts_to x 'i`

, and we can’t split it to share among the
threads, since both threads require full permission to `x`

to update
it.

Further, for the program to correctly add `2`

to `x`

, each
increment operations must take place atomically, e.g., if the two
fragments below were executed in parallel, then they may both read the
initial value of `x`

first, bind it to ``v`

, and then both update
it to `v + 1`

.

```
let v = !x; || let v = !x;
x := v + 1; || x := v + 1;
```

Worse, without any synchronization, on modern processors with weak memory models, this program could exhibit a variety of other behaviors.

To enforce synchronization, we could use a lock, e.g., shown below:

```
fn attempt (x:ref int)
requires pts_to x 'i
ensures emp
{
let l = L.new_lock (exists* v. pts_to x v);
fn incr ()
requires emp
ensures emp
{
L.acquire l;
let v = !x;
x := v + 1;
L.release l
};
par incr incr
}
```

This program is type correct and free from data races. But, since the
lock holds the entire permission on `x`

, there’s no way to give this
function a meaningful postcondition, aside from `emp`

.

## A First Take, with Locks

Owicki and Gries’ idea was to augment the program with auxiliary variables, or ghost state, that are purely for specification purposes. Each thread gets its own piece of ghost state, and accounts for how much that thread has contributed to the current value of shared variable. Let’s see how this works in Pulse.

The main idea is captured by `lock_inv`

, the type of the predicate
protected by the lock:

```
let contributions (left right: GR.ref int) (i v:int)=
exists* (vl vr:int).
GR.pts_to left #one_half vl **
GR.pts_to right #one_half vr **
pure (v == i + vl + vr)
let lock_inv (x:ref int) (init:int) (left right:GR.ref int) =
exists* v.
pts_to x v **
contributions left right init v
```

Our strawman `lock`

in the `attempt`

shown before had type ```
lock
(exists* v. pts_to x v)
```

. This time, we add a conjunct that refines
the value `v`

, i.e., the predicate `contributions l r init v`

says
that the current value of `x`

protected by the lock (i.e., `v`

) is
equal to `init + vl + vr`

, where `init`

is the initial value of
`x`

; `vl`

is the value of the ghost state owned by the “left”
thread; and `vr`

is the value of the ghost state owned by the
“right” thread. In other words, the predicate ```
contributions l r init
v
```

shows that `v`

always reflects the values of the contributions
made by each thread.

Note, however, the `contributions`

predicate only holds
half-permission on the left and right ghost variables. The other half
permission is held outside the lock and allows us to keep track of
each threads contribution in our specifications.

Here’s the code for the left thread, `incr_left`

:

```
fn incr_left (x:ref int)
(#left:GR.ref int)
(#right:GR.ref int)
(#i:erased int)
(lock:L.lock (lock_inv x i left right))
requires GR.pts_to left #one_half 'vl
ensures GR.pts_to left #one_half ('vl + 1)
{
L.acquire lock;
unfold lock_inv;
unfold contributions;
let v = !x;
x := v + 1;
GR.gather left;
GR.write left ('vl + 1);
GR.share left;
fold (contributions left right i (v + 1));
fold lock_inv;
L.release lock
}
```

Its arguments include

`x`

and the`lock`

, but also both pieces of ghost state,`left`

and`right`

, and an erased value`i`

for the initial value of`x`

.Its precondition holds half permission on the ghost reference

`left`

Its postcondition returns half-permission to

`left`

, but proves that it was incremented, i.e., the contribution of the left thread to the value of`x`

increased by`1`

.

Notice that even though we only had half permission to `left`

, the
specifications says we have updated `left`

—that’s because we can
get the other half permission we need by acquiring the lock.

We acquire the lock and update increment the value stored in

`x`

.And then we follow the increment with several ghost steps:

Gain full permission on

`left`

by combining the half permission from the precondition with the half permission gained from the lock.Increment

`left`

.Share it again, returning half permission to the lock when we release it.

Finally, we

`GR.pts_to left #one_half (`vl + 1)`

left over to return to the caller in the postcondition.

The code of the right thread is symmetrical, but in this, our first take, we have to essentially repeat the code—we’ll see how to remedy this shortly.

```
fn incr_right (x:ref int)
(#left:GR.ref int)
(#right:GR.ref int)
(#i:erased int)
(lock:L.lock (lock_inv x i left right))
requires GR.pts_to right #one_half 'vl
ensures GR.pts_to right #one_half ('vl + 1)
{
L.acquire lock;
unfold lock_inv;
unfold contributions;
let v = !x;
x := v + 1;
GR.gather right;
GR.write right ('vl + 1);
GR.share right;
fold (contributions left right i (v + 1));
fold (lock_inv x i left right);
L.release lock
}
```

Finally, we can implement `add2`

with the specification we want:

```
fn add2 (x:ref int)
requires pts_to x 'i
ensures pts_to x ('i + 2)
{
let left = GR.alloc 0;
let right = GR.alloc 0;
GR.share left;
GR.share right;
fold (contributions left right 'i 'i);
fold (lock_inv x 'i left right);
let lock = L.new_lock (lock_inv x 'i left right);
par (fun _ -> incr_left x lock)
(fun _ -> incr_right x lock);
L.acquire lock;
unfold lock_inv;
unfold contributions;
GR.gather left;
GR.gather right;
GR.free left;
GR.free right;
}
```

We allocate

`left`

and`right`

ghost references, initializing them to`0`

.Then we split them, putting half permission to both in the lock, retaining the other half.

Then spawn two threads for

`incr_left`

and`incr_right`

, and get as a postcondition that contributions of both threads and increased by one each.Finally, we acquire the lock, get

`pts_to x v`

, for some`v`

, and`contributions left right i v`

. Once we gather up the permission on`left`

and`right`

, and now the`contributions left right i v`

tells us that`v == i + 1 + 1`

, which is what we need to conclude.

This version also has the problem that we allocate lock and then acquire it at the end, effectively leaking the memory associated with the lock. We’ll see how to fix that below.

## Modularity with higher-order ghost code

Our next attempt aims to write a single function `incr`

, rather than
`incr_left`

and `incr_right`

, and to give `incr`

a more
abstract, modular specification. The style we use here is based on an
idea proposed by Bart Jacobs and Frank Piessens in a paper titled
Expressive modular fine-grained concurrency specification.

The main idea is to observe that `incr_left`

and `incr_right`

only
deferred by the ghost code that they execute. But, Pulse is higher
order: so, why not parameterize a single function by `incr`

and let
the caller instantiate `incr`

twice, with different bits of ghost
code. Also, while we’re at it, why not also generalize the
specification of `incr`

so that it works with any user-chosen
abstract predicate, rather than `contributions`

and `left/right`

ghost state. Here’s how:

```
fn incr (x: ref int)
(#refine #aspec: int -> vprop)
(l:L.lock (exists* v. pts_to x v ** refine v))
(ghost_steps:
(v:int -> vq:int -> stt_ghost unit emp_inames
(refine v ** aspec vq ** pts_to x (v + 1))
(fun _ -> refine (v + 1) ** aspec (vq + 1) ** pts_to x (v + 1))))
requires aspec 'i
ensures aspec ('i + 1)
{
L.acquire l;
with _v. _;
let vx = !x;
rewrite each _v as vx;
x := vx + 1;
ghost_steps vx 'i;
L.release l;
}
```

As before, `incr`

requires `x`

and the `lock`

, but, this time,
it is parameterized by:

A predicate

`refine`

, which generalizes the`contributions`

predicate from before, and refines the value that`x`

points to.A predicate

`aspec`

, an abstract specification chosen by the caller, and serves as the main specification for`incr`

, which transitions from`aspec 'i`

to`aspec ('i + 1)`

.And, finally, the ghost function itself,

`ghost_steps`

, now specified abstractly in terms of the relationship between`refine`

,`aspec`

and`pts_to x`

—it says, effectively, that once`x`

has been updated, the abstract predicates`refine`

and`aspec`

can be updated too.

Having generalized `incr`

, we’ve now shifted the work to the
caller. But, `incr`

, now verified once and for all, can be used with
many different callers just by instantiating it difference. For
example, if we wanted to do a three-way parallel increment, we could
reuse our `incr`

as is. Whereas, our first take would have to be
completely revised, since `incr_left`

and `incr_right`

assume that
there are only two ghost references, not three.

Here’s one way to instantiate `incr`

, proving the same specification
as `add2`

.

```
fn add2_v2 (x: ref int)
requires pts_to x 'i
ensures pts_to x ('i + 2)
{
let left = GR.alloc 0;
let right = GR.alloc 0;
GR.share left;
GR.share right;
fold (contributions left right 'i 'i);
let lock = L.new_lock (
exists* (v:int).
pts_to x v ** contributions left right 'i v
);
ghost
fn step
(lr:GR.ref int)
(b:bool { if b then lr == left else lr == right })
(v vq:int)
requires
contributions left right 'i v **
GR.pts_to lr #one_half vq **
pts_to x (v + 1)
ensures
contributions left right 'i (v + 1) **
GR.pts_to lr #one_half (vq + 1) **
pts_to x (v + 1)
{
unfold contributions;
if b
{
with _p _v. rewrite (GR.pts_to lr #_p _v) as (GR.pts_to left #_p _v);
GR.gather left;
GR.write left (vq + 1);
GR.share left;
with _p _v. rewrite (GR.pts_to left #_p _v) as (GR.pts_to lr #_p _v);
fold (contributions left right 'i (v + 1));
}
else
{
with _p _v. rewrite (GR.pts_to lr #_p _v) as (GR.pts_to right #_p _v);
GR.gather right;
GR.write right (vq + 1);
GR.share right;
with _p _v. rewrite (GR.pts_to right #_p _v) as (GR.pts_to lr #_p _v);
fold (contributions left right 'i (v + 1));
}
};
par (fun _ -> incr x lock (step left true))
(fun _ -> incr x lock (step right false));
L.acquire lock; //we leak the lock here, but we can do better, below
unfold contributions;
GR.gather left;
GR.gather right;
GR.free left;
GR.free right;
}
```

The code is just a rearrangement of what we had before, factoring the
ghost code in `incr_left`

and `incr_right`

into a ghost function
`step`

. When we spawn our threads, we pass in the ghost code to
either update the left or the right contribution.

This code still has two issues:

The ghost

`step`

function is a bloated: we have essentially the same code and proof twice, once in each branch of the conditional. We can improve this by defining a custom bit of ghost state using Pulse’s support for partial commutative monoids—but that’s for another chapter.We still leak the memory associated with the lock at the end—we’ll remedy that next.

### Exercise

Instead of explicitly passing a ghost function, use a quantified trade.

## A version with invariants

As a final example, in this section, we’ll see how to program `add2`

using invariants and atomic operations, rather than locks.

Doing this properly will require working with bounded, machine
integers, e.g., `U32.t`

, since these are the only types that support
atomic operations. However, to illustrate the main ideas, we’ll assume
two atomic operations on unbounded integers—this will allow us to
not worry about possible integer overflow. We leave as an exercise the
problem of adapting this to `U32.t`

.

```
assume
val atomic_read (r:ref int) (#p:_) (#i:erased int)
: stt_atomic int emp_inames
(pts_to r #p i)
(fun v -> pts_to r #p i ** pure (reveal i == v))
assume
val cas (r:ref int) (u v:int) (#i:erased int)
: stt_atomic bool emp_inames
(pts_to r i)
(fun b ->
cond b (pts_to r v ** pure (reveal i == u))
(pts_to r i))
```

### Cancellable Invariants

The main idea of doing the `add2`

proof is to use an invariant token
`i:inv p`

instead of a `l:lock p`

. Just as in our previous code,
`add2`

starts with allocating an invariant, putting
`exists* v. pts_to x v ** contribution left right i v`

in the
invariant. Then call incr twice in different threads. However,
finally, to recover `pts_to x (v + 2)`

, where previously we would
acquire the lock, with a regular invariant, we’re stuck, since the
`pts_to x v`

permission is inside the invariant and we can’t take it
out to return to the caller.

An invariant token `i:inv p`

is a witness that the property `p`

is
true and remains true for the rest of a program’s execution. But, what
if we wanted to only enforce `p`

as an invariant for some finite
duration, and then to cancel it? This is what the library
`Pulse.Lib.CancellableInvariant`

provides. By allocating a ```
i:inv
(cancellable t p)
```

, we get an invariant which enforces `p`

only so
long as `active perm t`

is also provable. Here’s the relevant part
of the API:

```
ghost
fn create (v:vprop)
requires v
returns t:token
ensures cancellable t v ** active full_perm t
ghost
fn take (#p #t:_) (v:vprop)
requires cancellable t v ** active p t
ensures v ** active p t ** active full_perm t
ghost
fn restore (#t:_) (v:vprop)
requires v ** active full_perm t
ensures cancellable t v
fn cancel_inv (#t #v:_) (i:inv (cancellable t v))
requires active full_perm t
ensures v
```

A `cancellable t v`

is created from a proof of `v`

, also providing
a fractional-permission indexed predicate `active full_perm t`

,
which can be shared and gathered as usual. So long as one can prove
that the token `t`

is active, one can call `take`

to trade a
`cancellable t v`

for `v`

, and vice-versa. Finally, with full
permission on the `active`

predicate, `cancel_inv`

is a utility to
stop enforcing `v`

an invariant and to extract it from a ```
i:inv
(cancellable t v)
```

.

### An increment operation

Our first step is to build an increment operation from an
`atomic_read`

and a `cas`

. Here is its specification:

```
fn incr_atomic
(x: ref int)
(#p:perm)
(#t:erased C.token)
(#refine #aspec: int -> vprop)
(l:inv (C.cancellable t (exists* v. pts_to x v ** refine v)))
(f: (v:int -> vq:int -> stt_ghost unit emp_inames
(refine v ** aspec vq ** pts_to x (v + 1))
(fun _ -> refine (v + 1) ** aspec (vq + 1) ** pts_to x (v + 1))))
requires aspec 'i ** C.active p t
ensures aspec ('i + 1) ** C.active p t
```

The style of specification is similar to the generic style we used
with `incr`

, except now we use cancellable invariant instead of a
lock.

For its implementation, the main idea is to repeatedly read the
current value of `x`

, say `v`

; and then to `cas`

in `v+1`

if
the current value is still `v`

.

The `read`

function is relatively easy:

```
atomic
fn read ()
requires C.active p t
returns v:int
ensures C.active p t
opens (add_inv emp_inames l)
{
with_invariants l
{
C.take _;
let v = atomic_read x;
C.restore (exists* v. pts_to x v ** refine v);
v
}
};
```

We open the invariant

`l`

; then, knowing that the invariant is still active, we can`take`

it; then read the value`v`

;`restore`

the invariant; and return`v`

.

The main loop of `incr_atomic`

is next, shown below:

```
let mut continue = true;
fold (cond true (aspec 'i) (aspec ('i + 1)));
while (
with _b. _;
let b = !continue;
rewrite each _b as b;
b
)
invariant b.
pts_to continue b **
C.active p t **
cond b (aspec 'i) (aspec ('i + 1))
{
let v = read ();
let next =
with_invariants l
returns b1:bool
ensures cond b1 (aspec 'i) (aspec ('i + 1))
** pts_to continue true
** C.active p t
{
C.take _;
let b = cas x v (v + 1);
if b
{
elim_cond_true b _ _;
elim_cond_true true _ _;
f _ _;
intro_cond_false (aspec 'i) (aspec ('i + 1));
C.restore (exists* v. pts_to x v ** refine v);
false
}
else
{
with p q. rewrite (cond b p q) as q;
C.restore (exists* v. pts_to x v ** refine v);
true
}
};
continue := next
};
```

The loop invariant says:

the invariant remains active

the local variable

`continue`

determines if the loop iteration continuesand, so long as the loop continues, we still have

`aspec 'i`

, but when the loop ends we have`aspec ('i + 1)`

The body of the loop is also interesting and consists of two atomic
operations. We first `read`

the value of `x`

into `v`

. Then we
open the invariant again try to `cas`

in `v+1`

. If it succeeds, we
return `false`

from the `with_invariants`

block; otherwise
`true`

. And, finally, outside the `with_invariants`

block, we set
the `continue`

variable accordingly. Recall that `with_invariants`

allows at most a single atomic operation, so we having done a `cas`

,
we are not allowed to also set `continue`

inside the
`with_invariants`

block.

`add2_v3`

Finally, we implement our parallel increment again, `add2_v3`

, this
time using invariants, though it has the same specification as before.

```
fn add2_v3 (x: ref int)
requires pts_to x 'i
ensures pts_to x ('i + 2)
{
let left = GR.alloc 0;
let right = GR.alloc 0;
GR.share left;
GR.share right;
fold (contributions left right 'i 'i);
let tok = C.create2 (
exists* (v:int).
pts_to x v **
contributions left right 'i v
);
C.share tok;
let inv = new_invariant (C.cancellable tok _);
ghost
fn step
(lr:GR.ref int)
(b:bool { if b then lr == left else lr == right })
(v vq:int)
requires
contributions left right 'i v **
GR.pts_to lr #one_half vq **
pts_to x (v + 1)
ensures
contributions left right 'i (v + 1) **
GR.pts_to lr #one_half (vq + 1) **
pts_to x (v + 1)
{
unfold contributions;
if b
{
with _p _v. rewrite (GR.pts_to lr #_p _v) as (GR.pts_to left #_p _v);
GR.gather left;
GR.write left (vq + 1);
GR.share left;
with _p _v. rewrite (GR.pts_to left #_p _v) as (GR.pts_to lr #_p _v);
fold (contributions left right 'i (v + 1));
}
else
{
with _p _v. rewrite (GR.pts_to lr #_p _v) as (GR.pts_to right #_p _v);
GR.gather right;
GR.write right (vq + 1);
GR.share right;
with _p _v. rewrite (GR.pts_to right #_p _v) as (GR.pts_to lr #_p _v);
fold (contributions left right 'i (v + 1));
}
};
par (fun _ -> incr_atomic x inv (step left true))
(fun _ -> incr_atomic x inv (step right false));
C.gather tok;
C.cancel_inv inv;
unfold contributions;
GR.gather left;
GR.gather right;
GR.free left;
GR.free right;
}
```

The code too is very similar to `add2_v2`

, except instead of
allocating a lock, we allocate a cancellable invariant. And, at the
end, instead of acquiring, and leaking, the lock, we simply cancel the
invariant and we’re done.

## Exercise

Implement `add2`

on a `ref U32.t`

. You’ll need a precondition that
`'i + 2 < pow2 32`

and also to strengthen the invariant to prove
that each increment doesn’t overflow.