Understanding how F* uses Z3
As we have seen throughout, F* relies heavily on the Z3 SMT (Satifiability Modulo Theories) solver for proof automation. Often, on standalone examples at the scale covered in earlier chapters, the automation just works out of the box, but as one builds larger developments, proof automation can start becoming slower or unpredictable—at that stage, it becomes important to understand how F*’s encoding to SMT works to better control proofs.
At the most abstract level, one should realize that finding proofs in the SMT logic F* uses (first-order logic with uninterpreted functions and arithmetic) is an undecidable problem. As such, F* and the SMT solver relies on various heuristics and partial decision procedures, and a solver like Z3 does a remarkable job of being able to effectively solve the very large problems that F* presents to it, despite the theoretical undecidability. That said, the proof search that Z3 uses is computationally expensive and can be quite sensitive to the choice of heuristics and syntactic details of problem instances. As such, if one doesn’t chose the heuristics well, a small change in a query presented to Z3 can cause it to take a different search path, perhaps causing a proof to not be found at all, or to be found after consuming a very different amount of resources.
Some background and resources:
F*’s SMT encoding uses the SMT-LIB v2 language. We refer to the “SMT-LIB v2” language as SMT2.
Alejandro Aguirre wrote a tech report describing work in progress towards formalizing F*’s SMT encoding.
Michal Moskal’s Programming with Triggers describes how to pick triggers for quantifier instantiation and how to debug and profile the SMT solver, in the context of Vcc and the relation Hypervisor Verification project.
Leonardo de Moura and Nikolaj Bjorner describe how E-matching is implemented in Z3 (at least circa 2007).
A Primer on SMT2
SMT2 is a standardized input language supported by many SMT solvers. Its syntax is based on S-expressions, inspired by languages in the LISP family. We review some basic elements of its syntax here, particularly the parts that are used by F*’s SMT encoding.
Multi-sorted logic
The logic provided by the SMT solver is multi-sorted: the sorts provide a simple type system for the logic, ensuring, e.g., that terms from two different sorts can never be equal. A user can define a new sort
T
, as shown below:(declare-sort T)
Every sort comes with a built-in notion of equality. Given two terms
p
andq
of the same sortT
,(= p q)
is a term of sortBool
expressing their equality.Declaring uninterpreted functions
A new function symbol
F
, with arguments in sortssort_1 .. sort_n
and returning a result insort
is declared as shown below,(declare-fun F (sort_1 ... sort_n) sort)
The function symbol
F
is uninterpreted, meaning that the only information the solver has aboutF
is that it is a function, i.e., when applied to equal argumentsF
produces equal results.Theory symbols
Z3 provides support for several theories, notably integer and real arithmetic. For example, on terms
i
andj
ofInt
sort, the sort of unbounded integers, the following terms define the expected arithmetic functions:(+ i j) ; addition (- i j) ; subtraction (* i j) ; multiplication (div i j) ; Euclidean division (mod i j) ; Euclidean modulus
Logical connectives
SMT2 provides basic logical connectives as shown below, where
p
andq
are terms of sortBool
(and p q) ; conjunction (or p q) ; disjunction (not p) ; negation (implies p q) ; implication (iff p q) ; bi-implication
SMT2 also provides support for quantifiers, where the terms below represent a term
p
with the variablesx1 ... xn
universally and existentially quantified, respectively.(forall ((x1 sort_1) ... (xn sort_n)) p) (exists ((x1 sort_1) ... (xn sort_n)) p)
Attribute annotations
A term
p
can be decorated with attributes namesa_1 .. a_n
with valuesv_1 .. v_n
using the following syntax—the!
is NOT to be confused with logical negation.(! p :a_1 v_1 ... :a_n v_n)
A common usage is with quantifiers, as we’ll see below, e.g.,
(forall ((x Int)) (! (implies (>= x 0) (f x)) :qid some_identifier))
An SMT2 theory and check-sat
An SMT2 theory is a collection of sort and function symbol declarations, and assertions of facts about them. For example, here’s a simple theory declaring a function symbol
f
and an assumption thatf x y
is equivalent to(>= x y)
—note, unlike in F*, theassert
keyword in SMT2 assumes that a fact is true, rather than checking that it is valid, i.e.,assert
in SMT2 is likeassume
in F*.(declare-fun f (Int Int) Bool) (assert (forall ((x Int) (y Int)) (iff (>= y x) (f x y))))
In the context of this theory, one can ask whether some facts about
f
are valid. For example, to check iff
is a transitive function, one asserts the negation of the transitivity property forf
and then asks Z3 to check (using the(check-sat)
directive) if the resulting theory is satisfiable.(assert (not (forall ((x Int) (y Int) (z Int)) (implies (and (f x y) (f y z)) (f x z))))) (check-sat)
In this case, Z3 very quickly responds with
unsat
, meaning that there are no models for the theory that contain an interpretation off
compatible with both assertions, or, equivalently, the transitivity off
is true in all models. That is, we expect successful queries to returnunsat
.
A Brief Tour of F*’s SMT Encoding
Consider the following simple F* code:
let id x = x
let f (x:int) =
if x < 0
then assert (- (id x) >= 0)
else assert (id x >= 0)
To encode the proof obligation of this program to SMT, F* generates an SMT2 file with the following rough shape.
;; Some basic scaffoling
(declare-sort Term)
...
;; Encoding of some basic modules
(declare-fun Prims.bool () Term)
...
;; Encoding of background facts about the current module
(declare-fun id (Term) Term)
(assert (forall ((x Term)) (= (id x) x)))
;; Encoding the query, i.e., negated proof obligation
(assert (not (forall ((x Term))
(and (implies (lt x 0) (geq (minus (M.id x)) 0))
(implies (not (lt x 0)) (geq (M.id x) 0))))))
(check-sat)
;; Followed by some instrumentation for error reporting
;; in case the check-sat call fails (i.e., does not return unsat)
That was just just to give you a rough idea—the details of F*’s actual SMT encoding are a bit different, as we’ll see below.
To inspect F*’s SMT encoding, we’ll work through several small
examples and get F* to log the SMT2 theories that it generates. For
this, we’ll use the file shown below as a skeleton, starting with the
#push-options "--log_queries"
directive, which instructs F* to
print out its encoding to .smt2
file. The force_a_query
definition at the end ensures that F* actually produces at least one
query—without it, F* sends nothing to the Z3 and so prints no output
in the .smt2 file. If you run fstar.exe SMTEncoding.fst
on the
command line, you will find a file queries-SMTEncoding.smt2
in the
current directory.
module SMTEncoding
open FStar.Mul
#push-options "--log_queries"
let false_boolean = false
let true_boolean = true
let rec factorial (n:nat) : nat =
if n = 0 then 1
else n * factorial (n - 1)
let id (x:Type0) = x
let force_a_query = assert (id True)
Even for a tiny module like this, you’ll see that the .smt2 file is
very large. That’s because, by default, F* always includes the modules
prims.fst
, FStar.Pervasives.Native.fst
, and
FStar.Pervasives.fsti
as dependences of other modules. Encoding
these modules consumes about 150,000 lines of SMT2 definitions and
comments.
The encoding of each module is delimited in the .smt2 file by comments of the following kind:
;;; Start module Prims
...
;;; End module Prims (1334 decls; total size 431263)
;;; Start module FStar.Pervasives.Native
...
;;; End module FStar.Pervasives.Native (2643 decls; total size 2546449)
;;; Start interface FStar.Pervasives
...
;;; End interface FStar.Pervasives (2421 decls; total size 1123058)
where each End line also describes the number of declarations in the module and its length in characters.
Term
sort
Despite SMT2 being a multi-sorted logic, aside from the pervasive use
the SMT sort Bool
, F*’s encoding to SMT (mostly) uses just a
single sort: every pure (or ghost) F* term is encoded to the SMT
solver as an instance of an uninterpreted SMT sort called
Term
. This allows the encoding to be very general, handling F*’s
much richer type system, rather than trying to map F*’s complex type
system into the much simpler type system of SMT sorts.
Booleans
One of the most primitive sorts in the SMT solver is Bool
, the
sort of Booleans. All the logical connectives in SMT are operations on
the Bool
sort. To encode values of the F* type bool
to SMT, we
use the Bool
sort, but since all F* terms are encoded to the
Term
sort, we “box” the Bool
sort to promote it to Term
,
using the SMT2 definitions below.
(declare-fun BoxBool (Bool) Term)
(declare-fun BoxBool_proj_0 (Term) Bool)
(assert (! (forall ((@u0 Bool))
(! (= (BoxBool_proj_0 (BoxBool @u0))
@u0)
:pattern ((BoxBool @u0))
:qid projection_inverse_BoxBool_proj_0))
:named projection_inverse_BoxBool_proj_0))
This declares two uninterpreted functions BoxBool
and
BoxBool_proj_0
that go back and forth between the sorts Bool
and Term
.
The axiom named projection_inverse_BoxBool_proj_0
states that
BoxBool_proj_0
is the inverse of BoxBool
, or, equivalently,
that BoxBool
is an injective function from Bool
to
Term
.
The qid
is the quantifier identifier, usually equal to or derived
from the name of the assumption that includes it—qids come up when
we look at profiling quantifier instantiation.
Patterns for quantifier instantiation
The projection_inverse_BoxBool_proj_0
axiom on booleans shows our
first use of a quantified formula with a pattern, i.e., the part that
says :pattern ((BoxBool @u0))
. These patterns are the main
heuristic used to control the SMT solver’s proof search and will
feature repeatedly in the remainder of this chapter.
When exploring a theory, the SMT solver has a current partial model which contains an assignment for some of the variables in a theory to ground terms. All the terms that appear in this partial model are called active terms and these active terms play a role in quantifier instantiation.
Each universally quantified formula in scope is a term of the form below:
(forall ((x1 s1) ... (xn sn))
(! ( body )
:pattern ((p1) ... (pm))))
This quantified formula is inert and only plays a role in the solver’s
search once the bound variables x1 ... xn
are instantiated. The
terms p1 ... pm
are called patterns, and collectively, p1
... pm
must mention all the bound variables. To instantiate the
quantifier, the solver aims to find active terms v1...vm
that
match the patterns p1..pm
, where a match involves finding a
substitution S
for the bound variables x1...xm
, such that the
substituted patterns S(p1...pm)
are equal to the active terms
v1..vm
. Given such a substitution, the substituted term
S(body)
becomes active and may refine the partial model further.
Existentially quantified formulas are dual to universally quantified formulas. Whereas a universal formula in the context (i.e., in negative position, or as a hypothesis) is inert until its pattern is instantiated, an existential goal (or, in positive position) is inert until its pattern is instantiated. Existential quantifiers can be decorated with patterns that trigger instantiation when matched with active terms, just like universal quantifiers
Returning to projection_inverse_BoxBool_proj_0
, what this means is
that once the solver has an active term BoxBool b
, it can
instantiate the quantified formula to obtain (= (BoxBool_proj_0
(BoxBool b)) b)
.
Integers
The encoding of the F* type int
is similar to that of
bool
—the primitive SMT sort Int
(of unbounded mathematical
integers) are coerced to Term
using the injective function
BoxInt
.
(declare-fun BoxInt (Int) Term)
(declare-fun BoxInt_proj_0 (Term) Int)
(assert (! (forall ((@u0 Int))
(! (= (BoxInt_proj_0 (BoxInt @u0))
@u0)
:pattern ((BoxInt @u0))
:qid projection_inverse_BoxInt_proj_0))
:named projection_inverse_BoxInt_proj_0))
The primitive operations on integers are encoded by unboxing the
arguments and boxing the result. For example, here’s the encoding of
Prims.(+)
, the addition operator on integers.
(declare-fun Prims.op_Addition (Term Term) Term)
(assert (! (forall ((@x0 Term) (@x1 Term))
(! (= (Prims.op_Addition @x0
@x1)
(BoxInt (+ (BoxInt_proj_0 @x0)
(BoxInt_proj_0 @x1))))
:pattern ((Prims.op_Addition @x0
@x1))
:qid primitive_Prims.op_Addition))
:named primitive_Prims.op_Addition))
This declares an uninterpreted function Prims.op_Addition
, a
binary function on Term
, and an assumption relating it to the SMT
primitive operator from the integer arithmetic theory (+)
. The
pattern allows the SMT solver to instantiate this quantifier for every
active application of the Prims.op_Addition
.
The additional boxing introduces some overhead, e.g., proving x + y
== y + x
in F* amounts to proving Prims.op_Addition x y ==
Prims.op_Addition y x
in SMT2. This in turn involves instantiation
quantifiers, then reasoning in the theory of linear arithmetic, and
finally using the injectivity of the BoxInt
function to
conclude. However, this overhead is usually not perceptible, and the
uniformity of encoding everything to a single Term
sort simplifies
many other things. Nevertheless, F* provides a few options to control
the way integers and boxed and unboxed, described ahead.
Functions
Consider the F* function below:
let add3 (x y z:int) : int = x + y + z
Its encoding to SMT has several elements.
First, we have have a declaration of an uninterpreted ternary function
on Term
.
(declare-fun SMTEncoding.add3 (Term Term Term) Term)
The semantics of add3
is given using the assumption below, which
because of the pattern on the quantifier, can be interpreted as a
rewrite rule from left to right: every time the solver has
SMTEncoding.add3 x y z
as an active term, it can expand it to its
definition.
(assert (! (forall ((@x0 Term) (@x1 Term) (@x2 Term))
(! (= (SMTEncoding.add3 @x0
@x1
@x2)
(Prims.op_Addition (Prims.op_Addition @x0
@x1)
@x2))
:pattern ((SMTEncoding.add3 @x0
@x1
@x2))
:qid equation_SMTEncoding.add3))
:named equation_SMTEncoding.add3))
In addition to its definition, F* encodes the type of add3
to the
solver too, as seen by the assumption below. One of the key predicates
of F*’s SMT encoding is HasType
, which relates a term to its
type. The assumption typing_SMTEncoding.add3
encodes the typing of
the application based on the typing hypotheses on the arguments.
(assert (! (forall ((@x0 Term) (@x1 Term) (@x2 Term))
(! (implies (and (HasType @x0
Prims.int)
(HasType @x1
Prims.int)
(HasType @x2
Prims.int))
(HasType (SMTEncoding.add3 @x0 @x1 @x2)
Prims.int))
:pattern ((SMTEncoding.add3 @x0 @x1 @x2))
:qid typing_SMTEncoding.add3))
:named typing_SMTEncoding.add3))
This is all we’d need to encode add3
if it was never used at
higher order. However, F* treats functions values just like any other
value and allows them to be passed as arguments to, or returned as
results from, other functions. The SMT logic is, however, a
first-order logic and functions like add3
are not first-class
values. So, F* introduces another layer in the encoding to model
higher-order functions, but we don’t cover this here.
Recursive functions and fuel
Non-recursive functions are similar to macro definitions—F* just
encodes encodes their semantics to the SMT solver as a rewrite
rule. However, recursive functions, since they could be unfolded
indefinitely, are not so simple. Let’s look at the encoding of the
factorial
function shown below.
open FStar.Mul
let rec factorial (n:nat) : nat =
if n = 0 then 1
else n * factorial (n - 1)
First, we have, as before, an uninterpreted function symbol on Term
and an assumption about its typing.
(declare-fun SMTEncoding.factorial (Term) Term)
(assert (! (forall ((@x0 Term))
(! (implies (HasType @x0 Prims.nat)
(HasType (SMTEncoding.factorial @x0) Prims.nat))
:pattern ((SMTEncoding.factorial @x0))
:qid typing_SMTEncoding.factorial))
:named typing_SMTEncoding.factorial))
However, to define the semantics of factorial
we introduce a
second “fuel-instrumented” function symbol with an additional
parameter of Fuel
sort.
(declare-fun SMTEncoding.factorial.fuel_instrumented (Fuel Term) Term)
The Fuel
sort is declared at the very beginning of F*’s SMT
encoding and is a representation of unary integers, with two
constructors ZFuel
(for zero) and SFuel f
(for successor).
The main idea is to encode the definition of factorial
guarded by
patterns that only allow unfolding the definition if the fuel argument
of factorial.fuel_instrumented
is not zero, as shown below.
Further, the assumption defining the semantics of
factorial.fuel_instrumented
is guarded by a typing hypothesis on
the argument (HasType @x1 Prims.nat)
, since the recursive function
in F* is only well-founded on nat
, not on all terms. The
:weight
annotation is an SMT2 detail: setting it to zero ensures
that the SMT solver can instantiate this quantifier as often as
needed, so long as the the fuel instrumentation argument is non-zero.
Notice that the equation peels off one application of SFuel
, so
that the quantifier cannot be repeatedly instantiated infinitely.
(assert (! (forall ((@u0 Fuel) (@x1 Term))
(! (implies (HasType @x1 Prims.nat)
(= (SMTEncoding.factorial.fuel_instrumented (SFuel @u0) @x1)
(let ((@lb2 (Prims.op_Equality Prims.int @x1 (BoxInt 0))))
(ite (= @lb2 (BoxBool true))
(BoxInt 1)
(Prims.op_Multiply
@x1
(SMTEncoding.factorial.fuel_instrumented
@u0
(Prims.op_Subtraction @x1 (BoxInt 1))))))))
:weight 0
:pattern ((SMTEncoding.factorial.fuel_instrumented (SFuel @u0) @x1))
:qid equation_with_fuel_SMTEncoding.factorial.fuel_instrumented))
:named equation_with_fuel_SMTEncoding.factorial.fuel_instrumented))
We also need an assumption that tells the SMT solver that the fuel argument, aside from controlling the number of unfoldings, is semantically irrelevant.
(assert (! (forall ((@u0 Fuel) (@x1 Term))
(! (= (SMTEncoding.factorial.fuel_instrumented (SFuel @u0) @x1)
(SMTEncoding.factorial.fuel_instrumented ZFuel @x1))
:pattern ((SMTEncoding.factorial.fuel_instrumented (SFuel @u0) @x1))
:qid @fuel_irrelevance_SMTEncoding.factorial.fuel_instrumented))
:named @fuel_irrelevance_SMTEncoding.factorial.fuel_instrumented))
And, finally, we relate the original function to its fuel-instrumented counterpart.
(assert (! (forall ((@x0 Term))
(! (= (SMTEncoding.factorial @x0)
(SMTEncoding.factorial.fuel_instrumented MaxFuel @x0))
:pattern ((SMTEncoding.factorial @x0))
:qid @fuel_correspondence_SMTEncoding.factorial.fuel_instrumented))
:named @fuel_correspondence_SMTEncoding.factorial.fuel_instrumented))
This definition uses the constant MaxFuel
. The value of this
constant is determined by the F* options --initial_fuel n
and
--max_fuel m
. When F* issues a query to Z3, it tries the query
repeatedly with different values of MaxFuel
ranging between n
and m
. Additionally, the option --fuel n
sets both the initial
fuel and max fuel to n
.
This single value of MaxFuel
controls the number of unfoldings of
all recursive functions in scope. Of course, the patterns are
arranged so that if you have a query involving, say, List.map
,
quantified assumptions about an unrelated recursive function like
factorial
should never trigger. Neverthless, large values of
MaxFuel
greatly increase the search space for the SMT solver. If
your proof requires a setting greater than --fuel 2
, and if it
takes the SMT solver a long time to find the proof, then you should
think about whether things could be done differently.
However, with a low value of fuel
, the SMT solver cannot reason
about recursive functions beyond that bound. For instance, the
following fails, since the solver can unroll the definition only once
to conclude that factorial 1 == 1 * factorial 0
, but being unable
to unfold factorial 0
further, the proof fails.
#push-options "--fuel 1"
let _ = assert (factorial 1 == 1) (* fails *)
As with regular functions, the rest of the encoding of recursive functions has to do with handling higher-order uses.
Inductive datatypes and ifuel
Inductive datatypes in F* allow defining unbounded structures and, just like with recursive functions, F* encodes them to SMT by instrumenting them with fuel, to prevent infinite unfoldings. Let’s look at a very simple example, an F* type of unary natural numbers.
type unat =
| Z : unat
| S : (prec:unat) -> unat
Although Z3 offers support for a built-in theory of datatypes, F* does
not use it (aside for Fuel
), since F* datatypes are more
complex. Instead, F* rolls its own datatype encoding using
uninterpreted functions and the encoding of unat
begins by
declaring these functions.
(declare-fun SMTEncoding.unat () Term)
(declare-fun SMTEncoding.Z () Term)
(declare-fun SMTEncoding.S (Term) Term)
(declare-fun SMTEncoding.S_prec (Term) Term)
We have one function for the type unat
; one for each constructor
(Z
and S
); and one “projector” for each argument of each
constructor (here, only S_prec
, corresponding to the F* projector
S?.prec
).
The type unat
has its typing assumption, where Tm_type
is the
SMT encoding of the F* type Type
—note F* does not encode the
universe levels to SMT.
(assert (! (HasType SMTEncoding.unat Tm_type)
:named kinding_SMTEncoding.unat@tok))
The constructor S_prec
is assumed to be an inverse of S
. If
there were more than one argument to the constructor, each projector
would project out only the corresponding argument, encoding that the
constructor is injective on each of its arguments.
(assert (! (forall ((@x0 Term))
(! (= (SMTEncoding.S_prec (SMTEncoding.S @x0)) @x0)
:pattern ((SMTEncoding.S @x0))
:qid projection_inverse_SMTEncoding.S_prec))
:named projection_inverse_SMTEncoding.S_prec))
The encoding defines two macros is-SMTEncoding.Z
and
is-SMTEncoding.S
that define when the head-constructor of a term
is Z
and S
respectively. These two macros are used in the
definition of the inversion assumption of datatypes, namely that given
a term of type unat
, one can conclude that its head constructor
must be either Z
or S
. However, since the type unat
is
unbounded, we want to avoid applying this inversion indefinitely, so
it uses a quantifier with a pattern that requires non-zero fuel to
be triggered.
(assert (! (forall ((@u0 Fuel) (@x1 Term))
(! (implies (HasTypeFuel (SFuel @u0) @x1 SMTEncoding.unat)
(or (is-SMTEncoding.Z @x1)
(is-SMTEncoding.S @x1)))
:pattern ((HasTypeFuel (SFuel @u0) @x1 SMTEncoding.unat))
:qid fuel_guarded_inversion_SMTEncoding.unat))
:named fuel_guarded_inversion_SMTEncoding.unat))
Here, we see a use of HasTypeFuel
, a fuel-instrumented version of
the HasType
we’ve seen earlier. In fact, (HasType x t)
is just
a macro for (HasTypeFuel MaxIFuel x t)
, where much like for
recursive functions and fuel, the constant MaxIFuel
is defined by
the current value of the F* options --initial_ifuel
,
--max_ifuel
, and --ifuel
(where ifuel
stands for “inversion fuel”).
The key bit in ensuring that the inversion assumption above is not indefinitely applied is in the structure of the typing assumptions for the data constructors. These typing assumptions come in two forms, introduction and elimination.
The introduction form for the S
constructor is shown below. This
allows deriving that S x
has type unat
from the fact that
x
itself has type unat
. The pattern on the quantifier makes
this goal-directed: if (HasTypeFuel @u0 (SMTEncoding.S @x1)
SMTEncoding.unat)
is already an active term, then the quantifer
fires to make (HasTypeFuel @u0 @x1 SMTEncoding.unat)
an active
term, peeling off one application of the S
constructor. If we
were to use (HasTypeFuel @u0 @x1 SMTEncoding.unat)
as the pattern,
this would lead to an infinite quantifier instantiation loop, since
every each instantiation would lead a new, larger active term that
could instantiate the quantifier again. Note, using the introduction
form does not vary the fuel parameter, since the the number of
applications of the constructor S
decreases at each instantiation
anyway.
(assert (! (forall ((@u0 Fuel) (@x1 Term))
(! (implies (HasTypeFuel @u0 @x1 SMTEncoding.unat)
(HasTypeFuel @u0 (SMTEncoding.S @x1) SMTEncoding.unat))
:pattern ((HasTypeFuel @u0 (SMTEncoding.S @x1) SMTEncoding.unat))
:qid data_typing_intro_SMTEncoding.S@tok))
:named data_typing_intro_SMTEncoding.S@tok))
The elimination form allows concluding that the sub-terms of a
well-typed application of a constructor are well-typed too. This time
note that the conclusion of the rule decreases the fuel parameter by
one. If that were not the case, then we would get a quantifier
matching loop between data_elim_SMTEncoding.S
and
fuel_guarded_inversion_SMTEncoding.unat
, since each application of
the latter would contribute an active term of the form (HasTypeFuel
(SFuel _) (S (S_prec x)) unat)
, allowing the former to be triggered
again.
(assert (! (forall ((@u0 Fuel) (@x1 Term))
(! (implies (HasTypeFuel (SFuel @u0) (SMTEncoding.S @x1) SMTEncoding.unat)
(HasTypeFuel @u0 @x1 SMTEncoding.unat))
:pattern ((HasTypeFuel (SFuel @u0) (SMTEncoding.S @x1) SMTEncoding.unat))
:qid data_elim_SMTEncoding.S))
:named data_elim_SMTEncoding.S))
A final important element in the encoding of datatypes has to do with
the well-founded ordering used in termination proofs. The following
states that if S x1
is well-typed (with non-zero fuel) then x1
precedes S x1
in F*’s built-in sub-term ordering.
(assert (! (forall ((@u0 Fuel) (@x1 Term))
(! (implies (HasTypeFuel (SFuel @u0)
(SMTEncoding.S @x1)
SMTEncoding.unat)
(Valid (Prims.precedes Prims.lex_t Prims.lex_t
@x1 (SMTEncoding.S @x1))))
:pattern ((HasTypeFuel (SFuel @u0) (SMTEncoding.S @x1) SMTEncoding.unat))
:qid subterm_ordering_SMTEncoding.S))
:named subterm_ordering_SMTEncoding.S))
Once again, a lot of the rest of the datatype encoding has to do with handling higher order uses of the constructors.
As with recursive functions, the single value of MaxIFuel
controls
the number of inversions of all datatypes in scope. It’s a good idea
to try to use an ifuel
setting that is as low as possible for
your proofs, e.g., a value less than 2
, or even 0
, if
possible. However, as with fuel, a value of ifuel that is too low will
cause the solver to be unable to prove some facts. For example,
without any ifuel
, the solver cannot use the inversion assumption
to prove that the head of x
must be either S
or Z
, and F*
reports the error “Patterns are incomplete”.
#push-options "--ifuel 0"
let rec as_nat (x:unat) : nat =
match x with (* fails exhaustiveness check *)
| S x -> 1 + as_nat x (* fails termination check *)
| Z -> 0
Sometimes it is useful to let the solver arbitrarily invert an
inductive type. The FStar.Pervasives.allow_inversion
is a library
function that enables this, as shown below. Within that scope, the
ifuel guards on the unat
type are no longer imposed and SMT can
invert unat
freely—F* accepts the code below.
#push-options "--ifuel 0"
let rec as_nat (x:unat) : nat =
allow_inversion unat;
match x with
| S x -> 1 + as_nat x
| Z -> 0
This can be useful sometimes, e.g., one could set the ifuel to 0 and
allow inversion within a scope for only a few selected types, e.g.,
option
. However, it is rarely a good idea to use
allow_inversion
on an unbounded type (e.g., list
or even
unat
).
Logical Connectives
The logical connectives that F* offers are all derived forms. Given the encodings of datatypes and functions (and arrow types, which we haven’t shown), the encodings of all these connectives just fall out naturally. However, all these connectives also have built-in support in the SMT solver as part of its propositional core and support for E-matching-based quantifier instantiation. So, rather than leave them as derived forms, a vital optimization in F*’s SMT encoding is to recognize these connectives and to encode them directly to the corresponding forms in SMT.
The term p /\ q
in F* is encoded to (and [[p]] [[q]]])
where
[[p]]
and [[q]]
are the logical encodings of p
and q
respectively. However, the SMT connective and
is a binary function
on the SMT sort Bool
, whereas all we have been describing so far
is that every F* term p
is encoded to the SMT sort Term
. To
bridge the gap, the logical encoding of a term p
interprets the
Term
sort into Bool
by using a function Valid p
, which
deems a p : Term
to be valid if it is inhabited, as per the
definitions below.
(declare-fun Valid (Term) Bool)
(assert (forall ((e Term) (t Term))
(! (implies (HasType e t) (Valid t))
:pattern ((HasType e t) (Valid t))
:qid __prelude_valid_intro)))
The connectives p \/ q
, p ==> q
, p <==> q
, and ~p
are
similar.
The quantified forms forall
and exists
are mapped to the
corresponding quantifiers in SMT. For example,
let fact_positive = forall (x:nat). factorial x >= 1
is encoded to:
(forall ((@x1 Term))
(implies (HasType @x1 Prims.nat)
(>= (BoxInt_proj_0 (SMTEncoding.factorial @x1))
(BoxInt_proj_0 (BoxInt 1)))))
Note, this quantifier does not have any explicitly annotated
patterns. In this case, Z3’s syntactic trigger selection heuristics
pick a pattern: it is usually the smallest collection of sub-terms of
the body of the quantifier that collectively mention all the bound
variables. In this case, the choices for the pattern are
(SMTEncoding.factorial @x1)
and (HasType @x1 Prims.nat)
: Z3
picks both of these as patterns, allowing the quantifier to be
triggered if an active term matches either one of them.
For small developments, leaving the choice of pattern to Z3 is often fine, but as your project scales up, you probably want to be more careful about your choice of patterns. F* lets you write the pattern explicitly on a quantifier and translates it down to SMT, as shown below.
let fact_positive_2 = forall (x:nat).{:pattern (factorial x)} factorial x >= 1
This produces:
(forall ((@x1 Term))
(! (implies (HasType @x1 Prims.nat)
(>= (BoxInt_proj_0 (SMTEncoding.factorial @x1))
(BoxInt_proj_0 (BoxInt 1))))
:pattern ((SMTEncoding.factorial.fuel_instrumented ZFuel @x1))))
Note, since factorial
is fuel instrumented, the pattern is
translated to an application that requires no fuel, so that the
property also applies to any partial unrolling of factorial also.
Existential formulas are similar. For example, one can write:
let fact_unbounded = forall (n:nat). exists (x:nat). factorial x >= n
And it gets translated to:
(forall ((@x1 Term))
(implies (HasType @x1 Prims.nat)
(exists ((@x2 Term))
(and (HasType @x2 Prims.nat)
(>= (BoxInt_proj_0 (SMTEncoding.factorial @x2))
(BoxInt_proj_0 @x1))))))
Options for Z3 and the SMT Encoding
F* provides two ways of passing options to Z3.
The option --z3cliopt <string>
causes F* to pass the string as a
command-line option when starting the Z3 process. A typical usage
might be --z3cliopt 'smt.random_seed=17'
.
In contrast, --z3smtopt <string>
causes F* to send the string to
Z3 as part of its SMT2 output and this option is also reflected in the
.smt2 file that F* emits with --log_queries
. As such, it can be
more convenient to use this option if you want to debug or profile a
run of Z3 on an .smt2 file generated by F*. A typical usage would be
--z3smtopt '(set-option :smt.random_seed 17)'
. Note, it is
possible to abuse this option, e.g., one could use --z3smtopt
'(assert false)'
and all SMT queries would trivially pass. So, use
it with care.
F*’s SMT encoding also offers a few options.
--smtencoding.l_arith_repr native
This option requests F* to inline the definitions of the linear
arithmetic operators (+
and -
). For example, with this option
enabled, F* encodes the term x + 1 + 2
as the SMT2 term below.
(BoxInt (+ (BoxInt_proj_0 (BoxInt (+ (BoxInt_proj_0 @x0)
(BoxInt_proj_0 (BoxInt 1)))))
(BoxInt_proj_0 (BoxInt 2))))
--smtencoding.elim_box true
This option is often useful in combination with
smtencoding.l_arith_repr native
, enables an optimization to remove
redundant adjacent box/unbox pairs. So, adding this option to the
example above, the encoding of x + 1 + 2
becomes:
(BoxInt (+ (+ (BoxInt_proj_0 @x0) 1) 2))
--smtencoding.nl_arith_repr [native|wrapped|boxwrap]
This option controls the representation of non-linear arithmetic
functions (*, /, mod
) in the SMT encoding. The default is
boxwrap
which uses the encoding of Prims.op_Multiply,
Prims.op_Division, Prims.op_Modulus
analogous to
Prims.op_Addition
.
The native
setting is similar to the smtencoding.l_arith_repr
native
. When used in conjuction with smtencoding.elim_box true
,
the F* term x * 1 * 2
is encoded to:
(BoxInt (* (* (BoxInt_proj_0 @x0) 1) 2))
However, a third setting wrapped
is also available with provides
an intermediate level of wrapping. With this setting enabled, the
encoding of x * 1 * 2
becomes
(BoxInt (_mul (_mul (BoxInt_proj_0 @x0) 1) 2))
where _mul
is declared as shown below:
(declare-fun _mul (Int Int) Int)
(assert (forall ((x Int) (y Int)) (! (= (_mul x y) (* x y)) :pattern ((_mul x y)))))
Now, you may wonder why all these settings are useful. Surely, one
would think, --smtencoding.l_arith_repr native
--smtencoding.nl_arith_repr native --smtencoding.elim_box true
is
the best setting. However, it turns out that the additional layers of
wrapping and boxing actually enable some proofs to go through, and,
empirically, no setting strictly dominates all the others.
However, the following is a good rule of thumb if you are starting a new project:
Consider using
--z3smtopt '(set-option :smt.arith.nl false)'
. This entirely disables support for non-linear arithmetic theory reasoning in the SMT solver, since this can be very expensive and unpredictable. Instead, if you need to reason about non-linear arithmetic, consider using the lemmas fromFStar.Math.Lemmas
to do the non-linear steps in your proof manually. This will be more painstaking, but will lead to more stable proofs.For linear arithmetic, the setting
--smtencoding.l_arith_repr native --smtencoding.elim_box true
is a good one to consider, and may yield some proof performance boosts over the default setting.
Designing a Library with SMT Patterns
In this section, we look at the design of FStar.Set
, a module in
the standard library, examining, in particular, its use of SMT
patterns on lemmas for proof automation. The style used here is
representative of the style used in many proof-oriented
libraries—the interface of the module offers an abstract type, with
some constructors and some destructors, and lemmas that relate their
behavior.
To start with, for our interface, we set the fuel and ifuel both to zero—we will not need to reason about recursive functions or invert inductive types here.
module SimplifiedFStarSet
(** Computational sets (on eqtypes): membership is a boolean function *)
#set-options "--fuel 0 --ifuel 0"
Next, we introduce the signature of the main abstract type of this
module, set
:
val set (a:eqtype)
: Type0
Sets offer just a single operation called mem
that allows testing
whether or not a given element is in the set.
val mem (#a:eqtype) (x:a) (s:set a)
: bool
However, there are several ways to construct sets:
val empty (#a:eqtype)
: set a
val singleton (#a:eqtype) (x:a)
: set a
val union (#a:eqtype) (s0 s1: set a)
: set a
val intersect (#a:eqtype) (s0 s1: set a)
: set a
val complement (#a:eqtype) (s0:set a)
: set a
Finally, sets are equipped with a custom equivalence relation:
val equal (#a:eqtype) (s0 s1:set a)
: prop
The rest of our module offers lemmas that describe the behavior of
mem
when applied to each of the constructors.
val mem_empty (#a:eqtype) (x:a)
: Lemma
(ensures (not (mem x empty)))
[SMTPat (mem x empty)]
val mem_singleton (#a:eqtype) (x y:a)
: Lemma
(ensures (mem y (singleton x) == (x=y)))
[SMTPat (mem y (singleton x))]
val mem_union (#a:eqtype) (x:a) (s1 s2:set a)
: Lemma
(ensures (mem x (union s1 s2) == (mem x s1 || mem x s2)))
[SMTPat (mem x (union s1 s2))]
val mem_intersect (#a:eqtype) (x:a) (s1:set a) (s2:set a)
: Lemma
(ensures (mem x (intersect s1 s2) == (mem x s1 && mem x s2)))
[SMTPat (mem x (intersect s1 s2))]
val mem_complement (#a:eqtype) (x:a) (s:set a)
: Lemma
(ensures (mem x (complement s) == not (mem x s)))
[SMTPat (mem x (complement s))]
Each of these lemmas should be intuitive and familiar. The extra bit
to pay attention to is the SMTPat
annotations on each of the
lemmas. These annotations instruct F*’s SMT encoding to treat the
lemma like a universal quantifier guarded by the user-provided
pattern. For instance, the lemma mem_empty
is encoded to the SMT
solver as shown below.
(assert (! (forall ((@x0 Term) (@x1 Term))
(! (implies (and (HasType @x0 Prims.eqtype)
(HasType @x1 @x0))
(not (BoxBool_proj_0
(SimplifiedFStarSet.mem @x0
@x1
(SimplifiedFStarSet.empty @x0)))))
:pattern ((SimplifiedFStarSet.mem @x0
@x1
(SimplifiedFStarSet.empty @x0)))
:qid lemma_SimplifiedFStarSet.mem_empty))
:named lemma_SimplifiedFStarSet.mem_empty))
That is, from the perspective of the SMT encoding, the statement of
the lemma mem_empty
is analogous to the following assumption:
forall (a:eqtype) (x:a). {:pattern (mem x empty)} not (mem x empty)
As such, lemmas decorated with SMT patterns allow the user to inject
new, quantified hypotheses into the solver’s context, where each of
those hypotheses is justified by a proof in F* of the corresponding
lemma. This allows users of the FStar.Set
library to treat set
almost like a new built-in type, with proof automation to reason about
its operations. However, making this work well requires some careful
design of the patterns.
Consider mem_union
: the pattern chosen above allows the solver to
decompose an active term mem x (union s1 s2)
into mem x s1
and
mem x s2
, where both terms are smaller than the term we started
with. Suppose instead that we had written:
val mem_union (#a:eqtype) (x:a) (s1 s2:set a)
: Lemma
(ensures (mem x (union s1 s2) == (mem x s1 || mem x s2)))
[SMTPat (mem x s1); SMTPat (mem x s2)]
This translates to an SMT quantifier whose patterns are the pair of
terms mem x s1
and mem x s2
. This choice of pattern would
allow the solver to instantiate the quantifier with all pairs of
active terms of the form mem x s
, creating more active terms that
are themselves matching candidates. To be explicit, with a single
active term mem x s
, the solver would derive mem x (union s
s)
, mem x (union s (union s s))
, and so on. This is called a
matching loop and can be disastrous for solver performance. So,
carefully chosing the patterns on quantifiers and lemmas with
SMTPat
annotations is important.
Finally, to complete our interface, we provide two lemmas to
characterize equal
, the equivalence relation on sets. The first
says that sets that agree on the mem
function are equal
, and
the second says that equal
sets are provably equal (==)
, and
the patterns allow the solver to convert reasoning about equality into
membership and provable equality.
val equal_intro (#a:eqtype) (s1 s2: set a)
: Lemma
(requires (forall x. mem x s1 = mem x s2))
(ensures (equal s1 s2))
[SMTPat (equal s1 s2)]
val equal_elim (#a:eqtype) (s1 s2:set a)
: Lemma
(requires (equal s1 s2))
(ensures (s1 == s2))
[SMTPat (equal s1 s2)]
Of course, all these lemmas can be easily proven by F* under a
suitable representation of the abstract type set
, as shown in the
module implementation below.
module SimplifiedFStarSet
(** Computational sets (on eqtypes): membership is a boolean function *)
#set-options "--fuel 0 --ifuel 0"
open FStar.FunctionalExtensionality
module F = FStar.FunctionalExtensionality
let set (a:eqtype) = F.restricted_t a (fun _ -> bool)
(* destructors *)
let mem #a x s = s x
(* constructors *)
let empty #a = F.on_dom a (fun x -> false)
let singleton #a x = F.on_dom a (fun y -> y = x)
let union #a s1 s2 = F.on_dom a (fun x -> s1 x || s2 x)
let intersect #a s1 s2 = F.on_dom a (fun x -> s1 x && s2 x)
let complement #a s = F.on_dom a (fun x -> not (s x))
(* equivalence relation *)
let equal (#a:eqtype) (s1:set a) (s2:set a) = F.feq s1 s2
(* Properties *)
let mem_empty #a x = ()
let mem_singleton #a x y = ()
let mem_union #a x s1 s2 = ()
let mem_intersect #a x s1 s2 = ()
let mem_complement #a x s = ()
(* extensionality *)
let equal_intro #a s1 s2 = ()
let equal_elim #a s1 s2 = ()
Exercise
Extend the set library with another constructor with the signature shown below:
val from_fun (#a:eqtype) (f: a -> bool) : Tot (set a)
and prove a lemma that shows that a an element x
is in from_fun
f
if and only if f x = true
, decorating the lemma with the
appropriate SMT pattern.
This interface file and its implementation provides the definitions you need.
Answer
Look at FStar.Set.intension if you get stuck
Profiling Z3 and Solving Proof Performance Issues
At some point, you will write F* programs where proofs start to take much longer than you’d like: simple proofs fail to go through, or proofs that were once working start to fail as you make small changes to your program. Hopefully, you notice this early in your project and can try to figure out how to make it better before slogging through slow and unpredictable proofs. Contrary to the wisdom one often receives in software engineering where early optimization is discouraged, when developing proof-oriented libraries, it’s wise to pay attention to proof performance issues as soon as they come up, otherwise you’ll find that as you scale up further, proofs become so slow or brittle that your productivity decreases rapidly.
Query Statistics
Your first tool to start diagnosing solver performance is F*’s
--query_stats
option. We’ll start with some very simple artificial
examples.
With the options below, F* outputs the following statistics:
#push-options "--initial_fuel 0 --max_fuel 4 --ifuel 0 --query_stats"
let _ = assert (factorial 3 == 6)
(<input>(20,0-20,49)) Query-stats (SMTEncoding._test_query_stats, 1) failed
{reason-unknown=unknown because (incomplete quantifiers)} in 31 milliseconds
with fuel 0 and ifuel 0 and rlimit 2723280
statistics={mk-bool-var=7065 del-clause=242 num-checks=3 conflicts=5
binary-propagations=42 arith-fixed-eqs=4 arith-pseudo-nonlinear=1
propagations=10287 arith-assert-upper=21 arith-assert-lower=18
decisions=11 datatype-occurs-check=2 rlimit-count=2084689
arith-offset-eqs=2 quant-instantiations=208 mk-clause=3786
minimized-lits=3 memory=21.41 arith-pivots=6 max-generation=5
arith-conflicts=3 time=0.03 num-allocs=132027456 datatype-accessor-ax=3
max-memory=21.68 final-checks=2 arith-eq-adapter=15 added-eqs=711}
(<input>(20,0-20,49)) Query-stats (SMTEncoding._test_query_stats, 1) failed
{reason-unknown=unknown because (incomplete quantifiers)} in 47 milliseconds
with fuel 2 and ifuel 0 and rlimit 2723280
statistics={mk-bool-var=7354 del-clause=350 arith-max-min=10 interface-eqs=3
num-checks=4 conflicts=8 binary-propagations=56 arith-fixed-eqs=17
arith-pseudo-nonlinear=3 arith-bound-prop=2 propagations=13767
arith-assert-upper=46 arith-assert-lower=40 decisions=25
datatype-occurs-check=5 rlimit-count=2107946 arith-offset-eqs=6
quant-instantiations=326 mk-clause=4005 minimized-lits=4
memory=21.51 arith-pivots=20 max-generation=5 arith-add-rows=34
arith-conflicts=4 time=0.05 num-allocs=143036410 datatype-accessor-ax=5
max-memory=21.78 final-checks=6 arith-eq-adapter=31 added-eqs=1053}
(<input>(20,0-20,49)) Query-stats (SMTEncoding._test_query_stats, 1) succeeded
in 48 milliseconds with fuel 4 and ifuel 0 and rlimit 2723280
statistics={arith-max-min=26 num-checks=5 binary-propagations=70 arith-fixed-eqs=47
arith-assert-upper=78 arith-assert-lower=71 decisions=40
rlimit-count=2130332 max-generation=5 arith-nonlinear-bounds=2
time=0.05 max-memory=21.78 arith-eq-adapter=53 added-eqs=1517
mk-bool-var=7805 del-clause=805 interface-eqs=3 conflicts=16
arith-pseudo-nonlinear=6 arith-bound-prop=4 propagations=17271
datatype-occurs-check=5 arith-offset-eqs=20 quant-instantiations=481
mk-clause=4286 minimized-lits=38 memory=21.23 arith-pivots=65
arith-add-rows=114 arith-conflicts=5 num-allocs=149004462
datatype-accessor-ax=9 final-checks=7}
There’s a lot of information here:
We see three lines of output, each tagged with a source location and an internal query identifer (
(SMTEncoding._test_query_stats, 1)
, the first query for verifying_test_query_stats
).The first two attempts at the query failed, with Z3 reporting the reason for failure as
unknown because (incomplete quantifiers)
. This is a common response from Z3 when it fails to prove a query—since first-order logic is undecidable, when Z3 fails to find a proof, it reports “unknown” rather than claiming that the theory is satisfiable. The third attempt succeeded.The attempts used
0
,2
, and4
units of fuel. Notice that our query wasfactorial 3 == 6
and this clearly requires at least 4 units of fuel to succeed. In this case it didn’t matter much, since the two failed attempts took only47
and48
milliseconds. But, you may sometimes find that there are many attempts of a proof with low fuel settings and finally success with a higher fuel number. In such cases, you may try to find ways to rewrite your proof so that you are not relying on so many unrollings (if possible), or if you decide that you really need that much fuel, then setting the--fuel
option to that value can help avoid several slow failures and retries.The rest of the statistics report internal Z3 statistics.
The
rlimit
value is a logical resource limit that F* sets when calling Z3. Sometimes, as we will see shortly, a proof can be “cancelled” in case Z3 runs past this resource limit. You can increase therlimit
in this case, as we’ll see below.Of the remaning statistics, perhaps the main one of interest is
quant_instantiations
. This records a cumulative total of quantifiers instantiated by Z3 so far in the current session—here, each attempt seems to instantiate around 100–150 quantifiers. This is a very low number, since the query is so simple. You may be wondering why it is even as many as that, since 4 unfolding of factorial suffice, but remember that there are many other quantifiers involved in the encoding, e.g., those that prove thatBoxBool
is injective etc. A more typical query will see quantifier instantiations in the few thousands.
Note
Note, since the quant-instantiations
metric is cumulative, it
is often useful to precede a query with something like the following:
#push-options "--initial_fuel 0 --max_fuel 4 --ifuel 0 --query_stats"
#restart-solver
let _dummy = assert (factorial 0 == 1)
let _test_query_stats = assert (factorial 3 == 6)
The #restart-solver
creates a fresh Z3 process and the
dummy
query “warms up” the process by feeding it a trivial
query, which will run somewhat slow because of various
initialization costs in the solver. Then, the query stats reported
for the real test subject starts in this fresh session.
Working though a slow proof
Even a single poorly chosen quantified assumption in the prover’s context can make an otherwise simple proof take very long. To illustrate, consider the following variation on our example above:
assume Factorial_unbounded: forall (x:nat). exists (y:nat). factorial y > x
#push-options "--fuel 4 --ifuel 0 --query_stats"
#restart-solver
let _test_query_stats = assert (factorial 3 == 6)
We’ve now introduced the assumption Factorial_unbounded
into our
context. Recall from the SMT encoding of quantified formulas, from the
SMT solver’s perspective, this looks like the following:
(assert (! (forall ((@x0 Term))
(! (implies (HasType @x0 Prims.nat)
(exists ((@x1 Term))
(! (and (HasType @x1 Prims.nat)
(> (BoxInt_proj_0 (SMTEncoding.factorial @x1))
(BoxInt_proj_0 @x0)))
:qid assumption_SMTEncoding.Factorial_unbounded.1)))
:qid assumption_SMTEncoding.Factorial_unbounded))
:named assumption_SMTEncoding.Factorial_unbounded))
This quantifier has no explicit patterns, but Z3 picks the term
(HasType @x0 Prims.nat)
as the pattern for the forall
quantifier. This means that it can instantiate the quantifier for
active terms of type nat
. But, a single instantiation of the
quantifier, yields the existentially quantified formula. Existentials
are immediately skolemized by Z3, i.e., the
existentially bound variable is replaced by a fresh function symbol
that depends on all the variables in scope. So, a fresh term a @x0
corresponding @x1
is introduced, and immediately, the conjunct
HasType (a @x0) Prims.nat
becomes an active term and can be used to
instantiate the outer universal quantifier again. This “matching loop”
sends the solver into a long, fruitless search and the simple proof
about factorial 3 == 6
which previously succeeded in a few
milliseconds, now fails. Here’s are the query stats:
(<input>(18,0-18,49)) Query-stats (SMTEncoding._test_query_stats, 1) failed
{reason-unknown=unknown because canceled} in 5647 milliseconds
with fuel 4 and ifuel 0 and rlimit 2723280
statistics={ ... quant-instantiations=57046 ... }
A few things to notice:
The failure reason is “unknown because canceled”. That means the solver reached its resource limit and halted the proof search. Usually, when you see “canceled” as the reason, you could try raising the rlimit, as we’ll see shortly.
The failure took 5.6 seconds.
There were 57k quantifier instantiations, as compared to just the 100 or so we had earlier. We’ll soon seen how to pinpoint which quantifiers were instantiated too much.
Increasing the rlimit
We can first retry the proof by giving Z3 more resources—the directive below doubles the resource limit given to Z3.
#push-options "--z3rlimit_factor 2"
This time it took 14 seconds and failed. But if you try the same proof a second time, it succeeds. That’s not very satisfying.
Repeating Proofs with Quake
Although this is an artificial example, unstable proofs that work and then suddenly fail do happen. Z3 does guarantee that it is deterministic in a very strict sense, but even the smallest change to the input, e.g., a change in variable names, or even asking the same query twice in a succession in the same Z3 session, can result in different answers.
There is often a deeper root cause (in our case, it’s the
Factorial_unbounded
assumption, of course), but a first attempt at
determining whether or not a proof is “flaky” is to use the F* option
--quake
.
#push-options "--quake 5/k"
let _test_query_stats = assert (factorial 3 == 6)
This tries the query 5 times and reports the number of successes and failures.
In this case, F* reports the following:
Quake: query (SMTEncoding._test_query_stats, 1) succeeded 4/5 times (best fuel=4, best ifuel=0)
If you’re working to stabilize a proof, a good criterion is to see if
you can get the proof to go through with the --quake
option.
You can also try the proof by varying the Z3’s random seed and checking that it works with several choices of the seed.
#push-options "--z3smtopt '(set-option :smt.random_seed 1)'"
Profiling Quantifier Instantiation
We have a query that’s taking much longer than we’d like and from the query-stats we see that there are a lot of quantifier instances. Now, let’s see how to pin down which quantifier is to blame.
Get F* to log an .smt2 file, by adding the
--log_queries
option. It’s important to also add a#restart-solver
before just before the definition that you’re interested in profiling.#push-options "--fuel 4 --ifuel 0 --query_stats --log_queries --z3rlimit_factor 2" #restart-solver let _test_query_stats = assert (factorial 3 == 6)F* reports the name of the file that it wrote as part of the query-stats. For example:
(<input>(18,0-18,49)@queries-SMTEncoding-7.smt2) Query-stats ...Now, from a terminal, you run Z3 on this generated .smt2 file, while passing it the following option and save the output in a file.
z3 queries-SMTEncoding-7.smt2 smt.qi.profile=true > sample_qiprofileThe output contains several lines that begin with
[quantifier_instances]
, which is what we’re interested in.grep quantifier_instances sample_qiprofile | sort -k 4 -nThe last few lines of output look like this:
[quantifier_instances] bool_inversion : 352 : 10 : 11 [quantifier_instances] bool_typing : 720 : 10 : 11 [quantifier_instances] constructor_distinct_BoxBool : 720 : 10 : 11 [quantifier_instances] projection_inverse_BoxBool_proj_0 : 1772 : 10 : 11 [quantifier_instances] primitive_Prims.op_Equality : 2873 : 10 : 11 [quantifier_instances] int_typing : 3168 : 10 : 11 [quantifier_instances] constructor_distinct_BoxInt : 3812 : 10 : 11 [quantifier_instances] typing_SMTEncoding.factorial : 5490 : 10 : 11 [quantifier_instances] int_inversion : 5506 : 11 : 12 [quantifier_instances] @fuel_correspondence_SMTEncoding.factorial.fuel_instrumented : 5746 : 10 : 11 [quantifier_instances] Prims_pretyping_ae567c2fb75be05905677af440075565 : 5835 : 11 : 12 [quantifier_instances] projection_inverse_BoxInt_proj_0 : 6337 : 10 : 11 [quantifier_instances] primitive_Prims.op_Multiply : 6394 : 10 : 11 [quantifier_instances] primitive_Prims.op_Subtraction : 6394 : 10 : 11 [quantifier_instances] token_correspondence_SMTEncoding.factorial.fuel_instrumented : 7629 : 10 : 11 [quantifier_instances] @fuel_irrelevance_SMTEncoding.factorial.fuel_instrumented : 9249 : 10 : 11 [quantifier_instances] equation_with_fuel_SMTEncoding.factorial.fuel_instrumented : 13185 : 10 : 10 [quantifier_instances] refinement_interpretation_Tm_refine_542f9d4f129664613f2483a6c88bc7c2 : 15346 : 10 : 11 [quantifier_instances] assumption_SMTEncoding.Factorial_unbounded : 15890 : 10 : 11Each line mentions is of the form:
qid : number of instances : max generation : max costwhere,
qid is the identifer of quantifier in the .smt2 file
the number of times it was instantiated, which is the number we’re most interested in
the generation and cost are other internal measures, which Nikolaj Bjorner explains here
Interpreting the results
Clearly, as expected,
assumption_SMTEncoding.Factorial_unbounded
is instantiated the most.Next, if you search in the .smt2 file for “:qid refinement_interpretation_Tm_refine_542f9d4f129664613f2483a6c88bc7c2”, you’ll find the assumption that gives an interpretation to the
HasType x Prims.nat
predicate, where each instantiation ofFactorial_unbounded
yields another instance of this fact.Notice that
equation_with_fuel_SMTEncoding.factorial.fuel_instrumented
is also instantiated a lot. This is because aside from the matching loop due toHasType x Prims.nat
, each instantiation ofFactorial_unbounded
also yields an occurrence offactorial
as a new active term, which the solver then unrolls up to four times.We also see instantiations of quantifiers in
Prims
and other basic facts likeint_inversion
,bool_typing
etc. Sometimes, you may even find that these quantifiers fire the most. However, these quantifiers are inherent to F*’s SMT encoding: there’s not much you can do about it as a user. They are usually also not to blame for a slow proof—they fire a lot when other terms are instantiated too much. You should try to identify other quantifiers in your code or libraries that fire a lot and try to understand the root cause of that.
Z3 Axiom Profiler
The Z3 Axiom Profiler can also be used to find more detailed information about quantifier instantiation, including which terms we used for instantiation, dependence among the quantifiers in the form of instantiation chains, etc.
However, there seem to be some issues with using it at the moment with Z3 logs generated from F*.
Splitting Queries
In the next two sections, we look at a small example that Alex Rozanov reported, shown below. It exhibits similar proof problems to our artificial example with factorial. Instead of just identifying the problematic quantifier, we look at how to remedy the performance problem by revising the proof to be less reliant on Z3 quantifier instantiation.
module Alex
let unbounded (f: nat -> int) = forall (m: nat). exists (n:nat). abs (f n) > m
assume
val f : (f:(nat -> int){unbounded f})
let g : (nat -> int) = fun x -> f (x+1)
#push-options "--query_stats --fuel 0 --ifuel 0 --split_queries always"
let find_above_for_g (m:nat) : Lemma(exists (i:nat). abs(g i) > m) =
assert (unbounded f); // apply forall to m
eliminate exists (n:nat). abs(f n) > m
returns exists (i:nat). abs(g i) > m with _. begin
let m1 = abs(f n) in
assert (m1 > m); //prover hint
if n>=1 then assert (abs(g (n-1)) > m)
else begin
assert (n<=0); //arithmetics hint
eliminate exists (n1:nat). abs (f n1) > m1
returns exists (i:nat). abs(g i) > m with _.
begin
assert (n1>0);
assert (abs (g (n1-1)) > m);
()
end
end
end
The hypothesis is that unbounded f
has exactly the same problem as
the our unbounded hypothesis on factorial—the forall/exists
quantifier contains a matching loop.
This proof of find_above_for_g
succeeds, but it takes a while and
F* reports:
(Warning 349) The verification condition succeeded after splitting
it to localize potential errors, although the original non-split
verification condition failed. If you want to rely on splitting
queries for verifying your program please use the '--split_queries
always' option rather than relying on it implicitly.
By default, F* collects all the proof obligations in a top-level F* definition and presents it to Z3 in a single query with several conjuncts. Usually, this allows Z3 to efficiently solve all the conjuncts together, e.g., the proof search for one conjunct may yield clauses useful to complete the search for other clauses. However, sometimes, the converse can be true: the proof search for separate conjuncts can interfere with each other negatively, leading to the entire proof to fail even when every conjunct may be provable if tried separately. Additionally, when F* calls Z3, it applies the current rlimit setting for every query. If a query contains N conjuncts, splitting the conjuncts into N separate conjuncts is effectively a rlimit multiplier, since each query can separately consume resources as much as the current rlimit.
If the single query with several conjunct fails without Z3 reporting any further information that F* can reconstruct into a localized error message, F* splits the query into its conjuncts and tries each of them in isolation, so as to isolate the failing conjunct it any. However, sometimes, when tried in this mode, the proof of all conjuncts can succeed.
One way to respond to Warning 349 is to follow what it says and enable
--split_queries always
explicitly, at least for the program fragment in
question. This can sometimes stabilize a previously unstable
proof. However, it may also end up deferring an underlying
proof-performance problem. Besides, even putting stability aside,
splitting queries into their conjuncts results in somewhat slower
proofs.
Taking Control of Quantifier Instantiations with Opaque Definitions
Here is a revision of Alex’s program that addresses the quantifier instantiation problem. There are a few elements to the solution.
[@@"opaque_to_smt"]
let unbounded (f: nat → int) = forall (m: nat). exists (n:nat). abs (f n) > m
let instantiate_unbounded (f:nat → int { unbounded f }) (m:nat)
: Lemma (exists (n:nat). abs (f n) > m)
= reveal_opaque (`%unbounded) (unbounded f)
assume
val f : (z:(nat → int){unbounded z})
let g : (nat -> int) = fun x -> f (x+1)
#push-options "--query_stats --fuel 0 --ifuel 0"
let find_above_for_g (m:nat) : Lemma(exists (i:nat). abs(g i) > m) =
instantiate_unbounded f m;
eliminate exists (n:nat). abs(f n) > m
returns exists (i:nat). abs(g i) > m with _. begin
let m1 = abs(f n) in
if n>=1 then assert (abs(g (n-1)) > m)
else begin
instantiate_unbounded f m1;
eliminate exists (n1:nat). abs (f n1) > m1
returns exists (i:nat). abs(g i) > m with _.
begin
assert (abs (g (n1-1)) > m)
end
end
end
Marking definitions as opaque
The attribute
[@@"opaque_to_smt"]
on the definition ofunbounded
instructs F* to not encode that definition to the SMT solver. So, the problematic alternating quantifier is no longer in the global scope.Selectively revealing the definition within a scope
Of course, we still want to reason about the unbounded predicate. So, we provide a lemma,
instantiate_unbounded
, that allows the caller to explicity instantiate the assumption thatf
is unbounded on some lower boundm
.To prove the lemma, we use
FStar.Pervasives.reveal_opaque
: its first argument is the name of a symbol that should be revealed; its second argument is a term in which that definition should be revealed. It this case, it proves thatunbounded f
is equal toforall m. exists n. abs (f n) > m
.With this fact available in the local scope, Z3 can prove the lemma. You want to use
reveal_opaque
carefully, since with having revealed it, Z3 has the problematic alternating quantifier in scope and could go into a matching loop. But, here, since the conclusion of the lemma is exactly the body of the quantifier, Z3 quickly completes the proof. If even this proves to be problematic, then you may have to resort to tactics.Explicitly instantiate where needed
Now, with our instantiation lemma in hand, we can precisly instantiate the unboundedness hypothesis on
f
as needed.In the proof, there are two instantiations, at
m
andm1
.Note, we are still relying on some non-trivial quantifier instantiation by Z3. Notably, the two assertions are important to instantiate the existential quantifier in the
returns
clause. We’ll look at that in more detail shortly.But, by making the problematic definition opaque and instantiating it explicitly, our performance problem is gone—here’s what query-stats shows now.
(<input>(18,2-31,5)) Query-stats (AlexOpaque.find_above_for_g, 1) succeeded in 46 milliseconds
This wiki page provides more information on selectively revealing opaque definitions.
Other Ways to Explicitly Trigger Quantifiers
For completeness, we look at some other ways in which quantifier instantiation works.
An Artificial Trigger
Instead of making the definition of unbounded
opaque, we could
protect the universal quantifier with a pattern using some symbol
reserved for this purpose, as shown below.
let trigger (x:int) = True
let unbounded_alt (f: nat → int) = forall (m: nat). {:pattern (trigger m)} (exists (n:nat). abs (f n) > m)
assume
val ff : (z:(nat → int){unbounded_alt z})
let gg : (nat -> int) = fun x -> ff (x+1)
#push-options "--query_stats --fuel 0 --ifuel 0"
let find_above_for_gg (m:nat) : Lemma(exists (i:nat). abs(gg i) > m) =
assert (unbounded_alt ff);
assert (trigger m);
eliminate exists (n:nat). abs(ff n) > m
returns exists (i:nat). abs(gg i) > m with _. begin
let m1 = abs(ff n) in
if n>=1 then assert (abs(gg (n-1)) > m)
else begin
assert (trigger m1);
eliminate exists (n1:nat). abs (ff n1) > m1
returns exists (i:nat). abs(gg i) > m with _.
begin
assert (abs (gg (n1-1)) > m)
end
end
end
We define a new function
trigger x
that is trivially true.In
unbounded_alt
we decorate the universal quantifier with an explicit pattern,{:pattern (trigger x)}
. The pattern is not semantically relevant—it’s only there to control how the quantifier is instantiatedIn
find_above_for_gg
, whenever we want to instantiate the quantifier with a particular lower boundk
, we asserttrigger k
. That gives Z3 an active term that mentionstrigger
which it then uses to instantiate the quantifier with our choice ofk
.
This style is not particularly pleasant, because it involves polluting our definitions with semantically irrelevant triggers. The selectively revealing opaque definitions style is much preferred. However, artificial triggers can sometimes be useful.
Existential quantifiers
We have an existential formula in the goal exists (i:nat). abs(g i)
> m
and Z3 will try to solve this by finding an active term to
instantiate i
. In this case, the patterns Z3 picks is (g i)
as
well the predicate (HasType i Prims.nat)
, which the SMT encoding
introduces. Note, F* does not currently allow the existential
quantifier in a returns
annoation to be decorated with a
pattern—that will likely change in the future.
Since g i
is one of the patterns, by asserting abs (g (n - 1)) >
m
in one branch, and abs (g (n1 - 1)) > m
in the other, Z3 has
the terms it needs to instantiate the quantifier with n - 1
in one
case, and n1 - 1
in the other case.
In fact, any assertion that mentions the g (n - 1)
and
g (n1 - 1)
will do, even trivial ones, as the example below shows.
let find_above_for_g1 (m:nat) : Lemma(exists (i:nat). abs(g i) > m) =
instantiate_unbounded f m;
eliminate exists (n:nat). abs(f n) > m
returns exists (i:nat). abs(g i) > m with _. begin
let m1 = abs(f n) in
if n>=1 then assert (trigger (g (n-1)))
else begin
instantiate_unbounded f m1;
eliminate exists (n1:nat). abs (f n1) > m1
returns exists (i:nat). abs(g i) > m with _.
begin
assert (trigger (g (n1-1)))
end
end
end
We assert trigger (g (n - 1)))
and trigger (g (n1 - 1))
, this
gives Z3 active terms for g (n - 1))
and g (n1 - 1)
, which
suffices for the instantiation. Note, asserting trigger (n - 1)
is
not enough, since that doesn’t mention g
.
However, recall that there’s a second pattern that’s also applicable
(HasType i Prims.nat)
–we can get Z3 to instantiate the quantifier
if we can inject the predicate (HasType (n - 1) nat)
into Z3’s
context. By using trigger_nat
, as shown below, does the trick,
since F* inserts a proof obligation to show that the argument x
in
trigger_nat x
validates (HasType x Prims.nat)
.
let trigger_nat (x:nat) = True
let find_above_for_g2 (m:nat) : Lemma(exists (i:nat). abs(g i) > m) =
instantiate_unbounded f m;
eliminate exists (n:nat). abs(f n) > m
returns exists (i:nat). abs(g i) > m with _. begin
let m1 = abs(f n) in
if n>=1 then assert (trigger_nat (n-1))
else begin
instantiate_unbounded f m1;
eliminate exists (n1:nat). abs (f n1) > m1
returns exists (i:nat). abs(g i) > m with _.
begin
assert (trigger_nat (n1-1))
end
end
end
Of course, rather than relying on implicitly chosen triggers for the
existentials, one can be explicit about it and provide the instance
directly, as shown below, where the introduce exists ...
in each
branch directly provides the witness rather than relying on Z3 to find
it. This style is much preferred, if possible, than relying implicit
via various implicitly chosen patterns and artificial triggers.
let find_above_for_g' (m:nat) : Lemma(exists (i:nat). abs(g i) > m) =
instantiate_unbounded f m;
eliminate exists (n:nat). abs(f n) > m
returns _ // exists (i:nat). abs(g i) > m
with _. begin
let m1 = abs(f n) in
if n>=1 then (
introduce exists (i:nat). abs(g i) > m
with (n - 1)
and ()
)
else begin
instantiate_unbounded f m1;
eliminate exists (n1:nat). abs (f n1) > m1
returns _ //exists (i:nat). _ abs(g i) > m
with _.
begin
introduce exists (i:nat). abs (g i) > m
with (n1 - 1)
and ()
end
end
end
Here is a link to the the full file with all the variations we have explored.
Overhead due to a Large Context
Consider the following program:
module T = FStar.Tactics
module B = LowStar.Buffer
module SA = Steel.Array
open FStar.Seq
#push-options "--query_stats"
let warmup1 (x:bool { x == true }) = assert x
let test1 (a:Type) (s0 s1 s2: seq a)
: Lemma (Seq.append (Seq.append s0 s1) s2 `Seq.equal`
Seq.append s0 (Seq.append s1 s2))
= ()
The lemma test1
is a simple property about FStar.Seq
, but the
lemma occurs in a module that also depends on a large number of other
modules—in this case, about 177 modules from the F* standard
library. All those modules are encoded to the SMT solver producing
about 11MB of SMT2 definitions with nearly 20,000 assertions for the
solver to process. This makes for a large search space for the solver
to explore to find a proof, however, most of those assertions are
quantified formulas guarded by patterns and they remain inert unless
some active term triggers them. Nevertheless, all these definitions
impose a noticeable overhead to the solver. If you turn
--query_stats
on (after a single warm-up query), it takes Z3 about
300 milliseconds (and about 3000 quantifier instantiations) to find a
proof for test1
.
You probably won’t really notice the overhead of a proof that takes 300 milliseconds—the F* standard library doesn’t have many quantifiers in scope with things like bad quantifier alternation that lead to matching loops. However, as your development starts to depend on an ever larger stack of modules, there’s the danger that at some point, your proofs are impacted by some bad choice of quantifiers in some module that you have forgotten about. In that case, you may find that seemingly simple proofs take many seconds to go through. In this section, we’ll look at a few things you can do to diagnose such problems.
Filtering the context
The first thing we’ll look at is an F* option to remove facts from the context.
#push-options "--using_facts_from 'Prims FStar.Seq'"
let warmup2 (x:bool { x == true }) = assert x
let test2 (a:Type) (s0 s1 s2: seq a)
: Lemma (Seq.append (Seq.append s0 s1) s2 `Seq.equal`
Seq.append s0 (Seq.append s1 s2))
= ()
The --using_facts_from
option retains only facts from modules that
match the namespace-selector string provided. In this case, the
selector shrinks the context from 11MB and 20,000 assertions to around
1MB and 2,000 assertions and the query stats reports that the proof
now goes through in just 15 milliseconds—a sizeable speedup even
though the absolute numbers are still small.
Of course, deciding which facts to filter from your context is not
easy. For example, if you had only retained FStar.Seq
and forgot
to include Prims
, the proof would have failed. So, the
--using_facts_from
option isn’t often very useful.
Unsat Core and Hints
When Z3 finds a proof, it can report which facts from the context were relevant to the proof. This collection of facts is called the unsat core, because Z3 has proven that the facts from the context and the negated goal are unsatisfiable. F* has an option to record and replay the unsat core for each query and F* refers to the recorded unsat cores as “hints”.
Here’s how to use hints:
Record hints
fstar.exe --record_hints ContextPollution.fst
This produces a file called
ContextPollution.fst.hints
The format of a hints file is internal and subject to change, but it is a textual format and you can roughly see what it contains. Here’s a fragment from it:
[ "ContextPollution.test1", 1, 2, 1, [ "@MaxIFuel_assumption", "@query", "equation_Prims.nat", "int_inversion", "int_typing", "lemma_FStar.Seq.Base.lemma_eq_intro", "lemma_FStar.Seq.Base.lemma_index_app1", "lemma_FStar.Seq.Base.lemma_index_app2", "lemma_FStar.Seq.Base.lemma_len_append", "primitive_Prims.op_Addition", "primitive_Prims.op_Subtraction", "projection_inverse_BoxInt_proj_0", "refinement_interpretation_Tm_refine_542f9d4f129664613f2483a6c88bc7c2", "refinement_interpretation_Tm_refine_ac201cf927190d39c033967b63cb957b", "refinement_interpretation_Tm_refine_d83f8da8ef6c1cb9f71d1465c1bb1c55", "typing_FStar.Seq.Base.append", "typing_FStar.Seq.Base.length" ], 0, "3f144f59e410fbaa970cffb0e20df75d" ]
This is the hint entry for the query with whose id is
(ContextPollution.test1, 1)
The next two fields are the fuel and ifuel used for the query,
2
and1
in this case.Then, we have the names of all the facts in the unsat core for this query: you can see that it was only about 20 facts that were needed, out of the 20,000 that were originally present.
The second to last field is not used—it is always 0.
And the last field is a hash of the query that was issued.
Replaying hints
The following command requests F* to search for
ContextPollution.fst.hints
in the include path and when attempting to prove a query with a given id, it looks for a hint for that query in the hints file, uses the fuel and ifuel settings present in the hints, and prunes the context to include only the facts present in the unsat core.fstar.exe --use_hints ContextPollution.fst
Using the hints usually improves verification times substantially, but in this case, we see that the our proof now goes through in about 130 milliseconds, not nearly as fast as the 15 milliseconds we saw earlier. That’s because when using a hint, each query to Z3 spawns a new Z3 process initialized with just the facts in the unsat core, and that incurs some basic start-up time costs.
Many F* projects use hints as part of their build, including F*’s standard library. The .hints files are checked in to the repository and are periodically refreshed as proofs evolve. This helps improve the stability of proofs: it may take a while for a proof to go through, but once it does, you can record and replay the unsat core and subsequent attempts of the same proof (or even small variations of it) can go through quickly.
Other projects do not use hints: some people (perhaps rightfully) see hints as a way of masking underlying proof performance problems and prefer to make proofs work quickly and robustly without hints. If you can get your project to this state, without relying on hints, then so much the better for you!
Differential Profiling with qprofdiff
If you have a proof that takes very long without hints but goes through quickly with hints, then the hints might help you diagnose why the original proof was taking so long. This wiki page describes how to compare two Z3 quantifier instantiation profiles with a tool that comes with Z3 called qprofdiff.
Hints that fail to replay
Sometimes, Z3 will report an unsat core, but when F* uses it to try to replay a proof, Z3 will be unable to find a proof of unsat, and F* will fall back to trying the proof again in its original context. The failure to find a proof of unsat from a previously reported unsat core is not a Z3 unsoundness or bug—it’s because although the report core is really logically unsat, finding a proof of unsat may have relied on quantifier instantiation hints from facts that are not otherwise semantically relevant. The following example illustrates.
module HintReplay
assume
val p (x:int) : prop
assume
val q (x:int) : prop
assume
val r (x:int) : prop
assume P_Q : forall (x:int). {:pattern q x} q x ==> p x
assume Q_R : forall (x:int). {:pattern p x} q x ==> r x
let test (x:int { q x }) = assert (r x)
Say you run the following:
fstar --record_hints HintReplay.fst
fstar --query_stats --use_hints HintReplay.fst
You will see the following output from the second run:
(HintReplay.fst(15,27-15,39)) Query-stats (HintReplay.test, 1) failed
{reason-unknown=unknown because (incomplete quantifiers)} (with hint)
in 42 milliseconds ..
(HintReplay.fst(15,27-15,39)) Query-stats (HintReplay.test, 1) succeeded
in 740 milliseconds ...
The first attempt at the query failed when using the hint, and the second attempt at the query (without the hint) succeeded.
To see why, notice that to prove the assertion r x
from the
hypothesis q x
, logically, the assumption Q_R
suffices. Indeed, if you look in the hints file, you will see that it
only mentions HintReplay.Q_R
as part of the logical core. However,
Q_R
is guarded by a pattern p x
and in the absence of the
assumption P_Q
, there is no way for the solver to derive an active
term p x
to instantiate Q_R
—so, with just the unsat core, it
fails to complete the proof.
Failures for hint replay usually point to some unusual quantifier
triggering pattern in your proof. For instance, here we used p x
as a pattern, even though p x
doesn’t appear anywhere in
Q_R
—that’s not usually a good choice, though sometimes, e.g.,
when using artificial triggers it can
come up.
This wiki page on hints provides more information about diagnosing hint-replay failures, particularly in the context of the Low* libraries.