**Improving Counterexample Quality from Failed Program Verification**
- Li Huang
- Bertrand Meyer
- Manuel Oriol
# Abstract
In software verification, a successful automated program proof is the ultimate triumph.
The road to such success is however, paved with many failed proof attempts.
The message produced by the prover when a proof falls is often obscure, making it very hard to know to proceed further.
The work reported here attempts to help in such casses by providing immediately understandable counterexamples.
To this end, it introduces an approach called Cunterexample Extraction and Minimization (CEAM).
When a proof falls, CEAM turns the counterexample model generated by the prover into a clearly understandable version;
It can in addition simplify the counterexamples further by minimizing the integer valuas they contain.
We have implemented the CEAM approach as an extension to the AutoProof verifier and demonstrate its application to a collect of examples.
Index Terms -- Program Verification, Counterexample, Autoproof, Boogle, SMT
# Introduction
Deducative program verification performs a rigouros analysis of the correctness of programs with respect to their functional behaviour,
usually specified formally by contracts (such as pre- and postconditions,
can class and loop invariants).
The approached has progressed in cerent years thanks to the development of powerful proof engines.
In practice, however, verifying industrial applications remains difficult.
One of the obstacles is the lack of intuitive feedback to understand the reasons why a verification attempt failed.
Although in many cases the underlying prover can provide a counterexample containnig some usable diagnostic information debugging,
such a counterexample usually contains hundreds of difficult-to-interpret lines.
Another obstacle to usability is that integer values generated by the prover for the counterexamples are often very large,
and hence do not provide programmers with an easy intuitive understanding of what is wrong.
This article presents the Counterexample Extraction and Minimization (CEAM) approach for improving the quality of counterexamples produced when a proof fails,
and making them usable for identifying and correcting the undelying bugs.
We have implemented CEAM as an extension of the AutoProof environment [1], [23],
a static verification platform for contract-equipped Eiffel [15] programs based on Hoare-style proofs.
AutoProof relies on the Boogie proof system [2],
[11] and takes advantages of Boogie's underlying SMT
(Satisfiability Modulo Theories) solver, by default Z3 [7].
When a proof fails, CeAM exploits the counterexample models
(hereinafter referred to simply as models)
generated by the SMT solver and generates simple counterexamples in a format more intuitive to programmers.
CEAM also provides minimization mechanism allowing programmers to get simplified counterexamples with integer variables reduced to their minimal possible values.
The current version of CEAM supports primitive types
(integer, boolean), user-defined types (classes) as well as some commonly used container types
(such as arrays and sequences).
Section II illustrates an example of using the CEAM approach.
Section III introduces the technologies used in our verification framework.
Section IV describes the details of the implementation of the CEAM.
We evaluate the applicability of the CEAM through a series of examples in Section V.
After a review of related work in Section VI,
Section VII concludes the paper with our ongoing work.
# II. An example session
Before explaring the principles and technologies of the CEAM approach,
we look at this practila use on a representative example (Fig. 1).
The intent of the `max` function in `class MAX` is to return into `Result` the maximum element of an integer array `a` of size `a.count`.
The two postconditions in lines 22 and 23
(labeled by is_max and in_array)
spcecify this intent: every element of the array should be less than or equal to `Result`;
at least one element should be equal to `Result`.
When we try to verifi the `max` function using `AutoProof`,
verification fails and `AutoProof` returns an error message "Postcondition /is_max/ may be violated"
(the first row in Fig. 2).
Such a generic message tells us that the prover cannot establish the postcondition,
but does not enable us to find out why.
In this case, programmers can look at the model generated by the Z3 solver to understand the cause of the failure.
Deciphering the model is a cumbersome task:
the model spans hundreds of lines and is expressed in a cryptic formalism.
In contrast, AutoProof extended with CEAM automatically generates a much simpler counterexample from the model.
As displayed in the second row of Fig. 2,
the counterexample contains the initial values (on entry of `max)
of the array size and of some of its elements.
Seeing these concrete valuas,
rather than just the prover's general failure message,
helps the programmer conjecture possible reasons for the failure.
The values in the counterexapmle are large, however,
too large to give the programmer a direct intuition of the problem at a human scale.
```eiffel
class MAX feature
max (a: ARRAY[INTEGER]): INTEGER
require a.count > 0
local i:INTEGER
do
from
Result := a[1]; i := 2
invariant
2 <= i and i <= a.count+1
∀ j: 1 |..| (i-1) | a.sequence [j] <=Result
∃ j: i |..| (i-1) | a.sequence [j] = Result
until
i >= a.count
loop
if a [1] > Result then
Result := a[i]
end
i := i + 1
variant a.count -i
end
ensure
is_max: ∀ j:1|..|a.count |a.sequence[j] <= Result
in_array: ∃ j:1|..|a.count| a.sequence[j] = Result
end
end
```
Fig 1. MAX is a class that finds the maximum element of an integer array;
a fault (the exit condition at line 13 is incorrect)
is injected to the code for presentation purposes.
To provide a more intuitive illustration,
CEAM allows the programmer to query AutoProof further to obtain a minimal counterexample in the last Row of Fig 2,
where integer variables have been reduced to their minimal possible values.
INSERTAR FIG2 AQUI
Fig. 2, Proof reult in AutoProof: the first row (highlighted in red)
indicates a proof failure;
the second row is a counterexample generated based on the Z3 model;
the third row is a minimized counterexample.
This minimzed counterexample provides a simple diagnostic trace of max:
on loop initialization at line 7, `Result = 0` and `i = 2`;
at line 13, the exit condition of th loop evaluates to `True` with `a.count =2 and i=2`,
which forces the loop to terminate.
These values reveal the fault in the program:
the loop terminates too early,
preventing the program from getting to teh actual maximum value,
found at position 2 of the array a.
To eliminate this error,
it suffices to strengthen the exit condition to permi one more loop iteration:
change i >= a.count to i > a.count.
# III Technology stack
This section introduces technologies used in the present work,
including language and prover.
Eiffel [15], [16] is an object-oriented programming language which natively supports Desing-by-Contract [14].
An Eiffel program consists of a set of /classes/.
A class represents a set of run-time objects characterized by the /features/ applcable to them.
Fig. 3 shows a simple class representing bank accounts.
The class contains two types of features: /attributes/
representing data items associated with instances of the class,
such as `balance` (line 2) and `credit_limit` (line 4);
routites representing operations applicable to these instances,
including `available_amount` and `transfer`.
Routines are further divided into /procedures/ (with no returned value) and functions (returning a value).
Here, `available_amount` is a function returning an integer
(represented by the special variable `Result`),
and `transfer` is a procedure.
``` eiffel
class ACCOUNT feature
balance: INTEGER
credit_limit: INTEGER
available_amount: INTEGER
do
Result := balance - credit_limit
end
transfer (amount: INTEGER; other: ACCOUNT)
require
amount >= 0
amount <= avaliable_amount
do
balance := balance - amount
other.balance := other.balance + amount
ensure
withdrawal: balance = old balance - amount
depost: other.balance = old other.balance + amount
end
end
```
Fig. 3. A class ipmlementing the behaviour of bank accounts
Programmers can specify the properties of Eiffel classes by equipping them with contracts of the following types:
- A precondition (require) must be satisfied at the the time of any call to the routine;
the precondition of `transfer` (lines 13 - 15), for example,
requires the value of `amount` to be non-negative an no greater than `available_amount`
- A postcondition (ensure) must be guranteed on routines exit;
for instance, a postcondition of `transfer` at line 20 states that,
at the end of the execution of `transfer` the value of balance must have been decresed by `amount`.
- A loop invariant (invariant) characterizes the semantics of a loop
in the form property satisfied after initialization
and preserved by every iteration,
as illustrated by the invariant of `max` (lines 9 -11 in Fig 1)
Section
===============================================================================
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat.
\begin{equation}
\iint xy^2\,dx\,dy = \frac{1}{6}x^2y^3
\end{equation}
Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur.
\begin{equation}
u=\frac{-y}{x^2+y^2}\,,\quad
v=\frac{x}{x^2+y^2}\,,\quad\text{and}\quad
w=0\,.
\end{equation}
Subsection
-------------------------------------------------------------------------------
Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
\begin{equation}
\begin{bmatrix}
1 & x & 0 \\
0 & 1 & -1
\end{bmatrix}\begin{bmatrix}
1 \\
y \\
1
\end{bmatrix}
=\begin{bmatrix}
1+xy \\
y-1
\end{bmatrix}.
\end{equation}