1. Introduction
Craig’s interpolation theorem [
1] is a classical well-known result in first-order logic; it says that whenever an implication
is valid, then there exists a formula
such that the implications
are valid too, and the formula
contains at most the symbols occurring both in
and in
. This theorem has been largely investigated both in propositional and predicate logics; a renewed interest in it has come from recent applications in verification [
2,
3,
4]. The reason for why interpolation became important in verification is because it helps to discover, in a completely
automatic way, new predicates that might contribute to the construction of invariants. In fact, many model-checking problems are of an
infinite state, which means that the language needed in order to build, e.g., safety invariants, is quite rich and is far from requiring only finitely many formulae up to logical equivalence. Popular methods for synthesizing invariants analyze spurious error traces. Suppose that the system to be verified is specified via a triple
given by a tuple of variables
, a formula
describing initial states, and a formula
describing state evolutions; suppose also that we are given a further formula
describing undesired ‘error’ states. Then, the system under examination cannot reach an error configuration in
n steps iff the formula
is not satisfiable (in the models of a suitable theory
T). From the unsatisfiability proof, taking an interpolant, say, at the
i-th iteration, one can produce a formula
such that, modulo
T, we have that the implications
are both
T-valid (i.e., true in all models of
T). The formula
(and the literals it contains) can contribute to the refinement of the current candidate safety invariant. This fact is exploited in different ways during invariant search; it can also be combined with orthogonal techniques in existing implementations, as witnessed by a rich literature; see, e.g., [
4,
5,
6,
7,
8,
9,
10], among many other contributions.
One major problem encountered during the above applications concerns the
shape of the interpolant. Usually, one considers implications such as (
1), which are valid in all models of a given first-order theory
T; but in general, there is no guarantee that if
are both quantifier-free, then there is an interpolant
such that the implications (
2) are
T-valid and such that
is also quantifier-free. This is crucial because, very often, first-order theories commonly used in verification have a decidable quantifier-free fragment, but are undecidable outside that fragment (this is the case, for instance, of the McCarthy theory of arrays; see
Section 3 below); even if general first-order satisfiability remains decidable, the computational cost of a satisfiability test may increase considerably when moving from the quantifier-free fragment to arbitrary first-order formulae (this is the case, for instance, of Presburger arithmetic). This is why some considerable effort has been put into designing theory-specific interpolation algorithms operating at a quantifier-free level [
11,
12,
13,
14,
15,
16,
17] and in identifying suitable variants of theories axiomatizing common datatypes enjoying
quantifier-free interpolation [
18,
19,
20].
Still, knowing that an isolated theory by itself has quantifier-free interpolation might not be sufficient for applications; in common benchmarks, it happens that arrays, sets, lists, etc. are always arrays, sets, lists of something (booleans, integers, reals, etc.), so that one must be sure that quantifier-free interpolation transfers to combined theories. This will be the main subject of the first part of the present survey paper.
In the second part of the paper, we consider a strong form of Craig interpolation, namely, uniform interpolation. We recall here what uniform interpolants are in the context of the quantifier-free fragment of a first-order theory T. We use notations such as to say that at most the variables from the tuple occur freely in . Given a quantifier-free formula , a uniform interpolant of (w.r.t. ) is a quantifier-free formula satisfying the following two properties:
is T-valid;
For any further quantifier-free formula such that is T-valid, we have that the implication is T-valid, too.
Whenever uniform interpolants exist, one can compute an interpolant for an entailment such as
in a way that is
independent of
. Uniform interpolants have been widely studied in the context of non-classical propositional logics (a non-exhaustive list includes [
21,
22,
23,
24,
25,
26,
27,
28]). In the last decade, the automated reasoning community has also developed an increasing interest in uniform interpolants, this time for quantifier-free fragments of first-order theories [
29,
30]; in this literature, uniform interpolants are often called ‘covers’, but the definitions of uniform interpolants and of covers are equivalent. In these contributions, examples of computations of uniform interpolants were supplied, and some algorithms were also sketched. The first formal
proofs about the existence of uniform interpolants in
(the theory of pure equality in an arbitrary signature) were however published only in [
31,
32]. The usefulness of uniform interpolants in model checking was already stressed in [
30] and further motivated by our recent line of research on the verification of data-aware processes [
31,
33,
34,
35]. In such applications, combination problems obviously arise, so in
Section 5 below, we investigate transfer problems for uniform quantifier-free interpolation.
Structure of the Paper
Section 2 settles on notations and basic definitions.
Section 3 investigates quantifier-free interpolation for combined theories in disjoint signatures and the semantic counterparts related to quantifier-free interpolation: amalgamation, strong amalgamation, and definability properties.
Section 4 extends this analysis to the case of non-disjoint signature theories and shows interesting applications to modal logic.
Section 5 introduces uniform interpolants, discusses their existence in
and shows how to transfer them to combined convex theories.
Section 6 concludes.
The paper is conceived as a survey paper, principally addressed to a mathematical audience; however, it should be taken into account that many motivations and examples that suggested the results included in the paper come from the software verification area; hence, we give at least some sketches of algorithmic aspects. Proofs are omitted (sometimes in favor of intuitive justifications); however, they can all be found in the original papers. More precisely, proofs of the results from
Section 3 are in [
36], proofs of the results from
Section 4 are in [
37], and proofs of the results from
Section 5 are in [
31,
32,
38,
39,
40].
2. Preliminaries
We assume that the reader is familiar with the basic notions concerning first-order logic; this includes syntactic notions such as signature, variable, term, atom, literal, formula, and sentence and semantic notions such as structure, substructure, truth, satisfiability, and validity. The equality symbol “=” is considered a logical symbol and, hence, is included in all signatures considered below; to exclude limit cases, we always assume that our signatures always contain at least one individual constant symbol. When we use notations such as , we mean that the expression (term, literal, formula, etc.) E contains free variables only from the tuple . Concerning ‘tuples’, we make an important convention: When we speak of a ‘tuple of variables’, the tuple is meant to represent an arity; hence, it is assumed not to contain repetitions. The same convention does not apply when we speak of a ‘tuple of terms’, which might consequently contain repetitions. These conventions are useful for substitutions; we use them when denoting with the formula obtained from by the simultaneous replacement of the ‘tuple of variables’ with the ‘tuple of terms’ . For similar reasons, whenever we use a notation such as , we assume not only that the tuples and are made of pairwise distinct elements, but also that and are disjoint as sets. A formula is universal (existential) iff it is obtained from a quantifier-free formula by prefixing it with a string of universal (or existential) quantifiers. A formula is ground iff it does not contain occurrences of variables (neither free nor bound).
From the semantic side, we refer to standard model-theoretic terminology [
41] for basic notions such as structures, embeddings, diagrams, etc.
-structures are indicated with calligraphic letters
; the support of a
-structure
is indicated with
. A
theory T in a signature
is a set of
-sentences; the
models of
T are those
-structures in which all the sentences in
T are true. A
-formula
is
T-satisfiable (or
T-consistent) if there exists a model
of
T such that
is true in
under a suitable assignment
to the free variables of
(in symbols,
); it is
T-valid (in symbols,
) if its negation is not
T-satisfiable. A theory
T is
universal iff all sentences in
T are universal. A theory
T admits
quantifier-elimination iff, for every formula
, there is a quantifier-free formula
such that
(since we work in a computational logic context, we consider part of the definition of a theory enjoying quantifier elimination the fact that such
is effectively computable from
). A formula
T-entails a formula
if the implication
is
T-valid (in symbols,
, or simply
when
T is clear from the context). If
is a set of formulæ and
is a formula, the notation
means that there are
such that
. The
satisfiability modulo the theory T (SMT
)
problem amounts to establishing the
T-satisfiability of quantifier-free
-formulæ. Some theories have special standard names in the SMT-literature (some of these names will be recalled during the paper). For pure equality theory, our conventions are as follows. We shall call
the pure equality theory in the signature
; we may also use just
instead of
in case there is no need to specify the signature
; however, in that case,
is assumed to be
proper, i.e., it must contain (besides free constants) at least a predicate or a function symbol different from equality.
2.1. Combinations of Theories
Stable infiniteness is a semantic ingredient often occurring in combination results; the requirement is rather mild, and most theories used in verification, such as theories axiomatizing fragments of arithmetics as well as common datatypes, satisfy it (but there are also notable exceptions, such as bitvector theories). The formal definition is as follows. A theory T is stably infinite iff every T-satisfiable quantifier-free formula (from the signature of T) is satisfiable in an infinite model of T. By compactness, it is immediate to show that T is stably infinite iff every model of T embeds into an infinite one.
Let
be a stably infinite theory over the signature
such that the
problem is decidable for
and
and
are disjoint (i.e., the only shared symbol is equality). Under these assumptions, the
Nelson–Oppen combination method [
42,
43] tells us that the SMT problem for the combination
of the theories
and
(i.e., the union of their axioms) is decidable. In general, however, the combined SMT problem
may become undecidable, even when the
problems are decidable and the signatures are disjoint [
44]; on the other hand, stable infiniteness is a sufficient but not necessary condition for the decidability transfer of
problems to a disjoint combination (for a survey on different sufficient criteria, see [
45], and for recent developments, also see [
46]).
2.2. Interpolation Properties
In this paper, we are interested in specializing the Craig interpolation property to quantifier-free fragments of first-order theories. We give two definitions: The first one is more restricted, and the second one is more liberal.
Definition 1. [Plain quantifier-free interpolation] A theory T admits (plain) quantifier-free interpolation (or, equivalently, has quantifier-free interpolants) iff, for every pair of quantifier-free formulae such that , there exists a quantifier-free formula , called an interpolant, such that: (i) , (ii) . (Notice that only the variables occurring in both ψ and ϕ can occur in θ.)
The following extension of the above definition is considered more natural (and also more useful in verification applications):
Definition 2. [General quantifier-free interpolation] Let T be a theory in a signature Σ; we say that T has the general quantifier-free interpolation property iff, for every signature (disjoint from Σ) and for every pair of ground -formulæ such that , there exists a ground formula θ, such that: (i) , (ii) , and (iii) all predicates, constants, and function symbols from occurring in θ also occur both in ϕ and ψ.
Since free variables can be replaced by free constants without affecting entailment relations, it should be clear that the general quantifier-free interpolation property (Definition 2) implies the plain quantifier-free interpolation property (Definition 1).
2.3. Amalgamation Properties
When stating amalgamability and strong amalgamability properties (see [
47] for a survey), people usually limit themselves to universal theories; actually, most theories we have in mind for several applications are universal; however, to some extent, we also want to handle general first-order theories in the paper. In order to do that, it is important to observe that a substructure of a model of a non-universal theory need not be a model of the theory. Thus, in our definitions, we must take into account substructures that are not necessarily submodels. This leads to the notions below, which we call ‘sub-amalgamability’ and ‘strong sub-amalgamability’:
Definition 3. Let T be a theory; we call a T-fork a triple , where are models of T and is their shared substructure. (By this, we mean that is a substructure of both and that .) AT-amalgam
of such a fork is a triple , where is a T-model and , are embeddings whose restrictions to the support of coincide. A theory T has the sub-amalgamation property
iff every T-fork has a T-amalgam. A theory T has the strong sub-amalgamation property iff every T-fork has a T-amalgam satisfying the following additional condition: If for some , we have , then there exists an element a in such that .
When the theory T is universal, we may speak of ‘amalgamation’ and ‘strong amalgamantion’ properties instead of ‘sub-amalgamation’ and ‘strong sub-amalgamation’ properties, respectively.
3. Strong Amalgamation and Combined Interpolation
The results presented in this section concern the relationships between syntactical notions, such as forms of interpolation, and their corresponding semantic counterparts, such as variants of amalgamation, and they are based on [
36]. An old result due to Bacsich [
48] connects quantifier-free interpolation and amalgamation for the case of universal theories; the result can be easily extended to arbitrary first-order theories replacing amalgamation with sub-amalgamation [
36].
Theorem 1 [
36,
48]
. A theory T has the sub-amalgamation property iff it admits quantifier-free interpolants. The above theorem can be used to find examples and counterexamples. For instance, it is easily seen that (which is trivially universal) has amalgamation, and hence, it has quantifier-free interpolants; in fact, also has the strong amalgamation property mentioned above. A simple example of a universal theory that does not enjoy amalgamation is the theory of a binary relation that is a partial function.
Less trivial examples and counterexamples are given by the variants of McCarthy’s theory of arrays [
49]. We consider three variants of this theory. The first variant is
, which is the theory of arrays with extensionality. The signature of
contains the sort symbols
, and
and the function symbols
and
. (notice that Theorem 1, and in general all results in this paper, extends to many-sorted signatures). The set of axioms of
consists of the following three sentences:
Now,
enjoys the amalgamation property in the sense that, given two models
and
of
sharing a substructure
that is also a model of, there is a model
of
endowed with embeddings from
agreeing on the support of
. However,
is
not universal, so this is not sufficient to guarantee quantifier-free interpolation. In fact,
is not sub-amalgamable, and quantifier-free interpolation fails for it, as shown by the following valid implication whose interpolants require a quantifier (the counterexample is due to R. Jhala and is reported in [
50]):
The theory
is obtained from
by skolemizing the extensionality axiom (6); hence, its language has an extra binary function
and the following additional axiom:
which replaces (6). This theory is universal and (strongly) amalgamable [
18,
19]. This means that quantifier-free interpolants exist; for example, an interpolant of the two formulæ (
8) can be written without quantifiers in this theory as
The third variant of the array theory we want to mention is the theory
, where the axiom (
8) is strengthened so that
returns the
biggest index where
differ. This requires adding at least a symbol for a total ordering relation on the sort
(we leave the reader to consult [
20] for details). Under suitable mild hypotheses, it is possible to prove that the universal theory
also has amalgamation and, hence, quantifier-free interpolation (but the proof is surprisingly much more delicate [
20]).
Amalgamation and sub-amalgamation are not modular properties in the sense that they can get lost when taking union of theories, even under disjoint signatures. However,
strong amalgamation, under stable infiniteness, is modular [
36].
Theorem 2 [
36]
. Let and be two stably infinite theories over disjoint signatures and . If both and have the strong sub-amalgamation property, then so does . Thus, in view of Theorem 1, has quantifier-free interpolants. Actually, strong sub-amalgamation is a necessary condition for the transfer of the quantifier-free interpolation property in the sense that is precisely stated in the following result.
Theorem 3 [
36]
. Let T be a theory admitting quantifier-free interpolation and let Σ be a proper signature disjoint from the signature of T. Then, has quantifier-free interpolation iff T has the strong sub-amalgamation property. The intuitive reason for why the above theorem holds is the following. Recall that since is proper, it must contain at least a function or a predicate symbol different from equality. Suppose, e.g., it contains a unary predicate P; take models of T that cannot be strongly amalgamated over their common substructure. Then, expand them to -structures in such a way that P holds precisely for the elements of the support of that are not from the support of ; clearly, sub-amalgamation fails for these expanded models; hence, lacks quantifier-free interpolation.
Strong amalgamation also characterizes the general quantifier-free interpolation property.
Theorem 4 [
36]
. A theory T has the general quantifier-free interpolation property iff T has the strong sub-amalgamation property. 3.1. Strong Amalgamation: A Syntactic Characterization
Strong amalgamation needs an ‘operational’ characterization to be useful when designing concrete combined interpolation algorithms. We reformulate strong amalgamation via a syntactic property (to be called the equality-interpolating property); this syntactic property, roughly speaking, says that disjunctions of variables’ equalities can be entailed, modulo T, by quantifier-free formulae only in the case that such equalities are mediated by explicitly defining terms.
Given two finite tuples
and
of terms,
We use
to denote the juxtaposition of the two tuples
and
of terms. So, for example,
is equivalent to
. Next Definition is taken from [
36] [Definition 4.1 and Theorem 4.2(iii)]:
Definition 4. A theory T is equality interpolating iff it has the quantifier-free interpolation property and satisfies the following condition:
- •
for every triple of tuples of variables and for every pair of quantifier-free formulae such that there exists a tuple of terms such that
The following theorem states the syntactic counterpart of the strong amalgamation property.
Theorem 5 [
36]
. Given a theory T with quantifier-free interpolation, the following conditions are equivalent:- (i)
T is strongly sub-amalgamable;
- (ii)
T is equality interpolating.
If a theory T has quantifier elimination, then it obviously also has quantifier-free interpolants, and hence, it is sub-amalgamable. However, quantifier elimination is not sufficient to get strong sub-amalgamation (see below for counterexamples). Nevertheless, if the theory is also universal, then quantifier elimination is sufficient.
Theorem 6 [
36]
. A universal theory admitting quantifier elimination is equality interpolating. Proof. (Sketch) In order to prove this theorem, one needs to preliminarily show a
testing-point quantifier elimination lemma; such a lemma says that if
T is universal and has quantifier elimination, then for every quantifier-free formula
, there exists a tuple
of tuples of terms such that
Taking this preliminary result for granted, we formally prove that a universal and quantifier eliminable theory
T satisfies the implication (9)
(10). Suppose that (9) holds; by the testing-point quantifier elimination lemma, there exist tuples of terms
such that
is
T-valid. For every
, if we replace
with
in (9), we get
and, hence,
Taking into account (
12) and letting
be the tuple
obtained by juxtaposition, we get
Removing the existential quantifier in the antecedent of the implication, we obtain
and, a fortiori,
, as desired. □
Theorem 6 immediately yields a bunch of strongly amalgamating theories.
Example 1. The theory of recursive data structures
[51] requires a signature comprising two unary function symbols ‘’ and ‘’ and a binary function symbol ‘’; the axioms of are the following: where t is a term obtained by finitely many applications of and to the variable x (e.g., axioms (16) include , , , and so on). Clearly, is universal; the fact that it admits elimination of quantifiers has been known since an old work by Mal’cev [52]. Example 2. The theory of integer difference logic
requires a signature comprising the constant symbol ‘0’, the unary function symbols ‘’ and ‘’, and the binary predicate symbol ‘<’; it is axiomatized by adding to the irreflexivity, transitivity, and linearity axioms for < the following set of sentences: is universal, and the fact that admits elimination of quantifiers can be shown by slightly modifying the procedure for the similar theory of natural numbers with successor and ordering in [53]. Notice that the atoms of are equivalent to formulae of the form , where - (a)
and ;
- (b)
are variables or the constant 0;
- (c)
is j, abbreviates when or when .
Usually, is written as or as ; hence, the name of “integer difference logic.”
Example 3. The theory is a fragment of linear arithmetic over the integers that is slightly more expressive than . It can be defined as the theory whose axioms are the sentences that are true in in the signature comprising the constant 0, the unary function symbols , , and −, and the binary predicate symbol <. It can be shown that has a set of quantifier-eliminating universal axioms [36]; thus, is equality interpolating. Example 4. Linear Arithmetic over the Reals can be axiomatized as the theory of totally ordered divisible abelian groups [41]. It has quantifier elimination (e.g., via the Fourier–Motzkin procedure), but it is easily seen that it is not strongly sub-amalgamable (just consider two copies of the reals sharing the integers as a common substructure). However, if one includes multiplication by rational coefficients
in the signature of the theory, one gets a universal set of axioms enjoying quantifier elimination, thus gaining strong amalgamation and the equality-interpolating property. Example 5. The situation is somewhat similar for integer linear arithmetics. The theory of the integers under addition, 0, successor, and ordering does not have quantifier elimination; if we add infinitely many unary predicates for equivalence modulo n, we get Presburger arithmetics that enjoy quantifier elimination. However, this is not yet sufficient for the equality-interpolating property; for that, we must add infinitely many unary function symbols for integer division by n, varying n (see [36] or [54] for details). 3.2. The Case of Convex Theories
A first-order theory
T is said to be
convex iff, for every conjunction of literals
if
then there is
such that
Among convex theories, we have universal Horn theories (see
Section 4 below); another remarkable example of a convex theory is linear real arithmetic (here is where the name ‘convex’ comes from: It comes from the fact that the convexity of linear real arithmetic follows from the geometrical fact that if a convex set is contained in a union of hyperplanes, then it is contained in one of them). On the other hand, integer linear arithmetic (Example 5) and integer difference logic (Example 2) are non-convex theories.
In convex theories, one can formulate the equality-interpolating condition in some interesting simpler ways. A formula is said to be primitive iff it is obtained by prefixing some existential quantifiers to a conjunction of literals.
Proposition 1 [
36]
. The following conditions are equivalent for a convex theory T with quantifier-free interpolation:- (i)
T is equality interpolating;
- (ii)
For every pair of variables and for every pair of conjunctions of literals
, such that there exists a term such that - (iii)
For every tuple of variables , for every further variable y, and for every primitive formula such that there is a term such that
Condition (ii) is due to Yorsh and Musuvathi; in fact, in [
55], they proved the combination Theorem 2 for the restricted case of convex theories using precisely condition (ii) instead of the semantic notion of strong sub-amalgamation. Condition (iii) is the Beth definability property formulated for the primitive fragment of the language—we shall call it the
primitive Beth definability property (modulo
T). We shall make use of this property in
Section 5.2 when we discuss an algorithm for computing combined uniform interpolants in the convex case.
3.3. Sketch of the Combined Interpolation Algorithm
Theorem 2 shows that the union of two stably infinite signature-disjoint strongly amalgamable theories has the quantifier-free interpolation property. However, it does not show how to compute quantifier-free interpolants given analogous input algorithms for the component theories. Such an algorithm is described in detail in [
36]; we give some indications here of how it works.
Below, we consider two theories
in their respective signatures
; the two theories are both stably infinite and equality interpolating; moreover, the
problems are decidable and the signatures
are disjoint. We also assume the availability of algorithms for
and
that are able not only to compute quantifier-free interpolants, but also the tuples
of terms in Definition 4 for the equality-interpolating property. Since the
problem is decidable for
, it is always possible to build an equality-interpolating algorithm by enumeration; in practice, better algorithms can be designed (see [
55] for some examples concerning convex theories, see above for non-convex examples regarding some quantifier-eliminating arithmetic theories).
We can restate our problem as follows: We are given a finite set and a finite set of -ground formulae possibly containing additional free constants. We assume that is -unsatisfiable (here, by abuse of notation, we confuse a finite set of formulae with its conjunction). We must find a finite set of ground formulae C (containing at most the free constants occurring both in and in ) such that and . Applying standard purification procedures, we can assume that all literals in are pure, meaning that they cannot contain both an A-strict free constant and a B-strict free constant. Here, we call A-strict (or B-strict) a free constant occurring only in (or only in ); we call it ‘shared’ if it occurs in both and . Finally, we call it A-local (or B-local) iff it is either shared or A-strict (or B-strict). A similar terminology is applied to terms, literals, and quantifier-free formulæ; they are said to be A-local, B-local or shared iff they contain only constants that are A-local, B-local or shared, respectively.
The algorithm uses the
metarules framework introduced in [
18]. This framework collects some manipulations that can be freely operated in pairs
without losing the possibility of computing an interpolant. For instance, if
, where the
are
A-local, then it is possible to non-deterministically replace
A with
, compute all interpolants of
, and then recombine them into an interpolant of
. A long list of metarules is supplied in [
18,
36]; they are rather simple transformations. Strictly speaking, metarules are not part of an interpolation algorithm; however, if every single transformation of a concrete interpolation algorithm can be reformulated as a combination of metarules, then the algorithm itself is automatically partially correct (that is, it gives a correct answer when it terminates), and only termination requires a proof in order to achieve its total correctness.
The combined interpolation algorithm we are introducing follows this schema. It manipulates by applying transformation rules (justified by metarules) that generate a tree labeled by pairs . In the end, it will be possible to compute, via the input interpolation algorithms, an interpolant out of every leaf; such interpolants will then be re-combined to form an interpolant for the original unsatisfiable pair . While applying the transformation rules, it might happen that some A-strict free constant a ‘becomes shared’ because an equation explicitly defining it via a shared term is entailed by the current A; the same might happen for a B-strict constant. This ‘term-sharing’ technique is easily justified by a combination of metarules.
Now, one of the transformation rules simply guesses a Boolean assignment satisfying the current formula
A (or
B) and adds the corresponding set of literals to
A (or
B). A Boolean assignment can also guess equalities or disequalities among
A-strict (or
B-strict) constants, between an
A-strict (or
B-strict) constant and a shared constant, or between two shared constants. What the assignment cannot do is guess an equality/disequality between an
A-strict and a
B-strict constant because no impure literal is tolerated in the interpolant to be built. So, it is assumed by default that
A-strict and
B-strict constants cannot be equal to each other. When this leads to an inconsistency, this is just because a relation such as
holds for
or
(here,
are the
A-strict constants,
are the
B-strict constants, and
,
are those among the currently assigned literals that are
-literals). If this happens, by the equality-interpolating property, there are shared terms
such that
Invoking the available interpolation algorithm for
, we can compute a ground shared
-formula
such that
and
. We choose among
alternatives in order to non-deterministically update
in the successor nodes. For the first
alternatives, we add some
(for
,
) to
A. For the last
alternatives, we add
to
A and some
to
B (for
,
). After such updates, the number of the
A-strict or of the
B-strict free constants decreases because we added an explicitly defining equation in all cases (see the above remark about ‘term sharing’). Thus, in the end, it will be possible to assert (explicitly or implicitly) an equality or a disequality for every pair of free constants. Since this is precisely the condition for consistency in the combined Nelson–Oppen procedure [
42], this cannot happen because we assumed that
was
-unsatisfiable; hence, every leaf of the tree to be built must contain a contradiction either in the
-part or in the
-part of its labeling constraint so that an interpolant can be extracted from every leaf.
4. Non-Disjoint Combinations
Whenever signatures are not disjoint, transfer results are harder to obtain. A crucial notion here is
-compatibility [
56], which we are going to introduce in the following. This section is based on the results presented in [
37,
57].
Recall [
41] that a
universal theory
has a
model completion iff
is a stronger theory in the same language of
such that: (i) Every model of
can be embedded into a model of
; and (ii)
has quantifier elimination. The definition of a model completion could be suitably extended to theories which are not universal, but we do not need to consider this more general case. Alternative equivalent definitions are possible (for instance, condition (ii) is equivalent to the fact that the union of
and of the diagram of a model of
is a complete theory).
Definition 5. Let T be a theory in the signature Σ and let be a universal theory in a subsignature such that . We say that T is-compatible iff there is a -theory such that:
- (i)
;
- (ii)
is a model completion of ;
- (iii)
Every model of T can be embedded, as a Σ-structure, into a model of .
-compatibility is a generalization of stable infiniteness; in fact, a theory T is stably infinite iff it is -compatible, where is the pure equality theory in the empty signature.
In [
56], it is shown that the decidability of the SMT problem transfers from two theories to their non-disjoint combination in case the two theories are both
-compatible with respect to a locally finite theory in their shared signature (this result has as a special case the transfer of the decidability of the global consequence relations to fusions of modal logics [
58]; see below). More results that replace local finiteness with a so-called ‘noetherianity condition’ are given in [
59].
Strong Amalgamation over a Horn Theory
In [
37], two results are given concerning combined quantifier-free interpolation in the case of non-disjoint signatures. We report the second one only, which is easier to formulate and more effective in the applications.
A -theory T is universal Horn iff it can be axiomatized via formulæ of the form , where the and B are all atoms(the standard definition of a universal Horn theory would include also the case where B is ⊥, we disregarded this case for simplicity and because our applications to modal logic do not require it). In purely functional signatures, universal Horn theories axiomatize quasi-varieties.
One important fact is that the categories of models of universal Horn theories are co-complete [
60]; hence, in particular, pushouts exist.
Definition 6. Let T be a theory and let be a T-fork. A pushout
of the fork is a triple , where is a T-model and , are Σ-homomorphisms whose restrictions to the support of coincide, such that for every other triple with the same properties, there is a unique homomorphism (called the comparison homomorphism) such that (). If the pushout of the T-fork is a T-amalgam (i.e., if are embeddings), it is called the minimal T-amalgam of the T-fork.
Notice that, even when the pushout is a T-amalgam, comparison morphisms need not be injective. This makes the next definition interesting.
Definition 7. Let T be a theory in a signature Σ; let be a universal Horn theory in a subsignature with the amalgamation property. We say that T is-strongly sub-amalgamable iff every T-fork has a T-amalgam such that the comparison morphism with respect to the minimal -amalgam of the -reduct of the T-fork is injective.
Notice that strong amalgamation is nothing but -strong amalgamability, where is the pure equality theory in the empty signature. Thus, the following transfer result is a genuine generalization of Theorem 2:
Theorem 7 [
37]
. If are both -compatible and -strongly sub-amalgamable (for an amalgamable universal Horn theory in their common subsignature ), then so is . The above theorem has interesting applications to modal logic. In the following, we let
be the theory of Boolean algebras. Recall that a Boolean algebra is defined to be a bounded and complemented distributive lattice; since Boolean algebras have a meet-semilattice reduct, it is possible to introduce in them a partial ordering relation
via the definition
, where ⊓ is the meet operation.. It is well known [
41] that
has a model completion, which is the theory of atomless Boolean algebras: A Boolean algebra is said to be atomless iff it has no non-zero ≤-minimal element.
A
BAO-equational theory is any theory
T whose signature extends the signature of Boolean algebras and whose axioms are all equations and include the Boolean algebra axioms. BAO stands for ‘Boolean algebras with operations’. BAO-equational theories arise as algebraic semantics of propositional modal logics [
61]; for instance, modal algebras (i.e., Boolean algebras endowed with a unary operator □ preserving meets and 1) are Lindenbaum algebras of propositional calculi based on the modal system
K. However, we do not assume here any ‘normality’ conditions on the operations of a BAO; hence, BAO are algebraic counterparts of classical modal logics in the sense of [
62] (in a classical modal logic, the only assumption made on the modal operators is that the replacement rule for equivalent formulae applies).
The
fusion of two BAO-equational theories
and
is just their combination
(when speaking of the fusion of
and
, we assume that
and
share only the Boolean algebras’ operations and no other symbols). This notion of fusion matches with the standard notion of fusion [
58] of the modal logics that are counterparts in propositional logic of the algebraic theories
.
Any -equational theory T is -compatible; to see this, it is sufficient to show that a model of T embeds into a model of T whose Boolean reduct is atomless. This is done by taking the colimit of the chain defined as follows: Let be , let be , and use the diagonal maps as embeddings .
Thus, in order to apply Theorem 7, we only need to characterize -strong sub-amalgamability. Surprisingly, this is nothing but a well-known notion from the literature.
Definition 8. We say that a BAO-equational theory T has the superamalgamation
property [63] iff for every T-fork , there exists a T-amalgam such that for every such that there exists such that holds in and holds in . Theorem 8 [
37]
. A BAO-equational theory T has the superamalgamation property iff it is -strongly amalgamable. As is well known from [
63], the superamalgamation property for varieties of modal algebras, in the case of normal modal logics, corresponds to (the local deducibility version of) the Craig interpolation theorem. Thus, Theorem 8 implies, in particular, a Wolter fusion transfer result [
58] of the Craig interpolation theorem for normal modal logics. For non-normal modal logics, superamalgamation corresponds to a strong version of the Craig interpolation theorem (encompassing both the local and the global deducibility versions of it) called the
comprehensive interpolation property in [
37]. Theorem 8 above implies that this comprehensive interpolation property transfers to fusions in the non-normal case, too, as proved in [
37].
5. Uniform Interpolants
This section presents results contained in [
31,
32,
38,
39,
40,
64,
65]. First, we analyze a strong form of of quantifier-free interpolation and its relationship with model-completeness [
31,
32]; we then show that, for convex theories, the same hypotheses allowing the transfer of the existence of ordinary interpolants also allow the transfer of the existence of these stronger interpolants [
39,
40].
Fix a theory
T and an existential formula
; a quantifier-free formula
is said to be a
T-cover [
30] (or, simply, a
cover) of
iff the following two conditions are satisfied: (i)
; (ii) For every formula
such that
, we have that
. Sometimes, the cover
(which is unique up to
T-equivalence) is called a
(quantifier-free) uniform interpolant. The reason for this terminalogy comes from the fact that an entailment like
is equivalent (by the standard rule for existential quantifier introduction) to
, hence it is immediately seen that a cover
of
can work as an interpolant for all entailments
(varying all
for which the entailment holds).
We say that a theory T has uniform quantifier-free interpolation iff every existential formula (equivalently, every primitive formula ) has a T-cover.
The following lemma supplies a semantic counterpart to the notion of a cover. What the lemma essentially says is that the cover of expresses a ‘solvability condition’ for (seen as a kind of system of equations in the parameters ); if this solvability condition is true, then it is possible to build (maybe in an extended model) a solution for and vice versa.
Lemma 1 [
31,
32]
. A formula is a T-cover of iff it satisfies the following two conditions:- (i)
;
- (ii)
For every model of T and for every tuple of elements from the support of such that , it is possible to find another model of T such that embeds into and .
In
Section 4, we mentioned the model completion
of a universal theory
T; we recall from [
41] that
axiomatizes the models of
T that are existentially closed for
T, i.e., those models
of
T for which an existential formula with parameters in
having a solution in an extension of
, which is also a model of
T, has a solution in
itself. If a theory has uniform interpolation, then every existential formula
has a
T-cover, so it is possible to express the solvability condition of
via the cover. In this way, we can axiomatize existentially closed models; we just say that ‘whatever is solvable actually has a solution’. These intuitive considerations show why the following result comes with no surprise.
Theorem 9 [
31,
32,
66]
. Suppose that T is a universal theory. Then, T has a model completion iff T has uniform quantifier-free interpolation. If this happens, is axiomatized by the infinitely many sentenceswhere is an existential formula and ψ is a T-cover of it.
5.1. Uniform Interpolants in
Whereas it is clear that theories enjoying quantifier elimination also have uniform interpolation, it is less evident whether other theories used in verification have covers or not.
is the typical theory used in verification that is not a theory axiomatizing arithmetic data and that does not enjoy quantifier elimination. That is why investigating uniform interpolation in
can be interesting. In fact,
does enjoy uniform interpolation; the result was stated on various occasions in the literature (including, e.g., [
30]), but the first proofs were only published in the conference paper [
31] and its journal version [
32]. Alternative proofs are reported in [
38,
65]. Actually, these papers contain three different algorithms for computing uniform interpolants in
. In this subsection, we only report the first algorithm from [
38,
64,
65], which is the simplest one to explain.
The algorithm is based on transformation rules. We first need some definitions (for simplicity, we assume that the signature is functional).
A
flat literal is a literal included in the following list:
where
and
b are (not necessarily distinct) variables or constants. A formula is flat iff all literals occurring in it are flat; flat terms are terms occurring in the literals listed above in (
20).
An
explicit definition via a directed acyclic graph (abbreviated as a DAG-definition, or simply as a DAG) is any formula
of the following form (where
):
Thus,
provides an
explicit definition of the
in terms of the parameters
. Given such a DAG-definition
, we can, in fact, associate to it a substitution
so that a formula such as
is equivalent to
. The formula
is said to be the
unravelling of (
21); notice that computing such an unravelling by explicitly performing the required substitutions causes an exponential blow-up.
We want to compute the cover of a primitive formula
; we can assume without loss of generality that
the constraint is flat. To see this, it is sufficient to apply (as a pre-processing step) the well-known Congruence Closure Transformations, as explained, e.g., in [
67] (these transformations have a linear cost).
The algorithm manipulates formulae in the following format:
where
is a DAG and
are flat constraints (notice that the
do not occur in
). When writing formulae such as (
22), we usually omit the existential quantifiers
and
for brevity.
Initially,
and
are the empty conjunction. In (
22), the variables
are called
parameter variables, the variables
are called
(explicitly) defined variables, and the variables
are called
(truly) quantified variables. The algorithm does not modify the variables
; on the other hand, it might cause some quantified variable to disappear or to be renamed as a defined variable. Below, the letters
range over
.
Definition 9. A term t (or a literal L) is -free when there is no occurrence of any of the variables in t (or in L). Two flat terms of the kinds are said to be compatible iff, for every , either is identical to or both and are -free. The difference set of two compatible terms as above is the set of disequalities , where is not identical to .
The algorithm (taken from [
38,
64,
65]) applies the rules below in any order, except the last one, which has lower priority. The last rule splits the execution of the algorithm into several branches; each branch will produce a different disjunct in the output formula.
- (1)
Simplification Rules:
- (1.0)
If an atom such as belongs to , just remove it; if a literal such as occurs somewhere, delete , replace with ⊥, and stop;
- (1.i)
If t is not a variable and contains both and , remove the former and replace it with .
- (1.ii)
If contains with , remove it and replace with everywhere.
- (2)
DAG Update Rule: If
contains
, remove it, rename
as
everywhere (for fresh
), and add
to
More formally:
- (3)
-Free Literal Rule: If
contains a literal
, move it to
. More formally:
- (4)
Splitting Rule: If
contains a pair of atoms
and
, where
t and
u are compatible flat terms as in (
23) (thus, in particular,
t and
u are of the kinds
and
, respectively), and no disequality from the difference set of
belongs to
, then apply one of the following alternatives:
- (4.0)
Remove from the atom , add to the atom , and add to all equalities such that is in the difference set of ;
- (4.1)
Add to one of the disequalities from the difference set of (notice that the difference set cannot be empty; otherwise, Rule (1.i) applies).
Theorem 10 [
38,
64,
65]
. Suppose that we apply the above algorithm to the primitive formula and that the algorithm terminates with its branches in the statesThen, the cover of in is the disjunction of the unravellings of the formulæ while varying .
The proof (shown in [
38,
65]) is based on Lemma 1 and essentially shows that if we conjoin the Robinson Diagram of a model satisfying
(relatively to a certain assignment to the variables
) with
, we get a canonical rewrite system (under a suitable reduction ordering; see [
68] for information on rewrite systems).
Example 6. This example is analyzed in [38,64,65]. Let us compute a cover of the formula . We first need to flatten it; this produces the set of literalswhere we have two newly introduced variables to be eliminated, too. After applying Splitting (4.0), the equality is removed and the new equalities , are introduced. Now, (2) renames as by (2). We apply (4.0) again; this removes and adds the equalities , ; moreover, the variable is renamed as . We can now apply (3) to the literal . The branch terminates with . This gives as a first disjunct of the uniform interpolant. The other branches give , , and as further disjuncts, so that our cover turns out to be logically equivalent to . The above algorithm has exponential complexity (the branches have quadratic size); notice, however, that, if the signature only contains unary function symbols, there is no need to apply the Splitting Rule, and hence, the complexity is polynomial; the case of a signature with only unary function symbols is important in the applications to data-aware verification because it allows the formalization of read-only databases with primary and foreign keys [
35].
5.2. Combined Uniform Interpolants
We now investigate combined uniform interpolants by starting from the convex case and by showing the algorithm presented in [
39,
40]. Let us fix a
convex, stably infinite, equality-interpolating universal theory admitting a model completion. Let
T be such a theory, let
be its signature, and let
be its model completion. Consider a conjunction of
-literals
, where
(recall that the tuple
is disjoint from the tuple
according to the conventions from
Section 2).
For
, we let the formula
be the quantifier-free formula equivalent in
to the formula
where the
are renamed copies of the
. The following lemma (taken from [
39,
40]) comes from the primitive Beth definability property (recall the paragraph following Proposition 1):.
Lemma 2. Let be the disjunctive normal form (DNF) of . Then, for every , there is a -term such that The above lemma is the key technical ingredient for the proof of the following result.
Theorem 11 [
39,
40]
. Let be convex, stably infinite, equality-interpolating, universal theories over disjoint signatures with uniform quantifier-free interpolation. Then, has uniform quantifier-free interpolation. We now present the algorithm from [
39,
40] to compute covers in
when the hypotheses of the above theorem are satisfied and the
satisfiability problems are decidable. We show how compute the cover of a primitive formula
, where we freely assume that the literals in
are all flat: if we let
to be the signature of
and
to be the signature of
, flatness means in particular that such literals are either
-literals or
-literals or both (the latter can obviously be the case only for equalities or negated equalities involving variables). The idea behind the algorithm is that the input cover algorithms can be separately applied, once all potential definability phenomena have been identified.
A
working formula is a formula of the kind
where
is a DAG,
is a conjunction of
-literals, and
is a conjunction of
-literals. We assume that
in a working formula (
28) always contain the literals
(for distinct
from
) as a conjunct; this can be forced at the initialization stage by making a case-split followed by replacements of equals by equals. Contrary to what we did in the
case above, here, we do not need to separate the literals that do not contain the truly existential variables
from the other ones.
A working formula such as (
28) is said to be
terminal iff, for every existential variable
, we have that
Intuitively, in a terminal working formula, all variables that are not parameters are either explicitly definable or recognized not to be implicitly definable by both theories. Notice that the validity tests for the implications (
29) can be effectively discharged using the quantifier-free satisfiability procedures in
.
We first observe (see [
39,
40] for details) that every working formula is equivalent (modulo
) to a disjunction of terminal working formulæ. Such a disjunction of terminal working formulæ can be computed as follows: One exhaustively applies the following transformations in all possible ways (the output is the disjunction of the different outcomes).
- (1)
Update by adding to it a disjunct from the DNF of and by adding to it a disjunct from the DNF of ;
- (2.i)
Select and ; then, update by adding to it a disjunct from the DNF of ; the equality (where is the term mentioned in Lemma 2) is added to ; the variable becomes, in this way, part of the defined variables.
To conclude, we need the final fact (again, shown in [
39,
40]) that the cover of a working Formula (
28) that is terminal is given by the unravelling of the explicit definitions of the variables
from the formula
where
is the
-cover of
and
is the
-cover of
.
A remarkable corollary of the above theorem says that existence of uniform interpolants is preserved when adding free function symbols to a convex, stably infinite, equality-interpolating, universal theory with uniform interpolants (this is because the combination with
enjoys the hypotheses of Theorem 11). Unfortunately, the convexity hypothesis is indispensable for this result to hold, as the following counterexample from [
39,
40] shows.
We take as
the integer difference logic
of Example 2; notice that this theory is stably infinite, universal, and has quantifier elimination (thus, it coincides with its own model completion). This theory is not convex; however, it is equality interpolating, as seen in
Section 3 above. As
, we take
, where
has just one unary free function symbol
f (this
f is supposed not to belong to the signature of
).
Proposition 2 [
39,
40]
. Let be as above; the formuladoes not have a cover in . The counterexample still applies when replacing integer difference logic with linear integer arithmetics.
6. Conclusions
We investigated transfer results concerning the existence of quantifier-free (ordinary and uniform) interpolants to combined first-order theories. The investigation used semantic and model-theoretic tools in an essential way in order to obtain appropriate conceptualizations for justifying concrete algorithms.
Some problems are left open; in particular, the results concerning the case of non-disjoint signatures are far from being exhaustive. Indeed, in the case of non-disjoint signatures, sufficient conditions for the transfer of uniform interpolants are completely missing.
Advanced combination problems tend to be rather difficult in nature; however, very often, applications show unexpectedly interesting research perspectives that are worth pursuing. For instance, in [
40], a strong result (working only under the stable infiniteness hypothesis) for the transfer of existence of uniform interpolants is obtained in the case of special many-sorted disjoint signature combinations (called ‘tame combinations’) arising in the area of verification of data-aware processes [
34,
35,
69].
Applications are also important for testing the feasibility of the algorithms suggested by theoretical research in concrete implementations. As briefly mentioned in the introduction, frameworks for the verification of data-aware processes [
35,
69,
70] provide a particularly interesting setting where (combined) uniform interpolation plays a crucial role from the theoretical, the methodological/algorithmic, and the operational perspectives (see [
71] for an exhaustive introduction to this topic). In this context, complex dynamic systems that can interact with a persistent data storage are verified against some property of interest via sophisticated techniques based on SMT-solving and on automated reasoning; specifically, the presence of the ‘data’ component, which is usually represented as relational databases that the ‘process’ component can query and update, requires the development of suitable techniques for eliminating (to some extent) quantifiers binding variables that range over the content of such databases. This task can be effectively and efficiently performed by computing (combined) uniform interpolants [
31,
32,
39,
40]. This motivated the implementation of algorithms for computing combined uniform interpolants in the state-of-the-art MCMT model checker [
72]. We demonstrated the feasibility of this approach in [
71] by testing MCMT against a benchmark of concrete data-aware processes, and we showed in [
69,
70,
71,
73,
74] how these techniques turn out to be extremely useful for developing operational verification frameworks for modeling and verifying business processes enriched with concrete data that emerge in real-world scenarios and in business process management [
75] within contemporary organizations.