1. Introduction
In the field of logic, encompassing both mathematical and computational aspects, the question of whether to prioritize axioms or rules greatly influences the study of various problems. This dilemma affects areas like logic programming design, different axiomatizations in arithmetics, and automated reasoning, among others.
Designing rules that enhance demonstration by substituting axioms is a viable approach to addressing provability in a theory (e.g., see [
1]), provided that the impact of such substitution on the theorems of the latter is analyzed. For example, it is essential to manage induction in equational reasoning [
2] or noetherian induction by rewriting [
3], among others. Specific representation formalisms, such as geometric logic, can also benefit from the introduction of rules to simplify the axiom set [
4,
5].
In some applied fields, rule-based reasoning plays a pivotal role in various realms of mathematical or computational logic, the Semantic Web, and expert systems, among others, where rules provide a flexible and potent framework for representing and applying knowledge in a computer system.
Studying the (proof–theoretic) strength of a deductive system when we replace axioms for rules is not only interesting theoretically but also useful for tasks like designing and implementing computational proof systems.
This paper presents some theoretical tools and results that can be useful in enhancing our understanding of the (proof–theoretic) strength of systems representing knowledge by means of conditional axioms or, instead, by means of rules.
1.1. From Conditional Axioms to Rules
In this work, we address the following general question. Assume that a certain mathematical principle can be formalized in a first-order language by a set
E of conditional formulas of the form
(for instance, an induction principle). Then, given a first-order theory
T expressing some basic properties of the functions and relations involved,
T can be extended by adding the (universal closures of the) formulas in
E as new axioms. In this way, we obtain a new theory that we denote by
. However, it is also possible (and in some cases perhaps more convenient, see some examples below) to extend
T by using the involved principle to produce a new inference rule. This can be carried out, for example, by considering for each
in
E an instance
of a new inference rule denoted by
. Let
be the closure of
T under first-order logic and applications of this new rule. This approach can also be seen as a weaker version of the typical conversion process from a Hilbert-style axiomatic system to a (Gentzen) sequent system. After leaving out the external quantifiers, the conversion primarily involves identifying a sound decomposition for each axiom to generate the rules.
In general, will be a proper extension of , but how much stronger than can be? What is the exact relationship between both theories? And, are there other natural rules associated with a set E of conditional axioms that provide interesting information about ?
Before delving into the answers to these questions, let us present the scenario from which we extract both motivation and methods to deal with the previous problems: the study of conservation results between formal arithmetic theories. By a conservation result between two theories and , we mean a proposition stating that for some class of sentences and every sentence , if proves , then so does (and in such a case, we say that is -conservative over ).
Classical axiom schemes axiomatizing Peano arithmetics (say,
–induction,
–collection, and uniform
–reflection,
) share basic axiomatization and conservativity properties. It is a well-known fact that it is possible to develop a uniform treatment of conservation results for these schemes and, as a matter of fact, several (mainly proof–theoretic) methods providing uniform derivations of the basic conservation results are known. These methods include Herbrand Analysis as developed by W. Sieg in [
6,
7], S. Buss’ witnessing method both in Peano arithmetic and in bounded arithmetic (see [
8,
9]), the model–theoretic approach to Herbrand Analysis developed by J. Avigad in [
10], and several approaches based on applications of cut elimination, see for instance Parsons’ [
11], or more recently, the approach followed by L. Beklemishev in [
12,
13] (also closely related to Sieg’s method).
However, the basic ideas of these methods are, to a great extent, independent of any specific arithmetic notion and, therefore, could be developed for arbitrary (countable) first-order languages and theories. In order to explore this intuition, we can isolate the main characteristics of these arithmetical contexts.
Most axiom schemes that are considered in first-order arithmetic produce (when restricted to some class of formulas) sets of conditional axioms; that is, formulas of the form
where
denotes a sequence
of possible free variables. A classical example from formal arithmetics is the
–induction scheme,
, given by the formulas
where
varies within the class of formulas
in the arithmetic hierarchy. We can also introduce an inference rule,
, associated with
E in a natural way. Namely,
E–Rule denotes the inference rule whose instances are
for each
. The rule corresponding to the previous
–induction scheme is denoted by
and consists of instances of the form
where
.
Then, given a base theory T, we are interested in conservation properties of the extensions of T, , and (the closure of T under first-order logic and applications of .) The analysis of these conservation properties typically proceeds from some level of syntactical complexity (represented for some class of formulas ) and (for some class of formulas containing ) studies of –conservation results between the theories associated with E over the base theory T.
It is a classical theorem in the proof theory of first-order arithmetics (proved by Parsons [
11]) that (over a base theory,
, axiomatized by some elementary arithmetic facts) the scheme
is
-conservative over
. More precisely,
is a
-conservative extension of
. In this work, we isolate conditions under which a similar result to Parsons’ theorem can be derived for a given set of conditional axioms
E. We show that if the formulas in the set of conditional axioms
E have a suitable syntactical complexity, then the conservation properties of the theories associated with
E can be derived from simple conditions on the
. In order to derive these general counterparts of Parson’s theorem, the following auxiliary notions are introduced.
If, for each
, the formulas
and
have a low syntactical complexity with respect to the basic level
(see Definition 6), then we say that
E is a set of
normal conditional axioms with respect to
. Let us denote by
the set of universal closures of boolean combinations of formulas in
. Then, the set of
consequences of
can be described using an auxiliary inference rule
, with instances
where
and
is a conjunction of formulas that belong to
or are negations of some formulas in
. Using a model–theoretic argument (based upon simple properties of a general notion of an existentially closed model, following ideas developed by Avigad in [
10]) we prove that for every theory
T (axiomatized by formulas with restricted syntactical complexity, see Corollary 1),
Let us remark here that the auxiliary
is a natural device in the proof–theoretic approach to conservation results for arithmetical theories via cut elimination. As a matter of fact, the rule
was considered by J. C. Shepherdson in his analysis of open induction (see lemma 2.3 in [
14]). A very similar rule was also used for
induction by Parsons (see section §3 in [
11]) and more recently by Beklemishev in his work on
–collection (see [
12]) and
–induction (see [
13]). Our proof of Corollary 1 (as a consequence of Proposition 2) provides a model–theoretic interpretation of these proof–theoretic arguments.
Corollary 1 suggests the following notion (see Definition 7): we say that E is weakly Π-reducible modulo T if and are equivalent for every theory extending T. Putting it all together, we can state a version of Parson’s theorem in a more general context where the axiomatization of T has again a restricted syntactical complexity (see Theorem 1 for this and other related conservation results):
This result, and more generally Theorem 1, can be considered as a reformulation of some aspects of Kaye’s model–theoretic work in [
15] where, using Henkin constructions, Kaye derived conservation results in the spirit of Theorem 1. As pointed out in the survey paper [
16], a uniform model–theoretic treatment of conservation results between several induction schemes and their parameter-free versions can be obtained using these ideas. The model–theoretic core of this uniform approach was developed in a general setting in [
17]. The main conceptual difference between Kaye’s approach and our exposition in this paper is, on top of the use of existentially closed models, the emphasis on the role played by inference rules in these results that, we think, allows for a more systematic presentation.
In a similar vein, in
Section 5, we reinterpret, using the notion of a set of normal conditional sentences, a conservation result obtained by Kaye (see Theorem 1.4 in [
17]). Namely, we show that (see Theorem 3), given a theory
T and a finite set
E of normal conditional sentences with respect to
, then
For the reader’s convenience,
Table 1 summarizes the axiomatizations of the theories we will work with in this paper. Given a first-order language
L, let
E denote a set of conditional axioms of the form
and let
denote a basic fragment of
L (as defined in Definition 3). The first two theories are given by a base theory
T together with a set of axioms. The remaining theories are constructed using (Hilbert-style) inference rules and thus are defined by the closure of the base theory
T under applications of the corresponding rule.
Table 2 summarizes the main conservation results obtained in this paper. For a given base theory
T, the theory
is conservative over each of the following subtheories (it is important to note that, for the first three conservation results, we also require the base theory
T to be
-axiomatizable, whereas the remaining four hold for base theories
T of arbitrary syntactical complexity).
1.2. Aim and Structure of the Paper
In this paper, we prove some conservation results between, on the one hand, an arbitrary theory axiomatized (over a basic theory
T) by a set of conditional axioms
E and, on the other hand, theories axiomatized (again over
T) by (nested) applications of the rules
and
. Although the results included here are not essentially new and the corresponding arithmetical versions are well-known, we think that the model–theoretic approach that we develop in their proofs is rather simple and clear. This simplicity aids in making the whole presentation of these kinds of results in a general context very easy to follow. In a fundamental sense, this paper revisits a substantial part of the work of Kaye in [
16,
17] through the light of the methods introduced more recently by Avigad in [
10]. It also reformulates Kaye’s results in terms of natural inference rules, as can be found in more standard proof–theoretic works [
12,
18]. We hope this reformulation, together with the model–theoretic approach we adopt here, can contribute to making the logical content of these results more accessible to a wider audience, including researchers working on topics with no direct connection with formal arithmetics.
The structure of this paper is as follows. The next section,
Section 2, specifies the basic notions and notation used in the paper. In
Section 3, we present the basic model–theoretic device used throughout the paper to derive conservation results. Here, a generalization of the notion of an existentially closed model plays a central role. The notion of a normal conditional axiom is introduced in
Section 4, where some conservation results between the theory axiomatized by conditional axioms and the one obtained by considering the associated rules are established. The specific case of a finite set of conditional sentences is analyzed in
Section 5. This paper concludes with some considerations about the results obtained and possible lines of future research.
2. Inference Rules and Conditional Axioms
We always work in classical first-order logic with equality. Let us fix a countable first-order language L. A formula is a literal of L if it is atomic or negated atomic.
As usual, a theory T is a set of sentences of L closed under logical consequence (that is, for each formula , if , then ). An axiomatization of T is a set of formulas such that .
Through this paper, we shall deal with different sets of formulas built up from a basic distinguished set using several syntactical operations. So, we shall begin by defining these operations. Given a set of formulas , the following notation will be used:
is the set of formulas .
is the set (the sets , , ,…are defined accordingly using the appropriate blocks of initial quantifiers).
(resp. ) is the set of all finite conjunctions (resp. disjunctions) of formulas of . is the closure of under disjunctions and conjunctions.
denotes the set of boolean combinations of formulas in .
As usual, a tuple of elements (or variables) is abbreviated by , and we write to indicate that the free variables of are among the tuple .
For a given base theory T, extensions of T can typically be axiomatized in two ways: (1) by adding a new set of sentences E to T and closing under logical consequence, or (2) by closing T under (first-order logic and) applications of some new inference rules. In this paper, we shall explore the relationship between both extensions.
Firstly, we recall some basic notions and terminology on inference rules introduced by Beklemishev in [
18]. By an inference rule, we mean a set of instances, that is, expressions of the form
where
are formulas. If
R is an inference rule, then
denotes the closure of
T under first-order logic and
unnested applications of
R (that is, a proof in
may contain several applications of
R but they are not to occur on the same branch within the proof). By recursion on
, we define
and
. The
closure of
T under first-order logic and applications of
R is
.
Definition 1. Let and be rules and let U be a theory.
- 1.
The rule is derivable from modulo U if for every extension T of U, extends .
- 2.
The rules and are equivalent modulo U if for every extension T of U, (that is, they are equivalent theories).
- 3.
The rule is reducible to modulo U if for every extension T of U, extends .
- 4.
and are congruent modulo U if for every extension T of U, .
Many significant mathematical and combinatorial principles are expressed in the form of implications, where the satisfaction of a certain condition A entails the presence of another condition B. The formal representation of these principles corresponds to formulas whose outermost connective is the logical implication symbol →. This motivates the following definition.
Definition 2. We say that a set E of L-formulas is a set of conditional axioms if every element of E is a formula of the form (recall that for a formula φ, we write to mean that the free variables of φ are among the variables ).
Let
T be an
L-theory and let
E be a set of conditional axioms. Then, by definition,
is the theory axiomatized by
T plus the universal closure of every formula in
E, i.e., the theory given by
T plus
for each
.
A natural inference rule,
, can be associated with
E by considering the instances
for each
.
In this paper, we explore the relationship between the theories , , and for a given set of conditional axioms.
In order to be able to determine more precisely the relative proof–theoretic strength of these theories, we shall fix some level of syntactical complexity for the formulas considered. This will be performed through the notion of a basic fragment of a first-order language:
Definition 3. A set of formulas of L, Π, is a basic fragment of L if Π satisfies the following conditions:
- 1.
Every atomic formula of L belongs to Π.
- 2.
If and θ is a subformula of φ, then .
- 3.
If and t is a term of L, then .
In short, a basic fragment is a set of formulas that includes all atomic formulas and is closed under subformulas and term substitution.
Let us enumerate a few natural examples of basic fragments.
The sets and of formulas of L (where denotes the class of open formulas of L and for each , and ).
The set of all literals of L (a formula is a literal if it is atomic or negated atomic).
The set comprising all clauses (i.e., disjunctions of literals) of L.
In the context of arithmetic languages, other examples are the classes in the arithmetical hierarchy
and
,
, (see [
19] or [
20]); the sets
and
,
, in the
hierarchy of bounded formulas (see [
21]); and the sets
and
,
, of strictly bounded formulas considered in bounded arithmetic (see [
20]).
Given a basic fragment
, we associate
E with a new set of conditional axioms denoted by
and given by the set of all formulas
where
, and for each
,
. We will also consider its associated inference rule,
, given by the set of instances
where
and for each
,
.
Let us observe that is equivalent to (both are different axiomatizations of the same theory) but, in general, the theories and are not equivalent. Indeed, always extends (to simulate an application of , one applies the corresponding instance of with , which belongs to for any basic fragment). However, in general, may not necessarily be an extension of .
3. A Model–Theoretic Standpoint
In this section, we introduce the machinery that we will use to derive our conservation results between theories and , where E is a set of conditional axioms.
The methods we use in this paper are model–theoretic in nature and essentially follow the methodology introduced by Avigad in [
10] who, in turn, mentions that in the context of bounded arithmetic, this methodology has been used in Zambella [
22], where it is attributed to unpublished work by Albert Visser.
Central to our approach is the notion of an
-
closed model of a theory
T, where
is a fixed but arbitrary basic fragment. This notion is a natural generalization of the well-known concept of an existentially closed model and extends the concept of a Herbrand-saturated model introduced in [
10].
All the languages and models considered through this paper are countable. Given two
L structures
and
and a set of formulas
, we shall write
to express that
is a
-elementary extension of
; that is,
is a substructure of
, and for each
and each
, we have
Definition 4. Given a basic fragment Π
of a first-order language L and an L structure , the Π
-diagram of is the set of sentences of the language given by Remark 1. Let Π be a basic fragment of a language L and let be an L structure.
- 1.
If is an L structure and is a substructure of , then - 2.
For every L-theory T, the following conditions are equivalent:
- (a)
is consistent.
- (b)
There exists such that .
Remark 2. Let us observe that, for every two L structures and , we have The next definition introduces a straightforward generalization of the notion of an existentially closed model.
Definition 5. Let be an L structure. We say that is an -closed model of T if , and for each , implies
Observe that taking
, we obtain the classical notion of an existentially closed model from Model Theory; taking
, we obtain the notion of a Herbrand-saturated model from [
10]; taking
, we obtain the notion of a 1-closed model from [
23]; and taking
, we obtain the notion of an
-closed model from [
24].
The usual chain argument for constructing existentially closed models provides us with an existence lemma. The proof of Lemma 1 is rather standard but we include a sketch so that the reader can check where the properties defining the notion of a basic fragment are needed.
Lemma 1. Let T be a -axiomatizable consistent theory. Then, for each , there exists an -closed model of T, , such that .
Proof. Let
be a model of
T. In the first step, we will construct a model of
T,
, satisfying
, and for each
with
and for all
and
,
To this end, we will form a chain of models of
T of power
,
and take
.
Let be an enumeration of all formulas in with parameters from .
: Put .
: If is consistent, there exists such that and . Define to be . If is inconsistent, define to be .
Let us check that satisfies the required properties.
. As usual, by induction on the syntactical complexity of the formulas in , it is easy to see that the union of the chain is a -elementary extension of each model in the chain (here we use the assumption that a basic fragment is closed under subformulas and term substitution).
. It follows from the fact that -axiomatizable theories are preserved under unions of -elementary chains. In fact, let be an axiom of T, where , and consider . Pick such that . Since is a model of T, there are such that . Since and , . Hence, , as required.
Consider with , and such that . Pick such that is . Clearly, is consistent and so by construction. But then since .
Repeating the construction
times, we obtain a chain of models of
T:
such that any
sentence with constants from
that holds in some extension of
, which is a model of
T, holds in
as well. Take
. It is clear that
and
is an
-closed model of
T. □
Our basic device to prove the conservation results is the next lemma, which is a general version of Theorem 3.4 in [
10].
Lemma 2. Let T be a -axiomatizable theory and let be a theory such that every -closed model of T is a model of . Then, is -conservative over T.
Proof. Let be a sentence such that . We must show that .
Suppose that . Then, there exists . Since is an sentence, is a -axiomatized, consistent theory and, by Lemma 1 and Remark 2, there exists an -closed model of T, , such that . Firstly, by the assumption on the theory , is a model of . Secondly, put , with , and pick , satisfying that . Since , , and so . Then, , a contradiction. □
In order to apply Lemma 2, we will need the following result that mirrors theorem 3.3 outlined in [
10]. It establishes a connection between validity within an
-closed model of
T and the provability within the theory
T itself.
Proposition 1. Let be an -closed model of T and , such that . Then, there exist and such that Proof. Since and , there are and such that is and . Since is -closed, is inconsistent. Indeed, assume that is consistent. By Remark 1, there exists such that . It follows from the fact that is an -closed model of T that . But then since , which contradicts the fact that .
Therefore,
In particular,
and hence
. By compactness, there exist
such that
. Since the constant symbols
do not appear in the language of the theory
T, we obtain
as required. □
4. Normal Conditional Axioms
After introducing the basic model–theoretic machinery in the previous section, we are now ready to establish our first general conservation theorem between axioms and inference rules. To this end, we first need the following simple yet useful lemma.
Lemma 3. Let T be a theory and let E be a set of conditional axioms. Then, for every basic fragment Π, and are congruent modulo T.
Proof. Since
, it is enough to show that for every theory
U extending
T,
extends
. Even more, since
is closed under conjunctions and negation, it is enough to show that for all
and
, if
, then
Given
, there are
such that
Then, since
, we have
Then,
n (unnested) applications of
give us
and, therefore,
as required. □
The next Proposition establishes a general conservation theorem between a base theory T augmented with a set of conditional axioms E and the associated inference rule theory , where is an appropriate basic fragment. Namely, under very general conditions, it is possible to replace the use of an axiom from E by the use of an inference rule at the price of adding certain side formulas from the class during the application of the inference rule.
Proposition 2. Let T be a theory, let Π be a basic fragment, and let E be a set of conditional axioms such that
(S1) For every , is T-provably equivalent to an formula; and
(S2) is -axiomatizable.
Then, is -conservative over .
Proof. By Lemma 3, –Rule and –Rule are congruent and hence it is sufficient to show that is -conservative over –Rule. Note that is also a basic fragment and that . By condition () –Rule is -axiomatizable. Hence, by Lemma 2 for the basic fragment , it suffices to prove that every -closed model of –Rule is a model of .
Let be an -closed model of –Rule. Consider and such that . We must show that .
By condition (
), there exists
such that
and so
. By Proposition 1 for the basic fragment
, there exist
and
, satisfying
and
Then,
and, by an application of
, we obtain
Therefore,
since
–Rule and
. □
Please note that conditions (S1) and (S2) of the above Proposition are satisfied by every theory and every set of conditional axioms with suitable syntactical complexity. Namely, if T is a -axiomatizable theory and E is a set of conditional axioms such that for all we have and , then T and E satisfy (S1) and (S2). According to this, in what follows we focus on -axiomatizable theories and sets of conditional axioms of restricted syntactical complexity. This motivates the following definitions:
Definition 6. A formula is a normal conditional axiom with respect to Π if (modulo logical equivalence) and .
If, instead, for some theory T, is T-provably equivalent to a formula and is T-provably equivalent to a formula, then we say that is a normal conditional axiom with respect to Π over T.
Remark 3. In the context of formal arithmetic, there are a good number of combinatorial or logical principles that can be naturally expressed as a set of normal conditional axioms with respect to a suitable basic fragment Π. For instance, the induction principle and the collection or Replacement principles are prominent examples.
In a non-arithmetical context, an interesting example of normal conditional axioms could be the geometric ones (cf. [25]). A geometric
axiom is a formula following the geometric axiom scheme below:where each is an atom and each is a conjunction of a list of atoms , and none of the variables in any are free in the ’s. It is easy to check that a set of geometric axioms, E, is a set of normal conditional axioms with respect to the basic fragment consisting of the atomic formulas of the language, At.
In view of the discussion preceding Definition 6 and as an immediate corollary of Proposition 2, the following result is obtained.
Corollary 1. Let Π be a basic fragment and let T be a -axiomatizable theory. Let E be a set of normal conditional axioms with respect to Π over T. Then, is -conservative over –Rule.
The previous corollary establishes a broadly applicable conservation result between a set of conditional axioms and the naturally associated inference rule –Rule. Remarkably, in Corollary 1, we only imposed certain syntactical conditions on the quantifier complexity of the involved theories. Therefore, this conservation phenomenon remains independent of the specific combinatorial or mathematical principles that the set of conditional axioms E could express.
Remark 4. It is important to notice that these conservation results are properties of the given axiomatizations of the theories . That is, there are sets of conditional axioms and such that and are equivalent theories but the associated inference rules and significantly differ in strength.
A set of geometric axioms E can be used to illuminate this aspect of the approach developed in this work. Let us observe that if E is a set of geometric axioms, then each element in E is a conditional axiom where is a conjunction of atomic formulas. As a consequence,is an instance of (with ) and it follows that, for every theory T, is equivalent to , rendering trivial every conservation result between both theories. However, let us observe that if we putthen and D is also a set of normal conditional axioms with respect to Π.
By Proposition 2, is -conservative over , but now is a theory presumably weaker than (since the applications of rule only produce formulas).
Moreover, Corollary 1 suggests a natural scenario in which the conservativity of over the directly associated inference rule –Rule can be established, namely when E–Rule and –Rule are shown to be equivalent rules. This prompts the following definition.
Definition 7. Let T be a theory and let E be a set of conditional axioms. We say that
Paradigmatic examples of
-reducible sets of conditional axioms are provided by the different versions of the induction principle in first-order arithmetics, usually formulated by means of a scheme. In particular, the open induction scheme gives us a simple but very clear example that was already studied in the early 1960s by Shepherdson (see [
14]). Let us consider a first-order language
L extending the language of first-order arithmetics. Let
T be a theory in the language
L axiomatized by
sentences and let
E be the set of conditional sentences generated by the scheme
where
varies within the set
of all open formulas of
L. Then,
To see this, it is enough to notice that, given
such that, for some extension
U of
T,
then the sentence
can be derived in
from a single instance of
as follows:
Let
be the formula
. Then,
and, therefore,
, but this last sentence is easily seen to be equivalent to
.
In [
14], the theory
was denoted by IAO and
by RIO. Bearing in mind Corollary 1 and
, we obtain an alternative proof of Theorem 2.2 in [
14] stating that IAO is
-conservative over RIO.
Now we are ready to prove our first general conservation theorem of a theory from
over the associated inference rule theory
–Rule. As a by-product, we will also obtain the conservativity of
over a certain
parameter-restricted version of that theory. In fact, given a set of conditional axioms
E, we define the set of
sentences(
U stands for
uniform, for in order to apply an axiom of
, the antecedent
must be uniformly true, i.e., true for all values of the parameters
). It is clear that
implies
, which, in turn, implies
–Rule
Theorem 1. Let T be a -axiomatizable theory and let E be a set of normal conditional axioms with respect to Π over T. If E is weakly Π-reducible modulo T, then
- 1.
is -conservative over .
- 2.
is -conservative over .
- 3.
In fact, if a theory D satisfies that every extension of is closed under , then is -conservative over .
Proof. Part (1) directly follows from Proposition 2. Note that conditions () and () in the statement of Proposition 2 are satisfied since E is a set of normal conditional axioms with respect to over T and –Rule and –Rule are equivalent since E is assumed to be weakly -reducible modulo T.
Let us prove part (2). Let be a sentence such that . Then, is consistent and -axiomatizable; hence, by Lemma 1, there exists an -closed model of , say .
Observe that is closed under E–Rule. By weak -reducibility modulo T and Lemma 3, is also closed under –Rule. Hence, reasoning as in the proof of Proposition 2, we obtain that . In particular, and so .
As for part (3), let us observe that extends (for otherwise there would be such that and so would not be closed under E-Rule, contradicting the hypothesis on D). Hence, part (3) follows from part (2). □
Remark 5. Theorem 1 provides a general method for proving the conservativity of a set of axioms E over the associated inference rule E–Rule: (i) expressing E as a set of normal conditional axioms with respect to an appropriate basic fragment Π, and (ii) showing that E is weakly Π-reducible (i.e., –Rule is derivable from E–Rule).
In the realm of arithmetic, significant results can be derived from this approach. For instance, it can be readily demonstrated that the theory of -induction can be formulated as a set of normal conditional axioms with respect to the basic fragment . Subsequently, the resulting –Rule can be derived from (or, more precisely, reduced to) –IR modulo . This leads to a proof of the well-known fact regarding the conservativity of over . Through this approach, numerous other significant conservation results for arithmetic theories can be proved.
In the setting of formal number theory, a natural question regarding the proof strength of an arithmetic theory
T is to characterize the
consequences of the theory; that is, the set of all theorems of
T of a fixed quantifier complexity
. To fix notation, given a theory
T and a set of formulas
in the language of
T, we denote
Two prototypical results in this regard are the well-known facts that
characterizes the
consequences of
-induction
and that
characterizes
.
In [
17], Kaye already observed that these fundamental facts can be extended to a broader, arithmetic-free context, and that they can be established by using simple model–theoretic arguments. Our Theorem 1 provides an alternative proof of Kaye’s observation. Specifically, suppose that
T is a
-axiomatizable theory and that
E is a set of conditional axioms satisfying that if
, then both
and
are in
(possibly modulo
T). Then,
–Rule is
-axiomatizable and
is
-axiomatizable. Therefore, if
E satisfies the assumptions of Theorem 1, we obtain characterizations of
and
. Namely,
–Rule captures, precisely, the
consequences of
, and
captures, precisely, the
consequences of
.
To close this section, we show how to derive from Theorem 1 another result of Kaye regarding general
L-theories. Namely, Theorem 1.1. of [
17] establishes that if
T is any
-axiomatizable
L-theory (
), then
Here we obtain a slightly more general version of this result: if
is a basic fragment of a language
L and
T is a
-axiomatizable
L-theory, then
and
coincide (note that Kaye’s result can be recovered by taking
).
Theorem 2. If T is a -axiomatizable theory, then Proof. Without a loss of generality, it suffices to prove the result under the assumption that the basic fragment is closed under boolean combinations. Assume and let T be any -axiomatized theory. We must show that and are equivalent theories.
First of all, observe that
T can be axiomatized by a set of conditional axioms as follows. Let
denote the theory in the language of
T with no non-logical axioms and define
where
denotes the (false) sentence
. It is clear that
. Now consider the new set of conditional axioms
; that is,
E is the set of conditional axioms given by
where
,
, and
. We also have
and now it is immediate to verify that
E is a set of normal conditional axioms with respect to
, which, in addition, is
-reducible modulo
. By Theorem 1, we obtain that
T is
-conservative over
, which, by definition, is given by the set of sentences
where
,
, and
. But it is straightforward to check that
can be rewritten as a set of sentences which, modulo logical equivalence, are in
. Namely,
where
,
, and
. Consequently,
implies
, which, in turn, implies
by conservativity. For the opposite direction, observe that (modulo logical equivalence) every
sentence can be rewritten as an
sentence. □
5. Finite Sets of Conditional Sentences
In the previous section, we obtained a number of conservation theorems of
over
or
for a general set of normal conditional axioms
E. In this section, we will prove finer conservation results for the particular case where
E is a finite set of normal sentences. In other words, we are interested in cases where
E can be expressed as
where
and all
s and
s are sentences, i.e., they contain no free variables.
Again, the original motivation for considering these particular sets of conditional axioms comes from results in the context of formal arithmetics. A well-studied fragment of first-order Peano arithmetic is the scheme of parameter-free
-induction
given by a basic algebraic theory
together with
where
; that is,
and contains no other free variables than the induction variable
x. Note that
can be seen as a set of normal conditional
sentences with respect to
. It is well-known that
and its parameter-free counterpart
share the same
consequences (indeed, the
consequences are also preserved), but
enjoys the following nice property:
Let be a sentence. If for some we have , then .
The previous property is a well-known conservation theorem for fragments of arithmetic obtained (independently) by Z. Adamowicz, T. Bigorajska, G. Mints, and also by Z. Ratajczyk. The result generalizes to for each , but we cannot expect to have a similar result for (parametric) since is well-known to be finitely axiomatizable.
At first sight, the previous result for
could seem to be a very particular property of the induction scheme in the formal arithmetic setting. However, and this was already observed by Kaye in [
17], this property again corresponds to a very general purely logical fact for theories described in terms of conditional sentences, see theorem 1.4 in [
17] (let us observe that similar results in the context of bounded arithmetic theories have been obtained by Jeřábek in [
26].)
In the present section, we shall prove a (slightly more general) version of theorem 1.4 in [
17] using our methodology. Namely, we shall obtain a conservation theorem relating the number of conditional sentences needed to derive a
formula from
E and the depth of the nested applications of the corresponding
, see Theorem 3 below.
Through this section,
will denote an arbitrary basic fragment. We shall begin with an analysis of
when
E is a set of conditional sentences but
E is not necessarily a finite set. Let us observe that, since
E is a set of sentences and, by Lemma 3,
is congruent with
, it is straightforward to check that
is congruent with the following rule (that we shall denote by
):
This motivates the introduction of a kind of dual version of
that we shall denote by
. Instances of this new rule are
Please notice that the subscript ∃ in the name of the –Rule indicates that, as shown in Theorem 3 below, under certain assumptions, is -conservative over applications of this rule. Similarly, the subscript ∀ in the name of the –Rule indicates that, under certain assumptions, is -conservative over applications of –Rule.
Lemma 4. Let T be a theory and let E be a set of conditional sentences.
- 1.
For every sentence , .
- 2.
For every sentence ,
Proof. (1) We only prove that, for all sentences , extends (the opposite direction is trivial).
Let
be such that
, for some sentence
. Since
is
for some
(and we can assume that the variables in
are all different from the ones in
), we obtain
But recall that, by Lemma 3,
is congruent with
, and therefore it follows that
. Thus,
and the result follows.
(2) We prove that, given , extends .
Let
be a sentence such that
for some
. Since
and
are (respectively)
and
for some
(and we can assume that the variables in
are all different from the ones in
), we obtain
So,
, and the result follows. □
The first interesting fact about sets of conditional sentences is the following improvement of Proposition 2.
Lemma 5. Let E be a set of conditional sentences such that if , then . Then, for every theory T, is -conservative over .
Proof. Let denote the theory axiomatized by . We shall prove that every -closed model of is a model of . Thus, the result will follow from Lemma 2.
Let be an -closed model of . First of all, note that
(•) is consistent.
Proof of (•): For otherwise by compactness, there would exist and satisfying and . Since , we would have , a contradiction.
Hence, there exists such that . Let us show that .
Pick such that . It follows from and the fact that is an -closed model of that . Since and , we also have . But note that every sentence can be rewritten (modulo logical equivalence) as an sentence.
Therefore, by applying Proposition 1, we obtain that there exist
and
such that
and
. Then,
(recall that
and
are congruent rules) and so
. But,
since
. Therefore,
, as required.
We have thus shown that there is such that and, therefore, , as required. □
We now turn to the study of the case where E is a finite set of conditional sentences. First, we introduce some notation. If , where is a conditional sentence, then will be denoted by and will be denoted by . The next lemma shows that for every conditional sentence , nested applications of (or ) collapse to unnested applications of the rule.
Lemma 6. Assume ψ is a conditional sentence of the form .
- 1.
If , then .
- 2.
If , then
Proof. (1): By Lemma 3, and are congruent, and thus we can assume, without loss of generality, that is closed under boolean combinations (that is, ). In addition, we also take advantage of the fact that, since we are dealing with conditional sentences, each instance of can be easily transformed into an equivalent instance of . Thus, in the following proofs, we shall deal with instances of , although we refer to them as instances of .
First, we show that
unnested applications of
can be replaced by a single unnested application of this rule: Let
be an
sentence equivalent to
, where
are sentences such that for each
, we have
. Then,
, and as a consequence, since for each
,
we obtain that these
k applications of
corresponding to
can be replaced by the instance given by the sentence
.
Now we show how to deal with nested applications of
. Let
be sentences such that
Let be a sentence equivalent to . Let us see that .
Indeed, we argue using T and assume that holds. Firstly, if holds, then we obtain since by our hypothesis. Secondly, if does not hold, then we have and . But, since by hypothesis , it follows that . As a consequence, we obtain again, as required.
We showed that and therefore one application of is enough to derive , which is equivalent to . So, two nested applications of can be replaced by one unnested application of the rule, and the equivalence between and follows.
(2) A straightforward modification of the previous proof allows us to derive part (2). We must notice that is congruent with , as can be easily seen. □
Now we consider a finite set of conditional sentences. In this case, applications of the corresponding (respectively, ) can be described in terms of the set of rules (respectively, ), . We shall study the interaction of these m rules and derive a collapse result.
Proposition 3. Let be a finite set of conditional sentences with cardinal m. Then, for every theory T,
- 1.
If for every , , then .
- 2.
If for every , , then .
Proof. (1) As we pointed out in the proof of Lemma 6, we can assume, without loss of generality, that is closed under boolean combinations. In addition, we deal with instances of as instances of . Now, the proof proceeds by induction on , the number of sentences in the set E.
: In this case, the result is just part (1) in Lemma 6.
: Consider where for each , is a conditional sentence with . We assume that the result holds for every set of conditional sentences with cardinal m.
In order to derive the result for
E, it is enough to show that
is closed under
for each
. Let us assume that
for some
p and
. Put
. It is enough to show that
, since then it will follow that
, as required.
We observe that
is equivalent to an
sentence and, as a consequence, by Lemma 4,
Now the crucial point is the following fact:
Claim:
Proof of Claim: We shall prove this by showing, by induction on , that for all , extends :
Consider
. If
, then, obviously,
; hence, let us assume that
. Then,
Now, by the induction hypothesis (on
m),
, and so
In view of the claim above, we obtain , and therefore, . This shows that and concludes the proof.
(2): It is easy to adapt the proof of item (1). We omit the details. □
Remark 6. Let us note that Proposition 3 also holds for every finite set of conditional sentences, E, satisfying that for all , .
For instance, we can deal with part (2) of Proposition 3 just by puttingand having in mind that for all T, if , then we obtain , since and . Therefore, , and it follows thatwhere . We can deal with part (1) in a similar way. We are now in a position to prove the main result of the present section. From Proposition 3 and Lemma 5, we derive the following version of theorem 1.4 of [
17].
Theorem 3. Let T be a theory and let E be a finite set of conditional sentences such that for every , . Let m be the number of elements of E. Then,
- 1.
is -conservative over .
- 2.
is -conservative over .
Proof. In view of Remark 6, part (1) follows from Lemma 5 and Proposition 3.
We give a direct proof for part (2).
Let be a sentence such that . We shall prove that by induction on :
: If
and
, then
. As a consequence,
. Since
, we obtain
and, therefore,
.
: Let
be a set of sentences. First, we consider the case where
for all
. Let
be a sentence such that
. Then,
and, in particular, it follows that
. For each
, let
be the theory
where
. Then,
, and reasoning as in case
, we obtain
. If
, then by the induction hypothesis,
. We proved in this way that for all
l,
. As a consequence,
and it follows that
.
For the general case, we put
and observe that
. Then, by the previous restricted case,
is
-conservative over
(where
). Hence,
is
-conservative over
which is, obviously, a subtheory of
. □
Corollary 2. Let E be a set of normal conditional axioms with respect to Π
over T such that E is Π
-reducible modulo T. If F is a finite subset of with m elements or F is a finite set of m sentences included in E, then 6. Conclusions and Future Work
Both in pure and in applied logic, the question of whether it is more convenient to formalize a certain mathematical principle as an axiom or as an inference rule is important and ubiquitous.
In this work, we developed a general logical framework that allows for replacing axioms with corresponding inference rules without greatly affecting the proof–theoretical strength, preserving theorems up to a certain level of quantifier complexity. While these results are familiar to logicians working in arithmetic formal theories, we believe they could also benefit a broader audience in logic.
The proof methods we use are conceptually very simple: we essentially combine syntactical manipulations and basic model–theoretic constructions. This should make the article accessible to a wide audience.
Several avenues for future research suggest themselves. Firstly, it is natural to ask whether the main results of the present paper also hold for other settings different from classical first-order logics (such as, for example, intuitionistic logic or minimal logic). In order to explore this line of future work, one should have to isolate first a suitable notion that could play the role of the -closed models for these new settings. Secondly, and still from a theoretical point of view, it would also be of interest to explore possible applications of the obtained results to other formal theories different from the arithmetical ones (for example, theories axiomatized by geometric axioms could serve as a first field of study). Finally, from an applied perspective, it would be desirable to investigate possible applications to rule-based reasoning in computational logic. In areas such as logic programming, expert systems, or the Semantic Web, rule-based knowledge representation is a crucial and powerful tool, and implementations of rule-based systems naturally emerge. A formal analysis of the axioms-as-rules strategy could be of interest in these fields.
We close this paper with some pointers to the kind of applications that we think deserve further exploration.
6.1. Description Logics
Conditional axioms are frequently used as "general axioms" in description logics. For example, in
logic, consider the axiom
which, when translated to first-order logic (with
A and
B being concept names translated to unary predicates and roles
r,
, and
as binary relations), becomes
which is equivalent to the normal condition axiom (defined in
Section 4):
An example of the utility of introducing rules would be as a mechanism for simplification through design patterns. Consider the following example extracted from [
27]:
Note that, since it is not possible to quantify over concepts in typical description logics, expressing the fact that every subclass of animals only has a child of the same subclass is not feasible. Nevertheless, by employing a certain type of (meta)rules [
27], it is possible to obtain a representation of such information, which allows for the elimination of certain axioms from the ontology:
which could be added to the standard tableau-based consistency class algorithm as a rule that acts on classes:
The question that arises with this transformation is what degree of conservativity is maintained between the original ontology and the new representation .
In the framework of first-order logic, we could deal with g as an axiom scheme, where X ranges over a convenient class of formulas. In this way, we could obtain a set of conditional axioms E and applications of the (meta)rule R that will correspond to applications of the . Nevertheless, although we do not discard that some conservation results could be derived by a more or less direct application of the machinery developed in this paper, the chief question here would be: Is it possible to derive, in the setting of description logics, some results and techniques similar to the ones developed in this paper for first-order theories? In particular, can such an approach be useful in settling the exact conservation between the ontologies and (or other similar cases)?
6.2. Coherent/Geometric Logic
In Remark 3, we briefly discussed the notion of a (finitary) geometric axiom (also known as a coherent formula). There we pointed out that a set
E of geometric axioms provides us with an example of a normal set of conditional axioms with respect to the basic fragment
of all atomic formulas of the language. We also noticed that if we put
then
, and
D is also a set of normal conditional axioms with respect to
. Therefore, by Proposition 2,
is
-conservative over
.
It is a theorem of Barr that if a geometric sentence is derivable from a geometric theory using classical (first-order) logic, then it is also derivable using intuitionistic logic. As noted in [
25], this result can be easily derived using a cut–elimination argument. However, can the model–theoretic methods that we presented in this paper be adapted to derive Barr’s theorem or some related result?
Towards a positive answer to this problem, the following question could be considered. In [
28], Coste, Lombardi, and Roy deal with
dynamical theories axiomatized by geometric axioms (in [
28], the term "dynamical axiom" is used to refer to geometric axioms). A key notion in that work is the notion of a
dynamical proof (closely related to a derivation of a basic sequence in a cut-free system with mathematical rules, as pointed out in [
25]). Theorem 1.1 in [
28] shows that there is a construction associating a dynamical proof to each classical proof of an atomic formula
B from a set of atomic formulas
R. Is it possible to transform a proof of a geometric sentence in
in (some kind of) a dynamical proof?
6.3. Inference Rules and Automated Reasoning
The examples we just mentioned are interesting not only from a theoretical point of view. The inference rules we described in these examples can be implemented as tools for automated reasoning. Automated theorem proving is an important research field with applications in Artificial Intelligence and Program Verification (just to mention two application areas). Inference rules provide the base for efficient implementations of the reasoning principles expressed by conditional axioms. In addition to providing a guide to the search for appropriate inference rules, the results presented in this paper can play a fundamental role in the analysis of the formal properties (correctness, completeness, conservativeness,…) of the systems finally developed.
A classic example of such use is provided by the Boyer–Moore Nqthm theorem prover [
29] and the closely related system ACL2 [
30], developed by Kaufmann and Moore, used in the modeling and verification of computer hardware and software. In these systems, the (noetherian) induction principle is implemented as an inference rule that provides a powerful tool to derive properties of functions defined by recursion. Model–theoretic methods play an important role in the proofs of the correctness properties of ACL2, as shown in [
31].