CHAPTER I
THE LOWER PREDICATE CALCULUS
1.1. General Introduction. The Metamathematics of Algebra is concerned with the analysis and development of Algebra by the methods of Mathematical Logic. Model theory deals with the relations between the properties of sentences or sets of sentences specified in a formal language on one hand, and of the mathematical structures or sets of structures which satisfy these sentences, on the other hand. Since the methods employed in Model theory are frequently algebraic in spirit if not in detail and since the algebraic theories of fields, rings, groups, etc. are in view of their transparent structure well suited to a detailed logical analysis, Model theory and the Metamathematics of Algebra supplement one another in a natural way. Indeed, in many cases it is hard to decide whether a particular topic belongs properly to the former subject or to the latter. The algebraic approach has penetrated into many other branches of Mathematics, and it was inevitable that the Metamathematics of Algebra should follow suit. Again, since the language of Topology has proved equally ubiquitous it can be used also in connection with the matters dealt with in this book. In fact, the task of drawing precise boundary lines within Mathematics has become notoriously difficult, and the above definitions are intended only to give a general indication of our purpose. However, in order to keep this book within reasonable proportions, we shall choose most of our examples from Algebra and we shall dispense with the use of topological language. The plan of the book is as follows. In the present chapter, we set up the formal language of the Lower predicate calculus, which is fundamental for the sequel. We then discuss the relations between the sentences of our formal language and the mathematical structures in which they hold. Finally, we establish the extended completeness theorem of the Lower predicate calculus and we derive some of its immediate consequences. The second chapter begins with a discussion of some fundamental concepts which are relevant to algebraic systems such as the notions of equality, of extension, of isomorphism, of homomorphism. This is 1
2
THE LOWER PREDICATE CALCULUS
11.1.
followed by the specification of axiomatic systems of some common algebraic concepts. The chapter concludes with a number of fairly direct applications of the extended completeness theorem, including Malcev’s theory of normal chains. In the third chapter, we discuss a number of typical problems of Model theory. These include the set-theoretic characterizations of systems of structures which are given by axioms in prenex normal form with certain special types of prefixes (e.g. with existential quantifiers only). Systems of structures which are closed under intersection are also considered. The fourth chapter deals with various notions of completeness, including model-completeness and relative model-completeness. Some of the most important instances of these notions are considered in detail. The first two sections of the fifth chapter are concerned with Beth’s theorem on definability in the Lower predicate calculus and with related results. This is followed by the discussion of another kind of definability problem which arises in Algebra, and by a metamathematical analysis of the notion of an algebraically closed extension and of related concepts. The chapter concludes with applications to Differential Algebra and to Hilbert’s theorem on the zeros of polynomials, the Nullstellensatz. Chapter VI contains logical analyses and generalizations of various standard algebraic concepts such as the notion of polynomial extension and of separability. The seventh chapter deals with the metamathematical theory of ideals, and the eighth chapter is concerned with the varieties of metamathematical ideals. It is shown that this includes the theory of algebraic varieties as a special case. Other applications deal with differential ideals and with a generalization of Artin’s theorem on Hilbert’s seventeenth problem concerning the representation of definite functions as sums of squares. The use of function symbols and their application to the elimination of quantifiers is discussed in the ninth and last chapter. This is followed by a brief description of the ultraproduct construction and of its fundamental properties. The chapter concludes with an introduction to Non-standard analysis. It is shown that the metamathematical approach provides a tool for the development of function theory in certain nonarchimedean fields, and that the resulting methods can be used for the proof of theorems in classical Analysis. The reader is expected to be familiar with the elements of the Lower predicate calculus up to the transformation of a sentence into prenex
1.2.1
3
RULES OF FORMATION
normal form. On the mathematical side, we shall make use of some standard results concerning groups, rings, and fields. We shall use the following set-theoretic notation - a E A denotes the membership relation, a is an element of A . A = B and A # B state that the sets A and B are equal (co-extensive, contain the same elements) or unequal, respectively. A C B and B 2 A indicate that A is a subset of B, including the possibility that A equals B. The union, intersection, and difference of two sets A and B will be denoted by A u B, A n B, and A - B respectively. The union and intersection of a set of sets {A,] will be denoted by {Ad
u
and by
V
nM . V
The cardinal number of a set A will be denoted by I A I. The sign of equality will be used also in order to define one symbol in terms of another symbol, or in terms of a group of symbols, e.g., X = [ [ A ] 3 [ B ] ] .
1.2. Rules of Formation. We set up a formal language L by means of the following rules : The atomic symbols of L are The (individual) object symbols, usually denoted by a, b, c, .. ., small italics, near the beginning of the alphabet, with or without subscripts, occasionally by other symbols such as numerals, 0, 1, 2, . .. to conform with common usage. They constitute a well-defined set of arbitrary transfinite cardinal number. The dummy symbols or variables, denoted by u, v, w, x , . . . ; these symbols are supposed to constitute a countable set. The relative symbols, divided into disjoint classes, R,, n = 0, 1,2, .. . (relative symbols of order n, or n-place relative symbols. Relative symbols of order n 2 1 will be denoted by A ( , , .. .), B ( ,), . . . (capital italics followed by n empty spaces, separated by commas, in round brackets). Relative symbols of order 0 will be denoted simply by A , B, C, . .. The classes R, constitute well-defined sets whose cardinals are specified but arbitrary. The connectives. These are the five symbols (negation), V(disjunction), A (conjunction), 2 (implication), and (equivalence). The quanti3ers. These are - the universal quantifier, denoted by (V ) and the existential quantifier, denoted by (3 ). N
4
THE LOWER PREDICATE CALCULUS
p.2.
The separation symbols, denoted by [ (left square bracket) and ] (right square bracket). This completes the list of atomic symbols. Atomic formulae are obtained by filling the empty spaces of a relative symbol of order n, n = 1, 2, . . . with object symbols or variables. For instance if A ( ,,) is a 3-place relation, then A (a, b, a) and A (b, a, z) are both atomic formulae. 0-place relative symbols are atomic formulae by themselves, by definition. Well-formed formulae, briefly ‘wff’ or ‘formulae’ are now defined inductively. They will be denoted by capital italics taken from the end of the alphabet, V, W, X, .. . Observe that these symbols do not belong to the formal language L. 1.2.1. Atomic formulae bracketed by square brackets are wff.
X and Y are wff, then [X V Y ] , [ X A Y ] , [ X 3 Y ] ,and [ X = Y ] are all wff.
1.2.2. If X is a wff then [- XI also is a wff. If
1.2.3. If X is a wff then [ (Vy) X ] and [ (3y) X ] are both wff, provided X does not already contain one of (Vy) or (3y). Thus, [ (3y) [ A(a, x ) ] ] and [ (3y) [ A(a, y ) ] ] are both wff while [ (Vy) [ (3y) [AQ)]]]is not a wff. It will be understood without detailed explanation what is meant by the phrase “X contains (Vy)”, etc. We shall also say that a wff X contains a wiT Y if X is constructed from Y and other wff by the (repeated) use of 1.2.2. and 1.2.3. We shall assume throughout that the language with which we are concerned contains at least one wff. WE are divided into complete formulae, or sentences, and incomplete formulae, or predicates, as follows. A wff X is called complete if whenever a variable is contained in it, say y , then y is contained in a wff Z such that 2 occurs in X in one of the forms [ (Vy)Z] or [ ( 3 y ) Z l . If y occurs in X more than once - excepting the cases where it occurs within the brackets of a quantifier - then the above condition is supposed to be satisfied whenever y occurs. For example, [ (Vy) [ [ A(y ) ] A [B (y ) ]] ] and [ [ (Vy) [ A( Y ) ] ] [ ($9 [ C ( r , a ) ] ] ]are both complete, while [ [ (Vy) [ A( y ) ] ]A A [ B ( y ) ] ]is incomplete. We may express this definition in a different way by introducing the notion of the scope of (an individual occurence of) a quantifier, e.g. (3y) within a wff X. This is the wff contained in X which begins with the left bracket following immediately upon (3y) and ends with the corresponding right bracket. X is then complete if every variable contained in a relation within Xis within the scope of a quantifier
1.2.1
5
RULES OF FORMATION
with the same variable. Any variable in X which does not have this property is calledfree (or more precisely, the occurence of the variable in question is free). We define the order of a wff as the number of pairs of square brackets contained in it. Thus, the order of [ A @ ) ] is 1 and the order of [[ (Vx) [ B ( x ) ] ]V [ C ( u ) ] ]is 4. The order of a wff constructed by negation or quantification (see 1.2.2. and 1.2.3. above) exceeds by 1 the order of the formula from which it is constructed. The order of a formula constructed by disjunction, conjunction, implication, or equivalence exceeds by 1 the sum of the orders of the two formulae from which it is constructed. The above is a version of the language of the Lower (restricted) predicate calculus which is distinguished by its straightforwardness at the cost of some lack of economy. Thus, it is known that three of the connectives (conjunction, implication, and equivalence) and one of the two quantifiers can be expressed in terms of the remaining symbols. Moreover, in our formulation a large number of brackets is required even for relatively simple expressions. An advantage of this notation is that it indicates quite clearly the mode of construction of a formula. However, later we shall simplify our notation by adopting the following rules. In the successive construction of a wff from a set of atomic formulae, the following square brackets may be omitted. Square brackets enclosing atomic formulae. Square brackets following a negation provided they enclose a negation. For example, [ [ X I ] may be replaced by [ XI. Square brackets preceding or following the symbol of conjunction provided they enclose a negation; square brackets preceding or following the symbol of disjunction, provided they enclose a conjunction or a negation. For example, [ [X A Y ] V [- Z ] ] may be replaced by [XA Y V ZJ. Square brackets preceding or following the sign of implication or of equivalence, provided they enclose a disjunction or a conjunction or a negation. In any sequence of quantifications such that both the quantifiers on the left and the brackets on the right follow immediately upon one another, all the square brackets may be omitted, with the exception of the innermost and the outermost pairs (which may however be removable by virtue of another rule). For example, [ (Vx) [ (3y) [ (Vz) [ X I ] ] ]becomes [ (W (3Y) (Vd [ X ] ] Finally, the outermost brackets in a wff may also be dropped. For
--
-
N
-
-
-
-
6
THE LOWER PREDICATE CALCULUS
example
[ (W [ (9) [ (V.4 [ [ [ A(x, Y ) ] A [ A(v, 411 may be written in simplified form as
t1.3.
= [ A(x,.>I ] ] ] ]
(W (VY) (W [ A (XY Y ) A A (YY 4 = A (x,41 * The above rules are framed in such a way that any formula which has been simplified by the use of some or all of them can be restored to its fully bracketed form without ambiguity. Rules for the omission of brackets in successive conjunctions or disjunctions are not included, since they involve the validity of the associative law for these operations. However, later on we shall introduce the further simplification of denoting the conjunction and disjunciton of any number of wff taken in any order of association by [XI A X2 A . . . A Xn] and [XI V X2 V . .. V Xn], respectively. Any statement or argument involving such expressions will then be meant to apply whenever the latter are replaced by corresponding fully bracketed expressions. Thus, [ X I V X2 V X3] will be replaceable by [XI V [X2 V X3]] or by [[XI V X Z ]V X3] where the particular choice of the fully bracketed expression is irrelevant. Moreover, different fully bracketed expressions may be selected for the same simplified formula if the latter occurs more than once. Finally, we may mention that some of the subsequent work is simplified if we rule out wfT in which the same variable appears more than once within the brackets of a quantifier (e.g. xin (3x)). This can be done without limiting the scope of our language, but the practice will not be adopted here. 1.3. Rules of deduction. From the set of sentences in L we now select a subset - to be called the set of theorems of L - by a purely formal procedure. Given any sentences X , Y, Z in L, the following are theorems. 1.3.1.
‘ X = [ Y = XI1 [ X = Y ] ]3 [ X = Y ] ] “ X D Y ] = [ [ Y DZ ] = [ X = Z ] ] ‘[XA Y ] = X ] . [ X AY ] 3 Y ] - [ X = Y ] = “X=, Z ] = [ X I [ Y V Z ] ] ] ] 2 3 [XV Y]] ‘Y = [XV Y ] ] & [ X IZ ] = “ Y = Z ] = “ X V Y ] = z ] ] ] [X- Y ] 3 [X = Y ] ]
x=
1.3.1
RULES OF DEDUCTION
7
[[I=Y ] = [ Y = XI]
Y ] = [ [ Y = XI = [ X E Y ] ] ] [ [ X I YI = [ [- YI = [- X I ] ] [ X = [- [- X I ] ] [- X I ] = X ]
[ [X =
"-
In simplified notation, the sixth of the above sentences, for example, becomes [ X I Y ] =) [ [ X = 21 I [ X = Y A Z ] ] We shall sometimes single out an object symbol or free variable with reference to a wff X by writing X(u) or X ( y ) in place of X . In the same way we may display several object symbols or variables, without necessarily mentioning all object symbols or free variables contained in X. Given X(u) we then mean by X ( b ) the result of substituting b for a wherever a occurs in X, with a similar notation for the substitution of variables. With these conventions, the following are supposed to be theorems, for arbitrary X , a, y.
1.3.2.
[ [ (VY) [ X ( Y ) l ] = X ( 4 3 [ [X(a)l = [(W X(Y)l]]
provided these are sentences in accordance with the rules of formation. Further theorems are obtained by the application of the following three rules of inference :
1.3.3. If X and [X 3 Y ] are theorems, then Y is a theorem. This is the rule of modus ponens. If [ X 3 Y(u)] is a theorem, where X does not contain a, then [X 2 [ (Vz) Y ( z ) ] ]is a theorem, provided it is a sentence. If [Y(u) X ] is a theorem, where X does not contain a, then [ [ (32) Y ( z ) ] 3 X ] is a theorem, provided it is a sentence. It should be understood that the symbols a and z mentioned in these rules signify arbitrary object symbols or variables, respectively. All rules of substitution are then deducible and need not be introduced as postulates. Two of these which are of frequent use are as follows
1.3.4. From any theorem X which contains a quantifier with variable y, another theorem is obtained by replacing y by any other variable z both in the quantifier and in its scope, provided that scope is not part of the scope of a quantifier which contains z.
8
r1.3.
THE LOWER PREDICATE CALCULUS
1.3.5. In any theorem, we may replace a formula which is obtained by bracketing a relative symbol of order 0 by an arbitrary sentence. The result is a theorem provided it is a wff at all. Two more standard results of the Lower predicate calculus which will be required in the sequel are as follows:
1.3.6. If
[[[ ...[X l A X z I A X 3 1 A ... AXn] 3 and
[[[ ... [ Y i A Yz] Y3]
...
Ym], m = l , 2 Yk] 3
,..., k
z]
are all theorems then
is a theorem.
[ [ [ ... [XiAXz]AXs]A ... A X n ] 321
1.3.7. To every sentence X there exists a sentence X ' which contains the same object and relative symbols as X such that X s X' is a theorem, and such that x' is in prenex normal form, i.e. where the qk denote quantifiers while Y, the matrix of the sentence, does not contain any further quantifiers. We include the possibility n = 0, in which case X' contains neither quantifiers nor variables. Let K be a set of sentences in L. We say that a sentence Y is deducible from K, in symbols K F Y, if there exists a finite sequence of sentences XI, XZ,. .., X,, in K such that
[ [ [ ...[X I A X ~ I A X ... ~ ] AX,,]
3
Y]
is a theorem in L. By a natural convention we include the possibility = 0, in which case it is understood that Y itself is a theorem. The set of sentences in L which are deducible from K will be denoted by S ( K ) . S (K) includes K as well as all theorems of L. Using 1.3.6. we may show that for all K, S ( S ( K ) ) = S ( K ) . If K c S ( K ) then K' is said to be deducible from K, in symbols K F K . A set of sentences Kis called contradictory if S ( K ) includes all sentences of L, otherwise K is consistent. K is contradictory if and only if S ( K ) contains a sentence of the form [X A [ X I ] . In addition to the theorems which were obtained above as a certain subclass of the sentences of L, we shall also have occasion to formulate
n
N
1.4.1
SEMANTIC INTERPRETATION
9
theorems outside L, in the usual sense of the term. By way of distinction these are sometimes called meta-theorems. We shall not use this term and shall instead relay on the context for elucidation. 1.4. Semantic interpretation. We now come to the semantic, or descriptive, interpretation of the sentences of the given language. A mathematical StructureMwhichcan be described by sentencesof Lis of the following type. It consists of a set of (individual) objects or individuals which (like the object symbols) will be denoted by small italics a, b, . . . and of sets of relations of order n, (e.g. A ( ), B ( , ), ...) such that for every relation A of order n defined in M and for every ordered n-ple al, ..., a n of different or identical constants of M, the instance A (al, .. .,a n ) of the relation either holds or does not hold (in M). We shall indicate the situation also by saying that A holds, or does not hold, at (al, . . ., an). We do not identify the relations as such with sets of ordered n-ples of individual objects of M so that two relations may hold at the same n-ples. We also include the possibility that relations of order 0 belong to M. Such a relation then holds, or does not hold in M, without reference to the individual objects of M. Relations of order 0 do not appear in the familiar mathematical structures which will be considered later. However, they do appear as the elements of structures defined in connection with the propositional calculus. On the other hand such structures do not normally contain any objects, or relations of positive order. Let C be a one-to-one correspondence which maps the individual objects of M on a subset of the set of object symbols of a language L and which, at the same time, maps the relations of M on relative symbols of L of the same order. Let K be the set of wff of L whose object and relative symbols all appear in C. We say that these wff are dejhed in M under C. To every atomic formula X in K which does not contain any variables there corresponds, by C, the expression X' of a relation between certain individuals of M , which either holds or does not hold in M. The following rules then determine, by definition, whether a sentence of K holds or does not hold in M (under the correspondence C). 1.4.1. Let Y be a sentence of order 1 in K, Y = [XI where X is an atomic formula. Then Y holds in A4 if and only if the expression X', which corresponds to X under C, holds in M. 1.4.2. Given two sentences Y and Z in K,
[YV 21 holds in M if and only if at least one of the two sentences Y
10
T€i@LOWER PREDICATE CALCULUS
[1.4.
or 2 holds in M; [ Y A Z ] holds in M if and only if both Y and 2 hold in M, [ Y 3 Z ] holds in M if Z holds in M , and also if neither Y nor Z holds in M ; [ Y 3 21 holds in M if both Y and 2 hold in M and also if neither Y nor Z holds in M ; and finally, [ Y]holds in M if and only if Y does not hold in it. N
1.4.3. Given a wff Y = Y(x)in which z is not quantified (does not appear within a quantifier) and no other variable is free, [ (Vz) Y (z)] holds in M if and only if Y(a) holds in M for all object symbols a in L which correspond to objects in M; [ (32) Y (z)] holds in M if and only if Y(a)holds in M for at least one a in L which corresponds to an object in M. Since each of the above rules 1.4.2., 1.4.3. bases the decision whether a sentence does or does not hold in M on a sentence of lower order while by 1.4.1. the decision whether or not a sentence of order 1 holds in M depends directly on M, it follows that these rules determine uniquely for any Y in K whether Y holds or does not hold in M (under C). However, the rule 1.4.3. is not effective in any constructive sense since it leaves the decision whether or not Y holds in M on the corresponding question for a set of sentences which may be infinite. Now let K be any set of sentences in L and let C be a one-to-one correspondence which maps the object symbols occuring in sentences of K (if any) on objects of M, and the relative symbols of K on relations of M of corresponding order. Extend C in some way to a correspondence C' as considered above, i.e. such that all objects and relations of M are mapped in one-to-one correspondence on object and relative symbols of L. Then it is not difficult to see that the answer to the question whether or not such a sentence of Kholds in L depends only on the correspondence C and not on the particular choice of the extension C'. However, since the cardinal number of the set of object symbols of L is limited it may happen that the number of objects in M is so large that no correspondence C' of the required type exists. This can be avoided by considering, in connection with any given structure M, only languages which possess a sufficiently large pool of object symbols from the outset. Alternatively, L as given may be embedded in a more comprehensive language L', the particular choice of L' being again irrelevant. If all sentences of a set K hold in a structure M under a correspondence C then we say that M is a model of K (under C). If K contains only a single sentence Y, then we shall say also that M is a model of Y.
1.5.1
RELATION BETWEEN DEDUCTIVE A N D DESCRIPTIVE CONCEPTS
11
So far, we have distinguished strictly between object and relative symbols on one hand, and individual objects and relations on the other. However, it frequently simplifiesmatters a great deal to suppose that the objects and relations of a structure M coincide with object and relative symbols of a language L in which M is described. In that case it is usual to suppose that these objects and relations coincide individually with the corresponding object and relative symbols of L, i.e., C is the identity. For certain purposes, it is useful to consider a language which is based on the same types of atomic symbols as the language considered above and whose set of wff is extended by the introduction of certain infinitary rules of formation. There exists a considerable body of information on such languages, but we shall make use of them only in some special cases. Thus, if XI, X Z ,Xs,. . . is a countable sequence of wff we may include among the wff also the infinite disjunction [XI V X ZV X3 V . . .] and the infinite conjunction [XI A X2 A X3 A . . .I. If all object and relative symbols of the Xi are reIated to the objects and relations of a structure M by a correspondence C as before, then we say that the above infinite disjunction holds in M if at least one Xt holds in M, and the infinite conjunction holds in M if all Xt hold in M . 1.5. The Relation between Deductive and Descriptive Concepts. Let K be the set of sentences of L which is defined in a structure M under a correspondence C. Then 1.5.1. Every theorem of L which is included in K holds in
M.
We shall omit the proof of this theorem, which is obtained by checking through the rules of deduction, (1.3). Some care is required in dealing with the rules of inference 1.3.3. Next 1.5.2. Every sentence
Y of K that is deducible from a set of sentences which hold in M must itself hold in M. Indeed, suppose that Y is deducible from a set of sentences X I , XZ,..., X, which hold in M.The sentence
[[[ ... [XiAXzIAXs]... AX”]3 Y] then is a theorem and so must hold in M. But if so, then Y must hold in M, by 1.4.2. A contradictory set K cannot possess a model. In fact, by assumption, we consider only languages which contain at least one statement. Since K is contradictory it cannot be empty. Let Y be an element of K and let
12
THE LOWER PREDICATE CALCULUS
[IS.
Z = [ Y A [- Y]]. If K is defined and holds in a structure M under a correspondence C then Z holds in M under the same correspondence. But if so then Y both holds and does not hold in M, which is impossible. From now on we shall usually omit references to the correspondence C which establishes the connection between a set of sentences, K, and a structure M. Thus, when stating that the sentences of K are (or, K is) defined in M, we shall take it for granted that this involves a particular correspondence C .We are going to prove 1.5.3. THEOREM. If a sentence X of L holds in every structure in which it is defined, then Xis a theorem. This is, essentially, Godel's completeness theorem for the Lower predicate calculus. 1.5.4. THEOREM (EXTENDEDCOMPLETENESS THEOREMOF THE LOWER
PREDICATE CALCULUS). Every consistent set of sentences K in a language L possesses a model.
1.5.5. THEOREM. If a set of sentences K and a sentence Yare such that Yis defined and holds in any structure which is a model of K then Y is deducible from K. Hence, Y is deducible from a finite subset of K.
Theorems 1.5.5. and 1.5.3. can both be reduced to 1.5.4. Suppose that the sentence Xmentioned in 1.5.3. is not a theorem of L. In that case the sentence [- XI - or, more precisely, the unit set containing [- X I cannot be contradictory. For if it were, then [ [- XI 3 Y] would be a theorem for all Yin L, hence [ [- XI 3 XI, [- XI] V X ] ,[XVXI, and finally X would all be theorems in L, by the rules of the calculus of propositions. This is contrary to assumption and so [- X I is consistent and possesses a model M , by 1.5.4. But if so then X i s defined, but does not hold in M,contrary to the hypothesis of 1.5.3. This proves 1.5.3. Coming to 1.5.5., suppose that Y is not deducible from K. It follows that for any finite number'of sentences XI, .. .,Xn of K, the conjunction
[I-
[[ . . . [XiAXz]AX3]... A [ - Y ] ] is not contradictory (otherwise [ [ [. . . [XI A X23 A X3] . . . A Xn]3
Y] would be a theorem, i.e. Y would be deducible from K). But this means that the set {XI, . . .,X,, Y} is consistent, and hence that K u { Y} is consistent. If so, then by 1.5.4. there exists a model M of K u {- Y > . M is a model of K i n which Y holds, contrary to assumption. This proves 1.5.5.
-
-
-
1.5.1
RELATION BETWEEN DEDUCTIVE AND DESCRIPTIVE CONCEPTS
13
To prove 1.5.4., suppose first that the sentences of K do not include any quantifiers or object symbols. That is to say, K consists of sentences built up from relative symbols of order 0 by means of connectives (and brackets). A model M of K then is a set of relations of order 0 in one-toone correspondence with the relative symbols of K such that any relation of the set either holds or does not hold in M y and such that the rules of 1.4.2. then imply that the sentences of K all hold in M. To simplify the argument we shall identify these relations with the corresponding relative symbols. Let S be the set of relative symbols of K. Then the question whether any A E S holds or does not hold in a structure M as indicated may be expressed by a valuation function a, ( A ) defined in S and ranging over the ‘truth values’ 0 (holds) and 1 (does not hold). Reference to 1.4.2. shows that the question whether or not a sentence which is built up from elements of S holds in Mycan then be decided by the standard truth table evaluation if we identify 0 with ‘true’ and 1 with ‘false’. Thus, if K contains only a single sentence X , the fact that we may assign values a, ( A ) = 0 or = 1 to the relative symbols of X so as to ensure that X holds may be accepted as a well-known result of the calculus of propositions. If K contains a h i t e number of sentences only, this may be reduced to the case of a single sentence by taking the conjunction of the elements of K in any arbitrary order. In order to prove the theorem for K of arbitrary infinite cardinal (countable or non-countable) we require the following auxiliary consideration. Without changing our notation we may, temporarily, regard S as an abstract set the character of whose elements, A , By C, . . . is irrelevant. By a partial valuation of S we mean a function of one variable which is defined on a subset V of S and takes values in the two-element set (0, l}. The valuation a, is called total if its domain of definition is the entire set S. We write Da, for the domain of definition of a partial valuation a,. Also, if U is any subset of S, we write a, I U for the restriction of a, to the set Da, n U. With this terminology, we propose to prove the following VALUATION LEMMA.Let @ = { p v } be a set of partial 1.5.6. SPECIAL valuations of S with index set I = { v } , such that for every finite subset U of S there exists a a, E @ which includes U in its domain of definition. Then there exists a total valuation JU, of S such that for every finite subset U of S there exists a yV E @ which includes U in its domain of definition and such that yI U = pvl U, i.e. v/ coincides with qv on U.
14
THE LOWER PREDICATE CALCULUS
[1.5.
For the proof, we shall call a partial valuation yl of S admissible if for any finite subset U of S there exists a pvE @ such that U c Dp, and yl U = p,lDy/ n U,i.e. yl coincides with pv on the intersection of U and Dyl. Given two partial valuations p and yl we shall call cy an extension of p if D p c Dyl and yllDp = p, in symbols p < cy. Let Y be the set of admissible partial valuations of S. Yis not empty for, by the hypothesis of the lemma it contains the empty partial valuation, i.e. the partial valuation whose domain of definition is empty. Also, Y is partially ordered by the relation of extension, <, defined above. Any non-empty totally ordered subset Y' of Y possesses an upper bound y'. Indeed, if Y = {wp) then we may define the domain of definition of w' as the union of the domains of definition of the yo, and the value of yl' for any argument within its domain of definition as the joint value of all ylh which are defined for that argument. Accordingly, by Zorn's lemma, Y contains at least one maximal element, V/O say. We wish to show that ylo is a total valuation of S. Suppose on the contrary that S - Dylo is not empty, and let A be an element of that set. Define a partial valuation yl1 of S, with domain of definition Dylo U (A}, by y l 1 = ylo on Dylo and yll(A) = 0. Since ylo is maximal, yl1 cannot be admissible. That is to say, there exists a finite subset V of S such that the conditions V c Dp, and ~ 1 V1 = VvlDyl1 n V are not satisfied for any pp.E @. Define a partial valuation y2 of S, with domain of definition Dylo U {A}, by V / Z = cyo on Dylo and ylz(A) = 1. Again, ylz cannot be admissible, and so there exists a finite subset W of S such that the conditions W c Dp, and ylzl W = q V ( D v zn W are not satisfied for any pvE @. We note that Y must contain A, otherwise the existence of a pv as required for admissibility follows from the fact that ylo is admissible. Similarly W must contain A. Now consider the set U = V U W. Since ylo is admissible there exists a pp E @ which, contains U in its domain of definition such that cyol U = prrlDylo n U. A belongs to the domain of definition of pp, since A E U and so either p p ( A ) = 0 or p,(A) = 1. But in the former case we then have y l i I V = pplDyll n V and V = D p p while in the latter case yl2l W = pplDy/z n W and W = Dp,. This contradicts our earlier conclusion that no element of @ with these properties can exist. Accordingly S - Dylo is empty, cyo is a total valuation, and since yo is admissible it follows that yo satisfies the conditions stated in 1.5.6. This completes the proof of the lemma. We return to the proof of 1.5.4. for the case that K is infinite and
1.5.1
RELATION BETWEEN DEDUCTIVE A N D DESCRIPTIVE CONCEPTS
15
consists of sentences without quantifiers or object symbols. Thus, K contains relative symbols of order 0 only. Denoting by S the set of relative symbols which appear in K, let S’ be any finite subset of S, and let K‘ be the set of sentences of K whose relative symbols belong entirely to S’. Since the number of elements of S’ is finite it follows that while the number of sentences of K’ may be infinite they must all be equivalent, in the sense of the calculus of propositions, to the sentences of a finite subset of K . We conclude that there exists a total valuation rpv of S‘ (i.e. ampartial valuation of S ) such that pvattributes the value 0 (‘true’ or ‘holds’) to all elements of K‘ according to the familiar truth table evaluation. For elements of S‘ which are not contained in,any sentences of K the value of rpv may be chosen arbitrarily, e.g. qv = 0. Let @ = {qp} be the set of all partial valuation of S obtained in this way, and let yo be a total valuation of S with the properties stated in 1.5.6. Let X be any sentence of K. We maintain that the values which are attributed by yo to the relative symbols of X yield the value 0 (‘true’ or ‘holds’) for X.Indeed if V is the set of relative symbols which occur in X then for some qv E @, V c Dqv = S’ and V / O = vvon V. It follows that X belongs to the set K defined above for the given finite set S’. And since qv yields 0 for X the same applies to yo, which is equal to vv on V. We conclude that yo yields 0 (‘holds’) for all sentences of K. This establishes 1.5.4. for the case that the sentences of K are built up from relative symbols of order 0 only. Suppose next that K may include relative symbols of arbitrary order, while still not containing any quantifiers or variables. As before, we shall use the relative and object symbols of K as the relations and objects of the structure M which satisfies K (i.e. in which K holds). Let S be the set of atomic formulae which occur in the sentences of K. Choose S’as a set of relative symbols of order 0 which are in one-to-one correspondence C with the elements of S. Two atomic formulae are regarded as different if they contain different relative symbols, or if they contain the same relative symbol, but differ at some point in the object symbols or variables that occupy corresponding places. By our definition of a language L, it is in fact not certain a priori that there are enough relative symbols of order 0 available for S’. However, in that case we may replace L by a more extensive language. We agree once and for all to carry out such extensions tacitly whenever a problem of this kind arises. Now let K be the set of sentences X‘ obtained from the sentences X of K by replacing all atomic formulae of S by the corresponding relative
16
THE LOWER PREDICATE CALCULUS
r1.5.
symbols of S‘. We claim that K‘ is consistent. This is true, trivially, if K and K‘ are empty. In the general case, let A be an element of S and let A‘ be the corresponding element of S’. If K‘ is contradictory, then it contains a finite number of sentences X i , . . . X i , corresponding to sentences XI, . . .,X , in K such that the sentence
is a theorem, where Y‘ = [ [ A ’ ]A substitution 1.3.5., the sentence
[-
[ A ’ ] ] ] .Hence, by the rule of
is a theorem where Y = [ [ A ] A [ [ A ] ] ] ,A being the atomic formula that corresponds to A ’ . But this shows that K is contradictory, contrary to assumption. We conclude that K‘ also is consistent. It follows from what has been proved already that K‘ possesses a model M‘, i.e. there exists a valuation of S‘ which makes all elements of K‘ ‘true’. For any atomic formula R(a1, . . ., a,) of S, we now define that R(m, . . ., an) holds in M if and only if the corresponding relative symbol of S‘ holds in M’. For any R (al, . . ., a,) of M which does not belong to S and hence, does not occur in K,we define arbitrarily that R (a,. ..,a,) holds in M. Since the decision whether a sentence X of K holds in M depends only on the question which of the atomic formulae of K hold in M, and since this connection is the same in M’ as it is in Myit follows that all sentences of K hold in M y M is a model of K. We now drop all restrictions on the form of the sentences of K. However, in view of the cases disposed of previously, we may suppose that K is not empty and that it contains at least one sentence that involves a quantifier. Replace every sentence X of K by a sentence X’ in prenex normal form, such as exists by 1.3.7., and let K be the set of all sentences X’ which are obtained in this way. Since X X’ is a theorem for all X E K it follows that every model of K‘ is also a model of K. And since X 3 X’ is a theorem for all X E K, and K is consistent, it follows (compare 5.3.6.) that K‘ also is consistent. Accordingly, we shall from now on confine ourselves to the case that all sentences of K are in prenex normal form from the outset. We count the quantifiers in any sentence in prenex normal form in their “natural” order, from left to right. In order to exemplify our procedure, we shall suppose that K includes a sentence of the form
1.5.1
17
RELATION BETWEEN DEDUCTIVE A N D DESCRIPTIVE CONCEPTS
where Q does not contain any further quantifiers. We shall use the relative symbols of K as the relations of a model M. The set of individual objects of M will be introduced in due course. A sequence of sets of sentences {KO,K I , Kz, , .] and a sequence of sets of object symbols {PO,P I , Pz, . . } are now defined by induction as follows. KO= K, while POis the set of object symbols contained in the sentences of K. However, if K does not contain any object symbols then we define that PO contains the single object symbol a, picked at random. KI contains all sentences of KO. Moreover, if X is a sentence of KO which begins with an existential quantifier then K1 contains a sentence X* which is obtained from X by deleting the existentialquantifier (and the brackets which belong to it) and by replacing the corresponding variable everywhere in the remaining wff by a new object symbol. Thus, if X is of the form 1.5.7. then we may select as X * the sentence
.
.
1.5.8.
provided b does not occur in PO.Moreover, it will be understood that we introduce distinct new object symbols for distinct sentences of KO.This completes the definition of K I .P I is obtained as the union of Po and of the new object symbols introduced just now. KZcontains all sentences of K I .Moreover, if Xis a sentence of K1 which begins with a universal quantifier, then KZ contains all sentences which are obtained from X by deleting the quantifier and by replacing the corresponding variable in the remaining wff by elements of Pi. Thus, we obtain from 1.5.8. the sentences
[@.I[(W[ (WQ (b, a,
2,
u,
41 . . .]
where a varies over all elements of P I .This completes the definition of K2. PZ is simply given by PZ = P I . In general, we obtain Kn and Pn from Kn-l and Pn-1 for odd n just the same way as K1 and PI was obtained from KOand PO,and we obtain Kn and Pn from Kn-1 and Pn-1 for even n in the way in which we obtained Kz and Pz from KI and P I . Let K = U {Kn} = KO U Ki U Kz U . . and P' = U {Pn} = PO u PI u PZ u n
n
.
. .. The set P' will serve as the set
of individual objects of the required structure M .
18
I1.5.
THE LOWER PREDICATE CALCULUS
We propose to show that K' is consistent. Since K can be contradictory only if a finite subset of K is already contradictory, and since K is the union of an ascending chain of sets Kn, it will be sufficient to prove that all Kn are consistent. If not, then there exists a first n, n = m say, such that Km is contradictory. Also, m 1 since KO = K is consistent, by assumption. Suppose first that m is even. Then there exist sentences Y1, . .., Y k E Km, and Z1, . . ., Zz E Km - Km-1 such that the set {Yi,. . ., Yk, Z1,. ..,Zz} is contradictory. 1 must be positive, otherwise Km-1 would be readily contradictory. By the definition of Km there exist sentences VI, . . ., Vi E Km-1 which begin with universal quantifiers,
1.5.9. such that
Vr = [ ( v x ) Sr(X)],
i = 1, . ..,I ,
zr = &(at)
where the at are certain object symbols which belong to Pm. Then the sentences [Vg 3 Zt] are theorems, by 1.3.2. Now if { Y17 . . Y k , Z1, . .., Z Z }is contradictory, then there exists a sentence W = [ [ A ]A [A]]], A a relative symbol of order 0, which is deducible from { Y I , ..., Yk, Z1, . . ., Z Z }and this is equivalent to the statement that
.
1.5.10.
[Yl A [Yz A [.
. .,
. . A Yk]
3
1-
W]
is deducible from (21, . Z Z }(or, if k = 0, that W is itself deducible from { Z l , . . . Z Z } ) But . since the sentences [ Vt 3 Zt] are theorems, this implies that the sentence 1.5.10. is deducible from { V1, .. . VZ},and hence that W is deducible from { YI , ..., Yk, V1, ... V Z ]= H, say. Then H is contradictory, although it is a subset of Km-1, which is consistent. This is impossible and shows that Km cannot be contradictory. Suppose next that m is odd. Then there exist sentences Y I ,. . .,Yk E Km-l, 21, . . . Zz E Km - Km-i, I > 0, such that { Yi7 . .., Yk, 2, ..., Zz} is contradictory. By the definition of Km there now exist sentences V1, . , ., Vz E Km-1 which begin with existential quantifiers,
1.5.11.
V C = [ @ y ) ~ t ( x ) ] , i = 1, ..., 2 ,
such that Zr = Si (at) where the ad are certain object symbols which belong to P, but not to Pm-1. Moreover, we may suppose that the 2, are distinct and so the Vt and the at are distinct. Since {YI, .. ., Yk, Z1, . . .Zz} is contradictory there exists a sentence W = [ [ A ] A [ [A]
-
1
1.5.1
19
RELATION BETWEEN DEDUCTIVE AND DESCRIPTIVE CONCEPTS
as before such that W is deducible from { Y1, ..., Yn,2, ..., Zr) and hence, by the rules of the calculus of propositions, such that 1.5.12.
[ Y i A [ Y z A [... A [ Y n A [ Z z A
... AZz] . . . I =
W]=U
is deducible from 21. Thus Z1 U is a theorem and hence, by the third rule of 1.3.3., V1 =I U is a theorem. It follows that { Y1, . . ., Yk, V1, 2 2 , . . . Z } is contradictory. Applying the same argument to 22,. . ., Z Z in turn, we conclude that { Y I , ..., Yk, VI, . . ., VZ} is contradictory. But this set is included in Km-l, and so our assumption was wrong, Km must be consistent. This concludes the proof of our assertion that K is consistent. Let K* be the set of sentences of K which do not contain any quantifiers. K* is consistent since it is a subset of K', and the relative symbols in K* are the same as in K' and in K while the set of object symbols in K* is P'. It therefore follows from an earlier argument that K* possesses a model M whose relations are the relative symbolsjust mentioned and whose set of objects is P'. We propose to show that Mis a model of K,rnoreprecisely, that M is a model of K' = K. This will establish 1.5.4. in its entirety. In order to prove that the sentences of K' all hold in M we use induction on the number of quantifiers in the prefix. For sentences without quantifiers the assertion is clearly valid for such sentences are included in K*, and M is a model of K*. Suppose then that it is true for all sentences of K' with k < n quantifiers, n 2 1, and let X be a sentence of K' that contains exactly n quantifiers. Consider first the possibility that X begins with an existential quantifier, X = [ (3y) S ( y ) ] . Since X belongs to K it belongs to all Km from some m onward, and in particular it belongs to some Km with even subscript m. But if so then Km+l contains a sentence S(u) which holds in M since it has only n - 1 quantifiers. It then follows from the semantic interpretation of X (1.4.3,) that [ (3y) S (y)], which is X,also holds in M. Suppose next that X contains exactly n quantifiers, of which the first is universal, X = [ (Vy) S ( y ) ] . Then we have to show only that S(u) holds in M for any u E P'. But it follows from the construction of K' and P' that there exists an odd subscript m such that X E Km and u E Pm (and the same is then true for all greater subscripts). Hence, by the definition of Km+l, S(u) is contained in that set and hence in K'. But S(u) contains n - 1 quantifiers only and so S(u) holds in Myas required. This completes the proof of 1.5.4. We shall now estimate the number of objects required for the model M
20
THE LOWER PREDICATE CALCULUS
11.6.
of K,in relation to the number of sentences in K. If K is finite then all Kn and all Pn are finite and so P' is, at most, countable. If K is infinite, of cardinal k, then all Kn, P n have at most k elements, and so P' has at most Hok = k elements. Hence if we define the cardinal number of a model as the cardinal of its set of objects, we have the following 1.5.13. THEOREM. If the consistent set of sentences K is finite then it possesses a finite or countable model M. If K is infinite then it possesses a model whose cardinal does not exceed the cardinal of k. For finite or countable Kthis is the theorem of Lowenheim and Skolem. The notion of a structure can be defined in a slightly different way which appears to be somewhat more natural from a set-theoreticalpoint of view. In this definition a relation of order n within a structure is identified with a subset of the set of ordered n-ples of individuals of the structure. It is then no longer possible that two distinct relations hold for the same n-ples (for in that case the relations are not distinct set-theoretically) but on the other hand it may happen that the correspondence between relative symbols and relations is many-one. While we may still use object symbols as individual objects it is now impossible for a relative symbol to take the place of a relation. In order to base our own approach on set-theoreticalconcepts we have to think of objects and of relations of different orders as the elements of distinct sets of individuals in some (absolute or axiomatic) Set Theory. The statement that R(a1, ..., an) holds in a structure M then signifies that the sequence of n 1 individuals R, U I , . . . an, belongs to a preassigned set.
+
1.6. Sets of Sentences and Their Varieties. Unless an explicit remark is made to the contrary, we shall suppose from now on that the relative and object symbols coincide with the relations and individual objects (constants) which they denote, and we shall refer to them simply as relations and individuals, or constants. Given a consistent set of sentences K in a language L we may, as we have seen, suppose that L has a model M but we may not, without making special assumptions on L, suppose that all the individuals of M are contained in L. However, it is clear that if we choose a 'universe' of individuals D which is large enough (e.g. that contains as many individuals in addition to the individuals of L as there are sentences in L) then to any given consistent set of sentences K in L we may find a model M such that all individuals of M belong to D. This
1.6.1
21
SETS OF SENTENCES A N D THEIR VARIETIES
ensures the existence of models to all consistent sets in a single universe of individuals and saves us from the embarassment of ill-defined totalities. A set of sentences K in a given language L will be called a T-system (T for Tarski) if it includes all sentences which can be deduced from it, in symbols if S ( K ) = K. For example, the set of all theorems in L is a T-system. A set of structures V (based on a definite universe of individuals, as explained above) will be called a variety (of structures) if Vconsists of all structures which are models of a set of sentences K in L. We shall also say in this case that Vis the variety of K, in symbols K + V.The theory of systems of sentences and of varieties of structures will be developed later (Chapter VII and VIII) in a more elaborate setting. In the present section we wish to establish a particular result on varieties which is closely related to 1.5.4. 1.6.1. COMPACTNESS THEOREMFOR VARIETIES OF STRUCTURES. Let { VV}be a set of varieties of structures such that the intersection of any
finite number of elements of { V y } is not empty. Then the intersection { VV}is not empty. V
Since the VVare varieties there exist sets of sentences KV such that for every v, KV+ VV,i.e. VV is the variety of Kp. Let K = (J {KV.}Then V
every model of K is a model of every Kv and hence belongs to every VV. Thus, in order to prove 1.6.1. we only have to show that K possesses a model. This, by 1.5.4. requires only that K is consistent, i.e. that every finite subset of K is consistent. Let {XI, . . ., X n } be any finite subset of K. Then there exist sets Kv, e.g. KI, . . .,Kn, such that Xt E Kt, i = 1, . ..,n. Now let M be an element of V I n . . . n Vnwhere Kt -+ VC,i = 1, . . .,n. Such a structure M exists by the assumption of 1.6.1. Then the sentences Xg hold in M, i = 1, ...,n. It follows that {XI, ...,Xn} is consistent. Thus, K is consistent and the theorem is proved. Closely related to 1.6.1. is the following
1.6.2. PRINCIPLEOF LOCALIZATION. Let K be a set of sentences such that every finite subset of K possesses a model. Then K possesses a model. Indeed, every finite subset of K is consistent since it possesses a model. Hence K is consistent. Hence K possesses a model, by 1.5.4. This proves 1.6.2. There is an essential difference between Theorem 1.5.4. on one hand and
22
[1.7.
THE LOWER PREDICATE CALCULUS
Theorems 1.6.1. and 1.6.2. on the other. Whereas 1.5.4. establishes a connection between the deductive or syntactical properties of a set of sentences on one hand, and the semantic or model theoretic properties of the set on the other, 1.6.1. and 1.6.2. refer only to the models of a set and do not contain any mention of its deductive properties. It turns out that for many applications 1.6.2. may be used in place of 1.5.4. The question arises whether the former theorems cannot be proved directly without reference to the rules of deduction. This is indeed possible, e.g. on the basis of the special valuation lemma 1.5.6. However, even in cases when the use of the rules of deduction is not essential, it may still be instructive. Since the very nature of the calculus of deduction shows that any deduction can involve a finite number of sentences only, the consistency of any set must be determined by the consistency of its finite subsets. Thus, the equivalence of consistency and of the existence of a model provides a natural explanation of 1.6.2. and 1.6.1. 1.7. Problems 1.7.1. Show how to modify the theory of this chapter if functions (function symbols) are included, such as p(x, y) for x y. (Compare section 9.1, below). 1.7.2. Show how to modify the theory of this chapter if object symbols are excluded. 1.7.3. Derive 1.6.2. and 1.6.1. (in that order) without using the calculus of deduction.
+
1.7.4. Modify the theory of semantic interpretation (section 1.4.) so as to suit the set-theoretic definition of a structure (end of section 1.5.).
References. The calculus introduced in sections 1.2, and 1.3. is a modified version of Hilbert-Bernays 1934/1939. 1.5.3. is proved in Godel 1930. Theorems 1.5.4., 1.6.1., 1.6.2. are equivalent if 1.5.3. is taken for granted. First came 1.6.2., which is due to Malcev (stated in Malcev 1941). Proofs of 1.5.4. will be found in Henkin 1949,A. Robinson 1951, compare also Rasiowa-Sikorski 1950, Beth 1951. 1.6.1. is stated in Tarski 1952. For the theorem of Lowenheim-Skolem (with a semantic definition of consistency) see Lowenheim 1915, Skolem 1920. The set-theoretic definition of a structure is given in Tarski 1952, where a structure is called a relational system. 1.5.6. is a special case of 9.3.2., below. It follows immediately from a lemma in Rado 1949. Deductively closed systems are considered in Tarski 193511936.