Professional Documents
Culture Documents
Theodore Sider
June 2, 2008
Preface
i
PREFACE ii
Another important source was Ed Gettier’s 1988 modal logic class at the Uni-
versity of Massachusetts. My notes from that course formed the basis of the
first incarnation of this work.
I am also deeply grateful for feedback from colleagues, and from students
in courses on this material. In particular, Marcello Antosh, Josh Armstrong,
Gabe Greenberg, Angela Harper, Sami Laine, Gregory Lavers, Alex Morgan,
Jeff Russell, Brock Sides, Jason Turner, Crystal Tychonievich, Jennifer Wang,
Brian Weatherson, and Evan Williams: thank you.
Contents
Preface i
1 Nature of Logic 1
1.1 Logical consequence and logical truth . . . . . . . . . . . . . . . . . 2
1.2 Form and abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Formal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Correctness and application . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 The nature of logical consequence . . . . . . . . . . . . . . . . . . . 8
1.6 Extensions, deviations, variations . . . . . . . . . . . . . . . . . . . . 10
1.6.1 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6.2 Deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6.3 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Metalogic, metalanguages, and formalization . . . . . . . . . . . . 12
1.8 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Propositional Logic 18
2.1 Grammar of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 The semantic approach to logic . . . . . . . . . . . . . . . . . . . . . 21
2.3 Semantics of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Natural deduction in PL . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.3 Sequent proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.4 Example sequent proofs . . . . . . . . . . . . . . . . . . . . . 36
2.5 Axiomatic proofs in PL . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5.1 Example axiomatic proofs . . . . . . . . . . . . . . . . . . . . 43
2.6 Soundness of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.7 Completeness of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
iii
CONTENTS iv
4 Predicate Logic 84
4.1 Grammar of predicate logic . . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Semantics of predicate logic . . . . . . . . . . . . . . . . . . . . . . . 85
4.3 Establishing validity and invalidity . . . . . . . . . . . . . . . . . . . 90
8 Counterfactuals 207
8.1 Natural language counterfactuals . . . . . . . . . . . . . . . . . . . . 208
8.1.1 Not truth-functional . . . . . . . . . . . . . . . . . . . . . . . 208
8.1.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.1.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.1.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.1.5 Some implications . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.1.6 Context dependence . . . . . . . . . . . . . . . . . . . . . . . 211
8.2 The Lewis/Stalnaker approach . . . . . . . . . . . . . . . . . . . . . 213
8.3 Stalnaker’s system (SC) . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.3.1 Syntax of SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.3.2 Semantics of SC . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.4 Validity proofs in SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
8.5 Countermodels in SC . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.6 Logical Features of SC . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.6.1 Not truth-functional . . . . . . . . . . . . . . . . . . . . . . . 230
8.6.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.6.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . . . . 230
8.6.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.6.5 Some implications . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.6.6 No exportation . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.6.7 No importation . . . . . . . . . . . . . . . . . . . . . . . . . . 232
8.6.8 No hypothetical syllogism (transitivity) . . . . . . . . . . . 233
8.6.9 No transposition . . . . . . . . . . . . . . . . . . . . . . . . . . 234
8.7 Lewis’s criticisms of Stalnaker’s theory . . . . . . . . . . . . . . . . 234
8.8 Lewis’s system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
8.9 The problem of disjunctive antecedents . . . . . . . . . . . . . . . 241
References 350
Chapter 1
Nature of Logic
ince you are reading this book, you are probably already familiar with
S some logic. You probably know how to translate English sentences into
symbolic notation—into propositional logic:
1
CHAPTER 1. NATURE OF LOGIC 2
Likewise, we say that “it’s not the case that snow is white and snow is not white”
is a logical truth because it has the form: it’s not the case that φ and not-φ.
We need to think hard about the idea of form. Apparently, we got the
alleged form of Argument A by replacing some words with Greek letters and
leaving other words as they were. We replaced the sentences ‘John is happy’ and
‘Ted is happy’ with φ and ψ, respectively, but left the expressions ‘It’s not the
case that’ and ‘or’ as they were, resulting in the schematic form displayed above.
Let’s call that form, “Form 1”. What’s so special about Form 1? Couldn’t we
make other choices for what to leave and what to replace? For instance, if we
replace the predicate ‘is happy’ with the schematic letter α, leaving the rest
intact, we get this:
And if we replace the ‘or’ with the schematic letter γ and leave the rest intact,
then we get this:
So, what did we mean, when we said that Argument A is logically correct in
virtue of its form? What is Argument A’s form? Is it Form 1, Form 2, or Form
3?
There is no such thing as the form of an argument. When we assign an
argument a form, what we are doing is focusing on certain words and ignoring
others. We leave intact the words we’re focusing on, and we insert schematic
letters for the rest. Thus, in assigning Argument A Form 1, we’re focusing on
the words (phrases) ‘it is not the case that’ and ‘or’, and ignoring other words.
More generally, in (standard) propositional logic, we focus on the phrases
‘if…then’, ‘if and only if’, ‘and’, ‘or’, and so on, and ignore others. We do this
in order to investigate the relations of logical consequence that hold in virtue
of these words’ meaning. The fact that Argument A is logically correct depends
just on the meaning of the phrases ‘it is not the case that’ and ‘or’; it does not
depend on the meanings of the sentences ‘John is happy’ and ‘Ted is happy’.
We can substitute any sentences we like for ‘φ’ and ‘ψ’ in Form 1 and still get
a valid argument.
In predicate logic, on the other hand, we focus on further words: ‘all’ and
‘some’. Broadening our focus in this way allows us to capture a wider range
of logical consequences and logical truths. For example “If Ted is happy then
someone is happy” is a logical truth in virtue of the meaning of ‘someone’, but
not merely in virtue of the meanings of the characteristic words of propositional
logic.
Call the words on which we’re focusing—that is, the words that we leave
intact when we construct the forms of sentences and arguments—the logical
constants. (We can speak of natural language logical constants—‘and’, ‘or’, etc.
for propositional logic; ‘all’ and ‘some’ in addition for predicate logic—as well
as symbolic logical constants: ∧, ∨, etc. for propositional logic; ∀ and ∃ in
addition for predicate logic.) What we’ve seen is that the forms we assign
depend on what we’re considering to be the logical constants.
We call these expressions logical constants because we interpret them in a
constant way in logic, in contrast to other terms. For example, ∧ is a logical
constant; in propositional logic, it always stands for conjunction. There are
fixed rules governing ∧, in proof systems (the rule that from P ∧Q one can
infer P , for example), in the rules for constructing truth tables, and so on.
Moreover, these rules are distinctive for ∧: there are different rules for other
logical constants such as ∨. In contrast, the terms in logic that are not logical
constants do not have fixed, particular rules governing their meanings. For
example, there are no special rules governing what one can do with a P as
CHAPTER 1. NATURE OF LOGIC 5
α is a bachelor
Therefore, α is unmarried
Accordingly, we could treat the predicates ‘is a bachelor’ and ‘is unmarried’
as logical constants, and develop a corresponding logic. We could introduce
special symbolic logical constants for these predicates, we could introduce
distinctive rules governing these predicates in proofs. (The rule of “bachelor-
elimination”, for instance, might allow one to infer “α is unmarried” from “α
is a bachelor”.) As with the choices of the previous paragraph, this choice of
what to treat as a logical constant is also not ruled out by the concept of form.
And it would be more productive than the choices of the last paragraph. Still,
it would be far less productive than the usual choices of logical constants in
predicate and propositional logic. The word ‘bachelor’ doesn’t have as general
application as the words commonly treated as logical constants in propositional
and predicate logic; the latter are ubiquitous.
At least, this remark about “generality” is one idea about what should be
considered a “logical constant”, and hence one idea about the scope of what
is usually thought of as “logic”. Where to draw the boundaries of logic—and
indeed, whether the logic/nonlogic boundary is an important one to draw—is
an open philosophical question about logic. At any rate, in this course, one
thing we’ll do is study systems that expand the list of logical constants from
standard propositional and predicate logic.
CHAPTER 1. NATURE OF LOGIC 6
P
(Q→R)∨(Q→∼S)
P ↔(P ∧Q)
The symbols ∧, ∨, etc., are used to represent the English words ‘and’, ‘or’, and
so on (the logical constants for propositional logic), and the sentence letters
P, Q, etc., are used to represent declarative English sentences.
Why ‘formal’? Because we stipulate, in a mathematically rigorous way,
a grammar for the language; that is, we stipulate a mathematically rigorous
definition of the idea of a sentence of this language. Moreover, since we are
only interested in the logical behavior of the chosen logical constants ‘and’,
‘or’, and so on, we choose special symbols (∧, ∨ . . . ) for these words only; we
use P, Q, R, . . . indifferently to represent any English sentence whose internal
logical structure we are willing to ignore.3
We go on, then, to study (as always, in a mathematically rigorous way) vari-
ous concepts that apply to the sentences in formal languages. In propositional
logic, for example, one constructs a mathematically rigorous definition of a
2
Notes
3
Natural languages like English also have a grammar, and the grammar can be studied
using mathematical techniques. But the grammar is much more complicated, and is discovered
rather than stipulated; and natural languages lack abstractions like the sentence letters.
CHAPTER 1. NATURE OF LOGIC 7
tautology (“all Trues in the truth table”), and a rigorous definition of a prov-
able formula (e.g., in terms of a system of deduction, using rules of inference,
assumptions, and so on).
Of course, the real goal is to apply the notions of logical consequence and
logical truth to sentences of English and other natural languages. The formal
languages are merely a tool; we need to apply the tool.
translation of the former English sentence from the translations of the latter
English sentences in the formal system.
In this book I won’t spend much time on philosophical questions about
which formal systems are correct. My goal is rather to introduce those for-
malisms that are ubiquitous in philosophy, to give you the tools you need to
address such philosophical questions yourself. Still, from time to time, we’ll dip
just a bit into these philosophical questions, in order to motivate our choices
of logical systems to study.
necessarily
it will be the case that
most
it is morally wrong that
1.6.1 Extensions
Here we add to standard logic. We add both:
· new symbols
· new cases of logical consequence and logical truth that we can model
1.6.2 Deviations
Here we change, rather than add. We retain the same symbols from standard
logic, but we alter standard logic’s proof theory and semantics. We therefore
change what we say about the logical consequences and logical truths that
involve the symbols.
Why do this? Perhaps because we think that standard logicians are wrong
about what the right logic for English is. If we want to correctly model logical
consequence in English, therefore, we must construct systems that behave
differently from standard logic.
For example, in the standard semantics for propositional logic, every for-
mula is either true or false. But some have argued that natural language sen-
tences like the following are neither true nor false:
1.6.3 Variations
Here we also change standard logic, but we change the notation without
changing the content of logic. We study alternate ways of expressing the same
thing.
CHAPTER 1. NATURE OF LOGIC 12
∼(P ∧Q)
∼P ∨∼Q
are two different ways of saying the same thing. We will study other ways of
saying what those two sentences say, including:
P |Q
∼∧P Q
In the first case, | is a new symbol for “not both”. In the second case (“Polish
notation”), the ∼ and the ∧ mean what they mean in standard logic; but instead
of going between the P and the Q, the ∧ goes before P and Q. The value of
this, as we’ll see, is that we no longer will need parentheses.
These are really interesting claims! They show that the method of truth tables
and the method of constructing derivations amount to the same thing, as
applied to symbolic formulas of propositional logic. One can establish similar
results for standard predicate logic.
A couple remarks about proving things in metalogic.
First: what do we mean by “proving”? We do not mean: constructing a
derivation in the logical system we’re investigating. We’re trying to construct a
proof about the system. We do this in English, and we do it with informal (though
rigorous!) reasoning of the sort one would encounter in a mathematics book.
Logicians often distinguish the “object language” from the “metalanguage”.
The object language is the language that’s being studied—the language of
propositional logic, for example. Sentences of this object language look like
this:
P ∧Q
∼(P ∨Q)↔R
The metalanguage is the language we use to talk about the object language.
In the case of the present book, the metalanguage is English. Here are some
example sentences of the metalanguage:
those sets that are not members of themselves. For short, R is the set of non-
self-members. Russell asks the following question: is R a member of itself?
There are two possibilities:
A ∩ B = {u : u ∈ A and u ∈ B}
A ∪ B = {u : u ∈ A or u ∈ B}
Sets have members, but they don’t contain them in any particular order. For
example, the set containing me and Bill Clinton doesn’t have a “first” member.
CHAPTER 1. NATURE OF LOGIC 16
This is reflected in the fact that “{Ted, Clinton}” and “{Clinton, Ted}” are
two different names for the same set—the set containing just Clinton and Ted.
But sometimes we need to talk about a set-like thing containing Clinton and
Ted, but in a certain order. For this purpose, logicians use ordered sets. Two-
membered ordered sets are called ordered pairs. To name the ordered pair of
Clinton and Ted, we use: “〈Clinton, Ted〉”. Here, the order is significant, for
〈Clinton, Ted〉 and 〈Ted, Clinton〉 are not the same thing. The three-membered
ordered set of u, v, and w (in that order) is written: 〈u, v, w〉; and similarly for
ordered sets of any finite size. A n-membered ordered set is called an n-tuple.
Let’s even allow 1-tuples: let’s define the 1-tuple 〈u〉 as being the object u itself.
In addition to sets, and ordered sets, we’ll need a further related concept:
that of a function. A function is a rule that “takes in” an object or objects,
and “spits out” a further object. For example, the addition function is a rule
that takes in two numbers, and spits out their sum. As with sets and ordered
sets, functions are not limited to mathematical entities: they can “take in” and
“spit out” any objects whatsoever. We can speak of the father-of function, for
example, which is a rule that takes in a person, and spits out the father of that
person. And later in this book we will be considering functions that take in
and spit out linguistic entities: sentences and parts of sentences from formal
languages.
Each function has a fixed number of “places”: a fixed number of objects
it must take in before it is ready to spit out something. You need to give
the addition function two arguments (numbers) in order to get it to spit out
something, so it is called a two-place function. You only need to give the father-
of function one object, on the other hand, to get it to spit out something, so it
is a one-place function.
The objects that the function takes in are called its arguments, and the object
it spits out is called its value. Suppose f is an n-place function, and u1 . . . un are
n of its arguments; one then writes “ f (u1 . . . un )” for the value of function f as
applied to arguments u1 . . . un . f (u1 . . . un ) is the object that f spits out, if you
feed it u1 . . . un . For example, where f is the father-of function, since Ron is
my father, we can write: f (Ted) = Ron; and, where a is the addition function,
we can write: a(2, 3) = 5.
There’s a trick for “reducing” talk of both ordered pairs and functions to
talk of sets. One first defines 〈u, v〉 as the set {{u}, {u, v}}; one defines 〈u, v, w〉
as 〈u, 〈v, w〉〉, and similarly for n-membered ordered sets, for each positive
integer n. And, finally, one defines an n-place function as a set, f , of n + 1-
tuples obeying the constraint that if 〈u1 , . . . , un , v〉 and 〈u1 , . . . , un , w〉 are both
CHAPTER 1. NATURE OF LOGIC 17
Propositional Logic
2.1 Grammar of PL
Modern logic has made great strides by treating the language of logic as a
mathematical object. To do so, grammar needs to be developed rigorously.
(Our study of a new logical system will always begin with grammar.)
If all you want to do is understand the language of logic informally, and be
able to use it effectively, you don’t really need to get so careful about grammar.
For even if you haven’t ever seen the grammar of propositional logic formalized,
you can recognize that things like this make sense:
P →Q
R ∧ (∼S↔P )
→P QR∼
(P ∼Q∼(∨
But to make any headway in metalogic, we will need more than an intuitive
understanding of what makes sense and what does not; we will need a precise
definition that has the consequence that only the strings of symbols in the first
group “make sense”.
18
CHAPTER 2. PROPOSITIONAL LOGIC 19
Grammatical formulas (i.e., ones that “make sense”) are called well-formed
formulas, or “wffs” for short. We define these by first carefully defining exactly
which symbols are allowed to occur in wffs (the “primitive vocabulary”), and
second, carefully defining exactly which strings of these symbols count as wffs:
Primitive vocabulary:
Definition of wff:
Think of this procedure in reverse: we begin with the smallest wffs (sentence
letters), and build up complex wffs using clause ii). Example: we can use clauses
i) and ii) to show that the expression (∼P →(P →Q)) is a wff:
· so, since P and Q are both wffs, (P →Q) is also a wff (clause ii))
· so, since ∼P and (P →Q) are both wffs, (∼P →(P →Q)) is also a wff (clause
ii))
What’s the point of clause iii)? Clauses i) and ii) provide only sufficient
conditions for being a wff, and therefore do not on their own exclude non-
sense combinations of primitive vocabulary like P ∼Q∼R, or even strings like
(P ∨147)→⊕ that include disallowed symbols. Clause iii) rules these strings out,
since there is no way to build up either of these strings from clauses i) and ii),
in the way that we built up the wff (∼P →(P →Q)).
What happened to ∧, ∨, and ↔? Our definition of a wff mentions only
→ and ∼; it therefore counts expressions like P ∧Q, P ∨Q, and P ↔Q as not
being wffs. Answer: we can define the ∧, ∨, and ↔ in terms of ∼ and →:
Definitions of ∧, ∨, and ↔:
And other alternate choices are possible. We’ll talk about this later.
So: → and ∼ are our primitive connectives; the others are defined. Why
do we choose only a small number of primitive connectives? Because, as we
will see, it makes meta-proofs easier.
is a sentence that is “true no matter what”, and the idea that one sentence is a
logical consequence of some other sentences iff there is “no way” for the latter
sentences to be true without the former sentence being true. We will use our
definitions to make these rough statements more precise: we will say that one
formula is a logical consequence of others iff there is no configuration in which
the latter formulas are true but the former is not, and that a formula is a logical
truth iff it is true in all configurations.
2.3 Semantics of PL
Our semantics for propositional logic is really just truth tables, only presented
a little more carefully than in introductory logic books. What a truth table of a
formula does is depict how the truth value of that formula is determined by the
truth values of its sentence letters, for each possible combination of truth values
for its sentence letters. To do this nonpictorially, we need to define a notion
corresponding to “a possible combination of truth values for sentence letters”.
→ 1 0 ∼
1 1 0 1 0
0 1 1 0 1
VI (α) = I (α)
VI (φ→ψ) = 1 iff either VI (φ) = 0 or VI (ψ) = 1
VI (∼φ) = 1 iff VI (φ) = 0
We have another recursive definition: the valuation function’s values for com-
plex formulas are determined by its values for smaller formulas; and this pro-
cedure bottoms out in the values for sentence letters, which are determined
directly by the interpretation function I .
Notice also that in the definition of a valuation function I use the English
logical connectives ‘either…or’, and ‘iff ’. I used these English connectives
rather than the logical connectives ∨ and ↔, because at that point I was not
writing down wffs of the language of study (in this case, the language of propo-
sitional logic). I was rather using sentences of English—our metalanguage, the
informal language we’re using to discuss the formal language of propositional
logic—to construct my definition of the valuation function. My definition
needed to employ the logical notions of disjunction and bi-implication, the
English words for which are ‘either…or’ and ‘iff’.
1
The → table, for example, shows what truth value φ→ψ takes on depending on the truth
values of its parts. Rows correspond to truth values for φ, columns to truth values for ψ. Thus,
to ascertain the truth value of φ→ψ when φ is 1 and ψ is 0, we look in the 1 row and the 0
column. The listed value there is 0—the conditional is false in this case. The ∼ table lacks
multiple columns because ∼ is a one-place connective.
CHAPTER 2. PROPOSITIONAL LOGIC 24
One might again worry that something circular is going on. We defined the
symbols for disjunction and bi-implication, ∨ and ↔, in terms of ∼ and → in
section 2.1, and now we’ve defined the valuation function in terms of disjunction
and bi-implication. So haven’t we given a circular definition of disjunction
and bi-implication? No. When we define the valuation function, we’re not
trying to define logical concepts such as negation, conjunction, disjunction,
implication, and bi-implication, and so on, at all. A reductive definition of these
very basic concepts is probably impossible (though one can define some of them
in terms of the others). What we are doing is starting with the assumption that
we already understand the logical concepts, and then using those notions to
provide a formalized semantics for a logical language. This can be put in terms
of object- and meta-language: we use metalanguage connectives, such as ‘iff’
and ‘or’, which we simply take ourselves to understand, to provide a semantics
for the object language connectives ∼, →, etc.
Back to the definition of the valuation function. The definition applies
only to official wffs, which can contain only the primitive connectives → and
∼. But sentences containing ∧, ∨, and ↔ are abbreviations for official wffs,
and therefore they too are governed by the definition. In fact, given the
abbreviations defined in section 2.1, one can show that the definition assigns
the intuitively correct truth values to sentences containing ∧, ∨, and ↔; one
can show that for any PL-interpretation I , and any wffs ψ and χ ,
I’ll show that the first statement is true here; the others are exercises for the
reader. I’ll write out my proof in excessive detail, to make it clear exactly how
the reasoning works.
Proof that ∧ gets the right truth condition. Let ψ and χ be any wffs. The expres-
sion ψ∧χ is an abbreviation for the expression ∼(ψ→∼χ ). So we want to
show that, for any PL-interpretation I , VI (∼(ψ→∼χ )) = 1 iff VI (ψ) = 1 and
VI (χ ) = 1. Now, in order to show that a statement α holds iff a statement
β holds, we must first show that if α holds, then β holds (the “forwards ⇒
direction”); then we must show that if β holds then α holds (the “backwards
⇐direction”):
CHAPTER 2. PROPOSITIONAL LOGIC 25
Let’s reflect on what we’ve done so far. We have defined the notion of
a PL-interpretation, which assigns 1s and 0s to sentence letters of the sym-
bolic language of propositional logic. And we have also defined, for any PL-
interpretation, a corresponding PL-valuation function, which extends the
interpretation’s assignment of 1s and 0s to complex wffs of PL. Note that we
have been informally speaking of these assignments as assignments of truth
values. That’s because the assignments of 1s and 0s accurately models the truth
values of statements in English that are represented in the obvious way by
PL-wffs. For example, the ∼ of propositional logic is supposed to model the
English phrase ‘it is not the case that’. Accordingly, just as an English sentence
2
The careful reader will note that here (and henceforth), I treat “VI (α) = 0” and “VI (α) is
not 1” interchangeably (for any wff α). (Similarly for “VI (α) = 1” and “VI (α) is not 0”.) This is
justified as follows. First, if VI (α) is 0, then it can’t also be that VI (α) is 1—VI was stipulated
to be a function. Second, since it was stipulated that VI assigns either 0 or 1 to each wff, if
VI (α) is not 1, then VI (α) must be 0.
CHAPTER 2. PROPOSITIONAL LOGIC 26
“It is not the case that φ” is true iff φ is false, one of our valuation functions
assigns 1 to ∼φ iff it assigns 0 to φ.
Semantics in logic, recall, generally defines two things: configurations and
truth-in-a-configuration. In the propositional logic semantics we have laid
out, the configurations are the interpretation functions, and the valuation
function defines truth-in-a-configuration. Each interpretation function gives
a complete assignment of truth values to the sentence letters. Thus, insofar
as the sentence letters are concerned, an interpretation function completely
specifies a possible configuration of the world. And for any interpretation
function, its corresponding valuation function specifies, for each complex wff,
what truth value that wff has in that interpretation. Thus, for each wff (φ) and
each configuration (I ), we have specified the truth value of that wff in that
configuration (VI (φ)).
Onward. We are now in a position to define the semantic versions of the
notions of logical truth and logical consequence for PL. The semantic notion
of a logical truth is that of a valid formula:
if a formula has the form (φ→ψ), then we assigned it an appropriate truth value
based on the truth values of φ and ψ. But suppose we had a formula in our
language that looked as follows:
P →P →P
and suppose that P has truth value 0. What is the truth value of the whole? We
can’t tell, because of the missing parentheses. For if the parentheses look like
this:
(P →P )→P
then the truth value is 0, whereas if the parentheses look like this:
P →(P →P )
1 P →(Q→R)
2 P ∧Q
3 P 2, ∧E
4 Q 2, ∧E
5 Q→R 1, 3 →E
6 R 4, 5 →E
7 (P ∧Q)→R 2-6, →I
or like this:
1. P →(Q→R) Pr.
2. show (P ∧Q)→R CD
3. P ∧Q As.
4. show R DD
5. P 3, ∧E
6. Q 3, ∧E
7. Q→R 1, 5 →E
8. R 6, 7→E
from the derivations familiar from introductory books. Our version of the
above derivation will look like this:
1. P →(Q→R) ` P →(Q→R) RA
2. P ∧Q ` P ∧Q RA
3. P ∧Q ` P 2, ∧E
4. P ∧Q ` Q 2, ∧E
5. P →(Q→R), P ∧Q ` Q→R 1,3 →E
6. P →(Q→R), P ∧Q ` R 4,5 →E
7. P →(Q→R) ` (P ∧Q)→R 5, →I
2.4.1 Sequents
Natural deduction systems model the kind of reasoning one employs in everyday
life. How does that reasoning work? In its simplest form, one reasons in a
step-by-step fashion from premises to a conclusion, each step being sanctioned
by a rule of inference. For example, suppose that one already knows the premise
that P ∧(P →Q) is true. One can then reason one’s way to the conclusion that
Q is also true, as follows:
In this kind of proof, each step is a tiny, indisputably correct, logical inference.
Consider the moves from 1 to 2 and from 1 to 3, for example. These are
indisputably correct because a conjunctive statement clearly logically implies
either of its conjuncts. Likewise for the move from 2 and 3 to 4: it is clear
that a conditional statement, plus its antecedent, together imply its consequent.
Natural deduction systems consist in part of simple general principles like these
(“a conjunctive statement logically implies either of its conjuncts”); they are
known as rules of inference.
In addition to rules of inference, ordinary reasoning employs a further
technique: the use of assumptions. In order to establish a conditional claim “if P
CHAPTER 2. PROPOSITIONAL LOGIC 31
then Q”, one would ordinarily i) assume P , ii) reason one’s way to Q, and then
iii) on that basis conclude that the conditional claim “if P then Q” is true. Once
P ’s assumption is shown to lead to Q, the conditional claim “if P then Q” may
be concluded. Another example: to establish a claim of the form “not-P ”, one
would ordinarily i) assume P , ii) reason one’s way to a contradiction, and iii)
on that basis conclude that “not-P ” is true. Once P ’s assumption is shown to
lead to a contradiction, “not-P ” may be concluded. The first sort of reasoning
is called conditional proof, the second, reductio ad absurdum.
When one reasons with assumptions, one writes down statements that one
does not know to be true. When you write down P as an assumption, with the
goal of proving the conditional “if P then Q”, you do not know P to be true.
You’re merely assuming P for the sake of establishing the conditional “if P
then Q”. Outside the context of this proof, the assumption need not hold; once
you’ve reasoned your way to Q on the basis of the assumption of P , and so
concluded that the conditional “if P then Q” is true, you stop assuming P . To
model this sort of reasoning formally, we need a way to keep track of how the
conclusions we establish depend on the assumptions we have made. Natural
deduction systems in introductory textbooks tend to do this geometrically (by
placement on the page), with special markers (e.g., ‘show’), and by drawing
lines or boxes around parts of the proof once the assumptions that led to those
parts are no longer operative. We will do it differently: we will keep track of the
dependence of conclusions on assumptions by writing down explicitly, for each
conclusion, which assumptions it depends on. We will do this by constructing
our derivations out of sequents.
A sequent looks like this:
Γ`φ
a relation of logical consequence between its premises and its conclusion, but
the idea nevertheless makes sense. Let’s introduce an informal notion of logical
correctness for sequents: the sequent Γ ` φ is logically correct if the formula φ
is a logical consequence of the formulas in Γ. Thus, one is entitled to conclude
the conclusion of a logically correct sequent from its premises. The idea, then,
of constructing a sequent proof of a sequent is to show that that sequent is
logically correct—to show, that is, that its consequent is a logical consequence
of its premises.
From our investigation of the semantics of propositional logic, we already
have the makings of a semantic criterion for when a sequent is logically correct:
the sequent Γ ` φ is logically correct iff φ is a semantic consequence of Γ.
What we will be doing in this section is giving a new, proof-theoretic, criterion
for the logical correctness of sequents.
2.4.2 Rules
The first step in developing our system is to write down sequent rules. A sequent
rule is a permission to move from certain sequents to certain other sequents. Our
goal is to construct rules with the following feature: if the “from” sequents are
all logically correct sequents, then any of the “to” sequents will be guaranteed
to be a logically correct sequent. Call such sequent rules “logical-correctness
preserving”.
Consider, as an example, the first rule of our system “∧ introduction”, or
“∧I” for short. We picture this sequent rule thus:
Γ`φ ∆`ψ
∧I
Γ, ∆ ` φ∧ψ
Above the line go the “from” sequents; below the line go the “to”-sequents.
(The comma between Γ and ∆ in the “to” sequent simply means that the
premises of this sequent are all the members of Γ plus all the members of ∆.
Strictly speaking it would be more correct to write this in set-theoretic notation
as: Γ ∪ ∆ ` φ∧ψ.) Thus, ∧I permits us to move from the sequents Γ ` φ and
∆ ` ψ to the sequent Γ, ∆ ` φ∧ψ. For any sequent rule, we say that any of the
“to” sequents (Γ, ∆ ` φ∧ψ in this case) follows from the “from” sequents (in this
case Γ ` φ and ∆ ` ψ) via the rule.
It seems intuitively clear that ∧I preserves logical correctness. For if some
assumptions Γ logically imply φ, and some assumptions ∆ logically imply ψ,
CHAPTER 2. PROPOSITIONAL LOGIC 33
then (since φ∧ψ intuitively follows from φ and ψ taken together) the conclusion
φ∧ψ should indeed logically follow from all the assumptions together, the ones
in Γ and the ones in ∆.
Our next sequent rule is ∧E:
Γ ` φ∧ψ
∧E
Γ`φ Γ`ψ
This lets one move from the sequent Γ ` φ∧ψ to either the sequent Γ ` φ or
the sequent Γ ` ψ (or both). This, too, appears to preserve logical correctness.
If the members of Γ imply the conjunction φ∧ψ, then (since φ∧ψ intuitively
implies both φ and ψ individually) it must be that the members of Γ imply φ,
and they must also imply ψ.
The rule ∧I is known as an introduction rule for ∧, since it allows us to move
to a sequent whose major connective is the ∧. Likewise, the rule ∧E is known
as an elimination rule for ∧, since it allows us to move from a sequent whose
major connective is the ∧. In fact our sequent system contains introduction and
elimination rules for the other connectives as well: ∼, ∨, and → (let’s forget
the ↔ here.) We’ll present those rules in turn.
First ∨I and ∨E:
Γ`φ Γ ` φ∨ψ ∆1 , φ ` χ ∆2 , ψ ` χ
∨I ∨E
Γ ` φ∨ψ Γ ` ψ∨φ Γ, ∆1 , ∆2 ` χ
Let’s think about what ∨E means. Remember the intuitive meaning of a sequent:
its conclusion is a logical consequence of its premise. Another (related) way to
think of it is that Γ ` φ means that one can establish that φ if one assumes the
members of Γ. So, if the sequent Γ ` φ∨ψ is logically correct, that means we’ve
got the disjunction φ∨ψ, assuming the formulas in Γ. Now, suppose we can
reason to a new formula χ , assuming φ, plus perhaps some other assumptions
∆1 . And suppose we can also reason to χ from ψ, plus perhaps some other
assumptions ∆2 . Then, since either φ or ψ (plus the assumptions in ∆1 and ∆2 )
leads to χ , and we know that φ∨ψ is true (conditional on the assumptions in Γ),
we ought to be able to infer χ itself, assuming the assumptions we needed along
the way (∆1 and ∆2 ), plus the assumptions we needed to get φ∨ψ, namely, Γ.
Next, we have double negation:
Γ`φ Γ ` ∼∼φ
DN
Γ ` ∼∼φ Γ`φ
CHAPTER 2. PROPOSITIONAL LOGIC 34
1. P ∧Q ` P ∧Q As
2. P ∧Q ` P 1, ∧E
3. P ∧Q ` Q 1, ∧E
4. P ∧Q ` Q∧P 2, 3 ∧I
Though it isn’t strictly required, we write a line number to the left of each
sequent in the series, and to the right of each line we write the sequent rule
that justifies it, together with the line or lines (if any) that contained the “from”
sequents required by the sequent rule in question. (The rule of assumptions
requires no “from” sequents, recall.)
(It’s important to distinguish what we’re now calling proofs, namely, sequent
proofs, from the kinds of informal arguments I gave in section 2.3, and will
give elsewhere in this book. Sequent proofs (and also the axiomatic proofs we
will introduce in section 2.5) are formalized object-language proofs. The sentences
in sequent proofs are sentences in the object language; they are wffs of PL.
Moreover, we gave a rigorous definition of what a sequent proof is. Moreover,
sequent proofs are restrictive in that only the system’s official rules may be
used. For contrast, consider the argument I gave in section 2.3 that any PL-
valuation assigns 1 to φ∧ψ iff it assigns 1 to φ and 1 to ψ. That argument was
an informal metalanguage proof. The sentences in the argument were sentences
of English, and the argument used informal (i.e., not formalized) techniques of
reasoning. “Informal” doesn’t imply lack of rigor. The argument was perfectly
rigorous: it conforms to the standards of good argumentation that generally
prevail in mathematics. We’re free to use any reasonable pattern of reasoning,
for example “universal proof” (to establish something of the form “everything
is thus-and-so”, we consider an arbitrary thing and show that it is thus-and-so).
We may “skip steps” if it’s clear how the argument is supposed to go. In short,
what we must do is convince a well-informed and mathematically sophisticated
reader that the result we’re after is indeed true.)
Next we introduce the notion of a “provable sequent”. The idea is that
each sequent proof culminates in the proof of some sequent. Thus we offer the
following definition:
1. P ∧Q ` P ∧Q As
2. P ∧Q ` P 1, ∧E
3. P ∧Q ` Q 1, ∧E
4. P ∧Q ` Q∧P 2, 3 ∧I
Notice the strategy. We first use the rule of assumptions to enter the premise
of the sequent we’re trying to prove: P ∧Q. We then use the rules of inference
to infer the consequent of that sequent: Q∧P . Since our initial assumption of
P ∧Q was dependent on the formula P ∧Q, our subsequent inferences remain
dependent on that same assumption, and so the final formula concluded, Q∧P ,
remains dependent on that assumption.
Let’s write our proofs out in a simpler way. Instead of writing out entire
sequents, let’s write out only their conclusions. We can indicate the premises
of the sequent using line numbers; the line numbers indicating the premises
CHAPTER 2. PROPOSITIONAL LOGIC 37
of the sequent will go to the left of the number indicating the sequent itself.
Rewriting the previous proof in this way yields:
1 (1) P ∧Q As
1 (2) P 1, ∧E
1 (3) Q 1, ∧E
1 (4) Q∧P 2, 3 ∧I
1. P →Q ` P →Q As
2. Q→R ` Q→R As
3. P ` P As
4. P →Q, P ` Q 1,3 →E
5. P →Q, Q→R, P ` R 2,4 →E
6. P →Q, Q→R ` P →R 5,→I
1 (1) P →Q As
2 (2) Q→R As
3 (3) P As
1,3 (4) Q 1, 3 →E
1,2,3 (5) R 2, 4 →E
1,2 (6) P →R 5, →I
Let’s think about this example. We’re trying to establish P →R on the basis of
two formulas, P →Q and Q→R, so we start by assuming the latter two formulas.
Then, since the formula we’re trying to establish is a conditional, we assume
the antecedent of the conditional, in line 3. We then proceed, on that basis,
to reason our way to R, the consequent of the conditional we’re trying to
prove. (Notice how in lines 4 and 5, we add more line numbers on the very
left. Whenever we use → E, we increase dependencies: when we infer Q from
P and P →Q, our conclusion Q depends on all the formulas that P and P →Q
depended on, namely, the formulas on lines 1 and 3. Look back to the statement
of the rule → E: the conclusion ψ depends on all the formulas that φ and φ→ψ
CHAPTER 2. PROPOSITIONAL LOGIC 38
depended on: Γ and ∆.) That brings us to line 5. At that point, we’ve shown
that R can be proven, on the basis of various assumptions, including P . The
rule →I (that is, the rule of conditional proof) then lets us conclude that the
conditional P →R follows merely on the basis of the other assumptions; that
rule, note, lets us in line 6 drop line 3 from the list of assumptions on which
P →R depends.
Next let’s establish an instance of DeMorgan’s Law, ∼(P ∨Q) ` ∼P ∧∼Q:
1 (1) ∼(P ∨Q) As
2 (2) P As (for reductio)
2 (3) P ∨Q 2, ∨I
1,2 (4) (P ∨Q)∧∼(P ∨Q) 1, 3 ∧I
1 (5) ∼P 4, RAA
6 (6) Q As (for reductio)
6 (7) P ∨Q 6, ∨I
1,6 (8) (P ∨Q)∧∼(P ∨Q) 1, 7 ∧I
1 (9) ∼Q 8, RAA
1 (10) ∼P ∧∼Q 5, 9∧I
Next let’s establish ∅ ` P ∨∼P :
1 (1) ∼(P ∨∼P ) As
2 (2) P As (for reductio)
2 (3) P ∨∼P 2, ∨I
2,1 (4) (P ∨∼P )∧∼(P ∨∼P ) 1, 3 ∧I
1 (5) ∼P 4, RAA
6 (6) ∼P As (for reductio)
6 (7) P ∨∼P 6, ∨I
6,1 (8) (P ∨∼P )∧∼(P ∨∼P ) 1, 7 ∧I
1 (9) ∼∼P 8, RAA
1 (10) ∼P ∧∼∼P 5, 9 ∧I
∅ (11) ∼∼(P ∨∼P ) 10, RAA
∅ (12) P ∨∼P 11, DN
Comment: my overall goal was to assume ∼(P ∨∼P ) and then derive a con-
tradiction. And my route to the contradiction was to separately establish ∼P
CHAPTER 2. PROPOSITIONAL LOGIC 39
1 (1) P ∨Q As
2 (2) ∼P As
3 (3) Q As (for use with ∨E)
4 (4) P As (for use with ∨E)
5 (5) ∼Q As (for reductio)
4,5 (6) ∼Q∧P 4,5 ∧I
4,5 (7) P 6, ∧E
2,4,5 (8) P ∧∼P 2,7 ∧I
2,4 (9) ∼∼Q 8, RAA
2,4 (10) Q 9, DN
1,2 (11) Q 1,3,10 ∨E
The basic idea of this proof is to use ∨ E on line 1 to get Q. That calls, in
turn, for showing that each disjunct of line 1, P and Q, leads to Q. Showing
that Q leads to Q is easy; that was line 3. Showing that P leads to Q took lines
4-10; line 10 states the result of that reasoning, namely that Q follows from P
(as well as line 2). I began at line 4 by assuming P . Then my strategy was to
establish Q by reductio, so I assumed ∼Q in line 5. At this point, I basically
had my contradiction: at line 2 I had ∼P and at line 4 I had P . (You might
think I had another contradiction: Q at line 3 and ∼Q at line 5. But at the end
of the proof, I don’t want my conclusion to depend on line 3, whereas I don’t
mind it depending on line 2, since that’s one of the premises of the sequent
I’m trying to establish.) So I want to put P and ∼P together, to get P ∧∼P ,
and then conclude ∼∼Q by RAA. But there is a minor hitch. Look carefully at
how RAA is formulated. It says that if we have Γ,φ ` ψ∧∼ψ, we can conclude
Γ ` ∼φ. The first of these two sequents includes φ in its premises. That means
that in order to conclude ∼φ, the contradiction ψ∧∼ψ needs to depend on φ.
So in the present case, in order to finish the reductio argument and conclude
∼∼Q, the contradiction P ∧∼P needs to depend on the reductio assumption
∼Q (line 5.) But if I just used ∧I to put lines 2 and 4 together, the resulting
contradiction will only depend on lines 2 and 4. To get around this, I used
a little trick. Whenever you have a sequent Γ ` φ, you can always add any
CHAPTER 2. PROPOSITIONAL LOGIC 40
formula ψ you like to the premises on which φ depends, using the following
method:3
i. Γ`φ (begin with this)
i + 1. ψ`ψ As (ψ is any chosen formula)
i + 2. Γ, ψ ` φ∧ψ ∧I
i + 3. Γ, ψ ` φ ∧E
Lines 4, 6 and 7 in the proof employ this trick: initially, at line 4, P depends
only on 4, but then by line 7, P also depends on 5. That way, the move from 8
to 9 by RAA is justified.
a) P, Q, R ` P
b) P →(Q→R) ` (Q∧∼R)→∼P
reasoning with assumptions), the last of which is the conclusion of the proof.
Each line in the proof must be justified in one of two ways: it may be inferred
by a rule of inference from earlier lines in the proof, or it may be an axiom.
An axiom is a certain kind of formula, a formula that one is allowed to enter
into a proof without any justification at all. Axioms are the “starting points” of
proofs, the foundation on which proofs rest. Since axioms are to play this role,
the axioms in a good axiomatic system ought to be indisputable logical truths.
For example, “P →P ” would be a good axiom—it’s obviously a logical truth.
(As it happens, we won’t choose this particular axiom; we’ll instead choose
other axioms from which this one may be proved.) Similarly, for each rule of
inference in a good axiomatic system, there should be no question but that the
premises of the rule logically imply its conclusion.
Actually we’ll employ a slightly more general notion of a proof: a proof
from a given set of wffs Γ. A proof from Γ will be allowed to contain members
of Γ, in addition to axioms and wffs that follow from earlier lines by a rule.
Think of the members of Γ as premises, which in the context of a proof from
Γ are temporarily treated as axioms, in that they are allowed to be entered
into the proof without any justification. The intuitive point of a proof from
Γ is to demonstrate its conclusion on the assumption that the members of Γ are
true, in contrast to a proof simpliciter (i.e. a proof in the sense of the previous
paragraph), whose point is to demonstrate its conclusion unconditionally. (Note
that we can regard a proof simpliciter as a proof from the empty set ∅.)
Formally, to apply the axiomatic method, we must choose i) a set of rules,
and ii) a set of axioms. An axiom is simply any chosen sentence (though as we
saw, in a good axiomatic system the axioms will be clear logical truths.) A rule
is simply a permission to infer one sort of sentence from other sentences. For
example, the rule modus ponens can be stated thus: “From φ→ψ and φ you may
infer ψ”, and pictured as follows:
φ→ψ φ
MP
ψ
(Modus ponens is the analog of the sequent rule →E.) Given any chosen axioms
and rules, we can define the following concepts:
Definition of axiomatic proof from a set: Where Γ is a set of wffs and φ is
a wff, an axiomatic proof from Γ is a finite sequence of wffs whose last line is
φ, in which each line either i) is an axiom, ii) is a member of Γ, or iii) follows
from earlier wffs in the sequence via a rule.
CHAPTER 2. PROPOSITIONAL LOGIC 42
φ → (ψ→φ) (A1)
(φ→(ψ→χ )) → ((φ→ψ)→(φ→χ )) (A2)
(∼ψ→∼φ) → ((∼ψ→φ)→ψ) (A3)
Thus, a PL-theorem is any formula that is the last line of a sequence of formulas,
each of which is either an A1, A2, or A3 axiom, or follows from earlier formulas
in the sequence by modus ponens. And a formula is PL-provable from some
set Γ if it is the last line of a sequence of formulas, each of which is either a
member of Γ, an A1, A2, or A3 axiom, or follows from earlier formulas in the
sequence by modus ponens.
The axiom “schemas” A1-A3 are not themselves axioms. They are, rather,
“recipes” for constructing axioms. Take A1, for example:
φ→(ψ→φ)
4
See Mendelson (1987, p. 29).
CHAPTER 2. PROPOSITIONAL LOGIC 43
This string of symbols isn’t itself an axiom because it isn’t a wff; it isn’t a wff
because it contains Greek letters, which aren’t allowed in wffs (since they’re
not on the list of PL primitive vocabulary). φ and ψ are variables of our
metalanguage; you only get an axiom when you replace these variables with
wffs. P →(Q→P ), for example, is an axiom; it results from A1 by replacing φ
with P and ψ with Q. (Note: since you can put in any wff for these variables,
and there are infinitely many wffs, there are infinitely many axioms.)
A few points of clarification about how to construct axioms from schemas.
First point: you can stick in the same wff for two different Greek letters. Thus
you can let both φ and ψ in A1 be P , and construct the axiom P →(P →P ).
(But of course, you don’t have to stick in the same thing for φ as for ψ.) Sec-
ond point: you can stick in complex formulas for the Greek letters. Thus,
(P →Q)→(∼(R→S)→(P →Q)) is an axiom (I put in P →Q for φ and ∼(R→S)
for ψ in A1). Third point: within a single axiom, you cannot substitute different
wffs for a single Greek letter. For example, P →(Q→R) is not an axiom; you
can’t let the first φ in A1 be P and the second φ be R. Final point: even though
you can’t substitute different wffs for a single Greek letter within a single axiom,
you can let a Greek letter become one wff when making one axiom, and let
it become a different wff when making another axiom; and you can use each
of these axioms within a single axiomatic proof. For example, each of the
following is an instance of A1, and one could include both in a single axiomatic
proof:
P →(Q→P )
∼P →((Q→R)→∼P )
In the first case, I made φ be P and ψ be Q; in the second case I made φ be
∼P and ψ be Q→R. This is fine because I kept φ and ψ constant within each
axiom.
The definitions we have given in this section constitute another way of
making precise the proof-theoretic conception of the core logical notions, as
applied to propositional logic. A logical truth, on this conception, is a PL-
theorem; one formula is a logical consequence of others iff it is PL-provable
from them.
1. P →(Q→P ) (A1)
2. (P →(Q→P ))→((P →Q)→(P →P )) (A2)
3. (P →Q)→(P →P ) 1,2 MP
The existence of this proof shows that (P →Q)→(P →P ) is a theorem.
Building on the previous proof, we can construct a proof of P →P from
{P →Q}:
1. P →(Q→P ) (A1)
2. (P →(Q→P ))→((P →Q)→(P →P )) (A2)
3. (P →Q)→(P →P ) 1,2 MP
4. P →Q member of {(P →Q)}
5. P →P 3, 4 MP
Thus, we have shown that {P →Q} ` P →P .
(When we’re talking about provability from a set, let’s adopt the convention
of writing “φ1 . . . φn ` ψ” instead of “{φ1 . . . φn } ` ψ”, and writing “Γ, φ1 . . . φn ”
instead of “Γ ∪ {φ1 . . . φn }”. That is, let’s drop the set-braces on the left hand
side of ` in these circumstances. In this new notation, what we showed in the
previous paragraph was: P →Q ` P →P .)
The next example is a little harder: (R→P )→(R→(Q→P ))
1. [R→(P →(Q→P ))]→[(R→P )→(R→(Q→P ))] A2
2. P →(Q→P ) A1
3. [P →(Q→P )]→[R→(P →(Q→P ))] A1
4. R→(P →(Q→P )) 2,3 MP
5. (R→P )→(R→(Q→P )) 1,4 MP
Here’s how I approached this problem. What I was trying to prove, namely
(R→P )→(R→(Q→P )), is a conditional whose antecedent and consequent both
begin: (R→. That looks like the consequent of A2. So I wrote out an instance
of A2 whose consequent was the formula I was trying to prove; that gave me
line 1 of the proof. Then I tried to figure out a way to get the antecedent of
line 1; namely, R→(P →(Q→P )). And that turned out to be pretty easy. The
consequent of this formula, P →(Q→P ) is an axiom (line 2 of the proof). And
if you can get a formula φ, then you choose anything you like—say, R,—and
then get R→φ, by using A1 and modus ponens; that’s what I did in lines 3 and
4.
CHAPTER 2. PROPOSITIONAL LOGIC 45
Exercise 2.3 Establish each of the following facts. For these prob-
lems, do not use the “toolkit” from the following sections; i.e.,
construct the axiomatic proofs “from scratch”. However, you may
use a fact you prove in an earlier problem in later problems.
a) ` P →P
b) ` (∼P →P )→P
c) ∼∼P ` P
In fact, we’ll regularly want to make a move like that of lines 3 and 4 from
the preceding proofs—whenever we have φ on its own, and we want to move
to ψ→φ. Let’s call this move “adding an antecedent”; this is how it is done:
In future proofs, instead of repeating such steps, let’s just move directly from φ
to ψ→φ, with the justification “adding an antecedent”.
The preceding proof was a bit tricky, and most proofs are trickier still.
Moreover, the proofs quickly get very long. Practically speaking, the best way
to make progress in an axiomatic system like this is by building up a toolkit.
The toolkit consists of theorems and techniques for doing bits of proofs which
are applicable in a wide range of situations. Then, when approaching a new
problem, one can look to see whether the problem can be reduced to a few
chunks, each of which can be accomplished by using the toolkit. Further, one
can cut down on writing by citing bits of the toolkit, rather than writing down
entire proofs.
So far, we have just one tool in our toolkit: “adding an antecedent”. Let’s
add another: the “MP technique”. Here’s what the technique will let us do.
Suppose we can separately prove φ→ψ and φ→(ψ→χ ). The MP technique
then shows us how to construct a proof of φ→χ . I call this the MP technique
because its effect is that you can do modus ponens “within the consequent of
the conditional φ→”. Here’s how the MP technique works:
CHAPTER 2. PROPOSITIONAL LOGIC 46
Note that the lines in this “proof schema” are schemas (they contain Greek
letters), rather than wffs. It therefore isn’t a proof at all; rather, it becomes a
proof once you fill in wffs for the φ, ψ, and χ . We constructed a proof schema
because we want the MP technique to be applicable whenever we want to move
from formulas of the form φ→ψ and φ→(ψ→χ ), to a formula of the form
φ→χ , no matter what φ, ψ, and χ may be.
And while we’re on the topic of proof schemas, note also that whenever
one constructs a proof of a formula containing sentence letters, one could just
as well have constructed a similar proof schema. Corresponding to the proof
of (R→P )→(R→(Q→P )), for example, there is this proof schema:
It’s usually more useful to think in terms of proof schemas, rather than proofs,
because they can go into our toolkit, if they have general applicability. The
proof schema we just constructed, for example, shows that anything of the
form (φ→ψ)→(φ→(χ →ψ)) is a theorem. As it happens, this is a fairly intuitive
theorem schema. Think of it as the principle of “weakening the consequent”.
χ →ψ is logically weaker than ψ, so if φ leads to ψ, φ must also lead to χ →ψ.
That sounds like a pattern that might well recur, so let’s put it into the toolkit,
under the label “weakening the consequent”. If we’re ever in the midst of a
proof and could really use a line of the form (φ→ψ)→(φ→(χ →ψ)), then we
can simply write that line down, and annotate on the right “weakening the
consequent”. Given the proof sketch above, we know that we could always
in principle insert a five-line proof of line; to save writing we simply won’t
bother. (Note that once we do this—omitting those five lines—the proofs we
are constructing will cease to be official proofs, since not every line will be
CHAPTER 2. PROPOSITIONAL LOGIC 47
either an axiom or a line that follows from earlier lines by MP. They will be
instead proof sketches, which are in essence metalanguage arguments to the
effect that there exists some proof or other of the desired type. An ambitious
reader could always construct an official proof on the basis of the proof sketch,
by taking each of the bits, filling in the details using the toolkit, and assembling
the results into one proof.)
Next I want to add to our toolkit the principle of “strengthening the an-
tecedent”: [(φ→ψ)→χ ]→(ψ→χ ). The intuitive idea is that if φ→ψ leads to
χ , then ψ ought to lead to χ , since ψ is logically stronger than φ→ψ. This
proof will be harder still; we’ll need to break it into bits and use the toolkit to
complete it. Here’s a sketch of the overall proof:
All that remains is to supply separate proofs of lines a and b. Step a is pretty
easy. Its consequent, ψ→(φ→ψ) is an instance of A1, so we can prove it in one
line, then use “adding an antecedent” to get a.
Line b is a bit harder. It has the form: (α→β)→[(γ →α)→(γ →β)]. Call
this “adding antecedents” (and put it into the toolkit too), since it lets you add
the same antecedent (γ ) to both the antecedent and consequent of a condi-
tional (α→β). The following proof sketch for adding antecedents uses the MP
technique again!
Given this theorem, we can always move from φ→ψ and ψ→χ to φ→χ thus:
1. [φ→(ψ→χ )]→[(φ→ψ)→(φ→χ )] A2
2. [(φ→ψ)→(φ→χ )]→[ψ→(φ→χ )] strengthening the antecedent
3. [φ→(ψ→χ )]→[ψ→(φ→χ )] 1,2 transitivity
1. (∼ψ→∼φ)→[(∼ψ→φ)→ψ] A3
2. [(∼ψ→φ)→ψ]→(φ→ψ) strengthening the antecedent
3. (∼ψ→∼φ)→(φ→ψ) 1, 2 transitivity
1. (φ→ψ)→[(ψ→∼∼ψ)→(φ→∼∼ψ)] transitivity
2. (ψ→∼∼ψ)→[(φ→ψ)→(φ→∼∼ψ)] 1, swapping
antecedents
3. ψ→∼∼ψ exercise 2.4c
4. (φ→ψ)→(φ→∼∼ψ) 2, 3 MP
5. ∼∼φ→φ exercise 2.4b
6. (∼∼φ→φ)→[(φ→∼∼ψ)→(∼∼φ→∼∼ψ)] transitivity
7. (φ→∼∼ψ)→(∼∼φ→∼∼ψ) 5, 6 MP
8. (∼∼φ→∼∼ψ)→(∼ψ→∼φ) contraposition 1
9. (φ→ψ)→(∼ψ→∼φ) 4, 7, 8 transitivity (2x)
1. ∼φ→(∼ψ→∼φ) A1
2. (∼ψ→∼φ)→(φ→ψ) contraposition 1
3. ∼φ→(φ→ψ) 1, 2 transitivity
Exercise 2.4 Establish each of the following. For these you may
use the toolkit.
2.6 Soundness of PL
In this chapter we have discussed two approaches to propositional logic: the
proof-theoretic approach and the semantic approach. In each case, we in-
troduced formal notions of logical truth and logical consequence. For the
semantic approach, these notions involved truth in PL-interpretations. For the
CHAPTER 2. PROPOSITIONAL LOGIC 50
2.4. The soundness proof, for instance, would proceed by proving by induction
that whenever sequent Γ ` φ is provable, φ is a semantic consequence of Γ.
The main thing would be to show that each rule of inference (As, RAA, ∧I, ∧E,
etc.) preserves semantic consequence. But note how much more involved this
proof would be, since there are so many rules of inference. The paucity of rules
in the axiomatic system made the construction of proofs within that system a
real pain in the neck, but now we see how it makes metalogical life easier.
Before we leave this section, let me summarize and clarify the nature of
proofs by induction. Induction is the method of proof to use whenever one
is trying to prove that each entity of a certain sort has a certain feature F ,
where each such entity is generated from certain “starting points” by a finite
number of successive “operations”. To do this, one establishes two things: a)
that the starting points have feature F , and b) that the operations preserve
feature F —i.e., that if the inputs to the operations have feature F then the
output also has feature F .
In logic, it is important to distinguish two different cases where proofs by
induction are needed. One case is where one is establishing a fact of the form:
every theorem has a certain feature F . (The proof of soundness is an example of
this case.) Here’s why induction is applicable: a theorem is defined as the last
line of a proof. So the fact to be established is that every line in every proof has
feature F . Now, a proof is defined as a finite sequence, where each member is
either an axiom or follows from earlier lines by the rule modus ponens. The
axioms are the “starting points” and modus ponens is the “operation”. So if
we want to show that every line in every proof has feature F , all we need to
do is show that a) the axioms all have feature F , and b) show that if you start
with formulas that have feature F , and you apply modus ponens, then what you
get is something with feature F . More carefully, b) means: if φ has feature F ,
and φ→ψ has feature F , then ψ has feature F . Once a) and b) are established,
one can conclude by induction that all lines in all proofs have feature F . When
one gives this first sort of inductive argument, for the conclusion that every
theorem φ has a certain feature, it is sometimes called “induction on the proof
of φ” or “induction on the length of φ’s proof”.
A second case in which induction may be used is when one is trying to
establish a fact of the form: every formula has a certain feature F . (The proof
that every wff has a finite number of sentence letters is an example of this
case.) Here’s why induction is applicable: all formulas are built out of sentence
letters (the “starting points”) by successive applications of the rules of formation
(“operations”) (the rules of formation, recall, say that if φ and ψ are formulas,
CHAPTER 2. PROPOSITIONAL LOGIC 53
then so are (φ→ψ) and ∼φ.) So, to show that all formulas have feature F ,
we must merely show that a) all the sentence letters have feature F , and b)
show that if φ and ψ both have feature F , then both (φ→ψ) and ∼φ also
will have feature F . When one gives this second sort of inductive argument,
for the conclusion that every formula φ has a certain feature, it is sometimes
called “induction on the construction of φ”, or “induction on the number of
connectives in φ”.
Inductions in logic can take yet other forms, but these two are particularly
common.
If you’re ever proving something by induction, it’s important to identify
what sort of inductive proof you’re constructing. What are the entities you’re
dealing with? What is the feature F ? What are the starting points, and what
are the operations generating new entities from the starting points? If you’re
trying to construct an inductive proof and get stuck, you should return to these
questions and make sure you’re clear about their answers.
CHAPTER 2. PROPOSITIONAL LOGIC 54
φ→φ
(φ→ψ)→(ψ→φ)
Exercise 2.7 Prove the following form of soundness: for any set
of formulas, Γ , and any formula φ, if Γ ` φ then Γ φ (i.e., if φ is
provable from Γ then φ is a semantic consequence of Γ.)
2.7 Completeness of PL
In this section we’ll prove completeness for propositional logic. It will be a bit
more difficult than the preceding sections, and may be skipped without much
loss. If you decide to work through the more difficult sections dealing with
metalogic later in the book (for example sections 6.5 and 6.6), you might first
return to this section.
Before we prove completeness, we’ll need to prove a helpful theorem (which
is interesting in its own right) and a Lemma.
As you learned in section 2.5.1 (perhaps to your dismay), constructing
axiomatic proofs is much harder than constructing sequent proofs. It’s hard to
prove things when you’re not allowed to use conditional proof! Nevertheless,
one can prove a metalogical theorem about our axiomatic system that is closely
related to conditional proof:
That is: whenever there exists a proof from (Γ and) φ to ψ, then there also
exists a proof of φ→ψ (from Γ).
Suppose we want to prove φ→ψ. Our axiomatic system does not allow
us to assume φ in a conditional proof of φ→ψ. But once we’ve proved the
deduction theorem, we’ll be able to do the next best thing. Suppose we write
down a proof of ψ from {φ}. That is, we write down a proof in which each line
is either i) a member of {φ} (that is, φ itself), or ii) an axiom, or iii) follows
from earlier lines in the proof by modus ponens. The deduction theorem then
lets us conclude that some proof of φ→ψ exists. We won’t have constructed such
a proof ourselves; we only constructed the proof from φ to ψ. Nevertheless
the deduction theorem assures us that it exists. More generally, whenever we
can construct a proof of ψ from φ plus some other premises (the formulas in
some set Γ), then the deduction theorem assures us that some proof of φ→ψ
from those other premises also exists.
Proof of the deduction theorem. Suppose Γ, φ ` ψ. That is, there is some proof,
call it “Proof A”, of ψ from Γ ∪ {φ}. Such a proof looks like this:
CHAPTER 2. PROPOSITIONAL LOGIC 56
1. α1
2. α2
.
.
n. ψ
where each αi is either a member of Γ ∪ {φ}, an axiom, or follows from earlier
lines in the proof by MP. Our strategy will be to establish that:
We already know that each line of proof A is provable from Γ ∪ φ; what (*) says
is that if you stick “φ→” in front of any of those lines, the result is provable
from Γ all by itself. Once we succeed in establishing (*) then we will have
proved the deduction theorem. For the last line of proof A is ψ; (*) then tells
us that φ→ψ is provable from Γ.
(*) says that each line of proof A has a certain feature, namely, the feature of:
being provable from Γ when prefixed with “φ→”. Just as in the proof of soundness,
this calls for the method of proof by induction, and in particular, induction on
φ’s proof. Here goes.
What we’re going to do is show that whenever a line is added to proof A,
then it has the feature—provided, that is, that all earlier lines in the proof have
the feature. There are three cases in which a line αi could have been added
to proof A. The first case is where αi is an axiom. We must show that αi has
the feature—that is, show that Γ ` φ→αi . Well, we can prove φ→αi from Γ as
follows:
1. αi axiom
2. φ→αi adding an antecedent
This is not an official proof, of course; it’s a proof sketch. And note that we
didn’t need to use any members of Γ in the proof. That’s OK; if you look back
at the definition of a proof from a set, you’ll see that this counts officially as a
proof from Γ.
The second case in which a line αi could have been added to proof A is
where αi is a member of Γ ∪ {φ}. This subdivides into two subcases. The first
is where αi is φ itself. Here, φ→αi is φ→φ, which is shown in exercise 2.3a to
be a theorem, i.e., provable from no premises at all. So it is obviously provable
CHAPTER 2. PROPOSITIONAL LOGIC 57
from Γ. The second subcase is where αi ∈ Γ. But here we can prove φ→αi
from Γ as follows:
1. αi member of Γ
2. φ→αi adding an antecedent
The first two cases were “base” cases of our inductive proof, because we
didn’t need to assume anything about earlier lines in proof A. The third case in
which a line αi could have been added to proof A leads us to the inductive part
of our proof: the case in which αi follows from two earlier lines of the proof
by MP. Here we simply assume that those earlier lines of the proof have the
feature we’re interested in (this assumption is the “inductive hypothesis”; the
feature, recall, is: being provable from Γ when prefixed with “φ→”) and we show
that αi has the feature as well.
So: we’re considering the case where αi follows from earlier lines in the
proof by modus ponens. That means that the earlier lines have to have the
forms χ →αi and χ . Furthermore, the inductive hypothesis tells us that the
result of prefixing either of these earlier lines with “φ→” is provable from Γ.
That is, we know that some proof from Γ culminates in φ→(χ →αi ), and some
other proof from Γ culminates in φ→χ . We can then string these two proofs
together into a new proof, and then continue that new proof as follows:
.
.
k. φ→(χ →αi )
.
.
l. φ→χ
l + 1. φ→αi k, l , MP method
This is a proof of φ→αi from Γ.
Thus, in all three cases, whenever αi was added to proof A, there always
existed some proof of φ→αi from Γ. By induction, (*) is established; and this
in turn completes the proof of the deduction theorem.
Next we’ll prove the following lemma:
Lemma: Let I be any PL-interpretation, let φ be any wff, let s1 . . . sn be the
sentence letters in φ, and where ψ is any formula, define ψ0 as being ψ itself if
ψ is true in I , and as being ∼ψ if ψ is false in I . Then s10 . . . sn0 ` φ0
CHAPTER 2. PROPOSITIONAL LOGIC 58
A very rough way of thinking about what Lemma says is this: the truth value of a
formula is provably settled by the truth values of its sentence letters (“provably”
in the sense of provability in our axiomatic system.)
Proof of Lemma. Let I , φ, and s1 . . . sn be as described in Lemma. We must
show that φ0 is provable from {s10 . . . sn0 }. We’ll show this by induction. With an
eye toward setting up the inductive proof correctly, think of Lemma as saying:
every formula has the feature being a formula whose primed version is provable from
the primed versions of its sentence letters. This makes it clear that the assertion
we’re trying to prove has the form “every formula has a certain feature”, and
thus calls for proof by induction on the formula’s construction (rather than
proof by induction on a proof, as in the previous two inductive proofs). So,
we’ll need to show that all sentence letters have the feature (base case), and
then show that if α and β have the feature (inductive hypothesis), then both
∼α and α→β must have the feature as well.
Base case: suppose φ is a sentence letter. Then there is just one si , which is
φ itself. So what we need to show is: φ0 ` φ0 . But that’s trivial; we can give a
one-line proof of φ0 from {φ0 }:
1. φ0 member of {φ0 }
Now assume the inductive hypothesis: that both α and β have the feature.
That is, where s1 . . . sn are the sentence letters in α, and t1 . . . t m are the sentence
letters in β, we are assuming:
First we must show that ∼α has the feature. Since ∼α has the same sentence
letters as α (namely, s1 . . . sn ), this means showing that s10 . . . sn0 ` (∼α)0 . Now,
(∼α)0 is either ∼α or ∼∼α depending on whether ∼α is true or false (in I ; I’ll
suppress this from now on). We’ll consider these cases separately.
· In the former case, α is false; and so α0 is ∼α. Then (a) tells us that
s10 . . . sn0 ` ∼α. Since (∼α)0 is ∼α in this case, we’ve already shown what
we wanted: s10 . . . sn0 ` (∼α)0 .
CHAPTER 2. PROPOSITIONAL LOGIC 59
· In the latter case, α is true, and so α0 is just α. So what (a) tells us in this
case is: s10 . . . sn0 ` α. Furthermore, since in this case (∼α)0 is ∼∼α, what
we’re trying to establish is s10 . . . sn0 ` ∼∼α. So, to construct a proof of
∼∼α from {s10 . . . sn0 }, begin with a proof of α from {s10 . . . sn0 } (we know
that such a proof exists from what (a) told us). Then insert a proof of
α→∼∼α (exercise 2.4c). Finish the proof by concluding ∼∼α by MP.
Next we must show that α→β has the feature. The sentence letters in
α→β are s1 . . . sn , t1 . . . t m , so what we must show is: s10 . . . sn0 , t10 . . . t m0 ` (α→β)0 .
We’ll consider three separate cases:
· First case: β is true. Then α→β is also true, and so (α→β)0 is just α→β.
We may then construct the desired proof from {s10 . . . sn0 , t10 . . . t m0 } of α→β
as follows. Here β0 is just β, so (b) tells us that t10 . . . tn0 ` β. So we may
begin our desired proof with a proof of β from {t10 . . . tn0 } (ipso facto this
is a proof from {s10 . . . sn0 , t10 . . . t m0 }). We then use the technique of “adding
an antecedent” to move to α→β.
.
.
i. α
.
.
j. ∼β
j +1. [∼∼(α→β)→∼β]→[(∼∼(α→β)→β)→∼(α→β)]
A3
j +2. ∼∼(α→β)→∼β j , adding an antecedent
j +3. (∼∼(α→β)→β)→∼(α→β) j +1, j +2, MP
j +4. α→[∼∼(α→β)→β] see below
j +5. ∼∼(α→β)→β i, j +4, MP
j +6. ∼(α→β) j +3, j +5, MP
As for step j +4, consider the following proof of β from {α, ∼∼(α→β)}:
Completeness: if φ then ` φ
CHAPTER 2. PROPOSITIONAL LOGIC 61
proof of sn →φ, continuing with a proof of ∼sn →φ, and then continuing as
follows:
.
.
i. sn →φ
.
.
j. ∼sn →φ
j +1. ∼φ→∼sn i , contraposition 2, MP
j +2. ∼φ→∼∼sn j , contraposition 2, MP
j +3. (∼φ→∼∼sn )→[(∼φ→∼sn )→φ] A3
j +4. φ j + 1, j + 2, j + 3, MP (x2)
Stage 2c: Following the strategy of 1c, show on the basis of stages 2a and
2b that s10 . . . sn−2
0
` φ.
Stage 2 removed one more of the si0 s on the left of the `. Stage 3 will remove
another one, and so we will have: s10 . . . sn−3
0
` φ. Each stage removes another
one; and so after the last stage, stage nc, they will all be gone and we will have
shown that ` φ.
Chapter 3
s promised, we will not stop with the standard logics familiar from in-
A troductory textbooks. In this chapter we examine some philosophically
important variations and deviations from standard propositional logic.
f (1) = 0
f (0) = 1
This is a called one-place function because it takes only one truth value as input.
In fact, we have a name for this truth function: negation. And we express that
truth function with our symbol ∼. So: negation is a truth function that we can
express in propositional logic.
63
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 64
g (1, 1) = 1
g (1, 0) = 0
g (0, 1) = 0
g (0, 0) = 0
Conjunction is a two-place truth function, which means that it takes two truth
values as inputs. We have a symbol for this truth function as well: ∧.
Here’s another truth function:
i(1, 1) = 0
i(1, 0) = 1
i(0, 1) = 1
i(0, 0) = 1
Think of this truth function as “not both”. Unlike the negation and conjunction
truth functions, we don’t have a single symbol for this truth function. Never-
theless, it too can be expressed in propositional logic. If we want to express
“not-both (P , Q)”, we can just write:
∼(P ∧Q)
In fact, it’s not hard to show that any truth function (of any finite number
of places) can be expressed in propositional logic using just the ∧, ∨, and ∼.
Proof. The proof will be informal; but before giving it, we need a precise
definition of what it means to say that a truth function “can be expressed” in
propositional logic.
Definition of expressability: n-place truth function h can be expressed in
propositional logic iff there is some sentence of propositional logic, φ, con-
taining n sentence letters, P1 . . . Pn , which has the following feature: whenever
P1 . . . Pn have the truth values t1 . . . tn , respectively, then the whole sentence φ
has the truth value h(t1 . . . tn )
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 65
Now for the proof. I’ll begin by illustrating the idea with an example. Suppose
we want to express the following three-place truth-function:
f (1, 1, 1) = 0
f (1, 1, 0) = 1
f (1, 0, 1) = 0
f (1, 0, 0) = 1
f (0, 1, 1) = 0
f (0, 1, 0) = 0
f (0, 0, 1) = 1
f (0, 0, 0) = 0
We must construct a sentence with three sentence letters, P1 , P2 , and P3 , whose
truth table “matches” function f . Now, if we ignore everything but the numbers
in the above picture of function f , we can think of it as a kind of truth table
for the sentence we’re after. The first column of numbers represents the truth
values of P1 , the second column, the truth values of P2 , and the third column,
the truth values of P3 ; and the far right column represents the truth values that
the desired formula should have. Each row represents a possible combination
of truth values for these sentence letters. Thus, the second row (“ f (1, 1, 0) = 0”)
is the combination where P1 is 1, P2 is 1, and P3 is 0; the fact that the fourth
column in this row is 1 indicates that the desired formula should be true here.
Since function f returns the value 1 in just three cases (rows two, four, and
seven), the sentence we’re after should be true in exactly those three cases: (a)
when P1 , P2 , P3 take on the three truth values in the second row (i.e., 1, 1, 0);
(b) when P1 , P2 , P3 take on the three truth values in the fourth row (1, 0, 0); and
(c) when P1 , P2 , P3 take on the three truth values in the seventh row (0, 0, 1) .
Now, we can construct a sentence that is true in case (a) and false otherwise:
P1 ∧P2 ∧∼P3 . We can also construct a sentence that’s true in case (b) and false
otherwise: P1 ∧∼P2 ∧∼P3 . And we can also construct a sentence that’s true in
case (c) and false otherwise: ∼P1 ∧∼P2 ∧P3 . But then we can simply disjoin these
three sentences to get the sentence we want:
(P1 ∧P2 ∧∼P3 ) ∨ (P1 ∧∼P2 ∧∼P3 ) ∨ (∼P1 ∧∼P2 ∧P3 )
(Strictly speaking the three-way conjunctions, and the three-way disjunction,
need parentheses added, but since it doesn’t matter where they’re added—
conjunction and disjunction are associative—I’ve left them off.)
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 66
This strategy is in fact purely general. Any n-place truth function, f , can
be represented by a chart like the one above. Each row in the chart consists of
a certain combination of n truth values, followed by the truth value returned
by f for those n inputs. For each such row, construct a conjunction whose
i th conjunct is Pi if the i th truth value in the row is 1, and ∼Pi if the i th truth
value in the row is 0. Notice that the conjunction just constructed is true if and
only if its sentence letters have the truth values corresponding to the row in
question. The desired formula is then simply the disjunction of all and only
the conjunctions for rows where the function f returns the value 1.1 Since the
conjunction for a given row is true iff its sentence letters have the truth values
corresponding to the row in question, the resulting disjunction is true iff its
sentence letters have truth values corresponding to one of the rows where f
returns the value true, which is what we want.
Say that a set of connectives is adequate iff one can express all the truth
functions using a sentence containing only those connectives. What we just
showed was that the set {∧, ∨, ∼} is adequate. We can then use this fact to
prove that other sets of connectives are adequate. For example, it is easy to
prove that φ∨ψ has the same truth table as (is true relative to exactly the same
PL-interpretations as) ∼(∼φ∧∼ψ). But that means that for any sentence χ
whose only connectives are ∧, ∨, and ∼, we can construct another sentence χ 0
with the same truth table but whose only connectives are ∧ and ∼: simply begin
with χ and use the equivalence between φ∨ψ and ∼(∼φ∧∼ψ) to eliminate
all occurrences of ∨ in favor of occurrences of ∧ and ∼. But now consider
any truth function f . Since {∧, ∨, ∼} is adequate, f can be expressed by some
sentence χ ; but χ has the same truth table as some sentence χ 0 whose only
connectives are ∧ , and ∼; hence f can be expressed by χ 0 as well. So {∧, ∼} is
adequate.
Similar arguments can be given to show that other connective sets are
adequate as well. For example, the ∧ can be eliminated in favor of the → and
the ∼ (since φ∧ψ has the same truth table as ∼(φ→∼ψ)); therefore, since
{∧, ∼} is adequate, {→, ∼} is also adequate.
1
Special case: if there are no such rows—i.e., if the function returns 0 for all inputs—
then let the formula be simply any logically false formula containing P1 . . . Pn , for example
P1 ∧∼P1 ∧P2 ∧P3 ∧ · · · ∧Pn .
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 67
(+) For any sentence, φ, containing just sentence letter P and the connectives
∧ and →, φ is true in any PL-interpretation in which P is true
We’ll again use the method of induction. We want to show that (+) holds for all
sentences. So we first prove that (+) is true for all sentence with no connectives
(i.e., for sentences containing just sentence letters.) This is the base case, and
is very easy here, since if φ has no connectives, then obviously φ is just the
sentence letter P itself, in which case, clearly, φ is true in any PL-interpretation
in which P is true. Next we assume the inductive hypothesis:
(ih) (+) is true for sentences φ and ψ. (That is, in any interpretation in which
P is true, both φ and ψ are true.)
And we try to show, on the basis of this assumption, that (+) is true for φ∧ψ
and for φ→ψ. This is easy to do. First we show that (+) is true for φ∧ψ—
that is, φ∧ψ is true in any interpretation in which P is true. But we know
by the inductive hypothesis that φ and ψ are individually true in any such
interpretation. But then, we know from the truth table for ∧ that φ∧ψ is also
true in any such interpretation. The reasoning is exactly parallel for φ→ψ: the
inductive hypothesis tells us that whenever P is true, so are φ and ψ, and then
we know that in this case φ→ψ must also then be true, by the truth table for
→. Therefore, by induction, the result is proved.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 68
% 1 0
1 0 1
0 1 0
Can all the truth functions be expressed using just the %? Justify
your answer.
% 1 0
1 0 1
0 1 1
Now here’s an exciting thing about |: it’s an adequate connective all on its own.
You can express all the truth functions using just |!
Here’s how we can prove this. We showed above that {→, ∼} is adequate;
so all we need to do is show how to define the → and the ∼ using just the |.
Defining ∼ is easy; φ|φ has the same truth table as ∼φ. As for φ→ψ, think of
it this way. φ→ψ is equivalent to ∼(φ∧∼ψ), i.e., φ|∼ψ. But given the method
just given for defining ∼ in terms of |, we know that ∼ψ is equivalent to ψ|ψ.
Thus, φ→ψ has the same truth table as: φ|(ψ|ψ).
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 69
Exercise 3.2 For each of the following two truth functions, f and
g , first find a sentence that expresses it in standard propositional
logic (i.e., with ∼, ∧, ∨, ↔, →); then find a sentence that expresses
it using just the Sheffer stroke:
f (1, 1) = 1 g (1, 1, 1) = 1
f (1, 0) = 0 g (1, 1, 0) = 0
f (0, 1) = 0 g (1, 0, 1) = 1
f (0, 0) = 1 g (1, 0, 0) = 1
g (0, 1, 1) = 1
g (0, 1, 0) = 1
g (0, 0, 1) = 0
g (0, 0, 0) = 1
Exercise 3.3 Show that all truth functions can be defined using
just ↓ (nor). The truth table for ↓ is the following:
% 1 0
1 0 0
0 0 1
∼φ
∧φψ ∨φψ →φψ ↔φψ
What’s the point? This notation eliminates the need for parentheses. With the
usual notation, in which we put the connectives between the sentences they
connect, we need parentheses to distinguish, e.g.:
(P ∧Q)→R
P ∧(Q→R)
But with Polish notation, these are distinguished without parentheses; they
become:
→∧P QR
∧P →QR
respectively.
a) P ↔∼P
b) (P →(Q→(R→∼∼(S∨T ))))
#. There are a number of things one could take # to mean (e.g., “meaningless”,
or “undefined”, or “unknown”).
Standard logic is “bivalent”—that means that there are no more than two
truth values. So, moving from standard logic to a system that admits a third
truth value is called “denying bivalence”. One could deny bivalence, and go
even further, and admit four, five, or even infinitely many truth values. But
we’ll only discuss trivalent systems—i.e., systems with only three truth values.
Why would one want to admit a third truth value? There are various
philosophical reasons one might give. One concerns vagueness. A person with
one dollar is not rich. A person with a million dollars is rich. Somewhere in the
middle, there are some people that are hard to classify. Perhaps a person with
$100,000 is such a person. They seem neither definitely rich nor definitely
not rich. So there’s pressure to say that the statement “this person is rich” is
capable of being neither definitely true nor definitely false. It’s vague.
Others say we need a third truth value for statements about the future. If
it is in some sense “not yet determined” whether there will be a sea battle
tomorrow, then (it is argued) the sentence:
Let’s define validity and semantic consequence for Łukasiewicz’s system much
like we did for standard PL:
Łukasiewicz definitions of validity and consequence:
Notice that there are now two ways a formula can fail to be valid. It can
be 0 under some trivalent interpretation, or it can be # under some trivalent
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 74
interpretation. “Valid” (under this definition) means always true; it does not
mean never false. (Similarly, the defined notion of semantic consequence is
that of truth-preservation, not nonfalsity-preservation.) The definition leaves
it open that a formula might be never-false, and still not be always-true: such a
formula would be sometimes # and sometimes 1, but never 0.
Here is the intuitive idea behind the Kleene tables. Let’s call the truth values
0 and 1 the “classical” truth values. If a formula’s halves have only classical truth
values, then the truth value of the whole formula is just the classical truth value
determined by the classical truth values of the halves. But if one or both halves
are #, then we must consider the result of turning each # into one of the classical
truth values. If the entire formula would sometimes be 1 and sometimes be 0
after doing this, then the entire formula is #. But if the entire formula always
takes the same truth value, X, no matter which classical truth value any #s are
turned into, then the entire formula gets this truth value X. Intuitively: if there
is “enough information” in the classical truth values of a formula’s parts to settle
on one particular classical truth value, then that truth value is the formula’s
truth value.
Take the truth table for φ→ψ, for example. When φ is 0 and ψ is #, the
whole formula is 1—because the false antecedent is sufficient to make the whole
formula true, no matter what classical truth value we convert ψ to. On the
other hand, when φ is 1 and ψ is #, then the whole formula is #. The reason is
that what classical truth value we substitute in for ψ’s # affects the truth value of
the whole. If the # becomes a 0 then the whole thing is 0; but if the # becomes
a 1 then the whole thing is 1.
There are two important differences between Łukasiewicz’s and Kleene’s
systems. The first is that, unlike Łukasiewicz’s system, Kleene’s system makes
the formula P →P invalid. The reason is that in Kleene’s system, #→# is #; thus,
P →P isn’t true in all valuations (it is # in the valuation where P is #.)
In fact, it’s easy to show that there are no valid formulas in Kleene’s system.
Proof. Consider the valuation that makes every sentence letter #. Here’s an
inductive proof that every wff is # in this interpretation. Base case: all the
sentence letters are # in this interpretation. (That’s obvious.) Inductive step:
assume that φ and ψ are both # in this interpretation. We need now to show
that φ∧ψ, φ∨ψ, and φ→ψ are all # in this interpretation. But that’s easy—just
look at the truth tables for ∧, ∨ and →. #∧# is #, #∨# is #, and #→# is #.
Even though there are no valid formulas in Kleene’s system, there are still
cases of semantic consequence. Semantic consequence for Kleene’s system is
defined as truth-preservation: Γ Kleene φ iff φ is true whenever every member
of Γ is true, given Kleene’s truth tables. Then P ∧Q Kleene P , since the only
way for P ∧Q to be true is for P to be true and Q to be true.
The second (related) difference is that in Kleene’s system, → is interde-
finable with the ∼ and ∨, in that φ→ψ has exactly the same truth table as
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 76
∼φ∨ψ. (Look at the truth tables to verify that this is true.) But that’s not true
for Łukasiewicz’s system. In Łukasiewicz’s system, when φ and ψ are both #,
then φ→ψ is 1, but ∼φ∨ψ is #.
So basically, the classical bit of each truth table is what you’d expect; but
everything gets boring if any constituent formula is a #.
One way to think about these tables is to think of the # as indicating nonsense.
The sentence “The sun is purple and blevledgekl;rz”, one might naturally think,
is neither true nor false because it is nonsense. It is nonsense even though it
has a part that isn’t nonsense.
3.3.4 Supervaluationism
Recall the guiding thought behind the strong Kleene tables: if a formula’s
classical truth values fix a particular truth value, then that is the value that the
formula takes on. There is a way to take this idea a step further, which results
in a new and interesting way of thinking about three-valued logic.
According to the strong Kleene tables, we get a classical truth value for
φ
ψ, where
is any connective, only when we have “enough classical
information” in the truth values of φ and ψ to fix a classical truth value for
φ
ψ. Consider φ∧ψ for example: if either φ or ψ is false, then since
falsehood of a conjunct is classically sufficient for the falsehood of the whole
conjunction, the entire formula is false. But if, on the other hand, both φ and
ψ are #, then neither φ nor ψ has a classical truth value, we do not have enough
classical information to settle on a classical truth value for φ∧ψ, and so the
whole formula is #.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 77
But now consider a special case of the situation just considered, where φ is
P , ψ is ∼P , and P is #. According to the strong Kleene tables, the conjunction
P ∧∼P is #, since it is the conjunction of two formulas that are #. But there is a
way of thinking about truth values of complex sentences according to which
the truth value ought to be 0, not #: no matter what classical truth value P were
to take on, the whole sentence P ∧∼P would be 0—therefore, one might think,
P ∧∼P ought to be 0. If P were 0 then P ∧∼P would be 0∧∼0—that is 0; and if
P were 1 then P ∧∼P would be 1∧∼1—0 again.
The general thought here is this: suppose a sentence φ contains some
sentence letters P1 . . . Pn that are #. If φ would be false no matter how we assign
classical truth values to P1 . . . Pn —that is, no matter how we precisified φ—then
φ is in fact false. Further, if φ would be true no matter how we precisified it,
then φ is in fact true. But if precisifying φ would sometimes make it true and
sometimes make it false, then φ in fact is #.
The idea here can be thought of as an extension of the idea behind the
strong Kleene tables. Consider a formula φ
ψ, where
is any connective.
If there is enough classical information in the truth values of φ and ψ to fix on a
particular classical truth value, then the strong Kleene tables assign φ
ψ that
truth value. Our new idea goes further, and says: if there is enough classical
information within φ and ψ to fix a particular classical truth value, then φ
ψ
gets that truth value. Information “within” φ and ψ includes, not only the
truth values of φ and ψ, but also a certain sort of information about sentence
letters that occur in both φ and ψ. For example, in P ∧∼P , when P is #, there
is insufficient classical information in the truth values of P and of ∼P to settle
on a truth value for the whole formula P ∧∼P (since each is #). But when we
look inside P and ∼P , we get more classical information: we can use the fact
that P occurs in each to reason as we did above: whenever we turn P to 0, we
turn ∼P to 1, and so P ∧∼P becomes 0; and whenever we turn P to 1 we turn
∼P to 0, and so again, P ∧∼P becomes 0.
This new idea—that a formula has a classical truth value iff every way of
precisifying it results in that truth value—is known as supervaluationism. Let us
lay out this idea formally.
Where I is a trivalent interpretation and C is a PL-interpretation (i.e., a
bivalent interpretation in the sense of section 2.3), say that C is a precisification
of I iff: whenever I assigns a sentence letter a classical truth value (i.e., 1 or 0),
C assigns that sentence letter the same classical value. Thus, precisifications
of I agree with I on the classical truth values, but in addition—being PL-
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 78
P ∧Q
P ∧∼P
truth-functional. Similar arguments can be given to show that the → and the
∨ aren’t truth-functional either, given supervaluationism.
3.4 Intuitionism
Intuitionism in the philosophy of mathematics is a view according to which
there are no mind-independent mathematical facts. Rather, mathematical facts
and entities are mental constructs that owe their existence to the mental activity
of mathematicians constructing proofs. This philosophy of mathematics leads
intuitionists to a distinctive form of logic: intuitionist logic.
Let P be the statement: The sequence 0123456789 occurs somewhere in the
decimal expansion of π. How should we think about its meaning? For the classical
mathematician, the answer is straightforward. P is a statement about a part of
mathematical reality, namely, the infinite decimal expansion of π. Either the
sequence 0123456789 occurs somewhere in that expansion, in which case P is
true, or it does not, in which case P is false and ∼P is true.
For the intuitionist, this whole picture is mistaken, premised as it is on the
reality of an infinite decimal expansion of π. Our minds are finite, and so only
the finite initial segment of π’s decimal expansion that we have constructed so
far is real. The intuitionist’s alternate picture of P ’s meaning, and indeed of
meaning generally (for mathematical statements) is a radical one.5
The classical mathematician, comfortable with the idea of a realm of mind-
independent entities, thinks of meaning in terms of truth and falsity. As we saw,
she thinks of P as being true or false depending on the facts about π’s decimal
5
One intuitionist picture, anyway, on which see Dummett (1973). What follows is a crude
sketch. It does not do justice to the actual intuitionist position, which is, as they say, subtle.
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 81
Likewise, intuitionists do not accept the law of the excluded middle, φ∨∼φ,
as a logical truth. To be a logical truth, according to an intuitionist, a sentence
should be provable from no premises whatsoever. But to prove P ∨∼P , for
example, would require either exhibiting a case of 0123456789 in π’s decimal
expansion, or proving that the assumption that 0123456789 occurs in π’s decimal
expansion leads to a contradiction. We’re not in a position to do either.
Though we won’t consider intuitionist predicate logic, one of its most
striking features is easy to grasp informally. Intuitionists say that an existentially
quantified sentence is proved iff one of its instances has been proved. Therefore
they reject the inference from ∼∀xF x to ∃x∼F x, for one might be able to
prove a contradiction from the assumption of ∀xF x without being able to
prove any instance of ∃x∼F x.
We have so far been considering a putative philosophical justification for
intuitionist propositional logic. That justification has been rough and ready;
but intuitionist propositional logic itself is easy to present, perfectly precise,
and is a coherent system regardless of what one thinks of its philosophical
underpinnings. Two simple modifications to the natural deduction system of
section 2.4 generate a natural deduction system for intuitionistic propositional
logic. First, we need to split up the double-negation rule, DN, into two halves,
“double-negation introduction” and “double-negation elimination”:
Γ ` ∼∼φ Γ`φ
DNE DNI
Γ`φ Γ ` ∼∼φ
In our original system from section 2.4 we were allowed to use both DNE and
DNI; but in the intuitionist system, we are only allowed to use DNI; DNE is
not allowed. Second, to make up for the dropped rule DNE, our intuitionist
system adds the rule “ex falso”:
Γ ` φ∧∼φ
EF
Γ`ψ
Note that EF can be proved in the original system: simply use RAA and then
DNE. So, intuitionist logic results from a system for classical logic by simply
dropping one rule (DNE) and adding another rule that was previously provable
(EF). It follows that every intuitionistically provable sequent is also classically
provable (because every intuitionistic proof can be converted to a classical
proof).
CHAPTER 3. VARIATIONS AND DEVIATIONS FROM PL 83
Notice how dropping DNE blocks proofs of various classical theorems the
intuitionist wants to avoid. The proof of ∅ ` P ∨∼P in section 2.4, for instance,
used DNE. Of course, for all we’ve said so far, there might be some other way
to prove this sequent. Only when we have a semantics for intuitionistic logic,
and a soundness proof relative to that semantics, can we show that this sequent
cannot be proven without DNE. We will discuss a semantics for intuitionism
in section 7.2.
It is interesting to note that even though intuitionists reject the inference
from ∼∼P to P , they accept the inference from ∼∼∼P to ∼P , since its proof
only requires the half of DN that they accept, namely the inference from P to
∼∼P :
1 (1) ∼∼∼P As
2 (2) P As (for reductio)
2 (3) ∼∼P 2, DN (accepted version)
1,2 (4) ∼∼P ∧ ∼∼∼P 1,3 ∧I
1 (5) ∼P 4, RAA
Note that you can’t use this sort of proof to establish ∼∼P ` P . Given the way
RAA is stated, its application always results in a formula beginning with the ∼.
Chapter 4
Predicate Logic
et’s now turn from propositional logic to the “predicate calculus” (PC),
L as it is sometimes called. As with propositional logic, we’re going to
formalize predicate logic. We’ll first do grammar, and then move to semantics.
We won’t consider proof theory at all.1
Primitive vocabulary:
· logical: →, ∼, ∀
· nonlogical:
· parentheses
1
Proof systems for predicate logic, in both axiomatic and natural-deduction form, are
straightforward, and can be found in standard logic textbooks.
84
CHAPTER 4. PREDICATE LOGIC 85
No symbol of one type is a symbol of any other type. Let’s call any variable or
constant a term.
Definition of wff:
We’ll call formulas that are wffs in virtue of clause i) “atomic” formulas. When
a formula has no free variables, we’ll say that it is a closed formula, or sentence;
otherwise it is an open formula. (“Free” means that the variable doesn’t “belong”
to any quantifier in the formula. For example, in ∀yRxy, the variable x is free,
whereas the variable y is “bound” to the quantifier ∀y. ‘Free’ and ‘bound’ can
be precisely defined, but I won’t bother.)
We have the same defined logical terms: ∧, ∨, ↔. We also add the following
definition of the existential quantifier:
F x don’t have truth values at all. The variable ‘x’ doesn’t stand for any one
thing, and so ‘F x’ doesn’t have a truth value.
The solution to this problem is due to the Polish logician Alfred Tarski. It
begins with a new conception of a configuration, that of a model:
As before, we can give recursive clauses for the truth values of negations
and conditionals. φ→ψ, for example, will be true iff either φ is false or ψ is
true.
But this becomes tricky when we try to specify the truth value of ∀xF x. It
should, intuitively, be true if and only if ‘F x’ is true, no matter what we put
in in place of ‘x’. But this is vague. Do we mean “whatever name (constant)
we put in place of ‘x”’? No, because we don’t want to assume that we’ve got a
name for everything in the domain, and what if F x is true for all the objects we
have names for, but false for one of the nameless things! Do we mean, “true no
matter what object from the domain we put in place of ‘x”’? No; objects from
the domain aren’t part of our primitive vocabulary, so the result of replacing ‘x’
with an object from the domain won’t be a formula!2
Tarski’s solution to this problem goes as follows. Initially, we don’t consider
truth values of formulas absolutely. Rather, we let the variables refer to certain
things in the domain temporarily. Then, we’ll say that ∀xF x will be true iff
for all objects u in the domain D: F x is true while x temporarily refers to u.
We implement this idea of temporary reference with the idea of a “variable
assignment”:
The variable assignments give the “temporary” meanings to the variables; when
g (x) = u, then u is the temporary denotation of x.
We need a further bit of notation. Let u be some object in D, let g be some
variable assignment, and let α be a variable. We then define “ g uα ” to be the
variable assignment that is just like g , except that it assigns u to α. (If g already
assigns u to α then g uα will be the same function as g .)
Note the following important fact about variable assignments: g uα , when
applied to α, must give the value u. (Work through the definitions to see that
this is so.) That is:
g uα (α) = u
One more bit of apparatus. Given any model M (= 〈D, I 〉), and given any
variable assignment, g , and given any term (i.e., variable or name) α, we define
2
Unless the domain happens to contain members of our primitive vocabulary!
CHAPTER 4. PREDICATE LOGIC 88
I (α) if α is a constant
(
[α]M , g =
g (α) if α is a variable
(In understanding clause i), recall that the one tuple containing just u, 〈u〉 is
just u itself. Thus, in the case where Π is F , some one place predicate, clause i)
says that VM ,g (F α) = 1 iff [α]M ,g ∈ I (F ).)
So far we have defined the notion of truth in a model relative to a variable
assignment. But what we really want is a notion of truth in a model, period—that
is, absolute truth in a model. (We want this because we want to define, e.g., a
valid formula as one that is true in all models.) So, let’s define absolute truth in
a model in this way:
It might seem that this is too strict a requirement—why must φ be true relative
to each variable assignment? But in fact, it’s not too strict at all. The kinds of
formulas we’re really interested in are formulas without free variables (we’re
interested in formulas like F a, ∀xF x, ∀x(F x→Gx); not formulas like F x,
∀x Rxy, etc.) And if a formula has no free variables, then if there’s even a single
variable assignment relative to which it is true, then it is true relative to every
variable assignment. (And so, we could just as well have defined truth in a
model as truth relative to some variable assignment.) I won’t prove this fact,
but it’s not too hard to prove; one would simply need to prove (by induction)
that, for any wff φ and model M , if variable assignments g and h agree on all
variables free in φ, then VM , g (φ) = VM ,h (φ).
Now we can give semantic definitions of the core logical notions:
Since our new definition of the valuation function treats the propositional
connectives → and ∼ in the same way as the propositional logic valuation did,
it’s easy to see that it also treats the defined connectives ∧, ∨, and ↔ in the
same way:
VM ,g (φ∧ψ) = 1 iff VM ,g (φ) = 1 and VM ,g (ψ) = 1
VM ,g (φ∨ψ) = 1 iff VM ,g (φ) = 1 or VM , g (ψ) = 1
VM ,g (φ↔ψ) = 1 iff VM ,g (φ) = VM ,g (ψ)
Moreover, we can also prove that the valuation function treats ∃ as it should
(given its intended meaning):
VM ,g (∃αφ) = 1 iff there is some u ∈ D such that VM ,g α (φ) = 1
u
This can be established as follows. The definition of ∃αφ is: ∼∀α∼φ. So, we
must show that for any model, and any variable assignment g based on that
model, VM ,g (∼∀α∼φ) = 1 iff there is some u ∈ D such that VM ,g α (φ) = 1. (In
u
arguments like these, I’ll sometimes stop writing the subscript M in order to
reduce clutter. It should be obvious from the context what the relevant model
is.) Here’s the argument:
CHAPTER 4. PREDICATE LOGIC 90
· Given the clause for ∼, this can be rewritten as: “… iff for some u ∈ D,
Vg α (φ) = 1”
u
Example 4.1: Show that ∀xF x→F a is valid. That is, show that for any
model 〈D, I 〉, and any variable assignment g , Vg (∀xF x→F a) = 1:
vi) By the definition of the denotation of a variable, [x] g x = gIx (a) (x)
I (a)
The claim in step iv) that I (a) ∈ D comes from the definition of an inter-
pretation function: the interpretation of a constant is always a member of the
domain. Notice that “I (a)” is a term of our metalanguage; that’s why, when
I’m given that “for any u ∈ D” in step iii), I can set u equal to I (a) to obtain
step iv).
viii) But [x] g xy and [y] g xy are each just u. Hence 〈u, u〉 ∈ I (R), contradicting
uu uu
iv).
d) ∃x∀yRxy→∀y∃xRxy
We’ve seen how to establish that particular formulas are valid. How do
we show that a formula is invalid? We need to simply exhibit a single model
in which the formula is false. (The definition of validity specifies that a valid
formula is true in all models; therefore, it only takes one model in which a
formula is false to make that formula invalid.) So let’s take one example; let’s
show that the formula (∃xF x∧∃xGx)→∃x(F x∧Gx) isn’t valid. To do this, we
must produce a model in which this formula is false. All we need is a single
model, since in order for the formula to be valid, it must be true in all models.
My model will contain letters in its domain:
D = {u, v}
I (F ) = {u}
I (G) = {v}
CHAPTER 4. PREDICATE LOGIC 92
It is intuitively clear that the formula is false in this model. In this model,
something has F (namely, u), and something has G (namely, v), but nothing
has both.
One further example: let’s show that ∀x∃yRxy 2 ∃y∀xRxy. We must show
that the first formula does not semantically imply the second. So we must come
up with a model in which the first formula is true and the second is false. It helps
to think about natural language sentences that these formulas might represent.
If R symbolizes “respects”, then the first formula says that “everyone respects
someone or other”, and the second says that “there is someone whom everyone
respects”. Clearly, the first can be true while the second is false: suppose that
each person respects a different person, so that no one person is respected by
everyone. A simple case of this occurs when there are just two people, each of
whom respects the other, but neither of whom respects him/herself:
(
•h •
D = {u, v}
I (R) = {〈u, v〉, 〈v, u〉}
a) 2 ∀x(F x→Gx)→∀x(Gx→F x)
c) Rab 2 ∃xRx x
5.1 Identity
“Standard” predicate logic is usually taken to include the identity sign (“=”).
“a=b ” means that a and b are one and the same thing.
93
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 94
like “V(φ) = 1”. This shouldn’t generally cause confusion, but if there’s a
danger of misunderstanding, I’ll clarify by writing things like: “…= (i.e., is the
same object as)…”.
v) So, given the clause for “=”, [x] g xy is not the same object as [y] g xy
uu uu
vi) but [x] g xy and [y] g xy are the same object. [x] g xy is g xuyu (x), i.e., u; and
uu uu uu
[y] g xy is g xuyu (y), i.e., u.
uu
a) F ab ∀x(x=a→F x b )
t =c
∀x(M x→∼x=s)
(It will be convenient to abbreviate ∼α=β as α6=β. Thus, the second symbol-
ization can be rewritten as: ∀x(M x→x6= s).) But many other sentences involve
the concept of identity in subtler ways.
Consider, for example, “Every lawyer hates every other lawyer”. The ‘other’
signifies nonidentity; we have, therefore:
∀x(Lx→∀y[(Ly∧x6=y)→H xy])
Consider next “Only Ted can change grades”. This means: “no one other than
Ted can change grades”, and may therefore be symbolized as:
∼∃x(x6= t ∧C x)
This says that there are two different objects, x and y, each of which are di-
nosaurs. To say “There are at least three dinosaurs” we say:
Indeed, for any n, one can construct a sentence φn that symbolizes “there are
at least n F s”:
φn : ∃x1 . . . ∃xn (F x1 ∧ · · · ∧F xn ∧ δ)
where δ is the conjunction of all sentences “xi 6= x j ” where i and j are integers
between 1 and n (inclusive) and i < j . (The sentence δ says in effect that no
two of the variables x1 . . . xn stand for the same object.)
Since we can construct each φn , we can symbolize other sentences involving
number as well. To say that there are at most n F s, we write: ∼φn+1 . To say
that there are between n and m F s (where m > n), we write: φn ∧∼φ m+1 . To
say that there are exactly n F s, we write: φn ∧∼φn+1 .
These methods for constructing sentences involving number will always
work; but one can often construct shorter numerical symbolizations by other
methods. For example, to say “there are exactly two dinosaurs”, instead of
saying “there are at least two dinosaurs, and it’s not the case that there are at
least three dinosaurs”, we could say instead:
b) The only truly great player who plays in the NBA is Allen
Iverson
P f (a)
c = s(a, b )
(here “c” symbolizes “3”). We can put variables into the blanks of function
symbols, too. Thus, we can symbolize “Someone’s father was a politician” as
∃xP f (x)
Example 5.2: Symbolize the following sentences using predicate logic with
identity and function symbols:
Primitive vocabulary:
· logical: →, ∼, ∀, =
· nonlogical:
· parentheses
The definition of a wff, actually, stays the same; all that needs to change is the
definition of a “term”. Before, terms were just names or variables. Now, we
need to allow for f (a), f ( f (a)), etc., to be terms. This is done by the following
recursive definition of a term:1
Definition of terms:
Note the recursive nature of this definition: the denotation of a complex term
is defined in terms of the denotations of its smaller parts. Let’s think carefully
about what the final clause says. It says that, in order to calculate the denotation
of the complex term f (α1 . . . αn ) (relative to assignment g ), we must first figure
out what I ( f ) is—that is, what the interpretation function I assigns to the
function symbol f . This object, the new definition of a model tells us, is an
n-place function on the domain. We then take this function, I ( f ), and apply
it to n arguments: namely, the denotations (relative to g ) of the terms α1 . . . αn .
The result is our desired denotation of f (α1 . . . αn ).
It may help to think about a simple case. Suppose that f is a one-place
function symbol; suppose our domain consists of the set of natural numbers;
suppose that the name a denotes the number 3 in this model (i.e., I (a) = 3),
and suppose that f denotes the successor function (i.e., I ( f ) is the function,
successor, that assigns to any natural number n the number n + 1.) In that case,
the definition tells us that:
[ f (a)] g = I ( f )([a] g )
= I ( f )(I (a))
= successor(3)
=4
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 102
ii) Suppose for reductio that ∃xP f (x) is false in this model; i.e., for some
variable assignment g , Vg (∃xP f (x)) = 0
vi) [ f (x)] g x is just I ( f )([x] g x ), and [x] g x is just gI (c)x (x)—i.e., I (c).
I (c) I (c)) I (c)
Primitive vocabulary:
· logical: →, ∼, ∀, =, ι
· nonlogical:
· for each n > 0, n-place predicates F , G . . ., with or without subscripts
· for each n > 0, n-place function symbols f , g ,…, with or without
subscripts
· variables x, y . . . with or without subscripts
· individual constants (names) a, b . . ., with or without subscripts
· parentheses
Notice how we needed to combine the recursive definitions of term and wff
into a single recursive definition of wffs and terms together. The reason is that
we need the notion of a wff to define what counts as a term containing the ι
operator (clause ii); but we need the notion of a term to define what counts as
a wff (clause iv). The way we accomplish this is not circular. The reason it isn’t
is that we can always decide, using these rules, whether a given string counts as
a wff or term by looking at whether smaller strings count as wffs or terms. And
the smallest strings are said to be wffs or terms in non-circular ways.
structure (namely: ιx(B x∧C x).) But even without the ι, there is a way to
symbolize whole sentences containing ‘the black cat’, using just standard predicate
plus identity. We could, for example, symbolize “The black cat is happy” as:
That is, “there is something such that: i) it is a black cat, ii) nothing else is a
black cat, and iii) it is happy”.
This method for symbolizing sentences containing ‘the’ is called “Russell’s
theory of descriptions”, in honor of its inventor Bertrand Russell, the 19th and
20th century philosopher and logician.2 The general idea is to symbolize: “the
φ is ψ” as ∃x[φ(x) ∧ ∀y(φ(y)→x=y) ∧ ψ(x)]. This method can be iterated so
as to apply to sentences with two or more definite descriptions, such as “The
8-foot tall man drove the 20-foot long limousine”, which becomes, letting ‘E’
stand for ‘is eight feet tall’ and ‘T ’ stand for ‘is twenty feet long’:
? Or does it mean “It is not the case that the President is bald”, which is
symbolized thus:
So the first implies that there’s a unique president. The second merely denies
that there is a unique president, who is bald. That doesn’t imply that there’s
a unique president. It would be true if there’s a unique president who is not
bald, but it would also be true in two other cases: the case in which there are
no presidents at all, and the case in which there is more than one president.
A similar issue arises with the sentence “The round square does not exist”.
We might think to symbolize it:
letting “E” stands for “exists”. In other words, we might give the description
wide scope. But this is wrong, because it says there is a certain round square that
doesn’t exist, and that’s a contradiction. This way of symbolizing the sentence
corresponds to reading the sentence as saying “The thing that is a round square
is such that it does not exist”. But that isn’t the most natural way to read the
sentence. The sentence would usually be interpreted to mean: “It is not true
that the round square exists”, —that is, as the negation of “the round square
exists”:
with the ∼ out in front. Here we’ve given the description narrow scope. Notice
also that saying that x exists at the end is redundant, so we could simplify to:
Again, notice the moral of these last two examples: if a definite description
occurs in a sentence with a ‘not’, the sentence may be ambiguous: does the
‘not’ apply to the entire rest of the sentence, or merely to the predicate?
If we are willing to use Russell’s method for translating definite descriptions,
we can drop ι from our language. We would, in effect, not be treating “the F ”
as a referring phrase. We would instead be paraphrasing sentences that contain
“the F ” into sentences that don’t. “The black cat is happy” got paraphrased
as: “there is something that is a black cat, is such that nothing else is a black
cat, and is happy”. See?—no occurrence of “the black cat” in the paraphrased
sentence.
In fact, once we use Russell’s method, we can get rid of function symbols too.
Given function symbols, we treated “father” as a function symbol, symbolized
it with “ f ”, and symbolized the sentence “George W. Bush’s father was a
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 109
P ιxF x b
But given Russell’s method, we can symbolize the whole thing without using
either function symbols or the ι:
We can get rid of all function symbols this way, if we want. Here’s the method:
· In any sentence containing the term “ f (α1 . . . αn )”, replace each occur-
rence of this term with “the x such that R(x, α1 . . . αn )”.
For example, let’s go back to: “Every even number is the sum of two prime
numbers”. Instead of introducing a function symbol s (x, y) for “the sum of x
and y”, let’s introduce a predicate letter R(z, x, y) for “z is a sum of x and y”.
We then use Russell’s method to symbolize the whole sentence thus:
The end of the formula (beginning with ∃w) says “the product of y and z is
identical to x”—that is, that there exists some w such that w is a product of y
and z, and there is no other product of y and z other than w, and w = x.
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 110
Like those sentences that are representable in standard logic, these sentences
involve quantificational notions: most things, some critics, and so on. In this
section we introduce a broader conception of what a quantifier is, and new
quantifiers that allow us to symbolize these sentences.
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 111
VM ,g (∀αφ) = 1 iff φM ,g ,α = D
VM ,g (∃αφ) = 1 iff φM ,g ,α 6= ∅
But if we can rewrite the semantic clauses for the familiar quantifiers ∀ and
∃ in this way—as conditions on φM ,g ,α —then why not introduce new symbols
of the same grammatical type as ∀ and ∃, whose semantics is parallel to ∀ and
∃ except in laying down different conditions on φM ,g ,α ? These would be new
kinds of quantifiers. For instance, for any integer n, we could introduce a
quantifier ∃n , to be read as “there exists at least n”. That is, ∃n φ means: “there
are at least n φs.” The definitions of a wff, and of truth in a model, would be
updated with the following clauses:
· if α is a variable and φ is a wff, then ∃n αφ is a wff
· VM , g (∃n αφ) = 1 iff |φM , g ,α | ≥ n
The expression |A| stands for the “cardinality” of set A—i.e., the number of
members of A. Thus, this definition says that ∃n αφ is true iff the cardinality of
φM ,g ,α is greater than or equal to n—i.e., this set has at least n members.
Now, the introduction of the symbols ∃n do not increase the expressive
power of predicate logic, for as we saw in section 5.1.3, we can symbolize
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 112
“there are at least n F s” using just standard predicate logic (plus “=”). The
new notation is merely a space-saver. But other such additions are not mere
space-savers. For example, by analogy with the symbols ∃n , we can introduce a
symbol ∃∞ , meaning “there are infinitely many”:
So we should require the following of RQ . Consider any set, D, and any one-
one function, f , from D onto another set D 0 . Then, if a subset X of D bears
RQ to D, the set f [X ] must bear RQ to D 0 . ( f [X ] is the image of X under
function f —i.e., {u : u ∈ D 0 and u = f (v), for some v ∈ D}. It is the subset of
D 0 onto which f “projects” X .)
CHAPTER 5. EXTENSIONS OF PREDICATE LOGIC 113
Exercise 5.9 Let the quantifier ∃prime mean “there are a prime
number of”. Using the notation of generalized quantifiers, write
out the semantics of this quantifier.
Everything is material
Something is spiritual
Here, the predicates (verb phrases) ‘is material’ and ‘is spiritual’ correspond
to the open sentences of logic; it is to these that ‘everything’ and ‘something’
attach.
But in fact, monadic quantifiers in natural language are atypical. ‘Every’
and ‘some’ typically occur as follows:
The first is to be read: “all φs are ψ”; the second is to be read “there is a φ that
is a ψ”. The clauses for these new binary quantifiers in the definition of the
valuation function for a PC-model are:
VM ,g ((∀α:φ)ψ) = 1 iff φM ,g ,α ⊆ ψM ,g ,α
VM ,g ((∃α:φ)ψ) = 1 iff φM ,g ,α ∩ ψM ,g ,α 6= ∅
That is, (the α:φ)ψ is true iff i) there is exactly one φ, and ii) every φ is a
ψ. This truth condition, notice, is exactly the truth condition for Russell’s
symbolization of “the φ is a ψ”; hence the name the.
As with the introduction of the monadic quantifiers ∃n , the introduction of
the binary existential and universal quantifiers, and of the, does not increase the
expressive power of first order logic, for the same effect can be achieved with
monadic quantifiers. (∀α:φ)ψ, (∃α:φ)ψ, and (the α:φ)ψ become, respectively:
∀α(φ→ψ)
∃α(φ∧ψ)
∃α(φ ∧ ∀β(φβ →β=α) ∧ ψ)
(where φβ is φ with free αs changed to βs.) But, as with the monadic quantifiers
∃∞ and most, there are binary quantifiers one can introduce that genuinely
increase expressive power. For example, most occurrences of ‘most’ in English
are binary, e.g.:
The binary most2 increases our expressive power, even relative to the monadic
most: not every sentence expressible with the former is equivalent to a sentence
expressible with the latter.3
You may invent any generalized quantifiers you need, provided you
write out their semantics.
∃X X a
∃X ∃yXy
(where gUπ is the variable assignment just like g except in assigning U to π.)
Notice that, as with the generalized monadic quantifiers, no alteration to the
definition of a PC-model is needed. All we need to do is change grammar and
the definition of the valuation function.
Second-order logic is different from first-order logic in many ways. For
instance, one can define the identity predicate in second-order logic:
(*) (GK2 ) is true in 〈D, I 〉 iff D has a nonempty subset, X , such that i)
X ⊆ I (C ), and ii) whenever 〈u, v〉 ∈ I (A) and u ∈ X , then v ∈ X as
well and v is not u.
2φ: “It is necessary that φ”, “Necessarily, φ”, “It must be that φ”
3φ: “It is possible that φ”, “Possibly, φ”, “It could be that φ”, “It can be that
φ”, “It might be that φ”
118
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 119
so doesn’t happen in any possible world. But within this limit, we can imagine
all sorts of possible worlds: possible worlds with talking donkeys, possible
worlds in which I am ten feet tall, and so on. “Complete” means simply that
no detail is left out—possible worlds are completely specific scenarios. There
is no possible world in which I am “somewhere between ten and eleven feet
tall” without being some particular height.1 Likewise, in any possible world in
which I am exactly ten feet, six inches tall (say), I must have some particular
weight, must live in some particular place, and so on. One of these possible
worlds is the actual world—this is the complete and possible scenario that in
fact obtains. The rest of them are merely possible—they do not obtain, but
would have obtained if things had gone differently.
In terms of possible worlds, we can think of our modal operators thus:
Primitive vocabulary:
· Sentence letters: P, Q, R . . . , with or without numerical subscripts
· Connectives: →, ∼, 2
· Parentheses
Definition of wff:
· Sentence letters are wffs
· If φ and ψ are wffs then φ→ψ, ∼φ, and 2φ are also wffs
· nothing else is a wff
The 2 is the only new primitive connective. But just as we were able to
define ∧, ∨, and ↔, we can define new nonprimitive modal connectives:
· “3φ” (“Possibly φ”) is short for “∼2∼φ
· “φ⇒ψ” (“φ strictly implies ψ”) is short for “2(φ→ψ)”
I can come to the party, but I can’t stay late. (“can” = “is not
inconvenient”)
Humans can travel to the moon, but not Mars. (“can” = “is
achievable with current technology”)
Objects can move almost as fast as the speed of light, but nothing
can travel faster than light. (“can” = “is consistent with
the laws of nature”)
Objects could have traveled faster than the speed of light (if
the laws of nature had been different), but no matter what the
laws had been, nothing could have traveled faster than itself.
(“can” = “metaphysical possibility”)
You can borrow but you can’t steal. (“can” = “morally ac-
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 123
ceptable”)
So when representing English sentences using the 2 and the 3, one should
keep in mind that these expressions can be used to express different strengths
of necessity and possibility. (Though we won’t do this, one could introduce
different symbols for different sorts of possibility and necessity.)
The different strengths of possibility and necessity can be made vivid by
thinking, again, in terms of possible worlds. As we saw, we can think of the 2
and the 3 as quantifiers over possible worlds (the former a universal quantifier,
the latter an existential quantifier). The very broad sort of possibility and
necessity, metaphysical possibility and necessity, can be thought of as a completely
unrestricted quantifier: a statement is necessarily true iff it is true in all possible
worlds whatsoever. The other kinds of possibility and necessity can be thought
of as resulting from various restrictions on the quantifiers over possible worlds.
Thus, when ‘can’ signifies achievability given current technology, it means:
true in some possible world in which technology has not progressed beyond where it has
progressed in fact at the current time; when ‘can’ means moral acceptability, it
means: true in some possible world in which nothing morally forbidden occurs; and so
on.
with sentences to form a new sentence, the truth value of the resulting sentence
is determined by the truth value of the component sentences. Many think
that ‘and’ is truth-functional, since they think that an English sentence of the
form “φ and ψ” is true iff φ and ψ are both true. But ‘necessarily’ is not truth-
functional. Suppose I tell you the truth value of φ; will you be able to tell me
the truth value of this sentence? Well, if φ is false then presumably you can (it
is false), but if φ is true, then you still don’t know. If φ is “Ted is a philosopher”
then “Necessarily φ” is false, but if φ is “Either Ted is a philosopher or he isn’t
a philosopher” then “Necessarily φ” is true. So the truth value of “Necessarily
φ” isn’t determined by the truth value of φ. Similarly, ‘possibly’ isn’t truth-
functional either: ‘I might have been six feet tall’ is true, whereas ‘I might have
been a round square’ is false, despite the fact that ‘I am six feet tall’ and ‘I am a
round square’ each have the same truth value (they’re both false.)
Since the 2 and the 3 are supposed to represent ‘necessarily’ and ‘possibly’,
respectively, and since the latter aren’t truth-functional, we can’t use the method
of truth tables to construct the semantics for the 2 and the 3. For the method
of truth tables assumes truth-functionality. Truth tables are just pictures of truth
functions: they specify what truth value a complex sentence has as a function of
what truth values its parts have. Imagine trying to construct a truth table for
the 2. It’s presumably clear (though see the discussion of systems K, D, and T
below) that 2φ should be false if φ is false, but what about when φ is true?:
2
1 ?
0 0
There’s nothing we can put in this slot in the truth table, since when φ is true,
sometimes 2φ is true and sometimes it is false.
Our challenge is clear: we need a semantics for the 2 and the 3 other than
the method of truth tables.
6.3.1 Relations
Before we investigate how to overcome this challenge, a digression is necessary,
to introduce the concept of a relation. A relation is just a feature of multiple
objects taken together. The taller-than relation is one example: when one
person is taller than another, that’s a feature of those two objects taken together.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 126
Another example is the the less-than relation for numbers. When one number
is less than another, that’s a feature of those two numbers taken together.
“Binary” relations apply to two objects at a time. The taller-than and less-
than relations are binary relations, or “two-place” relations as we might say.
We can also speak of three-place relations, four-place relations, and so on.
An example of a three-place relation would be the betweenness relation for
numbers: the relation that holds between 2, 5, and 23 for example.
Recall our discussion of ordered sets from section 1.8. In addition to their
use in constructing models, ordered sets are also useful for giving an official
definition of what a relation is.
For example, the taller-than relation may be taken to be the set of ordered pairs
〈u, v〉 such that u is a taller person than v. The less-than relation for positive
integers is the set of ordered pairs 〈m, n〉 such that m is a positive integer less
than n, another positive integer. That is, it is the following set:
In other words, the domain of R is the set of all things that bear R to something;
the range is the set of all things that something bears R to; and R is over A iff
the members of the ’tuples in R are all drawn from A.
Binary relations come in different types, depend on the patterns in which
they hold. Here are some types of binary relations that we will need to think
about:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 127
Notice that we relativize some of these relation types to a given set A. The
notion of reflexivity is defined as being relative to a set, for example. We do
this because the alternative would be to say that a relation is reflexive simpliciter
if everything bears R to itself; but that would require the domain and range
of any reflexive relation to be the set of absolutely all objects. It’s better to
introduce the notion of being reflexive relative to a set, which is applicable to
relations with smaller domains and ranges. (I will sometimes omit the qualifier
‘in A’ when it is clear which set that is.) Why don’t symmetry and transitivity
have to be relativized to a set?—because they only say what must happen if
R holds among certain things. Symmetry, for example, says merely that if R
holds between u and v, then it must also hold between v and u, and so we can
say that a relation is symmetric absolutely, without implying that everything is
in its domain and range.
by truth tables within each world; ∼φ, for example, will be true at a world iff φ
is false at that world. But the truth value of 2φ at a world won’t be determined
by the truth value of φ at that world; the truth value of φ at other worlds will
also be relevant.
Specifically, 2φ will count as true at a world iff φ is true at every world that
is “accessible” from the first world. What does “accessible” mean? Each model
will come equipped with a binary relation, R, that holds between possible
worlds; we will say that world v is “accessible from” world w when R wv. The
intuitive idea is that R wv if and only if v is possible relative to w. That is, if you
live in world w, then from your perspective, the events in world v are possible.
The idea that what is possible might vary depending on what possible
world you live in might at first seem strange, but it isn’t really. “It is physically
impossible to travel faster than the speed of light” is true in the actual world,
but false in worlds where the laws of nature allow faster-than-light travel.
On to the semantics. We first define a general notion of a MPL model,
which we’ll then use to give a semantics for each of our systems:
Definition of model: An MPL-model is an ordered triple, 〈W , R, I 〉, where:
· W is a non-empty set of objects (“possible worlds”)
· R is a binary relation over W (“accessibility relation”)
· I is a two-place function that assigns a 0 or 1 to each sentence letter,
relative to (“at”, or “in”) each world—that is, for any sentence letter α,
and any w ∈ W , I (α, w) is either 0 or 1. (“interpretation function”)
Each MPL-model contains a set W of possible worlds, and an accessibility
relation R. 〈W , R〉 is sometimes called the model’s frame. Think of the frame
as a map of the “structure” of the model’s space of possible worlds: it contains
information about how many worlds there are, and which worlds are accessible
from which. In addition to a frame, each model also contains an interpretation
function I , which assigns truth values to sentence letters.
A model’s interpretation function assigns truth values only to sentence
letters. But the sum total of all the truth values of sentence letters relative to
worlds determines the truth values of all complex wffs, again relative to worlds.
It is the job of the model’s valuation function to specify exactly how these truth
values get determined:
Definition of valuation: Where M (= 〈W , R, I 〉) is any MPL-model, the
valuation for M , VM , is defined as the two-place function that assigns either
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 129
What about the truth values for complex formulas that contain ∧, ∨, ↔, and
3? Given the definition of these defined connectives in terms of the primitive
connectives, it is easy to prove that the following derived conditions hold:
VM (φ∧ψ, w) = 1 iff VM (φ, w) = 1 and VM (ψ, w) = 1
VM (φ∨ψ, w) = 1 iff VM (φ, w) = 1 or VM (ψ, w) = 1
VM (φ↔ψ, w) = 1 iff VM (φ, w) = VM (ψ, w)
VM (3φ, w) = 1 iff for some v ∈ W , R wv and VM (φ, v) = 1
So far, we have introduced a general notion of an MPL model, and have
defined the notion of a wff’s being true at a world in an MPL model. Next, let
us consider how to define validity.
Remember that our overall strategy is C. I. Lewis’s: we want to construct
different modal systems, since it isn’t obvious which formulas ought to count as
logical truths. The systems will be named: K, D, T, B, S4, S5. Each system will
come with its own definition of a model. As a result, different formulas will
come out valid in the different systems. For example, as we’ll see, the formula
2P →22P is going to come out valid in S4 and S5, but not in the other systems.
Here are the definitions:
Notice that for each system, the valid formulas are defined as the formulas
that are valid in every model in which the accessibility relation has a certain
formal feature. The systems differ from from one another by what that formal
feature is. For T it is reflexivity: a formula is T-valid iff it is valid in every
model in which the accessibility relation is reflexive. For S4 the formal feature
is reflexivity + transitivity. Other systems correspond to other formal features.
As before, we’ll use the notation for validity. But since we have many
modal systems, if we claim that a formula is valid, we’ll need to indicate which
system we’re talking about. Let’s do that by subscripting with the name of
the system; thus, “T φ” means that φ is T-valid.
It’s important to get clear on the status of possible-worlds lingo here. Where
〈W , R, I 〉 is a model, we call the members of W “worlds”, and we call R the
“accessibility” relation. Now, there is no question that “possible worlds” is a
vivid way to think about necessity and possibility. But officially, W is nothing
but a nonempty set, any old nonempty set. Its members needn’t be the kinds
of things metaphysicians call possible worlds: they can be numbers, people,
bananas—whatever you like. Similarly, R is just defined to be any old binary
relation on R; it needn’t have anything to do with the metaphysics of modality.
Officially, then, the possible-worlds talk we use to describe our models is just
talk, not heavy-duty metaphysics. Still, models are usually intended to model
something—to depict some aspect of the dependence of truth on the world.
So if modal sentences of English containing ‘necessarily’ and ‘possibly’ aren’t
made true by anything like possible worlds, it’s hard to see why possible worlds
models would shed any light on their meaning, or why truth-in-all-possible-
worlds-models would be a good way of modeling (genuine) validity for modal
statements. At any rate, this philosophical issue should be kept in mind. Back,
now, to the formalism.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 131
Example 6.1: The formula 2(P ∨∼P ) is K-valid. To show this formula is
K-valid, we must show that it is valid in every MPL-model, since validity-in-all-
MPL-models is the definition of K-validity. Being valid in a model means being
true at every world in the model. So, consider any MPL-model 〈W , R, I 〉,
and let w be any world in W . We must prove that VM (2(P ∨∼P ), w) = 1. (As
before, I’ll start to omit the subscript M on VM when it’s clear which model
we’re talking about.)
ii) So, by the truth condition for the 2 in the definition of the valuation
function, there is some world, v, such that R wv and V(P ∨∼P, v) = 0
iii) Given the truth condition for the ∨, V(P, v) = 0 and V(∼P, v) = 0
iv) Since V(∼P, v) = 0, given the truth condition for the ∼, V(P, v) = 1. But
that’s impossible; V(P, v) can’t be both 0 and 1.
Example 6.2: Show that T (32(P →Q)∧2P ) → 3Q. We must show that
V((32(P →Q)∧2P )→3Q, w) = 1 for the valuation V for an arbitrarily chosen
model and world w in that model.
iii) …V(3Q, w) = 0
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 132
iv) From ii), 32(P →Q) is true at w, and so V(2(P →Q), v) = 1, for some
world, call it v, such that R wv
v) From ii), V(2P, w) = 1. So, by the truth condition for the 2, P is true in
every world accessible from w; since R wv, it follows that V(P, v) = 1.
vi) From iv), P →Q is true in every world accessible from v; since our model
is a T-model, R is reflexive. So R v v; and so V(P →Q, v) = 1
vii) From v) and vi), by the truth condition for the →, V(Q, v) = 1
viii) Given iii), Q is false at every world accessible from w; this contradicts
vii)
The last example just showed that the formula (32(P →Q)∧2P )→3Q is
valid in T. Suppose we were interested in showing that this formula is also valid
in S4. What more would we have to do? Nothing! To be S4-valid is to be
valid in every S4-model; but a quick look at the definitions shows that every
S4-model is a T-model. So, since we already know that the the formula is valid
in all T-models, we already know that it must be valid in all S4-models (and
hence, S4-valid), without doing a separate proof.
Think of it another way. To do a proof that the formula is S4-valid, we need
to do a proof in which we are allowed to assume that the accessibility relation is
both transitive and reflexive. And the proof above did just that. We didn’t ever
use the fact that the accessibility relation is transitive—we only used the fact
that it is reflexive (in line 9). But we don’t need to use everything we’re allowed
to assume.
In contrast, the proof above doesn’t establish that this formula is, say, K-valid.
To be K-valid, the formula would need to be valid in all models. But some
models don’t have reflexive accessibility relations, whereas the proof we gave
assumed that the accessibility relation was reflexive. And in fact the formula
isn’t in fact K-valid, as we’ll show how to demonstrate in the next section.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 133
= S5 `@@
||| @@
| @@
||
S4 aB ~>
B
BB ~
BB ~
B ~~~
TO
DO
An arrow from one system to another indicates that validity in the first system
implies validity in the second system. For example, if a formula is D-valid, then
it’s also T-valid. The reason is that if something is valid in all D-models, then,
since every T-model is also a D-model (since reflexivity implies seriality), it
must be valid in all T-models as well.
S5 is the strongest system, since it has the most valid formulas. (That’s
because it has the fewest models—it’s easier to be S5-valid because there are
fewer potentially falsifying models.)
Notice that the diagram isn’t linear. That’s because of the following. Both B
and S4 are stronger than T; each contains all the T-valid formulas. But neither
B nor S4 is stronger than the other—each contains valid formulas that the
other doesn’t. (They of course overlap, because each contains all the T-valid
formulas.) S5 is stronger than each; S5 contains all the valid formulas of each.
These relationships between the systems will be exhibited below.
Suppose you are given a formula, and for each system in which it is valid,
you want to give a semantic proof of its validity. This needn’t require multiple
semantic proofs—as we have seen, one semantic proof can do the job. To prove
that a certain formula is valid in a number of systems, it suffices to prove that it
is valid in the weakest possible system. Then, that very proof will automatically
be a proof that it is valid in all stronger systems. For example, a proof that a
formula is valid in K would itself be a proof that the formula is D, T, B, S4,
and S5-valid. Why? Because every model of any kind is a K-model, so K-valid
formulas are always valid in all other systems.
In general, then, to show what systems a formula is valid in, it suffices to
give a single semantic proof of it, namely, a semantic proof in the weakest
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 134
6.3.4 Countermodels
We have a definition of validity for the various systems, and we’ve shown how
to establish validity of particular formulas. Now we’ll investigate establishing
invalidity.
Let’s show that the formula 3P →2P is not K-valid. A formula is K-valid if
it is valid in all K-models, so all we must do is find one K-model in which it
isn’t valid. What follows is a procedure for doing this:4
r 3P →2P
4
This procedure is from Cresswell and Hughes (1996).
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 135
Now, since the box represents a world, we should have some way of representing
the accessibility relation. What worlds are possible, relative to r; what worlds
does r “see”? Well, to represent one world (box) seeing another, we’ll draw
an arrow from the first to the second. However in the case of this particular
model, we don’t need to make this world r see anything. After all, we’re trying
to construct a K-model, and the accessibility relation of a K-model doesn’t
even need to be serial—no world needs to see any worlds at all. So, we’ll forget
about arrows for the time being.
0
3P →2P
r
1 0 0
3P →2P
r
Enter asterisks
When we assign a truth value to a modal formula, we thereby commit ourselves
to assigning certain other truth values to various formulas at various worlds.
For example, when we make 3P true at r, we commit ourselves to making P
true at some world that r sees. To remind ourselves of this commitment, we’ll
put an asterisk (*) below 3P . An asterisk below indicates a commitment to there
being some world of a certain sort. Similarly, since 2P is false at r, this means
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 136
that P must be false in some world P sees (if it were true in all such worlds,
then by the semantic clause for the 2, 2P would be true at r). We again have a
commitment to there being some world of a certain sort, so we enter an asterisk
below 2P as well:
1 0 0
r 3P →2P
∗ ∗
1 0 0
r 3P →2P
∗ ∗
??
??
??
??
??
a 1 0
b
P P
What I’ve done is added two more worlds to the diagram: a and b. P is true in
a, but false in b. I have thereby satisfied my obligations to the asterisks on my
diagram, for r does indeed see a world in which P is true, and another in which
P is false.
The set of worlds We first must specify the set of worlds, W . W is simply
the set of worlds I invoked:
W = {r, a, b}
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 137
But what are r, a, and b? Let’s just take them to be the letters ‘r’, ‘a’, and ‘b’. No
reason not to—the members of W , recall, can be any things whatsoever.
That is, we write out the set of all ordered pairs 〈w1 , w2 〉 such that w1 “sees”
w2 .
I (P, a) = 1
I (P, b) = 0
I (α, w) = 0 for all other sentence letters α and worlds w
down is reflexive. (In our case, we were only trying to construct a K-model, and
so for us this step is trivial.) Secondly, make sure that the formula in question
really does come out false at one of the worlds in your model.
Simplifying models
Sometimes a model can be simplified. Consider the diagram of the final version
of the model above:
1 0 0
r 3P →2P
∗ ∗
??
??
??
??
??
a 1 0
b
P P
We needn’t have used three worlds in the model. When we discharged the first
asterisk, we needed to put in a world that r sees, in which P is true. But we
needn’t have made that a new world—we could have simply have made P true in
r. Of course we couldn’t haven’t done that for both asterisks, because that would
have made P both true and false at r. So, we could make one simplification:
1 1 0 0
r 3P →2P
∗ ∗
0
0
b
P
The official model would then look as follows:
W = {r, b }
R = {〈r, r 〉, 〈r, b 〉}
I (P, r) = 1, all others 0
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 139
1 1 0 0
r 3P →2P
∗ ∗
0
0
b
0 P
Official model:
W = {r, b }
R = {〈r, r 〉, 〈r, b 〉, 〈b , b 〉}
I (P, r) = 1, all others 0
That was easy—adding the fact that b sees itself didn’t require changing any-
thing else in the model.
Suppose we want now to show that 3P →2P isn’t T-valid. Well, we’ve
already done so! Why? Because we’ve already produced a T-model in which
this formula is false. Look back at the most recent model. Its accessibility
relation is reflexive. So it’s a T-model already. In fact, that accessibility relation
is also already transitive, so it’s already an S4-model.
So far we have established that 2K,D,T,S4 3P →2P . What about B and S5?
It’s easy to revise our model to make the accessibility relation symmetric:
1 1 0 0
r 3P →2P
∗ ∗
0 O
0
b
0 P
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 140
Official model:
W = {r, b }
R = {〈r, r〉, 〈r, b〉, 〈b, b〉, 〈b, r〉}
I (P, r) = 1, all others 0
Now, we’ve got a B-model, too. What’s more, we’ve also got an S5-model:
notice that the accessibility relation is an equivalence relation. (In fact, it’s also
a total relation.)
So, we’ve succeeded in establishing that 3P →2P is not valid in any of our
systems. Notice that we could have done this more quickly, if we had given the
final model in the first place. After all, this model is an S5, S4, B, T, D, and
K-model. So one model establishes that the formula isn’t valid in any of the
systems.
In general, in order to establish that a formula is invalid in a number of
systems, try to produce a model for the strongest system (i.e., the system with
the most requirements on models). If you do, then you’ll automatically have a
model for the weaker systems. Keep in mind the diagram of systems:
S5 `@
|= @@
||| @@
|| @
S4 aB >B
BB
BB ~~~
B ~~~
TO
DO
K
An arrow from one system to another, recall, indicates that validity in the first
system implies validity in the second. The arrows also indicate facts about
invalidity, but in reverse: when an arrow points from one system to another,
then invalidity in the second system implies invalidity in the first. For example,
if a wff is invalid in T, then it is invalid in D. (That’s because every T-model is
a D-model; a countermodel in T is therefore a countermodel in D.)
When our task is to discover which systems a given formula is invalid in,
usually only one countermodel will be needed—a countermodel in the strongest
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 141
Above asterisks Let’s try to get a countermodel for 32P →23P in all the
systems in which it is invalid, and a semantic validity proof in all the systems in
which it is valid. We always start with countermodelling before doing semantic
validity proofs, and when doing countermodelling, we start by trying for a
K-model. After the first few steps, we have:
1 0 0
r 32P →23P
∗ ∗
CC
{{ CC
{{{ CC
{{ CC
{{ CC
{
} { C!
1 0
2P 3P
a b
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 142
At this point, we’ve got a true 2, and a false 3. Take the first: a true 2P . This
doesn’t commit us to adding a world in which P is true; rather, it commits us
to making P true in every world that a sees. Similarly, a zero over a 3, over
3P in world b in this case, commits us to making P false in every world that
b sees. We indicate such commitments, commitments in every world seen, by
putting asterisks above the relevant modal operators:
1 0 0
r 32P →23P
∗ ∗
~ @@
~~ @@
~ @@
~~ @@
~~~ @@
~ @@
∗ ~
~ ∗
a 1 b 0
2P 3P
Now, how can we discharge these asterisks? In this case, when trying to
construct a K-model, we don’t need to do anything. Since a, for example,
doesn’t see any world, then automatically P is true in every world it sees; the
statement “for every world, w, if Raw then V(P, w) = 1” is vacuously true.
Same goes for b—P is automatically false in all worlds it sees. So, we’ve got a
K-model in which 32P →23P is true.
Now let’s turn the model into a D-model. Every world must now see at
least one world. Let’s try:
1 0 0
r 32P →23P
∗ ∗
~ @@
~~ @@
~ @@
~~ @@
~~~ @@
~ @@
∗ ~
~ ∗
a 1 b 0
2P 3P
1 0
c d
P P
0 0
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 143
I added worlds c and d, so that a and b would each see at least one world.
(Further, worlds c and d each had to see a world, to keep the relation serial.
I could have added still more worlds that c and d saw, but then they would
themselves need to see some worlds…So I just let c and d see themselves.) But
once c and d were added, discharging the upper asterisks in worlds a and b
required making P true in c and false in d (since a sees c and b sees d).
Let’s now try for a T-model. This will involve, among other things, letting
a and b see themselves. But this gets rid of the need for worlds c and d, since
they were added just to make the relation serial. I’ll try:
1 0 0
r 32P →23P
∗ ∗
0 @@
~~ @@
~~~ @@
~~ @@
~~ @@
~ @@
∗ ~
~ ∗
a 1 1 b 0 0
2P 3P
0 0
When I added arrows, I needed to make sure that I correctly discharged the
asterisks. This required nothing of world r, since there were no top asterisks
there. There were top asterisks in worlds a and b; but it turned out to be easy
to discharge these asterisks—I just needed to let P be true in a, but false in b.
Notice that I could have moved straight to this T-model—which is itself a
D-model—rather than first going through the earlier mere-D-model. However,
this won’t always be possible—sometimes you’ll be able to get a D-model, but
no T-model.
At this point let’s verify that our model does indeed assign the value 0 to
our formula 32P →23P . First notice that 2P is true in a (since a only sees
one world—itself—and P is true there). But r sees a. So 32P is true at r. Now,
consider b. b only sees one world, itself, and P is false there. So 3P must also
be false there. But r sees b. So 23P is false at r. But now, the antecedent of
32P →23P is true, while its consequent is false, at r. So that conditional is
false at r. Which is what we wanted.
Onward. Our model is not a B-model, since a, for example, doesn’t see r,
despite the fact that r sees a. So let’s try to make this into a B-model. This
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 144
involves making the relation symmetric. Here’s how it looks before I try to
discharge the top asterisks in a and b:
1 0 0
r 32P →23P
∗ ∗
0 ? @_ @
~~~ @@
~~ @@
~~ @@
~ @@
~~ @@
∗ ~
~ ∗
a 1 1 b 0 0
2P 3P
0 0
Now I need to make sure that all top asterisks are discharged. For example,
since a now sees r, I’ll need to make sure that P is true at r. However, since
b sees r too, P needs to be false at r. But P can’t be both true and false at r.
So we’re stuck, in trying to get a B-model in which this formula is false. This
suggests that maybe it is impossible—that is, perhaps this formula is true in all
worlds in all B-models—that is, perhaps the formula is B-valid. So, the thing
to do is try to prove this: by supplying a semantic validity proof.
So, let 〈W , R, I 〉 be any model in which R is reflexive and symmetric, let
V be its valuation function, and let w be any member of W ; we must show that
V(32P →23P, w) = 1.
iii) …V(23P, w) = 0
v) By symmetry, R v w.
vi) From iv), via the truth condition for 2, we know that P is true at every
world accessible from v; and so, by v), V(P, w) = 1.
vii) By iii), there is some world, call it u, such that R w u and V(3P, u) = 0.
ix) By vi), P is false in every world accessible from u; and so by viii), V(P, w) =
0, contradicting vi)
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈r, b〉}
I (P, a) = 1, all others 0
This model is itself also a T, D, and K-model (since its accessibility relation is
reflexive and serial), so: 2K,D,T,S4 32P →23P .
∗
1 0 0 0
32P →3232P
r
Notice how commitments ∗
to specific truth values for 1 1 0
2P 232P
different formulas are recorded a
by placing the formulas
side by side in the box ∗
0
∗
0 0 1
32P
b
P
∗
0
0
c
P
0
Official model:
W = {r, a, b , c}
R = {〈r, r 〉, 〈a, a〉, 〈b , b 〉, 〈c, c〉, 〈r, a〉, 〈r, b 〉, 〈a, b 〉, 〈b , c〉}
I (P, b) = 1, all others 0
Now consider what happens when we try to turn this model into a B-model.
World b must see back to world a. But then the false 32P in b conflicts with the
true 2P in a. So it’s time for a validity proof. In constructed this validity proof,
we can be guided by failed attempt to construct a countermodel (assuming all
of our choices in constructing that countermodel were forced). In the following
proof that the formula is B-valid, I chose variables for worlds that match up
with the countermodel above:
iii) V(3232P, r ) = 0
iv) From ii), there’s some world, call it a, such that V(2P, a) = 1 and R ra
vi) And so, there’s some world, call it b , such that V(32P, b ) = 0 and Rab
vii) By symmetry, R b a. And so, given vi), V(2P, a) = 0. This contradicts iv)
We now have a T-model for the formula, and a proof that it is B-valid. The
B-validity proof shows the formula to be S5-valid; the T-model shows it to
be K- and D-invalid. We still don’t yet know about S4. So let’s return to the
T-model above, and see what happens when we try to make its accessibility
relation transitive. World a must then see world c, which is impossible since
2P is true in a and P is false in c. So we’re ready for a S4-validity proof (the
proof looks like the B-validity proof at first, but then diverges):
iii) V(3232P, r ) = 0
iv) From ii), there’s some world, call it a, such that V(2P, a) = 1 and R ra
vi) And so, there’s some world, call it b , such that V(32P, b ) = 0 and Rab
viii) And so, there’s some world, call it c, such that V(P, c) = 0 and R b c.
ix) From vi) and viii), given transitivity, we have Rac. And so, given iv),
V(P, c) = 1, contradicting viii)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 148
Daggers
There’s another kind of step in constructing models. When we make a condi-
tional false, we’re forced to enter certain truth values for its components: 1 for
the antecedent, 0 for the consequent. But consider making a disjunction true. A
disjunction can be true in more than one way. The first disjunct might be true,
or the second might be true, or both could be true. So we have a choice for
how to go about making a disjunction true. Similarly for making a conditional
true, a conjunction false, or a biconditional either true or false.
When one has a choice about which truth values to give the constituents
of a propositional compound, it’s best to delay making the choice as long as
possible. After all, some other part of the model might force you to make one
choice rather than the other. If you investigate the rest of the countermodel,
and nothing has forced your hand, you may need then to make a guess: try one
of the truth value combinations open to you, and see whether you can finish
the countermodel. If not, go back and try another combination.
To remind ourselves of these choice points, we will place a dagger (†) un-
derneath the major connective of the formula in question. Consider, as an
example, constructing a countermodel for the formula 3(3P ∨2Q)→(3P ∨Q).
Throwing caution to the wind and going straight for a T-model, we have after
a few steps:
∗
1 0 0 0 0 0
r
3(3P ∨2Q)→(3P ∨ Q)
∗
0
1 0
a 3P ∨2Q P
†
0
We still have to decide how to make 3P ∨2Q true in world a: which disjunct
to make true? Well, making 2P true won’t require adding another world to
the model, so let’s do that. We have, then, a T-model:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 149
∗
1 0 0 0 0 0
r
3(3P ∨2Q)→(3P ∨ Q)
∗
0
∗
1 1 1 0
3P ∨2Q
a
P
†
0
W = {r, a}
R = {〈r, r〉, 〈a, a〉, 〈r, a〉}
I (Q, a) = 1, all else 0
OK, let’s try now to upgrade this to a B-model. We can’t simply leave
everything as-is while letting world a see back to world r, since 2Q is true
in a and Q is false in r. But there’s another possibility. We weren’t forced to
discharge the dagger in world a by making 2Q true. So let’s explore the other
possibility; let’s make 3P true:
∗
1 0 0 0 0 0
r
3(3P ∨2Q)→(3P ∨ Q)
∗
0 O
1 1 0
a 3P ∨2Q P
∗ †
0 O
1
b
P
0
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 150
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈a, b〉, 〈b, a〉}
I (P, b) = 1, all else 0
i) Suppose for reductio that in some world r in some S5-model, V(3(3P ∨2Q)→(3P ∨Q), r ) =
0. Then V(3(3P ∨2Q), r ) = 1 and …
iii) Given i), for some world a, R ra and V(3P ∨2Q, a) = 1. So, either
V(3P, a) = 1 or V(2Q, a) = 1
a) Suppose V(2Q, a) = 1.
b) R is symmetric. So, given iii), Ra r ; and so, given a), V(Q, r ) = 1
c) But given ii), V(Q, r ) = 0—contradiction.
Summary of steps
Here, then, is a final list of the steps for constructing countermodels:
5. Enter asterisks
g) 332P ↔2P
h) 33P →23P
variables (“φ” and “ψ” here) that are not part of the primitive vocabulary of
the object language.
When we defined the notion of validity, what we defined was the notion of
a valid formula. (That’s because validity is defined in terms of truth in a model,
which itself was defined only for formulas.) So it’s not, strictly speaking, correct
to apply the notions of validity or invalidity to schemas.
However, it’s often interesting to show that every instance of a given schema
is valid. (Instances are schemas are formulas, and so the notion of validity can
be properly applied to them.) It’s easy, for example, to show that every instance
of the schema 2(φ→φ) is valid in each of our modal systems. (Let φ be any
MPL-wff, and take any world w in any model. Since the rules for evaluating
propositional compounds within possible worlds are the classical ones, φ→φ
must be true at w, no matter what truth value φ has at w. Hence 2(φ→φ) is
true in any world in any model, and so is valid in each system.)
There is, therefore, a kind of indirect notion of schema-validity: validity
of all instances. How about the invalidity of schemas? Here we must take
great care. In particular, the notion of a schema, all of whose instances are
invalid, is not a particularly interesting notion. Take, for instance, the schema
3φ→2φ. We showed earlier that a certain instance of this schema, namely
3P →2P is invalid in each of our systems. However, the schema 2φ→3φ also
has plenty of instances that are valid in various systems. The following formula,
for example, is an instance of 3φ→2ψ, and can easily be shown to be valid in
each of our systems:
3(P →P )→2(P →P )
Thus, even intuitively terrible schemas like 3φ→2φ have some valid instances.
(For an extreme example of this, consider the schema φ. Even this has some
valid instances: P →P , for one.) So it’s not interesting to inquire into whether
each instance of a schema is invalid. What is interesting is to inquire into
whether a given schema has some instances that are invalid. We can show, for
example, that the schema 3φ→2φ has some invalid instances (3P →2P , for
one), and hence is in this way unlike the schema 2(φ→φ).
So when dealing with schemas, it will often be of interest to ascertain
whether each instance of the schema is valid; it will rarely (if ever) be of interest
to ascertain whether each instance of the schema is invalid.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 154
6.4.1 System K
Our first system, K, is the weakest system—i.e., the system with the fewest
theorems.
Axiomatic system K:
· Rules: modus ponens and necessitation:
φ→ψ φ φ
MP NEC
ψ 2φ
φ→(ψ→φ) (A1)
(φ→(ψ→χ ))→((φ→ψ)→(φ→χ )) (A2)
(∼ψ→∼φ)→((∼ψ→φ)→ψ) (A3)
2(φ→ψ)→(2φ→2ψ) (K)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 155
1. P →(Q→P ) (A1)
2. P →(Q→P ))→((P →Q)→(P →P )) (A2)
3. (P →Q)→(P →P ) 1,2 MP
4. 2((P →Q)→(P →P )) 3, NEC
Using this technique, we can prove anything of the form 2φ, where φ
is provable in PL. And, since the PL axioms are complete (section 2.7), that
means that we can prove 2φ whenever φ is a tautology—i.e., a valid wff of
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 156
PL. But constructing proofs from the PL axioms is a pain in the neck!—and
anyway not what we want to focus on in this chapter. So let’s introduce the
following time-saving shortcut. Instead of writing out proofs of tautologies,
let’s instead allow ourselves to write any PL tautology at any point in a proof,
annotating simply “PL”.5 Thus, the previous proof could be shortened to:
1. (P →Q)→(P →P ) PL
2. 2((P →Q)→(P →P )) 1, NEC
2P →2P PL
Why am I making such a fuss about this? Didn’t I just say in the previous
paragraph that we can write down any tautology at any time, with the annotation
“PL”? Well, strictly speaking, 2P →2P isn’t a tautology. A tautology is a valid
wff of PL, and 2P →2P isn’t even a wff of PL (since it contains a 2). But it
is the result of beginning with some PL-tautology (Q→Q, in this case) and
uniformly changing sentence letters to chosen modal wffs (in this case, Qs to
2P s); hence any proof of the PL tautology may be converted into a proof of
it; hence the “PL” annotation is just as justified here as it is in the case of a
genuine tautology. So in general, MPL wffs that result from PL tautologies in
this way may be written down and annotated “PL”.
Back to investigating what we can prove in K. As we’ve seen, we can prove
that tautologies are necessary—we can prove 2φ whenever φ is a tautology.
One can also prove in K that contradictions are impossible. For instance,
∼3(P ∧∼P ) is a theorem of K:
5
How do you know whether something is a tautology? Figure it out any way you like: do a
truth table, or a natural deduction derivation—whatever.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 157
1. ∼(P ∧∼P ) PL
2. 2∼(P ∧∼P ) 1, NEC
3. 2∼(P ∧∼P )→∼∼2∼(P ∧∼P ) PL
4. ∼∼2∼(P ∧∼P ) 2, 3, MP
1. ∼(P ∧∼P ) PL
2. 2∼(P ∧∼P ) 1, NEC
3. ∼∼2∼(P ∧∼P ) 2, PL
i . 2(φ→ψ)
i + 1. 2(φ→ψ)→(2φ→2ψ) K axiom
i + 2. 2φ→2ψ i, i + 1, MP
First prove φ→ψ, then necessitate it to get 2(φ→ψ), then distribute the 2
over the arrow to get 2φ→2ψ. This procedure is one of the core K-strategies,
and is featured in the following proof of 2(P ∧Q)→(2P ∧2Q):
1. (P ∧Q)→P PL
2. 2[(P ∧Q)→P ] NEC
3. 2[(P ∧Q)→P ]→[2(P ∧Q)→2P ] K axiom
4. 2(P ∧Q)→2P 3,4 MP
5. 2(P ∧Q)→2Q Insert steps similar to 1-4
6. 2(P ∧Q)→(2P ∧2Q) 4,5, PL
Notice that the preceding proof, like all of our proofs since we introduced
the time-saving shortcuts, is not a K-proof in the official defined sense. Lines 1,
5, and 6 are not axioms, nor do they follow from earlier lines by MP or NEC;
similarly for line 6.6 So what kind of “proof” is it? It’s a metalanguage proof:
an attempt to convince the reader, by acceptable standards of rigor, that some
real K-proof exists. A reader could use this metalanguage proof as a blueprint
for constructing a real proof. She would begin by replacing line 1 with a proof
from the PL axioms of the conditional (P ∧Q)→P . (As we know from chapter
??, this could be a real pain in the neck!—but the completeness of PL assures us
that it is possible.) She would then replace line 5 with lines parallel to lines 1-4,
but which begin with a proof of (P ∧Q)→Q rather than (P ∧Q)→P . Finally,
in place of line 6, she would insert a proof from the PL axioms of the sen-
tence (2(P ∧Q)→2P )→[(2(P ∧Q)→2Q)→(2(P ∧Q)→(2P ∧2Q))], and then
use modus ponens twice to infer 2(P ∧Q)→(2P ∧2Q).
Let’s introduce another time-saving shortcut, which we’ll use more and
more as we progress: doing two (or more) steps at once. This shortcut is
featured in the following proof of (2P ∨2Q)→2(P ∨Q):
1. P →(P ∨Q) PL
2. 2(P →(P ∨Q)) 1, NEC
3. 2(Q→(P ∨Q)) PL, NEC
4. 2P →2(P ∨Q) 2, K
5. 2Q→2(P ∨Q) 3, K
6. (2P ∨2Q)→2(P ∨Q) 4,5 PL
6
A further (even pickier) reason: the symbol ∧ isn’t allowed in wffs; the sentences in the
proof are mere abbreviations for official MPL-wffs.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 159
One further comment about this last proof: it illustrates a strategy that is
common in modal proofs. We were trying to prove a conditional formula
whose antecedent is a disjunction of two modal formulas. But the modal
techniques we had developed didn’t deliver formulas of this form. They only
showed us how to put 2s in front of PL-tautologies, and how to distribute 2s
over →s. They only yield formulas of the form 2φ and 2φ→2ψ, whereas the
formula we were trying to prove looks different. To overcome this problem,
what we did was to use the modal techniques to prove two conditionals, namely
2P →2(P ∨Q) and 2Q→2(P ∨Q), from which the desired formula, namely
(2P ∨2Q)→2(P ∨Q), follows by propositional logic. The trick, in general, is
this: remember that you have PL at your disposal. Simply look for one or
more modal formulas you know how to prove which, by PL, imply the formula
you want. Assemble the desired formulas, and then write down your desired
formula, annotating “PL”. In doing so, it may be helpful to recall PL inferences
like the following:
The next example illustrates our next major modal proof technique: com-
bining two 2 statements to get a single 2 statement. Let us construct a K-proof
of (2P ∧2Q)→2(P ∧Q):
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 160
1. P →(Q→(P ∧Q)) PL
2. 2[P →(Q→(P ∧Q))] NEC
3. 2P →2(Q→(P ∧Q)) 2, K
4. 2(Q→(P ∧Q))→[2Q→2(P ∧Q)] K axiom
5. 2P →[2Q→2(P ∧Q)] 3,4 PL
6. (2P ∧2Q)→2(P ∧Q) 5, PL
(If you wanted to, you could skip step 5, and just go straight to 6 by propositional
logic, since 6 is a propositional logical consequence of 3 and 4; I put it in for
perspicuity.)
The general technique illustrated by the last problem applies anytime you
want to move from several 2 statements to a further 2 statement, where the in-
side parts of the first 2 statements imply the inside part of the final 2 statement.
More carefully: it applies whenever you want to prove a formula of the form
2φ1 →(2φ2 → · · · (2φn →2ψ) . . . ), provided you are able to prove the formula
φ1 →(φ2 → · · · (φn →ψ) . . . ). (The previous proof was an instance of this because
it involved moving from 2P and 2Q to 2(P ∧Q); and this is a case where one
can move from the inside parts of the first two formulas (namely, P and Q), to
the inside part of the third formula (namely, P ∧Q)—by PL.) To do this, one
begins by proving the conditional φ1 →(φ2 → · · · (φn →ψ) . . . ), necessitating it to
get 2[φ1 →(φ2 → · · · (φn →ψ) . . . )], and then distributing the 2 over the arrows
repeatedly using K-axioms and PL to get 2φ1 →(2φ2 → · · · (2φn →2ψ) . . . ).
One cautionary note in connection with this last proof. One might think to
make it more intuitive by using conditional proof:
But this is not a legal proof, since our axiomatic system allows neither assump-
tions nor conditional proof.
In fact, our decision to omit conditional proof was not at all arbitrary. Given
our rule of necessitation, we couldn’t add conditional proof to our system. If we
did, proofs like the following would become legal:
Thus, P →2P would turn out to be a K-theorem. But we don’t want that: after
all, a statement P might be true without being necessarily true.
Once we have a soundness proof (section 6.5), we’ll be able to show that
P →2P isn’t a K-theorem. But as we just saw, one can construct a K-proof from
{P } of 2P (recall the notion of a proof from a set Γ, from section 2.5.) It follows
that the deduction theorem (section 2.7), which says that if there exists a proof
of ψ from {φ}, then there exists a proof of φ→ψ, fails for K (it likewise fails for
all the modal systems we will consider.) So there will be no conditional proof
in our axiomatic modal systems. (Of course, to convince yourself that a given
formula is really a tautology of propositional logic, you may sketch a proof of it
to yourself using conditional proof in some standard natural deduction system
for nonmodal propositional logic; and then you may write that formula down
in one of our axiomatic MPL proofs, annotating “PL”.)
Back to techniques for constructing proofs in K. The following proof of
22(P ∧Q)→22P illustrates a technique for proving formulas with “nested”
modal operators:
1. (P ∧Q)→P PL
2. 2(P ∧Q)→2P 1, NEC, K
3. 2[2(P ∧Q)→2P ] 2, NEC
4. 22(P ∧Q)→22P 3, K
`K ∼2φ→3∼φ `K 3∼φ→∼2φ
`K ∼3φ→2∼φ `K 2∼φ→∼3φ
1. ∼∼φ→φ PL
2. 2∼∼φ→2φ 1, NEC, K
3. ∼2φ→∼2∼∼φ 2, PL
It will also be worthwhile to know that an analog of the K axiom for the 3
is a K-theorem:
2(φ→ψ)→(3φ→3ψ) (K3)
2(φ→ψ)→(∼2∼φ→∼2∼ψ)
2(φ→ψ)→(2∼ψ→2∼φ)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 163
This is equivalent, given PL, to what we want to show; and it looks like the
result of necessitating a tautology and then distributing the 2 over the → a
couple times—just the kind of thing we know how to do in K. Here, then, is
the desired proof of K3:
1. (φ→ψ)→ (∼ψ→∼φ) PL
2. 2(φ→ψ)→2(∼ψ→∼φ) 1, NEC, K
3. 2(∼ψ→∼φ)→(2∼ψ→2∼φ) K
4. 2(φ→ψ)→(2∼ψ→2∼φ) 2,3 PL
5. 2(φ→ψ)→(∼2∼φ→∼2∼ψ) 4,PL
In doing proofs, let’s also allow ourselves to refer to earlier theorems proved,
rather than repeating their proofs. The importance of K3 may be illustrated
by the following proof of 2P →(3Q→3(P ∧Q)):
1. P →[Q→(P ∧Q)] PL
2. 2P →2[Q→(P ∧Q)] 1, NEC, K
3. 2[Q→(P ∧Q)]→[3Q→3(P ∧Q)] K3
4. 2P →[3Q→3(P ∧Q)] 2,3, PL
where the Oi s are modal operators, all but one of which are 2s. (Thus, the
remaining Oi is the 3.) This can be done, provided that ψ is provable in K from
the φi s. The basic strategy is to prove a nested conditional, the antecedents
of which are the φi s, and the consequent of which is ψ; necessitate it; then
repeatedly distribute the 2 over the →s, once using K3, the rest of the times
using K. But there is one catch. We need to make the application of K3 last,
after all the applications of K. This in turn requires the conditional we use to
have the φi that is underneath the 3 as the last of the antecedents. For instance,
suppose that φ3 is the one underneath the 3. Thus, what we are trying to
prove is:
In other words, one must swap one of the other φi s (I arbitrarily chose φn )
with φ3 . What one obtains at the end will therefore have the modal statements
out of order:
But that problem is easily solved; this is equivalent in PL to what we’re trying
to get. (Recall that φ→(ψ→χ ) is logically equivalent in PL to ψ→(φ→χ ).)
Why do we need to save K3 for last? The strategy of successively distribut-
ing the box over all the nested conditionals comes to a halt as soon as the K3
theorem is used. Let me illustrate with an example. Suppose we wish to prove
`K 3P →(2Q→3(P ∧Q)). We might think to begin as follows:
1. P →(Q→(P ∧Q)) PL
2. 2[P →(Q→(P ∧Q))] 1, Nec
3. 3P →3(Q→(P ∧Q)) K3, 2, MP
4. ?
But neither K nor K3 gets us this. The remedy is to begin the proof with a
different conditional:
1. Q→(P →(P ∧Q)) PL
2. 2(Q→(P →(P ∧Q))) 1, Nec
3. 2Q→2(P →(P ∧Q)) K, 2, MP
4. 2(P →(P ∧Q))→(3P →3(P ∧Q)) K3
5. 2Q→(3P →3(P ∧Q)) 3, 4, PL
6. 3P →(2Q→3(P ∧Q)) 5, PL
to demonstrate this after section 6.5.) Relatedly, one can’t prove in K that
tautologies are possible or that contradictions aren’t necessary.
c) ∼3(Q∧R)↔2(Q→∼R)
h) 3P →(2Q→3Q)
6.4.2 System D
System D results from adding a new axiom schema to system K:
Axiomatic system D:
· Rules: MP, NEC
· Axioms: the A1, A2, A3, and K schemas, plus the D-schema:
2φ→3φ (D)
Notice that since system D includes all the K axioms and rules, we retain all
the K-theorems. The addition of the D-schema just adds more theorems. In
fact, all of our systems will build on K in this way, by adding new axioms to K.
With the D-schema in place, we can now prove that tautologies are possible:
1. P ∨∼P PL
2. 2(P ∨∼P ) 1, NEC
3. 2(P ∨∼P )→3(P ∨∼P ) D
4. 3(P ∨∼P ) 2,3 MP
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 166
1. 2P →3P D
2. 2(2P →3P ) 1, NEC
3. 22P →23P 2, K
Like K, system D is very weak. As we will see later, we can’t prove 2φ→φ
in D. Therefore, D doesn’t seem to be a correct logic for metaphysical, or
nomic, or technological necessity, for surely, if something is metaphysically,
nomically, or technologically necessary, then it must be true. (If something is
true in all metaphysically possible worlds, or all nomically possible worlds, or
all technologically possible worlds, then surely it must be true in the actual
world, and so must be plain old true.) But perhaps there is some interest in
D anyway; perhaps D is a correct logic for moral necessity. Suppose we read
2φ as “One ought to make φ be the case”, and, correspondingly, read 3φ as
“One is permitted to make φ be the case”. Then the fact that 2φ→φ cannot
be proved in D would be a virtue, for from the fact that something ought be
done, it certainly doesn’t follow that it is done. The D-axiom, on the other
hand, would correspond to the principle that if something ought to be done
then it is permitted to be done, which does seem like a logical truth. But I won’t
go any further into the question of whether D in fact does give a correct logic
for moral necessity.
a) ∼(2P ∧2∼P )
6.4.3 System T
T is the first system we have considered that has any plausibility of being a
correct logic for a wide range of concepts of necessity (metaphysical necessity,
for example):
Axiomatic system T:
· Rules: MP, NEC
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 167
· Axioms: the A1, A2, A3, and K schemas, plus the T-schema:
2φ→φ (T)
Recall that in the case of K, we proved a theorem schema, K3, which was
the analog for the 3 of the K-axiom schema. Let’s do the same thing here; let’s
prove a theorem schema T3, which is the analog for the 3 of the T axiom
schema:
T3: φ→3φ
1. 2∼φ→∼φ T
2. φ→∼2∼φ 1, PL
2 is just the definition of φ→3φ. Thus, we have established that for every
wff φ, `T φ→3φ. So let’s allow ourselves to write down formulas of the form
φ→3φ, annotating simply “T3”.
Notice that instances of the D-axioms are now theorems: 2φ→φ is a T
axiom, we just proved that φ→3φ is a theorem; and from these two by PL we
can prove 2φ→3φ. Thus, T is an extension of D: every theorem of D remains
a theorem of T. (Since D was an extension of K, T too is an extension of K.)
b) 2P ∧32(P →Q)]→3Q
6.4.4 System B
Our systems so far don’t allow us to prove anything interesting about iterated
modalities, i.e., sentences with consecutive boxes or diamonds. Which such
sentences should be theorems? The B axiom schema decides some of these
questions for us; here is system B:
Axiomatic system B:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 168
32φ→φ (B)
B3: φ→23φ:
1. 32∼φ→∼φ B
2. φ→∼32∼φ 1, PL
3. ∼32∼φ↔2∼2∼φ MN
4. ∼2∼φ→3φ PL (since 3 abbreviates ∼2∼)
5. 2∼2∼φ→23φ 4, NEC, K, MP
6. φ→23φ 2, 3, 5, PL
a) 32P ↔3232P
6.4.5 System S4
The characteristic axiom of our next system, S4, is a different principle gov-
erning iterated modalities:
2φ→22φ (S4)
S4 contains the S4-schema but does not contain the B-schema. Symmetri-
cally, B lacks the S4-schema, but of course contains the B-schema. As a result,
some instances of the B-schema are not provable in S4, and some instances of
the S4-schema are not provable in B (we’ll be able to show this after section
6.5). Hence, although S4 and B are each extensions of T, neither B nor S4 is
an extension of the other.
As before, we have a theorem schema that is the analog for the 3 of the S4
axiom schema:
S43: 33φ→3φ:
1. 2∼φ→22∼φ S4
2. 2∼φ→∼3φ MN
3. 22∼φ→2∼3φ 2, NEC, K, MP
4. 2∼3φ→∼33φ MN
5. ∼3φ→2∼φ MN
6. 33φ→3φ 5,1,3,4, PL
Example 6.7: Show that `S4 (3P ∧2Q)→3(P ∧2Q). This problem is rea-
sonably difficult. My approach is as follows. We saw in the K section above that
the following sort of thing may always be proved: 2φ→(3ψ→3χ ), whenever
the conditional φ→(ψ→χ ) can be proved. So we need to try to work the
problem into this form. As-is, the problem doesn’t quite have this form. But
something very related does have this form, namely: 22Q→(3P →3(P ∧2Q))
(since the conditional 2Q→(P →(P ∧2Q)) is a tautology). This thought in-
spires the following proof:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 170
a) 2P →232P
b) 2323P →23P
c) 32P →3232P
6.4.6 System S5
Here, instead of the B or S4 schemas, we add the S5 schema to T:
Axiomatic system S4:
· Rules: MP, NEC
· Axioms: the A1, A2, A3, K, and T schemas, plus the S5-schema:
32φ→2φ (S5)
S53: 3φ→23φ
1. 32∼φ→2∼φ S5
2. ∼3φ→2∼φ MN
3. 3∼3φ→32∼φ 2, NEC, K3, MP
4. ∼23φ→3∼3φ MN
5. 2∼φ→∼3φ MN
6. 3φ→23φ 4,3,1,5, PL
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 171
Next, note that the B and S4 axioms are now derivable as theorems. The B
axiom, 32φ→φ, is trivial:
1. 32φ→2φ S5
2. 2φ→φ T
3. 32φ→φ 1,2 PL
And now the S4 axiom, 2φ→22φ. This is a little harder. I used the B3
theorem, which we can now appeal to since the theoremhood of the B-schema
has been established.
1. 2φ→232φ B3
2. 32φ→2φ S5
3. 2(32φ→2φ) 2, Nec
4. 232φ→22φ 3, K, MP
5. 2φ→22φ 4, 1, PL
c) 2(2P →2Q)∨2(2Q→2P )
if `S α↔β, then `S χ ↔χ β
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 172
That is, whenever a formula has a string of modal operators in front, it is always
equivalent to the result of deleting all the modal operators except the innermost
one. For example, 223232232323φ and 3φ are provably equivalent in
S5; i.e., 223232232323φ↔3φ is a theorem of S5). This follows from
the fact that the following equivalences are all theorems of S5:
32φ↔2φ (a)
22φ↔2φ (b)
23φ↔3φ (c)
33φ↔3φ (d)
The left-to-right direction of (a) is just S5; the right-to-left is T3; (b) is T and
S4; (c) is T and S53; and (d) is S43 and T3. Thus, by repeated applications
of these equivalences, using substitution of equivalents, we can reduce strings
of modal operators to the innermost operator. (It is straightforward to convert
this argument into a more rigorous inductive proof.)8
Our study of modal logic has reversed that of history. We began with semantics,
because that is the more intuitive approach. Historically (as we noted earlier),
the axiomatic systems came first, in the work of C. I. Lewis. Given the uncer-
tainty over what formulas ought to be counted as axioms, modal logic was in
disarray. The discovery by the teenaged Saul Kripke in the late 1950s of the
possible-worlds semantics we studied in section 6.3, and of the correspondence
between simple constraints (reflexivity, transitivity, etc.) on the accessibility
relation in his models and Lewis’s axiomatic systems, was a major advance in
the history of modal logic.
The soundness and completeness theorems have practical as well as the-
oretical value. First, once we’ve proved soundness, we will for the first time
have a method for establishing that a given formula is not a theorem: construct
a countermodel for that formula, thus establishing that the formula is not valid,
and then conclude via soundness that the formula is not a theorem. Second,
given completeness, if we want to know that a given formula is a theorem, it
suffices to show that it is valid. Since semantic validity proofs are comparatively
easy to construct, it’s nice to be able to use them rather than axiomatic proofs.
Let’s begin with soundness. We’re going to prove a general theorem, which
we’ll use in several soundness proofs. First we’ll need a piece of terminology.
Where Γ is any set of modal wffs, let’s call “K+Γ” the axiomatic system that
consists of the same rules of inference as K (MP and NEC), and which has
as axioms the axioms of K (instances of the K- and PL- schemas), plus the
members of Γ. Here, then, is the theorem:
Modal systems of the form K+Γ are commonly called normal. Normal
modal systems contain all the K-theorems, plus possibly more. What Theorem
6.1 gives us is a method for constructing a soundness proof for any normal
system. Since all the systems we have studied here (K, D, etc.) are normal, this
method is sufficiently general for us. Here’s how the method works for system
T. System T has the same rules of inference as K, and its axioms are all the
axioms of K, plus the instances of the T-schema. In the “K+Γ” notation, T = K
+ {2φ→φ : φ is an MPL wff}. To establish soundness for T, all we need to do
is show that every instance of the T-schema is valid in all reflexive models; for
we may then conclude by Theorem 6.1 that every theorem of T is valid in all
reflexive models. This method can be applied to each of our systems: for any
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 176
system, S, to establish S’s soundness it will suffice to show that the S’s “extra-K”
axioms are valid in all of the S-models.
Theorem 6.1 follows from two lemmas we will need to prove:
Lemma 6.2 All PL and K-axioms are valid in all MPL-models
Lemma 6.3 For every MPL-model, M , MP and Necessitation preserve validity
in M
Proof of Theorem 6.1 from the lemmas. Assume that every wff in Γ is valid in a
given MPL-model M , and consider any theorem φ of K+Γ. That theorem is
a last line in a proof in which each line is either an axiom K+Γ, or follows from
earlier lines in the proof by MP or NEC. But axioms of K+Γ are either PL
axioms, K axioms, or members of Γ. The first two classes of axioms are valid in
all MPL-models, by Lemma 6.2, and so are valid in M ; and the final class of
axioms are valid in M by hypothesis. Thus, all axioms in the proof are valid
in M . Moreover, by Lemma 6.3, the rules of inference in the proof preserve
validity in M . Therefore, by induction, every line in the proof is valid in M .
Hence the last line in the proof, φ, is valid in M .
We now need to prove the lemmas.
Proof of Lemma 6.2. From our proof of soundness for PL (section 2.6), we know
that the PL truth tables generate the value 1 for each PL axiom, no matter
what truth value its immediate constituents have. But here in MPL, the truth
values of conditionals and negations are determined at a given world by the
truth values at that world of its immediate constituents via the PL truth tables.
So any PL axiom must have truth value 1 at any world, regardless of what truth
values its immediate constituents have. PL-axioms, therefore, are true at every
world in every model, and so are valid in every model. We need now to show
that any K axiom—i.e., any formula of the form 2(φ→ψ)→ (2φ→2ψ)—is
valid in any model:
iii) …V((2φ→2ψ), w) = 0
v) …V(2ψ, w) = 0
vi) Given v), for some v, R wv and V(ψ, v) = 0
vii) Given iv), since R wv, V(φ, v) = 1
viii) Given ii), since R wv, V(φ→ψ, v) = 1
ix) Lines vi), vii), and viii) contradict, given the truth condition for the →
Proof of Lemma 6.3. We must show that the rules MP and NEC preserve va-
lidity in any given model. That is, we must show that if the inputs to one of
these rules is valid in some model, then that rule’s output must also be valid in
that model.
First MP. Let φ and φ→ψ be valid in model 〈W , R, I 〉; we must show that
ψ is also valid in that model. That is, where V is this model’s valuation, and
w is any member of W , we must show that V(ψ, w) = 1. Since φ and φ→ψ
are valid in this model, V(φ→ψ, w) = 1, and V(φ, w) = 1; but by the truth
condition for →, V(ψ, w) must also be 1.
Next NEC. Suppose φ is valid in model M . We must show that 2φ is
valid in M , i.e., that 2φ is true at each world in M , i.e., that for each world,
w, φ is true at every world accessible from w. But since φ is valid in M , φ
is true in every world in M , and hence is true at every world accessible from
w.
6.5.1 Soundness of K
We can now construct soundness proofs for the individual systems. I’ll do this
for some of the systems, and leave the verification of soundness for the other
systems as exercises.
First K. In the “K+Γ” notation, K is just K+∅, and so it follows immediately
from Theorem 6.1 that every theorem of K is valid in every MPL-model. So
K is sound.
6.5.2 Soundness of T
T is K+Γ, where Γ is the set of all instances of the T-schema. So, given Theorem
6.1, to show that every theorem of T is valid in all T-models, it suffices to show
that all instances of the T-schema are valid in all T-models:
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 178
iii) …V(φ, w) = 0
6.5.3 Soundness of B
B is K+ Γ, where Γ is the set of all instances of the T- and B- schemas. Given
Theorem 6.1, it suffices to show that every instance of the B-schema and every
instance of the T-schema is valid in every B-model. So, choose an arbitrary
model with a reflexive and symmetric accessibility relation, whose valuation is
V, and let w be any world in that model. We must show that V counts each
instance of the T-schema and the B-schema as being true at w. The proof
of the previous section shows that the T-axioms are true at w. Now for the
B-axioms:
iii) …V(φ, w) = 0
This sufficient condition for theoremhood can then be used to give complete-
ness proofs, as the following example brings out. Suppose we can demonstrate
that the accessibility relation in the canonical model for T is reflexive. Then,
since T-valid formulas are by definition true in every world in every model
with a reflexive accessibility relation, we know that every T-valid formula is
valid in the canonical model for T. But then the italicized statement tells us
that every T-valid formula is a theorem of T. So we would have established
completeness for T.
The trick for constructing canonical models will be to let the worlds in these
models be sets of formulas (remember, worlds are allowed to be anything we
like). And we’re going to construct the interpretation function of the canonical
model in such a way that a formula will be true at a world iff the formula is a
member of the set that is the world. Working out this idea will occupy us for
awhile.
the remaining formulas, and then strip one 2 off of the front of each. The
definition of accessibility, therefore, says that R ww 0 iff for each wff 2φ that is
a member of w, the wff φ is a member of w 0 .
The definition of accessibility in the canonical model says nothing about
formal properties like transitivity, reflexivity, and so on. As a result, it is not
true by definition that the canonical model for S is an S-model. T-models,
for example, must have reflexive accessibility relations, whereas the definition
of the accessibility relation in the canonical model for T says nothing about
reflexivity. As we will eventually see, the canonical model for each system S
turns out to be an S-model, but this fact must be proven; it’s not built into the
definition of a canonical model.
An atomic wff (sentence letter) is defined to be true at a world iff it is a
member of that world. Thus, for atomic wffs, truth and membership coincide.
What we really need to know, however, is that truth and membership coincide
for all wffs, including complex wffs. Proving this turns out to be a big task,
which will occupy us for several sections. We’ll need first to assemble some
firepower: a number of preliminary lemmas and theorems which will eventually
be used to prove that membership and truth coincide for all wffs in canonical
models. We’ll then finally be able to give completeness proofs.
6.4d if `S φ then φ ∈ Γ
Proof of Lemma 6.4a. We know from the definition of maximality that at least
one of φ or ∼φ is in Γ. But it cannot be that both are in Γ, for then Γ would be
S-inconsistent (it would contain the finite subset {φ, ∼φ}; but since all modal
systems incorporate propositional logic, it is a theorem of S that ∼(φ∧∼φ).)
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 182
Proof of Lemma 6.4b. Suppose first that φ→ψ is in Γ, and suppose for reductio
that φ is in Γ but ψ is not. Then, since Γ is maximal, ∼ψ is in Γ; but now
Γ is S-inconsistent by containing the subset {φ, φ→ψ, ∼ψ}. Suppose for the
other direction that either φ is not in Γ or ψ is in Γ, and suppose for reductio
that φ→ψ isn’t in Γ. Since Γ is maximal, ∼(φ→ψ) ∈ Γ. Now, if φ ∈ / Γ then
∼φ ∈ Γ, but then Γ would contain the S-inconsistent subset {∼(φ→ψ), ∼φ}.
And if on the other hand, ψ ∈ Γ then Γ again contains an S-inconsistent subset:
{∼(φ→ψ), ψ}. Either possibility contradicts Γ’s S-consistency.
Theorem 6.5 If ∆ is an S-consistent set of wffs, then there exists some maximal
S-consistent set of wffs, Γ, such that ∆ ⊆ Γ
( ) ∼ → 2 P1 P2 ...
1 2 3 4 5 6 7 ...
Since we’ll need to refer to what position an expression has in this list, the positions of the
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 183
going through this list one-by-one, at each point adding either φi or ∼φi .
Here’s how we do this more carefully. Let’s begin by defining an infinite
sequence of sets, Γ0 , Γ1 , . . . :
· Γ0 is ∆
· Γn+1 is Γn ∪ {φn+1 } if that is S-consistent; otherwise Γn+1 is Γn ∪ {∼φn+1 }
Note the recursive nature of the definition: the next member of the sequence
of sets, Γn+1 is defined as a function of the previous member of the sequence,
Γn .
Next let’s prove that each member in this sequence—that is, each Γi —is an
S-consistent set. We do this inductively, by first showing that Γ0 is S-consistent,
and then showing that if Γn is S-consistent, then so will be Γn+1 .
Obviously, Γ0 is S-consistent, since ∆ was stipulated to be S-consistent.
Next, suppose that Γn is S-consistent; we must show that Γn+1 is S-consistent.
Look at the definition of Γn+1 . What Γn+1 gets defined as depends on whether
Γn ∪{φn+1 } is S-consistent. If Γn ∪{φn+1 } is S-consistent, then Γn+1 gets defined
as that very set Γn ∪ {φn+1 }, and so of course is S-consistent. So we’re ok in
that case.
The remaining possibility is that Γn ∪ {φn+1 } is S-inconsistent. In that case,
Γn+1 gets defined as Γn ∪{∼φn+1 }. So must show that in this case, Γn ∪{∼φn+1 }
is S-consistent. Suppose for reductio that it isn’t. The conjunction of some
finite subset of its members must therefore be provably false in S. Since Γn was
S-consistent, the finite subset must contain ∼φn+1 , and so there exist ψ1 . . . ψ m ∈
expressions are listed underneath those expressions. (E.g., the position of the 2 is 5.) Now,
where φ is any wff, call the rating of φ the sum of the positions of the occurrences of its
primitive expressions. (The rating for the wff (P1 →P1 ), for example, is 1 + 6 + 4 + 6 + 2 = 19.)
We can now construct the listing of all the wffs of MPL by an infinite series of stages: stage 1,
stage 2, etc. In stage n, we append to our growing list all the wffs of rating n, in alphabetical
order. The notion of alphabetical order here is the usual one, given the ordering of the primitive
expressions laid out above. (E.g., just as ‘and’ comes before ‘nad’ in alphabetical order, since
‘a’ precedes ‘n’ in the usual ordering of the English alphabet, ∼2P2 comes before 2∼P2 in
alphabetical order since ∼ comes before the 2 in the ordering of the alphabet of MPL. Note
that each of these wffs are inserted into the list in stage 15, since each has rating 15.) In stages
1-5 no wffs are added at all, since every wff must have at least one sentence letter and P1 is
the sentence letter with the smallest position. In stage 6 there is one wff: P1 . Thus, the first
member of our list of wffs is P1 . In stage 7 there is one wff: P2 , so P2 is the second member of
the list. In every subsequent stage there are only finitely many wffs; so each stage adds finitely
many wffs to the list; each wff gets added at some stage; so each wff eventually gets added after
some finite amount of time to this list.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 184
6.6.6 “Mesh”
Our ultimate goal is to show that in canonical models, a wff is true at a world
iff it is a member of that world. If we’re going to be able to show this, we’d
better be able to show things like this:
We’ll need to be able to show (2) and (3) because it’s part of the definition of
truth in any MPL-model (whether canonical or not) that 2φ is true at w iff φ
is true at each world accessible from w, and that 3φ is true at w iff φ is true at
some world accessible from w. Think of it this way: (2) and (3) say that the
modal statements that are members of a world w in a canonical model “mesh”
with the members of the other worlds in that canonical model. This sort of
mesh had better hold if truth and membership are going to coincide.
(2) we know to be true straightaway, since it follows from the definition of
the accessibility relation in canonical models. The definition of the canonical
model for S, recall, stipulated that w 0 is accessible from w iff for each wff 2φ
in w, the wff φ is a member of w 0 . (3), on the other hand, doesn’t follow
immediately from our definitions; we’ll need to prove it. Actually, it will be
convenient to prove something slightly different which involves only the 2:
(Given the definition of accessibility in the canonical model and the definition
of the 3 in terms of the 2, Lemma 6.6 basically amounts to (3).)
Proof of Lemma 6.6. Let ∆ be as described. The first (and biggest) step is to
establish:
Suppose for reductio that (*) is false. By the definition of S-inconsistency, for
some χ1 . . . χ m ∈ 2− (∆) ∪ {∼φ} the following is a theorem of S:
∼(χ1 ∧ · · · ∧χ m )
∼(χ1 ∧ · · · ∧χ m ∧∼φ)
Now go through the list χ1 . . . χ m , and if it contains any wffs that are not
members of 2− (∆), drop them from the list. Call the resulting list ψ1 . . . ψn .
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 186
Each of the ψi s, note, is a member of 2− (∆).11 The only wff that could have
been dropped, in moving from the χi s to the ψi s, is ∼φ (since each χi was
a member of 2− (∆) ∪ {∼φ}); the following wff is therefore a PL-semantic-
consequence of the previous wff, and so is itself a theorem of S:
Next, begin a proof in S with a proof of ∼(ψ1 ∧ · · · ∧ψn ∧∼φ), and then continue
as follows:
.
.
i. ∼(ψ1 ∧ · · · ∧ψn ∧∼φ)
i + 1. ψ1 →(ψ2 → · · · (ψn →φ)) . . . ) i, PL
i + 2. 2(ψ1 →(ψ2 → · · · (ψn →φ)) . . . ) i + 1, NEC
.
.
j. 2ψ1 →(2ψ2 → · · · (2ψn →2φ)) . . . ) i + 2…, K, PL (×n)
j + 1. ∼(2ψ1 ∧ · · · ∧2ψn ∧∼2φ) j , PL
This proof establishes that `S ∼(2ψ1 ∧ · · · ∧2ψn ∧∼2φ). But since 2ψ1 …2ψn ,
and ∼2φ are all in ∆, this contradicts ∆’s S-consistency (2ψ1 …2ψn are mem-
bers of ∆ because ψ1 …ψn are members of 2− (∆).)
We’ve established (*): 2− (∆) ∪ {∼φ} is S-consistent. It therefore has a
maximal S-consistent extension, Γ, by Theorem 6.5. Since 2− (∆) ∪ {∼φ} ⊆ Γ,
we know that 2− (∆) ⊆ Γ and that ∼φ ∈ Γ. Γ is therefore our desired set.
11
If 2− (∆) is empty then there will be no ψi s. In that case, let’s regard “ψ1 ∧ · · · ∧ψn ∧∼φ”
as standing for ∼φ.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 187
Proof of Theorem 6.7. We’ll use induction. The base case is when φ has zero
connectives—i.e., φ is a sentence letter. In that case, the result is immediate:
by the definition of the canonical model, I (φ, w) = 1 iff φ ∈ w; but by the
definition of the valuation function, VM (φ, w) = 1 iff I (φ, w) = 1.
Now the inductive step. We suppose (ih) that the result holds for φ, ψ, and
show that it holds for ∼φ, φ→ψ, and 2φ as well. First, ∼: we must show that
∼φ is true at w iff ∼φ ∈ w:
i) ∼φ ∈ w iff φ ∈
/ w (6.4a)
ii) φ ∈
/ w iff φ is not true at w (ih)
Completeness of D
Let us show that in the canonical model for D, the accessibility relation, R, is
serial. Let w be any world in that model. We showed above that 3(P →P ) is a
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 189
Completeness of T
All we need to do is to prove that the accessibility relation in the canonical model
for T is reflexive; given that, every T-valid formula is valid in the canonical
model for T, and hence by corollary 6.8, every T-valid formula is a T-theorem.
Let φ be any wff. `T 2φ→φ, so, where w is any world in the canonical
model for T, by lemma 6.4d, 2φ→φ ∈ w. By lemma 6.4c, if 2φ ∈ w, then so
is φ. Formula φ was arbitrarily chosen, so we have: for any φ, if 2φ ∈ w then
φ ∈ w. But this is the definition of R ww. World w was arbitrarily chosen, so
R is reflexive.
Completeness of B
We must show that the accessibility relation in the canonical model for B is
reflexive and symmetric. Reflexivity can be demonstrated in the same way as it
was for T, since every T-theorem is a B-theorem.
Now for symmetry: in the canonical model for B, suppose that R wv. We
must show that R v w—that is, that for any 2ψ in v, ψ ∈ w. So, suppose that
2ψ ∈ v. By theorem 6.7, 2ψ is true at v; since R wv, by the definition of
3 it follows that 32ψ is true at w, and hence is a member of w by theorem
6.7. Since `B 32ψ→ψ, by lemma 6.4d, 32ψ→ψ ∈ w, and so, by lemma 6.4c,
ψ ∈ w.
Completeness of S4
We must show that the accessibility relation in the canonical model for S4 is
reflexive and transitive. Again, reflexivity can be demonstrated as it was for
T; transitivity remains. Suppose R wv and R v u. We must show R w u—that
is, for any 2ψ ∈ w, ψ ∈ u. If 2ψ ∈ w, since `S4 2ψ→22ψ, by lemma 6.4d,
CHAPTER 6. PROPOSITIONAL MODAL LOGIC 190
Completeness of S5
We must show that the accessibility relation in the canonical model for S5 is
reflexive, symmetric, and transitive. But since each T, B, and S4 theorem is
an S5 theorem, the proofs of reflexivity, symmetry, and transitivity from the
previous three sections apply here.
s we have seen, possible worlds are useful for giving a semantics for propo-
A sitional modal logic. Possible worlds are useful in other areas of logic as
well. In this chapter we will briefly examine two other uses for possible worlds:
semantics for tense logic, and semantics for intuitionist propositional logic.
191
CHAPTER 7. VARIATIONS ON MPL 192
Comments:
· we add in a new place for all predicates, for the time at which the object
satisfies the predicate. Thus, instead of saying “C x”—“x is a child”—we
say “C x t ”: “x is a child at t ”
· the quantifier ∃x is atemporal, ranging over all objects at all times. That’s
how we can say that there is a thing, x, that is a dinosaur, and which, at
some previous time, trampled a mammal.
So: we can use Quine’s strategy to represent temporal notions using standard
predicate logic. But some philosophers reject the conception of time that is
presupposed by Quine’s strategy. First, Quine presupposes that the past, present,
and future are equally real. After all, his symbolization of “A dinosaur once
trampled a mammal” says that there is such a thing as a dinosaur. Quine’s view is
that time is “space-like”. Other times are as real as the present, just temporally
distant, just as other places are equally real but spatially distant. Second, Quine
presupposes a distinctive metaphysics of change. Quine accounts for change
by adding argument places to temporary predicates like ‘is a child’ and ‘is an
adult’. For him, the statement ‘Ted is an adult’ is incomplete in something
like the way ‘Philadelphia is north of’ is complete: its predicate has an unfilled
argument place. When all of a sentence’s argument places are filled, it can no
longer change its truth value; as a result (according to some), Quine’s approach
leaves no room for genuine change.
Arthur Prior (1967; 1968) and others reject Quine’s picture of time. Ac-
cording to Prior, rather than reducing notions of past, present, and future to
CHAPTER 7. VARIATIONS ON MPL 193
notions about what is true at times, we must instead include certain special
temporal expressions—sentential tense operators—in our most basic languages,
and develop an account of their logic. Thus he initiated the study of tense logic.
One of Prior’s tense operators was P, symbolizing “it was the case that”.
Grammatically, P behaves like the ∼ and the 2: it attaches to a complete
sentence and forms another complete sentence. Thus, if R symbolizes “it is
raining”, then P R symbolizes “it was raining”. If a sentence letter occurs by
itself, outside of the scope of all temporal operators, then for Prior it is to
be read as present-tensed. Thus, it was appropriate to let R symbolize “It is
raining”—i.e., it is now raining.
Suppose we symbolize “there exists a dinosaur” as ∃xD x. Prior would then
symbolize “There once existed a dinosaur” as:
P ∃xD x
And according to Prior, P ∃xD x is not to be analyzed as saying that there exist
dinosaurs located in the past. For him, there is no further analysis of P ∃xD x.
Prior’s attitude toward P is like everyone else’s attitude toward the ∼: no one
thinks that ∼∃xU x, “there are no unicorns”, is to be analyzed as saying that
there exist unreal unicorns. Further, Prior can represent the fact that I am
now, but have not always been, an adult, without adding argument places for
times to predicates. Symbolizing ‘is an adult’ with ‘A’, and ‘Ted’ with ‘t ’, Prior
would write: At ∧ P∼At (“Ted is an adult, but it was the case that: Ted isn’t an
adult”). For Prior, the sentence At (“Ted is an adult”) is a complete statement,
but nevertheless can alter its truth value.
Hφ: “it is, and always has been the case that φ”
Fφ: “it either is, or will at some point in the future be the case that, φ”
Pφ: “it either is, or was at some point in the past the case that φ”
CHAPTER 7. VARIATIONS ON MPL 194
Whether we take the tense operators as including the present moment will
affect what kind of logic we develop for them. For example, the “2-like”
operators G and H will obey the T-principle (Gφ and Hφ will imply φ) if they
are interpreted as including the present moment, but not otherwise.
CHAPTER 7. VARIATIONS ON MPL 195
clauses for the tense operators. Let’s take just G and H as primitive; here are
the clauses:
VM (Gφ, t ) = 1 iff for every t 0 such that t ≤ t 0 , VM (φ, t 0 ) = 1
VM (Hφ, t ) = 1 iff for every t 0 such that t 0 ≤ t , VM (φ, t 0 ) = 1
If we define F and P as ∼G∼ and ∼H∼, respectively, then we get the following
derived clauses:
VM (Fφ, t ) = 1 iff for some t 0 such that t ≤ t 0 , VM (φ, t 0 ) = 1
VM (Pφ, t ) = 1 iff for some t 0 such that t 0 ≤ t , VM (φ, t 0 ) = 1
Call an MPL-model, thought of in this way, a “PTL-model” (for “Priorean
Tense Logic”). And say that a wff is PTL-valid iff it is true in every time in every
PTL-model. Given our discussion of system K from chapter 6, we already
know a lot about PTL-validity. The truth condition for the G is the same as
the truth condition for the 2 in MPL. Thus, for each K-valid formula φ of
MPL, there is a PTL-valid formula of tense logic: simply replace each 2 in
φ with G. Replacing 2s with Gs in the K-valid formula 2(P ∧Q)→2P results
in the PTL-valid formula G(P ∧Q)→GP , for example. Similarly, the result of
replacing 2s with Hs in a K-valid formula also results in a PTL-valid formula.
But there are further cases of PTL-validity that depend on the interaction
between different tense operators, and hence have no direct analog in MPL.
For example, we can demonstrate that PTL φ→GPφ:
i) Suppose for reductio that in some PTL-model M (= 〈T , ≤, I 〉) and
some t ∈ T , VM (φ→GPφ, t ) = 0. (I henceforth drop the subscript M .)
ii) So V(φ, t ) = 1 and …
iii) …V(GPφ, t ) = 0.
iv) Given iii), by the truth condition for G: for some t 0 ∈ T , t ≤ t 0 and
V(Pφ, t ) = 0
v) Given iv), by the (derived) truth condition for P: for every t 00 ∈ T , if
t 00 ≤ t 0 then V(φ, t 00 ) = 0
vi) letting t 00 in v) be t , given that t ≤ t 0 (from iv)), we have: V(φ, t ) = 0,
contradicting ii).
Similarly, one can show that PTL φ→HFφ.
CHAPTER 7. VARIATIONS ON MPL 197
Gφ→GGφ
Hφ→HHφ
There are other interesting constraints on ≤ that one might impose. One might
impose reflexivity, for example. This is natural to impose if we are construing
the tense operators as including the present moment; not otherwise. Imposing
reflexivity validates the “T-schemas” Gφ→φ and Hφ→φ.
One might also impose “connectivity” of some sort.
CHAPTER 7. VARIATIONS ON MPL 198
So, we might require that the ≤ relation be strongly connected (in T ), or,
alternatively, merely weakly connected. This would be to disallow “incom-
parable” pairs of times—pairs of times neither of which bears the ≤ relation
to the other. The stronger requirement disallows all incomparable pairs; the
weaker requirement merely disallows incomparable pairs when each member
of the pair is after or before some one time. Thus, the weaker requirement
disallows “branches” in the temporal order but allows distinct timelines wholly
disconnected from one another, whereas the stronger requirement insures that
all times are part of a single non-branching structure. Each sort validates every
instance of the following schemas:
G(Gφ→ψ) ∨ G(Gψ→φ)
H(Hφ→ψ) ∨ H(Hψ→φ)
iii) so there exist times, t 0 and t 00 , such that t ≤ t 0 and t ≤ t 00 , and V(Gφ→ψ, t 0 ) =
0 and V(Gψ→φ, t 00 ) = 0
There are other constraints one might impose, for example anti-symmetry
(no distinct times bear ≤ to each other), density (between any two times there is
another time), or eternality (there exists neither a first nor a last time). In some
cases, imposing a constraint validates an interesting schema being validated.
Further, some constraints are more philosophically controversial than others.
CHAPTER 7. VARIATIONS ON MPL 199
Notice that one should not impose symmetry on ≤. Obviously if one time
is at least as early as another, then the second time needn’t be at least as early
as the first. Moreover, imposing symmetry would validate the “B” schemas
FGφ→φ and PHφ→φ; but these clearly ought not to be validated. Take the
first, for example: it doesn’t follow from it will be the case that it is always going to
be the case that I’m dead that I’m (now) dead.
So far we have been interpreting the tense operators as including the present
moment. That led us to call the temporal ordering relation in our models “≤”,
and require that it be reflexive. What if we instead interpreted the tense
operators as not including the present moment? We would then call the
temporal ordering relation “<”, and think of it as the earlier-than relation; and
we would no longer require that it be reflexive. Indeed, it would be natural to
require that it be irreflexive: that it never be the case that t < t .
We have considered only the semantic approach to tense logic. What of a
proof-theoretic approach? Given the similarity between tense logic and modal
logic, it should be no surprise that axiom systems similar to those of section
6.4 can be developed for tense logic. Moreover, the techniques developed in
sections 6.5-6.6 can be used to give soundness and completeness proofs for
tense-logical axiom systems, relative to the possible-worlds semantics that we
have developed in this section.
semantics for propositional modal logic, and so it will include valuation func-
tions that assign the values 1 and 0 to formulas relative to the members of a
set W . But the idea is to now think of the members of W as stages in the
construction of proofs, rather than as possible worlds, and to think of 1 and 0 as
“proof statuses”, rather than truth values. That is, we are to think of V(φ, w) = 1
as meaning that formula φ has been proved at stage w.
Let us treat the ∧ and the ∨ as primitive connectives. Here is Kripke’s
semantics for intuitionist propositional logic. (To emphasize the different way
we are regarding the “worlds”, we rename W “S ”, for stages in the construction
of proofs, and we will use the variables s, s 0 , etc., for its members.)
VM (α, s) = I (α, s)
VM (φ∧ψ, s) = 1 iff VM (φ, s) = 1 and VM (ψ, s) = 1
VM (φ∨ψ, s) = 1 iff VM (φ, s) = 1 or VM (ψ, s ) = 1
VM (∼φ, s) = 1 iff for every s 0 such that R s s 0 , VM (φ, s 0 ) = 0
VM (φ→ψ, s) = 1 iff for every s 0 such that R s s 0 , either VM (φ, s 0 ) = 0
or VM (ψ, s 0 ) = 1
Note that the truth conditions for the → and the ∼ at stage s no longer
depend exclusively on what s is like; they are sensitive to what happens at
stages accessible from s. Unlike the ∧ and the ∨, → and ∼ are not “truth
functional” (relative to a stage); they behave like modal operators.
CHAPTER 7. VARIATIONS ON MPL 201
Let us think intuitively about these models. We are to think of each member
of S as a stage in the construction of mathematical proofs. At any stage, one
has come up with proofs of some things but not others. When V assigns 1 to a
formula at a stage, that means intuitively that as of that state of information,
the formula has been proven. The assignment of 0 means that the formula has
not been proven thus far (though it might nevertheless in the future.)
The holding of the accessibility relation R represents which future stages
are possible, given one’s current stage. If s 0 is accessible from s , that means that
s 0 contains all the proofs in s, plus perhaps more. Given this understanding,
reflexivity and transitivity are obviously correct to impose, as is the heredity
condition, since (on the somewhat idealized conception of proof we are oper-
ating with) one does not lose proved information when constructing further
proofs. But the accessibility relation will not in general be symmetric: for
sometimes one will come across a new proof that one did not formerly have.
Let’s also think through why the truth conditions for →, ∧, ∨ and ∼ are
intuitively correct. Intuitionists, recall, associate with each propositional con-
nective, a conception of what proofs of formulas built using that connective
must be like:
of ψ, then there could never be a possible future in which one has a proof of
φ but not one of ψ. Conversely, if one lacks such a method, then it should be
possible one day to have a proof of φ without being able to convert it into a
proof of ψ, and thus without then having a proof of ψ.
We can now define intuitionist validity and semantic consequence in the
obvious way:
7.2.2 Examples
Given the semantics just introduced, it’s straightforward to demonstrate facts
about validity and semantic consequence.
Example 7.1: Show that Q I P →Q. (I’ll omit the qualifier “I” from now
on.) Take any model and any stage s ; assume that V(Q, s) = 1 and V(P →Q, s ) =
0. Thus, for some s 0 , R s s 0 and V(P, s 0 ) = 1 and V(Q, s 0 ) = 0. But this violates
heredity.
one can in principle prove? Clearly not, for then any formula assigned 1 at any accessible
stage should already be assigned 1 at that stage. But if stages are not idealized in this way,
then why suppose that the assignment of 0 at a stage to ∼φ (failure to prove that φ leads to a
contradiction) insures that there is some future stage at which φ is proved? A similar worry
confronts the valuation condition for →.
CHAPTER 7. VARIATIONS ON MPL 203
Example 7.3: Show that 2 P ∨∼P . Here’s a model in which P ∨∼P is valu-
ated as 0 in stage r:
0 0 0
r P ∨∼P
∗
0
∗
a 1
P
0
S : {r, a}
R : {〈r, r〉, 〈a, a〉, 〈r, a〉}
I (P, a) = 1, all other atomics 0 everywhere
∗
1 0 0
r
∼∼P P
∗
0
∗
1 0
a
P ∼P
∗
0
7.2.3 Soundness
Recall our proof system for intuitionism from section 3.4. What I’d like to
do next is show that that proof system is sound, relative to our semantics for
intuitionism. But first we’ll need to prove an intermediate theorem:
Generalized heredity: The heredity condition holds for all formulas. That
is, for any wff φ, whether atomic or no, and any stage, s , in any intuitionist
model, if V(φ, s ) = 1 and R s s 0 then V(φ, s 0 ) = 1.
CHAPTER 7. VARIATIONS ON MPL 205
Proof. The proof is by induction. The base case is just the official heredity
condition. Next we make the inductive hypothesis (ih): heredity is true for
formulas φ and ψ; we must now show that heredity also holds for ∼φ, φ→ψ,
φ∧ψ, and φ∨ψ. I’ll do this for φ∧ψ, and leave the rest as exercises.
∧: Suppose for reductio that V(φ∧ψ, s ) = 1, R s s 0 , and V(φ∧ψ, s 0 ) = 0.
Given the former, V(φ, s) = 1 and V(ψ, s ) = 1. By (ih), V(φ, s 0 ) = 1 and
V(ψ, s 0 ) = 1—contradiction.
Now for soundness. What does soundness mean in the present context?
The proof system in section 3.4 is a proof system for sequents, not individual
formulas. So first, we need a notion of intuitionist validity for sequents.
Proof. This will be an inductive proof. Since a provable sequent is the last
sequent in any proof, all we need to show is that every sequent in any proof
is I-valid. And to do that, all we need to show is that the rule of assumptions
generates I-valid sequents (base case), and all the other rules preserve I-validity
(induction step). For any set, Γ, valuation function V, and stage s, let’s write
“V(Γ, s) = 1” to mean that V(γ , s) = 1 for each γ ∈ Γ.
Base case: the rule of assumptions generates sequents of the form φ ` φ,
which are clearly I-valid.
Induction step: we show that the other sequent rules from section 3.4
preserve I-validity.
∧I: Here we assume that the inputs to ∧I are I-valid, and show that its
output is I-valid. That is, we assume that Γ ` φ and ∆ ` ψ are I-valid sequents,
and we must show that it follows that Γ, ∆ ` φ∧ψ is also I-valid. So, consider
any model with valuation V and any stage s such that V(Γ ∪ ∆)=1, and suppose
for reductio that V(φ∧ψ, s) = 0. Since Γ ` φ is I-valid, V(φ, s) = 1; since ∆ ` ψ
is I-valid, V(ψ, s) = 1; contradiction.
CHAPTER 7. VARIATIONS ON MPL 206
Exercise 7.5 Show that ∧E, ∨I, DNI, RAA, →I, →E, and EF
preserve I-validity.
I can now justify an assertion I made, but did not prove, in section 3.4. I
asserted there that the sequent ∅ ` P ∨∼P is not intuitionistically provable.
Given the soundness proof, to demonstrate that a sequent is not intuitionisti-
cally provable, it suffices to show that its premises do not I-semantically-imply
its conclusion. But in example 7.3 we showed that 2 P ∨∼P , which is equivalent
to saying that ∅ 2 P ∨∼P .
Similarly, we showed in example 7.4 that ∼∼P 2 P . Thus, by the soundness
theorem, the sequent ∼∼P ` P isn’t provable. (Recall how, in constructing our
proof system for intuitionism in section 3.4, we dropped the rule of double-
negation elimination.)
Chapter 8
Counterfactuals1
here are certain conditionals in natural language that are not well-
T represented either by propositional logic’s material conditional or by
modal logic’s strict conditional. In this chapter we consider “counterfactual”
conditionals—conditionals that (loosely speaking) have the form:
207
CHAPTER 8. COUNTERFACTUALS 208
What should the logic of this new connective be, if it is to accurately represent
natural language counterfactuals?
∼P 2 P 2→Q
Q 2 P 2→Q
For consider: I did not strike the match; but it doesn’t logically follow that
if I had struck the match, it would have turned into a feather. So if 2→ is to
represent ‘if it had been that…, it would have been that…’, ∼P should not
semantically imply P 2→Q. Similarly, George W. Bush (somehow) won the last
United States presidential election, but it doesn’t follow that if the newspapers
had discovered beforehand that Bush had an affair with Al Gore, he would
still have won. So our semantics had better not count P 2→Q as a semantic
consequence of Q either.
These implications hold for the material conditional, however (for any φ
and ψ):
∼φ φ→ψ
ψ φ→ψ
theory might have been true; in a possible world in which there is a conspiracy,
it would be true that if Oswald hadn’t shot Kennedy, someone else would
have. Thus, our logic should allow counterfactuals to be contingent statements.
Just because a counterfactual is true, it should not follow logically that it is
necessarily true; and just because a counterfactual is false, it should not follow
logically that it is necessarily false. Our semantics for 2→, that is, should have
the following features:
One reason this is important is that it shows an obstacle to using the strict
conditional ⇒ to represent natural language counterfactuals. For remember
that φ⇒ψ is defined as 2(φ→ψ). As a result:
8.1.3 No augmentation
The → and the ⇒ obey the argument form augmentation
φ→ψ φ⇒ψ
(φ∧χ )→ψ (φ∧χ )⇒ψ
That is, φ→ψ PL (φ∧χ )→ψ and φ⇒ψ K,… (φ∧χ )⇒ψ. However, natural
language counterfactuals famously do not obey augmentation. Consider:
So, our next desideratum is that the corresponding argument should not hold
good for 2→ (that is, P 2→Q 2 (P ∧R)2→Q.)
CHAPTER 8. COUNTERFACTUALS 210
8.1.4 No contraposition
→ and ⇒ obey contraposition:
φ→ψ φ⇒ψ
∼ψ→∼φ ∼ψ⇒∼φ
But counterfactuals do not. Suppose I’m on the firing squad, and we shoot
someone dead. My gun was loaded, but so were those of the others. Then the
premise of the following argument is true, while its consequent is false:
φ2→ψ
φ→ψ
φ φ2→ψ ∼ψ φ2→ψ
ψ ∼φ
The reason is, of course, that modus ponens and modus tollens are valid for
the →. (Note that it’s not inconsistent to say that modus tollens holds for the
2→ and also that contraposition fails.)
Another implication: the strict conditional should imply the counterfactual:
φ⇒ψ
φ2→ψ
To see that these implications should hold, consider first the argument from
the strict conditional to the counterfactual conditional. Surely, if φ entails—
necessitates—ψ, then if φ were indeed true, ψ would be as well. As for the
CHAPTER 8. COUNTERFACTUALS 211
counterfactual implying the material, suppose that you think that if φ were
true, ψ would also be true. Now suppose that someone tells you that φ is true,
but that ψ is false. Wouldn’t you then need to give up your original claim that
if φ were to be true, then ψ would be true? It seems so. So, the statement
φ2→ψ isn’t consistent with φ∧∼ψ—that is, it isn’t consistent with the denial
of φ→ψ.
Would Syracuse be warm in the winter? Would Williams hit over .300?
No one answer is correct, once and for all. Which answer is correct depends
on the linguistic context. Whether a counterfactual is true or whether it is false
depends in part on what the speaker means to be saying, and what her audience
takes her to be saying, when she utters the counterfactual. Would Syracuse be
warm?—in some contexts, it would be correct to say yes, and in others, to say
no. When we imagine Syracuse being warm, we imagine reality being different
in certain respects from actuality. In particular, we imagine Syracuse as being
in Louisiana. In other respects, we imagine a situation that is a lot like reality—
we don’t imagine a situation, for example, in which Syracuse and Louisiana
are both located in China. Now, when considering counterfactuals, there is
a question of what parts of reality we hold constant. In the Syracuse-Louisiana
case, we seem to have at least two choices. Do we hold constant the location of
Syracuse, or do we hold constant the borders of Louisiana? The truth value of
the counterfactual depends on which we hold constant.
What determines which things are to be held constant, when we evaluate
the truth value of a counterfactual? It large part: the context of utterance of
the counterfactual. Suppose I am in the middle of the following conversation:
“Syracuse restaurants struggle to survive because the climate there is so bad:
no one wants to go out to eat in the winter. If Syracuse were in Louisiana,
its restaurants would do much better.” In such a context, an utterance of the
counterfactual “If Syracuse were in Louisiana, Syracuse winters would be warm”
would be regarded as true. But if this counterfactual were uttered in the midst of
the following conversation, it would be regarded as false: “You know, Louisiana
is statistically the warmest state in the country. Good thing Syracuse isn’t in
Louisiana, because that would ruin the statistic.”
Does just saying a sentence, intending it to be true, make it true? Well,
sort of! When a certain sentence has a meaning that is partly determined by
context, then when a person utters that sentence with the intention of saying
something true, that tends to create a context in which the sentence is true.
Compare ‘flat’—we’ll say “the table is flat”, and thereby utter a truth. But when
a scientist looks at the same table and says “you know, macroscopic objects are
far from being flat. Take that table, for instance. It isn’t flat at all—when viewed
under a microscope, it can be seen to have a very irregular surface”. The term
‘flat’ has a certain amount of vagueness—how flat does a thing have to be to
count as being “flat”? Well, the amount required is determined by context.2
2
See Lewis (1979).
CHAPTER 8. COUNTERFACTUALS 213
When we consider the possible world that would be actual if kangaroos had
no tails, we do not depart gratuitously from actuality. For example, we do
not consider a world in which kangaroos have wings, or crutches. We do not
consider a world with different laws of nature, in which there is no gravity. We
keep the kangaroos as they actually are, but remove the tails, and we keep the
laws of nature as they actually are. It seems that the kangaroos would then fall
over.
Take the examples of the previous section, in which I got you to give
differing answers to certain sentences. Consider:
How does the contextual dependence of this sentence work, on the Lewis-
Stalnaker view? By supplying different standards of comparison of similarity.
Think about similarity, for a moment: things can be similar in certain respects,
while not being similar in other respect. A blue square is similar to a blue circle
in respect of color, not in respect of shape. Now, when we answer affirmative
to this counterfactual, according to Lewis and Stalnaker, when we consider
the possible world most similar to the actual world in which Syracuse is in
Louisiana, we are using a kind of similarity that weights heavily Louisiana’s
actual borders. When we count the counterfactual false, we are using a kind of
similarity that weights very heavily Syracuse’s actual location.
CHAPTER 8. COUNTERFACTUALS 214
8.3.1 Syntax of SC
The primitive vocabulary of SC is that of propositional modal logic, plus the
connective 2→. Here’s the grammar:
Definition of SC-wff:
8.3.2 Semantics of SC
Where R is a three-place relation, let’s abbreviate “Rxy z” as “R z xy”. And,
where u is any object, let “R u ” be the two-place relation that holds between
objects x and y iff R u xy. (Think of R u as the two-place relation that results
from “plugging” up one place of the three-place relation R with object u.)
We can now define SC-models:
(Recall that a binary relation R is “strongly connected” in set A iff for each
u, v ∈ A, either Ruv or Rv u, and “anti-symmetric” iff u = v whenever both
Ru v and Rv u.)
i) VM (α, w) = I (α, w)
ii) VM (∼φ, w) = 1 iff VM (φ, w) = 0
iii) VM (φ→ψ, w) = 1 iff either VM (φ, w) = 0 or VM (ψ, w) = 1
iv) VM (2φ, w) = 1 iff for any v, VM (φ, v) = 1
v) VM (φ2→ψ, w) = 1 iff for any x, IF [V(φ, x) = 1 and for any y such that
VM (φ, y) = 1, x w y] THEN VM (ψ, x) = 1
Rough proof. Let φ be any formula that’s valid in all total models, and let M be
any equivalence relation model. We need to show that φ is true in an arbitrary
world r ∈ W (M ’s set of worlds). Now, any equivalence relation partitions
its domain into non-overlapping subsets in which each world sees every other
world. So W is divided up into one or more non-overlapping subsets. One of
these, W r , contains r . Now, consider a model, M 0 , just like M , but whose
set of worlds is just W r . M 0 is a total model, so φ is valid in it by hypothesis.
Thus, in this model, φ is true at r . But then φ is true at r in M , as well.
Why? Roughly: the truth value of φ at r in M isn’t affected by what goes
on outside r ’s partition, since chains of modal operators just take us to worlds
seen by r , and worlds seen by worlds seen by r , and… Such chains will never
have us “look at” anything out of r ’s partition, since these worlds are utterly
unconnected to r via the accessibility relation. So φ’s truth value at r in M is
determined by what goes on in W r , and so is the same as its truth value at r in
M 0.
So, we get the same class of valid formulas whether we require the accessibility
relation to be total, or an equivalence relation. Things are easier if we make
it a total relation, because then we can simply drop talk of the accessibility
relation and define necessity as truth at all worlds. The corresponding clause
for possibility is:
The derived clauses for the other connectives remain the same:
I say “we can think of” as a similarity relation, but take this with a grain
of salt—just as our definitions allow the members of W to be any old things, so,
is allowed to be any old relation over W . Just as the members of W could be
fish, so the relation could be any old relation over fish. (But as before, if the
truth conditions for natural language counterfactuals have nothing in common
with the truth conditions for 2→ statements in our models, the interest in our
semantics is diminished, since our models wouldn’t be modeling the behavior of
natural language counterfactuals.)
The constraints on the formal properties of the nearness relation—certain
of them, at least, seem plausible constraints on if it is to be thought of as
a similarity relation. C1 simply says that it makes sense to compare any two
worlds in respect of similarity to a given world. C2 has a transparent meaning.
C3 means “no ties”—it says that, relative to a given base world w, it is never
the case that there are two separate worlds x and y such that each is at least as
close to w as the other. C4 is the “base” axiom—it says that every world is at
least as close to itself as every other. Given C3, it has the further implication
that every world is closer to itself than every other. (We define “x is closer to w
than y is” (x ≺w y) to mean x w y and not: y w x.) C5 is called the “limit”
assumption: according to it, for any formula φ and any base world w, there
is some world that is a closest world to w in which φ is true (that is, unless φ
isn’t true at any worlds at all). This rules out the following possibility: there
are no closest φ worlds, only an infinite chain of φ worlds, each of which is
closer than the previous. Certain of these assumptions have been challenged,
especially C3 and C5. We will consider those issues below.
Note how condition C5 in the definition of an SC-model made reference
to the valuation function that we went on to define. This is in contrast to
our earlier definitions of models, in which the definition of a model made
no reference to the valuation function. The reason for this difference is that
constraint C5 (the limit assumption) is a constraint that relates the nearness
relation to the truth values of all formulas, complex or otherwise: it says that
any formula φ that is true somewhere is true in some closest-to-w world.
Given our definitions, we can define validity and semantic consequence:
Definitions of validity and semantic consequence:
· φ is SC-valid (SC φ) iff φ is true at every world in every SC-model
· Γ SC-semantically implies φ (Γ SC φ) iff for every SC-model and every
world w in that model, if every member of Γ is true at w then φ is also
true at w
CHAPTER 8. COUNTERFACTUALS 218
Example 8.1: Let’s show that the formula (P ∧Q)→(P 2→Q) is SC-valid.
We pick an arbitrary SC-model, 〈W , , I 〉, pick an arbitrary world r ∈ W , and
show that this formula is true at r :
iii) the truth condition for 2→ says that P 2→Q is true at r iff for every
closest P -world (to r ), Q is true as well. So since P 2→Q is false at r ,
there must be a closest-to-r P -world at which Q is false—that is, there
is some world a such that:
a) V(P, a) = 1
b) for any x, if V(P, x) = 1 then a r x
c) V(Q, a) = 0
Example 8.2: Show that SC [(P 2→Q)∧((P ∧Q)2→R)] → [P 2→R]. (This
formula is worth taking note of, because it is valid despite its similarity to the
invalid formula [(P 2→Q)∧(Q2→R)] → [P 2→R]):
i) Suppose for reductio that P 2→Q and (P ∧Q)2→R are true at r , but
P 2→R is false there.
iii) Since P 2→Q is true at r , Q is true in all the nearest-to-r P worlds, and
so V(Q, a) = 1.
a) φ⇒ψ φ2→ψ
b) φ2→ψ φ→ψ
8.5 Countermodels in SC
In this section we’ll learn how to construct countermodels in SC. Along the
way we’ll also look at how to decide whether a given formula is SC-valid or
SC-invalid. As with plain old modal logic, the best strategy is to attempt to
come up with a countermodel. If you fail, then you can use your failed attempt
to guide the construction of a validity proofs.
We can use diagrams like those from section 6.3.4 to represent SC-counter-
models. The diagrams will be a little different though. They will still contain
boxes (rounded now, to distinguish them from the old countermodels) in which
we put formulas; and we again indicate truth values of formulas with small
numbers above the formulas. But since there is no accessibility relation, we
don’t need the arrows between the boxes. And since we need to represent the
nearness relation, we will arrange the boxes vertically. At the bottom goes a box
for the world, r , of our model in which we’re trying to make a given formula
false. We string the other worlds in the diagram above this bottom world r :
the further away a world is from r in the r ordering, the further above r we
place it in the diagram. Thus, a countermodel for the formula ∼P →(P 2→Q)
might look as follows:
CHAPTER 8. COUNTERFACTUALS 220
/.1 -,
()P Q*+
1
b
/.1 -,
()P Q*+
0
a
O
no P
/.1 0 0 -,
In this diagram, the world we’re primarily focusing on is the bottom world,
world r. The nearest world to r is world r itself. The next nearest world to r is
the next world moving up from the bottom: world a. The furthest world from
r is world b. Notice that P is false at world r, and true at worlds a and b. Thus,
a is the nearest world to r in which P is true. Since Q is false at world a, that
makes the counterfactual P 2→Q false at world r . Since ∼P is true at r, the
entire material conditional ∼P →(P 2→Q) is false at world r, as desired. (World
b isn’t needed in this countermodel; I included it merely for illustration.) The
“no P ” sign to the left of worlds a and r is a reminder to ourselves in case we
want to add further worlds to the diagram: don’t include any worlds between a
and r in which P is true. Otherwise world a would no longer be the nearest P
world.
What strategy should one use for constructing SC-countermodels? As we
saw in section 6.3.4, a good policy is to make “forced” moves first. For example,
if you are committed to making a material conditional false at a world, go
ahead and make its antecedent true and consequent false in that world, right
away. In fact, a false counterfactual also forces certain moves. It follows from
the truth condition for the 2→ that if φ2→ψ is false at world w, then there
exists a nearest-to-w φ world at which ψ is false. So if you put a 0 overtop of
a counterfactual φ2→ψ in some world w, it’s good to do the following two
things right away. First, add a nearest-to-w world in which φ is true (if such a
world isn’t already present in your diagram). And second, make ψ false there.
True counterfactuals don’t force your hand quite so much, since there are
two ways for a counterfactual to be true. If φ2→ψ is true at w, then ψ must be
true at every nearest-tow φ world. This could happen, not only if there exists
a nearest-to-w φ world in which ψ is true, but also if there are no nearest-to-w
CHAPTER 8. COUNTERFACTUALS 221
In keeping with the advice I gave a moment ago, let’s deal with the false
counterfactual first: let’s make P 2→R false in r. This means that we need to
add a nearest-to-r P world in which R is false. At this point, nothing prevents
us from making this world r itself, but that might collide with other things we
might want to do later, so I’ll make this nearest-to-r P world a distinct world
from r:
/.1 0 1
-,
()P Q*+
a
O R
no P
/. 0 1 1 1 0 0
-,
“No P ” reminds me not to add any P -worlds between a and r. Since world r is
in the “no P zone”, I made P false there.
Notice that I made Q true in a. This is because P 2→Q is true in r . This
formula says that Q is true in the nearest-to-r P world; and a is the nearest-to-r
P world. In general, whenever you add a new world to one of these diagrams,
you should go back to all the counterfactuals in the bottom world and see
whether they require their consequents to have certain truth values in the new
world.
CHAPTER 8. COUNTERFACTUALS 222
Now for the final counterfactual Q2→R. This can be true in two ways—
either there is no Q world at all (the vacuous case), or there is a nearest-to-r
Q world in which R is true. Q is already true in world a, so the vacuous
case is ruled out. So we must include a nearest-to-r Q world, call it “b”, and
make R true there. Where will we put this new world b? There are three
possibilities. World b could be farther away from, identical to, or closer to r
than a. (These are the only three possibilities, given anti-symmetry.) Let’s try
the first possibility:
/.1 -,
()Q R*+
1
b
O
/.1 0 1
-,
()P Q*+
O
a
R no Q
no P
/. 0 1 0 1 1 0 0
-,
()[(P 2→Q)∧(Q2→ R)]→(P 2→ R)*+
r
This doesn’t work, because world a is in the no-Q zone, but Q is true at world
a. Put another way: in this diagram, b isn’t the nearest-to-r Q world; world a
is. And so, since R is false at world a, the counterfactual Q2→R would come
out false at world r, whereas we want it to be true. we’ve got Q true at a nearer
world—namely, a.
Likewise, we can’t make world b be identical to world a, since we need to
make R true in b and R is already false in a.
But the final possibility works out just fine—let world b be closer to r than
a:
CHAPTER 8. COUNTERFACTUALS 223
/.1 0 1
-,
()P Q*+
a
O R
/.1 -,
()Q P*+
1 0
no P b
O
R
no Q
/. 0 1 0 1 1 0 0
-,
()[(P 2→Q)∧(Q2→ R)]→(P 2→ R)*+
r
Notice that I made P false in b, since b is in the no P zone. Here’s the official
model:
W = {r, a, b}
r = {〈b, a〉 . . . }
I (P, a) = I (Q, a) = I (Q, b) = I (R, b) = 1, all others 0
In this official model I left out a lot in giving the similarity relation for this
model. First, I left out some of the elements of r . Fully written out, it would
be:
r = {〈r, b〉, 〈b, a〉, 〈r, a〉, 〈r, r〉, 〈a, a〉, 〈b, b〉}
I left out 〈r, b〉 because it gets included automatically given the “base” assump-
tion (C4). Also, the element 〈r, a〉 is required to make r transitive. The
elements 〈r, r〉, 〈a, a〉, and 〈b, b〉 were entered to make that relation reflexive.
(Why must it be reflexive? Because reflexivity comes from strong connectivity.
Let w and x be any members of W ; we get (x w x or x w x) from Strong
connectivity of w , and hence x w x.) My policy will be to write out enough
of r so that the rest can be inferred, given the definition of an SC-model. Sec-
ondly, this isn’t a complete writing out of itself; it is just r . To be complete,
we’d need to write out a , and b . But in this case, these latter two parts of
don’t matter, so I omitted them. (Later we’ll consider cases where we need to
consider more of than simply r .)
Example 8.3: Is the formula (P 2→R)→((P ∧Q)2→R) valid or invalid? (This
is the formula corresponding to the inference pattern of augmentation.) The
CHAPTER 8. COUNTERFACTUALS 224
()P ∧Q R*+
a
O
/.1 -,
()P Q*+
1 0
no P ∧Q b
O
R
/. 0 1 -,
no P
0 0
()(P 2→ R)→[(P ∧Q)2→ R)*+
r
I began with the false: (P ∧Q)2→R. This forced the existence of a nearest
P ∧Q world, in which R was false. But since P ∧Q was true there, P was true
there; this ruled out the true P 2→R in r being vacuously true. So I was forced
to consider the nearest P world. It couldn’t be farther out than a, since P is
true in a. It couldn’t be a, since R was already false there. So I had to put it
nearer than a. Notice that I had to make Q false at b. Why? Well, it was in the
“no P ∧Q zone”, and I had made P true in it. Here’s the official model:
W = {r, a, b}
r = {〈b, a〉 . . . }
I (P, a) = I (Q, a) = I (P, b) = I (R, b) = 1, all else 0
Example 8.4: Determine whether SC 3P →[(P 2→Q)→∼(P 2→∼Q)]. An
attempt to find a countermodel fails at the following point:
/. 1
-,
()P ∼Q*+
a 1 1 0
O
no P
/.1 0 0 -,
At world a, I’ve got Q being both true and false. A word about how I got
to that point. I noticed that I had to make two counterfactuals true: P 2→Q
CHAPTER 8. COUNTERFACTUALS 225
and P 2→∼Q. Now, this isn’t a contradiction all by itself. Remember that
counterfactuals are vacuously true if their antecedents are impossible. So if P
were impossible, then both of these would indeed be true, without any problem.
But 3P has to be true at r. This rules out those counterfactual’s being vacuously
true. Since P is possible, the limit assumption has the result that there is a closest
P world. This then with the two true counterfactuals created the contradiction.
This reasoning is embodied in the following semantic validity proof:
iii) Since 3P is true at r , P is true at some world. So, by the limit assumption,
we have: there exists a world, a, such that V(P, a) = 1 and for any x, if
V(P, x) = 1 then a r x. For short, a is a closest-to-r P world.
iv) The truth condition for 2→, applied to P 2→Q, gives us that Q is true
at all the closest-to-r P worlds.
Note the use of the limit assumption. It is the limit assumption we must use
when we need to know that there is a nearest φ-world, in cases where we can’t
get this knowledge from other things in the proof.
Cases where one counterfactual is nested within another call for something
new. Let’s consider how to show that [P 2→(Q2→R)]→[(P ∧Q)2→R] is SC-
invalid (this is the formula corresponding to “importation”). We begin by
making the formula false in r, the actual world of the model. This means making
the antecedent true and the consequent false. Now, since the consequent is a
false counterfactual, we are forced to make there be a nearest P ∧Q world in
which R is false:
CHAPTER 8. COUNTERFACTUALS 226
/.1 1 1 -,
()P ∧Q R*+
0
a
O
no P ∧Q
/. -,
/.1 -,
()P Q2→R*+
0 1
no P ∧Q b
O
no P
/. 0 1 -,
()[P 2→(Q2→ R)]→[(P ∧Q)2→ R]*+
0 0
r
/.1 -,
()Q R*+
1
c
O
no Q
/.1 -,
()P Q2→R*+
0 1
b
I made there be a nearest-to-b Q world, and made R true there. Notice that
I kept the old truth values of b from the other diagram. This is because this
new diagram is a diagram of the same worlds as the old diagram; the difference
is that the new diagram represents the b nearness relation, whereas the old
one represented a different relation: r . Now, this diagram isn’t finished. The
diagram is that of the b relation, and that relation relates all the worlds in the
model. So, worlds r and a have to show up somewhere here. The safest practice
is to put them far away from b, so that there isn’t any possibility of conflict
with the no Q zone that has been established. Thus, the final appearance of
this part of the diagram is as follows:
/. -,
r () *+
/. -,
a () *+
/.1 -,
()Q R*+
1
c
O
no Q
/.1 -,
()P Q2→R*+
0 1
b
The old truth values from worlds r and a are still in effect (remember that this
is another diagram of the same model, but representing a different nearness
relation), but I left them out because of the fact that they’ve already been
written on the other part of the diagram.
Notice that the order of the worlds in the r-diagram does not in any way
affect the order of the worlds on the b diagram. The nearness relations in
CHAPTER 8. COUNTERFACTUALS 228
It might, for example, seem odd that b is physically closer to a than to c in the
view from r, but not in the view from a. But remember that in any diagram,
only some of the features are intended to be genuinely representative. These
diagrams are in ink, but this is not intended to convey the idea that the worlds in
the model are made of ink. This feature of the diagram isn’t intended to convey
information about the model. Analogously, the fact that in b is physically closer
to a than to c in the view from r is not intended to convey the information that,
in the model, ba c. In fact, the diagram of the view from r is only intended to
convey information about r ; it doesn’t carry any information about a , b , or
c .
Back to the countermodel. That other part of the diagram, the view from r,
must be updated to include world c. The safest procedure is to put c far away
on the model to minimize possibility of conflict. Thus, the final picture of the
view from r is:
CHAPTER 8. COUNTERFACTUALS 229
/. -,
c () *+
/.1 1 1 -,
()P ∧Q R*+
0
a
O
/.1 -,
()P Q2→R*+
0 1
no P ∧Q b
O
no P
/. 0 1 -,
()[P 2→(Q2→ R)]→[(P ∧Q)2→ R]*+
0 0
r
Again, I haven’t re-written the truth values in world c, because they’re already
in the other diagram, but they are to be understood as carrying over. Now for
the official model:
W = {r, a, b, c}
r = {〈b, a〉, 〈a, c〉 . . . }
b = {〈c, a〉, 〈a, r〉 . . . }
I (P, a) = I (Q, a) = I (P, b) = I (Q, c) = I (R, c) = 1, all else 0
Notice that we needed to express two of ’s subrelations: r and b . Remember
that any model has got to contain i for every world i in the model. For
example, if we were to write out this model completely officially, we’d have to
specify a and c . But we don’t bother with those parts of that don’t matter.
b) [P 2→(Q→R)]→[(P ∧Q)2→R]
c) [P 2→(Q2→R)]→[Q2→(P 2→R)]
CHAPTER 8. COUNTERFACTUALS 230
∼P 2 P 2→Q
Q 2 P 2→Q
Our semantics does indeed have this result, because the similarity metrics based
on different worlds can be very different. For example: consider a model with
worlds r and a, in which Q is true in the nearest-to-r P world, but in which Q
is false at the nearest-to-a P world. P 2→Q is true at r and false at a, whence
2(P 2→Q) is false at r.
8.6.3 No augmentation
In example 8.3 we produced a model containing a world in P 2→Q was true
but (P ∧R)2→Q was false. Thus, P 2→Q 2 (P ∧R)2→Q.
CHAPTER 8. COUNTERFACTUALS 231
8.6.4 No contraposition
Let’s show that P 2→Q 2 ∼Q2→∼P :
/.1 0 -,
()∼Q ∼P*+
0 1
a
O
/.1 -,
()P Q*+
1
no ∼Q b
O
no P
/. -,
()(P 2→Q) (∼Q2→∼P )*+
1 0
r
8.6.6 No exportation
We have shown that the SC-semantics reproduces the logical features of natural
language counterfactuals discussed in section 8.1. In the next few sections we
discuss some further logical features of the SC-semantics, and compare them
with the logical features of the →, the ⇒, and natural language counterfactuals.
The → obeys exportation:
(φ∧ψ)→χ
φ→(ψ→χ )
But the ⇒ doesn’t in any system; (P ∧Q)⇒R 2S5 P ⇒(Q⇒R). Nor does
the 2→; it can be easily shown with a countermodel that (P ∧Q)2→R 2SC
P 2→(Q2→R).
Does the natural language counterfactual obey exportation? Here is an
argument that it does not. The following is true:
CHAPTER 8. COUNTERFACTUALS 232
Suppose Bill had married Laura. Would it then have been true that: if he had
married Hillary, he would have been a bigamist? Well, let’s ask for comparison:
what would the world have been like, had George W. Bush married Hillary
Rodham Clinton? Would Bush have been a bigamist? Here the natural answer
is no. George W. Bush is in fact married to Laura Bush; but when imagining him
married to Hillary Rodham Clinton, we don’t hold constant his actual marriage.
We imagine him being married to Hillary instead. If this is true for Bush, then
one might think it’s also true for Bill in the counterfactual circumstance in
which he’s married to Laura: it would then have been true of him that, if he
had married Hillary, he wouldn’t have still been married to Laura, and hence
would not have been a bigamist.
It’s unclear whether this is a good argument, though, since it assumes that
ordinary standards for evaluating unembedded counterfactuals (“If George had
married Hillary, he would have been a bigamist”) apply to counterfactuals
embedded within other counterfactuals (“If Bill had married Hillary, he would
have been a bigamist” as embedded within “If Bill had married Laura then…”.)
Contrary to the assumption, it seems most natural to evaluate the consequent of
an embedded counterfactual by holding its antecedent constant. But a defender
of the SC semantics might argue that the second displayed counterfactual
above has a reading on which it is false (recall the context-dependence of
counterfactuals), and hence that we need a semantics that allows for the failure
of exportation.
8.6.7 No importation
Importation holds for →, and for ⇒ in T and stronger systems:
φ→(ψ→χ )φ⇒(ψ⇒χ )
(φ∧ψ)→χ (φ∧ψ)⇒χ
CHAPTER 8. COUNTERFACTUALS 233
but not for the 2→: above we produced an SC-model with a world in which
the conditional [P 2→(Q2→R)]→[(P ∧Q)2→R] was false.
The status of importation for natural language counterfactuals is similar to
the status of exportation. One can argue that the following is true, at least on
one reading:
8.6.9 No transposition
Transposition governs the →:
φ→(ψ→χ )
ψ→(φ→χ )
but not the ⇒ (in any of our modal systems); P ⇒(Q⇒R) 2S5 Q⇒(P ⇒R). Nor
does it govern the 2→; it’s easy to show that P 2→(Q2→R) 2SC Q2→(P 2→R).
The status of transposition for natural language counterfactuals is sim-
ilar to that of importation and exportation. If we can ignore the effects of
embedding on the evaluation of counterfactuals, then we have the following
counterexample to transposition. It is true that:
have only discussed features of Stalnaker’s system that are shared by Lewis’s.
Let’s turn, now, to the differences.
Lewis challenges Stalnaker’s assumption that w is always anti-symmetric.
Real similarity relations permit ties. So it seems implausible to rule out the
possibility of two worlds being exactly similar to a given world.
The challenge to Stalnaker here is most straightforward if Stalnaker intends
to be giving truth conditions for natural language counterfactuals, rather than
merely doing model theory. In that case, the set W in an SC-model must be the
set of genuine possible worlds, and must be a relation of genuine similarity,
in which case it ought to admit ties. But even if Stalnaker is not doing this,
the objection may yet have bite, to the extent that the semantics of natural
language conditionals is like similarity-theoretic semantics.5
The validity of certain formulas depends on the “no ties” assumption; the
following two wffs are SC-valid, but are challenged by Lewis:
Take the first one, for example. Suppose you gave up anti-symmetry, thereby
allowing ties. Then the following would be a countermodel for the law of
conditional excluded middle:
/.1 -, /.1 -,
()P ∼Q*+
0
()P Q*+
0 1
a
O b
no Q
/. 0 0 0
-,
Remember that P 2→Q is true only if Q is true in all the nearest P worlds.
In this model, Q is true in one of the nearest P worlds, but not all, so that
counterfactual is false at r. Similarly for P 2→∼Q.
A similar model shows that distribution fails if the “no ties” assumption is
given up.
So, should we give up conditional excluded middle? As Lewis concedes,
the principle is initially plausible. An equivalent formulation of conditional
5
For an interesting response to Lewis, see Stalnaker (1981).
CHAPTER 8. COUNTERFACTUALS 236
But everyone agrees that (P 2→∼Q)→∼(P 2→Q) is always true, at least, when
P is possibly true. So, in cases where P is possibly true anyway, the question
of whether conditional excluded middle is valid is the question of whether
∼(P 2→Q) and P 2→∼Q are equivalent to each other. And it does indeed
seem that in ordinary usage, one expresses the negation of a counterfactual
by negating its consequent. To deny the counterfactual “if she had played,
she would have won", one says "no, she wouldn’t have!”, meaning “if she had
played, she would not have won”.
And take the other formula validated by Stalnaker’s theory, distribution. In
reply to: “if the coin had been flipped, it would have come out either heads or
tails”, one might ask: “which would it have been, heads or tails?”. The thinking
behind the reply is that “if the coin had been flipped, it would have come up
heads”, or “if the coin had been flipped, it would have come up tails” must be
true.
So there’s some plausibility to both these formulas. But Lewis says two
things. The first is metaphysical: if we’re going to accept the similarity analysis,
we’ve got to give them up, because ties just are possible. The second is purely
semantic: the intuitions aren’t completely compelling. About the coin-flipping
case, Lewis denies that if the coin had been flipped, it would have come up
heads, and he also denies that if the coin had been flipped, it would have come
up tails. Rather, he says, if it had been flipped, it might have come up heads.
And if it had been flipped, it might have come up tails. But neither outcome is
such that it would have resulted, had the coin been flipped.
Concerning excluded middle, Lewis says:
It is not the case that if Bizet and Verdi were compatriots, Bizet would be
Italian; and it is not the case that if Bizet and Verdi were compatriots, Bizet
would not be Italian; nevertheless, if Bizet and Verdi were compatriots,
Bizet either would or would not be Italian. (Counterfactuals, p. 80.)
Lewis can follow this up by noting that if Bizet and Verdi were compatriots,
Bizet might be Italian, but it’s not the case that if they were compatriots, he
would be Italian.
Here is a related complaint of Lewis’s about Stalnaker’s semantics. In the
last little bit, I’ve used English phrases of the form “if it were the case that φ,
CHAPTER 8. COUNTERFACTUALS 237
then it might have been the case that ψ”. This conditional Lewis calls “the
‘might’ counterfactual”; he symbolizes it as φ3→ψ, and defines it thus:
Lewis criticizes Stalnaker’s system for the fact that this definition of 3→ doesn’t
work in Stalnaker’s system. Why not? Well, since internal negation is valid in
Stalnaker’s system, φ3→ψ would always imply φ2→ψ—not good, since the
might-conditional in English seems weaker than the would-conditional. So,
Lewis’s definition of 3→ doesn’t work in Stalnaker’s system. Moreover, there
doesn’t seem to be any other plausible definition. So, Stalnaker can’t define
3→.6
Lewis also objects to Stalnaker’s limit assumption. The following line is
less than one inch long:
Seems false. But if we use Stalnaker’s truth conditions as truth conditions for
natural language counterfactuals, and take our intuitive judgments of similarity
seriously, we seem to get the result that it is true! The reason is that there
doesn’t seem to be a closest world in which the line is more than one inch long.
For every world in which the line is, say, 1 + k inches long, there’s another
world in which the line has a length closer to its actual length but still more
than one inch long: say, 1 + k2 inches. So there doesn’t seem to be any closest
world in which the line is over one inch long.
In light of these criticisms, Lewis proposes a new similarity-based seman-
tics for counterfactuals, which assumes neither anti-symmetry nor the limit
assumption. Let’s look at that system.
6
Lewis (1973, p. 80).
CHAPTER 8. COUNTERFACTUALS 238
· W is a nonempty set
· I is a function that assigns either 0 or 1 to each sentence letter relative
to each member of W
· is a three-place relation over W
· The valuation function, VM , for M (see below) and satisfy the following
conditions:
constraints, where α is any sentence letter, φ and ψ are any wffs, and w is any
member of W :
· VM (α, w) = I (α, w)
· VM (∼φ, w) = 1 iff VM (φ, w) = 0
· VM (φ→ψ, w) = 1 iff either VM (φ, w) = 0 or VM (ψ, w) = 1
· VM (2φ, w) = 1 iff for any v, VM (φ, v) = 1
· VM (φ2→ψ, w) = 1 iff EITHER φ is true at no worlds, OR: there is
some world, x, such that VM (φ, x) = 1 and for all y, if y w x then
VM (φ→ψ, y) = 1
It may be verified that every LC-valid wff is SC-valid.9 The converse is not
true, as the discussion of conditional excluded middle in the previous section
shows.
Comments on all this: First, notice that the limit and anti-symmetry con-
ditions are simply dropped. Second, the Base condition is modified; now it
says that no world is as close to a world as itself. Before, it said that each world
is at least as close to itself as any other. Stalnaker’s Base condition, plus anti-
symmetry, entails the present Base condition. But Lewis’s system doesn’t have
anti-symmetry, so the Base condition must be stated in the stronger form.10
Third, let’s think about what the truth condition for the 2→ says. First,
there’s the vacuous case: if φ is necessarily false then φ2→ψ comes out true.
But if φ is possibly true, then what the clause says is this: φ2→ψ is true at
w iff there’s some φ world where ψ is true, such that no matter how much
closer to w you go, you’ll never get a φ world where ψ is false. If there is a
nearest-to-w φ world, then this implies that φ2→ψ is true at w iff ψ is true in
all the nearest-to-w φ worlds.
So, thinking of these as truth-conditions for natural-language counterfac-
tuals for a moment, recall the sentence:
9
Let’s say that an LC model is “Stalnaker-acceptable” iff it obeys the limit and anti-symmetry
assumptions. Suppose that φ is LC-valid. Then it’s true in all Stalnaker acceptable LC-models.
Now, notice that in Stalnaker-acceptable models, Lewis’s truth-conditions for formulas yield
the same results as Stalnaker’s (exercise 8.3). So, φ must be true in all SC-models.
10
Why do we want to prohibit worlds being just as close to w as w is to itself? So that P ∧Q
semantically implies P 2→Q. Otherwise P ∧Q could be true at w while P ∧∼Q was true at
some world as close to w as w is to itself, in which case P 2→Q would turn out false at w.
CHAPTER 8. COUNTERFACTUALS 240
If the line were over one inch long, it would be over ten
inches long.
There’s no nearest world in which the line is over one inch long, only an infinite
series of worlds where the line has lengths getting closer and closer to one
inch long. But this doesn’t make the counterfactual true. A counterfactual is
true if its antecedent is impossible, but that’s not true in this case. So the only
way the counterfactual could be true is if the second part of the definition is
satisfied—if, that is, there is some world, x, such that the antecedent is true
at x, and the material conditional (antecedent→consequent) is true at every
world at least as similar to the actual world as is x. Since the “at least as similar
as” relation is reflexive, this can be rewritten thus:
· for some world, x, the antecedent and consequent are both true at x, and
the material conditional (antecedent→consequent) is true at every world
at least as similar to the actual world as is x
So, is there any such world, x? No. For let x be any world at which the
antecedent and consequent are both true—i.e., any world in which the line is
over ten inches long. We can always find a world that is more similar to the
actual world than x in which the material conditional (antecedent→consequent)
is false: just choose a world just like x but in which the line is only, say, two
inches long.
Let’s see how Lewis’s theory works in the case of a true counterfactual, for
instance:
If I were more than six feet tall, then I would be less than
nine feet tall
(I am, in fact, less than six feet tall.) The situation here is similar to the previous
example in that there is no nearest world in which the antecedent is true. But
now, we can find a world x, in which the antecedent and consequent are both
true, and such that the material conditional (antecedent→ consequent) is true
in every world at least as similar to the actual world as is x. Simply take x to
be a world just like the actual world but in which I am, say, six-feet-one. Any
world that is at least as similar to the actual world as this world must be one in
which I’m less than nine feet tall; so in any such world the material conditional
(I’m more than six feet tall→I’m less than nine feet tall) is true.
CHAPTER 8. COUNTERFACTUALS 241
From this we may obtain a derived clause for the truth conditions of φ3→ψ:
That is, φ3→ψ is true at w iff φ is possible, and for any φ world, there’s a
world as close or closer to w in which φ and ψ are both true. In cases where
there is a nearest φ world, this means that ψ must be true in at least one of the
nearest φ worlds.
Exercise 8.3 Show that in any Lewis model in which the limit
and anti-symmetry conditions hold, Lewis’s truth conditions reduce
to Stalnaker’s. That is, in any such model, a wff counts as being
true at a given world given Lewis’s definition of truth in a model if
and only if it counts as being true at that world given Stalnaker’s
definition.
In general, one is entitled to conclude from “If P or Q had been the case,
then R would have been the case” that “if P had been the case, R would have
been the case”. If Butch Cassidy and the Sundance Kid could have survived by
surrendering, they certainly would not say to each other “If we had surrendered
or tried to run away, we would have been shot”.
Is this a problem for Lewis and Stalnaker? Some have argued this, but
others respond as follows. One must take great care in translating from natural
language into logic. For example,12 no one would want to criticize the law
∼∼P →P on the grounds that “There ain’t no way I’m doing that” doesn’t
imply that I might do that. And there are notorious peculiar things about the
behavior of ‘or’ in similar contexts. Consider:
One can argue that this does not have the form:
After all, suppose that you are permitted to stay, but not to go. If you stay, you
can’t help doing the following act: staying ∨ going. So, surely, you’re permitted
to do that. So, the second sentence is true. But the first isn’t; if someone uttered
it to you when you were in jail, they’d be lying to you! It really means:
Similarly, “If either P or Q were true then R would be true” seems usually to
mean “If P were true then R would be true, and if Q were true then R would be
true”. We can’t just expect natural language to translate directly into our logical
language—sometimes the surface structure of natural language is misleading.
12
The example is adapted from Loewer (1976).
Chapter 9
243
CHAPTER 9. QUANTIFIED MODAL LOGIC 244
structure; the move to modal propositional logic let us analyze modal structure.
Moving to QML lets us do all three at once, as with:
The second, de dicto, sentence makes the true claim that in any possible world,
anyone that is in that world a bachelor is, in that world, male. The first, de
re, sentence makes the false claim that if any object, u, is a bachelor in the
actual world, then that object u is necessarily a bachelor—i.e., the object u is a
bachelor in all possible worlds.
What do the following English sentences mean?
Surface grammar suggests that they would mean the de re claim that each
bachelor is such that he is necessarily male. But in fact, it’s very natural to
hear these sentences as making the de dicto claim that it’s necessary that all
bachelors are male.
The de re/de dicto distinction also emerges with definite descriptions. This
may be illustrated by using Russell’s theory of descriptions (section 5.3.3). Re-
call how Russell’s method generated two possible symbolizations for sentences
containing definite descriptions and negations. “The striped bear is not dan-
gerous”, for example, can be symbolized as either of the following, depending
on whether the definite description is given wide or narrow scope relative to
the negation operator:
(The second denies the existence of something that is both i) the one and only
striped bear, and ii) dangerous; the first says that there exists something that
is the one and only striped bear, and adds that this bear is non-dangerous.) A
similar phenomenon arises with sentences containing definite descriptions and
modal operators. There are two symbolizations of “The number of the planets
is necessarily odd” (letting “N x” mean that x numbers the planets):
(Let’s count Pluto as a planet.) The second is de dicto; it says that it’s necessary
that: there is one and only one number of the planets, and that number is odd.
This claim is false, since there could have been eight planets. The second is de
CHAPTER 9. QUANTIFIED MODAL LOGIC 246
re; it says that (in fact) there is one and only one number of the planets, and
that that number is necessarily odd. That’s true, I suppose: the number nine
(the number that in fact numbers the planets) is necessarily odd.
Natural language sentences containing both definite descriptions and modal
operators are perhaps ambiguous. “The number of the planets is necessarily
odd” is naturally heard as expressing a de re claim; but “The American president
is necessarily an American citizen” can be heard as expressing a de dicto claim.
Recall that our semantics for moda propositional logic assigned truth values to
sentence letters relative to possible worlds. We have something similar here: we
relativize the interpretation of predicates to possible worlds. The interpretation
of a two-place predicate, R, for example is a set of ordered triples, two members
of which are in the domain, and one member of which is a possible world.
When 〈u1 , u2 , w〉 is in the interpretation of R, that represents R’s applying to
u1 and u2 in possible world w. In a possible worlds setting, this relativization
makes intuitive sense: a predicate can apply to some objects in one possible
world but fail to apply to those same objects in some other possible world.
CHAPTER 9. QUANTIFIED MODAL LOGIC 247
Notice that the interpretations of constants are not relativized in any way to
possible worlds. The interpretation I assigns simply a member of the domain
to a name. This reflects the common belief that natural language proper
names—which constants are intended to represent—are rigid designators, i.e.,
terms that have the same denotation relative to every possible world (see Kripke
(1972).) We’ll discuss the significance of this feature of our semantics below.
On to the definition of the valuation function for an SQML-model. First,
we keep the definition of a variable assignment from nonmodal predicate logic
(section 4). Our variable assignments therefore assign members of the domain
to variables absolutely, rather than relative to worlds. (This is an appropriate
choice given our choice to assign constants absolute semantic values.) But the
valuation function will now relativize truth values to possible worlds (as well
as to variable assignments). After all, the sentence ‘F a’, if it represents “Ted is
tall”, should vary in truth value from world to world.
The derived clauses are what you’d expect, including the following one for
3:
Finally, we have:
CHAPTER 9. QUANTIFIED MODAL LOGIC 248
i) suppose for reductio that (for some model, world r , and variable assign-
ment g ,) Vg (3∃x(x=a∧2F x)→F a, r ) = 0. Thus Vg (3∃x(x=a∧2F x), r ) =
1 and …
ii) …Vg (F a, r ) = 0
Notice that in line xii) I inferred that Vg assigned “F a” truth at r . I could have
subscripted ‘V’ with any variable assignment, since the truth condition for the
formula “F a” is the same, regardless of the variable assignment; I picked g
because that’s what I needed to get the contradiction.
∗
1 1 1 0 0
r
(3F a∧3Ga)→3(F a∧Ga)
∗ ∗
The understars make us create two new worlds:
∗
1 1 1 0 0
r
(3F a∧3Ga)→3(F a∧Ga)
∗ ∗
1 1
a b
Fa Ga
We must then discharge the overstar from the false diamond in each world
(since every world is accessible to every other world in our models):
CHAPTER 9. QUANTIFIED MODAL LOGIC 250
∗
1 1 1 0 0 0 0
(3F a∧3Ga)→3(F a∧Ga)
r
∗ ∗ †
1 0 0 1 0 0
a b
Fa F a∧Ga Ga F a∧Ga
(I had to make either F a or Ga false in r—I chose F a arbitrarily.) Now, we’ve
indicated the truth-values that we want the atomics to have. How do we make
the atomics have the TVs we want in the picture?
We do this by introducing a domain for the model, and stipulating what the
names refer to and what objects are in the extensions of the predicates. Let’s
use letters like ‘u’ and ‘v’ as the members of the domain in our models. Now, if
we let the name ‘a’ refer to (the letter) u, and let the extension of F in world r
be {} (the empty set), then the truth value of ‘F a’ in world r will be 0 (false),
since the denotation of a isn’t in the extension of F at world r. Likewise, we
need to put u in the extension of F (but not in the extension of G) in world
a, and put u in the extension of G ((but not in the extension of F ) in world b.
This all may be indicated on the diagram as follows:
a: u
∗
1 1 1 0 0 0 0
(3F a∧3Ga)→3(F a∧Ga)
r
∗ ∗ †
F :{}
1 0 0 1 0 0
Fa F a∧Ga Ga F a∧Ga
a b
F : {u} G : {} F : {} G : {u}
Within each world I’ve included a specification of the extension of each predi-
CHAPTER 9. QUANTIFIED MODAL LOGIC 251
cate. But the specification of the referent of the name ‘a’ does not go within
any world; it was rather indicated (in boldface) at the top of model. This is
because names, unlike predicates, get assigned semantic values absolutely in a
model, not relative to worlds.
Time for the official model:
W = {r, a, b}
D = {u}
I (a) = u
I (F ) = {〈u, a〉}
I (G) = {〈u, b〉}
∗ +
1 1 0 0
r
2 ∃ xF x→ ∃ x2F x
+
The overstar above the 2 in the antecedent must be discharged in r itself, since,
remember, every world sees every world in these models. That gives us a true
existential. Now, a true existential is a bit like a true 3—the true ∃xF x means
that there must be some object u from the domain that’s in the extension of F
in r. I’ll put a + under true ∃s and false ∀s, to indicate a commitment to some
instance of some sort or other. Analogously, I’ll indicate a commitment to all
instances of a given type (which would arise from a true ∀ or a false ∃) with a +
above the connective in question.
OK, how do we make ∃xF x true in r? By making “F x” true for some
value of x. Let’s put the letter u in the domain, and make “F x” true when u is
assigned to x. We’ll indicate this by putting a 1 overtop of “F ux ” in the diagram.
Now, “F ux ” isn’t a formula of our language—what it indicates is that “F x” is to
be true when u is assigned to x. And to make this come true, we treat it as an
atomic—we put u in the extension of F at r:
CHAPTER 9. QUANTIFIED MODAL LOGIC 252
∗ +
1 1 0 0 1
2 ∃ xF x→ ∃ x2F x F ux
+
r
F : {u}
Good. Now we’ve got to attend to the overplus, the + sign overtop the false
∃x2F x. Since it’s a false ∃, we’ve got to make 2F x false for every object in the
domain (otherwise—if there were something in the domain for which 2F x was
true—∃x2F x would be true after all). So far, we’ve got only one object in our
domain, u, so we’ve got to make 2F x false, when u is assigned to the variable
‘x’. We’ll indicate this on the diagram by putting a 0 overtop of “2F ux ”:
∗ +
1 1 0 0 1 0
2 ∃ xF x→ ∃ x2F x F u
x
2F ux
+
r
∗
F : {u}
Ok, now we have an understar, which means we should add a new world to
our model. When doing so, we’ll need to discharge the overstar from the
antecedent. We get:
∗ +
1 1 0 0 1 0
2 ∃ xF x→ ∃ x2F x F ux 2F ux
+
r
∗
F : {u}
0 1 1
F u
x
∃ xF x F v
x
a +
F : {v}
CHAPTER 9. QUANTIFIED MODAL LOGIC 253
This move requires some explanation. Why the v? Well, I was required to
make F x false, with u assigned to x. Well, that means keeping u out of the
extension of F at a. Easy enough, right? Just make F ’s extension {}? Well,
no—because of the true 2 in r, I’ve got to make ∃xF x true in a. But that means
that something’s got to be in F ’s extension in a! It can’t be u, so I’ll add a new
object, v, to the domain, and put it in F ’s extension in a.
But adding v to the domain of the model adds a complication. We had
an overplus in r—over the false ∃. That meant that, in r, for every member of
the domain, 2F x is false. So, 2F x is false in r when v is assigned to x. That
creates another understar, requiring the creation of a new world. The model
then looks as follows:
∗ +
1 1 0 0 1 0 0
2 ∃ xF x→ ∃ x2F x F u
x
2F u
x
2F v
x
+
r
∗ ∗
F : {u}
0 1 1 0 1 1
F u
x
∃ xF x F v
x
F v
x
∃ xF x F ux
a + b +
F : {v} F : {u}
(Notice that we needn’t have made another world b—we could simply have
discharged the understar on r.)
Ok, here’s the official model:
W = {r, a, b}
D = {u, v}
I (F ) = {〈u, r〉, 〈u, b〉, 〈v, a〉}
CHAPTER 9. QUANTIFIED MODAL LOGIC 254
Exercise 9.1 For each formula, give a validity proof if the wff is
SQML-valid, and a countermodel if it is invalid.
a) 3∀xF x→∃x3F x
b) ∃x3Rax→32∃x∃yRxy
∀x∀y(x=y→2(x=y))
When we try to make the formula false by putting a 0 over the initial ∀, we get
an under-plus. So we’ve got to make the inside part, ∀y(x=y→2x=y), false
for some value of x. We do this by putting some object u in the domain, and
letting that be the value of x for which ∀y(x=y→2x=y) is false. We get:
0 0
r ∀ x∀y(x=y→2x=y) ∀y( ux =y→2( ux =y))
+ +
Now we need to do the same thing for our new false universal: ∀y(x=y→2x=y).
For some value of y, the inside conditional has to be false. But that means that
the antecedent must be true. So the value for y has to be u again. We get:
CHAPTER 9. QUANTIFIED MODAL LOGIC 255
0 0 1 0 0
The understar now requires creation of a world in which x=y is false, when
both x and y are assigned u. But there cannot be any such world! An identity
sentence is true (at any world) if the denotations of the terms are identical. Our
attempt to find a countermodel has failed; we must do a validity proof. Consider
any SQML model 〈W , D, I 〉, any r ∈ W , and any variable assignment g ; we’ll
show that Vg (∀x∀y(x=y→2x=y), r ) = 1:
v) …Vg xy (2(x=y), r ) = 0
uv
Notice at the end how the particular world at which the identity sentence was
false didn’t matter. The truth condition for an identity sentence is simply that
the terms denote the same thing; it doesn’t matter what world this is evaluated
relative to.1
1
A note about variables. In validity proofs, I’m using italicized ‘u’ and ‘v’ as variables to
range over objects in the domain of the model I’m considering. So, a sentence like ‘u = v’
might be true, just as the sentence ‘x=y’ of our object language can be true. But when I’m
doing countermodels, I’m using the roman letters ‘u’ and ‘v’ as themselves being members of
the domain, not as variables ranging over members of the domain. Since the letters ‘u’ and
‘v’ are different letters, they are different members of the domain. Thus, in a countermodel
with letters in the domain, if the denotation of a name ‘a’ is the letter ‘u’, and the denotation
CHAPTER 9. QUANTIFIED MODAL LOGIC 256
∀x2F x→2∀xF x
0 0
+
1 0 0
∀ xF x F ux
r a +
∀ x2F x→2∀xF x
∗
F : {}
When you have a choice between discharging over-things and under-things,
whether plusses or stars, always do the under things first. In this case, this
means discharging the understar and ignoring the over-plus for the moment.
So, discharging the understar gave us world a, in which we made a universal
false. This gave an underplus, and forced us to make an instance false. So I put
object u in our domain, and keep it out of the extension of F in a. This makes
F x false in a, when x is assigned u.
But now, I need to discharge the overplus in r. I must make 2F x true for
every member of the domain, including u, which is now in the domain. But
then this requires F x to be true, when u is assigned to x, in a:
+ ∗ 0 0 1
1 0 0 1 1 ∀ xF x F ux F ux
∀ x2F x→2∀xF x 2F ux +
r a
∗
F : {u} F : {?}
So, we fail to get a model. Time for a validity proof; let’s show that every
instance of the Barcan Schema is valid:
v) from i), for every member of the domain, and so for u in particular,
Vg α (2φ, r ) = 1.
u
CHAPTER 9. QUANTIFIED MODAL LOGIC 258
vi) thus, for every world, and so for w in particular, Vg α (φ, w) = 1. Contra-
u
dicts iv).
The validity of the Barcan formula in our semantics is infamous because the
Barcan formula seems, intuitively, to be invalid. To see why, we need to think a
bit about the intuitive significance of the relative order of quantifiers and modal
operators. Consider the difference between the following two sentences:
3∃xF x
∃x3F x
In general, a sentence of the form 3φ says that it’s possible for the component
sentence, φ, to be true. So the first of our two sentences, 3∃xF x, says that
it’s possible for “∃xF x” to be true. That is: it’s possible for there to exist an F .
What about the second sentence? In general, a sentence that begins without
a modal operator in front makes a statement about the actual world. Thus,
a statement that begins with “∃x . . . ” is saying that there exists, in the actual
world, an object x, such that…. Our second statement, then, says that there
actually exists an object, x, that is possibly F . It matters, therefore, whether
the ∃ comes after or before the 3. If the ∃ comes first, then the statement is
saying that there actually exists a certain sort of object (namely, an object that
could have been a certain way.) But if it comes second, after the 3, then the
statement is merely saying that there could have existed a certain sort of object.
There is a similar contrast with the following two statements:
2∀xF x
∀x2F x
The first says that it’s necessary that: everything is F . That is, in every possible
world, every object that exists in that world is F in that world. The objects
ranged over by the ∀, so to speak, are drawn from the worlds the 2 introduces,
because the ∀ occurs inside the scope of the 2. The second statement, by
contrast, says that: every actual object is necessarily F . That is, every object that
exists in the actual world is F in every possible world. The second statement
concerns just actually existing objects because the ∀ occurs in the front of the
formula, not inside the scope of the 2.
With all this in mind, return to the Barcan formula, ∀x2F x→2∀xF x. It
says:
CHAPTER 9. QUANTIFIED MODAL LOGIC 259
Now we can see why this claim is questionable. Even if every actual thing is
necessarily F , there could still be worlds containing non-F things, so long as
those non-F things don’t exist in the actual world. Suppose, for instance, that
every object in the actual world is necessarily a material object. Then, letting
F stand for “is a material object”, ∀x2F x is true. Nevertheless, 2∀xF x seems
false—it would presumably be possible for there to exist an immaterial object:
a ghost, say. Possible worlds containing ghosts would simply need to contain
objects that do not exist in the actual world (since all the objects in the actual
world are necessarily material.)
This objection to the validity of the Barcan formula is obviously based on
the idea that what objects exist can vary from possible world to possible world.
But this sort of variation is not represented in the SQML definition of a model.
Each such model contains a single domain, D, rather than different domains
for different possible worlds. The truth condition we specified for a quantified
sentence ∀αφ, at a world w, was simply that φ is true at w of every member of
D—the quantifier ranges over the same domain, regardless of which possible
world is being described. That is why the Barcan formula turns out valid under
our definition.
This feature of SQML models is problematic for an even more direct reason:
the sentence ∀x2∃y y = x, i.e., “everything necessarily exists”, turns out valid!:
v) So, since u ∈ D, we have Vg xy (y=x, w 0 ) = 0. But that can’t be, given the
uu
clause for ‘=’ in the definition of the valuation function.
It’s clear that this formula turns out valid for the same reason that the Barcan
formula turns out valid: SQML models have a single domain common to each
possible world.
CHAPTER 9. QUANTIFIED MODAL LOGIC 260
We have already discussed the Barcan schema. The third schema raises no philo-
sophical problems for SQML, since, quite properly, it has instances that turn out
invalid: as we saw above, there are SQML models in which 2∃xF x→∃x2F x
is false. Let’s look at the other two schemas.
First, the converse Barcan schema. Like the Barcan schema, each of its
instances is valid given the SQML semantics (I’ll leave this to the reader to
demonstrate), and like the Barcan schema, this verdict faces a philosophical
challenge. The antecedent says that in every world, everything that exists in
that world is φ. Existents are thus always φ. It might still be that some object
isn’t necessarily φ: perhaps some object that is φ in every world in which it
exists, fails to be φ in worlds in which it doesn’t exist. This talk of an object
being φ in a world in which it doesn’t exist may seem strange, but consider
the following instance of the converse Barcan schema, substituting “∃y y=x”
(think: “x exists”) for φ:
This formula seems to be false. Its antecedent is clearly true; but its consequent
says that every object in the actual world exists necessarily, and hence seems
intuitively to be false.
Each instance of the fourth schema, ∃α2φ→2∃αφ, is also validated by
the SQML semantics (again, an exercise for the reader); and again, this is
philosophically questionable. Let’s suppose that physical objects are necessarily
physical. Then, ∃x2P x seems true, letting P mean ‘is physical’. But 2∃xP x
seems false—it seems possible that there are no physical objects. This coun-
terexample requires that there be worlds with fewer objects than those that
actually exist, whereas the counterexample to the Barcan formula involved the
possibility that there be more objects than those that actually exist.
CHAPTER 9. QUANTIFIED MODAL LOGIC 261
presumably have no mass (nor any spatial location, nor any other physical
feature.) Ordinary quantification is restricted to normal things. So if we want
to translate an ordinary claim into the language of QML, we must introduce
a predicate for the normal things, “N ”, and use it to restrict quantifiers. But
now, consider the following ordinary English statement:
which says:
+
1 0 0 0 1 0
∀ x(N x→2F x)→2∀x(N x→F x) N u
x
→2F ux
r
∗ †
N : {}
0 1 0 0
∀ x(N x→F x) N u
x
→F ux
a +
N : {u} F : {}
So in a sense, the ordinary intuitions that were alleged to undermine the Barcan
schema are in fact consistent with Constancy.
CHAPTER 9. QUANTIFIED MODAL LOGIC 263
The defender of Constancy can defend the converse Barcan schema and the
fourth schema in similar fashion. The objection to the converse Barcan schema
assumed the falsity of ∀x2∃y y=x. “Sheer prejudice!”, according to the friend
of constancy. “And recall further that an ordinary utterance of ‘Everything exists
necessarily’ expresses, not ∀x2∃y y=x, but rather ∀x(N x→2∃y(N y∧y=x)),
(N for ‘normal’), the falsity of which is is perfectly compatible with Constancy.
It’s possible to fail to be normal; all that’s impossible is to utterly fail to exist.
Likewise for the fourth schema.”
This defense of SQML is hard to take. Let “G” stand for a kind of object
that, in fact, has no members, but which could have had members. Perhaps ghost
is such a kind. ∀x2∼Gx→2∀x∼Gx is an instance of the Barcan schema, and
so true according to the defender of Constancy. Since there could have existed
ghosts, the consequent of this conditional is false. Therefore, its antecedent
∀x2∼Gx must be false. That is, there exists something that could have been a
ghost. But this is a very surprising result. The alleged possible ghost couldn’t
be any material object, presumably, assuming it would be impossible for any
material object to be a ghost. The defender of Constancy, then, is committed to
the existence of objects which we wouldn’t otherwise have dreamed of accepting:
things that could have been ghosts, things that been dragons, things that could
have been gods, and so on.
The defender of Constancy might try to defend this conclusion by remind-
ing us that these “possible-ghosts”, “possible-dragons”, and so on, are not
normal objects. They aren’t in space and time, presumably, which explains why
no one has ever seen, heard, felt, or smelled one. He might even say that they
are non-actual, or even that they do not exist (though they are). We are quite cor-
rect, she might say, to scoff at the idea that some normal/actual/existing objects
are capable of being ghosts; but what’s the big deal about saying that some non-
normal/non-actual/non-existing objects have these capabilities? This move,
too, will be considered philosophically suspect by many. Many philosophers
regard the idea that there are some non-existent things, or some non-actual
things, as being anywhere from obviously false to conceptually incoherent, or
subversive, or worse.3 And how does it help to point out that the objects aren’t
normal? The postulation of non-normal objects—objects above and beyond
the objects that the rest of us believe in—was exactly what I was claiming is
philosophically suspect!
On the other hand, Constancy’s defenders can point to certain powerful
3
See Quine (1948); Lycan (1979).
CHAPTER 9. QUANTIFIED MODAL LOGIC 264
arguments in its favor. Here’s a quick sketch of one such argument. First, the
following seems to be a logical truth:
Ted = Ted
∃y y = Ted
This latter formula, too, is therefore a logical truth. But if φ is a logical truth
then so is 2φ (recall the rule of necessitation from chapter 6). So we may infer
that the following is a logical truth:
2∃y y = Ted
Next, notice that nothing in the argument for 2∃y y = Ted depended on any
special features of me. We may therefore conclude that the reasoning holds
good for every object; and so ∀x2∃y y = x is indeed a logical truth. Since,
therefore, every object exists necessarily, it should come as no surprise that
there are things that might have been ghosts, dragons, and so on—for if there
had been a ghost, it would have necessarily existed, and thus must actually exist.
This and other related arguments have apparently wild conclusions, but they
cannot be lightly dismissed, for it is no mean feat to say exactly where they go
wrong (if they go wrong at all!).4
The third one on the list was invalid before, and so is still invalid now. As for
the Barcan formula, here is a countermodel:
CHAPTER 9. QUANTIFIED MODAL LOGIC 267
r Dr : {u} F : {u}
0
a Da : {u, v} F : {u}
0
Official model:
W = {r, a}
R = {〈r, r〉, 〈r, a〉, 〈a, a〉}
D = {u, v}
Dr = {u}
Da = {u, v}
I (F ) = {〈u, r〉, 〈u, a〉}
a) 2∀xF x→∀x2F x
b) ∃x2F x→2∃xF x
c) ∀x2∃y y=x
For example, the first example, the counterexample to the Barcan formula,
required a model in which the domain expanded; world a was accessible from
world r, and had a larger domain. But suppose we made the decreasing domains
requirement:
if R wv, then Dv ⊆ Dw
The counterexample would then go away. Indeed, every instance of the Barcan
schema would then become VDSQML-valid, which may be proved as follows:
if R wv then Dw ⊆ Dv
Even after imposing the increasing domains requirement, the Barcan for-
mula remains VDQML-invalid; and after imposing the decreasing domains
requirement, the converse Barcan formula and also ∃x2F x→2∃xF x remain
VDQML-invalid (the original countermodels for these formulas establish this.)
However, in systems in which the accessibility relation is symmetric, this col-
lapses: imposing either of these requirements results in imposing the other.
That is, in B or S5, imposing either the increasing or the decreasing domains
requirement results in imposing both, and hence results in all three formulas
being validated.
CHAPTER 9. QUANTIFIED MODAL LOGIC 269
a) 2∀αφ→∀α2φ
b) ∃α2φ→2∃αφ
thus barring objects from having properties at worlds where they don’t ex-
ist. But some would argue that this goes too far. The new clause validates
CHAPTER 9. QUANTIFIED MODAL LOGIC 270
∀x2(F x→∃y y=x). “An object must exist in order to be F ”—sounds clearly
true if F stands for ‘is human’, but what if F stands for ‘is famous’? If Baconians
had been right and there had been no such person as Shakespeare, perhaps
Shakespeare might still have been famous.
The issues here are complex.7 But whether or not we should adopt the new
clause, it looks as though there are some existence-entailing English predicates
π: predicates π such that nothing can be a π without existing. ‘Is human’ seems
to be such a predicate. So we’re back to our original worry about VDQML-
semantics: its truth condition for 2φ requires truth of φ at all worlds, which is
allegedly too strong, at least when φ is a sentence like πa, where π is existence-
entailing.
One could modify the clause for the 2 in the definition of the valuation
function, so that in order for 2F a to be true, a only needs to be F in worlds in
which it exists:
VM ,g (2φ, w) = 1 iff for each v ∈ W , if R wv, and if [α]M ,g ∈ Dw for each
name or free variable α occurring in φ, then VM ,g (φ, v) = 1
(“Free variable” here means a variable not bound to any quantifier in φ.) This
would indeed have the result that 2F a gets to be true provided a is F in every
world in which it exists. But be careful what you wish for. Along with this result
comes the following: even if a doesn’t necessarily exist, the sentence 2∃x x=a
comes out true. For according to the new clause, in order for 2∃x x=a to be
true, it must merely be the case that ∃x x=a is true in every world in which a
exists, and of course this is indeed the case.
If 2∃x x=a comes out true even if a doesn’t necessarily exist, then 2∃x x=a
doesn’t say that a necessarily exists. Indeed, it doesn’t look like we have any way
of saying that a necessarily exists, using the language of QML, if the 2 has the
meaning provided for it by the new clause.
A notion of necessity according to which “Necessarily φ” requires truth in
all possible worlds is sometimes called a notion of strong necessity. In contrast,
a notion of weak necessity is one according to which “Necessarily φ” requires
merely that φ be true in all worlds in which objects named within φ exist. The
new clause for the 2 corresponds to weak necessity, whereas our original clause
corresponds to strong necessity.
As we saw, if the 2 expresses weak necessity, then one cannot even express
the idea that a thing necessarily exists. That’s because one needs strong necessity
7
The question is that of so-called “serious actualism” (Plantinga, 1983).
CHAPTER 9. QUANTIFIED MODAL LOGIC 271
to say that a thing necessarily exists: in order to necessarily exist, you need to
exist at all worlds, not just at all worlds at which you exist! So this is a serious
deficiency of having the 2 of QML express weak necessity. But if we allow the
2 to express strong necessity instead, there is no corresponding deficiency, for
one can still express weak necessity using the strong 2 and other connectives.
For example, to say that a is weakly necessarily F (that is, that a is F in every
world in which it exists), one can say: 2(∃x x=a→F a).
So it would seem that we should stick with our original truth condition for
the 2, and live with the fact that statements like 2F a turn out false if a fails
to be F at worlds in which it doesn’t exist. Those who think that “Necessarily,
Ted is human” is true despite Ted’s possible nonexistence can always translate
this natural language sentence into the language of QML as 2(∃x x=a→F a)
(which requires a to be F only at worlds at which it exists) rather than as 2F a
(which requires a to be F at all worlds).
Chapter 10
10.1 Actuality
The word ‘actually’, in one of its senses anyway, can be thought of as a one-place
sentence operator: “Actually, φ.”
‘Actually’ might at first seem redundant. “Actually, snow is white” basically
amounts to: “snow is white”. But the actuality operator interacts with modal
operators in interesting ways. The following two sentences, for example, clearly
have different meanings:
The first sentence expresses the triviality that snow is white in any possible
world in which snow is white. But the second sentence makes the nontrivial
statement that if snow is white in any world, then snow is white in the actual
world.
So, ‘actually’ is nonredundant, and consequently, worth thinking about.
Let’s add a symbol to modal logic for it. “@φ” will symbolize “Actually, φ”.
We can now symbolize the pair of sentences above as 2(S→S) and 2(S→@S),
272
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 273
One could add a designated world to models for quantified modal logic in a
parallel way.
The old definition of validity for a system (section 6.3), recall, never em-
ployed the notion of truth in a model; rather, it proceeded via the notion of
validity in a frame. The nice thing about the new definition is that it’s parallel
In certain special cases, we could do without the new symbol @. For example, instead of
1
symbolizing “Necessarily, if snow is white then snow is actually white” as 2(S→@S), we could
symbolize it as 3S→S. But the @ is not in general eliminable; see Hodes (1984b,a).
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 274
to the way validity is usually defined in model theory: one first defines truth in
a model, and then defines validity as truth in all models. But the new definition
doesn’t differ in any substantive way from the old definition, in that it yields
exactly the same class of valid formulas:
Proof. It’s obvious that everything valid on the old definition is valid on the
new definition (the old definition says that validity is truth in all worlds in
all models; the addition of the designated world w@ doesn’t play any role in
defining truth at worlds, so each of the new models has the same distribution
of truth values as one of the old models.) Moreover, suppose that a formula is
invalid on the old definition—i.e., suppose that φ is false at some world, w, in
some model M . Now construct a model of the new variety that’s just like M
except that its designated world is w. φ will be false in this model, and so φ
turns out invalid under the new definition.
iv) Given the latter, there is some world, call it “a”, such that R w@ a and
Vg (∀x(Gx∨@F x), a) = 0. And so, there is some object, call it “u”, in the
model’s domain, D, such that Vg x (Gx∨@F x, a) = 0
u
vii) Given the latter, Vg x (F x, w@ ) = 0 (by the clause in the truth definition
u
for @)
viii) Given ii), for every object in D, and so for u in particular, Vg x (F x∨2Gx, w@ ) =
u
1.
ix) And so, either Vg x (F x, w@ ) = 1 or Vg x (2Gx, w@ ) = 1
u u
The formula turns out false in this model, which means that it turns out false
in w@ : the consequent is false in @ because at world a, something (namely, u)
is neither G nor F ; but the antecedent is true there: since u is F at w@ , it’s
necessary that u is either G or actually F .
10.2 ×
Adding @ to the language of quantified modal logic is a step in the right
direction, since it allows us to express certain kinds of comparisons between
possible worlds that we couldn’t express otherwise. But it doesn’t go far enough;
we need a further addition.2 Consider this sentence:
It might have been the case that, if all those then rich
might all have been poor, then someone is happy
What it’s saying, in possible worlds terms, is this:
Note what the × does: change the reference world. When evaluating a formula,
it says to forget about the old reference world, and make the new reference
world whatever the current world of evaluation happens to be.
We can define validity and consequence thus:
Valid formulas are thus defined as those that are true at every pair of worlds of
the form 〈w, w〉; semantic consequence is truth-preservation at every such pair.
Notice, however, that these aren’t the only notions of validity and con-
sequence that one could introduce. There is also the notion of truth, and
truth-preservation, at every pair of worlds:3
Validity and general validity, and consequence and general consequence, come
apart in various ways, as we’ll see below.
As we saw, moving to this new language increases the flexibility of the @;
we can symbolize
It might have been the case that, if all those then rich
might all have been poor, then someone is happy
3
The term ‘general validity’ is from Davies and Humberstone (1980); the first definition of
validity corresponds to their “real-world validity”.
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 279
as
3×(3∀x(@Rx→P x)→∃xH x)
Moreover, it costs us nothing. For we can replace any sentence φ of the old
language with ×φ in the new language (i.e. we just put the × operator at the
front of the sentence.)4 For example, instead of symbolizing
×3∀x(@Rx→P x)
Example 10.3: Show that if φ then @φ. Suppose for reductio that
φ is valid but @φ is not. That means that in some model and some world,
w (and some assignment g , but I’ll suppress this since it isn’t relevant here),
V(@φ, w, w) = 0. Thus, given the truth condition for @, V(φ, w, w) = 0. But
that violates the validity of φ.
Example 10.4: Show that every instance of φ↔@φ is 2D-valid, but not
every instance of 2(φ↔@φ) is. (Moral: any proof theory for this logic had
better not include the rule of necessitation!) For the first, the truth condition
for @ insures that for any world w in any model (and any variable assignment),
V(@φ, w, w) = 1 iff V(φ, w, w) = 1, and so V(φ↔@φ, w, w) = 1. Thus,
φ↔@φ.
But some instances of 2(φ↔@φ) aren’t valid. Let φ be ‘F a’; here’s a
countermodel:
W = {c, d}
D = {u}
I (a) = u
I (F ) = {〈u, c〉}
4
This amounts to the same thing as the old symbolization in the following sense. Let
φ be any wff of the old language. Thus, φ may have some occurrences of @, but it has no
occurrences of ×. Then, for every SQML-model M = 〈W , D, I 〉, and any v, w ∈ W , ×φ is
true at 〈v, w〉 in M iff φ is true in the designated-world SQML model 〈W , w, D, I 〉.
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 280
b) 2×∀x3@F x→2∀xF x
10.3 Fixedly
The two-dimensional approach to semantics—evaluating formulas at pairs of
worlds rather than single worlds—raises an intriguing possibility. The 2 is a
universal quantifier over the world of evaluation; we might, by analogy, follow
Davies and Humberstone (1980) and introduce an operator that is a universal
quantifier over the reference world. Davies and Humberstone call this operator
F, and read “Fφ” as “fixedly, φ”. Grammatically, F is a one-place sentential
operator. Its semantic clause is this:
· VM ,g (Fφ, v, w) = 1 iff for every v 0 , VM , g (φ, v 0 , w) = 1
All the other two-dimensional semantic definitions, including the definitions
of validity and consequence, remain the same.5
Humberstone and Davies point out that given F, @, and 2, we can introduce
two new operators: F@ and F2. It’s easy to show that:
Example 10.5: For example, F@φ→φ is 2D-valid for each wff φ (exer-
cise 10.2). But not every instance of this wff is generally valid. The formula
F@(@Ga↔Ga)→(@Ga↔Ga) is not generally valid, for example. General va-
lidity requires truth at all pairs 〈v, w〉 in all models. But in the following model,
Vg (F@(@Ga↔Ga)→(@Ga↔Ga), c, d) = 0 (for any variable assignment g ):
W = {c, d}
D = {u}
I (a) = u
I (G) = {〈u, c〉}
In this model, the referent of ‘a’ is in the extension of ‘G’ in world c, but not
in world d. That means that @Ga is true at 〈c, d〉 whereas Ga is false at 〈c, d〉,
and so @Ga↔Ga is false at 〈c, d〉. But F@φ means that φ is true at all pairs
of the form 〈v, v〉, and the formula @Ga↔Ga is true at any such pair (in any
model). Thus, F@(@Ga↔Ga) true at 〈c, d〉 in this model.
Example 4:
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 282
Hesperus = Phosphorus
B (the standard meter bar) is one meter long
notion of contingency:
where bar B is the standard meter bar, and the “descriptive names” ‘one meter’
and ‘Julius’ are said to be “rigid designators” whose references are “fixed” by the
descriptions ‘the length of bar B’ and ‘the inventor of the zip’, respectively. Now,
whether or not these sentences, understood as sentences of everyday English,
are indeed genuinely contingent and a priori depends on delicate issues in the
philosophy of language concerning descriptive names, rigid designation, and
reference fixing. Rather than going into all that, let’s construct some examples
that are similar to Kripke’s and Evans’s. Let’s simply stipulate that ‘one meter’
and ‘Julius’ are to abbreviate “actualized descriptions”: ‘the actual length of bar
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 284
B’ and ‘the actual inventor of the zip’. With a little creative reconstruing in the
first case, the sentences then have the form: “the actual G is G”:
∃x(Gx∧∀y(Gy→y=x)) → ∃x(@Gx∧∀y(@Gy→y=x)∧Gx)
Any sentence of this form is 2D-valid (though not generally 2D-valid), and is
superficially contingent. So we have further examples of the contingent a priori
in the neighborhood of the examples of Kripke and Evans.
Various philosophers want to concede that these sentences are contingent
in one sense—namely, in the sense of superficial contingency. But, they claim,
this is a relatively unimportant sense (hence the term ‘superficial contingency’,
which was coined by Evans). In another sense, they’re not contingent at all.
Evans calls the second sense of contingency “deep contingency”, and defines it
thus (1979, p. 185):
If a deeply contingent statement is true, there will exist some state of
affairs of which we can say both that had it not existed the statement
would not have been true, and that it might not have existed.
The intended meaning of ‘the statement would not have been true’ is that the
statement, as uttered with its actual meaning, would not have been true. The
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 285
idea is supposed to be that ‘Julius invented the zip’ is not deeply contingent,
because we can’t locate the required state of affairs, since in any situation in
which ‘Julius invented the zip’ is uttered with its actual meaning, it is uttered
truly. So the Julius example is not one of a deeply contingent a priori truth.
Evans’s notion of deep contingency is far from clear. One of the nice things
about the two-dimensional modal framework is that it allows us to give a
clear definition of deep contingency. Davies and Humberstone (1980) give a
definition of deep contingency which is parallel to the definition of superficial
contingency, but with F@ in place of 2:
Under this definition, the examples we have given are not deeply contingent.
To be sure, this definition is only as clear as the two-dimensional notions of
fixedness and actuality. The formal structure of the two-dimensional framework
is of course clear, but one can raise philosophical questions about how that
formalism is to be interpreted. But at least the formalism provides a clear
framework for the philosophical debate to occur.
Our discussion of the necessary a posteriori will be parallel to that of the
contingent a priori. Just as we defined superficial contingency as the falsity of
the 2, so we can define superficial necessity as the truth of the 2:
How shall we construe a posteriority? Let’s follow our earlier strategy, and take
the failure to be 2D-valid as our guide.
But here we must take a bit more care. It’s quite a trivial matter to construct
models in which 2D-invalid sentences are necessarily true; and we don’t need
the two-dimensional framework to do it. We clearly don’t want to say that
‘Everything is a lawyer” is an example of the necessary a posteriori. But let F
stand for ‘is a lawyer’; we can construct a model in which the predicate F is
true of every member of the domain at any world, ∀xF x is true, and so is
superficially necessary at every world, despite the fact that it is not 2D-valid.
But this is too cheap. We began by letting the predicate F stand for a predicate
of English, but then constructed our model without attending to the modal
fact that it’s simply not the case that it’s necessarily true that everything is a
lawyer. If F is indeed to stand for ‘is a lawyer’, we would need to include in any
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 286
has the form ‘If the actual F and the actual G exist then they are identical’,
which was discussed in the previous paragraph. We may then construct a
realistic model in which F and G each have a single object in their extension in
some world, w, but in which they have different objects in their extensions in
other worlds. In such a model, the sentence
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 287
(2HP) 2(If Hesperus and Phosphorus exist then they are identical)
is true at 〈w, w〉, and so we again have our desired example: (HP) is superficially
necessary, despite the fact that it is a posteriori (2D-invalid).
Isn’t it strange that (HP) is both a posteriori and necessary? The two-
dimensional response is: no, it’s not, since although it is superficially necessary,
it isn’t deeply necessary in the following sense:
It isn’t deeply necessary because in any realistic model (given what F and G
currently stand for), there must be worlds and objects other than c and u, that
are configured as they are in the model below:
W = {c, d}
D = {u, v}
I (F ) = {〈u, c〉, 〈u, d〉}
I (G) = {〈u, c〉, 〈v, d〉}
In this model, even though (2HP) is false at 〈c, c〉, still, F@(HP), i.e.:
is false at 〈c, c〉 (and indeed, at every pair of worlds), since (HP) is false at 〈d, d〉.
And so, (HP) is not deeply necessary in this model.
One might try to take this two-dimensional line further, and claim that in
every case of the necessary a posteriori (or the contingent a priori), the necessity
(contingency) is merely superficial. But defending this stronger line would
require more than we have in place so far. To take one example, return again
to ‘Hesperus = Phosphorus’, but now, instead of thinking of ‘Hesperus’ and
‘Phosphorus’ as abbreviations for actualized descriptions, let us represent them
by names in the logical sense (i.e., the expressions called “names” in the defini-
tion of well-formed formulas, which are assigned denotations by interpretation
functions in models). Thus, ‘Hesperus = Phosphorus’ is now represented as:
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC 288
W = {c, d}
D = {u, v}
I (a) = u
I (b ) = u
The model is apparently realistic; it falsifies no relevant modal facts. But the
sentence a=b is deeply necessary (at any world in the model). And yet it is a
posteriori (2D-invalid).
∃x(Gx∧∀y(Gy→y=x)) → ∃x(@Gx∧∀y(@Gy→y=x)∧Gx)
Exercise 5.6 We must show that for any PC+DD model 〈D, I , E 〉, and any
variable assignment g , [α] g (relative to this model) is either E or a member of
D. We’ll do this by induction on the grammar of α. So, we’ll show that the
result holds when α is a variable, constant, or ι term (base cases), and then show
that, assuming the result holds for simpler terms (inductive hypothesis), it also
holds for complex terms made up of the simpler terms using a function symbol.
Base cases. If α is a variable then [α] g is g (α), which is a member of D
given the definition of a variable assignment. If α is a constant then [α] g is
I (α), which is a member of D given the definition of a model’s interpretation
function. If α has the form ιβφ then [α] g is either the unique u ∈ D such that
V g β (φ) = 1 (if there is such a u) or E (if there isn’t). So in all three cases, [α] g
u
is either E or a member of D. (Note that even though ι terms are syntactically
complex, we treated them here as a base case of our inductive proof. That’s
because we had no need for any inductive hypothesis; we could simply show
directly that the result holds for all ι terms.)
Next we assume the inductive hypothesis:
and show that the same goes for the complex term f (α1 . . . αn ). Well,
[ f (α1 . . . αn )] g is defined as I ( f )([α1 ] g . . . [αn ] g ). And the inductive hypothesis
tells us that each [αi ] g is either E or a member of D. And we know from the
definition of a model that I ( f ) is a function that maps any n-tuple of members
289
APPENDIX A. ANSWERS TO SELECTED EXERCISES 290
∗ ∗
r 1 0 0
2[P →3(Q→R)]→3[Q→(2P →3R)]
∗ ∗
1 1 1 1 0 1 1 0 0 0
a
P →3(Q→R) Q→(2P →3R)
† ∗
0
0 1 0 1
b Q→R P
†
0
Official model:
W = {r, a, b}
R = {〈r, a〉, 〈a, b〉, 〈a, a〉, 〈b, b〉}
I (P, a) = I (Q, a) = I (P, b) = 1, all else 0
xv) That means that V(3R, a) = 0; so, since Raa (reflexivity), V(R, a) = 0.
xvi) Lines xiii), xv), and xi) contradict (truth condition for →)
∗
1 1 1 1 0 0
2(P ↔Q)→2(2P ↔2Q)
r
† ∗
0 O
∗
1 1 1 1 1 0 0
2P ↔2Q
a
P ↔Q
† † ∗
0 O
0 1
b
Q P
0
APPENDIX A. ANSWERS TO SELECTED EXERCISES 292
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈r, b〉, 〈b, r〉}
I (P, r) = I (Q, r) = I (P, a) = I (Q, a) = I (P, b) = 1, all else 0
i) Suppose for reductio that in some world r of some S4-model, the formula
is false.
v) Given iv), 2P and 2Q must have different truth values in world a. With-
out loss of generality (given the symmetry between P and Q elsewhere
in the problem), let’s suppose that …
vii) …V(2Q, a) = 0.
xi) But then, given ii), V(P ↔Q, b ) = 1. This contradicts viii) and ix).
1 1 0 0
r 332P ↔2P
∗ † ∗
0 ? <] <
<<
<<
<<
<<
<<
∗ <<
1 1 1 0
32P
a b
P
0
∗
0
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈r, b〉, 〈b, r〉}
I (P, r) = I (P, a) = 1, all else 0
S4-countermodel:
1 0 0
r 332P ↔2P
∗ † ∗
0 <<
<<
<<
<<
<<
<<
∗ <<
1 1 1 0
32P
a b
P
0
∗
0
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈r, b〉}
I (P, a) = 1, all else 0
S5-validity proof:
i) We must show that in any world r of any S5-model, 332P and 2P have
the same truth value.
APPENDIX A. ANSWERS TO SELECTED EXERCISES 294
iii) …V(2P, r ) = 0.
x) V(332P, r ) = 0 and …
xi) …V(2P, r ) = 1.
1. (P ∧Q)→P PL
2. 2[(P ∧Q)→P ] 1, NEC
3. 2[(P ∧Q)→P ]→[3(P ∧Q)→3P ] K3
4. 3(P ∧Q)→3P 2,3 MP
5. 3(P ∧Q)→3Q Similar to 1-4
6. 3(P ∧Q)→(3P ∧3Q) 4, 5, PL
1. ∼(Q∧R)→(Q→∼R) PL
2. 2[∼(Q∧R)→(Q→∼R)] 1, NEC
3. 2∼(Q∧R)→2(Q→∼R) 2, K, MP
4. 2∼(Q∧R)↔∼3(Q∧R) MN (modal negation, proved in book)
5. (Q→∼R)→∼(Q∧R) PL
6. 2(Q→∼R)→2∼(Q∧R) 5, NEC, K, MP
7. ∼3(Q∧R)↔2(Q→∼R) 3, 4, 6 PL
Exercise 6.4g We’re to show that `K 3(P →Q)↔(2P →3Q). This one’s
a bit tough. The trick for the first half is getting the right order for the PL
tautologies, and for the second half, getting the right PL strategy.
1. P →[(P →Q)→Q] PL
2. 2P →2[(P →Q)→Q] 1, NEC, K, MP
3. 2[(P →Q)→Q]→[3(P →Q)→3Q] K3
4. 2P →[3(P →Q)→3Q] 2, 3, PL
5. 3(P →Q)→(2P →3Q) 4, PL
I must now prove the right-to-left direction, namely, (2P →3Q)→3(P →Q).
Note that the antecedent of this conditional is PL-equivalent to ∼2P 3Q. So
my goal will be to get two conditionals, ∼2P →3(P →Q), and 3Q→3(P →Q),
from which the desired conditional follows by PL.
6. ∼P →(P →Q) PL
7. 3∼P →3(P →Q) 1, NEC, K3, MP
8. 3∼P →∼2P MN
9. Q→(P →Q) PL
10. 3Q→3(P →Q) 9, NEC, K3, MP
11. (2P →3Q)→3(P →Q) 7, 8, 10, PL
12. 3(P →Q)↔(2P →3Q) 5, 11, PL
show that VI (φ) = 1. (VI , recall, is the classical valuation for I .) Consider
the intuitionist model with just one stage, r, in which formulas have the same
valuations as they have in the classical interpretation—i.e., 〈{r}, {〈r, r〉}, I *〉,
where I *(α, r) = I (α) for each sentence letter α. It’s easy to check that since
the intuitionist model has only one stage, the classical and intuitionist truth
conditions collapse in this case, so that for every wff φ, VI * (φ, r) = VI (φ). So,
since every member of Γ is true in I , every member of Γ is true at r in the
intuitionist model. Since Γ I φ, it follows that φ is 1 at r in the intuitionist
model; and so, φ is true in the classical interpretation—i.e., VI (φ) = 1.
and we must show that the following are also generally valid: ∼φ→F∼φ,
(φ→ψ)→F(φ→ψ), ∀αφ→F∀αφ, 2φ→F2φ, Fφ→FF φ, and ×φ→F ×φ:
∼ : Suppose otherwise—suppose V(∼φ→F∼φ, v, w) = 0 for some v, w.
So V(∼φ, v, w) = 1 and V(F∼φ, v, w) = 0. So V(φ, v, w) = 0, and for some
v 0 , V(∼φ, v 0 , w) = 0; and so V(φ, v 0 , w) = 1. By (ih), V(φ→Fφ, v 0 , w) = 1, and
so V(Fφ, v 0 , w) = 1, and so V(φ, v, w) = 1—contradiction.
→ : Suppose for some v, w, V(φ→ψ)→ F(φ→ψ), v, w) = 0. So (i)
V(φ→ψ, v, w) = 1 and V(F(φ→ψ), v, w) = 0. So, for some world, call it u,
V(φ→ψ, u, w) = 0, and so V(φ, u, w) = 1 and V(ψ, u, w) = 0. Given the former
and the inductive hypothesis, V(Fφ, u, w) = 1, and so V(φ, v, w) = 1. And so,
APPENDIX A. ANSWERS TO SELECTED EXERCISES 297
given (i), V(ψ, v, w) = 1, and so, given the inductive hypothesis, V(Fψ, v, w) = 1,
and so V(ψ, u, w) = 1, which contradicts (ii).
∀ : Suppose for some v, w, Vg (∀αφ, v, w) = 1, but Vg (F∀αφ, v, w) = 0.
Given the latter, for some v 0 , Vg (∀αφ, v 0 , w) = 0; and so, for some u in the
domain, Vg α (φ, v 0 , w) = 0. Given the former, Vg α (φ, v, w) = 1; given (ih) it
u u
follows that Vg α (Fφ, v, w) = 1, and so, Vg α (φ, v 0 , w) = 1. Contradiction.
u u
2 : suppose (i) V(2φ, v, w) = 1 and (ii) V(F2φ, v, w) = 0, for some v, w.
From (ii), V(2φ, v 0 , w) = 0 for some v 0 , and so V(φ, v 0 , w 0 ) = 0 for some
w 0 . Given (i), V(φ, v, w 0 ) = 1; and so, given (ih), V(Fφ, v, w 0 ) = 1, and so
V(φ, v 0 , w 0 ) = 1. Contradiction.
F: suppose V(Fφ, v, w) = 1 and V(FFφ, v, w) = 0, for some v, w. From the
latter, V(Fφ, v 0 , w) = 0 for some v 0 , and so V(φ, v 00 , w) = 0 for some v 00 , which
contradicts the former.
×: suppose Vg (×φ, v, w) = 1 but Vg (F×φ, v, w) = 0, for some v, w. Given
the latter, Vg (×φ, v 0 , w) = 0 for some v 0 , and so Vg (φ, w, w) = 0, which con-
tradicts the former.
Appendix B
Exercise 2.1 We’re to show that the defined symbols ∨ and ↔ get the right
truth conditions. We must first show that V (ψ∨χ ) = 1 iff either V (ψ) = 1 or
V (χ ) = 1, for any valuation V . ψ ∨ χ is short for ∼ψ→χ , so we need to show
that for any V ,V (∼ψ→χ ) = 1 iff V (ψ) = 1 or V (χ ) = 1.
First suppose that V (∼ψ→χ ) = 1. We must now show that V (ψ) = 1
or V (χ ) = 1. So suppose for reductio that this is not the case—i.e., suppose
that V (ψ) = 0 and V (χ ) = 0. Since V (ψ) = 0 then V (∼ψ) = 1, by the clause
in the definition of truth-in-a-valuation function for the ∼. And then, since
V (∼ψ) = 1 and V (χ ) = 0, V (∼ψ→χ ) = 1, by the clause in the definition
of truth-in-a-valuation function for the →. That contradicts the reductio
assumption.
Next suppose that either V (ψ) = 1 or V (χ ) = 1, and suppose for reductio
that V (∼ψ→χ ) = 0. Given the latter, V (∼ψ) = 1 and V (χ ) = 0 (clause for → );
and then given the clause for ∼, V (ψ) = 0. But if both V (ψ) = 0 and V (χ ) = 0,
that contradicts the initial supposition that either V (ψ) = 1 or V (χ ) = 1.
Next we must show that V (ψ↔χ ) = 1 iff V (ψ) = V (χ ) (for any V ).
“ψ↔χ ” is short for: “(ψ→χ )∧(χ →ψ)”. The ∧is still not part of our basic
vocabulary; however, we showed in class that, given how ∧is defined, V (α∧β) =
1 iff V (α) = 1 and V (β) = 1. Given this fact, V (ψ↔χ ) = 1 iff V (ψ→χ ) = 1
and V (χ →ψ) = 1. But it is true that both V (ψ→χ ) = 1 and also V (χ →ψ) = 1,
iff ψ and χ have the same truth value in V —i.e., iff V (ψ) = V (χ ). For if they
have different truth values then one of the conditionals ψ→χ or χ →ψ must
be false (whichever one is 1→0); and conversely, if ψ and χ have the same truth
value then each of these conditionals must be true (since both 0→0 and 1→1
298
APPENDIX B. ANSWERS TO REMAINING EXERCISES 299
are 1.)
1. ∼∼P Premise
2. ∼∼P →(∼P →∼∼P ) A1
3. ∼P →∼∼P 1,2, MP
4. (∼P →∼∼P )→[(∼P →∼P )→P ] A3
5. [(∼P →∼P )→P ] 3,4 MP
6. ∼P →∼P Repeat the proof of #1
7. P 5,6 MP
Exercise 2.5 The system we are to consider uses the same definition of wffs
and the same rule (MP), but has different axioms:
φ→φ
(φ→ψ)→(ψ→φ)
Let’s first prove that a) every theorem of this system has an even number of
“∼”s:
A theorem of a system is the last line of a proof. So I’ll prove by induction
that every line of every proof in this system has an even number of ∼s. To do
that, I need to prove i) that every axiom of the system has an even number of
APPENDIX B. ANSWERS TO REMAINING EXERCISES 302
∼s (base case), and ii) that if we assume that φ and φ→ψ each have an even
number of tildes, then it follows that what you get from those formulas by
MP—i.e., ψ—must also have an even number of ∼s (inductive step).
Base case: that’s easy. In each axiom schema, each Greek letter occurs twice.
In the first schema, φ occurs twice, and in the second schema, both φ and ψ
occur twice. So whenever we construct an axiom from either schema, each ∼
that occurs in any wff that we stick in for φ or for ψ will appear in the axiom
twice. Thus, that axiom will have an even number of ∼s.
Inductive step: assume the inductive hypothesis: that φ and φ→ψ each
have an even number of ∼s. Let n be the number of ∼s in φ, and let m be the
number of ∼s inφ→ψ. The inductive hypothesis tells us that both n and m
are even. That means that m − n is even. But m − n is the number of ∼s in ψ.
So ψ has an even number of ∼s.
Next I’ll show that b) not every theorem of this system is valid. To do this,
I just need to produce a single theorem of this system and a single valuation
in which this theorem is false. I choose the theorem (P →Q)→(Q→P ). That’s
a theorem of the system because it’s an axiom (second axiom schema). The
valuation I choose is one in which Q is 1 and P is 0. In this valuation P →Q is
true (because P is 0), and Q→P is 0 (because Q is 1 and P is 0); so that means
that the whole thing is 0.
NOTE: you can’t just say that the axiom schema (φ→ψ)→(ψ→φ) is invalid.
First, it’s just a schema, not a wff, so the notion of validity doesn’t apply to
it. Second, there are instances of this schema that are valid, for instance:
[(P →P )→Q]→[Q→(P →P )].
Exercise 2.6 We are to show (for regular propositional logic) that the truth
value of a formula depends only on the truth values of the sentence letters in
that formula.
Let φ be any wff and let V and V 0 be valuations that agree on the sentence
letters in φ (i.e., for any sentence letter α, if α is in φ then V (α) = V 0 (α)).
Show that V (φ) = V 0 (φ).
Let V and V 0 be as described. Let’s show by induction that every formula
φ containing only sentence letters on which V and V 0 agree is such that
V (φ) = V 0 (φ). Since we’re trying to show something of the form “all formulas
φ are blah blah blah”, our base case is to show that all atomic formulas are blah
blah blah; and our induction step will be to show that if ψ and χ are blah blah
blah, then so are ∼ψ and ψ→χ .
Base case: we must show that if φ is atomic then V (φ) = V 0 (φ). But it’s
APPENDIX B. ANSWERS TO REMAINING EXERCISES 303
Exercise 2.7 We must show that for any set of formulas, Γ, and any formula
φ, if Γ ` φ then Γ φ (i.e., if φ is provable from Γ then φ is a semantic
consequence of Γ.) Like the proof of the original version of soundness, let’s
do this by induction. Here we’re not proving that every formula has a certain
property; we’re trying to prove that anything that is provable from Γ has a certain
property. So our inductive proof will concern the successive addition of lines
to a growing proof according to the rules of proof, not the successive addition
of more formulas to a growing formula by the rules of grammar.
Remember that a formula φ is provable from Γ iff there exists a proof from Γ
(i.e., a proof in which each line is either an axiom, a member of Γ , or follows
from earlier lines in the proof by MP) whose last line is φ. So let’s prove by
induction that the last line of every proof from Γ is a semantic consequence of Γ . And
we do that, in essence, by showing that every time you add to a proof from Γ,
you must always add a formula that is a semantic consequence of Γ. Formulas
you add to the proof fall into two categories: i) axioms and members of Γ , and
ii) formulas following by MP from earlier lines.
Base case: we must show that axioms and members of Γ are semantic
consequences of Γ. What does it mean to say that a formula ψ is a semantic
consequence of Γ ? It means that for any valuation, V , if every member of Γ
is true in V , then so is ψ. Well, it’s then obvious that any member of Γ is a
semantic consequence of Γ (obviously, a member of Γ is true in any valuation
that counts all of Γ s members true.) What about axioms—are they all true
APPENDIX B. ANSWERS TO REMAINING EXERCISES 304
% 1 0
1 0 1
0 1 0
f (1, 1) = 1 g (1, 1, 1) = 1
f (1, 0) = 0 g (1, 1, 0) = 0
f (0, 1) = 0 g (1, 0, 1) = 1
f (0, 0) = 1 g (1, 0, 0) = 1
g (0, 1, 1) = 1
g (0, 1, 0) = 1
g (0, 0, 1) = 0
g (0, 0, 0) = 1
Thus, the whole thing is false exactly when one of the following is true:
P ∧Q∧∼R
∼P ∧∼Q∧R
Thus, the whole thing is true when (*) is false—i.e., when the negation of (*) is
true. But the negation of (*) is equivalent to:
The only remaining thing to do is eliminate the ∧s with their equivalents using
the |, via the equivalence mentioned above: φ∧ψ is equivalent to (φ|ψ)|(φ|ψ).
That would take forever, and I’m getting a bit lazy, so I won’t write it out.
Note: a simpler expression of function g is: (P ↔Q)→(Q↔R), which
could then be expressed using the Sheffer stroke.
% 1 0
1 0 0
0 0 1
APPENDIX B. ANSWERS TO REMAINING EXERCISES 308
We know that all the truth functions can be defined using just ∼ and ∨. So all
we need to do is show that ∼ and ∨ can be defined using ↓.
First, ∼φ can be defined as φ ↓ φ. (The ↓ generates a false sentence from
two trues, and a true sentence from two falses.)
Second, note that φ ↓ ψ is equivalent to ∼(φ∨ψ). So φ∨ψ is equivalent to
∼(φ ↓ ψ), and hence to (φ ↓ ψ) ↓ (φ ↓ ψ).
of I . So, since SI (φ) and SI (ψ) are both #, we know that there exist four
precisifications of I : B, C , D, and E , such that:
VB (φ) = 1 VC (φ) = 0
VD (ψ) = 1 VE (ψ) = 0
Exercise 4.1a What we are trying to show is that ∀x(F x→(F x∨Gx)) is
valid—i.e., that this formula is true in every model—i.e., that for any model
M (= 〈D, I 〉), and any assignment to the variables g defined on that model,
V g ,M (∀x(F x→(F x∨Gx))) = 1. (I’ll leave the subscript M implicit from now
on.) So, suppose for reductio that this is not true—that is, suppose for some
model and some g defined on that model, we have:
i) So, for some u ∈ D,V g x (F x→(F x∨Gx)) = 0. Call one such u, “u”.
u
i) Suppose for reductio that for some model and some g , V g (∀x(F x∧Gx)→
(∀xF x∧∀xGx)) = 0
iv) Suppose the former (i.e., V g (∀xF x) = 0). Then for some u in the domain,
V g x (F x) = 0. From the first part of ii), we know that for every object in
u
the domain, and so for u in particular, V g x (F x∧Gx) = 1. From the clause
u
in the definition of V for ∧, we know that V g x (F x) = 1. Contradiction.
u
v) Suppose the latter (i.e., V g (∀xGx) = 0). Then, for some v in the domain,
V g x (Gx) = 0. From the first part of ii), we know that for every object in
v
the domain, and so for v in particular, V g x (F x∧Gx) = 1. So V g x (Gx) = 1.
v v
contradiction.
Exercise 4.1c What we’re trying to show is that the set of formulas
{∀x(F x→Gx), ∀x(Gx→H x)} logically implies the sentence ∀x(F x→H x).
That is, we’re trying to show that ∀x(F x→H x) is true in every model, M , in
which all the premises in the set are true. So, we proceed as follows: suppose
for reductio that in some model, M , each of the premises are true and the
conclusion is false. We then reason as follows:
i) Since the conclusion is false in this model, we know that for some g ,
V g (∀x(F x→H x)) = 0.
ii) Since the premises are true, we know that for each variable assignment,
and so for g in particular, V g (∀x(F x→Gx)) = 1, and V g (∀x(Gx→H x)) =
1
iii) From i), for some u ∈ D, V g x (F x→H x) = 0 (clause for ∀ ). Call it “u”
u
APPENDIX B. ANSWERS TO REMAINING EXERCISES 312
a) V g x (F x) = 0 or V g x (Gx) = 1; and
u u
b) V g x (Gx) = 0 and V g x (H x) = 1
u u
iii) Given the former, for some member of the domain, call it u,
V g x (∀yRxy) = 1
u
iv) Given the latter, for some member of the domain, call it v, V g y (∃xRxy) =
v
0
v) From line iii), we know that for each member of the domain, and so for
v in particular, V g xy (Rxy) = 1
uv
vi) From line iv) we know that for each member of the domain, and so for u
in particular, V g y x (Rxy) = 0.
vu
APPENDIX B. ANSWERS TO REMAINING EXERCISES 313
vii) The function g uxvy is the same function as the function gvyux (each is the
function just like g , except that it assigns u to x and v to y.) So the
previous two lines contradict.
D = {0, 1}
I (F ) = {0}
I (G) = {0, 1}
D = {0, 1}
I (F ) = {0}
I (G) = {0}
Exercise 4.2c To show that Rab does not semantically imply ∃xRx x, we
need to find a model in which the first is true and the second is false. Here is
APPENDIX B. ANSWERS TO REMAINING EXERCISES 314
such a model:
D = {0, 1}
I (a) = 0
I (b ) = 1
I (R) = {〈0, 1〉}
• / •
But now, given the second premise, this new thing has to R something. It
can’t R back to the first thing, because given transitivity, that first thing would
then need to R itself. So we need to add a third thing:
• / • / •
Also, given transitivity, the first thing Rs the third thing. But now this third
thing needs to R something. It can’t R itself, or any of the things earlier in the
sequence, because each of those things Rs it; so given transitivity, if the third
thing Rs any of those things, it would have to R itself.
And so on. We can never stop with any finite model, since the second
premise will always force us to add another object.
But we can have an infinite model, in which each object Rs all the later
objects:
• / • / • / • / ...
APPENDIX B. ANSWERS TO REMAINING EXERCISES 315
In essence, R is interpreted in this model as meaning “is less than”. The first
premise is true in the model because “is less than” is a transitive relation. The
second premise is true because for each natural number there exists a greater
natural number. The conclusion is false because no number is less than itself.
Exercise 5.1a F ab ∀x(x=a→F x b ):
i) Suppose for reductio that for some model and some g , V g (F ab ) = 1, but
…
ii) …V g (∀x(x=a→F x b )) = 0.
g) And [b ] g x is I (b )
u
h) So we have: 〈I (a), I (b )〉 ∈
/ I (F )
c) so 〈I (a),I (b )〉 ∈ I (F )
d) this contradicts line iv) h)
Exercise 5.1b We are to show that ∃x∃y∃z(F x∧F y∧F z∧x6=y∧x6= z∧y6= z),
∀x(F x→(Gx∨H x) 2 ∃x∃y∃z(Gx∧Gy∧Gz∧x6=y∧x6= z∧y6= z). We need a
model in which the two premises:
Exercise 5.2b “The only truly great player who plays in the NBA is Allen
Iverson”: ∀x[(Gx∧P xn)→x=i ]
for ‘is a solitary confinement cell’, ‘S xy z’ stand for ‘x shares y with z’ (it’s a
three place predicate), and ‘I ’ stand for ‘is in’:
Exercise 5.2d The shortest symbolization of “there are at least five dinosaurs”
I could find:
Exercise 5.4a To show that ∀xF x→F f (a), let M be any model, let g be
any assignment in that model.
@0=
===
==
==
2 o 1
D = {0, 1, 2}
I ( f ) = the function g such that g (0) = 1, g (1) = 2, g (2) = 0
Exercise 5.5a We must show that ∀xL(x, ιyF xy)→∀x∃yLxy. It’s easy to
get confused by the complexity of the antecedent here, “∀xL(x, ιyF xy)”. This
just has the form: ∀xLxα , where α is “ιyF xy”. L is a two-place predicate;
it applies to the terms x and α. If you think of “F xy” as meaning that x is a
father of y, and “Lxy” as meaning that x loves y, then ∀xL(x, ιyF xy) means
“everyone x loves the y that he (x) is the father of”.
Now for the proof. Suppose for reductio that in some model, and some
assignment g in that model:
iii) Given the second, for some u ∈ D,V g x (∃xLxy) = 0. Call this u “u”.
u
APPENDIX B. ANSWERS TO REMAINING EXERCISES 319
iv) Given the first, for every v ∈ D,V g x (L(x, ιyF xy)) = 1
v
a) [x] g x is g ux (x)—i.e., u.
u
Aside: It doesn’t really matter for this problem, but we can in-
fer something about v. Remember that E , the emptiness
marker, is never in the extension of any predicate. That goes
for two-place predicates like L, as well as one-place predi-
cates. What that means is that for any ordered pair 〈o1 , o2 〉,
if 〈o1 , o2 〉 ∈ I (L), then neither o1 nor o2 can be E . Thus, since
〈[x] g x , [ιyF xy] g x 〉 ∈ I (L), we can conclude that [ιyF xy] g x —
u u u
i.e., v—is not E . What’s more, given the definition of deno-
tation for ι terms, there must exist exactly one object v ∈ D
such that V g xy (F xy) = 1, and that [ιyF xy] g x is this v. (If there
uv u
weren’t exactly one such v, then [ιyF xy] g x would be E .) Sum-
u
mary: we know that v is not the emptiness marker, but rather is
the one and only object in the domain such that V g xy (F xy) = 1.)
uv
viii) Now, from line iii) we have: for every o ∈ D,V g xy (Lxy) = 0
uo
Exercise 5.5b We must show that 2 GιxF x→F ιxGx. To make this formula
false in a model, we need to make GιxF x true and F ιxGx false. Let’s think
about the denotation of ιxF x. To make GιxF x true, the denotation of ιxF x
APPENDIX B. ANSWERS TO REMAINING EXERCISES 320
must be in the extension of G; that means that it can’t be the emptiness marker.
So let’s let the denotation of ιxF x be the number 0. Now, 0 must be the one
and only object in the extension of F (since it is not the emptiness marker and
is the denotation of ιxF x.) So we have this so far:
D = {E , 0, . . . ?
I (F ) = {0}
I (G) = {0, . . . ?
Now let’s ask: can we stop there? Can we let G’s extension just contain 0 and
nothing else? The answer is no. For if 0 is the one and only object in G’s
extension, then 0 will be the denotation of ιxGx. But since 0 is in the extension
of F , that would make F ιxGx be true, whereas we want it to be false. So we
need to add something else to G’s extension:
D = {E , 0, 1}
I (F ) = {0}
I (G) = {0, 1}
Now, the denotation of ιxGx is the emptiness marker, E . Since E is not in the
extension of F , F ιxGx is false, which is what we want.
Exercise 5.7a “If a person commits a crime, then the judge that sentences
him/her wears a wig”: ∀x[(P x∧∃y(C y∧M xy)) → ∃y(W y∧Eιz(J z∧S z x)y)]
(“E x1 x2 ” = “x1 wears x2 ”)
Exercise 5.7b “The tallest spy is a spy": Sιx(S x∧∀y((S y∧y6= x)→T xy))
Exercise 5.8 “The ten-feet-tall man is not happy”, symbolized first using
the ι, and then (under two readings) using Russell’s method:
∼H ιx(T x∧M x)
∼∃x(T x∧M x∧∀y([T y∧M y)→y=x) ∧ H x)
∃x(T x∧M x∧∀y([T y∧M y)→y=x) ∧ ∼H x)
The first Russellian symbolization says that it’s not true that: there is exactly
one ten-feet-tall man who is happy. The second says that there is exactly one
ten-feet-tall man, and he his not happy. So if there isn’t exactly one ten-feet-tall
APPENDIX B. ANSWERS TO REMAINING EXERCISES 321
man (whether because no man is ten-feet-tall, or because more than one man
is ten-feet-tall), then the first is true while the second is false. Given that the
null individual is not in the extension of any predicate, the ι symbolization is
also true if there is not exactly one ten-feet-tall man; so the first Russellian
symbolization is like the ι symbolization.
Exercise 5.9 The semantics of ∃prime : For any model, M , and any assignment
to the variables g , VM , g (∃prime αφ) = 1 iff |φM, g ,α | is prime.
∗ ∗
1 1 1 0 0 0 0 0
2(P ∨3Q)→(2P ∨3Q)
r
† ∗
0 O
0 0 1 1
a P Q P ∨3Q
†∗
0 O
1
b
Q
0
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈a, b〉, 〈b, a〉}
I (P, r) = I (Q, b) = 1, all else 0
i) Suppose for reductio that in some S4 model 〈W, R,I〉, for some world
r ∈ R, V(2(P ∨3Q)→(2P ∨3Q), r ) = 0
iv) Given iii), V(2P, r ) = 0. So for some world, call it a, R ra and V(P, a) = 0
∗ ∗
1 1 1 1 0 1 1 1 0 0 0 0
3(P ∧3Q)→(23P →32Q)
r
∗ ∗ ∗ ∗
0 O
1 1 0
a Q 3P 2Q
∗ ∗
0
Official model:
W = {r, a}
R = {〈r, r〉, 〈a, a〉, 〈r, a〉, 〈a, r〉}
I (P, r) = I (Q, a) = 1; all else 0
∗
1 1 1 1 0 0
r
2(P ∧Q)→22(3P →3Q)
∗
0
1 1 1 0
a P ∧Q 2(3P →3Q)
∗
0
∗
1 1 0 0 0
3P →3Q
b
∗
0
APPENDIX B. ANSWERS TO REMAINING EXERCISES 324
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, b〉}
I (P, r) = I (Q, r) = I (P, a) = I (Q, a) = I (P, b) = 1, all else 0
i) Suppose for reductio that the formula is false at some world r in some
B-model
iv) Given ii), for some world, call it a, R ra and V(2(3P →3Q), a) = 0. And
so, for some world, call it b , Rab and V(3P →3Q, b ) = 0
vi) …V(3Q, b ) = 0
vii) Since R ra and Rab , by transitivity R r b . Given i), V(P ∧Q, b ) = 1, and
so V(Q, b ) = 1.
B-validity proof: like the S4-validity proof through the first 6 steps; then:
viii) Since R ra, given ii), V(P ∧Q, a) = 1, and so V(Q, a) = 1, contradicting
vii)
∗
1 1 1 1 1 0 0
2(2P →Q)→2(2P →2Q)
r
† ∗
0 O
∗
1 1 1 1 1 0 0
2P →Q 2P →2Q
a
† ∗
0 O
0 1
b
Q P
0
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈r, b〉, 〈b, r〉}
I (P, r) = I (Q, r) = I (P, a) = I (Q, a) = I (P, b) = 1, all else 0
i) Suppose for reductio that the formula is false at some world r of some
S4-model.
vi) …V(2Q, a) = 0.
1 0 0
r 33P →23P
∗ ∗
0 @@
~~ @@
~~~ @@
~ @@
~~~ @@
~ @@
~~
1 1 ~ ∗
a 3P b 0 0
∗ 3P
0 0
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈r, b〉}
I (P, a) = 1, all else 0
B-countermodel:
1 0 0 0
r 33P →23P
∗ ∗
~>
0 `@@
@@
~~~ @@
~~ @@
~~~ @@
~ @@
~~
1 1 ~ ∗
a 3P b 0 0
∗ 3P
0 0
APPENDIX B. ANSWERS TO REMAINING EXERCISES 327
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈r, b〉, 〈a, r〉, 〈b, r〉}
I (P, a) = 1, all else 0
S5 validity proof:
i) Suppose for reductio that the formula is false at some world r in some
S5 model
iii) …V(23P, r ) = 0
∗
1 0 1 0 0 1 0 1 0 0
2[2(P →2P )→2P ]→(32P →2P )
r
∗ ∗ † ∗ ∗
0 Nf NN
NNN
NNN
NNN
NNN
∗ &
0 1 0
1 1 1
2(P →2P )→2P
2(P →2P )→2P 2P o
a b
∗ †
† 0
0
APPENDIX B. ANSWERS TO REMAINING EXERCISES 328
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈r, b〉, 〈b, r〉, 〈b, a〉}
I (P, r) = I (P, a) = 1, all else 0
∗
1 0 1 0 0 1 0 1 0 0
2[2(P →2P )→2P ]→(32P →2P )
r
∗ ∗ † ∗ ∗
0 O Nf NN
NNN
NNN
NNN
NNN
∗ &
0 1 0
1 1 1
2(P →2P )→2P
2(P →2P )→2P 2P
a b
∗ †
† 0
0
Official model:
W = {r, a, b}
R = {〈r, r〉, 〈a, a〉, 〈b, b〉, 〈r, a〉, 〈a, r〉, 〈r, b〉, 〈b, r〉}
I (P, r) = I (P, a) = 1, all else 0
i) Suppose for reductio that the formula is false in some world, r , of some
S5 model
1. Q→(P →Q) PL
2. 2Q→2(P →Q) 1, NEC, K, MP
3. 2(P →Q)→(3P →3Q) K3
4. 3P →(2Q→3Q) 2, 3, PL
1. 2P →3P D
2. 2P →∼2∼P rewrite of 1 given def of 3
3. ∼(2P ∧2∼P ) 2, PL
1. (P ∧Q)→∼(P →∼Q) PL
2. 2(P ∧Q)→2∼(P →∼Q) 1, NEC, K, MP
3. 2∼(P →∼Q)→3∼(P →∼Q) D
4. 3∼(P →∼Q)↔∼2(P →∼Q) MN
5. ∼[2(P ∧Q)∧2(P →∼Q)] 2, 3, 4, PL
6. 2∼[2(P ∧Q)∧2(P →∼Q)] 6, NEC
7. 3∼[2(P ∧Q)∧2(P →∼Q)] D, 6, MP
8. 3∼[2(P ∧Q)∧2(P →∼Q)]↔ MN
∼2[2(P ∧Q)∧2(P →∼Q)]
9. ∼2[2(P ∧Q)∧2(P →∼Q)] 7, 8, PL
1. 2P →P T
2. 2P →(P ∨Q) 1, PL
3. 32P →3(P ∨Q) 2, NEC, K3, MP
1. 23P →3P T
2. 323P →33P 1, NEC, K3, MP
3. 33P →3P S43
4. 323P →3P 2, 3, PL
5. 2323P →23P 4, NEC, K, MP
1. 323P →3P B
2. 2323P →23P 1, NEC, K, MP
1. 2P →32P T3
2. 22P →232P 1, NEC, K, MP
3. 2P →22P S4
4. 2P →232P 2, 3, PL
5. 32P →3232P 4, NEC, K3, MP
Exercise 6.9a We are to show that `S5 (2P ∨3Q)↔2(P ∨3Q). My strat-
egy for the first half, in lines 1-7, uses the fact from propositional logic that
(φ∨ψ)→χ follows from φ→χ and ψ→χ . My strategy for the second half uses
MN, plus the fact from propositional logic that χ →(φ∨ψ) is equivalent to
χ →(∼ψ→φ):
APPENDIX B. ANSWERS TO REMAINING EXERCISES 333
1. P →(P ∨3Q) PL
2. 2P →2(P ∨3Q) 1, Nec, K, MP
3. 3Q→(P ∨3Q) PL
4. 23Q→2(P ∨3Q) 3, Nec, K, MP
5. 3Q→23Q S53
6. 3Q→2(P ∨3Q) 4, 5, PL
7. (2P ∨3Q)→2(P ∨3Q) 2, 6, PL (done left-to-right; now
for the other direction. Goal: get
2(P ∨3Q)→(2∼Q→2P ))
8. 2∼Q↔∼3Q MN
9. (P ∨3Q)→(2∼Q→P ) 8, PL
10. 2(P ∨3Q)→2(2∼Q→P ) 9, NEC, K, MP
11. 2(2∼Q→P )→(22∼Q→2P ) K
12. 2∼Q→22∼Q S4
13. 2(P ∨3Q)→(2∼Q→2P ) 10, 11, 12, PL
14. 2(P ∨3Q)→(2P ∨3Q) 13, 8, PL
15. (2P ∨3Q)↔2(P ∨3Q) 7, 14, PL
1. ∼(2P →2Q)→2P PL
2. 3∼(2P →2Q)→32P 1, NEC, K3, MP
3. 32P →2P S5
4. 2P →22P S4
5. 2P →(2Q→2P ) PL
6. 22P →2(2Q→2P ) 5, NEC, K, MP
7. 3∼(2P →2Q)↔∼2(2P →2Q) MN
8. ∼2(2P →2Q)→2(2Q→2P ) 7, 2, 3, 4, 6 PL
9. 2(2P →2Q)∨2(2Q→2P ) 8, PL
Exercise 6.11 Where S is any normal modal system, we must show that if
∆ is an S-consistent set of wffs containing the formula 3φ, then 2− (∆) ∪ φ is
also S-consistent.
3φ is an abbreviation of ∼2∼φ; so what we’re given is this: S is a normal
modal system, ∆ is an S-consistent set of wffs containing the formula ∼2∼φ.
By Lemma 6.6, 2− (∆) ∪ {∼∼φ} is S-consistent. Now suppose for reductio
that 2− (∆) ∪ {φ} is not S-consistent. So given the definition of S-consistency,
for some ψ1 . . . ψn in 2− (∆) ∪ {φ}, `S ∼(ψ1 ∧ · · · ∧ψn ). Since S includes PL,
`S ∼(ψ1 ∧ · · · ∧ψn ∧φ). If φ is one of the ψs, then the rest of the ψs are members
of 2− (∆); so for some δ1 . . . δ m in 2− (∆) (namely, all the ψs other than φ),
`S ∼(δ1 ∧ · · · ∧δ m ∧φ). Since S includes PL, we have: `S ∼(δ1 ∧ · · · ∧δ m ∧∼∼φ),
which violates the S-consistency of 2− (∆) ∪ {∼∼φ}.
Exercise 6.12 We are to demonstrate completeness for the system that results
from adding to K every axiom of the form 3φ→2φ, where the frames for this
system are defined as those whose accessibility relation meets the condition
that every world can see at most one world. Let’s first show that
(*) in the canonical model for the strange system, every world sees at most
one world.
To do this, suppose for reductio that for some world, w, in this canonical
model, R wv and R wv 0 and v6=v 0 . Now, since v6=v 0 , and v and v 0 are maximal
consistent sets of sentences, there must be some sentence, φ, that is a member
of one set but not the other. Without loss of generality, suppose that φ ∈ v and
φ∈ / v 0 . Then, by theorem 6.7, V(φ, v) = 1 and V(φ, v 0 ) = 0. Since R wv and
R wv 0 , that means that V(3φ, w) = 1 (since φ is true in some world accessible
from w) and V(2φ, w) = 0 (since φ isn’t true at all worlds accessible from w).
So V(3φ→2φ, w) = 0. But 3φ→2φ is a theorem of the strange system, and
so by 6.4d is a member of w, and so by theorem 6.7 is true at w. Contradiction.
Now we use (*) to prove completeness for the strange system. Suppose φ
is valid in the strange system. That is, φ is true in any world of any model in
which the accessibility relation is such that every world sees at most one world.
Given (*), the canonical model for the strange system is such a model. So φ is
APPENDIX B. ANSWERS TO REMAINING EXERCISES 336
true at every world in the canonical model—i.e., is valid in the canonical model
for this system. By corollary 6.8, φ is a theorem of the strange system.
Exercise 7.1 We’re to show that φ I ψ iff I φ→ψ. For the left-to-right
direction, suppose φ ψ, and suppose for reductio that V(φ→ψ, s) = 0. Then
for some s 0 (that s sees), V(φ, s 0 ) = 1 and V(ψ, s 0 ) = 0—contradicts φ ψ.
For the other direction, suppose φ→ψ, and suppose for reductio that
V(φ, s) = 1 while V(ψ, s ) = 0. By φ→ψ, V(φ→ψ, s ) = 1. By reflexivity, either
V(φ, s ) = 0 or V(ψ, s) = 1. Contradiction.
∗
1 0 0 0 0 0
r
∼(P ∧Q) ∼P ∨∼Q
∗ ∗
0 JJ
tt JJ
t tt JJ
JJ
tt JJ
tt J%
zt
t
∗ ∗
a 1 0 0 b 1 0 0
P P ∧Q Q P ∧Q
0 0
∗
1 0 0 0
r
P →(Q∨R) (P →Q)∨(P →R)
∗ ∗
0
∗
1 0 0 0
r
P →(Q∨R) (P →Q)∨(P →R)
∗ ∗
0 rr LLL
rrrrr LLL
LLL
r
rrrrr LLL
LLL
∗ x r
r LL
&
a 1 0 b P R
0
P Q
0
Exercise 7.4 We are assuming the inductive hypothesis (ih) that heredity
holds for formulas φ and ψ, and we must show that heredity then must also
hold for ∼φ, φ→ψ, and φ∨ψ.
∼ : Suppose for reductio that V(∼φ, s) = 1, R s s 0 , and V(∼φ, s 0 ) = 0. Given
the latter, for some s 00 , R s 0 s 00 and V(φ, s 00 ) = 1. By transitivity, R s s 00 . This
contradicts V(∼φ, s) = 1.
→ : Suppose for reductio that V(φ→ψ, s ) = 1, R s s 0 , and V(φ→ψ, s 0 ) = 0.
Given the latter, for some s 00 , R s 0 s 00 and V(φ, s 00 ) = 1 and V(ψ, s 00 ) = 0; but by
transitivity, R s s 00 —contradicts the fact that V(φ→ψ, s) = 1.
∨ : Suppose for reductio that V(φ∨ψ, s ) = 1, R s s 0 , and V(φ∨ψ, s 0 ) = 0.
Given the former, either V(φ, s) = 1 or V(ψ, s ) = 1; and so, given (ih), either φ
or ψ is 1 in s 0 . That violates V(φ∨ψ, s 0 ) = 0.
Exercise 7.5 We must show that ∧I, ∨I, DNI, RAA, →I, →E, and EF pre-
serve I-validity.
∧E: Assume that Γ ` φ∧ψ is I-valid, and suppose for reductio that V(Γ, s ) =
1 and V(φ, s) = 0, for some stage s in some model. By the I-validity of Γ ` φ∧ψ,
V(φ∧ψ, s) = 1, so V(φ, s) = 1. Contradiction. The case of ψ is parallel.
∨I: Assume that Γ ` φ is I-valid and suppose for reductio that V(Γ, s) = 1
but V(φ∨ψ, s) = 0. Thus V(φ, s) = 0—contradiction.
DNI: if Γ ` φ then Γ ` ∼∼φ): assume Γ ` φ is I-valid, and suppose for
reductio that V(Γ, s) = 1 but V(∼∼φ, s) = 1. From the latter, for some s 0 , R s s 0
and V(∼φ, s 0 ) = 1. So V(φ, s 0 ) = 0 (since R is reflexive). From the former and
the fact that Γ ` φ, V(φ, s) = 1. This violates general heredity.
RAA: Suppose that Γ,φ ` ψ∧∼ψ is I-valid, and suppose for reductio that
V(Γ, s) = 1 but V(∼φ, s ) = 0. Then for some s 0 , R s s 0 and V(φ, s 0 ) = 1. Since
APPENDIX B. ANSWERS TO REMAINING EXERCISES 338
V(Γ, s) = 1 (i.e., all members of Γ are 1 at s), by general heredity V(Γ, s 0 ) = 1 (all
members of Γ are 1 at s’). Thus, since Γ,φ ` ψ∧∼ψ is I-valid, V(ψ∧∼ψ, s 0 ) = 1.
But that is impossible. (If V(ψ∧∼ψ, s 0 ) = 1, then V(ψ, s 0 ) = 1 and V(∼ψ, s 0 ) = 1;
but from the latter and the reflexivity of R it follows that V(ψ, s 0 ) = 0.)
→I: Suppose that Γ,φ ` ψ is I-valid, and suppose for reductio that V(Γ, s ) = 1
but V(φ→ψ, s) = 0. Given the latter, for some s 0 , R s s 0 and V(φ, s 0 ) = 1 and
V(ψ, s 0 ) = 0. Given general heredity, V(Γ, s 0 ) = 1. And so, given that Γ,φ ` ψ is
I-valid, V(ψ, s 0 ) = 1—contradiction.
→E: Suppose that Γ ` φ and ∆ ` φ→ψ are both I-valid, and suppose for
reductio that V(Γ ∪ ∆, s) = 1 but V(ψ, s) = 0. Since Γ ` φ and ∆ ` φ→ψ,
V(φ, s ) = 1 and V(φ→ψ, s) = 1. Given the latter, and given that R is reflexive,
either V(φ, s) = 0 or V(ψ, s) = 1. Contradiction.
EF (ex falso): Suppose Γ ` φ∧∼φ is I-valid, and suppose for reductio that
V(Γ, s) = 1 but V(ψ, s ) = 0. Given the former and the I-validity of Γ ` φ∧∼φ,
V(φ∧∼φ, s) = 1, which is impossible.
i) Suppose φ⇒ψ is true at r , and suppose for reductio that φ2→ψ is false
at r .
iii) But that can’t be. “φ⇒ψ” means 2(φ→ψ). So φ→ψ is true at every
world. So there can’t be a world like a, in which φ is true and ψ is false.
iv) …ψ is false at r
vi) Given iii) and v), r is a closest-to-r φ world. So, given i), ψ is true at r .
Contradicts iv).
APPENDIX B. ANSWERS TO REMAINING EXERCISES 339
v) …V(∼(P 2→∼Q), w) = 0.
vi) Given ii) and the limit assumption, there is some nearest-to-w P -world,
call it v.
x) …V(P 2→Q, w) = 0
xi) Given x), there is some nearest-to-w P -world, call it v 0 , such that
V(Q, v 0 ) = 0.
xii) Given ix), V(P 2→∼Q, w = 0). So there’s some nearest-to-w P -world,
call it v 00 , such that V(∼Q, v 00 ) = 0.
(Note that this wff is invalid given Lewis’s semantics, as the countermodel from
section 8.7 shows.
Exercise 8.2b 2SC [P 2→(Q→R)]→[(P ∧Q)2→R]:
/.1 1 1 -,
()P ∧Q R*+
0
b
O
/.1 -,
()P Q→R*+
0 1
O
a no P ∧Q
no P
/. -,
()[P 2→(Q→R)]→[(P ∧Q)2→ R]*+
1 0 0
r
Official model:
W = {r, a, b}
r = {〈a, b〉 . . . }
I (P, b) = I (Q, b) = I (P, a) = 1; all else 0
/. -,
d () *+
/. -,
c () *+
/.1 -,
()Q P 2→R*+
0
b
O
“view from r”:
/.1 -,
()P Q2→R*+
1
O
a no Q
no P
/. -,
()[P 2→(Q2→ R)]→[Q2→(P 2→ R)]*+
1 0 0
r
/. -,
d () *+
/. -,
r () *+
/. -,
b () *+
“view from a”:
/.1 -,
()Q R*+
1
c
O
no Q
/.1 -,
()P Q2→R*+
1
a
APPENDIX B. ANSWERS TO REMAINING EXERCISES 342
/. -,
c () *+
/. -,
r () *+
/. -,
a () *+
“view from b”:
/.1 -,
()P R*+
0
O d
no P
/.1 -,
()Q P 2→R*+
0
b
Official model:
W = {r, a, b, c, d}
r = {〈a, b〉, 〈b, c〉, 〈c, d〉 . . . }
a = {〈c, b〉, 〈b, r〉, 〈r, d〉 . . . }
b = {〈d, a〉, 〈a, r〉, 〈r, c〉 . . . }
I (P, a) = I (Q, b) = I (Q, c) = I (R, c) = I (P, d) = 1, all else 0
Exercise 8.3 We must show that in Lewis models where the limit and anti-
symmetry conditions hold, Lewis’s truth conditions reduce to Stalnaker’s. Con-
sider any Lewis model 〈W , , I 〉 in which the limit and anti-symmetry con-
ditions hold. Let VL and VS be the Lewis and Stalnaker valuation functions,
respectively, for this model. We must show that these are the same functions.
We’ll show by induction that for any wff φ: for any world w, VS (φ2→ψ) =
VS (φ2→ψ). Base case: show that VS and VL assign the same truth values to
sentence letters at each world. This follows from the fact that both functions
by definition assign the truth values I (α, w) for sentence letters α.
Induction step: assuming the inductive hypothesis:
APPENDIX B. ANSWERS TO REMAINING EXERCISES 343
(ih) VS and VL assign the same truth values at each world to wffs φ and ψ
we must show that VS and VL also assign the same truth values at each world
to ∼φ, φ→ψ, 2φ, and φ2→ψ. This is easy in the first three cases, since i)
the clauses in the definitions of VS and VL for the ∼, →, and 2 are identical,
and define the truth values of the complex formulas ∼φ, φ→ψ, and 2φ at a
given world as a function of the truth values of φ and ψ at that world and other
worlds; and ii) (ih) tells us that φ and ψ have the same VS and VL values at all
worlds.
It remains to show that, for a given world w, VS (φ2→ψ) = VL (φ2→ψ).
Given Stalnaker’s truth conditions, we know that VS (φ2→ψ, w) = 1 iff:
(S) for any x, IF [VS (φ, x) = 1 and for any y such that VS (φ, y) = 1, x w y]
THEN VS (ψ, x) = 1
(L) EITHER φ is trueL at no worlds, OR: there is some world, x, such that
VL (φ, x) = 1 and for all y, if y w x then VL (φ→ψ, y) = 1
ii) Suppose for reductio that (L) isn’t true. Then each disjunct of (L) is false;
so:
iv) …NOT: “there is some world, x, such that VL (φ, x) = 1 and for all y, if
y w x then VL (φ→ψ, y) = 1”. So: for every world, x, if VL (φ, x) = 1
then for some y, y w x and VL (φ→ψ, y) = 0
v) Given iii) and the limit condition (which we are assuming holds in this
model), there is some world, call it a, that is a nearest-to-w world in
which φ is trueL .
ix) Given (ih) and vii), for all y, if VS (φ, y) = 1 then a w y. (It’s crucial to
the success of this step that (ih) tells us that φ has the same value under
VS and VL at all worlds.)
xi) Given iv) and vi), there is some world, call it b , such that b w a and …
ii) Suppose for reductio that (S) isn’t true. So for some world, call it a,
VS (φ, a) = 1 and…
vi) v) tells us that φ is trueL at some world. So by i), there is some world,
call it b , such that VL (φ, b ) = 1 and …
viii) Given v) and iv), VL (φ→ψ, a) = 0. So from vii), a w b . But given vi)
and the ih, VS (φ, b ) = 1; and so, given iii), a w b . Contradiction.
iii) …V g (∃x3F x, w) = 0
vi) The definition of a QML model specifies that the domain D cannot be
empty. So, D has at least one member; call some member of D “u 0 ”.
viii) From v), V g x0 (3F x, w) = 0, from which it follows that for every member
u
of W , and so for w 0 in particular, V g x0 (F x, w 0 ) = 0, contradicting vii)
u
+ +
c 0 0 0
∃ x∃yRxy ∃ yR ux y R ux uy
APPENDIX B. ANSWERS TO REMAINING EXERCISES 346
Official model:
W = {r, c}
D = {u}
I (a) = u
I (R) = {〈u, u, r〉}
Exercise 9.2 Formulas 9.1b and 9.1c are SQML-invalid, and so remain
invalid in the variable domain semantics. But whereas 9.1a is SQML-valid, it
is VDQML-invalid:
APPENDIX B. ANSWERS TO REMAINING EXERCISES 347
+ ∗
1 0 0 0 0
3∀xF x→ ∃ x3F x 3F vx
r
∗ Official model:
Dr : {v} F : {} W = {r, a}
Dr = {v}
Da = {u}
+
I (F ) = {〈u, a〉}
1 1 0
u
a ∀ xF x F x
F vx
Da : {u} F : {u}
Official model:
r Dr : {u, v} F : {u, v}
0
W = {r, a} R = {〈r, r〉, 〈r, a〉, 〈a, a〉}
D = {u, v} Dr = {u, v} Da = {v}
a Da : {v} F : {v} I (F ) = {〈u, r〉, 〈v, r〉, 〈v, a〉}
0
Official model:
r Dr : {u, v} F : {u}
0
W = {r, a} R = {〈r, r〉, 〈r, a〉, 〈a, a〉}
D = {u, v} Dr = {u, v} Da = {v}
a Da : {v} F : {u} I (F ) = {〈u, r〉, 〈u, a〉}
0
Exercise 9.3c The model in exercise 9.3a shows that 2VDQML ∀x2∃y y=x.
v) by i), Vg (∀αφ, v) = 1
Exercise 10.1a We are to show that φ→2@φ. Consider any model, world
w (and variable assignment), and suppose for reductio that V(φ, w, w) = 1
but V(2@φ, w, w) = 0. Given the latter, there is some world, v, such that
V(@φ, w, v) = 0. And so, given the truth condition for @, V(φ, w, w) = 0.
Contradiction.
Exercise 10.1b We are to show that 2×∀x3@F x→2∀xF x.
i) Suppose for reductio that for some world w, some variable assignment
g , and some model, Vg (2×∀x3@F x, w, w) = 1 and …
APPENDIX B. ANSWERS TO REMAINING EXERCISES 349
iii) Given the latter, for some world, call it “a”, Vg (∀xF x, w, a) = 0.
vii) Thus, for every object in the domain, and so for u in particular,
Vg x (3@F x, a, a) = 1
u
/ I (F ). Contradiction
xi) But given iv), 〈[x] g x , a〉 ∈
u
— (1998). Logic, Logic, and Logic. Cambridge, MA: Harvard University Press.
Boolos, George and Richard Jeffrey (1989). Computability and Logic. 3rd edition.
Cambridge: Cambridge University Press.
Chalmers, David (1996). The Conscious Mind. Oxford: Oxford University Press.
Cresswell, M.J. and G.E. Hughes (1996). A New Introduction to Modal Logic.
London: Routledge.
350
BIBLIOGRAPHY 351
Enderton, Herbert (1977). Elements of Set Theory. New York: Academic Press.
Evans, Gareth (1979). “Reference and Contingency.” The Monist 62: 161–189.
Reprinted in Evans 1985.
— (1991b). Logic, Language, and Meaning, Volume 2: Intensional Logic and Logical
Grammar. Chicago: University of Chicago Press.
Harper, William L., Robert Stalnaker and Glenn Pearce (eds.) (1981). Ifs: Con-
ditionals, Belief, Decision, Chance, and Time. Dordrecht: D. Reidel Publishing
Company.
Hodes, Harold (1984a). “On Modal Logics Which Enrich First-order S5.”
Journal of Philosophical Logic 13: 423–454.
Kripke, Saul (1972). “Naming and Necessity.” In Donald Davidson and Gilbert
Harman (eds.), Semantics of Natural Language, 253–355, 763–769. Dordrecht:
Reidel. Revised edition published in 1980 as Naming and Necessity (Cambridge,
MA: Harvard University Press).
Linsky, Bernard and Edward N. Zalta (1994). “In Defense of the Simplest
Quantified Modal Logic.” In James Tomberlin (ed.), Philosophical Perspectives
8: Logic and Language, 431–458. Atascadero: Ridgeview.
Prior, A. N. (1967). Past, Present, and Future. Oxford: Oxford University Press.
— (1956). Logic and Knowledge. Ed. Robert Charles Marsh. New York: G.P.
Putnam’s Sons.
Soames, Scott (2004). Reference and Description: The Case against Two-
Dimensionalism. Princeton: Princeton University Press.
— (1978). “Assertion.” In Peter Cole and Jerry Morgan (eds.), Syntax and Se-
mantics, Volume 9: Pragmatics, 315–332. New York: Academic Press. Reprinted
in Stalnaker 1999: 78–95.
BIBLIOGRAPHY 354