You are on page 1of 23

Explanation Versus Prediction: Which Carries More Evidential

University Press Scholarship Online

Oxford Scholarship Online

The Book of Evidence


Peter Achinstein

Print publication date: 2001


Print ISBN-13: 9780195143898
Published to Oxford Scholarship Online: November 2003
DOI: 10.1093/0195143892.001.0001

Explanation Versus Prediction: Which Carries More Evidential


Peter Achinstein (Contributor Webpage)

DOI:10.1093/0195143892.003.0010

Abstract and Keywords


According to one standard view, a prediction of a new fact always counts as stronger
evidence for a hypothesis than an explanation of a known one. According to another view,
it is the reverse. Both views are shown to be mistaken. What is important for evidence is
not whether it was predicted or explained, but (in the sort of cases used by each side)
what selection procedures were used to obtain the evidence. Stephen Brush's defense of
explanationism and Patrick Maher's defense of predictivism are critically examined, as
is Clark Glymour's problem of old evidence.
Keywords: Brush, evidence, explanationism, Glymour, Maher, old evidence, prediction, predictivism,
scientific explanation, selection procedures

1. The Historical Thesis of Evidence


According to a standard view, predictions of new phenomena provide stronger evidence
for a theory than explanations of old ones. More precisely, a theory that predicts

Page 1 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


phenomena that did not prompt the initial formulation of that theory is better supported
by those phenomena than is a theory by known phenomena that generated the theory in
the first place and that the theory was used to explain. So say various philosophers of
science, including William Whewell in the nineteenth century and Karl Popper in the
twentieth, to mention just two.1
Stephen Brush takes issue with this on historical grounds.2 He argues that, generally
speaking, scientists do not regard new phenomena predicted by a theory, even ones of a
kind totally different from those that prompted the theory in the first place, as providing
better evidential support for that theory than is provided by already known phenomena
explained by the theory. By contrast, Brush claims, there are cases, including general
relativity and the periodic law of elements, in which scientists tend to consider known
phenomena explained by a theory as constituting much stronger evidence than novel
predictions.3
(p.211) Both the predictionist and the explanationist are committed to an interesting
historical thesis about putative evidence e and a hypothesis h, viz.
Historical thesis: Whether e if true is evidence that h, or how strong that evidence
is, depends on certain historical facts about e, h, or their relationship.4
For example, whether e if true is evidence that h, or how strong it is, depends on
whether e was known to be true before or after h was formulated. Various historical
positions are possible, as Alan Musgrave noted years ago in a very interesting article.5
On a simple predictionist view (which Musgrave classifies as purely temporal), e is
evidence that h only if e was not known to be true when h was first proposed. On another
view (which Musgrave attributes to Elie Zahar and calls heuristic), e is evidence that h
only if when h was first formulated it was not devised in order to explain e. On yet a third
historical view (which Musgrave himself accepts), e is evidence that h only if e cannot be
explained by a predecessor theory, that is, by a competing theory which was devised
by scientists prior to the formulation of h.
These are three examples of historical views concerned with the temporal order in which
e and h were formulated or known to be true. But other historical positions are possible.
For example, it might be held that whether e if true is evidence that h, and how strong
that evidence is, always depends on historical facts concerning how the results reported
in e were obtained, for example, what sampling methods were in fact used by those who
reported that e (more on this in section 2).
The historical thesis is not that e or h are themselves propositions about particular
historical events. (For example, e might be that light can be diffracted, and h might be that
light is a wave motion.) Rather the thesis is that even if neither e nor h describes a
particular event or set of events that occurred, whether e, if true, is evidence that h
depends upon the occurrence of some particular event or set of events pertaining to
when or how e, h, or the relationship between them came to be formulated or believed,
or how the results in e were obtained.

Page 2 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


Is the historical thesis true or false? It is clearly true in the case of subjective evidence.
Whether e is some person's evidence that h depends on certain historical facts about e,
h, and their relationship, viz. whether that person in fact has or had certain beliefs about
e, h, and their relationship. However, defenders of the historical thesis are not speaking
of subjective evidence. They are not speaking of something that someone takes to be
evidence but of something that is (objective) evidence. In cases of the latter sort, I
propose to argue that the historical thesis is sometimes true, and sometimes false,
depending on the type of evidence in question. I will show how this comports with my
own theory of potential and veridical evidence. Then I will consider what implications, if
any, this has for the (p.212) debate between Brush and the predictionists and for the
problem of old evidence raised by Clark Glymour.
Before beginning, however, let me mention a curious but interesting fact about certain
wellknown philosophical theories or definitions of objective evidence (other than mine),
including Carnap's theory of confirmation,6 Hempel's satisfaction theory,7 Glymour's
bootstrap account,8 and the usual hypotheticodeductive account. These theories are
incompatible with the historical thesis.9 They hold that whether, or the extent to which, e
is evidence that h is an a priori fact about the relationship between e and h. It is in no way
affected by empirical issues such as the time at which h was first proposed, or e was first
known, or by the intentions with which h was formulated, or how information reported in
e was obtained. Defenders of these views must reject both the predictionist and the
explanationist claims about evidence. They must say that whether, or the extent to which,
e supports h has nothing to do with whether e was first formulated as a novel prediction
from h or whether e was known before h and h was constructed to explain it.
Accordingly, we have two extreme or absolutist positions. There is the position, reflected
in the historical thesis, that evidence is always historical (in the sense indicated). And
there is a contrasting position, reflected in a priori views, that evidence is never historical.
Does the truth lie at either extreme? Or is it somewhere in the middle? In what follows I
will pursue these questions with respect to my concepts of potential and veridical
evidence, the ones most crucial to scientists (although what I will say is also applicable to
ESevidence).

2. Selection Procedures
Suppose that an investigator decides to test the efficacy of a drug D in relieving
symptoms S. The hypothesis under consideration is
h: Drug D relieves symptoms S in approximately 95% of the cases.
As I have emphasized in chapter 9, whether (and the extent to which) some test result is
evidence that a certain hypothesis is true depends on the selection procedure used to
obtain that result. Here are two of the many possible selection procedures for testing h:
SP1: Choose a sample of 2000 persons of different ages, both sexes, who have
symptoms S in varying degrees; divide them arbitrarily into two groups; give one
group drug D and the other a placebo; determine how many in each group have
their symptoms relieved.

Page 3 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


SP2: Choose a sample of 2000 females aged 5 all of whom have symptoms S in a
very mild form; proceed as in SP 1 .
(p.213) Now suppose that our investigator obtains the following result:
e: In a group of 1000 persons with symptoms S taking drug D, 950 persons had
relief of S; in a control group of 1000 Ssufferers not taking D but a placebo none
had symptoms S relieved.
If result e was obtained by following SP 2, then e, although true, would not be
particularly good evidence that h, certainly not as strong as that obtained by following SP
1 . The reason, of course, is that SP 1 , by contrast with SP 2, gives a sample that is varied
with respect to factors that may well be relevant: age, sex, and severity of symptoms.
(Hypothesis h does not restrict itself to 5yearold girls with mild symptoms, but asserts a
cure rate for the general population of sufferers with varying degrees of the symptoms in
question.)
This means that if the result described in e is obtained, then whether that result so
described constitutes evidence that h, and how strong that evidence is, depends on a
historical fact about e, viz. how in fact e was obtained. If e resulted from following SP 1 ,
then e is pretty strong evidence that h; if e was obtained by following SP 2, then e is
pretty weak evidence that h, if it is evidence at all. Just by looking at e and h, and even by
ascertaining that e is true, we are unable to determine to what extent, if any, e, if true,
supports h. We need to invoke history.
Here is a third selection procedure:
SP3: Choose a sample of 2000 persons all of whom have S in varying degrees;
divide them arbitrarily into two groups; give one group drugs D and D (where D
relieves symptoms S in 95% of the cases and blocks possible curative effects of D
when taken together); give the other group a placebo.
If this was used to generate e, then result e fails to provide any support for h. And, again,
that this is so can be ascertained only by learning a historical fact about e, viz. that what e
(truly) reports was obtained by following SP 3.10
So far then we seem to have support for the historical thesis about evidence. Can we
generalize from examples like this to all cases? Can we say that for any e, and any
hypothesis h, whether, or to what extent, e, if true, is evidence that h depends upon
historical facts about how e was obtained?
Consider another type of case, in which the aim is to test the hypothesis
h: John will win the lottery.
At the present time suppose that this can be done only indirectly by obtaining
information about who bought tickets and how many. Two investigators proceed to obtain
information of this sort, each one following a different selection procedure: (p.214)

Page 4 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


SP4: Determine who bought tickets, and how many, by asking lottery officials.
SP5: Determine this by consulting the local newspaper, which publishes this
information as a service to its readers.
These results are obtained:
e1: Investigator 1 reports that in the lottery 1000 tickets were sold, of which John
owns 999, and no further tickets will be sold.
e2: Investigator 2 reports the same thing.
Whether e 1 or e 2 or both are evidence that h depends on what selection procedure
was employed by the investigators in obtaining the information in their reports. Suppose
that investigator 1 followed SP 4, while investigator 2 followed SP 5. And suppose that
newspapers, by contrast to lottery officials, are usually unreliable in such reports. Then,
although e 1 , if true, is strong evidence that h, e 2 is not. In any case, to determine
whether, or the extent to which, e 1 or e 2 supports h we again need to determine a
historical fact concerning how the ticket information reported in e 1 and e 2 was obtained.
Now, however, let us distill the information reported in e 1 and e 2 into the following:
e: In the lottery 1000 tickets were sold, of which John owns 999, and no further
tickets will be sold.
In this case, whether e, if true, is evidence that h, and how strong it is, does not depend
on historical facts concerning how information reported in e was obtained, or concerning
when or how e, h, or their relationship came to be formulated, believed, or known. To be
sure, whether, or the extent to which, e, if true, supports h does depend upon other
historical facts, such as whether the lottery is fair, whether certain conditions will
interfere, and so forth. But these are not historical facts of the kind relevant for the
historical thesis of evidence. In this case, unlike the drug case, in order for e, if true, to
be evidence that h, it is irrelevant how or when the information in e was obtained, or
even whether it was obtained. Accordingly, we have a case that violates the historical
thesis of evidence.
Since examples similar to each of the above can be readily constructed, we may conclude
that there are cases that satisfy the historical thesis of evidence, and others that fail to
satisfy it. With respect to a hypothesis h, if e speaks of observations or tests made that
yield certain results, then whether e if true is evidence that h, and if so, how strong that
evidence is, depends on what selection procedure was employed in making these
observations or tests. That is a historical fact of the kind that conforms to the historical
thesis of evidence. However, there are many cases, such as the last one noted, in which e
does not speak of observations or tests, but of certain facts that are described
independently of observations or tests. Such cases do not conform to the historical thesis.
What implications, if any, does this hold for whether predictions or explanations provide
stronger evidence?

Page 5 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


(p.215) 3. Predictions Versus Explanations
Let us return to the original question proposed by Brush. Do novel facts predicted by a
theory provide stronger evidence for that theory than known facts explained by the
theory, as Whewell and Popper claim? Or is the reverse true?
A preliminary point is worth making. As should be obvious from discussions in earlier
chapters, the mere fact that some theory or hypothesis h was (or indeed can be) used
successfully to predict some novel e does not suffice to make e evidence that h. Nor
does the fact that some theory or hypothesis h was (or can be) used to explain some
known fact e suffice to make e evidence that h. Suppose there is a lottery whose 1 million
tickets go on sale on Tuesday; only one ticket can be purchased by any given person; and
the drawing will be made on Friday. My hypothesis is that you will win the lottery.
Suppose I use this hypothesis to predict that you will buy 1 ticket. The fact that my
prediction turned out to be correct is not evidence that you will win the lottery. Suppose
that on Sunday it comes to be known that you are in a good mood. And suppose I use the
hypothesis that you won the lottery to explain this known fact. This would not suffice to
make the fact that you are in a good mood evidence that you won the lottery. It is readily
shown that on the account of (potential and veridical) evidence I propose, neither the fact
that e was (or can be) correctly predicted from a hypothesis h, nor the fact that h was (or
can be) used to explain a known fact e, suffices to make e evidence that h.
Accordingly, what the predictionist and explanationist may want to say is this.
Whether e is a novel fact correctly predicted from a hypothesis h, or whether e is a
known fact that h was invoked to explain, is relevant for the question of whether e is
evidence that h, and particularly for the question of how strong that evidence is. On the
predictionist view, a novel fact correctly predicted by a hypothesis provides stronger
evidence than does an already known fact explained by the hypothesis. For an
explanationist it is the reverse. Which view is correct?
My answer is this: Neither one. Sometimes a prediction provides better evidence for a
hypothesis, sometimes an explanation does, and sometimes they are equally good. Which
obtains has nothing to do with the fact that it is a prediction of novel facts or that it is an
explanation of known ones.
To show this, I will begin with a case that violates the historical thesis of evidence. Here it
should be easy to show that whether the putative evidence is known before or after the
hypothesis is formulated is irrelevant for whether it is evidence that h or how strong that
evidence is. Let
h: This coin is fair, that is, if tossed in random ways under normal conditions it will
land heads approximately half the time in the long run
e: This coin is physically symmetrical, and in a series of 1000 random tosses
under normal conditions it landed heads approximately half the time.
Let us suppose that e is empirically complete with respect to h (that is, whether e is
evidence that h, and how strong that evidence is, does not depend on empirical (p.216)

Page 6 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


facts other than e).11 In particular, whether e is evidence that h does not depend on
when, how, or even whether, e comes to be known, or on whether e was known first and
h then formulated, or on whether h was conceived first and e then stated as a prediction
from it. Putative evidence e supports hypothesis h and does so (equally well) whether or
not e is known before or after h was initially formulated, indeed whether or not h was
ever formulated or e is ever known to be true or any selection procedure was used to
obtain e.
The same holds in a case that is analogous except that e is not empirically complete with
respect to h. Let
e: In the lottery 1000 tickets were sold, of which John owned 999 at the time of
the drawing.
h: John won.
Whether e is evidence that h, and if so how strong it is, depends on the truth of certain
facts other than e, such as whether the lottery was fair. But it does not depend on
whether e was known before or after h was formulated, or on whether e was ever
known, or on what selection procedure, if any, was employed to obtain e.
So let us focus instead on cases that satisfy the historical thesis of evidence. We might
suppose that at least in such cases explanations (or predictions) are always better for
evidence. Return once again to our drug hypothesis
h: Drug D relieves symptoms S in approximately 95% of the cases.
Consider two evidence claims, the first a prediction about an unknown future event, the
second a report about something already known:
e1: In the next clinical trial of 1000 patients who suffer from symptoms S and who
take D approximately 950 will get some relief.
e2: In a trial that has already taken place involving 1000 patients with S who took
D (we know that) approximately 950 got some relief.
On the prediction view, e 1 (if true) is stronger evidence for h than is e 2. On the
explanation view, it is the reverse. And to sharpen the case, let us suppose that e 2, by
contrast to e 1 , was not only known to be true prior to the formulation of h, but that h
was formulated with the intention of explaining e 2. Which view is correct? Neither one.
Let us take the prediction case e 1 first. Whether e 1 if true is evidence that h, and how
strong it is, depends on the selection procedure to be used in the next clinical trial.
Suppose this selection procedure calls for choosing just 5yearold girls with very mild
symptoms who in addition to D are also taking drug D which ameliorates symptoms S in
95% of the cases and potentially blocks D from doing so. Then e 1 would be very weak
evidence that h is true, if it is evidence at all. This is so, despite the fact that e 1 is a
correct prediction from h, one not used in generating h in the first place. By contrast,
suppose that the selection procedure used in the past trial mentioned in e 2 is much

Page 7 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


better with respect to h. For example, it calls for choosing humans of both sexes, of
different ages, with (p.217) symptoms of varying degrees, who are not also taking drug
D or any other that can prevent D from working. Then e 2 would be evidence that h,
indeed much stronger evidence than e 1 . In such a case, a known fact explained by h
would provide more support for h than a newly predicted fact would.
Obviously the situations here can be reversed. We might suppose that the selection
procedure used to generate the prediction of e 1 is the one cited in the previous
paragraph as being used to generate e 2 (and vice versa). In this situation a newly
predicted fact would provide more support for h than an already explained one.
In these cases whether some fact is evidence, or how strong that evidence is, has nothing
to do with whether it is being explained or predicted. It has to do with the selection
procedure used to generate that evidence. In one situationwhether it involves
something that is explained or predictedwe have a putative evidence statement
generated by a selection procedure that is a good one relative to h; in the other case we
have a flawed selection procedure. This is what matters for evidence, not whether the
putative evidence is being explained or predicted.

4. A Response: Maher's Account


A response of the predictionist and explanationist will now be considered in this
section and the two that follow. It involves formulating the information concerning
whether e is a prediction or an explanation as part of the evidence statement itself or at
least as part of the background information.12
In the example above, let
e: In a clinical trial of 1000 patients who suffer from symptoms S and who take
drug D, approximately 95% got some relief
b1: e is a prediction from h
b2: h was devised to explain e
h: Drug D relieves symptoms S in approximately 95% of the cases.
According to the predictionist,
e&b 1 is stronger evidence that h than is e&b 2 (or e is stronger evidence that h
given b 1 than it is given b 2).
According to the explanationist,
e&b 2 is stronger evidence that h than is e&b 1 (or e is stronger evidence that h
given b 2 than it is given b 1 ).
Patrick Maher offers a predictionist response of this sort, which he formulates as
follows.13 Suppose that a hypothesis h is generated on some occasion by (p.218) a
method M. Let M h be that M generated a hypothesis that entails h. Let O be that (as
Maher puts it) e was available to M when it generated h. Then, according to Maher, the
predictionist thesis is this:

Page 8 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


(PT) p(h/M h&e&O) > p(h/M h&e&O).14
The probability on the left is conditional on the assumption that e was not used to
generate h. (This is the prediction case.) The probability on the right is conditional on
the assumption that e was used to generate h (which Maher calls accommodation).
Maher does not claim that (PT) holds universally. But he does prove a theorem showing
that it holds under certain conditions, which, he claims, usually obtain in science.15
Before discussing any of these conditions, I will give an example Maher himself offers, and
then show how he would deal with the drug case above. Maher's example involves coin
tossing and it purports to show that prediction provides stronger evidence than
accommodation. 16 He imagines two cases, as follows:
(a) Accommodation. A fair coin is randomly tossed 99 times by an experimenter, and the
outcome of each toss is recorded in a sentence e. The hypothesis h is a conjunction of e
with the proposition (h) that the coin will land heads on the 100th toss. Let b contain the
information that the results of the 99 tosses were first recorded by the experimenter,
following which the experimenter formulated the conjunctive hypothesis h to
accommodate e. (This is the experimenter's method M of generating hypothesis h.)
According to Maher,
(1) p(h/e&b) = p(h/e&b) = .
That is, given that on each of the first 99 tosses the coin landed in the manner described
by e and given the accommodation described in b, the probability that the coin will land
heads on the 100th toss and that in the first 99 tosses it landed in the manner described
by e is just the probability that it will land heads on the 100th toss, given that it landed the
way it did on the first 99 tosses and given accommodation. Assuming the coin is fair and
tossed randomly, the latter probability should be .
(b) Prediction. Propositions e, h, and h are as above, but b is changed. Instead of first
tossing the coin 99 times and recording the results, the experimenter predicts the
results of the first 99 tosses, viz. e, and also predicts that the 100th toss will yield heads.
(Call this conjunctive fact b(e).)17 Then the coin is tossed 99 times and the results e are
exactly as predicted. Now, according to Maher,
(2) p(h/e&b(e)) = p(h/e&b(e)) approximately 1.
(p.219) Let CP i be that the experimenter correctly predicts the result of the ith toss.
Where h is that the 100th toss will be heads, let P(h) be that the experimenter predicts
that the 100th toss will be heads. Then e& b(e) is equivalent to CP 1 . . . CP 99&P(h). So
from (2) we have
(3) p(h/CP 1 . . . CP 99 & P(h)) = approximately 1.
In this case, given that the experimenter successfully predicts the results of the first 99
tosses, viz. e, and predicts that the 100th toss will be heads, the probability that his

Page 9 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


prediction (h) of heads on the 100th toss will be correct is close to 1. The only difference
between (1) and (3) is that in (1) the data recorded in e are accommodated by the
experimenter using the conjunctive hypothesis h (= h&e), while in (3) the data recorded
in e are correctly predicted by the experimenter in advance of the experiment.
Maher attributes the difference in probabilities to the fact that in the successful
prediction case, but not the accommodation case, there is a strong reason to suppose
that the experimenter has a reliable method of predicting coin tosses. This is the basis for
his claim that predictions provide stronger evidence than accommodations.

5. Maher and the Drug Case


Returning now to my drug case, here is how Maher's account would proceed. (In what
follows I alter the example to conform to this account.) Suppose there are 11 clinical trials,
each to determine the effectiveness of drug D in relieving symptoms S. Let
e1: In the first 10 clinical trials involving 1000 Ssufferers who took drug D, 95%
got relief in 9 out of the 10 trials. In one of the ten trials, 75% got relief.
e2: In the 11th trial involving 1000 Ssufferers who took drug D, 95% got relief.
M: If the relative frequency of 95% success in ten clinical trials involving 1000 S
sufferers is .9, infer the same 95% success rate in another trial involving 1000 S
sufferers who took D.
Me2: M was used to generate a hypothesis that entails e 2.
Mh: M was used to generate a hypothesis that entails h.
Oe(i): e i was available to M when M generated e i .
Oh: e i was available to M when M generated h.
h: e 1 &e 2.
Now, according to the predictionist thesis,
(PT) p(h/M h&e 1 &O h) > p(h/Mh & e1&O h)
The probability on the left represents the prediction case, the one on the right
represents accommodation. That is, if e 1 is a prediction, then h has a higher probability
than if e 1 is accommodated (that is, if e 1 was available to M when M generated h).
(p.220) In the prediction case, Maher claims, the fact that e 1 , which is entailed by h,
was correctly predicted (indicated by e 1 &O h) boosts the reliability of the method M
used to generate h. This in turn boosts the probability of h. By contrast, in the
accommodation case, the fact that e 1 was accommodated (indicated by e 1 &O h) does
not boost the reliability of the method M used to generate h, or does not boost it as
much as the prediction case does. Hence in this case h's probability is not as high as it is
in the prediction case.
One of the central assumptions of Maher's general theorem concerns the lack of such a
boost in the case of accommodation. Maher speaks of the reliability R of a method for
generating a hypothesis. This is the (objective) probability that the hypothesis generated

Page 10 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


by this method will be correct. Let the expected value of the reliability of a method M for
generating a hypothesis h be denoted by E(RM(h)).18 Consider M above, a method that
will generate e 2. Maher makes an assumption for his theorem which has the following
consequence [p. 336, see (22)]:
(1) E(RM(e 2)/Me2&e2&O e2&e1) = E(RM(e 2)/e 1 ).
He assumes that if e is simply accommodated (but not predicted), then the expected
value of the reliability of method M with respect to e 2 is not increased.
Now in the drug case above, this assumption does not hold. According to e 2, the result
in the 11th trial satisfies M. This should boost the expected value of the probability that a
hypothesis generated by M is correct, whether the results in the 11th trial were
accommodated or predicted.
Maher's response is to agree that in this sort of case the predictionist thesis (PT) does
not hold.19 In such a case, he says, it is certain what the method M would predict, that is,
(2) p(Me2/e2&O e2&e1 = p(Me2/e2&O e2&e1) = 1.
This violates an assumption for his theorem from which (1) above follows. Although he
agrees that (PT) does not hold when (2) is true, he claims that in the usual scientific cases
(2) is false. Maher writes:
Only in very special cases can we predict with certainty what hypotheses scientists
will generate.20
In response, note first that (2) is concerned not with the probability that a scientist will
generate the hypothesis e 2 but with the probability that method M will generate e 2. I
agree that the former probability is not 1, but it is the latter probability that is of concern
in (2). So is Maher claiming that in typical cases of prediction, where a scientist uses a
method for generating a predictive hypothesis h, the probability that the method
generates h is less than 1? The answer seems to be yes. The only justification Maher
offers for this answer is that typically in predictive (p.221) cases the method being used
is not understood well enough to allow it to be certain that the method generates the
prediction in question.
As mentioned in note 14, Maher employs a subjective interpretation of probability in his
predictionist thesis (PT). So his present claim would be that a typical scientist who makes a
prediction using a method M does not understand M well enough to be completely sure
that M yields that prediction.21 The scope and interest of Maher's predictionist thesis
(PT), then, depends on how typical it is for scientists using a method to predict the truth
of a hypothesis h not to understand the method well enough to know for sure whether h
is actually predicted by that method. To say the least, this is very different from the usual
predictionist idea. More importantly, it is one whose wide applicability seems
questionable. Frequently, a scientist will be in a situation in which the method of
prediction consists in deriving the prediction deductively from a theory or hypothesis,

Page 11 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


where the scientist knows whether or not the derivation is correct (without necessarily
knowing whether what is derived is true).

6. Balmer's Formula
It may be useful to cite a simple historical example, viz. Balmer's formula, that is similar in
certain respects to one employed by Maher.22 When light from hydrogen is analyzed
using a spectroscope, it is seen to consist in series of sharp lines of definite wavelengths.
In 1885, Johann Jakob Balmer introduced a general formula that entailed the wavelengths
of the four lines known by him at that time. The formula can be represented as follows:

1
1/n = R( 1/n2 )
4
where is the wavelength of a given line, R is a constant, and n = 3, 4, 5, 6 for the four
lines. Balmer does not claim to be explaining why the lines occur or have the wavelengths
they do, but simply to be represent[ing] the wavelengths of the different lines in a
satisfactory manner. 23 This seems to be a case satisfying Maher's notion of
accommodation.
(p.222) Now Balmer indicates that he used his formula to obtain the wavelength of a
fifth line (letting n = 7). He says he knew nothing of such a line when he performed the
calculation, and was later informed that it exists and satisfies the formula. So the fifth line
was, from his standpoint, a prediction that turned out to be correct. Moreover, he
reports being informed that many more hydrogen lines are known, which have been
measured by Vogel and Huggins in the violet and ultraviolet parts of the hydrogen
spectrum and the spectrum of the white stars (p. 362). What impresses Balmer,
however, is not the fact that he has made a successful prediction, but simply the fact that
all of the lines, whether accommodated or predicted, satisfy his formula. He writes:
From these comparisons it appears that the formula holds also for the fifth
hydrogen line. . . . It further appears that Vogel's hydrogen lines and the
corresponding Huggins lines of the white stars can be presented by the formula
very satisfactorily. We may almost certainly assume that the other lines of the white
stars which Huggins found further on in the ultraviolet part of the spectrum will be
expressed by the formula. (p. 362)
As far as Balmer is concerned, it is the fact that the various lines, whether first known and
later accommodated or first predicted and later known, all satisfy his formula that
provides strong evidence for the last claim in the above passage.
Let B(i) mean that line i satisfies Balmer's formula. Balmer's claim is that
(3) p(B(5)/B(1) . . . B(4)) is very high.
This has nothing to do with accommodation or prediction. However, let the method for
generating hypotheses of the form B(i) be as follows:
M: Use Balmer's formula to obtain B(i).

Page 12 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


The probability in which Maher is interested for accommodation is
(4) p(h/M&e&O),
that for prediction is
(5) p(h/M&e&O),
where h is B(5) &e, e is B(1) . . . B(4), and O says that e was available to M when it
generated h.
If Maher's predictionist thesis (PT) holds for this case, then the probability in (5) should
be greater than that in (4). But, I submit, they are equal. The relevant consideration is
that in both cases lines 1 through 4 satisfy Balmer's formula, and that the hypothesis of
interest concerning line 5 was generated using the Balmer formula. The truthvalue of
claims B(1) . . . B(4) is relevant, but it is irrelevant whether or not the truthvalues of
these claims were available to method M when it generated B(5).
To be sure, in such a case Maher will say that Balmer's method M is such that whether
B(i) is a prediction using M is known for certain.24 But I submit, it is just (p.223) these
kinds of predictions, viz. deductions from hypotheses and/or applications of quantitative
formulas to new cases, that are frequently meant by those who defend a predictionist
thesis. Moreover, I submit, they are not atypical in science.

7. Brush Redux
Brush is clearly denying a general predictionist thesis. By contrast, he cites cases in which
scientists themselves regarded known evidence explained by a theory as stronger
support for that theory than new evidence that was successfully predicted. And he
seems to imply that this was reasonable. He offers an explanation for this claim, viz. that
with explanations of the known phenomena, by contrast with successful predictions of the
new ones, scientists had time to consider alternative theories that would generate these
phenomena. Now, even if Brush does not do so, I want to extend this idea and consider a
more general explanationist view that is committed to the following three theses that
Brush invokes for some cases:
(1) A selection procedure for testing a hypothesis h is flawed, or at least inferior
to another, other things being equal, if it fails to call for explicit consideration of
competitors to h.
(2) The longer time scientists have to consider whether there are plausible
competitors to h the more likely they are to find some if they exist.
(3) With putative evidence already known before the formulation of h scientists
have (had) more time to consider whether there are plausible competitors to h
than is the case with novel predictions.
I would challenge at least the first and third theses. In the drug example of section 2,
selection procedure SP 1 for the drug hypothesis does not call for explicitly considering
competitors to that hypothesis. Yet it does not seem flawed on that account, or inferior to
one that does. However, even supposing it were inferior, whether or not a selection

Page 13 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


procedure calls for a consideration of competitors is completely irrelevant to whether the
putative evidence claim is a prediction or a known fact being explained. In the case of a
prediction, no less than that of an explanation, the selection procedure may call for a
consideration of competitors.
For example, in our drug case, where h is Drug D relieves symptoms S in approximately
95% of the cases, and e is the prediction In the next clinical trial of 1000 patients
suffering from symptoms S who take D, approximately 950 will (p.224) get some relief,
the selection procedure to be used for the next clinical trial might include the rule
In conducting this next trial, determine whether the patients are also taking some
other drug which relieves S in approximately 95% of the cases and which blocks
any effectiveness D might have.
Such a selection procedure calls for the explicit consideration of a competitor to explain e,
viz. that it will be some other drug, not D, that will relieve symptoms S in the next trial.
This is so even though e is a prediction. Moreover, to respond to the third thesis about
time for considering competitors, an investigator planning a future trial can have as much
time as she likes to develop a selection procedure calling for a consideration of a
competing hypothesis. More generally, in designing a novel experiment to test some
hypothesis h, as much time may be spent in precluding competing hypotheses that will
explain the test results as is spent in considering competing hypotheses for old data.

8. Thomson Versus Hertz: The Wave Theory of Light


Let me invoke two scientific examples employed earlier. The first involves the dispute
between Heinrich Hertz and J. J. Thomson over the nature of cathode rays, discussed in
chapter 2 above. Recall that in 1883 Hertz observed that the cathode rays in his
experiments were not deflected by an electric field. He took this to be strong evidence
that cathode rays are not charged particles but some type of ether waves. In 1897
Thomson repeated Hertz's experiments but with a much higher evacuation of gas in the
cathode tube than Hertz had been able to obtain. Thomson believed that when cathode
rays pass through a gas they make it a conductor, which screens off the electric force
from the charged particles comprising the cathode rays. This screening off effect will be
reduced if the gas in the tube is more thoroughly evacuated. In Thomson's 1897
experiments electrical deflection of the cathode rays was detected, which Thomson took
to be strong evidence that cathode rays are charged particles.
I want to consider the evidential report of Hertz in 1883, not of Thomson in 1897. Let
1. e = In Hertz's cathode ray experiments of 1883, no electrical deflection of
cathode rays was detected.
2. h = Cathode rays are not electrically charged.
Hertz considered e to be strong (veridical) evidence that h. In 1897 Thomson claimed
that Hertz's results as reported in e did not provide strong evidence that h, since
Hertz's experimental setup was flawed: He was employing insufficiently evacuated

Page 14 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


tubes. To use my previous terminology, Thomson was claiming that Hertz's selection
procedure for testing h was inadequate, and hence that e is not (potential or veridical)
evidence that h.
Here we can pick up on a point emphasized by Brush. Hertz, we might say, failed to use
a selection procedure calling for considering a competitor to h to explain his results (viz.
that cathode rays are charged particles, but that the tubes (p.225) Hertz was using
were not sufficiently evacuated to allow an electrical force to act on these particles). But
and this is the point I want to emphasizein determining whether, or to what extent, e is
(potential or veridical) evidence that h, it is irrelevant whether Hertz's e was a novel
prediction from an already formulated hypothesis h or an already known fact to be
explained by h. Hertz writes that in performing the relevant experiments he was trying to
answer two questions:
Firstly: Do the cathode rays give rise to electrostatic forces in their
neighbourhood? Secondly: In their course are they affected by external
electrostatic forces?25
In his paper he did not predict what his experiments would show. Nor were the results
of his experiments treated by him as facts known before he had formulated his
hypothesis h. Once he obtained his experimental results he then claimed that they
supported his theory:
As far as the accuracy of the experiment allows, we can conclude with certainty that
no electrostatic effect due to the cathode rays can be perceived. (p. 251)
To be sure, we might say that Hertz's theory itself predicted some such results, even if
Hertz himself did not (that is, even if Hertz did not himself draw his conclusion before
getting his experimental results). But even if we speak this way, Hertz did not claim or
imply that his experimental results provide better (or weaker) evidence for his theory
because the theory predicted them before they were obtained. Nor did Thomson in his
criticism of Hertz allude to one or the other possibility. Whichever it waswhether a
prediction or an explanation or neitherHertz (Thomson was claiming) should have used
a better selection procedure. This is what is criticizable in Hertz, not whether he was
predicting a novel fact or explaining a known one.
The second example involves the nineteenth century argument for the wave theory of
light given in chapter 7, section 7. The argument began with two claims: (i) that light
travels from one point to another with a finite velocity; (ii) that in other known cases, such
as sound waves, water waves, and projectiles, modes of transfer involving finite velocities
are via classical waves in a medium or classical particles. From (i) and (ii), it is concluded
that (h) light is either a wave motion in a medium or a stream of particles. In probabilistic
terms,
(1) p(h/(i)&(ii)) is close to 1.
Indeed, it might be claimed that

Page 15 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


(2) p(there is an explanatory connection between h and (i)/(i)&(ii)) > ,
so that (since (i) and (ii) were deemed true in the nineteenth century), (i) constitutes
potential evidence that h, given (ii). The claim that it does constitute evidence that light is
either a classical wave or a classical stream of particles was made by wave theorists as
part of their eliminative argument for the wave theory.
Now, what I believe can legitimately be said about this case is that although (p.226) (i)
constitutes ESevidence that h (where the epistemic situation is the one holding for wave
theorists during the first four decades of the nineteenth century), it is not potential
evidence that h. The reason it is not has nothing to do with whether wave theorists
employed h to explain the already known fact that light travels from one point to another
in a finite time or to predict new cases in which this is so. The reason is that the selection
procedure is biased, although this was not known by wave theorists and could not be
known until the twentieth century. The items selected to defend the claim that modes of
transfer involve classical waves and particlessuch as water waves, sound waves, and
projectilesare all items from the macroworld of classical waves and particles; there are
none from the subatomic world subject to the laws of quantum mechanics which preclude
classical waves and particles. Accordingly, (1) and (2) are criticizable, not the fact that
wave theorists were explaining known facts (or predicting new ones).
I end this section with a quote from John Maynard Keynes, whose book on probability
contains lots of insights. Here is one:
The peculiar virtue of prediction or predesignation is altogether imaginary. The
number of instances examined and the analogy between them are the essential
points, and the question as to whether a particular hypothesis happens to be
propounded before or after their examination is quite irrelevant.26

9. The Problem of Old Evidence


Years ago Clark Glymour raised a fundamental objection to the popular positive
relevance definition of evidence.27 Suppose that the probability of e is 1. If it is, then,
assuming that the probability of h is greater than 0, p(e/h) = 1. Now, according to Bayes'
theorem,
(1) p(h/e) = p(h) xp (e/h)/p(e)).
Therefore, if p(e) = 1, then from (1) we obtain
(2) p(h/e) = p(h).
That is, the probability of h, given e, is the same as its prior probability. Now on the
positive relevance definition,
(3) e is evidence that h if and only if p(h/e) > p(h)
So if p(e) = 1 and p(h) = 0, then from (2) and (3), e cannot be evidence that h.
Why is this a problem? Suppose, says Glymour, that prior to the introduction of

Page 16 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


hypothesis h, e was known with certainty to obtain. Glymour concludes that e's
probability is then 1. (I will return to this claim in a moment.) If e's truth was known prior
to the introduction of h, then e is old evidence with respect to h. It follows from the
positive relevance definition of evidence that old evidence with respect to a hypothesis
cannot be evidence that the hypothesis is true, since (p.227) it cannot increase the
probability of the hypothesis. This strikes many as absurd, since frequently phenomena
considered evidence in favor of a theory were known with certainty to obtain prior to the
formulation of the theory.
In section 3 of this chapter we considered whether predictions of novel facts provide
stronger evidence for a theory than explanations of old ones, or whether the reverse is
true. Predictionists (such as Whewell and Popper) might welcome Glymour's problem by
responding that if e is already known to be true prior to the formulation of h, then it
cannot be (very much) evidence for h, even if it is derivable from h. For them, in order
that e be evidence that h, or at least substantial evidence, it must be a new prediction
and not a fact known prior to the formulation of h.
Now to make this claim is to espouse the historical thesis of evidence formulated in
section 1. To know whether e is a new prediction and not a fact known prior to the
formulation of h one must know historical facts about e and h, viz. whether and when e
is known and whether it is known prior to when h was proposed. In my earlier discussion
I rejected the historical thesis as a universal principle. To be sure, there are cases in
which determining whether, or to what extent, e supports h requires determining the
truth of some historical fact about e, h, or their relationship. But there are also cases
where this is not so at all. To claim that only predictions provide (substantial) evidence for
a hypothesis is to espouse a mistaken historical thesis.
Glymour considered the problem of old evidence sufficient to show that one should not
accept (3), the standard positive relevance definition of evidence. Since in chapter 4 I
have given many other reasons to reject this definition, what I want to focus on here is
not the question of whether Glymour's objection is devastating to the positive relevance
account but on what one is to say about cases in which the probability of the putative
evidence is 1.
However, before doing so I return to a central claim of Glymour's argument, viz. that if e
is known to be true (whether or not this was prior to the formulation of h), then p(e) = 1.
Is this true? It does, indeed, hold for subjective probability. If at a certain time t I know
that e is true, then at t my subjective probability for e is maximal.28 Does it hold for the
concept of epistemic probability that I espouse?
My answer is that it depends on what relativization, if any, is being assumed in the
probability statement. If e is known to be true, then it follows that it is true. And if the
probability in question is relativized to e, or to the fact that e is known to be true, then, to
be sure, the probability of e is 1 (p e (e) = 1). But that is trivial and uninteresting. With
other less trivial, more interesting, relativizations the probability is not 1, even though it
is known that e is true. For example, suppose e is as follows:

Page 17 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


e: This coin, which was tossed 100 times on January 1, 2000, landed heads 95% of
the time.
(p.228) Suppose some person knows that e is true. We may want to consider the
probability of e disregarding this knowledge, that is, p d(Ke) (e), in which d(Ke) means
disregard the fact that e is known to be true. In this case the probability may be less
than 1, even though it is known that e is true.29 Accordingly, with epistemic probability,
from the fact that e is known to be true, it does not necessarily follow that the probability
of e is 1. It depends on what is, or is not, being assumed.
I turn now to the question of whether Glymour's problem can be put in a form which
does not introduce the idea that e is a fact (not) known prior to the formulation of h.
Using epistemic probabilities, suppose that p b(e) = 1. This is not a historical claim. It
does not say that anyone in fact knows or came to know that e is true, or that anyone did
so before or after the formulation of h. All it says is that, in view of b, the degree of
reasonableness of believing e is maximal. Could it be the case that, given b, e is potential
or veridical evidence that h? Or does the fact that p b(e) = 1 preclude this possibility?
My definitions of potential and veridical evidence do not preclude this possibility. It is
possible for the following conditions for potential evidence to be satisfied:
1. e and b are true
2. e does not entail h
3. p b(h/e)xp b (there is an explanatory connection between h and e/h&e) > ,
even if p b(e) = 1.
To demonstrate this suppose the background information b contains the fact that there is
a lottery consisting of 1000 tickets one of which will be drawn at random, and that John's
wife Mary purchased 950 of these tickets as a present for John, who deposited them in
his safe deposit box with a legal document saying that the tickets belong to him. Let e be
that John owns 950 tickets in this lottery. Let h be that John will win. If e and b are true,
then the above conditions for e's being potential evidence that h are satisfied. Yet p b(e) =
1, since in this case the background b (which we may suppose contains standard legal
principles of ownership) entails e. Given b, the degree of reasonableness of believing e
(that John owns 950 tickets in this lottery) is maximal. Yet given b, e is potential evidence
that h. And, if h is true, it is veridical evidence as well. The fact that, in the light of the
background information b, it is maximally reasonable to believe e does not at all count
against e's being evidence that h. Indeed, in this case, in the light of b, e is very strong
evidence that h.
Now, let us consider what, if anything, happens if it is unreasonable to believe e. If p(e) =
0, then p(h/e) is undefined. So let us consider a case in which e has low, but nonzero,
probability. Predictionists champion such cases, because the lower the probability of
the prediction e, the higher the posterior probability of h on e.

Page 18 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


Let the background information b be that a fair coin will be randomly tossed 100 times.
Let e be that the coin will land heads 100 times in a row. Let h be the hypothesis that the
Devil will intervene after each random toss, causing the coin (p.229) to land heads each
time during the first 100 tosses. In this case the hypothesis h entails (the prediction) e.
Assuming b, the probability of e is very low: pb (e) = ( 1 )100 . Yet, given b, it seems far
2

fetched to say that e, if true, is potential evidence that h. Indeed, the definition of potential
evidence precludes this, even if e and b are both true, since (given normal background
information) the probability of the devil hypothesis h, even on the assumption of e, is
extremely low. The fact that the coin will land heads 100 times in a row, even if true, is not
potential evidence that the Devil will intervene.
Can we follow the predictionist and say at least this: Where h entails some prediction
e, the lower the probability of e the stronger the evidence that e confers upon h? No, we
cannot. All we can say is that the lower the probability of e, in such a case, the higher the
probability of h on e. But it does not follow from this that the lower the probability of e the
stronger the evidence that e confers upon h, since e may confer no evidence upon h.
Thus, in the previous example, let us change e to
e: The coin will land heads 1000 times in a row.
Let h and b be the same as before. Now we have pb (e') = ( 1 )1000 , which is a much lower
2

probability than pb (e) = ( 1 )100 . Yet that does not make e stronger evidence that h than e
2

is, since, on my conception, e is not evidence that h. The threshold for high probability
required for evidence has not been reached.
What can be said is this. If e is evidence that h, then the lower the probability of e the
stronger is the evidence that e confers upon h. For example, let b contain the information
that this coin is perfectly symmetrical. Let h be the hypothesis that it will land heads
approximately half the time. Let e 1 be the information that when tossed randomly the
first 100 times it landed heads between 45 and 55 times. Let e 2 be the information that
when tossed randomly the next 1000 times it landed heads exactly 500 times. We might
say that, given b, both e 1 and e 2 count as evidence that h. Now in this case p b(e 2) < p
b(e 1 ), so that, indeed, p b(h/e 2) > p b(h/e 1 ). But in this case also, in the light of b, e 2 is
stronger evidence that h than is e 1 .
Accordingly, whether e has very high or very low probability does not necessarily affect
whether, or the extent to which, e is evidence that h. Nor is it in general true that if h
predicts e, the lower is e's probability the stronger is e's evidence that h.

10. Conclusions
1. According to the historical thesis of evidence, whether e if true is evidence that h, or
how strong that evidence is, depends on certain historical facts about e, h, or their
relationship (for example, on whether e was known before or after h was formulated).
Although this thesis holds for subjective evidence, it does not hold universally for the
concepts of objective evidence I have introduced. Focusing on potential and veridical

Page 19 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


evidence, depending on the particular evidence claim and the selection procedure
employed, in some cases the historical thesis holds, in others it does not.
(p.230) 2. Sometimes a novel fact that is predicted provides better evidence for a
hypothesis than a known fact that is explained. Sometimes the reverse is true. Which
obtains has nothing to do with whether it is a prediction or an explanation, but rather, in
the cases in question, with the selection procedure used to generate the evidence. This is
illustrated in the case of Hertz's claim that the results of his 1883 cathode ray
experiments provide evidence that cathode rays are not electrically charged. It is also
illustrated in the wave theorists' argument that light consists of classical waves or classical
particles.
3. A response of Patrick Maher is examined which involves formulating the information
concerning whether e is a prediction or a fact explained as part of the evidence statement
itself or as part of the background information. This new formulation will not suffice to
establish the superior power of prediction over explanation (or accommodation), or
vice versa, in typical scientific cases of interest to predictionists and explanationists.
4. Glymour raises the problem of old evidence in order to reject the positive relevance
definition. According to that definition, e cannot be evidence that h when e is old
evidence known to be true. Predictionists may welcome this result, since for them, in
order for e to be substantial evidence that h, e must be a new prediction, not a fact
known prior to the formulation of h. To say this is to espouse the historical thesis of
evidence as universal, which I do not. I discuss Glymour's problem in a form that does
not commit one to the historical thesis. If the degree of reasonableness of believing the
putative evidence e is maximal (corresponding to old evidence), can e be potential (or
veridical) evidence that h? I demonstrate that it can be. So there is no comfort here for
the predictionist. (Nor does the fact that the probability of e is maximal, where e is
explainable by derivation from h, guarantee that e is potential evidence that h; so there is
no comfort here for the explanationist either.) A second question is this: If h entails some
prediction e, does the lower the probability of e mean the stronger the evidence that e
confers upon h? The answer again is no. So again there is no comfort for the predictionist.
Notes:
(1.) William Whewell, The Philosophy of the Inductive Sciences (New York: Johnson
reprint, 1967; from the 1847 ed.) Karl Popper, The Logic of Scientific Discovery
(London: Hutchinson, 1959).
(2.) Stephen Brush, Prediction and Theory Evaluation: The Case of Light Bending,
Science, 246 (1989), 11241129.
(3.) To what extent Brush wants to generalize this explanationist position is a question I
leave for him to answer. There are passages that strongly suggest a more general
position. For example: There is even some reason to suspect that a successful
explanation of a fact that other theories have already failed to explain satisfactorily (for
example, the Mercury perihelion) is more convincing than the prediction of a new fact, at

Page 20 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


least until the competing theories have had their chance (and failed) to explain it (p.
1127). In what follows, I consider a generalized explanationist thesis.
(4.) e and h here are propositions. As noted in chapter 2, this is the customary practice of
philosophers who speak of e as being evidence that h, even though it is the fact that e is
true that is evidence. The historical thesis, then, concerns the propositions e and h or
their relationship.
(5.) Alan Musgrave, Logical versus Historical Theories of Confirmation, British Journal
for the Philosophy of Science, 25 (1974), 123.
(6.) Rudolf Carnap, Logical Foundations of Probability.
(7.) Carl G. Hempel, Aspects of Scientific Explanation.
(8.) Clark Glymour, Theory and Evidence (Princeton: Princeton University Press, 1980).
(9.) See Laura J. Snyder, Is Evidence Historical?, Scientific Methods: Conceptual and
Historical Problems, Peter Achinstein and Laura J. Snyder, eds. (Malabar, Florida:
Krieger, 1994).
(10.) It might be objected that if this is so, then, contrary to what I have been saying, the
concept of evidence involvedpotential or veridical evidencecannot be objective.
Whether e is potential (or veridical) evidence that h in such cases will depend on what
selection procedure was employed, which, in turn, depends on what some person(s)
believed about e and h, viz. upon what was believed about how e was generated. My
reply is that in the cases in question whether e is potential evidence that h depends on
what selection procedure was in fact employed, not on what beliefs its employer(s) may
have had about it.
(11.) Microconditions are being disregarded. See chapter 5.
(12.) See Eric Barnes, Social Predictivism, Erkenntnis, 45 (1996), 6989.
(13.) Patrick Maher, Prediction, Accommodation, and the Logic of Discovery, PSA,
1988, vol. 1, 273285; How Prediction Enhances Confirmation, in J.M. Dunn and A.
Gupta, eds., Truth or Consequences (Netherlands: Kluwer Academic Publishers, 1990),
327343.
(14.) Maher construes p here as the subjective probability for some rational person who
has not yet learned the truthvalues of M h,e, and O. See How Prediction Enhances
Confirmation, p. 327. With suitable relativization, probability in (PT) could also be
understood in an objective epistemic sense of the sort introduced in chapter 5.
(15.) How Prediction Enhances Confirmation, p. 328.
(16.) Prediction, Accommodation, and the Logic of Discovery, p. 275. See also his

Page 21 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


Howson and Franklin on Prediction, Philosophy of Science, 60 (1993), 329340.
(17.) The experimenter is using some method, however random, for generating
predictions. Although Maher does not say so explicitly, I assume he would claim that the
method being used can but need not be given in b(e).
(18.) If a function has a finite number of values, the expected value is the sum of all
products consisting of a given value and the probability of that value. If a function (such
as probability) has infinitely many values, the expected value is defined using an integral.
(19.) Personal correspondence. See his Howson and Franklin on Prediction, Philosophy
of Science, 60 (June 1993), 329340; see pp. 339340. I am indebted to Patrick Maher
for help in trying to get me to express his views accurately in these sections; I hope I
have done so.
(20.) Howson and Franklin on Prediction, p. 340.
(21.) With an objective concept of probability, of the sort I defend in chapter 5, this
conclusion would not be permitted, unless a special relativization is introduced that
entails disregarding information indicating whether M yields the prediction in question.
(22.) Maher cites predictions from Mendeleyev's periodic table of the elements.
Mendeleyev placed the elements in groups based on atomic weights (rather than on
other bases) and showed that periodic repetitions of properties occur in the groups and
can be used to predict new elements. He did so explicitly claiming that atomic weights
physically determine and can explain properties of the elements and compounds. (See his
The Relations between the Properties of Elements and their Atomic Weights, reprinted
in Henry A. Boorse and Lloyd Motz, eds., The World of the Atom (New York: Basic
Books, 1966, vol. 1), p. 306.) By contrast, Balmer makes no physical or explanatory claims
regarding his formula. While Balmer's formula accommodates the data, it is more
clearly nonexplanatory than Maher's example, while also being predictive. Stephen Brush,
by contrast to Maher, claims that the periodic table was regarded by scientists as
providing stronger explanatory than predictive evidence.
(23.) Johann Jakob Balmer, The Hydrogen Spectral Series, reprinted in William Francis
Magie, A Source Book in Physics (Cambridge: Harvard University Press, 1965), 360
365; quote on p. 360.
(24.) Maher claims that his Mendeleyev example is one in which his present uncertainty
thesis holds. He writes: Nobodywithout the benefit of hindsightcould be certain that
Mendeleyev would propose the hypothesis he did (Howson and Franklin on
Prediction, p. 340). But, again, this is not the probability of concern. The question is
whether, with respect to a prediction Mendeleyev made using his periodic table, he was
in any doubt about whether his table yields that prediction. After constructing his
periodic table based on atomic weights, Mendeleyev writes: . . . it appears to be certain
when we look at the proposed table, that in some rows the corresponding members are

Page 22 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

Explanation Versus Prediction: Which Carries More Evidential


missing; this appears especially clearly, e.g., for the row of calcium; in which there are
missing the members analogous to sodium and lithium (p. 310). Also, at the end of his
paper he writes: The discovery of numerous unknown elements is still to be expected,
for instance, of elements similar to Al and Si having atomic weights from 6575 (p. 312).
Mendeleyev seems to have no doubt that these predictions are generated by his table.
(25.) Heinrich Hertz, Miscellaneous Papers (London: Macmillan, 1896).
(26.) J.M. Keynes, A Treatise on Probability (London: Macmillan, 1921).
(27.) Clark Glymour, Theory and Evidence, p. 86.
(28.) Although subjectivists accept this, some wish to solve Glymour's problem by
refusing to use one's actual degree of belief in e in determining p(e) but some
counterfactual degree of belief, such as one's degree of belief in e on everything one
knows minus e. See Colin Howson and Peter Urbach, Scientific Reasoning, p. 404.
(29.) This epistemic probability is analogous to a subjective one that would be used by
subjectivists offering a counterfactual solution to Glymour's problem mentioned in the
previous footnote.

Access brought to you by: Pontificia Universidad Catolica


del Peru (PUCP)

Page 23 of 23
PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015.
All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a
monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: Pontificia
Universidad Catolica del Peru (PUCP); date: 30 April 2015

You might also like