Professional Documents
Culture Documents
Avoiding Them
Introduction
Types of Mistakes
Suggestions
Resources Table of Contents
Glossary Blog
About
What Is Probability?
The notion of "the probability of something" is one of those ideas, like "point" and
"time," that we can't define exactly, but that are useful nonetheless. The following
should give a good working understanding of the concept.
Events
First, some related terminology: The "somethings" that we consider the probabilities of
are usually called events. For example, we may talk about the event that the number
showing on a die we have rolled is 5; or the event that it will rain tomorrow; or the
event that someone in a certain group will contract a certain disease within the next
five years.
Four Perspectives on Probability
equally likely outcomes, and the event A consists of exactly m of these outcomes, we
say that the probability of A is m/n. We may write this as "P(A) = m/n" for short.
This perspective has the advantage that it is conceptually simple for many situations.
However, it is limited, since many situations do not have finitely many equally likely
outcomes. Tossing a weighted die is an example where we have finitely many
outcomes, but they are not equally likely. Studying people's incomes over time would
be a situation where we need to consider infinitely many possible outcomes, since
there is no way to say what a maximum possible income would be, especially if we are
interested in the future.
2. Empirical (sometimes called "A posteriori" or "Frequentist")
This perspective defines probability via a thought experiment.
To get the idea, suppose that we have a die which we are told is weighted, but we
don't know how it is weighted. We could get a rough idea of the probability of each
outcome by tossing the die a large number of times and using the proportion of times
that the die gives that outcome to estimate the probability of that outcome.
This idea is formalized to define the probability of the event A as
P(A) = the limit as n approaches infinity of m/n,
where n is the number of times the process (e.g., tossing the die) is performed,
and m is the number of times the outcome A happens.
(Notice that m and n stand for different things in this definition from what they meant
in Perspective 1.)
In other words, imagine tossing the die 100 times, 1000 times, 10,000 times, ... . Each
time we expect to get a better and better approximation to the true probability of the
event A. The mathematical way of describing this is that the true probability is the
limit of the approximations, as the number of tosses "approaches infinity" (that just
means that the number of tosses gets bigger and bigger indefinitely). Example
This view of probability generalizes the first view: If we indeed have a fair die, we
expect that the number we will get from this definition is the same as we will get from
the first definition (e.g., P(getting 1) = 1/6; P(getting an odd number) = 1/2). In
addition, this second definition also works for cases when outcomes are not equally
likely, such as the weighted die. It also works in cases where it doesn't make sense to
talk about the probability of an individual outcome. For example, we may consider
randomly picking a positive integer ( 1, 2, 3, ... ) and ask, "What is the probability that
the number we pick is odd?" Intuitively, the answer should be 1/2, since every other
integer (when counted in order) is odd. To apply this definition, we consider randomly
picking 100 integers, then 1000 integers, then 10,000 integers, ... . Each time we
calculate what fraction of these chosen integers are odd. The resulting sequence of
fractions should give better and better approximations to 1/2.
However, the empirical perspective does have some disadvantages. First, it involves a
thought experiment. In some cases, the experiment could never in practice be carried
out more than once. Consider, for example the probability that the Dow Jones average
will go up tomorrow. There is only one today and one tomorrow. Going from today to
tomorrow is not at all like rolling a die. We can only imagine all possibilities of going
from today to a tomorrow (whatever that means). We can't actually get an
approximation.
A second disadvantage of the empirical perspective is that it leaves open the question
of how large n has to be before we get a good approximation. The example linked
above shows that, as n increases, we may have some wobbling away from the true
value, followed by some wobbling back toward it, so it's not even a steady process.
The empirical view of probability is the one that is used in most statistical inference
procedures. These are called frequentist statistics. The frequentist view is what gives
credibility to standard estimates based on sampling. For example, if we choose a large
enough random sample from a population (for example, if we randomly choose a
sample of 1000 students from the population of all 50,000 students enrolled in the
university), then the average of some measurement (for example, college expenses) for
the sample is a reasonable estimate of the average for the population.
3. Subjective
Subjective probability is an individual person's measure of belief that an event will
occur. With this view of probability, it makes perfectly good sense intuitively to talk
about the probability that the Dow Jones average will go up tomorrow. You can quite
rationally take your subjective view to agree with the classical or empirical views when
they apply, so the subjective perspective can be taken as an expansion of these other
views.
However, subjective probability also has its downsides. First, since it is subjective, one
person's probability (e.g., that the Dow Jones will go up tomorrow) may differ from
another's. This is disturbing to many people. Sill, it models the reality that often people
do differ in their judgments of probability.
The second downside is that subjective probabilities must obey certain "coherence"
(consistency) conditions in order to be workable. For example, if you believe that the
probability that the Dow Jones will go up tomorrow is 60%, then to be consistent you
cannot believe that the probability that the Dow Jones will do down tomorrow is also
60%. It is easy to fall into subjective probabilities that are not coherent.
The subjective perspective of probability fits well with Bayesian statistics, which are an
alternative to the more common frequentist statistical methods. (This website will
mainly focus on frequentist statistics.)
4. Axiomatic
This is a unifying perspective. The coherence conditions needed for subjective
probability can be proved to hold for the classical and empirical definitions. The
axiomatic perspective codifies these coherence conditions, so can be used with any of
the above three perspectives.
The axiomatic perspective says that probability is any function (we'll call it P) from
events to numbers satisfying the three conditions (axioms) below. (Just what
constitutes events will depend on the situation where probability is being used.)
The three axioms of probability:
I. 0 P(E) 1 for every allowable event E. (In other words, 0 is the smallest
If we have a fair die, the axioms of probability require that each number comes up with
probability 1/6: Since the die is fair, each number comes up with the same probability.
Since the outcomes "1 comes up," "2 comes up," ..."6 come up" are mutually exclusive
and their union is the certain event, Axiom III says that
P(1 comes up) + P( 2 comes up) + ... + P(6 comes up) = P(the certain event),
which is 1 (by Axiom 2). Since all six probabilities on the left are equal, that common
probability must be 1/6.