You are on page 1of 4

In God we trust. All others must bring data.

Robert Hayden, Plymouth State College

Binomial and Geometric Distributions

Binomial Distributions

Frequently we encounter situations where there are only two outcomes of interest like:
tossing a coin to yield heads or tails, attempting a free throw in basketball which will
either be successful or not, predicting the sex of an unborn child (either male or female),
quality testing of parts which will either meet requirements or not. In each case we can
describe the two outcomes as either a success or a failure depending on how the
experiment is defined.

When four specific conditions are satisfied in an experiment it is called a BINOMIAL


setting which will produce a BINOMIAL DISTRIBUTION. The four requirements are:
1) each observation falls into one of two categories called a success or failure
2) there is a fixed number of observations
3) the observations are all independent
4) the probability of success (p) for each observation is the same - equally likely

Statistics jargon: If the experiment is a binomial setting, then the random variable X =
number of successes and is called a binomial random variable, and the probability
distribution of X is called a binomial distribution

BINOMIAL DISTRIBUTION DEFINED::


The distribution of the count X of successes in the binomial setting is the binomial
distribution with parameters n and p. The parameter n is the number of observations, and
p is the probability of a success on any one observation. The possible values of X are the
whole numbers from 0 to n and is written X is B(n,p).

The binomial distributions are an important class of discrete probability distributions.


See page 440-441 for examples. The TI 83 can calculate binomial probabilities as
described in Ex. 8.5 page 442.

pdf (probability distribution function, specifically binomial pdf)...


Given a discrete random variable X, the probability distribution function assigns
a probability to each value of X. The probabilities must satisfy the rules for
probabilities studied earlier.
Frequently we want to find the probability that a random variable takes a range of
values...the cuulative binomial probability cdf or specifically binomial cdf.

cdf (cumulative (probability) distribution function, specifically binomial


cdf)...
Given a random variable X, the cumulative distribution function (cdf) of X
calculates the SUM of the probabilities for 0, 1, 2, ... up to the value of X. That
is, it calculates the probability of obtaining at most X success in n trials.

In addition to being helpful in answering questions involving wording such as "find the
probability that it takes at most 6 trials," the cdf is also particularly useful for calculating
the probability that it takes more than a certain number of trials to see the first success
using the complement rule...

P(X > n) = 1 - P(X < n) n = 2, 3, 4, ...

Binomial formulas exist to computer these probabilities by hand. We must first consider
the
Binomial coefficient...

The number of ways of arranging k successes among n observations is given by


the binomial coefficient

= n!
k!(N-k)!

Binomial Probability

The number of ways of arranging k successes among n observations is given by


the binomial coefficient P(X=k) =

pk (1-p)n-k
Geometric Distributions

Having spent much time studying binomial distributions, what qualifies, formulas to use,
etc., learning about geometric distributions should be easy.

Let's start by contrasting the two distributions and settings.


Binomial: has a FIXED number of trials before the experiment begins and X counts the
number of successes obtained in that fixed number.
Geometric: has a fixed number of successes (ONE...the FIRST) and counts the number
of trials needed to obtain that first success. It is theoretically possible to proceed
indefinitely without ever obtaining a success.
Examples would be
1) flip a coin UNTIL you get your first head
2) roll a die UNTIL you get your first 3
3) attempt a three-point shot in basketball UNTIL you make your first basket

A random variable X is geometric provided that the following conditions are met:
(a-c are same as binomial)
a) each observation falls into one of just two categories, called success or failure
b) probability of a success, p, is the same for each observation
c) observations are all independent
NEW
d) the variable of interest is the number of trials required to obtain the FIRST success.

Recognizing the existence of either a binomial or geometric distribution is essential to


know how to proceed with your data analysis. Here is an example that should help
explain how to VERIFY a geometric setting.

An experiment consists of rolling a single die. The event of interest is rolling a 3: this is
called a success. The random variable is defined as X = number of trials UNTIL a 3
occurs. To VERIFY that this is a geometric setting, note that rolling a 3 will represent a
success, and rolling any other number will represent a failure. The probability of rolling
a 3 on each roll is the same: 1/6. The observations are independent. A trial consists of
rolling the die once. We roll the die until a 3 appears. Since all of the requirements are
satisfied, this experiment describes a geometric setting.

Rule for calculating geometric probabilities:


If X has a geometric distribution with probability p of success and (1-p) of failure on each
observation, the possible values of X are 1, 2, 3, .... If n is any one of these values, the
probability that the first success occurs on the nth trial is

P(X=n) = (1-p)n-1 p
This rule can be used to construct a probability distribution table for X = number of rolls
of a die until a 3 occurs from our earlier example. We'll use the TI 83 to do this now.
When graphing the distribution of X as a probability distribution histogram it will appear
to be strongly skewed to the right. This will ALWAYS be the case. Try to determine
why from the formula!

The mean (expected value) and standard deviation of a geometric random variable can be
calculated using these formulas:

If X is a geometric random variable with probability of success p on each trial, then the
mean of the random variable, that is the expected number of trials required to get the first
success, is

? ?= 1/p and the variance of X is (1-p)/p2


whose square root yields the standard deviation

One more rule to go....

P(X >n) or the probability that it takes MORE than a certain number of trials to achieve
the first success.

P(X=n) = (1-p)n

Since we are seeking the first success at whatever trial it occurs, geometric simulations
are called "waiting time" simulations. Makes sense!

You might also like