Professional Documents
Culture Documents
Dr Charlotte Price
Oct 2013
Binomial distribution
In Lecture 6, we considered the following
scenario:
A company which manufactures self help audio
books wants to launch a new product relating to
claustrophobia
The irrational fear of confined spaces*
It is estimated that around 10% of the population
may experience claustrophobia during their lifetime*
*http://www.anxietyuk.org.uk
Problems
As the sample size increases:
Using the Binomial formula takes too long
The values you need might not all be
included in statistical tables
Even sophisticated calculators have limits!
6
Oct 2013
Probability
Bin(60,0.1)
Probability
Bin(60,0.1)
Normal approximation
As the sample size (or number of trials)
increases, the Binomial distribution starts to look
more like the Normal distribution
For instance:
if we had asked 100 people, Do you suffer from
claustrophobia?, the corresponding Binomial
distribution looks like this...
Oct 2013
Probability
Bin(100,0.1)
10
Probability
Bin(100,0.1)
11
Influence of n and p
Its not just the sample size (n) that
influences the shape of the Binomial
distribution
The probability of success (p) also plays
a part
Lets investigate.
12
Oct 2013
Bin(10,0.2)
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0
Number of successes
13
Bin(10,0.5)
0.25
0.20
0.15
0.10
0.05
0.00
0
10
Number of successes
14
Bin(10,0.9)
15
Oct 2013
Bin(100,0.2)
0.08
Bin(100,0.5)
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.00
0.00
6
10
13
16
19
22
25
28
31
34
37
28
32
35
38
41
44
47
50
96
98
100
53
56
59
62
65
68
0.14
Bin(100,0.9)
0.12
0.10
0.08
0.06
0.04
0.02
0.00
73
75
78
80
82
84
86
88
90
92
94
16
General guidance
Normal approximation to the Binomial
distribution can be considered when:
The sample size is large
The probability of success (p) is not too close
to zero or one
17
Normal parameters
How do we translate the Binomial information to
fit with a Normal curve problem?
A Binomial distribution depends on n and p
A Normal distribution depends on and
Use the Normal distribution with:
= np
2 = np(1-p) = npq
18
Oct 2013
19
In general:
P(X = x) approximated by P(x-0.5 Y x+0.5)
20
Oct 2013
Y ~ N(np,npq)
P(X < x)
P(Y x 0.5)
P(X x)
P(Y x + 0.5)
P(X = x)
P(X > x)
P(Y x + 0.5)
P(X x)
P(Y x - 0.5)
23
Claustrophobia again...
If the company asks 60 people, Do you suffer
from claustrophobia?, what is the probability
that at least 10 of them will say yes?
Remember, p = 0.1
Yes!
24
Oct 2013
Claustrophobia again...
X ~ Bin(60,0.1) is approximated by Y~N(6,5.4)
We want to calculate P(X 10)
This includes the values 10, 11, 12,....
z=
9.5 6
= 1.51
5.4
From z tables:
P(Z 1.51) = 1 P(Z < 1.51) = 1-0.9345 = 0.0655
So the probability that at least 10 people suffer
from claustrophobia is roughly 7%
26
Oct 2013
28
Example: Poisson
The mean number of arrivals per hour at a
post office is 25
What is the probability that there are at
least 30 arrivals in any given hour?
If X = Number of arrivals in one hour, we
want to find P(X 30)
29
Probability
Pois(25)
30
10
Oct 2013
Probability
Pois(25)
31
32
Normal approximation
Rule of thumb*: If > 10 then the Poisson
distribution can be approximated by the Normal
distribution
A Poisson distribution depends on
A Normal distribution depends on and
Use the Normal distribution with:
= and 2 =
*Note: some people use > 20
33
11
Oct 2013
Continuity correction
The Poisson distribution is a discrete
distribution so we must remember to apply
the continuity correction when using the
Normal approximation
The same rules apply as outlined in the
previous slides...
34
z=
29.5 25
= 0.90
25
From z tables:
P(Z 0.90) = 1 P(Z < 0.90) = 1-0.8159 = 0.1841
So the probability of at least 30 arrivals in any
given hour is approximately 18.4%
36
12
Oct 2013
38
Example: Binomial
Suppose the probability of a bank making a
mistake in processing a deposit is 0.0003
If 10,000 deposits are scrutinised, what is the
probability that more than 6 had mistakes?
Let X = Number of mistakes made
n = number of trials = 10,000
p = probability of mistake = 0.0003
If X ~ Bin(10000,0.0003), we want to find P(X > 6)
39
13
Oct 2013
Use an approximation?
Even if youre good at cancelling out factorials,
dealing with the Binomial formula is messy with
values such as n = 10,000
With the very small probability, p = 0.0003:
np = 10,000 0.0003 = 3 < 5
The Normal approximation to the Binomial is
therefore no good
40
Bin(10000,0.0003)
0.25
0.20
0.15
0.10
0.05
0.00
0
10
11
12
41
Number of mistakes
0.20
0.15
0.10
0.05
0.00
0
Number of mistakes
10
11
12
10
11
12
Number of mistakes
42
14
Oct 2013
43
Poisson approximation
With = np:
P(Y = r ) =
e np (np) r
r!
e 3 (3) r
= 0.0335
r!
r =0
6
= 1
15
Oct 2013
Learning points
Sometimes its easier to use one distribution to calculate
the answer to problems stemming from a different
distribution
We can approximate both the Binomial and Poisson
distributions by the Normal distribution
This is used a lot in practice
It is subject to certain conditions
16