You are on page 1of 16

Oct 2013

Dr Charlotte Price

Aims for lecture 7


Learn that you can use one distribution to
solve problems relating to a different
distribution
Approximate a Binomial distribution by a
Normal distribution
Approximate a Poisson distribution by a
Normal distribution
Approximate a Binomial distribution using a
Poisson distribution
2

From Binomial to Normal

Oct 2013

Binomial distribution
In Lecture 6, we considered the following
scenario:
A company which manufactures self help audio
books wants to launch a new product relating to
claustrophobia
The irrational fear of confined spaces*
It is estimated that around 10% of the population
may experience claustrophobia during their lifetime*
*http://www.anxietyuk.org.uk

Consider the following...


If the company asks 60 people, Do you suffer
from claustrophobia?, what is the probability
that at least 10 of them will say yes?

Information and strategy


n = number of trials = 60
p = probability of saying yes = 0.1
If X ~ Bin(60,0.1), we want to find P(X 10)
Either calculate: P(X=10)+P(X=11) + ... + P(X=60)
or calculate: 1 [P(X=0)+P(X=1)+ ... +P(X=9)]
5

Problems
As the sample size increases:
Using the Binomial formula takes too long
The values you need might not all be
included in statistical tables
Even sophisticated calculators have limits!
6

Oct 2013

Probability

Bin(60,0.1)

Probability

Bin(60,0.1)

Normal approximation
As the sample size (or number of trials)
increases, the Binomial distribution starts to look
more like the Normal distribution
For instance:
if we had asked 100 people, Do you suffer from
claustrophobia?, the corresponding Binomial
distribution looks like this...

Oct 2013

Probability

Bin(100,0.1)

10

Probability

Bin(100,0.1)

11

Influence of n and p
Its not just the sample size (n) that
influences the shape of the Binomial
distribution
The probability of success (p) also plays
a part
Lets investigate.
12

Oct 2013

Bin(10,0.2)
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0

Number of successes
13

Bin(10,0.5)
0.25

0.20

0.15

0.10

0.05

0.00
0

10

Number of successes
14

Bin(10,0.9)

15

Oct 2013

Large n, different values of p


0.10

Bin(100,0.2)

0.08

Bin(100,0.5)

0.08
0.06

0.06
0.04
0.04

0.02
0.02

0.00

0.00
6

10

13

16

19

22

25

28

31

34

37

28

32

35

38

41

44

47

50

96

98

100

53

56

59

62

65

68

0.14

Bin(100,0.9)
0.12

0.10

0.08

0.06

0.04

0.02

0.00
73

75

78

80

82

84

86

88

90

92

94

16

General guidance
Normal approximation to the Binomial
distribution can be considered when:
The sample size is large
The probability of success (p) is not too close
to zero or one

Rule of thumb*: np > 5 and n(1-p) > 5


*Note: there are other rules of thumb!

17

Normal parameters
How do we translate the Binomial information to
fit with a Normal curve problem?
A Binomial distribution depends on n and p
A Normal distribution depends on and
Use the Normal distribution with:
= np
2 = np(1-p) = npq
18

Oct 2013

Continuity correction factor


When using the Normal approximation to the
Binomial distribution, we can improve the
approximation by using a continuity correction factor
Important because we are using a continuous
distribution to approximate a discrete distribution
Fill in the gaps

19

Continuity correction factor


Add or subtract 0.5 to/from the value(s) of the
Binomial random variable as needed:
Example
P(X = 25) approximated by P(24.5 Y 25.5)
where X ~ Bin(n,p) and Y ~ N(np,npq)

In general:
P(X = x) approximated by P(x-0.5 Y x+0.5)
20

Semantics are important


To use the continuity correction properly:
List the values that are included in your
problem
Be careful to observe the nature of any
inequalities (i.e. < or or > or )
Take away 0.5 from the smallest value and/or
add 0.5 to the largest value
21

Oct 2013

Semantics are important


Suppose X ~ Bin(n,p) and Y ~ N(np,npq)
Examples
P(X 4) approximated by P(Y ???)
P(X > 4) approximated by P(Y ???)
P(X 4) approximated by P(Y ???)
P(X < 4) approximated by P(Y ???)
P(2 < X < 6) approximated by P(??? X ???)
22

Continuity correction: rules


X ~ Bin(n,p)

Y ~ N(np,npq)

P(X < x)

P(Y x 0.5)

P(X x)

P(Y x + 0.5)

P(X = x)

P(x - 0.5 Y x + 0.5)

P(X > x)

P(Y x + 0.5)

P(X x)

P(Y x - 0.5)
23

Claustrophobia again...
If the company asks 60 people, Do you suffer
from claustrophobia?, what is the probability
that at least 10 of them will say yes?
Remember, p = 0.1

Can we use the Normal approximation?


np = 600.1 = 6 > 5
n(1-p) = 600.9 = 54 > 5

Yes!
24

Oct 2013

Claustrophobia again...
X ~ Bin(60,0.1) is approximated by Y~N(6,5.4)
We want to calculate P(X 10)
This includes the values 10, 11, 12,....

Hence we need to find P(Y 9.5)


This is just a Normal distribution problem....
25

Using the Normal Distribution


To find P(Y 9.5), first obtain the z score:

z=

9.5 6
= 1.51
5.4

From z tables:
P(Z 1.51) = 1 P(Z < 1.51) = 1-0.9345 = 0.0655
So the probability that at least 10 people suffer
from claustrophobia is roughly 7%
26

Using the Normal Distribution


To find P(Y 9.5), first obtain the z score:

You may be interested


9.5 6 to know that the
z = probability
= 1.51
exact Binomial
in this case is
5.4
0.0731 or 7.3%, so the Normal
approximation
From z tables: is not too bad even though
close
to zero
P(Z 1.51) = p
1 isP(Z
< 1.51)
= 1-0.9345 = 0.0655
So the probability that at least 10 people suffer
from claustrophobia is roughly 7%
27

Oct 2013

From Poisson to Normal

28

Example: Poisson
The mean number of arrivals per hour at a
post office is 25
What is the probability that there are at
least 30 arrivals in any given hour?
If X = Number of arrivals in one hour, we
want to find P(X 30)
29

Probability

Pois(25)

30

10

Oct 2013

Probability

Pois(25)

31

The effect of lambda


As (the mean) increases, the Poisson
distribution looks more Normal:

32

Normal approximation
Rule of thumb*: If > 10 then the Poisson
distribution can be approximated by the Normal
distribution
A Poisson distribution depends on
A Normal distribution depends on and
Use the Normal distribution with:
= and 2 =
*Note: some people use > 20

33

11

Oct 2013

Continuity correction
The Poisson distribution is a discrete
distribution so we must remember to apply
the continuity correction when using the
Normal approximation
The same rules apply as outlined in the
previous slides...

34

Back to the post office...


We want to calculate P(X 30)
This includes the values 30, 31, 32,...

Since = 25 > 10, we can use the Normal


approximation to the Poisson distribution
If Y ~ N(25, 25), then after applying the
continuity correction we want to find P(Y 29.5)
Were back to a Normal problem...
35

Using the Normal Distribution


To find P(Y 29.5), first obtain the z score:

z=

29.5 25
= 0.90
25

From z tables:
P(Z 0.90) = 1 P(Z < 0.90) = 1-0.8159 = 0.1841
So the probability of at least 30 arrivals in any
given hour is approximately 18.4%
36

12

Oct 2013

Using the Normal Distribution


To find P(Y 29.5), first obtain the z score:

In this case, the


exact
25Poisson probability
29.5
z = or 18.2%, so
= 0.90
is 0.1821
the Normal
25
approximation is very good for this value
From z tables: of lambda
P(Z 0.90) = 1 P(Z < 0.90) = 1-0.8159 = 0.1841

So the probability of at least 30 arrivals in any


given hour is approximately 18.4%
37

From Binomial to Poisson

38

Example: Binomial
Suppose the probability of a bank making a
mistake in processing a deposit is 0.0003
If 10,000 deposits are scrutinised, what is the
probability that more than 6 had mistakes?
Let X = Number of mistakes made
n = number of trials = 10,000
p = probability of mistake = 0.0003
If X ~ Bin(10000,0.0003), we want to find P(X > 6)
39

13

Oct 2013

Use an approximation?
Even if youre good at cancelling out factorials,
dealing with the Binomial formula is messy with
values such as n = 10,000
With the very small probability, p = 0.0003:
np = 10,000 0.0003 = 3 < 5
The Normal approximation to the Binomial is
therefore no good
40

Bin(10000,0.0003)
0.25

0.20

0.15

0.10

0.05

0.00
0

10

11

12

41

Number of mistakes

Bin(10000,0.0003) and Pois(3)


0.25



 

0.20

0.15

0.10

0.05

0.00
0

Number of mistakes

10

11

12

10

11

12

Number of mistakes

42

14

Oct 2013

Poisson to the rescue!


In this case, the Poisson distribution is clearly
a very good approximation
Rule of thumb*: if n 20 and p 0.05 the
Binomial distribution can be approximated by
the Poisson
If X ~ Bin(n,p) approximate by Y ~ Pois(np)
*Note: This is just one rule of thumb there are others

43

Poisson approximation
With = np:

P(Y = r ) =

e np (np) r
r!

Why is this any better than using the


Binomial distribution?
The calculations are much simpler.
44

Back to the bank problem


P(X > 6) where X ~ Bin(10000,0.003)
Approximate by P(Y > 6) for Y ~ Pois(3)
P(Y > 6) = 1 [P(Y=0) + + P(Y = 6)]

e 3 (3) r
= 0.0335
r!
r =0
6

= 1

So the probability of more than 6 mistakes is


roughly 3.4%
45

15

Oct 2013

Learning points
Sometimes its easier to use one distribution to calculate
the answer to problems stemming from a different
distribution
We can approximate both the Binomial and Poisson
distributions by the Normal distribution
This is used a lot in practice
It is subject to certain conditions

We can approximate the Binomial distribution by the


Poisson distribution (under certain conditions)

Helps to simplify the calculations


46

16

You might also like