You are on page 1of 25

Statistical Analysis Chapter 4

Normal Distribution

What is the normal curve?

In chapter 2 we talked about histograms and


modes

A normal distribution is when a set of values for


one variable, when displayed in a histogram (or
line graph) has one peak (mode) and looks like a
bell. Here is an example using height:

Characteristics of the Normal Curve


a.

Bell shaped, fading at the tails. In other words, more


values are in the middle, and odd or unusual values
fall at the tails

b.

All (100%) of the data fits on the curve, with 50%


before the mean and 50% after

c.

68% of the data falls within -1 and +1 standard


deviations of the mean

d.

95% of the data falls between -2 and +2 standard


deviations

e.

The percentage of data between any two points is


equal to the probability of randomly selecting a value
between the two points (remember classical
probability from Ch. 3)

Standard Deviations and Z-Score

Z scores = the number of standard deviations


away from the mean.

z-score = x -

(x = data for which we want to know the zscore)

We use the characteristics of the normal curve,


and the z-score, to find out the probability of a
particular event or value occurring (remember
classical probability from Chapter 3)

Solving Normal Curve Problems Using


Z-Scores
(steps listed at bottom of p. 111)
1. Draw

a normal curve, showing values for (-2


through +2)
2. Shade the area in question
3. Calculate the z scores and cutoffs
(percentages asked for)
4. Use the z-scores and cutoffs to solve the
normal curve problem

Find Percentages on the Normal Curve


Table
Lets do these questions as a class
a. What

is the percentage of
b. What is the percentage of
c. What is the percentage of
1.11?
d. What is the percentage of
e. What is the percentage of

Answers
a.

.039839.8%

b.

.484648.46%

c.

.3665 + .3665 = .73373.3%

d.

.50 - .3925 = .107510.75%

e.

.50 - .2257 = .274327.43%

data from z = 0 to z = 0.1?


data from z = 0 to z = 2.16?
data from z = -1.11 to z =
data above z = 1.24?
data below z = -0.6?

Working backwards from percentages

When working backwards from percentages, we


still use the normal tablebut look for the
percentage to give us the z-score

a. What

is the z-score associated 10.2% of the data?


b. What is the z-score(s) for the middle 30% of the
normal curve?
c. What is the z-score of data in the upper 25% of the
normal curve?

Answers
a. z

= 0.26
b. z = -.39 to z = .39
c. z = 0.67

Lets do Question 4.2


Use the normal curve table to determine the percentage of
data in the normal curve
a.

Between z = 0 and z = .82

b.

Above z = 1.15

c.

Between z = -1.09 and z = .47

d.

Between z = 1.53 and z = 2.78

Work backward in the normal curve table to solve the following:


e.

32% of the data in the normal curve data can be found


between z = 0 and z = ?

f.

Find the z score associated with the lower 5% of the data.

g.

Find the z scores associated with the middle 98% of the data.

Question 4.2 Answers


Answers to Question 4.2
a. 29.39%
b. 12.51%
c. 54.29%
d. 6.03%
e. Between z = 0 and z = .92, or between z
= 0 and z = -.92

Question 4.7
Use the normal curve table to determine the percentage of data
in the normal curve
a.

Between z = 0 and z = .38

b.

Above z = -1.45

c.

Above z = 1.45

d.

Between z = .77 and z = 1.92

e.

Between z = -.25 and z = 2.27

f.

Between z = -1.63 and z = -2.89

Work backward in the normal curve table to solve the following.


g.

15% of the data in the normal curve can be found between z =


0 and z = ?

h.

Find the z score associated with the upper 73.57% of the data.

i.

Find the z scores associated with the middle 95%

Question 4.7 Answers


a.
b.
c.
d.
e.
f.
g.
h.
i.

14.80%
92.65%
7.35%
19.32%
58.71%
4.97%
z = .39 or -.39
z = -.63
Between z = -1.96 and z = +1.96

Binomial Distributions and Sampling


Binomial means two categories in a population
Males and females
Sports game players vs. Non sports game players
Incomes over 40,000 vs. incomes under 40,000
Quick note: Rememberfor binomial distributions,
we would visualize this data through a pie chart
because we do not have enough categories for a
histogram

Sampling from a Two-Category


Population

With two-category populations, we can describe


the population by p the percentage of values in
one category

This is the same p from the last chapter on


probability (classical probability)
P(event)

s (number of chances for success)


n (total equally likely possibilities)

We know (actually.statisticians know) that if we


randomly sampled from a population, then
ps p

Sampling Distribution

In order to know the odds of getting certain


values from this particular binomial sample, we
have to know the sampling distribution from this
population.

Under certain conditions, the sampling


distribution for a binomial value is normal (i.e.
the distribution follows the normal curve).

When the sampling distribution is normal, then


we can make predictions using our table and our
z-scores

Sampling from a Binomial Distribution

Suppose, we defined a population (full time FIT


students who either shop at Hot Topic), and we have
made our measure of interest into a binomial
distribution those who shop at Hot Topic and those
who do not.

Suppose over the last 10 years, marketers have


surveyed the FIT population hundreds of times and
found that Hot Topic shoppers are p = .13. (those
who are non-Hot Topic shoppers is p = .87)

Sampling from a Binomial Distribution

But suppose sometime later, your manager


asks you to lead another study. But this
time, you dont have enough money to
survey the whole population, and you have
to get a sample.

We can assume, because so many studies


have been done in the past that the true
value of Hot Topic shoppers is p = .13. Thus,
because we know that ps p, your sample
should have approximately the same value.

Sampling from a Binomial Distribution

For each sample, we can use the number sampled, and


the p value from the population to predict the total
number of Hot Topic shoppers. This is called the
expected value.

Expected value = np

Thus, if we collected a sample of 200 FIT students, how


many students would we expect to be Hot Topic
shoppers?
np = (200)(.13) = 26

This expected value is the mean of your sample

Binomial Distribution and the Normal


Curve

Now, we need to decide if we can use the normal


curve to solve problems

If (np) > 5 and n(1 p)>5then the sampling


distribution will be normally distributed.

So, our sample was 200 students.


Is (np) > 5?
Is n(1 p)>5?
Yesand yes.
np = (200)(.13) = 26
n(1 p) = (200)(1 - .13) = (200)(.87) = 174

Binomial Distribution and the Normal


Curve

What do we mean that a sampling distribution is


normal?
a)

Just like someones age is one value among many


ages that we tally to make a histogram, we can tally
many samples, get the p values of those sample, and
construct histograms from these means.

If we took say, 1000 samples, and tallied the p


values for Hot Topic shoppers, then those values,
when turned into a histogram, should form a
normal curve. Just like if we took the heights of a
1000 women, and tallied those values to get a
normal curve.

How to use the Binomial Distribution


and the Normal Curve
1.

2.
3.
4.
5.

6)

Get the mean ()the mean is the expected

value (np)
Get the standard deviation () = np(1 p)
Draw a normal curve using mean and standard
dev
Use the continuity correction factor, and add
+/- half a unit to the value we want to solve for
Get the z-scores = x -

Use the normal curve table to solve the


problem

Why the continuity correction factor?

This is only for discrete values (where values occupy only distinct
points.) For example, in our study, there is no such thing as a
half or 3/4 Hot Topic shopper. Either you are a shopper or not.
Looking at how histograms are presented, you can see why we
have to use the correction factor.
1. Probability of getting a value equal
to or greater than (=>), then you
must subtract a half-unit
2. Probability of getting a value equal
to or lesser than (=<), you must
add a half unit.
3. Probability of getting the exact
value, you must get the Z-scores
for a half-unit above and a halfunit below

Now lets answer a Hot Topic Question


If you collected a sample of 200 FIT
students
a.What is the probability that 13 will be Hot
Topic shoppers?
b.What is the probability that you will have
30 or more Hot Topic shoppers?
c.What is the probability that you will have
25 or less Hot Topic shoppers?

Question
1.

What is the probability that 13 will be Hot Topic shoppers?

2.

What is the probability that you will have 30 or more Hot


Topic shoppers?

3.

What is the probability that you will have 25 or less Hot


Topic shoppers?

Answer
1.

Get the mean () = expected value = np = (200)(.13) = 26

2.

Get the standard deviation () = np(1 p) = 26(1 - .13)


= 26(.87) = 22.62 4.76

3.

Draw a normal curve using mean and standard dev.

4.

Use the continuity correction factor to correct x. (a) 12.5


and 13.5, (b) 29.5, (c) 25.5

5.

Get the z-scores. (a) -2.83 and -2.62, (b) .735, (c)-.105

6.

Solve the problem (a) 4977 - .4956 = .002, or 2% (b) .50


- .2704 .23, or 23%, (c) .50 - .0596 = .4404

Now lets do question 4.16 as a class


In a marketing population of phone calls, 3%
produced a sale. If this population proportion (p =
3%) can be applied to future phone calls, then out of
500 randomly monitored phone calls,
a. How

many would you expect to produce a sale?


b. What is the probability of getting 11 to 14 sales?
c. What is the probability of getting 12 or less sales?

a. 15
b. 32.93%
c. 25.46%

Question 4.16 answers


a.
b.
c.

Expected value = np = 500(.03) = 15


32.93%
25.46%

You might also like