You are on page 1of 13

FUNDAMENTAL SAMPLING

DISTRIBUTIONS AND DATA


DESCRIPTION
20
th
November
Previous lecture: Sampling, its methods and errors
Todays lecture:
1. Sample Mean, sample variance, sample standard deviation
2. Data display and graphical Methods:
Box and whisker plot or boxplot
Quantile plot
Detection of deviation from normality
Normal quantile quantile plot
Normal Probability Plotting

3. Sampling Distributions
Sampling distribution of Means
Central limit Theorem
Sampling Distribution of the Difference between two averages
4. T-Distribution
5. F-distribution


WHAT IS SAMPLING DISTRIBUTION?
Sampling is defined as the process of selecting a number of
observations (subjects) from all the observations (subjects)
from a particular group or population.
Sampling distribution is defined as the frequency
distribution of the statistic for many samples.
It is the distribution of means and is also called the
sampling distribution of the mean.

*
FEATURES OF SAMPLING DISTRIBUTION
The 4 features of sampling distribution include:

1) The statistic of interest (Proportion, SD, or
Mean)
2) Random selection of sample
3) Size of the random sample (very important)
4) The characteristics of the population being
sampled.
1. SAMPLE MEAN, SAMPLE VARIANCE, SAMPLE STANDARD DEVIATION

The sample mean from a group of observations is an estimate
of the population mean . Given a sample of size n,
consider n independent random variables X
1
, X
2
, ..., X
n
, each
corresponding to one randomly selected observation. Each of
these variables has the distribution of the population, with
mean and standard deviation .


By the properties of means and variances of random variables,
the mean and variance of the sample mean are the following:

EXAMPLE
Although the mean of the distribution of is identical to the mean of the population
distribution, the variance is much smaller for large sample sizes.

For example, suppose the random variable X records a randomly selected student's
score on a national test, where the population distribution for the score is normal
with mean 70 and standard deviation 5 (N(70,5)). Given a simple random sample
(SRS) of 200 students, the distribution of the sample mean score has mean 70 and
standard deviation 5/sqrt(200) = 5/14.14 = 0.35.

FUN
CHARACTERISTICS OF SAMPLING
DISTRIBUTION

Central Limit Theorem
When random samples of size is taken from a population, the
distribution of sample means will approach the normal
distribution.

When the Sampling distribution of the mean has sample
sizes of 30 or more then it is said to be normally distributed.
So we can see that as the sample size increases to 30 or
more then it resembles a normal distribution, therefore the
size of the random sample is a very important feature of
sampling distribution.




CENTRAL LIMIT THEOREM

1. The random variable x has a distribution (which may
or may not be normal) with mean and standard
deviation o.
2. Samples all of the same size n are randomly selected
from the population of x values.

Given:
1. The distribution of sample x will, as the
sample size increases, approach a normal
distribution.
2. The mean of the sample means will be the
population mean .
3. The standard deviation of the sample means
will approach o/ .


n
Conclusions:
Central Limit Theorem
PRACTICE QS.
A1. The mean of the sampling distribution of the mean is the mean of the population
from which the scores were sampled, in this case 14.
A2. The population has a mean of 30 and a standard deviation of 6. The sample size of
your sampling distribution is N=9. What is the variance of the sampling distribution
of the mean?
A3. The standard error is the standard deviation of the population divided by the square
root of N. In this case, 12/4 = 3
A4. According to the central limit theorem, regardless of the shape of the parent
population, the sampling distribution of the mean approaches a normal distribution
as N increases. In this case, a sample size of 30 is sufficiently large to cause the
sampling distribution of the mean to look about normal.
A5. This problem is asking about the sampling distribution of the mean: Mean = 75, SD
= 10/sqrt(25) = 10/5 = 2, Skew = about 0 because the central limit theorem states
that the sampling distribution of the mean would be about normal with a large
enough N.
PRACTICAL RULES
COMMONLY USED:
1. For samples of size n larger than 30, the distribution of the
sample means can be approximated reasonably well by a normal
distribution. The approximation gets better as the sample size n
becomes larger.
2. If the original population is itself normally distributed, then the
sample means will be normally distributed for any sample size n
(not just the values of n larger than 30).
the mean of the sample means




x
=
NOTATION NOTATION
the mean of the sample means


the standard deviation of sample mean

x
=
o
x
=
o
n
NOTATION
the mean of the sample means


the standard deviation of sample mean

(often called standard error of the mean)

x
=
o
x
=
o
n
Distribution of 200 digits from
Social Security Numbers
(Last 4 digits from 50 students)
Figure 5-19
Distribution of 50 Sample Means
for 50 Students
Figure 5-20
As the sample size increases, the
sampling distribution of sample
means approaches a normal
distribution.
EXAMPLE 1

GIVEN THE POPULATION OF MEN HAS NORMALLY DISTRIBUTED
WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD DEVIATION OF 29
LB,

A) IF ONE MAN IS RANDOMLY SELECTED, FIND THE PROBABILITY
THAT HIS WEIGHT IS GREATER THAN 167 LB.

B) IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THE
PROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.



EXAMPLE 1: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD
DEVIATION OF 29 LB,
A) IF ONE MAN IS RANDOMLY SELECTED, FIND THE PROBABILITY
THAT HIS WEIGHT IS GREATER THAN 167 LB.


z = 167 172 = 0.17
29
EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD
DEVIATION OF 29 LB,
A) IF ONE MAN IS RANDOMLY SELECTED, THE PROBABILITY THAT HIS
WEIGHT IS GREATER THAN 167 LB. IS 0.5675.



EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD
DEVIATION OF 29 LB,

B) IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THE
PROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.




EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD
DEVIATION OF 29 LB,
B) IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THE
PROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.




EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD
DEVIATION OF 29 LB,
B) IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, FIND THE
PROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB.




z = 167 172 = 0.60
29
36


z = 167 172 = 0.60
29
36
EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 143 LB AND A STANDARD
DEVIATION OF 29 LB,
B.) IF 12 DIFFERENT MEN ARE RANDOMLY SELECTED, THE
PROBABILITY THAT THEIR MEAN WEIGHT IS GREATER THAN 167 LB IS
0.7257.


EXAMPLE: GIVEN THE POPULATION OF MEN HAS NORMALLY
DISTRIBUTED WEIGHTS WITH A MEAN OF 172 LB AND A STANDARD
DEVIATION OF 29 LB,
b) if 12 different men are randomly selected, their mean
weight is greater than 167 lb.

P(x > 167) = 0.7257
It is much easier for an individual to deviate from the
mean than it is for a group of 12 to deviate from the mean.
a) if one man is randomly selected, find the probability
that his weight is greater than 167 lb.

P(x > 167) = 0.5675
NON-NORMAL POPULATIONS
What can we say about the shape of the sampling
distribution of x when the population from which the
sample is selected is not normal?
53
490
102
72
35
21 26
17
8 10 2 3 1 0 0 1
0
100
200
300
400
500
600
F
r
e
q
u
e
n
c
y
Salary ($1,000's)
BaseballSalaries
THE IMPORTANCE OF THE CENTRAL
LIMIT THEOREM
When we select simple random samples of
size n, the sample means we find will vary
from sample to sample. We can model the
distribution of these sample means with a
probability model that is

, N
n

o | |
|
\ .
HOW LARGE SHOULD N BE?
For the purpose of applying the central limit
theorem, we will consider a sample size to be
large when n > 30.
SUMMARY
Population: mean ; stand dev. o; shape of
population dist. is unknown; value of is
unknown; select random sample of size n;
Sampling distribution of x:
mean ; stand. dev. o/\n;
always true!
By the Central Limit Theorem:
the shape of the sampling distribution is
approx normal, that is
x ~ N(, o/\n)
EXAMPLE
( )
4
8
A random sample of =64 observations is
drawn from a population with mean =15
and standard deviation =4.
a. ( ) 15; ( ) .5
b. The shape of the sampling distribution model for
is approx. no
SD X
n
n
E X SD X
x

o
= = = = =
( )
rmal (by the CLT) with
mean E(X) 15 and ( ) .5. The answer
depends on the sample size since ( ) .
SD X
n
SD X
SD X
= =
=
GRAPHICALLY

Shape of population
dist. not known
EXAMPLE (CONT.)
15.5 15 .5
.5 .5 ( )
c. 15.5;
1
This means that =15.5 is one standard
deviation above the mean ( ) 15
x
SD X
x
z
x
E X


=
= = = =
=
EXAMPLE 2
The probability distribution of 6-month
incomes of account executives has mean
$20,000 and standard deviation $5,000.
a) A single executives income is $20,000.
Can it be said that this executives income
exceeds 50% of all account executive
incomes?
ANSWER No. P(X<$20,000)=? No information
given about shape of distribution of X; we do
not know the median of 6-mo incomes.
EXAMPLE 2(CONT.)
b) n=64 account executives are randomly
selected. What is the probability that the
sample mean exceeds $20,500?

( )
( ) 5,000
64
20,000 20,500 20,000
625 625
( ) $20,000, ( ) 625
By CLT, ~ (20,000,625)
( 20,500)
( .8) 1 .7881 .2119
SD x
n
X
E x SD x
X N
P X P
P z

= = = =
> = > =
> = =
answer E(x) = $20, 000, SD(x) = $5, 000
EXAMPLE 3
A sample of size n=16 is drawn from a
normally distributed population with mean
E(x)=20 and SD(x)=8.

( )
8
16
20 24 20
2 2
16 20 24 20
2 2
~ (20,8); ~ (20, )
) ( 24) ( ) ( 2)
1 .9772 .0228
) (16 24)
( 2 2) .9772 .0228 .9544
X
X N X N
a P X P P z
b P X P z
P z


> = > = > =
=
s s = s s =
s s = =
EXAMPLE 3 (CONT.)
c. Do we need the Central Limit Theorem to
solve part a or part b?

NO. We are given that the population is
normal, so the sampling distribution of the
mean will also be normal for any sample size n.
The CLT is not needed.
EXAMPLE 4
Battery life X~N(20, 10). Guarantee: avg.
battery life in a case of 24 exceeds 16 hrs.
Find the probability that a randomly
selected case meets the guarantee.

10
24
20 16 20
2.04 2.04
( ) 20; ( ) 2.04. ~ (20,2.04)
( 16) ( ) ( 1.96)
.1 .0250 .9750
X
E x SD x X N
P X P P z

= = =
> = > = > =
=
EXAMPLE 5
Cans of salmon are supposed to have a net
weight of 6 oz. The canner says that the
net weight is a random variable with
mean =6.05 oz. and stand. dev. o=.18
oz.
Suppose you take a random sample of 36
cans and calculate the sample mean
weight to be 5.97 oz.
Find the probability that the mean weight
of the sample is less than or equal to 5.97
oz.
POPULATION X: AMOUNT OF SALMON
IN A CAN
E(X)=6.05 OZ, SD(X) = .18 OZ
X sampling dist: E(x)=6.05 SD(x)=.18/6=.03
By the CLT, X sampling dist is approx. normal
P(X s 5.97) = P(z s [5.97-6.05]/.03)
=P(z s -.08/.03)=P(z s -2.67)= .0038

How could you use this answer?


Suppose you work for a consumer
watchdog group
If you sampled the weights of 36 cans and
obtained a sample mean x s 5.97 oz., what
would you think?
Since P( x s 5.97) = .0038, either
you observed a rare event (recall: 5.97 oz is
2.67 stand. dev. below the mean) and the mean
fill E(x) is in fact 6.05 oz. (the value claimed by
the canner)
the true mean fill is less than 6.05 oz., (the
canner is lying ).

EXAMPLE 6
X: weekly income. E(x)=600, SD(x) = 100
n=25; X sampling dist: E(x)=600
SD(x)=100/5=20
P(X s 550)=P(z s [550-600]/20)
=P(z s -50/20)=P(z s -2.50) = .0062

Suspicious of claim that average is $600;
evidence is that average income is less.

EXAMPLE 7
12% of students at NCSU are left-handed. What
is the probability that in a sample of 50
students, the sample proportion that are left-
handed is less than 11%?
.12*.88
( ) .12; ( ) .046
50
E p p SD p = = = =
.12 .11 .12
( .11)
.046 .046
( .22) .4129
p
P p P
P z
| |
< = <
|
\ .
= < =
By the CLT, ~ (.12,.046) p N

You might also like