You are on page 1of 43

Buisness statistics

Learning Objectives

Determine when to use sampling instead of a census.


Distinguish between random and nonrandom
sampling.
Decide when and how to use various sampling
techniques.
Be aware of the different types of errors that can
occur in a study.
Understand the impact of the Central Limit Theorem
on statistical analysis.
Use the sampling distributions of x and p
.

Reasons for Sampling


Sampling A means for gathering useful information
about a population

Information gathered from sample, and conclusions drawn

Sampling vs. census has advantages

Sampling can save money.


Sampling can save time.
Given the resources can broaden the scope of the study
Because research process is sometimes destructive sample
can save product
If accessing the population is impossible ,the sample is the
only option.

Reasons for Taking a Census

Eliminate the possibility that a random sample


is not representative of the population.
The person authorizing the study is
uncomfortable with sample information.

Sampling Frame

Every research study has target population that consists of the


individuals ,institutions, or entities that are the objects of
investigation.
Sample is taken from a population list,map,directory,or other
source used to represent the population called the sampling
frame.
Sampling is done from the frame not target population
Ideally a one to one correspondence exists between frame and
population units
Frames may have overregistration or underregistration

Random Versus Nonrandom Sampling

Nonrandom Sampling (nonprobability


sampling) - Every unit of the population does
not have the same probability of being
included in the sample
Random sampling(probabilty sampling) Every unit of the population has the same
probability of being included in the sample.

Random Sampling Techniques

Simple Random Sample basis for other


random sampling techniques

Each unit is numbered from 1 to n


A random number generator can be used to select
n items from the sample

Random Sampling Techniques

Stratified Random Sample

Proportionate (% of the sample taken from each


stratum is proportionate to the % that each
stratum is within the whole population)
Disproportionate (when the % of the sample
taken from each stratum is not proportionate to
the % that each stratum is within the whole
population)

Systematic Random Sample


Cluster (or Area) Sampling

Simple Random Sample:


Sample Members
01 Alaska Airlines
02 Alcoa
03 Ashland
04 Bank of America
05 BellSouth
06 Chevron
07 Citigroup
08 Clorox
09 Delta Air Lines
10 Disney

N=30
n=6

11 DuPont
12 Exxon Mobil
13 General Dynamics
14 General Electric
15 General Mills
16 Halliburton
17 IBM
18 Kellog
19 KMart
20 Lowes

21 Lucent
22 Mattel
23 Mead
24 Microsoft
25 Occidental Petroleum
26 JCPenney
27 Procter & Gamble
28 Ryder
29 Sears
30 Time Warner

Simple Random Sampling:


Random Number Table
N = 30
9
9 4n
3 7=8 6
7 9 6 1
5 0 6 5 6 0 0 1 2 7

8
8
6
5
8

0
6
0
2
9

8
4
0
5
1

8
2
9
8
5

0
0
7
7
5

6
4
8
7
9

3
0
6
1
0

N = 30
n=6

1
8
4
9
5

7
5
3
6
5

1
3
6
5
3

4
6
4
5
0
8
9

5
8
2
3
1
5
0

7
3
8
7
8
4
6

3
6
7
9
6
5
8

7
7
7
8
9
3
9

3
6
6
8
4
4
4

7
6
6
9
7
6
8

5
8
8
4
7
8
6

5
8
3
5
5
3
3

2
2
5
4
8
4
7

9
0
6
6
8
0
0

7
8
0
8
9
0
7

9
1
5
1
5
9
9

6
5
1
3
3
9
5

9
6
5
0
5
1
5

3
8
7
9
9
9
4

9
0
0
1
9
9
7

0
0
2
2
4
7
0

9
1
9
5
0
2
6

4
6
6
3
0
9
2

3
7
5
8
4
7
7

4
8
0
8
8
6
1

4
2
0
1
2
9
1

7
2
2
0
6
4
8

5
4
6
4
8
8
2

3
5
4
7
3
1
6

1
8
5
4
0
5
4

6
3
5
3
6
9
4

1
2
8
1
0
4
9

8
6
7
9
6
1
3

Stratified Random Sample

Stratified Random sampling population is


divided into non-overlapping subpopulations
called strata

Researcher extracts a simple random sample from


each subpopulation
Stratified random sampling has the potential for
reducing error

Stratified Random Sample

Sampling error a sample does not represent


the population

Stratified random sampling has the potential to


match the sample closely to the population
Stratified sampling is more costly
Stratum should be relatively homogeneous, i.e.
race, gender, religion

Stratified Random Sample

Proportionate -- the percentage of the sample


taken from each stratum is proportionate to
the percentage that each stratum is within the
population
Disproportionate -- proportions of the strata
within the sample are different than the
proportions of the strata within the population

Systematic Sampling

Used because of its


convenience and easy
of administration
Population elements are
an ordered sequence
(at least, conceptually).
With systematic sampling,
every kth item is selected
to produce a sample of
size n from a population of
size N

N
=

where:
n = sample size
N = population size
k = size of selection interval

Systematic Sampling

Thereafter, sample elements are selected at a


constant interval, k, from the ordered sequence
frame.
Advantages of systematic sampling

Systematic sampling is evenly distributed across the


frame
Evenly determined if a sampling plan has been followed
Systematic sampling is based on the assumption that the
source of the population is random

Systematic Sampling: Example

Purchase orders for the previous fiscal year


are serialized 1 to 10,000 (N = 10,000).
A sample of fifty (n = 50) purchases orders is
needed for an audit.
k = 10,000/50 = 200

Systematic Sampling: Example

First sample element randomly selected from


the first 200 purchase orders. Assume the 45th
purchase order was selected.
Subsequent sample elements: 45, 245, 445,
645, . . .

Cluster Sampling

Cluster sampling involves dividing the


population into non-overlapping areas

Identifies the clusters that tend to be internally


homogeneous
Each cluster is a microcosm of the population

If the cluster is too large, a second set of


clusters is taken from each original cluster

This is two stage sampling

Cluster Sampling

Advantages

More convenient for geographically dispersed


populations
Reduced travel costs to contact sample elements
Simplified administration of the survey
Unavailability of sampling frame prohibits using
other random sampling methods

Cluster Sampling

Disadvantages

Statistically less efficient when the cluster


elements are similar
Costs and problems of statistical analysis are
greater than for simple random sampling

Nonrandom Sampling

Non-Random sampling sampling techniques


used to select elements from the population
by any mechanism that does not involve a
random selection process

These techniques are not desirable for use in


gathering data to be analyzed by inferential
statistics
Sampling error cannot be determined objectively
from these techniques

Types of non random sampling


techniques

Convenience sampling: Elements of sample are

selected for convenience (readily


available,nearby,willing to participate) of researcher.
Judgment sampling: Elements of sample are
chosen by judgment of the researcher.
Quota sampling: Quota sets the size of samples to
be obtained from subgroups based on the proportions
of subclasses in population.
Snowball sampling: Survey subjects are selected
based on referral from other survey respondents.

Errors

Data from nonrandom samples are not appropriate for


analysis by inferential statistical methods.
Sampling Error occurs when the sample is not representative
of the population
Non-sampling Errors all errors other than sampling errors

Missing Data, Recording, Data Entry, and Analysis Errors


Poorly conceived concepts , unclear definitions, and defective
questionnaires
Response errors occur when people do not know, will not say, or
overstate in their answers

Sampling Distribution of Mean x


Proper analysis and interpretation of a sample
statistic requires knowledge of its distribution.

Population

(parameter)

Calculate x
to estimate
Process of
Inferential Statistics

" Start here."

Select a
random sample

Sample
x
(statistic)

Sampling distribution of x

Sampling distribution of x is the frequency


distribution of sample means (computed after
randomly selecting samples of given size
from a population with particular distribution)
Sample means for samples taken for
populations with different distributions appear
to be approximately normally distributed
,especially as sample size becomes larger.

Central Limit Theorem

Central limits theorem allows one to study


populations with differently shaped
distributions
Central limits theorem creates the potential
for applying the normal distribution to many
problems when sample size is sufficiently
large

Central Limit Theorem

Advantage of Central Limits theorem is when


sample data is drawn from populations not
normally distributed or populations of
unknown shape can also be analyzed because
the sample means are normally distributed
due to large sample sizes

Central Limit Theorem

As sample size increases, the distribution


narrows

Due to the Std Dev of the mean


Std Dev of mean decreases as sample size
increases

Sampling from a Normal Population

The distribution of sample means is normal for


any sample size.
If x is the mean of a random sample of size n
from a normal population with mean of and
standard deviation of , the distribution of x is
a normal distribution with mean and
x

standard deviation x

Z Formula for Sample Means


Z

Tire Store Example


Suppose,
forfor
example,
thatthat
the mean
expenditure
Suppose,
example,
the mean
expenditure
per customer at a tire store is $85.00, with a standard
per
customer
at a tire
is $85.00,
standard
deviation
of $9.00.
If astore
random
samplewith
of 40acustomers
deviation
of $9.00.
a random sample
40 customers
is taken, what
is theIfprobability
that theof
sample
average
isexpenditure
taken, whatper
is the
customer
probability
for this
that
sample
the sample
will be $87.00
average
or
more? Because
sample for
sizethis
is greater
expenditure
perthe
customer
samplethan
will30,
bethe
$87.00
theorem
be used,
the sample
orcentral
more?limit
Because
the can
sample
size and
is greater
than means
30, the
are normally distributed. With = $85.00, = $9.00, and the
central
limitfortheorem
can be zused,
and theassample
z formula
sample means,
is computed
shownmeans
on
are
normally
distributed. With = $85.00, = $9.00, and the
the3
next slide.

z formula for sample means, z is computed as shown:

Solution to Tire Store Example


Population Parameters: 85, 9
Sample Size: n 40

87 X

P( X 87) P Z

87

P Z

87 85

P Z
9

40

P Z 1.41
.5 (0 Z 1.41)
.5 .4207
.0793

Graphic Solution to Tire Store


Example

9
40
1. 42

.5000

.5000

.4207

.4207
85

87

X - 87 85
2
Z=

1. 41

9
1. 42
n
40

Equal Areas
of .0793

1.41 Z

Demonstration Problem
Suppose that during any hour in a large
Suppose
that
during
any
hour
in
a
large
department
department store, the average number of
store, the average number of shoppers is 448, with
shoppers
is
448,
with
a standard deviation of 21 shoppers. What is the
a standard
21 shoppers.
What
probability
thatdeviation
a random of
sample
of 49 different
is the probability
thata sample
a random
of
shopping
hours will yield
meansample
between
49 and
different
shopping hours will yield a
441
446 shoppers?
sample mean between
441 and 446 shoppers?

Demonstration Problem

Graphic Solution for


Demonstration Problem

3
.4901

.4901

.2486

.2415
441

446 448

.2486

.2415
X

-2.33

-.67 0

X - 446 448
X - 441 448
0.67
Z=

2.33 Z =
21

21
n
49
n
49

Z formula for sample means of finite


population
Z = x-
/n *N-n/N-1
Where,
N-n/N-1
is called the finite correction factor
when sample size is less than 5% of finite size
population n/N .05 ,finite correction factor does
not significantly modify the solution.

Sampling Distribution of p

Sample Proportion
X
n
where:
p

X number of items in a sample that possess the characteristic


n = number of items in the sample
Sampling Distribution
Approximately normal if nP > 5 and nQ > 5
(P is the population proportion and Q = 1 - P.)
The mean of the distribution is P.
The standard deviation of the distribution is
(p*q)/n

Sampling Distribution of p p hat

p hat is a sample proportion


Whereas the mean is computed by averaging
a set of values, the sample proportion is
computed by dividing the frequency with
which a given characteristic occurs in a
sample by the number of items in the sample
as shown in the formula.

Z Formula for Sample Proportions


Z
where
p
n
P
Q
n P
n Q

p P
P Q
n
:
sample proportion
sample size
population proportion
1 P
5
5

Demonstration Problem
If 10% of a population of parts is defective,
Ifwhat
10% ofisathe
population
of
parts
is
defective,
probability of randomly
what is the probability of randomly selecting
selecting
80 parts and finding that 12 or more parts are
80defective?
parts and finding that 12 or more parts are
defective?

Solution for Demonstration Problem


Population Parameters
P = 0 . 10
Q = 1 - P 1 . 10 . 90
Sample
n = 80
X 12
X 12
p

0 . 15
n 80
P ( p . 15 ) P Z

P Z

. 15 P
PQ
n
. 15 . 10
(. 10 ) (. 90 )
80

P Z

. 15 p

0 . 05
0 . 0335

P ( Z 1. 49 )

. 5 P ( 0 Z 1. 49 )
. 5 . 4319
. 0681

Graphic Solution for


Demonstration Problem

0. 0335

.5000

.5000

.4319

.4319
0.10

^
0.15 p

p P 0.15 0.10
0. 05
Z=

1. 49
PQ
(.10)(. 90) 0. 0335
n
80

1.49 Z

You might also like