You are on page 1of 166

sreedhar.students@yahoo.

com
SEMESTER II
By
K. Sreedhara Babu

Business Statistics II
sreedhar.students@yahoo.com
What we have seen so far
sreedhar.students@yahoo.com
Focus for this Semester
sreedhar.students@yahoo.com
Internal Assessment (60 Marks)
Class Test 30 Marks
Project work 10 Marks
Lab Examination 10 Marks
Assignments 10 Marks
sreedhar.students@yahoo.com
SESSION 1
sreedhar.students@yahoo.com
Topics for this visit
Hypothesis Testing Recap
Proportional problems
1 sample and 2 sample
sreedhar.students@yahoo.com
Hypothesis Testing
sreedhar.students@yahoo.com
Inferential Decision Making
Population
Sample
Use parameters to summarize features
Use statistics to summarize features
Inference on the population from the sample
sreedhar.students@yahoo.com
Hypothesis Testing
sreedhar.students@yahoo.com
Examples
sreedhar.students@yahoo.com
Inferential Decision Algorithm
sreedhar.students@yahoo.com
Inferential Decision Algorithm
sreedhar.students@yahoo.com
Inferential Decision Algorithm
sreedhar.students@yahoo.com
Inferential Decision Algorithm
Alternative Hypo
One-sided test
Two-sided test
sreedhar.students@yahoo.com
Inferential Decision Algorithm
Alternative Hypo
One-sided test
Two-sided test
Differ with claim
sreedhar.students@yahoo.com
Inferential Decision Algorithm
Alternative Hypo
One-sided test
Two-sided test
Claim tested, Direction
sreedhar.students@yahoo.com
Inferential Decision Algorithm
Formulating Hypotheses
Null Hypotheses
Alternate Hypothesis
Selecting a confidence interval



sreedhar.students@yahoo.com
Inferential Decision Algorithm
Formulating Hypotheses
Null Hypotheses
Alternate Hypothesis
Selecting a confidence interval
Decision maker concern
How much risk he is willing to tolerate
Industrial decision making 95% or 99%
100 decisions 99 correct and 1 wrong more confident
95% in organization, 99% in legal reviews
So it is P
.05
or P
.01
sreedhar.students@yahoo.com
Level of Significance
sreedhar.students@yahoo.com
Level of Risk
Risk of reject a hypothesis when its is FALSE
Risk of reject a hypothesis when its is TRUE
sreedhar.students@yahoo.com
Types of errors
Errors
Type 1
Type 2
sreedhar.students@yahoo.com
Types of errors
sreedhar.students@yahoo.com
Inferential Decision Algorithm
Formulating Hypotheses
Selecting a confidence interval
Selecting a decision making tool Imp
Sample size & Frame size
If the distribution is known Parametric tool
If the distribution is not known Non parametric tool
Sample size may be a factor
One- sample
Two-sample
sreedhar.students@yahoo.com
Decision Making Tool
sreedhar.students@yahoo.com
Decision Making Tool
sreedhar.students@yahoo.com
Decision Making Tool
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Rejection Region
sreedhar.students@yahoo.com
Critical value Vs Test statistic
sreedhar.students@yahoo.com
Test statistic Mean Values
sreedhar.students@yahoo.com
Inferential Decision Algorithm
sreedhar.students@yahoo.com
Inferential Decision Algorithm
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Inferential Problems
Problems
Proportion Problems
Variance Problems
Mean Problems
sreedhar.students@yahoo.com
Inferential Problems
Problems
Proportion Problems
Variance Problems
Mean Problems
sreedhar.students@yahoo.com
The Proportion Problems
So many out of so many
Examples
3 out of 5 doctors prescribe a certain aspirin
So many defective parts per shift
Socialists, Democrats . casts votes out of many
Chi-squared distribution is the best to solve these
sreedhar.students@yahoo.com
Chi-squared Distribution
A variable is said to have Chi-square distribution
If the distribution has the shape of Chi-square curve
sreedhar.students@yahoo.com
Chi-squared Distribution
sreedhar.students@yahoo.com
Chi-squared Distribution
A variable is said to have Chi-square distribution
If the distribution has the shape of Chi-square curve
Results when the Independent variables are
Normally distributed
Squared
Summed

sreedhar.students@yahoo.com
Chi-squared Distribution
A variable is said to have Chi-square distribution
If the distribution has the shape of Chi-square curve
Results when the Independent variables are
Normally distributed
Squared
Summed
It is denoted by the symbol _
2

sreedhar.students@yahoo.com
Chi-squared Distribution
e 2.71828
v Number of degrees of freedom
c A constant depending on v

2
2 2 ( ) 1 / 2
( ) ( )
vc x
f x e x e

=
sreedhar.students@yahoo.com
MINITAB Environment
sreedhar.students@yahoo.com
Normal distributed sequence
Mean : 0 and S.D : 1
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Chi-squared Distribution
sreedhar.students@yahoo.com
Chi-squared Distribution
sreedhar.students@yahoo.com
Properties of _
2
Curves
Total area under _
2
-curve equals 1.
Value of 0 at lower limit. Extends in positive direction
It is a continuous probability distribution
It has only one parameter v
Its shape depends on v
v is small skewed distribution
v is large normal distribution
Mean v
Variance 2v
sreedhar.students@yahoo.com
Chi-square distribution
Chi-square distribution
Critical value approach
P value approach
sreedhar.students@yahoo.com
Chi-square distribution
Chi-square distribution
Critical value approach
P value approach
sreedhar.students@yahoo.com
Chi-square characterization factors
Factors
Degrees of freedom
Confidence Interval
We have to find out the critical values based on these two
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Rejection Region
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
_
2
0
_
2
Table (Portion)
What is the critical _
2
value if df = 2, & o =0.05
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
_
2
0
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
o = .05
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
o = .05
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
o = .05
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
o = .05
df = k - 1 = 2
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
o = .05
df = k - 1 = 2
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
_
2
0
Reject
o = .05
df = k - 1 = 2
sreedhar.students@yahoo.com
Upper Tail Area
DF .995 .95 .05
1 ... 0.004 3.841
2 0.010 0.103 5.991
Finding Critical Value Example
What is the critical _
2
value if df = 2, & o =0.05
_
2
Table (Portion)
o = .05
df = k - 1 = 2
_
2
0 5.991
Reject
sreedhar.students@yahoo.com
_
2
-value
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
_
2
-value
sreedhar.students@yahoo.com
Chi-square Test
A test on Qualitative variable (s)
Also known as Pearsons Chi-square Test
Evaluates how the
Observed frequencies are described by expected frequencies
Qualitative / Categorical data
Single observation is a word A class of category
sreedhar.students@yahoo.com
Hypothesis Test on Qualitative data
Qualitative
Data
Z Test Z Test
_
2
Test
Proportion Independence
1 pop.
_
2
Test
2 or more
pop.
2 pop.
sreedhar.students@yahoo.com
Hypothesis Test on Qualitative data
Qualitative
Data
Z Test Z Test
_
2
Test
Proportion Independence
1 pop.
_
2
Test
2 or more
pop.
2 pop.
sreedhar.students@yahoo.com
Hypothesis Test on Qualitative data
Qualitative
Data
Z Test Z Test
_
2
Test
Proportion Independence
1 pop.
_
2
Test
2 or more
pop.
2 pop.
Goodness of Fit Test
sreedhar.students@yahoo.com
Categorical Data Analysis
Independent / Explanatory variable Categorical
Dependent / Response variable Categorical
sreedhar.students@yahoo.com
Categorical Data Analysis
Notation to obtain test statistic
Rows represent Explanatory variable (r levels)
Cols represent Response variable (c levels)
1 2 c Total
1 n
11
n
12
n
1c
n
1.

2 n
21
n
22
n
2c
n
2.


r n
r1
n
r2
n
rc
n
r.

Total n
.1
n
.2
n
.c
n
..

sreedhar.students@yahoo.com
Categorical Data Analysis
Independent / Explanatory variable Categorical
Dependent / Response variable Categorical
Special Cases
2x2 (Each variable has 2 levels) Contingency Table
sreedhar.students@yahoo.com
Contingency Tables
Paper distribution
The Hindu
Hindustan Times
Indian Express
Business Times
Time of India
Assessment
Paper Acc Inac Tot
HIN 168 73 241
HT 230 73 303
IEX 254 53 307
BT 379 58 437
TOI 652 124 776
Tot 1683 381 2064
sreedhar.students@yahoo.com
Another Example
Daughter Son Total
Father 30 20 50
Mother 20 30 50
Total 50 50 100
Is there any relation between affections mentioned
sreedhar.students@yahoo.com
Chi-square Tests
Chi-square
Goodness of Fit
Independence
sreedhar.students@yahoo.com
Chi-square Tests
Chi-square
Goodness of Fit One variable
Independence
sreedhar.students@yahoo.com
Chi-square Tests
Chi-square
Goodness of Fit One variable
Independence Two variable
sreedhar.students@yahoo.com
Chi-square Tests
Chi-square
Goodness of Fit
Independence
sreedhar.students@yahoo.com
Goodness of Fit Test
sreedhar.students@yahoo.com
Goodness of Fit Test
One-way Chi-square test
Take a Sample frequency distribution
Relative frequencies observed
Relative frequencies hypothesized to be true in the population
How the agreement between them
sreedhar.students@yahoo.com
Consider a case
sreedhar.students@yahoo.com
Consider a case
A
B
C
D
sreedhar.students@yahoo.com
Goodness of Fit
Rats behavior is Random
Results obtained after conducting an experiment
A
B
C
D
sreedhar.students@yahoo.com
Goodness of Fit
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Solution
STEP 3 Computing Critical Value


815 . 7
3
05 .
2
=
=
=
crit
df
_
o
sreedhar.students@yahoo.com
Solution
STEP 3 Computing Test Statistic

|
|
.
|

\
|

=
e
e o
f
f f
2
2
) (
_
=
0
f Observed Frequency
=
e
f
Estimated Frequency
sreedhar.students@yahoo.com
Solution
STEP 3 Computing Test Statistic


sreedhar.students@yahoo.com
Solution
STEP 3 Computing Test Statistic


sreedhar.students@yahoo.com
Solution
STEP 3 Computing Test Statistic
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Consider a case
You may have heard, Stay with your first answer on a multiple-choice
test Is changing answers more likely to be helpful or harmful? To
examine this, Best (1979) studied the responses of 261 students in an
introductory psychology course. He recorded the number of right-to-
wrong, wrong-to-right, and wrong-to-wrong answer changes for each
student. More wrong-to-right changes than right-to-wrong changes
were made by 195 of the students, who were thus helped by
changing answers; 27 students made more right-to-wrong changes than
wrong-to-right changes and thus hurt themselves. Using a .05 level of
significance, test the hypothesis that the proportions of right-to-wrong
and wrong-to-right changes are equal.
sreedhar.students@yahoo.com
Solution
STEP 1 Hypothesis statement
false. is H : H
.50 P .50, P : H
0 A
right to wrong wrong to right 0
= =

sreedhar.students@yahoo.com
Solution
STEP 2 Confidence Interval Estimation
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Solution
STEP 3 Computing Test Statistics
sreedhar.students@yahoo.com
Solution
Observed Frequency
The obtained frequency for each category
right-to-wrong wrong-to-right
Observed 27 195
Expected Frequency
The hypothesized frequency for each distribution, given the null
hypothesis is true
Expected proportion multiplied by number of observations
right-to-wrong wrong-to-right
Expected .5*222 = 111 .5*222 = 111
sreedhar.students@yahoo.com
Goodness of Fit
Calculate the test statistic.
right-to-wrong wrong-to-right
Observed 27 195
Expected 111 111

|
|
.
|

\
|

=
e
e o
f
f f
2
2
) (
_
14 . 127
57 . 63 57 . 63
111
7056
111
7056
111
) 111 195 (
111
) 111 27 (
2 2
2
=
+ =
|
.
|

\
|
+
|
.
|

\
|
=
|
|
.
|

\
|

+
|
|
.
|

\
|

= _
sreedhar.students@yahoo.com
Goodness of Fit
STEP 4 Decision situation
Reject H0, 127.14>3.84
Interpret your results
The proportion of right-to-wrong changes and wrong-
to-right changes is not equal
sreedhar.students@yahoo.com
Consider a case
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
2
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Consider a case
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Chi-square distribution
Chi-square distribution
Critical value approach
P value approach
sreedhar.students@yahoo.com
P-Value
P-value
_
2
= 4.219
_
2
0
sreedhar.students@yahoo.com
Consider a case
In 1860, Gregor Mendel experiment modern study
Texture of pea seed smooth / wrinkled
SW, WW,WS,SS like a coin possibilities
Ho : wrinkle: 0.25 smooth 0.75
HA: Ho is False
So use Chi-square Goodness of fit
Total : 7324 1850 + 5474
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
As 3.841 is greater than this, We fail to Reject Null Hypothesis
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
As P value is greater than o, we fail to reject Null Hypothesis
sreedhar.students@yahoo.com
A special Case
sreedhar.students@yahoo.com
Crimes in India
The CBI compiles data on crime and crime rates and
publishes the information in Crime in the India. A
violent crime is classified by the CBI as murder,
forcible rape, robbery, or aggravated assault
sreedhar.students@yahoo.com
Crimes in India (1995)
Types of
violent crime
Relative
frequency
Murder 0.012
Forcible rape 0.054
Robbery 0.323
Agg. assault 0.611
1.000
sreedhar.students@yahoo.com
Crimes in India (2006)
Types of
violent crime

Frequency
Murder 9
Forcible rape 26
Robbery 144
Agg. assault 321
500
sreedhar.students@yahoo.com
Crimes in India
Do the data provide sufficient evidence to conclude
that year 2006 distribution of violent crimes has
changed from the 1995 distribution?
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
Solution
Types of
violent crime
Relative
frequency
p
Expected
frequency
np =E
Murder 0.012 (500)(0.012) = 6.0
Forcible rape 0.054 (500)(0.054) = 27.0
Robbery 0.323 (500)(0.323) = 161.5
Agg, assault 0.611 (500)(0.611) = 305.5
sreedhar.students@yahoo.com
Solution
Types of
violent crime
x
Observed
frequency
O
Expected
frequency
E
Difference
O - E
Square of
difference
(O - E)
2
Chi-
square
subtotal
(O - E)
2
/E
Murder 9 6.0 3.0 9.00 1.500
Forcible rape 26 27.0 -1.0 1.00 0.037
Robbery 144 161.5 -17.5 306.25 1.896
Agg. assault 321 305.5 15.5 240.25 0.786
500 500.0 0.0
E (O - E)
2
/E
4.219
sreedhar.students@yahoo.com
Solution
sreedhar.students@yahoo.com
_
2
Goodness-of-Fit Test
(Critical-Value Approach)
Assumptions
1. All expected frequencies are 1 or greater.
2. At most 20% of the expected frequencies are
less than 5.
Step 1 The null and alternative hypotheses are:
H
o
: the variable under consideration has the specified
distribution.
H
a
: the variable under consideration does not have the
specified distribution.
sreedhar.students@yahoo.com
_
2
Goodness-of-Fit Test
(Critical-Value Approach)
Step 2 Use E = np to calculate the expected frequency for
all possible value of the variable under consideration
(n = sample size; p = relative frequency)
Step 3 Determine whether the expected frequencies satisfy
assumptions. If not, do not use procedure.
Step 4 Decide on the significance level, o.
Step 5 Compute the test statistic, _
2

sreedhar.students@yahoo.com
_
2
Goodness-of-Fit Test
(Critical-Value Approach)
Step 6 Use Table to find critical value, _
o
2
, with df = k 1,
where k is the number of possible values for the
variable under consideration.
Step 7 If the value of the test statistic falls in the rejection
region, reject H
o
; otherwise, do not reject H
o
.
Step 8 Interpret the results of the hypothesis test.
sreedhar.students@yahoo.com
_
2
Goodness-of-Fit Test
(P-Value Approach)
Assumptions
1. All expected frequencies are 1 or greater.
2. At most 20% of the expected frequencies are
less than 5.
Step 1 The null and alternative hypotheses are:
H
o
: the variable under consideration has the specified
distribution.
H
a
: the variable under consideration does not have the
specified distribution.
sreedhar.students@yahoo.com
_
2
Goodness-of-Fit Test
(P-Value Approach)
Step 2 Use E = np to calculate the expected frequency for
all possible value of the variable under consideration
(n = sample size; p = relative frequency)
Step 3 Determine whether the expected frequencies satisfy
assumptions. If not, do not use procedure.
Step 4 Decide on the significance level, o.
Step 5 Compute the test statistic, _
2

sreedhar.students@yahoo.com
_
2
Goodness-of-Fit Test
(P-Value Approach)
Step 6 Use Table to find critical value, _
o
2
, with df = k 1,
where k is the number of possible values for the
variable under consideration. Get P-value from
software package
Step 7 If P < o, reject H
o
; otherwise, do not reject H
o
.
Step 8 Interpret the results of the hypothesis test.
sreedhar.students@yahoo.com
Learning by problem solving
The American Automobile Manufacturers Association compiles
data on U.S. car sales by type of car. The following is the 1990
distribution:
Type of car Small Midsize Large Luxury
Percentage 32.8 44.8 9.4 13
A random sample of last years car sales yielded the following data:
Type of car Small Midsize Large Luxury
Frequency 133 249 47 71
sreedhar.students@yahoo.com
Learning by problem solving
Car Type O p E = np
(O - E)
2
/E
Small 133 0.328 164 5.860
Midsize 249 0.448 224 2.790
Large 47 0.094 47 0.000
Luxury 71 0.130 65 0.554
500 500 9.204
sreedhar.students@yahoo.com
Learning by problem solving
A childrens raincoat manufacturer wants to know whether customers
prefer any specific color over other colors in childrens raincoats. He
selects a random sample of 50 raincoats sold and notes the colors:
Color Yellow Red Green Blue
No. Sold 17 13 8 12
At a = 0.10 is there a color preference for the raincoats?
sreedhar.students@yahoo.com
Thank you all

You might also like