You are on page 1of 15

by: Sudheer pai

1 of 15
Session 11
2. In a sample of 8 observations, the sum of squared deviations of items from the
mean was 84.4. In another sample of 10 observations, the value was found to be 102.6
test whether the difference is significant at 5% level. You are given that at 5% level,
critical value of F for
1
=7 and
2
=9 degree of freedom is 3.29 and for
1
=8 and

2
=10 degree of freedom, its value is 3.07
Solutions: Let us take hypothesis that the difference in the variance of the two
samples is not significant. We are given
8
1
= n ,
2
1 1
) ( X X =84.4
10
2
= n ,
2
2 2
) ( X X =102.6
2
2
2
1
S
S
F =
06 . 1
4 . 11
06 . 12
4 . 11
9
6 . 102
1
) (
06 . 12
7
4 . 84
1
) (
2
2
2 2 2
2
2
2
2 2 2
1
= =
= =

=
= =

=
_
_
F
n
X X
S
n
X X
S
for
1
=7 and
2
=9. F
0.05
=3.29
The calculated value of F is less than the table value. Hence we accept the hypothesis
and concluded that the difference in the variance of two samples is not significant at
5% level.
ANALYSIS OF VARIANCE
The analysis of variance frequently referred to by contraction ANOVA is a statistical
technique specially designed to test whether the means of more than two quantitative
populations are equal.
Problems:
The three samples below have been obtained from normal populations with equal
variance. Test the hypothesis that the sample means are equal:
8 7 12
10 5 9
7 10 13
14 9 12
11 9 14
The table value of F at 5% level of significance for
1
=2 and
2
=12 is 3.88
by: Sudheer pai
2 of 15
Solutions:
1
X
2
X
3
X
8
10
7
14
11
7
5
10
9
9
12
9
13
13
14
Total 50
X 10
40
8
60
12
10
3
12 8 10
=
+ +
= X
VARIANCE BEWEEN SAMPLES
,
2
1
X X
2
2
|
.
|

X X
2
3
|
.
|

X X
0
0
0
0
0
4
4
4
4
4
4
4
4
4
4
0 20 20
Sum of squares between samples = 0+20+20=40
VARIANCE WITHIN SAMPLES
1
X
,
2
1 1
X X
2
X
,
2
2 2
X X
3
X
,
2
3 3
X X
8
10
7
14
11
4
0
9
16
1
7
5
10
9
9
1
9
4
1
1
12
9
13
13
14
0
9
1
0
4
30 16 14
Sum of squares between samples = 30+16+14=60
ANOVA TABLE
Source of variation Sum of squares v Mean square
Between 40 2 20
Within 60 12 5
Total 100 14
4
5
20
= = F
The calculated value of F is greater than the table value. The hypothesis is
rejected. Hence there is significant difference in the sample means.
by: Sudheer pai
3 of 15
ANALYSIS OF VARIANCE IN TWO-WAY CLASSIFICATION
MODEL
In a one-factor analysis of variance explained above the treatments different
levels of a single factor which is controlled in the There are, however, many
situations in which the response variable of interest may be affected by more than
one factor. For example, sales of Maxfactor Cosmetics, in addition to being affected
by the point-of-sale display, might also be affected by the price charged, the size
and/or location of the store or the number of competitive products sold by the
store, Similarly petrol mileage may be affected by the type of car driven, the way it is
driven, road conditions and other factors in addition to the brand of petrol used.
When it is believed that two independent factors might have an effect on the response
variable of interest, it is possible to design the test so that an analysis of variance can
be used to test for the effects of the two factors simultaneously. Such a test is called a
two-factor analysis of variance. With two- factor analysis of variance, we can test two
sets of hypothesis with the same data at the same time.
In a two-way classifications the data are classified according to two different criteria
or factors. The procedure for analysis of variance is somewhat different than the one
followed while dealing with problems of one-way classification. In a two-way
classification the analysis of variance table takes the following form.
Source of
Variation
Sum of
squares
Degree of
freedom
Mean Sum of Squares Ratio of F
Between samples
Between Rows
Residual or error
SSC
SSR
SSE
(c-1)
(r-1)
(c-1)(r-1)
MSC-SSC/)/(c-1)
MSR = SSR/(r-1)
MSE=SSE/(r-1)(c-1)
MSE/MSE
MSR/MSE
Total SST n-1
SSC = Sum of square between columns
SSR = Sum of squares between rows
SSE= Sum of squares due to error
SST= Total sum of squares
The sum squares for the source 'Residual' is obtained by subtracting from the total
sum of squares the sum of squares between columns and rows, i.e.,
SSE = SST-[SSC+SSR]
The total number of degrees of freedom = n - 1 or cr - 1
where c refers to number of columns, and
r refers to number of rows,
Number of degrees of freedom between columns
=(c-1)
Number of degrees of freedom between rows
= (r-1)
Number of degrees of freedom for residual
=(c- l)(r- 1)
The total sum of squares, sum of squares for between columns and sum of squares
for between rows are obtained in the same way as before.
Residual or error sum of square = Total sum of square Sum of squares between
columns Sum of squares between rows.
by: Sudheer pai
4 of 15
5The F values are calculates as follows:
F (

1.

2
) = MSC/MSE
Where
1
= (c-1) and
2
= (c-1)(r-1)
F (

1.

2
) = MSR/MSE
Where
1
= (r-1) and
2
= (c-1)(r-1)
It should be carefully noted that
1
may not be same in both cases- in one case
1
= (c-
1) and another case
1
(r-1).
The calculated values of F are compared with the table values. If calculated value of F
is greater than the table value at pre-assigned level of significance the null hypothesis
is rejected, otherwise accepted.
It would be clear from above that in problems involving two-way classification.
residual is the measuring rod for testing significance. It represents the magnitude of
variation due to forces called change. The following examples would illustrate the
procedure.
Problems:
A tea company appoints four salesmen A,B,C and D and observes their sales in three
seasons-summer, winter and monsoon. The figures (in lakhs) are given in the
following table:
(i) Do the salesmen significantly differ in performance?
(ii) Is there significant difference between the seasons?
Solutions:
The above data are classified according to criteria (i) salesman, and (ii) seasons in
order to simply calculations we code the data by subtracting 30 from each figure. The
data in the code from are given below:
360 96 81 93 90 Salesmens
Totals
112 29 29 28 26 Monsoon
120 32 31 29 28 Winter
128 35 21 36 36 Summer
Total D C B A
Season
s
Salesmen Seasons
Grand total
T=0
6 -9 3 0
-8 -1 -1 -2 -4 Monsoon
0 +2 +1 -1 -2 Winter
+8 +5 -9 +6 +6 Summer
Total D C B A
Seasons Salesmen Seasons
by: Sudheer pai
5 of 15
Correction Factor =
,
0
12
0
2 2
= =
N
T
(number of items or N is 12)
Sum of squares between salesmen
This will be obtained by squaring up the salesmens totals, diving each total by the
number of items included in it, adding these figures and then subtracting the
correction factor from them.
Thus, sum of squares between salesmen:
, ,
3 ) 1 4 (
42 0 12 27 3 0
3
) 6 (
3
9
3
3
3
) 0 (
2 2 2 2 2
= =
= + + + + =
+

+ + =

N
T
Sum of squares between seasons
This will be obtained by dividing the squares of the season totals by the numbers of
items that make up each total, adding all such figures and subtracting therefrom the
correction factor, thus,
sum of squares between salesmen:
, ,
2 ) 1 4 (
32 0 16 0 16
4
8
4
0
4
) 8 (
2 2 2 2
= =
= + + =

+ + =

N
T
Total sum of squares
This will be obtained by adding the squares of all items in the table and subtracting
the correction factor therefrom, thus:
Total sum of squares =
11 ) 1 12 (
210 0 210
) 1 ( ) 2 ( ) 5 ( ) 1 ( ) 1 ( ) 9 (
) 2 ( ) 1 ( ) 6 ( ) 4 ( ) 2 ( ) 6 (
2
2 2 2 2 2 2
2 2 2 2 2 2
= =
= =
+ + + + + +
+ + + + +

N
T
The above information will be presented in the following table of Analysis of
Variance:
Source of
Variation
Sum of squares Degree of
freedom
Mean Sum of Squares
Between columns
(salesmen)
Between
Rows(seasons)
Residual
42
32
136
3
2
6
14
16
22.67
210 11
Let us take the hypothesis that there is no difference between the sales of salesman
and of seasons or .In other words, the three independent estimates of variance are the
estimates of variance of a common population.
by: Sudheer pai
6 of 15
Now first compare the salesmen variance estimate with the residual variance estimate;
thus
619 . 1
14
67 . 22
= = F
The table value of F for
1
= 3 and
2
= 6 at 5% level of significance is 4.76.
The calculated value is less than the table value and we conclude that the sales of
different salesmen do not differ significantly.
Now let us compare the season variance estimate with the residual variance estimate:
thus
417 . 1
16
67 . 22
= = F
The critical value of F for
1
= 2 and
2
= 6 at 5% level of significance is 5.14.
The calculated value is less than this and hence there is no significant difference in the
seasons as far as the sales are concerned.
Thus, the test shows that the salesmen and the seasons are alike so far as the sales are
concerned.
Problems:
The following data represent the number of units of production per day tumed out by
5 different workers using 4 different types of machines:
A. Test whether the mean productivity is the same for the different machine types.
B. Test whether the 5 men differ with respect to mean productivity
Solutions:
Let us take the hypothesis that (a) the mean productivity is the same for four different
machines, and (b) the 5 men do not differ with respect to mean productivity. To
simply calculations let us divide each value by 40. The coded data is given below
39 49 42 38 5
33 46 38 43 4
32 44 36 34 3
43 52 40 46 2
36 47 38 44 1
D C B A
Workers
Machine type
Tota
l
Machine type Worker
D C B A
+5
+21
-14
0
+8
-4
+3
-8
-7
-1
+7
+1
2
+4
+6
+9
-2
0
-4
-2
+
2
+
4
+
6
-6
+
3
-2
1
2
3
4
5
T=2
0
-17 +3
8
-6 +
5
Total
by: Sudheer pai
7 of 15
Correction Factor = 20
20
400
2
= =
N
T
Sum of squares between machines
, ,

+ +

+ =
2 2 2 2
5
) 17 (
5
38
5
6
5
) 5 (
Correction Factor
3 ) 1 4 ( ) 1 (
8 . 338 20 8 . 358
20 ) 8 . 57 8 . 288 2 . 7 5 (
= = =
= =
+ + + =
c
Sum of squares between workers
, , , ,
4 ) 1 5 ( ) 1 (
5 . 161 20 5 . 181
20 ) 16 0 49 25 . 110 25 . 6 (
20
4
64
4
0
4
196
4
441
4
25
4
8
4
0
4
14
4
21
4
) 4 (
2 2 2 2 2 2
= = =
= =
+ + + + =
+ + + + =
+ +

+ + =
r
N
T

Total sum of squares


Total sum of squares =
574 20 594
20 ] 1 49 64 9 16 81 36
16 144 49 4 4 16 4 4 9 36 36 16 [
] ) 1 ( ) 7 ( ) 8 ( ) 3 ( ) 4 ( ) 9 ( ) 6 (
) 4 ( ) 12 ( ) 7 ( ) 2 ( ) 2 ( ) 4 ( ) 0 (
) 2 ( ) 2 ( ) 3 ( ) 6 ( ) 6 ( ) 4 [(
2
2 2 2 2 2 2 2
2 2 2 2 2 2 2
2 2 2 2 2 2
= =
+ + + + + + +
+ + + + + + + + + + + =
+ + + + + + +
+ + + + + + +
+ + + + + =
N
T
Residual or Remainder = Total sum of squares (Sum of squares between machines
Sum of squares between workers)
= 574-33.8-161.5 = 73.7
Degree of freedom for remainder = 19-3-4=12
(c-1) (r-1)= (3*4) = 12
Source of Variation S.S d.f M.S Variance
Ratio or F
Between Machine types
Between Workers
Remainder or Residual
338.8
161.5
73.7
3
4
12
112.933
40.375
6.142
112.933/6.142
= 18.387
40.375/6.142
= 6.574
574 19
(a) For
3
=12, F
0.05
=3.49
Since the calculated value (18.4) is greater than the table value, we conclude that
the mean productivity is not same for the four different types of machines
(b) For
4
=12, F
0.05
=3.26
by: Sudheer pai
8 of 15
The calculated value (6.58) is greater than the table value, hence the worker differ
with respect to mean productivity.
Application of the t-distribution
The following are some of the examples to illustrate the way in which the Student
distribution is generally use to test the significance of the various results obtained
from small samples.
1. To test the Significance of the Mean of a Random Sample. In determining
whether the mean of a sample drawn from a normal population deviates significantly
from a stated value (the hypothetical value of the populations mean), when variance
of the population is unknown we calculate the statistic:
where X = the mean of the sample
= the actual or hypothetical mean of the population
n=the sample size
S= the standard deviation of the sample
Problems
The manufactures of a certain make of electric bulbs claims that his bulbs have a
mean life of 25 months with a standard deviation of 5 months. A random sample of 6
such bulbs gave the following values.
Life in months: 24, 26, 30, 20, 20, 18
Can you regard the producers claim to be valid at 1% level of significance? (Give
that the table values of the appropriate test statistics at the said level are 4.032, 3.707
and 3.499 for 5.6 and 7 degree of freedom respectively)
Solutions: Let us take the hypothesis that there is no significant difference in the
mean life of bulbs in the sample and that of the population. Applying t-test:
,
S
n X
t

=
,
1
2

=
_
n
X X
S
,
S
n X
t

=
by: Sudheer pai
9 of 15
CALCULATION OF X and S
X
x
X X ) (
24
26
30
20
20
18
+1
+3
+7
-3
-3
-5
1
9
49
9
9
25
X =138

2
x =102
517 . 4 4 . 20
5
102
1
23
6
138
2
= = =

=
= = =
_
_
n
x
S
n
X
X
084 . 1
517 . 4
449 . 2 2
6
517 . 4
| 25 23 |
=

=
=n-1=6-1=5. For =5, t
0.01
=4.032.
The calculated value if t is less than table value. The hypothesis is accepted.
Hence, the producers claim is not valid at 1 level of significance.
Problems
A random sample of size 16 has 53 as mean. The sum of the squares of the deviations
taken from mean is 135. Can this sample be regarded as taken from the population
having 56 as mean? Obtain 95% and 99% confidence limits of the mean of the
population. (for =15, t
0.05
= 2.13 for =15, t
0.01
= 2.95)
Solutions: Let us take the hypothesis that there is no significant difference between
the sample mean and hypothetical population mean. Applying t test:
,
4
3
4 3
16
3
| 56 53 |
3
15
135
1
135 ) ( , 16 , 56 , 53
2
=

=
= =

=
= = = =

=
_
_
t
n
X X
S
X X N X
n
S
X
t

=16-1=15. . For =16, t


0.05
= 2.13
The calculated value of t is more than the table value. The hypothesis is rejected.
Hence, the sample has not come from a population having 56 as mean.
2
x
by: Sudheer pai
10 of 15
95% confidence limits of the population mean
6 . 54 4 . 51 6 . 1 53
13 . 2
16
3
53
05 . 0
to
t
n
S
X
= =
=

99% confidence limits of the population means


212 . 55 788 . 50 212 . 2 53
95 . 2
4
3
53
95 . 2
16
3
53
01 . 0
to
t
n
S
X
= =

=

2. Testing Difference Between Means of Two Samples (Independent Samples) Given


two independent random samples of size n1 and n2 with means 1 X and 2 X and
standard deviations S
1
and S
2
we may be interested in testing the hypothesis that the
samples come from the same normal population. To carry out the test, we calculate
the statistic as follows:
Where 1 X = mean of the first sample
2 X = mean of the second sample
n
1
= number of observations in the first sample
n
2
= number of observations in the second sample
S = combined standard deviation
The value of S is calculated by the following formula:
Two types of drugs were used on 5 and 7 patients for reducing their weight.
Drug A was imported and drug B indigenous. The decrease in the weight after using
the drugs for six months was as follows:
Drug A : 10 12 13 11 14
Drug B : 8 9 12 14 15 10 9
If the bias correction due to small is ignored, pooled estimate of the standard
deviation can be obtained by:
2 1
2 1
2 1
n n
n n
S
X X
t
+

=
, ,
2
2 1
2 2
2
1
1
2 +
+
=
_ _
n n
X X X X
S
2 1
2
2 2
2
1 1
n n
S n S n
S
+
+
=
by: Sudheer pai
11 of 15
Is there a significant difference in the efficacy of the two drugs? If not, which drug
should you buy? (For =10, t
0.05
=2.223)
Solution: Let us take the hypothesis that there is no significant difference in the
efficacy of the two drugs. Applying t-test
2 1
2 1 2 1
n n
n n
S
X X
t
+
=
=
1
X
,
1 1
X X
,
2
1 1
X X
2
X
,
2 2
X X
,
2
2 2
X X
10
12
13
11
14
-2
0
+1
-1
+2
4
0
1
1
4
8
9
12
14
15
10
9
-3
-2
+1
+3
+4
-1
-2
9
4
1
9
16
1
4

1
X =60
,
2
1 1
X X =10

2
X =60
,
2
2 2
X X =44
However, it is advisable to take account of bias.
, ,
324 . 2
10
54
2 7 5
44 10
2
; 11
7
77
; 12
5
60
2 1
2
1 1
2
1 1
2
2
2
1
1
1
= =
+
+
=
+
+
=
= = = = = =
_ _
_ _
n n
X X X X
S
n
X
X
n
X
X
735 . 0
324 . 2
708 . 1
7 5
7 5
324 . 2
11 12
2 1
2 1 2 1
= =
+

+
=
=
n n
n n
S
X X
t
=n
1
+n
2
2 = 5+7-2 = 10
=10, t
0.05
= 2.228.
For the calculated value of t is less than the table value, the hypothesis is accepted.
Hence, there is no significance in the efficacy of two drugs. Since drug B is
indigenous and there is no difference in the efficacy of impoted and indigenous drug,
we should buy indigenous drug, i.e., B.
2 1
2
2 2
2
1 1
n n
S n S n
S
+
+
=
by: Sudheer pai
12 of 15
Problems:
For a random sample of 10 persons, fed on die A, the increased weight in pounds in a
certain period were:
10, 6, 16, 17, 13, 12, 8, 14, 15, 9
For another random sample of 12 persons, fed on diet B, increase in the same period
were:
7, 13, 22, 15, 12, 14, 18, 8, 21, 23, 10, 17
Test whether the diets A and B differ significantly as regards her effect on increase in
weight. Given the following:
Degree of freedom 19 20 21 22 23
Value of t at 5% level 2.09 2.09 2.08 2.07 2.07
Solutions: Let us take the null hypothesis that A and B do not differ significantly
weight regard to their effect on increase in weight. Applying t-test
2 1
2 1 2 1
n n
n n
S
X X
t
+
=
=
, ,
2
2 1
2
1 1
2
1 1
+
+
=
_ _
n n
X X X X
S
Calculating the requires values:
Persons fed on diet A Persons fed on diet B
Increases in
weight
1
X
Deviations
from mean 12
,
1 1
X X
,
2
1 1
X X
Increases
in weight
2
X
Deviations
from mean
15
,
2 2
X X
,
2
2 2
X X
10
6
16
17
13
12
8
14
15
9
-2
-6
+4
+5
+1
0
-4
+2
+3
-3
4
36
16
25
1
0
16
4
9
9
7
13
22
15
12
14
18
8
21
23
10
17
-8
-2
+7
0
-3
-1
+3
-7
+6
+8
-5
+2
64
4
49
0
9
1
9
49
36
64
25
4

1
X =120
,
1 1
X X =0
,
2
1 1
X X =
120

2
X =180
,
2 2
X X
=0
,
2
2 2
X X =
44
Mean increase in weight of 10 persons fed on diet A
; 12
10
120
1
1
1 = = =
_
n
X
X pounds
Mean increase in weight of 12 persons fed on diet A
by: Sudheer pai
13 of 15
; 15
12
180
2
2
2 = = =
_
n
X
X pounds
, ,
66 . 4
20
434
2 12 10
314 120
2
2 1
2
1 1
2
1 1
= =
+
+
=
+
+
=
_ _
n n
X X X X
S
1 X =12, 2 X =15, n
1
= 12, n
2
= 12, S = 4.66. Substituting the values in the above
formula
51 . 1 34 . 2
66 . 4
3
12 10
12 10
66 . 4
15 12
= =
+

= t
=n
1
+n
2
2 = 10+12-2 = 20.
For =20, the table value of t at 5 percent level is 2.09. The calculated value is less
than the table value and hence the experiment provides no evidence against the
hypothesis. We, therefore, conclude that diets A and B do not differ significantly as
regards their effect on increase in weight is concerned.
3. Testing Difference between Means of Two samples( Dependent Samples or
Matched Paired Observations)
n
S
d
t

=
0
or
S
n d
t =
Where d = the mean of the differences
S = the standard deviation of the differences
The value of S is calculated as follows:
,
1
2

=
_
n
d d
S or
1
) (
2

_
n
d d
It should be noted that t is based on n-1 degree of freedom.
Problems
To verify whether a course in accounting improved performed, a similar test was
given to 12 participants both before and after the course, The original marks
recorded in alphabetical order of the participants were44, 40, 61, 52, 32, 44, 70,
41,47,72,53, and 72. After the course, the marks were in the same order, 53, 38, 69,
57, 46, 39, 73, 48,73,74,60 and 78. Was the course useful?
Solutions: Let us take the hypothesis that there is no difference in the marks obtained
before and after the course, i.e. the course has not been useful
Applying t-test(difference formula):
S
n d
t =
by: Sudheer pai
14 of 15
Participants Before
(1
st
Test)
After
(
2nd
Test)
(2
nd
1
st
Test)
d
d
2
A
B
C
D
E
F
G
H
I
J
K
L
44
40
61
552
32
44
70
41
67
72
53
72
58
38
69
57
46
39
73
48
73
74
60
78
+9
-2
+8
+5
+14
-5
+3
+7
+6
+2
+7
+6
81
4
64
25
196
25
9
49
36
4
49
36
d=60 d
2
=578
443 . 3
03 . 5
464 . 3 5
03 . 5
12
03 . 5
11
278
1 12
) 5 ( 12 578
1
) (
5
12
60
2
2
2
=

=
= =

=
= = =
_
_
t
t
n
d n d
S
n
d
d
=n-1=12 1 = 11;
For =11, t
0.05
= 2.201
The calculated value of t is greater than the table value. The hypothesis is rejected.
Hence the course has been useful.
Problems:
A drug is given to 10 patients and the increments in their blood pressure were
recorded to be 3, 6, -2, 4, -3, 4, 6, 0, 0, 2. Is it reasonable to believe that the drug has
no effect on change of blood pressure? (5% value of t for 9 d.f.=2.26)
Solution. Let us take the hypothesis that the drug has no effect on charge of blood
pressure. Applying the difference test:
S
n d
t =
d d
2
3
6
-2
4
-3
4
6
0
0
2
9
36
4
16
9
16
36
0
0
4
d=0 d
2
=130
by: Sudheer pai
15 of 15
2
162 . 3
162 . 3 2
162 . 3
12 2
162 . 3
1 10
) 2 ( 10 130
1
) (
2
10
20
2
2
2
=

= =
=

=
= = =
_
_
t
n
d n d
S
n
d
d
=n-1=10 1 = 9; For =6, t
0.05
= 2.26.
The calculated value of t is less than the table value. The hypothesis is accepted.
Hence it is reasonable to believe that the drug has no effect on change of blood
pressure.

You might also like