You are on page 1of 17

The R project for Comparisons of Several

Multivariate Means
Chu-yu Chung Hang Du Yi Su Xiangmin Zhang
December 7, 2009
Abstract
Comparisons of multivariate means involve hypothesis testing, con-
structing simultaneous condence intervals (SCI) and decomposing vari-
ances under certain condition. In this project, we write ve individual R
functions to perform such tasks, including paired comparison, a repeated
measure design for comparing treatments, comparing mean vectors from
two multivariate population and comparing several multivariate popula-
tion means. Our R functions are designed to largely facilitate the compu-
tation and to produce as much information as needed in practice.
1 Introduction
Multivariate hypothesis testing is dierent from univariate testing in many
ways. It makes use of multivariate normal assumption, which is more ap-
propriate in many practical settings. It allows many possible alternatives.
The advantages of multivariate testing include preserving -value and
testing with a greater power.
In section 1.1 to 1.5, we introduce how to formulate the testing when
comparing multivariate means. Either critical value for rejection or si-
multaneous condence interval is presented. The notations are adapted
from Chapter 6 of Applied Multivariate Statistical Analysis (6th. ed.) by
Johnson R. A. and Wichern D. W.
1.1 Paired Comparison
Paired comparison is used to analyze measurements under dierent sets
of experimental conditions to estimate if the responses dier signicantly
within these sets.
In multivariate paired comparison procedure, we label the responses as
X111 (variable 1 under treatment 1 in the rst unit), . . ., X2np (variable
p under treatment 2 in the nth unit) to denote between p responses,
two treatments, and n experimental unites, hence the p paired-dierence
random variables under j th unit become
1
Dj1 = X1j1 X2j1
Dj2 = X1j2 X2j2
. . .
Djp = X1jp X2jp
(1)
Let Dj = (Dj1, Dj2, . . . , Djp) for j = 1, 2, . . . , n, then
E(Dj) = =
_
_
_
1
2
. . .
p
_
_
_
(2)
Cov(Dj) =
d
(3)
The null hypothesis is that all the treatments have the same mean,
that is = 0, the alternative one is that the treatments have dierent
means.
If further assume D1, D2, . . . , Dn are independent Np (,
d
) random
vectors, then
T
2
= n
_

D
_
S
1
d
_

D
_
(4)
where

D =
1
n
n

j=1
D
j
, and S
d
=
1
n1
n

j=1
_
D
j


D
j
_ _
D
j


D
j
_
is distributed as an ((n 1) p/ (n p)) Fp,np random variable.
And we reject H0 if
T
2
>
(n 1) p
(n p)
Fp,np () (5)
The 100(1 )% simultaneous condence intervals for the individual
mean dierences are
i :

di
_
(n 1) p
(n p)
Fp,np ()
_
s
2
d
n
(6)
The Bonferroni 100(1 )% simultaneous condence intervals for the
individual mean dierences are
i :

di tn1
_

2p
_
_
s
2
d
n
(7)
1.2 Repeated Measure Design Comparison
This is another generalization of the univariate paired t-statistics arising
in situations where q treatments are compared with respect to a single
response variable.
Assume the j th observation is
X
j
=
_
_
_
Xj1
Xj2
. . .
Xjq
_
_
_
(8)
2
Assume all the population follows Nq (, x). Let C be a contrast
matrix. An level test of H0: C = 0 (equal treatment means) versus
H1: C = 0 is:
Reject H0 if
T
2
= n(Cx)

_
CxC

_
1
(Cx) >
(n 1) (q 1)
(n q + 1)
Fq1,nq+1 () (9)
where Fq1,nq+1 is the upper (1 )th percentile of an F-distribution
with q1 and nq+1 d.f., x =
1
n
n

j=1
x
j
and S =
1
n1
n

j=1
(x
j
x) (x
j
x)

The 100(1)% simultaneous condence intervals for a single contrast


c

for any contrast vectors of interest are


c

x
_
(n1)(q1)
nq+1
Fq1,nq+1 ()
_
c

Sc
n
1.3 Comparison of Two Multivariate Population
Means
We are going to compare the responses from one set of experimental set-
tings (population 1) with independent response from another set of ex-
perimental settings (population 2) in this part. If X11, X12, . . . , X1n
1
is
a random sample of size n1 from Np (1, ) and X21, X22, . . . , X2n
2
is
an independent random sample of size n2 from Np (2, ), the likelihood
ratio test of
H0 : 1 2 = 0 (10)
then
T
2
=
_

X1

X2 (1 2)

__
1
n1
+
1
n2
_
S
pooled
_
1
_

X1

X2 (1 2)

(11)
is distributed as
(n1 +n2 2) p
n1 +n2 p 1
Fp,n
1
+n
2
p1 (12)
where
S
pooled
=
n1 1
n1 +n2 1
S1 +
n2 1
n1 +n2 1
S2 (13)
and (n1 1) S1 is distributed as Wn
1
1 () and (n2 1) S2 is distributed
as Wn
2
1 ().
The 100(1 )% simultaneous condence interval for 1i 2i is
_

X1i

X2i
_
c
_
1
n1
+
1
n2
s
ii,pooled
(14)
where c
2
=
(n
1
+n
2
2)p
n
1
+n
2
p1
Fp,n
1
+n
2
p1
3
1.4 Comparison of Several Multivariate Popula-
tion Means
MANOVA is a synthesis of analysis output for multivariate analysis. It is a
generalized form of univariate analysis of variance (ANOVA). MANOVA
table is used to identify sum of treatment eects and sum of residuals.
We will not delve into details here. A complete explanation of variance
decomposition and summary tables of MANOVA can be found in Chapter
6 of Applied Multivariate Statistical Analysis (6th. ed.) by Johnson R. A.
and Wichern D. W.
1.5 Treatment Eect Comparison
In treatment efect comparison, we rst test if the treatment eects are
the same. When the hypothesis of equal treatment eects is rejected, we
will construct simultaneous condence intervals for the components of the
dierences of vector means. Treatment eect comparison is closed related
to MANOVA. Again, a complete discussion can be found in Chapter 6 of
Applied Multivariate Statistical Analysis (6th. ed.) by Johnson R. A. and
Wichern D. W.
2 Examples
For illustrative purpose, we run our functions on several datasets, all of
which accompany Chapter 6 of the book Applied Multivariate Statistical
Analysis(6th. ed.).
2.1 Paired Comparison
2.1.1 Example 1 (T6-1.dat)
Sample x
11j
x
12j
x
21j
x
22j
1 6 27 25 15
2 6 23 28 13
3 18 64 36 22
4 8 44 35 29
5 11 30 15 31
6 34 75 44 64
7 28 26 42 30
8 71 124 54 64
9 43 54 34 56
10 33 30 29 20
11 20 14 39 21
Table 1: T6-1.dat
4
In the above table, the rst two colunms are from treatment 1 and the
last two columms are from treatment 2.
The R output from running our function paired on this dataset is as
follows,
reject null hypothesis, nonzero mean difference exists
T Squared Based Simultaneous CI for difference
Estimate LowerCI UpperCI
1 -9.363636 -22.453272 3.726000
2 13.272727 -5.700119 32.245574
Bonferroni Based Simultaneous CI for difference
Estimate LowerCI UpperCI
1 -9.363636 -20.573107 1.845835
2 13.272727 -2.974903 29.520358
2.2 Repeated Measure Design Comparison
2.2.1 Example 2 (T6-2.dat)
Sample x
1
x
2
x
3
gender
1 426 609 556 600
2 253 236 392 395
3 359 433 349 357
4 432 431 522 600
5 405 426 513 513
6 324 438 507 539
7 310 312 410 456
8 326 326 350 504
9 375 447 547 548
10 286 286 403 422
11 349 382 473 497
12 429 410 488 547
13 348 377 447 514
14 412 473 472 446
15 347 326 455 468
16 434 458 637 524
17 364 367 432 469
18 420 395 508 531
19 397 556 645 625
Table 2: T6-2.dat
In the above table, each column represent data from an individual
treatment.
5
The R output from running our function repmeasure on this dataset
is as follows,
reject null hypothesis of equal treatment means
contrast matrix
[,1] [,2] [,3] [,4]
[1,] -1 1 -1 1
[2,] -1 -1 1 1
[3,] -1 1 1 -1
Simultaneous CI for contrasts
Estimate LowerCI UpperCI
1 -206.32812 -282.19953 -130.4567
2 -306.92188 -415.73637 -198.1074
3 22.42188 -31.82305 76.6668
2.3 Comparison of Two Multivariate Population
Means
2.3.1 Example 3 (T6-9.dat)
Sample x
1
x
2
x
3
gender
1 98 81 38 female
2 103 84 38 female
. . . . .
. . . . .
. . . . .
23 162 124 61 female
24 177 132 67 female
25 93 74 37 male
26 94 78 35 male
. . . . .
. . . . .
. . . . .
47 131 95 46 male
48 135 106 47 male
Table 3: T6-9.dat
In the above table, the rst 24 rows are data from population one(
gender = female), and the last 24 rows are data from population two(
gender = male).
The R output from running our function twopop on this dataset is as
follows,
6
mean vector of population one
4.900659 4.622909 3.940286
mean vector of population two
4.725444 4.477574 3.703186
reject equality of mean vectors
The coeffcient of the linear combination
of most responsible for rejection is
-43.72677 -8.710687 67.54641
T Squared Based Simultaneous CI for the difference
Estimate LowerCI UpperCI
1 0.1752157 0.05776762 0.2926638
2 0.1453352 0.05411666 0.2365537
3 0.2371000 0.12906223 0.3451377
Bonferroni Based Simultaneous CI for the difference
Estimate LowerCI UpperCI
1 0.1752157 0.07702893 0.2734025
2 0.1453352 0.06907636 0.2215940
3 0.2371000 0.14678026 0.3274197
2.3.2 Example 4 (T6-12.dat)
Sample x
1
x
2
x
3
x
4
gender
1 0.34 3.71 2.87 30.87 male
2 0.39 5.08 3.38 43.85 male
. . . . . .
. . . . . .
. . . . . .
24 0.34 4.27 4.00 50.35 male
25 0.40 4.58 2.82 32.48 male
26 0.29 5.04 1.93 33.85 female
27 0.28 3.95 2.51 35.82 female
. . . . . .
. . . . . .
. . . . . .
49 0.37 5.23 2.48 34.86 female
50 0.35 5.37 2.25 35.07 female
Table 4: T6-12.dat
In the above table, the rst 25 rows are data from population one(
7
gender = male), and the last 25 rows are data from population two(
gender = female).
The R output from running our function twopop on this dataset is as
follows,
mean vector of population one
0.3136 5.1788 2.3152 38.1548
mean vector of population two
0.3972 5.3296 3.6876 49.4204
reject equality of mean vectors
The coeffcient of the linear combination
of most responsible for rejection is
-99.39898 6.375999 6.228141 -0.7908238
T Squared Based Simultaneous CI for the difference
Estimate LowerCI UpperCI
1 -0.0836 -0.1697234 0.002523361
2 -0.1508 -1.4650835 1.163483457
3 -1.3724 -1.8760572 -0.868742824
4 -11.2656 -17.1438597 -5.387340281
Bonferroni Based Simultaneous CI for the difference
Estimate LowerCI UpperCI
1 -0.0836 -0.1509852 -0.01621484
2 -0.1508 -1.1791296 0.87752962
3 -1.3724 -1.7664745 -0.97832550
4 -11.2656 -15.8649035 -6.66629645
2.4 Comparison of Several Multivariate Popula-
tion Means
2.4.1 Example 5 (T6-12.dat)
Continue Example 4, we now demonstrate results of doing MANOVA on
the same dataset T6-12.dat. The R output from running our function
MANOVA is as follows,
Overall mean vector
[,1] [,2] [,3] [,4]
[1,] 132.7528 133.3146 98.19101 50.46067
Treatment sample size
[,1] [,2] [,3]
[1,] 29 30 30
8
Treatment effect matrix
[,1] [,2] [,3]
[1,] -1.3734986 -0.3861423 1.7138577
[2,] 0.1336691 -0.6146067 0.4853933
[3,] 1.3262301 0.8756554 -2.1576779
[4,] 0.1255327 -0.2273408 0.1059925
One-Way MANOVA Table
Treatment SS&CP matrix
[,1] [,2] [,3] [,4]
[1,] 147.300878 26.752383 -173.908098 3.083107
[2,] 26.752383 18.918597 -42.424177 6.221813
[3,] -173.908098 -42.424177 213.678096 -8.005024
[4,] 3.083107 6.221813 -8.005024 2.344543
Error SS&CP matrix
[,1] [,2] [,3] [,4]
[1,] 1785.2609 174.1690 125.11034 289.05172
[2,] 174.1690 1904.2724 225.07586 178.87931
[3,] 125.1103 225.0759 2046.07471 -17.82644
[4,] 289.0517 178.8793 -17.82644 837.76782
Total SS&CP matrix
[,1] [,2] [,3] [,4]
[1,] 1932.56180 200.9213 -48.79775 292.13483
[2,] 200.92135 1923.1910 182.65169 185.10112
[3,] -48.79775 182.6517 2259.75281 -25.83146
[4,] 292.13483 185.1011 -25.83146 840.11236
Degrees of Freedom
Treatment Error Total
1 2 80 88
Bonferroni Based Simultaneous CI for Treatments Difference
Trt.1 Trt.2 Trt.3 Estimate LowerCI UpperCI
1 1 -1 0 -0.987 -4.451 2.476
2 1 0 -1 -3.087 -6.551 0.376
3 0 1 -1 -2.100 -5.563 1.363
4 1 -1 0 0.748 -2.829 4.325
5 1 0 -1 -0.352 -3.929 3.225
6 0 1 -1 -1.100 -4.677 2.477
7 1 -1 0 0.451 -3.257 4.158
8 1 0 -1 3.484 -0.224 7.191
9 0 1 -1 3.033 -0.674 6.741
10 1 -1 0 0.353 -2.020 2.725
11 1 0 -1 0.020 -2.353 2.392
12 0 1 -1 -0.333 -2.706 2.039
9
2.4.2 Example 6 (T6-13.dat)
Sample x
1
x
2
x
3
x
4
Group
1 131 138 89 49 1
2 125 131 92 48 1
. . . . . .
. . . . . .
. . . . . .
29 131 136 114 54 1
30 124 138 101 46 1
31 124 138 101 48 2
32 133 134 97 48 2
. . . . . .
. . . . . .
. . . . . .
59 135 132 98 54 2
60 130 128 101 51 2
61 137 141 96 52 3
62 129 133 93 47 3
. . . . . .
. . . . . .
. . . . . .
89 138 133 100 55 3
90 138 133 91 46 3
Table 5: T6-13.dat
In the above table, the rst 30 rows are data from group 1, the next
30 rows are data from group 2 and the last 30 rows are from data from
group 3.
The R output from running our function MANOVA on this dataset is as
follows,
Overall mean vector
[,1] [,2] [,3] [,4]
[1,] 0.3554 5.2542 3.0014 43.7876
Treatment sample size
[,1] [,2]
[1,] 25 25
Treatment effect matrix
[,1] [,2]
[1,] -0.0418 0.0418
[2,] -0.0754 0.0754
[3,] -0.6862 0.6862
[4,] -5.6328 5.6328
10
One-Way MANOVA Table
Treatment SS&CP matrix
[,1] [,2] [,3] [,4]
[1,] 0.087362 0.157586 1.434158 11.77255
[2,] 0.157586 0.284258 2.586974 21.23566
[3,] 1.434158 2.586974 23.543522 193.26137
[4,] 11.772552 21.235656 193.261368 1586.42179
Error SS&CP matrix
[,1] [,2] [,3] [,4]
[1,] 0.404480 5.378180 0.854764 4.328096
[2,] 5.378180 94.196160 2.597532 113.078548
[3,] 0.854764 2.597532 13.833280 105.750500
[4,] 4.328096 113.078548 105.750500 1884.311320
Total SS&CP matrix
[,1] [,2] [,3] [,4]
[1,] 0.491842 5.535766 2.288922 16.10065
[2,] 5.535766 94.480418 5.184506 134.31420
[3,] 2.288922 5.184506 37.376802 299.01187
[4,] 16.100648 134.314204 299.011868 3470.73311
Degrees of Freedom
Treatment Error Total
1 1 46 49
Bonferroni Based Simultaneous CI for Treatments Difference
Trt.1 Trt.2 Estimate LowerCI UpperCI
1 1 -1 -0.084 -0.151 -0.016
2 1 -1 -0.151 -1.179 0.878
3 1 -1 -1.372 -1.766 -0.978
4 1 -1 -11.266 -15.865 -6.666
2.5 Treatment Eect Comparison
Example 7 and Example 8 use dataset T6-12.dat and T6-13.dat (the same
as in the last subsection).
2.5.1 Example 7 (T6-12.dat)
The R output from running our function trt.eect is as follows,
Bonferroni Based Simultaneous CI for Treatments Difference
Trt.1 Trt.2 Estimate LowerCI UpperCI
1 1 -1 -0.084 -0.151 -0.016
2 1 -1 -0.151 -1.179 0.878
3 1 -1 -1.372 -1.766 -0.978
4 1 -1 -11.266 -15.865 -6.666
11
2.5.2 Example 8 (T6-13.dat)
The R output from running our function trt.eect is as follows,
Bonferroni Based Simultaneous CI for Treatments Difference
Trt.1 Trt.2 Trt.3 Estimate LowerCI UpperCI
1 1 -1 0 -1.000 -4.442 2.442
2 1 0 -1 -3.100 -6.542 0.342
3 0 1 -1 -2.100 -5.542 1.342
4 1 -1 0 0.900 -2.674 4.474
5 1 0 -1 -0.200 -3.774 3.374
6 0 1 -1 -1.100 -4.674 2.474
7 1 -1 0 0.100 -3.680 3.880
8 1 0 -1 3.133 -0.647 6.913
9 0 1 -1 3.033 -0.747 6.813
10 1 -1 0 0.300 -2.061 2.661
11 1 0 -1 -0.033 -2.395 2.328
12 0 1 -1 -0.333 -2.695 2.028
3 Appendix (R code)
3.1 Paired Comparison
In this part, x1 is a np numeric matrix or dataframe of data of responses
under treatment 1 where n is number of experimental unit and p is number
of responses; x2 is a np numeric matrix or dataframe of data of responses
under treatment 2 where n is number of experimental unit and p is number
of responses,and the input level is the condence level of interval.
paired<-function (x1, x2, level)
{
p <- ncol(x1)
n <- nrow(x1)
d <- x1 - x2
dbar <- apply(d, 2, mean)
s <- cov(d)
tsq <- n * t(dbar) %*% solve(s) %*% dbar
csq <- (n - 1) * p/(n - p) * qf(level, p, n - p)
if (tsq > csq)
cat("\n reject null hypothesis, nonzero mean difference exists \n")
else cat("do not reject null hypothesis, nonzero mean difference does not exist\n")
scit <- matrix(rep(0, p * 3), nrow = p)
scib <- matrix(rep(0, p * 3), nrow = p)
for (i in 1:p) {
scit[i, 1] <- dbar[i]
scit[i, 2] <- dbar[i] - sqrt(s[i, i]/n * csq)
scit[i, 3] <- dbar[i] + sqrt(s[i, i]/n * csq)
scib[i, 1] <- dbar[i]
scib[i, 2] <- dbar[i] - qt(1 - (1 - level)/(2 * p), n -
1) * sqrt(s[i, i]/n)
scib[i, 3] <- dbar[i] + qt(1 - (1 - level)/(2 * p), n -
12
1) * sqrt(s[i, i]/n)
}
scit <- data.frame(Estimate = scit[, 1], LowerCI = scit[,
2], UpperCI = scit[, 3])
scib <- data.frame(Estimate = scib[, 1], LowerCI = scib[,
2], UpperCI = scib[, 3])
cat("\n T Squared Based Simultaneous CI for difference \n")
print(scit)
cat("\n Bonferroni Based Simultaneous CI for difference \n")
print(scib)
}
3.2 Repeated Measure Design Comparison
In the part,x is a n q matrix or dataframe where n is number of ex-
perimental unit; q is number of treatment; and C is the Contrast matrix.
The input level is the condence level of interval.
repmeasure<-function (x, C, level)
{
q <- ncol(x)
n <- nrow(x)
xbar <- apply(x, 2, mean)
xbar.new <- C %*% xbar
s <- cov(x)
s.new <- C %*% s %*% t(C)
tsq <- n * t(xbar.new) %*% solve(s.new) %*% xbar.new
csq <- (n - 1) * (q - 1)/(n - q + 1) * qf(level, q - 1, n -
q + 1)
if (tsq > csq)
cat("\n reject null hypothesis of equal treatment means \n\n")
else cat("\n do not reject null hypothesis \n\n")
m <- nrow(C)
sci <- matrix(rep(0, m * 3), nrow = m)
for (i in 1:m) {
sci[i, 1] <- xbar.new[i]
sci[i, 2] <- xbar.new[i] - sqrt(csq/n * s.new[i, i])
sci[i, 3] <- xbar.new[i] + sqrt(csq/n * s.new[i, i])
}
sci <- data.frame(Estimate = sci[, 1], LowerCI = sci[, 2],
UpperCI = sci[, 3])
cat(" contrast matrix \n")
print(C)
cat("\n Simultaneous CI for contrasts \n")
print(sci)
}
13
3.3 Comparison of Two Multivariate Population
Means
In this part, x1 is a n1 p numeric matrix or dataframe of data from
population one where n1 is sample size and p is the number of responses.
x2 is a n2 p numeric matrix or dataframe of data from population two
where n2 is sample size and p is the number of responses. The input level
is the condence level of interval.
twopop<-function (x1, x2, level)
{
p <- ncol(x1)
n1 <- nrow(x1)
n2 <- nrow(x2)
x1bar <- apply(x1, 2, mean)
x2bar <- apply(x2, 2, mean)
cat("\n mean vector of population one \n", x1bar)
cat("\n\n mean vector of population two \n", x2bar)
s1 <- cov(x1)
s2 <- cov(x2)
s.pool <- (n1 - 1)/(n1 + n2 - 2) * s1 + (n2 - 1)/(n1 + n2 -
2) * s2
tsq <- t(x1bar - x2bar) %*% solve((1/n1 + 1/n2) * s.pool) %*%
(x1bar - x2bar)
csq <- (n1 + n2 - 2) * p/(n1 + n2 - p - 1) * qf(level, p,
n1 + n2 - p - 1)
if (tsq > csq) {
cat("\n\n reject equality of mean vectors\n\n")
cat("The coeffcient of the linear combination \n of most responsible for rejection is \n\n",
solve(s.pool) %*% (x1bar - x2bar))
}
else cat("\n\n do not reject equality of mean vectors\n\n")
scit <- matrix(rep(0, p * 3), nrow = p)
scib <- matrix(rep(0, p * 3), nrow = p)
for (i in 1:p) {
scit[i, 1] <- x1bar[i] - x2bar[i]
scit[i, 2] <- x1bar[i] - x2bar[i] - sqrt(csq) * sqrt((1/n1 +
1/n2) * s.pool[i, i])
scit[i, 3] <- x1bar[i] - x2bar[i] + sqrt(csq) * sqrt((1/n1 +
1/n2) * s.pool[i, i])
scib[i, 1] <- x1bar[i] - x2bar[i]
scib[i, 2] <- x1bar[i] - x2bar[i] - qt(1 - (1 - level)/(2 *
p), n1 + n2 - 2) * sqrt((1/n1 + 1/n2) * s.pool[i,
i])
scib[i, 3] <- x1bar[i] - x2bar[i] + qt(1 - (1 - level)/(2 *
p), n1 + n2 - 2) * sqrt((1/n1 + 1/n2) * s.pool[i,
i])
}
scit <- data.frame(Estimate = scit[, 1], LowerCI = scit[,
2], UpperCI = scit[, 3])
scib <- data.frame(Estimate = scib[, 1], LowerCI = scib[,
14
2], UpperCI = scib[, 3])
cat("\n\n T Squared Based Simultaneous CI for the difference \n")
print(scit)
cat("\n\n Bonferroni Based Simultaneous CI for the difference \n")
print(scib)
}
3.4 Comparison of Several Multivariate Popula-
tion Means
In this part, Y is an N p numeric matrix or dataframe of data where N
is total sample size and p is number of variables. X is an N 1 numeric
matrix or dataframe of data where N is total sample size; the input level
is the condence level of interval. C is the contrast used to test treatment-
eects dierences.
MANOVA<-function (Y, X, level, C)
{
p <- ncol(Y)
g <- length(levels(as.factor(X)))
X <- as.numeric(X)
data <- matrix(cbind(Y, X), ncol = p + 1)
N <- length(X)
meanvec <- matrix(apply(Y, 2, mean), ncol = 1)
n <- matrix(rep(0, g), ncol = 1)
trtmean <- matrix(rep(0, p * g), ncol = g)
trt.effect <- matrix(rep(0, p * g), ncol = g)
trt.cov <- matrix(rep(0, (p * p) * g), nrow = p)
W <- matrix(rep(0, (p * p)), nrow = p)
B <- matrix(rep(0, (p * p)), nrow = p)
df2 <- 0
for (k in 1:g) {
n[k] <- length(subset(X, X == k))
trtmean[, k] <- as.matrix(mean(subset(Y, X == k)))
trt.effect[, k] <- trtmean[, k] - meanvec
for (i in 1:p) {
for (j in 1:p) {
trt.cov[j, i + (k - 1) * p] <- cov(subset(Y[,
i], X == k), subset(Y[, j], X == k))
}
}
W = W + (n[k] - 1) * trt.cov[, (1 + (k - 1) * p):(k *
p)]
B = B + (n[k]) * (trt.effect[, k]) %*% t(trt.effect[,
k])
df2 = df2 + (n[k] - g)
}
T = W + B
df1 <- g - 1
df3 <- N - 1
15
prob <- 1 - ((1 - level)/(p * g * (g - 1)))
t <- qt(prob, (N - g))
d <- (g * (g - 1)/2)
tau <- matrix(rep(0, (p * d)), ncol = 1)
lower <- matrix(rep(0, (p * d)), ncol = 1)
upper <- matrix(rep(0, (p * d)), ncol = 1)
for (i in 1:p) {
for (k in 1:(g - 1)) {
for (l in (k + 1):g) {
A <- matrix(C, ncol = d)
tau[(1 + (i - 1) * d):(i * d), ] <- t(A) %*%
(as.matrix(trt.effect[i, ]))
lower[(1 + (i - 1) * d):(i * d), ] <- tau[(1 +
(i - 1) * d):(i * d), ] - t * sqrt((W[i, i]/(N -
g)) * ((1/n[k]) + (1/n[l])))
upper[(1 + (i - 1) * d):(i * d), ] <- tau[(1 +
(i - 1) * d):(i * d), ] + t * sqrt((W[i, i]/(N -
g)) * ((1/n[k]) + (1/n[l])))
}
}
}
cat(" Overall mean vector \n")
print(t(meanvec))
cat("\n Treatment sample size \n")
print(t(n))
cat("\n Treatment effect matrix \n")
print(trt.effect)
cat("\n One-Way MANOVA Table \n")
cat("\n Treatment SS&CP matrix\n")
print(B)
cat("\n Error SS&CP matrix\n")
print(W)
cat("\n Total SS&CP matrix\n")
print(T)
cat("\n Degrees of Freedom\n")
df <- data.frame(Treatment = df1, Error = df2, Total = df3)
print(df)
cat("\n Bonferroni Based Simultaneous CI for Treatments Difference \n")
Bonferroni.SCI = data.frame(Trt = t(A), Estimate = round(tau,
3), LowerCI = round(lower, 3), UpperCI = round(upper,
3))
print(Bonferroni.SCI)
}
3.5 Treatment Eect Comparison
In this part, Y is an N p numeric matrix or dataframe of data where N
is total sample size and p is number of variables. X is an N 1 numeric
matrix or dataframe of data where N is total sample size. The input
level is the condence level of interval. C is the contrast used to test
16
treatment-eects dierences.
trt.effect<-function (Y, X, level, C)
{
p <- ncol(Y)
g <- length(levels(as.factor(X)))
data <- matrix(cbind(Y, X), ncol = p + 1)
N <- length(X)
prob <- 1 - ((1 - level)/(p * g * (g - 1)))
t <- qt(prob, (N - g))
d <- (g * (g - 1)/2)
tau <- matrix(rep(0, (p * d)), ncol = 1)
lower <- matrix(rep(0, (p * d)), ncol = 1)
upper <- matrix(rep(0, (p * d)), ncol = 1)
for (i in 1:p) {
for (k in 1:(g - 1)) {
for (l in (k + 1):g) {
A <- matrix(C, ncol = d)
tau[(1 + (i - 1) * d):(i * d), ] <- t(A) %*%
(as.matrix(var.decomp(Y, X)$trt.effect[i, ]))
lower[(1 + (i - 1) * d):(i * d), ] <- tau[(1 +
(i - 1) * d):(i * d), ] - t * sqrt((var.decomp(Y,
X)$W[i, i]/(N - g)) * ((1/var.decomp(Y, X)$trt.size[k]) +
(1/var.decomp(Y, X)$trt.size[l])))
upper[(1 + (i - 1) * d):(i * d), ] <- tau[(1 +
(i - 1) * d):(i * d), ] + t * sqrt((var.decomp(Y,
X)$W[i, i]/(N - g)) * ((1/var.decomp(Y, X)$trt.size[k]) +
(1/var.decomp(Y, X)$trt.size[l])))
}
}
}
cat("\n Bonferroni Based Simultaneous CI for Treatments Difference \n")
Bonferroni.SCI = data.frame(Trt = t(A), Estimate = round(tau,
3), LowerCI = round(lower, 3), UpperCI = round(upper,
3))
print(Bonferroni.SCI)
}
4 Reference
Johnson R. A. and Wichern D. W. (2007). Applied Multivariate Sta-
tistical Analysis(6th. ed.). Englewood Clis, New Jersey: Prentice-
hall.
Rencher A. C. (1998). Multivariate Statistical Inference and Appli-
cations. John Wiley & Sons, Inc.
Morrison D.F. (1976). Multivariate Statistical Methods (2nd ed.).
McGraw-Hill Book Company.
17

You might also like