Ancova No Lineal

P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
Printer Name: Yet to Come
CHAPTER 12
Nonlinear ANCOVA
12.1 INTRODUCTION
The relationship between the covariate and the dependent variable scores is not
always linear. Because an assumption underlying the ANCOVA model is that the
within-group relationship between X and Y is linear, researchers should be aware of
the problem of nonlinearity. If ANCOVA is employed when the data are nonlinear, the
power of the F-test is decreased and the adjusted means may be poor representations
of the treatment effects.
Two reasons for nonlinear relationships between X and Y are inherent nonlinearity
of characteristics and scaling error. It is quite possible that the basic characteristics
being measured are not linearly related. For example, the relationship between extroversion (X) and industrial sales performance (Y) could be predicted to be nonlinear.
Those salespeople with very low extroversion scores may have poor sales performance because they have difficulty interacting with clients. Those with very high
extroversion scores may be viewed as overly social and not serious about their work.
Hence, very low or very high extroversion scores may be associated with low sales
performance, whereas intermediate extroversion scores may be associated with high
sales performance.
Another example of expected nonlinearity might be found between certain measures of motivation (X) and performance (Y). Psychologists working in the area of
motivation sometimes hypothesize that there is an optimal level of motivation or
arousal for an individual working on a specific task. At very low or very high levels
of arousal, performance is lower than at the optimal level of arousal. In both examples, the relationship between X and Y scores is expected to be nonlinear because
the relationship between the basic characteristic underlying the observed (measured)
scores is expected to be nonlinear. This distinction between the measured and underlying or basic scores is important. It is quite possible that the relationship between
observed X and Y scores is nonlinear when the relationship between the basic X and Y
characteristics is linear. When this occurs, the problems of scaling error are involved.
The Analysis of Covariance and Alternatives: Statistical Methods for Experiments, Quasi-Experiments,
and Single-Case Studies, Second Edition. Bradley E. Huitema.
2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc.
285
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
286
NONLINEAR ANCOVA
There are several types of scaling errors that can produce nonlinearity, but probably
the most frequently encountered type results in either ceiling or floor effects. In
either case the problem is that the instrumentation or scale used in the measurement of
either the X or the Y variable (or both) may not be adequate to reflect real differences
in the characteristics being measured. For example, if most of the subjects employed
in a study obtain nearly the highest possible score on a measure, there are likely to be
unmeasured differences among those who get the same high score. The measurement
procedure simply does not have sufficient ceiling to reflect differences among
the subjects on the characteristics being measured. Suppose most subjects get a
score of 50 on a 50-point pretest that is employed as a covariate; the test is much
too easy for the subjects included in the experiment. If the scores on this measure
are plotted against scores on a posttest that is of the appropriate difficulty level,
nonlinearity will be observable. Here the inherent relationship between the X and Y
characteristics is linear, but the obtained relationship between the observed measures
is not linear. Hence, one reason for nonlinearity in the XY relationship is scaling error
or inappropriate measurement. Regardless of the reason for nonlinearity, the linear
ANCOVA model is inappropriate if the degree of nonlinearity is severe.
12.2 DEALING WITH NONLINEARITY

A routine aspect of any data analysis is to plot the data. This preliminary step involves
plotting the Y scores against the X scores for each group. Severe nonlinearity will
generally be obvious in both the trend observed in the scatter plot and in the shape
of the marginal distributions. More sensitive approaches for identifying nonlinearity
include visual inspection of the residuals of the ANCOVA model and fitting various
alternative models to the data. Once it has been decided that nonlinearity is problematic, the next step is to either (1) seek a transformation of the original X and/or Y
scores that will result in a linear relationship for the transformed data or to (2) fit an
appropriate polynomial ANCOVA model to the original data.
Data Transformations
If the relationship between X and Y is nonlinear but monotonic (i.e., Y increases when
X increases but the function is not linear), a transformation of X should be attempted.
Logarithmic, square root, and reciprocal transformations are most commonly used
because they usually yield the desired linearity. Advanced treatments of regression
analysis should be consulted for details on these and other types of transformation
(e.g., Cohen et al., 2003).
Once a transformation has been selected, ANCOVA is carried out in the usual way
on the transformed data. For example, if there is reason to believe that the relationship
between loge X and Y is linear, ANCOVA is carried out using loge X as the covariate.
It must be pointed out in the interpretation of the analysis, however, that loge X rather
than X was the covariate.
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
DEALING WITH NONLINEARITY
287
A method of determining whether a transformation has improved the fit of the

model to the data is to plot the scores and compute ANCOVA for both untransformed
and transformed data. A comparison of the plots and ANCOVAs will reveal the effect
of the transformation.
Polynomial ANCOVA Models
If the relationship between X and Y is not monotonic, a simple transformation will
not result in linearity. In the nonlinear-monotonic situation, the values of Y increase
as value of X increases. In the nonlinear-nonmonotonic situation, Y increases as X
increases only up to a point, and then Y decreases as X increases. If we transform X to
loge X for the nonmonotonic situation, the loge X values increase as X increases and
nonlinearity is still present when loge X and Y are plotted. The simplest alternative
in this case is to fit a second-degree polynomial (quadratic) ANCOVA model. This
model is written as

Yij = + j + 1 X ij X .. + 2 X ij2 X 2 .. + ij ,
where
Yij is the dependent variable score of ith individual in jth group;
is thepopulation mean on Y;
j is the effect of treatment j;
1 is the linear effect regression coefficient;
Xij is the covariate score for ith individual in jth group;
X .. is the mean of all observations on covariate;
2 is the curvature effect coefficient;
X 2 is the squared covariate score for ith individual in jth group;
ij
N 2
X 2 .. is the mean of squared observations on covariate (i.e., i=1
X ij /N ); and
ij is the error component associated with ith individual in jth group.
This model differs from the linear model in that it contains the curvature effect
term 2 (X ij2 (X 2 ..)). If the dependent variable scores are a quadratic rather than a
linear function of the covariate, this model will provide a better fit and will generally
yield greater power with respect to tests on adjusted means.
The quadratic ANCOVA is computed by using X and X 2 as if they were two
covariates in a multiple covariate analysis. The main ANCOVA test, the homogeneity
of regression test, the computation of adjusted means, and multiple comparison tests
are all carried out as with an ordinary two-covariate ANCOVA. If the relationship between X and Y is more complex than a quadratic function, a higher degree polynomial
may be useful. The third-degree polynomial (cubic) ANCOVA model is written as

Yij = + j + 1 X ij X .. + 2 X ij2 X 2 .. + 3 X ij3 X 3 .. + ij .
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
288
NONLINEAR ANCOVA
This model will provide a good fit if the relationship between the covariate
and the dependent variable is a cubic function. Cubic ANCOVA is carried out by
employing X, X 2 , and X 3 as covariates in a multiple covariance analysis. Higher
degree polynomials can be employed for more complex functions, but it is very
unusual to encounter such situations.
Higher degree polynomial models virtually always fit sample data better than do
simpler polynomial models, but this does not mean that the more complex models
are preferable to the simpler ones. Care must be taken not to employ a more complex
model than is required; there are essentially two reasons to keep the model as simple
as possible. First, a degree of freedom is lost from the ANCOVA error mean square
(i.e., MSResw ) for each additional term in the ANCOVA model. If the number of
subjects is not large, the loss of degrees of freedom can easily offset the sum-ofsquares advantage of a better fit afforded by the more complex model. Even though
the sum-of-squares residual is smaller with more complex models, the mean-square
error can be considerably larger with complex models. The consequences of the
larger error term are less precise estimates of the adjusted means, and, correspondingly, less precise tests on the difference between adjusted means. This problem is
illustrated in Section 12.3. The second reason for not employing a more complex
model than is required is the law of parsimony. If a linear model fits the data almost
as well as a quadratic model, the simpler model should usually be chosen because
the interpretation and generalization of results is more straightforward.
Two additional points on the use of polynomial regression models are relevant
to the polynomial ANCOVA described here. First, it is not necessary that the covariate be a fixed variable. This point was made earlier in the discussion of assumptions for ANCOVA but is reiterated here for nonlinear ANCOVA because, as
Cramer and Appelbaum (1978) observed, it is sometimes mistakenly believed that
polynomial regression is appropriate only with X fixed. Second, the parameters of
the polynomial regression are sometimes difficult to estimate with certain multiple
regression computer programs because these programs will not, with certain data
sets, yield the inverse of the required matrix. This problem develops because X,
X 2 , X 3 , and so on are all highly correlated. These computational difficulties can
generally be reduced by transforming the raw X scores to deviation scores (i.e.,
centered scores) before the regression analysis is carried out. That is, in quadratic
ANCOVA, for example, (X X ) and (X X )2 rather than X and X 2 should be
used as the covariates. Additional details on this problem in the context of conventional regression analysis can be found in Bradley and Srivastava (1979) and
Budescu (1980).
12.3 COMPUTATION AND EXAMPLE OF FITTING

POLYNOMIAL MODELS
The computation and rationale for quadratic ANCOVA are essentially the same as
for multiple ANCOVA. Consider the following data:
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
289
COMPUTATION AND EXAMPLE OF FITTING POLYNOMIAL MODELS
(1)
(2)
Experimental Group
Control Group
13
7
17
14
3
12
18
14
7
14
8
19
11
2
19
15
8
11
13
1
2
9
10
15
Previous research or theoretical considerations may suggest that the relationship

between X and Y is best described as a quadratic function. A scatter plot of these
data appears to support a quadratic model. Hence, the experimenter has a reasonable
basis for deciding to employ the quadratic ANCOVA model. The computation of
the complete quadratic ANCOVA through the general linear regression procedure is
based on the following variables:
(1)
(2)
(3)
(4)
(5)
(6)
X2
DX
DX 2
1
1
1
1
1
1
0
0
0
0
0
0
13
7
17
14
3
12
11
2
19
15
8
11
169
49
289
196
9
144
121
4
361
225
64
121
13
7
17
14
3
12
0
0
0
0
0
0
169
49
289
196
9
144
0
0
0
0
0
0
18
14
7
14
8
19
13
1
2
9
10
15
We now proceed as if we were performing a multiple covariance analysis using

X and X 2 as the covariates. As before, the main test on adjusted treatment effects is
based on the coefficients of multiple determination R 2y X and R 2y D,X .
The term R 2y X represents the proportion of the total variability explained by the
quadratic regression (i.e., the regression of Y on X and X 2 ), whereas R 2y D,X represents
the proportion of the total variability explained by the quadratic regression and
the treatments. Hence, the difference between the two coefficients represents the
proportion of the variability accounted for by the treatments that is independent of
that accounted for by quadratic regression. The proportion of unexplained variability
is, of course, 1 R 2y D,X .
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
290
NONLINEAR ANCOVA
Column 1 in this example is the only dummy variable (because there are only J 1
dummy variables), columns 2 and 3 are the covariate columns, columns 4 and 5 are
the interaction columns (not used in the main analysis), and column 6 contains the
dependent variable scores. The regression analyses yield the following:
R 2y D,X = R 2y123
R 2y X = R 2y23
= 0.918903
and
= 0.799091.
Difference or unique contribution of dummy variable beyond quadratic regression = 0.119812.

Total sum of squares = 361.67. The general form of the quadratic ANCOVA
summary is as follows:
Source
Adjusted treatment
Quadratic residualw
Quadratic residualt

SS
R 2y D,X
1
1
R 2y X
df
SST J 1
R 2y D,X SST

R 2y X SST
MS
SSAT /(J 1)
MSAT /MSResw
N J 2 SSResw /(N J 2)
N12
The quadratic ANCOVA summary for the example data is as follows:

Source
SS
df
MS
F
11.82 (p = .009)
Adjusted treatment
(0.119812)361.67 = 43.33
43.33
Quadratic residualw
(1 0.918903)361.67 = 29.33
3.67
Quadratic residualt
(1 0.799091)361.67 = 72.66
Adjusted means and multiple comparison procedures are also dealt with as they
are under the multiple ANCOVA model. The adjusted means for the example data
are obtained through the regression equation associated with R 2y123 . The intercept and
regression weights are
b0 = 5.847359
b1 = 3.83111
b2 = 3.66943
b3 = 0.17533
The group 1 dummy score, the grand mean covariate score, and the grand mean
of the squared covariate scores are 1, 11, and 146, respectively. Hence, Y1 adj =
5.847359 + 3.83111(1) + 3.66943(11) 0.17533(146) = 12.75. The group 2
dummy score, the grand mean covariate score, and the grand mean of the squared
covariate scores are 0, 11, and 146, respectively. Hence, Y2 adj = 5.847359 +
3.83111(0) + 3.66943(11) 0.175333(146) = 8.92.
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
291
Just as the test of the homogeneity of regression planes is an important adjunct to

the main F test in multiple ANCOVA, the test of the homogeneity of the quadratic
regressions for the separate groups should be carried out in quadratic ANCOVA. This
test is computed in the same manner as the test of the homogeneity of regression
planes.
The form of the summary is as follows:
Source
SS
Heterogeneity of quadratic
regression
Quadratic residuali
Quadratic residualw
df
R 2y D,X,D X R 2y D,X SST

1 R 2y D,X,D X SST

1 R 2y D,X SST
MS
2(J 1)
MShet
MShet
MSResi
N (J 3)
MSResi
NJ2
A more general form, appropriate for testing the homogeneity of any degree
(denoted as C) polynomial regression, is as follows:
Source
SS
Heterogeneity of polynomial
regression
Polynomial residuali
Polynomial residualw
df

R 2y D,X,D X R 2y D,X SST

1 R 2y D,X,D X SST

1 R 2y D,X SST
MS
C(J 1)
MShet
MShet
MSResi
N J(C + 1)
MSResi
NJC
For the example data, the necessary quantities are

R 2y D,X,D X = R 2y12345 = 0.944817
and
R 2y D,X = R 2y123 = 0.918903.

Difference or heterogeneity of regression = 0.025914.
Total sum of squares = 361.67.
The summary is as follows:
Source
SS
df
MS
F
1.41 (p = .32)
Heterogeneity of polynomial
regression
Polynomial residuali
(0.025914)361.67 = 9.37
4.68
(1 0.944817)361.67 = 19.96
3.33
Polynomial residualw
(1 0.918903)361.67 = 29.33
The obtained F-value is clearly not significant; we conclude that there is little
evidence to argue that the population quadratic regressions for the experimental
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
292
NONLINEAR ANCOVA
and control groups are different. The quadratic ANCOVA model is accepted as a
reasonable representation of the data.
Comparison of Quadratic ANCOVA with Other Models
It was mentioned earlier that the complexity of the model employed should be
sufficient to adequately describe the data but that it should not be more complex than
is required. The results of applying four different models to the data of the example
problem are tabulated as follows:
Model
ANOVA
Linear ANCOVA
Quadratic ANCOVA
Cubic ANCOVA
Obtained F
Degrees of Freedom
p-value
2.62
2.38
11.82
9.96
1,10
1,9
1,8
1,7
.137
.157
.009
.016
The F of the simplest model, ANOVA, when compared with the linear ANCOVA
F, illustrates the fact that ANOVA can be more powerful than ANCOVA when the
correlation between the covariate and the dependent variable is low. The F of the
most complex of the four models, cubic ANCOVA, when compared with the quadratic
F, illustrates the fact that more complex models do not necessarily lead to greater
precision. The greatest precision is obtained with the model that is neither too simple
nor more complex than is necessary for an adequate fit.
Minitab Input and Output
Input for estimating the linear ANCOVA model:
MTB >
SUBC>
SUBC>
SUBC>
ancova Y=d;
covariate X;
means d;
residuals c7.
Output for linear ANCOVA:

ANCOVA: Y versus d
Factor Levels Values
d
2 0, 1
Analysis of Covariance for Y
Source
DF Adj SS
MS
Covariates
1
3.41
3.41
d
1
75.00 75.00
Error
9 283.25 31.47
Total
11 361.67
F
0.11
2.38
P
0.749
0.157
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
293
S = 5.61004
Covariate
X
R-Sq = 21.68%
Coef
0.1067
SE Coef
0.324
R-Sq(adj) = 4.28%
T
0.3293
P
0.749
Adjusted Means
d N
Y
0 6
8.333
1 6 13.333
MTB > Plot 'ANCOVA Residuals'*'X';
SUBC>
Symbol 'd'.
Scatterplot of ANCOVA Residuals vs X
Scatterplot of ANCOVA residuals vs X

8
d
0
1
6
4
ANCOVA residuals
P1: TIX/OSW
JWBS074-c12
2
0
2
4
6
8
0
10
X
15
20
It is obvious from inspecting the plot of the residuals of the linear ANCOVA model
shown above that this model is inappropriate. A quadratic model appears to be a good
contender so it is estimated next.
Input to compute quadratic ANCOVA. The variable d is a (1, 0) dummy variable
indicating group membership, c2 = the covariate X, and c3 = X 2 .
MTB >
SUBC>
SUBC>
SUBC>
ancova Y=d;
covariates c2 c3;
means d;
residuals c8.
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
294
NONLINEAR ANCOVA
ANCOVA: Y versus d
d
2 0, 1
Source
DF Adj SS
MS
Covariates
2 257.34 128.67
d
1
43.33
43.33
Error
8
29.33
3.67
Total
11 361.67
S = 1.91472
Covariate
X
X*X
R-Sq = 91.89%
Coef
3.6694
-0.1753
SE Coef
0.4421
0.0211
F
35.10
11.82
P
0.000
0.009
R-Sq(adj) = 88.85%
T
8.299
-8.322
P
0.000
0.000
Adjusted Means
d N
Y
0 6
8.918
1 6 12.749
MTB > Plot 'Quad ANCOVA Residuals'*'X';
SUBC>
Symbol 'd'.
Scatterplot of Quad ANCOVA Residuals vs X
Scatterplot of quad ANCOVA residuals vs X

d
0
1
Quad ANCOVA residuals
P1: TIX/OSW
JWBS074-c12
3
0
10
X
15
20
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
July 30, 2011
8:31
295
SUMMARY
Note that the residuals of the quadratic ANCOVA model indicate no additional
forms of nonlinearity or other departures from assumptions. This is confirmed by
estimating the cubic ANCOVA model. Note in the output that the p-value on the
cubic coefficient is .77.
Input for estimating the cubic ANCOVA model:
MTB >
MTB >
SUBC>
SUBC>
SUBC>
Let c9 = X*X*X
ancova Y=d;
covariates c2 c3 c9;
means d;
residuals c10.
Output for cubic ANCOVA model:

ANCOVA: Y versus d
d
2 0, 1
Source
DF
Adj SS
MS
Covariates
3 257.720 85.907
d
1
41.179 41.179
Error
7
28.947
4.135
Total
11 361.667
S = 2.03354
Covariate
X
X*X
X*X*X
R-Sq = 92.00%
Coef
3.2073
-0.1231
-0.0016
SE Coef
1.5909
0.1733
0.0054
F
20.77
9.96
P
0.001
0.016
R-Sq(adj) = 87.42%
T
2.0161
-0.7104
-0.3040
P
0.084
0.500
0.770
Adjusted Means
d N
Y
0 6
8.945
1 6 12.722
12.4 SUMMARY
The assumption of the conventional ANCOVA model that the covariate and the dependent variable are linearly related will not always be met. Severe nonlinearity
generally can be easily identified by inspecting the XY scatter plot within groups.
If the relationship is nonlinear but monotonic, it is likely that a simple transformation (generally of the X variable) can be found that will yield a linear relationship
P1: TIX/OSW
JWBS074-c12
P2: ABC
JWBS074-Huitema
296
July 30, 2011
8:31
NONLINEAR ANCOVA
between transformed X and Y. Analysis of covariance is then applied by using the

transformed variable as the covariate. If the relationship is not monotonic, the simple
transformation approach will not be satisfactory, and the more complex approach of
employing some polynomial of X should be attempted. Generally, a quadratic or cubic
ANCOVA model will fit the data. Complex polynomial models should be employed
only if simpler ones are obviously inadequate. Simpler models are preferred because
results based on complex models are (1) more difficult to interpret and generalize
and (2) less stable. When polynomial ANCOVA models are clearly called for, the
computation involves a straightforward extension of multiple ANCOVA.

Ancova No Lineal

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ancova No Lineal

Uploaded by

Copyright:

Available Formats

P1: TIX/OSW

July 30, 2011

Printer Name: Yet to Come

July 30, 2011

Printer Name: Yet to Come

12.2 DEALING WITH NONLINEARITY

July 30, 2011

Printer Name: Yet to Come

DEALING WITH NONLINEARITY

A method of determining whether a transformation has improved the fit of the

July 30, 2011

Printer Name: Yet to Come

12.3 COMPUTATION AND EXAMPLE OF FITTING

July 30, 2011

Printer Name: Yet to Come

COMPUTATION AND EXAMPLE OF FITTING POLYNOMIAL MODELS

Previous research or theoretical considerations may suggest that the relationship

We now proceed as if we were performing a multiple covariance analysis using

July 30, 2011

Printer Name: Yet to Come

Difference or unique contribution of dummy variable beyond quadratic regression = 0.119812.

The quadratic ANCOVA summary for the example data is as follows:

July 30, 2011

Printer Name: Yet to Come

COMPUTATION AND EXAMPLE OF FITTING POLYNOMIAL MODELS

Just as the test of the homogeneity of regression planes is an important adjunct to

R 2y D,X,D X R 2y D,X SST

For the example data, the necessary quantities are

R 2y D,X = R 2y123 = 0.918903.

July 30, 2011

Printer Name: Yet to Come

Output for linear ANCOVA:

July 30, 2011

Printer Name: Yet to Come

COMPUTATION AND EXAMPLE OF FITTING POLYNOMIAL MODELS

Scatterplot of ANCOVA residuals vs X

July 30, 2011

Printer Name: Yet to Come

Scatterplot of quad ANCOVA residuals vs X

Quad ANCOVA residuals

July 30, 2011

Printer Name: Yet to Come

Output for cubic ANCOVA model:

July 30, 2011

Printer Name: Yet to Come

between transformed X and Y. Analysis of covariance is then applied by using the

You might also like