You are on page 1of 5

T Tests, ANOVA, and Regression Analysis

Here is a one-sample t test of the null hypothesis that mu = 0:


DATA ONESAMPLE; INPUT Y @@;
CARDS;
1 2 3 4 5 6 7 8 9 10
PROC MEANS T PRT; RUN;

-----------------------------------------------------------------------------------------------The SAS System


The MEANS Procedure
Analysis Variable : Y
t Value Pr > |t|

5.74
0.0003

------------------------------------------------------------------------------------------------

Now an ANOVA on the same data but with no grouping variable:


PROC ANOVA; MODEL Y = ; run;
-----------------------------------------------------------------------------------------------The SAS System
The ANOVA Procedure
Dependent Variable: Y
DF

Sum of
Squares

Mean Square

F Value

Pr > F

Model

302.5000000

302.5000000

33.00

0.0003

Error

82.5000000

9.1666667

10

385.0000000

Source

Uncorrected Total
R-Square

Coeff Var

Root MSE

Y Mean

0.000000

55.04819

3.027650

5.500000

Source
Intercept

DF

Anova SS

Mean Square

F Value

Pr > F

302.5000000

302.5000000

33.00

0.0003

------------------------------------------------------------------------------------------------

Notice that the ANOVA F is simply the square of the one-sample t, and the onetailed p from the ANOVA is identical to the two-tailed p from the t.
Now an Regression analysis with Model Y = intercept + error.
PROC REG; MODEL Y = ; run;

-----------------------------------------------------------------------------------------------The REG Procedure


Model: MODEL1
Dependent Variable: Y
Source

DF

Sum of
Squares

Mean
Square

F Value

Pr > F

Model
Error
Corrected Total

0
9
9

0
82.50000
82.50000

Root MSE
Dependent Mean
Coeff Var

3.02765
5.50000
55.04819

.
9.16667

R-Square
Adj R-Sq

0.0000
0.0000

Parameter Estimates
Variable
Intercept

DF

Parameter
Estimate

Standard
Error

t Value

Pr > |t|

5.50000

0.95743

5.74

0.0003

------------------------------------------------------------------------------------------------

Notice that the ANOVA is replicated.


Now consider a two independent groups t test with pooled variances, null is
mu1-mu2 = 0:
DATA TWOSAMPLE; INPUT X Y @@;
CARDS;
1 1 1 2 1 3 1 4 1 5
2 6 2 7 2 8 2 9 2 10
PROC TTEST; CLASS X; VAR Y; RUN;

-----------------------------------------------------------------------------------------------The SAS System


T-Tests
Variable

Method

Variances

DF

t Value

Pr > |t|

Y
Pooled
Equal
8
-5.00
0.0011
------------------------------------------------------------------------------------------------

Now an ANOVA on the same data:


PROC ANOVA; CLASS X; MODEL Y = X; RUN;

-----------------------------------------------------------------------------------------------The ANOVA Procedure


Dependent Variable: Y
DF

Sum of
Squares

Mean Square

F Value

Pr > F

Model

62.50000000

62.50000000

25.00

0.0011

Error

20.00000000

2.50000000

Corrected Total

82.50000000

Source

Source
X

R-Square

Coeff Var

Root MSE

Y Mean

0.757576

28.74798

1.581139

5.500000

DF

Anova SS

Mean Square

F Value

Pr > F

62.50000000

62.50000000

25.00

0.0011

------------------------------------------------------------------------------------------------

Notice that the ANOVA F is simply the square of the independent samples t and
the one-tailed ANOVA p identical to the two-tailed p from t.

And finally replication of the ANOVA with a regression analysis:


PROC REG; MODEL Y = X; run;

-----------------------------------------------------------------------------------------------The SAS System


The REG Procedure
Model: MODEL1
Dependent Variable: Y
Number of Observations Read
Number of Observations Used

10
10

Analysis of Variance
DF

Sum of
Squares

Mean
Square

1
8
9

62.50000
20.00000
82.50000

62.50000
2.50000

Root MSE
Dependent Mean
Coeff Var

1.58114
5.50000
28.74798

Source
Model
Error
Corrected Total

R-Square
Adj R-Sq

F Value

Pr > F

25.00

0.0011

0.7576
0.7273

Parameter Estimates
Variable
Intercept
X

DF

Parameter
Estimate

Standard
Error

t Value

Pr > |t|

1
1

-2.00000
5.00000

1.58114
1.00000

-1.26
5.00

0.2415
0.0011

OK, but what if we have more than two groups? Show me that the ANOVA is a
regression analysis in that case.
Here is the SAS program, with data:
data Lotus;
input Dose N; Do I=1 to N; Input Illness @@; output; end;
cards;
0 20
101 101 101 104 104 105 110 111 111 113 114 79 89 91 94 95 96 99 99 99
10 20
100 65 65 67 68 80 81 82 85 87 87 88 88 91 92 94 95 94 96 96
20 20
64 75 75 76 77 79 79 80 80 81 81 81 82 83 83 85 87 88 90 96
30 20
100 105 108 80 82 85 87 87 87 89 90 90 92 92 92 95 95 97 98 99
40 20
101 102 102 105 108 109 112 119 119 123 82 89 92 94 94 95 95 97 98 99
*****************************************************************************;
proc GLM data=Lotus; class Dose;
model Illness = Dose / ss1;
title 'Here we have a traditional one-way independent samples ANOVA'; run;
*****************************************************************************;
data Polynomial; set Lotus; Quadratic=Dose*Dose; Cubic=Dose**3;
Quartic=Dose**4;

proc GLM data=Polynomial; model Illness = Dose Quadratic Cubic Quartic / ss1;
title 'Here we have a polynomial regression analysis.'; run;

*****************************************************************************
Here is the output:
Here we have a traditional one-way independent samples ANOVA

The GLM Procedure

Dependent Variable: Illness

Sum of
Source

DF

Squares

Mean Square

F Value

Pr > F

Model

6791.54000

1697.88500

20.78

<.0001

Error

95

7762.70000

81.71263

Corrected Total

99

14554.24000

Source

Dose

R-Square

Coeff Var

Root MSE

Illness Mean

0.466637

9.799983

9.039504

92.24000

DF

Type I SS

Mean Square

F Value

Pr > F

6791.540000

1697.885000

20.78

<.0001

------------------------------------------------------------------------------------------------

Here we have a polynomial regression analysis.

The GLM Procedure

Number of observations

100

------------------------------------------------------------------------------------------------

Here we have a polynomial regression analysis.

The GLM Procedure

Dependent Variable: Illness

Sum of
Source

DF

Squares

Mean Square

F Value

Pr > F

Model

6791.54000

1697.88500

20.78

<.0001

Error

95

7762.70000

81.71263

Corrected Total

99

14554.24000

Note that the polynomial regression produced exactly the same F, p, SS, MS, as the traditional
ANOVA.

Source

R-Square

Coeff Var

Root MSE

Illness Mean

0.466637

9.799983

9.039504

92.24000

DF

Type I SS

Mean Square

F Value

Pr > F

Dose

174.845000

174.845000

2.14

0.1468

Quadratic

6100.889286

6100.889286

74.66

<.0001

Cubic

389.205000

389.205000

4.76

0.0315

Quartic

126.600714

126.600714

1.55

0.2163

------------------------------------------------------------------------------------------------

Return to Wuenschs Stats Lessons Page


November, 2006

You might also like