You are on page 1of 47

Interpreting Summary Output from

Excel
W P Y
W P Y
i
i
32 . 26 46 . 1 05 . 474


2 1 0
+ + =
+ + = | | |
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.540656024
R Square 0.292308937
Adjusted R Square 0.281504493
Standard Error 176.6190143
Observations 134
ANOVA
df SS MS F Significance F
Regression 2 1687891.751 843945.8754 27.05451057 1.46138E-10
Residual 131 4086450.184 31194.27621
Total 133 5774341.935
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20 387.9723729 560.1228737
P 1.457121257 1.172416643 1.242835698 0.21614834 -0.862197169 3.776439684
WATER 26.31733728 4.891860687 5.379821498 3.32369E-07 16.64007562 35.99459894
Interpreting Summary Output from
Excel
Regression Statistics
Multiple R 0.540656024
Multiple R: The correlation between Y
i
and
i
is 54.1%



Interpreting Summary Output from
Excel
Regression Statistics
Multiple R 0.540656024
R Square 0.292308937
29.23% of the variation in Cotton Lint Yields is explained
by the independent variables: P & W


Interpreting Summary Output from
Excel
Regression Statistics
Multiple R 0.540656024
R Square 0.292308937
Adjusted R Square 0.281504493
Used to test if an additional independent variable improves
the model.


Interpreting Summary Output from
Excel
Regression Statistics
Multiple R 0.540656024
R Square 0.292308937
Adjusted R Square 0.281504493
Standard Error 176.6190143
The Standard Error is the error you would expect between
the predicted and actual dependent variable.
Thus, 176.62 means that the expected error for a cotton lint
yield prediction is off by 176.62 lbs/ac.

Interpreting Summary Output from
Excel
Regression Statistics
Multiple R 0.540656024
R Square 0.292308937
Adjusted R Square 0.281504493
Standard Error 176.6190143
Observations 134
AAEC 4302

ADVANCED
STATISTICAL METHODS IN
AGRICULTURAL RESEARCH
Chapter 12:
Hypothesis Testing
Statistical Hypothesis Testing
Two complementary hypotheses:
Null hypothesis H
0
Alternative hypothesis H
1

Three sets of hypotheses:
H
0
: B
j
= B
j
0

H
1
: B
j
B
j
0

H
0
: B
j
= B
j
0

H
1
: B
j
> B
j
0

H
0
: B
j
= B
j
0

H
1
: B
j
< B
j
0

Statistical Hypothesis Testing
Basic significance test:
H
0
: B
j
= 0

H
1
: B
j
0
Decision rule:
Reject H
0
if

Reject H
0
Do not reject H
0
Reject H
0

<- - - - - - -- - - - - - - - - - - - - - - - - - - - - - - - - >

0
c
j j
B B

*
>
c
j
B

c
j
B

Statistical Hypothesis Testing


2 types of mistakes:
H
0
is true H
0
is false
(H
1
is false) (H
1
is true)
______________________________________
Reject H
0
Error Type I Correct decision
______________________________________
Do not Correct Error Type II
Reject H
0
decision


Statistical Hypothesis Testing
Consider test statistics defined as:


Decision rule: Reject H
0
if

Linear transformation that yields a random
variable Z that has a normal distribution (=0,
=1)


Critical value Z
c
is determined from Pr(ZZ
c
)=
)

j
j
Z
| o
|
=
c
Z Z >
*
)

j
j j
Z
| o
| |
=
Statistical Hypothesis Testing
T-statistics is defined as



Decision rule: Reject H
0
if

)

j
j
j
S
t
|
|
=
c
t t >
*
Statistical Hypothesis Testing
To calculate k+1 t-statistics:

; where is
the value of estimated using the
OLS formulas and is any
assumed true (population) value of


| | ( ) K j B S
j
,... 0 , =
( )
| |
( ) k j
B S
B B
t
j
j j
j
,... 0 ,

=
j
B
j
B
j
B
j
B
Statistical Hypothesis Testing
The t-statistics are used to test the
null hypothesis that the true
unknown population value of B
j
is
equal to its assumed true
population value (above)

The tests are conducted based on the
fact that if the null hypothesis is
correct, the corresponding t-statistic
follows a t distribution with n-k-1
degrees of freedom

Statistical Hypothesis Testing

The t-statistics are also included in the
Excel output

Why use a t test, instead of a z test?

Need 100+ observations to use a Z
test, thus, we usually use the t,
regardless of the number of
observations.
Interpreting Summary Output from
Excel
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07
=
=
=
2
1
0

B
B
B
243 . 1
172 . 1
0 457 . 1

1
1 1

1
=

=
SE
B B
t
B
Statistical Hypothesis Testing
Statistical Hypothesis Testing
Example:
Y
i
= B
0
+ B
1
X
1
+ B
2
X
2
+ U
i

i
= B
0
+ B
1
X
1
+ B
2
X
2

i
= 474.05 + 1.46X
1
+26.32X
2
Where: Y
i
= Cotton Yields (lbs/ac)
X
1
= Phosphorous Fertilizer (lbs/ac)
X
2
= Irrigation Water (in/ac)
^ ^ ^
B
1
=1.50
B
1
P(B
i
)
^
Assume:
B
1
= 1.50
S.E.
1
= (B
1
) = 1.20
^
B
1
=1.46
^
^
B
1
~N(B
1
,
2
) => ~N(1.50, 1.20
2
)
Statistical Hypothesis Testing
B
1
=0
B
1
P(B
i
)
^
Assume:
B
1
= 0
S.E.
1
= (B
1
) = 1.20
^
B
1
=1.46
^
^
B
1
~N(B
1
,
2
) => ~N(0, 1.20
2
)
Statistical Hypothesis Testing
Statistical Hypothesis Testing
What can we conclude about ?
Since 1.46 is inside the probability distribution,
we cannot be certain that is not zero.

1

|
1

|
B
2
=0
B
2
P(B
2
)
^
Assume:
B
2
= 0
S.E.
1
= (B
2
) = 5
^
B
2
=26.32
^
^
B
2
~N(B
2
,
2
) => ~N(0, 5
2
)
Statistical Hypothesis Testing
3=15
Statistical Hypothesis Testing
2

|
is clearly outside the distribution. Therefore, is likely
not to belong to this distribution, i.e. is likely not to be equal
to zero.
2

|
2

|
Statistical Hypothesis Testing
Strictly speaking, the t-statistics are only
valid under the following additional
conditions:
1. The error term follows a normal
distribution with a zero mean and a
constant variance for all n
observations, i.e.:
A zero mean occurs if no relevant
independent variables are left out of
the multiple regression model
Statistical Hypothesis Testing
The dependent variable follows a
normal distribution with a constant
variance across observations
2. The values taken by the dependent
variable in different observations are
not correlated to each other
If U
i
(and thus Y
i
) are not normally
distributed, the t-statistics are roughly
valid if the sample is large enough (more
than 250 observations)
Statistical Hypothesis Testing
The steps of the t-statistic is to test:
State the hypotheses
Choose the level of significance
Calculate the value of the test statistics
t*
Find a critical value from table (Table
A.3 )
Apply the decision rule
Statistical Hypothesis Testing
In practice, o values are typically 0.10,
0.05 or 0.01, depending on the nature and
objectives of the research:

these indicate three possible levels of
statistical certainty when rejecting H
0

(90, 95 and 99%)
Statistical Hypothesis Testing
The decision rule is:

If |t
j
*
| critical t-table value (at
desired o and n-k-1 degrees of
freedom), reject H
0
and:
Conclude that B
j
is statistically
different from zero
Conclude that X
j
affects Y with a
certainty level of (1-o)
Statistical Hypothesis Testing
The decision rules are:

In a one-tailed alternative:
H
0
is the same
H
a
: B
j
< 0
The decision rule is:
If t
j
*
critical t-table value reject H
0
Some General Remarks
When reporting the results of a regression
analysis, it is customary to report either the
standard errors or the t-values in
parenthesis below the corresponding
parameter estimate.


i
= 474.05 + 1.46X1 +26.32X2
(43.511)*** (1.172) (4.892)***

Where: * Significant at the 90% level, i.e. =0.10
** Significant at the 95% level, i.e. =0.05
*** Significant at the 99% level, i.e. =0.01
Some General Remarks
It is also customary to always conduct a
basic test for the statistical significance of
each of the models parameters:
test: H
0
: B
j
=0 for j=1,,k
Statistical Hypothesis Testing
Example:

i
= 474.05 + 1.46X
1
+26.32X
2

Where: Y
i
= Cotton Yields (lbs/ac)
X
1
= Phosphorous Fertilizer (lbs/ac)
X
2
= Irrigation Water (in/ac)
Statistical Hypothesis Testing

i
= 474.05 + 1.46X
1
+26.32X
2

based on 134 observations
Question: Is B
1
=0?

Test: H
0
: B
1
= 0
H
a
: B
1
0 (two-tailed test)
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07
Statistical Hypothesis Testing
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07

i
= 474.05 + 1.46X
1
+26.32X
2

Two-tailed test:
t
*
1
= ( -B
1
)/(S.E.) = (1.457-0)/(1.172) = 1.243
df = (n - k -1) = (134 - 2 - 1) = 131
Next we must find t
c
from Table A.3
Using an =0.10 and df 125 we find t
c
1.657
1

|
Statistical Hypothesis Testing
0
t

P(t)
Statistical Hypothesis Testing
-1.657
-t
c
1.657
t
c
1.243
t
*
1
(/2) = 0.50 (/2) = 0.50
Statistical Hypothesis Testing
Since: |t
*
1
|<t
c
, 1.243<1.657
We cannot reject the null hypothesis (H
0
), for =0.10
(two-tailed test) and df=131.
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07
Statistical Hypothesis Testing

i
= 474.05 + 1.46X
1
+26.32X
2

based on 134 observations
Question: Is B
1
=0?

Test: H
0
: B
1
= 0
H
a
: B
1
> 0 (one-tailed test)
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07
Statistical Hypothesis Testing
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07

i
= 474.05 + 1.46X
1
+26.32X
2

One-tailed test:
t
*
1
= ( -B
1
)/(S.E.) = (1.457-0)/(1.172) = 1.243
df = (n - k -1) = (143 - 2 - 1) = 131
Next we must find t
c
from Table A.3
Using an =0.10 and df 125 we find t
c
1.288
1

|
Statistical Hypothesis Testing
0
t

P(t)
Statistical Hypothesis Testing
1.288
t
c
1.243
t
*
1
Statistical Hypothesis Testing
Since: |t
*
1
|<t
c
, 1.243<1.288
We cannot reject the null hypothesis (H
0
), for =0.10
(two-tailed test) and df=131.
Coefficients Standard Error t Stat P-value
Intercept 474.0476233 43.51108281 10.89487075 4.477E-20
P 1.457121257 1.172416643 1.242835698 0.21614834
W 26.31733728 4.891860687 5.379821498 3.32369E-07
Some General Remarks
A rule of thumb is that:

If |B
j
|>2S[B
j
] (i.e. |t
j
*
|>2)

B
j
is statistically different from zero, at
least at the 95% level of statistical
certainty.
(o=0.05 level of statistical
significance)
^
^
Some General Remarks
One-tail test vs. two-tail test
Advantage
If you properly justify that X
j
has only a
positive (negative) effect on the dependent
variable Y
i
, then the one-tail test will help
you reject the null hypothesis.
Under a one-tail test, the critical t-value is
smaller than the critical t-value under a two-
tail test.

Some General Remarks
One-tail test vs. two-tail test
Disadvantage
If you decide that X
j
has only a
positive effect on Y, than you cannot
change your decision after running the
regression.

Some General Remarks
Two-tail test vs. one-tail test
Advantage
It is more flexible than the one-tailed
test because X
j
can have either a
positive or negative effect on Y.
Some General Remarks
Two-tail test vs. one-tail test
Disadvantage
It is more difficult to reject the null
hypothesis (H
0
).

You might also like