You are on page 1of 18

r

Coefficient of Determination
Unit 3

Coefficient of Determination, r2
Once weve decided its appropriate to use
a line, we need to think about assessing the
accuracy of predictions.

Coefficient of Determination, r2
Suppose we wish to predict the price of homes in a
particular city. We take a random sample of 20
houses to get y = price and x = size (our housing
data).
Clearly, we are going to get some variability in the price,
since houses differ in price.
How much of this variability in price can be explained by the
fact that price is related to size and houses differ in size?
If a lot of the variation in price can be accounted for by
house size, a prediction of price based on house size will be
a big improvement over a prediction not based on house
size.
Our best guess, here, would be the average price of our
sample (y-bar).

Coefficient of Determination, r2
The Coefficient of Determination, r2, is
the proportion of variation in y that can
be attributed to the approximate linear
relationship between x and y. (or that
can be explained by the linear
relationship between x and y).

Coefficient of Determination, r2
r2 is useful because:
it gives the proportion of the variance
(fluctuation) of one variable that is
predictable from the other variable
explains how much of the variability in
the y's can be explained by the fact that
they are related to x

Lets look at a formula


We find the total variation in y (SSTotal)
SSTotal = (yi ybar)
Is also called SSM (Sum of Squares about the
mean)
Is the variation around the meanlooks like
variance

Formula continued
We then find the Sum of Squared
Residuals (SSR)
SSR = (yi i)
Is also called SSE or sum of squares of error
This is sometimes referred to as a measure of
the unexplained variation. Or the amount of
variation in y that cannot be attributed to the
linear relationship between x and y

Formula continued
This gives us
r = 1 (SSR / SSTotal)
If I multiply by 100, I get the percentage of
y variation attributable to the approximate
linear relationship between x and y.
The book uses the formula:
r = (SSM SSE) / SSM

Couple of Examples:

The variation of each observation (y) from is small.


explains the variation in y very well
High r, high r

The variation of each observation (y) from is


not really small.
doesnt explain the variation in y as well.
Poor r, poor r

Example
Suppose from our strong example that
r = .9 then r = .81
This means that 81% of the variation in the y
variable is accounted for by the linear
relationship between x and y
Suppose the other model:
r = -.4 then r = .16
This means that only 16% of the variation in the
y variable is accounted for by the linear
relationship

Some points
Always use in context
Must interpret the r with our sentence. Do
not say:
The regression equation can predict 81% of
the data points
81% of data points lie on the LSRL
LSRL accounts for 81% of the data points

Properties of r2
Properties to note:
r2 ranges in value from 0 to 1.0
The magnitude of r2 is proportional to the
strength of the linear relation between x and y
The location of the r2 value relative to 0 and 1.0
indicates the relative proximity of the linear relation
is to a perfect linear relation and no linear relation

Examples
An r2 value of 0.75 indicates that the linear relation
is the distance between:
No linear relation between x and y
A perfect linear relation between x and y.

If the r2 value between motivation to learn and


classroom achievement equals 0.16 for females
and 0.04 for males, we can conclude that the linear
relation between these two variables is 4 times as
strong for females as it is for males.
An r2 value between systolic blood pressure and
age equal to 0.38 implies that 38% of the variability
of between systolic blood pressure and age is
accounted for by their linear relation.

Standard Deviation about the LSRL

Se = (SSR / n-2)
This measures the typical amount by
which an observation deviates from
the LSRL (analogous to sample
standard deviation)

Example

Homework
Textbook pp. 190 196
# 15, 16, 31, 32, 47

Anova Table worksheet


Chapter 9 Project due October 12th!!
Unit 3 Test October 14th!!

Example
3.36.pdf
anova tables.pdf
anova answers.pdf

You might also like