You are on page 1of 18

CHAPTER 2

REGRESSION
AND
CORRELATION
Agenda

Objective
Introduction
r-Pearson’s Coefficient of Correlation
Spearman’s Coefficient of Rank Correlation
Linear Regression
The End
Objective
At the end of this chapter, you should be able to

1. Calculate r-Pearson’s coefficient of correlation and


determine its type.
2. Calculate Spearman’s coefficient of rank correlation and
determine its type
3. Find the regression line and estimate y given x.
Introduction
• In this chapter, you are going to study the
relationship between two variables in
mathematical form by understanding the
concept of correlation and regression.
• For coefficient of correlation, you’ll learn two
methods of calculation:
i. r-Pearson correlation of coefficient
ii. Spearman rank correlation of coefficient.
• For regression, you’ll learn how to obtain a
straight line equation, called regression equation
using least square method.
Introduction – Cont…
Once a set of data of two variables has been collected,
suppose the data is about height (x–value) and weight (y–
value) of adults, one would want to see if there is any
relationship between the two variables. By plotting the points
on a graph (Figure 1), called a Scatter diagram, we can see
the distribution pattern of the data. By analyzing the pattern,
one can deduce whether there’s any relationship between
the variables, and if the relationship exists, whether it is a
positive linear, negative linear or a non-linear relationship.

Figure 1 : Scatter Diagram


Introduction – Cont…
A scatter diagram can only show whether any correlation
exists and its type but to measure the strength of
relationship between the two variables, a statistic called
__________________________is used.

There are two methods to calculate the coefficient of


correlation:
1. r-Pearson correlation of coefficient
2. Spearman's coefficient of rank correlation
• The values of coefficient of correlation lie between –1.0 and
1.0. 7

• A value that is very close to –1.0 means the two variables


have a strong negative relationship. A negative correlation
means as the y–value increases, x–value decreases.

• A value that is very close to 1.0 means the two variables


have a strong positive relationship. A positive correlation
means as the y–value increases, x–value also increases.

(### refer to corr & relation_few examples file)

3/7/2018
Introduction – Cont…

Figure 2: Scatter diagrams for various value of r


r-Pearson’s Coefficient of Correlation
This method is used to measure the strength of
quantitative data only. It is also suitable if there is no
extreme (too small or too big) value. The formula is,
Example 1
Calculate the r–Pearson’s coefficient of correlation of the following data.
Determine the type of coefficient of correlation obtained.
Spearman’s Coefficient of Rank Correlation
The Spearman's coefficient of rank correlation measures linear or non-
linear relationship between the two variables. It can also be used to
measure the degree of relationship for qualitative variable. The
formula is
Example 2
Calculate the Spearman’s coefficient of rank correlation for the following
data. Then, determine its type.

Solution:
Rank x from the smallest (rx =1) to the largest x. If there are two x of equal
values, then the rank for each is the average of the two ranking. There are 3
values of x = 107, thus the ranking are 3, 4 and 5. To assign the correct rank
for x = 107, obtain the average rank for the three values of x. Hence,
Solution:
Linear Regression
When two variables have strong relationship
between them, then we can estimate the value of
one variable given the value of the other variable.
The method used for estimation is Least Square
Regression line. The line is in the form of
y = ax + b.
In this equation y is ____________
x is ______________
a is a gradient of the straight line
b is a y-intercept.
Least Square Linear Regression
 The least square linear
regression line is the best
fit line on the scatter
diagram. It is the line with
the least distance or
nearest to all plotted
points.
 The equation of the
regression line y on x is : y
= a x + b where
Example 5-3
Find the least square Linear Regression y and x for the following data.
Then estimate the value of y if x = 5.
TRY
Refer to the table,
a. Find the least squares
regression line for the data on
incomes and food
expenditures of seven
households given in table
above. Use income as an
independent variables and
food expenditure as a
dependent variable.
[a = 0.2642, b = 1.1414 ; y =0.2642x + 1.1414]
b. Calculate the correlation
coefficient for the example on
income and food expenditures
of seven households.
[r = 0.96 (strong positive correlation) ]
Websites:
18

 http://www.public.iastate.edu/~dnett/S401/ncorrel
ation.pdf

Copyright UniKL 2005 3/7/2018

You might also like