You are on page 1of 28

CORRELATION

COMPILED BY
MUTHAMA, JAPHETH MUTINDA

INTRODUCTION

Objectives of the presentation


After going through this presentation, the listener is expected to:
1. Be able to present the results of analysed research data.
2. Make effective interpretation of the relationship between research variables
3. Draw implications or inferences from the variables in the study model

Definition
Correlation (r) is the statistical measure of how two Variables move in
relation to each other.
It measures the relative strength of the relationship between two
variables
Correlation is computed into what is known as the correlation
coefficient,

which

ranges

between

-1

and

+1.

Coefficient of correlation

Coefficient of correlation is the technique of determining the degree of

correlation between two or more variables in different values of the study


variables
The correlation, if any, found through this approach is applied in a
statistical method to deal with the formulation of mathematical model
depicting relationship amongst variables which can be used for the purpose
of prediction of the values of dependent variable, given the values of the
independent variable

Coefficient of Correlation Analysis

The sample correlation coefficient (r) measures the degree of linearity in


the relationship between X and Y.

-1 < r < +1
Strong negative
relationship

Strong positive
relationship
r = 0 : Indicates no linear relationship between the research variables

-1 < r < +1
The + and signs are used for explaining the positive linear correlations
and negative linear correlations respectively

Interpreting Correlation Coefficient (r)

1) Strong correlation: r > 0.70 or r < 0.70


2) Moderate correlation: r is between 0.30 and 0.70
or r is between 0.30 and 0.70

3) Weak correlation: r is between 0 and 0.30 or r is between 0 and 0.30 .

Methods of studying Correlation

Correlation can be determined by use of the following method;


1.

A Scatter Diagram Method

2.

Karl Pearson Coefficient Correlation of Method

3.

Spearmans Rank Correlation Method

SCATTER DIAGRAMS
This is a graph in which the individual data points are plotted in two-dimensions as
presented below;

Strong relationship simply means a good linear fit

Very good fit

Moderate fit

Points clustered closely around a line show a strong correlation. The line is a good
predictor (good fit) with the data. The more spread out the points, the weaker the
correlation, and the less good the fit.
The line is a REGRESSSION line (Y = a + bX)

Coefficient of determination and the regression line


NOTE:
1. The coefficient of determination is a measure of how well the regression
line represents the data and therefore represents the percent of the data that
is the closest to the line of best fit
2. If the regression line passes exactly through every point on the scatter
plot, it would be able to explain all of the variation
3. The further the line is away from the points, the less it is able to explain
the variation

Cont

For example in the case of variables X and Y:


If the r = 0.922, then r 2 = 0.850
Which means that 85% of the total variation in y can be explained by the
linear relationship between x and y (as described by the regression
equation)
This therefore means that, the other 15% of the total variation in y remains
unexplained

Karl Pearsons coefficient of correlation (or simple


correlation)
This is the most widely used method of measuring the degree of
relationship between two variables.

Its defined as the measure of the strength of the linear relationship between
two variables that is defined in terms of the (sample) covariance of the
variables divided by their (sample) standard deviations.
This coefficient assumes the following:

(i) that there is linear relationship between the two variables;


(ii) that the two variables are casually related which means that one of the
variables is independent and the other one is dependent
(iii) A large number of independent causes are operating in both variables
so as to produce a normal distribution.

Karl Pearsons coefficient of correlation can be worked out thus

cov( x, y )
r
x . y
OR
n XY X

r xy

n X

( X ) 2 n Y 2 ( Y ) 2

- Shared variability of X and Y variables - on the top


- Individual variability of X and Y variables- At the bottom

Illistration
From the following data find the coefficient of correlation by Karl
Pearson method
X: 6, 2, 10, 4, 8
Y: 9, 11, 5, 8, 7

Sol.cont.
X

30

6
N
5
Y 40

8
N
5
x. y
26
26

0.92
2
2
40.20
800
x
.
y

Spearman's rank coefficient


This is the technique of determining the degree of correlation between two
variables incase of ordinal data where ranks are given to different values of
the variables.
The main objective of the coefficient is to determine the extend to which
the two sets of ranking are similar or dissimilar.
This method is only used to determine correlation when the data is not
available in numerical form
Thus when the values of the two variables are converted to their ranks and
the correlation is obtained, the correlation is known as rank correlation

Computation of Rank Correlation


Spearmans rank correlation coefficient can be calculated when
Actual ranks given
Ranks are not given but grades are given but not repeated
Ranks are not given and grades are given and repeated
R 1

6 D 2
N (N

1)

where
D Rx R y
Rx rank .of . X
R y rank .of . y

Illustration
Calculate the spearmans rank correlation coefficient between
advertisement cost and sales from the following data
Advertisement cost : 39, 65, 62, 90, 82, 75, 25, 98, 36, 78
Sales(Shs):
47, 53, 58, 86, 62, 68, 60, 91, 51, 84

R-x

R-y

39

47

10

-2

65

53

-2

62

58

90

86

82

62

-2

75

68

25

60

10

16

98

91

36

51

78

84

D2

30

Cont.
R 1

6 D

N N
6(30)
R 1 3
10 10
180
R 1
990
R 0.82
3

Nonlinear Relationships

In correlation analysis, not all relationships are linear.


In cases where there is clear evidence of a nonlinear relationship DO NOT
use Pearsons Product Moment Correlation ( r ) to summarize the strength
of the relationship between Y and X.

Non linear correlation Scatter graph

Conclusions
Correlation is the linear association between two numeric variables e.g
variables X and Y.
The correlation (r) ranges from -1 to +1
where
-1 < r < 1
If r < 0 then there is a negative correlation between X and Y, i.e. as X
increases Y generally decreases
If r > 0 then there is a positive correlation between X and Y, i.e. as X
increases Y generally increases
The close r is to 0 the weaker the linear association between X and Y.

A diagram explaining different strengths of correlations


The value of r ranges between ( -1) and ( +1)
The value of r denotes the strength of the association as illustrated
by the following diagram.

strong

-1

intermediate

-0.75

-0.25

weak

weak

indirect
perfect
correlation

intermediate

0.25

strong

0.75

Direct
no relation

perfect
correlation

Example of graphs and their interpretation


Negative and positive correlations

No Relationship (r = .00)
Information about Explanatory Flexibility tells you nothing about Emotional
Insight

ASIS - Emotional Insight

1
-.5

0.0

.5

Explanatory Flexibility

1.0

1.5

2.0

2.5

3.0

3.5

REFERENCES
Dhrymes, P. J.: Econometrics: Statistical Foundations and Applications,
Harper & Row, New York, 1970.
Fomby, Thomas B., Carter R. Hill, and Stanley R. Johnson: Advanced
Econometric Methods, Springer-Verlag, New York, 1984.
Goldberger, A. S.: A Course in Econometrics, Harvard University Press,
Cambridge, Mass., 1991.
Harvey, A. C.: The Econometric Analysis of Time Series, 2d ed., MIT Press,
Cambridge, Mass., 1990.
Kothari CR, Research methodology: an introduction. New Delhi, Vikas
publishing house Pvt ltd 2000
Emory C William, Business research methods. Illinois: Richard D. Irwin,
Inc. Homewood 2001

THANK YOU