You are on page 1of 28

# CORRELATION

COMPILED BY
MUTHAMA, JAPHETH MUTINDA

INTRODUCTION

## Objectives of the presentation

After going through this presentation, the listener is expected to:
1. Be able to present the results of analysed research data.
2. Make effective interpretation of the relationship between research variables
3. Draw implications or inferences from the variables in the study model

Definition
Correlation (r) is the statistical measure of how two Variables move in
relation to each other.
It measures the relative strength of the relationship between two
variables
Correlation is computed into what is known as the correlation
coefficient,

which

ranges

between

-1

and

+1.

Coefficient of correlation

## correlation between two or more variables in different values of the study

variables
The correlation, if any, found through this approach is applied in a
statistical method to deal with the formulation of mathematical model
depicting relationship amongst variables which can be used for the purpose
of prediction of the values of dependent variable, given the values of the
independent variable

## The sample correlation coefficient (r) measures the degree of linearity in

the relationship between X and Y.

-1 < r < +1
Strong negative
relationship

Strong positive
relationship
r = 0 : Indicates no linear relationship between the research variables

-1 < r < +1
The + and signs are used for explaining the positive linear correlations
and negative linear correlations respectively

## 1) Strong correlation: r > 0.70 or r < 0.70

2) Moderate correlation: r is between 0.30 and 0.70
or r is between 0.30 and 0.70

1.

2.

3.

## Spearmans Rank Correlation Method

SCATTER DIAGRAMS
This is a graph in which the individual data points are plotted in two-dimensions as
presented below;

## Very good fit

Moderate fit

Points clustered closely around a line show a strong correlation. The line is a good
predictor (good fit) with the data. The more spread out the points, the weaker the
correlation, and the less good the fit.
The line is a REGRESSSION line (Y = a + bX)

## Coefficient of determination and the regression line

NOTE:
1. The coefficient of determination is a measure of how well the regression
line represents the data and therefore represents the percent of the data that
is the closest to the line of best fit
2. If the regression line passes exactly through every point on the scatter
plot, it would be able to explain all of the variation
3. The further the line is away from the points, the less it is able to explain
the variation

Cont

## For example in the case of variables X and Y:

If the r = 0.922, then r 2 = 0.850
Which means that 85% of the total variation in y can be explained by the
linear relationship between x and y (as described by the regression
equation)
This therefore means that, the other 15% of the total variation in y remains
unexplained

## Karl Pearsons coefficient of correlation (or simple

correlation)
This is the most widely used method of measuring the degree of
relationship between two variables.

Its defined as the measure of the strength of the linear relationship between
two variables that is defined in terms of the (sample) covariance of the
variables divided by their (sample) standard deviations.
This coefficient assumes the following:

## (i) that there is linear relationship between the two variables;

(ii) that the two variables are casually related which means that one of the
variables is independent and the other one is dependent
(iii) A large number of independent causes are operating in both variables
so as to produce a normal distribution.

## Karl Pearsons coefficient of correlation can be worked out thus

cov( x, y )
r
x . y
OR
n XY X

r xy

n X

( X ) 2 n Y 2 ( Y ) 2

## - Shared variability of X and Y variables - on the top

- Individual variability of X and Y variables- At the bottom

Illistration
From the following data find the coefficient of correlation by Karl
Pearson method
X: 6, 2, 10, 4, 8
Y: 9, 11, 5, 8, 7

Sol.cont.
X

30

6
N
5
Y 40

8
N
5
x. y
26
26

0.92
2
2
40.20
800
x
.
y

## Spearman's rank coefficient

This is the technique of determining the degree of correlation between two
variables incase of ordinal data where ranks are given to different values of
the variables.
The main objective of the coefficient is to determine the extend to which
the two sets of ranking are similar or dissimilar.
This method is only used to determine correlation when the data is not
available in numerical form
Thus when the values of the two variables are converted to their ranks and
the correlation is obtained, the correlation is known as rank correlation

## Computation of Rank Correlation

Spearmans rank correlation coefficient can be calculated when
Actual ranks given
Ranks are not given but grades are given but not repeated
Ranks are not given and grades are given and repeated
R 1

6 D 2
N (N

1)

where
D Rx R y
Rx rank .of . X
R y rank .of . y

Illustration
Calculate the spearmans rank correlation coefficient between
Advertisement cost : 39, 65, 62, 90, 82, 75, 25, 98, 36, 78
Sales(Shs):
47, 53, 58, 86, 62, 68, 60, 91, 51, 84

R-x

R-y

39

47

10

-2

65

53

-2

62

58

90

86

82

62

-2

75

68

25

60

10

16

98

91

36

51

78

84

D2

30

Cont.
R 1

6 D

N N
6(30)
R 1 3
10 10
180
R 1
990
R 0.82
3

Nonlinear Relationships

## In correlation analysis, not all relationships are linear.

In cases where there is clear evidence of a nonlinear relationship DO NOT
use Pearsons Product Moment Correlation ( r ) to summarize the strength
of the relationship between Y and X.

## Non linear correlation Scatter graph

Conclusions
Correlation is the linear association between two numeric variables e.g
variables X and Y.
The correlation (r) ranges from -1 to +1
where
-1 < r < 1
If r < 0 then there is a negative correlation between X and Y, i.e. as X
increases Y generally decreases
If r > 0 then there is a positive correlation between X and Y, i.e. as X
increases Y generally increases
The close r is to 0 the weaker the linear association between X and Y.

## A diagram explaining different strengths of correlations

The value of r ranges between ( -1) and ( +1)
The value of r denotes the strength of the association as illustrated
by the following diagram.

strong

-1

intermediate

-0.75

-0.25

weak

weak

indirect
perfect
correlation

intermediate

0.25

strong

0.75

Direct
no relation

perfect
correlation

## Example of graphs and their interpretation

Negative and positive correlations

No Relationship (r = .00)
Insight

## ASIS - Emotional Insight

1
-.5

0.0

.5

Explanatory Flexibility

1.0

1.5

2.0

2.5

3.0

3.5

REFERENCES
Dhrymes, P. J.: Econometrics: Statistical Foundations and Applications,
Harper & Row, New York, 1970.
Fomby, Thomas B., Carter R. Hill, and Stanley R. Johnson: Advanced
Econometric Methods, Springer-Verlag, New York, 1984.
Goldberger, A. S.: A Course in Econometrics, Harvard University Press,
Cambridge, Mass., 1991.
Harvey, A. C.: The Econometric Analysis of Time Series, 2d ed., MIT Press,
Cambridge, Mass., 1990.
Kothari CR, Research methodology: an introduction. New Delhi, Vikas
publishing house Pvt ltd 2000
Emory C William, Business research methods. Illinois: Richard D. Irwin,
Inc. Homewood 2001

THANK YOU