Professional Documents
Culture Documents
Session 10
Introduction
Is
there an association between two or more variables? If yes, what is form and degree of that relationship? the relationship strong or significant enough to be useful to arrive at a desirable conclusion? the relationship be used for predictive purposes, that is, to predict the most likely value of a dependent variable corresponding to the given value of independent variable or variables?
Is
Can
Definition
Correlation
exists between two variables when one of them is related to the other in some way
Assumptions
1)
The sample of paired data (x,y) is a x,y) random sample. sample. The pairs of (x,y) data have a x,y) normal distribution. distribution. bivariate
2)
Spearmans Method of
Figure shows how the strength of the association between two variables is represented by the coefficient of correlation.
Negative Correlation 1.00 Strong negative correlation Perfect negative correlation 0.50 Weak negative correlation 0 Positive Correlation + 0.50 + 1.00
Weak positive Strong positive correlation correlation Perfect positive Moderate positive No correlation correlation correlation
Definition
Scatterplot
is a graph in which the paired (x,y) sample data are plotted with a horizontal x axis and a vertical y axis. axis. Each individual (x,y) pair is plotted as a single point. point.
x
(a) Positive (b) Strong positive
x
(c) Perfect positive
Scatter Plots
x
(d) Negative (e) Strong negative
x
(f) Perfect negative
Scatter Plots
No Linear Correlation
y y
x
(g) No Correlation
x
(h) Nonlinear Correlation
Scatter Plots
Correlation Coefficient r
7fdxdy - (7fdx)(7fdy)/N r= (SDx) (SDy) SDx = fdx ( fdx)/N SDy = fdy ( fdy)/N
Example 1
Find coefficient of correlation between height (X) and weight (Y) from the following data. Also, obtain the data. two regression line. line. Height Weight 61 62 65 55 68 70 62 60 60 53
Answer 1
r
= 0.65 63.2 = 0.32(Y 60) 63. 32(Y 60) 60 = 1.33(X 63.2) 33(X 63.
X Y
Example 2
Given the two regression lines 4x 5y + 33 = 0 20x 9y 107 = 0 20x And variance of x being 9, calculate 1) Mean x and Mean y 2) Correlation Coefficient of x & y 3) SD of y
Answer 2
Mean r
= 0.6 of y = 16
Variance
r = + 1 : Perfect Positive Correlation r = 1 : Perfect Negative Correlation r = 0 : Uncorrelated Correlation Standard Error (S.E.) = (1 r )/N, N = pair of (S. )/N, observations Probable Error = 0.6745 X S.E.
2
Advantages
This This
method is easy to understand and its application is simpler than Pearsons method. method is useful for correlation analysis when variables are expressed in qualitative terms like beauty, intelligence, honesty, efficiency, and so on. method is appropriate to measure the association between two variables if the data type is at least ordinal scaled (ranked) sample data of values of two variables is converted into ranks either in ascending order or descending order for calculating degree of correlation between two variables.
This The
Disadvantages
Values
of both variables are assumed to be normally distributed and describing a linear relationship rather than nonlinear relationship. large computational time is required when number of pairs of values of two variables exceed 30. method cannot be applied to measure the association between two variable grouped data.
A
This
Rho = 1- [6
Hits 1 2 3 4 5 6 7 8 9 10 Rank 10 9 8 7 6 5 4 3 2 1 HR 3 4 5 1 7 6 2 10 9 8 Rank 8 7 6 10 4 5 9 1 2 3 D 2 2 2 -3 2 0 -5 2 0 2
2) (D
D2 4 4 4 9 4 0 25 4 0 4
/N
2-1)] (N
Rho = 1- [6(58)/10(102-1)] Rho = 1- [348 / 10 (100 -1)] Rho = 1- [348 / 990] Rho = 1- 0.352 Rho = 0.648
N=10
(D2 = 58)
Pearsons r
Hits
1 2 3 4 5 6 7 8 9 10
HR
3 4 5 1 7 6 2 10 9 8
7xy
3 8 15 4 35 36 14 80 81 80
7xy/n =32.86
Example 3
Age of Age of wives husbands 15-25 25-35 35-45 45-55 55-65 65-75 Total 15253545556515-25 1525-35 2535-45 354545-55 55-65 5565-75 65Total 1 2 3 1 12 4 17 1 10 3 14 1 6 2 9 1 4 1 6 2 2 4 2 15 15 10 8 3 53
Sol.3 X Y dy d
15-25 1525-35 2535-45 354545-55 55-65 5565-75 65f fdx fdx fd fdxdy
15-25 15-
25-35 25-
35-45 35-
4545-55
55-65 55-
1525Sol.3 X 15-25 25-35 Y dy dx -2 -1 15-25 1525-35 2535-45 3545-55 4555-65 5565-75 65-
35-45 35-
4545-55
55-65 55-
0
1 10 3 -
+1 +2 +3
1 6 2 1 4 1 9 6 2 2 4
-2 -1 0 +1 +2 +3
f fdx fdx fd fdxdy
1 2 3
1 12 4 17
2 15 15 10 8 3 53
14
Sol.3 Y
15-25 1525-35 2535-45 3545-55 4555-65 5565-75 65-
15-25 15-
25-35 25-
35-45 35-
4545-55
55-65 55-
X dy d x -2 1 -1 2 0 +1 +2 +3 f fdx
-2
4 4
-1
1 12 4
0 2 12
0
1 10 3 0 0 0
+1 +2 +3
1 6 2 0 6 4
1 4 1 9 6 9 9 10 12 24 24
2 16 6
2 2
12 18
2 15
-4 0
8 15 0 10 32 27 92
6 16 0 8 32 24 86
3 -6 12 8 17 -17 17 14
14 0 0 0
4 12 36 30
fdx fd fdxdy
15Sol.3 X 15-25 Y dy d -2
25-35 25-
35-45 35-
4545-55
55-65 55-
-1
1 12 4
0 2 12
0
1 10 3 0 0 0
+1 +2 +3
1 6 2 4 0 6
-2 -1 0 +1 +2 +3
f fdx
1 2 3
4 4
1 4 1 6 12 24 24
2 16 6
2 2
12 18
2 15
-4 0
8 15 0 10 32 27 92
6 16 0 8 32 24 86
17
-6 12 8
r = 0.907
-17 17 14 0 9 0 0 9 10
14
4 12 36 30
fdx fd fdxdy
0.27 1.41 2 3
2.19 3
2.83 6
2.19 4
1.81 2
0.85 1
3.05 5
0.27 1.41 2 3
2.19 3
2.83 6
2.19 4
1.81 2
0.85 1
3.05 5
Household 2 3 3 6 4 2 1 5
0.27 1.41 2 3
2.19 3
2.83 6
2.19 4
1.81 2
0.85 1
3.05 5
r = 0.842 R2 = 0.71
Correlation means the relationship between two or more variables to measure the direction and degree of linear relationship. relationship. Regression analysis aims at establishing the functional relationship. relationship.
Correlation does not imply causation is a phrase used in the sciences and statistics to emphasize that correlation between two variables does not imply there is a cause-and-effect relationship cause-andbetween the two. Its converse, correlation proves two. causation, is a logical fallacy by which two events that occur together are claimed to have a causecauseandand-effect relationship. For example, relationship. A occurs in correlation with B. Therefore, A causes B.
This is a logical fallacy because there are at least four other possibilities: possibilities:
4.
B may be the cause of A, or some unknown third factor is actually the cause of the relationship between A and B, or the "relationship" is so complex it can be labeled coincidental (i.e., two events occurring at the same (i. time that have no simple relationship to each other besides the fact that they are occurring at the same time). time). B may be the cause of A at the same time as A is the cause of B (contradicting that the only relationship between A and B is that A causes B). B). This describes a self-reinforcing system. selfsystem.