Professional Documents
Culture Documents
Learning Objectives
Definition of Correlation
Need of Correlation
Computing Correlation
Correlation Coefficient (Pearson's correlation)
3 Characteristics of Relationship
Properties of Correlation
Coefficient of Determination
Steps of a Hypothesis test with Correlation
Spearman Rank Correlation Coefficient (rs)
Definition of Correlation?
Computing Correlation
(,)
Correl(X,Y)=
.. ..()
Very Strong Moderate Weak Positive No Linear Negligible Moderate Very strong
Positive Positive Association Association Negative Negative Negative
Association Association Association Association Association
3 Characteristics of Relationship
Form
Linear
Curvilinear
No pattern
Direction (Based on sign)
Positive
Negative
Strength (Based on Numerical Value)
Closer to 1
Closer to 0
Linear Relationship
No Linear Relationship
No Relationship
Zero Correlation
3 Characteristics of Relationship
Form
Linear
Curvilinear
No pattern
Direction (Based on sign)
Positive
Negative
Strength (Based on Numerical Value)
Closer to 1
Closer to 0
Direction (Based on sign)
Sign of the CORRELATION COEFFICIENT (+ or -)
POSITIVE X and Y tend to change in the SAME DIRECTION
Higher X Higher Y
Lower X Lower Y
Higher X Lower Y
Lower X Higher Y
Time Studying Quiz
POSITIVE DIRECTION
in Hours Grade
(X) (y)
0 65
0 60
1 70
2 75
3 80
3 85
4 90
4 90
5 95
5 98
6 99
Strong positive relationship between studying
6 100
and quiz grades (r = .989)
Hours of NEGATIVE DIRECTION
Cardio per % Body Fat
Week (y)
(X)
0 22
0 20
0 19.2
2 18
2 18.1
3 17.7
3 16
4 15
6 14
8 13
10 11 Strong Negative relationship between
12 10 cardio exercise and body fat (r = - .968)
3 Characteristics of Relationship
Form
Linear
Curvilinear
No pattern
Direction (Based on sign)
Positive
Negative
Strength (Based on Numerical Value)
Closer to 1
Closer to 0
Strength (Based on Numerical Value)
Numerical Value of the Correlation Coefficient (i.e., | 0-1|)
Closer to 1
More Consistent Relationship
Data Points in Scatter-Plot are More Linear
Closer to 0
Less Consistent Relationship
Data Points in Scatter-Plot are Less Linear and More Scattered
r Effect Size / Correlation Strength
.01 .10 Small / Weak
.09 .30 Medium / Moderate
.25 .50 Large / Strong
.50 .70 Very Large / Very Strong
STRONGEST STRENGTH
Calories
Daily Weight
(X) (y)
1000 100
1500 150
2000 200
2500 250
3000 300
3500 350
4000 400
4500 450
5000 500
1 1
1 1
1 2
1 2
1 3
2 1
2 1
2 4
2 5
2 5
3 5
4 1
4 2 Moderate positive relationship between
5 5 number of siblings and number of children
5 3 (r = .283)
Anatomy and
Physiology
Percentage of
Passing Stations WEAK STRENGTH
Grade on OSCE
(X) (y)
60 60
63 85
65 40
70 90
71 60
73 80
74 55
79 100
79 79
80 70
81 50
82 79
83 82
85 40
87 85
90 40 Weak negative correlation between final grade in
91 100 anatomy and physiology and percentage of passing
95 75 stations on the objective structured clinical
97 30
examination. (r = - .085)
Properties of Correlation
Measure of Related to
Linear Categorical Sample Size
Association Data Sensitive to
OUTLIERS
Coefficient of Determination
Proportion of common
variation of two variables
Example: r 2 = 67%
67% of variation in x is related to variation in y.
Example: Rising Hills Manufacturing
Rising Hills Manufacturing wishes to study the relationship
between the number of workers, and number of tables
produced in its plant.
X = number of workers
y = number of tables produced
How would you describe
this relationship?
r = .989
Two Tailed
Ho : p = 0 H1 : p = 0
(,)
r=
.. ..()
State 4. Make decision and State Conclusion
If the observed (calculated) r falls in the critical region:
- REJECT Ho and the Correlation is significant (i.e., the
relationship between X and Y is not likely due to chance and you
would expect to see a relationship between X and Y in the
population)
If the correlation is "low" then you don't reject the null hypothesis (but you
don't accept it either)
Spearman Rank Correlation
Coefficient (rs)
It is a non-parametric measure of correlation.
This procedure makes use of the two sets of ranks that may
be assigned to the sample values of x and Y.
Spearman Rank correlation coefficient could be computed
in the following cases:
- Both variables are quantitative.
- Both variables are qualitative ordinal.
- One variable is quantitative and the other is qualitative
ordinal.
Formula for Spearman Rank Correlation
Coefficient (rs)
1. In your third column rank the data in your first column from 1 to n (the
number of data you have). Give the lowest number a rank of 1, the next
lowest number a rank of 2, and so on.
2. In your fourth column do the same as in step 3, but instead rank the
second column
. If two (or more) pieces of data in one
column are the same, find the mean of
the ranks as if those pieces of data had
been ranked normally, then rank the
data with this mean.
5. Add up all the data in the "d2" column. This value is d2.
d2 = 6
6. Insert this value into the simplified Spearman's Rank
Correlation Coefficient formula and replace the "n" with the
number of pairs of data you have to calculate the answer.
7. Interpret your result. It can vary between -1 and 1.
Pearson Correlation vs Spearman Rank
Pearson Correlation Spearman Rank
Pearson correlation is the Spearman correlation means
"true" correlation between you rank the data (1st
variables--a measure of their through nth) and then take
tendency to rise and fall the correlation of the ranks
together. instead of the actual data.
The Pearson correlation The Spearman correlation
evaluates the linear coefficient is based on the
relationship between two ranked values for each
continuous variables. variable rather than the raw
data.
If you think the relationship is linear, Pearson is better.
If not, Spearman is better.