You are on page 1of 7

Quantitative Reasoning

Association
KEY CONCEPTS

H. Ghaedamini (PhD)
(Ali)

stahgh@nus.edu.sg
Instructor
Department of Statistics and Applied Probability
1
1-Relationship between two variables
Deterministic Relationship Categorical variables
The value of the dependent Contingency table
variable can be determined with Odds Ratio and Risk Ratio
the value of the independent
variable: Numerical variables
Y=3X+2 Scatter diagram
Linear correlation coefficient
Statistical Relationship For more information on types
The average pattern of one of variables:
variable can be described with https://statistics.laerd.com/stat
the value of the other variable istical-guides/types-of-
variable.php

2
2- Association between categorical
variables, 2*2 Contingency Table
Outcome
O1 O2 Total
E1 a c a+c
Exposure
E2 b d b+d
Total a+b c+d a+c+b+d

Counts or number of individuals



Risk of developing O1 if exposed to E1 = a/a+c +
= Risk Ratio
Risk of developing O1 if exposed to E2 = b/b+d
+


Odds of O1 among E1 = a/c
= Odds Ratio
Odds of O1 among E2 = b/d

3
Exam
3-Design of the study Point!

COHORT STUDY CASE CONTROL STUDY CROSS SECTIONAL STUDY


Starts from exposure Starts from outcome or Starts from a sample of
disease population
Example: 100 smoker, 100 Example: 1000 random
non-smoker Example: 100 with
cancer, 100 without cancer people
Looks into the future Looks at the background Looks at their disease and
exposure status
Expensive Convenient Convenient
Risk Ratio and Odds ratio Only Odds ratio can be Odds and Risk Ratio can be
can be computed computed computed
Association is there if OR Association is there if OR Association is there if OR or
or RR are not equal to 1 is not equal to 1 RR are not equal to 1
Let me
think
Why from a Case-Control study Risk Ratio cannot be computed?
deeper
4
4- Association between continuous
variables, linear correlation coefficient, r
Bivariate Data (X,Y)
Scatter Diagram
Linear Correlation coefficient, r
How close are the points to a line.
-1 r +1
The closer the value to 1 or -1, the stronger the Linear
association

Representative Line
The sum of distance of each point to this line is minimum

5
5-Linear Correlation Coefficient
Ecological correlation
Correlation based on aggregated data such as group average or rate
Association will be overstated based on the aggregated data
Ecological Fallacy
Deduce the inferences on correlation about individuals based on aggregated data
Atomistic fallacy
Generalize the correlation based on individuals towards the aggregate-level correlation
Attenuation Effect
Due to range restriction in one variable, the correlation coefficient obtained tends to understate the
strength of association between two variables
Regression towards mediocrity
In virtual test-retest situations the bottom group on the first test will on average show some improvement on
the second test; and the top group will, on average, fall back

6
Disclaimer
This document serves as supplementary reading material, and is not to be seen as any
modules syllabus
This document is subjected to errors and changes at any time, and author is not liable for any
effects of the change

You might also like