Cohort Study: Steps, Types, Advantages and Challenges

Cohort Study
Subodh S Gupta
Dr. Sushila Nayar School of Public Health MGIMS, Sewagram
Type of study Alternate name Observational studies

Descriptive studies Analytical studies
Ecological Cross-sectional Case-Control Cohort Correlational Prevalence Case-Reference Follow-up/ Longitudinal
Unit of study
Populations Individuals Individuals Individuals
Experimental/ intervention Studies

Randomized Controlled Clinical Trial Studies
Field Trial
Patients
Healthy person
Community Trial
Community intervention studies
Communities
Origin of word cohort

The
word cohort has its origin in the Latin cohors cohors (Latin word) = Refers to warriors and gives notion of a group of persons proceeding together in time Group of persons with a common statistical characteristic; e.g. age, birth date
Definition & Synonyms

Definition
The cohort study is an observational epidemiological study which, after the manner of an experiment, attempts to study the relationship between a purported cause (exposure) and the subsequent risk of developing disease.
Synonyms
Follow-up
Longitudinal
Prospective Incidence
study
The cohort design

Groups
are exposure based: The group or groups of persons to be studied are defined in terms of characteristics manifest prior to the appearance of the disease under investigation The study is conceptually longitudinal: The study groups so defined are observed over a period of time to determine the frequency of disease among them A definite beginning and end
The cohort design

Efficient
for examining
When there is good evidence of exposure and disease. When exposure is rare but incidence of disease is higher among exposed When follow-up is easy, cohort is stable When ample funds are available Common outcomes
The cohort design

Many
different outcomes for same exposure The dynamic nature of many risk factors and their relations in time to disease occurrence can be captured here (cannot be done in cross-sectional study and only with difficulty in case-control study) Associations (not cause and effect) Estimate incidence within risk factor groups
Cannot estimate prevalence of risk factor
Case control study

Time
Exposed
Not exposed
(People with disease)
Cases
Population
Exposed Controls (People without disease)
Not exposed
Direction of enquiry
Cohort study
Diseased
Exposed
People without the outcome Not Exposed Not diseased Not diseased
Population
Diseased
Time
Types of cohort study

Historical/
Retrospective/ Non-concurrent Prospective/ Concurrent
The distinction between retrospective and prospective cohort studies is important, not because of any conceptual difference or differences in interpretability of findings, but because of relevance to some practical issues, mostly the ability to control confounding.
Point in time when enquiry begins?

Diseased Exposed People without the outcome Not Exposed Not diseased Not diseased
Population
Diseased
Time Direction of enquiry
Both exposures and outcomes measured prospectively

Population
Diseased
Time
Exposures measured retrospectively and outcomes prospectively

Population
Diseased
Time
Both exposures and outcomes measured retrospectively

Population
Diseased
Time
Advantages
Direct
estimate of risk and rate of disease occurrence over time An efficient means of studying rare exposures Assess multiple outcomes of a single exposure Establish temporal relationship between exposure and outcome Exposure definitely precedes the outcome Avoids recall bias, survival bias Does not require strict random assignments of subjects Can be done with original data or secondary data
Best observational design to establish association
Disadvantages
Very
large sample sizes, especially for rare outcomes Expensive and time-consuming Attrition problem (Loss to follow-up) Differences in the quality of measurement of exposure or disease b/w the cohorts may introduce misclassification (information bias) Can not infer causal relation Very specific finding Complexity of data analysis Ethical issues Study effects
Alternate designs and concerns

Two
separate cohorts; exposed and unexposed subjects Omission of non-factor group Use of external comparison Use of mortality than morbidity as outcome Event notification arises from routine statistics, rather than special observations Comparison of several groups Competing causes of death
Cohort Study: Steps
Steps in conducting cohort study

1. 2. 3. 4. 5.
Identification of study population and initial steps Measurement of exposure Selection of study and comparison cohorts Follow-up (for outcome measurement) Data analysis
Types of cohorts
Closed
or fixed cohorts:
Fixed group of persons followed from a certain point in time until a defined endpoint Starting point - exposure defining event Endpoint occurrence of the disease, loss to follow-up, death The exposure is an event which occurs only once
Open
or dynamic cohorts:
Subjects may enter or leave the study at any time Exposure status may change over time
Cohorts
General
population cohorts: population groups offering special resources for followup or data linkage are chosen, and the individuals are subsequently allocated according to their exposure status Special exposure cohorts: Samples chosen on the basis of a particular exposure Exposures may be a particular event, a permanent state or a reversible state
General population cohorts (groups offering special resources)

Groups
with readily available health records Certain professional categories Obstetric populations Volunteer groups Geographically identified cohorts Record linkage
Special exposure cohorts (groups offering special resources)

Exposed
to certain factor or event Occupational groups Based on qualitative characteristics
Population-based Cohort Studies

Advantages Estimation of distributions and prevalence rates of relevant variables Risk factor distributions Ideal setting in which to carry out unbiased evaluation of relations
Selection of comparison group

Internal
Only one cohort identified Later on, classified into study and comparison cohort based on exposure
comparison
External
More than one cohort identified e.g. Cohort of radiologist compared with ophthalmologists
comparison
Comparison
If no comparison group is available we can compare the rates of study cohort with general population Cancer rate of uranium miners with cancer in general population
with general population rates
Ideal Cohort
Stable
cohort Cooperative cohort Committed cohort Well informed cohort
Exposure measurement
Exposures:
exogenous and/ or endogenous
Reference period Frequency of follow-up

Challenge
of prospective data collection
Changes in instrument over time Use of repeated measures Data collection costs
Sources of information
Records
Cohort
members: self-administered questionnaires, interviews, telephone interviews, mailed questionnaires, Medical examination & biomarkers: Clinic examinations & lab tests Measures of the environment: level of air pollution, quality of drinking water, airborne radiation Multiple methods
Follow-up: Types of outcomes

Discrete events
Single
events
Mortality First occurrence of a disease or health-related outcome

Multiple
occurrences
Disease outcome Transition between states of health/ disease Transitions between functional states
Level
of a marker
Exercise 1
An
investigator wants to discover whether or not being overweight in adolescence increases the risk of cardiovascular mortality in adulthood. a) Assuming historical records are available, would a prospective or retrospective study be more practical? b) Who would comprise the investigator's cohort under study? c) Who would comprise the investigator's exposed and unexposed groups in this cohort?
Group Exercise
Design
a Cohort Study Outline the steps which you will require to do for this study Special efforts you may need to do for follow-up of the study subjects What care you will need to take to reduce measurement bias Calculate the sample size
Challenges in conducting Cohort Study
multiple dimensions of time in cohort study

Age Calendar period
Challenge 1:
Exposure 1
Exposure 2 Exposure i Covariate 1 Covariate 2
Start of study
Covariate i
End of study
Challenge 2: Retaining cohort study members

Loss
to follow-up
Dropouts Can not be traced

More
concern: those who cannot be traced; May have moved because they have developed the disease
Effect of Nonresponse
Nonresponse:
a major problem A differential nonresponse will distrorts the true relationship b/w exposure and outcome
Nonresponse: random or selective?

Exposure
data: find out if nonrespondents are different from the respondents

Intensive efforts within the study design Follow-up of the nonrespondents as well as respondents
Challenge 3: Large Modern Cohort Studies

Huge
requirements of resources and manpower Management of huge database Follow-up Exposure information Data quality? Collection of biologic samples?
Challenge 4: Long term follow-up

Operational
problems Cumulative risk getting closer to one
Cohort Study Analysis
(Relation between exposure and outcome)

DISEASE STATUS Present EXPOSURE STATUS Present Absent Total
Standard 2 X 2 table
a+b
Absent
Total
c
a+c
d
b+d
c+d
N
Two types of measures for rate

Cumulative
incidence = Proportion of study subjects getting the outcome during the study period Incidence rate = New cases/ Person-time under observation
1. Cumulative incidence rate:

Number of new cases of disease occurring over a specified period of time in a population at risk.
EXAMPLE
A surveillance system for Hospital acquired infection among the postoperative patients in a month.
Example
9 6 14 14 24 19 14 4 5 19 21 6
10
15
20
25
30
2. Incidence density:
Number of new cases of disease occurring over a specified period of time in a population at risk throughout the interval.
Incidence density requires us to add up the period of time each individual was present in the population, and was at risk of becoming a new case of disease. Incidence density characteristically uses as the denominator personyears at risk. (Time period can be person-months, days, or even hours, depending on the disease process being studied.)
USES OF INCIDENCE DENSITY AND CUMULATIVE INCIDENCE
Incidence density gives the best estimate of the true risk of acquiring disease at any moment in time. Cumulative incidence gives the best estimate of how many people will eventually get the disease in an enumerated population.

Peripheral Vascular Disease
Present Cigarette Smoking Present Absent Total
Standard 2 X 2 table
15
1712
1727
Absent
Total
41
56
3188
4900
3229
4956

Disease status
Present 1st 2nd 3rd 4th 5th Total Absent Total
l X 2 table
Cholesterol quintiles
15 20 26 41 48 150
798 794 791 785 777 3945
813 814 817 826 825 4095
Comparing risks in different groups

Relative
risk OR Risk ratio (RR) Attributable risk OR Risk difference (AR) Attributable risk percent (AR%) Population attributable risk (PAR) Population attributable risk percent (PAR%) Odds Ratio (OR)
Relative risk OR Risk ratio

Ratio
of the risk among exposed to the risk among unexposed [Risk (Exp) / Risk (Unexp)] Risk of disease among exposed = [a/ [a+ b)] Risk of disease among unexposed = [c/ [c +d)] RR = [a/ [a +b)] / [c/ [c +d)] For null hypothesis, Risk ratio will equal one SE=
Risk difference vs. Relative risk

Absolute riskLung cancer deaths per 100,000 adult male per year
200 180 160
191
Smokers Non smokers
22
Relative risk
Absolute risk
8.7
140 120 100 80 60 40 20 0
Attributable risk OR Risk difference

(Absolute differences in risks or rates)
Also
known as attributable risk Risk (Exp) Risk (Unexp) Risk of disease among exposed = [a/ [a +b)] Risk of disease among unexposed = [c/ [c +d)] Risk difference = [a/ [a +b)] - [c/ [c +d)] For null hypothesis, Risk difference will equal zero
Risk difference vs. Relative risk

200 180 160 140 120 100 80 60 40 20 0
191
Smokers Non smokers
Risk difference
Absolute risks (Exp & Unexp)
8.7
Attributable risk percent among exposed

Among AR%
exposed, what percent of the total risk for disease is due to the exposure
(Exposed)
= [Risk (Exp) Risk (Unexp)]/ Risk (Exp) X 100 = (RR 1)/ RR X 100 = (OR 1)/ OR X 100 (if risk is small)
Attributable Risk Percent

200 180
191
22
% risk due to exposure
Absolute risks (Exp)
160 140 120 100 80 60 40 20 0
Relative risk
% risk due to background
8.7
Smokers
Non smokers
Attributable Risk Percent

200 180 160 140 120 100 80 60 40 20 0
191
p0RR Relative risk p0(RR-1)
Attributable risk Percent = (RR-1)/ RR *100
p0RR
Smokers
8.7
p0
Non smokers
Population attributable risk

In
the general population, how much of the total risk for disease is due to the risk factor Risk (Total) Risk (Unexp) Risk (Total)
= [Proportion population Exp X Risk (Exp)] + [Proportion population Unexp X Risk (Unexp)]
Population attributable risk percent

Among
the general population, what percent of the total risk for disease is due to the risk factor
PAR%
= [Risk (Total) Risk (Unexp)]/ Risk (Total) X 100 = [Pe (RR 1)]/ [1+ Pe (RR 1)] X 100
Population attributable risk percent

Absolute risk of lung cancer death per 100,000 adult male per year
180 160 140 120 100 Pe(RR-1) 80
RR
(RR-1)(1-Pe)
60
40 20 0
Pe
Smoker
(1-Pe)
Nonsmoker
Population Attributable risk Percent = [Pe (RR 1)]/ [1+ Pe (RR 1)] X 100
Risk Reduction
Risk Risk RR ARR RRR
(T/t) = a/(a+b) (Exp) = c/(c+d) = Risk (T/t)/ Risk (Exp) = Risk (Exp) Risk (T/t) = [Risk (Exp) Risk (T/t)] / Risk (Exp) = 1-Risk(T/t)/Risk(Exp) = 1-RR NNT = 1/ARR = 1/Risk(Exp)*RRR NNH
Analytical considerations
Concurrent
follow-up Varying follow-up dates Moving baseline dates Withdrawals Competing causes of death
Analytical considerations
Concurrent
follow-up
Simple risk-based analyses Survival analysis

Varying
follow-up dates
Simple risk analysis for all events up to, but not exceeding, the minimum elapsed time Survival analysis
Moving
baseline dates
Ignore and measure elapsed time since recruitment Survival analysis

Withdrawals Competing
causes of failure
Advanced methods
Standardization
Stratification
Life
Tables Multivariate analysis and Cox regression
Exercise 2
A
cohort study to explore the relationship between visual impairment and the risk of injuries from falls among the elderly. A total of 400 visually impaired (VI) persons >70 yrs are compared against 400 controls without VI. Over a 5-year follow-up period, 80 VI persons and 20 non-VI persons have injuries from falls. a) Construct a 2x2 table from the information above b) Calculate the followings with their CI :
Cumulative Incidence rate for exposed and unexposed Relative risk Attributable risk & Attributable risk percent
Exercise 2
A
cohort study to explore the relationship between visual impairment and the risk of injuries from falls among the elderly. A total of 400 visually impaired (VI) persons >70 yrs are compared against 400 controls without VI. Over a 5-year follow-up period, 80 VI persons and 20 non-VI persons have injuries from falls. a) Construct a 2x2 table from the information above b) Calculate the followings with their CI :
Cumulative Incidence rate for exposed and unexposed Relative risk Attributable risk & Attributable risk percent
Exercise 3
A
retrospective cohort study to explore the relationship between perimenopausal exogenous estrogen use and the risk of coronary heart disease (CHD). A total of 5000 exposed and 5000 unexposed women are enrolled and followed for 15 years for the development of myocardial infarction (MI). A total of 200 estrogen users and 300 nonusers had MIs.
Exercise 3 (Contd.)
a)
b)
c) d)
The risk (CI) of a MI among estrogen users The risk (CI) of a MI among nonusers of estrogen The relative risk (CIR) for MI Based on the results of this study is estrogen use a causative or protective factor for MI?
Exercise 4
Shaper et. al. (1988) A random sample of 7729 middle-aged British men Each man asked, at baseline, his alcohol consumption Next 7.5 years, death certificates collected for any subject who died
Alcohol consumption group (Unit/wk)
None Occasional Light (<1) (1-15) Moderate (16-42) Heavy (>42)
Subjects Deaths
466 41
1845 142
2544 143
2042 116
832 62
Exercise 4 (Contd.)
a) b)
Calculate the risk and the relative risk for each alcohol consumption group. Why might the conclusion based on the above table may be misleading? Given adequate funding, describe how?
Exercise 5
In a cohort study of 34387 menopausal women in Iowa, intakes of certain vitamins were assessed in 1986. In the period up to the end of 1992, 879 of these women were newly diagnosed with breast cancer. The table below shows data for two vitamins, classified according to ranked categories of intake.
Vitamin C Vitamin E
Events
PY
Events
PY
1 (low) 2 3 4 5 (high)
507 217 76 55 24
124,373 57,268 19,357 17,013 7,711
570 129 71 28 81
143,117 33,950 19,536 6,942 22,176
Exercise 5 (Contd.)
a)
For each vitamin, calculate the relative rates (with 95% confidence intervals) taking the low-consumption group as the base. Do your results suggest any beneficial (or otherwise) effect of additional vitamin C or E intake?
Types of bias
Selection
bias Follow-up bias Information bias Confounding bias Post hoc bias
Selection bias
Group
studied does not reflect the same distribution of factors (such as age, sex, SES, behavior etc.) as occurs in the general population
Effect of volunteering Whole spectrum of independent variables not represented in the study group Presence of incipient disease Distribution of covariates Survival cohorts: cohorts ascertained long after exposure
Follow-up bias
Also
known as Migration Bias In nearly all large studies some members of the original cohort drop out of the study If drop-outs occur randomly, such that characteristics of lost subjects in one group are on an average similar to those who remain in the group, no bias is introduced But ordinarily the characteristics of the lost subjects are not the same
Example of lost to follow-up

EXPOSURE irradiation
+ + 10000 20000 30000 50 100 Total 150 + 4000 RR= 30/4000 30/8000 =2 8000 12000 EXPOSURE irradiation + 30 30 Total 60
RR= 50/10000 100/20000 =1
Example. healthy worker effect

Question:
association b/w formaldehyde exposure and eye irritation Subjects: factory workers exposed to formaldehyde Bias: those who suffer most from eye irritation are likely to leave the job at their own request or on medical advice Result: remaining workers are less affected; association effect is diluted
Measurement / (Mis) classification

Exposure
misclassification occurs when exposed subjects are incorrectly classified as unexposed, or vice versa Disease misclassification occurs when diseased subjects are incorrectly classified as non-diseased, or vice versa
Misclassification bias: due to measurement errors

Systematic
bias Measurement errors

Non-differential: observed relative risk biased towards the null hypothesis Differential: This can lead to study results, which can not be interpreted because the observed relative risk may be biased towards the null, away from the null, or cross over the null value compared with the true relative risk
Sources of measurement errors

Selection/
exposure Omissions in the protocol for use of the instrument Poor execution of the study protocol Inherent subject characteristics Drift in accuracy of exposure measures over time Data processing and creation of exposure variables
design of the instrument to measure the
Reassignment to exposure category

Changes
in dichotomous exposure, if not taken into consideration will tend to make the strength of an observed association lower than that which actually existed
Latency is likely to be short Exposure accumulates over time during the study Very accurate results desirable
Reassignment
may not be possible
Close cohort as a rule Latency is very long Duration of follow-up is very long
Separate examination of outcome in those who changed exposure status during the study
Confounding bias
Other
factors which are associated with both outcome and exposure variables do not have the same distribution in the exposed and unexposed group
Examples confounding
COFFEE DRINKING
HEART DISEASE
(Smoking increases the risk of heart ds)
(Coffee drinkers are more likely to smoke)
SMOKING
Resolving Confounding Bias

Standardization
Stratification
Multivariate
adjustment
Post hoc bias

Use
of data from a cohort study to make observations that were not part of original study intent.
Thank you
Internal & External validity
7/13/2007
PRD-91

Cohort Study: Steps, Types, Advantages and Challenges

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cohort Study: Steps, Types, Advantages and Challenges

Uploaded by

Copyright:

Available Formats

Cohort Study

Type of study Alternate name Observational studies

Populations Individuals Individuals Individuals

Experimental/ intervention Studies

Community intervention studies

Origin of word cohort

Definition & Synonyms

The cohort design

The cohort design

The cohort design

Case control study

(People with disease)

Types of cohort study

Retrospective/ Non-concurrent Prospective/ Concurrent

Point in time when enquiry begins?

Time Direction of enquiry

Both exposures and outcomes measured prospectively

Exposures measured retrospectively and outcomes prospectively

Both exposures and outcomes measured retrospectively

Best observational design to establish association

Alternate designs and concerns

Cohort Study: Steps

Steps in conducting cohort study

General population cohorts (groups offering special resources)

Special exposure cohorts (groups offering special resources)

to certain factor or event Occupational groups Based on qualitative characteristics

Population-based Cohort Studies

Selection of comparison group

with general population rates

cohort Cooperative cohort Committed cohort Well informed cohort

exogenous and/ or endogenous

Reference period Frequency of follow-up

of prospective data collection

Follow-up: Types of outcomes

Mortality First occurrence of a disease or health-related outcome

Challenges in conducting Cohort Study

multiple dimensions of time in cohort study

Challenge 2: Retaining cohort study members

Dropouts Can not be traced

Nonresponse: random or selective?

data: find out if nonrespondents are different from the respondents

Challenge 3: Large Modern Cohort Studies

Challenge 4: Long term follow-up

problems Cumulative risk getting closer to one

Cohort Study Analysis

(Relation between exposure and outcome)

Two types of measures for rate

1. Cumulative incidence rate:

USES OF INCIDENCE DENSITY AND CUMULATIVE INCIDENCE

(Relation between exposure and outcome)

(Relation between exposure and outcome)

798 794 791 785 777 3945

813 814 817 826 825 4095

Comparing risks in different groups

Relative risk OR Risk ratio

Risk difference vs. Relative risk

200 180 160

Smokers Non smokers

140 120 100 80 60 40 20 0

Attributable risk OR Risk difference

Risk difference vs. Relative risk

200 180 160 140 120 100 80 60 40 20 0