You are on page 1of 131

A Short Introduction

to Epidemiology

Neil Pearce

Occasional Report Series No 2

Centre for Public Health Research


Massey University Wellington Campus
Private Box 756
Wellington, New Zealand

1
Centre for Public Health Research
Massey University Wellington Campus
Private Box 756
Wellington, New Zealand
Phone: 64-4-3800-606
Fax: 64-4-3800-600
E-mail: cphr@massey.ac.nz
Website: http://www.publichealth.ac.nz/

Copies of this publication can be purchased in hard copy


through our website (NZ$20, US$10, 10), or downloaded for
free in pdf form from the website.

June 2003

ISBN 0-473-09560-2

ISSN 1176-1237

2
To Irihapeti Ramsden

3
4
Preface

Who needs another introductory are used to investigate. In particular, in


epidemiology text? Certainly, there are recent years there has been a revival in
many introductory epidemiology books public health applications of
currently in print, and many of them are epidemiology, not only at the national
excellent. Nevertheless, there are four level, but also at the international level,
reasons why I believe that this new text as epidemiologists tackle global problems
is justified. such as climate change. This text does
not attempt to review the more complex
measures used to consider such issues.
Firstly, it is much shorter than most
However, it does provide a coherent and
introductory texts, many of which contain
systematic summary of the basic
more material than is required for a short
methods in the field, which can be used
introductory course. This is a short
as a logical base for the teaching and
introduction to epidemiology, and is not
development of research into these more
intended to be comprehensive.
complex issues.

Secondly, I have endeavoured to show


Chapter 1 gives a brief introduction to the
clearly how the different basic
field, with an emphasis on the broad
epidemiologic methods fit together in a
range of applications and situations in
logical and systematic manner. For
which epidemiologic methods have been
example, I attempt to show how the
used historically, and will continue to be
different possible study designs relate to
used in the future.
each other, and how they are different
approaches to a common task. Similarly,
I attempt to show how the different study Part 1 then addresses study design
design issues (confounding and other options. Chapters 2 and 3 discuss the
types of bias) relate to each other, and basic study designs involving studies with
how the principles and methods of data dichotomous outcome measures. Chapter
analysis are consistent across different 2 discusses incidence studies (including
study designs and data types. cohort studies) and describes the basic
study design and the basic effect
measures (i.e. incidence rates and rate
Thirdly, in this context, rather than
ratios). It then presents incidence case-
attempt a comprehensive review of
control studies as a more efficient means
available methods (e.g. multiple methods
of obtaining the same findings. Chapter 3
for estimating confidence intervals for the
similarly discusses prevalence studies,
summary risk ratio), I have attempted to
and prevalence case-control studies.
select only one standard method for each
Chapter 4 then considers study designs
application, which is reasonably robust
incorporating other axes of classification,
and accurate, and which is consistent and
continuous outcome measures (e.g. blood
coherent with the other methods
pressure) such as cross-sectional studies
presented in the text.
and longitudinal studies, or more complex
study designs such as ecologic and multi-
Finally, the field of epidemiology is level studies. Chapter 5 addresses issues
changing rapidly, not only with regards to of measurement of exposure and disease.
its basic methods, but also with regards
to the hypotheses which these methods

5
Part 2 then addresses study design epidemiologic studies which are not
issues. Chapter 6 discusses issues of represented in this book. In particular,
study size and precision. Chapter 7 my focus is on the use of epidemiology in
considers general issues of validity, public health, particularly with regard to
namely selection bias, information bias, non-communicable disease, and I include
and confounding. Chapter 8 discusses few examples from clinical epidemiology
effect modification. or from communicable disease outbreak
investigations. Nevertheless, I hope that
Finally, Part 3 considers what happens the book will be of interest not only to
after the data are collected, with chapter epidemiologists, but also to others who
9 discussing data analysis and chapter 10 have other training but are involved in
issues of interpretation. epidemiologic research, including public
health professionals, policy makers, and
I should stress that this book provides no clinical researchers.
more than a very preliminary introduction
to the field. In doing so I have attempted
to use a wide range of examples, which Neil Pearce
give some indication of the broad range
of situations in which epidemiologic Centre for Public Health Research
methods can be used. However, there are Massey University Wellington Campus
undoubtedly many other types of Private Box 756
epidemiologic hypotheses and Wellington, New Zealand

Acknowledgements

During the writing of this text, my salary


was funded by the Health Research
Council of New Zealand. I wish to thank
Sander Greenland and Jonny Myers for
their comments on the draft manuscript.
I also wish to thank Massey University
for support for my research programme.

6
A Short Introduction to Epidemiology
Contents

1. Introduction 9
PART 3: ANALYSIS AND
Germs and miasmas 10
INTERPRETATION OF STUDIES
Risk factor epidemiology 11
Epidemiology in the 21st century 12 9. Data analysis 111
Basic principles 111
PART 1: STUDY DESIGN OPTIONS Basic analyses 114
Controlling for confounding 118
2. Incidence studies 21
Incidence studies 22 10. Interpretation 123
Incidence case-control studies 30 Appraisal of a single study 123
Appraisal of all of the available
3. Prevalence studies 35 evidence 126
Prevalence studies 35
Prevalence case-control studies 40

4. More complex study designs 43


Other axes of classification 43
Continuous outcome measures 44
Ecologic and multilevel studies 49

5. Measurement of exposure and
health status 59
Exposure 59
Health status 64

PART 2: STUDY DESIGN ISSUES

6. Precision 75
Basic statistics 76
Study size and power 77

7. Validity 83
Confounding 83
Selection bias 89
Information bias 90

8. Effect modification 99
Concepts of interaction 99
Additive and multiplicative models
104
Joint effects 105

7
8
CHAPTER 1. Introduction
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

Public health is primarily concerned with recognise the complementary nature of


the prevention of disease in human the former (McKinlay, 1993), and some
population. It differs from clinical texts include the latter in their definition
medicine both in its emphasis on of epidemiology. However, the key
prevention rather than treatment, and in feature of epidemiological studies is that
its focus on populations rather than they are quantitative (rather than
individual patients (table 1.1). qualitative) observational (rather than
Epidemiology is the branch of public experimental) studies of the determinants
health which attempts to discover the of disease in human populations (rather
causes of disease in order to make than individuals). This will be my focus
disease prevention possible. here, while recognising the value, and
Epidemiological methods can be used in complementary nature, of other research
other contexts (particularly in clinical methodologies. The observational
research), but this short introductory text approach is a major strength of
focuses on the use of epidemiology in epidemiology as it enables a study to be
public health, i.e. on its use as part of the conducted in a situation where a
wider process of discovering the causes randomized trial would be unethical or
of disease and preventing its occurrence impractical (because of the large
in human populations. numbers of subjects required). It is also
the main limitation of epidemiological
In this context, epidemiology has been studies in that the lack of randomization
defined as (Last, 1988): means that the groups being compared
may differ with respect to various causes
"the study of the distribution and of disease (other than the main exposure
determinants of health-related states or under investigation). Thus,
events in specified populations, and the epidemiological studies, in general,
application of this study to control of experience the same potential problems
health problems" as randomized controlled trials, but may
suffer additional problems of bias because
This broad definition could in theory exposure has not been randomly
include a broad range of research allocated and there may be differences in
methodologies including qualitative baseline disease risk between the
research and quantitative randomised populations being compared.
controlled trials. Some epidemiologists

Table 1.1

The defining features of public health: populations and prevention

Prevention Treatment
----------------------------------------------------------------------
Populations Public health Health systems research
Individuals Primary health care/ Medicine (including primary health care)
Health education

9
1.1 Germs and Miasmas

Epidemiology is as old as public health ages. However, epidemiology was


itself, and it is not difficult to find founded as an independent discipline in a
epidemiological observations made by number of Western countries in parallel
physicians dating back to Hippocrates with the industrial revolution of the 19th
who observed that: century. In Anglophone countries it is
considered to have been founded by the
Whoever wishes to investigate work of Chadwick, Engels, Snow and
medicine properly should proceed thus: others who exposed the appalling social
in the first place to consider the conditions during the industrial
seasons of the year, and what effects revolution, and the work of Farr and
each of them produces when one others who revealed major
comes into a city in which he is a socioeconomic differences in disease in
stranger, he should consider its the 19th century. At that time,
situation, how it lies as to the winds epidemiology was generally regarded as
and the rising of the sunOne should a branch of public health and focused on
consider most attentively the waters the causes and prevention of disease in
which the inhabitants useand the populations, in comparison with the
ground and the mode in which the clinical sciences which were branches of
inhabitants live, and what are their medicine and focussed on disease
pursuits, whether they are fond of pathology and treatment of disease in
drinking and eating to excess, and individuals. Thus, the emphasis was on
given to indolence, or are fond of the prevention of disease and the health
exercise and labor. (Hippocrates, needs of the population as a whole. In
1938; quoted in Hennekens and Buring, this context, the fundamental
1987) importance of population-level factors
(the urban environment, housing,
Many other examples of epidemiological socioeconomic factors, etc) was clearly
reasoning were published through the acknowledged (Terris, 1987).

Table 1.2
Deaths and death rates from cholera in London 1854 in households supplied by the
Southwark and Vauxhall Water Company and by the Lambeth Water Company

Deaths
Cholera per 10,000
Houses deaths houses
------------------------------------------------------------------------------------------------
Southwark and Vauxhall 40,046 1,263 315
Lambeth Company 26,107 98 37
Rest of London 256,423 1,422 59
------------------------------------------------------------------------------------------------
Source: (Snow, 1936; quoted in Winkelstein, 1995)

10
Perhaps the most commonly quoted 1983; Loomis and Wing, 1991; Samet,
epidemiologic legend is that of Snow who 2000; Vandenbroucke, 1994), it is clear
studied the causes of cholera in London that Snow was able to discover, and
in the mid-19th century (Winkelstein, establish convincing proof for, the mode
1995). Snow was able to establish that of transmission of cholera, and to take
the cholera death rate was much higher preventive action several decades before
in areas supplied by the Southwark and the biological basis of his observations
Vauxhall Company which took water was understood. Thus, it was not until
from the Thames downstream from several decades after the work of Snow
London (i.e. after it had been that Pasteur and others established the
contaminated with sewerage) than in role of the transmission of specific
areas supplied by the Lambeth Company pathogens in what became known as the
which took water from upstream, with infectious diseases, and it was another
the death rates being intermediate in century, in most instances, before
areas served by both companies. effective vaccines or antibiotic
Subsequently, Snow (1936) studied the treatments became available.
area supplied by both companies, and Nevertheless, a dramatic decline in
within this area walked the streets to mortality from these diseases occurred
determine for each house in which a from the mid-nineteenth century long
cholera death had occurred, which before the development of modern
company supplied the water. The death pharmaceuticals. This has been
rate was almost ten times as high in attributed to improvements in nutrition,
houses supplied with water containing sanitation, and general living conditions
sewerage (table 1.2). (McKeown, 1979) although it has been
argued that specific public health
Although epidemiologists and other interventions on factors such as urban
researchers continue to battle over congestion actually played the major role
Snows legacy and its implications for (Szreter, 1988).
epidemiology today (Cameron and Jones,

1.2 Risk Factor Epidemiology

This decline in the importance of human genome project has seen an


communicable disease was accompanied accelerated interest in the role of genetic
by an increase in morbidity and mortality factors (Beaty and Khoury, 2000).
from non-communicable diseases such
as heart disease, cancer, diabetes, and Thus, epidemiology became widely
respiratory disease. This led to major recognized with the establishment of the
developments in the theory and practice link between tobacco smoking as a cause
of epidemiology, particularly in the of lung cancer in the early 1950's (Doll
second half of the 20th century. There and Hill, 1950; Wynder and Graham,
has been a particular emphasis on 1950), although this association had
aspects of individual lifestyle (diet, already been established in Germany in
exercise, etc) and in the last decade the the 1930s (Schairer and Schninger,

11
2001). Subsequent decades have seen for the ethical and practical constraints,
major discoveries relating to other epidemiologic theory and practice has,
causes of chronic disease such as quite appropriately, been based on the
asbestos, ionizing radiation, viruses, theory and practice of randomised trials.
diet, outdoor air pollution, indoor air Thus, the aim of an epidemiologic study
pollution, water pollution, and genetic investigating the effect of a specific risk
factors. These epidemiologic successes factor (e.g. smoking) on a particular
have in some cases led to successful disease (e.g. lung cancer) is intended to
preventive interventions without the obtain the same findings that would have
need for major social or political change. been obtained from a randomised
For example, occupational carcinogens controlled trial. Of course, an
can, with some difficulty, be controlled epidemiologic study will usually
through regulatory measures, and experience more problems of bias than a
exposures to known occupational randomised controlled trial, but the
carcinogens have been reduced in randomised trial is the gold standard.
industrialized countries in recent
decades. Another example is the This approach has led to major
successful World Health Organisation developments in epidemiologic theory
(WHO) campaign against smallpox. More (presented most elegantly and
recently, some countries have passed comprehensively in Rothman and
legislation to restrict advertising of Greenland, 1998). In particular, there
tobacco and smoking in public places have been major developments in the
and have adopted health promotion theory of cohort studies (which mimic a
programmes aimed at changes in randomised trial, but without the
"lifestyle". randomisation) and case-control studies
(which attempt to obtain the same
Individual lifestyle factors would ideally findings as a full cohort study, but in a
be investigated using a randomised more efficient manner). It is these basic
controlled trial, but this is often unethical methods, which follow a randomised
or impractical (e.g. tobacco smoking). controlled trial paradigm, which receive
Thus, it is necessary to do observational most of the attention in this short
studies and epidemiology has made introductory text. However, while
major contributions to the understanding presenting these basic methods, it is
of the role of individual lifestyle factors important to also recognise their
and health. Because such factors would limitations, and to also consider different
ideally be investigated in randomised or more complex methods that may be
controlled trials, and in fact would be more appropriate when epidemiology is
ideally suited to such trials if it were not used in the public health context.

1.3 Epidemiology in the 21st Century

In particular, in the last decade there the future direction of epidemiology


has been increasing concern expressed (Saracci, 1999). In particular, it has
about the limitations of the risk factor been argued that there has been an
approach, and considerable debate about overemphasis on aspects of individual

12
lifestyle, and little attention paid to the studies. Even if one is focusing on
population-level determinants of health individual lifestyle risk factors, there is
(Susser and Susser, 1996a, 1996b; good reason to conduct studies at the
Pearce, 1996; McMichael, 1999). population level (Rose, 1992). Moreover,
Furthermore, the success of risk factor every population has its own history,
epidemiology has been more temporary culture, and economic and social
and more limited than might have been divisions which influence how and why
expected (Pearce, 1996). For example, people are exposed to specific risk
the limited success of legislative factors, and how they respond to such
measures in industrialised countries has exposures. For example, New Zealand
led the tobacco industry to shift its (Aotearoa) was colonised by Great
promotional activities to developing Britain more than 150 years ago,
countries so that more people are resulting in major loss of life by the
exposed to tobacco smoke than ever indigenous people (the Mori). It is
before (Barry, 1991; Tominaga, 1986). commonly assumed that this loss of life
Similar shifts have occurred for some occurred primarily due to the arrival of
occupational carcinogens (Pearce et al, infectious diseases to which Mori had no
1994). Thus, on a global basis the natural immunity. However, a more
"achievement" of the public health careful analysis of the history of
movement has often been to move colonisation throughout the Pacific
public health problems from rich reveals that the indigenous people
countries to poor countries, and from mainly suffered major mortality from
rich to poor populations within the imported infectious diseases when their
industrialized countries. land was taken (Kunitz, 1994), thus
disrupting their economic base, food
It should be acknowledged that not all supply and social networks. This
epidemiologists share these concerns example, is not merely of historical
(e.g. Savitz, 1994; Rothman et al, 1998; interest, since it these same infectious
Poole and Rothman, 1998), and some diseases that have returned in strength
have regarded these discussions as an in Eastern Europe in the last decade,
attack on the field itself, rather than as after lying dormant for nearly a century
an attempt to broaden its vision. (Bobak and Marmot, 1996). Similarly,
Nevertheless, the debate has progressed the effects of occupational carcinogens
and there is an increasing recognition of may be greater in developing countries
the importance of taking a more global where workers may be relatively young
approach to epidemiologic research and or may be affected by malnutrition or
of the importance of maintaining an other diseases (Pearce et al, 1994).
appropriate balance and interaction
between macro-level (population), These issues are likely to become more
individual-level (e.g. lifestyle), and important because, not only is
micro-level (e.g. genetic) research. epidemiology changing, but the world
that epidemiologists study is also rapidly
There are three crucial concepts which changing. We are seeing the effects of
have received increasing attention in this economic globalization, structural
regard. adjustment (Pearce et al, 1994) and
climate change (McMichael, 1993, 1995),
The Importance of Context and the last few decades have seen the
occurrence of the informational
The first, and most important issue, is revolution which is having effects as
the need to consider the population great as the previous agricultural and
context when conducting epidemiologic industrial revolutions (Castells, 1996).

13
In industrialized countries, this is likely theories and identifies the major public
to prolong life expectancy for some, health problems which new theories
but not all, sections of the population. must be able to explain. A fruitful
In developing countries, the benefits research process can then be
have been even more mixed (Pearce et generated with positive interaction
al, 1994), while the countries of between epidemiologists and other
Eastern Europe are experiencing the researchers. Studying real public
largest sudden drop in life expectancy health problems in their historical and
that has been observed in peacetime social context does not exclude
in recorded human history (Boback learning about sophisticated methods
and Marmot, 1996) with a major rise of study design and data analysis (in
in alcoholism and forgotten diseases fact, it necessitates it), but it may help
such as tuberculosis and cholera. to ensure that the appropriate
questions are asked (Pearce, 1999).
This increased interest in population-
level determinants of health has been Appropriate Technology
particularly marked by increased
interest in techniques such as A related issue is the need to use
multilevel modelling which allow appropriate technology to address
individual lifestyle risk factors to be the most important public health
considered in context and in parallel research questions. In particular, as
with macro-level determinants of attention moves upstream to the
health (Greenland, 2000). Such a shift population level (McKinlay, 1993), new
in approach is important, not only methods will need to be developed
because of the need to emphasize the (McMichael, 1995). One example of
role of diversity and local knowledge this, noted above, is the recent rise in
(Kunitz, 1994), but also because of the interest in multilevel modelling
more general moves within science to (Blakely and Woodward, 2000; Pearce,
consider macro-level systems and 2000), although it is important to
processes (Cohen and Stewart, 1994) stress that it is an increase in
rather than taking a solely reductionist multilevel thinking in the
approach (Pearce, 1996). development of epidemiologic
hypotheses and the design of studies
Problem-Based Epidemiology that is required, rather than just the
use of new statistical techniques of
A second issue is that a problem-based data analysis. The appropriateness of
approach may be particularly valuable any research methodology depends on
in encouraging epidemiologists to the phenomenon under study: its
focus on the major public health magnitude, the setting, the current
problems and to take the population state of theory and knowledge, the
context into account (Pearce, 2001; availability of valid measurement tools,
Thacker and Buffington, 2001). A and the proposed uses of the
problem-based approach to teaching information to be gathered, as well as
clinical medicine has been increasingly the community resources and skills
adopted in medical schools around the available and the prevailing norms and
world. The value of this approach is values at the national, regional or local
that theories and methods are taught level (Pearce and McKinlay, 1998).
in the context of solving real-life Thus, there has been increased
problems. Starting with the problem interest in the interface between
at the population level provides a epidemiology and social science
reality check on existing etiological (Krieger, 2000), and in the

14
development of theoretical and basic epidemiologic methods, but I
methodological frameworks attempt to refer to more complex
appropriate for epidemiologic studies issues, and the potential use of more
in developing countries (Barreto et al, complex methods, where this is
2001). As noted above, this short appropriate.
introductory text focuses on the most

Summary

Public health is primarily concerned with applied to the study of non-


the prevention of disease in human communicable diseases. At the beginning
populations, and epidemiology is the of the 21st century, the field of
branch of public health which attempts epidemiology is changing rapidly, not
to discover the causes of disease in only with regards to its basic methods,
order to make disease prevention but also with regards to the hypotheses
possible. It thus differs from clinical which these methods are used to
medicine both in its emphasis on investigate. In particular, in recent years
prevention (rather than treatment) and there has been a revival in public health
in its focus on populations (rather than applications of epidemiology, not only at
individual patients). Thus, the the national level, but also at the
epidemiological approach to a particular international level, as epidemiologists
disease is intended to identify high-risk tackle global problems such as climate
subgroups within the population, to change. This text does not attempt to
determine the causes of such excess review the more complex methods used
risks, and to determine the effectiveness to study such issues. However, it does
of subsequent preventive measures. provide a coherent and systematic
Although the epidemiological approach summary of the basic methods in the
has been used for more than a century field, which can be used as a logical base
for the study of communicable diseases, for the teaching and development of
epidemiology has considerably grown in research into these more complex
scope and sophistication in the last few issues.
decades as it has been increasingly

References

Barreto ML, Almeida-Filho N, Breihl J America. J Epidemiol Comm Health


(2001). Epidemiology is more than 55: 158-9.
discourse: critical thoughts from Latin

15
Barry M (1991). The influence of the Last JM (ed) (1988). A dictionary of
U.S. tobacco industry on the health, epidemiology. New York: Oxford
economy, and environment of University Press.
developing countries. New Engl J Med
Loomis D, Wing S (1991). Is molecular
324: 917-20.
epidemiology a germ theory for the
Beaty TH, Khoury MJ (2000). Interface of end of the twentieth century? Int J
genetics and epidemiology. Epidemiol 19: 1-3.
Epidemiologic Reviews 22: 120-5.
McMichael AJ (1993). Planetary
Blakeley T, Woodward AJ (2000). overload: global environmental
Ecological effects in multi-level change and the health of the human
studies. J Epidemiol Comm Health 54: species. Cambridge: Cambridge
367-74. University Press.
Bobak M, Marmot M (1996). East-West McMichael AJ (1995). The health of
mortality divide and its potential persons, populations, and planets:
explanations: proposed research epidemiology comes full circle.
agenda. Br Med J 312: 421-5. Epidemiol 6: 633-5.
Cameron D, Jones IG (1983). John McMichael AJ (1999). Prisoners of the
Snow, the Broad Street pump and proximate: loosening the constraints
modern epidemiology. Int J Epidemiol on epidemiology in an age of change.
12: 393-6. Am J Epidemiol 149: 887-97.
Castells M (1996). The information age: McKeown T (1979). The role of medicine.
Economy, society and culture. Vol 1. Princeton, NJ: Princeton University
The rise of the network society. Press.
Oxford: Blackwell.
McKinlay JB (1993). The promotion of
Cohen J, Stewart I (1994). The collapse health through planned sociopolitical
of chaos: discovering simplicity in a change: challenges for research and
complex world. London: Penguin. policy. Soc Sci Med 36: 109-17.
Doll R, Hill AB (1950). Smoking and Pearce N (1996). Traditional
carcinoma of the lung. Br Med J 2: epidemiology, modern epidemiology,
739-48. and public health. AJPH 86: 678-83.
Greenland S (2000). Principles of Pearce N, McKinlay J (1998). Back to the
multilevel modelling. Int J Epidemiol future in epidemiology and public
29: 158-67. health. J Clin Epidemiol 51: 643-6.
Hennekens CH, Buring JE (1987). Pearce N (1999). Epidemiology as a
Epidemiology in medicine. Boston: population science. Int J Epidemiol
Little, Brown. 28: S1015-8.
Hippocrates (1938). On airs, waters and Pearce N (2000). The ecologic fallacy
places. Med Classics 3: 19. strikes back. J Epidemiol Comm
Health 54: 326-7.
Krieger N (2000). Epidemiology and
social sciences: towards a critical Pearce N (2001). The future of
reengagement in the 21st century. epidemiology: a problem-based
Epidemiologic Reviews 22: 155-63. approach using evidence-based
methods. Australasian Epidemiologist
Kunitz S (1994). Disease and social
8.1: 3-7.
diversity. New York: Oxford University
Press.

16
Pearce NE, Matos E, Vainio H, Boffetta P, mortality decline c.1850-1914: a
Kogevinas M (eds) (1994). reinterpretation of the role of public
Occupational cancer in developing health. Soc Hist Med 1: 1-37.
countries. Lyon: IARC.
Terris M (1987). Epidemiology and the
Poole C, Rothman KJ (1998). Our public health movement. J Publ Health
conscientious objection to the Policy 7: 315-29.
epidemiology wars. J Epidemiol Comm
Thacker SB, Buffington J (2001). Applied
Health 52: 613-4.
epidemiology for the 21st century. Int
Rose G. The strategy of preventive J Epidemiol 30: 320-5.
medicine. Oxford: Oxford University
Tominaga S (1986). Spread of smoking
Press, 1992.
to the developing countries. In:
Rothman KJ, Greenland S (1998). Zaridze D, Peto R (eds). Tobacco: a
Modern epidemiology. 2nd ed. major international health hazard.
Philadelphia: Lippincott-Raven. Lyon: IARC, pp 125-33.
Rothman KJ, Adami H-O, Trichopolous Vandenbroucke JP (1994). New public
(1998). Should the mission of health and old rhetoric. Br Med J 308:
epidemiology include the readication 994-5.
of poverty? Lancet 352: 810-3.
Winkelstein W (1995). A new perspective
Samet JM (2000). Epidemiology and on John Snows communicable disease
policy: the pump handle meets the theory. Am J Epidemiol 142: S3-9.
new millennium. Epidemiologic
Wynder EL, Graham EA (1950). Tobacco
Reviews 22: 145-54.
smoking as a possible etiologic factor
Saracci R (1999). Epidemiology in in bronchiogenic carcinoma. J Am
progress: thoughts, tensions and Statist Assoc 143: 329-38.
targets. Int J Epidemiol 28: S997-9.
Savitz DA (1994). In defense of black
box epidemiology. Epidemiology 5:
550-2.
Schairer E, Schninger E (2001). Lung
cancer and tobacco consumption. Int J
Epidemiol 30: 24-7.
Snow J (1936). On the mode of
communication of cholera. (Reprint).
New York: The Commonwealth Fund,
pp 11-39.
Susser M, Susser E (1996a). Choosing a
future for epidemiology: I. Eras and
paradigms. Am J Publ Health 86: 668-
73.
Susser M, Susser E (1996b). Choosing a
future for epidemiology: II. From
black boxes to Chinese boxes. Am J
Publ Health 86: 674-8.
Szreter S (1988). The importance of
social intervention in Briatain's

17
18
Part I

Study Design Options

19
20
CHAPTER 2. Incidence Studies
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

In this chapter and the next one I review The responses to these two questions yield
the possible study designs for the simple four basic types of epidemiologic studies
situation where individuals are exposed to (Morgenstern and Thomas, 1993; Pearce,
a particular risk factor (e.g. a particular 1998):
chemical) and when a dichotomous
outcome is under study (e.g. being alive or 1. Incidence studies
dead, or having or not having a particular 2. Incidence case-control studies
disease). Thus, the aim is to estimate the 3. Prevalence studies
effect of a (dichotomous) exposure on the 4. Prevalence case-control studies
occurrence of a (dichotomous) disease
outcome or health state. These four study types represent cells in a
two-way cross-classification (table 2.1).
It should first be emphasized that all Such studies may be conducted to describe
epidemiologic studies are (or should be) the occurrence of disease (e.g. to estimate
based on a particular source population the burden of diabetes in the community
(also called the study population or base by conducting a prevalence survey), or to
population) followed over a particular risk estimate the effect of a particular exposure
period. Within this framework a on disease (e.g. to estimate whether the
fundamental distinction is between studies incidence new cases of diabetes is greater
of disease incidence (i.e. the number of in people with a high fat diet than in
new cases of disease over time) and people with a low fat diet) in order to find
studies of disease prevalence (i.e. the out how we can prevent the disease
number of people with the disease at a occurring. In the latter situation we are
particular point in time). Studies involving comparing the occurrence of disease in an
dichotomous outcomes can then be exposed group with that in a non-
classified according to two questions: exposed group, and we are estimating the
effect of exposure on the occurrence of the
a. Are we studying studying incidence or disease, while controlling for other known
prevalence?; causes of the disease.
b. Is there sampling on the basis of
outcome?

Table 2.1
The four basic study types in studies involving a dichotomous health outcome

Sampling on outcome
------------------------------------------------------------
No Yes
------------------------------------------------------------
Study Incidence Incidence studies Incidence case-control studies
outcome
Prevalence Prevalence studies Prevalence case-control studies
------------------------------------------------------------

21
Thus, we might conclude that lung consider prevalence studies. In chapter
cancer is five times more common in 4, I then consider studies involving more
asbestos workers than in other workers, complex measurements of health status
even after we have controlled for (e.g. continuous lung function or blood
differences in age, gender, and pressure measurements) and more
smoking. In some instances we may complex study designs (ecologic and
have multiple categories of exposure multilevel studies). As noted in chapter
(high, medium, low) or individual 1, the latter situation is perhaps the
exposure scores, but we will start with norm, rather than the exception, when
the simple situation in which individuals conducting studies in the public health
are classified as exposed or non- context. However, for logical and
exposed. practical reasons I will first address the
simpler situation of a dichotomous
In this chapter I discuss incidence exposure (in individuals) and a
studies, and in the following chapter I dichotomous health outcome measure.

2.1 Incidence Studies

The most comprehensive approach In the hypothetical study shown in figure


involves collecting data on the 2.1, people enter the study when they
experience of the entire source are born, and some of them
population over the risk period in order subsequently develop disease. Of these,
to estimate disease incidence (the some subsequently "lose" their disease
development of a disease for the first (although they may "regain" it at a later
time) or mortality (i.e. death which is a date), and some have the condition all
particular type of incidence measure). their lives; some persons die from the
Figure 2.1 shows the experience of a disease under study, but most eventually
source population in which all persons die from another cause. However, the
are followed from a particular date. For information is "censored" since the study
simplicity, I will initially assume that the cannot last indefinitely; i.e. follow-up
source population is confined to persons stops by a particular age, at which time
born in a particular year, i.e. a birth some members of the study population
cohort. In the hypothetical study shown have died, and some have been lost to
in figure 2.1, the outcome under study is follow-up for other reasons (e.g.
the "event" of developing a particular emigration). For example, several people
disease. However, the concept of in figure 2.1 were censored before
incidence applies equally to studies of follow-up finished, either because they
other health events, such as died of the disease we were studying (if
hospitalisation or death. The key feature we were studying the incidence of
of incidence studies is that they involve disease, rather than deaths, they would
an event (e.g. developing a disease for be censored as soon as they developed
the first time) which occurs at a the disease), they died of something
particular point in time, rather than a else, or because they were lost to
state (e.g. having a disease) which can follow-up. Each person only contributes
exist over an extended period of time. person-time to the study until they are

22
censored, and after that we stop younger age than the non-exposed
counting them. This approach is followed group. If we only calculated the
because we may not get a fair percentage of people who died, then it
comparison between the exposed and would be 100% in both groups, and we
the non-exposed groups if they have would see no difference. However, if we
been followed for different lengths of take into account the person-time
time, e.g. if one group has many more contributed by each group, then we
people lost to follow-up than the other would see that both groups had the
group. same number of deaths (1,000), but that
in the exposed group these deaths
However, the person-time approach occurred earlier and the person-time
would be necessary even if no-one was contributed was therefore lower. Thus,
lost to follow up and both groups were the average age at death would be lower
followed for the same length of time. For in the exposed group or, to say the same
example, consider a cohort study of thing another way, the death rate
1,000 exposed and 1,000 non-exposed (deaths divided by person-years) would
people in which no-one was lost to be higher. To see this, we need to
follow-up and everyone was followed consider not only how many people were
until they died. Assume also that the in each group, but how much person-
exposure causes some deaths so the time they contributed, i.e. how long they
exposed group, on the average, died at a were followed for.

Figure 2.1
Occurrence of disease in a hypothetical population followed from birth

Birth End of Follow up


death from disease under study at risk

other death disease symptoms

lost to follow up severe symptoms

23
Example 2.1
Martinez et al (1995) were completed during of three years but had
studied 1246 newborns the childs second year wheezing at six years,
in the Tucson, Arizona of life and again at six and 13.7% had
area enrolled between years. At the age of six wheezing both before
May 1980 and October years, 51.5% of the three years of age and
1984. Parents were children had never at six years. The authors
contacted shortly after wheezed, 19.9% had concluded that the
the children were born, had at least one lower majority of infants with
and completed a respiratory tract illness wheezing have transient
questionnaire about their with wheezing during the conditions and do not
history or respiratory first three years of life have increased risks of
illness, smoking habits, but had no wheezing at asthma or allergies later
and education. Further six years, 15.0% had no in life.
parental questionnaires wheezing before the age

In some circumstances, a study might formally defined and enumerated (e.g.


be conducted to study the "natural a group of workers exposed to a
history" of a disease (e.g. diabetes). In particular chemical) then the study
such clinical epidemiology studies, may be termed a cohort study or
the population (denominator) under follow-up study (Rothman and
study comprises people who already Greenland, 1998) and the former
have a particular disease or condition, terminology will be used here.
and the goal is to ascertain which Incidence studies also include studies
factors affect the disease prognosis. where the source population has been
More typically, one might be interested defined but a cohort has not been
in a particular hypothesis about formally enumerated by the
developing disease, such as "a high investigator. Perhaps the most
cholesterol diet increases the risk of common examples are descriptive
developing ischaemic heart disease". studies, e.g. of national death rates. In
In this situation, the population under fact, as Rothman and Greenland
study comprises healthy individuals (1998) note, no qualitative distinction
and we are interested in factors that distinguishes descriptive variables
determine who develops the disease from the variables that are studied in
under study (and who doesnt). The analytic studies of risk factors. Thus,
data generated by such an incidence the distinction between descriptive
study involve comparing exposed incidence studies and analytic
and non-exposed groups and are incidence studies is at best only a
similar to that generated by a distinction based on data source (e.g.
randomised controlled trial, except obtaining information from routine
that dietary exposure has not been records rather than collecting the
randomly allocated. information specifically for the study).

Incidence studies ideally measure Similarly, there is no fundamental


exposures, confounders and outcome distinction between incidence studies
times on all population members. based on a broad population (e.g. all
When the source population has been workers at a particular factory, or all

24
persons living in a particular 0.0100 (or 1000 per 100,000 person-
geographical area) and incidence years).
studies involving sampling on the basis
of exposure, since the latter procedure A second measure of disease
merely redefines the source population occurrence is the incidence proportion
(cohort) (Miettinen, 1985). or average risk which is the proportion
of people who experience the outcome
Measures of Disease Occurrence of interest at any time during the
follow-up period (the incidence
I will briefly review the basic measures proportion is often called the
of disease occurrence that are used in cumulative incidence, but the latter
incidence studies, using the notation term is also used to refer to
depicted in table 2.2 which shows the cumulative hazards (Breslow and Day,
findings of a hypothetical incidence 1987)). Since it is a proportion it is
study of 20,000 persons followed for dimensionless, but it is necessary to
10 years (statistical analyses using specify the time period over which it is
these measures are discussed further being measured. In this instance,
in chapter 9). there were 952 incident cases among
the 10,000 people in the non-exposed
Three measures of disease incidence group, and the incidence proportion
are commonly used in incidence (b/N0) was therefore 952/10,000 =
studies. 0.0952 over the ten year follow-up
period. When the outcome of interest
Perhaps the most common measure of is rare over the follow-up period (e.g.
disease occurrence is the person-time an incidence proportion of less than
incidence rate (or hazard rate, force of 10%), then the incidence proportion is
mortality or incidence density approximately equal to the incidence
(Miettinen, 1985)) which is a measure rate multiplied by the length of time
of the disease occurrence per unit that the population has been followed
population time, and has the reciprocal (in the example, this product is 0.1000
of time as its dimension. In this whereas the incidence proportion is
example (table 2.2), there were 952 0.0952). I have assumed, for
cases of disease diagnosed in the non- simplicity, that no-one or was lost to
exposed group during the ten years of follow-up during the study period (and
follow-up, which involved a total of therefore stopped contributing person-
95,163 person-years; this is less than years to the study). However, as noted
the total possible person-time of above when this assumption is not
100,000 person-years since people valid (i.e. when a significant proportion
who developed the disease before the of people have died or have been lost
end of the ten-year period were no to follow-up), then the incidence
longer at risk of developing it, and proportion cannot be estimated
stopped contributing person-years at directly, but must be estimated
that time (for simplicity I have ignored indirectly from the incidence rate
the problem of people whose disease (which takes into account that follow-
disappears and then reoccurs over up was not complete) or from life
time, and I have assumed that we are tables (which stratify on follow-up
studying the incidence of the first time).
occurrence of disease). Thus, the
incidence rate in the non-exposed
group (b/Y0) was 952/95,163 =

25
A third possible measure of disease estimated indirectly from the incidence
occurrence is the incidence odds rate (via the incidence proportion, or
(Greenland, 1987) which is the ratio of via life-table methods). The incidence
the number of people who experience odds is not very interesting or useful
the outcome (b) to the number of as a measure of disease occurrence,
people who do not experience the but it is presented here because the
outcome (d). As for the incidence incidence odds is used to calculate the
proportion, the incidence odds is incidence odds ratio which is estimated
dimensionless, but it is necessary to in certain case-control studies (see
specify the time period over which it is below).
being measured. In this example, the
incidence odds (b/d) is 952/9,048 = These three measures of disease
0.1052. When the outcome is rare occurrence all involve the same
over the follow-up period then the numerator: the number of incident
incidence odds is approximately equal cases of disease (b). They differ in
to the incidence proportion. Once whether their denominators represent
again, if loss to follow-up is significant, person-years at risk (Y0), persons at
then the incidence odds cannot be risk (N0), or survivors (d).
estimated directly, but must be

Table 2.2

Findings from a hypothetical cohort study of 20,000 persons followed for 10 years

Exposed Non-exposed Ratio


------------------------------------------------------------------------------------------------
Cases 1,813 (a) 952 (b)
Non-cases 8,187 (c) 9,048 (d)
------------------------------------------------------------------------------------------------
Initial population size 10,000 (N1) 10,000 (N0)
------------------------------------------------------------------------------------------------
Person-years 90,635 (Y1) 95,163 (Y0)
------------------------------------------------------------------------------------------------
Incidence rate 0.0200 (I1) 0.0100 (I0) 2.00
Incidence proportion 0.1813 (R1) 0.0952 (R0) 1.90
(average risk)
Incidence odds 0.2214 (O1) 0.1052 (O0) 2.11

26
Measures of Effect in Incidence their denominators are based on person-
Studies years, persons, or survivors (people who
do not develop the disease at any time
Corresponding to these three measures during the follow-up period). They are all
of disease occurrence, there are three approximately equal when the disease is
principal ratio measures of effect which rare during the follow-up period (e.g. an
can be used in incidence studies. The incidence proportion of less than 10%).
measure of interest is often the rate However, the odds ratio has been
ratio (incidence density ratio), the ratio severely criticised as an effect measure
of the incidence rate in the exposed (Greenland, 1987; Miettinen and Cook,
group (a/Y1) to that in the non-exposed 1981), and has little intrinsic meaning in
group (b/Y0). In the example in table incidence studies, but it is presented
2.2, the incidence rates are 0.02 per here because it is the standard effect
person-year in the exposed group and measure in incidence case-control
0.01 per person-year in the non-exposed studies (see below).
group, and the rate ratio is therefore
Finally, it should be noted that an
2.00.
analogous approach can be used to
calculate measures of effect based on
A second commonly used effect measure differences rather than ratios, in
is the risk ratio (incidence proportion particular the rate difference and the risk
ratio or cumulative incidence ratio) which difference. Ratio measures are usually of
is the ratio of the incidence proportion in greater interest in etiologic research,
the exposed group (a/N1) to that in the because they have more convenient
non-exposed group (b/N0). In this statistical properties, and it is easier to
example, the risk ratio is 0.1813/0.0952 assess the strength of effect and the
= 1.90. When the outcome is rare over possible role of various sources of bias
the follow-up period the risk ratio is when using ratio measures (Cornfield et
approximately equal to the rate ratio. al, 1951). Thus, I will concentrate on the
use of ratio measures in the remainder
A third possible effect measure is the of this text. However, other measures
incidence odds ratio which is the ratio of (e.g. risk difference, attributable
the incidence odds in the exposed group fraction) may be of value in certain
(a/c) to that in the non-exposed group circumstances, such as evaluating the
(b/d). In this example the odds ratio is public health impact of a particular
0.2214/0.1052 = 2.11. When the exposure, and I encourage readers to
outcome is rare over the study period consult standard texts for a
the incidence odds ratio is approximately comprehensive review of these measures
equal to the incidence rate ratio. (e.g. Rothman and Greenland, 1998).

These three multiplicative effect Conducting An Incidence Study


measures are sometimes referred to
under the generic term of relative risk. Analyses of routine records
Each involves the ratio of a measure of
disease occurrence in the exposed group Perhaps the simplest type of incidence
to that in the non-exposed group. The study involves descriptive analyses
various measures of disease occurrence using routine mortality or incidence
all involve the same numerators records for a defined geographic
(incident cases), but differ in whether population. For example, most countries

27
have comprehensive death registration study information might then be linked
schemes, as well as regular national to death registration and hospital
censuses, a population register, or other admission records to identify deaths or
methods of estimating population major health events. Study
numbers. These can then be used, as the participants would contribute person-
numerator and denominator time (to the denominators used in
respectively, to calculate overall national calculating mortality or incidence
death rates, as well as the death rates rates) from the start of the study until
by age-group and gender. In some they died, emigrated (and therefore
countries, information may also be were no longer traceable) or the
available to calculate death rates by follow-up period finished. The
other demographic variables such as incidence rates (or mortality rate) for a
ethnicity, socio-economic status, particular condition might then be
employment status, occupation or compared between those exposed and
geographical area. However, the validity those not exposed (e.g. to a high
of such analyses may be questionable, cholesterol diet).
because in most countries death
certificates (or other routine records More specific cohorts
such as cancer registration records) are
not linked directly to the corresponding Cohorts may also be constructed not
population records. Thus, problems may only on the basis of more specific
occur if factors such as ethnicity are exposures. Perhaps the most common
coded differently on the death records example of this approach involves
and on the population records. studies that are based on workers in a
Nevertheless, such descriptive particular factory or industry
analyses, have played a major role in (Checkoway et al, 1989). Such studies
identifying public health problems and may be based on historical records,
suggesting priorities for public health enabling follow-up to be conducted
research. retrospectively. Typically, such a
historical cohort study might involve
Community-based cohort studies all workers who worked for at least
one month in the factory at any time
The limitations of analyses based on during 1970-1999. The list of such
routine records usually mean that a workers can be enumerated using
specific cohort must be constructed personnel records which also provide
for many epidemiologic studies. For information on their job titles and
studies investigating environmental departments (which can be used to
factors, or general lifestyle (diet, estimate their historical exposures). All
exercise, etc) a cohort study may be study participants are followed over
based on a particular community which time by linking the study information
is followed (usually prospectively) over with national death records, or
time. For example, a cohort may be incidence records (e.g. a national
based on all persons aged 20 years or cancer registry) as well as with other
more living in a particular city or record systems (e.g. social security
county in a particular year. This would records) to confirm vital status in
usually require a special survey to be those who are not found to have died
conducted at the start of the follow-up during the follow-up period. Workers
period, with further surveys being enter the study on the date that the
conducted at regular intervals. The study starts (1/1/70), or the date that

28
they first meet the eligibility criteria comparison (i.e. between those
(i.e. employment for one month), exposed and those not exposed to a
whichever is the latest date. They stop particular factor in the workplace), or
contributing person-time when they may involve an external comparison
die, emigrate, or the study finishes with national mortality (or incidence)
(31/12/99) whichever is the earliest. rates.
Such studies may involve an internal

Example 2.2

The Renfrew/Paisley included self-reported adjusted for smoking,


study was based on two smoking history, and reduced further
adjacent urban burghs occupation, address, when adjusted for lung
considered to be typical age, gender, and function, phlegm and
of the West of Scotland. respiratory symptoms. (area) deprivation
During 1972-1976, men Study participants were category. They
and women aged flagged at the National concluded that the social
between 45 and 64 and Health Service Central class difference in lung
identified by door-to- Register in Edinburgh cancer mortality was
door census as living in and followed for 20 explained by poor lung
Renfrew and Paisley years. Hart et al (2001) health, deprivation and
were invited to take reported that high lung poor socio-economic
part. The response rate cancer mortality risks conditions throughout
was 80%(7052 men and were seen for manual life, in addition to
8354 women). compared with non- smoking.
Participants completed a manual workers. The
questionnaire which risk reduced when

Example 2.3

Rafnsson et al (2001) were 64 cases of cancer, had the heaviest


studied cancer incidence whereas 51.6 were exposure to cosmic
in a cohort of 1690 flight expected on the basis of radiation at a young age
attendants working with national cancer incidence (RR=4.1). The authors
two airline companies in rates (RR=1.2). There concluded that the
Iceland. The total was a particularly association may be due
number of person-years elevated risk for breast to cosmic radiation or
of follow-up was 27,148. cancer in those who had disturbance of circadian
Among the 1532 women been hired in 1971 or rhythm.
flight attendants, there later and therefore had

29
2.2. Incidence Case-Control Studies

Incidence studies are the most (the possible methods of sampling


comprehensive approach to studying the controls are described below).
causes of disease, since they use all of
the information about the source Table 2.3 shows the data from a
population over the risk period. hypothetical case-control study, which
However, they are very expensive in involved studying all of the 2,765
terms of time and resources. For incident cases which would have been
example, the hypothetical study identified in the full incidence study, and
presented in table 2.2 would involve a sample of 2,765 controls (one for each
enrolling 20,000 people and collecting case). Such a case-control study would
exposure information (on both past and achieve the same findings as the full
present exposure) for all of them. The incidence study, but would be much
same findings can be obtained more more efficient, since it would involve
efficiently by using a case-control design. ascertaining the exposure histories of
5,530 people (2,765 cases and 2,765
An incidence case-control study involves controls) rather than 20,000. When the
studying all (or a sample) of the incident outcome under study is very rare, an
cases of the disease that occurred in the even more remarkable gain in efficiency
source population over the risk period, can be achieved with very little reduction
and a control group sampled from the in the precision of the effect estimate.
same population over the same period

Table 2.3

Findings from a hypothetical incidence case-control study based on the cohort in table 2.2

Exposed Non-exposed Odds Ratio


-----------------------------------------------------------------------------------------------------
Cases 1,813 (a) 952 (b)
Controls: from survivors
(cumulative sampling) 1,313 (c) 1,452 (d) 2.11
from source population
(case-cohort sampling) 1,383 (c) 1,383 (d) 1.90
from person-years
(density sampling) 1,349 (c) 1,416 (d) 2.00

30
Measures of Effect in Incidence exposure odds in the source population
Case-Control Studies of persons at risk at the start of follow-
up (N1/N0 = 10000/10000 =
In case-control studies, the relative risk 1383/1383), and the odds ratio obtained
is estimated using the odds ratio. in the case-control study will therefore
estimate the risk ratio in the source
Suppose that a case-control study is population over the study period (1.90).
conducted in the study population shown In this instance the method of calculation
in table 2.2; such a study might involve of the odds ratio is the same as for any
all of the 2,765 incident cases and a other case-control study, but minor
group of 2,765 controls (table 2.3). The changes are needed in the standard
effect measure which the odds ratio methods for calculating confidence
obtained from this case-control study will intervals and p-values to take into
estimate depends on the manner in account that some cases may also be
which controls are selected. Once again, selected as controls (Greenland, 1986).
there are three main options (Miettinen,
1985; Pearce, 1993; Rothman and The third approach is to select controls
Greenland, 1998). longitudinally throughout the course of
the study (Sheehe, 1962; Miettinen,
One option, called cumulative (or 1976); this is sometimes described as
cumulative incidence) sampling, is to risk-set sampling (Robins et al, 1986),
select controls from those who do not sampling from the study base (the
experience the outcome during the person-time experience) (Miettinen,
follow-up period, i.e. the survivors 1985), or density sampling (Kleinbaum
(those who did not develop the disease et al, 1982). In this instance, the ratio of
at any time during the follow-up period). exposed to non-exposed controls will
In this instance, the ratio of exposed to estimate the exposure odds in the
non-exposed controls will estimate the person-time (Y1/Y0 = 90635/95613 =
exposure odds (c/d = 8178/9048 = 1349/1416), and the odds ratio obtained
1313/1452) of the survivors, and the in the case-control study will therefore
odds ratio obtained in the case-control estimate the rate ratio in the study
study will therefore estimate the population over the study period (2.00).
incidence odds ratio in the source
population over the study period (2.11). Case-control studies have traditionally
Early presentations of the case-control been presented in terms of cumulative
approach usually assumed this context sampling (e.g. Cornfield, 1951), but
(Cornfield, 1951), and it was emphasised most case-control studies actually
that the odds ratio was approximately involve density sampling (Miettinen,
equal to the risk ratio when the disease 1976), often with matching on a time
was rare. variable such as calendar time or age,
and therefore estimate the rate ratio
It was later recognised that controls can without the need for any rare disease
be sampled from the entire source assumption (Sheehe, 1962; Miettinen,
population (those at risk at the 1976; Greenland and Thomas, 1982).
beginning of follow-up), rather than just
from the survivors (those at risk at the Conducting an Incidence Case-
end of follow-up). This approach which Control Study
was previously used by Thomas (1972)
and Kupper et al (1975), has more An incidence case-control study should
recently been termed case-cohort be based on a specified source
sampling (Prentice, 1986), or case-base population and risk period. The task in
sampling (Miettinen, 1982). In this such a population-based case-control
instance, the ratio of exposed to non- study is then to identify all cases of
exposed controls will estimate the the outcome under study that are

31
generated by the source population preferable, although registry-based
over the risk period. Controls are then studies may still be valuable when
sampled at random from the source population-based studies are not
population, ideally by density practicable, provided that careful
matching. consideration is given to possible
sources of bias.
In some instances, cases may be
identified from a particular disease Once the cases and controls have been
register which is not comprehensive selected, information on previous
with respect to the population of any exposures is then obtained for both
defined geographical area. This may groups. In some instances this may be
be a formal register (such as a Cancer from historical records, e.g. personnel
Register) or a similar data source (e.g. records that contain work history
admission records for a particular information. Perhaps more commonly,
hospital). In such a registry-based exposure information may be obtained
study the task is to identify the source from questionnaires. It is this latter
population for the register (e.g. all feature of case-control studies which has
persons who would have been left them open to criticism as being
admitted to the hospital if they had particularly prone to bias, e.g. because
developed the disease under study). the recall of past exposures (e.g. eating
This obviously poses more problems in meat, drinking alcohol, spraying
the selection of controls than is the pesticides) may be different between
case for a population-based study. One cases of disease and healthy controls.
possibility is to select controls from However, collecting exposure information
people appearing in the same from questionnaires is not an inherent
register for other health conditions feature of case-control studies, and is
(e.g. admissions to the hospital for sometimes a feature of cohort studies.
other causes), but this may result in Thus, there is nothing inherently biased
bias if the other health conditions are in the case-control design; rather what is
also caused (or prevented) by the important is the validity of the exposure
exposure under study. For this reason, information that is collected.
the population-based approach is

Example 2.4

Gustavsson et al (2001) years in Stockholm myocardial infarction was


studied the risk of County from 1992-1994. 2.11 (95% CI 1.23-3.60)
myocardial infarction from They selected controls among those highly
occupational exposure to from the general exposed occupationally,
motor exhaust, other population living in the and 1.42 (95% CI 1.05-
combustion products, same County during the 1.92) in those moderately
organic solvents, lead, same period (i.e. density exposed, compared with
and dynamite. They matching), matched for persons not occupationally
identified first-time, sex, age, year, and exposed to combustion
nonfatal myocardial hospital catchment area. products from organic
infarctions among men The odds ratio (estimating material.
and women aged 45-70 the rate ratio) of

32
Summary

When a dichotomous outcome is under the exposure and disease experience of


study (e.g. being alive or dead, or the entire source population. They may
having or not having a disease) a resemble randomized trials, but they
fundamental distinction is between may involve additional problems of
studies of incidence and studies of confounding because exposure has not
prevalence. Thus, four main types of been randomly assigned. The other
studies can be identified: incidence potential study designs all involve
studies, incidence case-control studies, sampling from the source population,
prevalence studies, and prevalence case- and may therefore include additional
control studies (Morgenstern and biases arising from the sampling process
Thomas, 1993; Pearce, 1998). These (chapter 7). In particular, incidence
various study types differ according to case-control studies involve sampling on
whether they involve incidence or the basis of outcome, i.e. they usually
prevalence data and whether or not they involve all incident cases generated by
involve sampling on the basis of the the source population and a control
outcome under study. Incidence studies group (of non-cases) sampled at random
involve collecting and analysing data on from the source population.

References

Breslow NE, Day NE (1987). Statistical Greenland S (1986). Adjustment of risk


methods in cancer research. Vol II: ratios in case-base studies (hybrid
The analysis of cohort studies. Lyon, epidemiologic designs). Stat Med 5:
France: IARC. 579-84.
Checkoway H, Pearce NE, Crawford- Greenland S (1987). Interpretation and
Brown DJ (1989). Research methods choice of effect measures in
in occupational epidemiology. New epidemiologic analyses. Am J
York: Oxford University Press. Epidemiol 125: 761-8.
Cornfield J (1951). A method of Greenland S, Thomas DC (1982). On the
estimating comparative rates from need for the rare disease assumption
clinical data: applications to cancer of in case-control studies. Am J
the lung, breast and cervix. JNCI 11: Epidemiol 116: 547-53.
1269-75.
Gustavsson P, Plato N, Hallqvist J, et al
(2001). A population-based case-

33
referent study of myocardial infarction Morgenstern H, Thomas D (1993).
and occupational exposure to motor Principles of study design in
exhaust, other combustion products, environmental epidemiology. Environ
organic solvents, lead and dynamite. Health Perspectives 101: S23-S38.
Epidemiol 12: 222-8.
Pearce N (1993). What does the odds
Hart CL, Hole DJ, Gillis CR, et al (2001). ratio estimate in a case-control study?
Social class differences in lung cancer Int J Epidemiol 22: 1189-92.
mortality: risk factor explanations
Pearce N (1998). The four basic
using two Scottish cohort studies. Int
epidemiologic study types. J
J Epidemiol 30: 268-74.
Epidemiol Biostat 3: 171-7.
Kleinbaum DG, Kupper LL, Morgenstern
Prentice RL (1986). A case-cohort design
H (1982). Epidemiologic research.
for epidemiologic cohort studies and
Principles and quantitative methods.
disease prevention trials. Biometrika
Belmont, CA: Lifetime Learning
73: 1-11.
Publications.
Rafnsson V, Tulinius H, Jnasson JG,
Kupper LL, McMichael AJ, Spirtas R
Hrafnkelsson J (2001). Risk of breast
(1975). A hybrid epidemiologic design
cancer in female flight attendants: a
useful in estimating relative risk. J Am
population-based study (Iceland).
Stat Assoc 70:524-8.
Cancer Causes and Control 12: 95-
Martinez FD, Wright AJ, Taussig LM, et al 101.
(1995). Asthma and wheezing in the
Robins JM, Breslow NE, Greenland S
first six years of life. New Engl J Med
(1986). Estimation of the Mantel-
332: 133-8.
Haenszel variance consistent with
Miettinen OS (1976). Estimability and both sparse-data and large-strata
estimation in case-referent studies. limiting models. Biometrics 42: 311-
Am J Epidemiol 103: 226-35. 23.
Miettinen OS, Cook EF (1981). Rothman KJ, Greenland S (1998).
Confounding: essence and detection. Modern epidemiology. 2nd ed.
Am J Epidemiol 114: 593-603. Philadelphia: Lippincott-Raven.
Miettinen O (1982). Design options in Sheehe PR (1962). Dynamic risk analysis
epidemiologic research: an update. of matched pair studies of disease.
Scand J Work Environ Health 8(suppl Biometrics 18: 323-41.
1): 7-14.
Thomas DB (1972). The relationship of
Miettinen OS (1985). Theoretical oral contraceptives to cervical
epidemiology. New York: Wiley. carcinogenesis. Obstet Gynecol 40:
508-18.

34
CHAPTER 3. Prevalence Studies
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

Incidence studies are ideal for studying conditions (e.g. chronic non-fatal
events such as mortality or cancer disease) prevalence studies are the only
incidence, since they involve collecting option. Furthermore, in some instances
and analysing all of the relevant we may be more interested in factors
information on the source population and which affect the current burden of
we can get better information on when disease in the population. Consequently,
exposure and disease occurred. although incidence studies are usual
However, incidence studies involve preferable, there is also an important
lengthy periods of follow-up and large role for prevalence studies, both for
resources, in terms of both time and practical reasons, and because such
funding, and it may be difficult to studies enable the assessment of the
identify incident cases of non-fatal level of morbidity and the population
chronic conditions such as diabetes. disease burden for a non-fatal
Thus, in some settings (e.g. some condition.
developing countries) and/or for some

3.1. Prevalence Studies

The term prevalence denotes the a specific population with that in other
number of cases of the disease under communities or countries. This may be
study existing in the source population at done, for example, in order to discover
a particular time. This can be defined as differences in disease prevalence and to
point prevalence estimated at one point thus suggest possible risk factors for the
in time, or period prevalence which disease. These further studies may
denotes the number of cases that involve testing specific hypotheses by
existed during some time interval (e.g. comparing prevalence in subgroups of
one year). people who have or have not been
exposed to a particular risk factor (e.g.
The prevalence is a proportion, and the as passive smoking) in the past.
statistical methods for calculating a
confidence interval for the prevalence Prevalence studies often represent a
are identical to those presented above considerable saving in resources
for calculating a confidence interval for compared with incidence studies, since it
the incidence proportion (chapter 9). is only necessary to evaluate disease
prevalence at one point in time, rather
In some instances, the aim of a than continually searching for incident
prevalence study may simply be to cases over an extended period of time.
compare the disease prevalence among On the other hand, this gain in efficiency

35
is achieved at the cost of greater risk of chronic heart disease will be negatively
biased inferences, since it may be much associated with the prevalence of heart
more difficult to understand the temporal disease (in people who are alive!), and
relationship between various exposures will therefore appear to be protective
and the occurrence of disease. For against heart disease in a prevalence
example, an exposure that increases the study.
risk of death in people with pre-existing

Example 3.1

The International Study years within specified symptoms in English-


of Asthma and Allergies geographical areas. The speaking countries; (ii)
in Childhood (ISAAC) older age-group was centres in Latin America
(Asher et al, 1995; chosen to reflect the also had particularly high
Pearce et al, 1993) period when morbidity symptom prevalence;
involved a simple Phase from asthma is common (iii) there is also high
I global asthma and to enable the use of asthma prevalence in
symptom prevalence self-completed Western Europe, with
survey and a more in- questionnaires. The lower prevalences in
depth Phase II survey. younger age-group was Eastern and Southern
The emphasis was on chosen to give a Europe - for example,
obtaining the maximum reflection of the early there is a clear
possible participation childhood years, and Northwest-Southeast
across the world in order involves parent- gradient within Europe,
to obtain a global completion of with the highest
overview of childhood questionnaires. The prevalence in the world
asthma prevalence, and Phase I findings, being in the United
the Phase I involving more than Kingdom, and some of
questionnaire modules 700,000 children, the lowest prevalences
were designed to be showed striking in Albania and Greece;
simple and to require international differences (iv) Africa and Asia
minimal resources to in asthma symptom generally showed
administer. In addition, prevalence (ISAAC relatively low asthma
a video questionnaire Steering Committee, prevalence. These
involving the audio- 1998a, 1998b). Figure striking findings call into
visual presentation of 3.1 shows the findings question many of the
clinical signs and for current wheeze (i.e. established theories of
symptoms of asthma wheeze in the previous asthma causation, and
was developed in order 12 months). There are a have played a major role
to minimise translation number of interesting in the development of
problems. The features of the figure: (i) new theories of asthma
population of interest there is a particularly causation in recent years
was schoolchildren aged high prevalence of (Douwes and Pearce,
6-7 years and 13-14 reported asthma 2003).

36
Figure 3.1

Twelve month period prevalence of asthma symptoms in 13-14 year old children in
Phase I of the International Study of Asthma and Allergies in Childhood (ISAAC)

Source: ISAAC Steering Committee (1998b)

20%
10 to <20%
5 to <10%
<5%

37
Measures of Effect in Prevalence Studies

Figure 3.2 shows the relationship population size - and that average
between incidence and prevalence of disease duration (D) does not change
disease in a steady state population. over time. Then, if we denote the
Assume that the population is in a prevalence of disease in the study
steady state (stationary) over time (in population by P, the prevalence odds is
that the numbers within each equal to the incidence rate (I) times the
subpopulation defined by exposure, average disease duration (Alho, 1992):
disease and covariates do not change
with time) this usually requires that P
incidence rates and exposure and ------ = ID
disease status are unrelated to the (1-P)
immigration and emigration rates and

Figure 3.2

Relationship between prevalence and incidence in a steady state population

P=prevalence
I=incidence
P/(1-P) = I x D N(1-p) x I D=duration
N=population

Non-asthmatic Asthma
[N(1-P)] cases
[NP]

NP/D

Now suppose that we compare two POR = [P1/(1-P1)]/[P0/(1-P0)] = I1D1/I0D0


populations (indexed by 1=exposed and
0=non-exposed) and that both satisfy An increased prevalence odds ratio may
the above conditions. Then, the thus reflect the influence of factors that
prevalence odds is directly proportional increase the duration of disease, as well
to the disease incidence, and the as those that increase disease incidence.
prevalence odds ratio (POR) satisfies the However, in the special case where the
equation:

38
average duration of disease is the same disease duration, and a difference in
in the exposed and non-exposed groups prevalence between two groups could
(i.e. D1 = D0), then the prevalence odds entirely depend on differences in disease
ratio satisfies the equation: duration (e.g. because of factors which
prolong or exacerbate symptoms) rather
POR = I1/I0 than differences in incidence. Changes in
incidence rates, disease duration and
i.e. under the above assumptions, the population sizes over time can also bias
prevalence odds ratio directly estimates the POR away from the rate ratio, as can
the incidence rate ratio. However, it migration into and out of the population
should be emphasised that prevalence at risk or the prevalence pool.
depends on both incidence and average

Table 3.1
Findings from a hypothetical prevalence study of 20,000 persons
Exposed Non-exposed Ratio
--------------------------------------------------------------------------------------
Cases 909 (a) 476 (b)
Non-cases 9,091 (c) 9,524 (d)
--------------------------------------------------------------------------------------
Total population 10,000 (N1) 10,000 (N0)
--------------------------------------------------------------------------------------
Prevalence 0.0909 (P1) 0.0476 (P0) 1.91
Prevalence odds 0.1000 (O1) 0.0500 (O0) 2.00

Table 3.1 shows data from a this is balanced by the 95 people who
prevalence study of 20,000 people. develop the disease each year (0.0100
This is based on the incidence study of the susceptible population of 9524
represented in table 2.2 (chapter 2), people). With the additional
with the assumptions that, for both assumption that the average duration
populations, the incidence rate and of disease is the same in the exposed
population size is constant over time, and non-exposed groups, then the
that the average duration of disease is prevalence odds ratio (2.00) validly
five years, and that there is no estimates the incidence rate ratio (see
migration of people with the disease table 2.2).
into or out of the population (such
assumptions may not be realistic, but Conducting a Prevalence Study
are made here for purposes of
illustration). In this situation, the Prevalence studies usually involve
number of cases who "lose" the surveys in a source population defined
disease each year is balanced by the by a geographic region or a particular
number of new cases generated from exposure (e.g. an industry or factory).
the source population. For example, in As with an incidence study, it is
the non-exposed group, there are 476 important that this source population is
prevalent cases, and 95 (20%) of well-defined and that a high response
these "lose" their disease each year; rate is obtained. The basic issues

39
involved in collecting data for a measuring disease status, since the
prevalence survey are therefore not methods that are most appropriate in
markedly different from those in any clinical practice may not be appropriate
other random population survey. or applicable in epidemiologic surveys.
However, particular consideration These issues are discussed in more
should be given to the methods of depth in chapter 5.

Example 3.2

Wilks et al (1999) glucose tolerance testing fourfold excess of


conducted a survey of was conducted after an diabetes in women
the prevalence of overnight fast (response compared to men, but
diabetes in the rate = 60%). The obesity could not entirely
population of Spanish prevalence of Type 2 account for the high
Town, Jamaica. A diabetes mellitus was prevalences observed
random population 15.7% among women which exceed those
sample was recruited by and 9.8% among men. previously reported
door-to-door canvassing The sex patterns were among European
(n=1,303) and oral consistent with the populations.

3.2. Prevalence Case-Control Studies

Just as an incidence case-control study Measures of Effect in Prevalence


can be used to obtain the same findings Case-Control Studies
as a full incidence study, a prevalence
case-control study can be used to obtain Suppose that a nested case-control
the same findings as a full prevalence study is conducted in the study
study in a more efficient manner. In population (table 3.1), involving all of
particular, if obtaining exposure the 1,385 prevalent cases and a group
information is difficult or costly (e.g. if it of 1,385 controls (table 3.2). The usual
involves lengthy interviews, or expensive approach to selecting controls is to
testing of serum samples), then it may select them from the non-cases. In this
be more efficient to conduct a instance, the ratio of exposed to non-
prevalence case-control study by exposed controls will estimate the
obtaining exposure information on all of exposure odds (b/d) of the non-cases,
the prevalent cases and a sample of and the odds ratio obtained in the
controls selected at random from the prevalence case-control study will
non-cases. therefore estimate the prevalence odds

40
ratio in the source population (2.00), a prevalence case-control study there is
which in turn estimates the incidence only one valid option, i.e. controls should
rate ratio provided that the above be selected at random from the non-
assumptions are satisfied in the cases. For both groups, information on
exposed and non-exposed populations. historical and current exposures may be
obtained, as well as information on
Conducting a Prevalence Case- potential confounders. The methods of
Control Study obtaining such information are generally
similar to those used in incidence case-
A prevalence case-control study can be control studies, and include
based on routine records (see example questionnaires, biological measurements,
3.3) or as a second phase of a specific and examination of historical records
prevalence survey. Whereas incidence (e.g. personnel and work history
case-control studies involve at least three records).
possible methods of selecting controls, in

Table 3.2
Findings from a hypothetical prevalence case-control study based on the population
represented in table 3.1
Exposed Non-exposed Ratio
--------------------------------------------------------------------------------------
Cases 909 (a) 476 (b)
Controls 676 (c) 709 (d)
--------------------------------------------------------------------------------------
Prevalence odds 1.34 (O1) 0.67 (O0) 2.00

Example 3.3

Studies of congenital exposure to pesticides and the month before


malformations usually congenital malformations conception and the first
involve estimating the in Comunidad Valenciana, trimester of pregnancy,
prevalence of Span. A total of 261 cases the adjusted prevalence
malformations at birth and 261 controls were odds ratio for congenital
(i.e. this is a prevalence selected from those malformations was 3.2
rather than an incidence infants born in eight (95% CI 1.1-9.0). There
measure). Garcia et al public hospitals during was no such association
(1999) conducted a 1993-1994. For mothers with paternal agricultural
(prevalence) case-control who were involved in work.
study of occupational agricultural activities in

41
Summary

When a dichotomous outcome is under disease in the source population at a


study (e.g. being alive or dead, or particular time, rather than the incidence
having or not having a disease) four of the disease over time. Prevalence
main types of studies can be identified: case-control studies involve sampling on
incidence studies, incidence case-control the basis of outcome, i.e. they usually
studies, prevalence studies, and involve all prevalent cases in the source
prevalence case-control studies population and a control group (of non-
(Morgenstern and Thomas, 1993; cases) sampled from the source
Pearce, 1998). Prevalence studies population.
involve measuring the prevalence of the

References

Alho JM (1992). On prevalence, prevalence of asthma symptoms:


incidence, and duration in general International Study of Asthma and
stable populations. Biometrics 48: Allergies in Childhood (ISAAC). Eur
587-92. Respir J 12: 315-35.
Asher I, Keil U, Anderson HR, et al Morgenstern H, Thomas D (1993).
(1995). International study of asthma Principles of study design in
and allergies in childhood (ISAAC): environmental epidemiology. Environ
rationale and methods. Eur Resp J 8: Health Perspectives 101: S23-S38.
483-91. Pearce N (1998). The four basic
Douwes J, Pearce N (2003). Asthma and epidemiologic study types. J Epidemiol
the Westernization package. Int J Biostat 3: 171-7.
Epidemiol 31: 1098-1102. Pearce NE, Weiland S, Keil U, et al
Garcia AM, Fletcher T, Benavides FG, (1993). Self-reported prevalence of
Orts E (1999). Parental agricultural asthma symptoms in children in
work and selected congenital Australia, England, Germany and New
malformations. Am J Epidemiol 149: Zealand: an international comparison
64-74. using the ISAAC protocol. Eur Resp J
6: 1455-61.
ISAAC Steering Committee (1998a).
Worldwide variation in prevalence of Wilks R, Rotimi C, Bennett F, et al
symptoms of asthma, allergic (1999). Diabetes in the Caribbean:
rhinoconjunctivitis and atopic eczema: results of a population survey from
ISAAC. Lancet 351: 1225-32. Spanish Town, Jamaica. Diabetic
Medicine 1999; 16: 875-83.
ISAAC Steering Committee (1998b).
Worldwide variations in the

42
CHAPTER 4. More Complex Study Designs
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

In the previous two chapters I reviewed not having a particular disease). I now
the possible study designs for the simple consider studies involving other axes of
situation where individuals are exposed classification, continuous measurements
to a particular risk factor (e.g. a of health status (e.g. continuous lung
particular chemical) and when a function or blood pressure
dichotomous outcome is under study measurements) and more complex study
(e.g. being alive or dead, or having or designs (ecologic and multilevel studies).

4.1: Other Axes of Classification

The four basic study types discussed in The Timing of Collection Of Exposure
chapters 2 and 3 are defined in terms of: Information
(a) the type of outcome under study
(incidence or prevalence); and (b) Perhaps the feature that has received
whether there is sampling on the basis of the most attention in various
outcome. They do not involve any classification schemes is the timing of
consideration of the nature of the the collection of exposure information.
exposure data. This provides additional This has dominated discussions of
axes of classification. directionality, particularly with regard
to case-control studies. In fact, for all of
Continuous Exposure Data the four basic study types, exposure
information can be collected
Firstly, it should be noted that in prospectively or retrospectively. For
discussing the above classification we example, an incidence study or incidence
have assumed that exposure is case-control study of occupational cancer
dichotomous (i.e. study participants may collect exposure information
are exposed or not exposed). In prospectively, or use historical
reality, there may be multiple information that was collected
exposure categories (e.g. high, prospectively but abstracted
medium and low exposure), or retrospectively by the investigator (e.g.
exposure may be measured as a occupational hygiene monitoring
continuous variable (see chapter 5). records), or use exposure information
However, although this requires minor that was collected retrospectively (e.g.
changes to the data analysis (see recall of duration and intensity of
chapter 9), it does not alter the four- pesticide use). An unfortunate aspect of
fold categorisation of study design some discussions of the merits of case-
options presented above. control studies is that they have often

43
been labelled as retrospective studies, not fundamental to the classification of
when this is in fact not an inherent part study types since, as with issues of
of their design. The potential problem directionality, they do not affect the
of bias due to exposure ascertainment parameterization of the exposure-
errors (e.g. recall bias) arises from the outcome association.
retrospective collection of exposure
information, irrespective of whether the The Level of Measurement of
study is an incidence, incidence case- Exposure
control, prevalence, or prevalence case-
control study. A third additional axis of classification
involves the level of measurement of
Sources of Exposure Information exposure. In particular, in ecologic
studies exposure information may be
Another set of issues that occur in collected on a group rather than on
practice involve the sources of exposure individuals (e.g. average level of meat
information (e.g. routine records, job- consumption) although others may still
exposure-matrices, questionnaires, be available for individuals (e.g. age,
biological samples). However, as noted gender). This situation is discussed in
above, these issues are important in section 4.3.
understanding sources of bias but are

4.2: Continuous Outcome Measures

Cross-Sectional Studies of cross-sectional studies in which the


disease outcome is dichotomous.
In chapters 2 and 3, the health outcome
under study was a state (e.g. having or Although cross-sectional studies are
not having hypertension). Studies could sometimes described as studies in which
involve observing the incidence of the exposure and disease information is
event of acquiring the disease state (e.g. collected at the same point in time (e.g.
the incidence of being diagnosed with Kramer and Boivin, 1988; Last 1988),
hypertension), or the prevalence of the this is not in fact an inherent feature of
disease state (e.g. the prevalence of such studies. In most cross-sectional
hypertension). More generally, the studies (including prevalence studies),
health state under study may have information on exposure will be
multiple categories (e.g. non- physically collected by the investigator at
hypertensive, mild hypertension, the same time that information on
moderate hypertension, severe disease is collected. Nonetheless,
hypertension) or may be represented by exposure information may include
a continuous measurement (e.g. blood factors that do not change over time
pressure). Since these measurements (e.g. gender) or change in a predictable
are taken at a particular point in time, manner (e.g. age) as well as factors that
such studies are often referred to as do change over time. The latter may
cross-sectional studies. Prevalence have been measured at the time of data
studies (see chapter 3) are a subgroup collection (e.g. current levels of airborne

44
dust exposure), or at a previous time Measures of Effect in Cross-Sectional
(e.g. from historical records on past Studies
exposure levels) or integrated over time.
The key feature of cross-sectional In a simple cross-sectional study
studies is that they involve studying involving continuous outcome data, the
disease at a particular point in time. basic methods of statistical analysis
Exposure information can be collected involve comparing the mean level of the
for current and/or historical exposures, outcome in exposed and non-
and a wide variety of exposure exposed groups, e.g. the mean levels of
assessment methods can be used within blood pressure in exposed and non-
this general category of study (these are exposed people. Standard statistical
discussed further in chapter 5). methods of analysis for comparing
means (perhaps after a suitable
Just as a prevalence case-control study transformation to normalise the data),
can be based on a prevalence survey, a and calculating confidence intervals (and
cross-sectional study can also involve associated p-values) for differences
sampling on the basis of the disease between means, can be used to analyse
outcome. For example, a cross-sectional such studies (see chapter 9). More
study of bronchial hyperresponsiveness generally, regression methods can be
(BHR) could involve testing all study used to model the relationship between
participants for BHR and then the level of exposure (measured as a
categorising the test results into severe continuous variable) and the level of the
BHR, mild BHR, and no BHR, and then outcome measure (also measured as a
obtaining exposure information on all continuous variable) (e.g. Armitage et al,
severe BHR cases and from random 2002).
samples of the other two groups.

Example 4.1

Nersesyan et al (2001) exposed to the findings could be due


studied chromosome earthquake had a higher either to environmental
aberrations in proportion of cells with exposures related to the
lymphocytes of persons chromosome aberrations earthquake or to severe
exposed to an (3.1% (SD 2.1)) than psychogenic stress.
earthquake in Armenia. the referents (1.7% (SD They noted that studies
They collected blood 1.3)). The differences in wild rodents living in
samples from 41 victims persisted when the data seismic regions have
of the 1988 earthquake were adjusted for age shown similar findings.
and from 47 reference and gender. The authors
blood donors. Those suggested that the

45
Longitudinal Studies

Longitudinal studies (cohort studies) or the prevalence at a particular time (as


involve repeated observation of study in a prevalence study), or the mean
participants over time (Pearce et al, blood pressure at a particular point in
1998). Incidence studies (chapter 2) are time (as in a cross-sectional study), a
a subgroup of longitudinal study in which longitudinal study might involve
the outcome measure is dichotomous. measuring baseline blood pressure in
More generally, longitudinal studies may exposed and non-exposed persons and
involve repeated assessment of then comparing changes in mean blood
categorical or continuous outcome pressure (i.e. the change from the
measures over time (e.g. a series of baseline measure) over time in the two
linked cross-sectional studies in the groups. Such a comparison of means can
same population). They thus can involve be made using standard statistical
incidence data, a series of prevalence methods for comparing means and
surveys, or a series of cross-sectional calculating confidence intervals and
continuous outcome measures. associated p-values for the difference
between the means (Armitage et al,
General longitudinal studies 2002; Beaglehole et al, 1993). More
generally, regression methods (Diggle et
A simple longitudinal study may involve al, 1994) might be used to model the
comparing the disease outcome relationship between the level of
measure, or more usually changes in the exposure (measured as a continuous
measure over time, between exposed variable) and the level of the outcome
and non-exposed groups. For example, measure (also measured as a continuous
rather than comparing the incidence of variable, in this instance the change in
hypertension (as in an incidence study), FEV1).

Example 4.2

The Tokelau Island were repeated (Round who had not: the mean
Migrant Study (Wessen II) in both the Tokelau differences were 1.43 for
et al, 1992) examined Islands (1976) and in systolic and 1.15 for
the effects of migration New Zealand (1975-7). diastolic in men, and
on development of A regression analysis of 0.66 and 0.46
Western diseases within changes in blood respectively in women.
a population which pressure between Round These differences in
initially had a low I and Round II (adjusted rates of annual increase
incidence of these for age) found that the in blood pressure were
conditions. Round I mean annual increase in maintained in
surveys were conducted blood pressure was subsequent surveys in
in the Tokelau Islands in greater in those who had men, but not in women.
1968/1971, and these migrated than in those

46
Time series measured over minutes, hours, days,
weeks, months or years (Dockery and
One special type of longitudinal study is Brunekreef, 1996). In many instances,
that of time series comparisons in such data can be analysed using the
which variations in exposure levels and standard statistical techniques outlined
symptom levels are assessed over time above. For example, a study of daily
with each individual serving as their own levels of air pollution and asthma
control. Thus, the comparison of hospital admission rates can be
exposed and non-exposed involves conceptualised as a study of the
the same persons evaluated at different incidence of hospital admission in a
times, rather than different groups of population exposed to air pollution
persons being compared (often at the compared with that in a population not
same time) as in other longitudinal exposed to air pollution. The key
studies. The advantage of the time series difference is that only a single population
approach is that it reduces or eliminates is involved, and it is regarded as
confounding (see chapter 6) by factors exposed on high pollution days and as
which vary among subjects but not over non-exposed on low pollution days.
time (e.g. genetic factors), or whose day Provided that the person-time of
to day variation is unrelated to the main exposure is appropriately defined and
exposure (Pope and Schwartz, 1996). On assessed, then the basic methods of
the other hand, time series data often analysis are not markedly different from
require special statistical techniques other studies involving comparisons of
because any two factors that show a exposed and non-exposed groups.
time trend will be correlated (Diggle et
al, 1994). For example, even a three- However, the analysis of time series may
month study of lung function in children be complicated because the data for an
will generally show an upward trend due individual are not independent and serial
to growth, as well as learning effects data are often correlated (Sherrill and
(Pope and Schwartz, 1996). A further Viegi, 1996), i.e. the value of a
problem is that the change in a measure continuous outcome measure on a
over time may depend on the baseline particular day may be correlated with the
value, e.g. changes in lung function over value for the previous day.
time may depend on the baseline level Furthermore, previous exposure may be
(Schouten and Tager, 1996). as relevant as, or more relevant than,
current exposure. For example, the
Time series can involve dichotomous effects of air pollution may depend on
(binary) data, continuous data, or exposure on preceding days as well as
counts of events (e.g. hospital on the current day (Pope and Schwartz,
admissions) (Pope and Schwartz, 1996), 1996).
and the changes in these values may be

Example 4.3

Hoek et al (2001) Netherlands during periods with high levels


studied associations 1986-1994. The authors of ozone, black smoke,
between daily variations found (table 4.1) that particulate matter 10
in air pollution and heart disease deaths microns in diameter
mortality in The were increased during (PM10), carbon monoxide

47
(CO), sulfur dioxide using any lag period general. In particular,
(SO2) and nitrogen (Schwartz, 2000). The heart failure deaths,
dioxide (NO2). As with authors reported that which made up 10% of
previously published deaths due to heart all cardiovascular
studies, the effects failure, arrhythmia, deaths, were responsible
depended on exposures cerebrovascular causes for about 30% of the
on the previous few and thrombocytic causes excess cardiovascular
days, and were weaker were more strongly deaths related to air
when the analysis only associated with air pollution from particular
considered exposures on pollution than were matter, SO2, CO, and
a particular day without cardiovascular deaths in NO2.

Table 4.1

Relative risks* (and 95% CIs) of cardiovascular disease mortality associated with air
pollution concentrations in the Netherlands

Pollutant Total CVD mortality Heart failure mortality


---------------------------------------------------------------------------------------------
Ozone (1 day lag) 1.055 (1.032-1.079) 1.079 (1.009-1.154)
Black smoke (7 day mean) 1.029 (1.013-1.046) 1.081 (1.031-1.134)
PM10 (7 day mean) 1.012 (0.984-1.041) 1.036 (0.960-1.118)
CO (7 day mean) 1.026 (0.993-1.060) 1.109 (1.012-1.216)
SO2 (7 day mean) 1.029 (1.012-1.046) 1.098 (1.043-1.156)

NO2(7 day mean) 1.023 (1.009-1.036) 1.064 (1.024-1.106)

---------------------------------------------------------------------------------------------
*Relative risks per 1 to 99th percentile pollution difference
Relative risks per 150 g/m3 for ozone (8-hour maximum of the previous
Day), per 120 g/m3 for CO, per 80 g/m3 for PM10, per 30 g/m3
for NO2, and per 40 g/m3 for black smoke and SO2, all as 7-day moving averages
Source: Hoek et al (2001)

48
4.3 Ecologic and Multilevel Studies

The basic study designs described in ecologic fallacy (see below) can occur
chapter 2 involved the measurement in that factors that are associated with
of exposure and disease in individuals. national disease rates may not be
In this section, I consider more associated with disease in individuals
complex study designs in which (Greenland and Robins, 1994). Thus,
exposures are measured in populations ecologic studies have recently been
instead of, or in addition to, regarded as a relic of the pre-
individuals. modern phase of epidemiology before
it became firmly established with a
Ecologic Studies methodologic paradigm based on the
theory of randomized controlled trials
In ecologic studies exposure of individuals.
information may be collected on a
group rather than on individuals. In However, population-level studies are
the past, ecologic studies have been now experiencing a revival for two
regarded as an inexpensive but important reasons (Pearce, 2000).
unreliable method for studying
individual-level risk factors for disease. Firstly, it is increasingly recognised
For example, rather than go to the that, even when studying individual-
time and expense to establish a cohort level risk factors, population-level
study or case-control study of fat studies play an essential role in
intake and breast cancer, one could defining the most important public
simply use national dietary and cancer health problems to be addressed, and
incidence data and, with minimal time in generating hypotheses as to their
and expense, show a strong potential causes. Many important
correlation internationally between fat individual-level risk factors for disease
intake and breast cancer. In this simply do not vary enough within
situation, an ecologic study does not populations to enable their effects to
represent a fundamentally different be identified or studied (Rose, 1992).
study design, but merely a particular More importantly, such studies are a
variant of the four basic study designs key component of the continual cycle
described in chapter 2 in which of theory and hypothesis generation
information on average levels of and testing (Pearce, 2000).
exposure in populations is used as a Historically, the key area in which
surrogate measure of exposure in epidemiologists have been able to add
individuals. value has been through this
population focus (Pearce, 1996, 1999).
This approach has been quite rightly For example, many of the recent
regarded as inadequate and unreliable discoveries on the causes of cancer
because of the many additional forms (including dietary factors and colon
of bias that can occur in such studies cancer, hepatitis B and liver cancer,
compared with studies of individuals aflatoxins and liver cancer, human
within a population. In particular, not papilloma virus and cervical cancer)
only will measures of exposure in have their origins, directly or
populations often be poor surrogates indirectly, in the systematic
for exposures in individuals, but the international comparisons of cancer

49
incidence conducted in the 1950s and consistent with biological knowledge at
1960s (Doll et al, 1966). These the time, but in other instances they
suggested hypotheses concerning the were new and striking, and might not
possible causes of the international have been proposed, or investigated
patterns, which were investigated in further, if the population level analyses
more depth in further studies. In some had not been done.
instances these hypotheses were

Example 4.4

The International Study tuberculosis notification are described below),


of Asthma and Allergies rates (von Mutius et al, but it is generally
in Childhood (ISAAC) 2000). It shows a consistent with the
(Asher et al, 1995; negative association hygiene hypothesis
Pearce et al, 1993) was between tuberculosis that suggests that
described in example rates and asthma asthma prevalence is
3.1. Figure 4.1 shows prevalence. This is not increasing in Western
the findings for current compelling evidence in countries because of the
wheeze (i.e. wheeze in itself (because of the loss of a protective effect
the previous 12 months) major shortcoming of from infections such as
and its association with ecologic analyses that tuberculosis in early life.

A second reason that ecologic studies et al, 1999). The failure to take account
are experiencing a revival is that it is of the importance of population context,
increasingly being recognised that some as an effect modifier and determinant of
risk factors for disease genuinely individual-level exposures could be
operate at the population level (Pearce, termed the individualistic fallacy
2000). In some instances they may (Diez-Rouz, 1998) in which the major
directly cause disease, but perhaps population determinants of health are
more commonly they may cause disease ignored and undue attention is focussed
as effect modifiers or determinants of on individual characteristics. In this
exposure to individual-level risk factors. situation, the associations between
For example, being poor in a rich these individual characteristics and
country or neighbourhood may be worse health can be validly estimated, but
than having the same income level in a their importance relative to other
poor country or neighbourhood, because potential interventions, and the
of problems of social exclusion and lack importance of the context of such
of access to services and resources (Yen interventions, may be ignored.

50
Figure 4.1

Association of tuberculosis notification rates for the period 1980-1982 (in countries with
valid tuberculosis notification data) and the prevalence of asthma symptoms in 13-14
year old children in the International Study of Asthma and Allergies in Childhood (ISAAC)

Source: von MutiuWh


s eteeze last
al (2000) 12 months (written questionnaire) vs tuberculosis
notification rate for the period 1980-1982 in countries with valid
tuberculosis notification data
40

35
Wheeze last 12 months %

30

25

20

15

10

0
0 10 20 30 40 50 60 70 80

Tuberculosis notification rate per 100,000

Example 4.5

Wilkinson (1992) has is clearly of crucial researchers (e.g. Lynch


analysed measures of importance since it et al, 2000; Pearce and
income inequality and implies that Davey Smith, 2003) who
found them to be development in itself have argued that the
positively associated may not automatically level of income
with national mortality be good for health, and inequality in a country,
rates in a number of that the way in which or in a state, is a
Western countries. This the Gross National surrogate measure for
is a true ecologic Product (GNP) is 'shared' other socioeconomic
exposure since the level may be as important as factors, including the
of income inequality is a its absolute level. It provision of public
characteristic of a should be noted, education and health
country, and not of an however, that this services, as well as
individual. If this evidence has been social welfare services.
evidence is correct, this disputed by other

51
Ecologic Fallacies

While stressing the potential value of usage. This does not mean that
ecologic analyses, it is also important to watching television causes every type of
recognise their limitations. In particular, disease, but rather than in many
ecologic studies are a very poor means instances the association between sales
of assessing the effects of individual of television sets and disease at the
exposures (e.g. diet or tobacco national level is confounded by other
smoking) since confounding (and effect exposures (at both the national and
modification) can occur at the individual individual level). A hypothetical example
level, the country (population) level, or is given in example 4.6. Another
both (Morgenstern, 1998). For example, problem is that individual level effects
almost any disease that is associated can confound ecologic estimates of
with affluence and westernisation has in population-level effects (Greenland,
the past been associated at the national 2001).These problems of cross-level
level with sales of television sets, and inference are avoided (or reduced) in
nowadays is probably associated at the multilevel analyses (see below).
national level with rates of internet

Example 4.6

Table 4.2 shows the data the country level: if a Thus, the ecologic
for a hypothetical regression is performed analysis correctly
ecological analysis. The on the country-level data estimates the individual-
numbers of cases and it indicates (comparing level relative risk of 0.5.
population numbers (and 100% exposure with 0% In table 4.4, there is
hence disease rates), as exposure) a relative risk confounding at the
well as the percentage of of 0.5. However, it is not country level (because
the population exposed, known whether this the rate in the non-
are known for each association applies to exposed differs by
country. Thus, the individuals, since the country) and there is in
numbers of people data are not available. fact no association at the
exposed and non- individual level. In table
exposed within each Tables 4.3-4.5 give three 4.5, there is effect
country are known, but different scenarios, each modification at the
it is not known how of which could generate country level, and the
many cases were the data in table 4.2. In relative risk is positive,
exposed and how many table 4.3, there is no but of differing
were not; thus it is not confounding at the magnitude, in all three
possible to estimate the country level (because countries. These three
rates in the exposed and the rate in the non- very different situations
non-exposed groups exposed is the same - (a protective effect, no
within each country. The 200 per 1,000 - in each effect, a positive effect
country-level data country), although there which is different in each
indicate a negative could of course still be country) all yield the
association between uncontrolled confounding same country-level data
exposure and disease at at the individual level. shown in table 4.2.

52
Table 4.2

Hypothetical example of an ecologic analysis

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
-----------------------------------------------------------------------------------------
Exposed ?/ ? ?/ ? ?/ ?
7000 10000 13000
Non-exposed ?/ ? ?/ ? ?/ ?
13000 10000 7000
-----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000

Source: Adapted from Morgenstern (1998)

Table 4.3
Hypothetical example of an ecologic analysis:
No confounding by country

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
----------------------------------------------------------------------------------------------
Exposed 7/ 100 10/ 100 13/ 100
7000 10000 13000
Non-exposed 26/ 200 20/ 200 14/ 200
13000 10000 7000
----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000
----------------------------------------------------------------------------------------------
Ratio 0.5 0.5 0.5

Source: Adapted from Morgenstern (1998)

53
Table 4.4

Hypothetical example of an ecologic analysis:


Confounding by country

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
----------------------------------------------------------------------------------------------
Exposed 12/ 171 15/ 150 18/ 139
7000 10000 13000
Non-exposed 21/ 162 15/ 150 9/ 129
13000 10000 7000
----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000
----------------------------------------------------------------------------------------------
Ratio 1.1 1.0 1.1

Source: Adapted from Morgenstern (1998)

Table 4.5

Hypothetical example of an ecologic analysis:


Effect modification by country

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
----------------------------------------------------------------------------------------------
Exposed 20/ 286 20/ 200 20/ 154
7000 10000 13000
Non-exposed 13/ 100 10/ 100 7/ 100
13000 10000 7000
----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000
----------------------------------------------------------------------------------------------
Ratio 2.9 2.0 1.5

Source: Adapted from Morgenstern (1998)

54
Multilevel Studies

If individual as well as population-level best features of individual level


data are available, then the problems analyses and population-level analyses.
of cross-level confounding and effect In particular, it enables us to take the
modification (illustrated in example population context of exposure into
4.6) are avoided by using multilevel account (Pearce, 2000). However, it
modelling (Greenland, 2000, 2002). should be stressed that multilevel
This enables the simultaneous modelling is complex, and requires
consideration of individual level effects intensive consideration of possible
(e.g. individual income) and biases at the population level, as well
population-level effects (e.g. per capita as at the individual level (Blakely and
national income, or income inequality). Woodward, 2000).
This approach therefore combines the

Example 4.7

Yen and Kaplan (1999) recruited in 1965. perceived health


conducted a multi-level Mortality risks were status, smoking status,
analysis of significantly higher in body mass index, and
neighbourhood social neighbourhoods with a alcohol consumption.
environment and risk of low social The authors concluded
death in the Alameda environment, even that the findings
County Study, after account was demonstrate the
comprising 6,928 non- taken of individual importance of area
institutionalised adult income level, characteristics as a
residents of the County education, ethnicity, health risk factor.

Summary

The basic study designs presented in Prevalence studies are a subgroup of


chapters 2 and 3 can be extended in cross-sectional studies in which the
two ways: by the inclusion of continuous outcome measure is dichotomous.
outcome measures; and by the use of Similarly, longitudinal studies can
exposure information on populations involve incidence data, but may also
rather than individuals. involve a series of cross-sectional
measurements. Incidence studies are a
Cross-sectional studies can include a subgroup of longitudinal studies in
variety of measurements of the health which the outcome measure is
outcome under study (e.g. lung function dichotomous. Time series studies are a
or blood pressure measurements). particular type of longitudinal study in

55
which each subject serves as his or her estimate the effects of exposures in
own control. individuals. These problems are avoided
(or reduced) in multilevel analyses,
Ecologic studies play an important role which permit us to take the population
in the process of hypothesis generation context of exposure into account.
and testing, but they pose additional
problems of bias when attempting to

References

Armitage P, Berry G, Matthews JNS Greenland S, Robins J (1994). Ecologic


(2002). Statistical methods in medical studies - biases, misconceptions, and
research. 4th ed Oxford: Blackwell. counterexamples. Am J Epidemiol
139: 747-60.
Asher I, Keil U, Anderson HR, Beasley R,
et al (1995). International study of Greenland S (2000). Principles of
asthma and allergies in childhood multilevel modelling. Int J Epidemiol
(ISAAC): rationale and methods. Eur 2000; 29: 158-67.
Resp J 8: 483-91. Greenland S (2001). Ecologic versus
Beaglehole R, Bonita R, Kjellstrom T individual-level sources of bias in
(1993). Basic epidemiology. Geneva: ecologic estimates of contextual
WHO. health effects, Int J Epidemiol 30:
1343-50.
Blakeley T, Woodward AJ (2000).
Ecological effects in multi-level Greenland S (2002). A review of
studies. J Epidemiol Comm Health 54: multilevel theory for ecologic
367-74. analyses. Stat Med 21: 389-95.

Diez-Roux AV (1998). Bringing context Hoek G, Brunekreef B, Fischer P, van


back into epidemiology: variables and Mijnen J (2001). The association
fallacies in multilevel analysis. Am J between air pollution and heart
Publ Health 1998; 88: 216-22. failure, arrhythmia, embolism,
thrombosis, and other cardiovascular
Diggle PJ, Liang K-Y, Zeger SL (1994). causes of death in a time series study.
Analysis of longitudinal data. Oxford: Epidemiol 2001; 12: 355-57.
Clarendon.
Kramer MS, Boivin J-F (1988). The
Dockery DW, Brunekreef B (1996). importance of directionality in
Longitudinal studies of air pollution epidemiologic research design. J Clin
effects on lung function. Am J Respir Epidemiol 41: 717-8.
Crit Care Med 154: S250-S256.
Last JM (ed) (1988). A dictionary of
Doll R, Payne P, Waterhouse J (eds) epidemiology. New York: Oxford
(1966). Cancer Incidence in Five University Press.
Continents: A Technical Report.
Berlin: Springer-Verlag (for UICC). Lynch J, Due P, Muntaner C, Davey
Smith G (2000). Social capital is it a
good investment strategy for public

56
health? J Epidemiol Comm Health 54: an overview. Am J Respir Crit Care
404-8. Med 154: S278-S284.
Morgenstern H (1998). Ecologic studies. Schwartz J (2000). The distributed lag
In: Rothman K, Greenland S. Modern between air pollution and daily
epidemiology. Philadelphia: deaths. Epidemiol 2000; 11: 320-6.
Lippincott-Raven, pp 459-80.
Sherill D, Viegi G (1996). On modeling
Nersesyan AK, Boffetta P, Sarkisyan TF, longitudinal pulmonary function data.
et al (2001). Chromosome aberrations Am J Respir Crit Care Med 154: S217-
in lymphocytes of persons exposed to S222.
an earthquake in Armenia. Scand J
Von Mutius E, Pearce N, Beasley R,
Work Environ Health 27: 120-4.
Cheng S, Von Ehrenstein O, Bjrkstn
Pearce N (1996). Traditional B, Weiland S, on behalf of the ISAAC
epidemiology, modern epidemiology, Steering Committee (2000).
and public health. AJPH 1996; 86: International patterns of tuberculosis
678-83. and the prevalence of symptoms of
asthma, rhinitis and eczema. Thorax
Pearce N (1999). Epidemiology as a
55: 449-53.
population science. Int J Epidemiol
1999; 28: S1015-8. Wessen AF, Hooper A, Huntsman J, et al
(1992). Migration and health in a
Pearce N (2000). The ecologic fallacy
small society: The case of Tokelau.
strikes back. J Epidemiol Comm
Oxford: Clarendon Press, 1992, pp
Health 2000; 54: 326-7.
318-57.
Pearce N, Davey Smith G (2003). Is
Wilkinson RG (1992). Income
social capital the key to inequalities in
distribution and life expectancy. Br
health? Am J Publ Health 93: 122-9.
Med J 304: 165-8.
Pearce NE, Weiland S, Keil U, et al Yen IH, Kaplan GA (1999).
(1993). Self-reported prevalence of Neighbourhood social environment
asthma symptoms in children in and risk of death: multilevel
Australia, England, Germany and New evidence from the Alameda County
Zealand: an international comparison Study. Am J Epidemiol 149: 898-
using the ISAAC protocol. Eur Resp J 907.
6: 1455-61.

Pearce N, Beasley R, Burgess C, Crane J


(1998). Asthma epidemiology:
principles and methods. New York:
Oxford University Press.
Pearce N, Douwes J, Beasley R (2000).
The rise and rise of asthma: a new
paradigm for the new millenium? J
Epidemiol Biostat 2000; 5: 5-16.
Pope CA, Schwartz J (1996). Time series
for the analysis of pulmonary health
data. Am J Respir Crit Care Med 154:
S229-S233.
Rose G (1992). The strategy of
preventive medicine. Oxford: Oxford
University Press, 1992.
Schouten JP, Tager IB (1996).
Interpretation of longitudinal studies:

57
58
CHAPTER 5: Measurement of Exposure and
Health Status
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

In the previous three chapters I outlined population over a particular risk period.
the general study design options that In this chapter I briefly review the
can be used in epidemiologic studies. various options for measuring exposure
These all involve measuring exposure and disease status.
and disease in a particular source

5.1: Exposure

As discussed in chapter 1, epidemiological situations (e.g. in a coal mine)


studies involve a wide variety of measurements of external exposures
exposures ranging from the population may be strongly correlated with internal
level to the individual and micro-levels. dose, whereas in other situations (e.g.
The term exposure is thus used environmental lead exposure) the dose
generically to refer to any factor that is may depend on individual lifestyle and
under study, and exposures may include activities and may therefore be only
population factors (e.g. income weakly correlated with the
inequality), individual-level socio- environmental exposure levels.
economic factors (e.g. income), physical
environmental factors (e.g. air pollution), Exposure levels can be assessed with
aspects of individual lifestyle (e.g. diet), regard to the intensity of the substance
as well as exposures measured at the in the environment (e.g. dust
level of the body, (e.g. total body burden concentration in the air) and the
of dioxin), organ (e.g. the concentration duration of time for which exposure
of asbestos in the lung), cell, or molecule occurs. The risk of developing disease
(e.g. DNA adducts). These various may be much greater if the duration of
situations are discussed here briefly; a exposure is long and/or the exposure is
more detailed discussion can be found in intense, and the total cumulative
Armstrong et al (1992). exposure may therefore be important.
For protracted etiologic processes, the
Exposure and Dose time-pattern of exposure may be
important and it is possible to assess
Strictly speaking, the term exposure this by examining the separate effects of
refers to the presence of a substance exposures in various time windows prior
(e.g. fine particulate matter) in the to the occurrence and recognition of
external environment, whereas the term clinical disease (Pearce, 1992). For
dose refers to the amount of substance example, in cancer studies recent
that reaches susceptible targets within exposures may not be relevant since the
the body, such as the airways. In some cancer may have first become

59
established some years previously use solely biologically-based definitions
(Pearce, 1988). Similarly, recent work (Polednak, 1989).
suggests that occupational asthma is
most likely to occur after about 1-3 Socio-economic status poses more
years of exposure to a sensitising agent significant measurement problems. It
(Ant et al, 1996). can be measured in a variety of ways,
including occupation, income, and
General Approaches to Exposure education (Liberatos et al, 1988;
Assessment Berkman and MacIntyre, 1997). These
measures may pose problems in some
Methods of exposure measurement demographic groups; for example,
include personal interviews or self- occupation and income may be poor
administered questionnaires (completed measures of socio-economic status in
either by the study participant or by a women, for whom the total family
proxy respondent), diaries, observation, situation may reflect their socio-
routine records, physical or chemical economic status better than their
measurements on the environment, or individual situation, and measures of
physical or chemical measurements on socio-economic status in children must
the person (Armstrong et al, 1992). For be based on the situation of the parents
example, table 5.1 summarizes the types or the total family situation.
of exposures data most commonly used Nevertheless, the various measures of
in occupational epidemiology studies socio-economic status are strongly
(Checkoway et al, 1989). Measurements correlated with each other, and asthma
on the person can relate either to epidemiology studies are usually based
exogenous exposure (e.g. airborne dust) on whichever measures are available,
or internal dose (e.g. plasma cotinine); unless socio-economic status is the main
the other measurement options (e.g. focus of the research and it is necessary
questionnaires) all relate to exogenous to obtain more detailed information.
exposures.
Questionnaires
Demographic Factors
Traditionally, exposure to most non-
In most instances, information on biological risk factors (e.g. tobacco
demographic factors such as age, smoking) has been measured with
gender and ethnicity can be obtained in questionnaires, and this approach has a
a straightforward manner from routine long history of successful use in
health care records or with epidemiology (Armstrong et al, 1992).
questionnaires. In studies focusing on Questionnaires may be self-administered
ethnicity, the etiologically relevant (e.g. postal questionnaires) or
definition will depend on the extent to interviewer-administered (e.g. in
which an ethnic difference is considered telephone or face-to-face interviews)
to be due to genetic and/or cultural and and may be completed by the study
environmental factors, but the available subject or by a proxy (e.g. parental
information will vary from country to completion of questionnaires in a study
country depending on historical and of children, or completion by the spouse
cultural considerations. For example, in of deceased cases). The validity of
New Zealand, Mori ethnicity is defined questionnaire data also depends on the
as a person who has Mori ethnicity and structure, format, content and wording
chooses to identify as Mori (Pomare et of questionnaires, as well as methods of
al, 1992), whereas some other countries administration and selection and training
of interviewers (Armstrong et al, 1992).

60
Table 5.1

Types of exposure data commonly used in occupational epidemiology studies


(Source: Adapted from Checkoway et al, 1989)

Ever employed in the industry


Duration of employment in the industry
Ordinally ranked jobs or tasks
Job-exposure matrices
Quantified personal measurements

Example 5.1

Raum et al (2001) pregnancy and given a risk of small-for-


studied the impact of self-administered 30- gestational-age (SGA)
maternal socio-economic page questionnaire newborns compared to
status on intrauterine covering socio- women with the highest
growth in the former demographic, education in both the
west and East Germany. psychosocial, nutritional, west (OR = 2.58, 95%
Information on socio- environmental and CI 1.17-5.67) and the
demographic or lifestyle occupational factors. The east (OR = 2.77, 95% CI
factors and pregnancy two school systems were 1.54-5.00). The authors
outcome was available not identical, but in each concluded that social
for 3,374 live-born system maternal inequalities existed and
singletons from West educational level was caused health
Germany (1987/88) and grouped into five inequalities in both the
3070 from East Germany categories. Women with West, and in the former
(1990/91). Women were the lowest education had socialist country of East
recruited during a significantly elevated German

Example 5.2

Vartia (2001) studied participants were asked more general stress and
the consequences of if they felt themselves mental stress reactions
workplace bullying in the subjected to such than did respondents
municipal sector in behaviour, or if they had from workplaces with no
Helsinki, Finland. Every observed someone else bullying. The targets
35th member of the at their workplace being of bullying used sleep-
Municipal Officials Union bullied. They were also inducing drugs and
was selected and 1037 asked about the sedatives more often
(65.5%) responded to a frequency and duration than did the
postal questionnaire. A of such acts. Both the respondents who were
definition of bullying was targets of bullying and not bullied.
provided and study the observers reported

61
Environmental Measurements and for each job title, exposure levels
Job-Exposure Matrices decreased over time, but increased again
during the 1966-75 time period. Within
In many studies, e.g. community-based each time period, the highest exposures
case-control studies, questionnaires are were in raw fiber handling and the
the only source of exposure information. lowest were in general area workers.
However, in some instances, particularly This historical exposure information can
in occupational studies, questionnaires be combined with information from
may be combined with environmental employment records to obtain exposure
exposure measurements (e.g. industrial estimates for individual workers. For
hygiene surveys) to obtain a quantitative example, table 5.3 shows the cumulative
estimate of individual exposures. Table exposure for a worker who worked as a
5.2 shows environmental measurements card operator during 1933-1938 and
in an asbestos textile plant in South then worked in clean-up during 1939-
Carolina (Dement et al, 1983; 1948.
Checkoway et al, 1989). It shows that

Example 5.3

Saracci et al (1984a) each plant, job/plant respirable fibres in the


conducted a historical areas were grouped into job category. The
cohort study of mortality six main occupational relative risk of lung
and cancer incidence of categories: not cancer was elevated,
workers exposed to specified, office, particularly in the group
made-made vitreous preproduction, with 30 years or more
fibres at 13 European production, secondary since first employment
plants. At 12 of the processes and (RR=1.92, 95% CI 1.17-
plants an environmental maintenance. For each 3.07). There was a
survey was conducted to worker a cumulative tendency for the risk to
measure present exposure index was increase with cumulative
concentrations of fibres created by multiplying exposure, but the
in air samples. This was the time spent in each pattern was not
used to create a job- job category by the consistent.
exposure matrix. Within mean concentration of

Table 5.2

Asbestos concentrations (fibers/cc) in job categories in an asbestos textile plant


(Source: Adapted from Checkoway et al, 1989)

Job category 1930-35 1936-45 1946-65 1966-75


General area 10.8 5.3 2.4 4.3
Card operators 13.3 6.5 2.9 5.3
Clean-up 18.1 8.8 4.0 7.2
Raw fiber handling 22.8 11.0 5.0 9.0

62
Table 5.3
Example of an exposure history of an individual worker
Job Years Mean exposure Cumulative exposure
Card operator 1933-35 10.8 32.4
Card operator 1936-1938 6.5 41.9
Clean-up 1939-45 8.8 103.5
Clean-up 1946-48 4.0 115.5

Quantified Personal Measurements

In some instances, quantified personal Quantified personal exposure


exposure measurements may be measurements can also be used in
available, e.g. in radiation workers case-control studies to estimate
wearing radiation dosimeters historical exposures. However, a
(Checkoway et al, 1989). This potential problem in this situation is
information is invaluable when it is that exposure may have changed over
available, but it is rarely available for time, or study participants may change
historical exposures with the exception their behaviour as a result of having
of some industries such as the nuclear been diagnosed with disease. This has
power industry. Such information can been a particular issue in case-control
of course be collected prospectively. studies of electromagnetic field
This is rarely practical for cohort exposure and childhood leukemia
studies of rare diseases with long where it has been argued that current
latency periods (e.g. cancer), but is personal exposure measurements may
more appropriate for cohort studies of be inferior to wire code information
relatively common conditions. For (i.e. whether the wiring to the house is
example, infant cohort studies of underground, or by overhead wires,
respiratory disease frequently etc) in estimating historical exposures
prospectively collect information on (Neutra and del Pizzo, 1996).
individual levels of allergen exposure
(e.g. Lau et al, 2001).

Example 5.4

Wing et al (1991) gamma rays, were used to estimate


conducted a historical measured using pocket individual exposures
cohort mortality study ionising chambers from over time. After
among workers at Oak 1943 until June 1944, accounting for age, birth
Ridge National film badges from then cohort, socio-economic
Laboratory, Tennessee. until 1975, and status, and active
Individual exposures to thermoluminescent worker status, external
external penetrating dosimeters since 1975. radiation with a 20-year
radiation, primarily This information was exposure lag (i.e.

63
exposures were only increased risk of death from cancer (4.94%
considered up until 20 (2.68% increase per 10 increase per 10 mSv).
years previously) was mSv cumulative
associated with an exposure), particularly

Biomarkers

More recently, there has been increasing combination with a job-exposure matrix
emphasis on the use of molecular (based on historical exposure
markers of internal dose (Schulte, measurements of work areas rather than
1993). In fact, there are a number of individuals) is usually considered to be
major limitations of currently available more valid than current exposure
biomarkers of exposure (Armstrong et measurements (whether based on
al, 1992), particularly with regard to environmental measurements or
historical exposures (Pearce et al, 1995). biomarkers) if the aim is to estimate
For example, serum levels of historical exposure levels (Checkoway et
micronutrients reflect recent rather than al, 1989). On the other hand, some
historical dietary intake (Willett, 1990). biomarkers have potential value in
Some biomarkers are better than others validation of questionnaires which can
in this respect (particularly markers of then be used to estimate historical
exposure to biological agents), but even exposures. Furthermore, biomarkers of
the best markers of chemical exposures internal dose may have relatively good
usually reflect only the last few weeks or validity in studies involving an acute
months of exposure. On the other hand, effect of exposure.
with some biomarkers it may be possible
to estimate historical levels provided that A more fundamental problem of
certain assumptions are met. For measuring internal dose with a
example, it may be possible to estimate biomarker is that it is not always clear
historical levels of exposure to pesticides whether one is measuring the exposure,
(or contaminants) from current serum the biological effect, or some stage of
levels provided that the exposure period the disease process itself (Saracci,
is known, and the half-life is known. 1984b). Thus the findings may be
Similarly, information on recent uninterpretable in terms of the causal
exposures can be used if it is reasonable association between exposure and
to assume that exposure levels (or at disease. When it is known that the
least relative exposure levels) have biologically effective dose is the most
remained stable over time (this may be appropriate measure, then the use of
particularly relevant in occupational appropriate biomarkers clearly has some
studies), and have not been affected by scientific advantages. However, choosing
lifestyle changes, or by the occurrence of the appropriate biomarker is a major
the disease. However, if the aim is to dilemma, and biomarkers are frequently
measure historical exposures, then chosen on the basis of an incomplete or
historical information on exposure erroneous understanding of the etiologic
surrogates may be more valid than direct process (or simply because a particular
measurements of current exposure or marker can be measured). An
dose levels. This situation has long been environmental exposure (e.g. tobacco
recognised in occupational epidemiology, smoke) may involve hundreds of
where the use of work history records in different chemicals, each of which may

64
produce hundreds of measurable the etiological component of the complex
biological responses (there are exposure mixture then the biomarker will
exceptions to this, of course, such as yield relatively poor exposure data.
environmental lead exposure, but most
environmental exposure involves A further major problem with the use of
complex mixtures). A biomarker typically biomarkers is that the resulting expense
measures one of the biological responses and complexity may drastically reduce
to one of the chemicals. If the chosen the study size, even in a case-control
biomarker measures the key etiological study, and therefore greatly reduce the
factor, then it may yield relatively good statistical power for detecting an
exposure data; however, if a biomarker association between exposure and
is chosen which has little relationship to disease.

Example 5.5

Ross et al (1992) studied follow-up, a nested case- of liver cancer were


urinary aflatoxin control study was more likely than controls
biomarkers and risk of conducted based on the to have detectable
hepatocellular carcinoma 22 identified cases of concentrations of
as part of an ongoing liver cancer, and 140 aflatoxin metabolites
prospective study of density-matched (OR = 2.4, 95% CI 1.0-
18,244 middle-aged men controls (matched for 5.9).
in Shanghai. After age and neighbourhood
35,299 person-years of or residence). The cases

Thus, questionnaires and environmental the etiologically relevant exposure. The


measurements will continue to play a appropriate approach (questionnaires,
major role in exposure assessment in environmental measurements or
epidemiology, but biomarkers may be biological measurements) will vary from
expected to become increasingly useful study to study, and from exposure to
over time, as new techniques are exposure within the same study, or
developed. The emphasis should be on within the same complex chemical
using appropriate technology to obtain mixture (e.g. in tobacco smoke).
the most practical and valid estimate of

5.2: Health Status

The measurement of health status will type of information required for


only be considered briefly here. The epidemiological studies may be different
most important consideration is that the from that which is required in clinical

65
practice. As with exposure data, the key and in some instances the ICD code for a
issue is that information should be of particular cause of death may change
similar quality for the various groups (Checkoway et al, 1989).
being compared. For example, suppose
that the bladder cancer incidence in a Some countries or states also maintain
particular geographical area is being incidence registers for conditions such as
compared with national incidence rates; cancer, congenital malformations or
then it would be inappropriate to epilepsy. These have most commonly
conduct a pathological review and been established for cancer registration
reclassification of the cases of the cancer and the International Agency for
identified in the area, since such a Research on Cancer (IARC) has been
reclassification had not been made for attempting to encourage the
the national data and the information establishment of cancer registries and to
would not be comparable. Rather, the standardise methods of cancer
cancer cases in the area should be registration throughout the world
classified exactly as they had been (Jensen et al, 1991). Provided that
classified in routine national cancer registration is relatively complete, then
statistics. Thus, the emphasis should be cancer registrations can provide valuable
on the comparability of information additional health status information (and
across the various groups being increase the number of identified cases)
compared. in a cohort study. Furthermore, cancer
registries are invaluable for identifying
The types of health outcome data used
newly diagnosed cases who can be
in epidemiological studies include:
interviewed (while they are still alive) for
mortality; disease registers; health
population-based case-control studies.
service records; and morbidity surveys.
These can be grouped into data based
Many Western countries have notification
on routinely collected records, and
systems for occupational diseases. For
morbidity data that is collected for a
example, in the United Kingdom the
specific epidemiologic study.
Surveillance of Work Related and
Occupational Respiratory Disease
Routine Records
(SWORD) project was established in
1989 as a national surveillance scheme
Most countries maintain comprehensive
for occupational respiratory disease
death registration systems at the
(Meredith et al, 1991).
national or regional levels, and cause of
death information for identified deaths
As discussed in chapter 2, other
can be obtained by requesting copies of
routinely collected records can be used
death certificates from national, state, or
for determining health status in cohort
municipal vital statistics offices. In most
studies, or to create informal registers
instances the causes of death are coded
for identifying cases for case-control
by a nosologist trained in the rules
studies; these include hospital admission
specified in the International
records, health insurance claims, health
Classification of Diseases (ICD) volumes
maintenance organisation (HMO)
compiled by the World Health
records, and family doctor (general
Organisation. Revisions to the ICD
practitioner records).
coding are made about every ten years,

66
Example 5.6

Jones et al (1998) birth. They then linked the mother during


performed a record the hospital record for pregnancy (OR=5.87,
linkage study of pre- each child to all of that 95% CI 0.90-38.3),
natal and early life risk childs hospital records and a significantly
factors for childhood and to his or her raised risk with pre-
onset diabetes mellitus. mothers maternity eclampsia or eclampsia
They identified 160 record. There were no during pregnancy
boys and 155 girls born significant associations (OR=1.48, 95% CI
during 1965-1986 who between subsequent 1.05-2.10). They
had been admitted to diabetes and hypothesized that pre-
hospital in Oxfordshire, birthweight, gestational eclampsia may be the
England with a age, birthweight for result of an
diagnosis of diabetes gestational age, immunogenetic
during 1965-1987. For maternal age and incompatibility between
each case, up to eight parity. There were non- mother and fetus, and
controls were chosen significantly increased that this early
from records for live risks with not immunological
births in the same breastfeeding disturbance may be
area, matched on sex, (OR=1.33, 95% CI related to the incidence
year of birth and 0.76-2.34) and with of diabetes later in life.
hospital or place of diabetes recorded in

Morbidity Surveys

In some circumstances, routine records which can be reversed by treatment or is


may not be available for the health self-limiting (Pearce et al, 1998). This
outcome under study, or may not be poses several problems with the use of
sufficiently complete or accurate or use "diagnosed asthma" in asthma
in epidemiological studies. Although this prevalence studies, since the diagnosis
could in theory apply to mortality of "variable airflow obstruction" usually
records, more commonly this is an issue requires several medical consultations
for non-fatal conditions, particularly over an extended period. It is therefore
chronic diseases such as respiratory not surprising that several studies have
disease and diabetes. Such morbidity found the prevalence of physician-
surveys may involve clinical diagnosed asthma to be substantially
examinations (e.g. a clinical history and lower than the prevalence of asthma
peak flow measurements for asthma), symptoms. Such problems of differences
more invasive testing (e.g. blood tests in diagnostic practice could be minimised
for diabetes), questionnaires, or a by using a standardised protocol for
combination of these methods. asthma diagnosis in prevalence studies.
However, this is rarely a realistic option
To take the example of asthma, the since it requires repeated contacts
essential feature of the condition (at between the study participants and
least in clinical and epidemiological physicians, and this is not possible or
terms) is variable airflow obstruction affordable in large-scale epidemiological

67
studies. Thus, most epidemiological symptoms and physiological
studies must, by necessity, focus on measurements (Pearce et al, 1998). In
factors which are related to, or particular, standardised symptoms
symptomatic of, asthma but which can questionnaires have been developed for
be readily assessed on a particular day. use in adults (Burney et al, 1994) and
The main options in this regard are children (Asher et al, 1995).

Example 5.7

Dowse et al (1990) classified according to commented that the


studied the prevalence the World Health findings in Indians were
of non-insulin dependent Organisation (WHO) similar to those in other
diabetes mellitus criteria (World Health studies of Indian migrant
(NIDDM) in adults aged Organisation, 1985). The communities, but the
25-74 years in Mauritius. prevalence of NIDDM findings in Creoles and
A random sample of was similar in men Chinese were
5,892 individuals was (12.1%) and women unexpected. Potent
chosen and 5,080 (11.7%). Age and sex- environmental factors
(83.4%) participated. standardised prevalence shared between ethnic
They used a 75g oral was similar in Hindu groups in Mauritius may
glucose tolerance test Indians (12.4%), Muslim be responsible for the
with fasting and 2-h post Indians (13.3%), Creoles epidemic of glucose
load blood collection. (10.4%) and Chinese intolerance.
Glucose tolerance was (11.9%). The authors

Health status can also be measured by role functioning, bodily pain, mental
more general morbidity and quality of health, and general health perceptions.
life questionnaires. Perhaps the most The SF-36 scales have been widely used
widely used questionnaire has been the in clinical research in a wide variety of
Medical Outcomes Study Short Form populations to assess overall health
(SF-36) (Ware, 1993). This includes status.
scales to measure physical functioning,

Summary

Methods of exposure measurement routine records, physical or chemical


include personal interviews or self- measurements on the environment, or
administered questionnaires (completed physical or chemical measurements on
either by the study participant or by a the person. Measurements on the person
proxy respondent), diaries, observation, can relate either to exogenous exposure

68
(e.g. airborne dust) or internal dose to biological markers with respect to
(e.g. plasma cotinine); the other historical exposures. The emphasis
measurement options (e.g. should be on using appropriate
questionnaires) all relate to exogenous technology to obtain the most practical
exposures. Traditionally, exposure to and valid estimate of the etiologically
most non-biological risk factors (e.g. relevant exposure.
cigarette smoking) has been measured
with questionnaires (either self- Similar considerations apply to the
administered or interviewer- collection of information on health
administered), and this approach has a status. Once again, it is important that
long history of successful use in the information obtained should be of
epidemiology. Questionnaires may be comparable quality in the exposed and
combined with environmental exposure non-exposed populations. With this
measurements (e.g. pollen counts, proviso, the specific methods used will
industrial hygiene surveys) to obtain a differ according to the hypothesis and
quantitative estimate of individual population under study, but the main
exposures. More recently, there has options include use of routine records
been increasing emphasis on the use of (mortality, incidence, hospital admission,
molecular markers of internal dose health insurance, general practitioner,
(Schulte, 1993). However, etc) and the mounting of a special
questionnaires and environmental morbidity survey (using clinical
measurements have good validity and examinations, biological testing or
reproducibility with regard to current questionnaires).
exposures and are likely to be superior

References

Ant JM, Sunyer J, Newman-Taylor AJ formulations. Kogevinas M, Pearce N,


(1996). Comparison of soybean Susser M, Boffetta P (eds).
epidemic asthma and occupational Socioeconomic factors and cancer.
asthma. Thorax 51: 743-9. Lyon: IARC, pp 51-64.
Armstrong BK, White E, Saracci R Burney PGJ, Luczynska C, Chinn S, Jarvis
(1992). Principles of exposure D (1994). The European Community
measurement in epidemiology. New Respiratory Health Survey. Eur Resp J
York: Oxford University Press. 7: 954-60.
Asher I, Keil U, Anderson HR, et al Checkoway HA, Pearce NE, Crawford-
(1995). International Study of Asthma Brown DJ (1989). Research methods
and Allergies in Childhood (ISAAC): in occupational epidemiology. New
rationale and methods. Eur Resp J 8: York: Oxford University Press.
483-91.
Dement JM, Harris RL, Symons MJ and
Berkman LF, MacIntyre S (1997). The Shy, CM (1983). Exposures and
measurement of social class in health mortality among chrysotile asbestos
studies: old measures and new workers. Part II: Mortality.

69
American Journal of Industrial Pearce N, Sanjose S, Boffetta P, et al
Medicine 4:421-433. (1995). Limitations of biomarkers of
exposure in cancer epidemiology.
Dowse GK, Gareeboo H, Zimmet PZ, et
Epidemiol 6: 190-4.
al (1990). High prevalence of NIDDM
and impaired glucose tolerance in Polednak AP (1989). Racial and ethnic
Indian, Creole, and Chinese differences in disease. New York:
Mauritians. Diabetes 1990; 39: 390-6. Oxford University Press.
Jensen OM, Parkin DM, MacLennan R, et Pomare E, Tutengaehe H, Ramsden I, et
al (1991). Cancer registration: al (1992). Asthma in Maori people. NZ
principles and methods. Lyon: IARC. Med J 105: 469-70.
Jones ME, Swerdlow AJ, Gill LE, Goldacre Raum E, Arabin B, Schlaud M, et al
MJ (1998). Pre-natal and early life risk (2001). The impact of maternal
factors for childhood onset diabetes education on intrauterine growth: a
mellitus: a record linkage study. Int J comparison of former West and East
Epidemiol 1998; 27: 444-9. Germany. Int J Epidemiol 2001: 30:
81-7.
Lau S, Illi S, Sommerfeld C, et al (2001).
Early exposure to house-dust mite Ross RK, Yuan J-M, Yu MC, et al (1992).
and cat allergens and development of Urinary aflatoxin biomarkers and risk
childhood asthma: a cohort study. of hepatocellular carcinoma. Lancet
Multicentre Allergy Study Group. 1992; 339: 943-6.
Lancet 2001; 356: 1392-7.
Saracci R, Simonato L, Acheson ED, et al
Liberatos P, Link BG, Kelsey JL (1988). (1984a). Mortality and incidence of
The measurement of social class in cancer of workers in the man made
epidemiology. Epidemiologic Reviews vitreous fibres producing industry: an
10: 87-121. international investigation at 13
European plants. Br J Ind Med 1984;
Meredith SK, Taylor VM, McDonald JC
41: 425-36.
(1991). Occupational respiratory
disease in the UK 1989: a report by Saracci R (1984b). Assessing exposure
the SWORD project group. Br J Ind of individuals in the identification of
Med 1991; 48: 292-8. disease determinants. In: Berlin A,
Draper M, Hemminki K, Vainio H
Neutra RR, del Pizzo V (1996). When
(eds). Monitoring human exposure to
wire codes predict cancer better
carcinogenic and mutagenic agents.
than spot measurements of magnetic
Lyon: IARC.
fields. Epidemiol 1996; 7: 217-8.
Schulte PA (1993). A conceptual and
Pearce N (1988). Multistage modeling of
historical framework for molecular
lung cancer mortality in asbestos
epidemiology. In: Schulte P, Perera
textile workers. Int J Epidemiol 17:
FP. Molecular epidemiology: principles
747-52.
and practices. New York: Academic
Pearce N (1992). Methodological Press, pp 3-44.
problems of time-related variables in
Vartia M (2001). Consequences of
occupational cohort studies. Rev
workplace bullying with respect to the
Epidem et Sant Publ 40: S43-S54.
well-being of its targets and the
Pearce N, Beasley R, Burgess C, Crane J observers of bullying. Scand J Work
(1998). Asthma epidemiology: Environ Health 27: 63-9.
principles and methods. New York:
Ware JE (1993). SF-36 Health Survey,
Oxford University Press.
Manual and Interpretation Guide.
Boston: The Health Institute.

70
World Health Organisation (WHO)
(1985). WHO Study Group: Diabetes
mellitus. Technical Report Series no
727. Geneva: World Health
Organisation.
Willett W (1990). Nutritional
epidemiology. New York: Oxford
University Press.
Wing S, Shy CM, Wood JL, et al (1991).
Mortality among workers at Oak Ridge
National Laboratory: evidence or
radiation effects in follow-up through
1984. JAMA 1991; 265: 1397-1402.

71
72
Part II

Study Design Issues

73
74
CHAPTER 6: Precision
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

Random error will occur in any However, there will always be other
epidemiologic study, just as it occurs in unknown or unmeasurable risk factors
experimental studies. It is often referred operating, and hence the disease rates in
to as chance, although it can perhaps particular subgroups will fluctuate about
more reasonably be regarded as the average. This will occur even if each
"ignorance" (although it is not the only subgroup has exactly the same exposure
thing that we may be ignorant about as history.
our study may be biased by unknown
confounders, measurement error, etc). Even in an experimental study, in which
For example, if we toss a coin 50 times, participants are randomised into
then ideally we might be able to predict "exposed" and "non-exposed" groups,
the outcome of each toss based on the there will be "random" differences in
speed, spin, and trajectory of the coin. background risk between the compared
In practice, we do not have all of the groups, but these will diminish in
necessary information (because of importance (i.e. the random differences
ignorance), or the computing power to will tend to even out) as the study size
use it (because of chaotic behaviour), grows. In epidemiological studies,
and we therefore regard the outcome of because of the lack of randomisation,
each toss as a chance phenomenon. there is no guarantee that differences in
However, we may note that, on the baseline (background) risk will "even
average, 50% of the tosses are heads out" between the exposure groups as the
and therefore we may say that a study size grows.
particular toss has a 50% chance of
producing a head. The basic principles of analysis of
epidemiologic data are discussed in
Similarly, suppose that 50 lung cancer chapter 9. However, at this stage it is
deaths occurred among 10,000 people important to discuss some basic
aged 35-39 exposed to a particular statistical principles and methods since
factor during one year. Then, if each they are relevant to the calculation of
person had exactly the same cumulative the appropriate study size.
exposure, we might expect two
subgroups of 5,000 people each to
experience 25 deaths during the one-
year period. However, just as 50 tosses
of a coin will not usually produce exactly
25 heads and 25 tails, neither will there
be exactly 25 deaths in each group. This
occurs because of differences in
exposure to other risk factors for lung
cancer, and differences in individual
susceptibility between the two groups.
Ideally, we should attempt to gather
information on all known risk factors
(potential confounders), and to adjust
for these in the analysis (see chapter 9).

75
6.1: Basic Statistics

Basic Concepts normally distributed, the means of the


samples will be approximately normally
Data can be summarized in various distributed provided that the samples are
forms, including frequency tables, sufficiently large (how "large" depends
histograms, bar charts, cross-tabulations on how non-normally distributed the
and pie charts. However, it is usually population is). The standard deviation of
also useful to give a summary measure the sample means is termed the
of central tendency. The mean (or standard error of the mean. Since the
average) is the most commonly used means are approximately normally
measure of central tendency, because of distributed, about 95% of sample means
its convenient statistical properties. The will lie within 1.96 standard errors of the
next step is data smoothing which overall population mean. Usually, a
involves the combination of the data with study only involves one sample, but the
a statistical model. In the simplest case, standard error can be estimated by
this involves assuming a particular dividing the standard deviation of the
statistical distribution in order to obtain a sample by the square root of the number
summary measure of variability of the of people in the sample.
data. The most common measure of
variability is the standard deviation Most epidemiological studies involve
(Armitage et al, 2002). The standard categorical rather than continuous
deviation is especially useful when the outcome data. For example, in a
underlying data distribution is particular area one might estimate the
approximately normal (i.e. symmetric proportion of births involving congenital
with a special type of bell-shape). If data malformations over a particular time
is not normally distributed, then it can period (this is actually the prevalence at
often be made approximately normally birth - it is very difficult to calculate the
distributed by an appropriate incidence of congenital malformations
transformation (e.g. a log because this requires information on
transformation), but these abortions and stillbirths as well as live
transformations may distort the scientific births). This involves the calculation of a
meaning of the findings, and make them proportion (p). Under the binomial
difficult to interpret. distribution, if the sample is sufficiently
large, the sampling distribution will
Usually it is not possible to study the approximate to the normal distribution
entire population in which one is with mean (p) and standard deviation:
interested (theoretically, this is almost
always infinite since we usually wish to s= (p(1-p)/n)0.5
generalise our findings not only to the
population we are studying, but also to where the 0.5 indicates the square root
other populations). It is therefore of the expression in parentheses. Thus,
necessary to consider a random sample one can calculate the proportion with
and to relate its characteristics to the malformations (i.e. the mean score for a
total population. If repeated samples are population in which a malformation
taken from the same population, then scores 1 and a completely healthy baby
the mean will vary between samples. scores 0), and the standard deviation of
Even if the underlying population is not this proportion (i.e. the standard error of

76
the mean score), and if the sample is also whether an association as large as
sufficiently large one can analyze these this is likely to have arisen by chance, if
estimates based on the normal in fact there is no causal association
distribution. between exposure and disease. The p-
value is the probability that differences
Testing and Estimation as large or larger as those observed
could have arisen by chance if the null
Usually, in epidemiologic studies, we hypothesis (of no association between
wish to measure the difference in exposure and disease) is correct. In the
disease occurrence between groups past, it is been common to test the
exposed and not exposed to a particular statistical significance of the study
factor. For example, if we have findings by seeing whether the p-value is
estimated the proportion of pregnancies less than an arbitrary value (e.g.
involving congenital malformations in an p<0.05). The limitations of statistical
area with high nitrate levels in drinking significance testing are discussed in
water, then we would wish to compare chapter 9. However, even if we do not
this to the corresponding proportion in intend to use p-values when reporting
an area with low nitrate levels (or with the findings of a study, the statistical
the proportion in all births nationally. In principles involved are nevertheless
doing so, we not only wish to estimate relevant to determining the appropriate
the size of the observed association, but study size.

6.2: Study Size and Power

The most effective means of reducing size. When exposure increases the risk
random error is by increasing the study of the outcome, or referents are
size, so that the precision of the cheaper to include in the study than
measure of association (the effect index subjects, then a larger ratio may
estimate) will be increased, i.e. the be more efficient. The optimal
confidence intervals will be narrower. reference: index ratio is rarely greater
Random error thus differs from than 2:1 for a simple unstratified
systematic error (see chapter 7) which analysis (Walter, 1977) with equal index
cannot be reduced simply by increasing and referent costs, but a larger average
the study size. ratio may be desirable in order to
A second factor that can affect assure an adequate ratio in each
precision, given a fixed total study size, stratum for stratified analyses.
is the relative size of the reference
group (the unexposed group in a cohort The ideal study would be infinitely large,
study, or the controls in a case-control but practical considerations set limits on
study). When exposure is not associated the number of participants that can be
with disease (i.e. the true relative risk is included. Given these limits, it is
1.0), and the costs (of recruitment, data desirable to find out, before
collection, etc) of index and reference commencing the study, whether it is
subjects are the same, then a 1:1 ratio large enough to be informative. One
is most efficient for a given total study

77
method is to calculate the "power" of the
study. This depends on five factors:

the cutoff value (i.e. alpha level) the expected relative risk (i.e. the
below which the p-value from the specified value of the relative risk
study would be considered under the alternative (non-null)
statistically significant. This value hypothesis));
is usually set at 0.05 or 5%;
the ratio of the sizes of the two
the disease rate in the non-exposed groups being studied;
group in a cohort study or the
the total number of study participants.
exposure prevalence of the controls
in a case-control study;

Once these quantities have been 1977; Schlesselman, 1982). The


determined, standard formulas are then standard normal deviate corresponding
available to calculate the statistical to the power of the study (derived from
power of a proposed study (Walter, Rothman and Boice, 1982) is then:

Z = N00.5|P1 P0|B0.5 ZB

K0.5

where:
Z = standard normal deviate corresponding to a given statistical power
Z = standard normal deviate corresponding to an alpha level (the largest
p-value that would be considered "statistically significant")
N0 = number of persons in the reference group (i.e. the non-exposed
group in a cohort study, or the controls in a case-control study)
P1 = outcome proportion in study group
P0 = outcome proportion in the reference group
A = allocation ratio of referent to study group (i.e., the relative size of the
two groups)
B = (1-P0) (P1+ (A-1) P0) + P0 (1-P1)
C = (1-P0) (AP1 - (A-1) P0) + AP0 (1-P1)
K = BC - A (P1-P0)2

Standard calculator and http://www.cdc.gov/epiinfo/ and


http://www.cdc.gov/epiinfo/,
microcomputer programmes Rothmans Episheet programme
incorporating procedures for power (Rothman, 2002) can be downloaded
calculations are widely available. In for free from
particular, EPI-INFO (Dean et al, 1990) http://www.oup-usa.org/epi/rothman/
can be downloaded for free from

78
Example 6.1

Consider a proposed study group of workers, the double the risk of disease,
of 5,000 exposed persons expected number of cases so the number of cases
and 5,000 non-exposed of the disease of interest observed will be 50 in the
persons. Suppose that on is 25 in the non-exposed exposed group.
the basis of mortality group. However, we
rates in a comparable expect that exposure will

Then:
Z = 1.96 (if a two-tailed significance test, for an alpha-level of 0.05, is to
be used)
N0 = 5,000
P1 = 0.010 (= 50/5000)
P0 = 0.005 (= 25/5000)
A = 1

Using the equation above, the standard statistically significant lung cancer
normal deviate corresponding to the excess in the exposed group is:
power of the study to detect a

Z = 50000.5 (0.010-0.005) (0.0149)0.5 - 1.96 x 0.0149 = 0.994


0.0001970.5

From tables for the An alternative approach lower 95% confidence


(one-sided) standard is to carry out a limit is 1.0, then the
normal distribution, it standard analysis of the power for a two-tailed
can be seen that this hypothesized results. If test (of p<0.05) would
corresponds to a power we make the be only 50%. This
of 83%. This means that assumptions given simulated confidence
if 100 similar studies of above, then the relative interval gives the
this size were risk would be 2.0, with a additional information
performed, then we 90% confidence interval that the observed
would expect 83 of them of 1.4-3.0. This relative risk could be as
to show a statistically approach only has an large as 3.0 or as low as
significant (p< 0.05) indirect relationship to 1.4 if the observed
excess of cases in the the power calculations. relative risk is 2.0.
exposed group. For example, if the

79
Related approaches are to estimate 1982; Greenland, 1983), and the size
the minimum sample sizes required to of the expected association is often
detect an association (e.g., relative just a guess. Nevertheless, power
risk) of specified magnitudes calculations are an essential aspect of
(Beaumont and Breslow, 1981), and to planning a study since, despite all their
estimate the minimum detectable assumptions and uncertainties, they
association for a given alpha level, nevertheless provide a useful general
power and study size (Armstrong, indication as to whether a proposed
1987). study will be large enough to satisfy
the objectives of the study.
Occasionally, the outcome is measured
as a continuous rather than a Estimating the expected precision can
dichotomous variable (e.g. blood also be useful (Rothman and
pressure). In this situation the Greenland, 1998). This can be done by
standard normal deviate corresponding "inventing" the results, based on the
to the study power is: same assumptions used in power
calculations, and carrying out an
Z = N00.5(1-0) Z analysis involving calculations of effect
estimates and confidence limits. This
s(A + 1)0.5 approach has particular advantages
when the exposure is expected to have
no association with disease, since the
where: concept of power is not applicable but
precision is still of concern. However,
1 = mean outcome measure in this approach should be used with
exposed group considerable caution, as the results
may be misleading unless interpreted
0 = mean outcome measure in
carefully. In particular, a study with an
reference group
expected lower limit equal to a
s = estimated standard particular value (e.g. 1.0) will have
deviation of outcome measure only a 50% chance of yielding an
observed lower confidence limit above
that value.
The power is not the probability that
the study will estimate the size of the In practice, the study size depends on
association correctly. Rather, it is the the number of available participants
probability that the study will yield a and the available resources. Within
"statistically significant" finding when these limitations it is desirable to make
an association of the postulated size the study as large as possible, taking
exists. The observed association could into account the trade-off between
be greater or less than expected, but including more participants and
still be "statistically significant". The gathering more detailed information
overemphasis on statistical about a smaller number of participants
significance is the source of many of (Greenland, 1988). Hence, power
the limitations of power calculations. calculations can only serve as a rough
Many features such as the significance guide as to whether a feasible study is
level are completely arbitrary, issues large enough to be worthwhile. Even if
of confounding, misclassification and such calculations suggest that a
effect modification are generally particular study would have very low
ignored (although appropriate methods power, the study may still be
are available - see Schlesselman, worthwhile if exposure information is

80
collected in a form which will permit individual cohorts were too small to be
the study to contribute to the broader informative in themselves, but each
pool of information concerning a contributed to the overall pool of data.
particular issue. For example, the
International Agency for Research on Once a study has been completed,
Cancer (IARC) has organised several there is little value in retrospectively
international collaborative studies such performing power calculations since
as those of occupational exposure to the confidence limits of the observed
man-made mineral fibers (Simonato et measure of effect provide the best
al, 1986) and phenoxy herbicides and indication of the range of likely values
contaminants (Saracci et al, 1991). for the true association (Smith and
The man-made mineral fiber study Bates, 1992; Goodman and Berlin,
involved pooling the findings from 1994). In the next chapter, random
individual cohort studies of 13 error will be ignored, and the
European factories. Most of the discussion will concentrate on issues of
systematic error.

Summary

Random error will occur in any large enough to be informative. One


epidemiologic study, just as it occurs method is to calculate the "power" of
in experimental studies. The most the study. In practice, the study size
effective means of reducing random depends on the number of available
error is by increasing the study size, participants and the available
so that the precision of the effect resources. Within these limitations it is
estimate will be increased. Random desirable to make the study as large
error thus differs from systematic error as possible, taking into account the
which cannot be reduced simply by trade-off between including more
increasing the study size. The ideal participants and gathering more
study would be infinitely large, but detailed information about a smaller
practical considerations set limits on number of participants. Hence, power
the number of participants that can be calculations can only serve as a rough
included. Given these limits, it is guide as to whether a feasible study is
desirable to find out, before large enough to be worthwhile.
commencing the study, whether it is

81
References

Armitage P, Berry G, Matthews JNS Saracci R, Kogevinas M, Bertazzi P, et al


(2002). Statistical methods in medical (1991). Cancer mortality in an
research. 4th ed. Oxford: Blackwell. international cohort of workers
exposed to chlorophenoxy herbicides
Armstrong B (1987). A simple estimator
and chlorophenols. Lancet 338: 1027-
of minimum detectable relative risk,
32.
sample size, or power in cohort
studies. Am J Epidemiol 125: 356- Schlesselman JJ (1982). Case-control
358. studies: design, conduct, analysis.
New York: Oxford University Press.
Beaumont JJ, Breslow NE (1981). Power
considerations in epidemiologic Simonato L, Fletcher AC, Cherrie J, et al.
studies of vinyl chloride workers. Am J (1986). Scandinavian Journal of Work,
Epidemiol 114: 725-734 Environment and Health 12: (suppl 1)
34-47.
Dean J, Dean A, Burton A, Dicker R
(1990). Epi Info. Version 5.01. Smith AH, Bates M (1992). Confidence
Atlanta, GA: CDC. limit analyses should replace power
calculations in the interpretation of
Goodman SN, Berlin JA (1994). The use
epidemiologic studies. Epidemiol 3:
of predicted confidence intervals when
449-52.
planning experiments and the misuse
of power when interpreting results. Walter SD (1977). Determination of
Ann Intern Med 121: 200-6. significant relative risks and optimal
sampling procedures in prospective
Greenland S (1983). Tests for interaction
and retrospective studies of various
in epidemiologic studies: a review and
sizes. Am J Epidemiol 105: 387-97.
a study of power. Statist Med 2: 243-
51.
Greenland S (1988). Statistical
uncertainty due to misclassification:
implications for validation substudies.
J Clin Epidemiol 41: 1167-74.
Rothman KJ, Boice JD (1983).
Epidemiologic Analysis with a
Programmable Calculator.
Epidemiology Resources, Inc.: Boston,
MA.
Rothman KJ (2002). Epidemiology: an
introduction. New York, Oxford
University Press.
Rothman KJ, Greenland S (1998).
Modern epidemiology. 2nd ed.
Philadelphia: Lippincott-Raven.

82
CHAPTER 7: Validity
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

Systematic error (lack of validity) is categories (Rothman and Greenland,


distinguished from random error (lack 1998): confounding; selection bias;
of precision) in that it would be and information bias. In general
present even with an infinitely large terms, these refer to biases arising
study, whereas random error can be from differences in baseline disease
reduced by increasing the study size. risk between the exposed and non-
Thus, systematic error, or bias, occurs exposed subpopulations of the source
if there is a systematic difference population (confounding), biases
between what the study is actually resulting from the manner in which
estimating and what it is intended to study participants are selected from
estimate. the source population (selection bias),
and biases resulting from the
There are many different types of bias, misclassification of these study
but in studies of cause and effect most participants with respect to exposure
biases fall into one of three major or disease (information bias).

7.1: Confounding

Confounding occurs when the exposed characteristics (and different baseline


and non-exposed groups (in the source disease risk) at the time that they enter
population) are not comparable due to the study, and because of differential
inherent differences in background loss and non-compliance across
disease risk (Greenland and Robins, treatment groups. However, there is
1986) because of differences in the more concern about non-comparability
distribution of other risk factors between in epidemiological studies because of
the exposed and non-exposed groups. the absence of randomisation. The
For example, this could occur if we were concept of confounding thus generally
studying the risk of heart disease in refers to the source population,
people who exercise frequently and although confounding can also be
those who do not, and if the people who introduced (or removed) by the manner
exercised frequently smoked less than in which study participants are selected
those who did not exercise; thus they from the source population.
might have a lower risk of heart disease
because they smoked less, and not If no other biases are present, three
because they exercised more. Similar conditions are necessary for a factor to
problems can occur in randomised trials be a confounder (Rothman and
because randomisation may fail, leaving Greenland, 1998).
the treatment groups with different

83
First, a confounder is a factor which is between exposure and disease, or a
predictive of disease in the absence of symptom of disease) should not be
the exposure under study. Note that a treated as a confounder because to do
confounder need not be a genuine so could introduce serious bias into the
cause of disease, but merely results (Greenland and Neutra, 1981;
"predictive". Hence, surrogates for Robins, 1987; Weinberg, 1993). For
causal factors (e.g. age) may be example, in a study of high fat diet
regarded as potential confounders, and colon cancer, it would be
even though they are rarely directly inappropriate to control for serum
causal factors. cholesterol levels if it was considered
that high serum cholesterol levels were
Second, a confounder is associated a consequence of a high fat diet, and
with exposure in the source population hence a part of the causal chain
at the start of follow-up (i.e. at leading from diet to colon cancer. On
baseline). In case-control studies this the other hand, if serum cholesterol
implies that a confounder will tend to itself was of primary interest, then this
be associated with exposure among should be studied directly, and high fat
the controls. An association can occur diet would be regarded as a potential
among the cases simply because the confounder if it also involved exposure
study factor and a potential to other risk factors for colon cancer.
confounder are both risk factors for Evaluating this type of possibility
the disease, but this does not cause requires information external to the
confounding in itself unless the study to determine whether a factor is
association also exists in the source likely to be a part of the causal chain.
population. Intermediate variables can sometimes
be used in the analysis, but special
Thirdly, a variable which is affected by techniques are then required to avoid
the exposure or the disease (e.g. an adding bias (Robins, 1989; Robins et
intermediate in the causal pathway al, 1992; Robins et al, 2000).

Example 7.1

Table 7.1 presents a exposed people. Thus, exposure (as noted


hypothetical example of although exposure is above) and is an
confounding by tobacco not associated with independent risk factor
smoking in a prevalence disease either within the for the disease (40% of
case-control study. One- subgroup of smokers non-exposed smokers
half of the study (POR=1.0) or within the have the disease
participants are subgroup of non- compared with 20% of
"exposed to the risk smokers (POR=1.0), it is non-exposed non-
factor of interest and associated with disease smokers). Thus, smoking
one-half are not. overall (POR=1.38) is a confounder and the
However, two-thirds of when the two subgroups crude prevalence odds
the exposed people are are combined. This ratio of 1.38 is invalid
smokers compared with occurs because smoking because it is not
one-third of the non- is associated with the adjusted for smoking.

84
Table 7.1

Hypothetical example of confounding by tobacco smoking in a prevalence


case-control study
Smokers Non-smokers Total
Exposed Non- Exposed Non- Exposed Non-
exposed exposed exposed
Cases 800 400 200 400 1,000 8,00
Non-cases 1,200 600 800 1,600 2,000 2,200
Total 2,000 1,000 1,000 2,000 3,000 3,000
Prevalence (%) 40 40 20 20 33.3 26.7
Prevalence odds ratio 1.0 1.0 1.38

Control of Confounding whether randomised studies are part of


epidemiology or whether they constitute
Misclassification of a confounder leads to a separate methodology).
a loss of ability to control confounding,
although control may still be useful A second method of control at the design
provided that misclassification of the stage is to restrict the study to narrow
confounder is non-differential ranges of values of the potential
(Greenland, 1980). Misclassification of confounders, e.g., by restricting the
exposure poses a greater problem study to white males aged 35-54. This
because factors which influence approach has a number of conceptual
misclassification may appear to be and computational advantages, but may
confounders, but control of these factors severely restrict the number of potential
may increase the net bias (Greenland study subjects and the generalizability of
and Robins, 1985). In general, control of the study, as effects in younger or older
confounding requires careful use of a people will not be observable.
priori knowledge, as well as inference
from the observed data. A third method of control involves
matching study subjects on potential
Control in the study design confounders. For example, in a cohort
study one would match a white male
Confounding can be controlled in the non-exposed subject aged 35-39 with
study design, or in the analysis, or both. an exposed white male aged 35-39. This
Control at the design stage involves will prevent age-sex-race confounding
three main methods (Rothman and in a cohort study, but is seldom done
Greenland, 1998). because it may be very expensive.
Matching can also be expensive in case-
The first method is randomization, i.e., control studies, and does not prevent
random allocation to exposure confounding in such studies, but does
categories, but this is rarely an option in facilitate its control in the analysis.
epidemiology which generally involves Matching may actually reduce precision
observational studies (it is debatable in a case-control study if it is done on a

85
factor which is associated with according to the levels of the
exposure, but is not a risk factor for the confounder(s) and calculating an effect
disease of interest. However, matching estimate which summarizes the
on a strong risk factor will usually association across strata of the
increase the precision of effect confounders. It is usually not possible to
estimates. control simultaneously for more than 2
or 3 confounders in a stratified analysis.
Control in the Analysis For example, in a cohort study, finer
stratification will often lead to many
Confounding can also be controlled in strata containing no exposed or no non-
the analysis, although it may be exposed persons. Such strata are
desirable to match on potential uninformative, thus fine stratification is
confounders in the design to optimize wasteful of information. This problem
the efficiency of the analysis. The can be mitigated to some extent, by the
analysis ideally should control use of multiple regression which allows
simultaneously for all confounding for simultaneous control of more
factors. Control of confounding in the confounders by "smoothing" the data
analysis involves stratifying the data across confounder strata.

Example 7.2

If the data presented in 1.0 in each of the two specific estimates (see
example 7.1 (table 7.1) subgroups (i.e. 1.0 in chapter 9) then yields an
is analysed separately in smokers and 1.0 in non- overall smoking-adjusted
smokers and non- smokers). Taking a prevalence odds ratio of
smokers, then the weighted average of 1.0.
prevalence odds ratio is these two stratum-

In general, control of confounding prior knowledge that the factor is


requires careful use of a priori predictive of disease.
knowledge, together with assessment
of the extent to which the effect Misclassification of a confounder leads
estimate changes when the factor is to a loss of ability to control
controlled in the analysis. Most confounding, although control may still
epidemiologists prefer to make a be useful provided that
decision based on the latter criterion, misclassification of the confounder was
although it can be misleading, nondifferential (unbiased) (Greenland,
particularly if misclassification is 1980). Misclassification of exposure is
present (Greenland and Robins, 1985). more problematic, since factors which
The decision to control for a presumed influence misclassification may appear
confounder can certainly be made with to be confounders, but control of these
more confidence if there is supporting factors may increase the net bias
(Greenland and Robins, 1985).

86
Example 7.3

Suppose that a cohort incidence rate will be 6.5 would be biased upwards
study of lung cancer (= 0.50 x 1.0 + 0.40 x by a factor of 9.4/6.5 =
involves a comparison 10 + 0.10 x 20) times 1.4, i.e. it would be 1.4
with national mortality the rate in non-smokers. times higher than the
rates in a country where Suppose that it was national rate due to
50% of the population considered most unlikely confounding by smoking.
are non-smokers, 40% that the cohort under Table 7.2 gives a range
are moderate smokers study contained more of such calculations
with a 10-fold risk of than 50% moderate presented by Axelson
lung cancer (compared smokers and 20% heavy (1978) using data from
to non-smokers), and smokers. Then, the Sweden. The last column
10% are heavy smokers incidence rate in the indicates the likely bias
with a 20-fold risk of study cohort would be in the observed rate
lung cancer. Then, it 9.4 times the rate in ratio due to confounding
can be calculated that non-smokers. Hence, the by smoking (a value of
the national lung cancer observed incidence rate 1.00 indicates no bias).

Table 7.2

Estimated crude rate ratios in relation to fraction of smokers in various hypothetical


populations

Population fraction (%) Bias in


Nonsmokers Moderate Smokersa Heavy Smokersa relative risk
100 -- -- 0.15
80 20 - 0.43
70 30 -- 0.57
60 35 5 0.78
50 40 10 1.00b
40 45 15 1.22
30 50 20 1.43
20 55 25 1.65
10 60 30 1.86
-- 65 35 2.08
-- 25 75 2.69
-- -- 100 3.08
Source: Axelson (1978)
aTwo different risk levels are assumed for smokers: 10 times for moderate smokers; and
20 times for heavy smokers.
bReference population with rates similar to those in general population in countries
such as Sweden.

87
Assessment of Confounding exposed and non-exposed groups in
order to check that the average level
When one lacks data on a suspected of humidity in the home is similar in
confounder (and thus cannot control the two groups. Such limited
confounding directly) it is still desirable information, if taken in all exposure-
to assess the likely direction and disease subgroups, can also be used to
magnitude of the confounding it directly control confounding (White,
produces. It may be possible to obtain 1982; Walker, 1982; Rothman and
information on a surrogate for the Greenland, 1998).
confounder of interest (for example,
social class is associated with many Finally, even if it is not possible to
lifestyle factors such as smoking, and obtain confounder information for any
may therefore be a useful surrogate study participants, it may still be
for some lifestyle-related possible to estimate how strong the
confounders). Even though confounder confounding is likely to be from
control will be imperfect in this particular risk factors. For example,
situation, it is still possible to examine this is often done in occupational
whether the exposure effect estimate studies, where tobacco smoking is a
changes when the surrogate is potential confounder, but smoking
controlled in the analysis, and to information is rarely available; in fact,
assess the strength and direction of although smoking is one of the
the change. For example, if the strongest risk factors for lung cancer,
relative risk actually increases (e.g. with relative risks of 10 or 20, it
from 2.0 to 2.5), or remains stable appears that smoking rarely exerts a
(e.g. at 2.0) when social class is confounding effect of greater than 1.5
controlled for, then this is evidence times in studies of occupational
that the observed excess risk is not disease (Axelson, 1978; Siemiatycki,
due to confounding by smoking, since 1988), because few occupations are
social class is correlated with smoking strongly associated with smoking,
(Kogevinas et al, 1997), and control although this degree of confounding
for social class involves partial control may still be important in some
for smoking. contexts.

Alternatively, it may be possible to


obtain accurate confounder
information for a subgroup of
participants in the study, and to assess
the effects of confounder control in this
subgroup. A related approach, known
as two-stage sampling, involves
obtaining confounder information for a
sample of the source population (or a
sample of the controls in a case-
control study). For example, in a study
of asthma in children, it may not be
possible to obtain information on
humidity levels in the home in all
children. However, it may still be
possible to obtain humidity
measurements for a sample of the

88
7.2: Selection Bias

Whereas confounding generally in the study or follow-up is incomplete.


involves biases that are inherent in the For example, in a cohort mortality
source population, and therefore would study, if a national population registry
occur even if everyone in the source (or some surrogate for this such as the
population took part in the study, United States Social Security system)
selection bias involves biases arising were not available, then it might be
from the procedures by which the necessary to attempt to contact each
study participants are selected from worker or his next-of-kin to verify vital
the source population. Thus, selection status (i.e. whether the worker was
bias is not an issue in a cohort study still alive). Bias could occur if the
involving complete follow-up, since in response rate was higher in the most
this case the study cohort composes heavily exposed persons who had been
the entire source population. However, diagnosed with disease than in other
selection bias can occur if participation persons.

Example 7.4

Wrensch et al (2000) obtained during a brief there was evidence of a


conducted a case-control telephone interview with selection bias in the
study of 476 adults 101 controls who recruitment of controls.
newly diagnosed with declined participation in The odds ratio for cases
glioma in the San the lengthy in-person versus controls who
Francisco Bay Area interview. Controls who completed the full
between August 1991 participated in the full interview was 0.9,
and April 1994, and 462 interview were more whereas when both
age- gender- and likely than controls who control groups were
ethnicity-matched only completed the combined the odds ratio
controls. In addition, telephone interview to was 1.3.
limited information was report head injury. Thus

Although we should recognize the relative risk estimate provided that loss
possible biases arising from subject to follow-up applied equally to the
selection, it is important to note that exposed and non-exposed populations
epidemiologic studies need not be based (Criqui, 1979). Analogously, case-
on representative samples to avoid bias. control studies have differing selection
For example, in a cohort study persons probabilities as an integral part of their
who develop disease might be more design, in that the selection probability
likely to be lost to follow-up than of diseased persons is usually close to
persons who did not develop disease; 1.0 provided that most persons with
however, this would not affect the disease are identified, whereas that for

89
non-diseased persons is substantially restricted to union members (because
less; however, this does not affect the the records are available), then the non-
relative risk estimate provided that exposed comparison group could be
these selection probabilities apply other workers in the same geographical
equally within each exposure group. area who are members of the same
union, and/or a similar union.
Additional forms of selection bias can
occur in case-control studies because Control of Selection Bias
these involve sampling from the source
population. In particular, selection bias Selection bias can sometimes be
can occur in a case-control study controlled in the analysis by identifying
(involving either incident or prevalent factors which are related to subject
cases) if controls are chosen in a non- selection and controlling for them as
representative manner, e.g. if exposed confounders (provided that these
people were more likely to be selected factors are not affected by the study
as controls than non-exposed people. exposure or disease). For example, if
white-collar workers are more likely to
Minimizing Selection Bias be selected for (or participate in) a
study than manual workers (and white
If selection bias has occurred in the collar work is negatively or positively
enumeration of the exposed group, it related to the exposure of interest),
may still be possible to avoid bias by then this bias can be partially controlled
choosing an appropriate non-exposed by collecting information on social class
comparison group. For example, if the and controlling for social class in the
exposed group does not include all analysis as a confounder.
workers in a particular industry, but is

7.3: Information Bias

Information bias involves It is customary to consider two types


misclassification of the study of misclassification: non-differential
participants with respect to disease or and differential misclassification.
exposure status. Thus, the concept of
information bias refers to those people Non-Differential Misclassification
actually included in the study, whereas
selection bias refers to the selection of Non-differential misclassification
the study participants from the source occurs when the probability of
population, and confounding generally misclassification of exposure is the
refers to non comparability of same for cases and non-cases (or
subgroups within the source when the probability of
population. Information bias involves misclassification of disease is the
misclassification of the study subjects same for exposed and non-exposed
with respect to exposure, confounders, persons). This can occur if exposed
or disease. and non-exposed persons are equally

90
likely to be misclassified according to risk estimate towards the null value
disease outcome, or if diseased and of 1.0 (Copeland et al, 1977;
non-diseased persons are equally Dosemeci et al, 1990). Hence, non-
likely to be misclassified according to differential misclassification tends to
exposure. Non-differential produce "false negative" findings and
misclassification of exposure usually is of particular concern in studies
(but not always) biases the relative which find a negligible association

Example 7.5

In many cohort studies risk is thus 10. If 15% of result, the observed
some exposed persons high exposed persons are incidence rates per
will be classified as non- incorrectly classified, 100,000 person-years
exposed, and vice versa. then 15 of every 100 will be 91 and 23
Table 7.3 illustrates this deaths and 15,000 of respectively, and the
situation with every 100,000 person- observed relative risk will
hypothetical data from a years will be incorrectly be 4.0 instead of 10.0.
study of lung cancer allocated to the low Due to non-differential
incidence in asbestos exposure group. Similarly misclassification,
workers. Suppose the if 10% of high exposed incidence rates in the
true incidence rates are persons are incorrectly high exposed group have
100 per 100,000 person- classified, then 1 of every been biased downwards,
years in the high 10 deaths and 10,000 of and incidence rates in
exposure group, and 10 every 100,000 person- the low exposure group
per 100,000 person- years will be incorrectly have been biased
years in the low exposure allocated to the low upwards.
group, and the relative exposure group. As a

Table 7.3
Hypothetical data from a cohort study in which 15% of highly exposed persons and
10% of low exposed persons are incorrectly classified.

Actual Observed
------------------------------- ---------------------------------------------------------
High Low High Exposure Low Exposure
Exposure Exposure
-----------------------------------------------------------------------------------------------------------------
Deaths 100 10 85 + 1= 86 9+ 15 = 24
Person-years 100,000 100,000 85,000 + 10,000 = 95,000 90,000 +15,000 = 105,000
-----------------------------------------------------------------------------------------------------------------
Incidence rate 100 10 91 23
per 100,000
person years
----------------------------------------------------------------------------------------------------------------
Rate ratio 10.0 4.0

91
between exposure and disease. One by the misclassification. For example if
important condition is needed to ensure only 80% of the deaths are identified in
that exposure misclassification produces a study, but this under-ascertainment
bias towards the null however: the applies equally to the exposed and non-
exposure classification errors must be exposed groups, then this will not affect
independent of other errors. Without the relative risk estimate.
this condition, non-differential exposure
misclassification can produce bias in any Secondly, the effect estimate may be
direction (Chavance et al, 1992; biased away from the null for some
Kristensen, 1992). exposure categories when there are
multiple exposure categories (see
Furthermore, there are several other example 7.6).
situations in which non-differential
misclassification will not produce a bias Finally, when there is positive
towards the null. confounding, and there is non-
differential misclassification of the
Firstly, when the specificity of the confounder, then confounding control
method of identifying the disease under will be incomplete and the adjusted
study is 100%, but the sensitivity is less effect estimate will consequently be
than 100%, then the risk difference will biased away from the null.
be biased towards the null, but the risk
ratio (or rate ratio) will be not be biased

Example 7.6

Table 7.4 gives non-exposed group for groups produces a bias


hypothetical data from a which there is no away from the null when
cohort study in which the misclassification. In this the low exposure group is
findings for the high and instance, the non- compared to the non-
low exposure groups are differential exposed group: the
the same as in example misclassification between relative risk is 4.6 instead
7.5, but there is also a the high and low exposure of 2.0.

Table 7.4
Hypothetical data from a cohort study in which 15% of highly exposed persons and 10% of
low exposed persons are incorrectly classified, but the non-exposed are correctly classified
Actual Observed
------------------------------------ --------------------------------
High Low Non-Exposed High Low Non-Exposed
-----------------------------------------------------------------------------------------------------
Deaths 100 10 5 86 24 5
Person-years 100,000 100,000 100,000 95,000 105,000 100,000
-----------------------------------------------------------------------------------------------------
Rate 100 10 5 91 23 5
-----------------------------------------------------------------------------------------------------
Rate ratio 20.0 2.0 1.0 18.1 4.6 1.0

92
One special type of non-differential phenomena do not represent
misclassification occurs when the study misclassification because these are not
outcome is not well-defined and errors in measurement. However, they
includes a wide range of etiologically do involve misclassification in the sense
unrelated outcomes (e.g., all deaths). that the etiologically relevant exposure
This may obscure the effect of exposure (or disease) has not been measured
on one specific disease since a large appropriately.
increase in risk for this disease may
only produce a small increase in risk for Differential Misclassification
the overall group of diseases under
study. A similar bias can occur when the Differential misclassification occurs when
exposure measure is not well defined the probability of misclassification of
and includes a wide range of exposure is different in diseased and non-
etiologically unrelated exposures, diseased persons, or the probability of
possibly due to a non-specific exposure misclassification of disease is different in
definition or due to the inclusion of exposed and non-exposed persons. This
exposures which could not have caused can bias the observed effect estimate
the disease of interest because they either toward or away from the null
occurred after, or shortly before, value. For example, in a nested case-
diagnosis. It could be argued that these control study of lung cancer, with a

Example 7.7

Table 7.5 shows data from some chemical. The true are classified correctly,
a hypothetical case- odds ratio is thus (70/30) then the observed odds
control study in which 70 (50/50) = 2.3. If 90% ratio would be (63/37) /
of the 100 cases and 50 of (63) of the 70 exposed (30/70) = 4.0.
the 100 controls have cases, but only 60% (30)
actually been exposed to of the 50 exposed controls

Table 7.5

Hypothetical data from a case-control study in which 90% of exposed cases and 60% of
exposed controls are correctly classified

Actual Observed
Exposed Non-exposed Exposed Non-exposed
Cases 70 30 63 37
Controls 50 50 30 70
Odds
2.3 4.0
ratio

93
control group selected from among non- the validity of a study. Given limited
diseased members of the cohort, the resources, it will often be more
recall of occupational exposures in desirable to reduce information bias by
controls might be different from that of obtaining more detailed information on
the cases. In this situation, differential a limited number of subjects than to
misclassification would occur, and it reduce random error by including more
could bias the odds ratio towards or subjects. However, a certain amount of
away from the null, depending on misclassification is unavoidable, and it is
whether members of the cohort who did usually desirable to ensure that it is
not develop lung cancer were more or towards the null value (as usually
less likely to recall such exposure than occurs with nondifferential exposure
the cases. misclassification) to minimize the
chance of false positive results.
As can be noted from example 7.7,
misclassification can drastically affect

Example 7.8

In the case-control exposed cases would is in a predictable


study of lung cancer in recall exposure, but now direction, towards the
Example 7.7, the 45 (90%) of the 50 null. However, it should
misclassification could exposed controls would be noted that making a
be made non-differential recall their exposure. bias non-differential will
by selecting controls The observed odds ratio not always make it
from cohort members would be smaller, and that the
with other types of (63/37)/(45/55) = 2.1 direction of bias from
cancer, or other This estimate is still non-differential
diseases, in order that biased in comparison misclassification is
their recall of exposure with the correct value of sometimes predictable
would be more similar to 2.3. However, the bias is in advance.
that of the cases. As non-differential, is much
before, 63 (90%) of the smaller than before, and

Minimizing Information Bias

Misclassification can drastically affect to produce false negative findings and is


the validity of a study. It is often helpful thus of greatest concern in studies
to ensure that the misclassification is which have not found an important
non-differential, by ensuring that effect of exposure. Thus, in general it is
exposure information is collected in an important to ensure that information
identical manner in cases and non-cases bias is non-differential and, within this
(or that disease information is collected constraint, to keep it as small as
in an identical manner in the exposed possible. Thus, can be argued that the
and non-exposed groups). In this aim of data collection is not to collect
situation, if it is independent of other perfect information, but to collect
errors, exposure misclassification tends information in a similar manner from

94
the groups being compared, even if this Relationship of Selection and
means ignoring more detailed exposure Information Bias to Confounding
information if this is not available for
both groups. However, this is not Selection bias and confounding are not
always the case (Greenland and Robins, always clearly demarcated. In
1985). particular, selection bias can sometimes
be viewed as a type of confounding,
Assessment of information bias since both can be reduced by controlling
for surrogates for the determinants of
Information bias is usually of most the bias (e.g. social class).
concern in historical cohort studies or Unfortunately, selection affected by
case-control studies when information is exposure and disease generates a bias
obtained by personal interview. Despite that cannot be reduced in this fashion.
these concerns, relatively little Some consider any bias that can be
information is generally available on the controlled in the analysis as
accuracy of recall of exposures. When confounding. Other biases are then
possible, it is important to attempt to categorized according to whether they
validate the classification of exposure or arise from the selection of study
disease, e.g., by comparing interview subjects (selection bias), or their
results with other data sources such as classification (information bias).
employer records, and to assess the
potential magnitude of bias due to
misclassification of exposure.

Summary

The greatest concern in confounding appear smaller than it


epidemiological studies usually relates really is).
to confounding, because exposure has
not been randomly allocated, and the Provided that information has been
groups under study may therefore be collected in a standardized manner
noncomparable with respect to their (and its accuracy is unrelated to other
baseline disease risk. However, to be a errors), then misclassification will be
significant confounder, a factor must non-differential, and any bias it
be strongly predictive of disease and produces will usually be towards the
strongly associated with exposure. null value. In this situation,
Thus, although confounding is misclassification tends to produce false
constantly a source of concern, the negative findings and is thus of
strength of confounding is often greatest concern in studies which have
considerably less than might be not found an important effect of
expected (it should be appreciated exposure; it is of much less concern in
however, that this appearance may be studies with positive findings, since
illusory, for nondifferential these findings are likely to have been
misclassification of a confounder which even more strongly positive if
is common will usually make the misclassification had not occurred.

95
Again, one should appreciate the misclassification of a confounder can
limitations of these observations: it lead to bias away from the null if the
may be difficult to be sure that the confounder produces confounding
exposure and disease misclassification away from the null.
is nondifferential, and nondifferential

References

Axelson O (1978). Aspects on Greenland S, Robins JM (1985).


confounding in occupational health Confounding and misclassification. Am
epidemiology. Scand J Work Environ J Epidemiol 122: 495-506.
Health 4: 85-9.
Chavance M, Dellatolas G, Lellouch J Greenland S, Robins JM (1986).
(1992). Correlated nondifferential Identifiability, exchangeability and
misclassifications of disease and epidemiological confounding. Int J
exposure: application to a cross- Epidemiol 15: 412-8.
sectional study of the relationship
between handedness and immune Kogevinas M, Pearce N, Susser M,
disorders. Int J Epidemiol 21: 537-46. Boffetta P (1997). Social inequalities
and cancer. In: Kogevinas M, Pearce
Copeland KT, Checkoway H, McMichael N, Susser M, Boffetta P (eds). Social
AJ, et al (1977). Bias due to inequalities and cancer. Lyon: IARC,
misclassification in the estimation of pp 1-15.
relative risk. Am J Epidemiol 105: 488-
95.
Kristensen P (1992). Bias from
Criqui MH (1979). Response bias and risk nondifferential but dependent
ratios in epidemiologic studies. misclassification of exposure and
American Journal of Epidemiology outcome. Epidemiol 3: 210-5.
109:394-399.
Dosemeci M, Wacholder S, Lubin JH Robins J (1987). A graphical approach to
(1990). Does nondifferential the identification and estimation of
misclassification of exposure always causal parameters in mortality studies
bias a true effect toward the null with sustained exposure periods. J
value? Am J Epidemiol 132: 746-8. Chron Dis 40 (suppl 2): 139S-161S.

Greenland S (1980). The effect of Robins J (1989). The control of


misclassification in the presence of confounding by intermediate variables.
covariates. Am J Epidemiol 112: 564- Stat Med 8: 679-701.
9. Robins JM, Blevins D, Ritter G, et al
Greenland S, Neutra R (1981). An (1992). G-estimation of the effect of
analysis of detection bias and prophylaxis therapy for pneumocystis
proposed corrections in the study of carinii pneumonia on the survival of
estrogens and endometrial cancer. J AIDS patients. Epidemiol 3: 319-36.
Chron Dis 34: 433-8.

96
Robins JM, Hernn MA, Brumback B White JE (1982). A two-stage design for
(2000). Marginal structural models and the study of the relationship between a
causal inference in epidemiology. rare exposure and a rare disease. Am J
Epidemiol 11; 550-62. Epidemiol 115: 119-28.
Rothman KJ, Greenland S (1998). Modern Wrensch M, Miike R, Neuhaus J
epidemiology. 2nd ed. Philadelphia: (2000). Are prior head injuries of
Lippincott-Raven. diagnostic X-rays associated with
glioma in adults? The effects of
Siemiatycki J, Wacholder S, Dewar R, et
control selection bias.
al (1988). Smoking and degree of
Neuroepidemiology 2000; 19: 234-
occupational exposure: Are internal
44.
analyses in cohort studies likely to be
confounded by smoking status?
Walker AM (1982). Anamorphic analysis:
sampling and estimation for covariate
effects when both exposure and
disease are known. Biometrics 38:
1025-32.
Weinberg CR (1993). Toward a clearer
definition of confounding. Am J
Epidemiol 137: 1-8.

97
98
CHAPTER 8: Effect Modification
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

In the previous chapter I discussed the (Miettinen, 1974). The term statistical
problem of confounding which occurs interaction denotes a similar phenomenon
when the exposed and non-exposed in the observed data. However, the terms
subpopulations of the source population interaction and effect modification are
are inherently different in background also used in a variety of other contexts,
disease risk. This should not be confused with a variety of meanings. In particular,
with effect modification which occurs the term interaction has different
when the measure of the effect of the meanings for biostatisticians, lawyers,
study factor depends on the level of clinicians, public health professionals,
another factor in the study population epidemiologists and biologists.

Example 8.1

Katsouyanni et al (1993) heat wave were modified urban areas and 27% in
studied the effects of air by the presence (or non-urban areas. Further
pollution and high absence) or high air analyses suggested that
temperature in the pollution levels. In Athens the threshold of effect of
causation of excess (where air pollution levels various air pollutants
mortality during a major are high) the increase in appeared to be lower on
heat wave in Greece in deaths on extremely hot extremely hot days.
July 1987. They found days was 97% in Athens,
that the effects of the but was 33% in other

8.1: Concepts of Interaction

The different concepts of interaction will per 1,000 person-years in smokers. On


be illustrated with data from a the other hand, the rate ratio for smoking
hypothetical study of the risk of lung is 7.0 in asbestos workers and 10.0 in
cancer per 1,000 population (e.g. over a other people. I will now consider how this
five year period) in relation to exposure data might be interpreted by a different
to cigarette smoke and asbestos (Table researchers and policy makers. In each
8.1). The risk difference due to smoking instance, it is recognized that it is
is 30 per 1,000 in asbestos workers and 9 important to prevent or reduce both

99
asbestos exposure and smoking. difference as the effect measure. They
However, in this case the asbestos note that the risk difference for smoking
exposure has already occurred and the and lung cancer is 30 per 1,000 (35 - 5)
factory has now closed, so our focus is on in asbestos workers and 9 per 1,000 (10
smoking. We want to know whether the - 1) in other people. Thus, the effect of
effect of smoking is modified by smoking is greater in asbestos workers
asbestos exposure, i.e. do smoking and and there is a positive statistical
asbestos exposure interact? interaction between the effects of
smoking and asbestos (table 8.2). They
Two Biostatisticians may even fit an additive model with an
interaction term and show that the
Suppose that we first consult a interaction term is positive.
biostatistician about how to interpret this
data. The first biostatistician we talk to We eventually get our two biostatistical
uses relative risk measures of effect. consultants together and they argue that
They note that the relative risk for there is no contradiction in the advice
smoking and lung cancer is 7.0 (35/5) in they have given us. Effect modification
asbestos workers and 10.0 (10/1) in and statistical interaction are merely
other people. Thus, the effect of smoking statistical concepts which depend on the
on lung cancer is less in asbestos workers methods used. In fact, all secondary risk
and there is therefore a negative factors modify either the rate ratio or the
statistical interaction between the effects rate difference, and uniformity over one
of smoking and asbestos (table 8.2). measure implies non-uniformity over the
They may even fit a multiplicative model other (Koopman, 1981; Steenland and
with an interaction term and show that Thun, 1986), e.g. an apparent additive
the interaction term is negative. joint effect implies a departure from a
multiplicative model. Several authors
We can see the logic of this argument, (e.g. Kupper and Hogan, 1978; Walter
but are somewhat surprised by the and Holford, 1978) have demonstrated
conclusion, since we can see the very the dependence of statistical interaction
high rates in people who both smoke and on the underlying statistical measure of
are exposed to asbestos. We therefore effect, and have therefore argued that
consult a second biostatistician. This the assessment of interaction is "model-
alternative biostatistician uses the risk dependent".

Table 8.1

Lung cancer risk per 1,000 people (and RR) in relation to exposure to cigarette smoke
and asbestos
Asbestos
Yes No
------------------------------------------------------
Smoking Yes 35/1000 (35.0) 10/1000 (10.0)
No 5/1000 (5.0) 1/1000 (1.0)
------------------------------------------------------
Rate difference 30/1000 9/1000
Rate ratio 7.0 10.0

100
A Lawyer

Next we consult a lawyer (I do not 86% (this is just 100*(R-1)/R where R is


advise this as a real course of action; the relative risk of 1.9). The
this is just a hypothetical consultation!). corresponding estimate for other people
She/he is also concerned about the (not exposed to asbestos) is 100*9/10
effect of smoking, but the effect they are which is 90%. Thus, the probability of
interested in is what is the probability causation by smoking is slightly less in
that my clients lung cancer was caused asbestos workers and there is therefore
by their smoking? If we look at the a negative interaction between the
asbestos workers, we find that if they effects of smoking and asbestos (table
smoked their risk of lung cancer was 35 8.2). It should be noted that this
per 1,000 whereas it was 7 per 1,000 if lawyers approach is a little simplistic
they didnt smoke. Thus, assuming there (Greenland, 1999), but the key issue
is no confounding by other factors, then here is that the effect that is being
of every 35 lung cancer occurring in the measured, and the inference about
smokers, 5 would have happened interaction, is different from that of the
anyway, and 30 are additional cases due two biostatisticians, although it is more
to smoking. Thus, for an individual lung consistent with that of the biostatistician
cancer case, the probability that smoking who uses the relative risk as the
caused the cancer is 100*30/35 which is measure of effect.

Table 8.2

The approaches of different consultants to interpreting the data in table 8.1


Size of effect
-----------------
Inherent
Effect Asbestos Statistical Is there an
Consultant measure workers Others model Interaction? Direction?
----------------------------------------------------------------------------------------------------------

Biostatistician 1 Relative risk 7.0 10.0 Relative risk Yes -ve

Biostatistician 2 Risk difference 30/1000 9/1000 Risk difference Yes +ve

Lawyer Probability of 86% 90% Relative risk Yes -ve


causation

Clinician Individual risk 30 per 9 per Risk difference Yes +ve


1,000 1,000

Public health Deaths 30 per 9 per Risk difference Yes +ve


worker prevented 1,000 1,000 Risk difference Yes +ve

Epidemiologist Combination of 21 cases Not Risk difference Yes +ve


factors to cause out of 35 applicable
disease (60% are
due to the
combination
of exposures)

101
A Clinician except that they are concerned about
the population rather than about
Next we consult with a clinician. She/he individual patients. They say I want to
says I advise my patients to give up conduct population smoking prevention
smoking, and I tell them that if they do campaigns and persuade people to give
manage to stop then they will reduce up smoking and that if they do then
their risk of lung cancer. They ask by they will reduce their risk of lung
how much? So I want to know what the cancer. I only have a limited amount of
reduction in their individual risk will be if resources so I want to know if I can
they give up smoking. Well, if their prevent more cases of lung cancer by
patient is an asbestos worker then they focusing on asbestos workers, or by
will reduce their risk by 30 per 1,000 doing my campaigns in the same
(over five years) by giving up smoking; number of people in the general
other people will reduce their risk by 9 population. If they prevent 1,000
per 1,000 (once again, this is a little asbestos workers smoking, then (once
simplistic since it this does not tell us there has been time for the reduction in
exactly how many years of life they will risk to start occurring) they will have
gain). Thus, the effect of smoking is prevented 30 lung cancer cases each
greater in asbestos workers and there is year. If they prevent 1,000 other people
therefore a positive statistical from smoking then each year they will
interaction between the effects of have prevented 9 cases of lung cancer.
smoking and asbestos (table 8.2). Thus, the effect of smoking is greater in
asbestos workers and there is therefore
A Public Health Worker a positive statistical interaction between
the effects of smoking and asbestos
The public health worker that we consult (table 8.2).
has a similar approach to the clinician,

Figure 8.1

Numbers of cases occurring through background factors, asbestos alone, smoking


alone, and their combination in people exposed to both factors

Background Asbestos Smoking Asbestos &


Smoking

A S
U U A U S
U

Cases 1/35 (3%) 4/35 (11%) 9/35 (26%) 21/35 (60%)

102
An Epidemiologist asbestos) together with unknown
background exposures (U), and 21
I have argued in chapter 1 that cases (60%) occurred through
epidemiology is part of public health, mechanisms involving both factors
and therefore I might be quite content together with unknown background
to accept the public health workers exposures (U). This means that 86%
approach. However, as an of the cases (26% + 60%) could have
epidemiologist I do want to know more been prevented by preventing smoking,
about the causation of disease, since whereas 71% (11% + 60%) could have
what I learn may be relevant to other been prevented by preventing asbestos
exposures or other diseases. Thus, I exposure. Thus, the attributable risks
may be particularly interested in the for the individual factors of smoking
combination of smoking and asbestos to (86%) and asbestos (71%) sum to
produce cases of lung cancer. Rothman more than 100% because of the cases
and Greenland (1998) have thus that occur through mechanisms
adopted an unambiguous involving both exposures and which
epidemiological definition of interaction consequently could be prevented by
in which two factors are not preventing either exposure.
"independent" if they are component
causes in the same sufficient cause. This One apparent exception should be noted
concept of independence of effects leads (Koopman, 1977). If two factors (A and
to the adoption of additivity of incidence B) belong to different sufficient causes,
rates as the state of "no interaction". but a third factor (C) belongs to both
Thus, the fact that the lung cancer rate sufficient causes, then A and B are
in the group exposed to both factors competing for a single pool of
(35/1000) is greater than the sum of susceptible individuals (those who have
the baseline risk (1/1000) plus the C). Consequently the joint effect of A
effect of asbestos alone (5/1000 and B will be less than additive
1/1000) plus the effect of smoking (Miettinen (1982) reaches a similar
alone (10/1000 - 1) indicates that there conclusion based on a model of
are some cases of disease that are individual outcomes). However, this
occurring due to the combination of phenomenon can be incorporated
exposures and which would not have directly into the causal constellation
occurred if either of the exposures had model by clarifying a previous ambiguity
been eliminated. We can do the same in the description of antagonism in the
calculations using the relative risks model's terms. Specifically, the absence
(relative to the group with exposure to of B can be included in the causal
neither factor) rather than incidence constellation involving A, and vice
rates: the joint effect is 35.0 times, versa. Then, two factors would not be
whereas it would be 1+(5.0-1)+(10.0- "independent" if the presence or
1)=14.0 if it were additive. This absence of the factors (or particular
situation is summarized in figure 8.1. It levels of both factors) were component
shows that in the group exposed to both causes in the same sufficient cause
factors, 1 case (3%) occurred through (Greenland and Poole, 1988; Rothman
unknown background exposures (U), 4 and Greenland, 1998).
cases (11%) through mechanisms
involving asbestos exposure alone (and A Biologist
not smoking) together with unknown
background exposures (U), 9 cases Finally, it should be stressed that this
(26%) occurred through mechanisms epidemiological concept of
involving smoking alone (and not independence of effects is distinct from

103
some biological concepts of which a particular biologic model, rather
independence. For example, Siemiatycki than being accepted as the "baseline", is
and Thomas (1981) give a definition in itself evaluated in terms of the co-
which two factors are considered to be participation of factors in a sufficient
biologically independent "if the cause. For example, two factors which
qualitative nature of the mechanism of act at different stages of a multistage
action of each is not affected by the process are not independent since they
presence of absence of the other". are joint components of at least one
However, this concept does not lead to sufficient cause. This occurs irrespective
an unambiguous definition of of whether they affect each other's
independence of effects, and thus does qualitative mechanism of action (the
not produce clear analytic implications. ambiguity in Siemiatycki and Thomas'
Rothman's concept of independence is formulation stems from the ambiguity of
at a more abstract conceptual level in this concept).

8.2 Multiplicative and Additive Models

Rothman's approach is attractive Second, it has been argued that


because it is based on epidemiological multiplicative models facilitate the
concepts which have a clear biologic assessment of the extent of unknown
interpretation, and because it leads to confounding or bias (Cornfield et al,
an unambiguous definition of 1959), although this is not always the
independence of effects which is case.
identical to that obtained through
public health considerations (Rothman Third, if it is desired to keep statistical
et al, 1980). However, the analytic interaction (effect modification) to a
implications of these concepts are not minimum, then a multiplicative model
straightforward, since assessing may be more appropriate. It is not
independence of effects is usually only uncommon for risk factors to have
one of the analytic goals of an approximately multiplicative effects
epidemiological study. Rather, there (Saracci, 1987). This presumably
are several other considerations which occurs because they are a part of
often favour the use of multiplicative common causal processes, although
models. other sufficient causes usually also
operate, and exact multiplicativity may
First, multiplicative models have not occur. Nevertheless, in this
convenient statistical properties. situation there may be less masking of
Estimation in non-multiplicative heterogeneity in calculating an overall
models may have problems of rate ratio than in calculating an overall
convergence, and inference based on rate difference; there are also many
the asymptotic standard errors may be instances of non-multiplicative
flawed unless the study size is very departures from additivity, however
large (Moolgavkar and Venzon, 1987). (Selikoff et al, 1980; Saracci, 1987).

104
8.3: Joint Effects

These considerations imply an apparent analysis strategies are based on the


dilemma. How can an analysis be principle that it is not appropriate to
conducted which combines the calculate an overall effect estimate if
advantages of ratio measures of effect interaction is present. However, this
with the assessment of independence in principle is commonly ignored if the
terms of a departure from additivity? difference in stratum-specific effect
These apparently contradictory goals estimates is not too great. In fact
can be reconciled in analyses which standardized rate ratios (see chapter 9)
concentrate on the estimation of have been developed for precisely this
separate and joint effects (Pearce, situation, and will consistently estimate
1989). meaningful epidemiological parameters
even under heterogeneity (Greenland,
Thus, when studying asbestos, smoking 1982). Nevertheless, some authors
and lung cancer, relative risks might be have proposed modeling strategies in
presented for smoking (in non-asbestos which the first step in the analysis
workers), asbestos exposure (in non- involves testing for statistical
smokers) and exposure to both factors, interaction. A related approach has been
relative to persons exposed to neither the development of generalized families
factor. These relative risks would be of models which include the additive
adjusted for all other factors (e.g. age) and multiplicative models as special
which are potential confounders, but not cases. An alternative general strategy
of immediate interest as effect can be based on epidemiological
modifiers. considerations (Pearce, 1989). The key
difference is that interaction is assessed
The estimation of separate and joint (rather than tested) in terms of a
effects may be difficult when the factors departure from additivity in order to
of interest are closely correlated, and elaborate an observed effect, rather
there are therefore only small numbers than being tested for departure from an
of people who are exposed to either arbitrary effect measure as an essential
factor alone. However, when it is initial analytic step. This procedure can
feasible, this approach combines the be achieved within the confines of
best features of multiplicative models statistically convenient multiplicative
and additive independence assessment, models through the analysis of separate
but also permits readers with other and joint effects.
concepts of independence to draw their
own conclusions (as in table 8.1).

When the assessment of joint effects is


a fundamental goal of the study, it can
be accomplished by calculating stratum-
specific effect estimates, as in Example
8.1 above. On the other hand, it is less
clear how to proceed when effect
modification is occurring, but
assessment of joint effects is not an
analytical goal. Conventional statistical

105
Summary

The terms interaction and effect in which two factors are not
modification are used in a variety of "independent" if they are component
contexts, with a variety of meanings. In causes in the same sufficient cause. This
particular, the term interaction has leads to the adoption of additivity of
different meanings for biostatisticians, incidence rates as the state of "no
lawyers, clinicians, public health interaction". However, there are other
professionals, epidemiologists and considerations which generally favor the
biologists. In each instance, they are use of multiplicative models. This
interested in the same question, namely implies an apparent dilemma as to how
does the effect of exposure A depend on an analysis can be conducted which
whether exposure B is also present (or combines the advantages of ratio
absent)? However, the word effect has measures of effect with the assessment
different meanings in different contexts. of independence in terms of a departure
In contrast to definitions based on from additivity. These apparently
statistical concepts, Rothman has contradictory goals can be reconciled
adopted an unambiguous through the analysis of separate and
epidemiological definition of interaction joint effects.

References

Cornfield J, Haenszel W, Hammond EC, Katsouyanni K, Pantazopoulou A,


et al (1959). Smoking and lung Touloumi G, et al. Evidence for
cancer: recent evidence and a interaction between air pollution and
discussion of some questions. JNCI high temperature in the causation of
22: 173-203. excess mortality. Arch Environ Health
1993; 48: 235-42.
Greenland S (1982). Interpretation and
estimation of summary ratios under Koopman JS (1977). Causal models and
heterogeneity. Statist Med 1: 217- sources of interaction. Am J
27. Epidemiol 106: 439-44.
Greenland S, Poole C (1988). Invariants Koopman JS (1981). Interaction
and noninvariants in the concept of between discrete causes. American
interdependent effects. Scand J Work Journal of Epidemiology 13:716-724.
Environ Health 14; 125-9.
Kupper LL, Hogan MD (1978).
Greenland S (1999). Relation of Interaction in epidemiologic studies.
probability of causation to relative Am J Epidemiol 108: 447-53.
risk and doubling dose: a
Miettinen OS (1974). Confounding and
methodologic error that has become
effect modification. Am J Epidemiol
a social problem. AJPH 89: 1166-9.
100: 350-3.

106
Miettinen OS (1982). Causal and Steenland K, Thun M (1986). Interaction
preventive interdependence: between tobacco smoking and
elementary principles. Scand J Work occupational exposures in the
Environ Health 8: 159-68. causation of lung cancer. Journal of
Occupational Medicine 28:110-118.
Moolgavkar SH, Venzon DJ (1987).
General relative risk models for Walter SD, Holford TR (1978). Additive,
epidemiologic studies. Am J multiplicative and other models for
Epidemiol 126: 949-61. disease risks. Am J Epidemiol 108:
341-6.
Pearce NE (1989). Analytic implications
of epidemiological concepts of
interaction. Int J Epidemiol 18: 976-
80.
Rothman KJ, Greenland S, Walker AM
(1980). Concepts of interaction. Am J
Epidemiol 112: 467-70.
Rothman KJ, Greenland S (1998).
Modern epidemiology. 2nd ed.
Philadelphia: Lippincott-Raven.
Saracci R (1987). The interactions of
tobacco smoking and other agents in
cancer etiology. Epidemiologic
Reviews 9: 175-93.
Selikoff I, Sedman H, Hammond E
(1980). Mortality effects of cigarette
smoking among amosite asbestos
factory workers. JNCI 65: 507-13

Siemiatycki J, Thomas DC (1981).


Biological models and statistical
interactions: an example from
multistage carcinogenesis. Int J
Epidemiol 10: 383-7.

107
108
Part III

Analysis and Interpretation of Studies

109
110
CHAPTER 9: Data Analysis
(In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003)

In this chapter I describe the basic outcomes (chapters 2 and 3) and I do


principles of data analysis in not consider more complex study
epidemiologic studies including the designs (chapter 4). Readers requiring a
estimation of effects and calculation of more formal and detailed statistical
confidence intervals while controlling for presentation are referred to standard
potential confounders. I only cover the texts (particularly Rothman and
basic methods for dichotomous Greenland, 1998).
exposures and dichotomous health

9.1: Basic Principles

Data Management epidemiological resources, including


epidemiological software which is
With the rapid advances in computer available free of charge, or at
technology in recent years, almost minimal cost, has been produced by
any epidemiological study can be the Epidemiology Monitor, and this
analysed on a personal computer publication also has a regular feature
(PC). In addition, a wide variety of reviewing such software (see the
software is available for data entry, Epidemiology Monitor Website at
data analysis and graphical http://www.epimonitor.net/). There
presentation of data on PCs (much of is also an excellent epidemiology
which is not available for mainframe Excel spreadsheet (Episheet)
computers). One particularly useful available, which can be used to do
package is EPI-INFO (Dean et al, most of the analyses described in this
1990), which is available through chapter (Rothman, 2002). It can be
WHO (Geneva) and CDC (Atlanta) downloaded from http://www.oup-
and can be downloaded from usa.org/epi/rothman/.
http://www.cdc.gov/epiinfo. This
package is particularly useful for data Given the huge amount of work usually
entry and editing, and can be used involved in collecting data for
on small laptop computers in the field epidemiologic studies, it is essential to
as well as on desktop computers. examine the raw data very carefully for
However, the same facilities are errors and to make every attempt to
available in many other packages, avoid errors in the transfer of data from
some of which are more sophisticated questionnaires onto the computer. In
both statistically and in terms of data most cases, the first step is to translate
management (e.g. Stata (Hills and some of the information into numerical
De Stavola, 2002)). A catalogue of (or alphabetical) codes, following a set

111
of coding instructions that should have done, both to avoid confusion, and also
been prepared prior to data collection. to avoid any possibility of the data
For instance, a detailed occupational coding and checking being influenced by
history may have been taken in a semi- the results of preliminary analyses.
narrative form, and must be Once the data have been entered and
subsequently coded. It is usually edited, there is usually a major task of
preferable to do this when entering the data management. This typically
data directly onto a PC, since this involves the use of a computer package
minimizes transcription errors. to transform the data, compute new
variables, and prepare new files suitable
Once the data are coded and entered, for statistical analysis.
programmes should be run that seek
strange data, contradictions, and Data Analysis
impossible data (e.g. a systolic blood
pressure of 40 mm Hg). These The basic aim of the analysis of a single
programmes should not be restricted to study is to estimate the effect of
a search for logic errors or exposure on the outcome under study
impermissible symbols. They should while controlling for confounding and
include also procedures that identify minimizing other possible sources of
values that lie outside plausible limits. bias. In addition, when confounding and
The values being queried should be other sources of bias cannot be
listed, and decisions on how the "errors" removed, then it is important to assess
are dealt with should be documented. their likely strength and direction. This
With many packages, this process can latter task was discussed in chapter 7.
be conducted during the actual data In this chapter I focus on the control of
entry since the range of permissible confounding.
values (for numeric variables) or legal
codes (for alphanumeric variables) can Effect estimation
be specified, as well as variables which
must not be left blank, conditional The basic effect measures, and methods
jumps (e.g. if the answer is "NO" the of controlling confounding are described
computer skips to the next relevant below. Usually, in epidemiology studies,
question), repeat fields (so that the we wish to measure the difference in
value of a variable is set by default to disease occurrence between groups
that of the last record entered or exposed and not exposed to a particular
displayed), and logical links between factor.
variables. The best method of data
checking is to enter all of the data The analysis ideally should control
twice, and to compare the two files for simultaneously for all confounding
discrepancies. This approach, combined factors. Control of confounding in the
with extensive edit checks at the time of analysis involves stratifying the data
data entry, should minimize errors. according to the levels of the
confounder(s) and calculating an effect
Even with double data entry and estimate which summarizes the
sophisticated checking procedures, information across strata of the
errors may occur, and it is therefore confounder(s). For example, controlling
important to run further edit checks for age (grouped into 5 categories) and
before data analysis begins. It is gender (with 2 categories) might involve
particularly important to finish all edit grouping the data into the 10 (= 5 x 2)
checks and to have a final version of the confounder strata and calculating a
data file before any data analysis is

112
summary effect estimate which is a In most instances, epidemiologic data
weighted average of the stratum- involves binomial (i.e. with persons in
specific effect estimates. the denominator) or Poisson (i.e. with
person-years in the denominator)
Confidence intervals outcome variables and ratio measures
of effect. The estimated relative risk
As well as estimating the effect of an (rate ratio, risk ratio, odds ratio) has an
exposure, it is also important to approximate log normal distribution,
estimate the statistical precision of the and the ln(RR) can be written as the
effect estimate. The confidence interval difference of the two compared risks:
(usually the 95% confidence interval)
provides a range of values in which it is ln(RR) = ln(R1/R0) = ln(R1) ln(R0)
plausible (provided that there is no
uncontrolled confounding or other bias)
that the true effect estimate may lie. If Thus (assuming no bias) the 95%
the statistical model is correct, and confidence interval for the natural log
there is no bias, then the confidence (ln) of the relative risk is:
intervals derived from an infinite series
of study repetitions would contain the ln(RR) + 1.96 SE
true effect estimate with a frequency no
less than its confidence level (Rothman
and Greenland, 1998). Thus the confidence interval for the
relative risk itself is:
The usual practice is to use 90% or
95% confidence intervals, but these RR e + 1.96 SE
values are completely arbitrary. Given a
large enough sample, an approximate
95% confidence interval for the true
P-Values
population mean is:

m + 1.96 SE As discussed in chapter 6, the p-value is


the probability that a test statistic as
large or larger as that observed could
where m is the observed mean of the have arisen by chance if there is no bias
sample, and SE is its standard error, and if the null hypothesis (of no
estimated from the standard deviation association between exposure and
of the sample divided by the square root disease) is correct. The test statistic
of the sample size. defines the p-value and usually has the
form:
This confidence interval depends on two
quantities (m and SE) which are z = D/SE
estimated from the sample itself, and
different results will be obtained from
different samples. Provided that the where D is the observed difference and
samples are sufficiently large, then 95% SE is the standard error of the
of the time, the confidence interval difference.
estimated from the sample would
contain the true population mean. One This provides a test statistic (z) which
should note, however, that this is no can be used to calculate the probability
guarantee that the interval from ones (p-value) that a difference as large as
data contains the true value. that observed would have occurred by

113
chance if the null hypothesis (that there studies, as well as non-statistical
is no difference in reality) were true. considerations such as the plausibility
and coherence of the effect in the light
In the past, p-values have often been of current theoretical and empirical
used to describe the results of a study knowledge (see chapter 10).
as "significant" or "not significant" on
the basis of decision rules involving an The problems of significance testing can
arbitrary alpha level as a cutoff for be avoided by recognizing that the
significance (e.g. alpha=0.05). principal aim of an individual study
However, it is now recognised that there should be to estimate the size of the
are major problems with this approach effect rather than just to decide whether
(Rothman and Greenland, 1998). or not an effect is present. The point
estimate should be accompanied by a
First, the p-value associated with a confidence interval (the interval
difference in outcome between two estimate) which indicates the precision
groups depends on two factors: the size of the point estimate by providing a
of the difference; and the size of the range of values within which it is most
study. A very small difference may be plausible that the true treatment effect
statistically significant if the study is may lie if no bias were present (Gardner
very large, whereas a very large and Altman, 1986; Rothman and
difference may not be significant if the Greenland, 1998). The point estimate
study is very small. p-values thus reflects the size of the effect, whereas
combine two phenomena which should the confidence interval reflects the
be kept separate: the size of the effect; study size on which this effect estimate
and the size of the study used to is based. This approach also facilitates
measure it. the comparison of the study findings
with those of previous studies. Note that
A second problem with significance all conventional statistical methods
testing is more fundamental. The assume no bias is present. Because
purpose of significance testing is to this assumption is rarely if ever correct,
reach a decision. However, in further considerations beyond the
environmental research, decisions statistics presented here are always
should ideally not be based on the needed (see chapter 10).
results of a single study, but should be
based on information from all available

9.2: Basic Analyses

Measures of Disease Occurrence

The basic measures of disease occurrence used measures. In the next section I
and association have been introduced in extend these methods to adjust for
chapter 2. In this section I consider them potential confounders. I will only present
in more depth and show how to calculate large sample methods of analysis which
confidence intervals for the commonly have sample size requirements for valid

114
use. To avoid statistical bias, more persons followed for 10 years. As noted
complex techniques are required for in chapter 2, three measures of disease
analyses of studies involving very small incidence are commonly used in incidence
numbers or sparse stratifications studies.
(Greenland et al, 2000). Once again,
readers are referred to standard texts The observed incidence rate in the non-
(particularly Rothman and Greenland, exposed group (table 9.1) has the form:
1998) for a more comprehensive review
of these methods. I will emphasise
cases b
confidence intervals, but will also present
I0 = -------------- = ----
methods for calculating p-values.
person-time Y0
Table 9.1 shows the findings of a
hypothetical incidence study of 20,000

Table 9.1

Findings from a hypothetical cohort study of 20,000 persons followed for 10 years

Exposed Non-exposed Ratio


Cases 1,813 (a) 952 (b)
Non-cases 8,187 (c) 9,048 (d)
Initial population size 10,000 (N1) 10,000 (N0)
Person-years 90,635 (Y1) 95,163 (Y0)
Incidence rate 0.0200 (I1) 0.0100 (I0) 2.00
Incidence proportion (average risk) 0.1813 (R1) 0.0952 (R0) 1.90

Incidence odds 0.2214 (O1) 0.1052 (O0) 2.11

The natural logarithm of I0 has an I0 e+ 1.96 SE


approximate standard error (under the
Poisson model for random variation in
b) of: The observed incidence proportion in
the non-exposed group has the form:
SE [ln(I0)] = (1/b)0.5
cases b
R0 = ---------- = ------
and an approximate 95% confidence persons N0
interval for the incidence rate is thus:

115
The observed incidence proportion in cases of disease (b). They differ in
the non-exposed group has the form: whether their denominators represent
person-years at risk (Y0), persons at
cases b risk (N0), or survivors (d).
R0 = ---------- = ------
persons N0 Measures of Effect

Corresponding to these three


measures of disease occurrence, there
Its logarithm has an approximate
are three principal ratio measures of
standard error (under the binomial
effect which can be used in incidence
model for random variation in b) of:
studies: the rate ratio, the risk ratio,
and the odds ratio. In incidence case-
SE[ln(R0)] = (1/b - 1/N0)0.5
control studies, the measure of effect
is always the odds ratio (though what
this is estimating depends on how the
and an approximate 95% confidence controls were chosen). In prevalence
interval for the incidence proportion is studies, the effect measure is usually
thus: the prevalence odds ratio, and the
statistical methods are identical to
R0e+ 1.96 SE those used in incidence case-control
studies.

The observed incidence rate ratio has


The observed incidence odds in the the form (table 9.1):
non-exposed group has the form:
I1 a/Y1
cases b
O0 = ----------- = ---- RR = ----- = ------
I0 b/Y0
non-cases d

The natural log of the incidence odds An approximate p-value for the null
(ln(O0)) has (under a binomial model) hypothesis that the rate ratio equals
the null value of 1.0 can be obtained
an approximate standard error of: using the person-time version of the
Mantel-Haenszel chi-square (Breslow
and Day, 1987). This test statistic
SE(ln(O0)) = (1/b + 1/d)0.5
compares the observed number of
exposed cases with the number
expected under the null hypothesis
and a 95% confidence interval for O0 that I1 = I0:
is:
[Obs(a) - Exp(a)]2 [a - Y1M1/T]2
O0 e+1.96 SE 2 = ---------------------- = ----------------
Var(Exp(a)) [M1Y1Y0/T2]

These three measures of disease


occurrence all involve the same where M1, Y1, Y0 and T are as depicted
numerator: the number of incident in table 9.1.

116
The natural logarithm of the rate ratio An approximate 95% confidence
has (under a Poisson model for a and interval for the risk ratio is then given
b) an approximate standard error of: by:

RR e+1.96 SE
SE[ln(RR)] = (1/a + 1/b)0.5

The incidence odds ratio has the form:


An approximate 95% confidence
interval for the rate ratio is then given O1 a/c ad
by (Rothman and Greenland, 1998): OR = --- = ----- =
O0 b/d bc
RR e+1.96 SE

An approximate p-value for the


The risk ratio has the form: hypothesis that the odds ratio equals
the null value of 1.0 can be obtained
from the Mantel-Haenszel chi-square
R1 a/N1
(Mantel and Haenszel, 1959):
RR = ------ = --------
R0 b/N0
[Obs(a) - Exp(a)]2 [a - N1M1/T]2
2 = ------------------- = -----------------
Var(Exp(a)) [M1M0N1N0/T2(T-1)]
An approximate p-value for the null
hypothesis that the risk ratio equals
the null value of 1.0 can be obtained
using the Mantel-Haenszel chi-square where M1, M0, N1, N0 and T are as
(Mantel and Haenszel, 1959): depicted in table 9.1.

[Obs(a) - Exp(a)]2 [a - N1M1/T]2 The natural logarithm of the odds ratio


has (under a binomial model) an
= ------------------ = ------------------
2
approximate standard error of:
Var(Exp(a)) [M1M0N1N0/T2(T-1)]

SE[ln(OR)] = (1/a +1/b+ 1/c +1/d)0.5

where M1, M0, N1, N0 and T are as


depicted in table 9.1.
An approximate 95% confidence
The natural logarithm of the risk ratio interval for the odds ratio is then given
has (under a binomial model for a and by:
b) an approximate standard error of:
OR e+1.96 SE

SE[ln(RR)] = (1/a - 1/N1 + 1/b - 1/N0)0.5

117
9.3: Control of Confounding

In general, control of confounding where Ti = Y1i + Y0i


requires careful use of a priori
knowledge, together with assessment An approximate p-value for the null
of the extent to which the effect hypothesis that the summary rate ratio
estimate changes when the factor is is 1.0 can be obtained from the person-
controlled in the analysis. Most time version of the one degree-of-
epidemiologists prefer to make a freedom Mantel-Haenszel summary chi-
decision based on the latter criterion, square (Shore et al, 1976):
although it can be misleading,
particularly if misclassification is
present (Greenland and Robins, [Obs(a) - Exp(a)]2 [ai - Y1iM1i/Ti]2
1985a). The decision to control for a 2 = ------------------------- = --------------------
presumed confounder can certainly Var(Exp(a)) [M1iY1iT0i/Ti2]
be made with more confidence if
there is supporting prior knowledge
that the factor is predictive of
disease. where M1i, Y1i, Y0i and Ti are as depicted
in table 9.1.
There are two methods of calculating
a summary effect estimate to control An approximate standard error for the
confounding: pooling and natural log of the rate ratio is
standardisation (Rothman and (Greenland and Robins, 1985b):
Greenland, 1998).

Pooling [ M1iY1iY0i/Ti2]0.5
SE = ------------------------------
Pooling involves calculating a [(aiY0i/Ti)(biY1i/Ti)]0.5
summary effect estimate assuming
stratum-specific effects are equal.
There are a number of different Thus, an approximate 95% confidence
methods of obtaining pooled effect interval for the summary rate ratio is
estimates, but a commonly used then given by:
method which is both simple and
close to being statistically optimal
(even when there are small numbers RR e+1.96 SE
in all strata) is the method of Mantel
and Haenszel (1959).

The Mantel-Haenszel summary rate The Mantel-Haenszel summary risk ratio


ratio has the form: has the form:

aiY0i/Ti aiN0i/Ti
RR = -------------- RR = -------------
biY1i/Ti biN1i/Ti

118
An approximate p-value for the where M1i, M0i, N1i, N0i and Ti are as
hypothesis that the summary risk ratio is depicted in table 9.1.
1.0 can be obtained from the one degree-
of-freedom Mantel-Haenszel summary An approximate standard error for the
chi-square (Mantel and Haenszel, 1959): natural log of the odds ratio (under a
binomial or hypergeometric model) is
(Robins et al, 1986):
[Obs(a) - Exp(a)]2 [ai - M1iM1i/Ti]2
2
= ----------------------- = ------------------
PR (PS + QR) QS
Var(Exp(a)) [M1iM0iM1iN0i/Ti2(Ti-1)]
SE = ----- + -------------- + ------
2R+2 2R+S+ 2S+2
where M1i, M0i, N1i, N0i and Ti are as
depicted in table 9.1. where: P = (ai + di)/Ti
Q = (bi + ci)/Ti
An approximate standard error for the R = aidi/Ti
natural log of the risk ratio is S = bici/Ti
(Greenland and Robins, 1985b):
R+ = R
S+ = S
[ M1iN1iN0i/Ti2 - aibi/Ti]0.5
SE = ---------------------------------
[(aiN0i/Ti)(biN1i/Ti)]0.5 Thus, an approximate 95% confidence
interval for the summary odds ratio is
then given by:
Thus, an approximate 95% confidence
OR e+1.96
+ SE
interval for the summary risk ratio is
then given by:
Standardisation
RR e+1.96 SE
Standardisation is an alternative
The Mantel-Haenszel summary odds approach to obtaining a summary
ratio has the form: effect estimate (Miettinen, 1974;
Rothman and Greenland, 1998).
aidi/Ti Pooling involves calculating the effect
estimate under the assumption that
OR = -----------
the measure (e.g. The rate ratio)
bici/Ti would be the same (uniform) across
strata if random error were absent. In
contrast, standardisation involves
An approximate p-value for the taking a weighted average of the
hypothesis that the summary odds ratio disease occurrence across strata (e.g.
is 1.0 can be obtained from the one the standardized rate) and then
degree-of-freedom Mantel-Haenszel comparing the standardized
summary chi-square (Mantel and occurrence measure between exposed
Haenszel, 1959): and non-exposed (e.g. the
standardized rate ratio) with no
assumptions of uniformity of effect.
[Obs(a) - Exp(a)]2 [ai - N1iM1i/Ti]2
Standardisation is more prone than
2 = ------------------------ = ---------------------- pooling to suffer from statistical
Var(Exp(a)) [ M1iM0iN1iN0i/Ti2(Ti-1)]] instability due to small numbers in

119
specific strata; by comparison, pooling (under the binomial model for random
with Mantel-Haenzsel estimators is error) of:
robust and in general its statistical
0.5
stability depends on the overall [ wi2Ri(1-Ri)/Ni]
numbers rather than the numbers in
SE = -----------------------
specific strata. However, direct
standardisation has practical R wi
advantages when more than two
groups are being compared, e.g. when
comparing multiple exposure groups or where Ni is the number of persons in
making comparisons between multiple stratum i. An approximate 95%
countries or regions, and does not confidence interval for the
require the assumption of constant standardized rate is thus:
effects across strata.
R e+ 1.96 SE
The standardized rate has the form:

Standardisation is not usually used for


wiRi odds, since the odds is only used in
R = --------- the context of a case-control study,
wi where the odds ratio is the effect
measure of interest, but standardized
odds ratios can be computed from
The natural log of the standardized rate case-control data (Miettinen, 1985;
has an approximate standard error Rothman and Greenland, 1998).
(under the Poisson model for random
error) of: A common choice of weights in
international comparisons is Segi's
World Population (Segi, 1960) shown
2 0.5
[ wi Ri/Yi] in table 9.2, although it does reflect a
SE = ---------------- developed countries bias in its age
structure. In etiologic studies a better
R wi
approach is to use the structure of the
overall source population as the
weights when calculating standardized
where Yi is the person-time in stratum i. rates or risks in subgroups of the
An approximate 95% confidence interval source population. When one is
for the standardized rate is thus: specifically interested in the effects
that exposure had, or would have, on
R e+ 1.96 SE a particular subpopulation, then
weights should be taken from that
subpopulation.
The standardized risk has the form:
Multiple Regression
wiRi
Multiple regression allows for the
R = ----------
simultaneous control of more
wi confounders by "smoothing" the data
across confounder strata. In particular,
rate ratios (based on person-time
The natural log of the standardized risk
data) can be modelled using Poisson
has an approximate standard error

120
log-linear rate regression, risk ratios can Table 9.2
be modelled using binomial log-linear risk
regression, and odds ratios can be Segis World population
modelled using binomial logistic
regression (Pearce et al, 1988; Rothman Age-group Population
and Greenland, 1998). -----------------------------
0-4 years 12,000
Similarly, continuous outcome variables 5-9 years 10,000
(e.g. in a cross-sectional study) can be 10-14 years 9,000
modelled with standard multiple linear 15-19 years 9,000
regression methods. These models all 20-24 years 8,000
have similar forms, with minor variations 25-29 years 8,000
to take into account the different data 30-34 years 6,000
types. They provide powerful tools when 35-39 years 6,000
used appropriately, but are often used 40-44 years 6,000
inappropriately, and should always be 45-49 years 6,000
used in combination with the more 50-54 years 5,000
straightforward methods presented here 55-59 years 4,000
(Rothman and Greenland, 1998). 60-64 years 4,000
Mathematical modelling methods and 65-69 years 3,000
issues are reviewed in depth in a number 70-74 years 2,000
of standard texts (e.g. Breslow and Day, 75-59 years 1,000
1980, 1987; Checkoway et al, 1989; 80-84 years 500
Clayton and Hills, 1993; Rothman and 85+ years 500
Greenland, 1998), and will not be -----------------------------
discussed in detail here. Total 100,000
-----------------------------
Source: Segi (1960)

Summary

The basic aim of the analysis of a single assessment of the extent to which the
study is to estimate the effect of effect estimate changes when the factor
exposure on the outcome under study is controlled in the analysis. There are
while controlling for confounding and two basic methods of calculating a
minimizing other possible sources of summary effect estimate to control
bias. In addition, when confounding and confounding: pooling and
other sources of bias cannot be standardisation. Multiple regression
removed, then it is important to assess allows for the simultaneous control of
their likely strength and direction. more confounders by "smoothing" the
Control of confounding in the analysis data across confounder strata. It
involves stratifying the data according provides a powerful tool when used
to the levels of the confounder(s) and appropriately, but are often used
calculating an effect estimate which inappropriately, and should always be
summarizes the information across used in combination with the more
strata of the confounder(s). In general, straightforward methods presented
control of confounding requires careful here.
use of a priori knowledge, together with

121
References

Breslow NE, Day NE (1980). Statistical Mantel N, Haenszel W (1959). Statistical


methods in cancer research. Vol I: aspects of the analysis of data from
The analysis of case-control studies. retrospective studies of disease. J
Lyon, France: IARC. Natl Cancer Inst 22: 719-48.
Breslow NE, Day NE (1987). Statistical Miettinen OS (1974). Standardization of
methods in cancer research. Vol II: risk ratios. Am J Epidemiol 96: 383-
The analysis of cohort studies. 8.
Lyon, France: IARC.
Miettinen OS (1985). Theoretical
Checkoway HA, Pearce NE, Crawford- epidemiology. New York: Wiley and
Brown DJ (1989). Research Sons.
methods in occupational
Pearce NE, Checkoway HA, Dement JM
epidemiology. New York: Oxford
(1988). Exponential models for
University Press.
analyses of time-related factors:
Clayton D, Hills M (1993). Statistical illustrated with asbestos textile
models in epidemiology. Oxford: worker mortality data. J Occ Med
Oxford Scientific Publications. 30: 517-22.
Dean J, Dean A, Burton A, Dicker R Robins JM, Breslow NE, Greenland S
(1990). Epi Info. Version 5.01. (1986). Estimation of the Mantel-
Atlanta, GA: CDC. Haenszel variance consistent with
both sparse-data and large-strata
Gardner MJ, Altman DG (1986).
limiting models. Biometrics 42:
Confidence intervals rather than p
311-23.
values: estimation rather than
hypothesis testing. Br Med J 292: Rothman KJ (2002). Epidemiology: an
746-50. introduction. New York: Oxford
University Press.
Greenland S, Robins JM (1985a).
Confounding and misclassification. Rothman KJ, Greenland S (1998).
Am J Epidemiol 122: 495-506. Modern epidemiology. 2nd ed.
Philadelphia: Lippincott-Raven.
Greenland S, Robins JM (1985b).
Estimation of a common effect Segi M (1960). Cancer mortality for
parameter from sparse follow-up selected sites in 24 countries (1950-
data. Biometrics 41: 55-68. 1957). Sendai, Japan: Department
of Public Health, Tohoku University
Greenland S, Schwartsbaum JA, Finkle
School of Medicine.
WD (2000). Problems due to small
samples and sparse data in Shore RE, Pasternak BS, Curnen MG
conditional logistic regression (1976). Relating influenza epidemics
analysis. Am J Epidemiol 2000; to childhood leukaemia in tumor
191; 530-9. registries without a defined
population base. Am J Epidemiol 103:
Hills M, De Stavola BL (2002). A short
527-35.
introduction to Stata for
biostatistics. London: Timberlake,
2002.

122
CHAPTER 10: Interpretation
[In: Pearce N. A Short Introduction to Epidemiology. Wellington: CPHR, 2003]

In this chapter I first consider the issues associations are likely to be valid, then
involved in interpreting the findings of a attention shifts to more general causal
single epidemiological study. I then inference, which should be based on all
consider problems of interpretation of all available information. In both situations,
of the available evidence. Interpreting it should be stressed that
the findings of a single study includes epidemiological studies almost always
considering the strength and precision contain potential biases, and the focus
of the effect estimate and the possibility should be on assessing the likely
that it may have been affected by direction and magnitude of the biases,
various possible biases (confounding, and whether they could explain the
selection bias, information bias). If it is observed associations.
concluded that the observed

10.1: Appraisal of a Single Study

It is easy to criticize an epidemiological associations found (or a lack of


study. Populations do not usually association) in the study, before
randomize themselves by exposure proceeding to consider other evidence.
status, do not always respond to However, the emphasis should not be
requests to participate in on simply preparing a list of possible
epidemiological studies, may supply biases (e.g. Feinstein, 1988). Rather, it
incomplete or inaccurate exposure is essential to attempt to assess the
histories for known or possible risk likely strength and direction of each
factors, and cannot be asked about possible bias, and to assess whether
unknown risk factors. Thus, although these biases (and their possible
some studies are clearly better than interactions) could explain the observed
others, it is important to emphasize that associations.
perfect epidemiological studies do not
exist. Furthermore, it is usually not What is the magnitude and
possible, nor desirable, to reach precision of the effect estimate?
conclusions on the basis of the findings
of a single study, and it is essential to As discussed in chapter 6, random error
consider all of the available evidence. (lack of precision) will occur in any
epidemiologic study, just as it occurs in
Nevertheless, when confronted with a experimental studies. The possible role
new study, perhaps with unexpected of random error is often addressed
findings, it is valuable to first consider through the question could the
possible explanations for the observed association be due to chance

123
alone? and this issue is usually epidemiologic study will involve biases.
assessed by calculating the p-value. The problem is not to identify possible
This is the probability (assuming that biases (these will almost always exist),
there are no biases) that a test statistic but rather to ascertain what direction
as large as that actually observed would they are likely to be in, and how strong
be found in a study if the null they are is likely to be.
hypothesis were true, i.e. that there
was in reality no causal effect of Confounding
exposure. However, recent reviews have
stressed the limitations of p-values and In assessing whether an observed
significance testing (Rothman, 1978; association could be due to confounding,
Gardner and Altman, 1986; Poole, the first consideration is whether all
1987; Pearce and Jackson, 1988). potential confounders have been
Foremost among these is that appropriately controlled for or
significance testing attempts to reach a appropriately assessed (e.g. by
decision on the basis of the data from a collecting and using confounder
single study, whereas what is more information in a sample of study
important is the strength and precision participants). If not, it is essential to
of the effect estimate and whether the assess the potential strength and
findings of a particular study are direction of uncontrolled confounding.
consistent with those of previous
studies. These issues are better In some areas of epidemiologic
addressed by calculating confidence research, e.g. occupational and
intervals rather than p-values (Gardner environmental studies, the strength of
and Altman, 1986; Rothman and uncontrolled confounding is often less
Greenland, 1998). Similarly, the than might be expected. For example,
possibility that the lack of a statistically Axelson (1978) has shown that for
significant association could be due to plausible estimates of the smoking
lack of precision (lack of study power) is prevalence in occupational populations,
more appropriately addressed by confounding by smoking can rarely
considering the confidence interval of account for a relative risk of lung cancer
the effect estimate rather than by of greater than 1.5. Similarly,
making post hoc power calculations Siemiatycki et al (1988) have found that
(Smith and Bates, 1992). confounding by smoking is generally
even weaker for internal comparisons in
What are the likely strengths and which exposed workers are compared
directions of possible biases? with non-exposed workers in the same
factory or industry). On the other hand,
Systematic error is distinguished from the potential for confounding can be
random error in that it would be present severe in studies of lifestyle and related
even with an infinitely large study, factors (e.g. diet, nutrition, exercise).
whereas random error can be reduced
by increasing the study size. Thus, It is unreasonable to simply assume
systematic error, or "bias", occurs if that a strong association could be due
there is a systematic difference between to confounding by unknown risk factors,
what the study is actually estimating since to be a strong confounder a factor
and what it is intended to estimate. The must be a very strong risk factor as well
types of bias (confounding, selection as being strongly associated with
bias, information bias) have already exposure. For example, if an
been discussed in chapter 7. In the occupational study found a relative risk
current context the key issue is that any of 2.0 for lung cancer in exposed

124
workers, it is highly unlikely that this to have been in. The important issue is
could be due to confounding by not whether information bias could
smoking, and it would be unreasonable have occurred (this is almost always
to dismiss the study findings merely the case since there are almost always
because smoking information had not problems of misclassification of
been available. On the other hand, exposure and/or disease) but rather
small relative risks (e.g. those in the the likely direction and strength of
range of 0.7-1.5, as frequently occur in such bias. In particular, if a study has
dietary studies) are not so difficult to yielded a positive finding (i.e. an effect
explain by lack of measurement, or poor estimate markedly different from the
measurement and control, of null value) then it is not valid to
confounders. dismiss it because of the possibility of
non-differential misclassification, or
Selection bias differential misclassification that is
likely (although not guaranteed)
Whereas confounding generally produce a bias towards the null.
involves biases inherent in the source
population, selection bias involves Summary of Issues of Systematic
biases arising from the procedures by Error
which the is study subjects are chosen
from the source population. As with In summary, when assessing whether
confounding, if it is not possible to the findings of a particular study could
directly control for selection bias, it be due to such biases, the important
still may be possible to assess its likely issue is not whether such biases are
strength and direction. It is likely to have occurred (since they will
unreasonable to dismiss the findings of almost always be present to some
a particular study because of possible extent), but rather what their direction
selection bias, without at least and strength is likely to be, and
attempting to assess which direction whether they taken together could
the possible selection bias would have explain the observed association. In
been in, and how strong it might have particular, epidemiological studies are
been. often criticized on the grounds that
observed associations could be due to
Information bias uncontrolled confounding or errors in
the classification of exposure or
With regards to information bias, the disease. However, the likely strength is
key issue is whether misclassification of uncontrolled confounding is
is likely to have been differential or sometimes less than might be
non-differential. In the latter case, the expected, and non-differential
bias will usually be in a know direction, misclassification of exposure will
i.e. towards the null. If usually (though not always) produce a
misclassification has been differential, tendency for false negative findings
then it is important to attempt to rather than false positive findings.
assess what direction the bias is likely

125
10.2: Appraisal of All of the Available Evidence

If it is concluded that the association in same time (e.g. by questionnaire, is


a particular study is unlikely to be blood tests, etc).
primarily due to bias and chance,
attention then shifts to assessing The criterion of specificity has been
whether this association exists more criticised (e.g. Rothman and Greenland,
generally, and whether the association 1998), on the grounds that there are
is likely to be causal. This should involve many instances of exposures that have
a review of all of the available evidence multiple (i.e. non-specific) effects.
including non-epidemiological studies. A These include tobacco smoke and
systematic quantitative review of the ionizing radiation, both of which cause
epidemiological evidence may involve a many different types of cancer.
formal meta-analysis with statistical Nevertheless, the specificity of the
pooling of information from the various effect may be relevant in assessing the
studies (e.g. Dickerson and Berlin, possibility of various biases. For
1992; Rothman and Greenland is, example, if an exposure is associated
1998). However, such a summary of the with esophageal cancer but is not
various study findings is just one step in associated with lung cancer, then the
the process of causal inference. A association is unlikely to be due to
systematic approach to causal inference confounding by smoking.
was elaborated by Hill (1965) and has
since been widely used and adapted Consistency is demonstrated by several
(e.g. Beaglehole et al (1993)). I will studies giving similar results, and
divide these considerations into those corresponds to the statistical concept of
that involve systematic review of the homogeneity across studies (Rothman
epidemiological evidence (including and Greenland, 1998). This is
meta-analyses) and those that also particularly important when a variety of
involve consideration of evidence from designs are used in different settings,
animal or mechanistic studies. since the likelihood that all studies are
all suffering from the same biases may
Evidence From Epidemiological thereby be reduced. On the other hand,
Studies a lack of consistency does not exclude a
causal association, because different
Considerations for assessing the exposure levels and other conditions
epidemiological evidence include may alter the effect of exposure in
temporality, specificity, consistency, certain studies.
strength of association and whether
there is evidence of a dose-response The strength of association is important
relationship (Hill, 1965). in that a relative risk than is far from
the null value of 1.0 is more likely to be
Temporality is crucial; the cause must causal than a weak association, which
precede the effect. This is usually self- could be more easily explained by
evident, but difficulties may arise in confounding or other biases. However,
studies (particularly case-control the fact that an association is weak does
studies) when measurements of not preclude it from being causal; rather
exposure and effect are made at the it means that it is more difficult to

126
exclude alternative explanations for the homogeneity often have relatively low
observed association. power, it is more appropriate to
examine the magnitude of variation
A dose-response relationship occurs instead of relying on formal statistical
when changes in the level of exposure tests (Rothman and Greenland, 1998).
are associated with changes in the
prevalence or incidence of the effect The limitations of meta-analyses should
than one would expect from biologic also be emphasized (Greenland, 1994;
considerations. The absence of an Egger and Davey-Smith, 1997; Egger et
expected dose-response relationship al, 1997). Strikingly different results can
provides evidence against a causal be obtained depending on which studies
relationship, while the presence of an are included in a meta-analysis.
expected relationship narrows the scope Publication bias is of particular concern,
of biases that could explain the given the tendency of journals to
relationship. publish positive findings and for the
publication of negative findings to be
Experimental evidence provides strong delayed (Egger and Davey-Smith,
evidence of causality, but this is rarely 1998), but naive graphical approaches
available for occupational exposures. to its assessment can be misleading
(Greenland, 1994).
Meta-Analysis
Even when an unbiased and
In the past, epidemiological evidence comprehensive list of studies is included
has been assessed in literature reviews, in a meta-analysis, there still remain the
but in recent years there has been an same problems of selection bias,
increasing emphasis on formal meta- information bias, and confounding, that
analysis, i.e. systematic quantitative need to be addressed in assessing
reviews. One benefit of a is meta- individual studies. Thus, a systematic
analysis is that it can reduce the quantitative review (i.e. meta-analysis)
probability of false negative results is like a report of a single study in that
because of small numbers in specific both quantitative and narrative
studies (Egger and Davey-Smith, 1997), elements are required to produce a
and may enable the effect of an balanced picture (Rothman and
exposure to be estimated with greater Greenland, 1998). Essentially the same
precision than is possible in a single issues need to be a addressed as in a
study. Furthermore, although a meta- report of a single study: what is the
analysis should ideally be based on overall magnitude and precision of the
individual data, relatively simple effect estimate (if it is considered
methods are available for meta- appropriate to calculate a summary
analyses of published studies in which effect estimate), and what are likely
the study (rather than the individual) is strengths and directions of possible
the unit of statistical analysis (Rothman biases?
and Greenland, 1998). Such methods
can be used to address the causal An advantage of meta-analysis is that
considerations outlined above, in these issues can often be better
particular the overall strength of addressed by contrasting the findings of
association and the shape and strength studies based on different populations,
of the dose-response curve. Just as or using different study designs. Thus,
importantly, statistical methods can also possible systematic biases can be
be used to assess consistency between addressed with actual data from specific
studies, but because statistical tests for studies rather than by hypothetical

127
examples. For example, in a study of an available for analysis and will therefore
occupational exposure and lung cancer, reduce random error. However, it will
there might be concern that an not necessarily reduce systematic error,
observed association was due to and may even increase it (because of
confounding by smoking. If smoking publication bias). Nevertheless, a
data had not been available, then the careful meta-analysis will enable various
best that could be done would be to possible biases to be addressed, using
attempt to assess the likely extent of actual data from specific studies, rather
confounding by smoking (see chapter than hypothetical examples. Such a
7), for example by sensitivity analysis meta-analysis will therefore facilitate
(Rothman and Greenland, 1998). the consideration of the causal
However, in a meta-analysis, if smoking considerations listed above, and in some
information were available for some instances will provide a valid summary
(but not all) studies then these studies estimate of the overall strength of
could be examined to assess the likely association and the shape and strength
strength and direction of confounding by of the dose-response curve (Greenland,
smoking (if any). 2003).

Similarly, studies of exposure to


Combination of Epidemiological
phenoxy herbicides and the
Evidence With Evidence From Other
development of soft tissue sarcoma and
Sources
non-Hodgkins lymphoma have
produced widely differing findings, and
it has been suggested that the high Epidemiological evidence should be
relative risks obtained in the Swedish considered together with all other
studies could be due to recall bias (a available evidence, including animal
particular type of information bias) in experiments. An association is plausible
that cases or cancer (soft tissue if it is consistent with other knowledge,
sarcoma or non-Hodgkins lymphoma) whereas the epidemiological evidence is
were compared with healthy general coherent if it is not inconsistent with
population controls, and that patients other knowledge. For instance,
with cancer may be more likely to recall laboratory experiments may have
previous chemical exposures. This shown that a particular environmental
hypothesis was tested in specific studies exposure can cause cancer in laboratory
(e.g. Hardell et al, 1979, 1981), but can animals, and this would make more
also be tested more generally by plausible is the hypothesis that this
considering the findings of studies that exposure could cause cancer in humans.
used general population controls with However, biological plausibility is a
those that used other cancer controls. relative concept; many epidemiological
In particular, one New Zealand study associations were considered
(Pearce et al, 1986) used both types of implausible when they were first
controls and found similar results with discovered but were subsequently
each, indicating that recall bias was not confirmed by other evidence, e.g. the
an important problem in this study. relation of lice to typhus. Lack of
plausibility may simply reflect lack of
In summary, a key advantage of meta- knowledge (medical, biological, or
analysis is that pooling findings from social) which is continually changing and
studies will increase the numbers evolving.

128
Summary

The task of interpreting the findings of differences between study findings and
a single epidemiological study should the likely magnitude of possible biases.
be differentiated from that of Furthermore, causal inference also
interpreting all of the available necessitates considering non-
evidence. Interpreting the findings of a epidemiological evidence from other
single study includes considering the sources (animal studies, mechanistic
strength and precision of the effect studies) in the consideration of more
estimate and the possibility that it may general causal criteria including the
have been affected by various possible plausibility and coherence of the
biases (confounding, selection bias, overall evidence.
information bias). The important issue
is not whether such biases are likely to Despite the continual need to assess
have occurred (since they will almost possible biases, and to consider
always be present to some extent), possible imperfections in the
but rather what their direction and epidemiological data, it is also
strength is likely to be, and whether important to ensure that preventive
together they could explain the action occurs when this is warranted,
observed association. If the observed albeit on the basis of imperfect data.
associations seem likely to be valid, As Hill (1965) writes:
then attention shifts to more general
causal inference, which should be "All scientific work is incomplete -
based on all available information. This whether it be observational or
includes assessing the specificity, experimental. All scientific work
strength and consistency of the is liable to be upset or modified
association and the dose-response by advancing knowledge. That
across all epidemiological studies. This does not confer upon us a
may include the use of meta-analysis, freedom to ignore the knowledge
but it is often not appropriate to derive that we already have, or to
a single summary effect estimate postpone the action that it
across all studies. Rather, a meta- appears to demand at a given
analysis can be used to examine time."
hypotheses about reasons for

References

Axelson O (1978). Aspects on Beaglehole R, Bonita R, Kjellstrom T


confounding in occupational health (1993). Basic epidemiology. Geneva:
epidemiology. Scand J Work Environ WHO.
Health 4: 85-9.

129
Dickerson K, Berlin JA (1992). Meta- Hill AB (1965). The environment and
analysis: state-of-the-science. disease: association of causation?
Epidemiologic Reviews 14: 154-76. Proc R Soc Med 58: 295-300.
Egger M, Davey-Smith G (1997). Meta- Pearce NE, Smith AH, Howard JK, et al
analysis: principles and promise. Br (1986). Non-Hodgkin's lymphoma
Med J 1997; 315: 1371-4. and exposure to phenoxyherbicides,
chlorophenols, fencing work and
Egger M, Davey-Smith G, Phillips A
meat works employment: a case-
(1997). Meta-analysis: principles and
control study. Brit J Ind Med 43: 75-
procedures. Br Med J 1997; 315:
83.
1533-7.
Pearce NE, Jackson RT (1988).
Egger M, Davey-Smith G (1998). Meta
Statistical testing and estimation in
analysis: bias in location and
medical research. NZ Med J 101:
selection of studies. Br Med J 1998;
569-70.
316: 61-6.
Poole C (1987). Beyond the confidence
Feinstein AR (1988). Scientific
interval. AJPH 77: 195-9.
standards in epidemiologic studies of
the menace of daily life. Science 242: Rothman KJ (1978). A show of
1257-63. confidence. N Engl J Med 299: 1362-
3.
Gardner MJ, Altman DG (1986).
Confidence intervals rather than p Rothman KJ, Greenland S (1998).
values: estimation rather than Modern epidemiology. 2nd ed.
hypothesis testing. Br Med J 292: Philadelphia: Lippincott-Raven.
746-50.
Siemiatycki J, Wacholder S, Dewar R, et
Greenland S (1994). A critical look at al (1988). Smoking and degree of
some populat meta-analytic methods. occupational exposure: Are internal
Am J Epidemiol 140: 290-6. analyses in cohort studies likely to be
confounded by smoking status?
Greenland S (2003). The impact of prior
American Journal of Industrial
distributions for uncontrolled
Medicine 13:59-69.
confounding and response bias: a
case study of the relation of wire Smith AH, Bates M (1992). Confidence
codes and magnetic fields to limit analyses should replace power
childhood leukemia. J Am Statist calculations in the interpretation of
Assoc 98: 1-8. epidemiologic studies. Epidemiol 3:
449-52.
Hardell L, Sandstrom A (1979). Case-
control study: soft-tissue sarcomas
and exposure to phenoxyacetic acids
or chlorophenols. Br J Cancer 39:
711-7.
Hardell L, Erikkson M, Lenner P,
Lundgren E (1981). Malignant
lymphoma and exposure to
chemicals, especially organic
solvents, chlorophenols and phenoxy
acids: a case-control study. Br J
Cancer 43: 169-76.

130
131

You might also like