You are on page 1of 27

{Jobs}0688jw/makeup/688ch38.

3d

38
The Use of Meta-analysis in
Pharmacoepidemiology
JESSE A. BERLIN
University of Pennsylvania School of Medicine, Center for Clinical Epidemiology and Biostatistics, University of
Pennsylvania Medical Center, Philadelphia, PA, USA

INTRODUCTION review has been increasingly recognized as being


subjective.13± 16 With the support of leading scien-
Meta-analysis has been defined as ``the statistical tists17 and journal editors,18 there has been
analysis of a collection of analytic results for the growing acceptance of the concept that the
purpose of integrating the findings.''1 Other literature review can be approached as a more
definitions have included qualitative, as well as rigorous scientific endeavor, specifically, an ob-
quantitative, analyses.2 Meta-analysis is used to servational study with the same requirements for
identify sources of variation among study findings planning, prespecification of definitions, use of
and, when appropriate, to provide an overall eligibility definitions, etc., as any other observa-
measure of effect as a summary of those findings.3 tional study. In recent years, the terms research
While epidemiologists have been cautious about synthesis and systematic review have been used to
adopting meta-analysis,4 ± 6 the need to make the describe the structured review process, in general,
most efficient and intelligent use of existing data while meta-analysis has been reserved for the
prior to (or instead of) embarking on a large, quantitative aspects of the process. For the
primary data collection effort, has dictated a purposes of this chapter, we shall use meta-analysis
progressively more accepting approach.6± 12 in the more general sense. Meta-analysis provides
Meta-analysis may be regarded as a ``state-of- the conceptual and quantitative framework for
the-art'' literature review, employing statistical such rigorous literature reviews; similar measures
methods in conjunction with a thorough and from comparable studies are tabulated system-
systematic qualitative review.13 The distinguishing atically and the effect measures are combined
feature of meta-analysis, as opposed to the usual when appropriate.
qualitative literature review, is its systematic, Several activities may be included under the
structured, and presumably objective presentation above definition of meta-analysis. Perhaps the
and analysis of available data. The traditional most popular conception of meta-analysis, for

Pharmacoepidemiology, Third Edition. Edited by B. L. Strom.


# 2000 John Wiley & Sons, Ltd.
{Jobs}0688jw/makeup/688ch38.3d

634 PHARMACOEPIDEMIOLOGY

most clinically oriented researchers, is the sum- studies, the exploration of subgroups of patients in
mary of a group of randomized clinical trials whom therapy may be more or less effective, the
dealing with a particular therapy for a particular combination of studies involved in the approval
disease. An example of this approach would be a process for new therapies, and the study of positive
study that examined the effects of aspirin following effects of therapies, as in the investigation of new
myocardial infarction. Typically, this type of meta- indications for existing therapies, particularly
analysis would present an overall measure of the when the outcomes being studied are uncommon
efficacy of treatment, e.g., a summary odds ratio. or the past studies have been small.
Summary measures may be presented for different The investigation of adverse effects has been a
subsets of trials involving specific types of patients, recurring theme throughout this book, as it is a
e.g., studies restricted to men versus studies that major focus of pharmacoepidemiology. It is most
include both men and women. More sophisticated often, but not always, pursued through nonexperi-
meta-analyses also examine the variability of mental studies. The difficulties in studying these
results among trials and, when results have been events have also been detailed throughout this
conflicting, attempt to uncover the sources of the book. One major challenge involves obtaining
disagreements.19 information on adverse reactions that is uncon-
More recently, meta-analyses of nonexperimen- founded by indication (see Chapter 43). These
tal epidemiologic studies have been performed20± 28 adverse events often occur only rarely, making
and articles have been written describing the their evaluation still more difficult. The results of
methodologic considerations specific to those nonexperimental studies of whether such events
meta-analyses.3, 6, 11± 13, 29 ±38 In general, both the are associated with a particular drug may be
meta-analyses of nonexperimental studies and the conflicting, leaving a confusing picture for the
associated methodologic articles tend to focus practicing clinician and the policy makers to
more on the exploration of reasons for disagree- interpret. Meta-analysis, by combining results
ment among the results of prior studies, including from many randomized studies, can address the
the possibility of bias. Given the greater diversity problem of rare events and rectify the associated
of designs of nonexperimental studies, it is logical lack of adequate statistical power in a setting free
to find more disagreement among nonexperimen- of the confounding and bias of nonexperimental
tal studies than among randomized trials. studies. For example, Chalmers and colleagues
This chapter summarizes many of the major used meta-analysis of randomized clinical trials to
conceptual and methodologic issues surrounding explore possible gastrointestinal side-effects of
meta-analysis and offers the views of one meta- nonsteroidal anti-inflammatory drugs (NSAIDs).39
analyst about possible avenues for future research These studies individually had almost no power to
in this field. detect any association between NSAIDs and
adverse gastrointestinal outcomes, but collectively
the number of subjects was adequate both to show
CLINICAL PROBLEMS TO BE some important associations and to show the
ADDRESSED BY rarity of most complications. The details of this
PHARMACOEPIDEMIOLOGY NSAID meta-analysis will be presented later in
RESEARCH this chapter.
When reports of several investigations of a
There are a number of reasons why a pharmacoe- specific adverse drug reaction disagree, whether
pidemiologist might be interested in conducting a randomized or nonexperimental in design, meta-
meta-analysis. These include the study of uncom- analysis can also be used to help resolve these
mon negative outcomes of therapies (adverse drug disagreements. These disagreements among studies
reactions) free of the confounding and bias of may arise from differences in the choice of
nonexperimental studies, the exploration of rea- endpoints, the exact definition of exposure, the
sons for inconsistencies of results across previous eligibility criteria for study subjects, the methods
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 635

of obtaining information, other differences in these problems relate to the process of combining
protocols, or a host of other reasons possibly studies that are often diverse with respect to
related to the quality of the studies. While it is not specific aspects of design or protocol, some of
possible to produce a definitive answer to every which may be of questionable quality.
research question, the exploration of the reasons
for heterogeneity among studies' results may at
QUALITY OF THE ORIGINAL STUDIES
least provide valuable guidance concerning the
design of future studies. Meta-analysis seems particularly prone to the
The exploration of subgroups of patients in ``garbage in, garbage out'' phenomenon. Combin-
whom therapy may be more or less effective is a ing a group of poorly done studies can produce a
controversial question in individual randomized precise summary result built on a very weak
trials. Most trials are not designed with sample foundation. This apparent precision may lend
sizes adequate to address efficacy in subgroups. undue credibility to a result that truly should not
The finding of statistically significant differences be used as a basis for formulating clinical or policy
between the effects of therapy in different sub- strategies.5 However, if the quality judgment is
groups, particularly when those groups were not subtly influenced by the direction or magnitude of
defined prospectively, raises the question of the findings of the study, excluding studies based
whether those are spurious findings. Conversely, on such a subjective judgment about their quality
the lack of statistical significance for clinically could open the meta-analytic process to a serious
important differences between prospectively de- form of bias.
fined subgroups can often be attributed to a lack
of statistical power. Such clinically meaningful but
COMBINABILITY OF STUDIES
statistically nonsignificant findings are difficult to
interpret. Meta-analysis can be used to explore Clearly, no one would suggest combining studies
these questions with improved statistical power. that are so diverse that a summary would be
The use of meta-analysis in the approval process nonsensical. For example, one would not combine
for new drugs represents another potential appli- studies of aspirin in the primary prevention of
cation, although experience in this area is as yet coronary heart disease with studies of aspirin given
rather limited. However, many of the methodolo- after myocardial infarction. Beyond obvious ex-
gic issues arising in the context of new drug amples like this, however, the choices may not be
approval also arise in the investigation of new so clear. Should studies with different patient
indications for pharmaceutical products that have populations be combined? How different can those
previously been approved for other purposes. For populations be before it becomes unacceptable to
some therapies, such as streptokinase in the combine the studies? Should nonrandomized
treatment of myocardial infarction, meta-analysis studies be combined with randomized studies?
could have been used to summarize evidence prior Should nonrandomized studies ever be used in a
to embarking on a very large scale, multicenter, meta-analysis? These are questions that cannot be
randomized trial.40 answered without generating some controversy.

PUBLICATION BIAS
METHODOLOGIC PROBLEMS TO BE
ADDRESSED BY Unpublished material cannot be retrieved by
PHARMACOEPIDEMIOLOGY literature searches and is likely to be difficult to
RESEARCH find referenced in published articles. Publication
bias occurs when study results are not published,
As the skeptical reader might imagine, many or their publication is delayed, because of the
methodologic issues can arise in the context of results.41± 50 The usual pattern is that statistically
performing a meta-analysis. Many, but not all, of significant results are published more easily than
{Jobs}0688jw/makeup/688ch38.3d

636 PHARMACOEPIDEMIOLOGY

nonsignificant results, although this bias may not 10± 20% of items, which had to be resolved in
be as severe for randomized studies as it is for conference with a third person. These disagree-
nonrandomized studies.42, 51 While one could ments arose from errors on the part of the reader
simply decide not to include unpublished studies and from lack of clarity of the presentation of
in a meta-analysis, since those data have often not material in the original articles. Whatever its
been peer reviewed,52 unpublished data can repre- source, when such variability exists, the opportu-
sent a large proportion of all available data.53 If nity for observer bias may exist as well.61
the results of unpublished studies are system- In a number of instances, more than one meta-
atically different from those of published studies, analysis has been performed in the same general
particularly with respect to the magnitude and=or area of disease and treatment. A review of 20 of
direction of the findings, their omission from a these instances52 showed that, for almost all
meta-analysis would yield a biased summary disease=treatment areas, there were differences
estimate (assuming that the quality of the unpub- between two meta-analyses of the same topic in
lished studies is at least equal to the quality of the the acceptance and rejection of papers to be
published studies). included. While there was only one case (out of
Publication bias is a potentially serious limita- the 20) of extreme disagreement regarding efficacy,
tion to any meta-analysis. The retrospective there were several cases in which one or more
identification of completed unpublished trials is analyses showed a statistically significant result
clearly possible53 in some instances, but generally while the other(s) showed only a trend. These
is not practical. One study54 used a survey of disagreements were not easily explainable. For
investigators to attempt to identify unpublished example, differences between meta-analyses of the
studies. The authors surveyed 42 000 obstetricians same topic in the acceptance and rejection of
and pediatricians, asking whether they had parti- papers did not always lead to differences in
cipated in any unpublished trials completed more conclusions.
than two years previously, i.e., during the period More generally, the acceptance or rejection of
prior to the end of 1984. They identified only 18 different sets of studies can drastically change
such studies, despite an overall response rate of conclusions. This is illustrated by several meta-
94% to their survey. analyses of whether or not corticosteroid drugs
Other forms of bias, related to publication bias, cause peptic ulcer. The first published paper
have also been identified.47 These include reference argued that corticosteroids did not cause peptic
bias, i.e., preferential citation of significant find- ulcer, because the p-value for the meta-analysis
ings,55 language bias, i.e., exclusion of studies in was only 0.07.62 Five years later, a second analysis,
languages other than English,56, 57 and bias related by a second set of authors, included a larger
to source of funding.42, 43, 58± 60 number of studies and found evidence for an
association with a p-value of less than 0.001.63 Re-
analysis of the data from the second meta-analysis
BIAS IN THE ABSTRACTION OF DATA
by the authors of the first meta-analysis, with the
Meta-analysis, by virtue of being conducted after addition of several more studies, gave a p-value of
the data are available, is a form of retrospective 0.40.64 Another meta-analysis done by the second
research and is thus subject to the potential biases set of authors gave a revised p-value of 0.01.65
inherent in such research.61 In the meta-analysis of Despite efforts to make meta-analysis an objective,
gastrointestinal side-effects of NSAIDs mentioned reproducible activity, there is evidently some
above39 and described more fully below, Chalmers judgment involved.
and colleagues examined over 500 randomized In a separate commentary,66 DerSimonian
studies. They measured the agreement of different reanalyzed data from one meta-analysis and one
individuals when reading Methods sections of clinical review of parenteral nutrition with
papers that had been masked as to their source branched-chain amino acids in hepatic encephalo-
and the results. There were disagreements on pathy. She pointed to differences in the data
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 637

extracted by the two sets of authors67, 68 for the uniquely more or less effective. For NSAIDs,
same endpoints from the same original papers. estimating the absolute risk difference (and, thus,
When combined statistically, the data extracted by the public health implications) as well as the
the two sets of authors led to substantively relative risk (and, thus, the etiologic implications)
different conclusions about the efficacy of therapy. might be a secondary objective.

Perform the Literature Search


CURRENTLY AVAILABLE SOLUTIONS
While computerized searches of the literature can
This section will first present the general principles facilitate the retrieval of all relevant published
of meta-analysis and a general framework for the studies, these searches are not always reliable.
methods typically employed in a meta-analysis. Several studies have examined problems with the
Much of this material has been presented in review use of electronic searches.69± 71 Use of search terms
articles in major clinical journals,7, 9, 10 so only the that are too nonspecific can result in large numbers
most important points will be highlighted here. In of mostly irrelevant citations that need to be
the second part of this section, specific solutions to reviewed to determine relevance. Use of too many
the methodologic issues raised in the previous restrictions can result in missing a substantial
section are presented. Finally, case studies of number of relevant publications. For example, in
applications that should be of interest to pharma- preparing for meta-analyses of neonatal hyperbi-
coepidemiologists will be presented, illustrating lirubinemia, MEDLINE was searched for relevant
approaches to some of the clinical and methodo- clinical trials.69 A search by a trained librarian
logic problems raised earlier. identified only 29% of known trials in the Oxford
Database of Perinatal Trials.72 It is generally
suggested that a professional librarian with train-
STEPS INVOLVED IN PERFORMING A
ing and experience in searches of clinical topics be
META-ANALYSIS (SEE TABLE 38.1)
consulted, although, as just cited, even a trained
librarian may not perform perfectly. Other meth-
Define the Purpose
ods of searching, such as review of the reference
While this is an obvious component of any sections of retrieved publications found to be
research, it is particularly important to define relevant, and hand searches of relevant journals,
precisely the primary and secondary objectives of a are also recommended.
meta-analysis. The important primary question
might be ``Are NSAIDs associated with an
Establish Inclusion=Exclusion Criteria
increased risk of gastrointestinal side effects?''.
Another might be ``Are corticosteroids effective in A set of rules for including and excluding studies
the treatment of alcoholic hepatitis?''. Secondary from the meta-analysis should be defined during
objectives might include the identification of the planning stage of the meta-analysis and should
subgroups in which a treatment appears to be be based on the specific hypotheses being tested in
the analysis. One might, for example, wish to limit
consideration to randomized studies with more
Table 38.1. General steps involved in conducting a than some minimum number of patients. In a
meta-analysis
meta-analysis of epidemiologic studies, one might
(1) Define the purpose wish to include studies of incident cases only,
(2) Perform the literature search excluding studies of prevalent cases, assuming that
(3) Establish inclusion=exclusion criteria the relationship between exposure and outcome
(4) Collect the data could be different in the two types of study.
(5) Perform statistical analyses
(6) Formulate conclusions and recommendation Practical considerations may, of course, force
changes in the inclusion criteria. For example,
{Jobs}0688jw/makeup/688ch38.3d

638 PHARMACOEPIDEMIOLOGY

one might find no randomized studies of a weight to that study in the summary measure. A
particular new indication for an existing therapeu- caution is that it is not always obvious that the
tic agent, thus forcing consideration of nonrando- same patients have been described in two different
mized studies. publications. Contacting the authors may be of
In establishing inclusion=exclusion criteria, one some help in determining whether there is duplica-
is also necessarily defining the question being tion, although some authors may perceive the
addressed by the meta-analysis. If broad inclusion inquiry as questioning their academic integrity.
criteria are established, then a broad, and perhaps The issue of multiple publications based on the
more generalizable, hypothesis may be tested. The same study has been addressed in more detail by
use of broad entry criteria also permits the Huston and Moher.73
examination of the effects of research design on
outcome (e.g., do randomized and nonrandomized
Collect the Data
studies tend to show different effects of therapy?)
or the exploration of subgroup effects. In the When the relevant studies have been identified and
example of aspirin administered following myo- retrieved, the important information regarding
cardial infarction, restriction of the meta-analysis study design and outcome needs to be extracted.
to studies using more than a certain dose of aspirin Typically, data abstraction forms are developed,
would not permit an exploratory, cross-study pilot tested on a few articles, and revised as
comparison of dose ±response effects, which might needed. As in any research, it is necessary to strike
prove illuminating. a balance between the completeness of the
A key point is that exclusion criteria should be information abstracted and the amount of time
based on a priori considerations of design of the needed to extract that information. Careful speci-
original studies and completeness of the reports fication in the protocol for the meta-analysis of the
and specifically not on the results of the studies. To design features and patient characteristics that will
exclude studies solely on the basis of results that be of clinical or academic interest may help avoid
contradict the majority of the other studies will over- or undercollecting information. It is gener-
clearly introduce bias into the process.11 While that ally advisable, when possible, to collect raw data
may seem obvious, the temptation to try to justify on outcome measures, e.g., numbers treated and
such exclusions on a post hoc basis may be strong, number of events in each group, rather than
particularly when a clinically plausible basis for derived measures such as odds ratios, which may
the exclusion can be found. Such exclusions made not be the outcome measures of interest in the
after having seen the data, and the effect of meta-analysis or may have been calculated incor-
individual studies on the pooled result, may form rectly by the original authors.
the basis for legitimate sensitivity analyses (com- Many articles on ``how to do a meta-analysis''
paring pooled results with and without that (e.g.,9, 10) recommend that the meta-analyst assess
particular study included), but should not be the quality of the studies being considered in a
viewed as primary exclusion criteria. meta-analysis. One might wish to use a measure of
Another important note is that studies may study quality as part of the weight assigned to each
often generate more than one published paper. For study in the analysis, as an exclusion criterion (e.g.,
example, later reports might update analyses excluding studies with quality scores below some
previously published, or might report on outcomes arbitrary threshold), or as a stratification factor
not addressed in earlier papers. It is essential, for allowing the separate estimation of effects for good
two reasons, that only one report on the same quality and poor quality studies.74, 75 Chalmers and
patients be accepted into the meta-analysis. First, colleagues have developed a quality assessment
the validity of the statistical methods depends on scoring system for randomized trials.76 Several
the assumption that the different studies represent groups have opted for other, far shorter and
different groups of individuals. Second, the inclu- simpler, systems (e.g.,77± 80) Issues related to
sion of a study more than once would assign undue quality scoring have been discussed more generally
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 639

by Moher and colleagues,81 and an annotated this comes from meta-analyses in which modest
checklist of quality scoring systems is available.82 but important inter-reader variability has been
Most of these systems were proposed as very demonstrated.39, 61 A second recommendation is
general systems that could be applied to clinical that readers be masked to certain information in
trials covering a wide range of therapies and studies, such as the identity of the authors and the
endpoints. Scoring systems designed for epidemio- institutions at which a study was conducted, and
logic studies have been developed as well, in the masked to the specific treatment assignments.52
context of evaluating studies of specific exposure ± While masking has a high degree of intuitive
disease relationships (e.g.,21). appeal, the effectiveness of masking in avoiding
The argument has been made, however, that bias has not been demonstrated. Only one
general scoring systems are arbitrary in their randomized trial examines the issue of the effect
assignment of weights to particular aspects of of masking on the results of meta-analyses.86 This
study design, and that such systems risk losing study compared the results of the same meta-
information, and can even be misleading.38, 83, 84 analyses performed independently by separate
Juni and colleagues,84 for example, examined teams of meta-analysts, with one team masked
studies comparing low molecular weight heparin and the other unmasked. The masked and
with standard heparin with respect to prevention unmasked teams produced nearly identical results
of postoperative thrombosis. They used 25 differ- on a series of five meta-analyses, lending little
ent quality assessment scales to identify high support to the need for masking.
quality trials. For six scales, the studies identified
as being of high quality showed little to no benefit
Perform Statistical Analyses
of low molecular weight heparin, while for seven
scales, the ``high quality'' studies showed a In most situations, the statistical methods for the
significant advantage of low molecular weight actual combination of results across studies are
heparin. This apparent contradiction raises ques- fairly straightforward. If one is interested in
tions about the validity of such scales as methods combining odds ratios or other estimates of
for stratifying studies. relative risk across studies, for example, some
Thus, in a given meta-analysis, one might wish form of weighted average of within-study results
to examine specific aspects of study design that are is appropriate, and several of these exist.87 A
unique to that clinical or statistical situation.38, 83 ± popular example of this is the Mantel ± Haenszel
85
For example, Schulz and colleagues85 found that procedure, in which odds ratios are combined
trials in which the concealment of randomized across studies with weights proportional to the
allocation was inadequate on average produced inverse of the variance of the within-study odds
larger estimates of treatment effects compared ratio.29, 31, 87 Other approaches include inverse
with trials in which allocation was adequately variance weighted averages of study-specific
concealed. This specific finding was not detected estimates of multivariate adjusted relative
when these same authors looked for an overall risks29, 87 and exact stratified odds ratios.88 One
association between quality score and treatment popular method, sometimes called the ``one-step''
effect. In the analysis of low molecular weight method, which is similar to the Mantel ± Haenszel
heparin, Juni and colleagues84 found that studies method,89 has been shown to be biased under
with unmasked outcome assessment showed lar- some circumstances. Since this method offers no
ger, and presumably biased, benefits of low clear advantage over the Mantel ± Haenszel meth-
molecular weight heparin than studies using od, which is more robust, the Mantel ± Haenszel
masked assessment of outcome. method may be preferable in many circum-
Two procedural recommendations have been stances.31 Choices are never simple, however. In
made regarding the actual techniques for data a simulation study, Deeks and colleagues90
extraction. One is that studies should be read showed, at an international meeting, that in
independently by two readers. The justification for situations where there are rare events, and
{Jobs}0688jw/makeup/688ch38.3d

640 PHARMACOEPIDEMIOLOGY

consequently frequent zero cells in contingency There are also methods for combining studies
tables, the one-step method tended to perform that do not make the assumption of a common
better than other alternatives, including Mantel ± treatment effect across all studies. These are the so-
Haenszel and exact methods. called ``random effects'' models, which allow for
One basic principle in most analytic approaches the possibility that the underlying true treatment
is that the comparisons between treated (exposed) effect, which each study is estimating, may not be
and untreated (unexposed) patients are typically the same for all studies, even when examining only
made within a study prior to combination across studies with similar designs, protocols, and patient
studies. In the combination of randomized trial populations. Hidden or unmeasured sources of
results, this amounts to preserving the randomiza- among-study variability of results are taken into
tion within each study prior to combination. In all account by these random effect models through
of the procedures developed for stratified data, the incorporation of such variability into the
``study'' plays the role of the stratifying variable. weighting scheme when computing a weighted
In general, more weight is assigned to large studies average summary estimate. The practical conse-
than to small studies because of the increased quence of the random effect models is to produce
precision of larger studies. A second basic principle wider confidence intervals than would otherwise
to note is that these methods generally assume that be produced by the traditional methods.91, 92 This
the studies are all estimating a single, common approach is particularly useful when there is
effect, e.g., a common odds ratio. In other words, heterogeneity among study results, and explora-
the underlying treatment effect (whether beneficial tory analyses have failed to uncover any known
or harmful) that all studies are estimating is sources of observed heterogeneity. However, ran-
assumed to be the same for all studies. Any dom effect models should not be viewed as a
variability among study results is assumed to be panacea for unexplained heterogeneity. One dan-
random and is ignored in producing a summary ger is that a summary measure of heterogeneous
estimate of the effect.91, 92 studies may not really apply to any particular
In any meta-analysis, however, the possible study population or study design, i.e., they lose
existence of heterogeneity among study designs information by averaging over potentially impor-
and results should be examined, and may warrant tant study and population characteristics.38
a set of exploratory analyses designed to investi- A second practical effect of random effect
gate the sources of that heterogeneity. Methods for models, which is only apparent from examining
detecting and describing heterogeneity are de- the mathematics involved, is that they tend to
scribed in detail by Hardy and Thompson.93 One assign relatively higher weights to small studies
might stratify the studies according to patient than the traditional methods would assign.91 This
characteristics or study design features and in- equalization of weights may have unwanted
vestigate heterogeneity within and across strata. consequences in some circumstances (see section
To the extent that the stratification explains the on ``Publication bias,'' below), and can lead to
heterogeneity, the combined results would differ counterintuitive results, with very small studies
between strata and the heterogeneity within the making contributions to the summary equal to
strata would be reduced compared to the overall those of very large studies. A thorough discussion
result. In addition to stratification, regression of the interpretation and application of fixed
methods such as weighted least squares linear versus random effect models is presented by
regression could be used to explore sources of Hedges and Vevea.98
heterogeneity.3, 29, 94± 96 These might be important Bayesian statistical methods are also being
when various components of study design are proposed with increasing frequency in the statis-
correlated with each other, acting as potential tical literature.99 ±104 These include the so-called
confounders. Graphical methods for meta-analysis ``confidence profile'' method developed by Eddy
have also been proposed, that focus on issues and colleagues.105, 106 These methods can incorpo-
related to heterogeneity.97 rate into the analysis the investigator's prior beliefs
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 641

about the size of an effect or about the factors APPROACHES TO SELECTED


biasing the observed effects. When the investigator METHODOLOGIC PROBLEMS IN
has no prior beliefs about the effect, the results of META-ANALYSIS
the observed studies are sometimes used to
estimate the components of the ``prior'' distribu- Combinability of Results from Diverse
tion. Thus, the final answers reflect the observed Studies: Heterogeneity is your Friend
data very closely. In practice, when the investiga-
tor does not specify prior beliefs, the summary The underlying question in any meta-analysis is
results are similar to those from standard methods, whether it is clinically and statistically reasonable
especially the random effect models described to estimate an average effect of therapy, either
above. positive or negative. If one errs on the side of being
Another approach107 is based on summing the too inclusive, and the studies differ too greatly,
statistical evidence provided by each study for each there is the possibility that the average effect may
value of the effect measure (e.g., the odds ratio). not apply to any particular subgroup of pa-
The value of the odds ratio with the maximum tients.38, 109 Conversely, diversity of designs and
evidence (the maximum likelihood estimate) has results may provide an opportunity to understand
usually proved the same as the pooled estimate the factors that modify the effectiveness (or
produced by other methods, in the situations toxicity) of a drug. In fact, it has been argued that
where this method has been used.107 By providing because of the potential for bias in observational
a mathematical and graphical picture of what is epidemiologic studies, exploring heterogeneity
occurring at other values of the effect estimate, this should be the main point of meta-analyses of such
method also provides information on the con- studies, rather than producing a single summary
tribution of each study to the total. measure.6, 33, 38, 110
An important word of caution is that statistical In addition to considering the clinical and
tests of heterogeneity, i.e., formal statistical tests methodologic differences among studies, prior to
of the variability among the studies, suffer from a the actual combination of results, it is essential to
notorious lack of statistical power.108 Thus, a evaluate the extent to which study results are
finding of significant heterogeneity may safely be combinable from a strictly statistical viewpoint.
interpreted as meaning the studies are not all This would usually involve some kind of statistical
estimating the same parameter. A lack of test of the variability (heterogeneity) of results
statistical significance, however, may not mean among studies. If the variability of odds ratios (on
that heterogeneity is not important in a data set the natural logarithm scale), for example, is greater
or that sources of variability should not be than that which could be attributed to sampling
explored. variability alone, one should question the wisdom
and meaning of a combined result. Efforts should
subsequently focus on exploring the reasons for
Formulate Conclusions and the variability of results, rather than proceeding to
Recommendations combine them.
The generality of the question posed will clearly
As with all research, the conclusions of a meta- influence the generalizability of the result, but also
analysis should be clearly summarized, with may affect whether the primary studies involved
appropriate interpretation of the strengths and are viewed as combinable or not. Because the set of
weaknesses of the meta-analysis. Authors should available studies is likely to be heterogeneous with
clearly state how generalizable the result is, how respect to design features, the choice of a more
definitive it is, and should outline the areas that general question may be preferred to a very
need future research. Any hypotheses generated by specific one. For example, Dickersin and Berlin13
the meta-analysis should be stated as such, and not suggest that a more general question might be ``Is
as conclusions. taking aspirin associated with decreased mortality
{Jobs}0688jw/makeup/688ch38.3d

642 PHARMACOEPIDEMIOLOGY

in patients who have already had a myocardial instructive to perform more exploratory analyses
infarction?'' rather than ``Is administration of 325 of meta-analytic data as well. These may provide
milligrams of aspirin per day, started within seven valuable insights into the biology of the problem
days of a first documented myocardial infarction and=or may generate hypotheses for future con-
and taken for at least six months, in the absence of firmation. Morgenstern et al.28 found that the
other preventive treatments, in patients followed association between neuroleptic medication and
for at least one year, associated with a decrease in tardive dyskinesia was stronger in studies con-
subsequent cardiovascular mortality?''. If addres- ducted in the United States than in studies
sing the second question, the meta-analyst may conducted elsewhere. They used regression meth-
quickly find him or herself with only one available ods to show that this association was not simply
study. Diversity of study designs, on the other the product of confounding by other study design
hand, can provide a more generalizable result than features. The authors suggest that the US study
restriction to a very narrowly defined group of samples may have had a higher baseline frequency
studies. Issues of study design, such as dose or of unmeasured factors (e.g., affective disorders
duration of therapy, and how study design relates such as schizophrenia) than the exposed groups in
to study results, could be addressed through a other countries. As with any exploratory analysis,
series of exploratory analyses.13 due caution must be exercised in the interpretation
As an example of the type of analysis that could of such a posteriori hypotheses, even though they
be used to investigate study design issues, Berlin may be based on very sound biological reasoning.
and colleagues,3 in a methodologic paper, examine
data originally presented by Romieu and collea-
Publication Bias
gues20 on the relationship between duration of oral
contraceptive use and risk of breast cancer. They As discussed above, when the primary source of
show, using regression methods and stratified data for a meta-analysis is published data, there is
analyses, that case ± control studies involving usually a danger that the published studies
mostly premenopausal women show a marked represent a biased subset of all the studies that
and quite homogeneous increase in risk with have been done. In general, it is more likely that
increasing duration of use of oral contraceptives. studies with statistically significant findings will be
Studies including postmenopausal cases, conver- published than studies with no significant findings.
sely, show no such increase in risk, on average. A practical technique for determining the potential
Furthermore, the average effect is difficult to for publication bias is the ``funnel plot,'' first
interpret, since the results of the postmenopausal proposed by Light and Pillemer.111 The method
studies are quite heterogeneous, with some studies involves plotting the effect size (e.g., the risk
showing an increase in risk with increasing difference) against a measure of study size, such as
duration of oral contraceptive use and others the sample size, or the inverse of the variance of
showing a decrease. One technical difficulty with the individual effect sizes. If there is no publication
meta-analysis of group-level data is also illustrated bias, the points should produce a kind of funnel
by this example. Although some studies present shape, with a scatter of points centered around the
age-specific results, or are restricted to certain age true value of the effect size, and with the degree of
groups, many studies do not present subgroup scatter narrowing as the variances decrease. If
analyses that are of interest in a meta-analysis. publication bias is a problem, the funnel would
Thus, for example, some studies that include look as though a bite had been taken out, with
postmenopausal women also include premenopau- very few (if any) points around the point indicating
sal women, and the data within such studies may no effect (e.g., odds ratio of 1.0) for studies with
not be presented by menopausal status. large variances. This method requires a sufficient
In the situation just described, there were strong number of studies to permit the visualization of a
biological reasons for performing separate ana- funnel shape to the data. If the funnel plot does
lyses by menopausal status. It is sometimes indicate the existence of publication bias, then one
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 643

or more of the correction methods described below studies with a Z-statistic of exactly 0 that would be
should be considered. In the presence of publica- required to exist, in order for the combined Z-
tion bias, the responsible meta-analyst should also score (published ‡ unpublished studies) to become
evaluate the ethics of presenting a summary result nonsignificant. Because this method focuses only
that is likely to represent an overestimate of the on Z-statistics, and ignores the estimation of
effect in question. effects (e.g., odds ratios), it is of limited utility.
Two examples of funnel plots are given in That is, the fail-safe N approach focuses only on
Figures 38.1 and 38.2. These plots represent the statistical significance of the combined result
studies of psychoeducational programs for surgical and does not help provide an overall estimate of
patients.111, 112 In the first plot, only the published the effect that is ``adjusted'' for publication bias.
studies are represented. The funnel appears to have A number of related methods to deal with
a ``bite'' taken out of it where the small studies potential unpublished studies have been developed
showing no effect of these programs should be. In in recent years. These include other methods for
the second plot, the unpublished studies, including estimating the number of unpublished stu-
doctoral dissertations, are included, and the dies,114, 115 formal methods to test for the presence
former ``bite'' is now filled with these unpublished of publication bias,116 ±118 and methods to adjust
studies. summary estimates to account for unpublished
Several mathematical approaches to the pro- studies,114, 119 ± 122 but several of those methods
blem of publication bias have also been proposed. make some fairly strong assumptions about the
An early method, first described by Rosenthal,113 specific mechanism producing the publication bias.
is the calculation of a ``fail-safe N'' when the result The available data are used to estimate both the
of the meta-analysis is a statistically significant average effect size for the comparison of groups
rejection of the null hypothesis. This method, in a and the probabilities of publication. (Note that
kind of sensitivity analysis, uses the Z-statistics only published data are used to estimate the
from the individual studies included in a meta- probability of publication, strengthening the de-
analysis to calculate the number of unpublished pendence of such methods on the underlying

Figure 38.1. Funnel plot for published studies only: analysis of data from the review by Devine and Cooks'112 of
psychoeducational programs for surgical patients. Reprinted by permission of the publishers from Summing Up: the
Science of Reviewing Research by Richard J. Light and David B. Pillemer, Cambridge, MA: Harvard University Press,
copyright 1984 by the President and Fellows of Harvard College.
{Jobs}0688jw/makeup/688ch38.3d

644 PHARMACOEPIDEMIOLOGY

Figure 38.2. Funnel plot for published (open boxes) and unpublished (closed triangles) studies combined: analysis of
data from the review by Devine and Cooks'112 of psychoeducational programs for surgical patients. Reprinted by
permission of the publishers from Summing Up: the Science of Reviewing Research by Richard J. Light and David B.
Pillemer, Cambridge, MA: Harvard University Press, copyright 1984 by the President and Fellows of Harvard College.

assumptions regarding the nature of the relation- for various cancers were larger than corresponding
ship between study findings and publication fixed effect summaries. This was apparently due to
probabilities.) the assignment of higher relative weights to small
An additional methodologic caution relates to studies which, in this case, showed relatively larger
the use of random effect models for combining effects, that may not be representative of the
results. When the results of the studies being findings of all small studies.
analyzed are heterogeneous and a random effect A proposed solution to the problem of publica-
model is being used to combine those results, one tion bias is the use of prospective registration of
of the properties of the model, described above, is studies at their inception, prior to the availability
to assign relatively higher weights to small studies of results.47, 123 ± 129 Going one step further, several
than would otherwise be assigned by more prospectively planned meta-analyses are either
traditional methods of combining data. If publica- being planned or have been conducted.130 ±133
tion bias is a problem in a particular data set, one
consequence implied by the funnel plot is that
CASE STUDIES OF APPLICATIONS OF
small studies would tend to show larger effects
META-ANALYSIS
than large studies. Thus, if publication bias is
present, one of the reasons for heterogeneity of
Investigation of Adverse Effects
study results is that the small studies show
systematically larger effects than the large studies. As mentioned earlier, the investigation of adverse
The assignment of higher relative weights to the or unwanted effects of existing therapies is an
small studies could, when publication bias is important application of meta-analysis. As dis-
present, lead to a biased summary result.38 In fact, cussed in Chapter 3, adverse events associated with
this appears to be exactly the situation presented pharmaceutical products are often so uncommon
by Poole and Greenland in an examination of as to be difficult to study. In particular, the usual
studies of water chlorination and cancer.35 Ran- premarketing randomized studies frequently have
dom effect summary estimates of the relative risk too few patients to provide any useful information
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 645

on the incidence of uncommon adverse events. By Data were extracted for the following endpoints:
the same token, individual studies may have low nausea, indigestion or dyspepsia, gross gastroin-
statistical power to address particular questions. testinal bleeding, suspected ulcer only, proven
Meta-analysis provides the benefit of vastly ulcer, gastric side-effects, and unspecified gastric
increased statistical power to investigate adverse side-effects. Factors that might be related to the
events. In fact, since 1982, the safety evaluation of incidence of these endpoints were also extracted
drugs in the US has included pooled analyses.134 from the studies: disease under study, drug
The assessment of the excess risk of gastrointest- ingested, dose and duration of drug, age of
inal side-effects associated with nonsteroidal anti- patients, sex of patients, and date of publication.
inflammatory drugs (NSAIDs) provides an excel- The data were analyzed by crude pooling (i.e.,
lent example of a situation in which meta-analysis ignoring the stratification by study and simply
has been helpful. Four different meta-analytic collapsing over studies), by unweighted averaging
approaches to this problem will be reviewed here. of the within-study risk differences, and by a
Chalmers and colleagues examined data from weighted average of the risk differences. Addi-
randomized trials of NSAIDs.39 They argued that tional analyses were performed to determine
the typical epidemiologic approaches to investigat- whether any factors seemed to be associated with
ing NSAIDs as risk factors for gastrointestinal a study showing a harmful effect of NSAIDs.
side-effects, i.e., cohort or case ±control studies, As a methodologic aside, the authors examined
are subject to too many potential biases. Rando- inter-reader disagreements. Overall, a disagree-
mized trials, on the other hand, would provide ment rate of 19% was observed for the final
internally valid comparisons of NSAID users to decision on inclusion or exclusion of studies. These
nonusers. Presumably, although not stated expli- disagreements were resolved in conference.
citly, the combination of results from numerous There were 100 randomized trials of nonaspirin
studies, with varied entry and exclusion criteria, NSAIDs included in the final analysis, containing
would alleviate the problem of the potential lack of 123 comparisons with a no-treatment control
generalizability from patients enrolled in a parti- group, which usually received a placebo. A total
cular trial. The pooling of results from numerous of 12 853 patients were included in these trials,
studies would permit the assessment of rare events. with a mean duration of treatment of about 67
The authors performed a meta-analysis of days (median 21 days) and a mean age of 46 years.
randomized trials, excluding trials involving topi- For the sake of brevity, the aspirin trials will not
cal usage of NSAIDs, those that examined be discussed here. The data revealed a generally
pharmacological endpoints only, studies of new- low risk of gastrointestinal side-effects. For
borns, trials involving fewer than four days of example, only two patients were reported with
treatment, trials in which patients were taking proven ulcers out of 6460 treated patients, with
NSAIDs within the three days before randomiza- none in the controls. In the ten studies explicitly
tion, and trials of drugs for dysmenorrhea (because mentioning gross upper-gastrointestinal hemor-
of the short duration of the drug regimen and the rhage, the risk was 8=1103 (0.73%) in the control
confounding gastrointestinal symptoms from the patients and 24=1157 (2.1%) in treated patients,
dysmenorrhea). The meta-analysis was limited to giving a crude relative risk of 2.8. The length of
those trials in which the anti-inflammatory drug followup for these ten studies was not specifically
was compared with a placebo, no drug, or a drug mentioned by the authors of the meta-analysis.
with no anti-inflammatory property. Photocopies However, the analysis of duration of therapy
of the ``Methods'' sections of 525 potentially showed that duration was longer for studies
relevant studies (blinded as to author, journal, showing a harmful effect of NSAIDs (geometric
and time and place of study, as well as all allusions mean = 81 d) than for studies showing no effect of
to results) were read by two independent observers NSAIDs (geometric mean = 25 d) for the gross
who determined inclusion suitability according to hemorrhage endpoint, consistent with a duration ±
the above criteria. response effect.
{Jobs}0688jw/makeup/688ch38.3d

646 PHARMACOEPIDEMIOLOGY

This meta-analysis was faced with some inter- control and cohort studies. Although these sepa-
esting statistical and other methodologic ques- rate summaries are only reported graphically and
tions. There were numerous studies that did not the exact values are difficult to read, the summary
explicitly mention side-effects in general or did not odds ratio from the cohort studies is clearly closer
mention particular side-effects, even though others to unity than the result from the case ± control
were mentioned. The authors chose to do a kind of studies. These authors also found that the size of
sensitivity analysis by analyzing all studies, assum- the odds ratio was related to the duration of
ing that the risk of an unreported side-effect was NSAID use. Interestingly, though, the highest
zero, and separately analyzing results from only odds ratios were obtained from studies in which
those studies explicitly mentioning a particular the duration of NSAID consumption was less than
side-effect. 1 month. (Note: Gabriel et al. only presented this
Another issue was the extensive number of finding without adjustment for study design, case ±
studies with no occurrences of a particular end- control versus cohort. Although they performed a
point in either the treated or the control group. multiple regression to examine interstudy hetero-
The usual pooling procedures, e.g., the Mantel ± geneity, the findings of that model with respect to
Haenszel procedure,87 essentially ignore such the individual potential sources of heterogeneity
studies, since they contribute no information, were not presented. It is possible that the studies
under one interpretation, concerning the common with under one month of NSAID use were also
odds ratio. On the other hand, if over 90 of 100 predominantly case ±control studies, but that
separate trials report no proven ulcers in either the cannot be determined from their paper, leaving
treated or the control groups, then another the underlying source of heterogeneity somewhat
interpretation of those results is that the risk of ambiguous.)
an ulcer is fairly low. Chalmers and colleagues The consistency of results for gastrointestinal
chose to work with risk differences to address this bleeding between the two meta-analyses is of
issue, allowing studies with no events in either interest and lends some support to a causal
group to enter the calculations. This is the type of association. Several points are important in con-
situation considered by Deeks and colleagues,90 sidering the above results. In a cohort study of
whose results suggest that the one-step method gastrointestinal bleeding and NSAIDs, Carson
would have been the most appropriate for these and colleagues136 found a quadratic duration±
studies with frequent occurrence of zero cells. response relation. They argue that this is compa-
Another meta-analytic approach to the problem tible with an increasing risk with increasing
of side-effects of NSAIDs was used by Gabriel and duration of NSAID use, as suggested by Chalmers
colleagues,135 who examined the results of 16 non- et al.,39 until many of those patients who would
experimental studies (nine case ± control and seven develop gastrointestinal bleeding from NSAIDs
cohort) of serious gastrointestinal complications were removed from the cohort and then the risk
related to use of NSAIDs. The studies had to have declined. This reasoning may explain the appar-
a comparison group and provide an estimate of ently anomalous finding by Gabriel et al.135 of
risk for serious gastrointestinal complications highest odds ratios for studies with less than one
(defined as bleeding, perforation, or other adverse month of NSAID use.
gastrointestinal events resulting in hospitalization Bollini and colleagues137 also examined epide-
or death) in NSAID users compared with nonu- miologic studies that investigated the association
sers, regardless of underlying disease. They ex- between NSAIDs and severe upper gastrointest-
cluded studies if the primary goal was to assess inal tract disease, including hematemesis, melena,
effectiveness. peptic ulcer, ulcer perforation, and death attribu-
The odds ratio found by these authors for table to these outcomes. The studies had to
gastrointestinal bleeding, based on nine studies compare groups exposed and unexposed to
reporting this endpoint, was 2.39 (CI 2.11, 2.70). NSAIDs. Of the 34 studies they examined, seven
The authors performed separate analyses for case ± were cohort, eight case ± control with community
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 647

controls, and 19 case ± control with hospital con- risks, while azapropazone, tolmetin, and ketopro-
trols. The type of study design was associated with fen had the highest risks. High dose ibuprofen (i.e.,
varying estimates of the relative risk. Case ± control greater than 1600 mg daily) was also associated
studies with hospital controls had the highest with an elevated relative risk.
average relative risk (4.4 [3.3 ± 6.0]) and the cohort It is important to keep in mind that the
studies had the lowest (2.0 [1.2 ±3.2]). They found conclusions reached by Henry et al. were based
that studies with satisfactory methods yielded on on indirect comparisons of the various drugs with
average a lower relative risk (2.6 [1.8 ±3.9]) as ibuprofen. They claimed to find little evidence that
compared with studies whose methods were the relative rankings were due to confounding by
unsatisfactory (4.2 [3.1 ±5.6]). patient susceptibility. Despite any shortcomings of
In perhaps the most comprehensive and clini- their approach, as the authors point out, clinical
cally useful of the systematic reviews in the area of and regulatory decisions have to be made on some
NSAIDs side effects, Henry and colleagues138 type of scientific basis, and these are the only data
addressed the issue of comparative relative risks available. Risks need to be weighed against
of serious gastrointestinal complications with benefit, and the authors highlight the known
individual NSAIDs. Their stated motivation for variability across patients in clinical response to
this approach was that one strategy for reducing particular drugs. Thus, it seems that this systema-
NSAID toxicity in populations would be to tic review provided useful information for clinical
choose, as first line therapy, a drug and dose with decision making.
a comparatively low risk of gastrointestinal side- In qualitative reviews of the literature on the
effects. gastrointestinal side effects of NSAIDs, Taragin et
The authors used meta-analytic methods to al.139 and Carson and Strom140 point out differ-
examine the range of relative risks for particular ences in study designs that could lead to differ-
NSAIDs and explore the extent to which differ- ences in results. For example, bleeding could be
ences in toxicity could be related to different doses, defined as all bleeding, fatal bleeding, bleeding
or to different susceptibility among patients requiring hospitalization, bleeding requiring trans-
receiving the various drugs. To do this, they fusions, or bleeding requiring surgery. Several
identified case ± control or cohort studies of rela- procedures exist for the detection of gastrointest-
tionships between use of specific NSAIDs in the inal bleeding. The clinical relevance of the different
community and development of serious peptic methods is sometimes unclear. Case ±control
ulcer complications requiring hospital admission. studies may show higher odds ratios because of
In estimating pooled relative risks, analyses were the likelihood of recall bias; patients with bleeding
restricted to studies that compared another drug requiring hospitalization might be more likely to
with ibuprofen as the reference. They used recall NSAID use than controls, particularly if
unadjusted relative risks based on 2  2 tables in probing by interviewers, or by health care provi-
the pooling. ders prior to interview, is more extensive for cases
The authors found 12 studies examining 14 than for controls. This possibility is supported by
NSAIDs, including two unpublished reports. the data from the meta-analysis of Gabriel et al.
Eleven of the studies were case ±control studies. Cohort studies based on claims data, such as that
The estimated relative risks for specific drugs conducted by Carson and colleagues136 described
versus ibuprofen ranged from 1.6 (95% CI 1.0, 2.5) above, sometimes use unvalidated outcomes. To
for fenoprofen, to 9.2 (95% CI 4.0, 21) for the extent that false events may be documented for
azapropazone. All of the relative risks were both the exposed and unexposed cohorts, the
significantly greater than 1.0. Using a weighted relative risk observed in such studies would show
ranking system, which incorporated study size into less of an effect of exposure. Of course, these
the weights, the authors found that ibuprofen had cohort studies may exaggerate the apparent effect
the lowest rank (least toxicity), followed by of exposure if spurious diagnoses of gastrointest-
diclofenac. Aspirin and naproxen had intermediate inal events are more likely to occur when a patient
{Jobs}0688jw/makeup/688ch38.3d

648 PHARMACOEPIDEMIOLOGY

has a history of NSAID use. Further variability examine the relationship between study design
may be generated among study results by the and outcome.146 The nonrandomized studies
inclusion of many different kinds of NSAID, some tended to show less benefit of quinidine with
of which may have more potential to cause respect to maintenance of sinus rhythm com-
gastrointestinal side-effects than others. pared with the randomized studies. The odds
Thus, another benefit of meta-analysis is the ratios for mortality also varied according to
ability to examine findings according to study study design, although there were few deaths
characteristics and study design, leading to hy- overall: OR = 3.5 (1.0, 12.4) for randomized
potheses about subgroups or particular therapies studies; OR = 9.9 (0.8, 123.2) for nonrando-
of special interest and suggestions for the design of mized studies. The overall mortality risk was
subsequent studies. Meta-analysis can quantify 2.0% (34=1709) for quinidine-treated patients
differences related to study design that the tradi- and 0.6% (4=681) for all control patients.
tional review can only observe in qualitative terms. 3. A third example is the meta-analysis of
There are numerous other examples of the nonexperimental studies of oral contraceptives
application of meta-analysis to the evaluation of and the risk of breast cancer discussed above.20
adverse effects of pharmaceutical therapies. These An association between increasing odds ratio
include the following. and increasing duration of oral contraceptive
use was found in case ±control studies in which
1. Two meta-analyses have been published on the the cases were mostly premenopausal (defined
effects of prophylactic lidocaine in acute as age limit of 45 years old or less). In a
myocardial infarction.141 ± 143 These studies subsequent methodologic paper, Berlin and
showed that, although lidocaine effectively colleagues3 use these same data to show that
prevented ventricular fibrillation,143 there the magnitude of the odds ratio depends not
seemed to be an excess in mortality among only on the menopausal status of the cases but
those patients randomly allocated to lidocaine on the calendar years during which cases were
compared with those allocated to placebo. In a accrued, presumably because of the changing
related paper, using meta-analytic regression formulations of oral contraceptives.
methods, Antman and Berlin144 calculate that,
given the low baseline incidence of ventricular
fibrillation in the current coronary care unit
environment, 400 patients would require treat-
New Indications for Existing Therapies
ment with lidocaine to prevent a single episode
of ventricular fibrillation. Considering this Meta-analysis has also been used to assess the
estimate in addition to the possibly increased effectiveness of existing therapies for new indica-
risk of mortality, the authors suggest that tions. As an example, the efficacy of antilympho-
prophylactic lidocaine should not be given cyte antibodies in the perioperative period of
routinely. cadaveric kidney transplantation (induction ther-
2. In a meta-analysis of randomized trials regard- apy) had not, until recently, been conclusively
ing the efficacy and safety of quinidine therapy demonstrated. Individual studies, both rando-
for the maintenance of sinus rhythm after mized and nonexperimental, had failed individu-
cardioversion,145 the authors show that quini- ally to show a significant benefit of induction
dine is, indeed, effective at maintaining sinus therapy with respect to allograft survival. Szczech
rhythm, but that mortality seems to be elevated and colleagues performed a meta-analysis of the
in quinidine patients (mortality odds ra- published data from the randomized trials of
tio = 2.98, 95% CI 1.1, 8.3, based on 12 deaths induction therapy147 in adults receiving cadaveric
among patients randomized to quinidine versus renal transplants. That analysis, using survival
only three among patients randomized to analytic methods on the group level (published)
placebo). In a subsequent paper, the authors data, showed a statistically significant 31% lower
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 649

rate of allograft failure at two years in patients with 33 failures) and 0.97 (CI 0.71 ± 1.32, p > 0.2) in
receiving induction therapy. unsensitized patients (510 patients with 163 fail-
In a subsequent analysis of the individual ures). The authors found no other significant
patient data from five of the seven randomized interactions between induction therapy and any
trials of induction therapy, Szczech and colleagues other variable.148
examined the effect of induction therapy beyond Several advantages of meta-analysis, and parti-
two years and in subgroups of patients with risk cularly individual patient analyses, are demon-
factors for early allograft failure.148 The subgroup strated by this example. The improved precision
analyses are examined in the next section. The five provided by large numbers of patients is an
studies included in the individual patient analyses important benefit. Having individual level data
gave results for the two year analysis virtually allowed an analysis that could go beyond the
identical to those obtained from the full set of simple, unadjusted analyses to which most meta-
seven studies using the published data, i.e., a analyses of published data are limited. The
relative rate of 0.69 favoring induction therapy. availability of patient characteristics permitted
When extended to five years, the rate of allograft not only adjustment for those characteristics, but
survival was 69.0% in patients receiving induction also examination of subgroup effects in larger
therapy and 64.4% in those not receiving induc- numbers of patients than would typically be
tion therapy (p = 0.13). Thus, the overall benefit included in a single trial. Although one might
demonstrated at two years was smaller and it was wish to confirm these subgroup results in an
no longer significant at five years. independent data set, the patient level analyses
strongly suggest that induction therapy is effective
in the 14% of patients who are presensitized. If
Differential Effects Among Subgroups of
confirmed, these results could mean that induction
Patients
therapy could be targeted to the group in which it
In the analysis of individual patient data by is highly effective, while avoiding needless treat-
Szczech and colleagues, the authors were able to ment and potential toxicity in other patients.
examine the specific effects of induction therapy in In another example of a meta-analysis addres-
subgroups of patients at high risk for allograft sing subgroup issues, Midgette and colleagues149
failure. Before proceeding to analyses within examined the use of intravenous streptokinase in
particular subgroups, the authors tested statistical patients with suspected acute myocardial infarc-
interactions between each of the relevant patient tion. The authors found a summary relative risk of
characteristics and induction therapy. One of the 0.72 (0.65, 0.79), favoring streptokinase, in pa-
patient characteristics of interest was the panel tients with suspected acute anterior myocardial
reactive antibody level (PRA), an indicator of infarction, and a summary risk ratio of 0.87 (0.76,
immune system presensitization. Patients with 1.01) in those with a suspected inferior infarction.
PRA levels less than 20% were considered The authors conclude that there is a protective
unsensitized, while those with PRA of 20% or effect of intravenous streptokinase in anterior
higher were considered presensitized. At two years, infarction, but that the protective effect in inferior
the effect of induction therapy differed in pre- infarction is smaller and less certain.
sensitized and unsensitized patients (p = 0.03 for
interaction). The rate ratio at two years was 0.12
Selection From Among Several Alternative
(CI 0.03 ± 0.44, p = 0.002) in presensitized patients
Therapies
(85 patients with 15 failures) and 0.74 (CI 0.50 ±
1.09, p = 0.13) in unsensitized patients (511 In a meta-analysis of therapies for the prevention
patients with 100 failures). This interaction was of supraventricular arrhythmias after coronary
still significant at five years (p = 0.009 for inter- bypass surgery, Andrews et al.150 looked separately
action), with a rate ratio of 0.20 (CI 0.09 ± 0.47, at verapamil, digoxin, and -adrenoceptor block-
p < 0.001) in presensitized patients (85 patients ers as prophylactic agents. Only randomized trials
{Jobs}0688jw/makeup/688ch38.3d

650 PHARMACOEPIDEMIOLOGY

were included. Neither digoxin nor verapamil benefit from intravenous administration.152 The
reduced the risk of supraventricular arrhythmias authors of the meta-analysis speculate that the
after coronary artery bypass surgery (digoxin: particular study failed to achieve therapeutic levels
OR = 0.97, CI = 0.62, 1.49; verapamil: of heparin in the subcutaneous group.
OR = 0.91, CI = 0.57, 1.46). The risk of a A somewhat different approach to subgroup
supraventricular arrhythmia in patients treated analyses has emerged in recent years. This strategy
with -blockers was dramatically reduced views the risk of events in the control group of a
(OR = 0.28, CI = 0.21, 0.36), although significant trial (baseline risk) as a general indicator of
heterogeneity among the study results was present. severity of disease in the treated population. The
The authors explored this heterogeneity by exam- relationship between treatment benefit and base-
ining separately studies of different -blockers, line risk can then be estimated, i.e., examining
and by summarizing separately preoperative and whether there is an interaction between treatment
postoperative treatment. While these separate and baseline risk.101, 102, 153 A number of statistical
summaries suggested varying degrees of hetero- issues arise in such analyses, including the inherent
geneity within subgroups of studies, all of the association induced by regressing treatment bene-
summaries showed statistically significant benefits fit (e.g., the log relative risk, which is calculated
of -blockers. The authors drew no firm conclu- using the baseline risk) on baseline risk, and the
sions from the subgroup analyses other than to fact that the baseline risk, except in very large
suggest directions for future research. studies, is affected by sampling variability. These
As another example, for several decades, hepar- approaches all use group level (published) data,
in has been used as the primary antithrombotic examining whether the risk in the population is
drug for the initial treatment of venous throm- associated with the magnitude of treatment bene-
boembolism. There has been some controversy fit. This may not necessarily yield the same result
over the optimal mode of administration of as looking at the interaction on an individual
heparin: intermittent intravenous, continuous in- patient basis between treatment and individual level
travenous, or subcutaneous injection. While con- estimates of risk. It is also clearly different
tinuous infusion has been shown to be safer than clinically from examining specific patient charac-
intermittent injection, and equally effective, con- teristics, as opposed to calculating estimates of risk
tinuous infusion has disadvantages, such as the that depend on several characteristics. Information
need for hospitalization for most patients, possibly may be lost by using multivariable risk estimates,
prolonged immobilization, enhanced risk for sepsis as opposed to potentially biologically specific
related to the infusion cannula, and possible patient characteristics.
increased cost.151 Hommes and colleagues per-
formed a meta-analysis of randomized trials of
Saving Time and Money If You Believe a
intravenous versus subcutaneous heparin admin-
Meta-Analysis
istration in the initial treatment of deep vein
thrombosis.151 They found eight studies meeting One of the potential benefits of meta-analysis is the
their inclusion criteria. The overall summary potential to shorten the time between a medical
relative risk for efficacy was 0.62 (0.39, 0.98), research finding and the clinical implementation of
indicating a benefit from the use of subcutaneous a new therapy. This is a concern not only for the
compared with intravenous administration. The development of new drugs, but for the exploration
analysis of safety (i.e., risk of major hemorrhage) of new indications for existing therapies. A group
also showed a modest benefit of subcutaneous at Harvard University has advocated the routine
injection (relative risk = 0.79, CI 0.42, 1.46). In the use of what they have termed ``cumulative meta-
analysis of efficacy, as in others described above, analysis,'' i.e., performing a new meta-analysis
there was highly significant among study hetero- each time the results of a new clinical trial are
geneity. The source of this heterogeneity was published.40, 154 Antman et al.40 applied this
apparently a single study showing a significant technique in combination with a classification
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 651

scheme of the treatment recommendations for summary odds ratio arbitrarily estimated after 17
myocardial infarction found in review articles and trials had been completed. On the right hand side
textbook chapters. They found many discrepancies of Figure 38.3, the same data are presented as a
between the evidence contained in the published cumulative meta-analysis, with an updated sum-
randomized trials and the timeliness of the mary estimate calculated after the completion of
recommendations. each new trial. The cumulative meta-analysis
As an example, Antman and colleagues ana- clearly shows that the updated pooled estimate
lyzed data from 17 trials of -blockers for the became statistically significant in 1977 and has
prevention of death in the years following a remained so ever since.
myocardial infarction.40 On the left hand side of The process of cumulative meta-analysis was
Figure 38.3, reproduced from their paper, the data applied by these authors to eight therapies for
are presented as a traditional meta-analysis, with acute myocardial infarction. In five of the six
individual study results presented along with the instances in which the cumulative meta-analyses

Figure 38.3. Results of 17 randomized control trials of the effects of oral -blockers for secondary prevention of mortality
in patients surviving a myocardial infarction presented as two types of meta-analysis. On the left is the traditional one,
revealing many trials with nonsignificant results but a highly significant estimate of the pooled results on the bottom of the
panel. On the right, the same data are presented as cumulative meta-analyses, illustrating that the updated pooled
estimate became statistically significant in 1977 and has remained so up to the present. Note that the scale is changed on
the right-hand graph to improve clarity of the confidence intervals. Reprinted with permission from Antman et al., Journal
of the American Medical Association; July 8, 1992; volume 268; pages 240 ± 248. Copyright 1992, American Medical
Association.
{Jobs}0688jw/makeup/688ch38.3d

652 PHARMACOEPIDEMIOLOGY

revealed the therapies to be of statistically sig- are often suggested by nonexperimental studies,
nificant benefit in reducing in-hospital mortality, it including cohort and case±control studies and
was several years before experts recommended the nonrandomized phase 2 clinical trials. The results
therapy with any consistency. An important of these studies are not always confirmed by
example was thrombolytic therapy, which did not subsequent, properly designed randomized trials.
begin to be recommended by more than half the For example, consider the case of -carotene in the
experts, even for specific indications, until 13 years prevention of cancer. A series of observational
after the cumulative meta-analysis would have studies (see Ziegler et al.157 for a review) examined
shown therapy to be effective. Six years passed the relation between dietary intake of foods rich in
between the publication, in a major journal,40, 155 -carotene and the risk of lung cancer. Overall, they
of the first meta-analysis showing an impressive showed a relatively consistent association between
reduction in mortality by thrombolytic therapy diets rich in -carotene and reduced risk of lung
and the year in which the majority of the experts cancer. Subsequent randomized trials of this specific
whose opinions were studied by the authors nutrient as a supplement have failed to confirm a
recommended it for routine or specific use. In protective effect against lung cancer.158±160
1985, a 20% reduction in mortality was established
at the p < 0:001 level (OR = 0.78; CI 0.69, 0.90). A
total of 14 reviews after 1985 did not mention the THE FUTURE
treatment or felt that it was still experimental. The
authors concluded that identifying and interpret- The examples above have raised several important
ing the therapeutic trials in a given field is issues that will need to be addressed in the future.
extremely difficult, so that clinical experts need A set of issues not fully addressed above relates to
access to better databases and new statistical the availability of individual level data. From the
techniques (such as cumulative meta-analysis). above examples, it becomes increasingly apparent
Some caution may be advised in interpreting that the pursuit of questions about subgroups of
cumulative meta-analyses. The issue of multiple patients is often an informative and important
statistical tests, for example, is considered by some element of a well conducted meta-analysis, at least
to be an important consideration. The problem is for certain therapies. One should certainly exercise
that testing and estimation procedures may need to due caution in the interpretation of subgroup
make adjustments for the increased probability of effects, emphasizing those that are specified a
a spurious positive finding (type I, or error) priori with biological justification. By assembling
introduced by the use of repeated statistical large numbers of patients, meta-analysis can at
tests.100, 156 At the least, one might wish to consider least begin to address the problems related to
using a more stringent criterion for statistical statistical instability of subgroup effects. It is too
significance than the traditional p < 0:05 cutoff. often the case, however, that results are not
Another consideration is that estimates of treat- reported separately for subgroups of patients.
ment effect may not be stable over time, perhaps Typically, some trials will exclude particular
due to changing clinical environments. In the - patients while others will not exclude them. At
blocker example, there is an apparent ``drift'' of the level of grouped data from published reports,
the effect estimate back toward the null in more one is faced with analyzing the two groups of
recent years, i.e., treatment appears to be less studies separately as the only way of addressing
effective in the most recent studies. Thus, it may be the subgroup question, or using meta-regression
important to re-evaluate therapies as other treat- techniques on what amounts to ecological data. As
ment strategies evolve for the same conditions. a trivial hypothetical example, suppose one wanted
A final caution with regard to interpreting to perform separate analyses of the effect of
cumulative meta-analyses relates to the continuing treatment X in men and women. Among the
need for well designed randomized controlled trials. existing randomized trials, six exclude women and
New indications for existing therapies, for example, four do not, but the four also include men. Ideally,
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 653

one would like to obtain information on the effect own admission) at least 50% of all cases of
in men alone from all of the ten trials, since all tuberculosis, failed to show the protective effect
include men. Similarly, one could obtain a found in a number of other studies with more
separate estimate of the effect of treatment in complete followup.169, 170
women from the four studies including women. It When there is little or no heterogeneity of results
is possible to use the group level data only to show, among trials, and the likelihood of serious pub-
for example, that studies that include women tend lication bias is minimal, one might be willing to
to show different effects than studies excluding accept meta-analytic evidence as helping to estab-
women, but one cannot perform a separate lish effectiveness. It is less obvious what to do with
analysis of women. One might alternatively regress the results of a meta-analysis when there is
the treatment effect measure (e.g., log relative risk) substantial heterogeneity. If the heterogeneity is
against the percent male (or female), but that is adequately explained in the analysis in terms of
still less satisfactory than obtaining patient level subgroup effects, or trial quality, meta-analysis
data. For further practical discussions of the use of might still be an acceptable part of demonstrating
individual patient data, see Stewart and Clarke161 effectiveness, but such a conclusion might be
and Stewart and Parmar.162 Mechanisms for the conditional on the type of patient or other factors.
sharing of person level data need to be promoted. Similarly, the technique of cumulative meta-
In the development of cumulative meta-analysis, analysis could be applied to the analysis of adverse
some of the most important issues will be events. As nonexperimental studies of adverse
philosophical ones. Some of the same issues apply effects are completed, the same approach could
to the approval process for new drugs. How much be applied. The likelihood seems to be, however,
evidence is required before a therapy can be that such meta-analyses would be faced with much
accepted as efficacious? Should we require the more serious issues of heterogeneity of findings
existence of a certain minimum number of trials than meta-analyses of randomized studies typically
showing a statistically significant benefit of a have to confront. The acceptance of meta-analytic
therapy? Suppose ten studies all show a 20% (or results in this context might be extremely slow.
thereabouts) reduction in mortality in patients (Consider the slow pace, demonstrated by Antman
treated with drug X compared to placebo, but et al.,40 with which new therapies are accepted even
none of the individual studies shows a statistically when the evidence is provided by randomized
significant effect. If the combined analysis shows a trials.) Even if a meta-analysis of, for example, oral
highly statistically significant 20% reduction in contraceptive use and breast cancer risk were to
mortality due to treatment X, and 20% is show a convincing, consistent duration± response
considered clinically significant, should the com- relationship, the issue of what to do with that
bined analysis be sufficient evidence for the information is complex. If the relative risk for 10
acceptance of drug X as beneficial? What would years of use is 1.5, is that sufficient to warrant
an additional, large clinical trial contribute? removal of oral contraceptives from the market?
In this context, it is worth noting that several Would 2.5 be a sufficiently high relative risk? What
empirical studies have examined discrepancies other factors, e.g., family history, etc., need to be
between large trials and meta-analyses of the same considered when prescribing oral contraceptives?
therapies.163 ± 168 The assumption made by some of While these are clearly more general issues, not
the authors of these studies, that larger studies are restricted to the interpretation of meta-analyses,
necessarily better studies, may not be valid. the additional precision provided by meta-analyses
Replication of a finding by independent studies makes their interpretation all the more difficult.
must certainly be a key element to establishing The concept of prospective meta-analysis also
efficacy, as compared with a single trial. Large merits further attention. Along with registration of
trials may also be poorly designed. For example, trials, this closely related strategy has been
for BCG vaccine, a huge trial using passive advocated as a means of avoiding publication
followup, and therefore missing (by the authors' bias.130 ± 133 It may be possible, however, to go
{Jobs}0688jw/makeup/688ch38.3d

654 PHARMACOEPIDEMIOLOGY

beyond simply planning the logistics of multiple 8. Buffler PA. The evaluation of negative epidemio-
trials and the collection of common data elements logic studies: the importance of all available
evidence in risk characterization. Regul Toxicol
to allow pooling of results upon the completion of Pharmacol 1989; 9: 34 ± 43.
all trials. It may be possible to go further toward 9. Sacks HS, Berrier J, Reitman D, Ancona-Berk VA,
planning the scientific questions to be addressed. Chalmers TC. Meta-analyses of randomized con-
As a simple example, by regulation, sex and age trolled trials. New Engl J Med 1987; 316: 450± 5.
(adult versus pediatric) would need to be addressed 10. L'Abbe KA, Detsky AS, O'Rourke K. Meta-
analysis in clinical research. Ann Intern Med
for a new analgesic. In addition, it would be 1987; 107: 224± 33.
important to consider indication (emergency de- 11. Colditz GA, Burdick E, Mosteller F. Heterogeneity
partment, postoperative, etc.) and dose (cumula- in meta-analysis of data from epidemiologic
tive dose, daily dose, need for a loading dose, etc.). studies: a commentary. Am J Epidemiol 1995;
How best to design the series of studies to address 142: 371± 82.
12. Berlin JA. Invited commentary: benefits of hetero-
all of these questions, either simultaneously or geneity in meta-analysis of data from epidemiolo-
sequentially, needs further consideration (see gic studies. Am J Epidemiol 1995; 142: 383±7.
Berlin and Colditz for a more complete discussion 13. Dickersin K, Berlin JA. Meta-analysis: state-of-
of this issue).33 the-science. Epidemiol Rev 1992; 14: 154± 76.
While there are no easy answers to many of 14. Mulrow CD. The medical review article: state of
the science. Ann Intern Med 1987; 106: 485± 8.
these questions, it is clear that meta-analysis will 15. Oxman AD, Guyatt GH. Guidelines for reading
play an increasingly important role in the for- literature reviews. Can Med Assoc J 1988; 138:
mulation of treatment and policy recommenda- 697±703.
tions. Thus, the quality of the meta-analyses 16. Chalmers I. Improving the quality and dissemina-
performed is of the utmost importance and needs tion of clinical research. In: Lock S, ed., The Future
of Medical Journals: in Commemoration of 150
to be reviewed by the scientific community in an Years of the British Medical Journal. London:
open, published forum. Meta-analyses, if they are British Medical Society, 1991; 127± 46.
carefully interpreted in view of their strengths and 17. Kass EH. Reviewing reviews. In: Coping with the
weaknesses, should prove to be extremely helpful biomedical literature: a primer for the scientist and
in pharmacoepidemiology research. the clinician. Warren KS, ed. New York: Praeger,
1981. 71 ± 91.
18. Huth EJ. Needed: review articles with more
scientific rigor. Ann Intern Med 1987; 106: 470± 1.
19. Canner PL. An overview of six clinical trials of
REFERENCES aspirin in coronary heart disease. Stat Med 1987; 6:
255±67.
1. DerSimonian R, Laird N. Meta-analysis in clinical 20. Romieu I, Berlin JA, Colditz G. Oral contra-
trials. Control Clin Trials 1986; 7: 177± 88. ceptives and breast cancer. Review and meta-
2. Last JM. A Dictionary of Epidemiology. New York: analysis. Cancer 1990; 66: 2253± 63.
Oxford University Press, 1988. 21. Longnecker MP, Berlin JA, Orza MJ, Chalmers
3. Berlin JA, Longnecker MP, Greenland S. Meta- TC. A meta-analysis of alcohol consumption in
analysis of epidemiologic dose-response data. relation to risk of breast cancer. J Am Med Assoc
Epidemiology 1993; 4: 218± 28. 1988; 260: 652± 6.
4. Feinstein AR. Para-analysis, faute de mieux, and 22. Longnecker MP, Orza MJ, Adams ME, Vioque J,
the perils of riding on a data barge. J Clin Chalmers TC. A meta-analysis of alcoholic bev-
Epidemiol 1989; 42: 929± 35. erage consumption in relation to risk of colorectal
5. Shapiro S. Is meta-analysis a valid approach to the cancer. Cancer Causes Control 1990; 1: 59 ± 68.
evaluation of small effects in observational studies? 23. Wong O, Raabe GK. Critical review of cancer
J Clin Epidemiol 1997; 50: 223± 29. epidemiology in petroleum industry employees,
6. Egger M, Schneider M, Davey Smith G. Spurious with a quantitative meta-analysis by cancer site.
precision? Meta-analysis of observational studies. Am J Ind Med 1989; 15: 283± 310.
Br Med J 1998; 316: 140± 4. 24. Frumkin H, Berlin J. Asbestos exposure and
7. Thacker SB. Meta-analysis. A quantitative ap- gastrointestinal malignancy: review and meta-
proach to research integration. J Am Med Assoc analysis. Am J Ind Med 1988; 14: 79 ± 95 [published
1988; 259: 1685± 9. erratum appears in Am J Ind Med 1988; 14: 493].
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 655

25. Berlin JA, Colditz GA. A meta-analysis of physical analyses of randomized control trials and recom-
activity in the prevention of coronary heart disease. mendations of clinical experts. Treatments for
Am J Epidemiol 1990; 132: 612±28. myocardial infarction. J Am Med Assoc 1992;
26. Gray A, Berlin JA, McKinlay JB, Longcope C. An 268: 240±8.
examination of research design effects on the 41. Begg CB, Berlin JA. Publication bias and dissemi-
association of testosterone and male aging: results nation of clinical research. J Natl Cancer Inst 1989;
of a meta-analysis. J Clin Epidemiol 1991; 44: 81: 107±15.
671± 84. 42. Easterbrook PJ, Berlin JA, Gopalan R, Matthews
27. Booth-Kewley S, Friedman HS. Psychological DR. Publication bias in clinical research. Lancet
predictors of heart disease: a quantitative review. 1991; 337: 867± 72.
Psychol Bull 1987; 101: 343± 62. 43. Dickersin K, Min YI, Meinert CL. Factors
28. Morgenstern H, Glazer WM, Niedzwiecki D, influencing publication of research results. Fol-
Nourjah P. The impact of neuroleptic medication low-up of applications submitted to two institu-
on tardive dyskinesia: a meta-analysis of published tional review boards. J Am Med Assoc 1992; 267:
studies. Am J Public Health 1987; 77: 717± 24. 374± 8.
29. Greenland S. Quantitative methods in the review 44. Dickersin K. The existence of publication bias and
of epidemiologic literature. Epidemiol Rev 1987; 9: risk factors for its occurrence. J Am Med Assoc
1 ± 30. 1990; 263: 1385± 9.
30. Greenland S, Longnecker MP. Methods for trend 45. Chalmers I. Underreporting research is scientific
estimation from summarized dose ± response data, misconduct. J Am Med Assoc 1990; 263: 1405± 8.
with applications to meta-analysis. Am J Epidemiol 46. Chalmers TC, Frank CS, Reitman D. Minimizing
1992; 135: 1301± 9. the three stages of publication bias. J Am Med
31. Greenland S, Salvan A. Bias in the one-step Assoc 1990; 263: 1392± 5.
method for pooling study results. Stat Med 1990; 47. Moher D, Berlin J. Improving the reporting
9: 247± 52. of randomised controlled trials. In: Non-
32. Fleiss JL, Gross AJ. Meta-analysis in epidemiol- random reflections on health services research.
ogy, with special reference to studies of the Maynard A, Chalmers I, eds., London: BMJ, 1997.
association between exposure to environmental 250± 71.
tobacco smoke and lung cancer: a critique. J Clin 48. Dickersin K, Min YI. NIH clinical trials and
Epidemiol 1991; 44: 127± 39. publication bias. Online J Curr Clin Trials 1993;
33. Berlin JA, Colditz GA. The role of meta-analysis in Doc. No. 50.
the regulatory process for foods, drugs, and 49. Stern JM, Simes RJ. Publication bias: evidence of
devices. J Am Med Assoc 1999; 281: 830± 4. delayed publication in a cohort study of clinical
34. O'Neill RT, Anello C. Does research synthesis research projects. Br Med J 1997; 315: 640± 5.
have a place in drug regulatory policy? Synopsis of 50. Ioannidis JP. Effect of the statistical significance of
issues: assessment of efficacy and drug approval. results on the time to completion and publication
Clin Res Regul Affairs 1996; 13: 23 ± 9. of randomized efficacy trials. J Am Med Assoc
35. Poole C, Greenland S. Random-effects meta- 1998; 279: 281± 6.
analyses are not always conservative. Am J 51. Berlin JA, Begg CB, Louis TA. An assessment of
Epidemiol 1999; 150: 469±75. publication bias using a sample of published
36. Jones DR. Meta-analysis of observational epide- clinical trials. J Am Stat Assoc 1989; 84: 381± 92.
miological studies: a review. J R Soc Med 1992; 85: 52. Chalmers TC, Berrier J, Sacks HS, Levin H,
165± 8. Reitman D, Nagalingam R. Meta-analysis of
37. Einarson TR, Leeder JS, Koren G. A method for clinical trials as a scientific discipline. II: Replicate
meta-analysis of epidemiological studies. Drug variability and comparison of studies that agree
Intell Clin Pharm 1988; 22: 813± 24. and disagree. Stat Med 1987; 6: 733± 44.
38. Greenland S. Invited commentary: a critical look 53. Early Breast Cancer Trialists' Collaborative
at some popular meta-analytic methods. Am J Group. Treatment of Early Breast Cancer. Vol I.
Epidemiol 1994; 140: 290±6. Worldwide Evidence. New York, NY: Oxford
39. Chalmers TC, Berrier J, Hewitt P, Berlin J, University Press, 1990.
Reitman D, Nagalingam R, Sacks H. Meta- 54. Hetherington J, Dickersin K, Chalmers I, Meinert
analysis of randomized controlled trials as a CL. Retrospective and prospective identification of
method of estimating rare complications of non- unpublished controlled trials: lessons from a survey
steroidal anti-inflammatory drug therapy. Aliment of obstetricians and pediatricians. Pediatrics 1989;
Pharmacol Ther 1988; 2: 9 ± 26. 84: 374±80.
40. Antman EM, Lau J, Kupelnick B, Mosteller F, 55. Gotzsche PC. Reference bias in reports of drug
Chalmers TC. A comparison of results of meta- trials. Br Med J Clin Res Edn 1987; 295: 654± 6.
{Jobs}0688jw/makeup/688ch38.3d

656 PHARMACOEPIDEMIOLOGY

56. Egger M, Zellweger-Zahner T, Schneider M, 72. Chalmers I, ed. Oxford database of perinatal
Junker C, Lengeler C, Antes G. Language bias in trials. Oxford: Oxford University Press, 1988.
randomised controlled trials published in English 73. Huston P, Moher D. Redundancy, disaggregation,
and German. Lancet 1997; 350: 326± 9. and the integrity of medical research. Lancet 1996;
57. Moher D, Fortin P, Jadad AR, Juni P, Klassen T, 347: 1024± 6.
Le Lorier J, Liberati A, Linde K, Penna A. 74. Detsky AS, Naylor CD, O'Rourke K, McGeer AJ,
Completeness of reporting of trials published in L'Abbe KA. Incorporating variations in the
languages other than English: implications for quality of individual randomized trials into meta-
conduct and reporting of systematic reviews. analysis. J Clin Epidemiol 1992; 45: 255± 65.
Lancet 1996; 347: 363± 6. 75. Berard A, Bravo G. Combining studies using effect
58. Davidson RA. Source of funding and outcome of sizes and quality scores: application to bone loss in
clinical trials. J Gen Intern Med 1986; 1: 155± 8. postmenopausal women. J Clin Epidemiol 1998; 51:
59. Hemminki E. Study of information submitted by 801±7.
drug companies to licensing authorities. Br Med J 76. Chalmers TC, Smith H Jr, Blackburn B, Silverman
1980; 280: 833± 6. B, Schroeder B, Reitman D, Ambroz A. A method
60. Cho MK, Bero LA. The quality of drug studies for assessing the quality of a randomized control
published in symposium proceedings. Ann Intern trial. Control Clin Trials 1981; 2: 31 ± 49.
Med 1996; 124: 485± 9. 77. Imperiale TF, McCullough AJ. Do corticosteroids
61. Chalmers TC. Problems induced by meta-analyses. reduce mortality from alcoholic hepatitis? A meta-
Stat Med 1991; 10: 971 ±9. analysis of the randomized trials. Ann Intern Med
62. Conn HO, Blitzer BL. Nonassociation of adreno- 1990; 113: 299± 307.
corticosteroid therapy and peptic ulcer. New Engl J 78. Prendiville W, Elbourne D, Chalmers I. The effects
Med 1976; 294: 473± 9. of routine oxytocic administration in the manage-
63. Messer J, Reitman D, Sacks HS, Smith H Jr, ment of the third stage of labour: an overview of
Chalmers TC. Association of adrenocorticosteroid the evidence from controlled trials. Br J Obstet
therapy and peptic-ulcer disease. New Engl J Med Gynaecol 1988; 95: 3 ± 16.
1983; 309: 21 ± 4. 79. Mahon WA, Daniel EE. A method for the
64. Conn HO, Poynard T. Adrenocorticosteroid ad- assessment of reports of drug trials. Can Med
ministration and peptic ulcer: a critical analysis. J Assoc J 1964; 90: 565± 9.
Chron Dis 1985; 38: 457± 68. 80. Verhagen AP, de Vet HCW, de Bie RA,
65. Chalmers TC. Meta-analysis in clinical medicine. Kessels AGH, Boers M, Bouter LM, Knipschild
Trans Am Clin Climatol Assoc 1987; 99: 144± 50. PG. The Delphi list: a criteria list for quality
66. DerSimonian R. Parenteral nutrition with assessment of randomized clinical trials for
branched-chain amino acids in hepatic ence- conducting systematic reviews developed by
phalopathy: meta analysis. Hepatology 1990; 11: Delphi consensus. J Clin Epidemiol 1998; 51:
1083± 4. 1235 ± 41.
67. Erikkson LS, Conn HO. Branched-chain amino 81. Moher D, Jadad AR, Tugwell P. Assessing the
acids in hepatic encephalopathy. Gastroenterology quality of randomized controlled trials. Current
1990; 99: 604± 7 [published erratum appears in issues and future directions. Int J Technol Assess
Gastroenterology 1990; 99: 1547]. Health Care 1996; 12: 195± 208.
68. Naylor CD, O'Rourke K, Detsky AS, Baker JP. 82. Moher D, Jadad AR, Nichol G, Penman M,
Parenteral nutrition with branched-chain amino Tugwell P, Walsh S. Assessing the quality of
acids in hepatic encephalopathy. A meta-analysis. randomized controlled trials: an annotated biblio-
Gastroenterology 1989; 97: 1033± 42. graphy of scales and checklists. Control Clin Trials
69. Dickersin K, Hewitt P, Mutch L, Chalmers I, 1995; 16: 62 ± 73.
Chalmers TC. Perusing the literature: comparison 83. Greenland S. Quality scores are useless and
of MEDLINE searching with a perinatal trials potentially misleading. Am J Epidemiol 1994; 140:
database. Control Clin Trials 1985; 6: 306± 17. 300±1.
70. Poynard T, Conn HO. The retrieval of randomized 84. Juni P, Witschi A, Block R, Egger M. The hazards
clinical trials in liver disease from the medical of scoring the quality of clinical trials: lessons
literature. A comparison of MEDLARS and for meta-analysis. J Am Med Assoc 1999; 282:
manual methods. Control Clin Trials 1985; 6: 1054± 60.
271± 9. 85. Schulz KF, Chalmers I, Hayes RJ, Altman DG.
71. Bernstein F. The retrieval of randomized clinical Empirical evidence of bias. Dimensions of metho-
trials in liver diseases from the medical literature: dological quality associated with estimates of
manual versus MEDLARS searches. Control Clin treatment effects in controlled trials. J Am Med
Trials 1988; 9: 23 ± 31. Assoc 1995; 273: 408± 12.
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 657

86. Berlin JA, Miles CG, Cirigliano MD, Conill AM, 104. Smith TC, Spiegelhalter DJ, Thomas A. Bayesian
Goldmann DR, Horowitz DA, Jones F, Scott E, approaches to random-effects meta-analysis: a
Hanchak NA, Williams SV. Does blinding of comparative study. Stat Med 1995; 14: 2685± 99.
readers affect the results of metaanalyses? Results 105. Eddy DM, Hasselblad V, Shachter R. An intro-
of a randomized trial. Online J Curr Clin Trials duction to a Bayesian method for meta-analysis:
1997; Doc. No. 205. the confidence profile method. Med Decis Making
87. Kleinbaum DG, Kupper LL, Morgenstern H. 1990; 10: 15 ± 23.
Epidemiologic research: principles and quantitative 106. Eddy DM, Hasselblad V, Shachter R. Meta-
methods. New York: Van Nostrand Reinhold, 1982. analysis by the confidence profile method: the
88. Rothman KJ. Modern epidemiology. Boston, MA: statistical synthesis of evidence. Boston, MA:
Little, Brown, 1986. Academic, 1990.
89. Yusuf S, Peto R, Lewis J, Collins R, Sleight P. Beta 107. Goodman SN. Meta-analysis and evidence. Control
blockade during and after myocardial infarction: Clin Trials 1989; 10: 188±204. [published erratum
an overview of the randomized trials. Prog appears in Control Clin Trials 1989; 10: 435].
Cardiovasc Dis 1985; 27: 335± 71. 108. Fleiss JL. Analysis of data from multiclinic trials.
90. Deeks J, Bradburn M, Localio R, Berlin J. Much Control Clin Trials 1986; 7: 267± 75.
ado about nothing: statistical models for meta- 109. Simon R. Overviews of randomized clinical trials.
analysis with rare events. 6th International Co- Cancer Treat Rep 1987; 71: 3 ± 5.
chrane Colloquium, Baltimore, MD, 1998. 110. Davey Smith G, Egger M, Phillips AN. Meta-
91. Berlin JA, Laird NM, Sacks HS, Chalmers TC. A analysis. Beyond the grand mean? Br Med J 1997;
comparison of statistical methods for combining 315: 1610± 14.
event rates from clinical trials. Stat Med 1989; 8: 111. Light RJ, Pillemer DB. Summing up: the science of
141± 51. reviewing research. Cambridge, MA: Harvard
92. Fleiss JL. The statistical basis of meta-analysis. University Press, 1984.
Stat Methods Med Res 1993; 2: 121± 45. 112. Devine EC, Cook TD. Effects of psycho-educa-
93. Hardy RJ, Thompson SG. Detecting and describ- tional interventions on length of hospital stay: a
ing heterogeneity in meta-analysis. Stat Med 1998; meta-analytic review of 34 studies. In: Light RJ,
17: 841± 56. ed., Evaluation Studies Review Annual, Vol. 8.
94. Berlin JA, Antman EM. Advantages and limita- Beverly Hills, CA: Sage, 1983.
tions of metaanalytic regressions of clinical trials 113. Rosenthal R. The file drawer problem and tolerance
data. Online J Curr Clin Trials 1994; Doc. No. 134. for null results. Psychol Bull 1979; 86: 638±41.
95. Berkey CS, Hoaglin DC, Mosteller F, Colditz GA. 114. Iyengar S, Greenhouse JB. Selection models and
A random-effects regression model for meta- the file-drawer problem. Stat Sci 1988; 3: 109± 17.
analysis. Stat Med 1995; 14: 395± 411. 115. Gleser LJ, Olkin I. Models for estimating the
96. Vanhonacker WR. Meta-analysis and response number of unpublished studies. Stat Med 1996; 15:
surface extrapolation: a least squares approach. 2493± 507.
Am Stat 1996; 50: 294±9. 116. Egger M, Davey Smith G, Schneider M, Minder C.
97. Walker AM, Martin-Moreno JM, Artalejo FR. Bias in meta-analysis detected by a simple,
Odd man out: a graphical approach to meta- graphical test. Br Med J 1997; 315: 629± 34.
analysis. Am J Public Health 1988; 78: 961± 6. 117. Begg CB, Mazumdar M. Operating characteristics
98. Hedges LV, Vevea JL. Fixed- and random-effects of a rank correlation test for publication bias.
models in meta-analysis. Psychological Methods Biometrics 1994; 50: 1088± 101.
1998; 3: 486±504. 118. Dear KBG, Begg CB. An approach for assessing
99. Carlin JB. Meta-analysis for 2  2 tables: a publication bias prior to performing a meta-
Bayesian approach. Stat Med 1992; 11: 141± 58. analysis. Stat Sci 1992; 7: 237± 45.
100. Whitehead A, Whitehead J. A general parametric 119. Hedges LV, Olkin I. Statistical Methods for Meta-
approach to the meta-analysis of randomized Analysis. Orlando, FL: Academic, 1985.
clinical trials. Stat Med 1991; 10: 1665± 77. 120. Vevea JL, Hedges LV. A general linear model for
101. Thompson SG, Smith TC, Sharp SJ. Investigating estimating effect size in the presence of publication
underlying risk as a source of heterogeneity in bias. Psychometrika 1995; 60: 419± 35.
meta-analysis. Stat Med 1997; 16: 2741± 58. 121. Hedges LV, Vevea JL. Estimating effect size under
102. McIntosh MW. The population risk as an ex- publication bias: small sample properties and
planatory variable in research synthesis of clinical robustness of a random effects selection model. J
trials. Stat Med 1996; 15: 1713± 28. Educ Behav Stat 1996; 21: 299± 332.
103. Larose DT, Dey DK. Grouped random effects 122. Givens GH, Smith DD, Tweedie RL. Publication
models for Bayesian meta-analysis. Stat Med 1997; bias in meta-analysis: a Bayesian data-augmenta-
16: 1817± 29. tion approach to account for issues exemplified in
{Jobs}0688jw/makeup/688ch38.3d

658 PHARMACOEPIDEMIOLOGY

the passive smoking debate. Stat Sci 1997; 12: Variability in risk of gastrointestinal complications
221± 50. with individual non-steroidal anti-inflammatory
123. Meinert CL. Toward prospective registration of drugs: results of a collaborative meta-analysis. Br
clinical trials. Control Clin Trials 1988; 9: 1 ±5. Med J 1996; 312: 1563± 6.
124. Simes RJ. Publication bias: the case for an 139. Taragin MI, Carson JL, Strom BL. Gastrointest-
international registry of clinical trials. J Clin Oncol inal side effects of the nonsteroidal anti-inflamma-
1986; 4: 1529± 41. tory drugs. Dig Dis 1990; 8: 269± 80.
125. Dickersin K. Report from the panel on the Case 140. Carson JL, Strom BL. The gastrointestinal toxicity
for Registers of Clinical Trials at the Eighth of the non-steroidal anti-inflammatory drugs. In:
Annual Meeting of the Society for Clinical Trials. Side-Effects of Anti-Inflammatory Drugs 3. Rain-
Control Clin Trials 1988; 9: 76 ± 81. sford KD, Velo GP, eds. Boston, MA: Kluwer,
126. Dickersin K. Why register clinical trials? Ð revis- 1992; 1 ± 8.
ited. Control Clin Trials 1992; 13: 170± 7. 141. Hine LK, Laird NM, Hewitt P, Chalmers TC.
127. Savulescu J, Chalmers I, Blunt J. Are research Meta-analysis of empirical long-term antiarrhyth-
ethics committees behaving unethically? Some mic therapy after myocardial infarction. J Am Med
suggestions for improving performance and ac- Assoc 1989; 262: 3037± 40.
countability. Br Med J 1996; 313: 1390± 3. 142. Hine LK, Laird N, Hewitt P, Chalmers TC. Meta-
128. Anonymous. Making clinical trialists register. analytic evidence against prophylactic use of
Lancet 1991; 338: 244± 5. lidocaine in acute myocardial infarction. Arch
129. Chalmers I, Dickersin K, Chalmers TC. Getting to Intern Med 1989; 149: 2694± 8.
grips with Archie Cochrane''s agenda. Br Med J 143. MacMahon S, Collins R, Peto R, Koster RW,
1992; 305: 786± 8. Yusuf S. Effects of prophylactic lidocaine in
130. Valsecchi MG, Masera G. A new challenge in suspected acute myocardial infarction. An over-
clinical research in childhood ALL: the prospective view of results from the randomized, controlled
meta-analysis strategy for intergroup collabora- trials. J Am Med Assoc 1988; 260: 1910± 16.
tion. Ann Oncol 1996; 7: 1005± 8. 144. Antman EM, Berlin JA. Declining incidence of
131. Margitic SE, Morgan TM, Sager MA, Furberg ventricular fibrillation in myocardial infarction.
CD. Lessons learned from a prospective meta- Implications for the prophylactic use of lidocaine.
analysis. J Am Geriatr Soc 1995; 43: 435± 9. Circulation 1992; 86: 764± 73.
132. Simes RJ. Prospective meta-analysis of cholesterol- 145. Coplen SE, Antman EM, Berlin JA, Hewitt P,
lowering studies: the Prospective Pravastatin Pool- Chalmers TC. Efficacy and safety of quinidine
ing (PPP) Project and the Cholesterol Treatment therapy for maintenance of sinus rhythm after
Trialists (CTT) Collaboration. Am J Cardiol 1995; cardioversion. A meta-analysis of randomized
76: 122C± 126C. control trials. Circulation 1990; 82: 1106± 16
133. Whitehead A. A prospectively planned cumulative [published erratum appears in Circulation 1991;
meta-analysis applied to a series of concurrent 83: 714].
clinical trials. Stat Med 1997; 16: 2901± 13. 146. Reimold SC, Chalmers TC, Berlin JA, Antman
134. Temple RJ. The regulatory evolution of the EM. Assessment of the efficacy and safety of
integrated safety summary. Drug Information J antiarrhythmic therapy for chronic atrial fibrilla-
1991; 25: 485± 92. tion: observations on the role of trial design and
135. Gabriel SE, Jaakkimainen L, Bombardier C. Risk implications of drug-related mortality. Am Heart J
for serious gastrointestinal complications related to 1992; 124: 924± 32.
use of nonsteroidal anti-inflammatory drugs. A 147. Szczech LA, Berlin JA, Aradhye S, Grossman RA,
meta-analysis. Ann Intern Med 1991; 115: 787± 96. Feldman HI. Effect of anti-lymphocyte induction
136. Carson JL, Strom BL, Soper KA, West SL, Morse therapy on renal allograft survival: a meta-analysis.
ML. The association of nonsteroidal anti-inflam- J Am Soc Nephrol 1997; 8: 1771± 7.
matory drugs with upper gastrointestinal tract 148. Szczech LA, Berlin JA, Feldman HI. The effect of
bleeding. Arch Intern Med 1987; 147: 85± 8. antilymphocyte induction therapy on renal allo-
137. Bollini P, Garcia Rodriguez L, Perez Gutthann S, graft survival. A meta-analysis of individual
Walker AM. The impact of research quality and patient-level data. Anti-Lymphocyte Antibody
study design on epidemiologic estimates of the Induction Therapy Study Group. Ann Intern Med
effect of nonsteroidal anti-inflammatory drugs on 1998; 128: 817± 26.
upper gastrointestinal tract disease. Arch Intern 149. Midgette AS, O'Connor GT, Baron JA, Bell J.
Med 1992; 152: 1289± 95. Effect of intravenous streptokinase on early
138. Henry D, Lim LL-Y, Garcia Rodriguez LA, Perez mortality in patients with suspected acute myocar-
Gutthann S, Carson JL, Griffin M, Savage R, dial infarction. A meta-analysis by anatomic
Logan R, Moride Y, Hawkey C, Hill S, Fries JT. location of infarction. Ann Intern Med 1990; 113:
{Jobs}0688jw/makeup/688ch38.3d

THE USE OF META-ANALYSIS IN PHARMACOEPIDEMIOLOGY 659

961± 8 [published erratum appears in Ann Intern 160. Omenn GS, Goodman GE, Thornquist MD,
Med 1991; 114: 522]. Balmes J, Cullen MR, Glass A, Keogh JP,
150. Andrews TC, Reimold SC, Berlin JA, Antman Meyskens FL, Valanis B, Williams JH, Barnhart
EM. Prevention of supraventricular arrhythmias S, Hammar S. Effects of a combination of beta
after coronary artery bypass surgery. A meta- carotene and vitamin A on lung cancer and
analysis of randomized control trials. Circulation cardiovascular disease. New Engl J Med 1996;
1991; 84: 236±44. 334: 1150± 5.
151. Hommes DW, Bura A, Mazzolai L, Buller HR, ten 161. Stewart LA, Clarke MJ. Practical methodology of
Cate JW. Subcutaneous heparin compared with meta-analyses (overviews) using updated indivi-
continuous intravenous heparin administration in dual patient data. Cochrane Working Group. Stat
the initial treatment of deep vein thrombosis. A Med 1995; 14: 2057± 79.
meta-analysis. Ann Intern Med 1992; 116: 279±84. 162. Stewart LA, Parmar MK. Meta-analysis of the
152. Hull RD, Raskob GE, Hirsh J, Jay RM, Leclerc literature or of individual patient data: is there a
JR, Geerts WH, Rosenbloom D, Sackett DL, difference? Lancet 1993; 341: 418± 22.
Anderson C, Harrison L. Continuous intravenous 163. LeLorier J, Gregoire G, Benhaddad A, Lapierre J,
heparin compared with intermittent subcutaneous Derderian F. Discrepancies between meta-analyses
heparin in the initial treatment of proximal-vein and subsequent large randomized, controlled trials.
thrombosis. New Engl J Med 1986; 315: 1109± 14. New Engl J Med 1997; 337: 536± 42.
153. Cook RJ, Walter SD. A logistic model for trend in 164. Ioannidis JP, Cappelleri JC, Lau J. Issues in
2  2  kappa tables with applications to meta- comparisons between meta-analyses and large
analyses. Biometrics 1997; 53: 352± 7. trials. J Am Med Assoc 1998; 279: 1089± 93.
154. Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, 165. Cappelleri JC, Ioannidis JP, Schmid CH, de
Mosteller F, Chalmers TC. Cumulative meta- Ferranti SD, Aubert M, Chalmers TC, Lau J.
analysis of therapeutic trials for myocardial infarc- Large trials vs meta-analysis of smaller trials: how
tion. New Engl J Med 1992; 327: 248± 54. do their results compare? J Am Med Assoc 1996;
155. Stampfer MJ, Goldhaber SZ, Yusuf S, Peto R, 276: 1332± 8.
Hennekens CH. Effect of intravenous streptoki- 166. Peto R, Collins R, Gray R. Large-scale rando-
nase on acute myocardial infarction: pooled results mized evidence: large, simple trials and overviews
from randomized trials. New Engl J Med 1982; 307: of trials. J Clin Epidemiol 1995; 48: 23 ±40.
1180± 2. 167. Villar J, Piaggio G, Carroli G, Donner A. Factors
156. Berkey CS, Mosteller F, Lau J, Antman EM. affecting the comparability of meta-analyses and
Uncertainty of the time of first significance in largest trials results in perinatology. J Clin
random effects cumulative meta-analysis. Control Epidemiol 1997; 50: 997± 1002.
Clin Trials 1996; 17: 357± 71. 168. Borzak S, Ridker PM. Discordance between meta-
157. Ziegler RG, Mayne ST, Swanson CA. Nutrition analyses and large-scale randomized, controlled
and lung cancer. Cancer Causes Control 1996; 7: trials. Examples from the management of acute
157± 77. myocardial infarction. Ann Intern Med 1995; 123:
158. The Alpha-Tocopherol=Beta Carotene Cancer 873± 7.
Prevention Study Group. The effect of vitamin E 169. Colditz GA, Brewer TF, Berkey CS, Wilson ME,
and beta carotene on the incidence of lung cancer Burdick E, Fineberg HV, Mosteller F. The efficacy
and other cancers in male smokers. New Engl J of bacillus Calmette ± Guerin vaccination in the
Med 1994; 330: 1029± 35. prevention of tuberculosis: meta-analysis of the
159. Hennekens CH, Buring JE, Manson JE, Stampfer published literature. J Am Med Assoc 1994; 271:
M, Rosner B, Cook NR, Belanger C, LaMotte F, 698± 702.
Gaziano JM, Ridker PM, Willett W, Peto R. Lack 170. Colditz GA, Berkey CS, Mosteller F, Brewer TF,
of effect of long-term supplementation with beta Wilson ME, Burdick E, Fineberg HV. The efficacy
carotene on the incidence of malignant neoplasms of bacillus Calmette ± Guerin vaccination of new-
and cardiovascular disease. New Engl J Med 1996; borns and infants in the prevention of tuberculosis:
334: 1145± 9. meta-analyses of the published literature. Pedia-
trics 1995; 96: 29 ±35.

You might also like