Professional Documents
Culture Documents
Study Design. Method guidelines for systematic reviews of trials of treatments for neck and back pain.
Objective. To help review authors design, conduct and
report systematic reviews of trials in this field.
Summary of Background Data. In 1997, the Cochrane
Back Review Group published Method Guidelines for Systematic Reviews, which was updated in 2003. Since then,
new methodologic evidence has emerged and standards
have changed. Coupled with the upcoming revisions to
the software and methods required by The Cochrane Collaboration, it was clear that revisions were needed to the
existing guidelines.
Methods. The Cochrane Back Review Group editorial
and advisory boards met in June 2006 to review the relevant new methodologic evidence and determine how it
should be incorporated. Based on the discussion, the
guidelines were revised and circulated for comment. As
sections of the new Cochrane Handbook for Systematic
Reviews of Interventions were made available, the guidelines were checked for consistency. A working draft was
made available to review authors in The Cochrane Library
2008, issue 3.
Results. The final recommendations are divided into 7
categories: objectives, literature search, inclusion criteria,
risk of bias assessment, data extraction, data analysis,
and updating your review. Each recommendation is classified into minimum criteria (mandatory) and further
guidance (optional). Instead of recommending Levels of
Evidence, this update adopts the GRADE approach to
From the *Institute for Work and Health, Toronto, Ontario, Canada;
University of Toronto, Toronto, Ontario, Canada; Toronto Rehabilitation Institute, Toronto, Ontario, Canada; and VU University,
Amsterdam, the Netherlands.
The manuscript submitted does not contain information about medical
device(s)/drug(s).
No funds were received in support of this work. No benefits in any
form have been or will be received from a commercial party related
directly or indirectly to the subject of this manuscript.
Supported by operational funds from The Institute for Work & Health,
Canadian Institutes of Health Research (CIHR), Canadian Agency for
Drugs and Technologies in Health to Cochrane Back Review Group
These guidelines expand on the methodology outlined in: Bombardier
C, van Tulder MW, Pennick V, Bronfort G, Corbin T, Deyo RA, de Bie
R, Furlan AD, Guillemin F, Malmivaara A, Peul W, Schoene M, Shekelle PG, Tomlinson G. Cochrane Back Group. About The Cochrane
Collaboration (Cochrane Review Groups (CRGs)) 2008, Issue 3. Art.
No.: BACK. Copyright Cochrane Collaboration, reproduced with permission.
The following are the editorial board members of the Cochrane Back
Review Group: Co-editors: Claire Bombardier and Maurits van Tulder; Managing editor: Victoria Pennick; Editors: Gert Brnfort, Rob
deBie, Terry Corbin, Rick Deyo, Andrea Furlan, Francis Guillemin,
Antti Malmivaara, Wilco Peul, Mark Schoene, Paul Shekelle, George
Tomlinson.
Address correspondence and reprint requests to Andrea D. Furlan,
Institute for Work & Health, 481 University Av, Suite 800, Toronto,
Ontario, Canada; E-mail: afurlan@iwh.on.ca
determine the overall quality of the evidence for important patient-centered outcomes across studies and includes a new section on updating reviews.
Conclusion. Citations of previous versions of the
method guidelines in published scientific articles (1997:
254 citations; 2003: 209 citations, searched February 10,
2009) suggest that others may find these guidelines useful to plan, conduct, or evaluate systematic reviews in the
field of spinal disorders.
Key words: systematic reviews, meta-analysis, Cochrane Collaboration, method guidelines, back pain, neck
pain. Spine 2009;34:1929 1941
Method Guidelines
Review Objective. Reviews with the Cochrane Back Review
Group start with a clinically relevant question that is clearly
defined in the objectives. The objectives should outline the intervention and participants. The Editorial Board recommends
that reviews focus specifically on (sub)acute or chronic back or
neck pain. It is also recommended that reviews focus separately
on nonspecific back or neck pain, sciatica or radicular symptoms, or specific causes (e.g., spinal stenosis, scoliosis). In addition, review authors should outline the comparisons that will
be evaluated in the review (Figure 1).
Literature Search
Minimum Criteria. One of the main principles underpinning a
systematic review is to include all available evidence. Therefore, once the research question has been defined, the literature
search is the next, very important step in conducting a systematic review. The starting point for the literature search is to
decide which articles should be retrieved, ensuring that as many
relevant trials as possible are identified. The search strategy
should relate directly to the research question(s) of the review
at issue and should be based on the inclusion criteria with
respect to study design, participants, interventions, and outcomes (see Inclusion Criteria section). Searching only MEDLINE
is clearly insufficient since it has been shown that in general,
approximately only half of the available RCTs will be identified
if MEDLINE is the only databases searched.8 It has been suggested that at least MEDLINE and EMBASE must be used to
ensure a comprehensive literature search, because overlap between these databases is small.9 11 Especially in the field of low
back pain, EMBASE has been shown to retrieve more clinical
trials than MEDLINE.12
Therefore, we recommend the following as a minimum
search strategy:
1. A computer-aided search of the MEDLINE and EMBASE databases since their inception for new reviews
and since the date of the previous search for updates of
reviews.7,8 The highly sensitive search strategies for retrieval of reports of controlled trials should be run in conjunction with a specific search for spinal disorders and the
intervention at issue (Appendix 1, Supplemental Digital
Content 1, available at: http://links.lww.com/BRS/A373
and Appendix 2, Supplemental Digital Content 2, available
at: http://links.lww.com/BRS/A374). It has been demonstrated that simple search strategies (i.e., strategies with a
few terms) are not adequate for systematic reviews.13
2. A search of the Cochrane Central Register of Controlled
Trials (CENTRAL) that is included in the most recent
issue of The Cochrane Library.
3. A search of the CBRG Trials Register by contacting the
editorial base of the Cochrane Back Review Group.
4. Screening references listed in relevant systematic reviews
and identified RCTs.
The search strategy should not be limited by language.
Unless they have easy access to a health sciences librarian who
is experienced in searching electronic databases, we suggest that
review authors contact the CBRG (Cochrane@iwh.on.ca) for assistance in developing and conducting the literature search. We
recommend that 2 review authors independently apply the inclusion criteria to select the potentially relevant trials from the titles,
abstracts, and keywords of the references retrieved by the literature search. Articles selected in this first round, articles for which
disagreement exist, and articles for which title, abstract, and keywords provide insufficient information for a decision should be
obtained so that the final decision about whether they meet the
inclusion criteria is based on the full paper. A consensus method
should be used to select the potentially relevant trials at both
steps. If disagreements persist, a third review author should be
consulted.
Reviews should be submitted within a year of the latest search
date. Because some reviews can take longer than a year to complete, the CBRG recommends that the authors update the search
Inclusion Criteria
Minimum Criteria
Study Design. RCTs with clearly reported and appropriate
randomization should be included. If the article only reports
that the trial is a randomized trial or that the participants were
randomly allocated to the intervention groups without a clear
description of the method of randomization, the authors
should be contacted for further information. Examples of ap-
Table 1. Taxonomy of Study Design of Studies Assessing the Effects of Health-Care Interventions
Experimental studies with control group (clinical
trials or trials): The investigator has control
over the decision concerning the allocation of
participants to different intervention groups.
Cohort study
Cross-sectional study
Case series
Case reports
E
F
Yes/No/Unsure
Yes/No/Unsure
Data Extraction
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Yes/No/Unsure
Minimum Criteria. At least 2 review authors should independently extract the data. Data describing study characteristics
that include characteristics of participants, interventions, comparisons, outcomes, analysis, results, and study sponsorship
should be extracted and presented in a table (see inclusion
criteria for full details). Cointerventions and other confounders
should be described in as much detail as possible to enable
accurate comparison.
If one of the review authors is an author or coauthor of one
of the included trials, this person should not be involved in any
decisions regarding the data extraction of the trial at issue.
Further Guidance
The CBRG recommends that authors use a standardized form
for data extraction that will facilitate the comparison process.
It is advisable to pilot test the data extraction form to minimize
misinterpretations or later disagreements. If there are disagreements, consensus should be achieved by discussion among the
review authors. If disagreements persist, an independent person
should be consulted. If the article does not contain sufficient
information, the authors may be contacted.
Data extraction forms will vary across different systematic
reviews, but there will also be similarities among the forms
needed for reviews on back and neck pain. Because designing a
data extraction form is time-consuming, and given the important function of data extraction forms, it may be helpful to
profit and learn from experiences of others. Examples of data
extraction forms used in other reviews can be obtained from
the CBRG website: www.cochrane.iwh.on.ca.
Data Analysis
Minimum Criteria. Regardless of whether the authors use a
quantitative analysis (meta-analysis) or not, the results from
studies should only be combined when they are judged to be
sufficiently clinically similar to yield meaningful results. This
means review authors should avoid combining studies that are
clinically heterogeneous for populations, interventions, comparisons, or outcomes. A meta-analysis should be conducted
whenever trials measuring a specific outcome at similar follow-up (short-term and/or long-term) report sufficient data to
do so. When a meta-analysis is performed with only a subset of
trials, review authors should assess whether the results of the
studies not reported quantitatively are consistent with the
meta-analysis. The analysis should include an explicit description of the comparisons (Figure 1).
Short-term follow-up refers to outcomes that are measured
closest to 4 weeks after randomization; it could be as short as 7
days in a trial of analgesics and as long as 12 weeks in a trial of
exercise therapy. Intermediate follow-up refers to measures taken
closest to 6 months. Long-term follow-up refers to measures taken
closest to 1 year. Long-term surgical outcomes should be measured at 5 years. Unless otherwise stated, outcomes are assumed to
be measured after the treatment is completed.
The Editorial Board refers the reader to Chapter 9 of the
Cochrane Handbook for Systematic Reviews of Interventions7
for further guidance on data analysis.
The primary analysis of the review should only be based on the
results from RCTs (Table 1). If review authors include designs
Table 3. Criteria for a Judgment of Yes for the Sources of Risk of Bias
1
2
3
4
5
7
8
9
10
11
12
A random (unpredictable) assignment sequence. Examples of adequate methods are coin toss (for studies with 2 groups), rolling a dice
(for studies with 2 or more groups), drawing of balls of different colors, drawing of ballots with the study group labels from a dark
bag, computer-generated random sequence, pre-ordered sealed envelops, sequentially-ordered vials, telephone call to a central
office, and pre-ordered list of treatment assignments Examples of inadequate methods are: alternation, birth date, social insurance/
security number, date in which they are invited to participate in the study, and hospital registration number.
Assignment generated by an independent person not responsible for determining the eligibility of the patients. This person has no
information about the persons included in the trial and has no influence on the assignment sequence or on the decision about
eligibility of the patient.
This item should be scored yes if the index and control groups are indistinguishable for the patients or if the success of blinding was
tested among the patients and it was successful.
This item should be scored yes if the index and control groups are indistinguishable for the care providers or if the success of
blinding was tested among the care providers and it was successful.
Adequacy of blinding should be assessed for the primary outcomes. This item should be scored yes if the success of blinding was
tested among the outcome assessors and it was successful or:
for patient-reported outcomes in which the patient is the outcome assessor (e.g., pain, disability): the blinding procedure is
adequate for outcome assessors if participant blinding is scored yes
for outcome criteria assessed during scheduled visit and that supposes a contact between participants and outcome assessors
(e.g., clinical examination): the blinding procedure is adequate if patients are blinded, and the treatment or adverse effects of the
treatment cannot be noticed during clinical examination
for outcome criteria that do not suppose a contact with participants (e.g., radiography, magnetic resonance imaging): the blinding
procedure is adequate if the treatment or adverse effects of the treatment cannot be noticed when assessing the main outcome
for outcome criteria that are clinical or therapeutic events that will be determined by the interaction between patients and care
providers (e.g., co-interventions, hospitalization length, treatment failure), in which the care provider is the outcome assessor: the
blinding procedure is adequate for outcome assessors if item 4 (caregivers) is scored yes
for outcome criteria that are assessed from data of the medical forms: the blinding procedure is adequate if the treatment or
adverse effects of the treatment cannot be noticed on the extracted data
The number of participants who were included in the study but did not complete the observation period or were not included in the
analysis must be described and reasons given. If the percentage of withdrawals and drop-outs does not exceed 20% for shortterm follow-up and 30% for long-term follow-up and does not lead to substantial bias a yes is scored. (N.B. these percentages
are arbitrary, not supported by literature).
All randomized patients are reported/analyzed in the group they were allocated to by randomization for the most important moments of
effect measurement (minus missing values) irrespective of non-compliance and co-interventions.
In order to receive a yes, the review author determines if all the results from all pre-specified outcomes have been adequately
reported in the published report of the trial. This information is either obtained by comparing the protocol and the report, or in the
absence of the protocol, assessing that the published report includes enough information to make this judgment.
In order to receive a yes, groups have to be similar at baseline regarding demographic factors, duration and severity of complaints,
percentage of patients with neurological symptoms, and value of main outcome measure(s).
This item should be scored yes if there were no co-interventions or they were similar between the index and control groups.
The reviewer determines if the compliance with the interventions is acceptable, based on the reported intensity, duration, number and
frequency of sessions for both the index intervention and control intervention(s). For example, physiotherapy treatment is usually
administered over several sessions; therefore it is necessary to assess how many sessions each patient attended. For singlesession interventions (e.g., surgery), this item is irrelevant.
Timing of outcome assessment should be identical for all intervention groups and for all important outcome assessments.
Further Guidance
Quantitative Analysis. If it is clinically relevant and statistically justified to combine the results, statistical pooling should
be performed that provides an overall estimate of effect, with a
95% confidence interval for each outcome.40,41 The Editorial
Board recommends contacting a statistician before performing
a quantitative analysis. A meta-analysis should start by examining potential publication and other biases with a funnel plot
to explore asymmetry among trial results.42 If asymmetry is
present, potential reasons should be explored. However, funnel
plots may be misleading and should be interpreted cautiously.43
Formal statistical tests also exist, but there is no consensus
regarding the strengths and weaknesses of these tests.44 46
For the meta-analysis of dichotomous outcomes, the relative
risk, risk difference, or odds ratio can be used to summarize the
effect. Empirical evidence from 125 meta-analyses showed that
summary odds ratios and risk differences usually lead to similar
for the heterogeneity, does not explain it, and does not take it
away. Careful analysis of heterogeneity, that is, of study characteristics that might explain differences among the results, is
always important.49 The characteristics of participants, types
of interventions, and the exact outcome values should be
clearly articulated for each group of study results that are combined. Sensitivity analyses should be performed to examine the
impact of variation in risk of bias or individual validity criteria
(refer Assessing Risk of Bias section).
Sometimes it may be difficult for review authors to decide
whether it is clinically relevant to combine the results from a
group of studies in a meta-analysisfor example, studies of
participants with different types of treatments, different comparison groups, or different clinical characteristics. There are
no simple answers here, and review authors must be explicit
about their decisions so that others may judge for themselves
whether their choices were clinically sensible.
A related but separate issue concerns statistical homogeneity. A test for the statistical homogeneity of studies may be
performed to evaluate whether the differences among the results of the studies are greater than those that would be found
by chance alone. However, the test is not very powerful, and
failure to reject the hypothesis of homogeneity is not proof that
the studies are homogeneous. If the hypothesis of homogeneity
is rejected, or if the review team decides, on clinical grounds,
that the studies are too heterogeneous to support statistical
combinations, then the potential sources of heterogeneity
should be examined, because the observed differences might be
caused by factors other than chance, such as different risks of
bias, characteristics of participants, interventions, control
groups, or outcomes. If the heterogeneity can be explained,
review authors should present the results of each relevant subgroup separately. Subgroup analyses should be kept to a min-
imum and should be defined a priori, because subgroup analyses can be informative but also misleading.50
Readers are referred to Chapters 9 and 10 in the Cochrane
Handbook of Systematic Reviews of Interventions7 for more
details on data analysis.
Clinical Relevance
Further Guidance. The CBRG recommends including an assessment of clinical relevance of study results in systematic
reviews. The conclusions about the effectiveness of the intervention should contain all the important information needed to
enable users to make a decision about the applicability of the
results to their population. The clinical relevance of the studies
should be independently assessed by at least 2 review authors.
In the 2003 Updated Method Guidelines, the Editorial
Board recommended 5 questions to assess the clinical relevance
of each included study.56,57 In 2006, Malmivaara et al, in consultation with the Editorial Board, reviewed the set of 5 questions and articulated the details in the evaluation of applicability and clinical relevance of results of RCTs. The final
consensus consisted of 40 items. For the most part, these items
are characteristics of the population, interventions, comparisons, analysis, and results that review authors are advised to
extract from the studies. These details should be used to answer
the 5 questions (Table 4). For more details and examples on
how to assess each item, review authors are encouraged to read
the original study by Malmivaara et al.58 There is ongoing
research examining how to determine important clinical differences in pain reduction and functional improvement. At
present, there is consensus regarding minimal clinically important changes for pain and function in back pain.59 Authors are
Yes
No
Unsure
Yes
No
Unsure
Yes
Yes
Yes
No
No
No
Unsure
Unsure
Unsure
*For low-back pain, consider 30% on VAS/NRS for pain as clinically significant,59,62 and 2 to 3 points (or 8 to 12%) on the Roland-Morris Disability Questionnaire
for function.59,60
*For neck pain, consider 3.5 to 5 U on the 50-U Neck Pain Disability Index or 7 to 10% change63,64 for function and 2.5 on an 10-U NRS (25% change) for pain.63
*For effect size, most authors use Cohens 3 levels.61
Small: WMD less than 10% of the scale (e.g., 10 mm on a 100 mm VAS); SMD or d scores 0.5; relative risk, 1.25 or 0.8 (depending on whether it reports
risk of benefit or risk of harm).
Medium: WMD 10 to 20% of the scale; SMD or d scores from 0.5 to 0.8; relative risk between 1.25 to 2.0, or 0.5 to 0.8.
Large: WMD 20% of the scale; SMD or d scores 0.8; relative risks 2.0 or 0.5.
VAS indicates Visual Analog Scale; NRS, Numerical Rating Scale; SMD, standardized mean difference; WMD, weighted mean difference.
advised to consult the literature that also includes key references on neck pain59 64 and include both statistical and clinical importance in their discussion(Table 4).59 64
The answers to these questions should be used to inform the
discussion of the final results and conclusions; for example, in
the discussion section, clinical relevance could be included as
follows: There was high quality evidence from 10 RCTs (2000
participants) that intervention A is more effective than no treatment for reducing pain in the long-term for individuals with
chronic low back pain. However, since none of the trials described the program in detail, it is difficult to determine how to
provide this treatment to your patients and which types of
exercise healthcare providers should provide to patients (this
example is not based on real data).
Conclusion
Minimum Criteria
Results should be listed in the same order as the comparisons and outcomes were set out in the protocol. To improve
References
1. Moher D, Cook DJ, Eastwood S, et al. Improving the quality of reports of
meta-analyses of randomised controlled trials: the QUOROM statement.
Quality of Reporting of Meta-analyses. Lancet 1999;354:1896 900.
2. Assendelft WJ, Koes BW, Knipschild PG, et al. The relationship between
methodological quality and conclusions in reviews of spinal manipulation.
JAMA 1995;274:1942 8.
3. Furlan AD, Clarke J, Esmail R, et al. A critical review of reviews on the
treatment of chronic low back pain. Spine 2001;26:E155 62.
4. Hoving JL, Gross AR, Gasner D, et al. A critical appraisal of review articles
on the effectiveness of conservative treatment for neck pain. Spine 2001;26:
196 205.
5. van Tulder MW, Assendelft WJ, Koes BW, et al. Method guidelines for
systematic reviews in the Cochrane collaboration back review group for
spinal disorders. Spine 1997;22:232330.
6. van Tulder M, Furlan A, Bombardier C, et al. Updated method guidelines for
systematic reviews in the Cochrane collaboration back review group. Spine
2003;28:1290 9.
7. Higgins J, Green S, eds. Cochrane Handbook for Systematic Reviews of
Interventions Version 5.0.0 [updated February 2008].The Cochrane Collaboration; 2008.
8. Glanville JM, Lefebvre C, Miles JN, et al. How to identify randomized
controlled trials in MEDLINE: ten years on. J Med Libr.Assoc 2006;94:
130 6.
9. Minozzi S, Pistotti V, Forni M. Searching for rehabilitation articles on MEDLINE and EMBASE: an example with cross-over design. Arch Phys Med
Rehabil 2000;81:720 2.
10. Sampson M, Barrowman NJ, Moher D, et al. Should meta-analysts search
Embase in addition to Medline? J Clin Epidemiol 2003;56:94355.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.