You are on page 1of 21

LITERATURE REVIEW

MANUAL EXAMINATION OF THE SPINE: A SYSTEMATIC CRITICAL LITERATURE REVIEW OF REPRODUCIBILITY


Mette Jensen Stochkendahl, DC,a Henrik Wulff Christensen, DC, MD, PhD,b Jan Hartvigsen, DC, PhD,c Werner Vach, PhD,d Mitchell Haas, DC, MA,e Lise Hestbaek, DC, PhD,f Alan Adams, DC, MS, MSEd,g and Gert Bronfort, DC, PhD h

ABSTRACT
Objective: Poor reproducibility of spinal palpation has been reported in previously published literature, and authors of recent reviews have posted criticism on study quality. This article critically analyzes the literature pertaining to the interand intraobserver reproducibility of spinal palpation to investigate the consistency of study results and assess the level of evidence for reproducibility. Methods: Systematic review and meta-analysis were performed on relevant literature published from 1965 to 2005, identified using the electronic databases MEDLINE, MANTIS, and CINAHL and checking of reference lists. Descriptive data from included articles were extracted independently by 2 reviewers. A 6-point scale was constructed to assess the methodological quality of original studies. A meta-analysis was conducted among the high-quality studies to investigate the consistency of data, separately on motion palpation, static palpation, osseous pain, soft tissue pain, soft tissue changes, and global assessment. A standardized method was used to determine the level of evidence. Results: The quality score of 48 included studies ranged from 0% to 100%. There was strong evidence that the interobserver reproducibility of osseous and soft tissue pain is clinically acceptable (j z 0.4) and that intraobserver reproducibility of soft tissue pain and global assessment are clinically acceptable. Other spinal procedures are either not reproducible or the evidence is conflicting or preliminary. (J Manipulative Physiol Ther 2006;29:475- 485) Key Indexing Terms: Reproducibility of Results; Palpation; Literature Review; Diagnostic Tests; Spine; Meta-Analysis

iomechanical dysfunction is thought to be an important contributor to spinal pain, and manual palpation is a widely used procedure for the

diagnosis of such dysfunctions among providers of manual medicine.1-3 Contrary to the expectations of many clinicians, unacceptable levels of reproducibility have been

a Research Fellow, Nordic Institute of Chiropractic and Clinical Biomechanics, Part of Clinical Locomotion Science, Odense, Denmark. b Senior Researcher, Nordic Institute of Chiropractic and Clinical Biomechanics, Part of Clinical Locomotion Science, Odense, Denmark. c Senior Researcher, Nordic Institute of Chiropractic and Clinical Biomechanics, Part of Clinical Locomotion Science, Odense, Denmark; and Associate Professor, Institute of Sports Science and Clinical Biomechanics, Part of Clinical Locomotion Science, University of Southern Denmark, Denmark. d Professor, The Department of Statistics, University of Southern Denmark, Denmark. e Professor, Center for Outcomes Studies, Western States Chiropractic College, Portland, Ore. f Senior Researcher, The Back Research Center, Backcenter Funen; and Part of Clinical Locomotion Science, University of Southern Denmark, Denmark.

Professor, Texas Chiropractic College, Pasadena, Tex. Professor, Department of Research, Wolfe-Harris Center for Clinical Studies, Northwestern Health Sciences University, Bloomington, Minn. This study was funded by the Nordic Institute of Chiropractic and Clinical Biomechanics, Odense, Denmark and the Foundation for Chiropractic Education and Research, grant no. 03-09-01. Submit requests for reprints to: Mette Jensen Stochkendahl, DC, Nordic Institute of Chiropractic and Clinical Biomechanics, Research Department, Klosterbakken 20, DK-5000 Odense C, Denmark (e-mail: m.jensen@nikkb.dk). Paper submitted September 15, 2005; in revised form February 2, 2006. 0161-4754/$32.00 Copyright D 2006 by National University of Health Sciences. doi:10.1016/j.jmpt.2006.06.011
h

475

476

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

shown in the majority of the previously published literature, and authors of newer reviews have questioned the utility of manual examination procedures in spinal diagnosis altogether.4-7 Severe criticism has been posted on the design of the original studies, including the use of asymptomatic subjects,4,5 inexperienced observers,5 parallel testing,4 unclear definitions of positive findings and rating scales,4,6 weak description of study results,4,5,7 and the need for improvement in overall study quality.4,7 Furthermore, the dependence of Cohens j (the most widely statistical method used in studies on reproducibility) on the prevalence of positive findings, and the composition of the study population has been the subject of discussion.8,9 Unfortunately, these reviews themselves have important limitations. For instance, some deal with only a minority of manual examination procedures such as chiropractic procedures only,4 1 spinal region,4,6,10 or motion palpation only.5 In only 3 reviews were a predefined quality system applied to assess study quality,4,6,7 and in none of the reviews were both the number of studies, the methodological quality, and the consistency of the outcomes considered, as recommended by van Tulder and others.11-13 Finally, in none of these reviews was the impact of the predefined criteria on the conclusions tested. Therefore, the value of palpation as a diagnostic tool is, at present, still unknown and so are the abilities of practitioners of manual therapy to reliably diagnose spinal dysfunctions using palpation. We therefore decided that another systematic review taking into account the above issues was warranted. Furthermore, a meta-analysis including comparable studies of adequate methodological standard and assessment of the consistency of study outcomes would be highly useful. The purpose of this paper is therefore to systematically review and critically assess the design and statistical methodology of the literature pertaining to reproducibility of spinal palpation adopting standardized criteria for judging diagnostic studies. A meta-analysis was conducted to evaluate consistency of study outcomes. Finally, the level of evidence for the reproducibility of spinal palpation was determined.

Fig 1. Inclusion and exclusion criteria. lyingand at different segmental levels. Consequently, a palpation procedure applied under a specific condition at 1 or more segmental level is denoted a test. A paper could consider a single test or several tests and only 1 palpation procedure or several palpation procedures. Reproducibility refers to the ability of a single observer to find the same result using the same diagnostic procedure in the same patient on 2 separate moments in time (intraobserver agreement) and/or the ability of 2 observers to find the same result of a given diagnostic procedure in a patient (interobserver agreement).14

Study Selection
Studies were identified by a comprehensive search of the MANTIS (1966-2005), CINAHL (1982-2005), and MEDLINE (1965-2005) databases using the index terms reproducibility, reliability, or observer variation in combination with palpation , motion palpation , physical examination procedures , or spine in text and abstracts. Bibliographies of retrieved documents were checked for any additional studies. The principal investigator (MJS) screened the documents retrieved from this search twice to determine eligibility according to inclusion and exclusion criteria, as listed in Figure 1.

METHODS
Definitions
Palpation was defined according to Bergmann and Petersen,1 and results of the original articles were analyzed according to the palpation procedure, using the following annotations: motion palpation (MP), static palpation (SP) (palpation for alignment and/or structure), osseous pain (OP) (pain generated from palpation of osseous structures), soft tissue pain (STP), soft tissue changes (STC), and global assessment (GA) (the latter was introduced to describe the use of 2 or more of the above procedures to make 1 single judgement on the presence/absence of mechanical dysfunction). Each palpation procedure could be by applied under 5 conditionsstanding, sitting, prone, supine, or side

Data Extraction
Using a checklist, data from included documents were extracted and recorded independently by 2 of the authors (MJS and HWC). Completed checklists were then compared, and discordances were resolved by discussion until consensus was reached. If consensus could not be reached, a third investigator (JH) was available to mediate.

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

477

Table 1. Basic characteristic of the selected articles for the systematic review
No. of articles Region Cervical Thoracic Lumbar SI joints Inter (n = 48) 16 5 19 8 Intra (n = 19) 3 2 8 6 No. of tests considered Palpaton procedure MP SP OP STP STC GA Inter (n = 58) 28 3 6 11 3 7 Intra (n = 26) 15 0 1 5 0 5

Fig 2. Operational definitions of the quality criteria.

Assessment of Methodological Quality of Trials


No standardized and validated method for assessing the quality of reproducibility studies exists. Therefore, a 6-point scale was constructed based on recognized requirements for clinical trials of reproducibility and standard recommendations for systematic reviews of test accuracy.12,15,16 The operational definitions of the quality criteria are described in Figure 2. A study was considered high-quality if the methodological quality score, expressed as a percentage of the maximum score, was 50% or higher and low-quality if the score was less than 50%. The quality score reflects the relevance and appropriateness of 3 separate dimensions that may affect interpretation of results, study population, study design, and statistical analysis. The quality scoring of the trials was performed independently by 2 reviewers (MJS and HWC). Differences in scores were resolved through consensus by the 2 reviewers. The quality scores of the individual trials were used as part of the evidence determination.

Fig 3. Flow chart of study inclusion in the meta-analysis of interobserver reproducibility studies. not using a binary classification of the test outcome, (3) studies not reporting any results at all, (4) studies using a binary outcome but not reporting j values, and (5) studies not reporting an adequate description of the palpation procedure. When possible, single results from included studies (j and confidence intervals [CI]) were drawn directly from the original articles. If CIs were not reported in the original studies, CIs were calculated according to Altman17 if the necessary information (prevalence and sample size) was available. Results for individual segmental levels not in sequence were included separately in the analysis. In case of multiple reproducibility results reported for several pairs of

Meta-Analysis
To assess the consistency of study outcomes in articles included in the systematic review, a meta-analysis was conducted. Not eligible for inclusion in the metaanalysis were (1) low quality studies (b50%), (2) studies

478

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

Fig 4. Meta-analysis: intraobserver reproducibility.

observers or several spinal segments in sequence, we took the average of the reported j values and computed a CI, again by applying the Altman formula with the original sample size. This is a conservative approach ignoring a possible gain in precision due to taking the average. We displayed all available original results in a forest plot. No formal modeling and analysis of heterogeneity was performed because (1) information on the precision of the single results was not available in all studies, (2) we used partially a conservative assessment in the single studies, and (3) multiple results within a study cannot be regarded as independent. Overall j values were computed by taking first the mean j value within each study and then by averaging these mean j values. Confidence intervals for the overall j values are based on the empirical variation of the mean j values, and were only computed if at least 4 studies constituted a mean j value. In a secondary analysis, the association between several study characteristics and the mean j value of the study was tested by an analysis of covariance, including the type

Fig 5. Meta-analysis: interobserver reproducibility.

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

479

Table 2. Results of studies using ICC or j w, and low quality studies included in the level of evidence of interobserver reproducibility
Results Palpaton procedure MP j or ICC ICC: 0.09-0.25 447 0.4-0.73 449 j w: 0.16-0.49 421 0.42-0.75 440 ICC: OP: 0.27-0.85 449 OP: 0.22-0.80 420 j w: OP: 0.47-0.52 425 STP: 0.24-0.56 425 Low quality j: 0.05 437 0.01 456 0.17-0.17 457

OP and STP

j: OP: 0.00-1.0 436 STP: 0.35-0.87 436

Fig 6. Flow chart of study inclusion in the assessment of level of evidence of interobserver reproducibility studies.
STC SP

j: 0.07 433 j: 0.14-0.37 436

of palpation, separately for the intra- and interobserver results. The study characteristics were as follows: publication year, definition of positive findings, segmental region, standardization (ie, agreement on procedure, written instructions, and training sessions), application condition, occupation, experience, symptomatic status of test population, multiple tests.

j w represents weighted j . 4 In-text reference number.

Assessment of the Level of Evidence


Criteria for determining the level of evidence for reproducibility of spinal palpation were adapted from the Agency for Health Care Policy and Researchs guidelines for acute low back pain.18 This method has been used to assess the level of evidence of risk factors for low back pain in systematic reviews of epidemiological studies.13,19 The method takes into account all available included studies which describe a palpation procedure, report results, and use a valid statistical method (j or j w) or intraclass correlation coefficient [ICC]).8 The system evaluates the evidence by taking into account (1) the number of studies, (2) the methodological quality expressed by quality scores, and (3) the consistency of the study outcomes. Consistency was checked by visual inspection of the forest plots. The rating system was applied to each palpation procedure. Five categories were used to describe evidence levels: - Strong evidence: provided by generally consistent findings in multiple (z2) high-quality studies - Moderate evidence: provided by generally consistent findings in 1 high-quality study and 1 or more lowquality studies or in multiple (z2) low-quality studies - Preliminary evidence: only 1 study available - Conflicting evidence: inconsistent findings in multiple (z2) studies - No evidence: no studies were identified

The level of acceptable reproducibility has traditionally, and somewhat arbitrarily, been set at j N 0.4 in studies of manual medicine,8,20-25 and thus, a j value above 0.4 was considered clinically acceptable reproducibility in this review. Levels of clinically acceptable reproducibility expressed in j w or ICC were arbitrarily chosen at 0.4 and 0.8, respectively.

Sensitivity Analysis
To test the robustness of the assumptions behind the weighting of the evidence, the prespecified cut points for adequate methodological quality (50%) and minimal clinically acceptable reproducibility (j z 0.4) were subjected to increases and decreases of the cut points of F25% in the quality score and F .1 in reproducibility.

RESULTS
Results of the Literature Search
More than 900 publications were retrieved, and 48 original articles published between 1980 and 2005 were included according to the inclusion criteria.20-67 In all 48 studies interobserver reproducibility were reported, and in 19 studies, intraobserver reproducibility was also reported (Appendices A and B, available online at www.mosby.com/jmpt). All predefined categories of palpation, spinal segments, and application conditions were evaluated. In 25 articles, a single test was evaluated, and in 22 articles, multiple tests (parallel testing) were assessed. Classification of the palpation procedure was not possible in 1 study due to insufficient description.63 Altogether, 58 tests were considered for

480

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

Table 3. Articles included in the meta-analysis and the assessment of level of evidence in categories of palpation procedures
Total number of articles in No of HQ systematic review articles eligible (n = HQ/LQ) for meta-analysis Procedure Inter (30/18) OP 8/2 STP MP STC SP GA 8/2 22/14 5/2 4/1 4/1 Intra (8/11) 1/1 2/1 7/8 0/0 0/0 2/1 No of used test results in the meta-analysis No of articles eligible for level of evidence Conflicting (n = HQ/LQ) evidence Intra (11/3) 1/0 2/0 6/2 0 0 2/1 Average j value from the meta-analysis (95% CI)4 Intra Inter Intra 0.91 0.65 0.35 (0.13-0.58) 0.44

Level of evidence

Inter Intra Inter Intra Inter (n = 22) (n = 8) (n = 57) (n = 26) (25/6) 5 1 5 1 8/1 7 16 3 3 4 2 6 0 0 2 11 27 3 3 7 5 15 0 0 5 8/1 20/3 3/1 3/1 4/0

Inter Intra Inter No No No No Yes Yes No No No

0.53 (0.32-0.74) Strong Strong 0.42 (0.29-0.55) Strong Strong 0.17 (0.10-0.24) Strong No 0.03 Conf No Conf Strong

Strong Pre

HQ , High-quality; LQ , low-quality; Pre , preliminary; Conf , conflicting. 4 Calculated if 4 or more results were available.

interobserver reproducibility and 26 tests for intraobserver reproducibility (Table 1). Motion palpation was the most frequently investigated palpation procedure, followed by studies of palpation for pain.

Methodological Quality
The methodological quality of the studies ranged from 0% to 100% (Appendices C and D, available online at www.mosby.com/jmpt). Overall, 30 studies (63%) were of high quality; however, only 8 of 19 studies (42%) investigating intraobserver reproducibility were high-quality. The proportion of high quality was higher among articles investigating the cervical and thoracic spine than the articles investigating the lumbar spine and the sacroiliac (SI) joints (67% vs 59%). A trend for increasing quality was seen for more recent articles. The average quality score increases from 27% in articles published before 1988, to 48% in articles published between 1988 and 1995, and to 54% in articles published after 1996.

Meta-Analysis
Of 48 original studies addressing interobserver reproducibility, 22 were considered both high-quality and eligible for inclusion in the meta-analysis according to the predetermined criteria. Twenty-six articles were not included (Fig 3). Figures 4 and 5 give an overview of the single results available for the meta-analysis. Eight original studies addressing intraobserver reproducibility were included in the meta-analysis (Fig 4). Eleven studies were not eligible. Ten studies were low-quality,34,37,48,53,60,61,63-66 and 1 paper did not use a binary classification of the test outcome.55 Results were only available for 4 procedures (STP, OP, MP, and GA). Within each procedure, results seem to be comparable and point to midrange to high-range j values, except of the study of Meijne et al.39

With respect to interobserver reproducibility, most of the results for STP indicate midrange reproducibility (Fig 5). Excepted are results from Boline,58 which showed lowrange reproducibility; however, the j estimate was very imprecise here (large CI). For STC, the results suggest lowrange reproducibility, whereas SP shows inconsistent results. Results of OP all suggest mid- to high-range j values. Most of the results for MP suggest low reproducibility. j Values were inconsistent for GA but had wide, overlapping confidence intervals. We found no significant effect of year of publication, segmental region, standardization of procedures, observer profession or experience, symptomatic status of test population, or number of tests performed on the j values (data not shown). Thus, our investigation showed that most study characteristics had little influence on the study results. A notable exception was seen when comparing the application conditions, where sitting palpation was associated with slightly smaller j values and standing palpation was associated with distinctly smaller j values. These differences were significant ( P = .042) for the interobserver studies, but the tendency could be also seen in the intraobserver studies (nonsignificant). We would also like to note that we could observe in the intraobserver analysis a tendency to low mean j values in studies without parallel testing (j = 0.23), compared with studies with parallel testing (j = 0.61) (nonsignificant).

Evidence of Reproducibility
Thirty-one articles were available for the assessment of level of evidence, including 6 studies not reporting a binary outcome (Fig 6).20,21,25,40,47,49 Results from the 6 studies using weighted j or ICC were not directly comparable to the studies using j , but all 6 studies showed results with similar trends of low interobserver agreement on MP and higher interobserver agreement on evaluation of pain

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

481

(Table 2). Similarly, we also included 5 low-quality studies, which showed similar trends (Table 2).33,36,37,56,57 Taking all 31 studies together, strong evidence of clinically acceptable intraobserver reproducibility (j z 0.4) was found for STP and GA (Table 3). Strong evidence for clinically acceptable interobserver reproducibility was found for OP and STP according to the predefined criteria for assessment of levels of evidence. Strong evidence of clinically unacceptable reproducibility was found for intraobserver MP and interobserver MP and STC. Conflicting evidence was found for interobserver reproducibility of SP and GA. Preliminary evidence of clinically acceptable reproducibility was found for intraobserver OP, and no evidence was found for intraobserver SP and STC.

Sensitivity Analysis
In the meta-analysis, only high-quality studies were included. If low-quality studies reporting binary outcomes and j values or high-quality studies using j w or ICC had been included, the results would have been unaffected (data not shown). Raising the cut point for adequate methodological quality from 50% to 75%, or any amount of decrease in the cut point, did not effect the weight of the evidence or the overall conclusions, except for intraobserver MP and intraobserver GA, where an increase to 75% would result in conflicting evidence derived from only 2 studies for intraobserver MP and moderate evidence for clinically acceptable intraobserver GA. Raising the cut point for clinical acceptability has an obvious impact, with results for pain being most robust due to high overall j values.

high-quality studies, a method not previously applied, whereas the conclusions by Seffinger et al7 were based on both high- and low-quality studies without an evaluation of consistency. The authors concluded that pain provocation tests are most reliable, and soft tissue paraspinal palpatory diagnostic test is not reliable. Among the 12 highest-quality articles, pain provocation, motion, and landmark location tests were reliable within the same observer, but not always among observers under similar conditions. Overall, examiner discipline, experience level, consensus on procedures used, training, or the use of symptomatic subjects did not improve reliability. This is in agreement with our findings. Furthermore, we conclude that palpation of pain is reproducible both within and among observers, whereas MP may be reproducible within the same observer.

Methodological and Clinical Considerations


The experimental design of reproducibility studies has been criticized in previous reviews,4-7,68-71 and we found that 26 of 48 articles were of low methodological quality, had invalid statistical methods, or insufficient reporting of palpation procedures or test results. Comparability of the studies included in a review is the important requirement to ensure valid generalizations. We ensured comparability with respect to the palpation procedures used, but the studies were rather heterogeneous with respect to characteristics such as definition of positive findings, segmental region, standardization, occupation, experience, symptomatic status of test population, and parallel testing. However, our investigation showed that most study characteristics had little influence on the study results, with the exception of the application condition. Especially, standing palpation was associated with very low j values. Among the reviewed studies, standing palpation is used solely in the bGillet testQ of SI biomechanical dysfunction, and only 2 studies reporting this condition were included in our analysis.39,59 However, both contributed to the evaluation of the inter- and intraobserver agreement of MP. If we remove these 2 studies, then the average j for the interobserver agreement increases to 0.19 (0.13-0.26), and the intraobserver agreement increases to 0.44 (0.14-0.73), such that the intraobserver agreement of MP can be regarded as acceptable. Poor reproducibility of MP may reflect the design of reproducibility studies, rather than the quality of the palpation procedure.29,30,72 Greater reproducibility may be attained by allowing positive findings in a neighboring spinal segment to count in assessing agreement.29 However, this implies that we define a new, different diagnostic test which, then, requires a clinical rationale of test meaningfulness, beyond just an increase in j values.8 Further, parallel testing (test regimens) seems to aid the observer in making the clinical decision, thus enhancing reproducibility;30,42 a tendency we could also observe in our data. The acceptable

DISCUSSION
Summary of Results
After reviewing studies dealing with reproducibility of manual palpation of the entire spine, including the SI joints, we found strong evidence for clinically acceptable reproducibility both within and between observers for palpation of osseous and STP and within the same observer for GA. Strong evidence for clinically unacceptable levels of reproducibility for intra- and interobserver MP and STC was found. Intraobserver reproducibility was consistently higher than interobserver reproducibility, and reproducibility of palpation for pain response was consistently higher than reproducibility of palpation for motion. The most recent and comprehensive review evaluating the reproducibility of spinal palpation by Seffinger et al7 applied different inclusion and general review criteria, and thus, only 27 of 44 articles and 9 of 19 high-quality articles included in this review were evaluated. Furthermore, we included several more recent publications and articles dealing with the SI joints, GA, and evaluated single results from multiple test regimens. Our conclusions are based on predefined criteria and an evaluation of consistency of

482

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

intraobserver reproducibility for GA is also in line with this finding. However, when evaluating a combination of tests, information is only given about the reproducibility of the single test as part of this exact combination of tests.14,73 Moreover, we must be aware that conclusions on a single test from a study involving several tests may be only valid if the test is applied as part of this exact combination of tests. From a clinical perspective, increased reproducibility with parallel testing indicates that at this point, clinicians should not base their diagnosis on a single clinical examination finding such as palpation but, rather, conduct a range of tests. It is, however, premature to make clinical guidelines on how to use palpation because many aspects of palpation, such as the validity, still need to be investigated. The reproducibility of palpation for pain response is consistently higher than palpation for motion and, consistently, substantially higher within an observer than among different observers. However, both palpatory pain studies and intraobserver studies in general have inherent problems with blinding of observers. In intraobserver studies, conscious and unconscious cues may render blinding of the observers impossible, and the independence of measures can not be guaranteed. In palpatory pain studies, blinding of subjects is impossible. Both situations imply the risk of overestimating reproducibility. It should also be noted that intraobserver reproducibility is somewhat higher than interobserver reproducibility by definition (depending on the magnitude of observer by subject interaction).74 A dilemma between high internal validity and clinical applicability arises when designing studies of reproducibility. For example, training studies contrast maximal (ideal) reproducibility with actual reproducibility in practice. To enhance the internal validity, rigid testing conditions should be set up with considerations to blinding, randomization, standardization and training, and parallel testing. However, rigid enforcement of testing condition often diverges from the clinical situation and, hence, may reduce the external validity. In a clinical situation, a mix of both asymptomatic and symptomatic patients will most likely present to practitioners of manual medicine. Therefore, the study population should consist of a mix of both symptomatic and asymptomatic subjects so that the reproducibility of the testing procedure has a relation to the characteristics of the study population.14 Finally, in spite of the use in every day clinical routines, test procedures do not always necessarily evaluate the clinical entity it is intended to evaluate, and it is therefore important to discuss the content of the test procedure.14,75

to a difference in prevalence estimates between observers, or whether observers lack agreement in spite of similar prevalence if a moderate j value is obtained in a study of reproducibility. j has been criticized for its dependence on the prevalence of positive findings, which limits its usefulness in metaanalyses, because studies with varying prevalence are typically compared. However, the composition of the study population may have greater impact on j than the prevalence of positive findings.9 Both a binary outcome and a reported j value were required for studies to be part of our metaanalysis. However, binary outcomes may vary according to the definition of positive findings (ie, prevalence is directly dependent on the definition of positive findings). For example, if the observer is asked to identify any hypomobile segment(s) in a spinal region, the prevalence can vary from 0% to 100%, depending on the study population. If the observer is to identify the most hypomobile segment, the overall prevalence of positive findings will be 100%, but at any particular segment under investigation, the prevalence of the most hypomobile can be 0% to 100%. However, we found no association between the prevalence of positive findings and j values. This supports that the composition of the study populations is probably of greater importance than the prevalence of positive findings, as suggested by Vach.9 Different words and schemes have been used to evaluate the strength of reproducibility, but there are no definitive guidelines for interpreting good concordance.8,76 Moreover, little research has been done to establish minimal, clinically acceptable reproducibility, and perhaps more important than qualifying the strength of concordance, the quantitative reproducibility indices need to be evaluated in terms of their clinical application.8

Limitations of this Review


Different methodologies have been advocated for systematic reviews of trials addressing therapeutic efficacy,12 but little consensus exists when it comes to assessing the quality of reproducibility studies. We have chosen to evaluate the strength of evidence based on a best-evidence synthesis method, and this is one of the main differences between this review and previously published reviews on the same topic. Heterogeneity across studies, in terms of test procedures, inclusion criteria, study design and presentation of results, may be masked by the best-evidence approach. Considerable heterogeneity in study characteristics was noted across studies included in this review. However, despite this heterogeneity, the meta-analysis showed very consistent overall findings and only moderate impact of the specific design characteristics on the study outcomes. The exclusion from the meta-analysis of studies that did not report a binary outcome is another important difference between this and previous reviews. To compare studies of reproducibility, the same type of outcome and method of statistics must be applied. On this account, we had to

Statistical Considerations
j is widely accepted as the statistical method of choice for evaluating agreement between 2 observers for a binary classification.8 It is, however, not without problems to use j as the sole measure of observer agreement because information is lost when a 4-fold table is summarized into 1 number. Consequently, we do not know whether it is due

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

483

exclude 5 high-quality studies from the meta-analysis. Results from these studies are not directly comparable to the included studies, but all 5 articles show results with similar trends of low interobserver agreement on MP and higher interobserver agreement on evaluation of pain; they were included in the level of evidence assessment. The restricted number of articles causes the strength of evidence to be preliminary or nonexistent in 3 categories. In return, the power of the conclusions with respect to pain and motion testing is compelling. However, results were, in some categories, based on a relatively small number of original studies, making the conclusions very sensitive to just a few future high-quality studies with different results. A j value was reported in all high-quality studies using a binary classification. Hence, there was no need to calculate these from a published 4-fold table. No attempts were made to retrieve additional, original results or materials from the primary authors. Although every effort was made to find all published reproducibility studies, selection bias may have occurred because we included only English-language articles. Publication bias may have resulted in an overestimation of test reproducibility because studies arriving at positive conclusions are more likely to get published.77,78 Furthermore, reviewer bias is also a possible limitation of this review. Reviewers were not blinded to the authors or the results of the individual trials when the methodological scoring was performed because of our familiarity with the literature. Despite acceptable study quality according to our criteria, many trials still had methodological limitations or, at best, inadequate reporting of methods. Nonetheless, reproducibility of spinal manual palpation has been very thoroughly investigated and more than 40 original articles have been evaluated in this review. However, to shed light on the clinical usefulness of palpation, the validity needs to be investigated, and new innovative research that addresses the concomitant problems of selecting a golden standard in motion testing is warranted. Future research should also address the question of palpation in the overall assessment of neck and back pain patients and the importance of palpation as part of the complete clinical evaluation of patients.

However, the results are sensitive to changes in the preset level of clinically acceptable reproducibility and to the number of included studies.

Practical Applications
! Palpation for pain is reproducible between observers at a clinically acceptable level. ! Most spinal palpatory procedures investigated is reproducible within the same observer but not between observers.

REFERENCES
1. Bergmann TF, Petersen DH. Joint principles and procedures. In: Bergmann TF, Petersen DH, Lawrence DJ, editors. Chiropractic technique: principles and procedures. New York7 Churchill Livingstone Inc; 1993. p. 51-121. 2. Schafer RC, Faye LJ. Introduction to the dynamic chiropractic paradigm. In: Schafer RC, Faye LJ, editors. Motion palpation and chiropractic technique. 1st ed. Huntington Beach, Calif7 The motion palpation institute; 1989. p. 1-41. 3. Maitland GD. Vertebral manipulation. 3rd ed. London7 Butterworths; 1977. 4. Hestbaek L, Leboeuf-Yde C. Are chiropractic tests for the lumbo-pelvic spine reliable and valid? A systematic critical literature review. J Manipulative Physiol Ther 2000;23:258-75. 5. Huijbregts PA. Spinal motion palpation: a review of reliability studies. J Man Manip Ther 2002;10:24-39. 6. van der Wurff P, Hagmeijer RH, Meyne W. Clinical tests of the sacroiliac joint. A systemic methodological review. Part 1: reliability. Man Ther 2000;5:30-6. 7. Seffinger MA, Najm WI, Mishra SI, Adams A, Dickerson VM, Murphy LS, et al. Reliability of spinal palpation for diagnosis of back and neck pain: a systematic review of the literature. Spine 2004;29:E413-25. 8. Haas M. Statistical methodology for reliability studies. J Manipulative Physiol Ther 1991;14:119-32. 9. Vach W. The dependence of Cohens kappa on the prevalence does not matter. J Clin Epidemiol 2005;58:655-61. 10. Vaughan B. Inter-examiner reliability in detecting cervical spine dysfunction: a short review. J Osteopath Med 2002;5:24-7. 11. van Tulder MW, Assendelft WJ, Koes BW, et al. Method guidelines for systematic reviews in the Cochrane collaboration back review group for spinal disorders. Spine 1997;22: 2323-30. 12. Clarke M, Oxmann AD. Cochrane reviewers handbook 4.2.0. Oxford7 Cochrane Collaboration; 2003 cited 2004 Jun 1. 13. Hoogendoorn WE, van Poppel MN, Bongers PM, Koes BW, Bouter LM. Systematic review of psychosocial factors at work and private life as risk factors for back pain. Spine 2000; 25:2114-25. 14. Patijn J. Reproducibility and validity studies of diagnostic procedures in manual/musculoskeletal medicine. International Federation for Manual/Musculoskeletal Medicine Scientific committee. Protocol Formats; 2004. 15. Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ 2001;323:157-62. 16. Irwig L, Macaskill P, Glasziou P, et al. Meta-analytic methods for diagnostic test accuracy. J Clin Epidemiol 1995;48:119-30.

CONCLUSIONS
Palpation for pain is reproducible at a clinically acceptable level, both within the same observer and among observers. Palpation for GA is reproducible within the same observer but not among different observers. The level of evidence to support these conclusions is strong. The reproducibility of MP, STC, and SP is not clinically acceptable. The level of evidence is strong for interobserver reproducibility of MP and STC, whereas no evidence or conflicting evidence exists for SP and intraobserver reproducibility of STC. Results are overall robust with respect to the predefined levels of acceptable quality.

484

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

17. Altman DG. Some common problems in medical research. In: Altman DG, editor. Practical statistics for medical research. London7 Chapman & Hall; 1991. p. 396-439. 18. Bigos S, Bowyer O, Braen G, et al. Acute low back problems in adults. Clinical Practice Guideline No. 14. AHCPR Publication No. 95-0642. Rockville (Md)7 Agency for Health Care Policy and Research, Public Health Service, U.S. Department of Health and Human Services; 1994 [December. Available from: www.ncbi.nlm.nih.gov/books/bv.fcgi?rid= hstat6.chapter.25870.]. 19. Hartvigsen J, Lings S, Leboeuf-Yde C, Bakketeig L. Psychosocial factors at work in relation to low back pain and consequences of low back pain; a systematic, critical review of prospective cohort studies. Occup Environ Med 2004;61:e2. 20. Pool JJ, Hoving JL, De Vet HC, van Mameren H, Bouter LM. The interexaminer reproducibility of physical examination of the cervical spine. J Manipulative Physiol Ther 2004;27:84-90. 21. Fjellner A, Bexander C, Faleij R, Strender LE. Interexaminer reliability in physical examination of the cervical spine. J Manipulative Physiol Ther 1999;22:511-6. 22. Strender LE, Lundin M, Nell K. Interexaminer reliability in physical examination of the neck. J Manipulative Physiol Ther 1997;20:516-20. 23. Strender LE, Sjoblom A, Sundell K, Ludwig R, Taube A. Interexaminer reliability in physical examination of patients with low back pain. Spine 1997;22:814-20. 24. Keating JC, Bergmann TF, Jacobs GE, Finer BA, Larson K. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality. J Manipulative Physiol Ther 1990;13:463-70. 25. Viikari-Juntura E. Interexaminer reliability of observations in physical examinations of the neck. Phys Ther 1987;67:1526-32. 26. Sebastian D, Chovvath R. Reliability of palpation assessment in non-neutral dysfunctions of the lumbar spine. Orthop Phys Ther Pract 2004;16:23-6. 27. Hicks GE, Fritz JM, Delitto A, Mishock J. Interrater reliability of clinical examination measures for identification of lumbar segmental instability. Arch Phys Med Rehabil 2003;84:1858-64. 28. Downey B, Nicholas T, Niere K. Can manipulative physiotherapists agree on which lumbar level to treat based on palpation? Physiotherapy 2003;89:74-81. 29. Christensen HW, Vach W, Manniche C, Haghfelt T, Hartvigsen L, Hbilund-Carlsen PF. Palpation of the upper thoracic spine an observer reliability study. J Manipulative Physiol Ther 2002;25:285-92. 30. Horneij E, Hemborg B, Johnsson B, Ekdahl C. Clinical tests on impairment level related to low back pain: a study of test reliability. J Rehabil Med 2002;34:176-82. 31. Marcotte J, Normand MC, Black P. The kinematics of motion palpation and its effect on the reliability for cervical spine rotation. J Manipulative Physiol Ther 2002;25:E7. 32. Comeaux Z, Eland D, Chila A, Pheley A, Tate M. Measurement challenges in physical diagnosis: refining interrater palpation, perception and comminication. J Bodyw Mov Ther 2001;5:245-53. 33. Ghoukassian M, Nicholls B, McLaughlin P. Inter-examiner reliability of the Johnson and Friedman percussion scan of the thoracic spine. J Osteopath Med 2001;4:15-20. 34. French SD, Green S, Forbes A. Reliability of chiropractic methods commonly used to detect manipulable lesions in patients with chronic low-back pain. J Manipulative Physiol Ther 2000;23:231-8. 35. Smedmark V, Wallin M. Inter-examiner reliability in assessing passive intervertebral motion of the cervical spine. Man Ther 2000;5:97-101.

36. van Suijlekom HA, de Vet HC, van den Berg SG, Weber WE. Interobserver reliability in physical examination of the cervical spine in patients with headache. Headache 2000;40:581-6. 37. Vincent-Smith B, Gibbons P. Inter-examiner and intraexaminer reliability of standing flexion test. Man Ther 1999; 4:87-93. 38. Hawk C, Phongphua C, Bleecker J, Swank L, Lopez D, Rubley T. Preliminary study of the reliability of assessment procedures for indications for chiropractic adjustments of the lumbar spine. J Manipulative Physiol Ther 1999;22:382-9. 39. Meijne W, van Neerbos K, Aufdemkampe G, van der Wurff P. Intraexaminer and interexaminer reliability of the Gillet test. J Manipulative Physiol Ther 1999;22:4-9. 40. Lundberg G, Gerdle B. The relationships between spinal sagittal configuration, joint mobility, general low back mobility and segmental mobility in female homecare personnel. Scand J Rehabil Med 1999;31:197-206. 41. Cattrysse E, Swinkels RAH, Oostendorp RAB, Duquet W. Upper cervical instability: are clinical tests reliable? Man Ther 1997;2:91-7. 42. Jull G, Zito G. Inter-examiner reliability to detect painful upper cervical joint dysfunction. Aust J Physiother 1997;43:125-9. 43. McPartland JM, Goodridge JP. Counterstrain and traditional osteopathic examination of the cervical spine compared. J Bodyw Mov Ther 1997;1:173-8. 44. Tuchin P, Hart J, Colman R, Johnson C, Gee A, Edwards I, et al. Interexaminer reliability of chiropractic evaluation for cervical spine problems a pilot study. Chiropr J Aust 1996; 5:23-9. 45. Haas M. Reliability of manual end-play palpation of the thoracic spine. Chiropr Tech 1995;7:120-4. 46. Lindsay DM. Interrater reliability of manual therapy assessment techniques. Phys Ther Can 1995;47:173-80. 47. Binkley J, Stratford PW, Gill C. Interrater reliability of lumbar accessory motion mobility testing. Phys Ther 1995;75:786-92. 48. Inscoe EL, Witt PL, Gross MT, Mitchell RU. Reliability in evaluating passive intervertebral motion of the lumbar spine. J Man Manip Ther 1995;3:135-43. 49. Maher C, Adams R. Reliability of pain and stiffness assessments in clinical manual lumbar spine examination. Phys Ther 1994;74:801-9. 50. Hubka MJ, Phelan SP. Interexaminer reliability of palpation for cervical spine tenderness. J Manip Physiol Ther 1994;17: 591-5. 51. Paydar D, Thiel H, Gemmell H. Intra- and interexaminer reliability of certain pelvic palpatory procedures and the sitting flexion test for sacroiliac joint mobility and dysfunction. J Neuromusculoskel Syst 1994;2:65-9. 52. Boline PD, Haas M, Meyer JJ, Kassak K, Nelson C, Keating JC. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality: part II. J Manipulative Physiol Ther 1993;16:363-74. 53. Mior SA, McGregor M, Schut B. The role of experience in clinical accuracy. J Manipulative Physiol Ther 1990;13:68-71. 54. Leboeuf C. Chiropractic examination procedures: a reliability and consistency study. J Aust Chiropr Assoc 1989;19:101-4. 55. Herzog W, Read LJ, Conway PJ, Shaw LD, McEwen MC. Reliability of motion palpation procedures to detect sacroiliac joint fixations. J Manipulative Physiol Ther 1989;12:86-92. 56. Nansel DD, Peneff AL, Jansen RD, Cooperstein R. Interexaminer concordance in detecting joint-play asymmetries in the cervical spines of otherwise asymptomatic subjects. J Manipulative Physiol Ther 1989;12:428-33. 57. Mootz RD, Keating JC, Kontz HP, Milus TB, Jacobs GE. Intraand interobserver reliability of passive motion palpation of the lumbar spine. J Manipulative Physiol Ther 1989;12:440-5.

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

485

58. Boline PD. Interexaminer reliability of palpatory evaluations of the lumbar spine. Am J Chiropr Med 1988;1:5-11. 59. Carmichael JP. Inter- and intra-examiner reliability of palpation for sacroiliac joint dysfunction. J Manipulative Physiol Ther 1987;10:164-71. 60. Love RM, Brodeur RR. Inter- and intra-examiner reliability of motion palpation for the thoracolumbar spine. J Manipulative Physiol Ther 1987;10:1-4. 61. Bergstrbm E, Courtis G. An inter- and intra-examiner reliability study of motion palpation of the lumbar spine in lateral flexion in the seated position. Eur J Chiropr 1986;34:121-41. 62. Mior SA, King R. Intra and interexaminer reliability of motion palpation in the cervical spine. J Can Chiropr Assoc 1985; 29:195-9. 63. Deboer KF, Harmon R, Tuttle CD, Wallace H. Reliability study of detection of somatic dysfunctions in the cervical spine. J Manipulative Physiol Ther 1985;8:9-16. 64. Potter NA, Rothstein JM. Intertester reliability for selected clinical tests of the sacroiliac joint. Phys Ther 1985;65:1671-5. 65. Johnston WL, Allan BR, Hendra JL, Neff DR, Rosen ME, Sills LD, et al. Interexaminer study of palpation in detecting location of spinal segmental dysfunction. J Am Osteopath Assoc 1983;82:839-45. 66. Gonella C, Paris SV, Kutner M. Reliability in evaluating passive intervertebral motion. Phys Ther 1982;62:436-44. 67. Wiles MR. Reproducibility and interexaminer correlation of motion palpation findings of the sacroiliac joints. J Can Chiropr Assoc 1980;24:59-69. 68. Oldreive WL. Manual therapy rounds. A critical review of the literature on tests of the sacroiliac joint. J Man Manip Ther 1995;3:157-61.

69. Keating JC. Inter-examiner reliability of motion palpation of the lumbar spine: a review of quantitative literature. Am J Chiropr Med 1989;2:107-10. 70. Panzer DM. The reliability of lumbar motion palpation. J Manipulative Physiol Ther 1992;15:518-24. 71. Haas M. The reliability of reliability. J Manipulative Physiol Ther 1991;14:199-208. 72. Humphreys K, Delahaye M, Peterson CK. An investigation into the validity of cervical spine motion palpation using subjects with congenital block vertebrae as a bgold standardQ. BMC Musculoskelet Disord 2004;5:19. 73. van Deursen L, Patijn J, Ockhuysen A, Vortman BJ. The value of some clinical tests of the sacro-iliac joint. Man Med 1990;5:96-9. 74. Feldt LS, McKee ME. Estimation of the reliability of skill tests. Res Q 1958;29:279-93. 75. Haas M, Groupp E, Panzer D, Partna L, Lumsden S, Aickin M. Efficacy of cervical endplay assessment as an indicator for spinal manipulation. Spine 2003;28:1091-6. 76. Landis JR, Koch GC. The measurement of observer agreement for categorical data. Biometrics 1977;33:159-74. 77. Huxley R, Neil A, Collins R. Unravelling the fetal origins hypothesis: is there really an inverse association between birthweight and subsequent blood pressure? Lancet 2002;360: 659-65. 78. Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 2004;291:2457-65.

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

485.e1

APPENDIX A
Test procedure MP STP Segmental level/ patient position T1-T8 Sitting + prone T7-L5 prone T11-L5 + SI observers own choice SI standing T12-S1 Observers own choice SI Standing Cx supine + sitting T12-S1 Side posture SI Sitting SI Study population (no. [M/F], category, symptomatic status) 107 (68/39) Outpatient Sympt + Asympt 84 (sex, NR) Gen pop Sympt + Asympt 19 (14/5) Recruitment NR Sympt 9 (5/4) Edu/staff Asympt 18 (14/4) Edu/staff Sympt + Asympt 41 (41/0) Edu/staff Sympt + Asympt 11 (sex NR) Research Status NR 6 (2/4) Edu/staff Sympt 32 (17/15) Edu/staff Asympt N15 (sex NR) Recruitment NR Status NR 45 (29/16) Gen pop Sympt 11 (sex NR) Prim Care Sympt + Asympt 60 (sex NR) Edu/staff Status NR 32 (32/0) Edu/staff Status NR 54 (sex NR) Edu/staff Asympt 100 (sex NR) Edu/staff Status NR 40 (40/0) Research + Edu/staff Asympt 62 (sex NR) Edu/staff Status NR 5 (0/5) Edu/staff Asympt Examiners (no., occupation, experience) 2 Chiropractors; experience NR 3 Physiotherapists, 18-25 y 5 Chiropractors 5-18 y 9 Osteopathic stud 4-5 y 4 Chiropractors 2 N 20 y 2 b 3 y 2 Physiotherapy stud experience NR 4 Manual practitioners 1.5-13 y 2 Physiotherapists 4-5 y 2 Chiropractic stud 1 y 74 Chiropractic stud Experience NR 2 Chiropractors N5 y 4 Chiropractic stud Experience NR 10 Chiropractors 1-11 y 2 Chiropractors 7 + 10 y 8 Chiropractic stud 1 y 10 stud. 1-3 y 2 Chiropractic stud. Experience NR 3 Chiropractors 2 Chiropractic stud Experience NR 5 Physiotherapists 3-20 y

Reference Christensen et al29 Horneij et al30 French et al34 Vincent-Smith and Gibbons37 Hawk et al38 Meijne et al39 Cattrysse et al41 Inscoe et al48 Paydar et al51 Mior et al53

Standardization +

MP STP GA MP GA MP GA MP MP OP MP

+ + + + + +/

Leboeuf54 Herzog et al55 Mootz et al57 Love and Brodeur60 Carmichael59 Bergstrbm and Courtis61 Deboer et al63 Mior and King62 Gonella et al66

MP OP STP MP MP MP MP MP Insuff descrip MP MP

Lx + SI sitting SI Standing Lx Sitting T1-L5 Sitting SI Standing Lx Sitting Cx Sitting C1 Supine T12-S1

NR + + + NR +

Cx , Cervical spine; Tx , thoracic spine; NR , not reported; NA , not applicable; Symp , symptomatic; Asympt , asymptomatic; Prim Care , primary care; Edu/ staff , educational (students) or staff members; Gen pop , General population; Outpatient , outpatient clinic; Research , research setting; Stud , student. M/F, male/female; PA , percentage agreement; CI , confidence interval; Neuro , neurologic testing, such as sensitivity, reflexes, muscular strength; Clin , clinical testing, such as active and passive range of motion, axial compression test, manual traction test, strait leg raise, and shoulder abduction test.

485.e2

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

Additional procedures Muscle length

Definition of positive findings/acceptable reliability Abnormality j N 0.5 Pain

Statistics (type, prevalence/ CI reported) j (expanded j ): +/+ j : /+

Summary of results/j (PA) MP: 0.13-0.45 (0.60-0.68) (82%-88%); STP: 0.34-0.57 (0.63-0.77) (81%-88%) MP: 0.56-0.78 (78%-89%); STP: 0.64-0.78 (83%-89%) 0.21 to 1.00 (30%-100%) 0.46 (42%) segment: 0.1 to 0.85 unit: 0.1 to 0.77 0.03-0.08 (71%-83%) 0.27 to 1.0 (63.6%-100%) MP: 0.29 (58%) OP: 0.91 (97%) NR

Quality score 100% 50.0%

History posture x-ray Neuro Clin Manual examination 3 tests of instability Posture

Joint in need of adjustment; allows F 1 segment Unsymmetrical movement, LN b R Joint in need of adjustment (segment and functional unit) Fixation Instability Mobility Restriction tenderness Fixation

j : / j : / j : +/ j : /+ j : / Percent agreement j : /se j : /

25.0% 25.0% 50.0% 75.0% 75.0% 0% 50.0% 25.0%

Gait analysis

NR Fixation, 3-point scale Fixation Most hypomobile motor unit Fixation Fixation Fixation Pain Muscle Fixation Mobility, 7-point scale

Percent agreement Percentage agreement, m2 j : +/ Pearson j : +/se Percent agreement j j : +/ Mean, SD

0.09 to 0.48 0.31 (90%) 0.37-0.52 (71%-79%)

25.0% 50.0% 25.0% 0% 50.0% 0% 25.0% 50.0% 0%

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

485.e3

APPENDIX B
Test procedure MP OP MP OP Segmental level/ patient position Cx Supine Lx Prone Study population (number (m/f), category, symptomatic status) 32 (12/20) Primary care Sympt 63 (25/38) Outpatient + Research Sympt Examiners (number, occupation, experience) 2 Physiotherapists Experience NR 3 Physiotherapist 1 Physiotherapist/chiropractor 3-8 y 6 Physiotherapists 3-11 y 2 Physiotherapists 5-8 y 2 Chiropractors Experience NR 3 Physiotherapists 18-25 y 24 Chiropractic stud + 1 Chiropractor Experience NR 3 Occupation NR N10 y 10 Osteopathic Stud 2 y 5 Chiropractors 5-18 y 2 Physiotherapists N25 y

Reference Pool et al20 Hicks et al27

Standardization + +

Downey et al28 Sebastian and Chovvath26 Christensen et al29 Horneij et al30 Marcotte et al31 Comeaux et al32 Ghoukassian et al33 French et al34 Smedmark and Wallin35 Van Suijlekom et al36 Vincent-Smith and Gibbons 37 Hawk et al38 Meijne et al39 Fjellner et al21 Lundberg and Gerdle40 Strender et al22

MP MP MP STP MP STP MP MP STC STC GA MP

Lx Prone

SP OP STP MP GA MP MP MP

60 (28/32) Prim Care Sympt L5 Sitting + prone 31 (sex NR) Recruitment NR Sympt T1-T8 Sitting + prone 107 (68/39) Outpatient Sympt + Asympt T7-L5 Prone 84 (sex NR) Gen pop Sympt + Asympt Cx Supine 3 (sex NR) Edu/staff Asympt C2-T8 Sitting 54 (27/28) Gen pop Status NR Tx Sitting 19 (19/0) Recruitment NR Asympt T11-L5 + SI 19 (14/5) Recruitment Observers own choice NR Sympt 61 (15/46) Prim. C1-3 + C7-T1 care Sympt Sitting + prone + side lying Cx Position NR 24 (13/11) Outpatient + Research Sympt SI Standing 9 (5/4) Edu/staff Asympt T12-S1 Observers own choice SI Standing C0-C5 Sitting + supine Lx Side posture 18 (14/4) Edu/staff Sympt + Asympt 41 (41/0) Edu/staff Symptom + Asympt 48 (8/40) Edu/staff + Gen pop Asympt 156 (0/156) Gen pop Status NR 50 (13/37) Gen pop Sympt + Asympt

+ + + + + +

2 Neurologists Experience NR 9 Osteopathic stud. 4-5 y 4 Chiropractors 2 N 20 y 2 b 3 y 2 Physiotherapy stud. Experience NR 2 Physiotherapists 6 + 12 y

+ + +

MP SP OP C0-C3 Supine STP STC

3 Physiotherapists + Experience NR 2 Physiotherapists 21 + 23 y +

Strender et al23

MP STP

Lx Prone

71 (28/43) Outpatient + Prim Care Sympt

2 Physiotherapists 2 Physicians Experience NR

Cattrysse et al41

GA

Cx Supine + sitting

11 (sex NR) Research Status NR

4 Manual practitioners 1.5-13 y

485.e4

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

Additional procedures Clin Clin General mobility test History Clin Muscle length History Posture X-ray Neuro Clin 4 tests of mobility

Definition of positive findings/ acceptable reliability Mobility Pain, 11-point scale j N 0.4, ICC N0.75 Mobility Pain

Statistics (type, prevalence/ CI reported) j : and ICC (2.1) +/ j : +/+

Summary of results/j (PA) MP: -0.09-0.63 (48%-90%) OP: 0.22-0.80 (40.6%-87.4%) MP: -0.02-0.26 (52%-69% ) OP: 0.25-0.55 (65%-87%) 0.37 0.69 MP: 0.03-0.0 (0.22-0.24) (68%-80%) STP: 0.38 (0.67-0.70) (77%-79%) MP: 0.12-0.49 (61%-77%) STP: 0.31-0.88 (80%-95%) 0.337-0.682 (81%-90%) NR 0.07 0.16 to 0.25 (48%-64%) 0.28-0.43 (79%-87%)

Quality score 50% 50%

Most symptomatic level Dysfunction Abnormality j N 0.5 Pain Fixation Inclination V 68 The most dysfunctional segment The most significant area of tissue tension Joint in need of adjustment Allows F 1segment Stiffness (reduced mobility)

j : +/+ j : +/ j (expanded j ): +/+ j : /+ j : +/se j : +/ j : / j : / j : /

50% 16.7% 100% 66.7% 16.7% 50.0% 33.3% 50.0% 66.7%

History Clin Tender points Manual examination Clin Posture Clin Clin

Facet joint pain Impairment Unsymmetrical movement, LNbR Joint in need of adjustment (segment and functional unit) Fixation If not normal j N0.4 Mobility, 5-point scale Mobility Consistency Pain Difference between L/R, the most pronounced side j N 0.4 Mobility Normality versus pathology j N 0.4

j : / j : / j : +/ j : /+ j (w): +/+ j (w): /+ j : +/+

SP: 0.14-0.37 OP: 0.0-1.0 STP: 0.35-0.87 0.05 (42%) segment: 0.42 to 0.44 unit: 0.39 to 0.54 0.05 to 0.0 (76%-77%) 0.16 to 0.49 (41%-92%) 0.42-0.75 MP: 0.05-0.15 (26%-44%) SP: 0.24 (70%) OP: 0.37 (58%) STP: 0.31-0.52 (62%-68%) STC: .18 (36%) MP: PT: 0.38-0.75 (72%-88%) MD: -0.08-0.24 (48%-62%) STP PT: 0.27-0.56 (72%-86%) MD: 0.22-0.40 (71%-76%) 0.64 to 1.0 (18%-100%)

33.3% 16.7% 66.7% 66.7% 66.7% 66.7% 75.0%

Clin Neuro

j : +/+

66.7%

3 tests of instability

Instability

j : /

83.3%

(continued on next page)

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

485.e5

APPENDIX B. continued
Test procedure GA MP SP STC GA MP
46

Reference Jull and Zito42 McPartland and Goodridge43 Tuchin et al44 Haas
45

Segmental level/ patient position C0-C3 Position NR C0-C3 Position NR

C1-C7 Position NR T3-T12 Sitting Lx + SI Supine + prone L1-S1 Prone T12-S1 Side posture Lx Prone C0-C7 Sitting SI Sitting Lx prone

Lindsay

MP MP MP MP OP SP MP OP OP STP

Binkley et al 47 Inscoe et al48 Maher and Adams49 Hubka and Phelon50 Paydar et al51 Boline et al52

Study population (number (m/f), category, symptomatic status) 40 (12/28) Out patient Sympt + Asympt 7 + 11 (1/6 + 5/6) Research + Edu/staff Sympt + Asympt 53 (sex NR) Edu/staff Sympt + Asympt 73 (2/3 males) Edu/staff Sympt/ Asympt 8 (sex NR) Gen pop Asympt 18 (9/9) Outpatient Sympt 6 (2/4) Edu/staff Sympt 90 (34/56) Prim Care Sympt 30 (11/19) Private Clinic Sympt 32 (17/15) Edu/staff Asympt 28 (+/+)Prim Care Sympt

Examiners (number, occupation, experience) 7 Physiotherapists Experience NR 2 Osteopaths 10 + 40 y 36 Osteopathic stud 8 Chiropractors 2-14 y 2 Chiropractors N15 y 2 Physiotherapists 6 + 10 y 6 Physiotherapists 6-13 y 2 Physiotherapists 4-5 y 6 Physiotherapists 8-21 y 2 Chiropractors 1 + 5 y 2 Chiropractic stud. 1 y 3 Chiropractors Experience NR

Standardization NR

+ + + + NR

Keating et al24

MP SP OP Lx Prone + sitting STP STC

46 (20/26) Recruitment NR Sympt + Asympt

3 Chiropractors 2 -10 y

Mior et al53

MP

SI

N15 (sex NR) Recruitment NR Status NR 45 (29/16) Gen pop Sympt 11 (sex NR) Prim Care Sympt + Asympt 270 (Approximately 50% males) Edu/ staff Asympt 60 (sex NR) Edu/staff Status NR 50 (27/23) Edu/staff + outpatient + Prim Care Sympt + Asympt 54 (sex NR) Edu/staff Asympt 32 (32/0) Edu/staff Status NR 69 (29/23) Outpatient Sympt

Leboeuf54 Herzog et al55 Nansel et al56

MP OP STP MP MP

Lx + SI Sitting SI Standing Middle + lower Cx Sitting + supine Lx Sitting Lx Sitting

74 Chiropractic stud. Experience NR 2 Chiropractors N5 y 4 Chiropractic stud Experience NR 10 Chiropractors 1-11 y 4 Chiropractors Experience NR 2 Chiropractors 7 + 10 y 2 Chiropractors Experience NR 10 stud. 1-3 y 8 Chiropractic stud 1 y

+/

NR + +

Mootz et al57 Boline58

MP MP STP STC MP MP OP STP

+ +

Carmichael59 Love and Brodeur60 Viikari-Juntura25

SI Standing T1-L5 Sitting Cx Seated

1 Physician 1 Physiotherapist + Experience NR

485.e6

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

Additional procedures Manual examination

Manual examination

Definition of positive findings/ acceptable reliability Most dysfunctional segment Order of magnitude Dysfunction. Facet joint tenderness. Tissue texture. (Rating 0-10) Vertebral dysfunction

Statistics (type, prevalence/ CI reported) j : / j : /

Summary of results/j (PA) 0.25-1.0 MP: 0.34 (67%) SP: 0.53 (77%) STC: 0.19 (70%)

Quality score 66.7% 58.3%

Logistic regression m2 j : /SE j : +/ ICC /+ Percent Agreement ICC (1,1) +/+ j : +/+ j : /se j : +/

16.7%

Posture Clin Muscle length Posture Posture Dermothemography Surface electromyography Posture Dermothemography Temperature

End play restriction Beyond slight anomaly Motion, 9-point scale Mobility Stiffness, 11-point scale Pain, 11-point scale The most tender spot Restriction Tenderness Presence of abnormality

0.14 Lx: 0.30 to 0.0 (14%-50%) SI: 0.0-0.60 (75%-86%) 0.09-0.25 MP: 0.40 to 0.73 OP: 0.27-0.85 0.68 (77%) MP: 0.09 (34%) OP: 0.73 (91%) OP: 0.48-0.90 (75-96%) STP: 0.40-0.78 (89%)

100% 66.7% 33.3% 16.7% 58.3% 75.0% 50.0% 50.0%

Misalignment Pain Fixation j N 0.4

j : +/

Fixation

j : /

MP: 0.07-0.09 SP: 0.0 OP: 0.48 STP: 0.30 STC: 0.07 NR

75.0%

16.7%

Gait analysis

NR Fixation, 3-point scale The side of greatest resistance (LN bR) marked segment. Fixation Presence of severe abnormality, fixation Fixation Most hypomobile motor unit Tendersness Rating (0-3) j N 0.4

Percent agreement Percentage agreement, m2 j : +/

0.01 (46%-54%)

16.7% 50.0% 16.7%

j : +/ j : +/

0.17 to 0.17 MP: 0.05 to 0.31 (78-91%) STP: 0.03 to 0.49 (90-96%) STC: 0.10-0.31 (70%) 0.02 (85%) OP: 0.47-0.52 STP: 0.24-0.56

33.3% 66.7%

Neuro Clin

j : +/se Pearson j (w): +/

50.0% 16.7% 50.0%

(continued on next page)

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

485.e7

APPENDIX B continued
Test procedure MP Segmental level/ patient position Lx Sitting C1 Supine Cx Sitting SI Standing + sitting + side posture + prone C7-T12 Standing T12-S1 SI Study population (number (m/f), category, symptomatic status) 100 (sex NR) Edu/staff Status NR 62 (sex NR) Edu/staff Status NR 40 (40/0) Research + Edu/staff Asympt 17 (10/7) Outpatient Sympt Examiners (number, occupation, experience) 2 Chiropractic stud. Experience NR 2 Chiropractic stud Experience NR 3 Chiropractors Experience NR 8 Physiotherapists 2-18 y

Reference Bergstrbm and Courtis61 Mior and King 62 MP Deboer et al63 Potter and Rothstein64 Johnston et al65 Gonella et al66 Wiles 67

Standardization NR +

Insuff descrip MP

STC MP MP

30 (sex NR) Edu/staff Status NR 5 (0/5) Edu/staff Asympt 46 (sex NR) Edu/staff Asympt

1 Osteopaths 5 Osteopathic stud Experience NR 5 Physiotherapists 3-20 y 12 Chiropractors average 2.75 y

NR + NR

485.e8

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

Additional procedures 13 SI joint tests

Definition of positive findings/ acceptable reliability Fixation Fixation Fixation Pain Muscle Restriction

Statistics (type, prevalence/ CI reported) Percent agreement j : +/ j Percentage agreement, v 2 Percent Agreement Mean, SD Percentage agreement, Pearson

Summary of results/j (PA) 0.15 (61%)

Quality score 0% 50.0% 50.0% 33.3%

Decreased rebound/ dullness Mobility, 7-point scale Restriction, 5-point scale

(79%-86%)

0% 16.7% 0%

Journal of Manipulative and Physiological Therapeutics Volume 29, Number 6

Stochkendahl et al Spinal Palpation: A Systematic Review

485.e9

APPENDIX C. Intra-observer reproducibility studies


Reference Christensen et al29 Horneij et al30 French et al34 Vincent-Smith and Gibbons37 Hawk et al38 Meijne et al39 Cattrysse et al41 Inscoe et al48 Paydar et al51 Mior et al53 Leboeuf54 Herzog et al55 Mootz et al57 Love and Brodeur60 Carmichael59 Bergstrbm and Courtis61 Deboer et al63 Mior and King62 Gonella et al66 Case mix 1 1 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 Blinding of observers to confounding info 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 Subject blinding 1 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 1 1 0 j /ICC 1 1 1 1 1 1 1 0 1 1 0 0 1 0 1 0 0 1 0 Total (max 4 points) 4 2 1 1 2 3 3 0 2 1 1 2 1 0 2 0 1 2 0 Total percentage 100.00 50.00 25.00 25.00 50.00 75.00 75.00 0.00 50.00 25.00 25.00 50.00 25.00 0.00 50.00 0.00 25.00 50.00 0.00

APPENDIX D. Inter-observer reproducibility studies


Randomized order of observer 0 0 0 1 1 1 0 0 0 1 1 0 0 1 0 1 1 1 1 1 0 1 1 1 1 0 0 1 Case mix 1 1 1 0 1 1 0 0 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 1 0 0 Blinding of observers to other observers 1 1 1 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 Blinding of observers to confounding info 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 Subject blinding 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0.5 1 1 0 0.5 0 1 1 0 0 0.5 Total (max 6 points) 3 3 3 1 6 4 1 3 2 3 4 2 1 4 4 4 4 4.5 4 5 4 3.5 1 6 4 2 1 3.5 Total percentage 50.00 50.00 50.00 16.67 100.00 66.67 16.67 50.00 33.33 50.00 66.67 33.33 16.67 66.67 66.67 66.67 66.67 75.00 66.67 83.33 66.67 58.33 16.67 100.00 66.67 33.33 16.67 58.33

Reference Pool et al20 Hicks et al27 Downey et al28 Sebastian and Chovvath26 Christensen et al29 Horneij et al30 Marcotte et al31 Comeaux et al32 Ghoukassian et al33 French et al34 Smedmark and Wallin35 Van Suijlekom et al36 Vincent-Smith and Gibbons37 Hawk et al38 Meijne et al39 Fjellner et al21 Lundberg and Gerdle40 Strender et al22 Strender et al23 Cattrysse et al41 Jull and Zito42 McPartland and Goodridge43 Tuchin et al44 Haas 45 Lindsay 46 Binkley et al47 Inscoe et al48 Maher and Adams49

j /ICC 1 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1

(continued on next page)

485.e10

Stochkendahl et al Spinal Palpation: A Systematic Review

Journal of Manipulative and Physiological Therapeutics July/August 2006

APPENDIX D. continued
Reference Hubka and Phelan50 Paydar et al51 Boline et al52 Keating et al24 Mior et al53 Leboeuf54 Herzog et al55 Nansel et al56 Mootz et al57 Boline58 Carmichael59 Love and Brodeur60 Viikari-Juntura25 Bergstrbm and Courtis61 Mior and King62 Deboer et al63 Potter and Rothstein64 Johnston et al65 Gonella et al66 Wiles 67 Randomized order of observer 1 1 1 1 0 0 0 0 1 1 0 0 1 0 0 1 0 0 1 0 Case mix 1 1 1 1 0 1 1 0 0 1 0 0 1 0 0 0 1 0 0 0 Blinding of observers to other observers 1 0 0 1 0 0 1 0 0 1 1 1 0 0 1 1 1 0 0 0 Blinding of observers to confounding info 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Subject blinding 0.5 0 0 0.5 0 0 1 0 0 0 1 0 0 0 1 1 0 0 0 0 j /ICC 1 1 1 1 1 0 0 1 1 1 1 0 1 0 1 0 0 0 0 0 Total (max 6 points) 4.5 3 3 4.5 1 1 3 1 2 4 3 1 3 0 3 3 2 0 1 0 Total percentage 75.00 50.00 50.00 75.00 16.67 16.67 50.00 16.67 33.33 66.67 50.00 16.67 50.00 0.00 50.00 50.00 33.33 0.00 16.67 0.00

You might also like