You are on page 1of 44

Data Collection

Validity and Reliability


Ratna Kurnia Illahi, M.Pharm., Apt.
Departemen Farmasi Komunitas
Program Studi Farmasi
Universitas Brawijaya
Outline
• Data collection method
• Measurement validity and
reliability
Data Collection
• Different sources of data
• Phases in the conduct of a survey
• Response rates
Sources of data
• Data may be accessed through existing
sources or collected during the course of
study
• Examples of existing sources:
- Medical records from hospital
- Databases, such as cancer
registries, birth & death registries
- Data collected for the purposes of
research project ex: Busselton Health
Study, Framingham Heart Study
Advantages of existing source data
• Data are often cheaper and quicker to
obtain
• Usually no burden on subjects being
studied
• Existing sources may be only source of
data that is needed
• ‘Population’ level data may be needed
Disadvantages of existing source data
• Quality of data may be in question
• Data may not be collected according to an
established or the same protocol so
comparability may be an issue
• All the measures/variable may not have
been collected
• However, it is often necessary to collect
data & therefore to design an appropriate
data collection instrument
Modes of data collection
• Data may be collected by different modes
- Self-complete surveys (hard-copy or
online)
- Interviews
- Telephone surveys
- Observation
Cont.
• A data collection instrument may be a
- questionnaire to be completed by the
respondent
- form onto which data will be transferred
- form to be filled during a structured
interview
Format of questionnaire
• Easy to read and visually appealing
• If long, separate sections with
pictures, heading, transitional
sentences to assist and encourage
respondents to complete
• Begin with a few easy questions
• Follow a logical order
• Don’t make it too long or too
complex
Conduct of survey
• The following phases are usually
conducted:
- Pre-testing
- Pilot testing
- Field work
- Data preparation
Pre-testing
• Aim to make sure questions are
understandable and to identify any gross
errors
• Initial test of questionnaire with subjects
similar to target group
• Obtain expert opinion (eg. for face and
content validity)
• Make sure all investigators on project are
satisfied with questionnaire
Pilot testing
• Aim to:
- identify any problems with
questionnaire
- determine time taken to complete
questionnaire
- test reliability and validity
- trial administration methods & protocol
• Administer to subjects from similar
population to main study
Questionnaire administration
• Data collection must follow a
standardised procedure to avoid
measurement bias and ensure
comparability of data collected from
different subjects
• A protocol for data collection is developed
and piloted
• People who will collect data are trained in
protocol
Examples of things covered in protocol

• Consent process
• Instructions for participants for
completion of questionnaire
• Instructions for administration of
questionnaire/conduct of interview to
standardise conditions for all participants
• Instructions on protocol and conditions
for taking measurement if physical
measurements to be collected
Data preparation
• Hard copy questionnaire responses
entered electronically
• For the sake of data quality, experienced
data entry personnel used where possible
and/or data entered twice and compared
• Data cleaned prior to analysis i.e checked
for errors when data entered
• Back-up copies of data made
Response rates
• Response rate above 80% recommended
to avoid bias
• Maximise response rates by:
- Making it easy for the respondents
- Offering small incentives where possible
- Explaining the importance of the
research and the value of the person’s
participation
Measurement
Validity and Reliability
What will we cover?
• Types of measurement error
• Introduction to measurement
reliability and validity
• Different types of validity and how
each are tested
• Different forms of reliability and
how they are measured
Examples of measurement
• Bathroom scales to measure weight
• Sphygmomanometer to measure blood pressure
• Bent-knee push up test to measure muscular
strength
• Beck Depression Inventory to measure levels of
depression
• Likert-scale questionnaire items to measure
satisfaction rates
• Short Form 36 to measure quality of life
When measuring..
• Measures need to be valid and reliable: that is,
measure what they are supposed to and in
consistent way
• The usefulness of data depends on the extent to
which the data are accurateand reflects the
attributes being measured
• Thus, it is important to eliminate/reduce
measurement error where possible
• Instrument: physical instrument, single question,
scale (set of items)
Types of error
• Error can be systematic or random
• Random errors
- Are unpredictable, affect all
measurements, are ‘non-differential’
- Net effect is that overall results are less
precise but not systematically shifted in
particular direction
- Occur due to fatigue, inattention, simple
mistakes
• E.g performance on test may be affected by
mood
Types of error
• Systematic errors
- Predictable and occur in one direction
- Results depart systematically from the
values
- Also known as (constant) bias
• E.g performance on test affected by noise
outside room
• E.g scale not calibrated
Validity and Reliability
• Validity is the extent to which an
instrument measures what it is intended
to measure
- The lack of bias or systematic error
• Reliability is the extent to which a
measurement is consistent and free from
error
- Also referred to as consistency, stability,
reproducibility, precision, dependability
Validity and Reliability
Validity and reliability relationship
Validity and Reliability
• Measurements can be
- reliable and valid (we consistently get
the correct answer)
- reliable but not valid (we consistently get
the wrong answer!)
- neither reliable or valid (we are in
trouble!)
• Measurements that are not reliable but
valid, vary greatly but their average will
give a valid value for the attribute
Measurement Validity
• Types of measurement validity
- Face validity
- Content validity
- Criterion validity
- Construct validity
Measurement Validity
• The extent to which an instrument
measures what it is meant to measure
- the degree to which the results of a
measurement correspond to the true
state of the phenomenon being measured
• Validity: lack of bias or systematic error
• Eg. to what extent does scale to measure
QoL actually measure QoL?
Face Validity
• Extent to which an instrument ‘at face
value’ measures what it is intended to
measure
• Weakest form of validity
• All instruments should at least have this
• No statistic calculated – evaluated
through judgement and common sense
• Assessed by piloting the instrument with
target group and through expert opinion
Content Validity
• Applicable when measuring a
multifaceted construct
• Is the instrument comprehensive?
• No statistic calculated – evaluated
through judgement
• Assessed by piloting the instrument
with target group and through
expert opinion
Criterion Validity
• Applicable when there is a criterion gold
standard method of evaluating the construct
• Usually exist when phenomenon is directly
observable
• But obtaining a criterion measurement may
be difficult, risky, too invasive or expensive to
use, so an alternative is needed
• Eg. results from an examination or blood test
to diagnose a condition
Criterion Validity
• May be predictive or concurrent
• Predictive: test has the ability to predict
outcome
- Eg. TEE results as a predictor of student’s
success at university
• Concurrent: test administered at the same
time give highly similar results
- Eg. results from a shortened version of an
IQ test correlate highly with results from a
(longer) established IQ test
Construct Validity
• The extent to which an instrument
measures a theoretical construct or trait
• Applicable when there is no criterion
against which to assess the instrument
• Phenomenon is not directly observable
• Demonstrate by showing that measures
out this instrument correlate/do not
correlate as expected with other
construct according to
theoretical/hypothesized relationships
Types of Reliability
• Test-retest reliability
• Intra-rater reliability
• Inter-rater reliability
• Internal consistency
Test-retest Reliability
• Stability over time
• A reliable instrument will obtain the same
results when administered repeatedly
• Usually applies to self-report instruments
and physiological measures where raters
are not involved
• Would the same group of respondents
score similarly on the same measure at
two different points at time?
Test-retest Reliability
• Questionnaire will have reliability if
- the questions are unambigious and at an
appropriate reading level
- respondents are given clear and consistent
directions for completing the questionnaire
- conditions for the respondents are optimal
and consistent eg. free from distraction and
competing influences
• Test in group from same population but not
part of main study
Rater reliability
• Intra-rater reliability: consistency of
measurement by the same person
overtime
• Inter-rater reliability: consistency if
measurements taken by different raters
measuring the same subjects
• Raters need to be trained especially
subjective judgement is involved or when
measurement procedure is unfamiliar to
raters
Intra-rater reliability
• Measure by having rater score same
construct on two or more occasions
• Assume instrument and construct remain
stable over time
• Avoid carryover, memory and practice
effects
• Even if rater is expert, their
measurements may be affected by fatigue
or by their previous rating
Inter-rater reliability
• Need to establish this if going to use
different raters in research study
• Would like to know that the raters are
interchangeable i.e score obtained will be
the same no matter who does the rating
• Intra-rater reliability needs to be
established for each rater first
• Best assessed when all raters measure in
same condition
• Ratings must be done independently
How to obtain rater reliability?
• Measurements taken according to
standardised protocol using objective
grader criteria
• Training of raters
• Clear communication of procedures for
making observations and the criteria for
making judgements
• Monitor performance over course of
study (changes in motivation, skill)
• Provide feedback on performance
Measures to test reliability
• Assess test-retest reliability and rater reliability
using
- an intra-class correlation (ICC) for continuous
variables and
- a kappa for categorical variables
• Two forms of ICC – measure agreement and
measure consistency
• Kappa measures the level of agreement between
2 sets of scores over and above the agreement
that would occur by chance
• Kappa for nominal and weighted kappa for
ordinal data
Internal consistency
• Applicable to scales i.e number of items
measure one construct which is hard to
measure with a single item
• Is a measure of the homogeinity of the
items
• The extent to which the items within the
scale are correlated and therefore are
measuring the same construct
• Usually measured using Cronbach’s alpha
(values between 0.7 – 0.9 recommended)
Context is important
• Reliability and validity may be relevant to
the context in which they are measured
• Eg. Need to demonstrate the validity of
an instrument in a different culture or
when translated or use within the
different group of age or patients

You might also like