Professional Documents
Culture Documents
biostatistics
BY
DALIA AHMED
ASSISTANT PROFESSOR OF PUBLIC HEALTH AND
COMMUNITY MEDCINE
dr dalia ahmed
dr dalia ahmed
dr dalia ahmed
Day 1: (theoretical)
Introduction
Variable types and data presentation
Statistical parameters
Day 2: (practical)
Rules for data entry
Data coding and entry
Data cleaning
Day 3 : (practical)
Frequency and crosstabulation
Statistical parameters calculations
Graphs
dr dalia ahmed
Objectives
Acquiring knowledge about variable
types,data presentation and statistical
parameters.
Acquiring skills about data coding ,entry
and cleaning.
Acquiring skills about data presentation and
parameters calculations.
3
Clinical epidemiology
dr dalia ahmed
Definition of Research
5I
www.bradfordvts.co.uk
dr dalia ahmed
dr dalia ahmed
Steps of research
1.
2.
3.
4.
5.
Ethical considerations
6.
7.
MEDICAL STATISTICS
dr dalia ahmed
dr dalia ahmed
Vary+measurable
Definitions:
Variables
Quantitative variables
is a type of information/characteristic
that varies among the studied cases
(study units) & can be measured .
dr dalia ahmed
10
Qualitative variables
Qualitative variables: the measure is expressed as
description.
Qualitative variables could be:
a- Nominal variables: the measurements have no
specific order.
Example: sex (males and females).
b- Ordinal variables: the measurements could be
arranged in certain order.
Examples: for disease conditions (mild, moderate,
severe).
dr dalia ahmed
12
A.Background variables
B.The outcome variables
C.The confounders
13
dr dalia ahmed
Confounding
Gambling
Cancer
15
14
Smoking,
Alcohol,
other
Factors
True association
dr dalia ahmed
16
Variables:
Solution
Match
groups
according
to
possible
confounders
17
dr dalia ahmed
18
Variables:
Tables:
Graphs:
DM IS
THE DEPENDENT V.
dr dalia ahmed
19
dr dalia ahmed
20
dr dalia ahmed
21
Sex
Age
Result
Degree
1
2
Male
Female
35
30
Fail
Pass
50
70
Male
40
Pass
90
4
5
Male
Female
45
33
Pass
Fail
80
40
Female
34
Fail
55
Female
32
Pass
60
female
28
Fail
50
male
25
Fail
45
10
male
30
Pass
70
11
12
male
male
40
42
Pass
Pass
72
75
13
male
45
Pass
80
14
15
male
female
41
55
dr dalia ahmed
Fail
Pass
16
female
44
Pass
67
17
male
36
Pass
90
22
55
66
Types of tables
1)
Simple tables showing single variable
a.
Tables with data on qualitative variables
(nominal) (e.g. percent distribution of the
studied sample by sex) .
b.
Table with data on quantitative variable
(continuous) (e.g. percent distribution of the
studied sample by age) .
2)
Contingency tables or cross tabulation of
two variables.
In such tables two variables are presented:
obesity and diabetes .
dr dalia ahmed
24
Quantitative data
Number Percent
11
Females
Total
7
18
Age group
Frequency
61.1
25-
38.9
30-
Sex
Male
11
40 -
Female
45 -
11
18
50-
Exam
result
55-
Total
18
Title
Sex
Males
Interval
35 (open end)
100.0
class
dr dalia ahmed
Result
25
Sex
Results
dr dalia ahmed
26
Graphs
dr dalia ahmed
27
dr dalia ahmed
28
Y axis
12
Number
10
8
6
4
2
0
Male
Male
Female
Sex
X axis
Female
Pie chart
Bar chart
dr dalia ahmed
29
dr dalia ahmed
percent
severe
moderate
mild
males
31
females
dr dalia ahmed
32
Maps
45%
to present information
45%
40%
35%
30%
30%
20%
25%
20%
15%
15%
10%
5%
0%
e.g.
No felt side
effects
No interfere
with daily work
Good method
of
administration
dr dalia ahmed
33
dr dalia ahmed
34
Histogram
Graphs for quantitative data
Histogram
Histogram
Frequency polygon
Frequency curve
Box plot
Error bar
Scatter plot
dr dalia ahmed
35
For discrete variables (e.g. the number of children per family), the number
representing the values should be centered below each bar to emphasize the discrete
dr dalia ahmed
36
nature of the variable .
Quantitative data
A
Box plot
Frequency
5
4
HISTOGRAM
3
2
1
0
22.5
27.5
32.5
37.5
42.5
47.5
52.5
Age (years)
Fig. 10: Age distribution of medical statistics course students (Kasr El-Eini 1998).
Frequency POLYGON
Frequency CURVE
C
6
5
4
Frequency
3
2
1
0
22.5
27.5
32.5
37.5 42.5
Age (years)
47.5
52.5
dr dalia ahmed
6
5
4
3
2
1
0
22.5
27.5
32.5
37.5
42.5
47.5 3752.5
dr dalia ahmed
38
Age (years)
Scatter diagrams
Error bar
39
Scatter plot
dr dalia ahmed
dr dalia ahmed
Y Dependent variable
Frequency
41
40
+1
-1
dr dalia ahmed
X Independent variable
X Increases
Y Increases
X Change
Y Not Follow
X Increases
Y Decreases
42
Line graphs
A line graph is particularly useful for numerical data
if you wish to show a trend over time.
Parameters
good summary of data.
used for statistical comparison and
testing.
dr dalia ahmed
44
Ratio
Quantitative
Central Tendency
Dispersion
dr dalia ahmed
45
Part
100
Total
Ratio
Part
Part
dr dalia ahmed
a
b
46
dr dalia ahmed
Proportion =
47
dr dalia ahmed
48
a. The midrange:
(the smallest observation + the largest observation)
2
Midrange = (1 + 6) /2 = 3.5 children
DATA
House
10
b. The mode:
Children
c. The median:
(The value in the middle of an arranged group of values)
the value that has 50% of the observations equal to or more than it
and the other 50% of observations equal to or less than it.
dr dalia ahmed
c.
dr dalia ahmed
50
After arranging houses, the middle
house is that one with 5 child
49
The median:
House
10
Children
dr dalia ahmed
Parameter
Advantages
51
dr dalia ahmed
52
Disadvantages
Mid range
Mode
Median
Arithmetic mean
dr dalia ahmed
54
a. Range:
Mean
=3
dr dalia ahmed
Family
10
Childr
en
Differe
nce
-2
-2
-1
-1
x X
= 12
55
c-Variance
In the previous calculations, the mathematicians do not accept
ignoring the negative sign, instead they suggested
squaring the difference to remove the negative sign
Then divide the results by observation number minus one
(x
= 4 + 4 + 1 + 1 + 0+ 0 + 0 + 1 + 4 + 9 = 24
=
dr dalia ahmed
57
x - mean
-1
1
1
-3
3
-2
2
3
-4
0
0
Square of
deviations
1
1
1
9
9
4
4
9
16
0
54
dr dalia ahmed
58
It is a measure of
spread.
Notice that the larger
the deviations (positive
or negative) the larger
the variance
SD=
n=10
Variance = 54/9 = 6
dr dalia ahmed
= + 2.5
59
60
10
Percentile
Central Tendency
(Location)
Mid range
Range
Mode
Minimum
Maximum
Median
percentile
quartiles
Arithmetic Mean
ADAM
Variance
Standard deviation
dr dalia ahmed
Coefficient of variation
Standard error of the mean
95% confidence interval of the mean
No.
63
MEAN
SD
SE
95 % Confidence
Interval for Mean
115.8
10.87
4.86
102.3 TO
129.3
112.0
7.58
3.39
102.6 TO
121.4
121.0
15.97
7.14
101.2 TO
140.8
118.0
17.89
8.00
95.8 TO
140.2
129.0
7.42
3.32
119.8 TO
138.2
25
119.2
12.98
2.59
113.8 TO
124.5
Total
Variance
Standard Error of first sample mean=
dr dalia ahmed
n
118.2
5
=4.86
64
80
70
Frequency
60
50
-S D
40
+S D
68 %
30
20
10
Co nfid e nc e inte rv a l 95 %
135
133
131
129
127
dr dalia ahmed
mm/Hg
125
123
121
119
117
115
113
111
109
107
65
105
62
dr dalia ahmed
Dispersion
66
11
dr dalia ahmed
67
dr dalia ahmed
69
dr dalia ahmed
70
12