You are on page 1of 43

Measures of Dispersion

Chapter 5
Measures of Dispersion
Synonym for variability
Often called spread or scatter
Indicator of consistency among a data set
Indicates how close data are clustered about
a measure of central tendency
Why Study Dispersion?
A small value for a measure of dispersion indicates that the data
are clustered closely (the mean is therefore representative of the data)
A large measure of dispersion indicates that the mean is not
reliable (it is not representative of the data)
48 49 50 51 52 47 53
Daily Computer Production
53 55 56 54 48 49 50 51 52 47 46 45 44
Daily Computer Production
Purpose of Measuring Variation
To test the reliability of an average
To serve as a basis for control of variability
To compare two or more series with regard
to variability
To facilitate as a basis for further statistical
analysis.
Properties of a good measure of variation
It should be simple to understand and easy
to calculate.
It should be rigidly defined.
It should be based on all observations.
It should be amenable to further algebraic
treatment.
It must have sampling stability.
It should not be affected by extreme
observations.
Absolute and relative measurement
Commonly used measures of dispersion
Range
Interquartile Range
Mean Deviation
Standard Deviation
The Range
The simplest measure of dispersion is the range.
Indicates how spread out the data are
For ungrouped data, the range is the difference
between the highest and lowest values in a set of
data.
RANGE = Highest Value - Lowest Value
Dependent on two extreme values
Coefficient of Range
Ratio of range
Coefficient of range =(Max- Min )/ (Max + Min)
The Range Example
RANGE = Highest Value - Lowest Value
EXAMPLE: A sample of five accounting
graduates revealed the following starting salaries:
$22,000, $28,000, $31,000, $23,000, $24,000.
The range is $31,000 - $22,000 = $9,000.
Dispersion Example
Number of minutes 20
clients waited to see a
consultant
Consultant
X Y
05 15 11 12
12 03 10 13
04 19 11 10
37 11 09 13
06 34 09 11
Consultant X:
Sees some clients almost
immediately
Others wait over 1/2 hour
Highly inconsistent
Consultant Y:
Clients wait about 10 minutes
9 minutes least wait and 13
minutes most
Highly consistent
Solution
1.Coefficient of range
=(Max- Min )/ (Max + Min)
= (37- 03 )/ (37 + 03) = 34/40 = 0.85
2. Coefficient of range
=(Max- Min )/ (Max + Min)
= (13- 09 )/ (13 + 09) = 4/22 = 0.18
Consultant X is inconsistent and Consultant Y is
consistent in their job..
The Interquartile Range
Modified version of the range
Positional measure of dispersion
Range of the middle 50% of observation , scores or ranks
Advantages over the range:
Not sensitive to extreme values in a data set
Not sensitive to the sample size
Calculation:
Put the values in order from low to high
Divide the set of values into quarters (1/4s)
For the values in the middle 50% -- subtract the lower
value from the higher value i.e Q
3
Q
1
Interquartile Range
Interquartile range = Q
3
Q
1
Semi-interquartile range or quartile deviation is
defined as
= (Q
3
Q
1
)/2
Coefficient of quartile deviation is
= = (Q
3
Q
1
)/(Q
3
+ Q
1
)
Interquartile Range Example
The number of complaints received by the manager of a
supermarket was recorded for each of the last 10
working days.
21, 15, 18, 5, 10, 17, 21, 19, 25 & 28
Sorted data
5, 10, 15, 17, 18, 19, 21, 21, 25 & 28
n Observatio or Q
Q
n
Q
rd
3 75 . 2
4
11
4
1
1
1
1
!
!

!

n Observatio or Q
Q
n
Q
th
8 25 . 8
4
33
4
1 3
3
3
3
!
!

!
Interquartile range = 21 15 = 6 days
For continuous data: one can use a graph:
15
Using an Ogive:
X
CF
20 40 60 80 100
20
40
60
80
100
x
x
x
Ogive
N/2 = 50
First Quartile at N/4
Q
1
Third Quartile at 3N/4
Q
3
Calculating exactly:Q1
Using the formula:
16
X f CF
0 < 20 15 15
20 < 40 60 75
40 <100 25 100
N/4 = 25
th
item
This is in the group 20 < 40
Lower limit (l) is 20
Width of group (i) is 20
Frequency of group (f) is 60
CF of previous group (F) is 15
Formula is:

'
+

'

!
f
F N
i l Q
q
4
1 1
First Quartile

'
+

'


!
60
15 25
20 20
1
Q
60
10
20 20 v ! 333 . 3 20 !
= 23.3333
This means that 25% of the data is below 23.333 (p.133- q.9)
Width of group (i) is 20
CF of previous group (F) is 15
Q3
17
Third Quartile
This is in the group 20 < 40
Lower limit (l) is 20
Width of group (i) is 20
Frequency of group (f) is 60
CF of previous group (F) is 15
X f CF
0 < 20 15 15
20 < 40 60 75
40 <100 25 100
3N/4 = 75
th
item
Formula is:

'
+

'

!
f
F N
i l Q
q
4 3
3 3

'
+

'

!
60
15 75
20 20
3
Q
60
60
20 20 v ! 20 20 !
= 40
So 25% of the data is above this point( i. e 40).
Interquartile Range and Coefficient of Q. D.
Interquartile range = 40-23.333= 16.67
1
Semi-interquartile range or quartile deviation is
defined as
= (Q
3
Q
1
)/2 = 16.67/2 =8.335
Coefficient of quartile deviation is
= = (Q
3
Q
1
)/(Q
3
+ Q
1
) = 16.67/ 63.33 = 0.26
Mean Deviation
The mean deviation takes into consideration all of
the values.
Mean Deviation: The arithmetic mean of the
absolute values of the deviations from the mean,
median or mode .
Where: X = the value of each observation X = the arithmetic mean of the values
n = the number of observations || = the absolute value (the signs of the deviations are disregarded)
n
x x
MD


!
Mean Deviation Example
The weights of a sample of crates containing
books for the bookstore are (in kgs.) 103, 97, 101,
106, 103.
X = 510/5 = 102 kgs.
|x-x| = 1+5+1+4+1=12
MD = 12/5 = 2.4 kgs
Typically, the variation in weights of the crates are
2.4 kgs. from the mean weight of 102 kgs.
7
If the data are in the form of a frequency
distribution, the mean deviation can be calculated
using the following formula:
Where: f = the frequency of an observation x
n = 7f = the sum of the frequencies
Frequency Distribution Mean Deviation


!
f
x x f
MD
_
| |
Frequency Distribution MD Example
Exercise 13.3(f) p. 336
Number of
outstanding
accounts
Frequency fx |x-x| f|x-x|
0 1 0 2 2
1 9 9 1 9
2 7 14 0 0
3 3 9 1 3
4 4 16 2 8
Total: N = 24
7fx = 48 7 f|x-x| = 22
mean = 48/24 = 2
MD = 22/24 = 0.92


!
f
x x f
_
| |

!
f
fx
x
_
x
Example 3:
Calculate mean deviation from median and mode of the
given data :
Class Interval Frequency
2-4 3
4-6 4
6-8 2
8-10 1
Standard Deviation
Standard deviation is the most commonly used
measure of dispersion
Similar to the mean deviation, the standard
deviation takes into account the value of every
observation
The values of the mean deviation and the standard
deviation should be relatively similar
Standard Deviation
The population standard deviation uses the squares
of the residuals
Steps;
Find the sum of the squares of the residuals
Find the mean
Then take the square root of the mean
Formula:
n
x x


'
+

'

!
2
_

The Standard Deviation (S or SD)


Most frequently used measure of dispersion
It is the average of the distances of the observed values
from the mean value for a set of data
Basic rule -- more spread will yield a larger SD
Calculation:
Calculate the arithmetic mean (AM)
Subtract each individual value from the AM
Square each value -- multiply it times itself
Sum (total) the squared values
Divide the total by the number of values (N)
Calculate the square root of the value
SD =
Sum of squares of individual deviations from arithmetic mean
Number of items
Example
:
Scores
Deviations
From Mean
Squares of
Deviations
01
03
05
06
11
12
15
19
34
37
143
-13
-11
-09
-08
-03
-02
+01
+05
+20
+23
169
121
81
64
9
4
1
25
400
529
1403
M = 143/10 = 14
No. of scores = 10
SD =
1403
10
= 11.8
SD =
Sum of squares of individual deviations from arithmetic mean
Number of items
Example
:
Scores
Deviations
From Mean
Squares of
Deviations
09
09
10
10
11
11
11
12
13
13
109
-02
-02
-01
-01
00
00
00
+01
+02
+02
4
4
1
1
0
0
0
1
4
4
19
M = 109/10 = 11
No. of scores = 10
SD =
19
10
= 1.4
The Coefficient of Variation
The coefficient of variation is a measure of
relative variability
It is used to measure the changes that have taken place
in a population over time
To compare the variability of two populations that are
expressed in different units of measurement
It is expressed as a percentage
Formula:
Where:
X = the mean of the sample S D = the standard deviation of the sample
.
100
_

'
+

'

!
x
d s
V
Exercise2
A Quality Control
Laboratory received
samples of electric bulbs
for testing the lives, from
two suppliers. The results
were as follows:
i) Which companys bulbs
have the greater length of
life.
ii) Which companys bulbs
have the greater length of
life.
Length of
Life(in
hrs.)
Company
A
Company
B
1500
2000
16 18
2000
2500
26 22
2500
3000
8 8
Total
50 48
Solution
Length of
Life(in hrs.)
Middle Point
(xi )
Company A
(fi )
Company B
(fi )
1500 2000 1750 16 18
2000 2500 2250 26 22
2500 3000 2750 8 8
Total 50 48
Company A
Basic Statistics
Sum( fx) 108,500.00
Count (Sum f) 50.00
Arithmetic Mean 2,170.00
Standard Deviation 337.05
Variance 113,600.00
Coefficient of Variation 16%
Company B
Basic Statistics
Sum(fx) 103,000.00
Count (Sum f) 48.00
Arithmetic Mean 2,145.83
Standard Deviation 352.94
Variance 124,565.97
Coefficient of Variation 16%
Answer
(i) Bulbs of Company A (mean 2170) have more
life than those of Company B (mean 2145)
(ii) Both Companies are same for uniformity (CV
16%)
Combined Variance (For different means)
W
2 1
2
2
2
2 2
2
1
2
1 1
) ( ) (

n n
d n d n


!
W W
W
Exercise 3
The mean and s.d of the lives of tyres of manufactured by
two factories of Durable tyre company, making 50,000
tyres annually , at each of the two factories , are given
below. Calculate combined mean and standard deviation of
the life of all the 100000 tyres produced in a year.
Factory Sample Size Mean (000 Kms) SD (000 Kms)
1 50 60 8
2 50 50 7
Combined Variance (For same means)
2 1
2
2 2
2
1 1

n n
n n

!
W W
W
Choosing Measures of Dispersion
Range
Interquartile Range
Use the range sparingly as the
measure of dispersion
Median is measure of central
tendency -- use the interquartile
range
Mean is measure of central
tendency -- use the standard
deviation
Standard
Deviation
Relationship between measures of
dispersion
Mean deviation = (4/5) standard deviation
Quartile deviation = (2/3) standard deviation
-1 +1
2.2
9.6
14
11
25.8
12.4
68%
NORMAL DISTRIBUTION CURVE
1 Standard Deviation
-2 +2
01
8.2
14
11
37
13.8
95%
NORMAL DISTRIBUTION CURVE
2 Standard Deviations
NORMAL DISTRIBUTION CURVE
3 Standard Deviations
-3
+3
01
6.8
14
11
37
15.2
99.7%
Q
1W
Q2W QlW Q QlW Q2W
Q
1W
Bell - Shaped Curve showing the relationship between and . W Q
68%
95%
99.7%

You might also like