You are on page 1of 21

Table of Contents

1. MEANING AND PURPOSE ................................................................................................................................... 2 OBJECTIVES OF MEASURING DISPERSION ...................................................................................................................... 2 PROPERTIES OF AN IDEAL MEASURE OF DISPERSION .......................................................................................................... 3 2. DEFINITION ........................................................................................................................................................ 3 3. ABSOLUTE AND RELATIVE MEASURES OF DISPERSION ...................................................................................... 4 4. RANGE .............................................................................................................................................................. 4 4.1. PROPERTIES OF RANGE ....................................................................................................................................... 5 4.2. USES OF RANGE ............................................................................................................................................... 5 4.3. LIMITATIONS: .................................................................................................................................................. 5 5. QUARTILE DEVIATION ........................................................................................................................................ 7 5.1. MERITS OF QUARTILE DEVIATION .......................................................................................................................... 7 5.2. LIMITATIONS: .................................................................................................................................................. 8 6. MEAN DEVIATION ............................................................................................................................................. 9 6.2. PROPERTIES OF MEAN DEVIATION ....................................................................................................................... 11 6.3. LIMITATIONS ................................................................................................................................................. 11 6.4. USES ........................................................................................................................................................... 11 6.5. MEAN SQUARED DEVIATION .............................................................................................................................. 13 7. VARIANCE ....................................................................................................................................................... 13 7.1. PROPERTIES OF VARIANCE ................................................................................................................................. 13 7.2. COMBINED VARIANCE ...................................................................................................................................... 14 8. STANDARD DEVIATION ................................................................................................................................... 14 8.1. CHARACTERISTICS OF STANDARD DEVIATION .......................................................................................................... 15 8.2. DIFFERENCE BETWEEN MEAN DEVIATION AND STANDARD DEVIATION ........................................................................... 16 8.3. MATHEMATICAL PROPERTIES OF STANDARD DEVIATION ............................................................................................ 17 9. COEFFICIENT OF VARIATION............................................................................................................................ 18 9.1. DEFINITION ................................................................................................................................................... 18 9.2. PROPERTIES .................................................................................................................................................. 18 10. RELATION BETWEEN MEASURES OF DISPERSION .......................................................................................... 19 11.CONCLUSION .................................................................................................................................................. 19

1. Meaning and purpose


There are various measures of measuring central tendency which give us the single value that represents a set of values in a distribution. Averages are the central value which typically represent the entire distribution but unable to give the correct idea about the scatteredness or variability of individual items from central value or average value. So others measures have to taken to measure the variability of the items. The meaning and purpose of variability is scattering of items from the central value. Variability statistics therefore is the measure of variation from central value. Importance or the necessity of measures of variability arises because averages only, do not give clear picture of data. Mean does not lead us to know whether the observations are close to each other or far apart. Median is a positional average and has nothing to do with the variability of the observation in the series. Mode is the largest occurring value independent of the other values of the set. Two distributions with the same averages may differ in case of scatteredness of the items from the central value. Thus, to have a clear picture of data, one needs to have a measure of dispersion or variability amongst observations in the set. Measure of variability help determine the reliability of an average and helps compare two or more series with regard to the consistency and is useful tool to analyze data for further statistical analysis.

Objectives of Measuring Dispersion


The major objectives for measuring dispersion are as follows: To determine the reliability of central tendency. To compare the consistency of two or more series. To determine the cause of variability and control it. To be of help in the use of other statistical tools. To control quality. In analysis of the time series.

Properties of an ideal measure of dispersion


The following are the necessary properties of an ideal measure of dispersion: It should be rigidly defined. It should be easy to calculate and easy to understand. It should be based on all the observations. It should be amenable to further mathematical treatment. It should be least affected by fluctuations of sampling. It should not be affected much by extreme observations.

While measures of central tendency are used to estimate "normal" values of a dataset, measures of dispersion are important for describing the spread of the data, or its variation around a central value. Two distinct samples may have the same mean or median, but completely different levels of variability, or vice versa. A proper description of a set of data should include both of these characteristics. There are various methods that can be used to measure the dispersion of a dataset, each with its own set of advantages and disadvantages. We have seen that the averages and the measures of dispersion can help in describing the frequency of distribution.

2. Definition
A central value does not provide information about the scattering of values in a set of data. Some other statistical measures which are involved to reflect on the scattering of items from numerical terms and are known as measures of variability or dispersion. Before discussing various measures of dispersion, it is worthwhile to point out that no hard and fast rules have been set which work as a guide for choosing a particular measure. A measure of dispersion is good if it posses all those properties which are discussed for the measure of central tendency.

3. Absolute and relative measures of dispersion


Absolute measures of dispersion have the units same as that of the given series. So that it cant be used to compare two series with different units. Ratio of two absolute measures of dispersion is independent of units. This is called the relative measure of dispersion. Relative measures of dispersion can make comparison even between the series with different units. Commonly used measures of dispersion Range Quartile deviation Mean deviation Variance Standard deviation Coefficient of variation The detail description of these measures of dispersion is as following:

4. Range
Range is the simplest measure of dispersion. It is defined as the difference between the largest and the smallest value in a data set. In a grouped frequency distribution range is computed either subtracting the lower limit of the smallest class from the upper limit of the largest class or by the subtracting the mid-value of the smallest class from the mid value of the largest class. The formula for range is Range = LS Where, L= largest item S = smallest item In the one hand range is simple measure of dispersion. It is easy to understand and compute. But in the other hand it may be the most misleading of all measures of dispersion. The reason is, range totally ignores all values other than the extreme ones in the distribution. Range is an absolute measure of dispersion. Its relative measure is called coefficient of range. The formula for coefficient of range is given below

Coefficient of range = Where, L= largest item S = smallest item

Lesser the coefficient of range , better is the result.

4.1. Properties of range


It is the easiest measure of dispersion. No calculations are involved. It is based on only the extreme values. It is susceptible to sampling fluctuation. It is not suitable for further mathematical treatments. It cannot be used in case of open-end distribution.

4.2. Uses of range


Range is very frequently used in quality control. It is used to setup the variation in the quality of the items manufactured. All the items which fall within some determined range of quality (shape, size, weight, etc) are accepted while those which fall outside the limits are rejected. Range is used to see variation in money rates, share values, exchange rates, gold prices etc. In addition, range is used in weather forecasting (minimum maximum temperature).

4.3. Limitations:
Range is not based on each and every item of the distribution. It is subject to fluctuations of considerable magnitude from sample to sample.

Range cannot tell us anything about the character of the distribution within the two extreme observations.

Range cannot be computed in case of open end distributions.

Problem: Q. The following are the prices of shares of Amish Co. Ltd. From Monday to Saturday: Day Monday Tuesday Wednesday Thursday Friday Saturday Price (Rs.) 200 210 208 160 220 250

Calculate range and its coefficient. Solution: Range= L-S Here, L=250 S=160 Range=L-S =250-160 =Rs. 90 Coefficient of range =

= = = 0.22

5. Quartile deviation
It is another measure of variability. The difference Q3 Q1 is interquartile range. Midpoint of the interquartile range is quartile deviation which is also called semi-inter quartile range. The symbol Q and D is used to denote quartile deviation. When the values represent some sort of ranking, the semiinter quartile range provides a measure of dispersion within that distribution. The formula for quartile deviation is

Q. D. = (Q3 Q1)
More the computed the value of Q, more will be the variability of the series. Compare to the range, quartile deviation is more reliable measure of dispersion. Since 50% of the cases lies between the first and the third quartiles, quartile deviation measures the dispersion of middle 50% of the data in a series. It is not only the difference between the scale values of Q 3 and Q1 but also the one half of the distance between Q3 and Q1. Quartile deviation is an absolute measure of dispersion. If it thus divided by average values of two quartiles a relative measure is obtained. It is called the coefficient of quartile deviation. Coefficient of quartile deviation has the formula,

Coefficient of Q.D. =Q3 Q1/Q3+Q1


Coefficient of quartile deviation is useful to compare the variability among the middle 50% of data for the series with different units of measurement.

5.1. Merits of quartile deviation


It is better measure of dispersion than range. Since it excludes the lowest and highest 25% of values it is not affected by extreme items. It can be calculated for the group data with open-end intervals.

5.2. Limitations:
It is not capable of further algebraic treatments. It is susceptible to sampling fluctuations. The measure does not take into account the individual values occurring between Q3 and Q1. It means that no idea about variation of even middle 50% values is available from this measure. Anyhow it provides some idea if the values are uniformly distributed between Q3 and Q1. It is not considered as measure of dispersion as it does not show the scattering of central values. In fact, it is measure of partitioning of distribution. Hence, it is not commonly used for statistical inference. Problem: Q. Compute quartile deviation and its coefficient from the following data:

Marks No.

10 of 4

20 7

30 15

40 8

50 7

80 2

students

Solution: Calculation of Q.D. and coefficient of Q.D. Marks 10 20 30 40 50 60 Frequency 4 7 15 8 7 2 Cumulative frequency 4 11 26 34 41 43

Q1= size of

th item

=size of 11th item =20 Q3=size of th item

=size of 33rd item =40 Q.D = =40-20/2 =10

Coefficient of Q.D. = = = 20/60 = 0.33

6. Mean deviation
Range, the simplest measure of dispersion in all, depends only on the two extreme items of the distribution. Provided the series is arranged in descending or ascending order and the quartile deviation accommodates middle 50% of data. As such the measure of dispersion discussed so far are not satisfactory in the sense that they lack most of the requirements of a good measure. Mean deviation is a better measure of dispersion than range and quartile deviation is, later two are not average deviations as they are not based in all observations. But the mean deviations show variation of the items from an average.

Definition Mean deviation of a series is defined as the arithmetic average of the positive deviations of various items from a measure of central tendency (mean, median, and mode). Consider a set of n observation x1, x2, x3 xn . Then the mean deviation denoted by M.D. is given by the formula

M.D. = 1/n |x A|
Where A is the central value i.e. the mean, median or the mode. In case of data given in the form of frequency distribution the formula becomes

M.D. = 1/N f |x A|
Where N = f is the total frequency. In case of grouped data the midpoint of each class interval is treated as x and we can use formula as above.

Mean deviation taken from mean, median and mode are respectively called the mean deviation from mean, mean deviation from median and mean deviation from mode. But in general the deviations are taken only from mean and median. Between mean and median later is supposed to be better than the former, because the sum of deviations from the median is less than the sum of deviation from the mean. Since the sum of item from median is least (wherever the signs are ignored) it is advantageous to find mean deviation from median. But in general practice mean deviation from mean is used. The main drawback of the mean deviation is we ignore algebraic signs while taking deviation of the items. Following are different formulas for mean deviation.

M.D from mean= |x x| n

M.D. from median =

X Md n

M.D. from mode = x Mo n

6.2. Properties of mean deviation


Mean deviation removes one main objection of the earlier measures, that it involves each value of the set. It is not much affected by extreme values. It has no relationship with any of the other measures of dispersion. Its main drawback is that algebraic signs of the deviations of are ignored which is mathematically unsound. Mean deviation is minimum when taken from median, which is a notable characteristic of mean deviation known as minimal property. Since deviations are taken from a central value, comparison about formation of different distributions can easily be made.

6.3. Limitations
The greatest drawback of this method is that algebraic signs are ignored while taking the deviations of the item. This method may not give very accurate results. It is not capable of further algebraic treatment. It is rarely used in sociological studies.

Because of these limitations, its use is limited and it is overshadowed as a measure of variation by the superior standard deviation.

6.4. Uses
It is especially effective in reports presented to the general public or to groups not familiar with statistical methods. This measure is useful for small samples with no elaborate analysis required. Incidentally, it may be mentioned that the National Bureau of Economic Research has found in its work on forecasting business cycles, that the average deviation is the most practical measure of dispersion to use for this purpose.

Problem Q. Calculate mean deviation from median and its coefficient from the following data: X F 10 3 11 12 12 8 13 12 14 3

Solution Calculation of Mean Deviation and its coefficient X 10 11 12 13 14 f 3 12 18 12 3 N=48 D 2 1 0 1 2 f D 6 12 0 12 6 f D=36 c.f. 3 15 33 45 48

Median= size of N+1/2 th item =size of 24.5th item Since 24.5th item is 12, hence median=12 M.D. = f |D| /N =36/48 =0.75 Coefficient of M.D.= M.D./median =0.75/12 =0.0625

6.5. Mean squared deviation


To overcome the drawback of ignoring algebraic signs sometimes averages are taken for the squared deviations. And this is known as mean squared deviation, given by the formula

M.D 2 = 1/n|x A|2


The minimum value of mean squared deviation is obtained if the deviations are taken from the arithmetic mean i.e. when A is replaced by X and this is called the variance, a popular measure of dispersion.

7. Variance
The minimum value of mean squared deviation is called the variance, given by the formula

2 =1/n(x -x)2
So, variance is the arithmetic mean of the squares of deviation taken from A.M. A sample variance differs from population variance. Computational formula makes the point clear. A sample variance is given by the formula s2 = 1/n -1 ( x - x )2 the process of subtracting the mean from each data value to obtain the deviations results in the loss of one piece of information from the original n numbers. Therefore, sum of the squared deviation is divided by one fewer than the number of terms added up while comparing the sample variance. Remarks If nothing about the sample or population is mentioned in the problem and variance is to be taken out, this has the idea of population variance and therefore it should be calculated accordingly

7.1. Properties of variance


1. Variance has mostly removed the drawbacks which are present in the above mentioned measure of dispersion. 2. The main demerits of variance are that its unit is the square of the units of measurement of variance values. For clarity, say the variable X is measured in cms , the unit of variance is cm2. Generally, this value is large and makes it difficult to decide about the magnitude of variation.

3. The variance gives more weightage to the extreme values as compared to those which are near to mean value, because the difference is squared in variance. 4. Variance is not affected by the shift in origin. This simply affects the mean, but the change of scale does affect it. Thus the variance of temperature would alter if the temperature is measured in Centigrade rather than in Fahrenheit, and the variance of the plants affected by disease would not be the same as that of the percentage of plants free from infection or if we subtract a fixed number, from the entire data and compute the variance of the deviations, the variance such computed will be the same as that of the original data. But if we divide the entire data set by some fixed number and compute the variance then the variance such computed would be equal to the product of the square of the number and the variance of the original data.

7.2. Combined variance


By the combined variance of two groups, we mean the variance of the observation of the two groups taken together. In general we say that we can obtain the variance of the whole series if its components are known. Sometimes this is also called the pooled variance. For the two groups consisting of n1 and n2 observations respectively let the group means be x1 and x and if the variances of the series are 12 and 22 respectively then the formula of combined variance is

122 = 1/ (n1 +n2)[ n1 (12 + d12) + n2 (22 +d22)]


Where x12 is the combined mean of the two series, d1 = x1 x12 and d2 = x2 x12

8. Standard deviation
A measure of dispersion in which the drawbacks of variance are overcome is standard deviation. Standard deviation denoted by S.D. is defined as the positive square root of the variance. The formula for population standard deviation is

S.D. = 2 =
For a sample we have the following formula

S.D. = s2 = s

If nothing is mentioned about the standard deviation we should understand it as population standard deviation and use formula accordingly. But in general we will be dealing with sample deviation. This is the best accepted and most widely accepted and most widely used of all variability measures. It is sensitive to all of the data because the deviation of all data values from the mean enter equally into the computation. Standard deviation is used in computing different statistical quantities like regression coefficients, correlation coefficients etc and also in testing the reliability of certain statistical measure. Note: literally, S.D. explains the average amount of variation on either side of the mean. Independently, S.D. is defined as the positive square root of the arithmetic mean of the sum of squared deviation taken from the arithmetic mean i.e.

= 1/n(x x ) 2
For computational purpose we use the following simplified form of the formula

= x2/n (x/n) 2
And in case of frequency distribution the formula is simply modified as

= fx2/N (fx/N) 2
Where N = f

8.1. Characteristics of standard deviation


Standard deviation is considered to be the best measure of dispersion and is used widely. If all variate values are same S.D. = 0 Standard deviation is least affected by fluctuation of sampling. It is affected by the change of scale but not affected by the change in origin. The limitations of this method are, it is not computable in case of open ended classes and if the unit of measurement of variables of two series is not the same, then their variability cannot be compared by comparing the values of standard deviation.

Standard deviation is an absolute measure of dispersion. Its relative measure is called the coefficient of standard deviation defined as

Coeff .(S.D.) =( / x)100

8.2. Difference between mean deviation and standard deviation


Both these measures of dispersion are based on each and every item of the distribution. But they differ in the following respects: Algebraic signs are ignored while calculating mean deviation whereas in the calculation of standard deviation signs are taken into account. Mean deviation can be computed either from median or mean. The standard deviation on the other hand is always computed from the arithmetic mean because the sum of the squares of the deviation of items from arithmetic mean is the least. Q. Yield (kg/plot) of the two soybean verities are: A-variety 10,11,9,10,9,10,4 and B-variety 10,10,10,9,10,7,7. Which one do you prefer and why?

Variety A x 10 11 9 10 9 10 4 63 =Xi n Now,

x-x 1 2 0 1 0 1 -5

(xi-x) 1 4 0 1 0 1 25 32

Variety B y 10 10 10 9 10 7 7 63

y-y 1 1 1 0 1 -2 -2

(yi-y)2 1 1 1 0 1 4 4 12

= 63 7

=9

Y = Yi n

= 63 7

= 9

S.D. of A variety =

(Xi-)2 n-1

= 32 7-1

=32 6

= 5.33

S.D. of B variety =

(Yi-)2 n-1

= 12 7-1

=12 6

=2

Interpretation: The SD of B variety is less than SD of A variety, that the B variety is more consistent or homogeneous than A variety since their means are equal. Therefore, we prefer B variety safely.

8.3. Mathematical properties of standard deviation

1. The standard deviation is independent of change of origin but not of scale. 2. The sum of the squared deviations of items in distribution from other arithmetic mean is minimum. i.e

(X- X)2 < (X-A)2

3. Standard deviation of n natural numbers: The standard deviation of the first n natural numbers can be obtained by the following formula

= (1/12)(n2 -1)
thus the standard deviation of natural numbers from1 to 6 will be =62-1/12 35/1 = 1.71 4. Combined standard deviation X deviations following formula X2 and standard

respectively ,then S.D (12)of the combined group by

12= n1( 12+d12)+n2(22+d22)/( 1+n2)

9. Coefficient of variation
All the measures of variation above have units .if series differ in their units of measurement, their variability cannot be compared by any measure of dispersion given so far. Also the size of measure of dispersion depends on the size of values. Hence in situations where either the two series have different units of measurements, or their means differ sufficiently in size, the coefficient of measurement should be used as a measure of dispersion. This is a relative measure of dispersion so it is a unitless measure. It takes into account the size of means of two series. Using coefficient of variation (C.V.) two or more sets of data can betterly can be compared for their variability. A series with less coefficient of variation is considered more consistent or stable.

9.1. Definition
Coefficient of variation of series of variate values the ratio of standard deviation to the arithmetic mean multiplied by hundred. If is the standard deviation and x is the set of values, the coefficient of variation is

c.v. = ( / x) 100
This measure of dispersion has been given by Professor Karl Pearson.

9.2. Properties
1. It is one of the most widely used measures of dispersion because of its virtues. 2. For field experiments, C.V. is low; it indicates more reliability of experimental findings. 3. With c.v. we comment about the variability of distribution .Less the c.v. more uniform, consistent or more stable the distribution and is more the c.v. more variable or more scattered the distribution is.

Q. Compare the two yields (variety A: the mean 60 kg; and standard deviation 10 kg and variety B the mean 50 kg and standard deviation 9 kg), which one is subjected to more variation?

Here, In case of A variety: Mean () = 60.0 kg S.D.(s)= 10.0 kg

Now,

C.V = S.D 100 Mean

= 10 100 60

= 16.66%

In case of B variety: Mean () = 50.0 kg Now, C.V = S.D 100 Mean Interpretation: The C.V. of A variety is lower than B variety. So, the A variety is better based on the yield distribution because it is more consistent or homogeneous than B variety. B variety has more variation than A variety. S.D.(s)= 9.0 kg = 9 50
100

= 18%

10. Relation between measures of dispersion


In a normal distribution, there is a fixed relationship between three most commonly used measures of dispersion. The quartile deviation is smallest, the mean deviation next and the standard deviation is largest, in the following proportions

Q.D.=2/3 S.D. and M.D. =4/5 S.D.

11.Conclusion
Range takes only the maximum and minimum values into account and not all the values. Hence it is very unstable or unreliable indicator of the amount of deviation. It is affected by the extreme values. The quartile deviation is more stable than the range as it depends on two intermediate values. This is not

affected by extreme values since the extreme values are already removed. However; quartile deviation also fails to take the values of all deviation. The mean deviation is measure of dispersion based on all items in a distribution. Mean deviation is the arithmetic mean of the deviations of a series computed from any measure of central tendency; i.e., the mean, median or mode, all the deviations are taken as positive i.e., signs are ignored. S.D. is the best accepted and most widely accepted and most widely used of all variability measures.

References

Agrawal, B.L. 1999. Programmed statistics. New age international publishers, New Delhi. Gupta, C.B. 1976. An introduction to statistical methods.7th edn. Vikas publishing house pvt. ltd., New Delhi. Gupta, S.P. 1995. Fundamental of statistics. 5th edn. Himalaya publishing house, Bombay. Pant, G. D., & Chaudhary, A. K. 2055. Statistics For Economics .2nd ed. Bhundi Puran Prakashan. Rangaswami, R. 1995. A text book of agricultural statistics.Wiley Eastern Limited. New age international publishers, New Delhi, India.680p Spiegel, M.R. 1992. Theory and problems of Statistics. 2 nd edn. McGraw-Hill Book Company, New- York. Steel, R.G.D. and J.H. Torri 1980. Principle and procedure of statistics,2 nd edition. McGraw-Hill Book Company, New- York. Sthapit, B.S., R.P.Yadav and S.P. khanal 2005. Business Statistics. (2 nd ed). Asmita Books Publishers, Kathmandu.633p.

You might also like