You are on page 1of 8

MBA SEMESTER 1

MB0040 – STATISTICS FOR MANAGEMENT- 4 Credits


(Book ID: B1129)
Assignment Set- 1 (60 Marks)
Note: Each question carries 10 Marks. Answer all the questions

1. Why it is necessary to summarise data? Explain the approaches available to summarize the data
distributions?

Graphical representation is a good way to represent summarised data. However, graphs provide us only an
overview and thus may not be used for further analysis. Hence, we use summary statistics like computing
averages. to analyse the data. Mass data, which is collected, classified, tabulated and presented systematically,
is analysed further to bring its size to a single representative figure. This single figure is the measure which can
be found at central part of the range of all values. It is the one which represents the entire data set. Hence, this
is called the measure of central tendency.

In other words, the tendency of data to cluster around a figure which is in central location is known as central
tendency. Measure of central tendency or average of first order describes the concentration of large numbers
around a particular value. It is a single value which represents all units.

Statistical Averages: The commonly used statistical averages are arithmetic mean, geometric mean, harmonic
mean.

Arithmetic mean is defined as the sum of all values divided by number of values and is represented by X.

Before we study how to compute arithmetic mean, we have to be familiar with the terms such as discrete data,
frequency and frequency distribution, which are used in this unit.

If the number of values is finite, then the data is said to be discrete data. The number of occurrences of each
value of the data set is called frequency of that value. A systematic presentation of the values taken by variable
together with corresponding frequencies is called a frequency distribution of the variable.

Median: Median of a set of values is the value which is the middle most value when they are arranged in the
ascending order of magnitude. Median is denoted by ‘M’.

Mode: Mode is the value which has the highest frequency and is denoted by Z.

Modal value is most useful for business people. For example, shoe and readymade garment manufacturers will
like to know the modal size of the people to plan their operations. For discrete data with or without frequency,
it is that value corresponding to highest frequency.

Appropriate Situations for the use of Various Averages

Page 1
1. Arithmetic mean is used when:

a. In depth study of the variable is needed

b. The variable is continuous and additive in nature

c. The data are in the interval or ratio scale

d. When the distribution is symmetrical

2. Median is used when:

a. The variable is discrete

b. There exists abnormal values

c. The distribution is skewed

d. The extreme values are missing

e. The characteristics studied are qualitative

f. The data are on the ordinal scale

3. Mode is used when:

a. The variable is discrete

b. There exists abnormal values

c. The distribution is skewed

d. The extreme values are missing

e. The characteristics studied are qualitative

4. Geometric mean is used when:

a. The rate of growth, ratios and percentages are to be studied

b. The variable is of multiplicative nature

5. Harmonic mean is used when:

a. The study is related to speed, time

b. Average of rates which produce equal effects has to be found

4.9 Positional Averages

Median is the mid-value of series of data. It divides the distribution into two equal portions. Similarly, we can
divide a given distribution into four, ten or hundred or any other number of equal portions.

Page 2
2. Explain the purpose of tabular presentation of statistical data. Draft a form of tabulation to
show the distribution of population according to i) Community by age, ii) Literacy , iii) sex ,
and iv) marital status.

The objectives of tabulation are to:

i. Simplify complex data

ii. Highlight important characteristics

iii. Present data in minimum space

iv. Facilitate comparison

v. Bring out trends and tendencies

vi. Facilitate further analysis

Marital Status Age/Sex Educated Non-Educated


Below 20yrs 20-40 Above 40 Below 20yrs 20-40 Above 40
Male
Married Female
Male
Unmarried Female

3. Give a brief note of the measures of central tendency together with their merits & Demerits. Which
is the best measure of central tendency and why?

Graphical representation is a good way to represent summarised data. However, graphs provide us only an
overview and thus may not be used for further analysis. Hence, we use summary statistics like computing
averages. to analyse the data. Mass data, which is collected, classified, tabulated and presented systematically,
is analysed further to bring its size to a single representative figure. This single figure is the measure which can
be found at central part of the range of all values. It is the one which represents the entire data set. Hence, this
is called the measure of central tendency.

In other words, the tendency of data to cluster around a figure which is in central location is known as central
tendency. Measure of central tendency or average of first order describes the concentration of large numbers
around a particular value. It is a single value which represents all units.

Arithmetic mean: Arithmetic mean is defined as the sum of all values divided by number of values and is
represented by

Page 3
Merits and demerits of arithmetic mean

Merits Demerits
It is simple to calculate and easy to It is affected by extreme values.
understand.
It is based on all values It cannot be determined for
distributions with open-end class
intervals.
It is rigidly defined. It cannot be graphically located.
It is more stable. Sometimes it is a value which is not
in the series.
It is capable of further algebraic
treatment.

Median: Median of a set of values is the value which is the middle most value when they are arranged in the
ascending order of magnitude. Median is denoted by ‘M’

Merits and demerits of median

Merits Demerits
It can be easily understood and It is not based on all values.
computed.
It is not affected by extreme values. It is not capable of further algebraic
treatment.
It can be determined graphically It is not based on all values.
(Ogives).
It can be used for qualitative data.
It can be calculated for distributions
with open-end classes.

Mode: Mode is the value which has the highest frequency and is denoted by Z.

Modal value is most useful for business people. For example, shoe and readymade garment manufacturers will
like to know the modal size of the people to plan their operations. For discrete data with or without frequency,
it is that value corresponding to highest frequency.

Merits and demerits of mode

Merits Demerits
In many cases it can be found by It is not based on all values.
inspection.
It is not affected by extreme values. It is not capable of further
mathematical treatment.
It can be calculated for distributions It is much affected by sampling
with open end classes. fluctuations.
It can be located graphically.
It can be used for qualitative data.

Page 4
The best measure of tendency is arithmetic mean. It is defined as a value obtained by dividing the sum of all
the observation by their number, that is mean= [sum of all the observations]/[number of the observations]
Arithmetic mean is used because it is simple to understand and easy to interpret. It is quickly and easily
calculated. It is amenable to mathematical treatments. It is relatively stable in repeated sampling experiments.

4. Machines are used to pack sugar into packets supposedly containing 1.20 kg each. On testing a
large number of packets over a long period of time, it was found that the mean weight of the packets
was 1.24 kg and the standard deviation was 0.04 Kg. A particular machine is selected to check the
total weight of each of the 25 packets filled consecutively by the machine. Calculate the limits within
which the weight of the packets should lie assuming that the machine is not been classified as faulty.

Since the sample size is 25, which is less than 30, it is a case of small sample. T distribution is used
to calculate confidence limit.

Since sample size is 25 which is less than 30 therefore it is a case of small sample
t-test distribution is used to calculate confidence interval.

Given, Sample size = n = 25


Standard deviation, S = 0.04
Degrees of Freedom, df = n-1 = 25-1 = 24
Mean weight, = 1.24
Weight = µ
α = 5% = 0.05

tα/2 = t 0.05/2 = t 0.025 = 2-064 at 95% confidence and degree of freedom df = 24

The limits are,

= ± tα/2 S/√n

= 1.24 ± 2.064( 0.04 / √25 )

= 1.24 ± [ 2.064 ( 0.04 / 5) ]

= 1.24 ± 0.016512

- tα/2 S/√n ≤ µ ≤ + tα/2 S/√

= 1.24 – 0.016512 ≤ µ ≤ 1.24 + 0.016512

= 1.223488 ≤ µ ≤ 1.256512
==========

5. A packaging device is set to fill detergent power packets with a mean weight of 5 Kg. The standard
deviation is known to be 0.01 Kg. These are known to drift upwards over a period of time due to
machine fault, which is not tolerable. A random sample of 100 packets is taken and weighed. This
sample has a mean weight of 5.03 Kg and a standard deviation of 0.21 Kg. Can we calculate that the
mean weight produced by the machine has increased? Use 5% level of significance.

Page 5
Since sample size is 100 which is a case of large sample
So Z-test statistics will be used for hypothesis testing.

Let us take the null hypothesis, H0


Let mean weight has increased
H1 and HA for alternate hypothesis

H0 : µ = 5
H1 : µ > 5 ( Right Tailed test )

Given, Sample size = n = 100


Mean Weight = = 5.03 kg
Standard deviation = S = 0.21 kg
Level of significance, α = 5%

Z = ( - µ ) / (S / √n)

= (5.03 – 5 ) / (0.21 / √100)

Z calculated = 1.428

Now, check the table for 5%


Now, Z critical = Zα = Z0.05 = 1.645 ( For one tailed test )

Since calculated value, Z calculated = 1.428 is less than its critical value Zα = 1.645

Therefore, H0 is accepted.

Hence we conclude the mean weight produced by the machine has increased.

6. Find the probability that at most 5 defective bolts will be found in a box of 200 bolts if it is known
that 2 per cent of such bolts are expected to be defective .(you may take the distribution to be
Poisson; e-4= 0.0183).

Given, total number of bolts, n = 200

P (defective bolt) = 2% = 0.02


Therefore, m = np = 200 * 0.02 = 4

P(X = 0) = P (zero defective bolt)

= (e-m m0 ) / 0!

= (e-4 40 ) / 1

= ( 0.0183 ) ( 1 ) / 1

= 0.0183

Page 6
=========

P ( at most 5 defective bolts )

= P (X≤5)

= P (X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4) + P(X=5)

= (e-m m0) / 0! + (e-m m1) / 1! + ( e-m m2) / 2! + ( e-m m3) / 3! + (e-m m4) / 4! + (e-m m5) / 5!

= e-m [ 1 + m1 / 1! + m2/2! + m3/3! + m4/4! + m5/5! ]

= e-4 [1 + 41 / 1 + 8/2 + 64/6 + 256/24 + 1024/120 ]

= 0.0183 [ 1 + 4 + 8 + 10.67 + 10.67 + 8.53 ]

= 0.0183 * 42.87

= 0.784521
=======

Page 7
Page 8

You might also like