You are on page 1of 37

SYMBIOSIS INSTITUTE OF

INTERNATIONAL BUSINESS [SIIB]


Constituent of Symbiosis International University
Accredited by NAAC with A Grade

1
IB-11-13-Pre-Induction Basics of Business Statistics

INDEX
MODULE 2 BASICS OF BUSINESS STATISTICS
Sr. No.
1

Content
Business Statistics
Introduction
History
Applications of Business Statistics
Levels of Measurement
Variables
Presentation of Data

Page No.
3

Types of Data
Frequency Distribution
Diagrammatic Presentation
Cumulative Frequency Distribution
Statistical graphs
Exercise

Measures of Central Tendency and Dispersion


Types of Data Presentation
Measures of Central Tendency
Measure of Dispersion
Shape
Exercise

16

Probability

25

Counting Principles
Permutations
Combinations
Exercise
Probability
Axiomatic Approach to Probability
Addition Rule of Probability
Exercise

2
IB-11-13-Pre-Induction Basics of Business Statistics

BUSINESS STATISTICS
Introduction
Statistics is a mathematical science pertaining to the collection, presentation, analysis and
interpretation or explanation of data. It also provides tools for prediction and forecasting based
on data. It is applicable to a wide variety of academic disciplines, from the natural and social
sciences to the humanities, government and business.
Statistical methods can be used to summarize or describe a collection of data; this is called
descriptive statistics. In addition, patterns in the data may be modeled in a way that accounts for
randomness and uncertainty in the observations, and are then used to draw inferences about the
process or population being studied; this is called inferential statistics. Descriptive, predictive,
and inferential statistics comprise applied statistics. Business statistics is the science of good
decision making in the face of uncertainty and is used in many disciplines such as financial
analysis, econometrics, auditing, production and operations including services improvement, and
marketing research.
History
The scope of the discipline of statistics broadened in the early 19th century to include the
collection and analysis of data in general. Today, statistics is widely employed in government,
business, and the natural and social sciences.
Because of its empirical roots and its applications, statistics is generally considered not to be a
subfield of pure mathematics, but rather a distinct branch of applied mathematics. Its
mathematical foundations were laid in the 17th century with the development of probability
theory by Pascal and Fermat. Probability theory arose from the study of games of chance. The
method of least squares was first described by Carl Friedrich Gauss around 1794. The use of
modern computers has expedited large-scale statistical computation, and has also made possible
new methods that are impractical to perform manually.
In applying statistics to a scientific, industrial, or societal problem, it is necessary to begin with a
process or population to be studied. Population is aggregate of objects animate or inanimate.
There might be a population of people in a country, of crystal grains in a rock, or of goods
manufactured by a particular factory during a given period. It may instead be a process observed
at various times; data collected about this kind of "population" constitute what is called a time
series.
For practical reasons, rather than compiling data about an entire population, a chosen subset of
the population, called a sample, is studied. Data are collected about the sample in an
observational or experimental setting. The data are then subjected to statistical analysis, which
serves two related purposes: description and inference.

3
IB-11-13-Pre-Induction Basics of Business Statistics

Descriptive statistics can be used to summarize the data, either numerically or


graphically, to describe the sample. Basic examples of numerical descriptors include the
mean and standard deviation. Graphical summarizations include various kinds of charts
and graphs.
Inferential statistics is used to model patterns in the data, accounting for randomness
and drawing inferences about the larger population. These inferences may take the form
of answers to yes/no questions (hypothesis testing), estimates of numerical characteristics
(estimation), descriptions of association (correlation), or modeling of relationships
(regression). Other modeling techniques include ANOVA, time series, and data mining.

If the sample is representative of the population, then inferences and conclusions made from the
sample can be extended to the population as a whole. A major problem lies in determining the
extent to which the chosen sample is representative. Statistics offers methods to estimate and
correct for randomness in the sample and in the data collection procedure, as well as methods for
designing robust experiments in the first place. (See experimental design.)
The fundamental mathematical concept employed in understanding such randomness is
probability. Mathematical statistics (also called statistical theory) is the branch of applied
mathematics that uses probability theory and analysis to examine the theoretical basis of
statistics.
The use of any statistical method is valid only when the system or population under
consideration satisfies the basic mathematical assumptions of the method. Misuse of statistics
can produce subtle but serious errors in description and interpretation subtle in the sense that
even experienced professionals sometimes make such errors, serious in the sense that they may
affect, for instance, social policy, medical practice and the reliability of structures such as
bridges. Even when statistics is correctly applied, the results can be difficult for the non-expert to
interpret. For example, the statistical significance of a trend in the data, which measures the
extent to which the trend could be caused by random variation in the sample, may not agree with
one's intuitive sense of its significance. The set of basic statistical skills (and skepticism) needed
by people to deal with information in their everyday lives is referred to as statistical literacy.
Applications of Business Statistics
Accounting
Public accounting firms use statistical
sampling procedures when conducting
audits for their clients.

4
IB-11-13-Pre-Induction Basics of Business Statistics

Economics
Economists use statistical information in
making forecasts about the future of the
economy or some aspect of it.

Marketing
Electronic point-of-sale scanners at retail
checkout counters are used to
collect data for a variety of marketing
research applications

.Production
A variety of statistical quality
control charts are used to monitor
the output of a production process

Finance

Financial advisors use price-earnings ratios and dividend yields to guide their investment
recommendations.

5
IB-11-13-Pre-Induction Basics of Business Statistics

Levels of measurement
There are four types of measurements or levels of measurement or measurement scales used in
statistics: nominal, ordinal, interval, and ratio. They have different degrees of usefulness in
statistical research.
Nominal
When the data for a variable consists of labels or names used to identify an attribute of the
element, the scale of measurement is considered to be nominal.
Example:
Students of a university are classified by the school in which they are enrolled using a nonnumeric label such as Business, Humanities, Education, and so on. Alternatively, a numeric code
could be used for the school variable (e.g. 1 denotes Business, 2 denotes Humanities, 3 denotes
Education, and so on).
Ordinal
The data have the properties of nominal data and the order or rank of the data is meaningful. A
nonnumeric label or numeric code may be used.
Example:
Students of a university are classified by their class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior. Alternatively, a numeric code could be used for the
class standing variable (e.g. 1 denotes Freshman, 2 denotes Sophomore, and so on).
Interval
The data have the properties of ordinal data, and the interval between observations is expressed
in terms of a fixed unit of measure. Interval data are always numeric.
Example:
Melissa has an SAT score of 1205, while Kevin has an SAT score of 1090. Melissa scored 115
points more than Kevin.
Ratio
The data have all the properties of interval data and the ratio of two values is meaningful.
Variables such as distance, height, weight, and time use the ratio scale. This scale must contain a
zero value that indicates that nothing exists for the variable at the zero point

6
IB-11-13-Pre-Induction Basics of Business Statistics

Example:
Melissas college record shows 36 credit hours earned, while Kevins record shows 72 credit
hours earned. Kevin has twice as many credit hours earned as Melissa.
Qualitative and Quantitative Data
Data can be further classified as being qualitative or quantitative. Qualitative data include labels
or names used to identify an attribute of each element. Qualitative data use either the nominal or
ordinal scale of measurement and may be nonnumeric or numeric. Quantitative data are obtained
using either the interval or ratio scale of measurement. The statistical analysis appropriate for a
particular variable depends upon whether the variable is qualitative or quantitative.
If the variable is qualitative, the statistical analysis is rather limited. We can summarize the
qualitative data by counting the number of observations in each category or by computing the
proportion of the observations in each qualitative category.
However if the characteristic is quantitative, arithmetic operations often provide meaningful
results.

Discrete variable
A variable taking isolated values is called discrete variable. The graphical representation of a
discrete variable is a step function. Examples of a discrete variable can be number of people in a
group, number of accidents occurring on a particular day etc.
Continuous variable
A variable which takes any value within the given interval is referred to as continuous variable.
e. g. weight of a person, temperature on a given day, rainfall on a given day etc. Graphical
presentation of a continuous variable is a curve.

7
IB-11-13-Pre-Induction Basics of Business Statistics

PRESENTATION OF DATA
Types of Data
Primary data
Primary data is the one which is collected for the first time by the investigator. He can collect it
using various methods, like survey (census), telephonic interviews, through e-mails etc.
This data are generally referred to as raw data as it is unprocessed data.
Secondary data
In some cases, data needed for a particular application already exist. Companies maintain a
variety of records or databases about their employees, and business operations.
Data are also available from a variety of industry associations and special interest organizations.
The internet continues to grow as an important source of information and statistical data. Almost
all companies maintain Web sites that provide general information about the company as well as
data on sales, number of employees, number of products etc.
Government agencies are another important source of secondary data. Information on vital
events (birth, death etc.) is available with the governmental agencies.
Some times information can also be collected form published journals.
Presentation of Data
After collection of data, the next stage the statistician has to go through is presentation of data.
Usually, size of the information collected is huge, so it becomes necessary to present it in a more
systematic and concise way in order to bring out important feature or characteristics of the data.
Basically, there are two ways to represent data.
1. Tabular
2. Graphical
Let us start the discussion with the introduction of frequency distribution and various
components of the frequency distribution.
Frequency Distribution
A frequency distribution is a tabular summary of data showing the number (frequency) of items
in each of several non-overlapping classes.
When raw data is converted into the frequency distribution, frequency distribution provides
summary which offers more insight than the original data.
Three steps necessary to define classes for a frequency distribution with quantitative data are:
1. Determine the number of non-overlapping classes.
2. Determine the width of each class.
3. Determine the class limits.
Number of classes
There is no specific rule for choosing the number of classes. As a general guideline, classes
between 5 and 20 are chosen.

8
IB-11-13-Pre-Induction Basics of Business Statistics

Width of the class


Generally, we choose same width for all classes. Width is denoted by h.
Class width = Upper class boundary Lower class boundary
= Difference between the two consecutive upper limits
= Difference between the two cosecutive lower limits
Mid-point of the class (class mark)
It is the mid point of the class interval. It is denoted by x. It is obtained as
Upper class limit + Lower class limit
Class mark = ___________________________________
2
Upper class boundary + Lower class boundary
=______________________________________
2
Relative Frequency
Frequency of the class
Relative frequency of class = ________________________
n
Where, n is the total number of observations.
Tabulation
While presenting the data one can make use of tabulation. It is the most concise way of
presentation of data.
There can be one-way, two-way or multifold tables depending on number of columns and rows
we choose.
Diagrammatic representation
One of the graphical ways of representing data is diagrammatic representation. In this, one can
use Bar graphs or Pie-charts. Bar graph can be of simple, multiple, sub-divided or percentage
type. This is pictorial presentation of data. In this, points are not plotted according to the scale.
These are more attractive and colourful as compared to various graphs available in statistics.

9
IB-11-13-Pre-Induction Basics of Business Statistics

Simple bar diagram

Pie-chart

Graphical Representation
A common graphical presentation of quantitative data is a histogram. It is series of adjacent
rectangles erected on X-axis. It is constructed by placing the variable of interest on the horizontal
axis and the frequency, relative frequency, or percent frequency on the vertical axis.
Histogram

10
IB-11-13-Pre-Induction Basics of Business Statistics

Before we learn ogive curve, let us look at cumulative frequency distribution.


Cumulative Frequency Distribution
The following frequency distribution table gives the marks obtained by 40 students:
Cumulative frequency is obtained by adding the frequency of a class interval and the frequencies
of the preceding intervals unto that class interval. This is explained by an example below.
Class Mark
0-10
10-20
20-30
30-40
40-50

Frequency
4
5
12
11
8

Cumulative frequency
4
(4) + 5 = 9
(9) + 12 =21
(21) + 11 = 32
(32) + 8 = 40

In the above table it can be observed that frequencies are added from top to bottom and also 4
students got marks 'less than 10', 9 students got marks 'less than 20' and so on.
Therefore, the above distribution is called 'less than' cumulative frequency distribution.
The above table can be re-written as follows:

In the same way 'more than' cumulative frequency distribution can be obtained by adding to the
other frequencies in the reverse order. It is explained in the following table.
Class Mark
0-10
10-20
20-30
30-40
40-50

Frequency
4
5
12
11
8

Cumulative frequency
(36) + 4 = 40
(31) + 5 = 36
(19) + 12 =31
(8) + 11 = 19
8

11
IB-11-13-Pre-Induction Basics of Business Statistics

The above table can be re-written as follows

Ogive curve
It is a cumulative frequency curve. There are two types of ogive curve; less than ogive curve and
more than ogive curve. Ogive curve is drawn by taking data values on the horizontal axis and
cumulative frequencies on the vertical axis.
Example
Draw a 'less than' ogive curve for the following data

To Plot an Ogive:
(i) We plot the points with coordinates having abscissa as actual limits and ordinates as the
cumulative frequencies, (10, 2), (20, 10), (30, 22), (40, 40), (50, 68), (60, 90), (70, 96) and (80,
100) are the coordinates of the points.
(ii) Join the points plotted by a smooth curve.
(iii) An Ogive is connected to a point on the X-axis representing the actual lower limit of the first
class.
Scale:
X -axis 1 cm = 10 marks, Y -axis 1cm = 10 c.f.

12
IB-11-13-Pre-Induction Basics of Business Statistics

Example
Using the data given below, construct a 'more than' cumulative frequency table and draw the
Ogive.

To Plot an Ogive
(i) We plot the points with coordinates having abscissa as actual lower limits and ordinates as the
cumulative frequencies
(70.5, 2), (60.5, 7), (50.5, 13), (40.5, 23), (30.5, 37), (20.5, 49),
(10.5, 57), (0.5, 60) are the coordinates of the points.
Y-axis 2 cm = 10 c.f.
(iii) An Ogive is connected to a point on the X-axis representing the actual upper limit of the last
class [in this case) i.e., point (80.5, 0)].
Scale:
X-axis 1 cm = 10 marks
(ii) Join the points by a smooth curve.

13
IB-11-13-Pre-Induction Basics of Business Statistics

Frequency Polygon
The weights of 50 students are recorded below. Draw a frequency polygon for this data.
Example
In a frequency distribution, the mid-value of each class is obtained. Then on the graph paper, the
frequency is plotted against the corresponding mid-value. These points are joined by straight
lines. These straight lines may be extended in both directions to meet the X - axis to form a
polygon.

Answer

If the above graph is joined by a smooth curve, then it is known as a frequency curve
14
IB-11-13-Pre-Induction Basics of Business Statistics

Exercise
The raw data displayed below are the electric and gas utility charges during the month of July
1990, for a random sample of 50, one- bedroom apartments in Mumbai:
96

171

202

178

147

102

153

197

127

82

157

185

90

116

172

111

148

213

130

165

141

149

206

175

123

128

144

168

109

167

95

163

150

154

130

143

187

166

139

149

108

119

183

151

114

135

191

137

129

158

a. Form a frequency distribution having 7 class intervals with the following class
boundaries Rs.80 but less than Rs.100, Rs.100 but less than Rs.120, and so on.
b. Form the percentage distribution from the frequency distribution developed in a.
c. From the percentage distribution developed in b.
i.

Plot the percentage histogram.

ii.

Plot the percentage polygon.

d. From the frequency distribution developed in a.


i.

Approximate mean, mode, range, midrange, standard deviation and coefficient


of variation.

ii.

Based on Chebyshevs rule, between what two values would we estimate that
at least 75% of the data are contained?

iii.

What percentage of data are actually contained within 2 S.D. of the mean?

iv.

Compare above results with those in part ii.

e. From the frequency distribution developed in a.


i.

Form the cumulative frequency distribution.

ii.

Form the cumulative percentage distribution.

iii.

Plot the ogive.

iv.

Approximate the median, Q1, Q3, the midhinge and the interquartile range.

15
IB-11-13-Pre-Induction Basics of Business Statistics

MEASURE OF CENTRAL TENDENCY AND


DISPERSION
Types of Data Presentation
Generally, data can be arranged in one of the following three ways.
Series of individual observations
x1 , x2 ,, xn
Ungrouped Frequency Distribution
( xi , fi ) ; i=1, 2,.,n
xi : i th observation in the series
fi : frequency of ith observation in the series
Grouped Frequency Distribution
( xi , fi ) ; i=1, 2,.,k
xi : midpoint of the i th class
fi : frequency of i th class

Describing and Summarizing Data


Three major properties which describe a batch of a numerical data are
Central Tendency
Dispersion
Shape
Summery measures computed from a sample of data are called Statistics.
Descriptive summary measures computed from an entire population are called Parameters.

Measure of central tendency/Location


Most batches of data show a distinct tendency to group or cluster about a certain central value.
Hence, generally it becomes possible to select some typical value called average, to describe the
entire batch. Such a typical value is measure of central tendency or location.
Different measures of central tendency are
Arithmetic Mean
Median
Mode
Midrange
Midhinge

16
IB-11-13-Pre-Induction Basics of Business Statistics

Arithmetic Mean
It is obtained by adding the raw scores and dividing the sum by the number of items.
Properties
Based on each and every observation in the series.
Capable of further mathematical treatment.
Gives distorted representation of data under study if data consists of outliers, i. e. it is
greatly affected by extreme observations.
To find the mean of raw data
Suppose the raw scores are
x1, x2, x3,, xN

then, mean is
where, M = mean
x = each score or item
N = number of items
= sigma, which means 'summation of '
Example: Find the mean of 6, 10, 4, 12, 8.

M=8
To find mean for grouped data

Where, x is the mid-interval


17
IB-11-13-Pre-Induction Basics of Business Statistics

M is the mean
f is the frequency
Example: Find the mean for the following table by the 'Direct Method'

Example: Calculate the mean marks in the distribution given below.

18
IB-11-13-Pre-Induction Basics of Business Statistics

= 29.75

Median
Median is defined as the middle value in an ordered sequence of data.
It is not affected by magnitude of the observation but is affected by number of observations.
Example: Find the median of 83, 37, 70, 29, 45, 63, 41, 70, 30, 54
Data in the sequence is 29, 30, 37, 41, 45, 54, 63, 70, 70, 83
Median = Middle-most score

Median = 49.5
Example: Find the median of 15, 8, 14, 20, 13, 12, 16.
Series in order is 8, 12, 13, 14, 15, 16, 20
n = 7 (odd)

Median = 14

19
IB-11-13-Pre-Induction Basics of Business Statistics

Mode
Mode is defined as the value in a batch of data which occurs most frequently.
It does not get affected by extreme observations.
It is not used for more than descriptive purpose because it is more variable from sample to
sample than other measure of central tendency.
Example: Find the mode of 43, 42, 44, 40, 48, 45, 40, 40
The given series is 40, 40, 40, 42, 43, 44, 45, 48
Since 40 is the most repeated score,
Mode = 40
Midrange
It is defined as the average of the two extremes of the data.
Let xmax and xmin be the two extremes of the data then mid-range is defined as
xmax + xmin
Midrange = _________
2
The main drawback of this is that it becomes distorted as a summary measure of central tendency
if an outlier is present.
Measures of Dispersion
Measure of location alone cannot reveal all the characteristics possessed by data under study. For
example, it may happen that two series having same measure of central tendency may have
different pattern of variation and if we try to compare these two series using average it will not
be a right thing to do. A measure which can measure this variation is called measure of
dispersion.
Following are measures of dispersion which are most frequently used.
Range
Variance
Standard Deviation
Coefficient of Variation
Range
It is a crude measure of dispersion. It measures the total spread in the batch of data.
It is given by
xmax - xmin
It fails to take into account how the data are distributed between the smallest and the largest
values.

20
IB-11-13-Pre-Induction Basics of Business Statistics

Variance
It is based on each and every observation in the series. It is defined as mean of squared deviation
of each observation about mean.
Standard Deviation
It is the most commonly used measure of dispersion. It is defined as positive square root of the
variance.
Variance and standard deviation reflect how data are varying. They measure the average scatter
around the mean- that is, these measures evaluate how the values fluctuate about the mean.
Standard deviation is calculated using the following formulae.
For an individual series,

For a frequency distribution,

The square of the Standard deviation is known as Variance.


Coefficient of Variation
It is a relative measure of dispersion. It is particularly used when comparing the variability of
two or more batches of data that are expressed in different units of measurement. C.V. is also
used in a situation where we want to compare two or more sets of data which are measured in the
same units but differ to such an extent that the direct comparison of the respective standard
deviation is not very useful.
S .D
Coefficientof var iation(c.v ) =
100 0 0
A.M

Example: Calculate the standard deviation and the variance for the following data
7, 8, 11, 6, 13, 8, 10.
21
IB-11-13-Pre-Induction Basics of Business Statistics

Answer

Variance =

(x M )

36
7
= 5.14
36
S .D = =
= 2.27
7
=

Shape

22
IB-11-13-Pre-Induction Basics of Business Statistics

For Symmetric Distribution,


Mean = Median = Mode
For Right Skewed (Positively Skewed) Distribution, mean is affected by extremely large
observation. In this case,
mode < median < mean < midrange
For Left Skewed (Negatively skewed) Distribution,
midrange < mean < median < mode
Quartiles
These are the partition values. Quartile is a useful measure of non-central location. It is often
employed when one wants to summarize or describe the properties of large batches of
quantitative data.
There are three quartiles, Q1 , Q2 and Q3 .
Midhinge
The midhinge is the mean of the first and third quartiles in a batch of data. It is used to overcome
potential problems introduced by extreme values in the data. It is the measure of central
tendency.
Interquartile Range
It is the measure of dispersion which measures the spread of middle 50 % of the observations.
Hence, it is not affected by extreme observations.
For Symmetric distribution
median =midhinge = midrange = mean=mode
For Positively Skewed distribution
mode < median < midhinge < mean < midrange
For Negatively Skewed distribution
midrange < mean < midhinge < median < mode
The Five Number Summary

Median, midhinge and interquartile range are called resistance statistics because they are
relatively insensitive to extreme values.
In order to get a better idea about the shape of the distribution, we use the five number summery.
These five numbers are;
Xmin , Q1 , Q2 , Q3 and Xmax

23
IB-11-13-Pre-Induction Basics of Business Statistics

Exercise

1. In a class of 50 students, 10 have failed and their average of marks is 2.5. The total marks
secured by the entire class were 281. Find the average marks of students who nave
passed.
2. What will be the mean and the median of 7 consecutive integers, the least of which is x.
3. Mean and median of 51 items are 100 and 95 respectively. At the time of calculations
two items 180 and 90 were wrongly taken as 100 and 10. What are the correct values of
mean and median?
4. The mean of a group of 10 observations is 15. Fifteen more observations are added to this
group and the mean of these 25 observations is found to be 12. Find the mean of the
additional 15 observations.
5. The mean of a group of 20 items is 30. Find the mean if each value is doubled and
increased by 5.
6. Calculate population variance from the following information;
n = 15, x = 480, x2 =15735
7. Means and variances of two series are given below:
Mean

Variance

Series A

54

Series B

100

Which series is more stable?


8. Two samples of size 40 and 45 respectively have the same mean 53, but different
standard deviations 19 and 8. Find the standard deviation of the combined group.
9. Find population variance of observations 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. Compare its variance
with population variance of 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20.
10. The mean and the standard deviation of population of 100 items were found to be 50 and
5 respectively. If at the time of calculations, two items were wrongly taken as 40 and 50
instead of 60 and 30, find the correct standard deviation.
----------

24
IB-11-13-Pre-Induction Basics of Business Statistics

PROBABILITY
Counting Principles
Addition
If two different operations can be performed in m and n different ways, then the number of
ways in which either operation 1 or operation 2 can be performed is given by (m+n) ways.
Multiplication
If two different operations can be performed in m and n different ways, then the number of
ways in which both operation 1 and operation 2 can be performed is given by (m*n) ways.
Permutations

Permutation is an arrangement of n things. In this case order in which these things are arranged
is important. Broadly speaking, there are 2 different cases in which any problem on permutation
can be classified into.
Case I

Arrangement of n distinct things taken r at a time is given by nPr.


Examples:
1) 2 and 3 are two digits and with these digits, the numbers 32 and 23 are formed. Although,
numbers viz., 32 and 23 consist of the digits 2 and 3, the order of digits is different. Each of the
above arrangements is called a 'permutation'. Thus, the number of arrangements or permutations
of two distinct digits 2 and 3 is 2.

2) The permutation of the three letters a, b, c taken two at a time are

The number of permutations of n dissimilar things taken r at a time without repetition is


denoted by nPr.
And is given by

The number of permutations of n different things taken r at a time is the same as the number
of ways of filling n letters in r positions, arranged in a straight line.
Each position is accommodating only one letter. We may fill the first position with any one of
the n letters. Having filled the first position in any one of these n ways, we have (n-1) letters with
which to fill the next position.
Having filled the first two positions, we have (n-2) letters with which to fill the third position.
Proceeding in this way one can see that filling r positions is like performing r different
operations with n, (n-1), (n-2) .. different ways respectively. And since, we have to fill all r
25
IB-11-13-Pre-Induction Basics of Business Statistics

positions; we need to multiply the respective number of ways. Therefore, the total number of
ways in which r positions can be filled with n letters without repetition is n (n-1) (n-2) (n-3)
(n-r+1).
Thus, number of r-permutations of n different things denoted by nPr = P(n,r) is given by nPr =
n(n-1)(n-2) (n-3)...(n - r +1)

If we put r = n in the above formula, then

We may understand that 0! = 1.


Properties

Case II
Circular Permutations
When things are arranged in places along a line with first and last place, they form a linear
permutation. So far we have dealt only with linear permutations. When things are arranged in
places along a closed curve or a circle, in which any place may be regarded as the first or last
place, they form a circular permutation.
Thus, the number of permutations of 4 objects in a row = 4!, where as the number of circular
permutations of 4 objects is (4-1)! = 3!.
The permutation in a row or along a line has a beginning and an end, but there is nothing like
beginning or end or first and last in a circular permutation. In circular permutations, we consider
one of the objects as fixed and the remaining objects are arranged as in linear permutation.
The following arrangements of 4 objects O1, O2, O3, O4 in a circle will be considered as one or
same arrangement

26
IB-11-13-Pre-Induction Basics of Business Statistics

Observe carefully that when arranged in a row, O1 O2 O3 O4, O2O3O4 O1, O3O4O1O2, O4O1O2O3
are different permutations. When arranged in a circle, these 4 permutations are considered as one
permutation.
Theorem:
The number of circular permutations of n different objects is (n-1)!.
Proof:
Each circular permutation corresponds to n linear permutations depending on where we start.

Since there are exactly n! linear permutations, there are exactly


number of circular permutations is the same as (n-1)!.

permutations. Hence, the

Example

Suppose there are n guests to be arranged along a circular table, then we have to fix the position
of one of the guest (which can be done in only one way) and then arrange remaining (n-1) guest
in (n-1) positions just like in linear case. Thus, the total number of ways in which n guest can be
arranged in a circular manner is (n-1)!
Combinations
The number of ways of selecting r things out of n dissimilar things is denoted by C(n, r) or nCr
The selections of number of things taking some or all of them at a time are called combinations.
Example:
From a class of 32 students, 4 are to be chosen for a competition. In how many ways can this be
done?
We are to select 4 students from 32. This selection can done in

Note that there is a relationship between permutations and combinations. For a given set of n
dissimilar things number of permutations is always greater than corresponding number of
combinations.

27
IB-11-13-Pre-Induction Basics of Business Statistics

Properties

C(n,0) = C(n,n) = 1

Difference between a Permutation and a Combination


In a combination, only selection is made. In a permutation, not only a selection is made,
but also there is an arrangement of a definite order.
There is no order of selection in combinations. In permutation, order is a must.
Usually (i.e., except in special cases or trivial cases), the number of permutations exceeds
the number of combinations.

28
IB-11-13-Pre-Induction Basics of Business Statistics

Exercise

1. A gentleman has 6 friends to invite. In how many ways can he send invitation cards to
them if he has 3 servants to carry the cards?
2. How many numbers, each lying between 100 and 1000, can be formed with digits 2, 3, 4,
0, 8, 9 (if repetitions of digits are not allowed)?
3. How many three digit numbers divisible by 5 can be formed using any numerals from 0
to 9 without repetition?
4. There are 10 points in a plane, of which 3 are collinear. Find the number of triangles
formed by joining these points.
5. From 7 engineers and 4 doctors a committee of 5 members is to be formed. In how many
ways can this be done
i.
ii.

To include exactly one doctor?


To include at least one doctor?

6. There are 2 books each of 3 volumes and 2 books each of 2 volumes. In how many ways
can these be arranged on a shelf so that the volumes of the same book remain together?
7. A company has 11 computer engineers and 7 mechanical engineers. In how many ways
can they be seated in a row so that no 2 of the mechanical engineers may sit together?
8. A company has 11 computer engineers and 7 mechanical engineers. In how many ways
can they be seated in a row so that all the mechanical engineers do not sit together?
9. How many words can be formed using letters of the word MATHEMATICS if
i.
there is no restriction
ii.
all the vowels are together
vowels are together and consonants are together
iii.
10. A person has 12 friends and he wants to invite 8 of them to a birthday party. Find
i.
how many times 3 particular friends will always attend the parties
ii.
how many times 3 particular friends will never attend the parties

--------

29
IB-11-13-Pre-Induction Basics of Business Statistics

Probability
In our day to day life, we come across many uncertain events.
We wake up in the morning and check the weather report. The statement could be 'there is 60%
chance of rain today'. This statement infers that the chance of rain is more than that having a dry
weather. We decide upon our breakfast from a statement that "corn flakes might reduce
cholesterol". What is the chance of getting a flat tyre on the way to an important appointment?
And so on.
How probable an event is? We generally infer by repeated observation of such events in long
term patterns.
Probability is the branch of mathematics devoted to the study of such events
People have always been interested in games of chance and gambling. The existence of games
such as dice is evident since 3000 BC. But such games were not treated mathematically till
fifteenth century. During this period, the calculation and theory of probability originated in Italy.
Later in the seventeenth century, French Mathematicians Pascal and Fermat contributed to this
Literature of study. The foundation of modern probability theory is credited to the Russian
mathematician, Kolmogorov. He proposed the axioms, at which the present subject of
probability is based.
Random Experiment and Sample Space
An experiment repeated under essentially homogeneous and similar conditions results in an
outcome, which is unique or not unique but may be one of the several possible outcomes. When
the result is unique then the experiment is called a 'deterministic' experiment.
Example:
While measuring the inner radius of an open tube, using slide calipers, we get the same result by
performing repeatedly the same experiment. Many scientific and Engineering experiments are
deterministic.
If the outcome is one of the several possible outcomes, then such an experiment is called a
"random experiment" or 'nondeterministic' experiment.
In other words, any experiment whose outcome cannot be predicted in advance, but is one of the
set of possible outcomes, is called a random experiment.
If we think an experiment as being performed repeatedly, then each repetition is called a trial.
We observe an outcome for each trial.
Example:
An experiment consists of 'tossing a die and observing the number on the upper-most face'
In such cases, we talk of chance of probability, which numerically measures the degree of chance
of the occurrence of events.
Sample Space (S)
The set of all possible outcomes of a random experiment is called the sample space, associated
with the random experiment

30
IB-11-13-Pre-Induction Basics of Business Statistics

Note:
Each element of S denotes a possible outcome. Each element of S is known as sample point.
Any trial results in an outcome and corresponds to one and only one element of the
set S.
e.g.,
1. In the experiment of tossing a coin,
S = {H, T}
2. In the experiment of tossing two coins simultaneously,
S = {HH, HT, TH, TT}
3. In the experiment of throwing a pair of dice,
S = {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2),. (6,1), (6,2), (6,3), (6,4), (6,5),
(6,6)}
Events
An event is the outcome or a combination of outcomes of an experiment. In other words, an
event is a subset of the sample space.

Consider a random experiment of rolling of a six faced die.


The sample space of this experiment is S= {1,2,3,4,5,6 }
Let A be the event that the number on the uppermost face is odd, then the corresponding set of
favourable outcomes is {1,3,5}i.e. A= {1,3,5}
Let B be the event that the number on the uppermost face is even. Then, B = {2,4,6}.
Let C be the event that the number on the uppermost face is above 7. Now, this set is certainly a
null set or an empty set because there is no favourable outcome. Thus, C=
Let D be the event that the number on the uppermost face is an integer between 1 and 6, both
inclusive, then D = {1,2,3,4,5,6} = S
Let E be the event that the outcome is less than 2. then, E = {1}
Types of Events
As we have different types of sets, we have different types of events.
We illustrate different types of events using above example.
Simple Event
If an event has one element of the sample space then it is called a simple or elementary event.
In the above example, E = {1} is a simple event
Compound Event
If an event has more than one sample points, the event is called a compound event.
In the above example, A = {1,3,5}is a compound event.

31
IB-11-13-Pre-Induction Basics of Business Statistics

Null Event ()
As null set is a subset of S, it is also an event called the null event or impossible event. In the
above example, C is a null event.

Sure event
In the above experiment, the sample space S= {1, 2, 3, 4, 5, 6}.. The event represented by D
occurs whenever the experiment is performed. Therefore, the event D is called a sure event or
certain event.
Complement of an Event
The complement of an event A with respect to S is the set of all the elements of S which are not
in A. The complement of A is denoted by A' or AC.
Note:
In an experiment if A has not occurred then A' has occurred.

Algebra of Events
In a random experiment, considering S(the sample space) as the universal set, let A, B and C be
the events of S. We can define union, intersection and complement of events and their properties
on S, which is similar to those in set theory.

ii) A-B is an event, which is same as ''A but not B"

vii)
Union of two events
If A and B are two events defined on the sample space S, then A or B or (A B) denotes the
event of the occurrence of at least one of the events A or B.
Intersection of two events
Intersection of two events A and B is the joint occurrence of these two events. It is denoted by (A
B).

32
IB-11-13-Pre-Induction Basics of Business Statistics

Mutually Exclusive Events


Two events associated with a random experiment are said to be mutually exclusive, if both
cannot occur together in the same trial or in other words, occurrence of one prevents the
occurrence of the other.
In the above experiment, the events A = {1,3,5 } and B = {2,4,6}are mutually exclusive.
Symbolically,
(A B) =
Where, (A B) is the event that both A and B occur.

Events E1, E2, , En associated with a random experiment are said to be pair-wise mutually
exclusive

Exhaustive Event
For a random experiment, let E1, E2, E3,.. En be the subsets of the sample space S

E1, E2, E3, , En form a set of Exhaustive events if

Independent Events

Events are said to be independent if the occurrence of one event does not affect the occurrence of
others.
Let A and B be two events defined on sample space S. Events A and B are said to be
independent if
Note:
If A and B are independent, then
i) Ac and Bc are independent
iii) A and Bc are independent
ii) Ac and B are independent

Partition of the sample space


A set of events E1, E2, E3, . En on S are said to form a partition of the sample space S,
if they are collectively exhaustive and mutually exclusive.
i.e. if

33
IB-11-13-Pre-Induction Basics of Business Statistics

Equally Likely Outcomes


The outcomes of a random experiment are said to be equally likely, if each one of them has equal
chance of occurrence.

Example: The outcomes of an unbiased coin are equally likely.


Probability of an Event
So far, we have introduced the sample of an experiment and used it to describe events. In this
section, we introduce probabilities associated to the events.
Let S be the sample space associated with the random experiment. Further, let S be finite and
equally-likely, i.e. let there be n (finite) number of sample points in S and let each one of them be
equally likely. Let A be the event defined on S then, probability of occurrence of event A is
denoted by P(A) and is given by

Where, m is the number of outcomes favourable for the occurrence of the event A.
Note 1: 0 P(A) 1 as 0 m n
Note 2: If P(A) = 0 then A is called a null event, or impossible event.
Note 3: If P(A) = 1 then A is called a sure event.
Note 4: If m is the number of cases favourable to A. Then n-m is favourable to "non occurrence
of A".

Axiomatic Approach to Probability


Axiomatic approach to probability closely relates the theory of probability to set theory.
Let S be the sample space of an experiment. Probability is a function, which associates a nonnegative real number to every event A of the sample space denoted by P(A) satisfying the
following axioms
For every event A in S, P(A) 0.
P(S) = 1

P(AC) = 1 - P(A)
P() = 0
If A1, A2, A3,.An are mutually exclusive events in S, then

34
IB-11-13-Pre-Induction Basics of Business Statistics

Addition Rule of Probability


If A and B are any two events, then

If A and B are mutually exclusive events, then


P(A B) = P(A) + P(B)
If A, B, C are any three events, then

35
IB-11-13-Pre-Induction Basics of Business Statistics

Exercise

1. A sample of 500 respondents was selected in a large metropolitan area in order to


determine various information concerning consumer behavior. Among the questions
asked was Do you enjoy shopping for clothing? Of 240 males, 136 answered yes. Of
260 females, 224 answered yes. What is the probability that the respondent chosen at
random
i.

Is a male?

ii.

Enjoys shopping for clothing?

iii.

Is a female?

iv.

Does not enjoy shopping for clothing?

2. A five digit number is to be formed by digits 1,2,3,4 and 5 without repetition. What is the
probability that the number is divisible by 4?
3. What is the probability that a leap year will have 52 Tuesdays?
4. Two friends A and B apply for two vacancies at the same post. The chances of their
selection are 0.25 and 0.20 respectively. What is the chance that
i.

One of them will be selected?

ii.

Both will be selected?

iii.

None of them will be selected?

5. Probability that a man will be alive 25 years hence is 0.3 and the probability that his wife
will be alive 25 years hence is 0.4. Find the probability that 25 years hence
i.

Both will be alive?

ii.

Only the man will be alive?

iii.

Only the women will be alive?

iv.

At least one of them will be alive?

6. One bag contains 5 red and 7 black balls and the other 3 red and 12 black balls. A ball is
drawn at random from either of the bags. What is the chance that the selected ball is
black?

36
IB-11-13-Pre-Induction Basics of Business Statistics

7. According to a survey, the probability that a family owns two cars if their annual income
is greater than Rs. 8 lakh is 0.75. Of the households surveyed, 60 per cent had income
over Rs. 8 lakh and 52 per cent had two cars. What is the probability that a family has
two cars and an income over Rs. 8 lakh a year?
8. The chance that a person stopping at a petrol pump will get his vehicles tyres checked is
0.12, the chance that he will get the oil checked is 0.29 and the chance that he will get
both checked is 0.07.
i.

What is the chance that a person will have neither his tyres nor oil checked?

ii.

What is the probability that a person who has his oil checked will also have
tyres checked?

9. It is known that 15 per cent of the males and 10 per cent of the females in a town having
equal number of them are unemployed. A person is selected at random from the town.
What is the probability that
i.

A person is employed?

ii.

A person is male given that he is employed?

10. A certain company encourages its employees to participate in cricket and hockey. A
survey indicates that 40% play cricket, 50% play hockey and 25% play both cricket and
hockey. Find the probability that
i.

An employee plays only hockey?

ii.

An employee plays only cricket?

iii.

An employee takes part in at least one of the games, cricket and hockey?

Note:
Four chapters together with four exercises have been given in the material for the purpose of
self study. Make sure that you go through entire material. Evaluation will be conducted on
this part immediately after you join the course.
Wish you all the best!

37
IB-11-13-Pre-Induction Basics of Business Statistics

You might also like