You are on page 1of 19

Probability & Probability Distributions

Probability Meaning It is the chance of occurrence of an event


Probability Theory Measure of uncertainty
Experiment A Process which results in some well-defined
outcome is known as an experiment.
e.g. - when a coin is tossed, we shall be getting either a Head of a
tail i.e., its outcome is a Head or a Tail, which is well defined.
Random experiment All the outcomes of the experiment are
known in advance, but any specific outcome of the experiment is
not known in advance.
e.g. tossing of a coin is a random experiment, since the
outcomes of the experiment is known in advance (i.e., it can be a
Head or a Tail) but what will come whether Head or Tail is not
known in advance.
Sample Space It is the list of all possible outcomes of an
experiment.
e.g. when 2 coins are tossed together, the random experiment
may result in any of the following:
Head (H) on the first coin and & Head (H) on the second coin
Head (H) on the first coin and & Tail (T) on the second coin
Tail (T) on the first coin and & Head (H) on the second coin
Tail (T) on the first coin and & Tail (T) on the second coin
Thus the corresponding sample space (for this example) denoted
as S = {(H, H), (H, T), (T, H), (T, T)}
Equally Likely Outcomes In case of tossing a coin,
it is known, in advance, that the coin will land with its Head or
Tail up
it is reasonably assumed that each outcome, a Head or a Tail, is
as likely to occur as the other
In other words, we say that there are equal chances for the coin
to land with its Head or Tail up. Hence we say that the outcomes
Head and Tail are equally likely.

Event An outcome of a random experiment is called an event.


In other words, an event is something that happens.
e.g. Consider tossing of 2 similar coins. The possible outcomes
are:
First Coin
H
H
T
T

Second coin
H
T
H
T

Outcome
HH
HT
TH
TT

The events of the above experiment are HH, HT and TT whose


frequencies are 1, 2 and 1 respectively.
Mutually exclusive events 2 events are said to be mutually
exclusive if both cannot occur simultaneously.
e.g. If a single coin is tossed, the Head and the Tail cannot
occur in the same trial. Hence the event Head and the event
Tail are mutually exclusive. If 2 coins are tossed then the events
{(H, H), (H, T), (T, H), (T, T)} are mutually exclusive.
Exhaustive Events The total number of all possible outcomes
of a random experiment will constitute an exhaustive set of
events.
e.g. Thus, in tossing of a coin, there are 2 exhaustive events
Head and Tail and in throwing of a die the exhaustive events
are either 1, 2, 3, 4, 5, 6. In drawing a single card from a pack of
52 playing cards the events card is red and card is black are
collectively exhaustive.
Classical approach to Probability (Mathematical / Priori
approach)
Probability of an event =

No . of outcomes favorable

the event
Total no . of all possible outcomes

In this approach probability


conducting experiment.

is

known

in

advance

before

e.g. - if a die is rolled once, and an even number is required on


the upper face of it, then in this experiment, the
Total number of all possible outcomes = 6 (since any of 1, 2, 3, 4,
5, 6 can come on the upper face)
Number of favorable outcomes = 3 (since the even numbers can
be any one of 2, 4, 6)
Hence probability of getting an even number on the upper face =
3/6 =
Relative frequency approach to probability (Statistical /
Posteriori approach)
Probability of an event =

No . of of occurrence of an event
Total no .of trials

e.g. If a coin is tossed 100 times and the outcomes of this


experiment are 57 heads and 43 tails, then the
Probability of a Head = 57 / 100 and
Probability of a Tail = 43 / 100
Subjective approach to probability
Probability of an event =

No. of successes
Total no . of trials

In this approach, probabilities are assigned to the events based


on experience or past records. This type of approach is suitable
for a sample size 10
Rules of Probability
Rule of Addition
If A1, A2, .,Am be m mutually exclusive events, then
P(A1 U A2 U U Am) = P(A1) + P(A2) +..+P(Am)
If the events A1, A2, .,Am are mutually exclusive events
and are also exhaustive, then
P(A1 U A2 U U Am) = P(S) = 1 = P(A1) + P(A2) +..+P(Am)

The event A and its complement A are mutually exclusive and


hence, P(A U A) = P(A) + P(A) since A U A = S, it follows that P(A
U A) = P(S) = 1; Therefore P(A) = 1 P(A).
Probability of the event A or B or C (A, B, C are any
events not necessarily mutually exclusive, in the sample
space S) then
P(A U B) = P(B) + P(A) P(A B)
P(A U B U C) = P(A) + P(B) + P(C) P(A B) P(A C) P(B C)
+ P(A B C)
Conditional Probability Conditional Probability is the
probability of occurrence of an event given that another event has
already occurred.
e.g. Consider the experiment of tossing 3 coins.
space of the experiment is
S = {HHH, HTH, THH, TTH, HHT, HTT, THT, TTT}

The sample

Since the coins are fair, we can assign the probability 1/8 to each
sample point.
Let E be the event at least 2 Heads appear &
Let F be the event as first coin shows tail then
E = {HHH, HTH, THH, HHT} &
F = {THH, TTH, THT, TTT}
E F = {THH}
So P(E) = P(HHH) + P(HTH) + P(THH) + P(HHT) = 1/8 + 1/8 + 1/8
+ 1/8 = 4/8 = 1/2;
P(F) = P(THH) + P(TTH) + P(THT) + P(TTT) = 1/8 + 1/8 + 1/8 +
1/8 = 4/8 = 1/2 &
P(E F) = 1/8
Now, suppose we are given that the event F occurs (i.e., the first
coin shows tail), then what is the probability of occurrence of E?
This information reduces our sample space from the set S to its
subset F for the event E.
Thus, probability of E considering F as the sample space = 1/4

This probability of the event E is called the conditional probability


of E given that F has already occurred, and is denoted by P(E|F) =
1/4

So P(E|F) =

No. of elementary events favourable


No. of elementary events favourable
E F F

n(E F)
n( F)

Now dividing the numerator and denominator by n(S), we get


P(E|F) =

n( E F /n(S ))
n(F)/n( S)

P( E F )
, where P(F) 0 i.e., F
P(F )

(empty set)
Problem:
In an organization, out of 200 employees, 40 are having their
monthly salary more than Rs.15000 & 120 of them are regular
takers of Alpha Brand Tea. Out of those 40, who are having their
monthly salary more than Rs.15000, 20 are regular takers of
Alpha Brand Tea. If a particular employee is selected, what is the
probability that he is having a monthly salary more than
Rs.15000, if he is a regular taker of Alpha Brand Tea?
Rule of Multiplication
P(A B) = P(A) P(B|A) = P(B) P(A|B)
If A, B & C are 3 events of sample space, then we have
P(A B C) = P(A) P(B|A) P(C|A B) = P(A) P(B|A) P(C|AB)
Independent events:
2 events are said to be independent if the occurrence of one
event does not influence the occurrence of the other event.
e.g. Successive tosses of a fair coin are independent. If a fair
coin is tossed twice, the event Head in the first toss (assume this
as event A) and the event Head in the 2 nd toss (assume this as
event B) are independent since the occurrence of Head in any
toss does not influence the occurrence of Head of the other toss
and the probability of getting a Head, say, in the second toss,
which is 1/2, does not change, whether in the 1 st toss we get a

Head or Tail. Hence here P(B|A) = P(B) since the occurrence of


event A does not alter the probability of event B.
So when the events A and B are independent, P(A B) = P(A) P(B)
In general, when a finite number of events A 1, A2, .,Am
are independent, we have
P(A1 A2 Am) = P(A1) P(A2)..P(Am)
Problems:
1. What is the probability of getting exactly 2 Heads when 3 coins
are tossed?
2. What is the probability of getting atleast 1 Heads when 3 coins
are tossed?
3. What is the probability of getting a sum of 9 when two dice is
thrown?
4. What is the probability of getting atleast a sum of 9 when two
dice is thrown?
5. A number is selected at random from the numbers 1 to 30.
What is the probability that (i) it is divisible by either 3 or 7 (ii)
it is divisible by 5 or 13
6. The board of directors of a company wants to form a quality
management committee to monitor quality of their products.
The company has 5 scientists, 4 engineers & 6 accountants.
Find the probability that the committee will contain 2 scientists,
1 engineer & 2 accountants?
7. A box contains 5 red & 4 blue similar shaped balls. 2 balls are
drawn at random from the box. Find the probability that both
of them are red if (i) the balls are drawn together (ii) the balls
are drawn one after the other, with replacement (iii) the balls
are drawn one after the other without replacement.
8. The probabilities that A & B will tell the truth are

2
4
3
5

respectively. What is the probability that (i) they agree with


each other (ii) they contradict each other
9. The probabilities that component A & component B of a
machine will fail are 0.09 & 0.06 respectively. The machine will
fail if any one of them fails. Find the probability that it will fail?
10. What is the probability of getting 53 Mondays in a leap year?

11. Find the probability of selecting 2 ys from the letters x, x, x,


x, y, y, y?
12. Find the probability of selecting a King and Queen from a
pack of playing cards, when 2 cards are drawn at a time?
13. The probability of Mr.Sunil solving a problem is
probability of Mr.Anish solving is

1
4 .

3
4 .

The

What is the probability

that a given problem will be solved?


14. The probability that a company A will survive for 20 years is
0.6. The probability that its sister concern will survive for 20
years is 0.8. What is the probability that atleast one of them
will survive for 20 years?
15. The probabilities that drivers A, B, C will drive home safely
after consuming liquor are

2 3 3
, ,
5 7 4

respectively.

What is the

probability that they will drive home safely after consuming


liquor.
Random Variable A random variable (r.v.) is a real valued
function whose domain is the sample space of a random
experiment.
e.g. Let us consider the experiment of tossing a fair coin 2 times
in succession. The sample space of the experiment is S = {(H, H),
(H, T), (T, H), (T, T)}
If X denotes the number of Heads obtained, then X is a r.v. and
for each outcome, its value is as given below:
X(HH) = 2, X(HT) = 1, X(TH) = 1, X(TT) = 0
More than one r.v. can be defined on the same sample space.
e.g. Let Y denote the no. of heads minus the no. of tails for
each outcome of the above sample space S, then Y is also a r.v.
and for each outcome, its value is as given below:
Y(HH) = 2, Y(HT) = 0, Y(TH) = 0, Y(TT) = -2
Discrete Random Variable This r.v. will take countable no. of
outcomes (as above example).

Continuous Random Variable Let us take the example of


measuring exact amount rain in inches tomorrow. Here we cant
say it is 2. When we say it is 2 then it should not be even
2.000001 or 1.99999. But we can define this r.v. as means 1.9
< X < 2.1, where X denotes the amount of rain
Probability Distributions
In some researches, after data collection, the next step is to
present the data in the form of a probability distribution which will
facilitate further analysis of data in more meaningful ways.
The probability distribution can be classified into discrete
probability distribution and continuous probability distribution.
Some examples of discrete probability distributions are Binomial,
Poisson distribution
Some examples of continuous probability distributions are
Exponential distribution, Uniform distribution, Normal distribution,
t-distribution
Discrete Probability Distribution
In an experiment, events can be represented in the form of
frequencies, which can be easily converted into the respective
probabilities by dividing them with the total no. of outcomes.
Consider the case of tossing 3 coins simultaneously. The no. of
outcomes of this experiment is 8 and the outcomes are {HHH,
HHT, HTH, HTT, THH, THT, TTH, TTT}
i.e., HHH 3 Heads
HHT 2 Heads and 1 Tail
HTH 2 Heads and 1 Tail
HTT 1 Head and 2 Tails
THH 2 Heads and 1 Tail
THT 1 Head and 2 Tails
TTH 1 Head and 2 Tails
TTT 3 Tails
Therefore, the events of this experiment are 3 Heads, 2 Heads
and 1 Tail, 1 Head and 2 Tails, 3 Tails and the frequencies are
1,3,3,1 respectively.

Probability Distribution of
the experiment
Probability
# of
of
outcome occurrenc
Even s of the
e of the
t
event
event
HHH 1
1/8
HHT 3
3/8
HTT 3
3/8
TTT
1
1/8
Total
1
The discrete probability distribution of tossing 3 coins is
calculated as follows:
The events of this experiment are HHH, HHT, HTT & TTT. Let X be
a r.v. defined as the no. of Heads in the event occurrence, which is
probabilistic.
The events HHH, HHT, HTT, TTT can be defined as follows:
X = 0, when the event is TTT (i.e., no. of Heads is 0)
X = 1, when the event is HTT (i.e., no. of Heads is 1)
X = 2, when the event is HHT (i.e., no. of Heads is 2)
X = 3, when the event is HHH (i.e., no. of Heads is 3)
Based on this, the probability distribution is

1 /8, if X =03
3/8,if X=12
0, otherwise

P(X) =

This is called probability mass function (p.m.f) & P(X) = 1


Continuous Probability Distribution
Let us consider the following function to demonstrate the concept
of the continuous distribution:
f(x) =

3 x 23 x +1.5,if 0 x 1
0, ot h erwise

For any prob. distribution, the value of the cumulative distribution


in the specified range should be 1.
The value of the cumulative distribution can be obtained by
integrating f(x) as shown below:
1

f ( x) dx
0

(3 x 23 x +1.5) dx
0

=1

Since the value of the cumulative function of f(x) is 1, f(x) is a


probability distribution (also called as probability density function.
In this distribution, the variable X is a continuous r.v. because its
value is continuous in the range from 0 to 1.
Hence the
probability distribution is a continuous probability distribution.
The cumulative function of a probability density function is called
cumulative density function (c.d.f.)
Problem:
Let X denotes the no. of hours you study during a randomly
selected school day. The probability that X can take the values x,
has the following form, where k is some unknown constant.
{0.1, if x = 0
P(X = x) = {kx, if x = 1 or 2
{k(5 - x), if x = 3 or 4
{0, otherwise
a. Find the value of k?
b. What is the probability that you study at least 2 hours?
Exactly 2 hours? At most 2 hours?
Problem:
Find the probability distribution of no. of doublets in 3 throws of a
pair of dice.
Mean of a r.v. X = Expected value of X = Expectation of X =
E(X) = xipi

Variance of a r.v. X = Var(X) = E(X2) [E(x)]2 where E(X2) =


xi2pi
Problem:
2 cards are drawn simultaneously (or successively without
replacements) from a well shuffled pack of 52 cards. Find the
mean, variance and standard deviation of the no. of kings.
Theoretical Probability Distributions
Binomial Distribution
It is a discrete probability distribution based on Bernoulli process.
In a game between 2 persons using a coin-tossing experiment, let
the occurrence of Head in a trial be a success to one person & the
occurrence of tail be a failure to the same person.
So, in a trial of tossing a coin, the probability of a success (p) to
the person is 0.5 & the probability of failure (q = 1 p) to the
same person is 0.5
If n repeated trials are performed, then the objective of the
game may be to find the probability of having x successes for
the person. This experiment is termed as Bernoulli Process in
which n repeated trials are performed with the following
assumptions:
In the experiment, there are only 2 mutually exclusive &
collectively exhaustive events.
The probability of occurrence of the events of the experiment, are
same in all the trials.
In all the n trials, the observations are independent of one
another.
Based on these fundamentals, the binomial probability
distribution is represented as follows:
P(x successes in n trials given p is the probability of success)
= nCx px qn-x, where x = 0, 1, 2,..,n; where n = No. of trials; p =
probability of success in a trial; q = probability of failure in a trial
(= 1- p); Here n & p are parameters.

In short, the probability distribution is represented as P(X = x) =


nCx px qn-x
The cumulative distribution function of the binomial distribution is
P(X = x) = nCx px qn-x = 1
Problem Based on past experience, the quality control engineer
of Heavy Electrical Limited has estimated that the probability of
commissioning each project in time at a client site is 0.9; The
Company is planning to commission 5 such projects in the
forthcoming year. Find the probability of commissioning (a) no
project in time (b) 2 projects in time (c) at most one project in
time & (d) at least 2 projects in time.
Mean & Variance of binomial distribution
Mean = np & Variance = npq
Problem If the probability of a defective bolt is 0.1, find the
mean & the s.d. of defective bolts in total of 900?
Probability of a defective bolt = p = 0.1 q = 1 0.1 = 0.9
Mean = np = 900 * 0.1 = 90
Variance = npq = (np)q = 90(0.9) = 81
Hence Standard Deviation = 9
Problem:
a. Eight coins are tossed at a time, 256 times. Find the expected
frequencies of successes (getting a head) and tabulate the
results obtained.
b. Also obtain the values of the mean & SD of the theoretical
(fitted) distribution.
Problem:
Fit a binomial distribution to the following data:
x
0
1
2
3
f
28
62
46
10

4
4

Poisson distribution
It is a discrete probability distribution. This is usually used to
represent the no. of occurrence of an event in one unit of time.

Poisson distribution may be obtained as a limiting case of


Binomial probability distribution under the following conditions:
1. n, the number of trials is indefinitely large, i.e., n
2. p, the probability of success for each trial is indefinitely small,
i.e., p 0
3. np = (say), is finite
Under the above 3 conditions the binomial probability function
tends to the probability function of the Poisson distribution given
below:
P(X = x) = e- x , x= 0, 1, 2, ..,
x!
In this probability mass function, is the only parameter
Mean & Variance of Poisson distribution
Mean = Variance =
Approximation of binomial distribution to Poisson distribution
The binomial distribution can be approximated to Poisson
distribution under any of the following conditions.
n 20 & p 0.05
n 100 & np 10
If any one of the above 2 conditions is satisfied, then the mean of
the Poisson distribution is = np
Problem:
The arrival rate of customers arriving at a bank counter follows
Poisson distribution with a mean arrival rate of 4 per 10 minutes
interval. Find the probability that
1. exactly 0 customer will arrive in 10 minutes interval
2. exactly 2 customers will arrive in 10 minutes interval
3. at most 2 customers will arrive in 10 minutes interval
4. at least 3 customers will arrive in 10 minutes interval
Given e-4 = 0.0183
Problem:
The QC assistant takes a sample of 25 units of a product at a
particular work station of a production line & inspects them one

by one. Based on the past experience, he has estimated that the


probability of one unit will be defective is 0.04.
Find the
probability that
1. no piece in the sample is defective
2. 3 pieces in the sample will be defective
3. At most 2 pieces will be defective
4. At least 3 pieces will be defective
Given e-1 = 0.3678
Problem:
It is known from the past experience that in a certain plant there
are on the average 4 industrial accidents per month. Find the
probability that in a given year, there will be less than 4
accidents. Assume Poisson distribution (Given e -4 = 0.0183)
Problem:
Suppose on an average 1 house in 1000 in a certain district has a
fire during a year. If there are 2000 houses in that district, what is
the probability that exactly 5 houses will have a fire during the
year?
Given e-2 = 0.1353
Problem:
The following table gives the number of days in a 100 day period
during which automobile accidents occurred in a city. Fit a
Poisson distribution to the data.
No. of accidents
0
1
2
3
4
No. of days
40
35
15
6
4
Normal Distribution
It is a continuous probability distribution. The behavior of many
of the real-life situations can be modeled as normal distribution.
Some examples which follow normal distribution are as follows:
Monthly salary of employees in a locality
Internal diameter of bearings produced in a company
Marks of students in an entrance test
Height of employees in a company
Weight of employees in a company

A continuous r.v. X is defined to follow normal distribution with


parameters & 2, denoted as X ~ N(,2) if the p.d.f. of the r.v. X
is given by
f(x) =

1
exp[-(x )2/22),
(2)

- < X <

If the observations of a real-life problem follow the normal


distribution with mean () and variance (2), then its r.v. can be
converted into a standard normal r.v. using the following
transformation:
Z=X-

where Z is a standard normal variable.


The corresponding
distribution is called standard normal distribution, whose formula
is as given below:
P(Z) = 1
(2)

exp[-Z2/2),

- < Z <

The mean and variance of this standard normal distribution are 0


and 1, respectively.
Problem: In a survey with a sample of 300 respondents, the
monthly income of the respondents follows normal distribution
with its mean and s.d. as Rs.15000 and Rs.3000 respectively.
(a) What is the probability that the monthly income is less than
Rs.12000? Also, find the no. of respondents having income less
than Rs.12000?
(b) What is the probability that the monthly income is more than
Rs.16000? Also, find the no. of respondents having income less
than Rs.16000?
(c) What is the probability that the monthly income is in
between Rs.10000 & Rs.17000? Also, find the no. of
respondents having income in between Rs.10000 & Rs.17000?

Problem: The marks obtained by 300 students in an examination


are estimated to be normally distributed with mean of 60 &
standard deviation of 8. How many students are expected to
score (a) More than 70 marks (b) Between 50 & 75 marks (c) If
top 5% of the students are to be given scholarships, what is the
eligible mark for the scholarships?
Problem: Steel rods are manufactured to be 3 inches in diameter
but they are acceptable if they are inside the limits 2.99 inches &
3.01 inches. It is observed that 5% are rejected as oversize & 5%
are rejected as undersize. Assuming that the diameters are
normally distributed, find the standard deviation of the
distribution. Hence find, what the proportion of rejects would be,
if the permissible limits were widened to 2.985 inches and 3.015
inches.
Problem: In a certain exam, 31% of the students got less than 45
marks & 8% of the students got more than 64 marks. Assuming
the distribution to be normal, find the mean & SD of the marks
Problem: The frequency distribution of a national survey on cars
is shown below:
No. of
0.000.501.001.502.002.50cars
0.49
0.99
1.49
1.99
2.49
2.99
Frequen
2
14
23
7
4
2
cy
a) Calculate the variance and SD
b) How many of the observation should theoretically fall
between 0.7 and 1.8, if the distribution is bell-shaped?
Problem: The manager of a small postal substation is trying to
quantify the variation in the weekly demand for mailing bags.
She has decided to assume that this demand is normally
distributed.
She knows that on an average 100 bags are
purchased weekly and that, 90% of the time, weekly demand is
below 115.
a) What is the standard deviation of this distribution?

b) The manager wants to stock enough mailing bags each week


so that the probability of running out of stock of bags is no
higher than 0.05. How much she should stock?
Uniform Distribution
This is a continuous probability distribution which has wider
practical applications.
More specifically, it has more use in
simulation. Let us assume that the distribution of the daily
demand of a product is uniformly distributed with 10 +/- 2 units.
The minimum daily demand is 8 and the maximum daily demand
is 12.
Now in general let the minimum daily demand & maximum daily
demand be a & b respectively.
Let X is a continuous r.v. & the probability of occurrence for the
values of the r.v. X in the range a & b is constant & it is 0 for all
values of the r.v. outside the interval a to b. The formula for the
corresponding uniform distribution is
P(X = x) =

1
, a x b
ba
0, otherwise
b

The cumulative density function is =


ba
ba

P( X=x)dx
a

dx
ba

a+b
2

&

=1

The mean & variance of the uniform distribution are


( ba )2
12

Problem:
In a private canteen, the daily demand for packed meals follows
uniform distribution as below:
P(X) = 1 / (450 230),
230 X 450

= 0, otherwise
If the service level of satisfying the demand of the canteen is 0.8,
find the highest possible demand which can be satisfied w.r.t. the
given service level (cumulative probability)
Problem:
Let the continuous r.v. X denote the current measured in a thin
copper wire in milliamperes. Assume that the range of X is [0,
20mA], and assume that the p.d.f. of X is f(x) = 0.05, 0 x 20 ;
What is the probability that a measurement of current is between
5 and 10 mA?
Exponential Distribution
It is a continuous probability distribution. This distribution is used
to represent the time interval between consecutive occurrence of
an event, like inter-arrival time of customers, service time for
customers, mean time between failures in maintenance activity,
etc.
Consider the example of the service time of customers in a
queuing system. Generally the service time in a queuing example
is a random variable which follows exponential distribution. Let
the no. of customers served per unit time (service rate) be .
Therefore, the service time is (1/).
If r.v. X follows exponential distribution, then the probability
density function f(x) = e-x, x 0
The r.v. X that equals the distance between successive events of
a Poisson process with mean > 0 is an exponential random
variable with parameter . The p.d.f. of X is
f(x) = e-x for 0 x
E(X) = =

& = V(X) =
2

1
2
;

The cumulative density functions are


P(X x) = 1 - e-x, for x 0
P(X x) = e-x, for x 0

x
P(x1 X x2) = e

x
- e , for 0 x1 < x2
1

Problem:
In an international airport, the service time for servicing flights by
a terminal follows exponential distribution. The service rate of a
terminal servicing the flights is 20 per day. Find the probability
that the service time of the terminal in clearing a flight is less
than 0.45 hour.
Problem:
In a mainframe computer centre, execution time of programs
follows exponential distribution. The average execution time of
the programs is 5 minutes. Find the probability that the execution
time of programs is
(a) Less than 4 minutes
(b) More than 6 minutes
(HintAverage execution time is 5 minutes; Means the execution
rate is (1/5) = )