You are on page 1of 120

In the Name of ALLAH, the beneficent, the Merciful,

“O Allah, send your salutations upon Muhammad (PBUH) & on the Family
of Muhammad (PBUH) as you sent your salutations upon Ibrahim & on the
Family of Ibrahim verily you are Most Praiseworthy & Glorious…”

Quantitative Methods for


Decision Making Techniques
A Practical and Philosophical approach
By,
Yaseen Ahmed
Faculty, FCS-IBA
What is Statistics (A science or an art?)
• An activity of obtaining data and then;
• Compiling, summarizing, presenting, analyzing,
interpreting and….
• Drawing conclusions, is called Statistics.
In short it is;
Data  Process  Information/Conclusions
• Statistics is sort of a mixture of science and art,
till process it is a SCIENCE and drawing
conclusions is an individual’s ART.
What is DATA (A word or a Keyword?)
• DATA is a group of raw facts and figures which
may VARY from;
• Person to Person, Object to Object, Distance
to Distance and Time to Time….
• Only the absence of VARIATION can cause a
CONSTANT and it doesn’t exists in our physical
world. Only spiritualism can define a
CONSTANT.
Data v/s Variable
• Variable is the storage of data, its being represented by letters X,Y,Z etc.
There are two types of variables:

• Qualitative Variable: It deals with the data which may vary by it kind,
which provides labels, or names, for categories of like items, i.e. a set of
observations where any single observation is a word or code that
represents a class or category.
• Gender, Complexion, Weather, Type are some examples

• Quantitative Variable: It deals with the numeric data, which measures


either how much or how many of something, i.e. a set of observations
where any single observation is a number that represents an amount or a
count.
• Age, Height, number, price are some examples of Quantitative variable.

Source: http://www.microbiologybytes.com/maths/1011-17.html
Inactivity breaker …
Object: Allocate a blank page from your writing material and divide that page into
two columns in the following manner:

Qualitative Variables Quantitative Variables


1- Gender 1- Age
2- Complexion 2- Height
3- Qualification 3- Weight
4- Weather 4- Price

20. 20.

Try to write atleast 20 variables in each column by observing several fields like
management, agriculture, medical, engineering, geology etc. Submit the same
sheet by writing your full name on the top.
Data Sources
There are three major sources of data:
1. Survey/Census:
An official, usually periodic enumeration of a population,
often including the collection of related demographic
information, is called census. Survey means to inspect and
determine the conditions of interest. www.surveymonkey.com
2. Experiment:
Any activity, which is usually being conducted within an
isolated atmosphere, and produces results, is called
experiment.
3. Simulation:
An artificial way of data collection.
Question of the Day….
What do you think about Quality of
the following in IBA??
1- Teaching 1,2,3,4,5
2- Administration 1,2,3,4,5
3- Structure 1,2,3,4,5
Where 1-Very Poor 5-Excellent
Data Collection/compilation
• Teaching Ranks where 1-Very Poor, 5-Excellent
4.5 3.7 4.3 3.3 2.7 4.7
3.8 4.5 3.4 4.0 3.8 2.7
4.3 3.4 3.2 3.7 3.9 3.8
3.8 3.7 3.6 5.0 4.2 4.1
4.2 4.1 3.9 4.5 5.0 3.7
4.8 3.2 4.2 4.5 4.2 5.0
2.9
• Data collection/compilation is needed for getting
actual behavior of the variable.
Note: The above data is simulated version of the actual.
Data Tabulation (Grouping Exercise)
Step # 01: Finding the range Class Intervals Frequency

Range = Max. – Min


Min ______ Min+h
Range = 5 – 2.7 = 2.3
Step # 02: Min+h ___ Min+h+h

Finding the number of classes


Min+h+h _______ ….
No. of classes = 1 + 3.3 log(n)
= 1+3.3 x log(37)= 6.175 …

Step # 03: Finding the width (h) … ____________ Max


h = Range/No. of classes = 0.4
Data Tabulation (Grouping Exercise)
Step # 01: Finding the range
Range = Max. – Min = 5.0-2.7 =2.3

Step # 02: Finding the number of classes


No. of classes = 1 + 3.3 log(n) = 1+3.3 log(37) = 6.175

Step # 03: Finding the width or height (h)


h = Range/No. of classes= 2.3/6.175 = 0.377  0.4

Class Interval: One of the intervals into which the range of a variable
of a distribution is divided, esp. one of the divisions
of the base line of a bar chart or histogram.
After forming the structure of Class-Intervals and frequencies by using
methods of tally-marks, we can observe the actual behavior.
Data  Process  Information
Ranks Frequency Histogram
12
2.7 3 10

3.1 5 8

Frequency
6
3.4 10
4
3.8 9 2

4.2 5 0
2.7 3.1 3.4 3.8 4.2 4.6
4.6 5 Ranks

The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Data  Process  Information
Ranks Frequency Histogram
12
2.7 7 10

3.1 11 8

Frequency
3.4 9 6

4
3.8 6
2

4.2 3 0
2.7 3.1 3.4 3.8 4.2 4.6
4.6 1 Ranks

The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Grouping the data (MSEXCEL)

Data Analysis option is located in the “Data menu”, in case if it is not


present there we can activate it by running the Add-Ins present in
“Excel Options”.
Grouping the data (MSEXCEL) cont…
After providing
“data-range” and
hitting the Labels
and Chart-output
options, we can
find the histogram
either in the new
worksheet or in the
specific place of the
existing sheet.

Bin numbers These numbers represent the intervals that you want the
Histogram tool to use for measuring the input data in the data analysis.
Statistical Measures (An introduction)
• The phrase “descriptive statistics” is used generically in place
of statistical measures.
• These statistic(s) describe or summarize the qualities of
data.
• Another name is “summary statistics”, which we mostly used
to ornament our reports/cases/research.
• This would be beneficial if graphical summary is not enough
sufficient for the final conclusions.

Processing Processing
Data By Graph By Measure
Conclusions
Statistical Measures (An Example)
Consider the following group data:

Class Frequency Relative Cumulative


Intervals Frequency Relative Frequency
(R.F.) (C.R.F)
2—4 2 2/25 = 0.08 0.08
4—6 5 5/25 = 0.20 0.28
6—8 9 9/25 = 0.36 0.64
8—10 7 7/25 = 0.28 0.92
10—12 2 2/25 = 0.08 1.00
f=25 R.F.=1

The above data showing Income in 1000’s of Rupees of some individuals in


late 1980’s
Statistical Measures (Quartiles)
• These are 3 values respectively represented by Q1, Q2
and Q3 and divides the data into 4 equal parts.
• Each part contains 25% observations
• Quartiles Usually highlight 4 different classes i.e.
Lower class, Lower Middle, Upper Middle and Upper
class.
25% 25% 25% 25%
Lower Lower Upper Upper
Class Middle Middle Class

Min Q1 Q2 Q3 Max
Computing Quartiles
In order to computer Quartile Values, we need to
consider the same frequency distribution in addition to
the column of Cumulative Frequency.

Class Frequency Cumulative


Intervals Frequency (C.F.)
2—4 2 2
4—6 5 7
6—8 9 16
8—10 7 23
10—12 2 25
f=25
Computing Quartiles (Procedure)
For any group-data, quartiles can be computed by following two
simple steps:
Step-1: Finding the location of ith Quartile: (where i=1,2 and 3)
𝑖 × 𝑓
4
Step-2: Finding the value of ith Quartile:
ℎ 𝑖 × 𝑓
𝑄𝑖 = 𝑙 + − 𝐶. 𝐹.
𝑓 4
Where l = lower limit of captured class, h=class-width, f=class
frequency, C.F.=previous class C.F.
Computing Quartiles (Demo)
Class Frequency Cumulative 1st
Intervals Frequency (C.F.) Quartile
Class
2—4 2 2
4—6 5 7
6—8 9 16
8—10 7 23
10—12 2 25
f=25
Step-1 (For Q1): (1 x 25) / 4 = 6.25

Step-2: Q1=4+2/5 (6.25 - 2) = 5.7


Note: Class width=h=2
Quartiles (Income Classes)
25% 25% 25% 25%
Lower Lower Upper Upper
Class Middle Middle Class

Min Q1 Q2 Q3 Max

2000 5700 7222 8786 12000

Quartiles can be computed using MSEXCEL, ungroup


form of data is needed there, the syntax is given below:
=QUARTILE(Data Range,i) where i=1,2,3 showing
quartile numbers.
Computing Quartiles from Ungroup
Data
• We must sort the Data before proceeding, for e.g.
2 2.5 3 5
1st 2nd 3rd 4th
Hence we can obtain our quartile values by following two
simple steps:
Step-1:
For Q1 => 1(4+1)/4
th
1.25 Value.
Q 1=2+0.25 (2.5 – 2) =2.125
Step-2: Q1=1st value + fraction (2nd Value – 1st Value)
Computing Quartiles (Contd)
• Consider the same sorted data :
2 2.5 3 5
1st 2nd 3rd 4th
Step-1: For Q2 ; 2(4+1)/4 = 2.5th Value
Step-2: Q2=2nd value + 0.5 ( 3rd – 2nd )
Q2= 2.5 + 0.5 (3 – 2.5) = 2.75
Finally, for Q3; 3(4+1)/4 = 3.75th Value.
Q3=3 + 0.75 (5 – 3) = 4.5
Where, Min=2 and Max. = 5
Computing Quartiles (Contd)
• Following is the Box-Plot of Treatment A:

Box-Plot of Treatment A
Max=5
5

Q3=4.5
4
A

Q2=2.75
2

Q1=2.125
Min=2
Quartiles, Deciles and Percentiles
Quartiles: Deciles: Percentiles:
To divide the data To divide the data To divide the data
into 4 equal parts. into 10 equal parts. into 100 equal parts.
Quartiles are Deciles are Nine Percentiles are Ninty
three values Q1, values D1, D2 , D3 … nine values P1, P2,….
Q2 and Q3 D9. P99

Step-1: Step-1: Step-1:


i=1,2,3 i=1,2,3,…,9 i=1,2,3,…99

Step-2: Step-2: Step-2:


Practice Questions
Q. What should be the interval of income which covers middle
50% individuals?
Ans. 5700 to 8786

Q. What should be the interval of income which covers middle


40% individuals?
100%

30% 40% 30%


Min Max
D3 D7

Q. What should be the interval of income which covers middle


30% individuals?
Exploratory Data Analysis (EDA) by Sir
John Wilder Tukey
There are two types of studies:
• Hypothetical Study
• Exploratory Study
In Exploratory study, we can perform our
analysis by avoiding conventional
methodologies. In EDA, we can observe the
trend of data by applying different processes
on the data.
• The Box-plot is a very useful part of EDA.
The Box-Plot
Boxplot of Teaching

Inter-quartile Range=Q3-Q1

Min Q1 Q2 Q3 Max
3 4 5

Teaching Ranks
Processing Data using Box-Plots
Boxplots of Female Ages - Male Ages
(means are indicated by solid circles)

45

Males are Less Variable


More Variable Younger than
Less Consistent Females More Consistent
35

Heterogeneous Homogenous

More Diversed
Less Diversed
25
Female A

Male Age
Exploratory Analysis for Quality ranks
from Aventis Field Managers
Boxplots of Teaching, Administration & Structure
(means are indicated by solid circles)

2
Admin
Teaching

Structur
Statistical Measures (Central Tendency)
(Mean, Median and Mode)
• The main problem associated with the mean
value of some data is that it is sensitive to
outliers.
• The median is simply the middle value
among some scores of a variable. It’s the 2nd
Quartile (Q2) of any data.
• The most frequent response or value for a
variable. Multiple modes are possible:
bimodal or multimodal.
Mean, Median and Mode
Measurements are on x-axis and frequencies are on y-axis

The Mode is based on the principal of democracy, while


median (Q2) follows the rule of moderation. Mean took its
place after being influenced by the higher values of
measurements. The above mentioned distribution is +vely
skewed.
Mean and Mode (Computations)
Class Frequency Mid-Points f i  xi
Intervals fi xi
Modal 2—4 2 (2+4)/2 =3 23
Class
4—6 f1=5 (4+6)/2 =5 55

6—8 fm=9 (6+8)/2 =7 97

8—10 f2=7 (8+10)/2 =9 79


10—12 2 (10+12)/2 =11 211
fi=25 f i  xi=179

Mode= 7.333 = 7333/- = 7160/- is the


Majority’s Income Average Income
Empirical relationship b/w
Mean, Median and Mode
• Following are the values for Mean, Median
and Mode obtained from the Income data:
Mean 
 fx i i

179
 7.160
f i 25
Median  Q2  7.222
 f m  f1 
Mode  l     h  7.333
 2 f m  f1  f 2 

Mean  Median  Mode (Thus the data is slightly  vely skewed )


Arithmetic Mean, Geometric Mean
and Harmonic Mean
• For any ungroup data, The Arithmetic Mean is:
Where xi are the observations and n
is the sample size
• For any ungroup data, The Geometric Mean is:

• For any ungroup data, The Harmonic Mean is:


Arithmetic Mean, Geometric Mean
and Harmonic Mean
• Consider the Following ungroup data and compute
A.M. , G.M. and H.M.:
XI : 1,2,3,4,5 n=5

• A.M. = (1+2+3+4+5)/5
= 15/5 = 3.0
• G.M. = (1x2x3x4x5) 1/5
= (120) 1/5 = 2.6052
• H.M. = 5 / (1/1+1/2+1/3+…+1/5)
5/2.28333 = 2.1898
Theorems related to AM, GM & HM
Empirically prove the following Theorems:

Theorem No. 1:
AM>GM>HM
3.0 > 2.6052 > 2.1898
Theorem No. 2:
AM x HM  GM2
3.0 x 2.1898  2.60522
6.569  6.7870 diff. = 0.22
Arithmetic Mean, Geometric Mean
and Harmonic Mean for Group Data
• For any Group data, The Arithmetic Mean is:
Where xi are the Mid-Points and fi are
class frequencies.
• For any Group data, The Geometric Mean is:

• For any Group data, The Harmonic Mean is:


AM, GM & HM (Computations)
For A.M. For G.M. For H.M.
Class Frequency Mid-
Intervals fi Points
fi × xi xi fi f i / xi
xi
2—4 2 3 2×3
2×3 32 2/3
4—6 5 5 5×5 55 5/5
6—8 9 7 9×7 79 9/7
8—10 7 9 7×9 97 7/9
10—12 2 11 2×11 11 2 2/11
fi=25 fi × xi=179  xi fi  fi / xi
Mean, Median and Mode
• MSEXCEL syntaxes for finding three measures
of central tendency are;

• =Average(Data Range) For Mean


• =Quartile(Data Range,2) For Median
• =Mode(Data Range) For Mode
Statistical Measures (Dispersion)
What is DISPERSION??
A dart-game can help us in this…
Based on the visual observation; we
can declare Player-A as a winner
because:
Player A is,
•More consistent/Less
Variable/Homogenous/Less Dispersed
And
Player B is,
•Less Consistent/More
Variable/Heterogeneous/More
dispersed
Measures of Dispersion
Some Important Measures of Dispersion are:
• Range=Max-Min
• Variance
• Standard Deviation
• Mean Deviation
• Inter-quartile Range
• Coefficient of Variation (C.V.)
Dispersion Measures (Cont…)
 xi  x 
2

Variance  V ( X ) 
n
Variance of the following
ungroup data:
X: 1,2,3,4,5
Mean=3

Standard Deviation= V (X )    2
=1.414 ???
Coefficient of Variation (Consistency Check)

• In order to check whether the variable is


consistent or not, we need to compute the
coefficient of variation,
V (X ) 
C.V .  100  100
X 
• For any consistent variable, C.V. < 100%
• C.V. is the unit-less measure of dispersion.
Variance & Standard deviation (group-data)
Class Frequency Mid-Points f i  xi f i (xi-mean)2
Intervals fi xi
2—4 2 (2+4)/2=3 23 2(3 - 7.16)2=34.61

4—6 5 (4+6)/2=5 55 5(5 - 7.16)2=23.33

6—8 9 (6+8)/2=7 97 9(7 - 7.16)2=0.230

8—10 7 (8+10)/2=9 79 7(9 - 7.16)2=23.69


10—12 2 (10+12)/2=11 211 2(11 - 7.16)2=29.49
fi=25 f i  xi=179 =111.34

 f x  x 
2
111.34
Variance  V ( X )  i i
  4.45
f i 25
Variable Comparison (Property of C.V.)

• Coefficient of Variation for 1,2,3,4,5 (n = 5) is,


V (X ) 1.414
C.V .  100  100  47.1%
X 3
• And for the Income-data (  fi = 25 ); it is,

V (X ) 2.111
C.V .  100  100  29.48%
X 7.16
• So technically, Income data is more consistent
than the first five natural numbers.
Hand-Profile Analysis
(An exploratory approach)
X3 S.No. Measurements (X)
X4 X2
1 X1
2 X2
X5
3 X3
Span (X6) 4 X4
5 X5
6 X6
Thumb 7 X7
(X1) in
cms Determine the Mean,
Length Standard deviation and
(X7) Coefficient of Variation.
Computing Mean and Standard Deviation
Using Scientific Calculators
New Models (ES Series) Prev. Models (MS Series)
Press MODE Press MODE
Select STAT Select SD
Select 1-Var Entering the Data:
Enter the Data in appeared data Obs1 M+
column… Obs2 M+
For Finding Mean and Standard Obs3 M+
Deviation: do it for all remaining data
Press Shift and then press 1 observations.
Select VAR For Finding Mean and Stand. Dev.
Select for mean Press Shift and Press 2
Select Xn for Standard Deviation Select for mean
Select Xn for Standard Deviation
Approximate Confidence Interval
For any Bell-shaped symmetrical distribution;
the following will be proved:

1)    will cover approximately 68% observations


2)   2 will cover approximately 95% observations
3)   3 will cover approximately 99.98% observations

Where  and  are the mean and standard deviation


respectively.
Why Bell-Shaped Symmetrical
Distribution??
• There are several Symmetrical Distributions
Why Bell-Shaped Symmetrical
Distribution??
• In a Bell-shaped distribution, extreme values come
with less frequency.
• Majority falls within one standard deviation.
• It’s Nature’s Distribution. God created almost all
natural measures with a bell-shaped distribution.
Empirical Proof for the Approx.
Confidence Intervals

• Bring One Neem Leaf and measure its length in


cms.
• Obtain Mean and Standard Deviation
• Empirically prove the following theorems:
1)    will cover approximately 68% observations
2)   2 will cover approximately 95% observations
3)   3 will cover approximately 99.98% observations
(Group the data and prove that its Bell-
shaped symmetric in nature)
Statistical Process Control (SPC)
• The Concept is based on Approximate
Confidence intervals.
• It’s usually use to monitor a manufacturing
Process or to observe individual’s
performance.
• For this purpose, we setup a graph which is
called a Control Chart.
• Control Chart is bounded by two Control
Limits.
A Control Chart
Upper control Limit
+3
A Realization
Theoretical /Claimed


Value

1 2 3 4 5 6 7

- 3
Lower control Limit
By observing any realization; we can monitor any process
which can alert us on two conditions:
1- Either any observation crosses or even touch any pre-
alarm control limit. Or
2- When the realization motion become rhythmic
Statistical Process Control (An activity)
• Consider the following Manufacturing
Process; S.No. X=2xRan#

1 2xRan#
X=2 x Ran#
• Simulate 7 Observations using this 2 2xRan#

simulator. 3 2xRan#

• Obtain a Control Chart using these 4 2xRan#

parameter values; =1, =0.3. 5 2xRan#

• Deduce whether your process is 6 2xRan#

under control or not. Comments on 7 2xRan#


your Realization.
A Class Activity
• Write your name on the Top of today’s Class
Work
• Keep your Class Work open on your desk.
• Leave your seat and check atleast one of your
classmate copy and write your remarks about
him/her on a chit.
• Submit your remarks-chit to me by writing the
name of that classmate.
Introduction to Probability
• It is the science in which either we study a
random experiment or we observe a random
phenomenon.
• In probability study, a sample space is needed
which is the set of all possible outcomes of
any random experiment.
• It is the connectivity b/w Descriptive and
Inferential Statistics.
Probability Topics Tree
Random Probability Random
Expectation
Experiment Distribution Variable
Counting Rules

Sample
Outcomes Criteria Numeric
Space

Mutually Exclusive (Non Overlapping)


Events
Non Mutually Exclusive (Overlapping)

Probability
Independent Dependent
P(AB)=P(A) P(B) Conditional Probability
What is the Distribution?
• Gives us a picture of
the variability
• and central tendency.

• Can also show the


amount of skewness
and Kurtosis.
Bell-Shaped Symmetrical Distribution
±
±2
Central Tendency
±3
Dispersion
Probability Distributions

• For any frequency distribution, we need a


variable while for any probability distribution,
we need a random variable
• Random Variable is the data which can be
obtained by converting the outcomes of any
sample space into numeric codes after defining
a particular criteria, so;
• Random Experiment is necessary for a
probability distribution
• Any Experiment with uncertain results
(outcomes) called a random experiment
• For example, mixing acid and base will
produce salt and water (It’s an experiment)
but;
• Tossing a Dice or a Coin, or Drawing a card
from well shuffled deck will produce a random
result (these are examples of random
experiments), so in each random experiment,
we collect all possibilities (outcomes) and
make a sample space
Object:
• Toss a Coin 5 times, 10 times, 15 times, … 50
times.
• Determine the Proportion of heads P(H) in
each experiment.
• Plot P(H) v/s Experiment and obtain the
realization.
• Comment on the coin by observing the
realization.
S.No. Outcome P(H)
1
2
H
H
Realization
3 T
1.0
4 T
5 T 2/5=0.4
6 H
7 H
0.5
8 H 1 2 3 …………………….. 10
9 H
10 T 6/10=0.6
… ..
0.0
15
… …
… …
50 T ? / 50 =…
Formation of Sample Spaces
Random Experiments Related to a Fair Coin:
Random Experiment # 1: Tossing a fair-coin once
S={H,T} 21=2 outcomes

Random Experiment # 2: Tossing a fair coin twice or tossing 2


fair coins, once.
S={HH, HT, TH, TT} 22=4 outcomes

Random Experiment # 3: Tossing a fair coin thrice or tossing 3


fair coins, once.
S={HHH, HHT, HTH, THH, THT, TTH, HTT, TTT} 23=8 outcomes
In general, 2n showing the two sided coin is being tossed ‘n’ times
Formation of Dichotomous SS
• A truth Table can help us forming the sample
space: For e.g. Sample Space of Rand. Exp. # 3.
• The formation rule is simple S. No. 1 2 3
st
st nd
nd rd
rd
S. No. 1 2 3
• Values of Every next column 1
1 H H H
2
should be doubled of the 2
3
T H H
3 H T H
preceding column. 4
4 T T H
• Outcomes can be observed 5
5 H H T
6
6 T H T
Horizontally. 7
7 H T T
8
8 T T T
Random Experiments with Dice
Random Experiment #4: Tossing a fair dice, once
S={1,2,3,4,5,6} 61=6 outcomes

Random Experiment #5: Tossing a fair dice, twice or


Tossing two fair dice once
S={11, 12, 13, 14, 15, 16
21, 22, 23, 24, 25 ,26
31, 32, 33, 34, 35, 36
41, 42, 43, 44, 45, 46
51, 52, 53, 54, 55, 56
61, 62, 63, 64, 65, 66} 62=36 outcomes
Random Experiments Contd..
Random Experiment #6: Tossing a fair coin and a fair
dice, once
S={H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6} 21 x61=12
outcomes

Random Experiment #7: Tossing 2 fair coins & a fair dice


once.
S={HH1,HH2,HH3,HH4,HH5,HH6
HT1,HT2,HT3,HT4, HT5, HT6
TH1,TH2,TH3, TH4, TH5, TH6
TT1, TT2,TT3, TT4, TT5, TT6} 22x61=24 outcomes
Random Experiments A Deck of Cards
Random Experiment #8: Drawing a card from a
well shuffled Deck of playing cards.
S= {Hearts King+Queen+Jack+Ace+2+3+…+10 13
Diamonds King+Queen+Jack+Ace+2+3+…+10 13
Clubs King+Queen+Jack+Ace+2+3+…+10 13
Spades King+Queen+Jack+Ace+2+3+…+10} 13
Total= 52
Formation of Events
Replicate the same work for
What is an Event? Random Experiment #3
• It’s a logical statement which should be followed, strictly
• We always collect the matching outcomes from the sample
space after viewing the Event statement. VENN Diagram

For e.g. if we consider the Random Exp. # 2: B


A
Object: Tossing a fair coin twice, S={HH,HT,TH,TT}
HH HT TH
Event(s):
A={First toss should be a Head} TT

A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT,TH}
Thus we formed two Non-Mutually Exclusive Events
Computing Probability
Probability of an Event
• P(A) stands for probability of an Event A such that;
P(A) = n(A)/n(S)
Where,
• n(A) is the number of outcomes present in Event A.
• n(S) are the number of outcomes present in the
Sample Space.
• Probability is a proportion of Event in a Sample Space.
• For any Event A; 0  P(A)  1 where A  S
Computing Probabilities (Example)
• Random Experiment # 2: Tossing a fair coin twice or
tossing two fair coins, once.
Sample Space S={HH,HT,TH,TT},
Event(s)
A={First toss should be a Head}, A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT, TH}
Therefore Probabilities will be,
P(A)=2/4=0.5 50% chances
P(B)=2/4=0.5 50% chances
Interpreting Probability
Probability occurs against every Event and should be interpreted
in 3 components;
1) Object of the Random Experiment
2) Value of the Probability
3) Event Statement
For e.g., Interpretation of P(A)=0.5 can be written as;
If we toss a fair coin twice, we have 50% chances
of getting head in the first toss.
Similarly, P(B)=0.5 would be:
If we toss a fair coin twice, we have 50% chances
of getting exactly one tail in both tosses.
Union, Intersection and Compliment
For the same Random Experiment # 2, the following
operations showing results and relevant interpretations
needed (where U=OR, =AND, A’=not(A):
Since S={HH,HT,TH,TT} A={HH,HT} B={HT,TH}
Therefore,
AUB={HH,HT,TH} P(AUB)=3/4=0.75 75%
If we toss a fair coin twice, we have 75% chances of getting
head in the first toss OR exactly one Tail in both tosses.
AB={HT} P(AB)=1/4=0.25 25%
A’=S-A={TH,TT} P(A’)=2/4=0.50 50%
P(A’)=1-P(A)
Practice Questions
Q1) If we toss a fair coin three times, determine the
following probabilities:
a) P(A)=Probability of getting exactly one Head in all tosses?
b) P(B)=Probability of getting Tail in the first toss?
c) P(C)=Probability of getting exactly one head AND one
tail? P(One head  One Tail)
d) P(D)=Probability of NOT getting exactly one head in all
tosses? P(A’)
e) P(F)=Probability of Either getting exactly one head in all
tosses OR tail in the first toss? P(AUB)
Practice Questions (Contd..)
Q2) If we toss a fair dice twice, determine the following
Probabilities: (Ref. Random Experiment #4)
a) P(A)=Probability of getting same number on both Dice?
b) P(B)=Probability of getting odd number in both Dice?
c) P(C)=Probability of getting sum of both numbers equals
to 5?
d) P(D)=Probability of getting an odd number AND an even
number on two Dice respectively.
e) P(F)=Probability of NOT getting the same number on
both Dice?
Practice Questions (Contd..)
Q3) If we toss a fair COIN and a Fair DICE once, determine
the following Probabilities: (Ref. Random Experiment #6)

a) P(A)=Probability of getting exactly One head in the coin?


b) P(B)=Probability of getting an odd number on Dice?
c) P(C)=Probability of getting exactly one Head with an Odd
number on Dice? P(AB)
d) P(D)=Probability of getting a number less than 4 on Dice.
e) P(F)=Probability of NOT getting exactly one Head in the
coin? P(A’)=1-P(A)
Practice Questions (Contd..)
Q3) If we toss two fair COINS and a Fair DICE once,
determine the following Probabilities: (Ref. Random
Experiment #7)

a) P(A)=Probability of getting exactly One head in the coin?


b) P(B)=Probability of getting an odd number on Dice?
c) P(C)=Probability of getting exactly one Head with an Odd
number on Dice? P(AB)
d) P(D)=Probability of getting a number less than 4 on Dice.
e) P(F)=Probability of NOT getting exactly one Head in the
coin? P(A’)=1-P(A)
Independence/Dependence
Conditional Probability
• A Contingency table can help us to understand
the concept of Independent or Dependant
Events.
• A contingency table is a Bivariate Frequency
table showing a joint Distribution of two
variables.
• Usually two Qualitative variables can be used
to form a Contingency Table.
Contingency Table (An Example)
Consider the following table which is
representing Gender (Male/Female) and the
Eyesight Status (Glasses/No Glasses):

Gender/EyeSight Male (M) Female (F) R. Total


R.Total

Glasses (G) 05 MG 12 FG G =17


17
No Glasses (NG) 09 MNG 19 FNG NG=28
28
Column Total M=14
14 F=31
31 S =45
45
Conditional Probability (Example Exercise)

Q: If we select a person at random from this


community then determine that probability that the
selected person will be;
a) A Male?
Ans. P(M)= 14/45=0.31 31%
b) A Male with Glasses?
Ans. P(MG)= 05/45=0.11 11%
c) A Male given that He must be wearing Glasses?
Ans. P(M/G)=05/17=0.294 29.4%
P(M/G)=P(MG)/P(M)=(05/45)/(14/45)=0.294
Conditional Probability
(Independence Check)
If Gender is independent of Eyesight, then the
following will be proved:
P(M/G)P(M)
We considered this empirically and got this result:
0.294 0.31
Therefore we can say that Gender is Independent of
Eyesight which is quite obvious.
This might be possible to get different answers for
both simple and conditional probabilities if we
consider the case of Gender v/s Heart Disease.
Conditional Probability
(Independence Check)
We might be having different result if we make a
minor change in the following table:
Gender/EyeSight Male (M) Female (F) R. Total

Glasses (G) 05 19
12 G =24
=17
No Glasses (NG) 09 12
19 NG=21
NG=28
Column Total M=14 F=31 S =45

Now, if we can observe the dependence by observing the following result;


P(M/G)=05/24=0.208 is not equals P(M)=14/45=0.31
Hence Gender and Eyesight are dependent.
Conditional Probability
(Class Activity)
Generate the following Bivariate Data by recalling
your memories: S.No Gender (M/F) Like/Dislike
.
1 M L
2 M D

50 F L

These people are from,


1- Your Family and Relatives
2- Your Friends and colleagues
3- Your Muhalla People (Living in the vicinity)
Conditional Probability
(Class Activity)
Bivariate (Compiled) Data then be converted into
a Contingency table: Male Female

Like
P(M/L)=a/(a+b) a b a+b
Dislike
c d
P(M)=(a+c)/N a+c N=50
Form a bivariate frequency table and test whether Gender is
Independent of your views or not, i.e. :
P(M/L)P(M) or P(M/L)P(M)
Independent results showing no gender discrimination in MIND, vice
versa for Dependence.
Intrdoductory Statistics 9th Ed.
Selected Probability Solved Example and Exercises:

(Introducing Contingency Tables)

• Exercises 4.4 Page 171


Attempt Selected Exercises
Random Variables (Example)
Replicate the same for Random Exp. # 3
• If we toss a fair coin twice {Random Exp. # 2}, the
A Probability
sample space will contain all possibilities and it will Distribution
be;
X P(X=x)
S={HH,HT,TH,TT)
• Now if we define a following criteria, i.e. 0 1/4
X={Showing no. of heads in each outcome} then ‘X’
1 2/4
will be a random variable having these values;
X={2,1,1,0} 2 1/4
• Finally we got the probability distribution so that
P(X=0)=0.25 are the chances of having no heads in
both tosses so on.
Game Theory (Gambler’s Example)
• Suppose a gambler is offering you to play a game
with him by tossing a fair coin and a fair dice,
once.
• He agreed to pay you Rs/- 100 for each Head
appeared in the Coin and Rs/- 300 for a Number
greater than 4 in the Dice.
• On the other hand, you will pay him Rs/- 50 for
each Tail appeared in the Coin and Rs/- 250 for a
Number less than 5 in the Dice.
• Determine whether it’s an Expected Loss or a
Gain for you?
Gambler’s Example: Working
• It’s a Random Experiment # 6 in which we have
tossed a Fair Coin and a Fair Dice once.
S={H1, H2, H3, H4 , H5 , H6 , T1 ,T2 ,T3,T4,T5,T6}
H=+100, T= - 50, {1,2,3,4}= - 250, {5,6}= +300
X={-150,-150,-150,-150,+400,+400,-300,,+250}

X -300 -150 +250 +400


P(X=x) 4/12 4/12 2/12 2/12
Mathematical Expectation E(X)
• In order to get the total effect of Probabilities on
the Values of X, we need to form another column:
X P(X=x) X  P(X=x) X2  P(X=x)

-300 0.333 -300 x 0.333 (-300)2 x 0.333

-150 0.333 -150 x 0.333 (-150) 2 x 0.333

+250 0.167 +250 x 0.167 (+250) 2 x 0.167

+400 0.167 +400 x 0.167 (+400) 2 x 0.167

E(X)= -41.67 E(X2)= 74620


Mathematical Expectation E(X)
• In any Probability Distribution, the Mean is;
E(X) =  X P(X=x)

• And the Variance V(X) is;


V(X) = E(X2) – [E(X)]2
Where,
E(X2) = X2 P(X=x) and E(X) =  X P(X=x)
Odd Rows Quiz # 2 Even Rows
• Suppose a gambler is offering • Suppose a gambler is offering
you to play a game with him by you to play a game with him by
tossing a Two fair coins and a tossing a Two fair coins and a
fair dice, once. fair dice, once.
• He agreed to pay you Rs/- 100 • He agreed to pay you Rs/- 150
for each Head appeared in the for each Head appeared in the
Coin and Rs/- 300 for a Number Coin and Rs/- 400 for a Number
greater than 4 in the Dice. greater than 4 in the Dice.
• On the other hand, you will pay • On the other hand, you will pay
him Rs/- 50 for each Tail him Rs/- 100 for each Tail
appeared in the Coin and Rs/- appeared in the Coin and Rs/-
250 for a Number less than 5 in 250 for a Number less than 5 in
the Dice. the Dice.
• Determine whether it’s an • Determine whether it’s an
Expected Loss or a Gain for Expected Loss or a Gain for
you? you?
Combinatorial Problems
What are Counting Techniques?
• The concept is usually use when the sample
space is too large for any Random Experiment.
• When we try to explore different ways to
arrange/rearrange objects.
• When we want to know how huge is the
domain of possibilities even in assigning simple
tasks to different individuals…
Counting Rules
In order to understand the concept; we can
consider the following case:
• If we have 3 objects A,B,C and we want to
choose 2 objects from 3.
• Then we have 2 Questions before we proceed…
Q1: Is duplication Allowed? (Y/N)
Q2: Is order important (ABBA)? (Y/N)
Counting Rules (Power Principle)
If the answer of both questions is YES, i.e.
Q1: Is duplication Allowed? (Y)
Q2: Is order important(ABBA)? (Y)
Then the group of arrangements should be:
AA AB AC
BA BB BC
CA CB CC
Total WAYS formula will be Nr where N=3 and r=2,
therefore 32=9 Ways, it is known as ‘POWER RULE’.
Counting Rules (Permutations)
If the answer sequence is below:
Q1: Is duplication Allowed? (N)
Q2: Is order important(ABBA)? (Y)
Then the group of arrangements should be:
AB AC
BA BC
CA CB
Total WAYS formula will be NPr= N!/(N-r)! , where
N=3 and r=2, therefore 3P2=6 Ways, it is known as
‘PERMUTATIONS’.
Counting Rules (Combinations)
If the answer sequence is below:
Q1: Is duplication Allowed? (N)
Q2: Is order important(ABBA)? (N)
Then the group of arrangements should be:
AB AC
BC
Total WAYS formula will be NCr= N!/r!(N-r)! , where
N=3 and r=2, therefore 3C2=3 Ways, it is known as
‘COMBINATIONS’.
Counting Rules (A Class Activity)
• Once after your CLASS TEACHER says ‘START’ then
you all have to Change your Seats.
• Do it as soon as you can
• Compute the TOTAL TIME required for all possible
arrangements.
• NP x time in seconds
r = Total seconds
• (Total Seconds)/60 = Total minutes
• (Total Minutes)/60 = Total Hours
• (Total Hours)/24 = Total days
• (Total days)/365 = Total Years required
Counting Rules (Cases)
Solve the following cases with a suitable
counting Rule:
1- How many ways are possible when we have to
decide a Batting order in a cricket team?
Answer is Permutations, because duplication is
not allowed but order matters, therefore:
10P =10! / (10-10)!=3628800 Ways
10
Counting Rules (Cases)
Determine whether the following situations would
require calculating a permutation or a combination:

a) Selecting three students to attend a conference in


Washington, D.C.
combination
b) Selecting a lead and an understudy for a school play.
permutation
c) Assigning students to their seats on the first day of
school.
permutation
Counting Rules (Cases)
• Evaluate:
Answer=720
• A coach must choose five starters from a team
of 12 players. How many different ways can the
coach choose the starters?
Answer=12C5=792
• Which of the following is NOT equivalent to ?
Counting Rules (Cases)
• The local Family Restaurant has a daily breakfast special in which the
customer may choose one item from each of the following groups:
Breakfast
Sandwich Accompaniments Juice
egg and ham breakfast potatoes
orange
egg and bacon apple slices cranberry
egg and cheese fresh fruit cup tomato
pastry apple
grape

a) How many different breakfast specials are possible?


Answer: 3C x 4C x 5C =60 breakfast choices
1 1 1
b) How many different breakfast specials without meat are possible?
Answer: 1C x 4C x 5C =20 meatless breakfast choices
1 1 1
Counting Rules (Cases)
• In How many ways we can design a Car’s Number Plate
if it comprises of 3 Alphabets followed by 3 Numbers?

Answer: 263x103=17576 x 1000=17576000 ways

• What if duplication is not allowed in the same case?


Answer: 26P x 10P = 15600 x 720 = 11232000 ways
3 3
Counting Rules (Cases)
• In How many ways we can set a Password for our
email address if it comprises of 6 letters?
Answer: 266=
• What if it contains 6 letters or numbers or both?
Answer: (26+10)6 =
• What if it contains 6 letters and numbers if duplication
is not allowed?
Answer: 36P6 =
A Probability Density Function (PDF)
• It’s a Mathematical Function which can
generate Probabilities in a Probability A Probability
Distribution. Distribution
• With reference to the previous random
X P(X=x)=
variable examples; we can generate the 2C 2
x/2
same probabilities using a mathematical
0 1/4
function i.e. P(X=x)=nCx/2n.
• If we put n=2 and x=0,1,2, we can 1 2/4
observe the same table.
2 1/4
• For X=0, we can compute;
P(X=0)=2C0/22=1/4 and so on.
Sequence of Bernoulli Trials (By James
Bernoulli) results Binomial Random Experiment
• In Dichotomous type random experiments, we
always encounter the Bernoulli trials (trials having
two possibilities, i.e. Success or Failure)
• If we consider a sequence of ‘n’ Bernoulli trials in
which we are having ‘x’ number of successive trials
i.e.; S,S,F,F,F,S,S,F,S,…….. F.
• So, it must contains ‘x’ successive trials and ‘n-x’
failure trials. Therefore the probabilities of
occurrence of ‘x’success in ‘n’ trials, we got the
following PDF:
P(X=x)=nCx px (1-p)n-x where X=0,1,2,…n
Where, ‘n ’ showing number of independent trials and
‘p ’ is the proportion of success
Binomial Random Experiment
(An Example) 0.5
P(X=x)

0.4
0.3
0.2
0.1
0
0 1 2 3

• Suppose in a particular Open heart surgery


operation; chances of survival of patient are 70%
(p=0.7) and if 3 patients are being operated through
the same operation, then chances of survival are
given below;
P(X=0)=0.027,P(X=1)=0.189 , P(X=2)=0.441 and
P(X=3)=0.343, these results indicating higher
chances of survival of any two patients among three
and so on
MS-EXCEL Syntax is, =Binomdist(x,n,p,cumul)
Binomial Probability Distribution
(A Case)
• I need to obtain a sample proportion based on a
Hand-Poll consent of the class:
• Tell Me how many of you are Pro-Imran Khan as a
Leader of PTI?
• Based on a Hand-poll result; we can obtain a
sample proportion .
• Now, determine the probability of finding 7 pro-
Imran Khan students if we select 10 Business school
students at random? P(X=7) where n=10 and p=
P(X=7) = 10C7 x 7 x (1 - )3
Binomial Probability Distribution
(A Case)
• Probability of Finding Atmost 3 pro-Imran Khan
students?
P(X3) = P(X=0)+P(X=1)+P(X=2)+P(X=3)
Which is similar to P(X<4)
• Probability of Finding Atleast 7 pro-Imran Khan
Students?
P(X7)=P(X=7)+P(X=8)+P(X=9)+P(X=10)
Which is similar to P(X>6)
• Probability of Finding 6 to 8 Pro-Imran Khan students?
(If nothing is written then default is inclusive)
P(6 X 8)=P(X=6)+P(X=7)+P(X=8)
Mathematical Expectation
(A Binomial Distribution case)
• As we know the Mean and Variance of any
Probability distribution is:
E(X)= X P(X=x) and V(X)=E(X2) – [E(X)]2
• But Using Binomial PDF, we can compute both
measures in terms of Parameters:
E(X)=np and V(X)=np(1-p)
• Determine the Average number of Pro-Imran
Khan Students in the group of 10.
Binomial Distribution (MCQs)
1. If n=5 and p=0.2; then P(X=2) is:
a) 0.409 b) 0.2048 c) 0.0512 d) None of these

2. If n=5 and p=0.2; then P(X>3) is:


a) 0.0643 b) 0.576 c) 0.0067 d) None of these

3. If Mean of Binomial dist is 1 and q=0.98 then n=?


a) 50 b) 1 c) 10 d) None of these
Binomial Distribution (MCQs)
4. If mean of binomial dist is 1 and var is 0.98 then
p=?
a) 0.98 b) 0.02 c) 0.0512 d) None of these

5. If mean of binomial dist is 1 and var is 0.98 then


q=?
a) 0.98 b) 0.02 c) 0.0512 d) None of these

6. If n=10 and p=0.9 then mean and variance are:


a) 9 and 0.9 b) 0.9 and 9 c) 9 and 100 d) None
The Poisson Distribution
• Poisson was the French Mathematician.
• He Worked on the Binomial PDF and obtained
its LIMITING FORM by putting n and p0.
• Poisson Probability Density function can be
written as:

• Domain for Poisson PDF is 0X


• Where  is the Parameter.
•  showing the rate of occurrence or Average.
Eulers number or Napier’s Constant
• This is ‘e’ http://en.wikipedia.org/wiki/E_(mathematical_constant)
• e is said to be the life’s function
• Whenever we have a life-time distribution or a natural growth
/ data is there, the model should be having ‘e’.
• For e.g. The well known Normal Distribution based on this
constant. Gauss (The German Mathematician used it)
• Following is the function of e:
e = (1 + x) 1/x but Lim x 0
If we put x=0 then e = inf.
So we have to ‘poke’ the function:
By putting x=0.001 (closest to zero) This will give us a result:
e = (1 + 0.001) ^ (1/0.001) = 2.719 Therefore, e = 2.71828
The Poisson Distribution
(Working with PDF)
• If the value of Parameter is given i.e. =2, then
find the Probability distribution of X.
X 0 1 2 3 4 5 6 7 8
P(X=x) 0.1353 0.2707 0.2707 0.1804 0.0902 0.0361 0.0120 0.0034 0.0009

• Determine the following probabilities.


i) P(X=2) = 0.2707
ii) P(X<2) = 0.1353 + 0.2707 = 0.406
iii) P(X2) = 1 – P(X<2) = 1 – 0.406 = 0.594
Binomial V/S Poisson Distrbution
• If proportion or a percentage is given along with the
number of trials or the sample size:
It is a Binomial Random Variable
e.g. : it is given that 2% of the items in a lot are
defective, determine the defective items in 10 lots.

• If rate or average is been provided of an event within


a time/distance or space, then it is a Poisson
Random variable.
e.g. : 2 items per lot are defective, determine the
defective items is the next lot?
The Poisson Distribution
(Working with PDF)
• If the value of Parameter is given i.e. =2, then
find the Probability distribution of X.
X 0 1 2 3 4 5 6 7 8
P(X=x) 0.1353 0.2707 0.2707 0.1804 0.0902 0.0361 0.0120 0.0034 0.0009

• Determine the Probabilities of X using Binomial


PDF with n=50 and p=0.04.
• Did you find any similarity b/w Binomial and
Poission Probabilities??
• Its due to the Asymptotic nature as
E(X)=np=50x0.04=2 which is equals to given .
The Poisson Distribution
(Cases)
• If a secretary is making 2 mistakes per page,
then determine the probability that she will
make no mistake in the next page?
P(X=0) with =2.
• According to the survey, it is found that there
are approx. 4 field mice per acre of land then
determine the probability that on the next acre
of land, there will be atmost 2 field mice found?
P(X2) with =4.
P(X2) = P(X=0) + P(X=1) + P(X=2)
The Poisson Distribution
(Obtaining Poisson Data)
• Poisson Data can be obtain by observing any
arrival/departure of humans/ vehicles/
objects etc.
• For e.g. one can observe vehicles entering in
a particular parking area per minute Or
number of persons entering in a mall per
minute for 15 or 20 minutes.
• Those 15 to 20 values will be the Poisson
Data and we can further infer about the
Arrival using Poisson PDF.
The Poisson Distribution
(Mean and Variance)
• Mean of a Poisson Distribution is:
E(X) = 

• Variance of a Poisson Distribution is:


V(X) = 
So technically in Poisson Random Variable
Mean = Variance = 

You might also like