Professional Documents
Culture Documents
“O Allah, send your salutations upon Muhammad (PBUH) & on the Family
of Muhammad (PBUH) as you sent your salutations upon Ibrahim & on the
Family of Ibrahim verily you are Most Praiseworthy & Glorious…”
• Qualitative Variable: It deals with the data which may vary by it kind,
which provides labels, or names, for categories of like items, i.e. a set of
observations where any single observation is a word or code that
represents a class or category.
• Gender, Complexion, Weather, Type are some examples
Source: http://www.microbiologybytes.com/maths/1011-17.html
Inactivity breaker …
Object: Allocate a blank page from your writing material and divide that page into
two columns in the following manner:
20. 20.
Try to write atleast 20 variables in each column by observing several fields like
management, agriculture, medical, engineering, geology etc. Submit the same
sheet by writing your full name on the top.
Data Sources
There are three major sources of data:
1. Survey/Census:
An official, usually periodic enumeration of a population,
often including the collection of related demographic
information, is called census. Survey means to inspect and
determine the conditions of interest. www.surveymonkey.com
2. Experiment:
Any activity, which is usually being conducted within an
isolated atmosphere, and produces results, is called
experiment.
3. Simulation:
An artificial way of data collection.
Question of the Day….
What do you think about Quality of
the following in IBA??
1- Teaching 1,2,3,4,5
2- Administration 1,2,3,4,5
3- Structure 1,2,3,4,5
Where 1-Very Poor 5-Excellent
Data Collection/compilation
• Teaching Ranks where 1-Very Poor, 5-Excellent
4.5 3.7 4.3 3.3 2.7 4.7
3.8 4.5 3.4 4.0 3.8 2.7
4.3 3.4 3.2 3.7 3.9 3.8
3.8 3.7 3.6 5.0 4.2 4.1
4.2 4.1 3.9 4.5 5.0 3.7
4.8 3.2 4.2 4.5 4.2 5.0
2.9
• Data collection/compilation is needed for getting
actual behavior of the variable.
Note: The above data is simulated version of the actual.
Data Tabulation (Grouping Exercise)
Step # 01: Finding the range Class Intervals Frequency
Class Interval: One of the intervals into which the range of a variable
of a distribution is divided, esp. one of the divisions
of the base line of a bar chart or histogram.
After forming the structure of Class-Intervals and frequencies by using
methods of tally-marks, we can observe the actual behavior.
Data Process Information
Ranks Frequency Histogram
12
2.7 3 10
3.1 5 8
Frequency
6
3.4 10
4
3.8 9 2
4.2 5 0
2.7 3.1 3.4 3.8 4.2 4.6
4.6 5 Ranks
The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Data Process Information
Ranks Frequency Histogram
12
2.7 7 10
3.1 11 8
Frequency
3.4 9 6
4
3.8 6
2
4.2 3 0
2.7 3.1 3.4 3.8 4.2 4.6
4.6 1 Ranks
The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Grouping the data (MSEXCEL)
Bin numbers These numbers represent the intervals that you want the
Histogram tool to use for measuring the input data in the data analysis.
Statistical Measures (An introduction)
• The phrase “descriptive statistics” is used generically in place
of statistical measures.
• These statistic(s) describe or summarize the qualities of
data.
• Another name is “summary statistics”, which we mostly used
to ornament our reports/cases/research.
• This would be beneficial if graphical summary is not enough
sufficient for the final conclusions.
Processing Processing
Data By Graph By Measure
Conclusions
Statistical Measures (An Example)
Consider the following group data:
Min Q1 Q2 Q3 Max
Computing Quartiles
In order to computer Quartile Values, we need to
consider the same frequency distribution in addition to
the column of Cumulative Frequency.
Min Q1 Q2 Q3 Max
Box-Plot of Treatment A
Max=5
5
Q3=4.5
4
A
Q2=2.75
2
Q1=2.125
Min=2
Quartiles, Deciles and Percentiles
Quartiles: Deciles: Percentiles:
To divide the data To divide the data To divide the data
into 4 equal parts. into 10 equal parts. into 100 equal parts.
Quartiles are Deciles are Nine Percentiles are Ninty
three values Q1, values D1, D2 , D3 … nine values P1, P2,….
Q2 and Q3 D9. P99
Inter-quartile Range=Q3-Q1
Min Q1 Q2 Q3 Max
3 4 5
Teaching Ranks
Processing Data using Box-Plots
Boxplots of Female Ages - Male Ages
(means are indicated by solid circles)
45
Heterogeneous Homogenous
More Diversed
Less Diversed
25
Female A
Male Age
Exploratory Analysis for Quality ranks
from Aventis Field Managers
Boxplots of Teaching, Administration & Structure
(means are indicated by solid circles)
2
Admin
Teaching
Structur
Statistical Measures (Central Tendency)
(Mean, Median and Mode)
• The main problem associated with the mean
value of some data is that it is sensitive to
outliers.
• The median is simply the middle value
among some scores of a variable. It’s the 2nd
Quartile (Q2) of any data.
• The most frequent response or value for a
variable. Multiple modes are possible:
bimodal or multimodal.
Mean, Median and Mode
Measurements are on x-axis and frequencies are on y-axis
• A.M. = (1+2+3+4+5)/5
= 15/5 = 3.0
• G.M. = (1x2x3x4x5) 1/5
= (120) 1/5 = 2.6052
• H.M. = 5 / (1/1+1/2+1/3+…+1/5)
5/2.28333 = 2.1898
Theorems related to AM, GM & HM
Empirically prove the following Theorems:
Theorem No. 1:
AM>GM>HM
3.0 > 2.6052 > 2.1898
Theorem No. 2:
AM x HM GM2
3.0 x 2.1898 2.60522
6.569 6.7870 diff. = 0.22
Arithmetic Mean, Geometric Mean
and Harmonic Mean for Group Data
• For any Group data, The Arithmetic Mean is:
Where xi are the Mid-Points and fi are
class frequencies.
• For any Group data, The Geometric Mean is:
Variance V ( X )
n
Variance of the following
ungroup data:
X: 1,2,3,4,5
Mean=3
Standard Deviation= V (X ) 2
=1.414 ???
Coefficient of Variation (Consistency Check)
f x x
2
111.34
Variance V ( X ) i i
4.45
f i 25
Variable Comparison (Property of C.V.)
V (X ) 2.111
C.V . 100 100 29.48%
X 7.16
• So technically, Income data is more consistent
than the first five natural numbers.
Hand-Profile Analysis
(An exploratory approach)
X3 S.No. Measurements (X)
X4 X2
1 X1
2 X2
X5
3 X3
Span (X6) 4 X4
5 X5
6 X6
Thumb 7 X7
(X1) in
cms Determine the Mean,
Length Standard deviation and
(X7) Coefficient of Variation.
Computing Mean and Standard Deviation
Using Scientific Calculators
New Models (ES Series) Prev. Models (MS Series)
Press MODE Press MODE
Select STAT Select SD
Select 1-Var Entering the Data:
Enter the Data in appeared data Obs1 M+
column… Obs2 M+
For Finding Mean and Standard Obs3 M+
Deviation: do it for all remaining data
Press Shift and then press 1 observations.
Select VAR For Finding Mean and Stand. Dev.
Select for mean Press Shift and Press 2
Select Xn for Standard Deviation Select for mean
Select Xn for Standard Deviation
Approximate Confidence Interval
For any Bell-shaped symmetrical distribution;
the following will be proved:
Value
1 2 3 4 5 6 7
- 3
Lower control Limit
By observing any realization; we can monitor any process
which can alert us on two conditions:
1- Either any observation crosses or even touch any pre-
alarm control limit. Or
2- When the realization motion become rhythmic
Statistical Process Control (An activity)
• Consider the following Manufacturing
Process; S.No. X=2xRan#
1 2xRan#
X=2 x Ran#
• Simulate 7 Observations using this 2 2xRan#
simulator. 3 2xRan#
Sample
Outcomes Criteria Numeric
Space
Probability
Independent Dependent
P(AB)=P(A) P(B) Conditional Probability
What is the Distribution?
• Gives us a picture of
the variability
• and central tendency.
A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT,TH}
Thus we formed two Non-Mutually Exclusive Events
Computing Probability
Probability of an Event
• P(A) stands for probability of an Event A such that;
P(A) = n(A)/n(S)
Where,
• n(A) is the number of outcomes present in Event A.
• n(S) are the number of outcomes present in the
Sample Space.
• Probability is a proportion of Event in a Sample Space.
• For any Event A; 0 P(A) 1 where A S
Computing Probabilities (Example)
• Random Experiment # 2: Tossing a fair coin twice or
tossing two fair coins, once.
Sample Space S={HH,HT,TH,TT},
Event(s)
A={First toss should be a Head}, A={HH, HT}
B={Exactly one Tail in the outcome}: B={HT, TH}
Therefore Probabilities will be,
P(A)=2/4=0.5 50% chances
P(B)=2/4=0.5 50% chances
Interpreting Probability
Probability occurs against every Event and should be interpreted
in 3 components;
1) Object of the Random Experiment
2) Value of the Probability
3) Event Statement
For e.g., Interpretation of P(A)=0.5 can be written as;
If we toss a fair coin twice, we have 50% chances
of getting head in the first toss.
Similarly, P(B)=0.5 would be:
If we toss a fair coin twice, we have 50% chances
of getting exactly one tail in both tosses.
Union, Intersection and Compliment
For the same Random Experiment # 2, the following
operations showing results and relevant interpretations
needed (where U=OR, =AND, A’=not(A):
Since S={HH,HT,TH,TT} A={HH,HT} B={HT,TH}
Therefore,
AUB={HH,HT,TH} P(AUB)=3/4=0.75 75%
If we toss a fair coin twice, we have 75% chances of getting
head in the first toss OR exactly one Tail in both tosses.
AB={HT} P(AB)=1/4=0.25 25%
A’=S-A={TH,TT} P(A’)=2/4=0.50 50%
P(A’)=1-P(A)
Practice Questions
Q1) If we toss a fair coin three times, determine the
following probabilities:
a) P(A)=Probability of getting exactly one Head in all tosses?
b) P(B)=Probability of getting Tail in the first toss?
c) P(C)=Probability of getting exactly one head AND one
tail? P(One head One Tail)
d) P(D)=Probability of NOT getting exactly one head in all
tosses? P(A’)
e) P(F)=Probability of Either getting exactly one head in all
tosses OR tail in the first toss? P(AUB)
Practice Questions (Contd..)
Q2) If we toss a fair dice twice, determine the following
Probabilities: (Ref. Random Experiment #4)
a) P(A)=Probability of getting same number on both Dice?
b) P(B)=Probability of getting odd number in both Dice?
c) P(C)=Probability of getting sum of both numbers equals
to 5?
d) P(D)=Probability of getting an odd number AND an even
number on two Dice respectively.
e) P(F)=Probability of NOT getting the same number on
both Dice?
Practice Questions (Contd..)
Q3) If we toss a fair COIN and a Fair DICE once, determine
the following Probabilities: (Ref. Random Experiment #6)
Glasses (G) 05 19
12 G =24
=17
No Glasses (NG) 09 12
19 NG=21
NG=28
Column Total M=14 F=31 S =45
Like
P(M/L)=a/(a+b) a b a+b
Dislike
c d
P(M)=(a+c)/N a+c N=50
Form a bivariate frequency table and test whether Gender is
Independent of your views or not, i.e. :
P(M/L)P(M) or P(M/L)P(M)
Independent results showing no gender discrimination in MIND, vice
versa for Dependence.
Intrdoductory Statistics 9th Ed.
Selected Probability Solved Example and Exercises:
0.4
0.3
0.2
0.1
0
0 1 2 3