You are on page 1of 6

1. Write a detailed note on Normal Distribution, Standard Normal Variety and the Central Limit Theorem.

em. Illustrate the concepts with the help of


diagrams.
2. From the following data compute quartile deviation and the coefficient of skewness:
Size 4.5-7.5 7.5-10.5 10.5-13.5 13.5-16.5 16.5-19.5
Frequency 14 24 38 20 4
3. A bank has a test designed to establish the credit rating of a loan application. If the persons, who default (D), 90% fail the test (F). Of the persons,
who will repa the bank (ND), 5 % fail the test. Furthermore, it is given that 4% of the population is not worthy of credit (i.e. defaulters). Given that
someone failed the test, what is the probability that he actually will default (When given the loan)?
4. Two laboratories A and B carry out independent estimates of fat content in ice cream made by a firm. A sample taken from each batch gives the
following fat content:
Batch No- 1 2 3 4 5 6 7 8 9 10
Lab A 7 8 7 3 8 6 9 4 7 8
Lab B 9 8 8 4 7 7 9 6 6 6
Is there a significant difference between the mean fat-content obtained by the two laboratories A and B?
5. Given the bivariate data:
X 1 5 3 2 1 1 7 3
Y 6 1 0 0 1 2 1 5

a. Fit a regression line of Y on X and hence predict Y if X=10


b. Fit a regression line of X on Y and hence predict X if Y =2.5
Solutions:
Ans. 1
Normal Distribution:
In probability theory and statistics, the normal distribution or Gaussian distribution is a continuous probability distribution that describes data that
clusters around a mean or average. The graph of the associated probability density function is bell-shaped, with a peak at the mean, and is
known as the Gaussian function or bell curve.
The normal distribution can be used to describe, at least approximately, any variable that tends to cluster around the mean. For example, the
heights of adult males in the United States are roughly normally distributed, with a mean of about 70 inches. Most men have a height close to
the mean, though a small number of outliers have a height significantly above or below the mean. A histogram of male heights will appear
similar to a bell curve, with the correspondence becoming closer if more data is used.

Central Limit Theorem (Clt)


The central limit theorem (CLT) states conditions under which the sum of a sufficiently large number of independent random variables, each
with finite mean and variance, will be approximately normally distributed (Rice 1995). The central limit theorem also requires the random
variables to be identically distributed, unless certain conditions are met. Since real-world quantities are often the balanced sum of many
unobserved random events, this theorem provides a partial explanation for the prevalence of the normal probability distribution. The CLT also
justifies the approximation of large-sample statistics to the normal distribution in controlled experiments.
In more general probability theory, a central limit theorem is any of a set of weak-convergence theories. They all express the fact that a sum
of many independent random variables will tend to be distributed according to one of a small set of "attractor" (i.e. stable) distributions. For
other generalizations for finite variance which do not require identical distribution
Standard Normal Variety
it is possible to relate all normal random variables to the standard normal.
If X ~ N(μ,σ2), then
is a standard normal random variable: Z ~ N(0,1). An important consequence is that the cdf of a general normal distribution is therefore
Conversely, if Z is a standard normal distribution, Z ~ N(0,1), then
X = σZ + μ
is a normal random variable with mean μ and variance σ2.
The standard normal distribution has been tabulated (usually in the form of value of the cumulative distribution function Φ), and the other
normal distributions are the simple transformations, as described above, of the standard one. Therefore, one can use tabulated values of the
cdf of the standard normal distribution to find values of the cdf of a general normal distribution.
Ans. 2

Size Frequency C.F. X fx


4.5 – 7.5 14 14 6 84
7.5 – 10.5 24 38 9 216
10.5 – 13.5 38 76 12 456
13.5 – 16.5 20 96 15 300
16.5 – 19.5 4 100 18 72
Total 100 60 1128

Q3 − Q1
N = 100, Quartile deviation =
2
For Q3

3N 3
= ×100 = 75
4 4
Q3 is lie on 76 in C.F.

3N
− cf × i
Q3 = l + 4
f

75 − 38
Q3 = 10.5 + ×3
38
= 10.5 + 2.9
=13.42 (approx)

For Q1

N 100
= = 25
4 4
N
− cf × i
Q1 = l + 4
f
Q1 is lie on 38 in C.F.

25 − 14
Q1 = 7.5 + ×3
24
= 7.5 + 1.375
=8.875
Q3 − Q1
Quartile deviation =
2
13.2 − 8.875
=
2
= 2.2725

Coefficient of skewness
Difference of skewness is given by
3(mean − median)
=
s.d .

calculate mean using =


∑ fx
∑f

S.D. =
∑ f ( x − x) 2

∑f
1128
Mean = = 11.28
100
N
− cf × i
=l + 2
Median

50 − 38
= 10.5 + ×3
38
= 10.5 + 0.94
= 11.44
x f
x−x ( x − x) 2 f ( x − x) 2
6 14 -5.28 27.87 390.18
9 24 -2.28 5.198 124.752
12 38 0.72 0.5198 19.69
15 20 3.72 13.83 276.6
18 4 6.72 45.15 180.6
100 991.822

991.822
s.d =
100
= 3.14

Coefficient of skewness
3(11.28 − 11.49)
=
3.14
−3 × 0.16
=
3.14
= −0.152

Ans. 3

90
Prob. D =
100
5
Prob. N.D =
100
4
Prob. defaulters =
100

 failed  Pr ob ( failedanddefault )
Pr ob  =
 default  Pr obdefault

90 5 4
Pr ob ( failedanddefault ) = + −
100 100 100
89
=
100
using Eq(1)

89
 failed  100
Pr ob  =
 default  90
100
89
= = 0.98
90

Ans. 4

Lab A Lab B
x − xA ( x − x A )2 x − xB ( x − x B )2
7 9 0.7 0.49 2 4
8 8 1.3 1.69 1 1
7 8 0.7 0.49 1 1
3 4 -3.7 13.69 -3 9
8 7 1.3 1.69 0 0
6 7 -0.7 0.49 0 0
9 9 2.3 5.29 2 4
4 6 -2.7 7.29 -1 1
7 6 0.7 0.49 -1 1
8 6 1.3 1.69 -1 1
33.3 22

x
Variation = ×100
σA

Find x=
∑x
N

σ= ∑ ( x − x) 2

N
x A = 6.7
33.3
σA =
10
= 3.3
= 1.816

xB = 7
22
σB =
10
= 2.2
= 1.48

coff. of variation of A

6.7
= ×100 = 368.94
1.816
coff. Variation of B

7
= ×100 = 698.52
1.48

Ans. 5

x y
1 6
5 1
3 0
2 0
1 1
1 2
7 1
3 5
23 16

23
x= = 2.875
8
16
y= =2
8
line y on x

y − y = byx( x − x)
n∑ xy − ∑ x ∑ y
byx =
n∑ x 2 − ( ∑ x )
2

calculate
x = 10
y=?

x on y
x − x = bxy ( y − y )
n∑ xy − ∑ x ∑ y
bxy =
n∑ y 2 − ( ∑ y )
2
calculate
y = 2.5
x=?

x y xy x2 y2
1 6 6 1 36
5 1 5 25 1
3 0 0 9 0
2 0 0 4 0
1 1 1 1 1
1 2 2 1 4
7 1 7 49 1
3 5 15 9 25
23 16 35 99 68

n∑ xy − ∑ x ∑ y
byx =
n∑ x 2 − ( ∑ x )
2

8 × 35 − 23 × 16
=
8 × 99 − (23) 2
280 − 368
=
792 − 529
−88
=
263
= −0.33

n∑ xy − ∑ x ∑ y
bxy =
n∑ y 2 − ( ∑ y )
2

8 × 35 − 23 × 16
=
8 × 68 − (16) 2
−88
=
544 − 256
−88
=
288
= −0.30
Equation of y on x

y – 2 = -0.33(x – 2.875)

Calculate y if x = 10

Y = -0.33(10 – 2.875) +2
= -2.35 + 2
y = -0.35

Equation of x on y

y – 2.875 = -0.30(x – 2)

at y = 2.5

x = -0.30(2.5 – 2) +2.875
= -0.30(0.5) +2.875
= 0.15 + 2.875
= 2.725

You might also like