You are on page 1of 13

Dr.

Suren Phansalker

ADM2303 Discrete Mathematical Probability Models or Distribution Functions

When Random Variables (RVs) are discrete, they have either Empirical Distributions/ Models or Mathematically defined Distributions/ Models. Empirical Distributions: These are distributions based on observations. We have already seen them in the last two lectures when we studied Expectation, Variance, Standard Deviation, Covariance and Linear Combinations of RVs. Mathematical Distributions: RVs are many times governed by distributions that can be defined quite precisely by mathematical functions/ relations. When specific conditions are met, different distributions are appropriate. We will study some of these discrete mathematical distributions for some time. 1. Bernoulli Trials or Distribution: (Enrichment) When the RV X has only two outcomes, it is called a Dichotomous RV. In such case, for convenience, the first outcome is given the numerical value of 1 and the second outcome is given the numerical value of 0 or vice versa. X can represent Success (1) or Failure (0); it can represent Boy (1) or Girl (0); it could be Correct (1) or Incorrect (0). Thus if we were to assume, for example the probability of Success to be p and the probability of Failure to be q, the Distribution Function will be: X=x P[X=x] 0 q 1 p Obviously, p + q = 1 or q = 1 - p Thus if RV, X Bernoulli(p) or if RV X is distributed () as a Bernoulli distribution with probability p of success, then:

E(X) = x P(X = x) = 0 (q) + 1 (p) =p Var(X) = x2 P(X = x ) - 2 = 02(q) + 12(p) p2 = 0 + p p2 = p (1 p) 2 (X) = p q, and
pq

{STD(X) = (X)} = (p q)1/2 = 2. Binomial Distribution:

If each trial is a Bernoulli trial with the same or constant p as the probability of success and if there are n such trials, where each trial is independent of the other trial, then: RV, X B(n, p) or RV X denoting the number of successes, is distributed binomially with the characteristic parameters of n trials and p as the probability of success. It can be shown that: P(X = x) = ( p ) x (q ) n x x
n

Where

n n! = x (n x)! x!

, and

= n p, 2 = n p q , and
= npq

Simple Proof of and in Binomial Distribution: (Enrichment) If X~ B(n, p) then: X = X1 + X2 + .. + Xn where X1, X2, .., Xn are all Identically Bernoulli Distributed Independent RVs, with the same p. Thus,

E[X] = E[X1] + E[X2] + .. + E[Xn] {E[X] = } = p + p + .. + p (taken n times) {E[X] = } = n p Also Var[X] = Var[X1] + Var[X2] + .. + Var[Xn] {Var[X] = 2(X)} = p q + p q + .. + p q (taken n times) and {Var[X] = 2(X)} = n p q,
{ = ( X ) = STD( X ) = npq

Ex.#1: A Simple Example of Binomial Distribution: A student is writing a multiple-choice Quiz. There are 5 choices to each question. This student is not at all prepared for the quiz. He simply makes a choice based on pure and simple guessing. Each question is worth 10% mark and the passing mark in the quiz is 50%. RV X denotes the # of correct responses the student chooses. If he attempts to write a test with 10 multiple choice questions, find: a. The appropriate distribution function for X. b. The actual distribution of this RV X. c. The Expectation, Variance and Standard Deviation of X. d. P[student fails] e. P[The student passes] f. P[The student obtains 80% or more marks] g. P[The students gets a perfect score] g. P[X = 0] or P[X = 5]Solution to the Example Problem: Solution to the Example Problem: a. Since the response to a multiple choice question based on guessing alone is a dichotomous RV, X with a constant probability of 1/5, and it is constant and does not change from one question to next, it is a RV such that: X ~ B(n = 10, p = 0.2).

b. P[X = x] = (0.2) x (1 0.2)10 x = (0.2) x (0.8)10 x x x where Thus: X=x 0 1 2 3 4 5 6 7 8 9 10 Total: P[X = x] 0.10737 0.26844 0.30199 0.20133 0.08808 0.02642 0.00551 7.8645x 10-4 7.3728x10-5 4.096x10-6 1.024x10-7 P[X = x] = 1.0000.
10 10! = x (10 x)! x!

10

10

This is a proper distribution. Here for example:


10 10! 10(9)(8) (0.2) 3 (0.8)103 = (0.2) 3 (0.8) 7 = 3 3(2)(1) (10 3)!(3)! = 120(0.008)(0.20972) 0.20133

c. {E[X] = } = n p = 10(0.2) = 2.0 (not what you expect!) {Var[X] = 2(X)} = n p q {Var[X] = 2(X)} = 10 (0.2) (0.8) {Var[X] = 2(X)} = 1.60 {STD[X] = (X) = } = 1.60 1.26491

d.

P[Student fails] = P[X < 5] P[X < 5] = P[X=0] + P[X=1] + P[X=2] + P[X=3] + P[X=4] (from the table in part b P[X < 5] 0.10737 + 0.26844 + 0.30199 + 0.20133 + 0.08808 0.96721 (96.721%!)

e.

P[Student passes] = P[X 5] = P[X=5] + P[X=6] + P[X=7] + P[X=8] + P[X=9] + P[X=10] = 1 P[X < 5] = 1 P[X fails] = 1 0.96721 = 0.03279 (3.279%!) P[Student obtains 80% or more] = P[X 8] = P[X = 8] + P[X = 9] + P[X = 10] 7.3728x10-5 + 4.096x10-6 + 1.024x10-7 7.7926x10-5 (about 7.8 in 100,000!) P[Student gets perfect score] = P[X = 10] 0.1.024x10-7 (about 1 in 10 million!) Please note that getting a perfect score is highly unlikely. The student will have to be extremely lucky whatever lucky means. Even just passing has only a small probability of 3.279%. It is a common belief, erroneously based on a myth, that multiple-choice exams are easy! Simple question: Interpret the meaning of the answer in part g above.

f.

g.

3. Poisson Distribution, Poisson[]: If the RV, X is distributed as a Poisson RV, then:


e x P[X = x] = x!

and {E[X] = } = {Var[X] = 2[X] = 2} = {STD[X] = [X] = } = Although Poisson RV, X is a Discrete RV, assuming discrete values such as 0, 1, 2, etc., it can take values which are very large or in the limit (right up to) . The Poisson Distribution is a Proper Distribution because:
x x e x P[ X = x] = x! = e x! = e e = 1 x =0 x =0 x =0 Poisson Distribution Function was developed by S. D. Poisson, a French x x

mathematician, long ago, for finding probabilities of somewhat rare events. It can be used to evaluate the probabilities of accidents, imperfections, flaws, mishaps etc. It can be even used to find the probabilities of catching Fish/ Poisson! 4. Solved Problems: The arrival of customers at a tellers counter at a local bank can be modeled by a Poisson distribution with a mean of 2 customers per hour. The teller wants to take a break for the next ten minutes. a. What is the probability that no one will arrive in the next ten minutes. b. What is the probability that 2 or more people will arrive in the next ten minutes? c. The teller has just served 2 customers who came in one after the other. Is this a better time for the teller to take the ten minute break? Solution: Let RV, 'X' represent # of customers served in the next 10 minutes as the event horizon. Then:
6

1 (customer ) 2(customers ) 3 1 = = = 60(min utes ) 10(min utes ) 3

a. P[X = 0] =

(1 / 3)0 e x = e 1 / 3 0! 0.71653 x!

b. P[X 2] = 1 - P[X < 2] = 1 - {P[X = 0] + P[X = 1]} = 1 - e 1 / 3


(1 / 3)0 (1 / 3)1 + = 1 0.95538 = 0.04462 1! 0!

c. The Posson Distribution is "memory-less": in other words the probabilities dont change based on what happened! Even though two customers just came in and filled up the average quota for the hour, we can still find the probability that: P[X > 0] = 1 - P[X = 0] = 1 - 0.71653 = 0.28347

This means that in the next 10 minutes, at least one (meaning one or more) customer will show up with a probability of ) 0.28347 (28.347%). Hence, 'No'; it is neither a better or worse time than otherwise (if two customers had not shown up). Useful Aside: Find the probability that in the next two hours that no more than 3 customers will show up. Here =
2(customers ) 4(customers ) = =4 60(min utes ) 120(min utes )

Hence,

P[X 3]
x =3

= P[X = 0] + P[X = 1] + P[X = 2] + P[X = 3]


x =3

x =3 x 40 41 42 43 e x 4 = e 4 = e 4 + + + 0.43347 0! 1! 2! 3! x! x =0 x = 0 x! x =0 P[X 3] 0.43347

P[ X = x] =

5. Geometric Distribution: (Enrichment) If the Bernoulli Trials are undertaken repeatedly, the question that arises is How many trials must be undertaken to have a winning trial? Obviously, the probability of winning at every trial is constant and is p. The distribution function is given below, and:
X= x P[X = x] 1 P 2 qp 3 qp
2

4 qp
3

.. ..

RV, X represents the # trials until the first success occurs. The distribution function of the RV, X is: P[X = x] = q(x 1) p This relation is intuitively appealing because, if you win on the x th trial, you must have not won for the previous (x-1) trials each of which has probability of q. Since all the trials are independent Bernoulli trials, it stands to reason that the probabilities of all the trials would be simply multiplied. The distribution function is given below, and: E[X] = x P[X=x] = 1p + 2 q p + 3q2p + 4q3p + .. = 1(1- q) + 2q(1 q) + 3q2(1 q) + 4q3(1 q) +.. = 1 q + 2q 2q2 + 3q2 3q3 + 4q3 4q4 + .. = 1 + q + q2 + q3 + .. =
1 1 = 1 q p

, based on an infinite geometric series with a common ratio of q and first term 1.

With some careful derivation and patience, Variance can be shown to be: Var[X] =
q p2

{[X] = STD[X] = } =

q q = 2 p p

Ex.#2 : Another Simple Example: (Enrichment) A lottery advertises that your chance of winning a prize is 1 in 10. Find the expected # of tickets and the standard deviation of tickets you should buy to win some prize. Solution: Here p = 1/10 = 0.10 E[X] = 1/p = 1/0.1 = 10 (Tickets) {[X] = } =
q 0 .9 = = 9.48683 , rather a large value! p 0 .1

3. Solved Example of Geometric Distribution: (Enrichment) Ex.#3: A computer chip manufacturer rejects 2% of the chips produced because they do not meet the specified requirements. a. What is the probability that the 5th chip you test is the first bad one you find? b. What is the probability that you find one bad chip within the first 10 you test? Here, the selection of chips may be considered as Bernoulli trials. There are only two possible outcomes: fail testing and pass testing. If the chips selected are a representative sample of all chips, the probability that a chip fails testing is a constant 2% (0.02). Since the population of chips is finite, the trials are not independent (the probabilities do change a little!). If less than 5% of the chips are tested, the probabilities can be considered reasonably constant.
9

Let RV X be the # of chips tested until the first bad chip. Then, The appropriate model/distribution is Geom(p = 0.02). a. P[X = 5] = (0.98)4(0.02) = 0.01845 b. P[1 X 10] =
i =10 i =1

P[ X = i] = P[ X = 1] + .... + P[ X = 10]

= 0.02 + 0.02(0.98) + 0.02(0.98)2 + .. + 0.02(0.98)9 = 0.02{(0.98) + (0.98)2 + .. + (0.98)9} = (0.02)


(1 0.9810 ) 0.18293 (1 0.98)

6. Hyper-geometric Distribution: (Enrichment) This distribution is easily appreciated by starting with a problem. Problem: Canadians play Lotto 649 quite regularly. Out of the 49 numbers, if all the 6 numbers chosen by the player match the 6 numbers randomly selected by Lotto 649, the player wins the jackpot. What is the probability that the player has: a. all the 6 numbers which match the selected numbers. b. has only 4 numbers which match the selected numbers. Solution: a. The # of possible 6 number combinations are:
49 49! 49(48)(47)(46)(45)(44)(43)! = = 13,983,816 6 (49 6)!(6)! = (43)!6!

The number of ways in which 6 matching numbers can be arranged from 6 selected numbers is:
6 6! = 6 (6 6)!(6)! = 1

10

P[All 6 numbers match] =

1 13,983,816

(approximately 1 in 14 million!) b. The number of ways in which 4 matching numbers can be arranged from 6 selected numbers is:
6 6! = 4 (6 4)!(4)! = 15

The number of ways in which the non-matching 2 numbers can be arranged from the remaining 43 (= 49-6) numbers is:
43 43! 43(42)(41)! = 2 (43 2)!(2!) = (41)!(2)! = 903

The number of ways in which 4 matching numbers can happen from 6 selected numbers is: 15(903) = 13545 P[4 numbers will match] = (approximately 1 in 1000!) Formal Definition of Hyper-geometric Distribution: X ~ Hypergeom(N, n, p) Here N: total # of items of two kinds n: # of items selected, and N1 is the # of item of first kind and N2 is the # of items of the second kind. Obviously N1 + N2 = N. With:
p= N1 N and q = 2 N N
13,545 9.68 x10 4 13,983,816

then,

11

N 1 N N 1 x n x P[X = x] = N n

and,

{ = E[X]} = n p = Var[X] = npq1

n( N 1 ) N

and,

n 1 N 1

{[X] = STD[X] = } = {Var[X]}1/2 Thus, {[X] = STD[X] = } = npq


N n . This expression can also be N 1

written in forms with N1 and N2 shown above.

General Solution to the Lotto-649 Problem with the Formula: Here: N = 49, N1 = 6, p = N1/N = 6/49, q = 43/49 and n = 6. Thus: If RV 'X' indicates the # of matching or winning numbers, then
N1 N N1 x n x = P[X = x] = N n 6 49 6 6 43 x 6 x x 6 x = 49 13,983,816 6

By using the above formula repeatedly, the following Distribution Function can be readily obtained. It is a Proper Distribution Function.

12

X=x X=x=0 X=x=1 X=x=2 X=x=3 X=x=4 X=x=5 X=x=6 Total:

6 43 x 6 x = P[X = x]= 13,983,816

6 43 x 6 x where L = 13,983,816 L

6,096,454/ (L) 5,775,588/ (L) 1,851,150/ (L) 246,820/ (L) 13,545/ (L) 258/ (L) 1/ (L) 13,983,816/ (L) = 1.000
1 . 13,983,816

Obviously, p = P[Winning the Jackpot] = P[X = 6] =

If this is the probability of the Geom(p) Distribution, it can readily be appreciated that: E[X] =
1 1 = = 13,983,816 1 p 13,983,816

This means that if you purchased 13,983,816 tickets at a cost of $27,967,632 at $2/ticket, you will likely win a jackpot of a few million dollars and then again you may not. It depends on how many tickets were sold!

13

You might also like