You are on page 1of 15

Leong YK & Wong WY Introduction to Statistical Decisions 1

Chapter 4
Some Basic Concepts on Probability
4.1 Introduction
The term random experiment is used to denote a
process in which the outcome cannot be
determined in advance. When the outcome of a
random experiment is or can be identified with a
number, this numerical outcome is referred to as
a random variable. A random variable will be
denoted by a capital letter, such as X ; and
possible values of X are denoted by lower case
letter x .

For a given random variable X, the term


distribution of X , defined to be
F ( x ) = P( X x ) .
It is necessary to keep that dependence on
display, as would be the case when more than one
random variable is under discussion.
Leong YK & Wong WY Introduction to Statistical Decisions 2

A distribution function not only describe a


distribution in terms of accumulated amounts
from the left, but also serves as a basis for
computing the probability of other events of
interests in the real line.

4.2 Discrete Random Variables


Variables
If the random variable X takes only countable
many values, the density function, denoted by
f X (x ) and defined by

f X (x ) = P( X = x )

is used to describe the pattern of possible values


x1 , L , xn and the corresponding probabilities
p1 ,L , pn , in which the probabilities pi ' s satisfy
the following conditions:
(a) 0 pi 1, i = 1 , L , n;
(b) p1 + L + pn = 1.

We call f X (x ) the (probability) density function of


X . The above conditions (a) and (b) specify the
properties of a probability function, i.e.,

(a*) f X ( x ) 1 for all x , and


Leong YK & Wong WY Introduction to Statistical Decisions 3

(b*) f X ( x ) = 1
x

4.3 Continuous Random Variables


A distribution function that is continuous, having
no jumps, will define a distribution in the value
space of X that assigns probability 0 to any single
value. However, the class of continuous functions
include many that are not useful in the sense that
those that are not differentiable over a large set of
points.

Distribution functions that provide useful models


are those which are differentiable everywhere,
with the possible exception of at most a finite
number of points in any finite interval. In such
case X is said to be a continuous random variable.

The distribution function of a continuous random variable


will then have a derivative except at isolated points, and
this derivative is denoted by f X ( ) :
d
FX ( x )
f X ( x) =
dx
The derivative function f X (x ) is called the density
Leong YK & Wong WY Introduction to Statistical Decisions 4

function of the distribution or of the random variable X


defining the distribution function.
Note
Unless clarification is needed, the probability function
P( X = x ) of discrete random variable will also be
denoted by f X (x ) and called the density function of X .

4.4 Expectation

Expected Value
The expected value of a function g ( X ) of
random variable X with possible values
x1 , L, xn ,L and probability function f X is
defined to be
E ( g ( X )) = g ( x1) f X ( x1) + L + g ( xn ) f X ( xn ) + ...

We call E ( X ) the mean or the expected value of


X.
The expectation E[( X E ( X ))2 ] is called the
variance of X and is denoted by Var ( X ) or 2X .
Leong YK & Wong WY Introduction to Statistical Decisions 5

The positive squared root of 2X is called the


standard deviation of X .
Example 4.4.1
A coin with probability of heads is tossed n
times. Let X denote the number of heads observed
in this experiment. The probability function of X
is given by
n x
f X ( x | ) = (1 )n x , x = 0 , 1, L, n ;
x
0 1.
One can verify that
E ( X ) = n and Var ( X ) = n (1 ) .
The random variable considered in this example
is known as a binomial random variable with
parameters n and .
Some common density functions of discrete
random variables together with their means and
variances are listed as below:
Leong YK & Wong WY Introduction to Statistical Decisions 6

Discrete Uniform Random Variable on set


{ x1 ,L, xn }.
1
f X ( x ) = , x { x1 , L , xn }.
n
x + L + xn
E( X ) = 1 = x,
n
1 n
Var ( X ) = ( xi x )2 .
n i =1

Poisson Random Variable


x
f X (x | ) = e , x = 0 ,1, 2 ,L ; > 0
x!
E ( X ) = = Var ( X ) .
Geometric Random Variable
f X ( x | ) = (1 ) x 1 , x = 1, 2 ,L ; 0 < < 1 .
1
E ( X ) = 1, Var ( X ) = .
2
Note:
Random variable with density function defined by

f X ( x | ) = (1 ) x , x = 0 ,1,L ; 0 < < 1


is also called a geometric random variable. Thus
whenever a geometric distribution is referred, the
Leong YK & Wong WY Introduction to Statistical Decisions 7

density function must be explicitly specified to


avoid ambiguity.

4.5 Continuous Distributions


Continuous random variable has been defined in
Section 4.1. For the sake of convenience,
continuous random variable will be defined more
restrictively as follows:

If a random variable X has associated


with a nonnegative valued function f such
that
b
P( a X b) = f ( x )dx ,
a
then we call f the density function of X and
we say that X has a continuous distribution.
The mean and variance of X are defined
respectively as

E ( X ) = xf ( x)dx and
Var ( X ) = ( x E ( X )) 2 f ( x)dx .

Of course, the expectations are well defined


only if the above integrals converge
absolutely.
Leong YK & Wong WY Introduction to Statistical Decisions 8

Example 4.5.1
Let X has density function given by
f ( x ) = e x , x > 0 and > 0 .
It is easy to verify that f is a genuine density
function. Moreover,
1 1
E( X ) = and Var ( X ) = .
2
Other random variables with continuous
distributions that we encountered in this course
together with their means and variances are
listed below:
Uniform random variable defined on interval
[a , b ]
1
f ( x) = , a xb
ba
a+b (b a )2
E( X ) = , Var ( X ) = .
2 12
Gamma random variable with parameters and
.

f ( x) = ( x ) 1 e x , x > 0 ; > 0 , > 0 .
( )
Leong YK & Wong WY Introduction to Statistical Decisions 9

E ( X ) = , Var ( X ) = .
2

Here the gamma function ( ) is defined as



( ) = x 1e x dx .
0
Some properties of gamma function are listed as
follows:
( + 1) = ( )
( n + 1) = n! where n is a nonnegative
integer.
Beta random variable with parameters and
.
( + ) 1
f ( x) = x (1 x ) 1, 0 < x < 1,
( )( )
where and are positive numbers.

E( X ) = , Var ( X ) = .
+ ( + )2 ( + + 1)
Leong YK & Wong WY Introduction to Statistical Decisions 10

Normal random variable with mean and


variance 2 .
1 1 2
f ( x) = exp (x ) ,
2 2 2
< x < , < < , > 0.

4.6 Median

Median
Let X be a random variable. Number m
is said
to be a median of the distribution of X if
P( X m) 1 / 2 and P( X m) 1 / 2 .

Let X be a random variable such that


E ( X 2 ) < , and let m be a median of the
distribution of X . Then

(i) E[( X E ( X ))2 ] = min E[( X a )2 ]


a
(ii) E (| X m |) = min E (| X a |)
a
Leong YK & Wong WY Introduction to Statistical Decisions 11

Example 4.6.1
Let X be a Bernoulli random variable with
parameter . Find a median of the distribution of
X.
4.7 Conditional Distributions
After we observe a random variable, we want to
adjust the probabilities associated with the ones
that have not yet been observed. In many
situations the parameter is treated as a random
variable. In this case, the conditional distribution
of random variable X given will be the
distribution that we would use for X after we
learn the value of .

Discrete Conditional Distribution


Suppose X and are two discrete random
variables. The conditional probability of
X = x , given that = is defined as
P ( X = x, = )
P( X = x | = ) = (4.7.1)
P ( = )
provided that P( = ) > 0 .
Leong YK & Wong WY Introduction to Statistical Decisions 12

We write P( X = x | = ) = f ( x | ) . Here,
f ( x , ) = P ( X = x , = )
is called the joint probability function of X and
Interchanging the role of X and , the conditional
probability of given that X = x is given by
P ( X = x , = )
P( = | X = x ) = . (4.7.2)
P( X = x )
We call P( X = x ) = f X ( x ) the marginal probability
of X . It follows from (4.7.1) and (4.7.2) that
P ( = | X = x ) P ( X = x ) = P ( X = x | = ) P ( = )
If we write
( ) = P( = ) , ( | x ) = P ( = | X = x ) ,
As a function in ,
f ( x | ) ( )
( | x ) = f ( x | ) ( ) . (*)
f X ( x)

Here the reciprocal of f X (x ) is regarded as a


proportional constant. Formula (*) is also adopted
as the definition of the conditional density
Leong YK & Wong WY Introduction to Statistical Decisions 13

function for continuous random variables X and


.
Example 4.7.1
Suppose that a certain machine produces
defective and non-defective parts, but we do not
know what proportion of defectives we would find
among all parts that could be produced by this
machine. Let stand for the unknown proportion
of defective parts among all possible parts
produced by the machine. If we were to learn that
= , then the machine produce a defective item
with probability . Suppose that we examine n
parts and let X stand for the number of defective
among the n examined parts. Then X is
conditionally and binomially distributed with
parameters n and . In other words, the
conditional probability function of X given that
= is given by
n x
f ( x | ) = (1 )n x , x = 0 ,1,L, n.
x
Suppose we believe that is uniformly
distributed on interval [0,1] . Then the conditional
density function of
given that X = x , is found to be
Leong YK & Wong WY Introduction to Statistical Decisions 14

( | x ) f ( x | ) ( ) x (1 )n x .
This implies that if X = x is observed, has the
beta distribution with parameter = x + 1 and
= n x + 1. In particular, if n = 2 and x = 1, then
the conditional density function of given that
X = 1 is
( | 1) = 6 (1 ) , 0 1.
Example 4.7.2
Suppose that the proportion of defective items
in a large manufactured lot is known to be either
0.1 or 0.2, and the prior probability function of
is as follows:
: P( = 0.1) = w , P( = 0.2) = 1 w , 0 w 1.
Suppose that four items are selected at random
and it is found that exactly one of them is
defective. Let X denote the number of defective
items found in the sample. Then the posterior
probabilities of is computed as follows:

P( = 0.1 | X = 1) P( X = 1 | = 0.1) P ( = 0.1)


(0.1)1(0.9)3 w

P( = 0.2 | X = 1) P( X = 1 | = 0.2) P ( = 0.2)


Leong YK & Wong WY Introduction to Statistical Decisions 15

(0.2)1(0.8)3 (1 w)
Therefore

(0.1)(0.9)3 w
P( = 0.1 | X = 1) =
(0.1)(0.9)3 w + (0.2)(0.8)3 (1 w)
729 w
=
729 w + 1024(1 w)
729w
= .
1024 295w
P( = 0.2 | X = 1) = 1 P ( = 0.1 | X = 1) .

Example 4.7.3
Suppose that the conditional density function of
X given that = is given by
f ( x | ) = x 1 , 0 < x < 1 ; > 0 .
If the prior distribution of is the gamma
distribution with parameters and . Determine
and mean of the posterior distribution of .
+1
Answer :
log x

You might also like