Professional Documents
Culture Documents
INDEX
! factorial .................... 3, 24 D(X) standard deviation ... 14 general math..................... 24 coefficients.................. 20
1st fundamental theorem of De Morgan’s laws ............ 10 generating function..... 18, 19 p(j) coefficients of the
probability...................... 16 density function.................. 8 ordinary ...................... 19 ordinary generating function
2nd fundamental theorem of beta............................. 12 properties .................... 20 ...................................... 20
probability...................... 17 exponential.................... 9 geometric distribution......... 7 permutation........................ 3
absorbing matrices............ 21 joint .............................. 9 glossary............................ 25 P-matrix........................... 21
absorption normal .......................... 8 graphing terminology ....... 25 poisson distribution ............ 7
probability................... 21 standard normal............. 8 h(z) ordinary generating poker probabilities............ 10
time ............................ 21 dependent variable............ 25 function.......................... 19 posterior probabilities ....... 11
approximation theorem..... 18 derangement..................... 25 hat check problem ............ 23 prior probabilities ............. 11
area derivatives........................ 24 hypergeometric distribution 7 probabilities
sphere ......................... 25 deviation .......................... 14 hypotheses ....................... 11 posterior...................... 11
Bayes' formula ................. 11 disoint................................ 3 inclusion-exclusion principle prior............................ 11
Bayes' inverse problem..... 11 distribution ........................................ 3 probability.......................... 2
Bayes' theorem................. 11 hypergeometric.............. 7 independent events ............. 4 uneven .......................... 4
Bernoulli trials uniform....................... 25 independent trials ............. 25 probability of absorption... 21
central limit theorem.... 17 distribution function . 5, 6, 25 independent variable......... 25 probability space ................ 3
Bernoulli trials process ... 6, 7 binomial........................ 6 inequality probability tree ................. 11
beta density function......... 12 cumlative ...................... 6 Chebyshev .................. 16 problems .......................... 23
binomial coefficient............ 5 discrete uniform ............ 5 Markov ....................... 13 properties of expectation... 13
binomial distribution exponential.................... 8 infinite sample space .......... 3 Q-matrix .......................... 21
central limit theorem.... 17 geometric ...................... 7 infinite sum...................... 24 quadratic equation ............ 25
negative ........................ 7 joint .............................. 7 integration........................ 24 random variable................ 25
binomial distribution function joint cumulative............. 6 intersection......................... 3 normal .......................... 4
........................................ 6 multinomial................... 7 joint cumulative distribution standard normal............. 4
binomial expansion........... 25 negative binomial .......... 7 function............................ 6 regular chain .................... 22
binomial theorem ............. 25 poisson.......................... 7 joint density function .......... 9 properties .................... 22
birthday problem .............. 23 double coin toss.................. 2 joint distribution................. 7 reliability ......................... 10
B-matrix........................... 21 e natural number.............. 24 L’Hôpitol’s rule................ 24 reverse tree....................... 11
calculus............................ 24 E(X) expected value......... 12 law of averages................. 16 R-matrix........................... 21
canonical form ................. 21 envelope problem ............. 23 law of large numbers ........ 16 round table ......................... 3
card problem .................... 23 ergodic chain.................... 22 linearizing an equation...... 25 sample problems............... 23
central limit theorem......... 17 Euler's equation................ 24 logarithms ........................ 24 sample space .................. 3, 5
general form................ 18 events M mean first passage matrix series................................ 25
sum of discrete variables independent................... 4 ...................................... 22 set properties...................... 3
.............................. 18 ex infinite sum ................. 24 m(ω) distribution function.. 5 single coin toss................... 2
central limit theorem for expectation Markov chains.................. 20 Sn* standardized sum....... 17
Bernoulli trials................ 17 properties .................... 13 Markov inequality ............ 13 specific generating functions
central limit theorem for expectation of a function... 13 mean................................ 12 ...................................... 19
binomial distributions ..... 17 expectation of a product.... 13 median ............................. 25 sphere .............................. 25
Chebyshev inequality ....... 16 expected value.......... 6, 8, 12 medical probabilities......... 23 standard deviation .... 6, 8, 14
coefficient exponential density function9 memoryless property ........ 12 standard normal density
binomial........................ 5 exponential distribution ...... 8 moment generating function function............................ 8
multinomial................... 5 f(t) exponential density ................................ 18, 19 standard normal random
coefficients of ordinary function............................ 9 moments .......................... 19 variable ............................ 4
generating function ......... 20 f(x) density function........... 8 multinomial coefficient....... 5 standardized sum.............. 17
combinations...................... 3 F(x,y) joint cumulative multinomial distribution ..... 7 states................................ 20
conditional density function distribution function.......... 6 multiple toss....................... 2 step .................................. 20
continuous................... 11 f(x,y) joint density function 9 mutually exclusive.............. 3 stirling approximation....... 24
conditional probability 10, 11 f(ω) continuous uniform NA(0,z) table of values..... 17 stochastic ......................... 25
continuous conditional density function ................ 9 natural number e............... 24 sum
density function .............. 11 factorial........................ 3, 24 Nc-matrix......................... 21 standardized ................ 17
continuous uniform density. 9 failure rate.......................... 9 negative binomial sum of continuous variables
convolution ...................... 15 fixed probability matrix .... 22 distribution ....................... 7 ...................................... 15
example ...................... 16 fixed probability vector..... 22 N-matrix .......................... 21 sum of discrete variables... 15
correlation........................ 15 fundamental matrix..... 21, 22 normal density function ...... 8 table of values for NA(0,z) 17
covariance........................ 15 FX(x) cumulative normal normal random variable ...... 4 time to absorption............. 21
cumulative distribution distribution function.......... 6 normalization ..................... 8 t-matrix............................ 21
function............................ 6 fX(x) normal density function NR-matrix........................ 21 transient state ................... 21
cumulative normal ........................................ 8 ordering ............................. 3 transition matrix ............... 20
distribution function.......... 6 g(t) generating function... 18, ordinary generating function tree
cyclic permutation .............. 3 19 ...................................... 19 probability................... 11
getting exactly n ( n − 1) ⋅ 1
There is an equal probability that the outcome will be P =
heads or tails. two heads 2{ {
2n
order of occurance inverse
1
P ( H ) = P (T ) =
is not a factor outcome
P ( Ei ∩ E j ∩ Ek ) − L
ORDERING/COMBINATIONS 2
+ ∑
1≤ i< j < k ≤ n In how many ways can n identical objects be arranged
E = an event, in this case in m containers?
n + m − 1 ( n + m − 1)!
=
Ω SAMPLE SPACE n − 1 ( n − 1)! ×( m − n )!
The set of all possible outcomes. (Means the same as A problem that appeared in the textbook was, how may
probability space, I think.) For example, if a coin is ways can 6 identical letters be put in 3 mail boxes? The
answer is 28.
tossed twice, the sample space is
Ω = {( H, H ) , ( H, T ) , ( T, H ) , ( T, T )}
Note that the sample space is not a number; it is a collection
or set of results. This is frequently a source of confusion;
the size of the sample space is a number, but the sample
space itself is not a number. See Probability, p2.
A normal random variable has a Gaussian density Two events A and B are called independent if the
function centered at the expectation µ. The figure outcome of one does not affect the outcome of the
below shows plots of the density functions of two other. Mathematically, (for a particular probability
normal random variables, centered at the common assignment/distribution) two events are independent if
expectation of 0. The plot having the sharper peak is the probability of both events occurring is equal to the
for the special case of a standard normal random product of their probabilities.
variable determined by and expectation of 0 and
deviation of 1. A normal random variable does not
P ( A ∩ B ) = P ( A) ⋅ P ( B )
necessarily have an expectation of zero. For example, the outcome of the first roll of a die does not
affect the second roll. The independence of two events can
be lost if the probabilities are not even, e.g. an unfair coin
or die.
Independence can also be expressed in terms of a
conditional probability. The probability of A given B is still A.
P ( A | B ) = P ( A)
And if this is true, then it is also true that
P ( B | A) = P ( B )
( )()
48! 4! 48! 4!
48 × 4 × ×
6 4
=
( 48 − 6 ) ! 6! ( 4 − 4 ) = 42!6! 0!4!
!4!
ω ) DISCRETE UNIFORM
( )
52 52! 52! m(ω
DISTRIBUTION FUNCTION p.19,367
10 ( 52 − 10 ) !10! 42!10!
What this is saying is, “From 48 non-aces choose 6, then The function assigning probabilities to a finite set of
from 4 aces choose 4". The product is the number of equally likely outcomes. For example, the distribution
possible combinations of 10 that can contain 4 aces. "Now function of a fair double coin toss is
divide this amount by all of the possible combinations of 10
1
cards out of 52.” Note that 0! = 1, so we have m ( H, H ) = m ( H, T ) = m ( T, H ) = m ( T, T ) =
48! 4
The distribution function for the roll of a die is
42!6! 48! 42!10!
= = = 0.000776 1 2 3 4 5 6
52! 42!6!52! mi =
42!10! 1/6 1/6 1/6 1/6 1/6 1/6
In general, the discrete uniform distribution function is
n + k − 1 ( n + k − 1) !
probabilities and is equal to one. See also Specific
Generating Functions p19.
n =
n ! ( k − 1) ! µ = center of the density, average value, expected value
σ2 = variance
k = the lowest value in the sample space
l = the highest value in the sample space
g(t) = generating function
()
numerical integration; there are tables of values for
this function in Appendix A of the textbook. b ( n, p , k ) = n p k q n − k where
k
1
FX ( x ) = ∫ e( )
( nk )
x − x −µ
2
/ 2 σ2
du
−∞
σ 2π accounts for the ways the result can be ordered
X = random variable for the number of occurrences in a
φ ( x ) = ( q + pe x )
n
given interval in time, area, length, etc. µ = np σ2 = npq
x = a particular outcome.
µ = center of the density, average value, expected value In the case of equal probabilities (p = 0.5), the function
σ = a positive value measuring the spread of the density, reduces to
standard deviation
()
b ( n, 0.5, k ) = n 0.5n
k
F(x,y) JOINT CUMULATIVE In the case of no successful outcomes (k = 0), the function
DISTRIBUTION FUNCTION p.165 reduces to
The example below is for two random variables and b ( n, p , 0 ) = q n
may be extended for additional variables.
F ( x, y ) = P ( X ≤ x, Y ≤ y )
n = number of trials or selections
p = probability of success
q = probability of failure (1-p)
k = number of successful outcomes
µ = center of the density, average value, expected value
σ2 = variance
φ(x) = density function
For example, if I can guess a person's age with a 70%
success rate, what is the probability that out of ten people, I
will guess the ages of exactly 8 people correctly?
( )
b (10,.7,8) = 10 .78.310−8 = 0.233
8
See also Specific Generating Functions p19.
( m )(k − m )
outcomes in x attempts. For geometric distribution,
M N k = 1.
P ( of selecting m, n balls ) =
( M k+ N )
x − 1 k x − k
u ( x, k , p ) = P ( X = x ) = p q
k − 1
This example was also used for hypergeometric distribution.
See section 5.1. This seems to describe the Bernoulli Trials Process which
is a sequence of x chance experiments such that 1) each
experiment has 2 possible outcomes and 2) the probability p
JOINT DISTRIBUTION FUNCTION p.141 of success is the same for each experiment and is not
affected by knowledge of previous outcomes.
A joint distribution function describes the probabilities X = random variable: the observation of an experimental
of outcomes involving multiple random variables. If outcome
the random variables are mutually independent then x = the number of attempts
the joint distribution function is the product of the k = the number of successful outcomes
individual distribution functions of the random p = the probability that any one event is successful
variables. q = the probability that an event is not successful, 1 - p
FX ,Y ( x, y ) = FX ( x ) FY ( y )
POISSON DISTRIBUTION p.187
1 1
Expectation: µ= Variance: σ2 =
λ λ2
1 λ
Deviation: σ= Generating Fnct.: g (t ) =
λ λ −t
λ = rate of occurrence, a parameter
t = time, units to be specified
DENSITY FUNCTIONS
Plot of the normal density function for µ = 0
f(x) DENSITY FUNCTION p.59
1
fX ( x) = e( )
2
− x −µ / 2 σ2
The density function is the derivative of the distribution
function F(x) (see p5). The integral of a density σ 2π
function over its entire interval is equal to one. So by
X = random variable: the observation of an experimental
integrating a density function over a particular interval
outcome
we determine the probability of an outcome falling x = a particular outcome.
within that interval.
µ = center of the density, average value, expected value
σ = the standard deviation, a positive value measuring the
P ( a ≤ X ≤ b ) = ∫ f ( x ) dx , f ( x) = F′( x)
b
F ( x, y ) = ∫ f ( t , u ) dt du
t u
−∞ −∞ ∫ t
The joint density function can involve more than two e.g. for a random variable T
variables and looks like P (T > x ) = e −λx P (T ≤ x ) = 1 − e −λx
∂ n F ( x1 , x2 ,L , xn )
f ( x1 , x2 ,L , xn ) = Two exponential density f (t )
∂x1 ∂x2 L ∂xn functions are plotted at
2
( ) ( )
is 4×48×44=8448. Multiply that by the 13 different
P A∪ B = P A∩ B numerical values in a deck of cards to get the total number
of 5-card hands that contain 3 of a kind (109,824). So the
P ( A ∩ B) = P ( A∪ B) probability is getting 3 of a kind is
1
× 4 × 48 × 44 ×13 = 0.04226
( )
From this we can get
52
(
A − ( B ∩ C ) = A ∩ B ∪C = A ∩ B ∩ C ) ( ) 5
( ) ( )
= A ∩ B ∩ A ∩ C = ( A− B )∩ ( A − C )
P(A|B) CONDITIONAL PROBABILITY
P(A|B) reads, "the probability that event A will occur
given that event B has occurred." Since we know that
B has occurred, the sample space now consists of
only those outcomes in which B has occurred.
P ( A ∩ B)
P(A | B) =
P (B)
=
∫
a
dx
For example, if we know that a spinner has stopped with its
x m (1 − x )
1 n−m
pointer in the upper half of a circle, 0 ≤ x ≤ ½, then the
conditional density is
∫
0
dx
1/ (1/2 ) , if 0 ≤ x ≤ 1 / 2 2, if 0 ≤ x ≤ 1/ 2 The computation of the integrals is too difficult for exact
f ( x | E) = =
0, if 1/ 2 < x < 1 0, if 1/ 2 < x < 1 solution except for small values of j and n.
∑ P(H ) P( E | H )
k =1
k k
A density function having positive parameters α and β. The memoryless property applies to the exponential
When both parameters are equal to one, the beta density function and the geometric distribution
density is the uniform density. When they are both function.
greater than one, the function is bell-shaped; when P (T > ( r + s ) | ( T > r ) ) = P ( T > s )
they are both less than one, the function is U-shaped.
A beta density function can be used to fit data that
does not fit the Gaussian curve of a normal density EXPECTATION
function (p8).
Beta Density Functions
E(X), µ EXPECTED VALUE OF DISCRETE
RANDOM VARIABLES p.225
The expected value, also known as the mean and
sometimes identified as µ, is the sum of the product of
each possible outcome and its probability. It is the
center of mass in a distribution function. If a large
number of experiments are undertaken and the results
are averaged, the value obtained should be close to
the expected value. The formula for the expected
value is
µ = E ( X ) = ∑ xm ( x )
x∈Ω
µ = E ( X ) = np
B ( α, β ) =
( α − 1)!(β − 1)!
( α + β − 1)!
E ( X 2 ) = V ( X ) + µ2 E(Xn) = ∫
The expected value, also known as the mean and +∞
sometimes identified as µ, is the center of mass in a x n f X ( x ) dx
−∞
distribution function. If a large number of experiments
are undertaken and the results are averaged, the If X and Y are two random variables with finite
value obtained should be close to the expected value. expected values, then
The formula for the expected value is
E ( X + Y ) = E ( X ) + E (Y )
+∞
µ = E(X ) = ∫ x f ( x ) dx
−∞ If X is a random variable and c is a constant
E ( cX ) = cE ( X )
+∞
provided that
∫ x f ( x ) dx is finite.
−∞
Otherwise the expected value does not exist. Note that the If X and Y are independent
limits of integration may be reduced provided they include
the sample space.
E ( X ⋅ Y ) = E ( X ) E (Y )
For an exponential density f ( t ) = λe −λt (p.9), the fX(x) = density function for the random variable X
expected value is
1 MARKOV INEQUALITY
µ=
λ The probability that an outcome will be greater than or
equal to some constant k is less than or equal to the
X = random variable: the observation of an experimental expected value divided by that constant.
outcome
f(x) = the density function for random variable X. E(X )
P(X > k) ≤
k
φ (X)) EXPECTATION OF A FUNCTION
E(φ For example, if the expected height of a person is 5.5
p.229 feet, then the Markov inequality states that the
If X and Y are two random variables and Y can be probability that a person is more than 11 feet tall is no
written as a function of X, then the expected value of Y more than ½. This example demonstrates the
can be computed using the distribution of X. looseness of the Markov inequality. A more
meaningful inequality is the Chebyshev inequality,
E (φ ( X )) = ∑ φ ( x ) m ( x )
which is a special case of Markov’s inequality (p. 16).
x∈Ω
(( X − µ) ) = ∫
+∞
σ2 = V ( X ) = E ( x − µ) f ( x ) dx
each possible outcome, squaring that difference, 2 2
multiplying that square by the probability of the −∞
outcome, and then summing these for each possible
outcome. The expected value is more useful as a Note that the limits of integration may be adjusted so
prediction when the outcome is not likely to deviate long as they continue to include the sample space.
too much from the expected value. If the integral fails to converge, the variance does
(( X − µ) ) = ∑ ( x − µ ) m ( x )
not exist.
σ2 = V ( X ) = E
2 2
x
The variance of a uniform distribution on [0,1] is 1/12.
The variance of an exponential distribution is 1/λ2.
For discrete random variables, the variance can be found by
a couple of methods: X = random variable: the observation of an experimental
outcome
∑ ( x − µ) m ( x)
2
Method 1: x
Find the expected value µ. µ = the expected value of X, E(X)
Subtract µ from each possible outcome. Square each of
these results. Multiply each result by its probability and
then sum all of these. V(X), σ2 PROPERTIES OF VARIANCE
For example, the variance of the roll of a die is p.259
2 2 2 2 2
[(1-3.5) +(2-3.5) +(3-3.5) +(4-3.5) +(5-3.5) +(6-3.5) ](1/6)=35/12. 2
V ( X + Y ) = V ( X ) + V (Y )
Method 2: E ( X ) − µ Multiply the probability of each
V ( cX ) = c 2V ( X ) V ( X + c) = V ( X )
2 2
The book doesn’t go into detail about this. SUM OF RANDOM VARIABLES p.285, 291
Covariance applies to both discrete and continuous Discrete: Given Z = X + Y, where X and Y are
random variables. independent discrete random variables with
cov ( X , Y ) = E ( X − µ ( X ) ) (Y − µ ( Y ) ) distribution functions m1(x) and m2(y), we can find the
distribution function m3(z) of Z using convolution.
= E ( XY ) − µ ( X ) µ (Y ) m3 = m1 ∗ m2
Property of covariance: m3 ( z ) = ∑ m1 ( k ) ⋅ m2 ( z − k )
V ( X + Y ) = V ( X ) + V (Y ) + 2cov ( X , Y ) k
X = random variable: the observation of an experimental Continuous: Given Z = X + Y, where X and Y are
outcome independent continuous random variables with density
V(X) = the variance of X functions f(x) and g(y), we can find the density function
µ = the expected value of X, E(X) h(z) of Z using convolution. Note that we are talking
density functions here where it was distribution
functions where discrete random variables were
ρ (X,Y) CORRELATION p.281 concerned. Also note that the limits of integration may
be adjusted for density functions that do not extend to
The book doesn’t go into detail about this either.
infinity.
Correlation applies to continuous random variables.
+∞
Another text calls this the correlation coefficient and f ( x) ∗ g ( y) = h ( z) = ∫ f ( z − y ) g ( y ) dy
has a separate function for discrete random variables −∞
+∞
which it calls correlation. =∫ g ( z − x ) f ( x ) dx
cov ( X , Y ) −∞
ρ ( X ,Y ) =
V ( X ) V (Y ) For more about the sum of random variables, see Properties
of Generating Functions p20.
k = represents all of the integers for which the probabilities
X = random variable: the observation of an experimental
m1(k) and m2(z-k) exist. (In cases where the probability
outcome
doesn’t exist, the probability is zero.)
V(X) = the variance of X
µ = the expected value of X, E(X)
σ 2π n →∞ * a
1 − x2 / 2
σ = the deviation where φ ( x ) = e
σ2 = the variance 2π
µ = the expected value of X, E(X)
*For some reason, there is a big problem when performing
this integration. The table of values in the next box are for
areas under the curve of φ(x) and may be used as a close
Sn* STANDARDIZED SUM OF Sn p.326 approximation instead of performing the integration. For
The standardized sum always has the expected value example, for the integration from a* = -.2 to b* = .3, find the
values of NA(z) for z = .2 and z = .3 in the table and add
0 and variance 1. A sum of variables is standardized them together to get .1942. Note that in this case the
by subtracting the expected number of successes and values were added because they represented areas on
dividing by its standard deviation. each side of the mean (center). In the case where both
values were on the same side of the mean (both have the
S n − np S n − nµ
S n* = or S n =
* same sign), a subtraction would have to take place to find
the desired area. That is because NA(z) is the area
jpq nσ 2 bounded by z and the mean. Refer to the figure below.
n →∞
(
φ ( x ) = lim npq b n, p, np + x npq ) a = lower bound
b = upper bound
φ(x) = standard normal density function
φ(x) = standard normal density n = number of trials or selections
n = number of trials or selections p = probability of success
p = probability of success q = probability of failure (1-p)
q = probability of failure (1-p)
j =1
n = number of trials or selections
( )=∫
+∞
Continuous: g ( t ) = E e etx f X ( x ) dx
tX
−∞
CENTRAL LIMIT THEOREM –
Uniform Density: g ( t ) = E e ( ) = b −1 a ∫
b
GENERAL FORM p.343 tX
etx dx
a
Where Sn is the sum of n discrete random variables,
and we assume that the deviation of this sum
Note that the limits of integration are the range of the
approaches infinity sn→∞:
random variable X and are not necessarily infinite.
Moments may also be calculated directly; see the next
S − mn 1 b − x2 / 2
2π ∫a
lim P a < n < b = e dx box.
n →∞
sn t = just some variable we need in order to have a generating
mn = the mean of Sn function
sn = the deviation of Sn (square root of the variance) j = a counting variable (integer) for the dummy variable x
a = lower bound x = dummy variable, I think
b = upper bound fX(x) = density function for the random variable X
n = number of trials or selections
For n large:
φ( xj ) 1 −
( j −nµ )2
P ( Sn = j ) : = e 2 nσ2
nσ 2
2πnσ 2
where xj =
( j − nµ )
nσ2
φ(x) = standard normal density
n = number of trials or selections
p = probability of success
µ = the expected value of X, E(X)
σ2 = the variance
Discrete: µ n =
∞
∑(x ) P( X = x )
j
k
j
j()
p X ( j ) = n p j q n− j g ( t ) = ( pet + q )
n
j =1 µ = np σ2 = np (1 − p )
( )=∫
+∞
Continuous: µ n = E X
n
x n f X ( x ) dx Geometric distribution for all j
−∞
pet
p X ( j ) = q j −1 p g (t ) =
Mean: µ = µ1 for t = 0 1 − qet
µ = 1/ p σ2 = q / p2
Variance: σ2 = µ 2 − µ12 for t = 0
Poisson distribution with mean λ for all j
Sanity check: µ0 = 1 for t = 0
e −λ λ j ( )
pX ( j ) =
λ et −1
t = just some variable we need in order to have a generating g (t ) = e
j!
function
k = a dummy counting variable (integer) for the moment µ=λ σ2 = λ
calculation
X = random variable: the observation of an experimental
n = a counting variable (integer) for the moments, where outcome
st nd
n = 1 for the 1 moment, n = 2 for the 2 moment, etc.
t = just some variable we need in order to have a generating
function
j = a counting variable (integer) for the dummy variable x
x = dummy variable, I think
For Y = X + a : (
gY ( t ) = E ( etY ) = E et ( X + a ) ) A Markov chain is composed of various states with
defined paths of movement between states and
= eta E ( etX ) = eta g X ( t ) associated probabilities of movement along these
paths. Permissible paths from one state to another
gY ( t ) = E ( etY ) = E ( etbX )
are called steps.
For Y = bX :
For example, let's say in the Land of Oz, there are never 2
= g X ( bt ) nice days in a row. When they have a nice day, the
following day will be rain or snow with equal probability.
gY ( t ) = E ( etY ) = E e ( ( )
t bX + a ) When they have snow or rain, there is a 50% chance that
For Y = bX + a : the following day will be the same and an equal chance of
the other two possibilities. So the states look like this.
= eta E ( etbX ) = eta g X ( bt ) 1 1
1
4
4 2
X −µ t 1 1
: g x* ( t ) = e
RAIN NICE SNOW
−µt / σ
For X * =
2 2
gX
σ σ 1
1
2
1
4
4
For Z = X + Y : (
g Z ( t ) = E ( etZ ) = E et ( X +Y ) )
= E ( etX ) E ( etY ) = g X ( t ) gY ( t )
TRANSITION MATRIX p.406
p( j) =
h ( j)
( 0) the weather following a rainy day, etc. Notice that the rows
each sum to 1 but the columns do not. We can use the
j! terminology p12 to mean the probability of having a nice day
(2) after a rainy day (1). We can read the result from
element p12 of the matrix.
1 1 1
For example, if h( z) = + z + z2 then p has values
4 2 4
of {1/4,1/2,1/4}. MATRIX POWERS p.406
z = just some variable we need in order to have a generating The above P-matrix raised to the second power gives
function us 2nd day probabilities, raised to a power of 3 gives
j = a counting variable (integer) for the dummy variable z rd
us 3 day probabilities, etc.
h(z) = ordinary generating function
.438 .188 .375
P = .375 .250 .375
2
.6 .6
From our example P-matrix we have
A state that is not absorbing (states 2 and 3 in this example)
is called a transient state. The P-matrix for the example −1 2 3
above is 1 0 0 .4
N = − = 2 1.32 .526
1 2 3 4
0 1 .6 0
3 .789 1.32
1 1 0 0 0
.6 0 .4 0 Q = a submatrix extracted from the P-matrix canonical form
P=
2
and used to obtain the fundamental matrix
3 0 .6 0 .4 I = an identity matrix
4 0 0 0 1
TIME TO ABSORPTION p.419
Z = ( I −P + W)
n→∞ −1
( )
P some share = 1 −
a birthday
365 ⋅ 364 ⋅K ⋅ ( 365 − r + 1)
365r
1 u
∫ e dx = u′ ⋅ e + C ∫ xe dx = ( x − 1) e + C
TRIGONOMETRIC IDENTITIES u x x
e + jθ + e − jθ = 2cos θ
e + jθ − e − jθ = j 2sin θ e ax
∫ xe dx =
ax
( ax − 1) + C
e ± jθ = cos θ ± j sin θ a2
( n − 1) / 2 !
for odd n
LOGARITHMS 2a ( n + )
∞ 1 /2
∫
n − ax 2
xe dx =
ln x = b ↔ e b = x ln x y = y ln x 0
1⋅ 3 ⋅ 5L ( n − 1) π
for even n
ln e x = x e a ln b = b a 2( ) + a( )
n/2 1 n/ 2
a
log a x = y ↔ a y = x 1 1
∫ x dx = ln x + C ∫ a dx = ln a a +C
x x
k =0 k !
CALCULUS - DERIVATIVES
d u v ⋅ u′ − u ⋅ v′
=
dx
v v2 e-x ANOTHER e THING
n
dx a = a ln a
d x x d u
′ x
dx a = u ⋅ a ln a
x −x
As n → ∞, 1 − → e
d
e = u′ ⋅ e
u u n
dx
1 u′
d
dx ln x = d
dx ln u =
x u
x ( x + 1)
median A value of a random variable for which all greater
= 1 + 2 + 3 +L + n values make the distribution function greater than one half
and all lesser values make it less than one half. Or, a value
2 in an ordered set of values below and above which there is
x ( x + 1)( 2 x + 1) 2 2 2 an equal number of values.
= 1 + 2 + 3 + L + n2 random variable A variable representing the outcome of a
6 particular experiment. For example, the random variable X1
might represent the outcome of two coin tosses. It's value
could be HT or HH, etc.
BINOMIAL THEOREM stochastic Random; involving a random variable; involving
Also called binomial expansion. When m is a positive chance or probability.
integer, this is a finite series of m+1 terms. When m is uniform distribution The probabilities of all outcomes are
not a positive integer, the series converges for -1<x<1. equal. If the sample space contains n discrete outcomes
m ( m − 1) m ( m − 1)( m − 2 )L ( m − n + 1) numbered 1 through n, then the uniform distribution function
(1 + x ) = 1 + mx + x2 +L + x n +L
m
2! n! is m(ω) = 1/n.
QUADRATIC EQUATION
GIven the equation ax + bx + c = 0 .
2
− b ± b 2 − 4ac
x=
2a
LINEARIZING AN EQUATION
Small nonlinear terms are removed. Nonlinear terms
include:
• variables raised to a power
• variables multiplied by other variables
∆ values are considered variables, e.g. ∆t.