You are on page 1of 21

ST102: Text for the gaps in lecture slides

Version: December 13, 2012


This document contains the text that has been omitted from the lecture slides and lled in
during the lectures. The slides are referred to by page number.
This document is updated after each section of material. The date of the latest update is under
the title above.
Slide 34: Statistical analysis may have two broad aims:
1. Descriptive statistics: Summarise the data that were collected, in order to make it
more understandable.
2. Statistical inference: Use the observed data to draw conclusions about some broader
population.
Slide 52:
Such a distribution is positively skewed (right-skewed).
A distribution with longer left tail (i.e. toward small values) is negatively skewed (left-
skewed).
Slide 55: Next we consider descriptive statistics which summarise one feature of the sample
distribution in a single number: summary statistics.
Slide 57: The sum of the numbers is written as
n

i=1
X
i
= X
1
+X
2
+ +X
n
Slide 58: Here a denotes a constant, i.e. a number with the same value for all i.
Proof:

i
aX
i
= (aX
1
+ +aX
n
) = a(X
1
+ +X
n
) = a

i
X
i
Slide 61: Analogous notation for the product of a set of numbers is
n

i=1
X
i
= X
1
X
2
X
n
1.

n
i=1
aX
i
= a
n
(

n
i=1
X
i
)
2.

n
i=1
a = a
n
1
3.

n
i=1
X
i
Y
i
= (

n
i=1
X
i
) (

n
i=1
Y
i
)
Slide 62:

X =

n
i=1
X
i
n
Slide 66:
Deviations:
from

X (= 4) from Median (= 3)
i X
i
X
i


X (X
i


X)
2
X
i
3 (X
i
3)
2
1 1 3 9 2 4
2 2 2 4 1 1
3 3 1 1 0 0
4 5 +1 1 +2 4
5 9 +5 25 +6 36
Sum 20 0 40 +5 45

X = 4
Slide 67:
n

i=1
(X
i


X) =
n

i=1
X
i

n

i=1

X =
n

i=1
X
i
n

X
=
n

i=1
X
i
n

n
i=1
X
i
n
=
n

i=1
X
i

n

i=1
X
i
= 0
Slide 73: In general the mean is aected much more than the median by outliers, i.e. unusually
large or small observations.
Slide 78: ...but they are clearly not the same. In one (red) the values have more dispersion
(variation) than in the other.
Slide 79: The rst measures of dispersion, sample variance and its square root, sample
standard deviation, are based on (X
i


X)
2
, the squared deviations from the mean.
Slide 80:
S
2
=

n
i=1
(X
i


X)
2
n 1
Slide 81:
about 2/3 of the observations are between

X S and

X +S
about 95% of the observations are between

X 2S and

X + 2S.
2
Slide 82: The sum of squares in S
2
can also be expressed as
n

i=1
(X
i


X)
2
=
n

i=1
X
2
i
n

X
2
Slide 109: In the population, statistical inference will allow us to say things like...
Slide 110:
Each response X
i
is a realisation of a random variable from a Bernoulli distribution
with probability parameter
The responses X
1
, X
2
, . . . , X
n
are independent of each other
The sampling distribution of the sample mean (proportion)

X has expected value
and variance (1 )/n
Thanks to the Central Limit Theorem, the sampling distribution is approximately a
normal distribution
Slide 111: Probability theory is the branch of mathematics that deals with randomness.
So we need to learn it rst.
Slide 113: This uses the language and concepts of set theory. So we need to learn the basics
of that rst.
Slide 114: A set is a collection of elements (members of the set)
Slide 120:
n
_
i=1
A
i
= A
1
A
2
A
n
n

i=1
A
i
= A
1
A
2
A
n
Slide 124: Two sets A and B are disjoint or mutually exclusive if A B = .
Slide 125: Proof:
A (B A
c
) = (A A
c
) B = B =
and
A (B A
c
) = (A B) (A A
c
) = B S = B
Slide 128: Such a function is a probability function if it satises the following axioms
(self-evident truths)
3
If events A
1
, A
2
, . . . are pairwise disjoint (i.e. A
i
A
j
= for all i = j), then
P
_

_
i=1
A
i
_
=

i=1
P(A
i
)
Slide 131: ...is simply the sum of the probabilities of the individual events:
P(A) = P(A
1
) +P(A
2
) +P(A
3
)
Slide 134: Thus
P(B) = P(A (B A
c
)) = P(A) +P(B A
c
) P(A)
since P(B A
c
) 0.
Slide 151: The number of possible ordered sequences of k objects selected with replacement
from n objects is thus
k times
..
n n n = n
k
Slide 153: Using factorials, the number on the last page is written as
n (n 1) (n k + 1) =
n!
(n k)!
Slide 162: ...the total number of possible selections is
m
1
+m
2
+ +m
K
...the total number of possible sequences is
m
1
m
2
m
K
Slide 169: Two events A and B are (statistically) independent if
P(A B) = P(A)P(B)
Slide 172: This implies the important result that if events A
1
, A
2
, . . . , A
n
are independent,
then
P(A
1
A
2
A
n
) = P(A
1
)P(A
2
) P(A
n
)
4
Slide 176: The answer is given by the conditional probability of A given that B has occurred,
or the conditional probability of A given B for short:
P(A|B) =
P(A B)
P(B)
Slide 187: Add up the probabilities of the paths:
P(A) =
K

i=1
P(A B
i
) =
K

i=1
P(A|B
i
) P(B
i
)
Slide 191: So we can write
P(B
j
|A) =
P(A|B
j
) P(B
j
)

K
i=1
P(A|B
i
) P(B
i
)
Slide 195: Suppose that you rst choose Box 1, and then Monty opens Box 3. Bayes theorem
tells us that
P(B
2
|M
3
) =
P(M
3
|B
2
)P(B
2
)
P(M
3
|B
1
)P(B
1
) +P(M
3
|B
2
)P(B
2
) +P(M
3
|B
3
)P(B
3
)
Slide 199: This kind of probability is known as the prior probability of an event A.
Slide 207: Previously, we considered descriptive statistics for a sample of observations of a
variable X:
X
1
, X
2
, . . . , X
n
Slide 208: Outcomes are numbers in this sample space. Instead of outcomes, we often call
them the values of the r.v..
Slide 209:
An r.v. is continuous if S is all of R or some interval(s) of it, e.g. [0, 1] or (0, ).
An r.v. is discrete if it is not continuous examples below.
Slide 211: ...and quantities for an observed sample are then used as estimators of the
analogous quantities for the population (r.v.).
5
Slide 213: The probability distribution (or just distribution) of a discrete random vari-
able X is specied by...
Slide 215: Alternative terminology: the pf of a discrete r.v. is also often called the probability
mass function (pmf).
Slide 216:
1. p(x) 0 for all real numbers x
2.

x
i
S
p(x
i
) = 1, i.e. the sum of the probabilities of the possible values of X is 1
Slide 217: These are clearly all non-negative, and their sum is

8
x=1
p(x) = 1.
Slide 220: Then the probability that the rst success occurs after x failures is the probability
of a sequence of x failures followed by a success, i.e.
(1 )
x

So the pf of random variable X (number of failures before rst success) is


p(x) =
_
(1 )
x
for x = 0, 1, 2, . . .
0 otherwise
Slide 221: Using the sum of the geometric series, we get

x=0
p(x) =

x=0
(1 )
x
=

x=0
(1 )
x
=
1
1 (1 )
=

= 1
Slide 222: The expression of the pf on the previous page involves a parameter (probability
of a successful throw), a number for which we can choose dierent values.
Slide 224: The cdf is denoted F(x) (or F
X
(x)) and dened as
F(x) = P(X x) for all real numbers x.
For a discrete r.v. it is simply
F(x) =

x
i
S, x
i
x
p(x
i
)
6
Slide 227: we can write
F(x) =
_
0 when x < 0
1 (1 )
y+1
when y x < y + 1, for y = 0, 1, 2, . . .
Slide 232: The expected value (or mean) of X is denoted E(X), and dened as
E(X) =

x
i
S
x
i
p(x
i
)
Slide 235: from which we can solve
E(X) =
1
1 (1 )
=
1

Slide 238: The expected value is thus


E(X) =
_
1
18
37
_
+
_
+1
19
37
_
= +0.027
Slide 240: This is also an r.v., and its expected value is
E[g(X)] =

g(x) p
X
(x)
In general,
E[g(X)] = g[E(X)]
Slide 241: Then
E(a X +b) = a E(X) +b
Slide 242: is obtained when a = 0:
E(b) = b
the expected value of a constant is the constant itself.
Slide 243: The variance of a discrete r.v. X is dened as
Var(X) = E{[X E(X)]
2
} =

x
[x E(X)]
2
p(x)
7
An alternative formula: The variance can also be calculated as
Var(X) = E(X
2
) E(X)
2
Slide 246: If a = 0, this gives
Var(b) = 0
the variance of a constant is 0. The converse also holds: if a random variable has
variance 0, it is actually a constant.
Slide 249: ...and the probability function
p(x) =
_
_
n
x
_

x
(1 )
nx
for x = 0, 1, 2, . . . , n
0 otherwise
Slide 253:
E(X) =
n

x=0
x
_
n
x
_

x
(1 )
nx
=
n

x=1
x
_
n
x
_

x
(1 )
nx
=
n

x=1
n(n 1)!
(x 1)![(n 1) (x 1)]!

x1
(1 )
nx
= n
n

x=1
_
n 1
x 1
_

x1
(1 )
nx
= n
n1

y=0
_
n 1
y
_

y
(1 )
(n1)y
= n 1 = n
Slide 254: The moment generating function (mgf) of a discrete random variable X is
dened as
M
X
(t) = E(e
tX
) =

x
e
tx
p(x)
Slide 255: ... using the following results:
M

X
(0) = E(X) and M

X
(0) = E(X
2
)
(Other moments about 0 are obtained from the mgf similarly:
M
(r)
X
(0) = E(X
r
), for r = 1, 2, 3, . . . )
8
Slide 259: and thus
M

X
(0) = = E(X)
M

X
(0) = (1 +) = E(X
2
)
and
Var(X) = E(X
2
) E(X)
2
= (1 +)
2
=
Slide 263: For a continuous r.v. X, the probability function is replaced by the probability
density function (pdf), denoted f(x) [or f
X
(x)].
Slide 265: In fact, for a continuous distribution
P(X = x) = 0 for all x
Slide 266: Integrals of the pdf give probabilities of intervals of values:
P(a < X b) =
_
b
a
f(x) dx
for any two numbers a < b.
Slide 268: The pdf f(x) of any continuous r.v. must satisfy these conditions:
1.
f(x) 0 for all x
2.
_

f(x) dx = 1
Slide 270: The cumulative distribution function (cdf) of a continuous r.v. X is dened
exactly as for discrete random variables, i.e.
F(x) = P(X x) for all real numbers x
Slide 271: The cdf is obtained from the pdf through integration:
9
P(X x) = F(x) =
_
x

f(t) dt for all x


...and cdf from pdf through dierentation:
f(x) = F

(x)
Slide 274: In general, for any two numbers a < b,
P(a < X b) =
_
b
a
f(x)dx = F(b) F(a)
Slide 279:
E(X) =
_

xf(x) dx
E[g(X)] =
_

g(x) f(x) dx
Var(X) = E{[X E(X)]
2
} =
_

[x E(X)]
2
f(x) dx = E(X
2
) E(X)
2
sd(X) =
_
Var(X)
Slide 285: Then
Var(X) = E(X
2
) [E(X)]
2
=
2

2

1

2
=
1

2
.
Slide 286: The moment generating function (mgf) of a continuous r.v. X is dened as for
discrete variables, with summation replaced by integration:
M
X
(t) = E(e
tX
) =
_

e
tx
f(x) dx
Slide 288: The median of a continuous random variable X is the value which satises
F(m) = 0.5
Slide 304: Recall (see p. 207) that in statistical inference we will treat observations
X
1
, X
2
, . . . , X
n
10
Slide 306: In the narrower sense, individual distributions within a family dier in having
dierent values of the parameters of the distribution.
...use observed data to choose (estimate) values for the parameters of that distribution, and
perform statistical inference on them.
Slide 310: X has a discrete uniform distribution if all these values have the same proba-
bility, i.e.
p(x) = P(X = x) =
1
K
for all x = 1, 2, . . . , K.
Slide 311: so
Var(X) = E(X
2
) E(X)
2
=
K
2
1
12
Slide 313: This is the distribution of a random variable X with the probability function
p(x) =
_

_
for x = 1
1 for x = 0
0 otherwise
Such a random variable X has a Bernoulli distribution with (probability) parameter . This
is often written as
X Bernoulli()
Slide 314:
E(X) =
1

x=0
xp(x) = 0 (1 ) + 1 =
E(X
2
) =
1

x=0
x
2
p(x) = 0
2
(1 ) + 1
2
=
Var(X) = E(X
2
) [E(X)]
2
=
2
= (1 )
Slide 315: This is often written as
X Bin(n, )
Slide 317: Thus the probability of obtaining three 1s is
_
4
3
_

3
(1 )
1
= 4 0.25
3
0.75
1
0.047
11
Slide 319:
E(X) = n
Var(X) = n (1 )
Slide 323: The probability function of the Poisson distribution is
p(x) =
_
e

x
x!
for x = 0, 1, 2, . . .
0 otherwise
If a random variable X has a Poisson distribution with parameter , this is often denoted by
X Poisson()
Slide 324:
E(X) =
Var(X) =
Slide 329: For X Poisson(1.6), the probability P(X = 0) is
p
X
(0) =
e

0
0!
=
e
1.6
1.6
0
0!
= e
1.6
= 0.20
Slide 331: For Y Poisson(8), the probability P(Y 1) is
p
Y
(0) +p
Y
(1) =
e
8
8
0
0!
+
e
8
8
1
1!
= 0.000335 + 0.002684 = 0.003
Slide 342: The probability density function is
f(x) =
_
1
ba
for a x b
0 otherwise.
The pdf is at, as shown below. Clearly f(x) 0 for all x, and
_

f(x) dx =
_
b
a
1
b a
dx =
1
b a
(b a) = 1.
12
Slide 343: The probability of an interval [x
1
, x
2
], where a x
1
< x
2
b, is thus
P(x
1
X x
2
) = F(x
2
) F(x
1
) =
x
2
x
1
b a
Slide 345:
E(X) =
b +a
2
= median of X
Var(X) =
(b a)
2
12
Slide 346: ...if its probability density function is
f(x) =
_
e
x
for x > 0
0 otherwise
Slide 348: As shown on p. 276, the cdf of the Exponential() distribution is
F(x) =
_
0 for x 0
1 e
x
for x > 0
Slide 349:
E(X) = 1/
Var(X) = 1/
2
Slide 352:
P(X 1) = F(1) = 1 e
1.61
= 0.798
Probability is about 0.8 that two arrivals are at most a minute apart.
P(X > 3) = 1 F(3) = e
1.63
= 0.008
The probability of a gap of 3 minutes or more between arrivals is very small.
Slide 358: The probability density function of the normal distribution is
f(x) =
1

2
2
exp
_

1
2
2
(x )
2
_
for < x <
13
Slide 359:
E(X) =
Var(X) =
2
Slide 362: Furthermore, if X is normally distributed, then so is Y . In other words, if X
N(,
2
), then
Y = aX +b N(a +b, a
2

2
)
Slide 363: The transformed variable Z = (X )/ is known as a standardised variable
or a z-score.
Slide 364: In the special case of the standard normal distribution, the cdf is
F(x) = (x) =
_
x

2
exp
_

1
2
t
2
_
dt
This is often denoted (x).
Slide 369: If Z N(0, 1), for any two numbers z
1
< z
2
,
P(z
1
< Z z
2
) = (z
2
) (z
1
)
Reality check: Remember that
(0) = P(Z 0) = 0.5 = P(Z > 0) = 1 (0)
Slide 370: Consider the 0.2005 marked in the table on p. 366. This is in the 0.8 row and
0.04 column of the table, so it shows that
1 (0.84) = P(Z > 0.84) = 0.2005
Slide 371: What if we want to calculate, for any a < b,
P(a < X b) = F
X
(b) F
X
(a)
P(a < X b) = P
_
a

<
X

_
= P
_
a

< Z
b

_
=
_
b

_
a

_
14
Slide 372: So here
X 74.2
11.31
= Z N(0, 1)
Slide 377: For a given 0 < < 1, the binomial distribution Bin(n, ) tends to the normal
distribution N(n, n (1 )) as n .
Slide 379: For the best results, use the continuity correction:
Slide 382: We then need to solve
x + 0.5 361

230.68
2.33
Slide 389: The joint probability distribution of a multivariate random variable X is
dened by the possible values x, and their probabilities.
Slide 391: ...dened as
p(x
1
, x
2
, . . . , x
n
) = P(X
1
= x
1
, X
2
= x
2
, . . . , X
n
= x
n
)
In the bivariate case, this is
p(x, y) = P(X = x, Y = y)
Slide 393: Note that this satises the conditions for a probability function:
1. p(x, y) 0 for all (x, y)
2.

3
x=0

3
y=0
p(x, y) = 0.100 + 0.031 + + 0.006 = 1.000
Slide 395: The marginal pf of (X
1
, X
2
) is
p
12
(x
1
, x
2
) = P(X
1
= x
1
, X
2
= x
2
) =

x
3

x
4
p(x
1
, x
2
, x
3
, x
4
)
Slide 396: Their marginal pfs are
p
X
(x) =

y
p(x, y) and p
Y
(y) =

x
p(x, y)
Slide 399: ...the joint distribution of X is specied by its joint probability density function
f(x
1
, x
2
, . . . , x
n
).
15
Slide 404: ...is the discrete probability distribution with the pf
p
Y |X
(y|x) = P(Y = y|X = x) =
P(X = x and Y = y)
P(X = x)
=
p
XY
(x, y)
p
X
(x)
Slide 407: Clearly p
Y |X
(y|x) 0 for all y, and

y
p
Y |X
(y|x) =

y
p
XY
(x, y)
p
X
(x)
=
p
X
(x)
p
X
(x)
= 1
Slide 408: The conditional distribution and pf of X given Y = y (for any y such that
p
Y
(y) > 0) is dened similarly, with roles of X and Y reversed:
p
X|Y
(x|y) =
p
XY
(x, y)
p
Y
(y)
Slide 409: These are known as the conditional mean and conditional variance, and are
denoted
E
Y |X
(Y |x) and Var
Y |X
(Y |x)
Slide 412: The conditional distribution of Y given that X = x is the continuous probability
distribution with the pdf
f
Y |X
(y|x) =
f
XY
(x, y)
f
X
(x)
Slide 416: The covariance of two random variables X and Y is dened as
Cov(X, Y ) = Cov(Y, X) = E{[X E(X)] [Y E(Y )]}
This can also be expressed as the more convenient formula
Cov(X, Y ) = E(XY ) E(X)E(Y )
Slide 418: The correlation of two random variables X and Y is dened as
Corr(X, Y ) = Corr(Y, X) =
Cov(X, Y )
_
Var(X) Var(Y )
=
Cov(X, Y )
sd(X) sd(Y )
When Cov(X, Y ) = 0, also Corr(X, Y ) = 0. When this is the case, we say that X and Y are
uncorrelated.
Slide 419: If Corr(X, Y ) > 0, we say that X and Y are positively correlated.
If Corr(X, Y ) < 0, we say that X and Y are negatively correlated.
16
Slide 423: Sample covariance of variables X and Y is calculated as

Cov(X, Y ) =
1
n 1
n

i=1
(X
i


X) (Y
i


Y )
Sample correlation of variables X and Y is calculated as
r =

Cov(X, Y )
s
X
s
Y
=

n
i=1
(X
i


X) (Y
i


Y )
_

n
i=1
(X
i


X)
2

n
i=1
(Y
i


Y )
2
Slide 426: This implies that
p
XY
(x, y) = p
X
(x) p
Y
(y) for all x, y.
Slide 427: They are independent if and only if their joint pf is
p(x
1
, x
2
, . . . , x
n
) = p
1
(x
1
) p
2
(x
2
) p
n
(x
n
)
Similarly, continuous r.v.s X
1
, X
2
, . . . , X
n
are independent if and only if their joint pdf is
f(x
1
, x
2
, . . . , x
n
) = f
1
(x
1
) f
2
(x
2
) f
n
(x
n
)
Slide 428: If two random variables are independent, they are also uncorrelated, i.e. then
Cov(X, Y ) = 0 and Corr(X, Y ) = 0
Slide 429: and the joint pf of the variables is
p(x
1
, x
2
, . . . , x
n
) = p(x
1
) p(x
2
) p(x
n
) =
n

i=1
p(x
i
)
=
n

i=1
e

x
i
x
i
!
=
e
n

i
x
i

i
x
i
!
Slide 430: and the joint pdf of the variables is
f(x
1
, x
2
, . . . , x
n
) = f(x
1
) f(x
2
) f(x
n
) =
n

i=1
f(x
i
)
=
n

i=1
1

2
2
exp
_

1
2
2
(x
i
)
2
_
=
1
(

2
2
)
n
exp
_

1
2
2
n

i=1
(x
i
)
2
_
17
Slide 436: In particular,
Var(X +Y ) = Var(X) + Var(Y ) + 2Cov(X, Y )
Var(X Y ) = Var(X) + Var(Y ) 2Cov(X, Y )
Slide 437: ...and the result on the previous page simplies to
Var
_
n

i=1
a
i
X
i
_
=
n

i=1
a
2
i
Var(X
i
)
In particular, when X and Y are independent,
Var(X +Y ) = Var(X) + Var(Y )
Var(X Y ) = Var(X) + Var(Y )
Slide 438: In particular, when X and Y are independent,
E(XY ) = E(X)E(Y )
Slide 439: From p. 243:
Var(X) = E(X
2
) E(X)
2
Slide 440: From p. 416:
Cov(X, Y ) = E(XY ) E(X)E(Y )
Slide 441: If X and Y are independent, Cov(X, Y ) = Corr(X, Y ) = 0.
Slide 443:
If X
i
Poisson(
i
), then

i
X
i
Poisson(

i
).
If X
i
Binomial(n
i
, ), then

i
X
i
Binomial(

i
n
i
, ).
Slide 444: Thus, using the results on pp. 434 and 437,
E(X) =
n

i=1
E(Z
i
) = n and Var(X) =
n

i=1
Var(Z
i
) = n(1 )
18
Slide 445: Then
n

i=1
a
i
X
i
+b N(,
2
)
Slide 452: Suppose we have a sample of n observations of a variable X:
X
1
, X
2
, . . . , X
n
Slide 454: The random variables X
1
, X
2
, . . . , X
n
are then called
independent and identically distributed (IID) random variables from distribu-
tion (population) f(x; ), or
a random sample of size n from distribution (population) f(x; ).
Slide 455: The joint pf/pdf of a random sample is
f(x
1
, x
2
, . . . , x
n
) = f(x
1
; ) f(x
2
; ) f(x
n
; ) =
n

i=1
f(x
i
; )
Slide 458: A statistic is a known function of the variables X
1
, X
2
, . . . , X
n
in a sample.
Slide 459: The probability distribution of a statistic is known as the sampling distribution
of the statistic.
Slide 464: For a random sample of size n from N(,
2
),
(a)

X N(,
2
/n)
(b) (n 1)S
2
/
2

2
n1
Slide 468: We know (see pp. 434 and 437) that for independent X
1
, . . . , X
n
from any distri-
bution
E
_
n

i=1
a
i
X
i
_
=
n

i=1
a
i
E(X
i
)
Var
_
n

i=1
a
i
X
i
_
=
n

i=1
a
2
i
Var(X
i
)
19
Slide 469: The formulae on the previous page then give
E(

X) =
n

i=1
1
n
E(X) = n
1
n
E(X) = E(X)
Var(

X) =
n

i=1
1
n
2
Var(X) = n
1
n
2
Var(X) =
Var(X)
n
Slide 470: Suppose that X
1
, . . . , X
n
are a random sample from a normal distribution with
mean and variance
2
. Then

X N
_
,

2
n
_
Slide 471: Variation of values of

X in dierent samples (the sampling variance) is large
when the population variance of X is large.
Slide 476: Then
lim
n
P
_

X
n

/

n
x
_
= (x)
The lim
n
indicates that this is an asymptotic result...
Slide 477: In less formal language, the CLT says that for a random sample from (nearly) any
distribution with mean and variance
2
, approximately

X N
_
,

2
n
_
Slide 486: Let X
1
, X
2
, . . . , X
k
be independent N(0, 1) random variables. If
Z = X
2
1
+X
2
2
+ +X
2
k
=
k

i=1
X
2
i
,
Slide 489: If Z
1
, Z
2
, . . . , Z
m
are independent random variables and Z
i

2
(k
i
), then their
sum is also
2
-distributed:
Z
1
+Z
2
+ +Z
m

2
(k
1
+k
2
+ +k
m
)
One example: if X
1
, X
2
, . . . , X
n
are a random sample from population N(,
2
), and S
2
is their
sample variance, then
(n 1)S
2

2

2
n1
20
Slide 493: Then the distribution of the random variable
T =
X
_
Z/k
is the t-distribution with k degrees of freedom.
Slide 499: Then the distribution of
Z =
U/p
V/k
is the F-distribution with degrees of freedom (p, k), denoted Z F
p, k
or Z F(p, k).
Slide 503: We are willing to assume that these observations are a realised value of a random
sample X
1
, X
2
, . . . , X
n
from population distribution N(,
2
), for some unknown values of
and
2
.
21