You are on page 1of 21

STATISTICS (ST102) LENT TERM MATERIAL one.

Point Estimation
Estimator (formula) function of data to give estimates Method of Moments Least Squares Maximum Likelihood Estimates

Data X, Y

Note: The Estimator is a random variable, as the value of the estimate changes as dierent samples are drawn. Thus, the estimator has a probability distribution. The samples are regarded as constants. Measuring the Preference of an Estimator: Mean Square Error (MSE) - Measures Tradeo Between Bias and Eciency

MSE = E

= E E + E = = Var( ) + Bias( )

) (

Bias - on average, the estimator gives the true value

Bias = E
Therefore, if an estimator is unbiased:

() ()

Bias = 0 E =
Disadvantages: No equal weights, magni es values that >1 and miniaturizes values that < 1 Mean Absolute Deviations (MAD)

()

()

MAD = E
Method of Moments Estimator (MME):
Kth Sample Moment (non-centered) k=1

Kth Population Moment

1 n Xi = X n i=1 1 n 2 Xi n i=1
...

E ( X)
E X 2 = Var ( X ) + E ( X )
...

k=2

( )

1 n k Xi n i=1
Computable from Data

E ( Xk )
Not Computable Depends on unknown parameters

Maximum Likelihood Estimator (MLE):

requires identical and independently distributed samples

Step 1: Constructing Likelihood Function

L ( ) = f ( X1 , X 2 ,..., X n ; ) = f ( X1; ) f ( X 2 ; ) ...f ( X n ; ) = f ( X i ; )


i=1

Often easier to work with log-likelihood function:

n n l ( ) = ln L ( ) = ln f ( X i ; ) = ln f ( X i ; ) i=1 i=1
Step 2: Maximize Likelihood Function Maximize log-likelihood function by dierentiation or by observation

dl ( ) =0 d =
Note: In likelihood function, Xs (sample data) are treated as constants, and (parameter) is a variable First order conditions are only application if the function is continuous dierentiable Properties: Under suitable conditions, MLE and MME have nice large sample properties Consistent -

n , MSE 0
Asymptotically Normal As n approaches in nity, under some regularity conditions,

()

1 n N 0, I ( )

1 1 N 0, N , nI ( ) nI ( )

I() is the Fisher Information, de ned as:

2 2 2 I ( ) = E 2 ln f ( X; ) = f ( X; ) i 2 ln f ( X; ) = f ( X; ) i 2 ln f ( X; )
On top of this, MLE is also: Invariant

MLE ( ) = MLE g ( ) = g
More Ecient than MME

()

Var ( MLE ) Var ( MME )

Conceptually, MLE uses more information in calculation than MME Hence, we always use MLE when possible

two. Con dence Intervals Con dence Interval for Parameter at 90/95/99% interval

C SE

()

Length of Con dence Interval = 2 x SE X% Con dence Interval: If one repeats the interval estimation a large number of times, about x% of times the interval estimator covers the true Con dence Intervals for Population Mean
Population Normally Distributed Variance Known Variance Unknown Population Not Normal If N is large, By Central Limit Theorem For Proportion Simpli cation after CLT

X N ( 0,1) n
Use Z-table

X t n1 s n
Use t-table

X N ( 0,1) n
approximately Use Z-table

X N ( 0,1) (1 ) n
Use Z-table

Chi-Squared Distribution

chi-squared distribution with k degrees of freedom


k

2 2 If X i N(0,1), Z = X1 + X 2 + ...+ X 2 + X 2 k 2 k i i=1 2 k = [ 0, ) E(Z) = k Var(Z) = 2k

Test for Population Variance:

( n 1) s2

2 n1

(1 ) Confidence Interval for 2


Let X i N( , 2 )
n 2

( n 1) s 2 ( n 1) s 2 is 2 , 2 1 /2,n1 /2,n1
Proof

i = 1, 2, ..., n

1 n Xi 2 = 2 (X i )2 n i=1 i=1
n n 2 2 2 1 n 1 n 1 n (X i )2 = 2 ( X i X ) + ( X ) = 2 ( X i X ) + ( X ) + 2 ( X ) ( X i X ) 2 i=1 i=1 i=1 i=1 i=1

2 2 1 n X ) = 2 (X ) 2 ( i=1

X 12 = n

2 1 n ( n 1) 1 n X X 2 = ( n 1) s2 2 2 = 2 2 ( Xi X ) = ) 2 ( n 1 n1 i=1 2 n 1 i=1 i

t-Distribution

student t-distribution with k degrees of freedom


2 Let Z N(0,1) and X k

T=

Z tk Xk

t is a continuous and symmetric distribution on (- ,), with heavier tails than the normal distribution. As k approaches in nity, the t distribution converges to the standard normal distribution. Test for Population Mean:

X t n1 s n

(1 ) Confidence Interval for is


X

s X t /2,n1 n
Proof

n
2

N ( 0,1) and

( n 1) s2

2 n1

2 n

( n 1) s2

2 n

n 1

2 X = t n1 s s2 n

F-Distribution

F-distribution with degrees of freedom p, k

U~

2 p

V~

2 k

U,V are independent r.v.s U p k ~ Fp,k

W= Fp,k = [ 0, )

2k 2 ( p + k 2 ) k E ( W) = ,k > 2 Var(W)= ,k > 4 2 k2 p (k 2) (k 4 ) If W ~ Fp,k , W1 ~ Fk,p If T ~ t k , T2 ~ F1,k

Test for ratio of 2 Normal Variances


2 2 2 2 H 0 : Y / X = r vs H1 : Y / X r 2 ( n 1) s2 / X X ( n 1) T= 2 ( m 1) s2 / Y Y ( m 1)

2 Y s2 s2 X X = 2 2 = r 2 ~ Fn1,m1 X sY sY

2 Y s2 s2 Y Y (1 ) Confidence Interval for 2 is F1 /2,n1,m1 2 , F /2,n1,m1 2 X sX sX

three. Hypothesis Testing


one Two-Tail Test State the Null and Alternative Hypothesis One-Tail Test

H 0 : = k vs H1 : k
two three

H 0 : = k vs H1 : > k / H 0 : = k vs H1 : < k
Compute the Test Statistic (T)

Look Up Critical Values at Level of Signi cance (1%, 5%, 10%) One-tail Test:

Compute the p-value p-value is the smallest level of signi cance that Ho can be rejected. Let t be the calculated test statistic

P(T > C ) =
If T > C , reject at level of significance
Two-tail Test:

P(T > t) / P(T < t) = p value


One-tail Test If p-value , reject at level of significance Two-tail Test If p-value / 2, reject at level of significance

P( T > C /2 ) =
If T > C /2 , reject at level of significance

four

Conclude Reject even at the 1% level of signi cance / Do not reject even at the 10% level of signi cance

Errors in Hypothesis Testing Type 1 Error: Type 2 Error: Power: Reject the Null Hypothesis when H0 is true Not rejecting H0 when H1 is true P ( Rejecting H0 when H1 is true ) Note: P ( Type 2 Error ) + Power = 1
For Example:

H 0 : = 0 vs H1 : 0 T=
Type 1 Error

X 0 Under H 0 , T N(0,1) n
Type 2 Error

P(Type 1 Error) = P(Observed T lies in critical region) = level of signi cance

True Mean is not 0, instead = 1 T is no longer a standard normal distribution.

1 0 T ~ N ,1 n
P(Type 2 Error) = P(T, under H1, lies within critical values) Power = P(T, under H1, lies outside critical values)

Properties

Trade o between Type 1 and Type 2 error As falls, P( Type 2 Error) Increases If variance increases, P( Type 2 Error) Increases

As the distance 0 - 1 increases, P( Type 2 Error) Falls If no. of samples increases, variance falls, P( Type 2 Error) Increases

Testing dierence of 2 population means


Data is Normally Distributed, or n is large
2 2 X N ( X , X ) Y N ( Y , Y )

2 2 X Y X N X , Y N Y , nX nY

X k = a

Matched Pairs Two distributions can be logically linked, same sample size

Independent Samples Two distributions cannot be linked, Dierent sample sizes

nX = nY = n
2 2 X Y Z = X Y N X Y , + n n

nX nY T=

( X Y) a SE ( X Y)

Under Ho, If Variance known

T=

Za

Under Ho, If Variance known

+ n n
2 X

2 Y

~ N(0,1)

SE ( X Y) = T=

2 2 X Y + nX nY

If Variance unknown, Use Sample Variance

( X Y) a ~ N ( 0,1)
2 2 X Y + nX nY

T=

Za 1 ~ t n1 where s 2 = n 1 s n

( Z

2 i

nZ 2

(1 ) Confidence Interval for X Y is

2 2 X Y Z /2 X + Y nX nY

If Variance unknown, but equal Test for Correlation E ( X EX ) ( Y EY) Cov ( X, Y) Corr ( X, Y) = = Var ( X ) Var ( Y) ( X EX )2 ( Y EY)2 Correlation measures the linear relationship between X and Y. When p=0, X and Y are linearly independent.
2 2 X = Y = 2

Pooled Variance = s 2 = P

( n x 1) s2X + ( n Y 1) s2Y ( n x 1) + ( n Y 1)

Pooled Variance is the weighted average of sample variances


SE ( X Y) =
2 2 1 s2 s2 1 ( n 1) s X + ( n Y 1) s Y P + P = + x nX nY n X n Y ( n x 1) + ( n Y 1)

H 0 : = 0 vs H1 : > 0 / < 0 / 0
Sample Correlation Coecient:
=

( X X ) ( Y Y ) ( X X ) ( Y Y)
2

Xi Yi nXY
(n 1)s Xs Y

T=

( X Y) a =
s s + nX nY
2 P 2 P

2 2 1 1 ( n x 1) s X + ( n Y 1) s Y n + n ( n 1) + ( n 1) X Y x Y

( X Y) a

~ t n+m2

(1 ) Confidence Interval for X Y is


2 2 1 1 ( n 1) s X + ( n Y 1) s Y X Y t /2,n+m2 + x n X n Y ( n x 1) + ( n Y 1)

T=

n2 = 1 2

n2 ~t 1 2 1 n2

Goodness of Fit Test 1. H0: r.v. X follows a certain distribution H1: r.v. X does not follow a certain distribution

to assess if a given distribution ts the data well

Note: in cases where the parameters of the distribution are not given, use MLE/MME to get a point estimate 2. Construct the Table and calculate the Test Statistic
1 Observed Frequency, Zi Probability, p Expected Frequency, Ei Dierence Zi - Ei (Zi - Ei)2 / Ei np1 np2 2 ... Total n 1 n 0 T

Under the Null Hypothesis,

T=
i=1

( Z i E i )2 ~ 2
Ei
n

n1No. of parameters estimated

T=
i=1

( Z i E i )2 =
Ei

n n n Z2 Z2 i 2 Z i + E i = i n E i=1 i=1 i=1 i=1 E i i

Note: If any category has expected cell count < 5, then merge groups so that all groups have expected counts that are more than or equal to 5. For some cases, intervals or groups can be self created. For example, in a test for normality, one can divide the line into 10 intervals with the probability at each interval being 10%. Contingency Tables / Tests of Association
Y Z 1 1 2 X ... r ... ... ... ... ... Zrc
r

Special application of the goodness-of- t test


Y Y (Z-E)2/E 1 1 2 2 ... ... ... ... ... c E1c E2c X ... r ... ... ... Erc ... r ... ... ... ... ... Er1 Er2 1 2 1 2 ... ... ... c E11 E12 E21 E22

E 2 ... ... ... c Z1c Z2c X Zr1 Zr2 Z11 Z12 Z21 Z22

T =
i=1 j=1

( Z i E i )2 ~ 2
Ei

pd

p - number of free counts among Z d - no. of the estimated free parameters for most cases, p-d = (r-1)(c-1)

Z Z ii , p i j = i j p ij = p ii p i j E ij = n p ij n n Z + Z12 + ...+ Z1n 11 Test for several Binomial Distributions: H 0 : p11 = p12 = ... = p1n = p p= Z1 + Z2 + ...+ Zn
Test of independence: H 0 : p ij = p ii p i j p ii =

four. Simple Linear Regression Model ALL FORMULAS BELOW ARE FOUND IN THE FORMULA SHEET

Theoretical Model: y i = 0 + 1x i + i Estimated / fitted Model: y i = 0 + 1x i 0 = y 1x {( x i x ) ( y i y )} = x iy i nxy 1 = 2 x 2i nx 2 (xi x )


Note: these are unbiased estimators. Their variance can be found in the formula sheet. The Gauss-Markov theorem states that these estimates are the best linear unbiased estimates. Any other linear estimates will have a larger variance than the estimates listed above, and hence are less ecient. See formula sheet for: Con dence Interval for Ey (constant) - repeating the expt. produce intervals that includes Ey x% of the time Predictive Interval for y (a r.v.) - interval contains the unobserved random variable y x% of the time Outliers: unusually small or large yi that lie outside the majority of observations ( >2 away from Ey) In uential Observations: an xi that is far away from other xs, have large in uence on tted line Analysis of Residuals Residuals should exhibit no systematic patterns, with constant variance

Testing of B0 and B1

i i.i.d. N ( 0, 2 )

y i ~ N ( 0 + 1x i , 2 )

2 n n n 2 2 ~ N , ~ N , 2 / (x x)2 0 i 1 0 n x i / (x i x) , 1 i=1 i=1 i=1

Replace variance with the estimate below:

2 =

1 n y i 0 + 1x i n 2 i=1

n 2 n 2 n = SE 0 x i / (x i x)2 , SE 1 = 2 / (x i x)2 n i=1 i=1 i=1

( )

( )

Common Tests: Test whether B0 is non-zero

= 0 T= 0 0 = H0 : 0 SE 0

( )

0 0 2 n 2 n x / (x x)2 n i=1 i i=1 i

~ t n2

2 n 2 n t (1 ) Confidence Interval for 0 is 0 /2,n2 x i / (x i x)2 n i=1 i=1


Test whether B1 is signi cantly non-zero

H 0 : 1 = 0 T= 1 1 = SE 1

( )

1 1 / (x i x)
2 i=1 n 2

~ t n2

(1 ) Confidence Interval for 1 is


Testing the variance of the residuals

n 2 2 1 t /2,n2 / (x i x) i=1

( n 2 ) 2

1 n y i 0 + 1x i 2 i=1

2 ~ n2

ANOVA

Total Sum of Squares (SS) = Regression SS + Resudual SS (y i y) = 12 (x i x)2 + y i 0 + 1x i


2 i=1 i=1 i=1 n n 2 n n n

Total SS = (y i y) = y 2 ny 2 i
i=1 i=1

Regression SS = 12 (x i x)2 = 12 x 2 nx 2 i i=1 i=1


n n

Another Test for whether B1 is signi cantly non-zero

Regression SS 1 1 F= = ( Residual SS) / ( n 2 ) SE 1

( )

~ F1,n2

Regression Correlation Coecient

Percentage of total variation explained by x

R=

Regression SS Residual SS = 1 , Total SS Total SS

R adj =

Regression SS / ( n 2 ) Total SS / ( n 1)

Analysis of Minitab Results

Fitted Line Plot


Velocity = 18.06 + 0.2818 Stopping Distance 70 60 50
Velocity
Reg ressio n 95% C I 95% P I S R-Sq R-Sq (ad j) 2.14805 98.4% 98.0%

40 30 20 10 0 20 40 60 80 100 Stopping Distance 120 140 160

STATISTICS (ST102) MICHAELMAS TERM MATERIAL Notepad for Probability Basic axioms, independence, mutual exclusion, pairwise disjoint/partitions, total probability, Bayes theorem, permutations and combinations

Notepad for Discrete and Continuous Random Variables

Discrete Random Variables p.d.f.

Continuous Random Variables

c.d.f

E(X)

Var(X)

m.g.f.

Misc

Discrete Random Variables 1: Discrete Uniform Distribution pdf: cdf: mgf: mean:

variance:

Discrete Random Variables 2: Bernoulli Distribution pdf: cdf: mgf: mean:

variance:

Discrete Random Variables 3: Binomial Distribution pdf: cdf: mgf: mean:

variance:

Discrete Random Variables 4: Poisson Distribution pdf: cdf: mgf: mean:

variance:

Continuous Random Variables 1: Uniform Distribution pdf: cdf: mgf: mean:

variance:

Continuous Random Variables 2: Exponential Distribution pdf: cdf: mgf: mean:

variance:

Continuous Random Variables 3: Normal Distribution pdf: cdf: mgf: mean:

variance:

Notepad for Multivariate Random Variables Joint distributions, Marginal Distributions, Conditional Distributions, Covariance and Correlation

You might also like