You are on page 1of 20

INSTITUTE OF MANAGEMENT TECHNOLOGY, GHAZIABAD

IMT

Financial Econometrics Term IV June-Aug 2011

Course Instructor: Dr. Kakali Kanjilal

IMT

Contents
1. Course Overview 2. Data & Basic Statistics 3. Data Classification 4. Analysis of Cross-section data 5. Analysis of qualitative cross section data 6. Analysis of Time Series Data 7. Analysis of Cross section and Time Series Data

Steps involved in econometric analysis


Business Problems

IMT

1. 2. 3. 4.

Translate into a few hypothesis Identify parameters/attributes Inquire on the information /data Identify most suitable model/analytical tool

Estimate the model

1. No 2.

Is the estimation correct? (Specification tests & diagnostic checks) Is the model adequate?

Yes

Leverage the model for predictions & strategic advice

Tests of the hypotheses

Source: Based on Maddala 2001

Data Handling by SAS


1. How to start SAS? 2. How do we get access/bring the data? (Import data file) 3. How do we know the contents of the data? (Proc contents) 4. How can we see the data layout? (Proc Print) 5. How do we make sure data is in desired form? 1. (Proc Means, Proc Univariate, Proc Freq) 6. Do we have all information that is required? 1. Drop/Keep 2. Add/Create Variables 3. Bring/Invoke additional Data 4. Append? 5. Merge/Join

IMT

IMT

Regression Analysis

What is Regression?
1. It means Drop, Falling off, Deterioration etc. in English 2. F. Galton introduced the word Regression . Means tends to move /Regress towards average. 3.

IMT

Today's world, it means statistical relationship of dependent variable on independent variable(s) having fixed values. It predicts/estimates the average value of the dependent variable based on the known fixed values of independent variables.

4. It does not necessarily mean causation. For example, the movements in domestic Stock returns could be dependent on the movements in global crude oil price, however, crude price movements dont the changes in Stock returns.

A Regression Situation
W eek ly con sum ption ex pen d iture & in com e for 6 0 fam ilies where for fix ed values o f X , we get d ifferen t set o f values for Y .
Y 80 55 eekly Family 60 65 70 75

IMT

100 65 70 74 80 85 88

120 79 84 90 94 98

140 80 93 95 103 108 113 115

Consumption Expenditure $$

Total Conditional Mean E (Y/X

325 65

462 77

445 89

707 101

C an we estim ate/pred ict a statistical relatio nship o f Y (d epend ent) fo r given X (ind epend ent?
Som e other E x am p les
1. 2. 3. T he d em an d o f a prod u ct with respect to its ad vertisin g ex pen d itu re . T his relation ship will be able to an aly ze % chan ge in d em an d with respect to 1% chan ge in ad vertisin g ex pen d itu re . T he stock retu rn o f a firm with respect to its sales, n o . of em ploy ees. Price respon siven ess/elasticity o f the d em an d for a prod u ct

eekly Family Income $$ 160 102 107 110 116 118 125 678 113 180 110 115 120 130 135 140 750 125 685 137 200 120 136 140 144 145 220 135 137 140 152 157 160 162 1043 149 966 161 240 137 145 155 165 175 189 260 150 152 175 178 180 185 191 1211 173

Population Regression

IMT

The line E(Y/ ) = f( i )= 1 + 2 i is a population regression function. 1 & 2 are parameters; represent behaviours of 60 families/population . Not observable. Assumes a linear relationship On average , as income increases, expenditure also goes up, although the individual family expenditure deviates (WHY?) with a tendency of being clustered around the mean. Deviation = ei = Yi - E (Y/ i ) ; 1) Systematic/Deterministic component = E (Y/ i )
2) Error /unsystematic /random component
200

E (Y/X) = f(Xi ) = Mean


150

100

2 = Slope
50

1 = Intercept

Distribution of Y for income $240

0 60 80 100 120 140 160 180 200 220 240 260

Conditional Distribution of Expenditure for different levels of family income

Sample Regression
Sample Regression line Yi(est) = b1 (est) + b2 (est) i + ei (est) ; b1(est) & b2(est) are statistic; sample characteristics, estimates of population parameters. Based on samples, regression estimates vary

IMT

How can we make sure sample estimates are true representation of population?

200 180 160

X 80 100 120 140 160 180 200 220 240 260

Y (1st) 55 88 90 80 118 120 145 135 145 175

Y (2nd) 70 65 90 95 110 115 120 140 155 150

Mean 65 77 89 101 113 125 137 149 161 173

140 120 100 80 60 40 20 0 60 80 100 120 140 160 180 200 220 240 260 Y (1st) Y (2nd) Mean

Samples from the population of 60 families

Sample regression lines

Two-Variable Regression Estimation


Y
X X X

IMT

e3 e1
X X X X X X X

e4

e2
X X

X
PRL : Yi = 1 + 2 i + ui ; latent, can be estimated from SRL SRL : Yi Yi (est) ei(est) ; i = 1, 2, 3, , n = b1 (est) + b2 (est) i - ei(est) ; i.e. ei(est) = Yi - Yi (est) ; i.e. ei(est)2 = (Yi - Yi (est) )2 ; Estimate b1 and b2, so that ei(est)2 (Residual Sum of Squares/RSS) is minimum. ei(est)2 / b1 = 0 & ei(est)2 / b2 = 0 gives Ordinary Least Square (OLS) estimates of 1 & 2. 1 & 2 can also be estimated by Maximum Likelihood Estimator (MLE). However, MLE and OLS give same results.

SRL = Yi(est) = b1 + b2

Assumptions of OLS estimation


The relationship between variables as:

IMT

& Y is linear ; implies linearity in parameters, not in

Y=

,Y= Y=

+
2 1

ln( ), ln(Y) =
3 2

, Y=
2 2

NOT + ln( ) or Y =
1

The

s are non-stochastic variables whose values are fixed


2

The error has zero expected value; E(u)=0 The error term has constant variance; E(u2) = Errors are statistically independent. Thus, E(ui uj)=0 for all i j ; no auto correlation The error term is normally distributed : u ~ N (0, ui i = 0 ; u & Y~ N (1 + 2 , are uncorrelated
2 2)

; homoscedastic

) s are uncorrelated;

Applicable in case of multiple regression , multicollinearity

OLS Estimates & its properties


OLS Estimates b2 = ( i ) ( Yi Y )/ ( = Cov ( , Y)/Var ( ); = xiyi/ xi ; b1 = Y b1
i

IMT

Variance of b1, b2 & ei )2 Var(b2) =


2 2

/(
2 i 2

)2
i

Var ( b1) =

/n(

)2

= ei2 /(n-2) ; (n-2) is called dof, number of independent observations, as we loose 2 dof to compute b1 & b2 in estimating Yi .

Cov (b1 , b2 ) =Var(b2) = (


2

/(

)2 )

By the assumption of normality in Y and u, b1 and b2 also follow normal distributions with b1 ~ N (E(b1) = 1,Var b1 ); Z = (b1 - 1) / Var (b1) ~ N (0,1) b2 ~ N (E(b2) = 2,Var b2 ); Z = (b1 - 2) / Var (b2) ~ N (0,1)

Goodness of Fit of estimated model


IMT

r2 (coefficient of determination) (in case of two-variable/bivariate regression) and R2 (in case of more than two variables/multivariate regression) measure how close sample regression line fits the data. r2 represent the overlapping portion below.

YY
r2 = 0

r2 between 0 to 1

Y= X

r2 close to 1

r2 = 1

r2 lies between 0 and 1. The higher the r2, the better the estimated model Y(est)

Goodness of Fit: Decompose Variation


Total Variation = Estimated/Explained part + Error Part Y
Total = ( Yi Y)
X

IMT

Yi
X

Due to error = ei Yi (est) Due to Regression = ( Yi(est) Y)

or,

TSS = ESS + RSS

or, (Yi Y)2 = (Yi(est) Y)2 + (Yi Yi (est))2 or, r2 = (Yi(est) Y)2 / (Yi Y )2 = 1 ( ei2 / (Yi Y )2 ) or, % of total variation in Y explained by the estimated regression model. or, r2 can be written as = ( (
i

Xi

) (Yi Y)) 2/ (

)2 (Yi Y)2

= Cov ( ,Y) 2 / Var ( ) Var (Y)

If the regression equation explains none of the variation of Yi (i.e. no relationship between zero If the equation explains all the variation, r2 will be one A low r2 would be indicative of a rather poor fit

& Y), r2 will be

Example: Actual Plots


Data on Weekly Family Consumption Expenditure and Weekly Family Income Y = Weekly Family Consumption Expenditure, $ X = Weekly Family Income, $
Y X

IMT

Corr = 0.98

70 65 90 95 110 115 120 140 155 150

80 100 120 140 160 180 200 220 240 260


180
160

Family Consumption Expenditu e

140 120 100 80 60 40 20 0 80 100 120 140 160 180 200 220 240 260

Shows a positive relationship, the sign of family income is expected to have positive sign

Regression Results
eekl Famil n ome eekl Famil onsumption Expen iture, $ eekl Famil n ome, $

IMT

OLS Es
b2 S (b1) S (b2) ^2 ^2 /b2 /b1
   

a
- b2

s
24.45 0.51 41.14 6.41 0.0013 0.04 42.16


(b1) (b2)

/ ^2 (b1))


q ( ^2)


S , hyp h

b2 = 0

byb

)'

('%# $

&

% $!# # "!

  

b2/S (b2)

f 2.306 w -



v ( , )^2/V

  

^2/ -2 ( )*V ( )

 

/ q

^2/ ^2 ( ^2)





Sq

^2/ * ^2)* ^2 (v



b1

  

0.9621 0.9808 14.243 3.813 8 0.05

Microsoft Office Excel Worksheet

5%

gn f

Actual vs Predicted Expenditure


180 Y 160 140
32 1 0

IMT

Yi e

Ye
4

100 80 60 40 20 0 80 100 120 140 160 180 200 220 240 260 24.5 170 0.51

7 5 6

120

111

= 24.45+0.51*X

Example: Software application: E-views


Dependent Variable: Y: Consumption Expenditure Method: Least Squares Date: 05/20/11 Time: 15:01 Sample: 1 10 Included observations: 10 Variable Coefficient Std. Error 6.413817 0.035743 0.962062 0.957319 6.493003 337.2727 -31.78092 202.8679 0.000001 t-Statistic 3.812791 14.24317 111.0000 31.42893 6.756184 6.816701 6.689797 2.680127

IMT

Prob. 0.0051 0.0000

C 24.45455 X/Income 0.509091 R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic)

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat

Example: Software application: SAS

IMT

Microsoft Office Excel Worksheet

IMT

Business Problem: 1.Understand the factors driving the performance of pharma companies 2. Predicting its performance in future

You might also like