You are on page 1of 94

Module 2.

1
Regression Concepts Two
variable models
(Ch 2 Brooks)








1
Learning objectives
Recognise the concepts and basics of OLS regression analysis
Understand the meaning of regression outcomes and its
limitation
Appreciate CLRM assumptions and BLUE properties of OLS
estimators
Understand hypothesis testing and the significance of OLS
parameters
Fully understand the return series used in financial research
Apply and interpret the two-variable regression model on the
classical market model (a CAPM application) using either SAS
or E-views programs.
Describe R
2
in the context of traditional regression.

2

Regression

Regression is probably the single most important tool at the
econometricians disposal.

But what is regression analysis?

It is concerned with describing and evaluating the relationship between
a given variable (usually called the dependent variable) and one or
more other variables (usually known as the independent variable(s)).
3

Some Notation

Denote the dependent variable by y and the independent variable(s) by x
1
, x
2
,
... , x
k
where there are k independent variables.

Some alternative names for the y and x variables:
y x
dependent variable independent variables
regressand regressors
effect variable causal variables
explained variable explanatory variable

Note that there can be many x variables but we will limit ourselves to the
case where there is only one x variable to start with. In our set-up, there is
only one y variable.

4

Regression is different from Correlation

If we say y and x are correlated, it means that we are treating y and x in a
completely symmetrical way.

In regression, we treat the dependent variable (y) and the independent
variable(s) (xs) very differently. The y variable is assumed to be random or
stochastic in some way, i.e. to have a probability distribution. The x
variables are, however, assumed to have fixed (non-stochastic) values in
repeated samples.

5

Simple Regression

For simplicity, say k=1. This is the situation where y depends on only one x variable.

CAPM: Example of the most classical two-variable relationship in finance:

How asset returns vary with their level of market risk.
Recall CAPM: r
i,t
= r
f,t
+ (r
m,t
- r
f,t
). (And dont forget expected term for political
correctness.) If we are to assume that risk free rate is quite stable over time, then this
will reduce to the widely-used Market Model:

r
i,t
=
i,t
+ .r
m,t
+
i,t .


If CAPM works, we expect alpha term (intercept term) to be
insignificantly different from zero (no excess return beyond what is
explained by risk) and beta term to be significantly positive (the higher
risk, the higher expected return). The market model is also widely used
(acceptable) to obtain market Beta for specific stocks.
Nevertheless, keep in mind the Rolls critiques (e.g., it is next to impossible to
find the real proxy for the market portfolio which truly represents the universe
of assets). That is one cannot really test CAPM!!

6
Security Characteristic Line
7
Simple Regression: An Example
Suppose that we have the following data on the excess returns on a fund
managers portfolio (fund XXX) together with the excess returns on a
market index:







We have some intuition that the beta on this fund is positive, and we
therefore want to find whether there appears to be a relationship between
x and y given the data that we have. The first stage would be to form a
scatter plot of the two variables.
Year, t Excess return
= r
XXX,t
rf
t
Excess return on market index
= rm
t
- rf
t
1 17.8 13.7
2 39.0 23.2
3 12.8 6.9
4 24.2 16.8
5 17.2 12.3
8
Graph (Scatter Diagram)
0
5
10
15
20
25
30
35
40
45
0 5 10 15 20 25
Excess return on market portfolio
E
x
c
e
s
s

r
e
t
u
r
n

o
n

f
u
n
d

X
X
X
9
Finding a Line of Best Fit
We can use the general equation for a straight line,
y=a+bx
to get the line that best fits the data.
a is the intercept (e.g. value of y when x = 0)
b is the slope of the relationship (e.g. how much y increase per unit
of increase in x)

However, this equation (y=a+bx) is completely deterministic.

Is this realistic? No. So what we do is to add a random disturbance term, u
into the equation.
y
t
= o + |x
t
+ u
t

where t = 1,2,3,4,5

10
Why do we include a Disturbance term?
The disturbance term can capture a number of features:

- We always leave out some determinants of y
t

- There may be errors in the measurement of y
t
that cannot be
modelled.
- Random outside influences on y
t
which we cannot model

11
Determining the Regression Coefficients
So how do we determine what o and | are?
Choose o and | so that the (vertical) distances from the data points to the
fitted lines are minimised (so that the line fits the data as closely as
possible):


y
x
12
Ordinary Least Squares
The most common method used to fit a line to the data is known as
OLS (ordinary least squares).

What we actually do is take each distance and square it (i.e. take the
area of each of the squares in the diagram) and minimise the total sum
of the squares (hence least squares).

Tightening up the notation, let
y
t
denote the actual data point t
denote the fitted value from the regression line
denote the residual, y
t
-


t
y
t
y
t
u
13
Actual and Fitted Value

y





















i
x x


i
y
i
y

i
u
14
How OLS Works
So min. , or minimise . This is known as the
residual sum of squares.

But what was ? It was the difference between the actual point and the
line, y
t
- .

So minimising is equivalent to minimising
with respect to and .


$
o
$
|
2
5
2
4
2
3
2
2
2
1
u u u u u + + + +
t
y
t
u

=
5
1
2

t
t
u
( )
2


t t
y y

2

t
u
15
Deriving the OLS Estimator
But , so let

Want to minimise L with respect to (w.r.t.) and , so differentiate L w.r.t.
and
(1)

(2)

From (1),

But and .
$
o
$
|
$
o
$
|
t t
x y | o

+ =

= =
t
t t
x y
L
0 )

( 2

| o
o c
c

= =
t
t t t
x y x
L
0 )

( 2

| o
| c
c
0

0 )

( = =

t t
t
t t
x T y x y | o | o

= y T y
t

= x T x
t

= =
t i
t t t t
x y y y L
2 2
)

( ) ( | o
16
Deriving the OLS Estimator (contd)
So we can write or (3)
From (2), (4)

From (3), (5)
Substitute into (4) for from (5),


$
o
0

= x y | o

=
t
t t t
x y x 0 )

( | o
x y | o

=

= +
= +
= +
t
t t t
t
t t t t t
t
t t t
x x T x y T y x
x x x x y y x
x x y y x
0

0

0 )

(
2
2
2
| |
| |
| |
0

= x T T y T | o
17
Deriving the OLS Estimator (contd)
Rearranging for ,



So overall we have




This method of finding the optimum is known as ordinary least squares.

$
|

=
t t t
y x x y T x x T ) (

2 2
|
x y
x T x
y x T y x
t
t t
| o |

and

2 2
=

18

What do We Use and For?
In the CAPM example used above, plugging the 5 observations in to make up
the formulae given above would lead to the estimates
= -1.74 and = 1.64. We would write the fitted line as:


Question: If an analyst tells you that she expects the market to yield a return
20% higher than the risk-free rate next year, what would you expect the return
on fund XXX to be?

Solution: We can say that the expected value of y = -1.74 + 1.64 * value of x,
so plug x = 20 into the equation to get the expected value for y:
$
o
$
|
$
o
$
|
06 . 31 20 64 . 1 74 . 1 = + =
i
y
t t
x y 64 . 1 74 . 1

+ =
19
Accuracy of Intercept Estimate
Care needs to be exercised when considering the intercept estimate,
particularly if there are no or few observations close to the y-axis:


y
0 x
20

The Population and the Sample

The population is the total collection of all objects or people to be studied,
for example,

Interested in Population of interest
predicting outcome the entire electorate
of an election

A sample is a selection of just some items from the population.

A random sample is a sample in which each individual item in the
population is equally likely to be drawn.

21

The DGP and the PRF

The population regression function (PRF) is a description of the model that
is thought to be generating the actual data and the true relationship
between the variables (i.e. the true values of o and |).

The PRF is

The SRF is
and we also know that .

We use the SRF to infer likely values of the PRF.

We also want to know how good our estimates of o and | are.

t t
x y | o

+ =
t t t
u x y + + = | o
t t t
y y u =
22
The Assumptions Underlying the
Classical Linear Regression Model (CLRM)
The model which we have used is known as the classical linear regression model.
We observe data for x
t
, but since y
t
also depends on u
t
, we must be specific about
how the u
t
are generated.
We usually make the following set of assumptions about the u
t
s (the unobservable
error terms):
Technical Notation Interpretation
1. E(u
t
) = 0 The errors have zero mean
2. Var (u
t
) = o
2
The variance of the errors is constant and finite
over all values of x
t

3. Cov (u
i
,u
j
)=0 The errors are statistically independent of
one another
4. Cov (u
t
,x
t
)=0 No relationship between the error and
corresponding x variate
23
The Assumptions Underlying the
CLRM Again
An alternative assumption to 4., which is slightly stronger, is that the
x
t
s are non-stochastic or fixed in repeated samples.

A fifth assumption is required if we want to make inferences about the
population parameters (the actual o and |) from the sample parameters
( and )

Additional Assumption
5. u
t
is normally distributed

$
o
$
|
24

Properties of the OLS Estimator

If assumptions 1. through 4. hold, then the estimators and determined by
OLS are known as Best Linear Unbiased Estimators (BLUE).
What does the acronym stand for?

Estimator - is an estimator of the true value of |.
Linear - is a linear estimator
Unbiased - On average, the actual value of the and s will be equal to
the true values.
Best - means that the OLS estimator has minimum variance among
the class of linear unbiased estimators. The Gauss-Markov
theorem proves that the OLS estimator is best.
$
o
$
|
$
|
$
|
$
|
$
|
$
o
25

Precision and Standard Errors

Any set of regression estimates of and are specific to the sample used in their
estimation.
Recall that the estimators of o and | from the sample parameters ( and ) are given
by


What we need is some measure of the reliability or precision of the estimators
( and ). The precision of the estimate is given by its standard error. Given
assumptions 1 - 4 above, then the standard errors can be shown to be given by







where s is the estimated standard deviation of the residuals.
$
o
$
|
$
|
$
|
$ o
$
o
x y
x T x
y x T y x
t
t t
| o |

and

2 2
=

=
2 2 2
2 2 2
2
2
2
1
) (
1
)

(
,
) (
) (
x T x
s
x x
s SE
x T x T
x
s
x x T
x
s SE
t t
t
t
t
t
|
o
26
Estimating the Variance of the Disturbance Term
The variance of the random variable u
t
is given by
Var(u
t
) = E[(u
t
)-E(u
t
)]
2

which reduces to
Var(u
t
) = E(u
t
2
)

We could estimate this using the average of :


Unfortunately this is not workable since u
t
is not observable. We can use
the sample counterpart to u
t
, which is :

But this estimator is a biased estimator of o
2
.

2
t
u

=
2 2
1
t
u
T
s

=
2 2

1
t
u
T
s
t
u
27
Estimating the Variance of the Disturbance Term
(contd)
An unbiased estimator of o is given by


where is the residual sum of squares and T is the sample size.

Some Comments on the Standard Error Estimators
1. Both SE( ) and SE( ) depend on s
2
(or s). The greater the variance s
2
, then
the more dispersed the errors are about their mean value and therefore the
more dispersed y will be about its mean value.

2. The sum of the squares of x about their mean appears in both formulae.
The larger the sum of squares, the smaller the coefficient variances.
$
o
$
|
2

=

T
u
s
t

t
u
28
Some Comments on the Standard Error Estimators

Consider what happens if is small or large:


y

y
0
x
x
y

y
0
x
x
( )
2

x x
t
29
Some Comments on the Standard Error Estimators
(contd)
3. The larger the sample size, T, the smaller will be the coefficient
variances. T appears explicitly in SE( ) and implicitly in SE( ).

T appears implicitly since the sum is from t = 1 to T.

4. The term appears in the SE( ).
The reason is that measures how far the points are away from the
y-axis.


$
o
$
|
$
o
( )
2

x x
t

2
t
x

2
t
x
30
Example: How to Calculate the Parameters and
Standard Errors
Assume we have the following data calculated from a regression of y on a
single variable x and a constant over 22 observations.
Data:



Calculations:



We write


$
( * . * . )
*( . )
. | =

=
830102 22 4165 86 65
3919654 22 4165
035
2
$
. . * . . o = = 8665 035 4165 5912
6 . 130 , 3919654
, 65 . 86 , 5 . 416 , 22 , 830102
2
= =
= = = =

RSS x
y x T y x
t
t t
t t
x y | o

+ =
t t
x y 35 . 0 12 . 59 + =
31
Example (contd)
SE(regression),






We now write the results as


( ) ( )
( )
0079 . 0
5 . 416 22 3919654
1
* 55 . 2 ) (
35 . 3
5 . 416 22 3919654 22
3919654
* 55 . 2 ) (
2
2
=

=
=

=
|
o
SE
SE
) 0079 . 0 (
35 . 0
) 35 . 3 (
12 . 59
t t
x y + =
55 . 2
20
6 . 130
2

2
= =

=

T
u
s
t
32

An Introduction to Statistical Inference

We want to make inferences about the likely population values from the
regression parameters.

Example: Suppose we have the following regression results:



is a single (point) estimate of the unknown population
parameter, |. How reliable is this estimate?

The reliability of the point estimate is measured by the coefficients
standard error.

$
. | = 05091
) 2561 . 0 (
5091 . 0
) 38 . 14 (
3 . 20
t t
x y + =
33

Hypothesis Testing: Some Concepts

We can use the information in the sample to make inferences about the population.
We will always have two hypotheses that go together, the null hypothesis (denoted
H
0
) and the alternative hypothesis (denoted H
1
).
The null hypothesis is the statement or the statistical hypothesis that is actually
being tested. The alternative hypothesis represents the remaining outcomes of
interest.
For example, suppose given the regression results above, we are interested in the
hypothesis that the true value of | is in fact 0.5. We would use the notation
H
0
: | = 0.5
H
1
: | = 0.5
This would be known as a two sided test.
In most context though, we deal with the hypotheses whether our regression
coefficients are statistically different from 0 (e.g. no true relationship between
tested variables)
H
0
: | = 0
H
1
: | = 0

34
One-Sided Hypothesis Tests

Sometimes we may have some prior information that, for example, we
would expect | > 0.5 rather than | < 0.5. In this case, we would do a one-
sided test:
H
0
: | = 0.5
H
1
: | > 0.5
or we could have had
H
0
: | = 0.5
H
1
: | < 0.5

There are two ways to conduct a hypothesis test: via the test of significance
approach or via the confidence interval approach.

35

The Probability Distribution of the
Least Squares Estimators

We assume that u
t
~ N(0,o
2
)

Since the least squares estimators are linear combinations of the random
variables
i.e.

The weighted sum of normal random variables is also normally distributed, so
~ N(o, Var(o))
~ N(|, Var(|))

What if the errors are not normally distributed? Will the parameter estimates
still be normally distributed?
Yes, if the other assumptions of the CLRM hold, and the sample size is
sufficiently large.

$
| =

w y
t t
$
o
$
|
36
The Probability Distribution of the
Least Squares Estimators (contd)

Standard normal variates can be constructed from and :

and


But var(o) and var(|) are unknown, so t-statistics are

and


Note: In most cases, as we deal with H
0
: | = 0 H
1
: | = 0 (and also alphas).
The t-statistics reduces to be the ratio of estimated coefficients over their
standard errors.

$
o
$
|
( )
( ) 1 , 0 ~
var

N
o
o o
( )
( ) 1 , 0 ~
var

N
|
| |
2
~
) (

T
t
SE o
o o
2
~
)

T
t
SE |
| |
37

Testing Hypotheses:
The Test of Significance Approach

Assume the regression equation is given by ,
for t=1,2,...,T


The steps involved in doing a test of significance are:
1. Estimate , and , in the usual way

2. Calculate the test statistic. This is given by the formula


where is the value of | under the null hypothesis.
test statistic
SE
=

$
*
(
$
)
| |
|
| *
SE(
$
) o
SE(
$
) |
$
o
$
|
t t t
u x y + + = | o
38
The Test of Significance Approach (contd)

3. We need some tabulated distribution with which to compare the estimated test
statistics. Test statistics derived in this way can be shown to follow a t-distribution
with T-2 degrees of freedom (df is a measure of number of pieces of information on which
the precision of a parameter estimate is based) (T is the number of observations) (2 is for two
parameter estimates, alpha and beta).
As the number of degrees of freedom increases, we need to be less cautious in our
approach since we can be more sure that our results are robust.

4. We need to choose a significance level, often denoted o. This is also
sometimes called the size of the test and it determines the region where we will
reject or not reject the null hypothesis that we are testing. It is conventional to
use a significance level of 5%.
Intuitive explanation is that we would only expect a result as extreme as this or
more extreme 5% of the time as a consequence of chance alone.
Conventional to use a 5% size of test, but 10% and 1% are also commonly used.

39
Determining the Rejection Region for a Test of
Significance
5. Given a significance level, we can determine a rejection region and non-
rejection region. For a 2-sided test:


f(x)
95% non-rejection
region
2.5%
rejection region
2.5%
rejection region
40
The Rejection Region for a 1-Sided Test (Upper Tail)

f(x)
95% non-rejection
region
5% rejection region
41
The Rejection Region for a 1-Sided Test (Lower Tail)


f(x)
95% non-rejection region
5% rejection region
42
The Test of Significance Approach: Drawing
Conclusions

6. Use the t-tables to obtain a critical value or values with which to
compare the test statistic.

7. Finally perform the test. If the test statistic lies in the rejection region
then reject the null hypothesis (H
0
), else do not reject H
0
.


43

A Note on the t and the Normal Distribution

You should all be familiar with the normal distribution and its characteristic
bell shape.

We can scale a normal variate to have zero mean and unit variance by
subtracting its mean and dividing by its standard deviation.

There is, however, a specific relationship between the t- and the standard
normal distribution. Both are symmetrical and centred on zero. The t-
distribution has another parameter, its degrees of freedom. We will always
know this (for the time being from the number of observations -2).

44

What Does the t-Distribution Look Like?

normal distribution
t-distribution
45
Comparing the t and the Normal Distribution
In the limit, a t-distribution with an infinite number of degrees of freedom is
a standard normal, i.e.

Examples from statistical tables:
Significance level N(0,1) t(40) t(4)
50% 0 0 0
5% 1.64 1.68 2.13
2.5% 1.96 2.02 2.78
0.5% 2.57 2.70 4.60

The reason for using the t-distribution rather than the standard normal is that
we had to estimate , the variance of the disturbances.

t N ( ) ( , ) = 01
o
2
46
Changing the Size of the Test
But note that we looked at only a 5% size of test. In marginal cases
(e.g. H
0
: | = 1), we may get a completely different answer if we use a
different size of test. This is where the test of significance approach is
better than a confidence interval.

For example, say we wanted to use a 10% size of test. Using the test of
significance approach,




as above. The only thing that changes is the critical t-value.


test stat
SE
=

=

=
$
*
(
$
)
.
.
.
| |
|
05091 1
02561
1917
47
Changing the Size of the Test:
The New Rejection Regions
-1.725
+1.725
5% rejection region 5% rejection region
f(x)
48
Changing the Size of the Test:
The Conclusion


t
20;10%
= 1.725. So now, as the test statistic lies in the rejection region,
we would reject H
0
.

Caution should therefore be used when placing emphasis on or making
decisions in marginal cases (i.e. in cases where we only just reject or
not reject).


49
Some More Terminology

If we reject the null hypothesis at the 5% level, we say that the result
of the test is statistically significant.


Note that a statistically significant result may be of no practical
significance. E.g. if a shipment of cans of beans is expected to weigh
450g per tin, but the actual mean weight of some tins is 449g, the
result may be highly statistically significant but presumably nobody
would care about 1g of beans.

50
The Errors That We Can Make
Using Hypothesis Tests

We usually reject H
0
if the test statistic is statistically significant at a
chosen significance level.

There are two possible errors we could make:
1. Rejecting H
0
when it was really true. This is called a type I error.
2. Not rejecting H
0
when it was in fact false. This is called a type II error.



Reality
H
0
is true H
0
is false
Result of
Significant
(reject H
0
)
Type I error
= o
\
Test Insignificant
( do not
reject H
0
)
\
Type II error
= |
51
The Trade-off Between Type I and Type II Errors

The probability of a type I error is just o, the significance level or size of test we
chose. To see this, recall what we said significance at the 5% level meant: it is only
5% likely that a result as or more extreme as this could have occurred purely by
chance.
Note that there is no chance for a free lunch here! What happens if we reduce the size
of the test (e.g. from a 5% test to a 1% test)? We reduce the chances of making a type
I error ... but we also reduce the probability that we will reject the null hypothesis at
all, so we increase the probability of a type II error:




So there is always a trade off between type I and type II errors when choosing a
significance level. The only way we can reduce the chances of both is to increase the
sample size.
less likely
to falsely reject
Reduce size more strict reject null
of test criterion for hypothesis more likely to
rejection less often incorrectly not
reject
52
The Exact Significance Level or p-value

This is equivalent to choosing an infinite number of critical t-values from
tables. It gives us the marginal significance level where we would be
indifferent between rejecting and not rejecting the null hypothesis.

If the test statistic is large in absolute value, the p-value will be small, and
vice versa. The p-value gives the plausibility of the null hypothesis.

e.g. a test statistic is distributed as a t
62
= 1.47.
The p-value = 0.12.

Do we reject at the 5% level?...........................No
Do we reject at the 10% level?.........................No
Do we reject at the 20% level?.........................Yes
53
Sample size and Asymptotic theory
Question: What is the appropriate sample size for a model
estimation?
Asymptotic theory: The econometric results hold (e.g. valid in testing procedure)
only if there are infinite number of observations
So, the answer is the larger sample size, the better. It will also reduce
sampling error (drawing of sample that are not representative of
population)
Important note: As one increases the sample size, higher chance the coefficient
estimates will become more significant by nature. As a result, for a study with
large number of observations (e.g. microstructure studies with millions of
observations), it is conservative to increase the size of the test (e.g. 1% significant
level instead of the conventional 5%)
54


R
2
: Goodness of Fit Statistics

We would like some measure of how well our regression model actually fits
the data.
We have goodness of fit statistics to test this: i.e. how well the sample
regression function (srf) fits the data.
The most common goodness of fit statistic is known as R
2
. One way to define
R
2
is to say that it is the square of the correlation coefficient between y and .
For another explanation, recall that what we are interested in doing is
explaining the variability of y about its mean value, , i.e. the total sum of
squares, TSS:


We can split the TSS into two parts, the part which we have explained (known
as the explained sum of squares, ESS) and the part which we did not explain
using the model (the RSS).
$ y
( )

=
t
t
y y TSS
2
y
55

Defining R
2

That is, TSS = ESS + RSS


Our goodness of fit statistic is


But since TSS = ESS + RSS, we can also write



R
2
must always lie between zero and one. To understand this, consider two
extremes
RSS = TSS i.e. ESS = 0 so R
2
= ESS/TSS = 0
ESS = TSS i.e. RSS = 0 so R
2
= ESS/TSS = 1
R
ESS
TSS
2
=
R
ESS
TSS
TSS RSS
TSS
RSS
TSS
2
1 = =

=
( ) ( )

+ =
t
t
t t
t t
u y y y y
2
2 2

56

The Limit Cases: R
2
= 0 and R
2
= 1



t
y
y
t
x

t
y
t
x
57

Problems with R
2
as a Goodness of Fit Measure

There are a number of them:

1. R
2
is defined in terms of variation about the mean of y so that if a model
is reparameterised (rearranged) and the dependent variable changes, R
2

will change.

2*. R
2
never falls if more regressors are added. to the regression, e.g.
consider:
Regression 1: y
t
= |
1
+ |
2
x
2t
+ |
3
x
3t
+ u
t

Regression 2: y = |
1
+ |
2
x
2t
+ |
3
x
3t
+ |
4
x
4t
+ u
t

R
2
will always be at least as high for regression 2 relative to regression 1.

3. R
2
quite often takes on values of 0.9 or higher for time series
regressions.


58
SAS and E-views application:
- BP-CAPM example
- In this example, we estimate how well the market model (a two-variable
model) explain British Pretoleum (BP) daily stock returns during March
15
th
1999 to August 12
th
2000
- Dataset:
For SAS: BPS107_for_SAS.xls (downloadable from our Stream)
For E-views: BPS107.xls (downloadable from our Stream)
- Regression model:
R
BP,t
= o + |.R
M,t
+ u
BP,t

- Notice that we do not include the risk-free term in our model. Many researchers do
this since daily risk-free rate are quite stable! In general, it does not generate much
difference.



59
SAS
Application
60
Data file: BPS107_for_SAS.xls
(save this in the directory of your works)
61
First-timer on SAS
- You can use SAS application at home by making a
request to Massey ITS helpdesk. You will get SAS
version 9.2. SAS is also available in most computers in
Massey campuses (Manawatu, Albany, Wellington).
- SAS is very powerful but relies on computer
programming writing skills. To become good in SAS,
one needs to practice extensively. However, you will be
required to do only basic tasks in 125.785 paper.
- The document prepared by JG will introduce you to the
basic SAS environments. This will be posted on Stream
by the first week of the semester.

62
Open up SAS 9.2
It will look like this
63
Now import the dataset (BPS107_for_SAS.xls)
- Click File, Import Data.. Youll get the screen below then click Next >
64
Then specify the directory you have saved the BPS107_for_SAS.xls file. Then click OK.
65
Youll then be asked for the sheet name in the Excel file. And since the name of the worksheet is
just sheet1, we can go with what is given in SAS. Then, click Next >.
66
SAS will then ask for the destination you would like imported file to be at. If you notice on your left panel,
there is a little cabinet-like icon libraries. This is the library system that allows you to arrange your folders and
working panels. If you double click to see whats in libraries, there will be a drawer-looking icon called Work.
By default, there is where SAS program works on. So, lets put the dataset in work and call it BPS107. Then
click Finish.
67
Youll have this screen.
68
Lets check the dataset BPS107 we imported.
- Double click on Work folder, youll see
69
Now, double click on BPS107, you can view the dataset we have just imported.
- Notice that the first row becomes heading automatically. This is set as default by SAS.
- Also note that there are other ways to import data (e.g. in the programming using infile
command)
70
Now lets start some programming
- First of all, close down the ViewTable you have in
previous step. SAS cannot work on dataset that is open
for viewing!!
- Notice that on your screen, down at the bottom, you
have output, log, and editor panels.
- Output is the panel where SAS will show the programs
output after the run.
- Log is the panel where you can see what the program
has done step by step. Importantly, it tells you possible
errors in programming (in red).
- Editor is the panel to work on the program (write, save,
and execute).
71
Activate editor panel (by just clicking at its panel down below) and put these
statements on: (note that anything that are between /*..*/ are comments (or explanations in
this case) and not the programming )
- Then at the end save this SAS program as BP_CAPM.sas
options ls=78;

/* creating the dataset to work on called CAPM_BP */
/* turning price level into log return for both BP stocks and the UK stock market */

data capm_bp; /* create new dataset called capm_bp */
set bps107; /* start by grabbing the dataset in 'work' folder called 'BPS107' to work with */

closing_price_bp_lag = lag(closing_price_bp); /* creating lag variable for BP closing price */
uk_index_lag = lag(uk_index); /* creating lag variable for UK stock market */


rbp=log(closing_price_bp/closing_price_bp_lag); /* calculating daily log return for BP stock */
rm=log(uk_index/uk_index_lag); /* calculating daily log return for UK stock market */

run; /* needed for programming to run the above commands (start from 'data' line */

72
Youll have this
73
Now run the program by highlighting all commands in editor panel and click
the small running man icon on top.
74
As the result of the execution, you will have a new dataset called capm_bp
seen on the left panel. Double click to see it and check rbp and rm.
75
Now, we are ready to perform a regression!!
- Close down the ViewTable
- Type this statement further into the BP_CAPM SAS program:

proc reg data=capm_bp;
model rbp = rm;
run;

Then, highlight the new statement and click on the small man icon.
(also save the program for later use)


76
And here is the final result
(in the output panel, of which you can also save as lis file)
77
E-Views
Application
(alternative program for Assignment 1 Part A)
78
First-timer on E-views
- You can use E-Views application only on some of
the Massey computers on campus.
- Compared to SAS, E-views is a bit more user-
friendly and does not require programming. BUT
it is quite limited in data manipulation.
- Note that what you see in the lecture note is
based on the older version of E-views. As a result,
some adjustments (minor) may be needed for the
use of E-views 6 on Masseys computers.
79
Data file: BPS107_for_eviews
- Heres how data file BPS107_for_eviews.xls
looks like: You need to save it in the directory of your convenience.




80
- Next, create workfile by clicking File, New, and then Workfile.
- A screen come up. Then, choose daily*5 days week+. Input Start Date as
03/15/1999 and End Date as 08/12/2000 and click on OK tick.

81
- You will now have the following screen:

82
- Now, you are ready to import your dataset into e-views by clicking on File,
Import, Read Text-Lotus-Excel, specify the directory of BPS107.xls file you
have saved, and then specify as below and OK:


83
- Now, you have more series in your workfile. You can double click on each series to
check for accuracy:















Then, rename closing_p01 to bp by right-clicking on the series and select rename and just identify
the new name to bp. Do the same for price_index01 and change the name to ftas.
84
- Note that we need daily return series of BP stock returns and UK market to
run the market model regression. We then need to transform daily prices or
indices into returns using Generate (Genr) function in E-views.
- To generate daily return on BP stock, click on Genr button and type
rbp=log(bp/bp(-1)) in the box, and then OK: (E-views is case insensitive)

85
- Double click on rbp series to check. The first-day return available should be
0.004117.


86
- Now create daily market return the same way we did for BP stock return. Give it a name
rftas from the formula rftas=log(ftas/ftas(-1)). The first daily market return in your
series should be 0.000423.




87
- Now we have two data series ready for the analyses. We can first visualise the
relationship between BP daily returns and UK market daily returns by clicking Quick,
Graph, Scatter, and then input the series rftas rbp in the box and OK;




88
- The graph will be ready. You can name it and save it if you wish;






















- What do you see in general??




89
- Finally, we are ready to estimate the market model regression. Click Objects,
New objects, Equation, name it CAPM. Then click OK. When gets into
another box identify rbp c rftas (c means constant), and OK; (choose LS
method since this is basic OLS)



















90
When gets into another box identify rbp c rftas (c means constant), and OK;
(choose LS method since this is basic OLS)
91
Here comes the results we look for!!















- Save the file for your future reference with the routine File Save as procedure. Lets name it BP CAPM for our
future reference.






92
So, what do you make of the results?
- RFTAS (or RM in SAS) coefficient represents Beta of BP stock. In this
case, it is highly significantly different from zero and reported as 0.62.
In this sense, the market premium has explanatory power on the
variability of BP stock returns (somewhat satisfy CAPM).
- The intercept term is insignificantly different from zero! It means
there is no significant abnormal return for BP stocks during the study
period in this simple market model framework (somewhat satisfy
CAPM).
- Nevertheless, the R-square is quite small (only 0.085), which means
that there are large proportion of BP stock return variability that is
NOT captured by CAPM framework.













93
Discussion:

- Jensen(1968) study
- Return series in financial research


- To be discussed in the class (internal) or block
courses (block)
94

You might also like