You are on page 1of 21

Ordinary Least Squares

Ordinary Least Squares a regression estimation


technique that calculates the Beta-hats -- estimated
parameters or coefficients of the model so as to
minimize the sum of the squared residuals.
e
i
2
= (Y
i
Yhat)
2
Where
e = = the residual
Yhat is the predicted value of Y
Why OLS?
1. OLS is relative easy to use. For a model
with one or two independent variables you
one can run OLS with a simple spreadsheet
(without using the regression function).
2. The goal of minimizing the sum of squared
residuals is appropriate from a theoretical
point of view.
3. OLS estimates have a number of useful
characteristics.
Why not minimize residuals?
Residuals can be positive and negative. Just
minimizing residuals can produce an
estimator with large positive and negative
errors that cancel each other out.
Minimizing the absolute value of the sum of
residuals poses mathematical problems. Plus,
we wish to minimize the possibility of very
large errors.
Useful Characteristics of OLS
1. Estimated regression line goes through the
means of Y and X. In equation form Mean
of Y =
0
+
1
(Mean of X)
2. The sum of residuals is exactly zero.
3. OLS estimators, under a certain set of
assumptions (which we discuss later), are
BLUE (Best Linear Unbiased Estimators).
Note: OLS is the estimator, the coefficients or
parameters are the estimates.
Classical Assumption 1
The error term has a zero population mean.
We impose this assumption via the constant
term.
The constant term equals the fixed portion of
the Y that cannot be explained by the
independent variables.
The error term equals the stochastic portion
of the unexplained value of Y.
Classical Assumption 2
The error term has a constant variance
Heteroskedasticity
Where does this most often occur? Cross-
sectional data
Why does this occur in cross-sectional data?
Classical Assumption 3
Observations of the error term are
uncorrelated with each other.
Serial Correlation or Autocorrelation
Where does this most often occur? Time-
series data
Why does this occur in time-series data?
Classical Assumption 4-5
The data for the dependent variable and
independent variable(s) do not have
significant measurement errors.
The regression model is linear in the
coefficients, is correctly specified, and has an
additive error term.
Classical Assumption 6
The error term is normally distributed
This is an optional assumption, but a good
idea. Why?
One cannot use the t-statistic or F-statistic
unless this holds (will explain these later).

Five More Assumptions
1. All explanatory variables are uncorrelated with the error term.
When would this not be the case? Then a system of equations is
needed (i.e. supply and demand).
What are the consequences? Estimation of slope coefficient for
correlated X terms is biased.
2. No explanatory variable is a perfect linear function of any other
explanatory variable.
Perfect collinearity or multicollinearity
Consequence: OLS cannot distinguish the impact of each X on Y.
3. X values are fixed in repeated sampling.
4. The number of observations n must be greater than the number of
parameters to be estimated.
5. There must be variability in X and Y values.

The Gauss-Markov Theorem
Given Classical Assumptions, the OLS
estimator
k
is the minimum variance
estimator from among the set of all linear
unbiased estimators of
k
.
In other words, OLS is BLUE
Best Linear Unbiased Estimator
Where Best = Minimum Variance
Given assumptions
The OLS coefficient estimators will be
unbiased
have minimum variance
are consistent
are normally distributed.
The last characteristic is important if we wish
to conduct statistical tests of these
estimators, the topic of the next chapter.
Unbiased Estimator and Small Variance
Unbiased estimator an estimator whose
sampling distributions has as its expected
value the true value of .
In other words... the mean value of the
distribution of estimates equals the true
mean of the item being estimated.
In addition to an unbiased estimate, we also
prefer a small variance.
How does OLS Work?
The Univariate Model
Y =
0
+
1
X +
For Example: Wins =
0
+
1
Payroll +
How do we calculate
1
?
Intuition:
1
equals the joint variation of X and Y
(around their means) divided by the variation of X
around its mean. Thus it measures the portion of the
variation in Y that is associated with variation in X.
In other words, the formula for the slope is:
Slope = COV(X,Y)/V(X)
or the covariance of the two variables divided by
the variance of X.
How do we calculate
1
?
Some Simple Math

1
= [(Xi - mean of X) * (Yi - mean of Y)] /
(Xi - mean of X)
2

If
xi = Xi - mean of X and
yi = Yi - mean of Y
then

1
= [(xi*yi)] / (xi)
2
How do we calculate
0
?
Some Simple Math

0
= mean of Y
1
*mean of X

0
is defined to ensure that the
regression equation does indeed pass
through the means of Y and X.
Multivariate Regression
Multivariate regression an equation with more than
one independent variable.
Multivariate regression is necessary if one wishes to
impose ceteris paribus.
Specifically, a multivariate regression coefficient
indicates the change in the dependent variable
associated with a one-unit increase in the
independent variable in question holding constant the
other independent variables in the equation.
Omitted Variables, again
If you do not include a variable in your model, then
your coefficient estimate is not calculated with the
omitted variable held constant.
In other words, if the variable is not in the model it
was not held constant.
Then again.... there is the Principle of Parsimony or
Occams razor (that descriptions be kept as simple as
possible until proven inadequate).
So we dont typically estimate regressions with
hundreds of independent variables.
The Multivariate Model
Y =
0
+
1
X
1
+
2
X
2
+ .......
n
X
n
+
For Example, a model where n=2:
Wins =
0
+
1
PTS +
2
DPTS +
Where
PTS = Points scored in a season
DPTS = Points surrendered in a season
How do we calculate
1
and
2
?
Some Less Simple Math
Remember:
xi = Xi - mean of X and
yi = Yi - mean of Y

1
= [(x
1
*yi)*(x
2
)
2
- (x
2
*yi)*(x
1
*x
2
)

] /

[(x
1
)
2
* (x
2
)
2
- (x
1
*x
2
)
2
]


2
= [(x
2
*yi)*(x
1
)
2
- (x
1
*yi)*(x
1
*x
2
)] /

[(x
1
)
2
* (x
2
)
2
- (x
1
*x
2
)
2
]


0
= mean of Y
1
*mean of X
2
*mean of X
Issues to Consider when reviewing
regression results
1. Is the equation supported by sound theory?
2. How well does the estimated regression fit the data?
3. Is the data set reasonably large and accurate?
4. Is OLS the best estimator for this equation?
5. How well do the estimated coefficients correspond to our
expectations?
6. Are all the obviously important variables included in the
equation?
7. Has the correct functional form been used?
8. Does the regression appear to be free of major econometric
problems?
NOTE: This is just a sample of questions one can ask.