You are on page 1of 21

Regression Analysis

Meaning

Regression is the measure of average relationship


between two or more variables like the effect of a price
increase upon demand, effect of changes in the money
supply upon the inflation rate.

Regression analysis predicts the value of dependent


variables from the value of independent variables.

EXAMPE ! Advertising can be considered as


independent variable and sales can be considered as the
dependent variable.

"ypes of regression
Simple Linear Regression
Multiple Regression
Simple Linear Regression

#imple linear regression analysis estimates the


relationship between two variables with a single
e$planatory variable.

#uppose we wish to identify and %uantify the


factors that determines earnings in the labor
market. &actors include ! occupation, age,
e$perience, education, race, gender etc.
suppose we restrict our attention to education.
"his is called simple regression.

#imple linear regression estimates


relationships of the form y ' b$ ( a
)here,
y is the dependent variable
a is the constant (intercept)
b is the regression coefficient (slope)
x is the independent variable (or
covariate)
east s%uare method and regression
e%uation

east s%uare method assumes that the best*fit curve of a


given type is the curve that has the minimal sum of
deviation s%uared +least s%uare error, from a given set of
data.

"he relation -' a ( bX indicates the e$act relation but in


real world, the relations are not e$act.

east s%uare method helps to determine the best


regression line drawn through the scatter points
i.e.where the difference between the actual and the
estimated values is minimum or least.
In fitting the line, the least-suares procedure
minimi!es the sum of suared errors,

e
j
2

"he estimated regression line of y on $ by the


method of least s%uare is given by
x b a y


+ =
E$ample

After investigation it has been found that the demand for the
automobiles in a city depends mainly, if not entirely, upon the no. of
families residing in that city. .elow are given figures for the sales of
automobiles in the five cities for the year /001 and the no. of
families residing in those cities.

&it a linear regression of - on X by the least s%uare method and


estimate the sales for the year /002 for city A which is estimated to
have 300 lakh families assuming that the same relationship holds
true
City A B C D E
No. of
families in
lakhs
70 75 80 60 90
Sales in
000s
25.2 28.6 30.2 22.3 35.4
Regression e%uation and regression
coefficients

)hen - is independent and X is the dependent


variable, we have regression e%uation of X on -
X *X ' b$y +- * -,
where !
X and - ! mean of X and -
b$y ' regression coefficient of X on -
' r +4$ 5 4y ,
' 6$y 5 6y7
Regression e%uation and regression
coefficients

)hen X is independent and - is the dependent variable,


we have regression e%uation of - on X
- * - ' by$ +X * X,
where !
X and - ! mean of X and -
by$ ' regression coefficient of - on X
' r +4y 5 4$ ,
' 6$y 5 6$7
And $ ' X * X and y ' - * -
Properties of regression coefficients

#ign of both the regression coefficients are


always same.

.oth the regression coefficients cannot


simultaneously e$ceed one.

8orrelation coefficient, r '

9f r ' 0 the regression lines are perpendicular to


each other.

9f r ' +perfect correlation,, the two regression


lines coincides.
YX XY
b b
1
E$ample

8alculate regression
lines from the
following data and
estimate $ when y is
/2 and y when $ is 1:
X -
30 :
3/ 2
31 ;
3; <
3= 31
/0 3:
/> /0
10 /3
E$ample

)e are given the following information about advertising


e$penditure and sales.

8orrelation coefficient ' 0.=

)hat should be the advertising budget if the company


wants to attain sales target of Rs.3/0 lakhs.
Advertising
e$penditure+X,+lakhs,
#ales +-,+lakhs,
Mean 30 <0
#.?. 1 3/
E$ample

"he lines of regression for a bivariate


distribution are given by
X(<- ' ; and -(>X ' ><51

8alculate the correlation coefficient

X and -
E$ample

A manufacturing
company is interested in
evaluating the annual
sales of the company in
lakhs of Rs over the past
33 years. &or this, they
have compiled the data
related to the annual
sales of the company for
the past 33 years. Relate
annual sales to the years
and predict the sales of
/002.
-ear +$, Annual
sale+lakhs,
3<<: 3
3<<2 :
3<<; >
3<<= ;
3<<< 30
/000 =
/003 <
/00/ 31
/001 3>
/00> 31
/00: 3=
E$ample

"he following data give the age and blood pressure of 30 women!

&ind the correlation coefficient between X and -.

?etermine the least s%uare regression e%uation of - on X.

Estimate the blood pressure of a woman whose age is >: years.

?etermine coefficient of determination


Age
(X)
56 42 36 47 49 42 60 72 63 55
Blood
press
ure(!
"4
7
"25 ""8 "28 "45 "40 "55 "60 "49 "50
E$ample

"he e%uations of the regression lines


between two variables are e$pressed as !

1$ ( /y ' /2 and 2$ ( y ' 13.

&ind the mean values, the regression


coefficients and the correlation coefficient
between $ and y.
8oefficient of determination,r
/

Ratio of the une$plained variation to the total variation


represents the proportion of variation in - that is not
e$plained by regression on X.

#ubtracting it from one gives the proportion of variation in


- that is e$plained by regression on X @ coefficient of
determination
r
/
' e$plained variation5 total variation
' 3* Ane$plained variation
"otal variation
9B"ERPRE"A"9CB
! "he value of r
/
is the proportion of variation in
the dependent variable - e$plained by regression on the
independent variable X.
#tandard error of estimate

#tandard error of the estimated regression line measures the


variability of the scatter from the regression line.

A large #.E. indicates a large amount of variation or scatter


around the regression line

A small #.E. indicates small amount of variation or scatter


around the regression line

A Dero #.E. indicates that all the observed data points fall
e$actly on the regression line.
2
)

(
2

=

n
y y
S
e
E$plained and une$plained variation

"otal variation ' E$plained variation (


Ane$plained variation

"otal variation '

E$plained variation

Ane$plained variation '


n
Y
XY b Y a Y Y


+ = =
2
2
) (
)

(

= XY b Y a Y Y Y
2 2
)

(
n
Y
Y Y Y
2
2
2
) (
) (


=
E$ample

A financial analyst obtained the


following information relating to
return on security A and that of
market portfolio M for the past
= years.

?evelop an estimating
e%uation that best describes
these data. &ind standard error
of estimate.

&ind the coefficient of


determination

?etermine the E of total


variation in security return
being e$plained by the return
on the market portfolio.
"ear Return
on
security
(#)
Mar$et
portfolio
(M)
3 30 3/
/ 3: 3>
1 3= 31
> 3> 30
: 32 <
2 32 31
; 3= 3>
= > ;

You might also like