You are on page 1of 7

Cigarettes Case, Sahil Nayar, Cluster Y

In order to answer the questions below, I first ran a linear regression of cigarette
sales per capita against all of the explanatory variables except state. However,
as can be seem in section 1 of the appendix, the explanatory power of this model
is very low, with an R^S of just 0.32.
Therefore, I ran a second regression of ln (cigarette sales per capita) against all
of the explanatory variables except state. The R^2 improves to 0.39, suggesting
that this is model has better explanatory power. Therefore, I refer to this model
(reproduced in section 2 of the appendix) when answering the questions below.
In the first model, Income is significant that the 10% level, price is significant at
any level, and the remaining variables are not significant. This is counter intuitive
since we would expect age to be significant (there is a legal age for smoking). It
is also reasonable to expect education to be to significant, since educated people
are more aware of the dangers of smoking. The fact that Income is not very
significant is consistent with the fact that cigarettes are habit forming products,
and hence relatively income inelastic. The signs of the significant coefficients are
in line with expectations.
In the second model, Age is significant at the 10% level, Income is significant at
the 2% level, and price is significant at any level. The signs of the significant
coefficients are in line with expectations. This appears to be a better model than
the first.
Multicollinearity could affect both models. We would expect, for example,
Income and age to be positively correlated, income and education to be
positively correlated, and income and black to be negatively correlated.
(a) Ho: B(Female)=0
H1: B(Female) is not equal to zero
We use a two-tail t-test. Since the p-value is 0.85 for the Female
explanatory variable, we fail to reject the null hypothesis at any
significance level. We could that Female is not needed in the
regression equation.
(b) To answer this question, it is necessary to run a regression of sales
against all explanatory variables except state, female and high. The
results of this regression are reported in section 7 of the appendix.
Conducting a two tail chow test between regressions 1 and 7 will help
test the following hypothesis:
Ho: Beta (Female) = Beta (High) = 0

H1: Either one or both coefficients are not equal to 0


=>

F-stat= {(SSErm-SSEfm)}/{(p+1-k)/SSEfm/(n-p-1)}
F-stat= 0.014
Since the F-stat is so low, we fail to reject the null hypothesis.

From section 2 of the appendix, we can see that lower 95% and the upper
95% limits for the income variable are 3.13239E-05 and 0.000309126
respectively. Therefore, the 95% confidence interval is 0.000170225 +/0.0001389.
(c) From section 4 of the appendix, we can see that the R^2 when Income is
removed from the regression equation is 0.3, implying that 30% of the
variation in sales can be explained when income is removed from the
model.
(d) From section 5 of the appendix, we can see that the R^2 when we
include only Price, Income and Age in the regression equation is 0.37.
This implies that Price, Income and Age can explain 37% of the variation
in sales. Note that R^2 has fallen by only 0.02 compared to regression
model 2 that included all of the variables expect state. This suggests
that these three variables have the bulk of the explanatory power out of
all the ones that we have data for.
(e) From section 6 of the appendix, we can see that the R^2 when only
Income is included in the regression model is 0.14, implying that Income
can explain 14% of the variation in sales. At first sight, these seems to
contradict the answer to question , where we removed income and R^2
fell by only 0.09, and not by 0.14. However, the reason for this is that
income is correlated to the other variables that are still included in the
regression equation. When income is removed, these act as partial
proxies for income and pick up part of its affect on sales.
Appendix
1. Linear Regression Model
SUMMARY OUTPUT- Linear Model
Regression Statistics
Multiple R
0.566429724
R Square
0.320842632
Adjusted R
0.228230264

Square
Standard
Error
Observations

28.17395995
51

ANOVA
df
Regression
Residual
Total

6
44
50

Coefficients

SS
16499.47468
34925.96885
51425.44353
Standard
Error

MS
2749.912446
793.7720194

t Stat

F
3.46436052

P-value

Lower 95%

Intercept

103.3448457

245.6071851

0.420772893

0.675969084

Age

3.219768497
0.814684412

1.403968151
0.075594981

0.167347577

HS

4.520452423
0.061586053

Income

0.018946453

0.010215988

1.854588473

0.070364211

Black

0.357535168
1.052858856
3.254918434

0.487219338

0.73382795

0.466946041

5.561007986

-0.18932878
3.155803959

0.850705811

Female
Price

1.031407044

Significance
F
0.006856991

0.940084002

0.002886409

Upper 95%

391.6439043
1.968564514
1.703474577
0.001642517
0.624390874
12.26033388
5.333582719

Sales vs Predicted Sales


300
250
200
150
Sales
100
50
0
0

20

40

60

80

100

120

Predicted Sales

2. Non-Linear Regression Model


SUMMARY OUTPUT- Non-Linear Regression
Regression Statistics
Multiple R
0.625647146
R Square
0.391434352
Adjusted R
0.308448127

140

160

180

200

598.3335957
11.00946936
1.580302471
0.039535423
1.339461211
10.15461617
1.176254149

Square
Standard
Error
Observations

0.190072168
51

ANOVA
df
6
44
50

SS
1.022448015
1.589606882
2.612054897

Coefficients
4.821796221

Standard
Error
1.656958776

Regression
Residual
Total

Intercept
Age
HS
Income
Black
Female
Price

0.038833553
0.003483929
0.000170225

0.021721774

0.001831089
0.012980646
0.024380487

0.003286966

0.005496169
6.89209E-05

0.037516659
0.006958261

MS
0.170408002
0.036127429

F
4.716859365

t Stat
2.910027872

P-value
0.005650188

1.787770803
0.633883217
2.46985781

0.08070137

0.557075841
0.345996861
3.503818895

0.529438762
0.017466212
0.580298502
0.730994073
0.001066606

Significance
F
0.000869756

Lower 95%
1.482415278
0.004943805
0.014560729
3.13239E-05
0.004793355
0.088590503
0.038403941

ln(Sales) vs Predicted ln(Sales)


6
5
4
3
ln(Sales)
2
1
0
4.3

4.4

4.5

4.6

4.7

4.8

4.9

5.1

Predicted ln(Sales)

3. Non-Linear Regression Model excluding Female and High


SUMMARY OUTPUT- Excluding Female and HS
Regression Statistics
Multiple R
0.619682888
R Square
0.384006881
Adjusted R
Square
0.330442262
Standard
0.187025216

5.2

5.3

Upper 95%
8.161177165
0.08261091
0.007592871
0.000309126
0.008455533
0.062629211
0.010357033

Error
Observations

51

ANOVA
df
Regression
Residual
Total

Intercept
Income
Age
Black
Price

4
46
50

Coefficients
4.061253068
0.000146469
0.037786995
0.002468201
0.023703173

SS
1.003047054
1.609007843
2.612054897
Standard
Error
0.423298415
4.66896E-05
0.014894817
0.002117318
0.006775848

MS
0.250761764
0.034978431

F
7.169039713

t Stat
9.594302564
3.137080193
2.53692238
1.16572014
3.498185462

P-value
1.49396E-12
0.002974138
0.014639292
0.249736561
0.001050873

Significance
F
0.000142198

Lower 95%
3.209197565
5.24877E-05
0.007805284
-0.00179374
0.037342248

Upper 95%
4.913308571
0.00024045
0.067768706
0.006730141
0.010064099

4. Non Linear Regression Model Excluding Income


SUMMARY OUTPUT- Excluding Income
Regression Statistics
Multiple R
0.554132015
R Square
0.30706229
Adjusted R
Square
0.230069211
Standard
Error
0.200554306
Observations
51
ANOVA
df
Regression
Residual
Total

Intercept
Age
HS
Black
Female
Price

5
45
50

SS
0.802063558
1.809991339
2.612054897

Coefficients
5.351700624
0.063872321

Standard
Error
1.733618867
0.020270431

0.005799373
0.006208014
0.037496172
0.020834885

0.004231198
0.002921003
0.03817503
0.007184048

MS
0.160412712
0.04022203

F
3.988180424

t Stat
3.087011065
3.15100961

P-value
0.003455756
0.002891868

1.370622086
2.125302071
0.982217229
2.900159344

0.177291209
0.039085419

Significance
F
0.004437236

0.331244423

Lower 95%
1.860013041
0.023045579
0.002722697
0.000324812
0.114384628

0.005750394

-0.0353043

5. Non-Linear Model with only Price, Age and Income

Upper 95%
8.843388207
0.104699064
0.014321443
0.012091216
0.039392284
0.006365469

SUMMARY OUTPUT- Income, Age, Price


Regression Statistics
Multiple R
0.604821953
R Square
0.365809595
Adjusted R
Square
0.325329356
Standard
Error
0.187737943
Observations
51
ANOVA
df
3
47
50

SS
0.955514743
1.656540154
2.612054897

Intercept
Income
Age

Coefficients
4.127128249
0.000149342
0.037523826

Standard
Error
0.42110809
4.68022E-05
0.014949862

Price

-0.02487975

0.006725788

Regression
Residual
Total

MS
0.318504914
0.035245535

F
9.036745018

Significance
F
7.82079E-05

t Stat
9.800638707
3.190921651
2.509978155
3.699157815

P-value
6.1093E-13
0.002527788
0.015575595

Lower 95%
3.279968057
5.51883E-05
0.007448584

0.00056573

-0.03841029

MS
0.359964388
0.045961031

F
7.831947665

Significance
F
0.007319681

t Stat
21.80961429
2.798561714

P-value
7.04454E-27
0.007319681

Lower 95%
3.845330714
4.02226E-05

Upper 95%
4.97428844
0.000243496
0.067599068
0.011349211

6. Non-Linear Model with only Income


SUMMARY OUTPUT- ln(Sales) vs Income
Regression Statistics
Multiple R
0.371226199
R Square
0.137808891
Adjusted R
Square
0.120213154
Standard
Error
0.214385239
Observations
51
ANOVA
df
Regression
Residual
Total

Intercept
Income

1
49
50

SS
0.359964388
2.252090509
2.612054897

Coefficients
4.235606774
0.000142671

Standard
Error
0.194208239
5.09801E-05

Upper 95%
4.625882833
0.000245119

7. Linear Model excluding Female and HS


SUMMARY OUTPUT
Regression Statistics
Multiple R
0.565849272
R Square
0.320185398
Adjusted R
Square
0.261071085
Standard
Error
27.5680058
Observations
51
ANOVA
df
Regression
Residual
Total

4
46
50

Coefficients

SS
16465.67612
34959.76741
51425.44353
Standard
Error

MS
4116.419029
759.9949437

F
5.416376863

Significance
F
0.001167909

t Stat

P-value

Lower 95%

Intercept
Income

55.32958014
0.018892061

62.39529309
0.006882169

0.886758879
2.745073749

0.379821522
0.008601136

Age

2.195534978
0.998777726

1.909119321
3.243905587

0.062496178

Price

4.191538246
3.239940647

Black

0.334162426

0.312098265

1.070696198

0.289891915

0.002198981

70.26562873
0.005038974
0.227844379
5.250375905
0.294058789

Upper 95%
180.924789
0.032745148
8.61092087
1.229505389
0.962383641

You might also like