Professional Documents
Culture Documents
455
STATISTICAL
METHODS
I
Jan
Hannig
11/18/10
STOR455 Lecture 23
STOR455 Lecture 23
Project
I
have
decided
to
assign
a
project.
It
will
be
part
of
the
midterm
exam
score
(10%
of
the
nal
grade).
The
two
in-class
midterm
score
make
40%
of
the
grade.
Due
December
2.
There
are
three
les
(info,
more
info
and
data)
11/18/10
STOR455 Lecture 23
Remedial Measures
Discard outlier
Transforma<on data to
11/18/10
STOR455 Lecture 23
Box-Cox Transforma<ons
11/18/10
STOR455 Lecture 23
=
1,
Y
=
Y1,
no
transforma<on
=
.5,
Y
=
Y1/2,
square
root
=
-.5,
Y
=
Y-1/2,
one
over
square
root
=
-1,
Y
=
Y-1
=
1/Y,
inverse
=
0,
(Y
=
(Y
-
1)/),
log
is
the
limit
11/18/10
STOR455 Lecture 23
Box-Cox
Details
We
can
es<mate
by
including
it
as
a
parameter
in
a
non
linear
model
Y
=
0
+
1X
+
Choose
that
give
the
best
t
SAS
code
is
in
proc
transreg
11/18/10
STOR455 Lecture 23
Plutonium Example
11/18/10
STOR455 Lecture 23
Do
it
in
SAS
proc
transreg
data=plu;
model
boxcox(y)=iden<ty(x);
run;
11/18/10
STOR455 Lecture 23
Do
it
in
SAS
*Data
transforma<on;
data
plu1;
set
plu;
where
NOT
(
x
EQ
0
AND
y
GE
0.09
);
sqy=sqrt(y);
sqx=sqrt(x);
run;
proc
print
data=plu1;
run;
*
transform
y
and
x;
proc
reg
data=Plu1;
model
sqy
=
sqx;
plot
sqy*sqx
rstudent.*p.
r.*nqq.;
run;
N
23
Rs q
0. 9450
Adj Rs q
0. 9424
RM
SE
0. 0248
- 1
- 2
0. 050
0. 075
0. 100
0. 125
0. 150
0. 175
0. 200
0. 225
0. 250
0. 275
0. 300
0. 325
0. 350
Pr edi c t ed Val ue
N
23
Rs q
0. 9450
0. 04
Adj Rs q
0. 9424
RM
SE
0. 0248
0. 02
0. 00
- 0. 02
- 0. 04
- 0. 06
- 3
- 2
- 1
0
Nor m
al
Quant i l e
11/18/10
STOR455 Lecture 23
Do
it
in
SAS
Analysis
of
Variance
Sum
of
Mean
Source
DF
Squares
Square
F
Value
Pr
>
F
Model
1
0.22142
0.22142
360.92
<.0001
Error
21
0.01288
0.00061348
Corrected
Total
22
0.23430
Root
MSE
0.02477
R-Square
0.9450
Dependent
Mean
0.18483
Adj
R-Sq
0.9424
Coe
Var
13.40098
Parameter
EsSmates
Parameter
Standard
Variable
DF
EsSmate
Error
t
Value
Pr
>
|t|
Intercept
1
0.07301
0.00783
9.32
<.0001
sqx
1
0.05731
0.00302
19.00
<.0001
11/18/10
STOR455 Lecture 23
Do
it
in
SAS
The
nal
model
we
t
is
sqrt(y)=0.07301+0.05731*sqrt(x)+
Be
careful
when
interpre<ng
the
predicted
values
are
sqrt(y),
to
get
a
predicted
value
for
y
need
to
square!
11/18/10
STOR455 Lecture 23
Do
a
model
selec<on
proc
reg,
%allsubsreg.sas
11/18/10
STOR455 Lecture 23
11/18/10
STOR455 Lecture 23
SENIC
Data
Info
about
113
hospitals
between
1975
and
1976
Variables:
id,
length
of
stay,
age,
infec<on
risk,
rou<ne
culturing
ra<o,
rou<ne
chest
X-ray
ra<o,
number
of
beds,
medical
school
alia<on,
region
(1=NE,
2=NC,
3=S,
4=W),
average
daily
census,
number
of
nurses,
available
facili<es
File
AppendixC01.txt
in
the
extra
data
sets
11/18/10
STOR455 Lecture 23
STOR455 Lecture 23
11/18/10
STOR455 Lecture 23
STOR455 Lecture 23