UKSUG10 Royston

DIY fractional polynomials
Patrick Royston
MRC Clinical Trials Unit , London
10 September 2010
Overview
Introduction to fractional polynomials

Going off-piste: DIY fractional polynomials
Examples
Fractional polynomial models
A fractional polynomial of degree 1 with power p1 is defined

as FP1 = 1 X p1
A fractional polynomial of degree 2 with powers (p1,p2) is
defined as FP2 = 1 X p1 + 2 X p2
Powers (p1,p2) are taken from a predefined set
S = {2, 1, 0.5, 0, 0.5, 1, 2, 3}
where 0 means log X
Also, there are repeated powers FP2 models
Example: FP1 [power 0.5] = 1 X0.5
Example: FP2 [powers (0.5, 3)] = 1 X0.5 + 2 X3
Example: FP2 [powers (3, 3)] = 1 X3 + 2 X3lnX
Some examples of fractional

polynomial (FP2) curves
(-2, 1)
(-2, 2)
(-2, -2)
(-2, -1)
Royston P, Altman DG (1994) Applied Statistics 43: 429-467.
FP analysis for the prognostic

effect of age in breast cancer
FP function selection procedure

Simple functions are preferred. More complicated functions are
accepted only if the fit is much better
Effect of age significant at 5% level?
2
df
P-value
Any effect?
Best FP2 versus null
17.61
0.0015
Linear function suitable?

Best FP2 versus linear
17.03
0.0007
FP1 sufficient?
Best FP2 vs. best FP1
11.20
0.0037
Fractional polynomials in Stata
fracpoly command
Basic syntax:
. fracpoly [, fp_options]: regn_cmd [yvar] xvar1 [xvars]
xvar1 is a continuous predictor which may have a curved
relationship with yvar
xvars are other predictors, all modelled as linear
Can use the fp_option compare to compare the fit of
different FP models
uses the FP function selection procedure
Example (auto data)
fracpoly, compare: regress mpg displacement

Fractional polynomial model comparisons:
-------------------------------------------------------------------------displacement
df
Deviance
Res. SD
Dev. dif. P (*) Powers
-------------------------------------------------------------------------Not in model
0
468.789
5.7855
70.818
0.000
Linear
1
417.801
4.12779
19.830
0.000 1
m = 1
2
400.592
3.67467
2.621
0.284 -2
m = 2
4
397.971
3.6355
--- -2 3
-------------------------------------------------------------------------(*) P-value from deviance difference comparing reported model with m = 2
model
Show FP1 and FP2 models in Stata (+ fracplot)
But what if fracpoly cant fit my

model ?
fracpoly supports only some of Statas rich set of
regression-type commands
Provided we know what the command we want to fit looks
like with a transformed covariate, we can fit an FP model to
the data
We just create the necessary transformed covariate values,
fit the model using them, and assess the fit
A new, simple command fracpoly_powers helps by
generating strings (local macros) with the required powers:
. fracpoly_powers [, degree(#) s(list_of_powers) ]
Fitting an FP2 model in the auto

example
// Store FP2 powers in local macros
fracpoly_powers, degree(2)
local np = r(np)
forvalues j = 1 / `np' {
local p`j' `r(p`j')'
}
// Compute deviance for each model with covariate displacement
local x displacement
local y mpg
local devmin 1e30
quietly forvalues j = 1 / `np' {
fracgen `x' `p`j'', replace
regress `y' `r(names)'
local dev = -2 * e(ll)
if `dev' < `devmin' {
local pbest `p`j''
local devmin `dev'
}
}
di "Best model has powers `pbest', deviance = " `devmin'
A real example: modelling fetal

growth
Prospective longitudinal study of n = 50 pregnant women
There are about 6 repeated measurements on each fetus at
different gestational ages (gawks)
gawks = gestational age in weeks
Wish to model how y = log fetal abdominal circumference
changes with gestational age
There is considerable curvature!
4.5
Log AC
5
5.5
The raw data
10
20
30
Gestational age, wk
40
A mixed model for fetal growth
Multilevel (mixed) model to fit this relationship:

. xtmixed y FP(gawks) || id: FP(gawks),
covariance(unstructured)
But how do we implement FP(gawks) here?
We want the best-fitting FP function of gawks, with random
effects for the parameters (s) of the FP model
Fitting an FP2 mixed model to the

fetal AC data
[First run fracpoly_powers to create local macros with powers]
// Compute deviance for each FP model with covariate gawks
gen x = gawks
gen y = ln(ac)
local devmin 1e30
qui fracgen x `p`j'', replace adjust(mean)
qui xtmixed y `r(names)' || id: `r(names)', ///
nostderr covariance(unstructured)
local p `p`j''
local devmin `dev'
}
di "powers = `p`j''" _col(20) " deviance = " %9.3f `dev'
}
di _n "Best model has powers `p', deviance = " `devmin'
Plots of some results
Residuals at the individual level
-.2
4.5
Residuals
-.1
0
.1
Log AC
5
5.5
.2
Fitted curves at the individual level
10
20
30
Gestational age, wk
40
-.2
Residuals
-.1
0
.1
.2
Residuals and fitted residuals
10
20
30
Gestational age, wk
40
10
20
30
Gestational age, wk
40
An ignorant example!
I know almost nothing about seemingly unrelated

regression (Statas sureg command)
It fits a set of linear regression models which have
correlated error terms
The syntax therefore has a set of equations
. sureg (depvar1 varlist1) (depvar2 varlist2) ... (depvarN
varlistN)
There may be non-linearities lurking in these equations
How can we fit FP models to varlist1, varlist2, ?
Example: modelling learning

scores
Stata FAQ from UCLA
(http://www.ats.ucla.edu/stat/stata/faq/sureg.htm):
What is seemingly unrelated regression and how can I
perform it in Stata?
Example: High School and Beyond study
Example: modelling learning

scores
Contains data from hsb2.dta
obs:
200
highschool and beyond (200 cases)
vars:
11
5 Jul 2010 13:23
size:
9,600 (99.9% of memory free)
------------------------------------------------------------------------------storage display
value
variable name
type
format
label
variable label
------------------------------------------------------------------------------id
float %9.0g
female
float %9.0g
fl
race
float %12.0g
rl
ses
float %9.0g
sl
schtyp
float %9.0g
scl
type of school
prog
float %9.0g
sel
type of program
read
float %9.0g
reading score
write
float %9.0g
writing score
math
float %9.0g
math score
science
float %9.0g
science score
socst
float %9.0g
social studies score
-------------------------------------------------------------------------------
[It is unclear to me what ses (low, middle, high) is]
Example (ctd.)
As an example, suppose we wish to model 2 outcomes (read,

math) as predicted by socst female ses and science
female ses using sureg as follows:
. sureg (read socst female ses) (math science female ses)
Are there non-linearities in read as a function of socst?
In math as a function of science?
For simplicity here, will restrict ourselves to FP1 functions of
socst and science
not necessary in principle
We fit the 8 8 = 64 FP1 models and look for the best-fitting
combination
Stata
gen x1 = socst
gen x2 = science
gen y1 = read
gen y2 = math
local devmin 1e30
qui fracgen x1 `p`j'', replace adjust(mean)
local x1vars `r(names)'
forvalues k = 1 / `np' {
qui fracgen x2 `p`k'', replace adjust(mean)
local x2vars `r(names)'
qui sureg (y1 `x1vars' female ses) (y2 `x2vars' female ses)
local px1 `p`j''
local px2 `p`k''
local devmin `dev'
}
}
}
[Run fpexample3.do in Stata]
Comments
The results suggest that there is indeed curvature in both

relationships
Can reject the null hypothesis of linearity at the 1%
significance level
FP1 vs linear: 2 = 10.08 (2 d.f.), P = 0.0065
Shows the importance of considering non-linearity
read as a function of socst

(adjusted female ses)
30
Partial predictor+residual of read

40
50
60
70
80
Fractional Polynomial (3),

adjusted for covariates
30
40
50
social studies score
60
70
math as a function of science

(adjusted female ses)
30
Partial predictor+residual of math

40
50
60
70
80
Fractional Polynomial (2),

adjusted for covariates
20
40
science score
60
80
Conclusions
Fractional polynomial models are a simple yet very useful

extension of linear functions and ordinary polynomials
If you are willing to do some straightforward do-file
programming, you can apply them in a bespoke manner to
a wide range of Stata regression-type commands and get
useful results
For (much) more, see Royston & Sauerbrei (2008) book

UKSUG10 Royston

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

UKSUG10 Royston

Uploaded by

Copyright:

Available Formats

DIY fractional polynomials

Introduction to fractional polynomials

Fractional polynomial models

A fractional polynomial of degree 1 with power p1 is defined

Some examples of fractional

Royston P, Altman DG (1994) Applied Statistics 43: 429-467.

FP analysis for the prognostic

FP function selection procedure

Linear function suitable?

Fractional polynomials in Stata

Example (auto data)

fracpoly, compare: regress mpg displacement

Show FP1 and FP2 models in Stata (+ fracplot)

But what if fracpoly cant fit my

Fitting an FP2 model in the auto

A real example: modelling fetal

The raw data

A mixed model for fetal growth

Multilevel (mixed) model to fit this relationship:

Fitting an FP2 mixed model to the

Plots of some results

Residuals at the individual level

Fitted curves at the individual level

Residuals and fitted residuals

I know almost nothing about seemingly unrelated

Example: modelling learning

Example: modelling learning

[It is unclear to me what ses (low, middle, high) is]

As an example, suppose we wish to model 2 outcomes (read,

[Run fpexample3.do in Stata]

The results suggest that there is indeed curvature in both

read as a function of socst

Partial predictor+residual of read

Fractional Polynomial (3),

math as a function of science

Partial predictor+residual of math

Fractional Polynomial (2),

Fractional polynomial models are a simple yet very useful

You might also like