You are on page 1of 10

This article was downloaded by: [Indian Council of Medical Res], [ramesh athe]

On: 02 May 2013, At: 03:56


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41
Mortimer Street, London W1T 3JH, UK
The American Statistician
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/utas20
Comparing Treatments Using Quality-Adjusted Survival: The
Q-TWiST Method
Richard D. Gelber
a
, Bernard F. Cole
b
, Shari Gelber
c
& Aron Goldhirsch
d
a
Harvard Medical School, Harvard School of Public Health, Dana-Farber Cancer Institute, Boston,
MA, 02115, USA
b
Department of Community Health and Division of Applied Mathematics, Brown University,
Providence, RI, 02912, USA
c
Frontier Science and Technology Research Foundation, Brookline, MA, 02146, USA
d
University of Bern, Scientific Director of the International Breast Cancer Study Group, Ospedale
Civico, Servizio Oncologico, 6900, Lugano, Switzerland
Published online: 27 Feb 2012.
To cite this article: Richard D. Gelber , Bernard F. Cole , Shari Gelber & Aron Goldhirsch (1995): Comparing Treatments Using Quality-
Adjusted Survival: The Q-TWiST Method, The American Statistician, 49:2, 161-169
To link to this article: http://dx.doi.org/10.1080/00031305.1995.10476135
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents will
be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be
independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or
arising out of the use of this material.
Comparing Treatments Using Quality-Adjusted Survival:
The Q-TWiST Method
Richard D. GELBER, Bernard F. COLE, Shari GELBER, and Aron GOLDHIRSCH
The quality of life of patients is an important component of
evaluation of therapies. Wepresent an overview of a sta-
tistical method called Q-TWiST (Quality-Adjusted Time
Without Symptoms and Toxicity) which incorporates
quality -of-life considerations into treatment comparisons.
Multivariate censored survival data are used to partition the
overall survival time into periods of time spent in a set of
progressive clinical health states which may differ in qual-
ity of life. Mean health state durations, restricted to the
follow-up limits of the clinical trial, are derived from the
data and combined with value weights to estimate quality-
adjusted survival. The methodology emphasizes treatment
comparisons based on threshold utility analyses that high-
light trade-offs between different health state durations; it
is not intended to provide a unique result combining qual-
ity and quantity of life. Wealso describe three recent ex-
tensions of the methodology: covariates can be included
using proportional hazards and accelerated failure time
regression models, restricted estimates can be projected
beyond follow-up limits using parametric models, and
meta-analyses can be performed incorporating quality-of-
life dimensions. The basic methods are demonstrated in
an analysis of data from a clinical trial comparing long
versus short duration adjuvant chemotherapy regimens for
the treatment of breast cancer. The clinical health states
are defined by the following three outcomes: (1) end of
treatment toxicity, (2) disease recurrence, and (3) death.
The results allow one to evaluate the trade-off between the
increased toxic effects and the increased recurrence-free
interval associated with the long duration treatment.
KEY WORDS: Clinical trials, Quality of life; Restricted
means; Survival analysis; Utility.
Richard D. Gelber is Professor of Pediatrics (Biostatistics) Har-
vard Medical School, Harvard School of Public Health, and Dana-
Farber Cancer Institute, Boston, MA 021 15, Bernard F. Cole is Assistant
Professor, Department of Community Health and Division of Applied
Mathematics, Brown University, Providence, RI 02912. Shari Gelber
is Biostatistician, Frontier Science and Technology Research Founda-
tion, Brookline, MA 02146. Aron Goldhirsch is Professor of Oncology,
University of Bern, Scientific Director of the International Breast Can-
cer Study Group, Ospedale Civico, Servizio Oncologico, 6900 Lugano,
Switzerland. Support for the clinical trial was provided by theSwiss
Cancer League, the Cancer League of Ticino, the Ludwig Institute for
Cancer Research, the Swedish Cancer Society, the Frontier Science and
Technology Research Foundation, and the Swiss Group for Clinical and
Epidemiological Cancer Research. Support for the methodological de-
velopment was provided by Grant PBR-53 fromthe American Cancer
Society and Grant CA-06516 fromthe National Cancer Institute. The
authors thank the patients, physicians, nurses, and data managers of the
International Breast Cancer Study Group who contributed to the clini-
cal trial described in this article. This paper was presented at the 1993
Spring Meetings of the Biometric Society (Eastern North America Re-
gion), Institute of Mathematical Statistics, and the American Statistical
Association, Philadelphia, PA, March 21-24, 1993. Portions of this ar-
ticle from an earlier paper by the authors are reprinted with permission
fromCancer Treatment Reviews (Gelber, Goldhirsch, and Cole 1993a).
1. INTRODUCTION
The evaluation of treatments in terms of quality of life
is becoming increasingly important in clinical research
(Schumacher, Olschewski, and Schulgen 1991; Cox et al.
1992). In particular, there is a need to develop methods
for comparing the palliative effects of treatment options
within randomized clinical trials. Such methods are es-
pecially useful in situations where a new treatment is not
shown to significantly prolong life, but may have an advan-
tage to improve or maintain the quality of life of the patient.
For example, an experimental treatment may significantly
increase time to disease progression or recurrence as com-
pared with a standard treatment, but have only a modest
effect on overall survival. Thus the experimental treatment
represents an improvement in quality of life. On the other
hand, the treatment may have gdverse side effects that di-
minish quality of life. In this case there is a trade-off
between improved response and treatment toxicity. For
an individual patient the treatment selection depends not
only on the magnitude of these trade-offs, but also on his or
her preferences concerning the trade-offs. The purpose of
this article is to present an overview of a statistical method
called Q-TWiST that can be used to make treatment com-
parisons in terms of both quality and quantity of life, while
incorporating individual patient preferences.
First attempts at assessing the impact of treatments on
quality of life were made by identifying and grading the
side effects of treatments. Subsequent efforts have been
made to measure patients perceptions of the influence of
such side effects and perceptions of symptoms of disease
(Priestman and Baum 1976). This has led to the develop-
ment of several instruments for assessing quality of life,
which have been reviewed for their attributes and value
for eliciting patient perceptions (see Maguire and Selby
(1989); Donovan, Sanson-Fisher, and Redman (1 989);
and Moinpour et al. (1989) for examples). Further ef-
forts focused on the integration of both quality and quan-
tity of life into a single end point that may be used to
make treatment comparisons. This led to the development
of the Q-TWiST method Q-TWiST stands for Quality-
Adjusted Time Without Symptoms of disease and Toxicity
of treatment, and was originally designed to incorporate
aspects of quality of life into adjuvant chemotherapy and
endocrine therapy comparisons for the treatment of breast
cancer. The methodology is an extension of the TWiST
method of Gelber and Goldhirsch (1986), which makes
treatment comparisons in terms of survival time with-
out symptoms of disease and toxicity of treatment (i.e.,
the survival time that remains after subtracting periods
of time with symptoms or toxicity from the overall sur-
vival time). The Q-TWiST method, which was first pro-
posed by Goldhirsch, Gelber, Simes, Glasziou, and Coates
(1989), allows for a portion of the time spent with symp-
toms or toxicity to be included in the comparison. This
@ 1995 American Statistical Association The American Statistician, May 1995, Vol. 49, No. 2 I61
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

is accomplished by placing value-weights on these peri-
ods according to their quality of life. The methodology
has been successfully applied in a number of analyses of
clinical trials. Gelber, Goldhirsch, and Cavalli (1991) and
Gelber, Goldhirsch, Hurny, Bernhard, and Simes (1992a)
present analyses of adjuvant therapies for operable breast
cancer, and Gelber et al. (1992b) and Lenderking et al.
(1994) present analyses of zidovudine therapy for HIV
infection.
In Section 2 we present a review of the Q-TWiST
methodology. Section 3 describes recent extensions which
(1) allow covariates to be included by proportional haz-
ards and accelerated failure time regression models, (2)
use parametric models to extrapolate beyond the follow-
up limits of the available data, and (3) provide a means for
performing meta-analyses. In Section 4 the basic proce-
dures are illustrated in an analysis of a clinical trial com-
paring long versus short duration chemotherapy for the
treatment of node-positive breast cancer. In this exam-
ple there is a trade-off between the toxic side effects of the
treatment and delayed disease recurrence. Practical issues
related to performing a Q-TWiST analysis are discussed
in Section 5.
2. A REVIEW OF THE Q-TWiST METHOD
The Q-TWiST method makes treatment comparisons
in terms of quality and quantity of life by penalizing
treatments which have negative quality-of-life effects and
rewarding those which increase survival and have other
positive quality-of-life effects. As in an ordinary survival
analysis, the focus of the method is on time, but rather
than look at a single end point such as overall survival or
disease-free survival, multiple outcomes corresponding to
changes in quality of life are considered. Periods of time
with the negative side effects of treatment are weighted
according to the severity of the side effects. A weight of
zero indicates the period of time is as bad as death, and a
weight of unity indicates perfect health. Weights between
zero and unity indicate degrees between these extremes.
These weights are called utility coeficients. A compos-
ite measure of quality and quantity of life (i.e., quality-
adjusted survival) is obtained by summing the weighted
periods of time. This utility model makes two main as-
sumptions: (1) the quality-adjusted time spent in a health
state is directly proportional to the actual time spent in the
health state, where the proportionality is given by the util-
ity coefficient, and (2) the utility coefficient for a health
state is independent of the time the health state is entered,
as well as past and future quality of life.
A general, technical description of the Q-TWiST
method is given by Glasziou, Simes, and Gelber (1990).
The three steps involved in applying the method are
reviewed briefly below. Where appropriate, we have illust-
rated the concepts by referencing the figures for the spe-
cific example described in Section 4.
2.1 Step 1: Define Quality-of-Life Outcomes
The first step is to define quality-of-life oriented sur-
vival outcomes that are relevant for the disease setting
under study. These should highlight specific treatment
differences in terms of time and quality of life. These
survival outcomes are then used to define a series of pro-
gressive clinical health states which may differ in terms
of quality of life. These states are progressive because
a patient must proceed through them in order; however,
any of the states may be skipped, for example, due to
early death. In the case of adjuvant chemotherapy for re-
sectable breast cancer, the survival outcomes are defined
as follows: the time with toxicity (TOX), represented by
the period in which the patient is exposed to subjective
side effects of therapy; disease-free survival (DFS), the
time until disease recurrence or death, whichever occurs
first; and overall survival (OS), the time to death from
any cause. The resulting progressive clinical health states
are: time spent with treatment toxicity (TOX); time with-
out either symptoms of the disease or toxicity of treat-.
ment (TWiST =DFS - TOX); and time following the
diagnosis of systemic spread of the disease or relapse
(REL =0s - DFS). The definitions of TOX and REL
reflect the fact that these periods of time have a negative
impact on the overall quality of life of the patient. Further-
more, their definitions are designed to emphasize the con-
trasting properties of the different treatments under study.
The defined survival outcomes (e.g., TOX, DFS, and
0s) indicate transitions between the progressive states
of health (e.g., TOX, TWiST, and REL). The transition
times may be subject to right censoring due to follow-up
loss or patients surviving beyond the follow-up interval.
As in standard survival analysis, this is acceptable if the
censoring mechanism does not provide information about
the failure mechanism (i.e., is noninformative). When
a transition time is censored, all subsequent transition
times in the progressive health state model are similarly
censored.
Each clinical health state is assigned a utility coeffi-
cient, which may be unknown. In our example the utility
coefficient for TWiST is assumed to be unity because it
characterizes a period of relatively perfect health. On the
other hand, the periods TOX and REL are associated with
diminished quality of life, but the exact values for their
utility coefficients are unknown. Therefore, we let UTOX
and uREL denote the respective utility coefficients. These
express the value of time in TOX and REL relative to
TWiST. Figure 1 displays the different time periods in
this example according to assumed utility coefficients of
1.0 for TWiST and .5 for both TOX and REL. This repre-
sents a scenario in which one month spent TOX or REL is
equivalent in value to one-half month spent with the better
quality of life that characterizes TWiST.
Utility
:::l-rk, TWiST
TOX REL DEATH
0.0
Time
Figure 1. Components of Quality-Adjusted Time Without Symp-
toms and Toxicity (Q-TWiST). Illustrates the division of overall sur-
vival into TOX (subjective toxic effects), TWiS7; and REL (relapse),
and the weighting of these time periods using utility coefficients UTOX
and UREL.
162 The American Statistician, May 1995, Vol. 49, No. 2
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

The Q-TWiST outcome is calculated as the weighted
sum of the clinical health state durations and the utility
coefficients. For the breast cancer example,
Q-TWiST =uTOX x TOX +TWiST +uREL x REL. (1)
2.2 Step 2: Partition Overall Survival
The second step is to consider each treatment separately
and to partition the overall survival time into the defined
clinical health states. This is done using the Kaplan-Meier
product limit method (Kaplan and Meier 1958) to graph
the transitional survival curves (e.g., the survival curves
for TOX, DFS, and 0s). The areas between the curves
are estimates of the mean health state durations. For ex-
ample, the area beneath the TOX curve is an estimate of
the mean duration of TOX, the area between the DFS and
TOX survival curves is an estimate of the mean duration
of TWiST, and the area between the 0s and DFS survival
curves is an estimate of the mean duration of REL. These
estimates have a multivariate normal limiting distribution
(Glasziou et al. 1990; Breslow and Crowley 1974), and
do not suffer from bias due to the induced dependency be-
tween the censoring mechanism and health state duration
distributions (Gelber, Gelman, and Goldhirsch 1989).
In practice, censoring often precludes one from estimat-
ing the entire survival curve. In this case the average health
state durations (i.e., the areas between the survival curves)
are calculated within the follow-up interval of the study co-
hort. The resulting estimates are called restricted means
(Kaplan and Meier 1958). Covariation among these re-
stricted means can be estimated using a resampling proce-
dure such as the bootstrap method (Glasziou et al. 1990).
As a useful visual display, the transitional survival cur-
ves corresponding to the multiple outcomes for one treat-
ment can be plotted on the same graph. Separate graphs
can be produced for each treatment group. These are called
partitioned survival plots (see Figure 3, Section 4).
2.3 Step 3: Compare the Treatments
The third step is to compare the treatment regimens in
terms of quality-adjusted survival (Q-TWiST). This com-
posite measure is obtained by the linear combination of
the estimated restricted mean health state durations cal-
culated in Step 2 and the utility coefficients. For exam-
ple, estimates of TOX, TWiST, and REL are substituted
into equation (1). This is done separately for each treat-
ment group, and the treatment effects are estimated by
computing the differences in Q-TWiST (e.g., treatment
group minus control group Q-TWiST) for specific values
of the utility coefficients. Standard error estimates can
be obtained using the bootstrap method. Statistical infer-
ences on the treatment effects can be conducted using the
large sample theory for restricted means estimated from
the Kaplan-Meier survival curves.
The influence of patient preferences on treatment choice
can be examined by a sensitivity analysis, called a thresh-
old utility analysis, which displays the treatment compari-
son for varying values of the utility coefficients (Glasziou
et al. 1990). When two treatments are being compared
and there are two utility coefficients, the sensitivity anal-
ysis can be presented as a two-dimensional plot with a
straight line, called a threshold line, indicating pairs of
utility coefficients for which the two treatments have equal
Q-TWiST (see Figure 5, Section 4). The threshold line is
obtained by setting the treatment effect equal to zero and
solving for the unknown utility coefficients, producing a
linear equation. A confidence region for the threshold
line can also be obtained by finding the pairs of utility
coefficient values for which the confidence interval for
the treatment effect captures zero. The plot shows which
treatment is preferred in terms of Q-TWiST for each pair
of coefficient values.
It is also possible to investigate how the Q-TWiST treat-
ment effect unfolds over the course of follow-up. This
is accomplished by performing the analysis at an evenly
spaced sequence of times (restriction times) leading up to
the follow-up limit. For example, if there are ten years of
follow-up, then the analysis could the restricted to yearly
intervals beginning at zero and ending at ten. The results
can be plotted on a time axis for particular values of the
utility coefficients or as a region indicating the range of
the treatment effect as the utility coefficients vary between
zero and one. This is called the Q-TWiST gain function
(see Figure 4, Section 4).
3. RECENT EXTENSIONS
3.1 Regression Models
Covariates and prognostic factors can be easily incor-
porated into a Q-TWiST analysis with standard regression
methods for survival analysis, allowing the inclusion of
continuous covariates as well as discrete stratifying vari-
ables. In most cases the entire sample of patients can
be used to estimate one model for each survival out-
come, avoiding the problem of decreased sample sizes
due to stratification. This has been done with proportional
hazards models (Cole, Gelber, and Goldhirsch 1993)
and accelerated failure time models (Cole, Gelber, and
Anderson 1994).
Proportional hazards regression can be used instead
of the product limit method in Step 2 of the Q-TWiST
methodology to estimate survival curves for the health
state transitions according to various predetermined pa-
tient profiles. Specifically, a proportional hazards model
is fit to each of the progressive survival outcomes, and the
resulting estimates are used to predict survival curves for
various covariate values. Threshold utility analyses, based
on the predicted survival curves, are preformed for each
of the patient profiles, allowing one to evaluate treatment
effectiveness under a variety of prognostic situations. If
the proportional hazards assumption is not appropriate for
aparticular covariate, then a stratified analysis can be used.
Accelerated failure time regression can be used in a
similar fashion, or it may be used in a more complicated
approach that involves the conditional modeling of health
state transitions given previous transitions and health state
durations (Cole et al. 1994). This represents a more di-
rect modeling of the health state transitions as a semi-
Markov stochastic process. The intensity function for
each transition is assumed to have a certain functional
form (e.g., Weibull, log-normal, etc.), and is conditional
The American Statistician, May 1995, Vol. 49, No. 2 163
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

on the previous health state transitions. Covariates are in-
cluded by assuming that the location parameter for each
model is a linear combination of covariate values and un-
known parameters. The parameters are estimated by max-
imum likelihood. The expected health state durations and
quality-adjusted survival are approximated by simulat-
ing data from the estimated regression models, and infer-
ence is conducted using the bootstrap method or the delta
method. This procedure also allows for models that do
not involve progressive health states; however, a sufficient
number of observations making each type of transition is
required.
3.2 Extrapolation of Survival Curves
The procedures described thus far use restricted sur-
vival means to produce a composite measure of quality
and quantity of life. The estimated mean treatment effect
in terms of quality-adjusted survival is restricted to the
follow-up limits of the data, and therefore does not ad-
dress the possible long-term treatment effect. In situations
where there are a sufficient number of events, long-term
effects may be extrapolated from the available data to pro-
vide an indication of what may occur in the future.
The extrapolation methodology is introduced by Gelber,
Goldhirsch, and Cole (1993b), and consists of using para-
metric models to model the tail of a survival curve and
project the product limit estimate beyond the follow-up
interval. Cut points are used to define where the tail be-
gins, and where observations may contribute to the likeli-
hood for estimating the model. Probability plots are used
to determine appropriate parametric models and values for
the cut points. The procedure is especially useful when it
is difficult to fit a parametric model to the entire survival
curve, but one is easily fit to the tail portion. For exam-
ple, early failures in clinical trials may be influenced by
the healthy entrant phenomenon that suggests that patients
entering a clinical trial, being initially healthy enough to
undergo the treatment, are at decreased risk for disease
recurrence and death soon after enrollment. Such a phe-
nomenon may be difficult to model and is not central to the
extrapolation problem. In this case a composite estimator
based on the product limit method and a parametric model
is convenient, useful, and appropriate.
To produce extrapolated estimates of quality-adjusted
survival, the extrapolation methodology is applied to the
survival curves corresponding to the progressive health
state transitions. Mean health state durations may then be
estimated using the projected survival curves, allowing
estimates to be restricted to some limit greater than the
follow-up interval. The bootstrap method may then be used
for statistical inference.
For the extrapolation methodology to be successful, it is
necessary to have a sufficient follow-up period and a suffi-
cient number of events for evaluating the fit of the model;
otherwise, the projected estimates could be misleading.
Although it is not possible to fully evaluate the accuracy
of statistical inferences based on projected estimates with-
out continued follow-up, a reasonable range of projections
based on careful modeling of a large data set can provide
estimates which supplement the more traditional measures
such as relative risk reduction, and represent a more com-
plete use of the clinical trial data.
3.3 Meta-Analysis
Cole, Gelber, and Goldhirsch (1994) present an exten-
sion of the Q-TWiST method to perform meta-analysis.
The method was applied to evaluate results from eight
clinical trials comparing chemotherapy versus control in
the treatment of breast cancer in women under 50 years
of age. The median follow-up intervals for these trials
range from a minimum of three years to a maximum of
ten years. The meta-analysis procedure uses multivariate
multiple regression models (one for the treatment group
and one for the control group) to combine individual trial
analyses in a manner that accounts for varying follow-
up intervals between the trials, and provides a summary in
terms of quality-adjusted survival. The method consists of
four steps and is a modification of the three-step standard
Q-TWiST procedure outlined in Section 2. The first step
in the analysis is exactly the same as Step 1 in Section 2,
and the remaining steps are as follows:
Step 2: Restricted mean health state transition times
are estimated separately for the treatment group and the
control group for each of the clinical trials under consid-
eration. The restriction time is the follow-up limit, which
may differ for each trial. Each trial also contributes an esti-
mated covariance matrix corresponding to the mean health
state transition times.
Step 3: A multivariate multiple regression model is fit to
the resulting estimates separately for the treatment group
and the control group. The dependent variables are the
restricted means for the progressive survival outcomes,
and the independent variable is the follow-up limit. Pow-
ers of the follow-up limit may also be included as inde-
pendent variables. Each trial contributes one multivariate
data point to the estimation for the two models. Regres-
sion parameters are estimated by generalized least squares
in order to accommodate the covariance estimates for the
health state durations among the clinical trials.
Step 4: The regression models estimated in Step 3 are
used to predict the health state durations for a particular
follow-up limit, which is generally some number smaller
than the largest follow-up interval observed for the trials
under consideration. This is done for each of the treatment
groups. For example, the regression models could be used
to predict the mean durations of TOX, TWiST, and REL
restricted to ten years. These estimates are then used to
predict mean quality-adjusted survival. The resulting esti-
mates take into account the data from all trials under con-
sideration. Statistical inference is carried out using the
covariance matrix of the regression parameter estimates.
This procedure assumes that the estimated mean health
state durations for each trial are normally distributed,
which is appropriate if the large sample properties of the
Kaplan-Meier product limit estimator apply.
4. EXAMPLE
To illustrate the evaluation of treatment effectiveness
using Q-TWiST, we applied the standard methodology as
described in Section 2 to a randomized clinical trial of ad-
juvant chemotherapy for resectable breast cancer. Trial V
164 The American Statistician, May 1995, Vol. 49, No. 2
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

2o 1
100-
80 -
2 6 0 -
a 40-
E
04
0 12 24 36 48 60 72 84
Months from Randomization
(a)
0 s
DFS
2o 1
04
0 12 24 36 48 60 72 84
Months from Randomization
(b)
Figure 2. Disease-Free Survival (a) and Overall Survival (b)
Comparing Long Duration Chemotherapy (Solid Line) Versus Short
Duration Chemotherapy (Dashed Line) for 1,229 Patients with Node-
Positive Breast Cancer in International Breast Cancer Study Group
(IBCSG) Trial Vat Seven Years of Median Follow-Up.
of the International Breast Cancer Study Group (IBCSG)
investigated, in patients with node-positive breast cancer,
the effectiveness of short duration (one month) peri-
operative systemic treatment compared with long dura-
tion adjuvant therapy (six or seven months) [see Ludwig
Breast Cancer Study Group (1 988); Gelber et al. (1992a)l.
The short duration therapy consisted of perioperative
chemotherapy given on days 1 and 8 after surgery. The
long duration treatment regimen consisted of chemother-
apy for six months either following the perioperative
coursc or initiated three to five weeks after surgery (with-
out the perioperative course). A total of 1,229 patients
were randomized to the two treatmcnts. Four hundred
thirteen patients were randomized to the short duration
treatment, and 8 16 patients were randomized to the long
duration treatment. The median follow-up for this analysis
was seven years.
Figure 2 shows the DFS and 0s comparisons ofthe long
duration group versus the short duration group. Table 1
gives thc seven-year percentages for DFS and 0s accord-
ing to treatment group.
4.1 Partitioning Overall Survival
Figurc 3 shows the partitioned survival plots according
to treatment group. The areas between the curves give the
average amount of time spent in TOX, TWIST, and REL
as indicated. The larger area of TOX and the smaller area
of REL are characteristics of the long duration treatment
in terms of time with reduced quality of life.
Table 2 gives the average amounts of time i n TOX,
TWIST, and REL up to seven years from randomization
Table 1. Seven-Year Disease-Free Survival (DFS) and Overall
Survival (0s) Percentages According to Treatment for 1,229
Patients With Node-Positive Breast Cancer in International Breast
Cancer Study Group Trial V
Chemotherapy
treatment 7- year DFS % (S.E.) 7-year 0s % (S. E.)
Long duration 51 (1.8) 63 (1.8)
Short duration 33 (2.5) 50 (2.7)
Log-rank test
2-sided P-value <.0001 .0002
derived from the partitioned survival plots. The two
right-hand columns of the table refer to the treatment
differences (long duration minus short duration) for the
average amount of time patients spend in the various
states. The Q-TWIST calculation was made as an exam-
ple attributing the utility coefficients of .5 to both TOX
and REL. These values were arbitrarily selected to il-
lustrate the method, and do not represent specific val-
ues actually derived from individual patient preferences.
Within seven years, the amount of Q-TWIST gained by
the long duration treatment compared with the short du-
ration treatment was five months, an amount of time
gained even after quality-of-life adjustments for toxic ef-
fects and disease relapse. The 95% confidence interval
, , , , , ,
0
0 12 24 36 48 60 72 84
Months from Randomlzatlon
( 4
100
80
E 60
E
a 40
20
0
0 12 24 36 48 60 72 84
Months from Randomization
(b)
Figure 3. Partitioned Survival Plots. Partitioned survival for the
long duration treatment (a) and for the short duration treatment (b)
for IBCSG Trial Vat seven years of median follow-up. In each graph
the area under the overall survival curve (0s) is partitioned by the
survival curves for disease-free survival (DFS) and time with treat-
ment toxicity (TOX). The areas between the survival curves give the
average months spent in TOX, TWiST, and REL as indicated.
The American Statistician, May 1995, Vol. 49, No. 2 16.5
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

Table 2. Average Months of Time According to Quality of Life End Point for 1,229 Patients
in International Breast Cancer Study Group Trial V
Chemotherapy treatmenf
End point Long duration Short duration Difference 95% C.I.
TOX 6
TWiST 54
REL 9
Q-TWIST 61
~ O X =UREL =0.5
0s 69
DFS 59
1
47
16
56
64
48
5 4.9-5.1
6 3-1 0
-7 -9--5
5 3-8
5 2-8
11 8-1 5
for the gain in Q-TWiST, with utility coefficients equal
to .5 for both TOX and REL, was between three and eight
months, suggesting an advantage for the long duration
regimen.
4.2 Q-TWiST Gain Function
By restricting the Q-TWiST analysis to yearly inter-
vals leading up to the seven year analysis, we see how
Q-TWiST gains for the long duration treatment are accu-
mulated over time. This is described by the Q-TWiST
gain function shown in Figure 4. The solid line within
the shaded region reflects the result for utility coefficients
of .5 for both TOX and REL. Early in the course of the
follow-up, the toxic effects of the long duration treatment
result in a loss in Q-TWIST compared with the short dura-
tion treatment. This is because the advantages of the long
duration treatment (i.e., increased DFS and 0s) do not ap-
pear until later on in time. As the benefits are realized with
follow-up, the Q-TWiST gain function begins to increase,
and will continue to increase provided the DFS curves
for the two treatments remain separated. The shaded re-
gion in Figure 4 illustrates the range of results for the
Q-TWIST gain function as the coefficient values for TOX
and REL range between 0 and 1. The lower edge of the
shaded region corresponds to utility coefficient values of
UTOX =0 and uREL =1, while the upper edge corresponds
to UTOX =1 and UREL =0.
0 1 2 3 4 5 6 7
Years
Figure 4. Q-TWiST Gain Function. The Soliddarkcurve gives the
average months of Q-TWiST (for UTOX =UREL =.5) gained for the
long duration treatment compared with the short duration treatment in
IBCSG Tral Vas a function of years from randomization. The shaded
region surrounding the solid curve shows the ranges for the Q-TWiST
gain function as the utility coefficients vary between 0 and 1.
4.3 Threshold Utility Analysis
Clearly, the results of a Q-TWIST analysis depend on
the values of the utility coefficients. A threshold utility
analysis illustrates the treatment comparison results for
all combinations of utility coefficient values, allowing the
interpretation of clinical trial results based on individual
patient preference. Figure 5 shows the threshold utility
analysis for the IBCSG Trial V data at seven years. The
solid threshold line in the lower right corner of the graph
indicates values of uTOX and uREL for which the treatments
have equal Q-TWiST. The long duration treatment has
greater Q-TWiST for pairs of utility coefficients that fall
above the threshold line, while the short duration treat-
ment has greater Q-TWiST for pairs of values that fall
below the threshold line. The dashed line gives an upper
95% confidence band for the threshold line. The lower
confidence band is outside the range of possible utility
coefficients. The results show that at seven years, a sig-
nificant Q-TWiST gain was achieved for a large range of
choices for the utility coefficients.
Figure 5 allows one to determine the treatment prefer-
ence given apair of utility coefficient values. For example,
0.8
0.6 Longer Duration
uTOX I , Sig. Better , , , , ,,,, j
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
U R E L
Threshold Utility Analysis for IBCSG 7iial V: Both uT0x
(vertical axis) and UREL (horizontal axis) range between 0 and 1,
where the value 1 indicates that the time is worth the same as TWiST;
while the value 0 indicates that the time is worth nothing. The solid
line is the threshold (based on values of UTOX and UREL) for which
the treatments have equal Q-TWiST: The dashed line shows the 95%
confidence band for the threshold. The region denoted by "Longer
Duration Sig. Better" indicates the values of utility coefficients for
which average Q-TWiST at seven years after randomization was sta-
tistically significantly greater for the long duration chemotherapy treat-
ment compared with the short duration chemotherapy treatment.
Figure 5.
166 The American Statistician, May 1995, Vol. 49, No. 2
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

for apatient with utility coefficient values of UTOX =UREL =
.5, the long duration treatment is significantly better, and
thus is the preferred treatment in terms of Q-TWiST. On
the other hand, for a patient with utility, coefficient values
of UTOX =. l and UREL =.9, for whom the disutility of toxic
effects is great while the disutility of relapse is minimal,
the gain in Q-TWiST at seven years is not significant. In
this case the treatment preference in terms of Q-TWiST
is not conclusive. It is important to note that the thresh-
old utility analysis does not indicate the distribution of
the utility coefficients for the population. In other words,
the threshold line does not tell us how many patients pre-
fer one treatment over the other. This question must be
addressed with additional research.
4.4 Incorporating Prognostic Factors
To illustrate one of the recent extensions of the Q-
TWiST methodology, weperformed a proportional haz-
ards Q-TWiST analysis of the IBCSG Trial V data using
the following prognostic factors: tumor size, age, tumor
grade, and the number of lymph nodes involved. Treat-
ment group was included as an added covariate. Models
were fit to each of survival outcomes, DFS and OS, and all
covariates were statistically significant ( p <.05). Other
factors, such as estrogen receptor status, were not included
because they were not statistically significant. The product
1 .o 1 - 1
Longer Duration
0.81 Sig. Better
UTOX
0.0 0.2 0.4 0.6 0.8 1.0
0.6 0.81
Lon er Duration
&g. Better
uTOX
0.41 ,/
0.04
0.0 0.2 0.4 0.6 0.8 1.0
UREL
(b)
Figure 6. Threshold Utility Analyses for Two Patient Profiles
Basedon the Proportional Hazards Model for 1,229 Patients in IBCSG
Rial V: Threshold diagrams are shown for 45 year old patients in a
good prognostic situation (a) and in a poor prognostic situation (b).
The regions denotedby Longer Duration Sig. Bettefindicate the val-
ues of utility coefficients for which average Q-Wi ST at seven years
after randomization was statistically significantly greater for the long
duration chemotherapy treatment compared with the short duration
chemotherapy treatment.
limit method was used to estimate the survival curves for
TOX according to treatment group. A proportional haz-
ards model was not used for TOX because none of the
prognostic factors was significant in the model, and the
proportional hazards assumption did not appear appropri-
ate for the treatment group covariate. For DFS and OS,
goodness-of-fit tests did not suggest that the proportional
hazards assumption was violated.
Threshold utility analyses at seven years based on the
model are presented in Figure 6 for two patient profiles.
These profiles represent a good prognosis and a poor
prognosis for a 45 year old patient. The range of util-
ity coefficients favoring the more toxicity-intensive long
duration chemotherapy is large for the poor prognosis sit-
uation compared with the good prognosis situation. This
is the case even though relative effectiveness is similar
for both patient profiles; that is, the same percentage re-
duction in the risk of an event is achieved for good and
poor prognosis. Figure 6 illustrates how, from a patients
point of view, a poor prognosis scenario has the potential
to gain more Q-TWiST in the short term (within seven
years), thus increasing the rationale to use long duration
chemotherapy.
5. APPLYING THE Q-TWiST METHOD
As illustrated by the IBCSG example, a Q-TWiST anal-
ysis may be performed retrospectively after the completion
of a clinical trial. In this case data must have been col-
lected in the trial to enable the partition of overall survival
into the clinically relevant health states. These are often
broadly defined, for example, using the entire treatment
period to represent TOX.
Alternatively, a Q-TWiST analysis can be planned
prospectively and specified as part of the protocol doc-
ument. Each clinical health state should be defined with
the assistance of a clinical colleague. This will ensure that
the appropriate data are collected for evaluating the clinical
health states. Patient-derived utilities could also be col-
lected during the trial (Weeks 1994). Methods for deriv-
ing utility scores such as standard gamble, time trade-off,
and multiattribute techniques are discussed by Torrance
(1986). In addition to a Q-TWiST analysis incorporating
the patient-derived utilities, we recommend performing a
threshold utility analysis to allow individual patients to
determine the treatment choice for their particular prefer-
ence scores.
In practice, the most challenging component to define
is toxicity. Typically, it is preferable to use criteria which
focus on symptomatic rather than on laboratory events as
the former most directly influence patients quality of life.
It may be difficult to precisely accommodate intermittent
toxicities because the clinical health states are progres-
sive. It is possible, however, to define toxicity as the time
period from initial treatment until all toxicity has disap-
peared. If there are long periods of time captured in this
definition that are actually free of toxicity, this will be re-
flected by having a higher value for the average toxicity
utility coefficient.
By defining the clinical health states to reflect specific
trade-offs of concern to health professionals and patients,
The American Statistician, May 1995, Vol. 49, No. 2 167
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

Q-TWiST provides a framework for treatment decision-
making. For example, to evaluate the role of zidovudine
therapy for asymptomatic patients with HIV infection,
progressive health states of TWiST, adverse events (AE:
symptomatic sequelae associated with treatment or dis-
ease), and progression (Prog: clinical definition of HIV
progression) were defined (Lenderking et al. 1994). In
this case Q-TWiST =TWiST +UAE x AE +uprog x Prog,
focusing attention on the trade-off between increased ad-
verse events and delayed disease progression associated
with zidovudine therapy.
Q-TWiST can also be used to evaluate chemotherapy
for small cell lung cancer which can prolong survival by
a month or two and may also relieve symptoms of the
disease, but at a cost of severe side effects of treatment.
By defining appropriate health states for treatment toxicity
and palliation of disease symptoms, the Q-lWiST method
can highlight the benefits and costs of chemotherapy in
this setting. Furthermore, different therapies for small cell
lung cancer may be less successful in returning patients
to states of relatively good health, and in this case TWiST
may be assigned a utility value less than one.
A Q-TWiST analysis of the efficacy of treatments de-
signed to prolong event-free survival can highlight the in-
fluence of late sequelae by defining clinical health state to
capture the occurrence of late events. This approach is cur-
rently being applied to evaluate treatments for childhood
acute lymphoblastic leukemia and Hodgkins disease.
6. CONCLUSIONS
The evaluation of treatment effectiveness in terms of
quality of life will become increasingly important in clin-
ical trials. For chronic illnesses with no cure, new treat-
ments will need to be evaluated not only for a survival
effect but also for possible palliative advantages. In this
article we have presented areview of the Q-TWiST method
that is directly applicable and well suited for this purpose
because treatments are evaluated simultaneously in terms
of quantity and quality of life. Other quality-of-life mea-
sures, which do not account for time, only indirectly reflect
benefits of delayed disease recurrence. Another advantage
of Q-TWiST is that the method does not aggregate quality-
of-life results for an entire population: instead, it allows
individual patients and physicians to determine the rec-
ommended treatment according to individual preferences.
This advantage is obtained from a threshold utility anal-
ysis which gives the preferred treatment according to all
combinations of the utility coefficients.
Wehave also described in this article various extensions
of the Q-TWiST methodology. The extension to regres-
sion models allows the evaluation of treatment effects, in
terms of quality of life, to be made according to differ-
ent prognostic situations. The extrapolation methodology
provides a means for investigating long-term treatment ef-
fects when there are sufficient data for modeling the tails of
the survival curves. The final extension to meta-analysis
allows clinical trials, having different length follow-up
intervals, to be combined in such a way that aggregate
Q-TWiST analyses are possible.
The Q-TWiST method provides a quality-adjusted sur-
vival analysis for clinical trial data that is most useful for
treatment decision-making. The results can be used for
treatment recommendations for individual patients, as well
as for clinical trial evaluations of therapeutic regimens.
[Received June 1993. Revised November 1994.1
REFERENCES
Breslow, N. E., and Crowley, J. (1974), A Large Sample Study of the
Life Tableand Product Limit Estimates under RandomCensorship,
Annals of Statistics, 2,437-453.
Cole, B. F., Gelber, R. D., and Anderson, K. M., for the Intema-
tional Breast Cancer Study Group (1994), Parametric Approaches
to Quality-Adjusted Survival Analysis, Biometrics, 50,621-63 1.
Cole, B. F., Gelber, R. D., and Goldhirsch, A., for the International
Breast Cancer Study Group (1993), Cox Regression Models for
Quality-Adjusted Survival Analysis, Statistics in Medicine, 12,975-
987.
(1993, A Quality Adjusted Survival Meta-Analysis of Adju-
vant Chemotherapy for Premenopausal Breast Cancer, Statistics in
Medicine, 14, 1771-1784.
Cox, D. R., Fitzpatrick, R., Fletcher, A. E., Gore, S. M., Spiegelhalter,
D. J., and Jones, D. R. (1992), Quality of Life Assessment: Can We
Keep It Simple?: Journal of the Royal Statistical Society, Part A,
155,353-393.
Donovan, K., Sanson-Fisher, R. W., and Redman, S. (1989), Measuring
Quality of Life in Cancer Patients, Journal of Clinical Oncology, 7,
Gelber, R. D., Gelman, R. S., and Goldhirsch, A. (1989), A Quality-
of-Life-Oriented Endpoint for Comparing Therapies, Biometrics, 45,
Gelber, R. D., and Goldhirsch, A. (1986), A New Endpoint for the
Assessment of Adjuvant Therapy in Postmenopausal Women with
Operable Breast Cancer, Journal of Clinical Oncology, 4, 1772-
1779.
Gelber, R. D., Goldhirsch, A., andcavalli, F., for the International Breast
Cancer Study Group (1991), Quality-of-Life-Adjusted Evaluation
of a Randomized Trial Comparing Adjuvant Therapies for Operable
Breast Cancer, Annals of Internal Medicine, 114,621-628.
Gelber, R. D., Goldhirsch, A., and Cole, B. F., for the International
Breast Cancer Study Group (1993a), Evaluation of Effectiveness:
Q-TWiST, Cancer Treatment Reviews, 19,73-84.
Gelber, R. D., Goldhirsch, A., and Cole, B. F. (1993b), Parametric
Extrapolation of Survival Estimates to Quality-of-Life Evaluation of
Treatments, Controlled Clinical Trials, 14,485-489.
Gelber, R. D., Goldhirsch, A., Hurny, C., Bernhard, J., and Simes, R. J .,
for the International Breast Cancer Study Group (1992a), Quality of
Life in Clinical Trials of Adjuvant Therapies, Journal of the National
Cancer Institute Monographs, 11, 127-135.
Gelber, R. D., Lenderking, W. R., Cotton, D. J ., Cole, B. F., Fischl,
M. A., Goldhirsch, A., and Testa, M. A,, for the AIDS Clinical Tri-
als Group (1992b), Quality-of-Life Evaluation in a Clinical Trial
of Zidovudine Therapy in Patients with Mildly Symptomatic HIV
Infection, Annals of Internal Medicine, 116,961-966.
Glasziou, P. P., Simes, R. J., and Gelber, R. D. (1990), Quality Adjusted
Survival Analysis, Statistics in Medicine, 9, 1259-1276.
Goldhirsch, A., Gelber, R. D., Simes, R. J., Glasziou, P., and Coates,
A,, for the Ludwig Breast Cancer Study-Group (1989), Costs and
Benefits of Adjuvant Therapy in Breast Cancer: A Quality Adjusted
Survival Analysis, Journal of Clinical Oncology, 7,36-44.
Kaplan, E. L., and Meier, P. (1958). Nonparametric Estimation from
Incomplete Observations, Journal of the American Statistical Asso-
ciation, 54,457-481.
Lenderking, W. R., Gelber, R. D., Cotton, D. J., Cole, B. F., Goldhirsch,
A., Volderding, P. A., and Testa, M. A. (1994), Evaluation of the
Quality-of-Life Assessment in Asymptomatic Human Immunode-
ficiency Virus Infection, New England Journal of Medicine, 330,
Maguire, P., and Selby, P., on behalf of the Medical Research Councils
Cancer Therapy Committee Working Party on Quality of Life (1989),
Assessing Quality of Life in Cancer Patients, British Journal of
Cancer, 60,437-440.
959-968.
781-795.
738-743.
168 The American Statistician, May 1995, Vol. 49, No. 2
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

Moinpour, C. M., Feigl, P., Metch, B. Hayden, K. A., Meyskens, Jr.,
F. L., and Crowley, J. (1989), Quality of Life End Points in Can-
cer Clinical Trials: Review and Recommendations, Journal of the
National Cancer Institute, 81,485-495.
Priestman, T. J., and Baum, M. (1976), Evaluation of Quality of Life in
Patients Receiving Treatment for Advanced Breast Cancer, Lancet,
Schumacher, M., Olschewski, M., and Schulgen, G. (1991), Assess-
ment of Quality of Life in Clinical Trials, Statistics in Medicine, 10,
1,899-900.
1915-1930.
TheLudwig Breast Cancer Study Group (1988), Combination Adju-
vant Chemotherapy for Node-Positive Breast Cancer: Inadequacy of
a Single Perioperative Cycle, New England Journal of Medicine,
319,677-683.
Torrance, G. W. (1986), Measurement of Health State Utilities for Eco-
nomic Appraisal: A Review, Journal ofHealth Economics, 5, 1-30.
Weeks, J . C., OLeary, J., Fairclough, D., Paltiel, D., and Weinstein,
M. (1994), The Q-tility Index: A New Tool for Assessing Health-
Related Quality of Life and Utilities in Clinical Trials and Clinical
Practice, in Proceedings of ASCO 1994, 13, p. 436.
The American Statistician, May 1995, Vol. 49, No. 2 169
D
o
w
n
l
o
a
d
e
d

b
y

[
I
n
d
i
a
n

C
o
u
n
c
i
l

o
f

M
e
d
i
c
a
l

R
e
s
]
,

[
r
a
m
e
s
h

a
t
h
e
]

a
t

0
3
:
5
6

0
2

M
a
y

2
0
1
3

You might also like