Professional Documents
Culture Documents
C HA P T E R 7
Incremental
Response
Modeling
A
standard part of marketing campaigns in many industries is to offer
coupons to encourage adoption of the new goods or service. This
enticement is often essential for success because it provides some
additional incentive to persuade customers to switch products if they are
currently satisfied with the goods or service in question. For example,
if I am a longtime buyer of laundry detergent XYZ, to get me to try a
competitors laundry product detergent ABC, I will need some incen-
tive great enough to get to move outside my comfort zone of buying
detergent XYZ on my trips to the store. This incentive or inducement
could be superior performance, but I will never know that the perfor-
mance is superior unless I try the product. Another inducement could
be value. Detergent ABC cleans just as well as detergent XYZ, and I get
a larger quantity of detergent ABC for the same price as detergent XYZ.
This strategy, like superior quality, is also predicated in me trying the
product. All strategies that will successfully convert from detergent XYZ
to detergent ABC require me to try the product, and the most common
141
142 BIG DATA, DATA MINING, AND MACHINE LEARNING
The results of a marketing campaign will fall into one of three groups:
people who will not purchase the product regardless of the incentive, those
who switched products because of the campaign, and those who were go-
ing to switch already and can now purchase the product at a reduced rate.
Of these three groups, we would like to target with our marketing only
those who switched just because of the coupon. There is a fourth category
that I will discuss here and not refer to again. This is the group referred
to as sleeping dogs; they are mostly encountered in political campaigns,
not marketing campaigns. Sleeping dogs are those people who purchase
your product or subscribe to your political view but by including them in
a campaign they respond negatively and leave your brand.
Public radio is another example to demonstrate incremental re-
sponse. Public radio, supported by listeners, has fund drives several
times throughout the year to raise money to support the programm-
ing. Many listeners donate as soon as they hear the fund drive begin
because they want to support the programming and feel a duty to do
so. Another group probably would not donate to the station except
for the appeal of a coffee mug, tote, video collection, or some other
gift to reward them for their generosity that sways them to pick up
the phone or go to the website and contribute. The problem for public
broadcasting is it cannot discern between the two groups. If it could
INCREMENTAL RESPONSE MODELING 143
Control Treatment
Purchase = Yes
Purchase = Yes
Incremental
Response
Purchase = No
Purchase = No
discern between the groups, it could save the costs associated with
those giveaway items and therefore reduce overall cost.
The method behind incremental response modeling is this: Begin
with a treatment and control group. These groups should follow
a methodology from the clinical trial literature. The treatment is a
coupon. Note that you can use multiple treatments, but here we will
discuss only the binary case.
Once you have divided your sample into the two assigned groups,
administered either the treatment or the control, and then gathered the
results, you can begin to apply this methodology. Figure 7.1 shows the
difference between the control group and treatment group. The treat-
ment group received the coupon for detergent ABC while the control
group received no coupon. You can see that the coupon was effective in
increasing sales (based on the size of the box labeled Purchase=Yes),
but the top section in the treatment group represents all people who
purchased the detergent. The ones in the box labeled Incremental Re-
sponse purchased because of the coupon and the rest would have pur-
chased regardless. Therefore, the treatment group did not maximize
profit because the detergent was sold at a lower price than could have
otherwise been demanded. It is rare but possible that in some cases the
treatment group, those who received the coupon, could actually gener-
ate less profit than the control group, doing nothing.
Then sort the resulting PD from largest to smallest, and the top
deciles are the incremental responders. Any predictive model can be
employed in the differencing technique, such as the regressionbased
differencing model and the treebased differencing model.
An improved method is to look only at the control group, the peo-
ple who did not get the coupon, and classify each person as an outlier
or not. Several techniques can be used for classifying outliers when
you have only one group. A method that has good results classifying
outliers is oneclass support vector machines (SVMs).
Recently a new method was suggested that uses an outlier detection
technique, particularly the oneclass SVM. The suggested method uses
the control group data to train the model and uses the treatment group
as a validation set. The detected outliers are considered as incremental
responses. This new method shows much better results than the dif-
ferencing technique. The technique is illustrated with plots below, but
more details can be found in the paper by Lee listed in the references.
In Figure 7.2, we see a graphical representation of the points from
the control group that have been identified as outliers. The dots closer
to the origin than the dashed line are classified as part of the negative
class, and other dots up and to the right of the dashed line are classi-
fied to the positive class. The points in the negative class are considered
outliers (those between the origin and the dashed line). They receive
this designation because there are particular features, or a combination
of many features, that identify them as different from the overall group.
Origin
Origin
Treatment
Control
are the potential incremental responders. Those are the people who
purchased because of the treatment; in this specific example, the
coupon for detergent ABC. I used the word potential above be-
cause there is no definitive way in real life to objectively measure
those people who responded only as a result of the treatment. This
can be tested empirically using simulation, and that work has illus-
trated the effectiveness of this method. Figure 7.5 is an example of
a simulation study.
Figure 7.5 shows 1,300 responders to the treatment group. This
includes 300 true incremental responders. The method described above
identified 296 observations as incremental responders, and 280 of
Respondents
Nonresponders
5
x2
6 4 2 0 2 4 6
x1
those identified were true positives. This is all the more impressive
because, as you can see, there is no simple way to use straight lines
and separate the gray true incremental responders from the black
nonincremental responders. This leaves 20 true responders who were
not identified and 16 who were incorrectly identified. See Table 7.1 for
a tabular view.
This simulation yields a 5.4% error rate, which is a significant im-
provement over the differencing method explained at the beginning
of the chapter. Incremental response modeling holds much promise
in the areas of targeted advertising and microsegmentation. The abil-
ity to select only those people who will respond only when they re-
ceive the treatment is very powerful and can contribute significantly
to increased revenue. Consider the typical coupon sells the goods or
service at 90% of the regular price (a 10% discount). Every correctly
identified true incremental responder will raise revenue 0.9 and every
correctly identified nonincremental responder (those who are not in-
fluenced by the treatment either to purchase or not) will raise revenue
by 0.1 because those items will not be sold at a discount needlessly.
Then add in the nominal cost of the treatmentad campaign, postage,
printing costs, channel management. We have the following revenue
adjustments:
Incremental Revenue = 0.9r + 0.1n campaign costs
where
r = incremental responders who will purchase the product if they
receive the coupon but otherwise will not
n = nonresponders who will not buy the product even if they
received the coupon
By taking the simulation example but increasing the error rate to
nearly double at 10%, you can see the advantage of using incremental
response:
148 BIG DATA, DATA MINING, AND MACHINE LEARNING
Incremental response revenue = 333 units = .9 ( 270 ) + .1( 900 ) fixed costs
compared to:
Control only = 100 units = .9 ( 0 ) + .1(1000 ) campaign costs
Treatment only = 270 units = .9 (300 ) + .1( 0 ) campaign costs
costs of each scenario, the treatment will have the largest effect be-
cause a coupon is being sent to each of the 1,300 people. This will
be followed by the incremental response group, where coupons are
sent only to those predicted to respond; and finally the control group,
where there are no campaign costs because of the lack of campaign.
Treatment campaign costs > Incremental reponse campaign costs >
Control campaign costs = 0
When the campaign costs are added to the calculations, the in-
cremental response is an even better option to either the treatment
or the control group. This increasing amount of information that is
made available will over time reduces the error rates, yielding even
larger revenues for those organizations that leverage this powerful
technique.