You are on page 1of 33

Open data: bringing small

businesses into the big leagues


John Murray
Fusion Data Science
Customer Profiling
Customer profiling is not new.
Big companies have been doing it for years.
In early days, the preserve of the mainframe.
Software tools and data products have
evolved and become easier to use.
External data sources such as Census and
commercial data sources commonly used to
augment existing data.

3 October 2014 CC BY-SA 2.0 UK 2
Applications of Customer Profiling
Help to inform advertising purchase decisions.
Identify new retail location sites and
rationalise existing networks.
Target new prospects through direct
marketing.
Identify factors leading to customer churn to
improve customer loyalty and retention.
Reduce fraud and defaults.

3 October 2014 CC BY-SA 2.0 UK 3
Analytical Software
Specialist commercial analytical software:
SAS
SPSS
MATLAB
Open source analytical software
R Project
Octave
3 October 2014 CC BY-SA 2.0 UK 4
Commercial Data Products
Geo-demographic segmentation systems:
Acxiom Personicx
CACI Acorn
Callcredit Cameo
Experian Mosaic
Lifestyle and transactional data providers.
Public registers:
Shareholders
Court judgements
3 October 2014 CC BY-SA 2.0 UK 5
What is Customer Profiling?
A description of a customer or set of customers
that includes demographic, geographic, and
psychographic characteristics, as well as buying
patterns, creditworthiness, and purchase history.
(Business Dictionary)
Customer analytics is a process by which data
from customer behaviour is used to help make
key business decisions via market segmentation
and predictive analytics. (Wikipedia)
3 October 2014 CC BY-SA 2.0 UK 6
Customer Profiling Basics
Typically customer profiling is presented
statistically as a set of percentages or
likelihood scores against behavioural and
demographic attributes.
In most cases the profile of a target group is
compared with the profile of a base group.
Key differences in these two profiles are
identified and used to inform business
decisions.
3 October 2014 CC BY-SA 2.0 UK 7
Target Group
This may be all customers, or a selected group
of customers identified by a characteristic.
For example
Respondents to direct marketing.
High value customers.
Lapsed customers.
Fraudsters.
Defaulters.
3 October 2014 CC BY-SA 2.0 UK 8
Base Group
The base group of people to compare against.
For example
Non-respondents to direct marketing.
Low value customers.
Active customers.
Trustworthy customers.
Creditworthy customers.
3 October 2014 CC BY-SA 2.0 UK 9
Case Study
Delicatessen in suburb of Chester.
Offered newsletter and loyalty incentive scheme
via in-store capture.
230 customer records in database.
Wanted to launch home delivery service.
3 mile radius from store, south of river area only.
10,000 households in base are, wanted to target
best 2,000.
3 October 2014 CC BY-SA 2.0 UK 10
Delivery Zone
3 October 2014 CC BY-SA 2.0 UK 11
Data Sources
Customer Database
ONS Postcode Directory (ONSPD)
2011 Census Datasets
ONS Postcode Estimates (Headcounts)
OS Open Data mapping products
3 October 2014 CC BY-SA 2.0 UK 12
Methodology
Delivery zone postcodes matched to ONSPD to
append Census Output Area identifiers.
Customer database postcodes matched to
above.
Using ONSPD, customer profiles produced
from Census variables expressed as
percentage.
Delivery zone profile weighted using ONS
headcounts at postcode level.



3 October 2014 CC BY-SA 2.0 UK 13
Census Variables Used
Age
Household composition
Age of children in household
Tenure
Occupation Type
Social Grade
Deprivation (from Census)
Length of Residence
3 October 2014 CC BY-SA 2.0 UK 14
Profile of Social Grade
3 October 2014 CC BY-SA 2.0 UK 15
Profile of Occupation Type
3 October 2014 CC BY-SA 2.0 UK 16
Result
3 October 2014 CC BY-SA 2.0 UK 17
Optimise Call Centre Queues
Call centre resources are expensive.
Demographic data can be used to prioritise
queues for resource optimisation:
At peaks select priority customers
Utilise slack periods more effectively
Minimise no answer/unavailable calls

3 October 2014 CC BY-SA 2.0 UK 18
Call Time Preference V Employment
3 October 2014 CC BY-SA 2.0 UK 19
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
Employed
Self-emp
Retired
Not working
Methodology
Enquiries from prospective customers received
via the website.
Census data used to estimate likelihood of
preferred call time based on postcode built into a
set of models.
Call centre queues organised according to
likelihood and availability of call centre resources
(constrained optimisation)
Feedback loop created from the dialler to
improve performance of predictive models.

3 October 2014 CC BY-SA 2.0 UK 20
Result
Reduction in unsuccessful calls
39%
Equates to productivity improvement of
14%
3 October 2014 CC BY-SA 2.0 UK 21
Retail Location Planning
Measure demographic profile of your existing
stores.
Use this to find other areas with similar
profiles.
Caution other factors are involved!
Footfall, e.g. how much passing trade?
Presence of competitor outlets

3 October 2014 CC BY-SA 2.0 UK 22
Specialist Golf Outlet
3 October 2014 CC BY-SA 2.0 UK 23
Methodology
Extract all customers acquired in preceding 3
years within 45 minutes drive time of a shop.
Convert these to proportions of population in
each postcode sector (e.g. CH1 2).
Build a predictive model using these as
dependent variable, and census proportions
as predictor.
Apply the model to other areas where no
present store presence.

3 October 2014 CC BY-SA 2.0 UK 24
Customer Loyalty
Target group, lapsed customers (havent
transacted in a period)
Base group, current active customers.
Append census variables.
Identify customers most likely to lapse.
Set up early warning system.
Trigger reactivation event, such as offers.
3 October 2014 CC BY-SA 2.0 UK 25
Direct Marketing
Target set: existing customers within 30 mins
drive time.
Base set: all adults in same area.
Profile comparison.
Strongest variables identified.
Logistic Regression model built in R.
Postcode level model used to drive door to
door leaflet campaign.
3 October 2014 CC BY-SA 2.0 UK 26
Gains Table
Incremental Analysis Cumulative Analysis
Rank Base Target Index Base Target Index
1 5.0% 24.0% 479.9 5.0% 24.0% 479.9
2 5.0% 18.3% 365.7 10.0% 42.3% 422.8
3 5.0% 11.2% 224.3 15.0% 53.5% 356.7
4 5.0% 10.0% 199.9 20.0% 63.5% 317.5
5 5.0% 5.7% 114.2 25.0% 69.2% 276.8
6 5.0% 4.3% 85.7 30.0% 73.5% 245.0
7 5.0% 4.1% 81.6 35.0% 77.6% 221.6
8 5.0% 3.1% 62.5 40.0% 80.7% 201.7
9 5.0% 4.6% 92.5 45.0% 85.3% 189.6
10 5.0% 2.7% 53.0 50.0% 88.0% 175.9
11 5.0% 2.3% 46.2 55.0% 90.3% 164.1
12 5.0% 2.9% 57.1 60.0% 93.1% 155.2
13 5.0% 1.8% 35.4 65.0% 94.9% 146.0
14 5.0% 1.1% 21.8 70.0% 96.0% 137.1
15 5.0% 0.9% 17.7 75.0% 96.9% 129.2
16 5.0% 1.6% 32.6 80.0% 98.5% 123.1
17 5.0% 1.1% 21.8 85.0% 99.6% 117.2
18 5.0% 0.2% 4.1 90.0% 99.8% 110.9
19 5.0% 0.1% 2.1 95.0% 99.9% 105.2
20 5.0% 0.1% 2.0 100.0% 100.0% 100.0
3 October 2014 CC BY-SA 2.0 UK 27
Gains Chart
3 October 2014 CC BY-SA 2.0 UK 28
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Gains Chart
Base %
Target %
Heat Map
3 October 2014 CC BY-SA 2.0 UK 29
Advertising Purchase
Produce customer profile.
Compare with profiles provided by media
outlets.
With radio and TV consider time of day.
Use unique phone numbers/urls to track
responses by media.
Measure intelligence.
3 October 2014 CC BY-SA 2.0 UK 30
Software Tools
Microsoft Office
Excel
Access
Mapping Software
ArcGIS
MapInfo
Microsoft MapPoint
Open source software e.g. Quantum GIS

3 October 2014 CC BY-SA 2.0 UK 31
Conclusion
Open data can help you:
Improve business process efficiency.
Reduce fraud and default.
Retain your customers.
React to market changes.
Find new customers.
Plan retail branch networks.
Purchase advertising more effectively.



3 October 2014 CC BY-SA 2.0 UK 32
Thank You


Questions?
3 October 2014 CC BY-SA 2.0 UK 33

You might also like