Risk-Sensitive Prescriptive Analytics: Real Estate Case Study

RISK-SENSITIVE PRESCRIPTIVE ANALYTICS
REAL ESTATE CASE STUDY
Disclaimer: This case study is written solely for educational purposes and is not intended to represent
successful or unsuccessful managerial decision making.
Ahmed Youssef, PhD www.ld-research.com

youssef@ld-research.com LD-Research, 78/79 Pappelallee, 10437 Berlin
LD-Research case study, 2017
Risk-sensitive prescriptive analytics: Real Estate

case study
Abstract
Based on actual data and Monte Carlo simulations, we demonstrate that the use of advanced analytics can
lead to substantial performance improvements in real estate investments. Moreover we show that risk-sensitive
proprietary algorithms developed at LD-Research are able to increase the risk adjusted return as measured by
the Sortino ratio by a remarkable 180% over state-of-the-art prescriptive analytics.
Keywords
Prescriptive Analytics Quantitative Risk Management Real Estate
Far better an approximate answer to the right question, generate thousands of instances of a simple investment
which is often vague, than the exact answer to the wrong ques- scenario.
tion, which can always be made precise. John Tukey
Three methods: We compare the performance of in-
vestors using three different strategies: i) an analyst
using traditional statistical tools ii) an expert using
1. Introduction cuttingedge machine learning and optimization tools
The use of advanced analytics has the potential to give real iii) LD-Research proprietary algorithms.
estate investors a competitive edge. We quantify these perfor-
mance gains in this case study, which has several interesting The procedure: First, the data is divided into two equal-
features: sized parts: a training set the system will use to learn
a pricing model, and a test set that will serve to sim-
Prescriptive analytics: Recent advances in the field of ulate an investment process and evaluate the different
machine learning make it possible to build algorithms algorithms in a business-relevant context. It is critical
that learn from data and lead to powerful predictive to only use the test data in the very last step, otherwise
models. Applying them to real estate, we can use histor- the reported returns will be over-optimistic.
ical housing price data to predict with higher accuracy
the actual selling price of a house, based on many of
its features. This process is referred to as predictive
analytics. But real estate investors are more interested
2. Three algorithms
in making the right investment decisions than pure price We will compare the performance of three analytics algo-
predictions. With prescriptive analytics, the subject of rithms.
this study, real estate investors will directly get good
buy/sell recommendations instead of house prices pre- Simple algorithm
dictions alone.
The first valuation algorithm is a multiple linear regression
model (not so simple after all). The accuracy of the algorithm
Actual data: We use publicly available house sale prices
is measured by its ability to predict the prices in the test set,
for King County, WA, which includes Seattle. The data
that is on houses it has never seen before. This algorithm has
references more than 20.000 homes sold between May
a low predictive power and makes an average error of 25%
2014 and May 2015, along with their selling price and
(mean absolute percentage error).
18 features such as location, surface area, condition,
year built, and whether they have a view. A few exam-
ples are shown in table (1). Expert algorithm
Our second model is a state-of-the-art machine learning algo-
Monte Carlo simulations: In order to quantify the rithm, known as a gradient boosting machine [1] . This is the
actual effect of prescriptive analytics, we use Monte algorithm used to win most machine learning competitions
Carlo simulations repeated random samplings to and is typically the most accurate algorithm out there. The
Risk-sensitive prescriptive analytics: Real Estate case study 2/4
Date Zipcode Lat Lon Lot (ft2 ) Bedrooms Waterfront Price

2014-07-22 98023 47.3277 -122.341 9480 3 No 86 500
2014-05-15 98106 47.5341 -122.358 6345 2 No 312 500
2014-06-11 98004 47.6500 -122.214 37325 5 Yes 7 062 500
Table 1. A few examples of the King County housing data.
Figure 1. Location of a few sold houses in the dataset.

Figure 2. Return distributions of the three strategies. Each
dot represents the return realized by an investor. This
models average error is roughly 12%1 ,
a very significant im- unconventional representation of the distributions has the
provement on the previous linear model. Its use as the basis of advantage of making the problems inherent uncertainty
any investment strategy will obviously lead to important per- utterly clear: once you choose a strategy, you have an equal
formance gains. Such predictive power is the reason behind chance of being any one of these dots!
the surge of modern machine learning techniques in many
fields.
is offered at a price that is within 5%2 of its true price. For
LD-Research proprietary algorithm instance, a $200K house may be offered for between $180K
At LD-Research we develop proprietary machine learning and $220K. After the properties are chosen by the algorithm,
algorithms particularly well suited for risk-sensitive decision we assume that we can sell all of them at their true price, in
making. In the next section we will show that it remarkably our example $200K. This procedure requires two steps:
outperforms the state-of-the-art expert model described above. Prediction: First we use a predictive model to compare
the offered and predicted price and determine wether or
3. Investment simulation not the property is worth investing in.
We now test these three algorithms using a very simple invest- Allocation: Second we use an algorithm to optimally
ment scenario. We are offered the opportunity to buy, within a allocate our limited capital between these 100 proper-
capital limit of $10M, any combination of 100 properties cho- ties. There is a huge number of possible combinations,
sen randomly from the test set. We are seeing these properties precisely 2100 1030 and they cannot be exhaustively
for the first time and dont know their true price, i.e., the searched.
price at which we will be able to resell them. Each property
The simple strategy will use an equally simple heuristic:
1 Technicalnote: in order to level the playing field and make performance
comparison straightforward, no feature engineering was used for any of the 2 The offered price is generated by a gaussian distribution centered on the
algorithms. true price and has a standard deviation equal to 5% of this mean value.
we compute the expected return of each decision and than the celebrated Sharpe ratio because it penalizes a
select the properties with the highest return first. If a strategy only for downside volatility [3]. The Sortino
property is too expensive for our remaining capital, we ratio is calculated as the average return divided by the
check for the next ones. downside volatility, and the higher its value the better.
The expert strategy uses a sophisticated integer pro- LD-Research algorithm improves the Sortino ratio by
gramming algorithm to make the buying decisions [2]. 180% relative to the expert strategy (from 19 to 54).
This algorithm solves the capital allocation problem to
optimality, meaning that, given our predictions, it will
find the best solution.
Figure 3. Average return, expected shortfall, and Sortino ratio of the three strategies. Higher values are better.
We use Monte Carlo simulations to generate 105 instances 4. Robustness

of this investment scenario for each one of the three tested In the previous simulation we showed substantial improve-
strategies and compute the distribution of returns, shown in ments over state-of-the art prescriptive analytics. But how
figure (2). For each distribution we also report three important robust are these performance gains? Extensive Monte Carlo
measures, shown in figure (3): simulations indicate the superior performance of LD-Research
algorithms under very different conditions. The simple invest-
Average return: LD-Research strategy outperforms the
ment game we have been studying has two tunable parameters:
expert strategy by a relative 7% (from 2.36% to 2.53%),
a substantial difference. Investor size: their capital divided by the average price
Risk: We define risk as the probability and amplitude of the full offered portfolio (capital / $54M in our case).
of losses in the worstcase scenarios. We measure risk Market fluctuations: how much do quoted prices fluctu-
by the expected shortfall at the 5% level: that is, the ate around the true price.
expected return if we are among the 5% less lucky in-
vestors. This is much lower than the average return and We examined the relative difference in the Sortino ra-
is usually negative: in the worst case we lose money. tio of LD-Research algorithm compared to the expert one.
This is also where LD-Research algorithm adds substan- We simulated 104 runs for each of the 100 combinations of
tial value as it is specifically designed for risk-sensitive Investor size and Market fluctuations shown in figure (4).
decision making. The algorithm improves the expected LD-Research strategy dominates the expert one under all
shortfall by an impressive 220% relative to the expert considered conditions, by an average of 90%.
strategy (from 0.25% to 0.31%), transforming the
worst case into a gain instead of a loss.
Risk adjusted return: This is a single metric that quickly
captures the trade-off between risk and average return.
We use the Sortino ratio, considered more meaningful
Figure 4. We see a consistent improvement (from 3% to

215% with an average of 90%) of the Sortino ratio relative to
the expert strategy, in a wide range of exogenous market
conditions.
References
[1] R.E. Schapire and Y. Freund. Boosting, Foundations and
Algorithms. The MIT press, 2012.
[2] D. Bertsimas and J.N. Tsitsiklis. Introduction to Linear
Optimization. Dynamic Ideas and Athena Scientific, 2008.
[3] F.A. Sortino and L.N. Price. Performance measurement in
a downside risk framework. Journal of Investing, Vol. 3 p.
593, 1994.

Risk-Sensitive Prescriptive Analytics: Real Estate Case Study

Uploaded by

Document Information

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Risk-Sensitive Prescriptive Analytics: Real Estate Case Study

Uploaded by

Copyright:

RISK-SENSITIVE PRESCRIPTIVE ANALYTICS

REAL ESTATE CASE STUDY

Ahmed Youssef, PhD www.ld-research.com

Risk-sensitive prescriptive analytics: Real Estate

Date Zipcode Lat Lon Lot (ft2 ) Bedrooms Waterfront Price

Table 1. A few examples of the King County housing data.

Figure 1. Location of a few sold houses in the dataset.

We use Monte Carlo simulations to generate 105 instances 4. Robustness

Figure 4. We see a consistent improvement (from 3% to

You might also like