You are on page 1of 27

REM: Resource-Efficient Mining for Blockchains

Fan Zhang?, Ittay Eyal?, Robert Escriva?


fanz@cs.cornell.edu ittay.eyal@cornell.edu escriva@cs.cornell.edu

Ari Juels, Robbert van Renesse?,


juels@cornell.edu rvr@cs.cornell.edu

? Cornell University Cornell Tech, Jacobs Institute


Initiativefor CryptoCurrencies & Contracts

Abstract less blockchains, however, one is fundamental. Proofs-


of-Work (PoWs) in blockchains are wasteful.
Blockchains show promise as potential infrastructure
PoWs are nonetheless the most robust solution today
for financial transaction systems. The security of
to two fundamental problems in decentralized cryptocur-
blockchains today, however, relies critically on Proof-of-
rency design: How to select consensus leaders and how
Work (PoW), which forces participants to waste compu-
to apportion rewards fairly among participants. A partic-
tational resources.
ipant in a PoW system, known as a miner, can only lead
We present REM (Resource-Efficient Mining), a new
consensus rounds in proportion to the amount of compu-
blockchain mining framework that uses trusted hardware
tation she invests in the system. This prevents an attacker
(Intel SGX). REM achieves security guarantees similar
from gaining majority power by cheaply masquerading
to PoW, but leverages the partially decentralized trust
as multiple machines. The cost, however, is the above-
model inherent in SGX to achieve a fraction of the waste
mentioned waste. PoWs serve no useful purpose beyond
of PoW. Its key idea, Proof-of-Useful-Work (PoUW), in-
consensus and incur huge monetary and environmental
volves miners providing trustworthy reporting on CPU
costs. Today the Bitcoin network uses more electricity
cycles they devote to inherently useful workloads. REM
than produced by a nuclear reactor, and is projected to
flexibly allows any entity to create a useful workload.
consume as much as Denmark by 2020 [30].
REM ensures the trustworthiness of these workloads by
means of a novel scheme of hierarchical attestations that We propose a solution to the problem of such waste in
may be of independent interest. a novel block-mining system called REM. Nodes using
REM replace PoWs wasted effort with useful effort of
To address the risk of compromised SGX CPUs, we
a form that we call Proof of Useful Work (PoUW). In a
develop a statistics-based formal security framework,
PoUW system, users can utilize their CPUs for any de-
also relevant to other trusted-hardware-based approaches
sired workload, and can simultaneously contribute their
such as Intels Proof of Elapsed Time (PoET). We show
work towards securing a blockchain.
through economic analysis that REM achieves less waste
than PoET and variant schemes. There have been several attempts to construct cryp-
We implement REM and, as an example application, tocurrencies that recycle PoW by creating a resource use-
swap it into the consensus layer of Bitcoin core. The ful for an external goal, but they have serious limitations.
result is the first full implementation of an SGX-based Existing schemes rely on esoteric resources [55], have
blockchain. We experiment with four example appli- low recycling rates [63], or are centralized [41]. Other
cations as useful workloads for our implementation of consensus approaches, e.g., BFT or Proof of Stake, are
REM, and report a computational overhead of 5 15%. in principle waste-free, but restrict consensus participa-
tion or have notable security limitations.
Intel [47] recently introduced a new approach to elim-
1 Introduction inating waste in distributed consensus protocols that re-
lies instead on trusted hardware, specifically a new in-
Despite their imperfections [26, 35, 37, 65, 71], struction set architecture extension in Intel CPUs called
blockchains [38, 64, 67] have attracted the interest of Software Guard Extensions (SGX). SGX permits the exe-
the financial and technology industries [16, 25, 34, cution of trustworthy code in an isolated, tamper-free en-
47, 69, 74] as a way to build a transaction systems vironment, and can prove remotely that outputs represent
with distributed trust. Among the impediments to the result of such execution. Leveraging this capability,
widespread adoption of decentralized or permission- Intels proposed Proof of Elapsed Time (PoET) is an in-

1
novative system with an elegant and simple underlying cryptocurrencies. Our results also apply to PoET.
idea. A miner runs a trustworthy piece of code that idles A further challenge arises in REM due to the feature
for a randomly determined interval of time. The miner that miners may choose their own PoUWs workloads. It
with the first code to awake leads the consensus round is necessary to ensure that miner-specified mining appli-
and receives a reward. PoET thus promises energy- cations running in SGX accurately report their computa-
waste-free decentralized consensus with security predi- tional effort. Unfortunately SGX lacks secure access to
cated on the tamper-proof features of SGX. PoET oper- performance counters. REM thus includes a hierarchical
ates in a partially-decentralized model, involving limited attestation mechanism that uses SGX to attest to com-
involvement of an authority (Intel), as we explain below. pilation of workloads with valid instrumentation. Our
Unfortunately, despite its promise, as we show in this techniques, which combine static and dynamic program
paper, PoET presents two notable technical challenges. analysis techniques, are of independent interest.
First, in the basic version of PoET, an attacker that can We have implemented a complete version of REM, en-
corrupt a single SGX-enabled node can win every con- compassing the toolchain that instruments tasks to pro-
sensus round and break the system completely. We call duce PoUWs, compliance checking code, and a REM
this the broken chip problem. Second, miners in PoET blockchain client. As an example use, we swap REM
have a financial incentive to power mining rigs with in for the PoW in Bitcoin core. As far as we are
cheap, outmoded SGX-enabled CPUs used solely for aware, ours is the first full implementation of an SGX-
mining. The result is exactly the waste that PoET seeks backed blockchain. (Intels Sawtooth Lake, which in-
to avoid. We call this the stale chip problem. cludes PoET, is implemented only as a simulation.) Our
REM addresses both the stale and broken chip prob- implementation supports trustworthy compilation of any
lems. Like PoET, REM operates in a partially decentral- desired workload. As examples, we experiment with four
ized model: It relies on SGX to prove that miners are REM workloads, including a commonly-used protein-
generating valid PoUWs. REM, however, avoids PoETs folding application and a machine learning application.
stale chip problem by substituting PoUWs for idle CPU The resulting overhead is about 5 15%, confirming the
time, disincentivizing the use of outmoded chips for min- practicality of REMs methodology and implementation.
ing. Miners in a PoUW system are thus entities that use
or outsource SGX CPUs for computationally intensive
Paper organization
workloads, such as scientific experiments, pharmaceuti-
cal discovery, etc. All miners can concurrently mine for The paper is organized as follows: Section 2 provides
a blockchain while REM gives them the flexibility to use background on proof-of-work and Intel SGX. We then
their CPUs for any desired workload. proceed to describe the contributions of this work:
We present a detailed financial analysis to show that
PoUW successfully addresses the stale chip problem. PoUW and REM, a low-waste alternative to PoW
We provide a taxonomy of different schemes, including that maintains PoWs security properties (3).
PoW, PoET, novel PoET variants, and PoUW. We ana- A broken-chip countermeasure consisting of a rig-
lyze these schemes in a model where agents choose how orous statistical testing framework that mitigates the
to invest capital and operational funds in mining and how impact of broken chips (4 and Appendix A) .
much of such investment to make. We show that the
A methodology for trustworthy performance instru-
PoUW in REM not only avoids the stale chip problem,
mentation of SGX applications using a combination
but yields the smallest overall amount of mining waste.
of static and dynamic program analysis and SGX-
Moreover, we describe how small changes to the SGX
backed trusted compilation (5).
feature set could enable even more efficient solutions.
Unlike PoET, REM addresses the broken chip prob- Design and full implementation of REM as a
lem. Otherwise, compromised SGX-enabled CPUs resource-efficient PoUW mining system with auto-
would allow an attacker to generate PoUWs at will, and matic tools for compiling arbitrary code to a PoUW-
both unfairly accrete revenue and disrupt the security of compliant module. Ours is the first full implemen-
the blockchain [29, 75, 79]. Intel has sought to address tation of an SGX-backed blockchain protocol (5).
the broken chip problem in PoET using a statistical- A model of consensus-algorithm resource consump-
testing approach, but published details are lacking, as tion that we use to compare the waste associated
appears to be a rigorous analytic framework. For REM, with various mining schemes. We overview the
we set forth a rigorous statistical testing framework for model and issues with previous schemes (6) and
mitigating the damage of broken chips, provide analytic defer the details to the appendix (B and C).
security bounds, and empirically assess its performance
given the volatility of mining populations in real-world We discuss related work in 7 and conclude in 8.

2
2 Background hashrate.
As the mining power is invested in a cryptocurrency
2.1 Blockchains grows, the cryptocurrencys cryptopuzzle difficulty rises
to keep the block generation rate stable. When compen-
Blockchain protocols allow a distributed set of partici- sation is sufficiently high, it is worthwhile for a large
pants, called miners, to reach a form of consensus called number of participants to mine, leading to a high diffi-
Nakamoto consensus. Such consensus yields an ordered culty requirement. This, in turn, makes it difficult for
list of transactions. Roughly speaking, the process is as an attacker to mine a large enough fraction of blocks to
follows. Miners collect cryptographically signed trans- perform a significant attack.
actions from system users. They validate the transac-
tions signatures and generate blocks that contain these PoW properties. The necessary properties for PoW to
transactions plus a pointer to a parent block. The result support consensus in a blockchain, i.e., resist adversarial
is a chain of blocks called (imaginatively) a blockchain. control, are as follows. First, a PoW must be tied to a
Each miner, as it generates a block, gets to choose the unique block, and be valid only for that block. Other-
blocks contents, specifically which transactions will be wise, a miner can generate conflicting blocks, allowing
included and in what order. System participants are con- for a variety of attacks. A PoW should be moderately
nected by a peer-to-peer network that propagates trans- hard [15], and its difficulty should be accurately tunable
actions and blocks. Occasionally, two or more miners so that the blockchain protocol can automatically tune
might nearly simultaneously generate blocks that have the expected block intervals. Validation of PoWs, on the
the same parent, forming two branches in the blockchain other hand, should be as efficient as possible, given that it
and breaking its single-chain structure. Thus a mech- is performed by the whole network. (In most cryptocur-
anism is used to choose which branch to extend, most rencies today, it requires just a single hash.) It should
simply, the longest chain available [64].1 also be possible to perform by any entity with access to
An attacker could naturally seek to generate blocks the blockchain If the proofs or data needed for valida-
faster than everyone else, forming the longest chain and tion are made selectively available by a single entity, for
unilaterally choosing block contents. To prevent such an instance, that entity becomes a central point of control
attack, a block is regarded as valid only if it contains and failure.2
proof that its creator has performed a certain amount of
work, a proof known as a Proof of Work (PoW).
2.2 SGX
A PoW takes the form of a cryptopuzzle: In most cryp-
tocurrencies, a miner must change an input (nonce) in the Intel Software Guard Extensions (SGX) [45, 46, 48, 49,
block until a cryptographic hash of the block is smaller 13, 43, 62] is a set of new instructions available on
than a predetermined threshold. The security properties recent-model Intel CPUs that confers hardware protec-
of hash functions force a miner to test nonces by brute tions on user-level code. SGX enables process execution
force until a satisfying block is found. Such a block con- in a Trusted Execution Environment (TEE), and specif-
stitutes a solution to the cryptopuzzle and is itself a PoW. ically in SGX in a protected address space known as an
Various hash functions are used in practice. Each type enclave. An enclave protects the confidentiality and the
puts different load on the processor and memory of a integrity of the process from certain forms of hardware
miners computing device [64, 63, 78]. attack and other processes on the same host, including
The process of mining determines an exponentially privileged processes like operating systems.
distributed interval of time between the blocks of an in- An enclave can read and write memory outside the en-
dividual miner, and, by extension, between blocks in the clave region as a form of inter-process communication,
blockchain. The expected amount of work to solve a but no other process can access enclave memory. Thus
cryptopuzzle, known as its difficulty, is set according to the isolated execution in SGX may be viewed in terms of
a deterministic algorithm that seeks to enforce a static an ideal model in which a process is guaranteed to exe-
expected rate of block production by miners (e.g., 10 cute correctly and with perfect confidentiality, but relies
minute block intervals in Bitcoin). An individual miner on a (potentially malicious) operating system for sup-
thus generates blocks at a rate that is proportional to porting services such as I/O, etc. This model is a simpli-
its mining power, its hashrate as a fraction of that in fication: SGX is known to expose some internal enclave
the entire population of miners. Compensation to min- 2 The Bitcoin protocol is expected to soon allow for the so-called
ers is granted per block generated, leading to an ex- segregated witness architecture [22, 60]. Then, transaction signatures
pected miner revenue that is proportional to the miners (witnesses) are kept in a data structure that is technically separate (seg-
regated) from the blockchain data structure. Despite this separation of
1 Thereare alternatives to this protocol [37, 58, 73, 78], however the data structures, the data in both must be propagated to allow for dis-
differences are immaterial to our exploration here. tributed validation.

3
state to the OS [79]. Our basic security model assumes model of REM and then give an overview of its system
ideal isolated execution, but as we detail in Section 4, we mechanics.
have baked a defense against compromised SGX CPUs
into REM.
3.1 Security Model
Attestation SGX allows a remote system to verify the
software running in an enclave and communicate se- A PoW solution embodies a statistical proof of an effort
curely with it. When an enclave is created, the CPU spent by the miner. With PoUW, however, a miner her-
produces a hash of its initial state known as a measure- self reports her own effort. The rational miners incentive
ment. The software in the enclave may, at a later time, is to lie, report more work then she actually performed,
request a report which includes a measurement and sup- and monopolize the blockchain. In PoUW / REM, use
plementary data provided by the process. The report is of a TEE Intel SGX in particular prevents such at-
digitally signed using a hardware-protected key to pro- tacks and enforces correct reporting of work. The result-
duce a proof that the measured software is running in an ing trust model is starkly different from that in traditional
SGX-protected enclave. This proof, known as a quote, is PoW.
part of an attestation can be verified by a remote system. PoET introduced, and we similarly use in REM, a par-
SGX signs quotes in attestations using a group signa- tially decentralized blockchain model. The blockchain is
ture scheme called Enhanced Privacy ID or EPID [72]. permissionless, i.e., any entity can participate as a miner,
This choice of primitive is significant in our design of as in a fully decentralized blockchain such as Bitcoin.
REM, as Intel made the design choice that attestations It is only partially decentralized, though, in that it relies
can only be verified by accessing Intels Attestation Ser- for security on two key assumptions about the hardware
vice (IAS) [50], a public Web service maintained by In- manufacturers behavior.
tel whose primary responsibility is to verify attestations First, we must assume that Intel correctly manages
upon request. identities, specifically that it assigns a signing key (used
REM uses attestations as proofs for new blocks, so for attestations) only to a valid CPU. It follows that Intel
miners need to access IAS to verify blocks. The current does not forge attestations and thus mining work. Such
way in which IAS works forces miners to access IAS on forgery, if detected in any context, would undermine the
every single verification, adding an undesirable round- companys reputation and the perceived utility of SGX,
trip time to and from Intels server to the block verifica- costing far more than potential blockchain revenue. Sec-
tion time. This overhead, however, is not inherent, and ond, we assume that Intel does not blacklist valid nodes
is due only to a particular design choice by Intel. As we in the network, rendering their attestations invalid when
suggest in Section 5.4, a simple modification, to the IAS the IAS is queried. Such misbehavior would be publicly
protocol, which Intel is currently testing, can eliminate visible and similarly damaging to Intel if unjustified.
this overhead entirely.
Even assuming trustworthy manufacturer behavior,
though, a limited number of individual CPUs might be
Randomness As operating systems sit outside of the
physically or otherwise compromised by a highly re-
trusted computing base (TCB) of SGX, OS-served ran-
sourced adversary (or adversaries). Our trust model as-
dom functions such as srand and rand are not acces-
sumes the possibility of such an adversary and makes
sible to enclaves. SGX instead provides a hardware-
the strong assumption that she can learn the attestation
protected random number generator (RNG) using the
(EPID signing) key for compromised machines and thus
rdrand instruction. REM relies on the SGX RNG.
can issue arbitrary attestations for those machines. In
particular, as we shall see, she can falsify random num-
3 Overview of PoUW and REM ber generation and lie about work performed in REM.
Even this strong adversary, though, does have a key
The basic idea of PoUW, and thus REM, is to replace limitation: As signing keys are issued by the manufac-
the wasteful computation of PoW with arbitrary useful turer, and given our first assumption above, it is not
computation. A miner proves that a certain amount of possible for an adversary to forge identities. We fur-
useful work has been dedicated to a specific branch of ther assume that the signatures are linkable. In SGX,
the blockchain. Intuitively, due to the value of the useful the EPID signature scheme for attestations has a linkable
work outside of the context of the blockchain supported (pseudonymous) mode [50, 13, 72], which permits any-
by REM, the hardware and power are well spent, and one to determine whether two signatures were generated
there is no waste. A comprehensive analysis of the waste by the same CPU. As a result, event a compromised node
is deferred to Appendix B. Here we describe the security cannot masquerade as multiple nodes.

4
Outside the REM security model It is important to REM does so with PoUW, and thus eliminates wasted
note that REM is a consensus framework, i.e., a means to CPU effort.
generate blocks, not a cryptocurrency. REM can be in- When a PoUW enclave determines that a block has
tegrated into a cryptocurrency, as we show by swapping been successfully mined, it produces a PoUW, which
it into the Bitcoin consensus layer. As REM has roughly consists of two parts: an SGX-generated attestation
the same exponentially distributed block-production in- demonstrating the PoUW enclaves compliance with
terval, such integration need not change security proper- REM and another attestation that a block was success-
ties above the consensus layer. For example, fork res- fully mined by the PoUW enclave at a given difficulty pa-
olution, transaction validation, block propagation, etc., rameter. The blockchain agent concatenates the PoUW
remain the same in a REM-backed blockchain as in a to the block template, forming a full block, and publishes
PoW-based one. Thus we do not expand the discussion it to the network.
of the security issues relevant to those elements in the When a blockchain participant verifies a fresh block
REM security model. received on the blockchain network, in addition to ver-
ifying higher-layer properties (e.g., in a cryptocurrency
such as Bitcoin, that transactions, previous block refer-
3.2 REM overview ences, etc., are valid), the participant verifies the attesta-
Figure 1 presents an architectural overview of REM. tions in the associated PoUW.
There are three types of entities in the ecosystem of Intels PoET scheme looks similar to REM in that its
REM: A blockchain agent, one or more REM miners, and enclave randomly determines block intervals and attests
one or more useful work clients. to block production. PoET, however, lacks the produc-
The useful work clients supply useful workloads to tion of useful work, an essential ingredient, as we explain
REM miners in the form of PoUW tasks, each of which later in the paper. We now discuss our strategy in REM
encompass a PoUW enclave and some input. Any SGX- for handling compromised nodes.
compliant program can be transformed into a PoUW en-
clave using the toolchain we developed. Note that a 4 Tolerating Compromised SGX Nodes
PoUW enclave has to conform to certain security re-
quirements. The most important is that it meters effort SGX does not achieve perfect enclave isolation. While
correctly, something that can be efficiently verified by a no real practical attack is known, researchers have
compliance checker and a novel technique we introduce demonstrated potentially dangerous side-channel attacks
called hierarchical attestation. We refer readers to 5.2 against applications [79] and even expressed concerns
and 5.3 for details. about whether an attestation key might be extracted [29].
The blockchain agent collects transactions and gener- Therefore, even if we assume SGX chips are manu-
ates a block template, a block lacking the proof of useful factured in a secure fashion, some number of individ-
work (PoUW). As detailed later, a REM miner will attach ual instances could be broken by well-resourced adver-
the required PoUW and return it to the agent. The agent saries. A single compromised node could be catastrophic
then publishes the full block to the P2P network, making to an SGX-based cryptocurrency, allowing an adversary
it part of the blockchain and receiving the corresponding to create blocks at will and perform majority attacks on
reward. the blockchain. While she could not spend other peoples
A miner takes as input a block template and a PoUW money, which would require access to their private keys,
task to produce PoUWs. It launches the PoUW enclave she could perform denial-of-service attacks, selectively
in SGX with the prescribed input and block template. drop transactions, or charge excessive transaction fees.
Once the PoUW task halts, its results are returned to the In principle, a broken attestation key can be revoked
useful work client. The PoUW enclave meters work per- through the Intel Attestation Service (IAS), but this can
formed by the miner and declares whether the mining only happen if the break is detected to begin with. Conse-
effort is successful and results in a block. Effort is me- quently, Intel has explored ways of detecting SGX com-
tered on a per-instruction basis. The PoUW enclave ran- promise in PoET [9] by statistically testing for implau-
domly determines whether the work results in a block by sibly frequent mining by a given node (using a z-test).
treating each instruction as a Bernoulli trial. Thus min- Details are lacking in published materials, however, and
ing times are distributed in much the same manner as a rigorous analytic framework seems to be needed.
in proof-of-work systems. While in, e.g., Bitcoin, effort For REM, we explore compromise detection within a
is measured in terms of executed hashes, in REM, it is rigorous definitional and analytic framework. The cen-
the number of executed useful-work instructions. Intu- terpiece is what we call a block-acceptance policy, a
itively, REM may be viewed as simulating the distribu- flexibly defined rule that determines whether a proposed
tion of block-mining intervals associated with PoW, but block in a blockchain is legitimate. As we show, defining

5
2 Block Blockchain Agent
template Verifiers
1 State
2 Useful Miner 6 State Content
Useful Block- Blockchain P2P
tasks TEE chain
Work
client agent Network Compliance
PoUW
3 Useful 4 PoUW
Enclave Effort
results 5 New
block

Figure 1: Architecture overview of REM

and analyzing policies rigorously is challenging, but we 4.1.2 Security and efficiency definitions
provide strong analytical and empirical evidence that a
relatively simple statistical-testing policy (which we de- We model the consensus algorithm for the blockchain,
note Pstat ) can achieve good results. Pstat both limits an the adversary A, and honest miners respectively as
adversarys ability to harvest blocks unfairly and mini- (ideal) programs progchain , progA , and progm . Together,
mizes erroneous rejection of honestly mined blocks. they define what we call a security game S(P) for a par-
ticular policy P.
We define security games and their constituent pro-
4.1 Threat Model and Definitions grams formally in Appendix A.2. Where clear from con-
text in what follows, we use the notation S, rather than
4.1.1 Basic notation S(P), i.e., omit P.
A security game S may itself be viewed as a proba-
To model block-acceptance policies, let M = bilistic algorithm. Thus we may treat the blockchain re-
{m1 , , mn } be the set of all miners, which we sulting from execution of S for interval of time as a
assume to be static. (Miners can join and leave the random variable CS ().
system; M includes all potential miners.) An adversary Normalizing the revenue from mining a block to 1, we
A controls a static subset MA M, where |MA | = k. define the payoff for a miner m for a given blockchain C
rate(mi ) specifies the mining rate of mi , the number of as m (C) = |Cm |.
mining operations per unit time it performs. An adversary A seeks to maximize payoffs for its min-
We define a candidate block to be a tuple B = (t, m, d), ers, as reflected in the following definition:
where t is a timestamp, m M the identity of the CPU
that mines the block, and d is the block difficulty. Diffi- Definition 2. (Advantage of A). For a given security
culty d is defined as the win probability per mining op- game S, the advantage of A for time is:
eration in the underlying consensus protocol (e.g. a hash
in Bitcoin, a unit time of sleep in PoET, an instruction in
PoUW). B denotes the set of possible blocks B. E[m (CS ())]
AdvSA () = ,
A blockchain is an ordered sequence of blocks. At maxm j MMA E[m j (CS ())]
time , blockchain C() is a sequence of accepted blocks
C() = {B1 , B2 , . . . , Bn } for some n. We drop where for any m MA . Note that E[m (CS ())] is equal for all
its clear from context. We let r() denote the number of such m,
as they all use strategy A and can emit blocks
rejected blocks of honest miners, i.e., miners in M MA , as frequently as desired (ignoring rate(m)).
in the history of C(). (Of course, this is not and indeed A policy that keeps AdvSA () low is desirable, but
cannot be recorded in a real blockchain system.) Let C theres a trade-off. A policy that rejects too many policies
be the space of all possible blockchains C. Let Cm denote incurs high waste, meaning that it rejects many blocks
blockchain C restricted to blocks mined by miner m M. from honest miners. We define waste as follows.
In REM, a blockchain-acceptance policy is used to de-
Definition 3. (Waste of a policy). For a given blockchain
termine whether a block appears to come from a legiti-
C() = {(B1 , B2 , . . . , Bn )}, the waste is defined as
mate miner (CPU that hasnt been compromised).
r()
Definition 1. (Blockchain-Acceptance Policy) A Waste(C()) = .
n + r()
blockchain-acceptance policy (or simply policy)
P : C B {reject, accept} is a function that takes as For security game S, the waste at time is defined as
input a blockchain and a proposed block, and outputs
whether the proposed block is legitimate. WasteS () = E[Waste(CS ())].

6
,rate
Pstat best (C, B): 4.3 Analysis of Pstat
parse B (, m, d)
We now analyze the average-case and worst-case waste
if |Cm | > F 1 (1 , d(ratebest )):
and adversarial advantage of Pstat . We assume for sim-
output reject
plicity that ratebest is accurately estimated. We remove
else
this assumption in the worst-case analysis below. We
output accept
also assume that the difficulty d(t) is stationary over the
. F 1 (, ) is the quantile function for Poisson period of observation.
Figure 2: Pstat
distribution with rate .
Waste Under Pstat , a miner generates blocks accord-
Our exploration of policies focuses critically on the ing to a Poisson process; whether a block is accepted
trade-offs between low AdvSA () and low WasteS (). To or rejected depends on whether the miner has gener-
illustrate the issue, we give a simple example in Ap- ated more blocks than a time-dependent threshold. This
pendix A.3 of a policy that allows any CPU to mine only process is obviously not memoryless and thus not di-
one block over its lifetime. As , it achieves the rectly representable as a Markov process. We can, how-
optimal AdvSA () = 1, but at the cost of WasteS () = 1, ever, achieve a close approximation using a discrete-time
i.e., 100% waste. Markov chain. Indeed, as we show, we can represent
waste in Pstat using a discrete-time Markov chain that is
periodically identical to the process it models, meaning
4.2 The REM policy: Pstat that its expected waste is identical at any time n, for
n Z+ and a model parameter specified below. This
REM makes use of a statistical-testing-based policy that Markov chain has a stationary distribution that yields
we denote by Pstat . Pstat is compatible not just with an expression upper-bounding waste in Pstat . (We be-
PoUW, but also with PoET and potentially other SGX- lieve, and the periodic identical property suggests, that
based mining variants. this bound is very tight.)
There are two parts to Pstat . First, Pstat estimates the To construct the Markov Chain, we partition time into
rate of the fastest honest miner(s) (fastest CPU type), de- intervals of length ; we regard each such interval as a
noted by ratebest = maxmMMA rate(m). There are var- discrete timestep. Assuming that all honest miners mine
ious ways to accomplish this; a simple one would be to at rate rate, let = d(rate). Thus an honest miner gen-
have an authority (e.g., Intel) publish specs on its fastest erates an expected Pois[ ] blocks in a given timestep i,
CPUs performance. (In PoET, mining times are uni- which we may represent as a random variable Yi . With-
form, so ratebest is just a system parameter.) We describe out loss of generality, we may set = 1/(d rate) and
an empirical approach to estimating ratebest in REM in thus = 1 and E[Pois[ ]] = 1.
Appendix A.1.
We represent the state of an honest miner at timestep n
Given an estimate of ratebest , Pstat tests submitted
by a random variable Xn = ni=1 (Yi E[Yi ]) = (ni=1 Yi )
blocks statistically to determine whether a miner is min-
n. Thus Xn Z is simply difference between the miners
ing blocks too quickly and may thus be compromised.
actually mined blocks and the expected number.
The basic principle is simple: On receiving a block B
from miner m, Pstat tests the null hypothesis Our Markov chain consists of a set of states C = Z
representing possible values of Xn (we use the notation C
here, as states represent |Cm | for an honest miner m). Fig-
H0 = {rate(m) ratebest }. ure 3 gives a simple example of such a chain (truncated
We use |Cm ()|, the number of blocks mined by m at to only five states).
time , as the test statistic. Under H0 , |Cm | should obey Our statistical testing regime may be viewed as reject-
a Poisson distribution with rate d(ratebest ), denoted as ing blocks when a transition is made to a state whose
Pois[d(ratebest )]. Pstat rejects H0 if |Cm | is greater than value is above a certain threshold thresh. We denote the
the (1)-quantile of the Poisson distribution. The false set of such states Crej = { j | j thresh} C and depict
rejection rate for a single test is therefore at most . We corresponding nodes visually in our example in Figure 3
specify Pstat (for a given ratebest ) in Figure 2. as red. Pstat sets thresh according to the statistical-testing
An important property that differentiates Pstat from regime we describe above and a desired false-rejection
canonical statistical tests is that Pstat repeatedly applies (Type-I) parameter . Specifically,
a given statistical test to an accumulating history of sam-
ples. The statistical dependency between samples makes
the analysis non-trivial, as we shall show. Crej [] = { j Z | j F 1 (1 , rate)}. (1)

7
1,500
P(0) 1.2 Advantage (left)

Block Count
broken CPU (right)

S(P )
P(1)

AdvA stat
P(1) P(3) honest CPU (right) 1,000
P(0)
P(2)
1.1
500
1 P(2) 0 P(1) +1 P(3) +2
P(3) 1 0
P(4) 0 20 40 60
t [days]
Figure 3: Markov chain with states C representing Pstat . Red
nodes show the rejection set Crej = Z+ , i.e., thresh = 1. Out- (a) Left y-axis: adversarial advantage of Pstat . Right y-axis:
going edges from 0 are omitted for clarity. the number of blocks mined by a compromised CPU versus an
honest CPU.
0.15
Waste (left)
The transition probabilities in our Markov chain are: 20

Block Count
Rejected Blocks (right)

)
Waste(Pstat
 0.1
P( j i + 1) if j i 1
P[i j | i C \Crej []] = 10
0 otherwise 0.05
(2)
 0 0
P( j + 1) if j 1 0 20 40 60
P[i j | i Crej []] =
0 otherwise. t [days]
(3)
(b) Left y-axis: the waste of Pstat . Right y-axis: the number of
An example of transitions is given in Figure 3. For rejected blocks.
instance, from state 1, the next state can be 2 if the
Figure 4: 60-day simulation of Pstat . The fastest honest CPU
miner doesnt produce any block in this step with prob-
mines one block per hour. The Markov chain analysis yields a
ability P(0), or state 2 + i if the miner produces i + 1 long-term advantage upper bound of 1.006 and waste of 0.006.
blocks in this step, thus with probability P(i + 1).
Finally, an upper bound on the false rejection rate
can be derived from the stationary probabilities of the per hour in expectation and refer to simply as the hon-
Markov chain. Letting q(s) denote the stationary proba- est miner. The adversary uses attack strategy stat .
bility of state s, In Figure 4a, the solid blue line shows the average ag-
gregate number of blocks mined by the adversary, and
Waste(Pstat )= sq(s). (4)
sCrej []
the dashed one those of the honest miner. The attackers
advantage is, of course, the ratio of the two values. Ini-
We compare our analytic bounds with simulation re- tially, the adversary achieves a relatively high advantage
sults in below. ( 127%), but this drops below 110% within 55 blocks,
and continues to drop. Our asymptotic analytic bound on
waste (given below) implies an advantage of 100.6%.
Adversarial Advantage We denote by stat the strat-
Figure 4b shows the average waste of Pstat and abso-
egy of an adversary that publishes blocks as soon as they
lute number of rejected blocks. The waste quickly drops
will be accepted by Pstat . In Appendix A.4, we show the
below 10%. As blocks accumulate, the statistical power
following:
of Pstat grows, and the waste drops further. Analytically,
) = 0.006, or 0.6% from Eqn. 4.
we obtain Waste(Pstat
Theorem 1. In a (non-degenerate) security game S
where A uses strategy stat ,
)
S(Pstat 1 Setting Setting the parameter imposes a trade-off
AdvA = ). on system implementers. As noted, corresponds to the
1 Waste(Pstat
Type-I error for a single test in Pstat . As Pstat performs
Simulation We simulate Pstat to explore its efficacy in continuous testing, however, a more meaningful secu-
rity measure is Waste(Pstat ), the rate of falsely rejected
both the average case and the worst case. Figure 4 shows
the result of 1000 runs of a 60-day mining period sim- blocks. Similarly there is no notion of Type-II error
S(P )
ulation under Pstat . We set = 0.4. We present statis- particularly, as our setting is adversarial. AdvA stat cap-
tics with respect to the fastest (honest) CPU in the sys- tures the corresponding notion in REM. As shown in Fig-
S(P )
tem, which for simplicity we assume mines one block ure 5, raising results in a lower AdvA stat , but higher

8
1.3
= 0.1 = 0.1 Algorithm 1: Miner Loop. The green highlighted
0.2
= 0.2 = 0.2 line is executed in a TEE (e.g., an SGX enclave).

Waste(Pstat )
S(P )

1.2
AdvA stat

= 0.4 = 0.4
= 0.6 = 0.6 1 while True do
0.1
1.1
2 template read from blockchain agent
3 hash, difficulty process(template)
0 4 task get from useful work client
1
0 20 40 60 0 20 40 60
5 outcome, PoUW TEE(task, hash, difficulty)
t [days] t [days]
6 send outcome to useful work client
(a) The adversarial advantage (b) The waste of Pstat under 7 if PoUW 6= then
of Pstat under different different 8 block formBlock(template, PoUW)
9 send block to blockchain agent
Figure 5: 60-day simulation of Pstat , under various . The
fastest honest CPU mines an expected one block per hour.

), and vice versa.


Waste(Pstat In REM, the miner untrusted layer is implemented as a
Python script using RPC to access the agent.
To securely decide whether an instruction was a win-
5 Implementation Details ning one, the PoUW enclave does the equivalent of
generating a random number and checking whether it
We have implemented a full REM prototype using
is smaller than value target that represents the desired
SGX (5.1), and as an example application swapped
system-wide block rate, i.e., difficulty. For this purpose,
REM into the consensus layer of Bitcoin-core [23].
it uses SGXs random number generator (SRNG). How-
We explain how we implemented secure instruction
ever, calling the SRNG and checking for a win after
counting (5.2), and our hierarchical attestation frame-
every single instruction would impose prohibitive over-
work (5.3) that allows for arbitrary tasks to be used for
head. Instead, we batch instructions by dividing useful
work. We explain how to reduce the overhead of at-
work into subtasks of short duration compared to the
testation due to SGX-specific requirements (5.4). Fi-
inter-block interval (e.g. 10 second tasks for 10 minute
nally (5.5) we present two examples of PoUW and eval-
average block intervals). We let each such subtask run
uate the overhead of REM.
to completion, and count its instructions. The PoUW en-
clave then calls the SRNG to determine whether at least
5.1 Architecture one of the instructions has won, i.e., it checks for a re-
sult less than target, weighted by the total number of
Figure 1 shows the architecture of REM. As discussed executed instructions. If so, the enclave produces an at-
in 3.2, the core of REM is a miner program that does testation that includes the input block hash and difficulty.
useful work and produces PoUWs. Each CPU instruction
executed in the PoUW is analogous to one hash function
computation in PoW schemes. That is, each instruction
has some probability of successfully mining a block, and
if the enclave determines this is the case, it produces a Why Count Instructions While instructions are rea-
proof the PoUW. sonable estimates of the CPU effort, CPU cycles would
Pseudocode of the miners iterative algorithm is given have been a more accurate metric. However, although cy-
in Algorithm 1. In a given iteration, it first takes a cles are counted, and the counts can be accessed through
block template from the agent and calculates the previ- the CPUs performance counters, they are vulnerable to
ous blocks hash and difficulty. Then it reads the task to manipulation. The operating system may set their values
perform as useful work. Note that the enclave code has arbitrarily, allowing a rational operator, who controls her
no network stack, therefore it receives its inputs from own OS, to improve her chances of finding a block by
the miner untrusted code and returns its outputs to the faking a high cycle count. Moreover, counters are incre-
miner untrusted code. The miner calls the TEE (SGX mented even if an enclave is swapped out, allowing an
enclave) with the useful task and parameters for mining, OS scheduler to run multiple SGX instances and having
and stores the result of the useful task. It also checks them double-count cycles. Therefore, while instruction
whether the enclave returned a successful PoUW; if so, counting is not perfect, we find it is the best method for
it combines the agent-furnished template and PoUW into securely evaluating effort with the existing tools avail-
a legal block and sends it to the agent for publication. able in SGX.

9
PoUW
Userwork.cpp Useful work
Toolchain To Useful Work Client
outcome
REM loader
SGX PoUWEnclave.so Proof of
PoUWruntime.so GNU ld sign tool Compliance Useful Work
Checker To P2P Network
Proof of
Compliance

Done by Useful Work Clients Done by REM miners

Figure 6: REM toolchain to transfer a useful work to an PoUW-ready program. Everything in the diagram has been implemented
besides existing tools such as ld and SGX signing tool.

Algorithm 2: PoUW Runtime the useful work programs themselves. The solution is to
wrap the useful work with a predefined, trusted PoUW
1 Function TEE(task, hash, diff)
runtime, and make sure to the enclave can only be en-
2 outcome, n := task.run()
tered through the PoUW runtime. The logic of PoUW
3 win := 0
runtime is summarized in Algorithm 2, and it is denoted
4 PoUW :=
as PoUWruntime.so in Figure 6. The PoUW runtime
/* simulating n Bernoulli tests */
serves as an in-enclave loader that launches the useful
5 l U[0, 1] /* query SGX RNG */
work program with proper input and collects the result
6 if l 1 (1 diff)n then
of instruction counting. It takes the block hash and diffi-
7 PoUW = intel [ hash | diff | 1 ]
culty and starts mining by running the mining program.
8 return outcome, PoUW
Once the mining program returns, the PoUW runtime ex-
tracts the instruction counter from the reserved register.
Then it draws a random value from SRNG and deter-
5.2 Secure Instruction Counting mines whether a new block should be generated, based
on the instruction counter and the current difficulty. If a
As we want to allow arbitrary useful work programs, it is block should be generated, the PoUW runtime produces
critical to ensure that instructions are counted correctly an attestation recording the template hash that it is called
even in the presence of malicious useful work programs. with and the difficulty.
To this end, we adopt a hybrid method combining static The last step of the toolchain is to compile the re-
and dynamic program analysis. We employ a customized sultant assembly and link it (using linker GNU ld) with
toolchain that can instrument any SGX-compliant code the PoUW runtime (PoUWruntime.so), to produce the
with dynamic runtime checks implementing secure in- PoUW enclave. Figure 21 in the Appendix shows a snip-
struction counting. pet of instrumented assembly code. This PoUW enclave
Figure 6 shows the workflow of the PoUW toolchain. is finally signed by an Intel SGX signing tool, creating an
First, the useful work code (usefulwork.cpp), C / application PoUWEnclave.so that is validated for load-
C++ source code, is assembled while reserving a regis- ing into an enclave.
ter as the instruction counter. Next, the assembly code The security of instruction counting relies on the as-
is rewritten by the toolchain such that the counter is in- sumption that once instrumented, the code cannot alter
cremented at the beginning of each basic block (a lin- the its behavior. To realize this assumption in SGX, we
ear code sequence with no branches) by the number of need to require two invariants. First, code pages must be
instructions in that basic block. In particular, we use non-writable; second, the useful work program must be
the LEA instruction to perform incrementing for two rea- single threaded.
sons. First, it completes in a single cycle, and second, it
doesnt change flags and therefore does not affect con-
ditional jumps. The count is performed at the beginning Enforcing Non-Writable Code Pages Writable code
of a block rather than its end to prevent a cheater from pages allow a program to rewrite itself at runtime. Al-
jumping to the middle of a block and gaining an exces- though necessary in some cases (e.g. JIT), writable code
sive count. opens up potential security vulnerabilities. In particu-
Another challenge is how to ensure the result of in- lar, writable code pages are not acceptable in REM be-
struction counting is used properlywe cannot rely on cause they would allow a malicious useful work program

10
.section data 5.3 Hierarchical Attestation
ENCLAVE_MTX:
.long 0
A blockchain participant that verifies a block has to
.section text check whether the useful work program that produced
... the blocks PoUW followed the protocol and correctly
enclave_entry: counted its instructions. SGX attestations require such a
xor %rax, %rax
xchgl ENCLAVE_MTX(%rip), %rax verifier to obtain a fingerprint of the attesting enclave. As
cmp %rax, 0 we allow arbitrary work, a nave implementation would
jnz enclave_entry store all programs on the blockchain. Then a verifier that
considers a certain block would read the program from
Figure 7: Code snippet: a spinlock to allow only the first thread the blockchain, verify it correctly counts instructions,
to enter enclave entry calculate its fingerprint, and check the attestation. Be-
yond the computational effort, just placing all programs
on the blockchain for verification would incur leads pro-
to easily bypass the instrumentation. A general memory hibitive overhead and enable DoS attacks via spamming
protection policy would be to require code pages to have the chain with overly large programs. The alternative of
WX permission, namely to be either writable or exe- having an entity that verifies program compliance is also
cutable, but not both. However, WX permissions are unacceptable, as it puts absolute blockchain control in
not enforced by the hardware. Intel has in fact acknowl- the hands of this entity: it can authorize programs that
edged this issue [7] and recommended that enclave code deterministically win every execution.
contain no relocation to enable the WX feature. To resolve this predicament, we form PoUW attesta-
REM thus explicitly requires code pages in the enclave tions with what we call two-layer hierarchical attesta-
code (usefulwork.so) to have WX permission. This tions. We hard-code only a single programs fingerprint
is straightforward to verify, as with the current imple- into the blockchain, a static-analysis tool called compli-
mentation of the SGX loader, code page permissions are ance checker. The compliance checker runs in a trusted
taken directly from the ELF program headers [6]. environment and takes a user-supplied program as input.
It validates that it conforms with the requirements de-
fined above. First, it confirms the text section is non-
Enforcing Single Threading Another limitation of writable. Then it validates the work programs compli-
SGX is that the memory layout is largely predefined ance by disassembling it and confirming that the dedi-
and known to the untrusted application. For example, cated register is reserved for instruction counting and that
the State Save Area (SSA) frames are a portion of stack counts are correct and appear where they should. Next,
memory that stores the execution context when handling it verifies that the PoUW runtime is correctly linked and
interrupts in SGX. This also implies that the SSA pages identical to the expected PoUW runtime code. Finally,
have to be writable. The address of SSA frames for it verifies the only entry point is the PoUW runtime and
an enclave is determined at the time of initialization, as that this is protected by a spinlock as shown in Figure 7.
the Thread Control Structure (TCS) is loaded by the un- Finally, it calculates the programs fingerprint and out-
trusted application through an EADD instruction. In other puts an attestation including this fingerprint.
words, the address of SSA is always known to the un- Every PoUW then includes two parts: The useful work
trusted application. This could lead to attacks on the program attestation on the mining success, and an at-
instruction counting if a malicious program has multi- testation from the compliance checker of the programs
ple threads that interact via manipulation of the execu- compliance (Figure 8). Note that the compliance attes-
tion context in SSA. For example, as we will detail later, tation and the programs attestation must be signed by
REM stores the counter in one of the registers. When the same CPU. Otherwise an attacker that compromises a
one thread is swapped out, the register value stored in an single CPU could create fake compliance attestations for
SSA is subject to manipulation by another thread. invalid tasks. Such an attacker could then create blocks at
While more complicated techniques such as Address will from different uncompromised CPUs, circumvent-
Space Layout Randomization (ASLR) for SGX could ing the detection policy of Section 4.
provide a general answer to this problem, for our pur- In summary, the compliance enclave is verified
poses it suffices to enforce the condition that an enclave through the hard-coded measurement in the blockchain
can be launched by at most one thread. As an SGX en- agent. Its output is a measurement that should be
clave has only one entry point, we can instrument the identical to the measurement of the PoUW enclave
code with a spinlock to allow only the first thread to pass, PoUWEnclave.so. PoUW Enclaves output should
as shown in Figure 7. match the block template (namely the hash of the block

11
prev. block hash, difficulty,
Header 1.2 14.4% 10.8% REM
transactions hash, timestamp 6.5% 5.8% SGX
1
From Native
PoUW enclave 0.8
measurement
1 00 Compliance
0.6
Checker
PoUW 0.4
From PoUW 0.2
Prefix hash Difficulty 1 00
Enclave 0
Protein Folding SVM zlib SHA3
Content
Transactions Figure 9: REM Overhead

Figure 8: Block structure with a proof comprising the quotes could use the resulting signed report. Such a change is
from the compliance enclave and a work enclave. under testing by Intel for future versions of the IAS [5].

5.5 Experiments
prefix, up to the proof) and the prescribed difficulty.
We evaluate the overhead of REM with four examples of
useful work benchmarks in REM as mining programs: a
Generalized Hierarchical Attestation The hierarchi- protein folding algorithm [1], a Support Vector Machine
cal attestation approach can be useful for other scenar- (SVM) classifier [27], the zlib compression algorithm
ios where participants need to obtain attestations to code (iterated) [2], and the SHA3-256 hash algorithm (iter-
they do not know in advance. As a general approach, one ated) [10]. We evaluate each benchmark in three modes:
hard-codes the fingerprint of a root compliance checker
that verifies its childrens compliance. Each of them, in Native We compile with the standard toolchain.
turn, checks the compliance of its children, and so on,
SGX We port to SGX by removing system calls and re-
forming a tree. The leaves of the tree are the programs
placing system libraries with SGX-compliant ones.
that produce the actual output to be verified. A hierarchi-
Then we compile in SGX-prerelease mode and run
cal attestation therefore comprises a leaf attestation and
with the SGX driver v1.7 [49].
a path to the root compliance checker. Each node attests
the compliance of its child. REM After porting to SGX, we instrument the code us-
ing our REM toolchain. We then proceed as in the
SGX mode.
5.4 IAS access overhead
Verifying blocks doesnt require trusted hardware. How- We use the same optimization level (-O2) in all modes.
ever, due to a design choice by Intel, miners must contact The experiments are done on a Dell Precision Worksta-
the IAS to verify attestations. Currently there is no way tion with an Intel 6700K CPU and 32GB of memory. For
to verify attestations locally. This requirement, however, more details on our experimental setup, see D.1.
does not change the basic security assumptions. More- We compared the running time in three modes and
over, a simple modification to the IAS protocol, which is the results are shown in Figure 9. The running time of
being tested by Intel [5], could get rid of the reliance on the native mode is normalized to one as a baseline. For
IAS completely on verifiers side. all four useful workloads, we observe an total overhead
of 5.8% 14.4% in REM relative to the native mode.
Recall that the IAS is a public web service that re-
Because the code is instrumented at control flow trans-
ceives SGX attestations and responds with verification
fers, workloads with more jumps will incur more count-
results. Requests are submitted to the IAS over HTTPS;
ing overhead. For example, SHA3-256 is highly iterative
a response is a signed report indicating the validation
compared with the other workloads, so it incurs the most
status of the queried platform [50]. In the current ver-
counting overhead.
sion of IAS, a report is not cryptographically linked with
We note that overhead for running in SGX is not uni-
its corresponding request, which makes the report only
form. For computation-bound workloads such as protein
trustworthy for the client initiating the HTTPS session.
folding, zlib, and SHA3, SGX introduces little overhead
Therefore an IAS access is required for every block ver-
(< 1%) because the cost of switching to SGX and obtain-
ification by every blockchain participant.
ing attestations is amortized by the longer in-enclave ex-
However, the following modification can eliminate
ecution time of the workload. In the shorter SVM bench-
this overhead: simply echoing the request in the body of
mark, the cost of entering SGX is more significant.
the report. Since the report is signed by Intel using a pub-
In summary, we observe an overhead of roughly 5
lished public key [50, 51], only one access to IAS would
15% for converting useful-work benchmarks into REM
be needed globally for every new block. Other miners
PoUW enclave.

12
6 Waste Analysis paid for proving that their equipment remains idle. Be-
yond the technical challenges, as in PoET, an operator
To compare PoUW against PoET and alternative with a set budget could redirect savings from power to
schemes, we explore a common game-theoretic model purchase more idle machines, producing capital waste.
(with details deferred to the appendix). We consider a Alternative approaches, like PoUW, aim at PoW pro-
set of operators / agents that can either work locally on ducing work useful for a secondary goal. Permacoin [63]
their own useful workloads or utilize their resource for repurposes mining resources as a distributed storage net-
mining. Based on the revenue from useful work and work, but recycles only a small fraction of mining re-
mining, and the capital and operational costs, we com- sources. Primecoin [55] is an active cryptocurrency
pute the equilibrium point of the system. We calculate whose useful outputs are Cunningham and Bi-twin
the waste in this context as the ratio of the total resource chains of prime numbers, which have no known utility.
cost (in U.S. dollars) spent per unit of useful work on a Gridcoin [41, 40], an active cryptocurrency whose min-
mining node compared with the cost when mining is not ers work for the BOINC [14] grid-computing network,
possible and all operators do useful work. We plug in relies on a central entity. FoldingCoin [70] rewards par-
concrete numbers for the parameters based on statistics ticipants for work on a protein folding problem, but as a
we collected from public data sources. layer atop, not integrated with, Bitcoin.
Initial study of PoET identified a subtle pitfall involv- Proof-of-Stake [77, 19, 54, 21] is a distinct approach in
ing miners ability to mine simultaneously on multiple which miners gain the right to generate blocks by com-
blockchains, a problem solved by He et al. [42] in a mitting cryptocurrency funds. It is used in experimen-
scheme we call Lazy-PoET. Our analysis, however, re- tal systems such as Peercoin [56] and Nxt [28]. Unlike
veals that even Lazy-PoET suffers from what we call PoW, however, in PoS, an attacker that gains majority
the stale-chip problem. Miners are better off maintain- control of mining resources for a bounded time can con-
ing farms of cheap, outdated CPUs just for mining than trol the system forever. PoS protocols also require that
using new CPUs for otherwise useful goals. funds, used as stake, remain frozen (and unusable) for
We consider instead an approach in which operators some time. To remove this assumption, Bentov et al. [20]
utilize their CPUs while mining, making newer CPUs and Duong et al. [32] propose hybrid PoW / PoS sys-
more attractive due to the added revenue from the useful tems. These works, and the line of hybrid blockchain
work done. We call this scheme Busy PoET. We find that systems starting with Bitcoin-NG [36, 57, 68], can all
it imporves on Lazy Poet, but remains highly wasteful. utilize PoUW as a low-waste alternative to PoW.
This observation leads to another approach, Proof of Another line of work on PoW for cryptocurrencies
Potential Work (PoPW). PoPW is similar to Busy-PoET, aims at PoWs that resist mining on dedicated hard-
but reduces mining time according to the speed of the ware and prevent concentration of mining power, e.g.,
CPU (its potential to do work), and thus rewards use via memory-intensive hashing as in Scrypt [59] and
of newer CPUs. Although PoPW would greatly reduce Ethereum [24]. Although democratization of mining
waste, SGX does not allow an enclave to securely re- power is not our focus here, PoUW in fact achieves this
trieve its CPU model, making PoPW theoretical only. goal by restricting mining to general-use CPUs.
We conclude that PoUW incurs the smallest amount
of waste among the options under study. For full de- SGX. Due to the complexity of the x86-64 architecture,
tails on our model, parameter choices, and analyses of several works [29, 75, 79] have exposed security prob-
the various mining schemes, we refer the reader to Ap- lems in SGX, such as side-channel attacks [79]. Tramer
pendices B and C. et al. [75] consider the utility of SGX if its confidentiality
guarantees are broken. Similar practical concerns moti-
7 Related Work vate REMs tolerance mechanism of compromised SGX
chips.
Cryptocurrencies and Consensus. Modern decentral- Ryoan [44] is a framework that allows a server to
ized cryptocurrencies have stimulated strong interest in run code on private client data and return the output to
Proof-of-Work (PoW) systems [17, 33, 52] as well as the client. The (trusted) Ryoan service instruments the
techniques to reduce their associated waste.3 server operators code to prevent leakage of client data.
An approach similar to PoET [47], possibly originat- In contrast, in REM, the useful-workload code is instru-
ing with Dryja [31], is to limit power waste by so-called mented in an untrusted environment, and an attestation
Proof-of-Idle. Miners buy mining equipment and get of its validity is produced within a trusted environment.
3 Permissioned systems, as supported in, e.g., Hyperledger [25] Haven [18] runs non-SGX applications by incorporat-
and Stellar [61], avoid waste by using traditional consensus protocols ing a library OS into the enclave. REM, in contrast, takes
at the cost of avoiding decentralization. code amenable to SGX compilation and enforces cor-

13
rect instrumentation. In principle, Haven could allow for Acknowledgements
non-SGX code to be adapted for PoUW.
This work is funded in part by NSF grants CNS-
Zhang et al. [80] and Juels et al. [53] are the first works
1330599, CNS-1514163, and CNS-1564102, ARO grant
we are aware of to pair SGX with cryptocurrencies. Their
W911NF-16-1-0145, and IC3 sponsorship from Chain,
aim is to augment the functionality of smart contracts,
IBM, and Intel.
however, and is unrelated to the underlying blockchain
layer in which REM operates.
References
[1] A Genetic Algorithm for Predicting Protein Folding in the 2D HP
Model. https://github.com/alican/GeneticAlgorithm.
8 Conclusion Accessed: 2016-11-11.
[2] A Lossless, High Performance Implementation of the Zlib (RFC
1950) and Deflate (RFC 1951) Algorithm. https://code.
We presented REM, which supports permissionless google.com/archive/p/miniz/. Accessed: 2017-2-16.
blockchain consensus based on a novel mechanism [3] Amazon EC2 instance pricing. https://aws.amazon.com/
called Proof of Useful Work (PoUW). PoUW leverages ec2/instance-types://aws.amazon.com/ec2/pricing/
Intel SGX to significantly reduce the waste associated on-demand/. Accessed: 2016-10-29.
with Proof of Work (PoW), and builds on and reme- [4] Amazon EC2 instance types. https://aws.amazon.com/
dies shortcomings in Intels innovative PoET scheme. ec2/instance-types/. Accessed: 2016-10-29.
PoUW and REM are thus a promising basis for partially- [5] Attestation Service for Intel Software Guard Extensions
(Intel SGX): API Documentation. Revision 2.0. Section
decentralized blockchains, reducing waste given certain
4.2.2. https://software.intel.com/sites/default/
trust assumptions in a hardware vendor such as Intel. files/managed/7e/3b/ias-api-spec.pdf. Accessed:
2017-2-21.
Using a rigorous analytic framework, we have shown
how REM can achieve resilience against compromised [6] Intel(R) Software Guard Extensions for Linux OS. https://
github.com/01org/linux-sgx. Accessed: 2017-2-16.
nodes with minimal waste (rejected honest blocks). This
framework extends to PoET and potentially other SGX- [7] Intel Software Guard Extensions Enclave
Writers Guide. https://software.intel.
based mining approaches. com/sites/default/files/managed/ae/48/
Our implementation of REM introduces powerful new Software-Guard-Extensions-Enclave-Writers-Guide.
pdf. Accessed: 2017-2-16.
techniques for SGX applications, namely instruction-
[8] Passmark software. https://www.cpubenchmark.net/. Ac-
counting instrumentation and hierarchical attestation, of
cessed: 2016-10-29.
potential interest beyond REM itself. They allow REM
[9] Sawtooth-core source code (validator). https://github.com/
to accommodate essentially any desired workloads, per- hyperledger/sawtooth-core/tree/0-7/validator/
mitting flexible adaptation in a variety of settings. sawtooth_validator/consensus/poet1. Accessed: 2017-
2-21.
Our framework for economic analysis offers a general
means for assessing the true utility of mining schemes, [10] Single-file C implementation of the SHA-3 implementation with
Init/Update/Finalize hashing (NIST FIPS 202). https://
including PoW and partially-decentralized alternatives. github.com/brainhub/SHA3IUF. Accessed: 2017-2-16.
Beyond illustrating the benefits of PoUW and REM, it [11] Stale CPU dealers on Alibaba. https://wefound.en.
allowed us to expose risks of approaches such as PoET in alibaba.com/. Accessed: 2016-10-29.
the use of stale chips, and propose improved variants, in- [12] UCI Machine Learning Repository.
cluding Proof of Potential Work (PoPW). We found that http://archive.ics.uci.edu/ml/datasets.html.
small changes to the TEE framework would be signifi- [13] A NATI , I., G UERON , S., J OHNSON , S., AND S CARLATA , V.
cant for reduced-waste blockcain mining. In particular, Innovative technology for CPU based attestation and sealing. In
allowing for secure instruction (or cycle) counting would Proceedings of the 2nd International Workshop on Hardware and
Architectural Support for Security and Privacy (2013), vol. 13.
reduce PoUW overhead, and a secure chip-model read-
ing instruction would allow for PoPW implementation. [14] A NDERSON , D. P. Boinc: A system for public-resource com-
puting and storage. In Grid Computing, 2004. Proceedings. Fifth
We reported on a complete implementation of REM, IEEE/ACM International Workshop on (2004), IEEE, pp. 410.
swapped in for the consensus layer in Bitcoin core in [15] A SPNES , J., JACKSON , C., AND K RISHNAMURTHY, A. Expos-
a prototype system. Our experiments showed mini- ing computationally-challenged Byzantine impostors. Depart-
mal performance impact (5-15%) on example bench- ment of Computer Science, Yale University, New Haven, CT, Tech.
Rep (2005).
marks. In summary, our results show that REM is practi-
[16] A ZURE , M. Blockchain as a service. https:
cally deployable and promising path to fair and environ- //web.archive.org/web/20161027013817/https://
mentally friendly blockchains in partially-decentralized azure.microsoft.com/en-us/solutions/blockchain/,
blockchains. 2016.

14
[17] BACK , A. Hashcash a denial of service counter-measure. [37] E YAL , I., AND S IRER , E. G. Majority is not enough: Bitcoin
http://www.cypherspace.org/hashcash/hashcash.pdf, mining is vulnerable. In Financial Cryptography and Data Secu-
2002. rity (2014).
[18] BAUMANN , A., P EINADO , M., AND H UNT, G. Shielding appli- [38] G ARAY, J. A., K IAYIAS , A., AND L EONARDOS , N. The Bit-
cations from an untrusted cloud with Haven. ACM Trans. Com- coin backbone protocol: Analysis and applications. In Advances
put. Syst. 33, 3 (Aug. 2015), 8:18:26. in Cryptology - EUROCRYPT 2015 - 34th Annual International
[19] B ENTOV, I., G ABIZON , A., AND M IZRAHI , A. Cryptocurren- Conference on the Theory and Applications of Cryptographic
cies without proof of work. CoRR abs/1406.5694 (2014). Techniques (2015), pp. 281310.

[20] B ENTOV, I., L EE , C., M IZRAHI , A., AND ROSENFELD , M. [39] GARWOOD, F. (i) fiducial limits for the poisson distribution.
Proof of activity: Extending bitcoins proof of work via proof Biometrika 28, 3-4 (1936), 437.
of stake. Cryptology ePrint Archive, Report 2014/452, 2014. [40] G RIDCOIN. Gridcoin. https://web.archive.org/web/
http://eprint.iacr.org/2014/452. 20161013081149/http://www.gridcoin.us/, 2016.
[21] B ENTOV, I., PASS , R., AND S HI , E. Snow white: Prov- [41] G RIDCOIN. Gridcoin (grc) first coin utilizing boinc official
ably secure proofs of stake. Cryptology ePrint Archive, Report thread. https://web.archive.org/web/20160909032618/
2016/919, 2016. http://eprint.iacr.org/2016/919. https://bitcointalk.org/index.php?topic=324118.0,
2016.
[22] B ITCOIN COMMUNITY. Bitcoin source. https://github.
com/bitcoin/bitcoin, retrieved Nov. 2016. [42] H E , W., S ONG , D., AND M ILUTINOVIC , M. SGX and smart
contracts. Initiative for Cryptocurrencies and Contracts Retreat
[23] B ITCOIN COMMUNITY. Bitcoin source. https://github.
(presentation), 2016.
com/bitcoin/bitcoin, retrieved Mar. 2015.
[43] H OEKSTRA , M., L AL , R., PAPPACHAN , P., P HEGADE , V., AND
[24] B UTERIN , V. A next generation smart contract & decen-
D EL C UVILLO , J. Using innovative instructions to create trust-
tralized application platform. https://www.ethereum.org/
worthy software solutions. In Proceedings of the 2Nd Inter-
pdfs/EthereumWhitePaper.pdf/, retrieved Feb. 2015, 2013.
national Workshop on Hardware and Architectural Support for
[25] C ACHIN , C. Architecture of the Hyperledger blockchain fab- Security and Privacy (New York, NY, USA, 2013), HASP 13,
ric. In Workshop on Distributed Cryptocurrencies and Consensus ACM, pp. 11:111:1.
Ledgers (2016). [44] H UNT, T., Z HU , Z., X U , Y., P ETER , S., AND W ITCHEL , E.
[26] C ARLSTEN , M., K ALODNER , H., W EINBERG , S. M., AND Ryoan: A distributed sandbox for untrusted computation on se-
NARAYANAN , A. On the instability of bitcoin without the block cret data. In 12th USENIX Symposium on Operating Systems De-
reward. In ACM CCS (2016). sign and Implementation (OSDI 16) (GA, Nov. 2016), USENIX
[27] C HANG , C.-C., AND L IN , C.-J. LIBSVM: A library for support Association, pp. 533549.
vector machines. ACM Transactions on Intelligent Systems and [45] I NTEL. Intel Software Guard Extensions Programming Refer-
Technology 2 (2011), 27:127:27. Software available at http: ence, 2014.
//www.csie.ntu.edu.tw/~cjlin/libsvm. [46] I NTEL. Intel 64 and IA-32 Architectures Software Developers
[28] C OMMUNITY, T. N. Nxt whitepaper, revision 4. Manual: Volume 3 (3A, 3B, 3C & 3D): System Programming
https://web.archive.org/web/20160207083400/ Guide, 325384-059us ed., June 2016.
https://www.dropbox.com/s/cbuwrorf672c0yy/ [47] I NTEL. Sawtooth lake introduction. https:
NxtWhitepaper_v122_rev4.pdf, 2014. //web.archive.org/web/20161025232205/https:
[29] C OSTAN , V., AND D EVADAS , S. Intel SGX Explained. Cryptol- //intelledger.github.io/introduction.html, 2016.
ogy ePrint Archive (2016). [48] I NTEL C ORPORATION. Intel Software Guard Extensions SDK.
[30] D EETMAN , S. Bitcoin could consume as much electricity https://software.intel.com/en-us/sgx-sdk, 2015.
as Denmark by 2020. https://web.archive.org/web/ [49] I NTEL C ORPORATION. Intel SGX for Linux. https://01.
20160828092858/http://motherboard.vice.com/read/ org/intel-softwareguard-extensions, 2016.
bitcoin-could-consume-as-much-electricity-as-denmark-by-2020,
[50] I NTEL C ORPORATION. Intel Software Guard Extensions: Intel
March 2016.
Attestation Service API. https://software.intel.com/
[31] D RYJA , T. Optimal mining strategies. SF Bitcoin-Devs presen- sites/default/files/managed/3d/c8/IAS_1_0_API_
tation. https://www.youtube.com/watch?v=QN2TPeQ9mnA, spec_1_1_Final.pdf, 2016.
2014.
[51] I NTEL C ORPORATION. Public Key for Intel Attesta-
[32] D UONG , T., FAN , L., V EALE , T., AND Z HOU , H.-S. Securing tion Service. https://software.intel.com/en-us/sgx/
bitcoin-like backbone protocols against a malicious majority of resource-library, 2016.
computing power. Cryptology ePrint Archive, Report 2016/716,
[52] JAKOBSSON , M., AND J UELS , A. Proofs of work and bread
2016. http://eprint.iacr.org/2016/716.
pudding protocols. In Secure Information Networks. Springer,
[33] DWORK , C., AND NAOR , M. Pricing via processing or combat- 1999, pp. 258272.
ting junk mail. In Annual International Cryptology Conference [53] J UELS , A., KOSBA , A., AND S HI , E. The Ring of Gyges: In-
(1992), Springer, pp. 139147. vestigating the future of criminal smart contracts. In ACM CCS
[34] DWYER , J. P., AND H INES , P. Beyond the byzz: Exploring dis- (2016), pp. 283295.
tributed ledger technology use cases in capital markets and cor- [54] K IAYIAS , A., KONSTANTINOU , I., RUSSELL , A., DAVID ,
porate banking. Tech. rep., Celent and MISYS, 2016. B., AND O LIYNYKOV, R. A provably secure proof-of-
[35] E YAL , I. The miners dilemma. In IEEE Symposium on Security stake blockchain protocol. Cryptology ePrint Archive, Report
and Privacy (2015), pp. 89103. 2016/889, 2016. http://eprint.iacr.org/2016/889.
[36] E YAL , I., G ENCER , A. E., S IRER , E. G., AND VAN R ENESSE , [55] K ING , S. Primecoin: Cryptocurrency with prime num-
R. Bitcoin-ng: A scalable blockchain protocol. In 13th USENIX ber proof-of-work. https://web.archive.org/web/
Symposium on Networked Systems Design and Implementation 20160307052339/http://primecoin.org/static/
(NSDI 16) (2016), pp. 4559. primecoin-paper.pdf, 2013.

15
[56] K ING , S., AND NADAL , S. PPcoin: Peer-to- [74] SWIFT, AND ACCENTURE. Swift on distributed ledger tech-
peer crypto-currency with proof-of-stake. https: nologies. Tech. rep., SWIFT and Accenture, 2016.
//web.archive.org/web/20161025145347/https: [75] T RAMER , F., Z HANG , F., L IN , H., H UBAUX , J.-P., J UELS , A.,
//peercoin.net/assets/paper/peercoin-paper.pdf, AND S HI , E. Sealed-glass proofs: Using transparent enclaves
2012. to prove and sell knowledge. Cryptology ePrint Archive, Report
[57] KOGIAS , E. K., J OVANOVIC , P., G AILLY, N., K HOFFI , I., 2016/635, 2016. http://eprint.iacr.org/2016/635.
G ASSER , L., AND F ORD , B. Enhancing bitcoin security and [76] U NGER , R., AND M OULT, J. Genetic algorithms for protein
performance with strong consistency via collective signing. In folding simulations. Journal of Molecular Biology 231, 1 (1993),
25th USENIX Security Symposium (USENIX Security 16) (2016), 75 81.
pp. 279296.
[77] U SER Q UANTUM M ECHANIC . Proof of stake instead
[58] L EWENBERG , Y., S OMPOLINSKY, Y., AND Z OHAR , A. Inclu- of proof of work. https://web.archive.org/web/
sive block chain protocols. In Financial Cryptography (Puerto 20160320104715/https://bitcointalk.org/index.
Rico, 2015). php?topic=27787.0.
[59] L ITECOIN P ROJECT. Litecoin, open source P2P digital currency. [78] W OOD , G. Ethereum: A secure decentralised generalised trans-
https://litecoin.org, retrieved Nov. 2014. action ledger (EIP-150 revision). https://web.archive.org/
[60] L OMBROZO , E., L AU , J., AND W UILLE , P. BIP141: Segregated web/20161019105532/http://gavwood.com/Paper.pdf,
witness (consensus layer). https://web.archive.org/web/ 2016.
20160521104121/https://github.com/bitcoin/bips/ [79] X U , Y., C UI , W., AND P EINADO , M. Controlled-channel at-
blob/master/bip-0141.mediawiki, 2015. tacks: Deterministic side channels for untrusted operating sys-
tems. In Proc. IEEE Symp. Security and Privacy (May 2015),
[61] M AZIERES , D. The Stellar consensus protocol: A federated
pp. 640656.
model for Internet-level consensus. https://web.archive.
org/web/20161025142145/https://www.stellar.org/ [80] Z HANG , F., C ECCHETTI , E., C ROMAN , K., J UELS , A., AND
papers/stellar-consensus-protocol.pdf, 2015. S HI , E. Town crier: An authenticated data feed for smart con-
tracts. In Proceedings of the 2016 ACM SIGSAC Conference on
[62] M C K EEN , F., A LEXANDROVICH , I., B ERENZON , A., ROZAS ,
Computer and Communications Security (New York, NY, USA,
C. V., S HAFI , H., S HANBHOGUE , V., AND S AVAGAONKAR ,
2016), CCS 16, ACM, pp. 270282.
U. R. Innovative instructions and software model for isolated
execution. In Proceedings of the 2nd International Workshop
on Hardware and Architectural Support for Security and Privacy
(2013), p. 10.
A Tolerating Compromised SGX Nodes:
[63] M ILLER , A., S HI , E., J UELS , A., PARNO , B., AND K ATZ , J. Details
Permacoin: Repurposing Bitcoin work for data preservation. In
Proceedings of the IEEE Symposium on Security and Privacy A.1 Mining Rate Estimation
(San Jose, CA, USA, 2014), IEEE.
[64] NAKAMOTO , S. Bitcoin: A peer-to-peer electronic cash system. We start by discussing how to statistically infer the power
http://www.bitcoin.org/bitcoin.pdf, 2008. of a CPU from its blocks in the blockchain. Reading the
[65] NAYAK , K., K UMAR , S., M ILLER , A., AND S HI , E. Stub- difficulty of each block in the main chain and the rate of
born mining: Generalizing selfish mining and combining with blocks from a specific CPU, we can estimate an lower
an eclipse attack. IACR Cryptology ePrint Archive 2015 (2015),
796.
bound of that CPUs power it follows directly from the
rate of its blocks. It is an lower bound since the CPU
[66] PANDE , V. Protein folding: The science. https:
//web.archive.org/web/20161016120350/https: might not be working continuously, and the estimates
//folding.stanford.edu/home/the-science/. accuracy increases with the number of available blocks.
[67] PASS , R., S EEMAN , L., AND S HELAT, A. Analysis of the Recall Cmi is the blocks mined by miner mi so far. Cmi
blockchain protocol in asynchronous networks. Tech. rep., Cryp- may contain multiple blocks, perhaps with varying dif-
tology ePrint Archive, Report 2016/454, 2016. ficulties. Without loss of generality, we write the diffi-
[68] PASS , R., AND S HI , E. Hybrid consensus: Efficient consensus culty as a function of time, d(t). The difficulty is the
in the permissionless model. Cryptology ePrint Archive, Report
probability for a single instruction to yield a win. De-
2016/917, 2016. http://eprint.iacr.org/2016/917.
note the power of the miner, i.e., its mining rate, by ratei .
[69] P OPPER , N. Central banks consider bitcoins technology, if not
bitcoin. New York Times, Oct 11 2016. Therefore in a given time interval of length T , the num-
[70] ROSS , R., AND S EWELL , J. Foldingcoin white paper.
ber of blocks mined by a specific CPU obeys Poisson
https://web.archive.org/web/20161022232226/http: distribution (since CPU rates are high and the win prob-
//foldingcoin.net/the-coin/white-paper/, 2015. ability is small, its appropriate to approximate a Bino-
[71] S APIRSHTEIN , A., S OMPOLINSKY, Y., AND Z OHAR , A. Opti- mial distribution by a Poisson distribution,) and with rate
mal selfish mining strategies in Bitcoin. CoRR abs/1507.06183 ratei T d(t). Further, under independence assumption, the
(2015).
mining process of a specific CPU is specified by a Pois-
[72] S IMON J OHNSON , V INNIE S CARLATA , C ARLOS ROZAS , son process with rate i (t) = ratei d(t), the product of the
E RNIE B RICKELL , AND F RANK M CKEEN. Intel Software Guard
Extensions: EPID Provisioning and Attestation Services, 2015. probability and the miners rate ratei .
[73] S OMPOLINSKY, Y., AND Z OHAR , A. Accelerating Bitcoins
There are many methods to estimate the mean of a
transaction processing. fast money grows on trees, not chains. In Poisson distribution. For example, by [39] an interval es-
Financial Cryptography (Puerto Rico, 2015). timation of i can be derived from n = |Cmi |, with 1

16
confidence, as progchain [P]
 
1 2 1 2 State:
i /2,2n , 1/2,2(n+1) . (5)
2T 2T C: the chain
Rt On receive init:
Therefore, let D = t0f d(t)dt, the 1 confidence in- C :=
terval of ratei is d := d0
  Send (C, P, d) to all miners
1 2 1 2
/2,2n , 1/2,2(n+1) . (6) On receive submit B from m:
2D 2D if P(C, B) = accept:
C C {B}
Knowing rates for all miners, the rate of the strongest
d adjust(C, d)
CPU (ratebest ) can be estimated. The challenge here is
Send (C, P, d) to all miners
to limit the influence of adversarial nodes. To this end,
instead of finding the strongest CPU directly, we approx-
Figure 10: The program for a blockchain. We omit details here
imate ratebest based on rate (e.g. f90% ), namely the
on how difficulty d is set, i.e., how d0 and adjust are chosen.
percentile fastest miner.

Claim 1. Suppose there are n nodes and k out of


consensus and fork resolution are instantaneous; loosen-
them are adversarial. Let rh = {rate1 , , ratenk } be
ing this assumption does not materially affect our anal-
the rates of n k honest nodes, in monotonically non-
1/n , , rate
100% } be yses. We also assume that block timestamping is accu-
decreasing order, and let r = {rate
rate. Timestamps can technically be forged at block gen-
the rates of all nodes, in monotonically non-decreasing
k+1 rate1 . eration, but in practice miners reject blocks with large
order, then ratek+1 rate
n skews [23], limiting the impact of timestamp forgery.
Informally, progchain maintains and broadcasts and au-
By Claim 1, we can obtain ratebest that is at least as
thoritative blockchain C. In addition to verifying that
high as rate .
block contents are correct, progchain appends to C only
blocks that are valid under a policy P. We model the
Bootstrapping. During the launch of a cryptocurrency, blockchain consensus algorithm as the (ideal) stateful
it could be challenging to estimate the mining power of program specified in Figure 10.
the population accurately, potentially leading to poison-
ing attacks by an adversary. At this early stage, it makes Adversary A (progA ). In our model, an adversary A
sense to hardwire a system estimate of the maximum executes a strategy A that coordinates the k miners MA
mining power of honest miners into the system and set under her control to generate blocks. Specifically:
conditions (e.g., a particular mining rate or target date) to
Definition 4. (Adversarial Strategy). An adversarial
estimate ratebest as we propose above. If the cryptocur-
strategy is a probabilistic algorithm A that takes in a
rency launches with a large number of miners, an even
set of identities, the current blockchain and the policy,
simpler approach is possible before switching to ratebest
and outputs a time-stamp and identity for block submis-
estimation: We can cap the total number of blocks that
R+ MA .
sion. Specifically, (MA ,C,t, P) (t, m)
any one node can mine, a policy we illustrate below. (See
Psimple .) In principle, A can have dependencies among indi-
vidual node behaviors. In our setting, this would not ben-
A.2 Security game definition efit A, however. As we dont know MA a priori, though,
the only policies we consider operate on individual miner
We model REM as an interaction among three entities: a block-generation history.
blockchain consensus algorithm, an adversary, and a set As a wrapper expressing implementation by A of A ,
of honest miners. Their behavior together defines a secu- we model A as a program progA , specified in Figure 11.
rity game, which we define formally below. We charac-
terize the three entities respectively as (ideal) programs Honest miners (progm ). Every honest miner m M
progchain , progA , and progm , which we now define. MA follows an identical strategy, a probabilistic algo-
rithm denoted h . In REM, h may be modeled as a sim-
Blockchain consensus algorithm (progchain ). A con- ple algorithm that samples from a probability distribution
sensus algorithm determines which valid blocks are on block mining times determined by rate(m) (specifi-
added to a blockchain C. We assume that underlying cally in our setting, an exponential distribution with rate

17
progA [A ] Psimple (C, B):
parse B (, m, d)
On receive (C, P, d) from progchain if |Cm | > 0:
t, m A (MA ,C, P, d) output reject
if t is not : else
wait until t output accept
send submit (t, m,
d) to progchain
Figure 13: A simple policy that allows one block per CPU over
Figure 11: The program for an adversary A that controls k its lifetime.
nodes MA = {mA1 , , mAk }.

progm [h ] that an adversary can not do better than an honest miner


unconditionally. However the asymptotic waste of this
On receive (C, P, d) from progchain policy is 100%.
t h (C, d) Another disadvantage of this policy is that it discour-
Send submit (t, m, d) to progchain ages miners from participating. Arguably, miner would
stay if the revenue from mining is high enough to cover
Figure 12: The program for an honest miner. h is the protocol the cost of replacing a CPU. But though a CPU is still
defined by progchain (e.g. PoET or PoUW). valuable in other context even if it is is blacklisted for-
ever in this particular system, repurposing it incurs op-
erational cost. Therefore chances are this policy would
rate(m)). We express implementation by honest miner m cause a loss of mining power, especially when the ini-
of h as a program progm [h ] (Figure 12). tial miner population is small, rendering the system more
vulnerable to attacks.
To understand the security of REM, we consider a se-
curity game that defines how an adversary A interacts
with honest miners, a blockchain consensus protocol, A.4 Adversarial advantage
and a policy given the above three ideal programs. For-
mally: A block-acceptance policy depends only on the number
of blocks by the adversary since its first one. There-
Definition 5. (Security Game) For a given triple fore an adversarys best strategy is simply to publish its
of ideal programs (progchain [P], progA [A ], progm [h ]), blocks as soon as they wont be rejected. Denote this
and policy P, a security game S(P) is a tuple S(P) = strategy as stat .
((M, MA , rate()); (A , h )). Clearly, an adversary will submit F 1 (1 ,td
ratebest ) blocks within [0,t]. On the other hand, the
We define the execution of S(P) as an interactive exe-
strongest honest CPU with rate ratebest mines td ratebest
cution of programs (progchain [P], progA [A ], progm [h ])
blocks in expectation. Recall that according to our
using the parameters of S(P). As P, A and h are ran-
Markov chain analysis, Pstat incurs false rejections for
domized algorithms, such execution is itself probabilis-
honest miners with probability wh (), which further
tic. Thus we may view the blockchain resulting from
reduces the payoff for honest miners. For a (non-
execution of S for interval of time as a random variable
degenerate) security game S, in which A uses strategy
CS ().
stat , the advantage is therefore:
A non-degenerate security game S is one in which
there exists at least one honest miner m with rate(m) > 0.
)
S(Pstat F 1 (1 ,td ratebest )
AdvA = lim (7)
t (1 wh ()) td ratebest
A.3 Warmup policy
Theorem 1. In a (non-degenerate) security game S
As a warmup, we give a simple example of a poten- where A uses strategy stat ,
tial block-acceptance policy. This policy just allows one
block throughout the life of a CPU, as shown in Fig- )
S(Pstat 1
ure 13. AdvA = ).
1 Waste(Pstat
Clearly, an adversary can not do better than mining
one block. Denote this simple strategy simple . For any Proof. Let = td ratebest . It is known that as for a
non-degenerate security game S, therefore, the advantage Poisson distribution goes to infinity, it converges in the
S(P )
AdvA simple () = 1 as . This policy is optimal in limit to a normal distribution with mean and variance .

18
Therefore, The annual capital cost of maintaining a CPU of
age age is denoted by the function C(age). The function
F 1 (1 , ) + zp 1 decreases with CPU age, as older CPUs are significantly
lim = lim = .
(1 wh ()) (1 wh ()) 1 wh () cheaper from recycling marketplace. The annual energy
cost of a single CPU is denoted by the function E(u),
which increases with u. Denote by (age) [0, 1] the
Early in a blockchains evolution, the potential advan- performance slowdown of a CPU, normalized to that of
tage of an adversary is relatively high. The confidence in- the latest model in the same family. increases with
terval is wide at this point, allowing the adversary to per- CPU age age, as newer generations (of similar MSRP)
form frequent generation without triggering detection. tend to be more powerful than their antecedents. Be-
As the adversary publishes more blocks, the confidence yond the CPU itself, there is an overhead for the plat-
interval tightens, forcing the adversary to reduce her min- form on which it runs. Denote by Ostd the annual cost
ing rate. This is illustrated by our numerical simulation overhead for running a CPU, including server, racking,
in Section 4.3. cooling etc. In some schemes (e.g., PoW), an operator
can reduce costs by placing the CPU in a farm a ded-
icated platform stripped down to essential resources and
B Resource Consumption Model thus usable only for mining. Denote by Ofarm (naturally
Ofarm < Ostd ) the annual cost overhead for running a CPU
In this appendix, we present a general economic / game-
in a dedicated mining farm.
theoretic model for modeling resource consumption of
Due to performance improvements, the annual income
consensus schemes. This model guides us toward an un-
for useful work from a CPU is a function Rw (u, age) of
derstanding of optimal mining strategies for various con-
both its age and utilization (or simply Rw when the pa-
sensus schemes and thus a basis for comparison among
rameters are clear from the context). The function in-
them. We detail the model in Section B.1. We esti-
creases with both parameters. We assume operators have
mate real-world parameter choices for this model in Sec-
unbounded useful work, enough to populate any and all
tion B.2 and in Section B.3 present cost per unit of useful
hardware they can afford.
work, our key metric of waste / resource consumption.
We use this model in Appendix C to compare various We let Rannual denote the annual total mining revenue
SGX-based mining schemes show that REM results in of the system, which we assume is independent of any
less resource waste than alternatives such as PoET. other variable and vary our analyses. This total revenue is
divided among the participants in a manner that depends
on the specific scheme used, and the details are given
B.1 Model below.
In our model, we consider a set of N op operators that Operators strive to maximize their net revenue by
choose to commit their CPUs either to mining or unre- choosing from a space of three strategies that dictate
lated useful work. This reflects the fact that there are hardware use. These are:
certain barriers to enter the mining industry and there-
fore not everyone will participate. For completeness we 1. No mining: The operator chooses not to mine, using
also discussed the implication of removing the limit of her CPUs solely for useful work. This is profitable
numbers of operators in C.3. as long as the income Rw offsets costs.
Each operator has an annual budget of budget for 2. Standard mining: The operator uses standard
purchasing hardware and for paying operating expenses. servers and mines on them (if the scheme allows).
In our context, both expenses are a function of the CPUs
the operator chooses to use. Denote by age the age in 3. Farming: The operator uses farm machines for min-
years of the CPUs maintained by the operator, which ing, reducing the per-CPU overhead, but losing the
we assume for simplicity to be uniform. If an operator ability to perform useful work.
chooses to maintain new CPUs, she enjoys better perfor-
mance and efficiency (computation per power unit), but We adopt a population-based representation of the
the cost is higher. The latest CPU has an age of 0, and strategy choices made by operators. Specifically, for op-
we arbitrarily set the oldest CPU available to 10, denoted erators collectively implementing a hybrid strategy, we
agemax = 10. Choosing a higher agemax value or remov- express their choice in terms of the aggregate fraction of
ing this limit strengthens our results. In some situations, resources devoted to each strategy by the full population.
the operator may choose to have the CPUs but not utilize To compare the resource waste associated with dif-
them. Denote by u [0, 1] the utilization level of a CPU, ferent consensus systems, we examine the optimal op-
where 0 means idle and 1, fully utilized. eration point for each. We model operators utility, and

19
400
Price
1 10% per year. Our model uses a logit function to approx-
Fitted Price
350
Speed 0.8
imate this trend, as is shown in Figure 14. As a result of
300 Fitted Speed
the performance slowdown of older CPUs, Rw decreases
Price [Q316 USD]

Speed (normed)
250 0.6 with age in a similar way.
200

150 0.4

100 Useful work revenue We used the cost model of Ama-


0.2
50 zon Web Services to approximate the useful-work rev-
0 0 enue Rw . In particular, we assume a miners mining com-
1 2 3 4 5 6 7 8 9 10 11
Age [year] puter has performance similar to an m4.large EC2 in-
stance, which has 2 CPU cores and 8 GiB of memory [4].
Figure 14: CPUs price and speed as function of age. As of the time of writing, an m4.large instance costs
0.12 per hour [3].
identify the best strategy for a rational operator. We de-
velop expressions for the operators revenues, and in- B.3 Cost per unit of useful work
stantiate them using the values obtained as explained in To compare the resource costs of the different schemes
Section B.2. We then calculate the equilibrium point of in Section C, we will compute for each its cost per unit
each scheme and the optimal operation point for miners of useful work.
in each scheme. We measure the waste of each scheme As a baseline, we first compute the cost per unit of
in terms of the cost for each unit of useful work, referred useful work on an optimal hardware setup devoted ex-
to as the useful price. clusively to useful work. The revenue of a single CPU
for useful work is the income over a year from useful
B.2 Parameter Values work (Rw ) minus the expenses for hardware purchase
(C), overhead (Ostd ), and power (E):
To estimate the price, performance, and energy efficiency
of a CPU as a function of its age, we investigate histori-
cal CPU data for a family of 7 Intel CPUs spanning ten Rw (u, agew ) (C(agew ) + Ostd + E(u, agew ).) (8)
years. From each generation of Intel CPUs since 2006,
we picked the fastest desktop chip. For each CPU, we The number of CPUs an operator can purchase is
estimate its price as of 2016 according to several market-
places worldwide [11] and its performance according to budget
N cpu (agew ) = (9)
standard benchmarks [8]. We present CPU specifications C(agew ) + Ostd + E(u, agew ).
as published by Intel. Long discontinued models can still
Therefore the total profit from useful work is
be purchased from many suppliers, generally at very low
per-unit costs for high-volume orders. All amounts are
Puseful (u, agew ) = N cpu (agew )
given in 2016 USD, denoted by $.  
Table 1: Sample CPUs and specifications. Benchmark scores
Rw (u, agew ) (C(agew ) + Ostd + E(u, agew ).) (10)
are normalized by the score of i7-6700K.
For simplicity we assume power usage grows linearly
Generation Sample CPU Launch Q316 speed Power with utilization, and so utilization of 1 is optimal to re-
Date Price (normed) at u = 1
duce the waste of the independent elements C and Ostd .
Cedarmill Pentium 4 661 Q106 $5 -4 85W
Wolfdale Core 2 Duo E8500 Q308 $10 0.21 65W Plugging in values from Section B.2, we can optimize
Lynnfield Core i7-880 Q210 $35 0.52 95W equation 10, resulting in an optimum CPU age of age w=
Sandy Bridge Core i7-2700K Q411 $50 0.80 95W
Ivy Bridge Core i7-3770K Q212 $75 0.87 77W 3.48 years and a corresponding baseline cost per unit of
Haswell Core i7-4770K Q213 $150 0.89 84W work of $391.25.
Skylake Core i7-6700K Q315 $290 1.00 91W
For a population of miners, the overall cost per unit of
useful work is equal to the total investment in the mining
ecosystem over a year, divided by the number of units of
Price and performance Table 1 summarizes CPU useful work. Consequently, it is a function of the fraction
specs. From the data in Table 1 we can see clearly that of mining CPUs doing useful work. If 100% of CPU
the CPU price C drops exponentially as it ages. CPU time is devoted to useful work, the cost per unit of useful
performance manifests a segmented, exponential trend: work is $391.25; 0% corresponds to infinite cost per unit
It doubled each year between 2006 and 2011 (following of useful work. Smaller percentages of CPU investment
Moores Law), but then improvements slowed to around in useful work correspond to higher costs.

20
Figure 15: Summary of revenue analysis result with parameters 100

Waste [normalized $/useful work]


PoW and LazyPoet
according to B.2. Notation: U is useful work, S is standard BusyPoET
PoPW
mining, F is farming. f2nd is the ratio of participants choosing PoUW
the second option, which is the more wasteful one. Waste is the
cost for one unit of useful work work normalized to the baseline 10

(the cost in a system with no mining): $391.25 (age w = 3.48).

Schemes Choices age


S age
F f2nd Waste 1
0 0.5 1 1.5 2 2.5 3 3.5 4
Useful work U - - - 1.0 Annual Cryptocurrency Revenue [106 USD]
PoW U, F - 4.68 76% 4.2
Lazy-PoET U, F - agemax 76% 4.2 1
Busy-PoET S, F 4.51 agemax 42% 2.5
0.8
PoPW S, F 4.39 5.61 26% 1.4

miner ratio
PoUW U, S 4.39 - 100% 1.1 0.6

0.4
PoW and LazyPoET
0.2 BusyPoET
C Comparative Analysis of Mining PoPW
PoUW
0
Schemes 0 0.5 1 1.5 2 2.5 3 3.5 4
Annual Cryptocurrency Revenue [106 USD]

Using our model from Appendix B, we now present


an analysis showing that PoUW incentivizes minimally Figure 16: Revenue analysis: The x axis is the annual cryp-
tocurrency revenue (Rannual ) and the left y-axis is the waste
wasteful mining behavior and / or achieves secure con-
factor, i.e. useful price normalized by the useful price without
sensus more effectively than alternatives. An important
mining. The right y-axis is the ratio of participants choosing
part of this section is our presentation of a spectrum the more-wasteful option in their scheme.
of five consensus schemes, including PoW, PoUW, and
three other SGX-based schemes (two of them newly pre-
sented in this work). These schemes illustrate a range of C.1.1 Proof of Work
technical approaches against which we compare PoUW
as a means of validating our design choices. We start with PoW, as used in Bitcoin, Ethereum, and
We first present and analyze our four consensus other common cryptocurrencies. For the purpose of this
schemes other than PoUWwhat we call strawman analysis, we consider only CPU mining. See below for a
schemesin Section C.1. We then offer detailed revenue discussion of dedicated mining hardware.
analysis of PoUW in Section C.2. Figure 15 compares Each operator therefore has two options: either buy-
the results for the different schemes at the parameter val- ing CPUs for useful work, or buying CPUs for farmed
ues calculated in Section B.2. We consider a conserva- mining (i.e. farming). For both scenarios, operators
tive total cryptocurrency revenue of $20m. At higher val- can choose freely to use CPUs of any age. Denote
ues the significance of PoUW would only increase. Fig- by f m [0, 1] the ratio of operators that choose to mine.
ure 16 compares the schemes at varying total cryptocur- Denote by age w and age F the optimal age for doing use-
rency revenue values. ful work and farming, respectively.
Due to symmetry, the optimal age of CPUs at all min-
ers is the same, and so their expended mining power is
C.1 Strawman Schemes the same. The mining revenue is therefore distributed
uniformly among all miners, and a single miners income
Here we present our spectrum of strawman consensus
is simply Rannual divided by the total number of mining
schemes that serve as points of comparison with and ex-
CPUs, leaving a total mining revenue of
planations of design choices in PoUW. We start with the
popular Proof of Work (PoW), and then present PoET
and a proposed minor variant on PoET, as well as two Pm (u, age, f m ) = N cpu (age)
 
strawman solutions of our own that demonstrate the dif- Rannual
C(age) Ofarm E(u, age) .
ficulty of achieving waste-limited SGX-based consen- f m N op N cpu (age) u
sus. We analyze PoUW in Section C.2. The reader may (11)
choose to skip directly to Section C.2 and then flip back
here to understand the alternatives and rationale for our Again, utilization is optimized at 1. The stable oper-
design choices. ation point is when the revenue from mining equals that
of honest work (12), where operators are not motivated

21
to switch sides, PoET, each CPU draws a random number r and sleeps
for r time. Whichever gets the smallest number wakes
Puseful (1, agew ) = Pm (1, agew , f m ) . (12) up first and becomes leader for the next consensus epoch.
Solving (12) we obtain f m (agem ) as a function of Building on top of Intels trusted hardware SGX, PoET
agem . makes use of the trusted random source protected by
We still need the find the optimal operation parameters hardware, prohibiting selfish actors from increasing the
for mining. To this end, we calculate the symmetric Nash frequency of their blocks, but with minimal computation.
equilibrium where if all operators who choose to mine at But the vanilla PoET proposed by Intel cannot be directly
age age, a single operator cannot increase her revenue employed. The most critical issue is that it costs nothing
by operating at a different age. In fact, finding such an for a miner to mine on multiple branches of a blockchain.
age amounts to finding an age that if all but one miner As has been studied in the context of Proof of Stake, be-
operate at age, then the other miners optimal operation ing able to work on multiple branches forces a strong
age must also be age, that is, assumption, namely that a majority of the miners blindly
follow the protocol, even if each of the members of this
age = max Pm (u, age0 , f m (age)) . (13) majority is not individually motivated to do so. He, Song,
age0
and Milutinovic [42] proposed to fix this issue by using
With the numbers from B.2, we find the optimal age SGXs monotonic counters. They argue that depleting
for mining is age F = 4.68 years. At the equilibrium, all 256 SGX counters ensures that the CPU is tied to a
76.0% of the operators would be mining. The cost for single branch.
one unit of useful work is $1643.25. The results are sum-
We call their patched PoET scheme that maintains the
marized in Figure 15.
CPU idle while mining Lazy-PoET. The point of Lazy-
Figure 16 shows how the annual cryptocurrency rev-
PoET is to allow for mining without any energy waste.
enue (Rannual ) affects the miner ratio at the equilibrium,
In the Lazy-PoET scheme, an operator can use her hard-
f m , and therefore the cost of useful work. As the value
ware for either useful work, with fully-utilized CPUs, or
of the currency and hence the mining revenue increases,
for mining, with idle CPUs. Idle CPUs are cheaper to
the ratio of operators choosing to mine increases linearly,
operate in farming mode, and the mining revenue of an
and the cost of useful work increases exponentially. As
old CPU is the same as that of a new and expensive CPU.
the miner ratio reaches 100%, the useful price goes to
Hence, the operator choices are reduced to either farm-
infinity as nobody is doing useful work.
ing, or useful-work.
Dedicated Hardware. Our analysis above assumes use The revenue expression for Lazy-PoET is the same as
of a CPU for mining, of course. But PoW functions that with Proof of Work (11), but here the optimal age is
have been used for cryptocurrencies for extended peri- agemax , since there is no benefit in using newer CPUs,
ods, namely double-SHA256 (for Bitcoin) and Scrypt and old CPUs are cheaper. Therefore, the revenue is only
(for Litecoin), led to the development of dedicated hard- a function of the miner ratio,
ware, particularly Application-Specific Integrated Cir-
cuits (ASIC). By design, these are far more efficient
than any generic hardware for the purpose of the given
PmLazy ( f m ) = Pm (0, agemax , f m ) . (14)
PoW, but cannot be used for anything else. Arguably, for
any PoW function, dedicated hardware can outperform
generic hardware.
Equating the revenues of mining (14) and useful work
The equilibrium point in our analysis, however, is not
(10), we obtain again a miner ratio of 76% and the cost
actually affected by the performance or cost of dedi-
per unit of useful work is $1643.4, i.e. a waste of 4.2.
cated hardware. The reason is that no matter what min-
The results are summarized in Figure 15. Note that the
ing devices are used, a fixed amount of mining rev-
numbers are identical to those of PoW, as the formulas
enue is evenly distributed among the mining operators at
are the same modulo agemax . The waste here is due to
the equilibrium, leaving the mining revenue unchanged.
what we call the stale chip problem farming with ex-
Therefore, advances in dedicated hardware do not affect
tremely old chips, insufficient for any useful work, yields
the conclusions of our analysis.
high revenue. This optimum is robust to mining income
changes. Mining is always optimal on stale chips. The
annual mining revenue determines the ratio of miners; if
C.1.2 Proof of Elapsed Time (PoET / Lazy-PoET)
it is too small, useful work simply becomes preferable to
As explained above, Intel proposed PoET as a waste-free any sort of mining.
PoW replacement for a permissionless blockchain. In

22
C.1.3 Busy Proof of Elapsed Time (Busy-PoET) An equilibrium is a pair (ageS , ageF ) where an opera-
tor cannot improve her revenue by changing her strategy.
With Lazy-PoET, mining CPUs are kept idle. However,
This amounts to two conditions.
we note that the depleted SGX counters are allocated per-
First, she cannot improve her revenue by changing her
enclave, and not for the CPU itself: Different enclave
CPU age (Equation 13) where all other operators main-
code accesses different SGX counters. This is sufficient
tain their strategy. This is expressed for the two operation
to ensure that the SGX is tied to a specific blockchain
modes as the equation system
branch, since the mining enclave for a blockchain is un-
changeable, preventing two from running on the same PS (1, age0S , f s (ageS , ageF ))

ageS = max
SGX and depleting the same counters. However, the age0S
(18)
implication is that the SGX and the CPU itself remain ageF = max PF (0, age0F , f s (ageS , ageF )).
age0F
available for any other purpose, while mining and pro-
viding proofs of elapsed time as necessary. We call a sys- Second, she cannot improve her revenue by changing
tem with proof-of-elapsed time where miners can con- her operation mode. We take this condition,
currently use their hardware for useful work Busy-PoET.
Since the mining revenue adds no overhead in Busy- Pstd (u, ageS , ageF ) = Pfarming (u, ageS , ageF ) ,
PoET, useful work with mining (henceforth offhand min-
ing) is preferable to useful work without it. Nonetheless, to obtain an expression of f s as a function of ageS and
farming remains a viable option because stale chips are ageF . The resulting expression is too complex to be
significantly cheaper than recent ones. placed here.
As in previous schemes, the annual revenue Rannual We numerically solve the equation system and observe
is split among all mining CPUs. Denote by f s , the that, as with previous schemes, the cost of useful work
ratio of operators that are working in standard mode. grows quickly with the annual cryptocurrency revenue.
Unlike the previous schemes, here the CPU count in- Figure 16 shows that the ratio of standard miners de-
cludes the f s N op operators in standard mode and the creases, as miners prefer farming with many cheap CPUs
(1 f s ) N op operators in farming mode. It is there- over performing useful work another instance of the
fore a function of the CPU ages of both standard mining, stale chip problem.
ageS , and farming, ageF . We defer the development of
this expression until after the discussion of both operator
modes. C.1.4 Proof of Potential Work
In order to mitigate the stale chip problem of the PoET
Standard mining Distinguished from PoW, the rev- variants, we propose a direct solution: Grant more block
enue from standard mining in Busy-PoET has an addi- rewards to CPUs with better performance. Since with
tional item of useful work revenue Ru : this approach a miner proves her CPU power, though
Rw (u, ageS )+Rm (C(ageS )+Ostd +E(u, ageS )). (15) she might not be utilizing it, we call it Proof of Poten-
tial Work (PoPW).
The total number of CPUs a standard operator can af- Technically, an SGX program can determine the CPU
cpu
ford, Nstd , has the same expression as in useful work (9), model by calling the cpuid instruction. Based on the
yielding a total annual revenue of (the dependency in f s CPU model, one can determine the value of a CPU by
is through Rm ) looking up in a public table, hard-coded in the blockchain
cpu
protocol. The time for the mining enclave to return is
PS (u, ageS , f s ) = Nstd therefore chosen with an exponential distribution param-
eter that is proportional to the CPUs power. The mining

Rw (u, ageS ) + Rm (C(ageS ) + Ostd + E(u, ageS )) .
(16) revenue is therefore linear in the slow-down factor of the
CPU.
Farming The expression for revenue from farmed Note that PoPW makes stronger security assumptions
mining is similar to that of previous schemes (the de- than any of the other schemes we discuss. The princi-
pendency in f s is through Rm ), pal that determines the values of the ever-changing CPU
value table has significant power over the blockchain; for
cpu example, he can attribute high values to CPUs to which
PF (u, ageF , f s ) = Nfarm
 it has better access than other operators.
Rm (C(ageF ) + Ofarm + E(u, ageF ) . (17)
The equilibrium analysis for Busy-PoET is not as sim- Standard mining As in Busy-PoET, standard opera-
ple now, as (16) and (17) are interdependent through f s . tion dominates useful work, as the revenue for mining

23
comes without any overhead. Recalling that the slow- Miner Ratio
120
down of a CPU is (age), the expression for the CPU M=5e4
M=1e5
revenue in standard mode is 100
M=5e5
M=1e6

Miner Ratio [%]


80

Rw (u, ageS ) + Rm (ageS ) 60


 40
C(ageS ) + Ostd + E(u, ageS ) . (19)
20

The total number of CPUs an operator can afford has 0


1 1.2 1.4 1.6 1.8 2
the same expression as in PoW, yielding a total annual Overhead of Counting
revenue of Waste

waste [normalized USD / useful work]


80
M=5e4
70 M=1e5
M=5e5
cpu 60 M=1e6
Nstd Rw (u, age) + Rm (age) 50

(C(ageS ) + Ostd + E(u, ageS ) . (20) 40
30
20
Farming The CPU revenue for farming is 10
0
 1 1.2 1.4 1.6 1.8 2
Rm (ageF ) C(ageF ) + Ostd + E(u, ageF ) . (21) Overhead of Counting

The expression for number of CPUs is the same as be- Figure 17: Ocounting affects the miner ratio and the waste in
fore. PoUW.
As before, the annual mining revenue is distributed
among all mining CPUs, though now proportionally to
their slow-down. The equilibrium analysis is similarly to The mining revenue does not suffer such a reduction, as
that of Busy-PoET, resulting in a quick increase of waste the overhead is taken into account when calculating the
as the annual cryptocurrency revenue grows, as show in mining effort.
Figure 16. Denote the optimal age for standard operation by ageS .
As shown in Figure 15, we observe that indeed the The expression for the number of CPUs is as usual (9),
stale-chip problem is resolved: now the optimal farm- at age = ageS . The standard mining revenue is the prod-
ing age is 5.61, as opposed to agemax in Lazy-PoET and uct of the CPU count and (22). This expression is a func-
Busy-PoET, but the farming problem remains. Starting tion of Rm and ageS ,
at some point, when the annual revenue is high enough, u
it becomes more profitable to farm, keeping idle CPUs, Rw ( , ageS )+
Ocounting
then to spend that amount on power for useful work.
Also as discussed before, the trust model of PoPW is ar- Rm (ageS ) u (C(ageS ) + Ostd + E(u, ageS ).
guably too strong for a decentralized system. (22)

As with PoW, the value of Rm is chosen such that the


C.2 Proof of Useful Work total revenue distributed is Rannual ,
Our solution, Proof of Useful Work (PoUW), avoids the Rannual
issues with previous schemes by directly counting work Rm = cpu . (23)
f s N op Nstd u
done towards mining effort. Since mining revenue is
only granted when work is done, farming without useful The equilibrium point is where the standard and
work means the CPUs must be processing useless work useful-work revenues are the same. Solving the equation
in this mode. Therefore, standard operation dominates we obtain an expression for the ratio of standard miners
farming, as the revenue from useful work comes at no at equilibrium, which is a function of ageS . We proceed
additional cost. to find a symmetric Nash equilibrium as is done for PoW.
The useful work analysis and optimal age calculation With the parameter values of Section B.2, we find that
remain unchanged, resulting in a revenue as expressed in all operators choose to work in standard mode. The op-
(10). timal CPU age is 4.39 years, rather than the 3.48 opti-
For standard operation, the mining revenue now de- mal age for useful-work operation. The average cost for
pends on the utilization of the CPU as well as on its age. useful work is $430.1, i.e. a waste of 1.1. The results
The useful work utilization suffers a decrease due to the are summarized in Figure 15. As shown in Figure 16,
overhead of online effort monitoring, denoted Ocounting . PoUW has the lowest waste among the four schemes we

24
compared. Unlike other schemes, the waste in PoUW is they were useful workers; and 2) no single operator can
not affected by the annual cryptocurrency revenue. The earn more by working at a different CPU age.
reason for that is PoUW effectively encourages all of the The analysis for PoW and Lazy-PoET remains basi-
participants to mine as long as the mining revenue isnt cally unchanged because in both schemes operators have
critically low. With any reasonable annual cryptocur- in fact only one option besides useful work. We refer
rency revenue, all of PoUW participants end up mining, readers to the Appendix for more details.
yielding a fixed waste per useful work unit.
Busy-PoET We note that the incremental overhead of
However, PoUW does introduce the overhead of se-
Busy-PoET over useful work and the cost of farming are
cure instrumentation (Ocounting ) that is not present in
key to the analysis. Taking it to an extreme where PoET
other schemes. The impact of Ocounting with different an-
incurs zero additional overhead, every CPU doing useful
nual cryptocurrency revenue is shown in Figure 17. The
work will mine along with the useful work, earning some
top graph shows how Ocounting impacts the miner ratio
mining revenue at no incremental cost. On the other
at equilibrium. Given an annual cryptocurrency revenue,
hand, if the cost incurred by PoET is hight, fewer par-
higher Ocounting discourages participants from mining as
ticipant will perform PoET at equilibrium, hence leaving
doing useful work would be more profitable. So the ratio
some profit margin for farming. We define the ineffi-
of miners at equilibrium decreases with Ocounting . The
ciency of PoET as the incremental overhead over doing
bottom graph shows how Ostd impacts the waste. Be-
useful work normalized by useful work revenue. The ef-
cause waste is only incurred by mining, higher annual
ficiency of PoET is one minus that.
cryptocurrency revenue increases miner ratio, leading to
more waste. Meanwhile, given an annual cryptocurrency Fig. 18 shows the number of standard mining CPUs
revenue, increasing Ocounting will first increase the waste at equilibrium, with PoET efficiency ranging from 60%
until all miners are forced to do useful work, where no to 100%, assuming no farming. We argue that in the
waste is present. case where PoET is very efficient, unbounded analysis
The conclusion is high Ocounting could diminish the se- is actually more realistic, as the equilibrium point de-
curity of PoUW because of a loss of miner power and an rived from the unbounded model requires an impracti-
extra waste. As our implementation suggests, REM only cally large amount of CPUs.
incurs a Ocounting of about 5 15%, i.e. 1.05 1.15 in Approximating the situations in Bitcoin, if we assume
Figure 17, allowing for high miner participation and low there are farmers (or attackers) with access to cheap stale
waste. chips and electricity, the presence of them could skew
the equilibrium significantly. At each point in Fig. 18,
standard miners operate at the margin where adding one
C.3 Unbounded analysis single CPU renders the standard mining a worse option
than not participating. Therefore, a single farmer with
In previous analysis, we assumed a bounded number of
cheap resources would expel a large amount of standard
operators in the system. This assumption is grounded in
mining CPUs from the mining pool. We argue that this is
the fact that there are certain barriers to enter the mining
a major drawback because it allows attacks at a relatively
business, For completeness, we now propose an alterna-
low cost.
tive model where the number of operators is unbounded.
In this model there will be infinitely many CPUs doing Another factor that would change the equilibrium is
useful work, and any CPU can switch to that at any point. the farming cost. In the extreme case where PoET in-
Weakening this assumption only strengthens the results. curs zero overhead, farming is no longer a viable op-
Admittedly, it is tricky to propose a perfect model for tion, unless its free, because most of the mining rev-
such a dynamic and pluralistic system. So we provided a enue would have been harvested by standard miners,
first-order approximation based on realistic but simplify- leaving farmers too little to cover the farming cost. But
ing assumptions. For example, the low electricity cost in on the other hand, if PoET incurs non-negligible over-
China is quite important to the dynamics of Bitcoin min- head, farming becomes possible if the farming cost is low
ing. but the details of price distribution are unknown. enough. Fig. 19 shows the thresholds for farming cost as
Therefore we assumed the cost of CPUs and electricity functions of PoET efficiency. For any given PoET effi-
is the same for all of the operators. ciency, if farming cost is below than the lower threshold,
Just as before, participant has two options besides all operators will end up farming at equilibrium; above
working on useful work: mining on herself or joining a the higher threshold, all operators will end up perform-
farm. The age of CPU still plays an important rule as the ing standard mining. If the farming cost is between the
price and performance vary significantly with the age. A two, then equilibrium will involve a mixture of both.
equilibrium is defined similarly as before, satisfying two PoUW As discussed previously, PoUW renders farming
conditions: 1) revenue for participants is the same as if irrelevant, eliminating a major source of waste. How-

25
108
number of CPUs

number of CPUs
106

106 Rannual = 3 106 Rannual = 3 106


Rannual = 3 107 104 Rannual = 3 107
Rannual = 3 108 Rannual = 3 108
104 Rannual = 3 109 Rannual = 3 109
102
0.6 0.7 0.8 0.9 1 0.6 0.7 0.8 0.9 1
efficiency of PoET efficiency of PoUW

Figure 18: The number of standard mining CPUs at equilib- Figure 20: The number of standard mining CPUs at equilib-
rium, as a function of PoET efficiency. Rannual denotes the rium, as a function of PoUW efficiency (i.e. 1 Ocounting ).
total annual cryptocurrency revenue. As a reference, Bitcoin Rannual denotes the total annual cryptocurrency revenue. As
yielded a total annual revenue of approximately 330 millions a reference, Bitcoin yielded a total annual revenue of approx-
in 2015. As PoET efficiency goes to 1, the number of mining imately 330 millions in 2015. As PoUW efficiency goes to 1,
CPUs tends to infinity. the number of mining CPUs tends to infinity.

D Implementation Details
threshold high
150
threshold low
farming cost (USD)

D.1 Experiments Setup Detalis


100 100% std. mining
Computational protein folding enables prediction of the
physical structure of chains of amino acids in protein
50 100% farming molecules. It has various uses in the study of diseases
such as Alzheimers and cystic fibrosis as well as various
0 types of cancer [66]. In our first experiment, we used an
0.6 0.7 0.8 0.9 1 implementation [1] of the protein folding algorithm pro-
efficiency of PoET posed by [76]; this software was a C++ program of ~800
LoC. To achieve SGX compliance, we removed all sys-
Figure 19: Farming cost affects the number of farmers at equi- tem calls. We instrumented the code automatically with
librium. If the farming cost is below threshold low, all operators
our compilation toolchain. Finally we compared the exe-
will join farms at equilibrium; above threshold high, all opera-
cution of the instrumented and non-instrumented code to
tors will do standard mining. Useful work cost is $391.25.
evaluate the overhead. The results are given in Figure 9.
In a second experiment, we ported a widely used SVM
implementation libsvm [27] (with ~4300 LoC), again
manually extracting system calls, and also replacing its
ever, PoUW does introduce an overhead Ocounting from
RNG calls with those for the SGX RNG. Then we trained
instructions counting. Fig. 20 shows the number of
an SVM classifier on the popular Mushroom dataset from
CPUs in PoUW at equilibrium under different PoUW ef-
the UCI Machine Learning Repository [12] and deployed
ficiency. Note that the equilibrium wont be skewed by
it in the enclave. The useful work in this case is to clas-
farmers as farming in PoUW is strictly inferior, irrespec-
sify inputs. We queried the classifier on a test set of ~8k
tive of farming cost.
points and measured the execution time.
The conclusion here is that if PoET and PoUW are In the third experiment, we ported an implementa-
sufficiently efficient, PoET can get rid of the farming is- tion [2] of zlib algorithm (the DEFLATE algorithm) to
sue unless farmers can operate at a very low cost. In this SGX and to REM. The program consists of ~5000 lines
case, however, the bounded model is more realistic be- of C code and we removed its reliance on standard li-
cause equilibrium in the unbounded model requires an braries manually. The useful workload in this experiment
impractically large amount of participating CPUs. On is to compress a random string of 50 MB. The experiment
the other hand, if PoET incurs non-negligible overhead, is repeated for 1000 times.
unbounded analysis draws a similar conclusion as the In the last experiment, we ported a C implementa-
bounded one does: farming remains a viable option and tion [10] of the SHA-3 message digesting algorithm.
because of that, an attacker can accrue old CPUs to attack The useful workload is to calculate SHA3 digests for a
at a lower cost. 100MB buffer of random bytes. The experiment is re-

26
...
.LEHB0:
leaq 1(%r15), %r15 # added by PoUW
call _ZN11stlpmtx_std12basic_stringIcNS...
.LEHE0:
.loc 7 70 0 is_stmt 0 discriminator 2
leaq 3(%r15), %r15 # added by PoUW
leaq -80(%rbp), %rax #, tmp94
movq %rax, %rsi # tmp94,
movq %rbx, %rdi # _4,
.LEHB1:
leaq 1(%r15), %r15 # added by PoUW
call _ZN11stlpmtx_std12out_of_rangeC1ER...
.LEHE1:
...

Figure 21: Code snippet: instrumented assembly code of a pro-


tein folding program.

peated for 1000 times.

D.2 Code Snippet


Figure 21 shows a snippet of assembly code instru-
mented with the REM toolchain. Register r15 is the re-
served instruction counter; it is incremented at the begin-
ning of each basic block in the lines commented added
by PoUW.

27