WeRLman: To Tackle Whale (Transactions), Go Deep (RL)

Roi Bar-Zur; Ameer Abu-Hanna; Ittay Eyal; Aviv Tamar

Paper 2022/175

WeRLman: To Tackle Whale (Transactions), Go Deep (RL)

Roi Bar-Zur, Technion – Israel Institute of Technology

Ameer Abu-Hanna, Technion – Israel Institute of Technology

Ittay Eyal, Technion – Israel Institute of Technology

Aviv Tamar, Technion – Israel Institute of Technology

Abstract

The security of proof-of-work blockchain protocols critically relies on incentives. Their operators, called miners, receive rewards for creating blocks containing user-generated transactions. Each block rewards its creator with newly minted tokens and with transaction fees paid by the users. The protocol stability is violated if any of the miners surpasses a threshold ratio of the computational power; she is then motivated to deviate with selfish mining and increase her rewards. Previous analyses of selfish mining strategies assumed constant rewards. But with statistics from operational systems, we show that there are occasional whales- blocks with exceptional rewards. Modeling this behavior implies a state-space that grows exponentially with the parameters, becoming prohibitively large for existing analysis tools. We present the WeRLman framework to analyze such models. WeRLman uses deep Reinforcement Learning (RL), inspired by the state-of-the-art AlphaGo Zero algorithm. Directly extending AlphaGo Zero to a stochastic model leads to high sampling noise, which is detrimental to the learning process. Therefore, WeRLman employs novel variance reduction techniques by exploiting the recurrent nature of the system and prior knowledge of transition probabilities. Evaluating WeRLman against models we can accurately solve demonstrates it achieves unprecedented accuracy in deep RL for blockchain. We use WeRLman to analyze the incentives of a rational miner in various settings and upper-bound the security threshold of Bitcoin-like blockchains. We show, for the first time, a negative relationship between fee variability and the security threshold. The previously known bound, with constant rewards, stands at 0.25. We show that considering whale transactions reduces this threshold considerably. In particular, with Bitcoin historical fees and its future minting policy, its threshold for deviation will drop to 0.2 in 10 years, 0.17 in 20 years, and to 0.12 in 30 years. With recent fees from the Ethereum smart-contract platform, the threshold drops to 0.17. These are below the common sizes of large miners.

Metadata

Available format(s): PDF
Category: Applications
Publication info: Preprint.
Keywords: Blockchain Selfish Mining Bitcoin Ethereum Transaction Fees Miner Extractable Value Deep Reinforcement Learning
Contact author(s): roi bar-zur @ campus technion ac il
History: 2022-07-27: revised; 2022-02-20: received; See all versions
Short URL: https://ia.cr/2022/175
License: CC BY

BibTeX

@misc{cryptoeprint:2022/175,
      author = {Roi Bar-Zur and Ameer Abu-Hanna and Ittay Eyal and Aviv Tamar},
      title = {{WeRLman}: To Tackle Whale (Transactions), Go Deep ({RL})},
      howpublished = {Cryptology {ePrint} Archive, Paper 2022/175},
      year = {2022},
      url = {https://eprint.iacr.org/2022/175}
}