Paper 2022/175

WeRLman: To Tackle Whale (Transactions), Go Deep (RL)

Roi Bar-Zur, Technion – Israel Institute of Technology
Ameer Abu-Hanna, Technion – Israel Institute of Technology
Ittay Eyal, Technion – Israel Institute of Technology
Aviv Tamar, Technion – Israel Institute of Technology
Abstract

The security of proof-of-work blockchain protocols critically relies on incentives. Their operators, called miners, receive rewards for creating blocks containing user-generated transactions. Each block rewards its creator with newly minted tokens and with transaction fees paid by the users. The protocol stability is violated if any of the miners surpasses a threshold ratio of the computational power; she is then motivated to deviate with selfish mining and increase her rewards. Previous analyses of selfish mining strategies assumed constant rewards. But with statistics from operational systems, we show that there are occasional whales- blocks with exceptional rewards. Modeling this behavior implies a state-space that grows exponentially with the parameters, becoming prohibitively large for existing analysis tools. We present the WeRLman framework to analyze such models. WeRLman uses deep Reinforcement Learning (RL), inspired by the state-of-the-art AlphaGo Zero algorithm. Directly extending AlphaGo Zero to a stochastic model leads to high sampling noise, which is detrimental to the learning process. Therefore, WeRLman employs novel variance reduction techniques by exploiting the recurrent nature of the system and prior knowledge of transition probabilities. Evaluating WeRLman against models we can accurately solve demonstrates it achieves unprecedented accuracy in deep RL for blockchain. We use WeRLman to analyze the incentives of a rational miner in various settings and upper-bound the security threshold of Bitcoin-like blockchains. We show, for the first time, a negative relationship between fee variability and the security threshold. The previously known bound, with constant rewards, stands at 0.25. We show that considering whale transactions reduces this threshold considerably. In particular, with Bitcoin historical fees and its future minting policy, its threshold for deviation will drop to 0.2 in 10 years, 0.17 in 20 years, and to 0.12 in 30 years. With recent fees from the Ethereum smart-contract platform, the threshold drops to 0.17. These are below the common sizes of large miners.

Metadata
Available format(s)
PDF
Category
Applications
Publication info
Preprint.
Keywords
Blockchain Selfish Mining Bitcoin Ethereum Transaction Fees Miner Extractable Value Deep Reinforcement Learning
Contact author(s)
roi bar-zur @ campus technion ac il
History
2022-07-27: revised
2022-02-20: received
See all versions
Short URL
https://ia.cr/2022/175
License
Creative Commons Attribution
CC BY

BibTeX

@misc{cryptoeprint:2022/175,
      author = {Roi Bar-Zur and Ameer Abu-Hanna and Ittay Eyal and Aviv Tamar},
      title = {WeRLman: To Tackle Whale (Transactions), Go Deep (RL)},
      howpublished = {Cryptology ePrint Archive, Paper 2022/175},
      year = {2022},
      note = {\url{https://eprint.iacr.org/2022/175}},
      url = {https://eprint.iacr.org/2022/175}
}
Note: In order to protect the privacy of readers, eprint.iacr.org does not use cookies or embedded third party content.