Paper 2022/175

WeRLman: To Tackle Whale (Transactions), Go Deep (RL)

Roi Bar-Zur, Ameer Abu-Hanna, Ittay Eyal, and Aviv Tamar


The security of proof-of-work blockchain protocols critically relies on incentives. Their operators, called miners, receive rewards for creating blocks containing user-generated transactions. Each block rewards its creator with newly minted tokens and with transaction fees paid by the users. The protocol stability is violated if any of the miners surpasses a threshold ratio of the computational power; she is then motivated to deviate with selfish mining and increase her rewards. Previous analyses of selfish mining strategies assumed constant rewards. But with statistics from operational systems, we show that there are occasional whales~-- blocks with exceptional rewards. Modeling this behavior implies a state-space that grows exponentially with the parameters, becoming prohibitively large for existing analysis tools. We present the WeRLman framework to analyze such models. WeRLman uses deep Reinforcement Learning (RL), inspired by the state-of-the-art AlphaGo Zero algorithm. Directly extending AlphaGo Zero to a stochastic model leads to high sampling noise, which is detrimental to the learning process. Therefore, WeRLman employs novel variance reduction techniques by exploiting the recurrent nature of the system and prior knowledge of transition probabilities. Evaluating WeRLman against models we can accurately solve demonstrates it achieves unprecedented accuracy in deep RL for blockchain. We use WeRLman to analyze the incentives of a rational miner in various settings and upper-bound the security threshold of Bitcoin-like blockchains. The previously known bound, with constant rewards, stands at~0.25. We show that considering whale transactions reduces this threshold considerably. In particular, with Bitcoin historical fees and its future minting policy, its threshold for deviation will drop to~0.2 in 10 years,~0.17 in 20 years, and to~0.12 in 30 years. With recent fees from the Ethereum smart-contract platform, the threshold drops to~0.17. These are below the common sizes of large miners.

Available format(s)
Publication info
Preprint. Minor revision.
BlockchainSecuritySelfish MiningBitcoinEthereumFeesTransaction FeesWhale TransactionsMiner Extractable ValueMEVDeep Reinforcement LearningMonte Carlo Tree SearchDeep Q Networks
Contact author(s)
roi bar-zur @ campus technion ac il
2022-02-20: received
Short URL
Creative Commons Attribution


      author = {Roi Bar-Zur and Ameer Abu-Hanna and Ittay Eyal and Aviv Tamar},
      title = {WeRLman: To Tackle Whale (Transactions), Go Deep (RL)},
      howpublished = {Cryptology ePrint Archive, Paper 2022/175},
      year = {2022},
      note = {\url{}},
      url = {}
Note: In order to protect the privacy of readers, does not use cookies or embedded third party content.