Cryptology ePrint Archive: Report 2017/588

Renyi Entropy Estimation Revisited

Maciej Obremski and Maciej Skorski

Abstract: We revisit the problem of estimating entropy of discrete distributions from independent samples, studied recently by Acharya, Orlitsky, Suresh and Tyagi (SODA 2015), improving their upper and lower bounds on the necessary sample size $n$.

For estimating Renyi entropy of order $\alpha$, up to constant accuracy and error probability, we show the following (a) Upper bounds $n = O(1) \cdot 2^{\left(1-\frac{1}{\alpha}\right)H_{\alpha}}$ for integer $\alpha>1$, as the worst case over distributions with Renyi entropy equal to $H_{\alpha}$. (b) Lower bounds $n=\Omega(1)\cdot K^{1-\frac{1}{\alpha}}$ for any real $\alpha>1$, as the worst case over all distributions on $K$ elements.

Our upper bounds essentially replace the alphabet size by a factor exponential in the entropy, which offers improvements especially in low or medium entropy regimes (interesting for example in anomaly detection). As for the lower bounds, they were previously given conditionally for sufficiently small accuracy, whereas our result applies even to constant accuracy. Moreover, we prove that small accuracy $\delta$ contributes to the complexity independently on the alphabet, by an extra factor $\delta^{-\frac{1}{\alpha}}$.

The argument for upper bounds derives a clean identity for the variance of falling-power sum of a multinomial distribution. Our approach for lower bounds utilizes convex optimization to find a distribution with worse estimation performance, and may be of independent interest.

Category / Keywords: Renyi Entropy Estimation, Renyi Entropy, Sample Complexity, Extreme points,

Date: received 16 Jun 2017, last revised 19 Jun 2017

Contact author: obremski at cs au dk

Available format(s): PDF | BibTeX Citation

Version: 20170620:153731 (All versions of this report)

Short URL: ia.cr/2017/588

Discussion forum: Show discussion | Start new discussion


[ Cryptology ePrint archive ]