On the Design of LIL Tests for (Pseudo) Random Generators and Some Experimental Results

You are looking at a specific version 20140112:132546 of this paper. See the latest version.

Paper 2014/031

On the Design of LIL Tests for (Pseudo) Random Generators and Some Experimental Results

Yongge Wang

Abstract

Random numbers have been one of the most useful objects in statistics, computer science, cryptography, modeling, simulation, and other applications though it is very difficult to construct true randomness. Many solutions (e.g., cryptographic pseudorandom generators) have been proposed to harness or simulate randomness and many statistical testing techniques have been proposed to determine whether a pseudorandom generator produces high quality randomness. NIST SP800-22 (2010) proposes the state of art testing suite for (pseudo) random generators to detect deviations of a binary sequence from randomness. On the one hand, as a counter example to NIST SP800-22 test suite, it is easy to construct functions that are considered as GOOD pseudorandom generators by NIST SP800-22 test suite though the output of these functions are easily distinguishable from the uniform distribution. Thus these functions are not pseudorandom generators by definition. On the other hand, NIST SP800-22 does not cover some of the important laws for randomness. Two fundamental limit theorems about random binary strings are the central limit theorem and the law of the iterated logarithm (LIL). Several frequency related tests in NIST SP800-22 cover the central limit theorem while no NIST SP800-22 test covers LIL. This paper proposes techniques to address the above challenges that NIST SP800-22 testing suite faces. Firstly, we propose statistical distance based testing techniques for (pseudo) random generators to reduce the above mentioned Type II errors in NIST SP800-22 test suite. Secondly, we propose LIL based statistical testing techniques, calculate the probabilities, and carry out experimental tests on widely used pseudorandom generators by generating around 30TB of pseudorandom sequences. The experimental results show that for a sample size of 1000 sequences (2TB), the statistical distance between the generated sequences and the uniform distribution is around 0.07 (with 0 for statistically indistinguishable and 1 for completely distinguishable) and the root-mean-square deviation is around 0.005. Though the statistical distance 0.07 and RMSD 0.005 are acceptable for some applications, for a cryptographic "random oracle", the preferred statistical distance should be smaller than 0.03 and RMSD be smaller than 0.001 at the sample size 1000. These results justify the importance of LIL testing techniques designed in this paper. The experimental results in this paper are reproducible and the raw experimental data are available at author's website.

Metadata

Available format(s): PDF
Category: Foundations
Publication info: Preprint.
Keywords: pseudorandomness
Contact author(s): yonwang @ uncc edu
History: 2014-01-12: received
Short URL: https://ia.cr/2014/031
License: CC BY