Decisional second-preimage resistance: When does SPR imply PRE?

. There is a well-known gap between second-preimage resistance and preimage resistance for length-preserving hash functions. This paper introduces a simple concept that ﬁlls this gap. One consequence of this concept is that tight reductions can remove interactivity for multi-target length-preserving preimage problems, such as the problems that appear in analyzing hash-based signature systems. Previous reduction techniques applied to only a negligible fraction of all length-preserving hash functions, presumably excluding all oﬀ-the-shelf hash functions.

The classic Rogaway-Shrimpton paper "Cryptographic hash-function basics" [15] shows that second-preimage resistance tightly implies preimage resistance for an efficient hash function that maps fixed-length inputs to much shorter outputs.The idea of the proof is that one can find a second preimage of a random input x with high probability by finding a preimage of the hash of x.But this probability depends on the difference in lengths, and the proof breaks down for length-preserving hash functions such as S.
The same paper also argues that second-preimage resistance cannot imply preimage resistance for length-preserving hash functions.The argument, in a nutshell, is that the identity function from {0, 1} 256 to {0, 1} 256 provides unconditional second-preimage resistance-second preimages do not exist-even though preimages are trivial to find.
A counterargument is that this identity-function example says nothing about real hash functions such as S. The identity-function example shows that there cannot be a theorem that for all length-preserving hash functions proves preimage resistance from second-preimage resistance; but this is only the beginning of the analysis.The example does not rule out the possibility that second-preimage resistance, together with a mild additional assumption, implies preimage resistance.

Contributions of this paper
We show that preimage resistance (PRE) follows tightly from the conjunction of second-preimage resistance (SPR) and decisional second-preimage resistance (DSPR).Decisional second-preimage resistance is a simple concept that we have not found in the literature: it means that the attacker has negligible advantage in deciding, given a random input x, whether x has a second preimage.
There is a subtlety in the definition of advantage here.For almost all lengthpreserving hash functions, always guessing that x does have a second preimage succeeds with probability approximately 63%.(See Section 3.) We define DSPR advantage as an increase in probability compared to this trivial attack.
We provide three forms of evidence that DSPR is a reasonable assumption.First, we show that DSPR holds for random functions even against quantum adversaries that get quantum access to a function.Specifically, a q-query quantum adversary has DSPR advantage at most 32q 2 /2 n against an oracle for a uniform random hash function from {0, 1} n to {0, 1} n .In [9] the same bound was shown for PRE and SPR together with matching attacks demonstrating the bounds are tight.This means that DSPR is at least as hard to break as PRE or SPR for uniform random hash functions from {0, 1} n to {0, 1} n .Second, the subtlety mentioned above means that DSPR, when generalized in the most natural way to m-bit-to-n-bit hash functions, becomes unconditionally provable when m is much larger than n.This gives a new proof of PRE from SPR, factoring the original proof by Rogaway and Shrimpton into two steps: first, prove DSPR when m is much larger than n; second, prove PRE from SPR and DSPR.
Third, we have considered ways to attack DSPR for real hash functions such as S, and have found nothing better than taking the time necessary to reliably compute preimages.A curious feature of DSPR is that there is no obvious way for a fast attack to achieve any advantage.A fast attack that occasionally finds a preimage of H(x) will occasionally find a second preimage, but the baseline is already guessing that x has a second preimage; to do better than the baseline, one needs to have enough evidence to be reasonably confident that x does not have a second preimage.Formally, there exists a fast attack (in the non-uniform model) that achieves a nonzero advantage (by returning 0 if the input matches some no-second-preimage values built into the attack, and returning 1 otherwise), but we do not have a fast way to recognize this attack.

Multi-target attacks
We see DSPR as showing how little needs to be assumed beyond SPR to obtain PRE.However, skeptics might object that SPR and DSPR are still two separate assumptions for cryptanalysts to study, that DSPR has received less study than PRE, and that DSPR could be easier to break than PRE, even assuming SPR.Why is assuming both SPR and DSPR, and deducing PRE, better than assuming both SPR and PRE, and ignoring DSPR?We give the following answer.
Consider the following simple interactive game T -openPRE.The attacker is given T targets H(1, x 1 ), . . ., H(T, x T ), where x 1 , . . ., x T are chosen independently and uniformly at random.The attacker is also given access to an "opening" oracle that, given i, returns x i .The attacker's goal is to output (i, x ) where H(i, x ) = H(i, x i ) and i was not an oracle query.Games of this type appear in, e.g., analyzing the security of hash-based signatures: legitimate signatures reveal preimages of some hash outputs, and attackers try to find preimages of other hash outputs.
One can try to use an attack against this game to break PRE as follows.Take the PRE challenge, insert it at a random position into a list of T − 1 randomly generated targets, and run the attack.Abort if there is an oracle query for the position of the PRE challenge; there is no difficulty answering oracle queries for other positions.The problem here is that a successful attack could query as many as T − 1 out of T positions, and then the PRE attack succeeds with probability only 1/T .What happens if T is large and one wants a tight proof?
If T -openPRE were modified to use targets H(x i ) instead of H(i, x i ) then the attacker could try many guesses for x , checking each H(x ) against all of the targets.This generic attack is T times more likely to succeed than a generic attack against PRE using the same number of guesses.However, the inclusion of the prefix i (as in [9]) seems to force attackers to focus on single targets, and opens up the possibility of a security proof that does not quantitatively degrade with T .
One might try to tightly prove security of T -openPRE assuming security of a simpler non-interactive game T -PRE in which the opening oracle is removed: the attacker's goal is simply to find some (i, x ) with H(i, x ) = H(i, x i ), given T targets H(1, x 1 ), . . ., H(T, x T ).This game T -PRE is simple enough that cryptanalysts can reasonably be asked to study it (and have already studied it without the i prefixes).However, the difficulty of answering the oracle queries in T -openPRE seems to be an insurmountable obstacle to a proof of this type.
We show that the security of T -openPRE follows tightly from the conjunction of two simple non-interactive assumptions, T -SPR and T -DSPR.This shows an important advantage of introducing DSPR, allowing a reduction to remove the interactivity of T -openPRE.
The advantage of SPR (and T -SPR) over PRE (and T -PRE) in answering oracle queries inside reductions was already pointed out in [9].The remaining issue, the reason that merely assuming T -SPR is not enough, is that there might be an attack breaking PRE (and T -PRE and T -openPRE) only for hash outputs that have unique preimages.Such an attack would never break SPR.
To address this issue, [9] assumes that each hash-function output has at least two preimages.This is a restrictive assumption: it is not satisfied by most length-preserving functions, and presumably it is not satisfied by (e.g.) SHA-256 for 256-bit inputs.Building a hash function that can be reasonably conjectured to satisfy the assumption is not hard-for example, apply SHA-256, truncate the result to 248 bits (see Theorem 11), and apply SHA-256 again to obtain a random-looking 256-bit string-but the intermediate truncation here produces a noticeably smaller security level, and having to do twice as many SHA-256 computations is not attractive.
We instead observe that an attack of this type must somehow be able to recognize hash outputs with unique preimages, and, consequently, must be able to recognize hash inputs without second preimages, breaking DSPR.Instead of assuming that there are always two preimages, we make the weaker assumption that breaking DSPR is difficult.This assumption is reasonable for a much wider range of hash functions.

The strength of SPR
There are some hash functions H where SPR is easy to break, or at least seems easier to break than PRE (and T -PRE and T -openPRE): -Define H(x) = 4 x mod p, where p is prime, 4 has order (p − 1)/2 modulo p, and x is in the range {0, 1, . . ., p − 2}.Breaking PRE is then solving the discrete-logarithm problem, which seems difficult when p is large, but breaking SPR is a simple matter of adding (p−1)/2 modulo p−1.(Quantum computers break PRE in this example, but are not known to break PRE for analogous examples based on isogenies.) n by Merkle-Damgård iteration of an n-bit compression function.Then, under reasonable assumptions, breaking SPR for H takes only 2 n−k simple operations.See [10].See also [1] for attacks covering somewhat more general iterated hash functions.
In the first example, proving PRE from SPR+DSPR is useless.In the second example, proving PRE from SPR+DSPR is unsatisfactory, since it seems to underestimate the quantitative security of PRE.This type of underestimate raises the same difficulties as a loose proof: users have to choose larger and slower parameters for the proof to guarantee the desired level of security, or have to take the risk of the "nightmare scenario" that there is a faster attack.
Fortunately, modern "wide-pipe" hash functions and "sponge" hash functions such as SHA-3 are designed to eliminate the internal collisions exploited in attacks such as [10].Furthermore, input lengths are restricted in applications to hash-based signatures, and this restriction seems to strengthen SPR even for older hash functions such as SHA-256.The bottom line is that one can easily select hash functions for which SPR and T -SPR and T -DSPR seem to be as difficult to break as PRE, such as SHA3-256 and SHA-256 restricted to 256-bit inputs.

Organization of the paper
In Section 2 we define DSPR and show how it can be used to relate SPR and PRE.A consequence of our definition is that a function does not provide DSPR if noticeably more than half the domain elements have no colliding value.In Section 3 we show that the overwhelming majority of length-preserving hash functions have the property that more than half of the domain elements have a colliding value.In Section 4 we extend the analysis to keyed hash functions.We show in Section 5 that DSPR is hard in the QROM.We define T -DSPR in Section 6.We show in Section 7 how to use T -DSPR to eliminate the interactivity of T -openPRE.We close our work with a discussion of the implications for hashbased signatures in Section 8.

Decisional second-preimage resistance
In this section we give a formal definition of decisional second-preimage resistance (DSPR) for cryptographic hash functions.We start by defining some notation and recalling some standard notions for completeness before we move on to the actual definition.

Notation
Fix nonempty finite sets X and Y of finite-length bit strings.In this paper, a hash function means a function from X to Y.
As shorthands we write We focus on bit strings so that it is clear what it means for elements of X or Y to be algorithm inputs or outputs.Inputs and outputs are required to be bit strings in the most common formal definitions of algorithms.These bit strings are often encodings of more abstract objects, and one could generalize all the definitions in this paper to work with more abstract concepts of algorithms.

Definitions
We now give several definitions of security concepts for a hash function H.We have not found decisional second-preimage resistance (DSPR) in the literature.
We also define a second-preimage-exists predicate (SPexists) and the secondpreimage-exists probability (SPprob) as tools to help understand DSPR.The definitions of preimage resistance (PRE) and second-preimage resistance (SPR) are standard but we repeat them here for completeness.
Definition 1 (PRE).The success probability of an algorithm A against the preimage resistance of a hash function H is Definition 2 (SPR).The success probability of an algorithm A against the second-preimage resistance of a hash function H is Definition 3 (SPexists).The second-preimage-exists predicate SPexists(H) for a hash function H is the function P : X → {0, 1} defined as follows: If P (x) = 0 then x has no second preimages under H: any x = x has H(x ) = H(x).The only possible successes of an SPR attack are for inputs x where P (x) = 1.
In other words, p = SPprob(H) is the maximum of Succ spr H (A) over all algorithms A, without any limits on the cost of A. Later we will see that almost all length-preserving hash functions H have p > 1/2.More precisely, p ≈ 1 − e −1 ≈ 0.63.For comparison, p = 0 for an injective function H, such as the nbit-to-n-bit identity function; and p = 1 for a function where every output has multiple preimages.

Definition 5 (DSPR).
Let A be an algorithm that always outputs 0 or 1.The advantage of A against the decisional second-preimage resistance of a hash function H is where P = SPexists(H) and p = SPprob(H).

Examples of DSPR advantages
Here are some examples of computing DSPR advantages.As above, write P = SPexists(H) and p = SPprob(H).
If A(x) = 1 for all x, then Pr [x ← R X ; b ← A(x) : P (x) = b] = p by definition, so Adv dspr H (A) = 0.If A(x) = 0 for all x, then Adv dspr H (A) = max {0, 1 − 2p}.In particular, Adv dspr H (A) = 0 if p ≥ 1/2, while Adv dspr H (A) = 1 for an injective function H.More generally, say A(x) flips a biased coin and returns the result, where the probability of 1 is c, independently of x.Then A(x) = P (x) with probability cp + (1 − c)(1 − p), which is between min {1 − p, p} and max {1 − p, p}, so again Adv dspr H (A) = 0 if p ≥ 1/2.As a more expensive example, say A(x) searches through all x ∈ X to see whether x is a second preimage for x, and returns 1 if any second preimage is found, otherwise 0. Then A(x) = P (x) with probability 1, so Adv dspr H (A) = 1−p.This is the maximum possible DSPR advantage.
More generally, say A(x) runs a second-preimage attack B against H, and returns 1 if B is successful (i.e., the output x from B satisfies x = x and H(x ) = H(x)), otherwise 0. By definition A(x) = 1 with probability Succ spr H (B), and if A(x) = 1 then also P (x) = 1, so A(x) = 1 = P (x) with probability Succ spr H (B). Also P (x) = 0 with probability 1 − p and if P (x) = 0 also A(x) = 0 as there simply does not exist any second-preimage for B to find.Hence, A(x) = 0 = P (x) with probability 1 − p. Overall A(x) = P (x) with probability 1 − p + Succ spr H (B), so Adv dspr H (A) = max {0, 1 − 2p + Succ spr H (B)} .This advantage is 0 whenever 0 ≤ Succ spr H (B) ≤ 2p − 1: even if B breaks secondpreimage resistance with probability as high as 2p − 1 (which is approximately 26% for almost all length-preserving H), A breaks DSPR with advantage 0. If B breaks second-preimage resistance with probability p, the maximum possible, then Adv dspr H (A) = 1 − p, the maximum possible advantage.As a final example, say x 1 ∈ X has no second preimage, and say A(x) returns 0 if x = x 1 , otherwise 1.Then A(x) = P (x) with probability p + 1/2 m , so Adv dspr H (A) = 1/2 m .This example shows that an efficient algorithm can achieve a (very small) nonzero DSPR advantage.We can efficiently generate an algorithm A of this type with probability 1−p by choosing x 1 ∈ X at random (in the normal case that X = {0, 1} m ), but for typical hash functions H we do not have an efficient way to recognize whether A is in fact of this type, i.e., whether x 1 in fact has no second preimage: recognizing this is exactly the problem of breaking DSPR!

Alternatives to the DSPR definition
Many security definitions require the attacker to distinguish two possibilities, each of which naturally occurs with probability 1/2.Any sort of blind guess is correct with probability 1/2.Define a as the probability of a correct output minus 1/2; a value of a noticeably larger than 0 means that the algorithm is noticeably more likely than a blind guess to be correct.
If an algorithm is noticeably less likely than a blind guess to be correct then one can do better by (1) replacing it with a blind guess or (2) inverting its output.
The first option replaces a with max{0, a}; the second option replaces a with |a|; both options have the virtue of eliminating negative values of a. Advantage is most commonly defined as |a|, or alternatively as 2|a|, the distance between the probability of a correct output and the probability of an incorrect output.These formulas are simpler than max{0, a}.
For DSPR, the two possibilities are not naturally balanced.A second preimage exists with probability p, and almost all length-preserving (or compressing) hash functions have p > 1/2.Guessing 1 is correct with probability p; guessing 0 is correct with probability 1−p; random guesses can trivially achieve any desired intermediate probability.What is interesting-and what is naturally considered in our proofs-is an algorithm A that guesses correctly with probability larger than p.We thus define the advantage as max{0, Succ(A) − p}, where Succ(A) is the probability of A generating a correct output.
An algorithm A that guesses correctly with probability smaller than 1 − p is also useful.We could define advantage as max{0, Succ(A) − p, (1 − Succ(A)) − p} to take this into account, rather than leaving it to the attack developer to invert the output.However, this formula is more complicated than max{0, Succ(A)−p}.
If p < 1/2 then, with our definitions, guessing 0 has advantage 1 − 2p > 0. In particular, if p = 0 then guessing 0 has advantage 1: our definitions state that injective functions are trivially vulnerable to DSPR attacks.It might seem intuitive to define DSPR advantage as beating the best blind guess, i.e., as probability minus max{p, 1 − p} rather than probability minus p.This, however, would break the proof that SPR ∧ DSPR implies PRE: the identity function would have both SPR and DSPR but not PRE.We could add an assumption that p ≥ 1/2, but the approach we have taken is simpler.

DSPR plus SPR implies PRE
We now present the main application of DSPR in the simplest case: We show that a second-preimage-resistant and decisional-second-preimage-resistant hash function is preimage resistant.
We first define the two reductions we use, SPfromP and DSPfromP, and then give a theorem statement analyzing success probabilities.The algorithm SPfromP(H, A) is the standard algorithm that tries to break SPR using an algorithm A that tries to break PRE.The algorithm DSPfromP(H, A) is a variant that tries to break DSPR.Each algorithm uses one computation of H, one call to A, and (for DSPfromP) one string comparison, so each algorithm has essentially the same cost as A if H is efficient.Definition 6 (SPfromP).Let H be a hash function.Let A be an algorithm.Then SPfromP(H, A) is the algorithm that, given x ∈ X , outputs A(H(x)).

Definition 7 (DSPfromP).
Let H be a hash function.Let A be an algorithm.Then DSPfromP(H, A) is the algorithm that, given x ∈ X , outputs This output is 0 if A(H(x)) returns the preimage x that was already known for H(x), and 1 otherwise.Note that the 0 case provides some reason to believe that there is only one preimage.If there are i > 1 preimages then x, which is not known to A except via H(x), is information-theoretically hidden in a set of size i, so A cannot return x with probability larger than 1/i.Theorem 8 (DSPR ∧ SPR ⇒ PRE).Let H be a hash function.Let A be an algorithm.Then Proof.This is as a special case of Theorem 25 below, modulo a change of syntax.The special case is that K in Theorem 25 is {()}, where () is the empty string.The change of syntax views a keyed hash function with an empty key as an unkeyed hash function.

The second-preimage-exists probability
This section mathematically analyzes SPprob(H), the probability that a uniform random input to H has a second preimage.The DSPR advantage of any attacker is information-theoretically bounded by 1 − SPprob(H).

Simple cases
In retrospect, the heart of the Rogaway-Shrimpton SPR-PRE reduction [15,Theorem 7] is the observation that SPprob(H) is very close to 1 for all highly compressing hash functions H. See Theorem 9. We show that SPprob(H) is actually equal to 1 for almost all hash functions H that compress more than a few bits; see Theorem 11.
Proof.Define I as the set of elements of X that have no second preimages; i.e., the set of x ∈ X such that The image set By definition SPprob(H) is the probability that |H −1 (H(x))| ≥ 2 where x is a uniform random element of X , i.e., the probability that x is not in I.This is at least 1 − (N − 1)/M .Theorem 10 (average of SPprob).The average of SPprob(H) over all hash functions H is 1 For example, the average is 1 256 and N = 2 256 ; see also Theorem 12.The average converges rapidly to 1 as N/M drops: for example, the average is approximately 1 − 2 −369.33 if M = 2 256 and N = 2 248 , and is approximately 1 − 2 −94548 if M = 2 256 and N = 2 240 , while the lower bounds from Theorem 9 are approximately 1 − 2 −16  and approximately 1 − 2 −32 respectively.
The average converges to 0 as N/M increases.The average crosses below 1/2, making DSPR trivially breakable for the average function, as N/M increases past about 1/ log 2 ≈ 1.4427.
Proof.For each x ∈ X , there are exactly N (N − 1) M −1 hash functions H for which x has no second preimages.Indeed, there are N choices of H(x), and then for each i ∈ X − {x} there are Hence there are exactly M (N M − N (N − 1) M −1 ) pairs (H, x) where x has a second preimage under H; i.e., the total of SPprob(H) over all N M hash functions This is content-free in the length-preserving case but becomes more useful as N/M drops.For example, if M = 2 256 and N = 2 248 , then the chance of 33 .Hence almost all 256-bit-to-248-bit hash functions have second preimages for all inputs, and therefore have perfect DSPR (DSPR advantage 0) against all attacks.
Proof.Write q for the probability that SPprob(H) = 1.Then SPprob(H) ≤ 1−1/M with probability 1−q.The point here is that SPprob(H) is a probability over M inputs, and is thus a multiple of 1/M .
The average of SPprob(H) Theorem 12 (average of SPprob vs. 1 − 1/e in the length-preserving case).If M = N > 1 then the average a of SPprob(H) over all hash functions The big picture is that almost all length-preserving hash functions H have SPprob(H) close to 1 − 1/e.This theorem states part of the picture: the average of SPprob(H) is extremely close to 1 − 1/e if N is large.Subsequent theorems fill in the rest of the picture.

How SPprob varies
This subsection analyzes the distribution of SPprob(H) as H varies. Theorem 14 amounts to an algorithm that computes the probability of each possible value of SPprob(H) in time polynomial in M + N .Theorem 16, used in Section 3.3, gives a simple upper bound on each term in the probability.
Proof.Choose integers i 1 , . . ., i a ≥ 2 with i 1 + • • • + i a = b, and consider any function f built as follows.Let π be a permutation of {1, . . ., b}.Define There are exactly b! choices of π, producing exactly b!/i 1 !• • • i a !choices of f .This covers all functions f for which 1 has exactly i 1 preimages, 2 has exactly i 2 preimages, etc.
The total number of functions being counted is thus the sum of For comparison, the power series Theorem 14 (exact distribution of SPprob).There are exactly The summand is In particular, if j > N then SPprob(H) = 1 − j/M with probability 0; and if j = N < M then SPprob(H) = 1 − j/M with probability 0. This calculation shows that Theorem 14 includes Theorem 9.
The distribution of M − j here, for a uniform random hash function H, is equal to the distribution of "K 1 " in [3, formula (2.21)], but the formulas are different.The sum in [3, formula (2.21)] is an alternating sum with cancellation between large terms.The sum in Theorem 14 is a sum of nonnegative terms; this is important for our asymptotic analysis.
Proof.We count the hash functions that (1) have exactly k ≥ j outputs and (2) have exactly j inputs with no second preimages.
Choose the j inputs.There are M j ways to do this.Choose a partition of the N outputs into • j outputs that will be used (without second preimages) by the j inputs; • k − j outputs that will be used (with second preimages) by the other M − j inputs; and • N − k outputs that will not be used.
There are N !/j!(k − j)!(N − k)! ways to do this.
Choose an injective function from the j inputs to the j outputs.There are j! ways to do this.
Choose a function from the other M − j inputs to the other k − j outputs for which each of these k − j outputs has at least two preimages.By Theorem 13, there are (k − j)!c(k − j, M − j) ways to do this.This produces a hash function that, as desired, has exactly k outputs and has exactly j inputs with no second preimages.Each such function is produced exactly once.Hence there are M j c(k − j, M − j)N !/(N − k)! such functions.
Finally, sum over k to see that there are hash functions H that have exactly j inputs with no second preimages, i.e., hash functions H that have SPprob(H) = 1 − j/M .
Our proof applies [5, Proposition VIII.7], which is an example of the "saddlepoint method" in analytic combinatorics.With more work one can use the saddlepoint method to improve bounds by a polynomial factor, but our main concern here is exponential factors.
Check the hypotheses of [5, Proposition VIII.7]:A and B are analytic functions of the complex variable z, with all coefficients nonnegative; B(0 Theorem 16 (exponential convergence of SPprob).Let j be an integer with 0 < j < M .Let k be an integer with j < k < N .Define µ = M/N , α = j/N , and κ = k/N .Let ζ be a positive real number.Assume that The proof combines Theorem 15 with the weak Stirling bound N !≥ (N/e) N .See [14] as claimed.

Maximization
This subsection formalizes and proves our claim that SPprob(H) is close to 1 − 1/e for almost all length-preserving hash functions H: as N increases (with M = N ), the distributions plotted in Figure 1 converge to a vertical line.
The basic idea here is that τ in Theorem 16 is noticeably below e when j/N is noticeably below or above 1/e.One can quickly see this by numerically plotting τ as a function of α and ζ: note that any choice of α and ζ (along with µ = 1) and thus determines τ .The plot suggests that ζ = 1 maximizes τ for each α, and that moving α towards 1/e from either side increases τ up to its maximum value e.One could use interval arithmetic to show, e.g., that τ /e < 0.998 for j/N > 0.4, but the required number of subintervals would rapidly grow as j/N approaches 1/e.Our proof also handles some corner cases that are not visible in the plot.
In particular, there is a unique Z > 0 such that ϕ 1 (Z) = (µ − α)/(1 − α).This is the first conclusion of the theorem.We return later to this particular value of Z. Define By Lemma 46, ϕ 2 (Z) achieves each real number >2 as a value exactly once. Define Then ϕ 3 (Z) > α, and ϕ 3 is decreasing since ϕ 2 is increasing.
In particular, again take the unique real number Z); and K < 1.This is the second conclusion of the theorem.We return later to these particular values of Z and K.
This is the fourth conclusion of the theorem.
Theorem 18.Let α, κ, ζ, A be positive real numbers.Assume that α < κ < 1; Proof.The point is that The first inequality follows from the case µ = 1 in Theorem 17.The second inequality follows from the increasing part of Lemma 47 if α ≤ A ≤ 1/e, and from the decreasing part of Lemma 47 if 1/e ≤ A ≤ α.
Theorem 19.Assume that M = N .Let A be a real number with 0 < A < 1.Let H be a uniform random hash function.If A > 1/e, define E as the event that SPprob(H) ≤ 1 − A. If A ≤ 1/e, define E as the event that SPprob(H) ≥ 1 − A. Then E occurs with probability at most (T /e) N 2πN 2 (N + 1)e 1/6N where Any A = 1/e has T /e < 1, and then the important factor in the probability for large N is (T /e) N .For example, if A = 0.4 then T /e < 0.99780899, so (T /e) N is below 1/2 2 247 for N = 2 256 .As another example, if A = 0.37 then T /e < 0.99999034, so (T /e) N is below 1/2 2 239 for N = 2 256 .
Proof.By Theorem 14, the number of hash functions We then divide by N N , the total number of hash functions, to obtain the probability of E.
Our strategy is to show that each term N j c(k − j, N − j)N !/(N − k)! is at most N ! 2 e N T N /N N .There are at most N (N + 1) pairs (j, k), so the total is at most N ! 2 e N T N N (N + 1)/N N , and the probability of E is at most (N !/N N ) 2 e N T N N (N + 1).This is at most (T /e) N 2πN 2 (N + 1)e 1/6N since N !≤ (N/e) N √ 2πN e 1/12N .The rest of the proof splits into various possibilities for (j, k).
as claimed; here we are using the weak Stirling bound N !≥ (N/e) N , and the fact that Assume from now on that j > 0. Then 0 < j < k < N , so 0 < α < κ < 1.Also 2(κ−α) < 1−α, so there is a positive real number Each term is thus bounded by N ! 2 e N T N /N N as claimed.

DSPR for keyed hash functions
In this section we lift the discussion to the setting of keyed hash functions.We model keyed hash functions as functions H : K × X → Y that take a dedicated key as additional input argument.One might also view a keyed hash function as a family of hash functions where elements of the family H are obtained by fixing the first input argument which we call the function key.We write for the function that is obtained from H by fixing the first input as k ∈ K.
We assume that K, like X and Y, is a nonempty finite set of finite-length bit strings.We define the compressing, expanding, and length-preserving cases as the cases |X | > |Y|, |X | < |Y|, and |X | = |Y| respectively, ignoring the size of K.
We recall the definitions of preimage and second-preimage resistance for keyed hash functions for completeness: Definition 20 (PRE for keyed hash functions).The success probability of adversary A against the preimage resistance of a keyed hash function H is Definition 21 (SPR for keyed hash functions).The success probability of adversary A against the second-preimage resistance of a keyed hash function H is Our definition of DSPR for a keyed hash function H relies on the secondpreimage-exists predicate SPexists and the second-preimage-exists probability SPprob for the functions H k .If H is chosen uniformly at random then, for large N and any reasonable size of K, it is very likely that all of the functions H k have SPprob(H k ) close to 1 − 1/e; see Theorem 19.
Definition 22 (DSPR for keyed hash functions).Let A be an algorithm that always outputs 0 or 1.The advantage of A against the decisional secondpreimage resistance of a keyed hash function H is where P k = SPexists(H k ) and p is the average of SPprob(H k ) over all k.
It might seem natural to define SPprob(H) as the average mentioned in the theorem.However, we will see later in the multi-target context that p is naturally replaced by a more complicated quantity influenced by the algorithm.

DSPR plus SPR implies PRE
Before we show that DSPR is hard in the QROM (see Section 5), we give a generalization of Theorem 8 for keyed hash functions.This theorem states that second-preimage and decisional second-preimage resistance together imply preimage resistance.
As in Theorem 8, we first define the two reductions we use, and then give a theorem statement analyzing success probabilities.The special case that K = {()}, where () means the empty string, is the same as Theorem 8, modulo syntactic replacements such as replacing the pair ((), x) with x.
Definition 23 (SPfromP for keyed hash functions).Let H be a keyed hash function.Let A be an algorithm.Then SPfromP(H, A) is the algorithm that, given (k, x) ∈ K × X , outputs A(H k (x), k).
Definition 24 (DSPfromP for keyed hash functions).Let H be a keyed hash function.Let A be an algorithm.Then DSPfromP(H, A) is the algorithm that, given Theorem 25 (DSPR ∧ SPR ⇒ PRE for keyed hash functions).Let H be a keyed hash function.Let A be an algorithm.Then where B = DSPfromP(H, A) and C = SPfromP(H, A).

Proof.
To analyze the success probabilities, we split the universe of possible events into mutually exclusive events across two dimensions: the number of preimages of H k (x), and whether A succeeds or fails in finding a preimage.Specifically, define as the event that there are exactly i preimages and that A succeeds, and define as the event that there are exactly i preimages and that A fails.Note that there are only finitely many i for which the events S i and F i can occur, namely i ∈ {1, 2, . . ., M }.All sums below are thus finite sums.
Define s i and f i as the probabilities of S i and F i respectively.The probability space here includes the random choices of x and k, and any random choices made inside A. The conditional probabilities mentioned below are conditional probabilities given S i .

PRE success probability. By definition, Succ pre
H (A) is the probability of the event that H k (x) = H k (A(H k (x), k)).This event is the union of S i , so Succ pre H (A) = i s i .DSPR success probability.Define P k = SPexists(H k ).For the i = 1 cases, we have P k (x) = 0 by definition of SPexists, so B is correct if and only if A succeeds.For the i > 1 cases, we have P k (x) = 1, so B is correct as long as A does not output x.There are two disjoint ways for this to occur: -A succeeds (case S i ).Then A outputs x with conditional probability exactly 1 i , since x is information-theoretically hidden in a set of size i; so there is conditional probability exactly i−1 i that A does not output x. -A fails (case F i ).Then A does not output x.

Together we get
DSPR advantage.By definition Adv dspr H (B) = max{0, Pr[B(x, k) = P k (x)]−p} where p is the average of SPprob(H k ) over all k.
By definition SPprob(H k ) is the probability over all choices of x that x has a second preimage under H k .Hence p is the same probability over all choices of x and k; i.e., p = i>1 s i + i>1 f i .Now subtract: SPR success probability.For the i = 1 cases, C never succeeds.For the i > 1 cases, C succeeds if and only if A succeeds and returns a value different from x.This happens with conditional probability i−1 i for the same reason as above.Hence Combining the probabilities.We have as claimed.
The formal structure of the proof is concluded at this point, but we close with some informal comments on how to interpret this proof.What happens is the following.The cases where the plain reduction from SPR (C in the above) fails are the S 1 cases, i.e., A succeeds when there is only one preimage.If the probability that they occur (s 1 ) gets close to A's total success probability, the success probability of C goes towards zero.However, s 1 translates almost directly to the DSPR advantage of B. This is also intuitively what we want.For a bruteforce attack, one would expect s 1 to be less than a 1 − p fraction of A's success probability.If it is higher, this allows to distinguish.On the extreme: If s 1 = s, then B's DSPR advantage is exactly A's success probability and the reduction is tight.If s 1 = 0, B has no advantage over guessing, but C wins with at least half the success probability of A (in this case our generic 1/3 bound can be tightened).As mentioned above, in general one would expect s 1 to be a recognizable fraction of s but clearly smaller than s.In these cases, both reductions succeed.

DSPR is hard in the QROM
So far we have highlighted relations between DSPR and other hash function properties.However, all this is useful only if DSPR is a hard problem for the hash functions we are interested in.In the following we show that DSPR is hard for a quantum adversary as long as the hash function behaves like a random function.We do this presenting a lower bound on the quantum query complexity for DSPR.
To make previous results reusable, we first need a result that relates the success probability of an adversary in a biased distinguishing game like the DSPR game to its success probability in the balanced version of the game.
Theorem 26.Let B λ denote the Bernoulli distribution that assigns probability λ to 1, X b for b ∈ {0, 1} a non-empty set, and

More specifically
accordingly.Then where we used p ≥ 1/2.Now, for a zero advantage in the biased game the second sub-claim is trivially true.For a non-zero advantage Adv p (A) we get The last sub-claim follows from The main statement follows from plugging the last two sub-claims together.
Our approach to show that DSPR is hard is giving a reduction from an average-case distinguishing problem that was used in the full version of [9].The problem makes use of the following distribution D λ over boolean functions.
In [9] the following bound on the distinguishing advantage of any q-query quantum adversary was shown.
Theorem 28 [9].Let D λ be defined as in Definition 27, and A be any quantum algorithm making at most q quantum queries to its oracle.Then We still have to briefly discuss how DSPR is defined in the (quantumaccessible) random oracle model.Instead of giving a description of the hash function H as implicitly done in Definition 5, we provide A with an oracle O that implements a function F : X → Y.As for most other notions that can be defined for unkeyed hash functions, DSPR in the (Q)ROM becomes the same for keyed and non-keyed hash functions.For keyed functions, instead of giving a description of the keyed hash function H and a key k to the adversary A, we provide A with an oracle that implements a function F : X → Y which now models H for a fixed key k.Hence, the following result applies to both cases.This can be seen as the key space might contain just a single key.
Now we got all tooling we need to show that DSPR is a hard problem.
Theorem 29.Let n ∈ N, N = 2 n , H : K × {0, 1} n → {0, 1} n as defined above be a random, length-preserving keyed hash function.Any quantum adversary A that solves DSPR making q quantum queries to H can be used to construct a quantum adversary B that makes 2q queries to its oracle and distinguishes D 0 from D 1/N with success probability Proof.By construction.The algorithm B generates an dspr instance as in Figure 2 and runs A on it.It outputs whatever A outputs.To answer an H query B needs two f queries as it also has to uncompute the result of the f query after it was used.The random function g can be efficiently simulated using 2q-wise independent hash functions as discussed in [9].
1. Sample x ← X and y ← Y independently and uniformly at random.2. Let g : X → Y\{y } be a random function.We construct H : X → Y as follows: for any x ∈ X otherwise.
Output: dspr instance (H, x ).Namely an adversary is given x and oracle access to H, and the goal is to decide if x has a second preimage under H.
) is a random dspr challenge from the set of all dspr challenges with P H (x ) = 0 (slightly abusing notation as we do not know a key for our random function).Similarly, if f ← R D 1/N , (H, x ) is a random dspr challenge from the set of all dspr challenges.
where the last inequality follows from Theorem 26.
Theorem 30.Let n ∈ N, N = 2 n , H : K × {0, 1} n → {0, 1} n as defined above be a random, length-preserving keyed hash function.Any quantum adversary A that makes no more than q quantum queries to its oracle can only solve the decisional second-preimage problem with advantage Proof.Use Theorem 29 to construct an adversary B that makes 2q queries and that has advantage at least Adv dspr H (A) of distinguishing D 0 from D 1/N .This advantage is at most 8(1/N )(2q) 2 = 32q 2 /N by Theorem 28.

DSPR for multiple targets
Multi-target security considers an adversary that is given T independent targets and is asked to solve a problem for one out of the T targets.This section defines T -DSPR, a multi-target version of DSPR.
We draw attention to an unusual feature of this definition: the advantage of an adversary A is defined as the improvement from p to q, where p and q are two probabilities that can both be influenced by A. The second probability q is A's chance of correctly predicting whether the input selected by A has a second preimage.The first probability p is the chance that the input selected by A does have a second preimage.
This deviates from the usual view of advantage as how much A improves upon success probability compared to some trivial baseline attack.What we are doing, for multi-target attacks, is asking how much A improves upon success probability compared to the baseline attack against the same target that A selected.In most of the contexts considered in the literature, the success probability of the baseline attack is independent of the target, so this matches the usual view.DSPR is different, because the success probability of the baseline attack depends on the target.
One can object that this allows the baseline attack to be affected (positively or negatively) by A's competence in target selection.We give two responses to this objection.First, our definition enables a proof (Theorem 33) that T -DSPR is at most T times easier to break than DSPR.Second, our definition enables an interactive multi-target generalization (Theorem 38) of our proof that DSPR and SPR together imply PRE.
Definition 31 (T -DSPR).Let T be a positive integer.Let A be an algorithm with output in {1, . . ., T } × {0, 1}.The advantage of A against the T -target decisional second-preimage resistance of a keyed hash function H is The only difference between the formulas for q and p is that q compares P k j (x j ) to b while p compares it to 1.If T > 1 then an algorithm might be able to influence p up or down, compared to any particular SPprob(H k i ), through the choice of j.Obtaining a significant T -DSPR advantage then means obtaining q significantly larger than p, i.e., making a prediction of P k j (x j ) significantly better than always predicting that it is 1.
As an extreme case, consider the following slow algorithm.Compute each P k j (x j ) by brute force; choose j where P k j (x j ) = 0 if such a j exists, else j = 1; and output P k j (x j ).This algorithm has q = 1 and thus T -DSPR advantage 1−p.The probability p for this algorithm is the probability that all of x 1 , . . ., x T have second preimages.For most length-preserving functions, this probability is approximately (1 − 1/e) T , which rapidly converges to 0 as T increases, so the T -DSPR advantage rapidly converges to 1.
Definition 32.Let A be an algorithm, and let T be a positive integer.Then Plant T (A) is the following algorithm: This uses the standard technique of planting a single-target challenge at a random position in a multi-target challenge.With probability 1/T , the multitarget attack chooses the challenge position; in the other cases, this reduction outputs 1.The point of Theorem 33 is that this reduction interacts nicely with the subtraction of probabilities in the DSPR and T -DSPR definitions.
The cost of Plant T (A) is the cost of generating a random number i between 1 and T , generating T − 1 elements of X × K, running A, and comparing j to i.The algorithm has essentially the same cost as A if X and K can be efficiently sampled.
Theorem 33 (T -loose implication DSPR ⇒ T -DSPR).Let H be a keyed hash function.Let T be a positive integer.Let A be an algorithm with output in {1, . . ., T } × {0, 1}.Then Proof.By definition Adv T -dspr H (A) runs A with T independent uniform random targets (x 1 , k 1 , . . ., x T , k T ).Write (j, b) for the output of A(x 1 , k 1 , . . ., x T , k T ).Then Adv T -dspr H (A) = max{0, q −p}, where q is the probability that P k j (x j ) = b, and p is the probability that P k j (x j ) = 1.
To analyze q and p, we split the universe of possible events into four mutually exclusive events: Then q = Pr E 00 + Pr E 11 and p = Pr E 01 + Pr E 11 , so q − p = Pr E 00 − Pr E 01 .
For comparison, Adv dspr H (B) runs B, which in turn runs A with T independent uniform random targets (x 1 , k 1 , . . ., x T , k T ).One of these targets (x i , k i ) is the uniform random target (x, k) provided to B as a challenge; B randomly selects i and the remaining targets.The output b of B(x, k) is b if j = i, and The choice of i is not visible to A, so the event that i = j has probability 1/T .Furthermore, this event is independent of E 00 , E 01 , E 10 , E 11 : i.e., i = j has conditional probability 1/T given E 00 , conditional probability 1/T given E 01 , etc.
Write q for the chance that P k (x) = b , and p for the chance that P k (x) = 1.Then Adv dspr H (B) = max{0, q − p }.To analyze q and p , we split into mutually exclusive events as follows: -E 00 occurs and i = j.This has probability (Pr E 00 )/T .Then (x j , k j ) = (x i , k i ) = (x, k) so P k (x) = P k j (x j ) = 0 = b = b .This contributes to q and not to p .-E 01 occurs and i = j.This has probability (Pr E 01 )/T .Then (x j , k j ) = (x, k) so P k (x) = 1, while b = b = 0.This contributes to p and not to q .-All other cases: b = 1 (since b = 0 can happen only if b = 0 and i = j).We further split this into two cases: • P k (x) = 1.This contributes to q and to p .
• P k (x) = 0.This contributes to neither q nor p .

Removing interactivity
The real importance of DSPR for security proofs is that it allows interactive versions of preimage resistance to be replaced by non-interactive assumptions without penalty.Interactive versions of preimage resistance naturally arise in, e.g., the context of hash-based signatures; see Section 8.The example discussed in this section is the T -openPRE notion already informally introduced in Section 1.2.We first review T -SPR, a multi-target version of second-preimage resistance.Then we formally define the interactive notion T -openPRE and show that its security tightly relates to T -SPR and T -DSPR.
T -SPR is what is called multi-function, multi-target second-preimage resistance in [9].It was shown in [9] that a generic attack against T -SPR has the same complexity as a generic attack against SPR.
Definition 34 (T -SPR).The success probability of an algorithm A against the T -target second-preimage resistance of a keyed hash function H is T -openPRE is essentially what would be T -PRE (which we did not define) but with the additional tweak that the adversary gets access to an opening oracle.The adversary is allowed to query the oracle for the preimages of all but one of the targets and has to output a preimage for the remaining one.

Definition 35 (T -openPRE).
Let H be a keyed hash function.The success probability of an algorithm A against the T -target opening-preimage resistance of H is defined as where Open(i) = x i .Now, it is of course possible to reduce PRE to T -openPRE.However, such a reduction has to guess the index j for which A will output a preimage (and hence does not make a query) correctly.Otherwise, if the reduction embeds its challenge image in any of the other positions, it cannot answer A's query for that index.As A does not lose anything by querying all indices but j, we can assume that it actually does so.Hence, such a reduction from PRE must incur a loss in tightness of a factor T .For some applications discussed below, T can reach the order of 4  √ N .This implies a quarter loss in the security level.Theorem 38 shows that T -openPRE is tightly related to the non-interactive assumptions T -DSPR and T -SPR: if H is T -target decisional-second-preimage resistant and T -target second-preimage resistant then it is T -target openingpreimage-resistant. As before, we first define the reductions and then state a theorem regarding probabilities.
Definition 36 (T -target SPfromP).Let H be a keyed hash function.Let A be an algorithm using an oracle.Let T be a positive integer.Then SPfromP T (H, A) is the following algorithm: This generalizes the standard SPfromP reduction: it handles multiple targets in the obvious way, and it easily answers oracle queries with no failures since it knows all the x i inputs.The algorithm SPfromP T (H, A) uses T calls to H (which can be deferred until their outputs are used) and one call to A.
Definition 37 (T -target DSPfromP).Let H be a keyed hash function.Let A be an algorithm.Then SPfromP T (H, A) is the following algorithm: This is an analogous adaptation of our DSPfromP reduction to the interactive multi-target context.Again oracle queries are trivial to answer.Note that the case that A cheats, returning an index j that it used for an Open query, is a failure case for A by definition; the algorithm SPfromP T (H, A) outputs 1 in this case, exactly as if A had failed to find a preimage.In other words, this algorithm returns 0 whenever A returns a solution that contains the preimage that was already known by the reduction (but not given to A via Open), and 1 otherwise.The core proof idea is the following.As noted above, the reductions attacking T -SPR and T -DSPR can perfectly answer all of A's oracle queries as they know preimages.However, for the index for which A outputs a preimage (without cheating), it did not learn the preimage known to the reduction.Hence, from there on we can apply a similar argument as in the proof of Theorem 25.We include a complete proof below to aid in verification.
Proof.Write (j, x ) for the output of A Open (H k 1 (x 1 ), k 1 , . . ., H k T (x T ), k T ).As in the proof of Theorem 25, we split the universe of possible events into mutually exclusive events across two dimensions: the number of preimages of H k j (x j ), and whether A succeeds or fails in finding a preimage.Specifically, define as the event that there are exactly i preimages and that A succeeds, and define as the event that there are exactly i preimages and that A fails.Note that there are only finitely many i for which the events S i and F i can occur.Define s i and f i as the probabilities of S i and F i respectively.The probability space here includes the random choices of (x 1 , k 1 , . . ., x T , k T ), and any random choices made inside A.
T -openPRE success probability.By definition, Succ T -openpre H (A) is the probability that x is a non-cheating preimage of H k j (x j ); i.e., that H k j (x ) = H k j (x j ) and j was not a query to the oracle.This event is the union of the events S i , so T -DSPR success probability.By definition B outputs the pair (j, b), where b = ((x = x j ) ∨ j was a query of A).
Define P k j = SPexists(H k j ), and define q as in the definition of Adv T -dspr H (B). Then q is the probability that B is correct, i.e., that b = P k j (x j ).There are four cases: -If the event S 1 occurs, then there is exactly 1 preimage of H k j (x j ), so P k j (x j ) = 0 by definition of SPexists.Also, A succeeds: i.e., j was not a query, and x is a preimage of H k j (x j ), forcing x = x j .Hence b = 0 = P k j (x j ).-If the event F 1 occurs, then again P k j (x j ) = 0, but now A fails: i.e., j was a query, or x is not a preimage of H k j (x j ).Either way b = 1 = P k j (x j ).
(We could skip this case in the proof, since we need only a lower bound on q rather than an exact formula for q.) -If the event S i occurs for i > 1, then P k j (x j ) = 1 and A succeeds.Hence j was not a query, and x is a preimage of H k j (x j ), so x = x j with conditional probability exactly 1 i .Hence b = 1 = P k j (x j ) with conditional probability exactly i−1 i .
-If the event F i occurs for i > 1, then P k j (x j ) = 1 and A fails.Failure means that x is not a preimage, so in particular x = x j , or that j was a query.
Either way b = 1 = P k j (x j ).
To summarize, q = s 1 + i>1 i−1 i s i + i>1 f i .T -DSPR advantage.Define p as in the definition of Adv T -dspr H (B). Then Adv T -dspr H (B) = max{0, q − p}.The analysis of p is the same as the analysis of q above, except that we compare P k j (x j ) to 1 instead of comparing it to b.We have 1 = P k j (x j ) exactly for the events S i and F i with i > 1. Hence p = i>1 s i + i>1 f i .Subtract to see that T -SPR success probability.By definition C outputs (j, x ).The T -SPR success probability Succ T -spr H (C) is the probability that x is a second preimage of x j under H k j , i.e., that H k j (x ) = H k j (x j ) while x = x j .
It is possible for C to succeed while A fails: perhaps A learns x j = Open(j) and then computes a second preimage for x j , which does not qualify as an T -openPRE success for A but does qualify as a T -SPR success for C. We ignore these cases, so we obtain only a lower bound on Succ T -spr H (C); this is adequate for the proof.
Assume that event S i occurs with i > 1.Then x is a preimage of H k j (x j ).Furthermore, A did not query j, so x j is not known to A except via H k j (x j ).There are i preimages, so x = x j with conditional probability exactly 1 i .Hence C succeeds with conditional probability i−1 i .To summarize, Succ T -spr H (C) ≥ i>1 i−1 i s i .Combining the probabilities.We conclude as in the proof of Theorem 25:

Applications to hash-based signatures
The interactive notion of T -openPRE with a huge number of targets naturally arises in the context of hash-based signatures.This was already observed and extensively discussed in [9].One conclusion of the discussion there is to use keyed hash functions with new (pseudo)random keys for each hash-function call made in a hash-based signature scheme.When applying this idea to Lamport one-time signatures (L-OTS) [11], the standard security notion for OTS of existential unforgeability under one chosen message attacks (EU-CMA) becomes T -openPRE where A is allowed to make T /2 queries.Using L-OTS in a many-time signature scheme such as the Merkle Signature Scheme [13] and variants like [12,2,8] can easily amplify the difference in tightness between a reduction that uses (T -)PRE and a reduction from T -SPR and T -DSPR to 2 70 .Indeed, the general idea of using T -SPR instead of (T -)PRE in security reductions for hash-based signatures already occurs in [9].However, there the authors make use of the assumption that for the used hash function every input has a colliding value for all keys, i.e., SPprob(H) = 1 in our notation.This is unlikely to hold for common length-preserving keyed hash functions as Section 3 shows SPprob(H) ≈ 1 − 1/e for random H.However, as shown above, it is also not necessary to require SPprob(H) = 1.Instead, it suffices to require (T -)DSPR.
For modern hash-based signatures like XMSS [7] L-OTS is replaced by variants [6] of the Winternitz OTS (W-OTS) [13].For W-OTS the notion of EU-CMA security does not directly translate to T -openPRE.Indeed, the security reduction gets far more involved as W-OTS uses hash chains.However, as shown in [9] one can replace (T -)PRE in this context by T -SPR and the assumption that SPprob(H) = 1.Along the lines of the above approach we can then replace the assumption that SPprob(H) = 1 by T -DSPR.Lemma 49.Define ϕ 5 (x) = xe x − e x + 1.Then ϕ 5 decreases for x < 0, has minimum value 0 at x = 0, and increases for x > 0.
By Lemma 49, xe x − e x + 1 increases with x for x > 0. Its value at x = 1 is 1.Hence log(xe x − e x + 1) is negative for x < 1 and positive for x > 1.
To summarize, γ is positive for x < 1 and negative for x > 1. Hence γ has its maximum value at x = 1.
We conclude by assuming