2010 Reports : Cryptology ePrint Archive Forum

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical - models actual M-D accurately?**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

**Re: 2010/384 is theoretical work - not practical**

Discussion forum for Cryptology ePrint Archive reports posted in 2010.
Please put the report number in the subject.

2010/384 is theoretical work - not practical

Posted by: **DutchShtull** (IP Logged)

Date: 08 July 2010 08:24

Three remarks for this paper:

1. It is nice discovery to show that narrow-pipe hash functions can not ever replace random oracles. From this point of view, wide-pipe hash designs have obvious advantage in SHA-3 contest.

2. The term "practical" in the title of the paper is not well chosen. Observed decreasing of the entropy is of *theoretical* importance, not of practical importance.

3. My impression is that wide-pipe hash designs will not manifest the observed loss of entropy (as narrow-pipe designs do), but this is not clearly stated and explained in the paper.

1. It is nice discovery to show that narrow-pipe hash functions can not ever replace random oracles. From this point of view, wide-pipe hash designs have obvious advantage in SHA-3 contest.

2. The term "practical" in the title of the paper is not well chosen. Observed decreasing of the entropy is of *theoretical* importance, not of practical importance.

3. My impression is that wide-pipe hash designs will not manifest the observed loss of entropy (as narrow-pipe designs do), but this is not clearly stated and explained in the paper.

Posted by: **Orr** (IP Logged)

Date: 09 July 2010 16:05

The latest version I've read contains mistakes which are currently are under discussion with the authors (offline).

For example, the claims concerning the lose of entropy if you iterate the compression function 2^20 times is wrong if the message blocks contain a bit of entropy.

So, I would hold your horses concerning the statement "narrow-pipe hash functions can not ever replace random oracles" (for what it's worth, any practical hash function cannot replace random oracles).

For example, the claims concerning the lose of entropy if you iterate the compression function 2^20 times is wrong if the message blocks contain a bit of entropy.

So, I would hold your horses concerning the statement "narrow-pipe hash functions can not ever replace random oracles" (for what it's worth, any practical hash function cannot replace random oracles).

Posted by: **gligoroski** (IP Logged)

Date: 09 July 2010 20:47

I am receiving very good comments, suggestions and corrections from Jean-Philippe and Orr - so soon I will post corrected version of the paper.

But Duchman is right: this is more theoretical work and of theoretical importance.

But Duchman is right: this is more theoretical work and of theoretical importance.

Posted by: **marshray** (IP Logged)

Date: 15 July 2010 01:12

The model

h_(i+1) = compress(h_i, m_i)

may be a bit of an oversimplification. Notice that in actual SHA-256 the result of the compression(h_i, m_i) is added into the h_i input chaining values to produce h_(i+1) instead of outright replacing it. I suspect M&D foresaw this issue and built in a resistance to it.

The paper seems to imply that the worst case would be a block message input which mapped every X chaining input to the same Y, destroying all entropy from previous blocks. However, in the actual SHA-256 this would result in the perfect transmission of entropy past that block.

In order for a block to destroy entropy it needs to actively negate the input chaining value. I'm thinking this probability is more like 1/2^256 rather than 1/e?

It is also possible I am misunderstanding something basic here. I admit to being a bit out of my league here. :-)

h_(i+1) = compress(h_i, m_i)

may be a bit of an oversimplification. Notice that in actual SHA-256 the result of the compression(h_i, m_i) is added into the h_i input chaining values to produce h_(i+1) instead of outright replacing it. I suspect M&D foresaw this issue and built in a resistance to it.

The paper seems to imply that the worst case would be a block message input which mapped every X chaining input to the same Y, destroying all entropy from previous blocks. However, in the actual SHA-256 this would result in the perfect transmission of entropy past that block.

In order for a block to destroy entropy it needs to actively negate the input chaining value. I'm thinking this probability is more like 1/2^256 rather than 1/e?

It is also possible I am misunderstanding something basic here. I admit to being a bit out of my league here. :-)

Posted by: **marshray** (IP Logged)

Date: 15 July 2010 04:47

So the

h_(i+1) = h_i + F(h_i, m_i)

construction used by seems to be credited to Davies-Meyer. It's not used in MD2 but is used in MD4 (RFC 1186 October 1990), MD5, SHA-1, and SHA-2.

However, descriptions of Davies-Meyer seem to be specifically expecting a block cipher for F. As a block cipher is a permutation it would have no missing preimages, so it would not seem that this effect would need compensating.

Perhaps it was an invisible influence, designer's intuition, or we just got lucky?

h_(i+1) = h_i + F(h_i, m_i)

construction used by seems to be credited to Davies-Meyer. It's not used in MD2 but is used in MD4 (RFC 1186 October 1990), MD5, SHA-1, and SHA-2.

However, descriptions of Davies-Meyer seem to be specifically expecting a block cipher for F. As a block cipher is a permutation it would have no missing preimages, so it would not seem that this effect would need compensating.

Perhaps it was an invisible influence, designer's intuition, or we just got lucky?

Posted by: **gligoroski** (IP Logged)

Date: 17 July 2010 06:37

marshray Wrote:

-------------------------------------------------------

> So the

>

> h_(i+1) = h_i + F(h_i, m_i)

>

> construction used by seems to be credited to

> Davies-Meyer. It's not used in MD2 but is used in

> MD4 (RFC 1186 October 1990), MD5, SHA-1, and

> SHA-2.

Hmm,

the function

h_(i+1) = h_i + F(h_i, m_i)

has still the form:

h_(i+1) = compress(h_i, m_i)

-------------------------------------------------------

> So the

>

> h_(i+1) = h_i + F(h_i, m_i)

>

> construction used by seems to be credited to

> Davies-Meyer. It's not used in MD2 but is used in

> MD4 (RFC 1186 October 1990), MD5, SHA-1, and

> SHA-2.

Hmm,

the function

h_(i+1) = h_i + F(h_i, m_i)

has still the form:

h_(i+1) = compress(h_i, m_i)

Posted by: **marshray** (IP Logged)

Date: 19 July 2010 19:34

Yes, but some of the possible compression functions approximate a random function more closely than others. If we have information that a given construction does not, then we may be obligated not to approximate it as such.

I did some Monte Carlo testing and learned some things :-). Results and code are posted here:

[www.ietf.org]

In short:

* Mixing the input chaining value back to the output (D-M) causes the result to more closely approximate a random function (for better or for worse). So MD4-SHA2 do, in fact, experience the entropy reduction effect.

* D-M does convert the worst-case of a completely degenerate compression function (e.g., one which returns the same image every time) into a best case of no entropy loss.

* A pure M-D construction using a random permutation (say AES) for the CF is made worse by randomly mixing into the output.

- Marsh

I did some Monte Carlo testing and learned some things :-). Results and code are posted here:

[www.ietf.org]

In short:

* Mixing the input chaining value back to the output (D-M) causes the result to more closely approximate a random function (for better or for worse). So MD4-SHA2 do, in fact, experience the entropy reduction effect.

* D-M does convert the worst-case of a completely degenerate compression function (e.g., one which returns the same image every time) into a best case of no entropy loss.

* A pure M-D construction using a random permutation (say AES) for the CF is made worse by randomly mixing into the output.

- Marsh

Posted by: **gligoroski** (IP Logged)

Date: 19 July 2010 20:17

Dear Marsh,

From the link you have sent I quote:

"I did some Monte Carlo testing and found that my intuition was mostly wrong. The entropy loss effect is also observable with the Davies-Meyer construction where the chaining value from the previous block is added back into the result from the current block. In fact, under some circumstances, it makes it worse! "

Although your experiments are just Monte Carlo testing - they are interesting and they confirm our initial intuition and feeling (mine and Klima's) that something is not completely OK and not ideally random with narrow-pipe designs.

But I think the research in this area has long way to go until (if ever) any concrete practical results will occur.

From the link you have sent I quote:

"I did some Monte Carlo testing and found that my intuition was mostly wrong. The entropy loss effect is also observable with the Davies-Meyer construction where the chaining value from the previous block is added back into the result from the current block. In fact, under some circumstances, it makes it worse! "

Although your experiments are just Monte Carlo testing - they are interesting and they confirm our initial intuition and feeling (mine and Klima's) that something is not completely OK and not ideally random with narrow-pipe designs.

But I think the research in this area has long way to go until (if ever) any concrete practical results will occur.

Posted by: **marshray** (IP Logged)

Date: 20 July 2010 19:22

When evaluating the effect of this phenomenon on actual hash designs, it's probably important to look inside the block structure as well. For example, SHA-1:

for i from 0 to 79 // thanks Wikipedia

if 0 ≤ i ≤ 19 then // "round A"

f = (b and c) or ((not b) and d)

k = 0x5A827999

else if 20 ≤ i ≤ 39 // "round B"

f = b xor c xor d

k = 0x6ED9EBA1

else if 40 ≤ i ≤ 59 // "round C"

f = (b and c) or (b and d) or (c and d)

k = 0x8F1BBCDC

else if 60 ≤ i ≤ 79 // "round D"

f = b xor c xor d

k = 0xCA62C1D6

...

Sorry for the formatting.

Notice that the two separate rounds A and C are composed of 'and', 'or', and 'not' operations (four times on each state variable). Does this on its own amount to a random function? Rounds B and D are simply 'xor' so we would expect no loss of entropy.

So does SHA-1 experience entropy loss due to missing preimages one time per block, or eight, or somewhere in between? To what degree can this be controlled by an attacker who can extend the input with chosen text?

SHA-256 appears to do some significantly nonlinear operations on two of the eight state variables in each of 64 steps. Does this amount to 16 consecutive random function applications?

This looks like it could knock a few additional bits out of the net effective entropy.

for i from 0 to 79 // thanks Wikipedia

if 0 ≤ i ≤ 19 then // "round A"

f = (b and c) or ((not b) and d)

k = 0x5A827999

else if 20 ≤ i ≤ 39 // "round B"

f = b xor c xor d

k = 0x6ED9EBA1

else if 40 ≤ i ≤ 59 // "round C"

f = (b and c) or (b and d) or (c and d)

k = 0x8F1BBCDC

else if 60 ≤ i ≤ 79 // "round D"

f = b xor c xor d

k = 0xCA62C1D6

...

Sorry for the formatting.

Notice that the two separate rounds A and C are composed of 'and', 'or', and 'not' operations (four times on each state variable). Does this on its own amount to a random function? Rounds B and D are simply 'xor' so we would expect no loss of entropy.

So does SHA-1 experience entropy loss due to missing preimages one time per block, or eight, or somewhere in between? To what degree can this be controlled by an attacker who can extend the input with chosen text?

SHA-256 appears to do some significantly nonlinear operations on two of the eight state variables in each of 64 steps. Does this amount to 16 consecutive random function applications?

This looks like it could knock a few additional bits out of the net effective entropy.

Posted by: **Orr** (IP Logged)

Date: 21 July 2010 09:37

marshray:

If you enter a random chaining value and a random message to SHA-1's compression function, you put in 512+160 bits of entropy. You would get with very high probability 160 bits of entropy coming out.

If you fix the message to constant, then we expect to lose only ~0.5 bits of entropy, as the underlying block cipher (called SHACAL-1) which is used in a Davies-Meyer mode is invertible (it's a block cipher!).

In other words, you will not be able to measure any entropy loss.

If you enter a random chaining value and a random message to SHA-1's compression function, you put in 512+160 bits of entropy. You would get with very high probability 160 bits of entropy coming out.

If you fix the message to constant, then we expect to lose only ~0.5 bits of entropy, as the underlying block cipher (called SHACAL-1) which is used in a Davies-Meyer mode is invertible (it's a block cipher!).

In other words, you will not be able to measure any entropy loss.

Posted by: **marshray** (IP Logged)

Date: 21 July 2010 16:41

In many real-world situations an attacker is able to extend the length of messages with chosen text in an attempt to engineer a collision. In these cases, there may be 160 bits of entropy coming in through the chaining values, but the entire message data could be an attacker-supplied constant value for many blocks.

Seems to me that as long as the ratio of (bits of entropy lost to missing preimages) to (log2 of appended constant blocks) remains greater than 2, an attacker seeking to collide messages would find it better to append constant blocks to each candidate message prefix rather than process those same blocks as different messages.

SHA-1 stands today at 2^63 for collision resistance. If this effect "stacks" on top of the other attacks (I don't know that it does) and reduces the collision resistance by a few bits with less than that much extra work, then it could make a real difference as to what year we see the first actual SHA-1 collision.

Seems to me that as long as the ratio of (bits of entropy lost to missing preimages) to (log2 of appended constant blocks) remains greater than 2, an attacker seeking to collide messages would find it better to append constant blocks to each candidate message prefix rather than process those same blocks as different messages.

SHA-1 stands today at 2^63 for collision resistance. If this effect "stacks" on top of the other attacks (I don't know that it does) and reduces the collision resistance by a few bits with less than that much extra work, then it could make a real difference as to what year we see the first actual SHA-1 collision.

Posted by: **Vlastimil Klima** (IP Logged)

Date: 04 August 2010 23:05

Please excuse our delay. We will write the update and more accurate version of the paper very soon.

Vlastimil Klima

Vlastimil Klima

Posted by: **hugo** (IP Logged)

Date: 05 August 2010 22:02

A few weeks ago, 7-15-2010, in a discussion in the TLS WG mailing list regarding the applicability of the results in this paper to TLS (and practical crypto in general) I posted the following comments

[www.ietf.org]

If I am right in my comments (I might have misunderstood the paper's claims) then the paper needs to be revised not to give the impression that its results have any bearing on practical cryptography (as a theoretical issue, in particular in the setting of statistical extractors, this behavior of M-D functions under narrow pipes was already pointed out in the DGGKR paper of Crypto 2004).

Note: I only refer to the general claims of the paper. I have not studied the paper's contents beyond that.

[www.ietf.org]

If I am right in my comments (I might have misunderstood the paper's claims) then the paper needs to be revised not to give the impression that its results have any bearing on practical cryptography (as a theoretical issue, in particular in the setting of statistical extractors, this behavior of M-D functions under narrow pipes was already pointed out in the DGGKR paper of Crypto 2004).

Note: I only refer to the general claims of the paper. I have not studied the paper's contents beyond that.

Please log in for posting a message. Only registered users may post in this forum.