Cryptographic Hash Function Problem

Started by
5 comments, last by Jethro_T 11 years, 1 month ago

This is related to my school work, I've basically created an example of a problem that I need to solve to better understand some things.

Say we are given a 128 bit message, m

We have a secure hash function H(x) that produces 256 bit output

We want to find a 128 bit value, t, such that H(m || t) gives us a result that has 50 leading zeros. Note: m || t is m concatenated with t.

The question is, how many times do we need to call the hash function so that we can expect to find at least one value of t that satisfies our requirement.

All help is appreciated.

Advertisement
The question is not precise enough. Examples of related questions that are precise enough are:
* What is the expected number of attempts before finding a value that satisfies the requirement?
* How many attempts would guarantee a probability of at least 90% of finding a value that satisfies the requirement?

This is the question I'm trying to ask:

* What is the expected number of attempts before finding a value that satisfies the requirement?

The question is plenty precise enough, assuming the OP's definition of "cryptographic hash function" is the accepted one.

As for the question, you need to look at the input/output properties of a cryptographic hash function, specifically the bit independence criterion. What does this criterion tell you about what different values of t will give you?

I'll give you a hint: it is very slightly better to select t = 0, 1, 2, .. until you find a match than to select t at random, because the latter runs the risk of you selecting the same t twice, thus wasting a hash function invocation (the probability of this occurring tends to 2^(-64), see the birthday paradox). Other than that, there is no difference between the two approaches. How come?

Another hint: the value of m is irrelevant to your expected probability of success - can you see why?

The rest should be straightforward, but please don't hesitate to ask.

Note it may be easier to approach this in the random oracle model, if you've studied this.

Another, even easier approach is to show that if H(X) is a secure n-bit hash function, then removing any subset of bits from the output of this hash function and calling the "reduced" hash function H'(X), then H'(X) is still a secure m-bit hash function (where n - m bits were removed from the output of H(X)). Then, your problem boils down to a simple preimage argument.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

The question is plenty precise enough, assuming the OP's definition of "cryptographic hash function" is the accepted one.

As for the question, you need to look at the input/output properties of a cryptographic hash function, specifically the bit independence criterion. What does this criterion tell you about what different values of t will give you?

I'll give you a hint: it is very slightly better to select t = 0, 1, 2, .. until you find a match than to select t at random, because the latter runs the risk of you selecting the same t twice, thus wasting a hash function invocation (the probability of this occurring tends to 2^(-64), see the birthday paradox). Other than that, there is no difference between the two approaches. How come?

Another hint: the value of m is irrelevant to your expected probability of success - can you see why?

The rest should be straightforward, but please don't hesitate to ask.

Note it may be easier to approach this in the random oracle model, if you've studied this.

Another, even easier approach is to show that if H(X) is a secure n-bit hash function, then removing any subset of bits from the output of this hash function and calling the "reduced" hash function H'(X), then H'(X) is still a secure m-bit hash function (where n - m bits were removed from the output of H(X)). Then, your problem boils down to a simple preimage argument.

Thanks for the post. Here's my thinking (please correct me if I'm off track):

The output bits all change seemingly randomly with no visible tie to any of the input bits.

So we are basically looking to find the probability of finding a bit string of length 256 that begins with 50 zeros, in a setting where all possible output cases have equal probability of occuring.

P(50 leading zeros) = (1 / 2^50)

So if we run the hash 2^50 times, we can expect to find a value of t that satisfies our requirement.

The question is plenty precise enough, assuming the OP's definition of "cryptographic hash function" is the accepted one.

As for the question, you need to look at the input/output properties of a cryptographic hash function, specifically the bit independence criterion. What does this criterion tell you about what different values of t will give you?

I'll give you a hint: it is very slightly better to select t = 0, 1, 2, .. until you find a match than to select t at random, because the latter runs the risk of you selecting the same t twice, thus wasting a hash function invocation (the probability of this occurring tends to 2^(-64), see the birthday paradox). Other than that, there is no difference between the two approaches. How come?

Another hint: the value of m is irrelevant to your expected probability of success - can you see why?

The rest should be straightforward, but please don't hesitate to ask.

Note it may be easier to approach this in the random oracle model, if you've studied this.

Another, even easier approach is to show that if H(X) is a secure n-bit hash function, then removing any subset of bits from the output of this hash function and calling the "reduced" hash function H'(X), then H'(X) is still a secure m-bit hash function (where n - m bits were removed from the output of H(X)). Then, your problem boils down to a simple preimage argument.

Thanks for the post. Here's my thinking (please correct me if I'm off track):

The output bits all change seemingly randomly with no visible tie to any of the input bits.

So we are basically looking to find the probability of finding a bit string of length 256 that begins with 50 zeros, in a setting where all possible output cases have equal probability of occuring.

P(50 leading zeros) = (1 / 2^50)

So if we run the hash 2^50 times, we can expect to find a value of t that satisfies our requirement.

Not very rigorous, but yes, that is correct. The easiest way to show it (IMO) is to consider H as a random oracle, such that for distinct inputs X and Y, every bit of H(X) and H(Y) are independent. Then, the probability of any one bit being zero is 1/2, and hence the probability of the first 50 bits being zero is 1/2^50.

On my last paragraph, you can also say, assume H is a secure 256-bit hash function. Then, define a new hash function H', which is simply the hash function H, truncated to the first 50 bits. By the bit independence criterion, H' is a "secure" 50-bit hash function (I use "secure" in quotes because while it technically has the required mathematical properties on 50 bits, it is too short to be secure in practice). Then, your problem is to find a preimage of H' for {0}^50, that is, find X such that:

H'(X) = {0}^50

Which by definition has complexity 2^50.

Note this is a bit out of reach for a single consumer-grade computer, but can be achieved in a few hours/days on a moderately sized cluster (say, 4 to 8 graphics cards).

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

The question is plenty precise enough, assuming the OP's definition of "cryptographic hash function" is the accepted one.

As for the question, you need to look at the input/output properties of a cryptographic hash function, specifically the bit independence criterion. What does this criterion tell you about what different values of t will give you?

I'll give you a hint: it is very slightly better to select t = 0, 1, 2, .. until you find a match than to select t at random, because the latter runs the risk of you selecting the same t twice, thus wasting a hash function invocation (the probability of this occurring tends to 2^(-64), see the birthday paradox). Other than that, there is no difference between the two approaches. How come?

Another hint: the value of m is irrelevant to your expected probability of success - can you see why?

The rest should be straightforward, but please don't hesitate to ask.

Note it may be easier to approach this in the random oracle model, if you've studied this.

Another, even easier approach is to show that if H(X) is a secure n-bit hash function, then removing any subset of bits from the output of this hash function and calling the "reduced" hash function H'(X), then H'(X) is still a secure m-bit hash function (where n - m bits were removed from the output of H(X)). Then, your problem boils down to a simple preimage argument.

Thanks for the post. Here's my thinking (please correct me if I'm off track):

The output bits all change seemingly randomly with no visible tie to any of the input bits.

So we are basically looking to find the probability of finding a bit string of length 256 that begins with 50 zeros, in a setting where all possible output cases have equal probability of occuring.

P(50 leading zeros) = (1 / 2^50)

So if we run the hash 2^50 times, we can expect to find a value of t that satisfies our requirement.

Not very rigorous, but yes, that is correct. The easiest way to show it (IMO) is to consider H as a random oracle, such that for distinct inputs X and Y, every bit of H(X) and H(Y) are independent. Then, the probability of any one bit being zero is 1/2, and hence the probability of the first 50 bits being zero is 1/2^50.

On my last paragraph, you can also say, assume H is a secure 256-bit hash function. Then, define a new hash function H', which is simply the hash function H, truncated to the first 50 bits. By the bit independence criterion, H' is a "secure" 50-bit hash function (I use "secure" in quotes because while it technically has the required mathematical properties on 50 bits, it is too short to be secure in practice). Then, your problem is to find a preimage of H' for {0}^50, that is, find X such that:

H'(X) = {0}^50

Which by definition has complexity 2^50.

Note this is a bit out of reach for a single consumer-grade computer, but can be achieved in a few hours/days on a moderately sized cluster (say, 4 to 8 graphics cards).

Thanks a lot. Very helpful :)

This topic is closed to new replies.

Advertisement