Is it possible to create a forged file which has the same checksums using two different algorithms?

A:

In theory you could do this. In practice, if you started from the two checksums provided by MD5 and SHA1 and tried to create a file that produced the same two checksums - it would be very difficult (many times more difficult than creating a file that produced the same MD5 checksum, or SHA1 checksum in isolation).

Sohnee 2009-07-24 12:31:47

+6 A:

"Would it be possible at all?" - yes, if the total size of the checksums is smaller than the total size of the file, it is impossible to avoid collisions.

"would it be possible in reality, with current hardware/software?" - if it is feasible to construct a text to match a given checksum for each of the checksums in use, then yes.

See wikipedia on concatenation of cryptographic hash functions, which is also a useful term to google for.

From that page:

"However, for Merkle-Damgård hash functions, the concatenated function is only as strong as the best component, not stronger. Joux noted that 2-collisions lead to n-collisions: if it is feasible to find two messages with the same MD5 hash, it is effectively no more difficult to find as many messages as the attacker desires with identical MD5 hashes. Among the n messages with the same MD5 hash, there is likely to be a collision in SHA-1. The additional work needed to find the SHA-1 collision (beyond the exponential birthday search) is polynomial. This argument is summarized by Finney."

moonshadow 2009-07-24 12:34:42

I wouldn't have searched for the term concatenation in this context, but I guess that is the answer to my question.That leaves only the question for the probability to do something like this in reality.

Mauli 2009-07-24 12:59:57

That latter question is answered by moonshadow too, in the quote: Using the two together is not substantially harder than using the best of the two on its own.

Nick Johnson 2009-07-24 21:36:20

A:

So I asked myself if it would be possible to create a file A' which has the same size, the same md5 sum, and the same sha1 sum as the original file A.

Yes, make a copy of the file.

Other than that, not without large amounts of computing resources to check tons of permutations (assuming the file size is non-trivial).

You can think of it like this:

If file size increases by n, the likelihood of a possible fake increases, but the computing costs necessary to test the combinations increases exponentially by 2^n.

So the bigger your file is, the more likely there is a dupe out there, but the less likely you are at finding it.

samoz 2009-07-24 12:39:51

+1 for the copy. ;)

Bombe 2009-07-24 13:04:02

-1 for sounding authoritative without actually being a cryptographer (see moonshadow's answer for the _correct_ assessment of the security of the two concatenated)

Nick Johnson 2009-07-24 21:37:24

@Nick: I fully agree. In crypto there are quite a number of surprising and unintuitive results. Concatenation of hashes is one of them and just plain guessing doesn't work here.

Accipitridae 2009-07-25 15:58:32

A:

In theory yes you can have it, in practice it's hell of a collusion. In practice no one even able to create a SHA1 collusion let alone MD5 + SHA1 + Size at the same time. This combination is simply impossible right now without having the whole computer power in the world and run it for a while.

Although in the close future we might see more vulnerabilities in SHA1 and MD5. And with the support of better hardware (especially GPU) why not.

dr. evil 2009-07-24 12:51:55

A:

For a naive answer, we'd have make some (incorrect) assumptions:

Both the SHA1 and MD5 hashing algorithms result in an even distribution of hash values for a set of random inputs
Algorithm details aside--a random input string has an equally likely chance of producing any hash value

(Basically, no clumping and nicely distributed domains.)

If the probability of discovering a string that collides with another's SHA1 hash is p1, and similarly p2 for MD5, the naive answer is the probability of finding one that collides with both is p1*p2.

However, the hashes are both broken, so we know our assumptions are wrong.

The hashes have clumping, are more sensitive to changes with some data than others, and in other words, aren't perfect. On the other hand, a perfect, non-broken hashing algorithm will have the above properties, and that's exactly what makes it hard to find collisions. They're random.

The probability intrinsically depends on the properties of the algorithm--basically, since our assumptions aren't valid, we can't "easily" determine how hard it is. In fact, the difficultly of finding input that collides likely depends very strongly on the characteristics of the input string itself. Some may be relatively easy (but still probably impractical on today's hardware), and due to the different nature of the two algorithms, some may actually be impossible.

2009-07-24 13:16:01

A:

MD5 is not useful for any cryptographic purpose.

http://www.mscs.dal.ca/~selinger/md5collision/

2009-07-24 15:56:10

This doesn't answer the question...

Adam Byrtek 2009-07-25 00:07:40

ansaurus

tags:

views:

answers:

Is it possible to create a forged file which has the same checksums using two different algorithms?

related questions