A Real MD5 Collision
A paper by Xiaoyun Wang and Dengguo Feng and Xuejia Lai and Hongbo Yu has been posted on Aug 17, 2004 about Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD, showing collisions for the MD5 hash with the right input vectors. Example:
Input vector 1:
d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70
Input vector 2:
d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70
Identical MD5 value, verified with WinHex: 79054025255fb1a26e4bc422aef54eb4
Another example with two different HTML files that give the same MD5 hash: Download (view with Firefox)
Conclusion:
MD5 is not as safe as can be expected from a 128-bit hash. X-Ways Forensics and WinHex allow to use the SHA-256 hash instead of MD5, which is not known to suffer from weaknesses, based on today's knowledge. From X-Ways' point of view, the main implication of the weakness of MD5 for computer forensics is that sufficient computing power might allow to manipulate files in such a way (e.g. append data) that their MD5 values will match that of known irrelevant files so that they go unnoticed in a forensic examination.