[arch-general] Packages Verified with MD5

Никола Вукосављевић hauzer at gmx.com
Sun Jan 12 14:02:00 EST 2014

Hash: SHA1

On 12.1.2014 19:29, Taylor Hornby wrote:
> On 01/12/2014 10:11 AM, Mark Lee wrote:
>> Perhaps I'm not strong enough in mathematics but I'd like to know
>> how possible md5 collisions can be weaponized. From what I see,
>> the idea would be to modify a binary such that it contains
>> malicious code (without changing the md5sum). Since most security
>> packages contain a number of compilation tests and md5 hashes
>> vary significantly with slight modifications, I'd like to know
>> how these collisions can be used to hijack a system. If one has
>> to build a binary that doesn't even encompass the functionality
>> of the binary it's trying to mimic, wouldn't that severely
>> decrease the effectiveness of a hash collision?
> The problem with MD5 is that it's possible to find two different 
> messages A and B such that MD5(A) = MD5(B). The birthday attack
> [1] makes it possible to find collisions for 128-bit hashes (like
> MD5) in about 2^64 operations, but there are MD5-specific attacks
> that make it even easier for MD5. So... it takes a bit of time to
> find a collision, but not too much time. It can be done in about a
> day with PS3's [2].
> For my explanation, I'll ignore the MD5-specific attacks and just
> use the birthday attack since it's easier to understand.
> The attacker starts out with two files:
> 1. The official copy of TrueCrypt.
> 2. A malicious copy of TrueCrypt.
> The malicious change would be something simple. For example, they
> could find where the master encryption key is copied into a
> buffer:
> ... MOV	[master_key + ECX], EAX     // EAX holds master key bits 
> ...
> And change it so that it's always zero:
> ... MOV	[master_key + ECX], 0       // Now the master key is set to
> zero ...
> Only one instruction is different, but now TrueCrypt encrypts
> everything with a null key.
> From these two copies copies of TrueCrypt, they generate 2^64
> variants of each. They can do this by re-ordering instructions,
> changing which registers are used for which variables, re-ordering
> ELF segments, or just appending random strings to the file (you can
> append stuff to an ELF binary and it will still run fine).
> Now the attacker has:
> 1. 2^64 files that are different, but behave exactly like the 
> non-malicious official copy of TrueCrypt.
> 2. 2^64 files that are different, but behave exactly like the
> malicious copy of TrueCrypt (that uses zero keys or whatever).
> If MD5 was a good hash function, each of the files would hash to a 
> random 128-bit value. You should now be able to see that at least
> one file from the "non-malicious" set of files probably has the
> same MD5 hash as one of the files from the "malicious" set of
> files.
> There are 2^64 * 2^64 = 2^128 ways of pairing a file from the 
> non-malicious pile with one from the malicious pile. The chance of
> any one pair colliding is 1 in 2^128, so there's a good chance that
> at least one of the pairs collide (it's not exactly 100% since the
> pairs are not mutually independent).
> To find the collision quickly, the attacker can just sort one of
> the sets of files by hash value, then for each file in the other
> set, use a binary search to see if any file has the same hash.
> Ignoring log(n) factors, that would take on the order of 2^64
> operations.
> So... Now the attacker has two files with the same MD5 hash. One
> that behaves like the official copy of TrueCrypt, and another that
> uses null keys.
> So, they give the *good* copy to the package maintainer, who
> laboriously checks that it's not backdoored (as if that ever
> happens). The package maintainer thinks everything is fine, since
> it behaves exactly like TrueCrypt and there's nothing fishy about
> the file (they probably won't notice some instructions being
> reordered). The file might even be the same as the one from the
> TrueCrypt website, if the attacker is collaborating with the
> TrueCrypt developers.
> The package maintainer thinks everything is good, so they hard-code
> the MD5 hash of their good copy into the PKGBUILD.
> Now, when most users install TrueCrypt, they will download the
> *good* copy, check the MD5 hash, it will match, and everything will
> be good. However, when the adversary sees that his target (e.g. a
> terrorist, or activist/journalist living in a non-free country) is
> downloading TrueCrypt, they will silently replace the file with
> with the *bad* copy. When the MD5 gets checked, it will match, and
> the user will be encrypting their files with null keys.
> In total, it took the attacker on the order of 2^66 operations to
> find the colliding files. You'd need a good budget (like the NSA's,
> or a large company's) to do that, but if you take advantage of the 
> MD5-specific attacks too, you can do it for much cheaper.
> Of course, this doesn't just apply to TrueCrypt, it could be
> applied to any other software that's checked only with an MD5 hash.
> To backdoor a web server, for example, you could change a constant
> used in array index bounds checking to introduce a buffer overflow
> vulnerability.
> [1] https://en.wikipedia.org/wiki/Birthday_attack
> [2] http://www.win.tue.nl/hashclash/rogue-ca/

Good stuff. Thanks for your time.
Version: GnuPG v2.0.19 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/


More information about the arch-general mailing list