On 1/22/20 8:25 PM, Artur Juraszek wrote:
Hi all,
While poking through Arch's package system, I noticed that despite its bad reputation, MD5 remains a default, and even some kind of a "recommendation", due to its presence in the example PKBUILDs, hashing algorithm for file integrity verification.
Is there a reason to not have it changed to a more future-proof one? I mean, at least for now, it seems good enough to protect before a so-called "2nd preimage attack", which is the primary concern in the classic file verification scenario, BUT:
a) given the huge size of AUR and its rather chaotic nature, it is not that hard to imagine _a_ malicious upstream which could try to sneak some nasty changes in its own files, with AUR maintainer not noticing anything - leveraging flaws which do exist and are quite well-explored even today.
b) it's already shown its weaknesses and it is not going to be any better - the only research direction is to found more (practical) attacks against MD5, so faster the change, fewer the people possibly affected in the future
Attaching a patch which, I think, replaces MD5 with SHA256 as a default completely - it's my first change in ABS-related code, though, so please do not hesitate to criticize if something's wrong ;]
This point has been raised a number of times. Here's the standard answer: checksums are not intended to prove that the download is trusted. It is an "integrity checksum", not an "authenticity checksum". Its purpose is to validate that you didn't have an interrupted download, or have a cached version of an unversioned filename, or that gamma rays didn't modify your disk and corrupt everything, etc. If you want authenticity, that is what PGP is for. PGP actually proves that the source code was created or authorized by a given *identity*. The default checksums are not used by security-conscious people, and aren't used in PKGBUILDs that define a different checksum anyway. Security conscious people also do out of band checks, or inspect the sources (maybe diffing it against the known previous version). The opinion of Allan, the lead developer, is that defaulting to md5sums is a good thing -- it makes it easy to spot people who just use the defaults and run updpkgsums without checking anything, whereas defaulting to sha256sums would result in him ironically having *less* trust in the package. "Securely verifying the unknown is still the unknown." ... I happen to think that there is merit to using sha256sums by default, on one specific rationale: it provides TOFU (Trust On First Use). For people who are totally careless about their packaging, you cannot trust the sources they list, but you can at least make sure that everyone is using the same sources, and that if an attacker attacked the source *after* the PKGBUILD is uploaded, you cannot maliciously change the sources which *some* people use. (But attackers can still successfully attack the source before it is first used, and then you're screwed). ... So ultimately that is what this discussion will always devolve to: - Do we want to ensure TOFU? - Do we want to give PKGBUILDs the default black mark "uses md5sums because maintainer doesn't care about researching sources"? -- Eli Schwartz Bug Wrangler and Trusted User