[pacman-dev] [PATCH] Replace MD5 with SHA-256 as a default file integrity check in PKGBUILDs
Hi all, While poking through Arch's package system, I noticed that despite its bad reputation, MD5 remains a default, and even some kind of a "recommendation", due to its presence in the example PKBUILDs, hashing algorithm for file integrity verification. Is there a reason to not have it changed to a more future-proof one? I mean, at least for now, it seems good enough to protect before a so-called "2nd preimage attack", which is the primary concern in the classic file verification scenario, BUT: a) given the huge size of AUR and its rather chaotic nature, it is not that hard to imagine _a_ malicious upstream which could try to sneak some nasty changes in its own files, with AUR maintainer not noticing anything - leveraging flaws which do exist and are quite well-explored even today. b) it's already shown its weaknesses and it is not going to be any better - the only research direction is to found more (practical) attacks against MD5, so faster the change, fewer the people possibly affected in the future Attaching a patch which, I think, replaces MD5 with SHA256 as a default completely - it's my first change in ABS-related code, though, so please do not hesitate to criticize if something's wrong ;] -- Artur Juraszek
On 23/1/20 11:25 am, Artur Juraszek wrote:
Hi all,
While poking through Arch's package system, I noticed that despite its bad reputation, MD5 remains a default, and even some kind of a "recommendation", due to its presence in the example PKBUILDs, hashing algorithm for file integrity verification.
Is there a reason to not have it changed to a more future-proof one? I mean, at least for now, it seems good enough to protect before a so-called "2nd preimage attack", which is the primary concern in the classic file verification scenario, BUT:
a) given the huge size of AUR and its rather chaotic nature, it is not that hard to imagine _a_ malicious upstream which could try to sneak some nasty changes in its own files, with AUR maintainer not noticing anything - leveraging flaws which do exist and are quite well-explored even today.
b) it's already shown its weaknesses and it is not going to be any better - the only research direction is to found more (practical) attacks against MD5, so faster the change, fewer the people possibly affected in the future
Attaching a patch which, I think, replaces MD5 with SHA256 as a default completely - it's my first change in ABS-related code, though, so please do not hesitate to criticize if something's wrong ;]
This change is not happening. Any checksum is insecure when added to a PKGBUILD using "makepkg -g", which is all the default value does. The person writing a PKGBUILD needs to use what is provided upstream (or even a PGP signature), in which case the default in makepkg does not make a difference. Allan
On Thu, 2020-01-23 at 02:25 +0100, Artur Juraszek wrote:
Hi all,
While poking through Arch's package system, I noticed that despite its bad reputation, MD5 remains a default, and even some kind of a "recommendation", due to its presence in the example PKBUILDs, hashing algorithm for file integrity verification.
Is there a reason to not have it changed to a more future-proof one? I mean, at least for now, it seems good enough to protect before a so-called "2nd preimage attack", which is the primary concern in the classic file verification scenario, BUT:
a) given the huge size of AUR and its rather chaotic nature, it is not that hard to imagine _a_ malicious upstream which could try to sneak some nasty changes in its own files, with AUR maintainer not noticing anything - leveraging flaws which do exist and are quite well-explored even today.
b) it's already shown its weaknesses and it is not going to be any better - the only research direction is to found more (practical) attacks against MD5, so faster the change, fewer the people possibly affected in the future
Attaching a patch which, I think, replaces MD5 with SHA256 as a default completely - it's my first change in ABS-related code, though, so please do not hesitate to criticize if something's wrong ;]
-- Artur Juraszek
I think we should change it to sha512 instead. sha256 and sha512 are pretty similar but sha512 is faster on 64-bit machine. Since 64-bit is the new standard for high-power computing, and the only architecture we support, it would be more beneficial to chose sha512. A quick benchmark on my machine confirms this: $ dd if=/dev/zero of=example.img bs=4096 count=512000 512000+0 records in 512000+0 records out 2097152000 bytes (2.1 GB, 2.0 GiB) copied, 2.77283 s, 756 MB/s $ time sha256sum example.img 274fbb979251bcaceab594dd89d5adfec310e8851e320b5b5f90fd5f18d76149 examp le.img real 4.79 user 4.47 sys 0.30 $ time sha512sum example.img 241497cb61e24fcdaf33a13f5635951ff7c21cb27904e6f3de7b221031b0216800cbce1 a667a66aafbdb7ffbfe2a39564b4cb48efea1d3721093fa7663e7a8c9 example.img real 3.33 user 3.09 sys 0.21 sha512 is ~1.5s than sha256 when calculating the checksum of a 2GiB zero-ed file. Thank you, Filipe Laíns
On Thu, 2020-01-23 at 01:36 +0000, Filipe Laíns wrote:
On Thu, 2020-01-23 at 02:25 +0100, Artur Juraszek wrote:
Hi all,
While poking through Arch's package system, I noticed that despite its bad reputation, MD5 remains a default, and even some kind of a "recommendation", due to its presence in the example PKBUILDs, hashing algorithm for file integrity verification.
Is there a reason to not have it changed to a more future-proof one? I mean, at least for now, it seems good enough to protect before a so-called "2nd preimage attack", which is the primary concern in the classic file verification scenario, BUT:
a) given the huge size of AUR and its rather chaotic nature, it is not that hard to imagine _a_ malicious upstream which could try to sneak some nasty changes in its own files, with AUR maintainer not noticing anything - leveraging flaws which do exist and are quite well-explored even today.
b) it's already shown its weaknesses and it is not going to be any better - the only research direction is to found more (practical) attacks against MD5, so faster the change, fewer the people possibly affected in the future
Attaching a patch which, I think, replaces MD5 with SHA256 as a default completely - it's my first change in ABS-related code, though, so please do not hesitate to criticize if something's wrong ;]
-- Artur Juraszek
I think we should change it to sha512 instead. sha256 and sha512 are pretty similar but sha512 is faster on 64-bit machine. Since 64-bit is the new standard for high-power computing, and the only architecture we support, it would be more beneficial to chose sha512.
A quick benchmark on my machine confirms this:
$ dd if=/dev/zero of=example.img bs=4096 count=512000 512000+0 records in 512000+0 records out 2097152000 bytes (2.1 GB, 2.0 GiB) copied, 2.77283 s, 756 MB/s
$ time sha256sum example.img 274fbb979251bcaceab594dd89d5adfec310e8851e320b5b5f90fd5f18d76149 examp le.img real 4.79 user 4.47 sys 0.30
$ time sha512sum example.img 241497cb61e24fcdaf33a13f5635951ff7c21cb27904e6f3de7b221031b0216800cbce1 a667a66aafbdb7ffbfe2a39564b4cb48efea1d3721093fa7663e7a8c9 example.img real 3.33 user 3.09 sys 0.21
sha512 is ~1.5s than sha256 when calculating the checksum of a 2GiB ^ *faster zero-ed file.
Thank you, Filipe Laíns
On 1/22/20 8:25 PM, Artur Juraszek wrote:
Hi all,
While poking through Arch's package system, I noticed that despite its bad reputation, MD5 remains a default, and even some kind of a "recommendation", due to its presence in the example PKBUILDs, hashing algorithm for file integrity verification.
Is there a reason to not have it changed to a more future-proof one? I mean, at least for now, it seems good enough to protect before a so-called "2nd preimage attack", which is the primary concern in the classic file verification scenario, BUT:
a) given the huge size of AUR and its rather chaotic nature, it is not that hard to imagine _a_ malicious upstream which could try to sneak some nasty changes in its own files, with AUR maintainer not noticing anything - leveraging flaws which do exist and are quite well-explored even today.
b) it's already shown its weaknesses and it is not going to be any better - the only research direction is to found more (practical) attacks against MD5, so faster the change, fewer the people possibly affected in the future
Attaching a patch which, I think, replaces MD5 with SHA256 as a default completely - it's my first change in ABS-related code, though, so please do not hesitate to criticize if something's wrong ;]
This point has been raised a number of times. Here's the standard answer: checksums are not intended to prove that the download is trusted. It is an "integrity checksum", not an "authenticity checksum". Its purpose is to validate that you didn't have an interrupted download, or have a cached version of an unversioned filename, or that gamma rays didn't modify your disk and corrupt everything, etc. If you want authenticity, that is what PGP is for. PGP actually proves that the source code was created or authorized by a given *identity*. The default checksums are not used by security-conscious people, and aren't used in PKGBUILDs that define a different checksum anyway. Security conscious people also do out of band checks, or inspect the sources (maybe diffing it against the known previous version). The opinion of Allan, the lead developer, is that defaulting to md5sums is a good thing -- it makes it easy to spot people who just use the defaults and run updpkgsums without checking anything, whereas defaulting to sha256sums would result in him ironically having *less* trust in the package. "Securely verifying the unknown is still the unknown." ... I happen to think that there is merit to using sha256sums by default, on one specific rationale: it provides TOFU (Trust On First Use). For people who are totally careless about their packaging, you cannot trust the sources they list, but you can at least make sure that everyone is using the same sources, and that if an attacker attacked the source *after* the PKGBUILD is uploaded, you cannot maliciously change the sources which *some* people use. (But attackers can still successfully attack the source before it is first used, and then you're screwed). ... So ultimately that is what this discussion will always devolve to: - Do we want to ensure TOFU? - Do we want to give PKGBUILDs the default black mark "uses md5sums because maintainer doesn't care about researching sources"? -- Eli Schwartz Bug Wrangler and Trusted User
Em janeiro 22, 2020 23:30 Eli Schwartz escreveu:
So ultimately that is what this discussion will always devolve to:
- Do we want to ensure TOFU?
Yes.
- Do we want to give PKGBUILDs the default black mark "uses md5sums because maintainer doesn't care about researching sources"?
No. Encouraging best packaging practices can and should be done right from the start. This discussion is pointless though. Let's continue to use md5sums until it's completely broken, then we can switch to something else. Regards, Giancarlo Razzolini
On 1/23/20 8:32 AM, Giancarlo Razzolini wrote:
Em janeiro 22, 2020 23:30 Eli Schwartz escreveu:
So ultimately that is what this discussion will always devolve to:
- Do we want to ensure TOFU?
Yes.
- Do we want to give PKGBUILDs the default black mark "uses md5sums because maintainer doesn't care about researching sources"?
No. Encouraging best packaging practices can and should be done right from the start.
This discussion is pointless though. Let's continue to use md5sums until it's completely broken, then we can switch to something else.
Then I'm sure you'll be delighted to know that the last time this discussion was brought up (a couple years ago?) Allan said he wanted to add "cksum" support and switch to that for a default. Rationale: both md5sum and cksum are already completely broken, but no one deludes themselves when they see "cksum" into thinking that it is anything but deliberate, and no one deludes themselves into thinking that there is any possibility it is secure. (The same thing is true of md5sum, both that its presence in makepkg is deliberate, and that it's not even intended to be secure. The difference is that with md5sum, people can lie to themselves about both.) And, sure enough, someone brought up the discussion again, and, sure enough, Allan has fulfilled on his promise with the patch submission which is a response to this thread: "makepkg: add CRC checksums and set these to be the default" -- Eli Schwartz Bug Wrangler and Trusted User
Em janeiro 23, 2020 11:59 Eli Schwartz escreveu:
Then I'm sure you'll be delighted to know that the last time this discussion was brought up (a couple years ago?) Allan said he wanted to add "cksum" support and switch to that for a default. Rationale: both md5sum and cksum are already completely broken, but no one deludes themselves when they see "cksum" into thinking that it is anything but deliberate, and no one deludes themselves into thinking that there is any possibility it is secure.
That's the opposite of encouraging best practices, but this horse is long dead, and there's nothing else to beat.
"makepkg: add CRC checksums and set these to be the default"
No comment on this one. Regards, Giancarlo Razzolini
participants (5)
-
Allan McRae
-
Artur Juraszek
-
Eli Schwartz
-
Filipe Laíns
-
Giancarlo Razzolini