Re: [arch-general] Stronger Hashes for PKGBUILDs
Hi, I'm intentionally using the title from Nov/Dec 2016 [0] to ease googling. I decided to check the status of this, and there is still 325 packages with only md5sums in [core] and [extra] (I didn't check [community]). Below results are generated by the attached script... Is there anything I can do (like sending reports to the Flyspray) to help convert those PKGBUILD's to SHA hashes? Thanks, L. ----- Stats ----- Total: 325 [core]: 28 [extra]: 283 removed: 14 -------------- Repo breakdown -------------- [core] b43-fwcutter cracklib dmraid fakeroot filesystem flex hdparm ipw2100-fw ipw2200-fw libaio libgssglue libnsl librpcsecgss libsasl libusb licenses linux-atm nfsidmap nilfs-utils pam pcmciautils pptpclient reiserfsprogs sysfsutils usbutils which xinetd zd1211-firmware [extra] a52dec accounts-qml-module aiksaurus alsa-firmware alsa-lib antlr2 apricots archlinux-menus archlinux-themes-slim arptables aspell-de aspell-es aspell-fr aspell-nl assimp attica-qt4 autoconf2.13 automoc4 bigloo bird bluez-firmware bogofilter bootchart capi4hylafax celt celt0.5.1 chemtool chmlib claws-mail-themes compface convertlit convmv cpio cscope ctags cups-pdf cvsps cyrus-sasl dbus-sharp dbus-sharp-glib ddrescue dmapi docbook-mathml docbook-xml dotconf ebtables enscript exiv2 facile fakechroot festival ffcall freealut frozen-bubble fsarchiver gamin gconf-sharp gd gdome2 giblib giflib gnome-mime-data gnu-efi-libs gnu-netcat gob2 gtkglext gtkmathview guile1.8 gwaterfall habak hunspell-it hunspell-ro hwdetect hylafax hyphen-de hyphen-it hyphen-nl hyphen-ro icon-naming-utils ifplugd ijs ilmbase imap iniparser ipset jack java-commons-net1 java-gnumail java-inetlib java-jdepend java-jline java-resolver lablgtk2 ladspa lame libaccounts-glib libatasmart libcaca libcdaudio libdbusmenu-qt libdiscid libdmtx libfbclient libgadu libglade libglademm libid3tag libiec61883 libieee1284 libifp libirman liblangtag liblqr libmcrypt libmp3splt libmp4v2 libmpeg2 libmusicbrainz5 libnet libnss_nis libomxil-bellagio libshout libsignon-glib libstdc++5 libtommath libusb-compat libutempter libxkbui libxmi libyaml licq lockdown-ms lua51 lua52 luajson lzop metalog mjpegtools mkisolinux mkpxelinux mksyslinux mod_dnssd mozilla-common mp3splt mp3wrap mtdev munin musepack mysql-python mythes-de mythes-en mythes-fr mythes-it mythes-ro nawk ncftp npapi-sdk nss_ldap nss-mdns nuget openbabel openconnect openexr oxygen-gtk2 pam_ldap perl-appconfig perl-bit-vector perl-business-isbn-data perl-carp-clan perl-common-sense perl-config-simple perl-convert-binhex perl-crypt-openssl-rsa perl-crypt-passwdmd5 perl-crypt-ssleay perl-date-calc perl-digest-hmac perl-digest-nilsimsa perl-digest-sha1 perl-email-date-format perl-ev perl-event-execflow perl-extutils-depends perl-extutils-pkgconfig perl-fcgi perl-file-sharedir perl-file-which perl-gtk2-ex-formfactory perl-guard perl-html-parser perl-html-tagset perl-image-exiftool perl-ipc-system-simple perl-locale-gettext perl-log-log4perl perl-mail-spf perl-math-round perl-mime-lite perl-netaddr-ip perl-net-cidr-lite perl-net-ip perl-sdl perl-string-shellquote perl-sys-hostname-long perl-template-toolkit perl-term-readkey perl-text-iconv perl-tie-simple perl-timedate perl-xml-namespacesupport perl-xml-sax-base php-apcu php-apcu-bc pkgstats progsreiserfs protobuf psiconv psutils pulseaudio-alsa pyopengl pyqt4 pyrex pysmbc python2-configparser python-appdirs python-defusedxml python-fpconst python-geoip python-mccabe python-mpd python-nose python-notify python-soappy qrencode qt-assistant-compat raptor rasqal razor refind-efi rpcsvc-proto rpmextract sane sane-frontends sbc schroedinger scons sg3_utils signon-plugin-oauth2 signon-ui slim-themes snappy snarf sni-qt sonata sound-theme-freedesktop source-highlight spandsp speedtouch spice-protocol taglib telepathy-idle telepathy-salut testdisk tevent tftp-hpa thinkfinger tidy tree ttf-bitstream-vera unixodbc wicd xalan-java xerces2-java xerces-c xorg-fonts-100dpi xorg-fonts-75dpi xorg-fonts-alias xorg-fonts-cyrillic xorg-fonts-type1 xsane yajl zita-alsa-pcmi zvbi [0] https://lists.archlinux.org/pipermail/arch-general/2016-December/042 -- Leonid Isaev
On Tue, May 08, 2018 at 08:08:31PM -0600, Leonid Isaev wrote:
[0] https://lists.archlinux.org/pipermail/arch-general/2016-December/042
Oops, this link should have been https://lists.archlinux.org/pipermail/arch-general/2016-December/042700.html -- Leonid Isaev
On Tue, May 08, 2018 at 08:08:31PM -0600, Leonid Isaev wrote:
[extra] ...
This list should also include "python-retrying". I should have grepped more carefully, sigh... -- Leonid Isaev
On 05/08/2018 10:08 PM, Leonid Isaev via arch-general wrote:
Hi,
I'm intentionally using the title from Nov/Dec 2016 [0] to ease googling. I decided to check the status of this, and there is still 325 packages with only md5sums in [core] and [extra] (I didn't check [community]). Below results are generated by the attached script... Is there anything I can do (like sending reports to the Flyspray) to help convert those PKGBUILD's to SHA hashes?
When you say "still", that implies that there was any sort of effort to change that in the first place... It will be closed as WONTFIX. That's a maintainer choice, and there are differing opinions about whether stronger checksums are: - not any sort of security check at all, they're there for CRC purposes, and using strong CRC is security theater because the maintainer probably just blindly ran updpkgsums without checking anything at all so they generated very strong fake hashes -- come back when you have PGP[1] which is actually security - actively dangerous as people think strong checksums equals security, which makes them trust the sources even when they shouldn't; like security theater except used as a justification for the other extreme - better than nothing, and therefore very useful since it ensures that you at least rebuilt the same thing the maintainer did - very much security, because obviously the maintainer verifies sources out of band, and checksums are their way of telling us what the canonical sources are FWIW I agree with point #3, but I estimate there's zero chance of universal consensus, and would prefer not to see a failed crusade rile people up. Again. As extensively discussed in several mailing list and forum threads, the best way to get security which everyone agrees on is to encourage upstream developers to PGP-sign their sources. I've done quite a bit of work on the existing TODO[1] which we have for implementing better PGP checks (and HTTPS for both privacy and TLS endpoint verification), in addition to providing the patchset[2] for makepkg (available in git master and awaiting the 5.1 release) which allows verifying git(1) signed commits/tags. This is honestly a much better use of everyone's time. [1] https://www.archlinux.org/todo/use-gpg-signatures-and-https-sources/ [2] https://git.archlinux.org/pacman.git/log/?id=37a89e2fac704babbe3badf0d9df0d41ec622f6f&showmsg=1 -- Eli Schwartz Bug Wrangler and Trusted User
On Tue, May 08, 2018 at 11:38:01PM -0400, Eli Schwartz via arch-general wrote:
When you say "still", that implies that there was any sort of effort to change that in the first place...
Fair enough :) I thought it's a slow natural process...
- not any sort of security check at all, they're there for CRC purposes, and using strong CRC is security theater because the maintainer probably just blindly ran updpkgsums without checking anything at all so they generated very strong fake hashes -- come back when you have PGP[1] which is actually security
In this case, even using gpg keys won't guarantee security because verifying a key via a side channel is not much easier than the hash.
- actively dangerous as people think strong checksums equals security, which makes them trust the sources even when they shouldn't; like security theater except used as a justification for the other extreme
Yes, but see [1] and [2]. At least with SHA hashes we are not so vulnerable. [1] http://cryptography.hyperlink.cz/2004/otherformats.html [2] https://www.mathstat.dal.ca/~selinger/md5collision
- better than nothing, and therefore very useful since it ensures that you at least rebuilt the same thing the maintainer did
No really, see just above. That is an old link, probably now forging .tar.gz files got much easier.
- very much security, because obviously the maintainer verifies sources out of band, and checksums are their way of telling us what the canonical sources are
If (s)he does, then there will be multiple hashes, from different sources, no?
As extensively discussed in several mailing list and forum threads, the best way to get security which everyone agrees on is to encourage upstream developers to PGP-sign their sources. I've done quite a bit of work on the existing TODO[1] which we have for implementing better PGP checks (and HTTPS for both privacy and TLS endpoint verification), in addition to providing the patchset[2] for makepkg (available in git master and awaiting the 5.1 release) which allows verifying git(1) signed commits/tags.
Thanks for your work! I didn't know about those links, will check them out. But ok, I see your point... Thanks, L. -- Leonid Isaev
On 05/08/2018 11:53 PM, Leonid Isaev via arch-general wrote:
- not any sort of security check at all, they're there for CRC purposes, and using strong CRC is security theater because the maintainer probably just blindly ran updpkgsums without checking anything at all so they generated very strong fake hashes -- come back when you have PGP[1] which is actually security
In this case, even using gpg keys won't guarantee security because verifying a key via a side channel is not much easier than the hash.
I'm not sure what you mean. PGP is by its very nature very secure, you establish an ongoing relationship with the key holder and can verify many, many objects, like the entire release history instead of independently bootstrapping the TOFU (Trust On First Use) model with every new release. PGP keys are also far more likely to appear in multiple independently verifiable locations, you can embed them in your DNS records, post them on your blog, github profile, keybase.io proofs utilizing DNS as well as social media linkages, email footer (and signed email history) to establish a difficult-to-falsify history, or simply follow the PGP web of trust. -- Eli Schwartz Bug Wrangler and Trusted User
On Wed, May 09, 2018 at 12:31:39AM -0400, Eli Schwartz via arch-general wrote:
PGP keys are also far more likely to appear in multiple independently verifiable locations, you can embed them in your DNS records, post them on your blog, github profile, keybase.io proofs utilizing DNS as well as social media linkages, email footer (and signed email history) to establish a difficult-to-falsify history, or simply follow the PGP web of trust.
It is all true. But... if I care to only do "makepkg -g >> PKGBUILD", then I'm unlikely to follow web of trust, and if I'm going to scout mailing lists for email footers, I will also scout debian, gentoo, alpine and fedora repos for different hashes. That was my only point, but we are mixing policy and technical issues. If hashes are supposed to mean that I'm building the same source as the maintainer, then using only md5sums negate this because the source can be silently swapped using existing libraries, and attackers don't even need to know mathematics behind md5 collisions... I agree that using strong hashes alone does not address security of source distribution, but neither does HTTPS for instance. At least, with sha-2 hashes, point #3 of your previous email makes sense. Thanks, -- Leonid Isaev
I would just like to note that SHA-2 hashes are inferior to Keccak and to BLAKE2. So better not to spend effort migrating to SHA-2.
On Wed, May 09, 2018 at 09:30:51PM +0200, Neven Sajko wrote:
I would just like to note that SHA-2 hashes are inferior to Keccak and to BLAKE2. So better not to spend effort migrating to SHA-2.
Strength of various SHA hashes is a different topic. My only point was that relying on md5 these days is like having no hashes at all or using the source filename as a hash... And there should be no migration -- when a new version of a package is released or a rebuild happens, just update the *sums array. Cheers, -- Leonid Isaev
Op do 10 mei 2018 01:26 schreef Leonid Isaev via arch-general < arch-general@archlinux.org>:
On Wed, May 09, 2018 at 09:30:51PM +0200, Neven Sajko wrote:
I would just like to note that SHA-2 hashes are inferior to Keccak and to BLAKE2. So better not to spend effort migrating to SHA-2.
Strength of various SHA hashes is a different topic. My only point was that relying on md5 these days is like having no hashes at all or using the source filename as a hash...
Which is (still) pretty much the point of these hashes. With pacman these are *not* very important. The hashes are there for a quick check if the download is complete. Integrity is a job for PGP. Mvg, Guus Snijders
On 05/10/2018 01:25 AM, Leonid Isaev via arch-general wrote:
On Wed, May 09, 2018 at 09:30:51PM +0200, Neven Sajko wrote:
I would just like to note that SHA-2 hashes are inferior to Keccak and to BLAKE2. So better not to spend effort migrating to SHA-2.
Strength of various SHA hashes is a different topic. My only point was that relying on md5 these days is like having no hashes at all or using the source filename as a hash...
And there should be no migration -- when a new version of a package is released or a rebuild happens, just update the *sums array.
Cheers,
Hello Leonid Isaev, I really like you effort on stronger hashes. I totally aggree with you that we need those, if we can't have GPG signatures by the maintainers. Hashes just help in less usecases than GPG signatures, of course, but they do. Unfortunately I made the experience, that this discussion is useless here and you rather start helping with GPG signatures for every package. If you want to put effort into this topic, which I really appreciate, please directly go for GPG signatures, otherway it will be just a frustrating discussion for you, sadly. What I can recommend to you for this is to write to upstream projects who don't use GPG signatures yet. Explain them why its important and help them to improve their software release security. I made the experience that quite a lot of projects did not know about the importance of GPG or just never looked into it. Just a few refuse to use GPG, leave that for now. As additional support you can use the GPGit guides as well as the automated (same named) GPGit tool: https://github.com/NicoHood/gpgit It will help new users to understand GPG and provide them an easy to use tool to get started with GPG within a few minutes. Feedback for this is appreaciated. I wish you all good luck, dont hesitate to contact me further if you have any great ideas regarding GPG etc. ~Nico
On Thu, May 10, 2018 at 10:06:08AM +0200, NicoHood wrote:
I really like you effort on stronger hashes. I totally aggree with you that we need those, if we can't have GPG signatures by the maintainers. Hashes just help in less usecases than GPG signatures, of course, but they do.
Currently, about 55% of [core] and 31% of [extra] packages make use of validpgpkeys. In [community] it should be even less. So, it is still a long way to go while all PKGBUILDs use GPG-verified sources... I agree with others that using a single sha256sum instead of md5sum offers questionable security benefit, but at least it protects against future tampering with the src by an attacker who knows about MD5 collisions.
Unfortunately I made the experience, that this discussion is useless here and you rather start helping with GPG signatures for every package. If you want to put effort into this topic, which I really appreciate, please directly go for GPG signatures, otherway it will be just a frustrating discussion for you, sadly.
There are only about 13% of packages in both [core] and [extra] that use MD5 -- a relatively small percentage. Yes, replacing those with a stronger hash is a stop-gap measure, but it involves no maintainance overhead. When you brought up this point last December, I didn't know that it is possible to have concurrent CRC and MD5 collisions (ar at least they are difficult to find). But since then, I did some homework and it indeed seems quite easy these days. Therefore, using MD5 is no better than having SKIP. In this regard, I don't understand why we need checksums at all? If upstream: (1) signes source with GPG, it will take care of both integrity and authenticity, so no need for hashes; (2) doesn't provide signatures, rely on gzip/bzip2/xz CRC. It is not cryptographically secure, but we don't need that anyway. Hence, we can substantially simplify makepkg code...
What I can recommend to you for this is to write to upstream projects who don't use GPG signatures yet. Explain them why its important and help them to improve their software release security. I made the experience that quite a lot of projects did not know about the importance of GPG or just never looked into it. Just a few refuse to use GPG, leave that for now.
What about upstreams, like PAM, who stopped signing their releases? From a developer point of view, it makes sense to not have a GPG key because it implies an additional responsibility of keeping it safe. Therefore, I understand people who don't signed their src archives.
As additional support you can use the GPGit guides as well as the automated (same named) GPGit tool: https://github.com/NicoHood/gpgit It will help new users to understand GPG and provide them an easy to use tool to get started with GPG within a few minutes. Feedback for this is appreaciated.
I don't think it's needed. GPG is not complicated at all. The difficulty that prevents its widespread use lies with maintaining the key, and with that no guide can help...
I wish you all good luck, dont hesitate to contact me further if you have any great ideas regarding GPG etc.
Thanks, L. -- Leonid Isaev
On Thu, 10 May 2018 03:46:34 -0600, Leonid Isaev via arch-general wrote:
GPG is not complicated at all.
https://aur.archlinux.org/pkgbase/linux-rt/#comment-645504 SICR -- pacman -Q linux{,-rt{,-pussytoes,-securityink,-cornflower}}|cut -d\ -f2 4.16.7-1 4.16.7_rt1-1 4.14.34_rt27-1 4.14.29_rt25-1 4.14.28_rt23-1
On 05/10/2018 05:46 AM, Leonid Isaev via arch-general wrote:
In this regard, I don't understand why we need checksums at all? If upstream: (1) signes source with GPG, it will take care of both integrity and authenticity, so no need for hashes; (2) doesn't provide signatures, rely on gzip/bzip2/xz CRC. It is not cryptographically secure, but we don't need that anyway. Hence, we can substantially simplify makepkg code...
makepkg --skippgpcheck without checksum integrity this would potentially result in corrupted, malformed downloads that aren't caught. Also a check which tells you the file has a bad signature *because the download is malformed* is sort of a weird user experience, and it might not be obvious you should try redownloading. -- Eli Schwartz Bug Wrangler and Trusted User
The single most beneficial change would be adoption of The Update Framework, since it is resilient against all known issues with remote package management, regardless of pkg signers coming/going and whether HTTPS is used or not. It also has a nice protocol for handling key revocation.
I do agree that using md5 is absurd, but putting effort into using sha-2 seems like a waste when Keccak and BLAKE2 are both faster and more secure than the old hashes. Regards, Neven
On 13 May 2018 at 20:11, Neven Sajko <nsajko@gmail.com> wrote:
I do agree that using md5 is absurd, ...
To clarify, md5 *is* unsecure and is even slower or not significantly faster than hashes from the Keccak and BLAKE2 families; using signatures would be a plus but signatures are not an argument for md5.
On Sun, May 13, 2018 at 08:19:19PM +0200, Neven Sajko via arch-general wrote:
On 13 May 2018 at 20:11, Neven Sajko <nsajko@gmail.com> wrote:
I do agree that using md5 is absurd, ...
To clarify, md5 *is* unsecure and is even slower or not significantly faster than hashes from the Keccak and BLAKE2 families; using signatures would be a plus but signatures are not an argument for md5.
It is trivial to enable blake2 support in makepkg using b2sum(1) from the coreutils package. Currently, I only saw gentoo using it but I didn't do proper research on this... Yes, md5 is almost as good these days as crc32... It is ok if the sources are gpg-signed, but not on its own. Cheers, -- Leonid Isaev
On 05/13/2018 08:11 PM, Leonid Isaev via arch-general wrote:
On Sun, May 13, 2018 at 08:19:19PM +0200, Neven Sajko via arch-general wrote:
On 13 May 2018 at 20:11, Neven Sajko <nsajko@gmail.com> wrote:
I do agree that using md5 is absurd, ...
To clarify, md5 *is* unsecure and is even slower or not significantly faster than hashes from the Keccak and BLAKE2 families; using signatures would be a plus but signatures are not an argument for md5.
It is trivial to enable blake2 support in makepkg using b2sum(1) from the coreutils package. Currently, I only saw gentoo using it but I didn't do proper research on this...
Maybe you could ask the coreutils developers whatever happened to implementing Keccak checksumming tools. -- Eli Schwartz Bug Wrangler and Trusted User
Hi Eli,
Maybe you could ask the coreutils developers whatever happened to implementing Keccak checksumming tools.
SHA-3? Have you see https://www.imperialviolet.org/2017/05/31/skipsha3.html I've also seen suggestions that the Keccak team push Kangaroo Twelve these days over SHA-3 due to SHA-3's comparative slowness. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy
On Mon, May 14, 2018 at 11:23:39AM +0100, Ralph Corderoy wrote:
Hi Eli,
Maybe you could ask the coreutils developers whatever happened to implementing Keccak checksumming tools.
SHA-3? Have you see https://www.imperialviolet.org/2017/05/31/skipsha3.html I've also seen suggestions that the Keccak team push Kangaroo Twelve these days over SHA-3 due to SHA-3's comparative slowness.
Of course, none of this is relevant for the present thread... Cheers, -- Leonid Isaev
On 05/14/2018 10:48 AM, Leonid Isaev via arch-general wrote:
On Mon, May 14, 2018 at 11:23:39AM +0100, Ralph Corderoy wrote:
Hi Eli,
Maybe you could ask the coreutils developers whatever happened to implementing Keccak checksumming tools.
SHA-3? Have you see https://www.imperialviolet.org/2017/05/31/skipsha3.html I've also seen suggestions that the Keccak team push Kangaroo Twelve these days over SHA-3 due to SHA-3's comparative slowness.
Of course, none of this is relevant for the present thread...
We're currently in feature freeze for pacman 5.1 Anyone who hopes to have b2sum support in *future* versions of pacman, would be well advised to come across as a person seeking to extend support for the current crop of common hashing algorithms, not someone pushing b2sum because "secure all PKGBUILDs". For this reason, it would probably be useful to see coreutils support more than one cherry-picked modern hashing algorithm. I'm not really caring which ones those are, but then I'm also perfectly happy with sha256/sha512 (which are both of them great algorithms which work perfectly fine). So I'm uninterested in the bikeshed on general principle, and only vaguely interested inasmuch as having more tools and more diversity in the future would probably be interesting and/or useful. But I can find lots of arguments for and against all the SHA3 candidates, some of them rather bitter, so I see no reason to take sides. -- Eli Schwartz Bug Wrangler and Trusted User
On Mon, May 14, 2018 at 11:01:57AM -0400, Eli Schwartz via arch-general wrote:
We're currently in feature freeze for pacman 5.1
Anyone who hopes to have b2sum support in *future* versions of pacman, would be well advised to come across as a person seeking to extend support for the current crop of common hashing algorithms, not someone pushing b2sum because "secure all PKGBUILDs".
For this reason, it would probably be useful to see coreutils support more than one cherry-picked modern hashing algorithm. I'm not really caring which ones those are, but then I'm also perfectly happy with sha256/sha512 (which are both of them great algorithms which work perfectly fine).
So I'm uninterested in the bikeshed on general principle, and only vaguely interested inasmuch as having more tools and more diversity in the future would probably be interesting and/or useful. But I can find lots of arguments for and against all the SHA3 candidates, some of them rather bitter, so I see no reason to take sides.
I agree... But I think that trying to identify the best algorithm is a waste of time because the only important feature is whether a given hash algorithm has been broken (in the sense of generating collisions). Everything else (performance, hash size, etc) is completely irrelevant for makepkg use... It would make sense to include B2B/SHA3 support in makepkg when we start seeing updtreams provide these hashes. Currently, AFAIK the only "upstream" doing that is Gentoo in their Manifests. Cheers, -- Leonid Isaev
On 05/08/2018 10:38 PM, Eli Schwartz via arch-general wrote:
This is honestly a much better use of everyone's time.
It is indeed a rare occurrence to see the uncommon common sense rear its lonely head from time to time... but comforting. -- David C. Rankin, J.D.,P.E.
participants (9)
-
Carsten Mattner
-
David C. Rankin
-
Eli Schwartz
-
Guus Snijders
-
Leonid Isaev
-
Neven Sajko
-
NicoHood
-
Ralf Mardorf
-
Ralph Corderoy