[pacman-dev] [PATCH 0/2] Deprecate md5sums, show sha256sums as an example-by-default.
Both the MD5 and SHA-1 hash functions have known collision attacks, providing an attack vector for malicious hosts and MITMs to provide tampered code without being detected by md5, or sha1, hashing. We should move to sha256-by-default, and encourage their use by changing the documentation and example files to follow suit. The SHA-2 family of hashes are currently secure against normal attacks (even at the scale of having Facebook's or Google's datacenters). Int the future, pacman should gain SHA-3 support though, because SHA-2 itself has some theoretical preimage attacks and possible collision attacks. Mike Swanson (2): proto: Encourage the use of sha256sums by example. doc, makepkg.conf: Deprecate md5sums, show examples using sha256sums. doc/PKGBUILD-example.txt | 4 ++-- doc/PKGBUILD.5.txt | 31 +++++++++++++++++++------------ doc/makepkg-template.1.txt | 2 +- etc/makepkg.conf.in | 2 +- proto/PKGBUILD-split.proto | 2 +- proto/PKGBUILD-vcs.proto | 2 +- proto/PKGBUILD.proto | 2 +- 7 files changed, 26 insertions(+), 19 deletions(-) -- 2.11.1
MD5 has had known and easy-to-carry-out collision attacks for years now, the SHA-2 (256, 384, 512) functions are presently safe. --- proto/PKGBUILD-split.proto | 2 +- proto/PKGBUILD-vcs.proto | 2 +- proto/PKGBUILD.proto | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/proto/PKGBUILD-split.proto b/proto/PKGBUILD-split.proto index 9898ef81..eea97e56 100644 --- a/proto/PKGBUILD-split.proto +++ b/proto/PKGBUILD-split.proto @@ -28,7 +28,7 @@ changelog= source=("$pkgbase-$pkgver.tar.gz" "$pkgname-$pkgver.patch") noextract=() -md5sums=() +sha256sums=() validpgpkeys=() prepare() { diff --git a/proto/PKGBUILD-vcs.proto b/proto/PKGBUILD-vcs.proto index ae9956a9..49c6759f 100644 --- a/proto/PKGBUILD-vcs.proto +++ b/proto/PKGBUILD-vcs.proto @@ -25,7 +25,7 @@ options=() install= source=('FOLDER::VCS+URL#FRAGMENT') noextract=() -md5sums=('SKIP') +sha256sums=('SKIP') # Please refer to the 'USING VCS SOURCES' section of the PKGBUILD man page for # a description of each element in the source array. diff --git a/proto/PKGBUILD.proto b/proto/PKGBUILD.proto index a2c600d5..9aff797c 100644 --- a/proto/PKGBUILD.proto +++ b/proto/PKGBUILD.proto @@ -27,7 +27,7 @@ changelog= source=("$pkgname-$pkgver.tar.gz" "$pkgname-$pkgver.patch") noextract=() -md5sums=() +sha256sums=() validpgpkeys=() prepare() { -- 2.11.1
--- doc/PKGBUILD-example.txt | 4 ++-- doc/PKGBUILD.5.txt | 31 +++++++++++++++++++------------ doc/makepkg-template.1.txt | 2 +- etc/makepkg.conf.in | 2 +- 4 files changed, 23 insertions(+), 16 deletions(-) diff --git a/doc/PKGBUILD-example.txt b/doc/PKGBUILD-example.txt index 910fd068..d4e1c9c1 100644 --- a/doc/PKGBUILD-example.txt +++ b/doc/PKGBUILD-example.txt @@ -12,8 +12,8 @@ depends=('glibc') makedepends=('ed') optdepends=('ed: for "patch -e" functionality') source=("ftp://ftp.gnu.org/gnu/$pkgname/$pkgname-$pkgver.tar.xz"{,.sig}) -md5sums=('e9ae5393426d3ad783a300a338c09b72' - 'SKIP') +sha256sums=('9124ba46db0abd873d0995c2ca880e81252676bb6c03e0a37dfc5f608a9b0ceb' + 'SKIP') build() { cd "$srcdir/$pkgname-$pkgver" diff --git a/doc/PKGBUILD.5.txt b/doc/PKGBUILD.5.txt index 18bc2a19..edf469fe 100644 --- a/doc/PKGBUILD.5.txt +++ b/doc/PKGBUILD.5.txt @@ -118,7 +118,7 @@ systems (see below). + Additional architecture-specific sources can be added by appending an underscore and the architecture name e.g., 'source_x86_64=()'. There must be a -corresponding integrity array with checksums, e.g. 'md5sums_x86_64=()'. +corresponding integrity array with checksums, e.g. 'sha256sums_x86_64=()'. + It is also possible to change the name of the downloaded file, which is helpful with weird URLs and for handling multiple source files with the same @@ -146,19 +146,26 @@ contain whitespace characters. listed here will not be extracted with the rest of the source files. This is useful for packages that use compressed data directly. -*md5sums (array)*:: - This array contains an MD5 hash for every source file specified in the - source array (in the same order). makepkg will use this to verify source - file integrity during subsequent builds. If 'SKIP' is put in the array - in place of a normal hash, the integrity check for that source file will - be skipped. To easily generate md5sums, run ``makepkg -g >> PKGBUILD''. - If desired, move the md5sums line to an appropriate location. +*sha256sums (array)*:: + This array contains a SHA256 hash for every source file specified in the + source array (in the same order). makepkg will use this to verify + source file integrity during subsequent builds. If 'SKIP' is put in the + array in place of a normal hash, the integrity check for that source + file will be skipped. To easily generate sha256sums, run ``makepkg -g + >> PKGBUILD''. If desired, move the sha256sums to an appropriate + location. -*sha1sums, sha256sums, sha384sums, sha512sums (arrays)*:: +*sha384sums, sha512sums (arrays)*:: Alternative integrity checks that makepkg supports; these all behave - similar to the md5sums option described above. To enable use and generation - of these checksums, be sure to set up the `INTEGRITY_CHECK` option in - linkman:makepkg.conf[5]. + similar to the sha256sums option described above. To enable use and + generation of these checksums, be sure to set up the `INTEGRITY_CHECK` + option in linkman:makepkg.conf[5]. + +*md5sums, sha1sums (arrays)*:: + Alternative legacy integrity checks that makepkg supports. These are + supported for compatibility, but should not be used in current PKGBUILD + files due to known collision attacks on the algorithms, allowing + malicious files to pose as legitimate ones. *groups (array)*:: An array of symbolic names that represent groups of packages, allowing diff --git a/doc/makepkg-template.1.txt b/doc/makepkg-template.1.txt index 99637d43..53cb4997 100644 --- a/doc/makepkg-template.1.txt +++ b/doc/makepkg-template.1.txt @@ -88,7 +88,7 @@ Example PKGBUILD license=('PerlArtistic' 'GPL') depends=('perl') source=("http://search.cpan.org/CPAN/authors/id/S/SH/SHERZODR/Config-Simple-${pkgver}.tar.gz") - md5sums=('f014aec54f0a1e2e880d317180fce502') + sha256sums=('dd9995706f0f9384a15ccffe116c3b6e22f42ba2e58d8f24ed03c4a0e386edb4') _distname="Config-Simple" # template start; name=perl-module; version=1.0; diff --git a/etc/makepkg.conf.in b/etc/makepkg.conf.in index 71293970..24b83d18 100644 --- a/etc/makepkg.conf.in +++ b/etc/makepkg.conf.in @@ -86,7 +86,7 @@ BUILDENV=(!distcc color !ccache check !sign) OPTIONS=(strip docs libtool staticlibs emptydirs zipman purge !debug) #-- File integrity checks to use. Valid: md5, sha1, sha256, sha384, sha512 -INTEGRITY_CHECK=(md5) +INTEGRITY_CHECK=(sha256) #-- Options to be used when stripping binaries. See `man strip' for details. STRIP_BINARIES="@STRIP_BINARIES@" #-- Options to be used when stripping shared libraries. See `man strip' for details. -- 2.11.1
On 02/23/2017 04:31 PM, Mike Swanson wrote:
Both the MD5 and SHA-1 hash functions have known collision attacks, providing an attack vector for malicious hosts and MITMs to provide tampered code without being detected by md5, or sha1, hashing.
We should move to sha256-by-default, and encourage their use by changing the documentation and example files to follow suit. The SHA-2 family of hashes are currently secure against normal attacks (even at the scale of having Facebook's or Google's datacenters). Int the future, pacman should gain SHA-3 support though, because SHA-2 itself has some theoretical preimage attacks and possible collision attacks.
I like the idea. ;) But this has come up multiple times already, and Allan has strongly resisted it. From the thread "[arch-general] Stronger Hashes for PKGBUILDs" (Dec. 2016)
I advocate keeping md5sum as the default because it is broken. If I see someone purely verifying their sources using md5sum in a PKGBUILD (and not pgp signature), I know that they have done nothing to actually verify the source themselves.
If sha2sums become default, I now know nothing. Did the maintainer of the PKGBUILD get that checksum from a securely distributed source from upstream? Had the source already been compromised upstream before the PKGBUILD was made? Now I am securely verifying the unknown.
But we don't care about that... we just want to feel warm and fuzzy with a false sense of security.
Also, there was a thread in the forums somewhere... Essentially, his arguments boil down to "strong checksums don't prove anything except that the AUR maintainer bumped the pkgver and ran `updpkgsums` to blindly insert unverified hashes into the PKGBUILD", and therefore md5sums are perfectly okay for the one thing they are meant to do, which is prove that the download wasn't corrupted in a freak accident. He did imply he'd be okay replacing the whole *sums thing with "crcsums", just to make things clearer for everyone. ;) It is of course very true that anyone who *really* cares about the security of a package, should lean on upstream to provide proper GPG signatures for their release artifacts, as that will be immeasurably more secure than any anonymous checksums no matter how strong, or how much you trust the maintainer. :) ... Good luck convincing Allan (you'll *need* it...). -- Eli Schwartz
On Thu, 23 Feb 2017 at 16:31 Mike Swanson <mikeonthecomputer@gmail.com> wrote:
Both the MD5 and SHA-1 hash functions have known collision attacks, providing an attack vector for malicious hosts and MITMs to provide tampered code without being detected by md5, or sha1, hashing.
We should move to sha256-by-default, and encourage their use by changing the documentation and example files to follow suit. The SHA-2 family of hashes are currently secure against normal attacks (even at the scale of having Facebook's or Google's datacenters). Int the future, pacman should gain SHA-3 support though, because SHA-2 itself has some theoretical preimage attacks and possible collision attacks.
<https://crypto.stackexchange.com/questions/26336/sha512-faster-than-sha256> points out that using sha512 is faster than sha256 so I'd rather not waste my time calculating hashes without a good reason
Mike Swanson (2): proto: Encourage the use of sha256sums by example. doc, makepkg.conf: Deprecate md5sums, show examples using sha256sums.
doc/PKGBUILD-example.txt | 4 ++-- doc/PKGBUILD.5.txt | 31 +++++++++++++++++++------------ doc/makepkg-template.1.txt | 2 +- etc/makepkg.conf.in | 2 +- proto/PKGBUILD-split.proto | 2 +- proto/PKGBUILD-vcs.proto | 2 +- proto/PKGBUILD.proto | 2 +- 7 files changed, 26 insertions(+), 19 deletions(-)
-- 2.11.1
-- Signed, Kieran Colford
On 24/02/17 07:58, Eli Schwartz wrote:
Good luck convincing Allan (you'll *need* it...).
Not going to happen...
Em fevereiro 23, 2017 19:22 Allan McRae escreveu:
On 24/02/17 07:58, Eli Schwartz wrote:
Good luck convincing Allan (you'll *need* it...).
Not going to happen...
Allan, I want to pitch you another line of thought. I followed that discussion last year, and I've been following closely the fallout of today's google announcement on the "practical" sha1 attack. Anyone who actually read the paper, and got past the sensationalism and the hypeness of those vulnerabilities sites (why does everything needs a site now?), knows that it doesn't change much for our usage of sha1, or md5 for that matter. You argued on the last year's discussion that using stronger hashes would gave the a "false sense of security". I don't disagree with that. But I want to add that using weaker (if only in keyspace or cryptographically) also creates a false sense of *insecurity*. And this people that have this false sense of insecurity, will be the same people who will have the false sense of security, regardless of what we do. They don't use GPG, nor ever will. They don't care if upstream sign things. All they see is: md5, and now sha1, are "broken" and arch should stop using them. With that in mind, using stronger algorithms, would be very easy for us (that patch is trivial), wouldn't have any drawbacks (just that stupid people would fell "safer"), and would make those same people to stop complaining that we don't use strong hashes. I don't see the issue of upstream never signing things changing on the near future. So we should either do a bigger change, perhaps even that crc proposal of yours, or do this smaller change and use stronger hashes by default. Cheers, Giancarlo Razzolini
On Thu, 2017-02-23 at 22:04 +0000, Kieran Colford wrote:
<https://crypto.stackexchange.com/questions/26336/sha512-faster-than-s ha256> points out that using sha512 is faster than sha256 so I'd rather not waste my time calculating hashes without a good reason
I wasn't aware of this, thanks :)
This is an interesting discussion, I don't exactly mind the points for remaining with md5sums as the example, but I do have some issue with it: I believe the documentation and sample PKGBUILDs should show best practices, rather than purposely use a poor practice with the hope that a PKGBUILD author fixes it themself. _I_ know to replace the checksum function with something better, to use GPG keys where possible, but a brand new author would not, and a long-time author may not even realize the best practices do not match what they are used to: the example PKGBUILDs aren't being changed to show the contrary. On false senses of security: Yes, there is some blind faith that an AUR maintainer just so happened to provide the correct checksums, but there's even a faith that the correct GPG keys are used and correct source host. Thankfully, it is usually plainly obvious if the latter is the case. :-) On upstream always providing GPG signature files against tarballs: Beyond the fact that not all upstreams even do this (and you can make a fair argument that the AUR maintainer has no firm reason to believe _their_ download was the correct one), I'm not actually entirely convinced that they should always be expected to do as such. This is a difficult position to defend, and it may come down to laziness, but hosting sites such as GitHub and GitLab provide automated tarball generation (by just using `git archive` on the backend -- it's easy to independently verify the archives). Speaking from my experience, it has become natural for me to stick with GPG-signing the tag in Git itself and ignoring output files such as these. It largely comes to “If you need to verify the integrity of the source code, I expect you to clone the repository and check that tag, and use `git archive | $HASH` to verify the archive GitHub/GitLab provide.” In this case, the GPG signature exists, but not in any sort of form makepkg is expecting nor can verify. Given that AUR PKGBUILDs can be retrieved over https or ssh, and likewish GitHub and GitLab repositories and files can be retrieved over https or ssh, there is already a high confidence that a malicious actor in the middle is not tampering with any of the sources. If somebody _really_ wants to verify that the PKGBUILD is using the unaltered source, no trickery, there is no way around examining it themself, making sure the correct checksums or GPG keys are being used, making sure no patches are changing the source maliciously (either as files or part of the prepare/build functions), and frankly, using the upstream project's VCS to verify tags and generated archives. tl;dr: I don't think PKGBUILDs can ever be automatically trusted, not even if they use GPG signature checking. Nor do I think they should ever be fully distrusted if they do not.
On 24/02/17 08:59, Giancarlo Razzolini wrote:
Em fevereiro 23, 2017 19:22 Allan McRae escreveu:
On 24/02/17 07:58, Eli Schwartz wrote:
Good luck convincing Allan (you'll *need* it...).
Not going to happen...
Allan,
I want to pitch you another line of thought. I followed that discussion last year, and I've been following closely the fallout of today's google announcement on the "practical" sha1 attack.
Anyone who actually read the paper, and got past the sensationalism and the hypeness of those vulnerabilities sites (why does everything needs a site now?), knows that it doesn't change much for our usage of sha1, or md5 for that matter.
You argued on the last year's discussion that using stronger hashes would gave the a "false sense of security". I don't disagree with that. But I want to add that using weaker (if only in keyspace or cryptographically) also creates a false sense of *insecurity*.
And this people that have this false sense of insecurity, will be the same people who will have the false sense of security, regardless of what we do. They don't use GPG, nor ever will. They don't care if upstream sign things. All they see is: md5, and now sha1, are "broken" and arch should stop using them.
With that in mind, using stronger algorithms, would be very easy for us (that patch is trivial), wouldn't have any drawbacks (just that stupid people would fell "safer"), and would make those same people to stop complaining that we don't use strong hashes.
I don't see the issue of upstream never signing things changing on the near future. So we should either do a bigger change, perhaps even that crc proposal of yours, or do this smaller change and use stronger hashes by default.
I find that a terrible argument. A
On 02/23/2017 10:16 PM, Mike Swanson wrote:
This is an interesting discussion, I don't exactly mind the points for remaining with md5sums as the example, but I do have some issue with it:
I believe the documentation and sample PKGBUILDs should show best practices, rather than purposely use a poor practice with the hope that a PKGBUILD author fixes it themself. _I_ know to replace the checksum function with something better, to use GPG keys where possible, but a brand new author would not, and a long-time author may not even realize the best practices do not match what they are used to: the example PKGBUILDs aren't being changed to show the contrary.
On false senses of security: Yes, there is some blind faith that an AUR maintainer just so happened to provide the correct checksums, but there's even a faith that the correct GPG keys are used and correct source host. Thankfully, it is usually plainly obvious if the latter is the case. :-)
On upstream always providing GPG signature files against tarballs: Beyond the fact that not all upstreams even do this (and you can make a fair argument that the AUR maintainer has no firm reason to believe _their_ download was the correct one), I'm not actually entirely convinced that they should always be expected to do as such.
This is a difficult position to defend, and it may come down to laziness, but hosting sites such as GitHub and GitLab provide automated tarball generation (by just using `git archive` on the backend -- it's easy to independently verify the archives). Speaking from my experience, it has become natural for me to stick with GPG-signing the tag in Git itself and ignoring output files such as these. It largely comes to “If you need to verify the integrity of the source code, I expect you to clone the repository and check that tag, and use `git archive | $HASH` to verify the archive GitHub/GitLab provide.”
I encourage you to run `git archive` on your local master repo, and generate a GPG signature against that... it will be reproducible in the autogenerated version. Then upload the GPG signature as a release artifact. Because you're right, it is sheer laziness. Downloading a potentially *huge* git repo just to verify signatures, is madness. Try applying that logic to the linux kernel... *cloning* it begins to approach the length of time required to *build* it. Knowing beforehand, how many commits to specify with --depth, is not a reasonable answer. :)
In this case, the GPG signature exists, but not in any sort of form makepkg is expecting nor can verify. Given that AUR PKGBUILDs can be retrieved over https or ssh, and likewish GitHub and GitLab repositories and files can be retrieved over https or ssh, there is already a high confidence that a malicious actor in the middle is not tampering with any of the sources. If somebody _really_ wants to verify that the PKGBUILD is using the unaltered source, no trickery, there is no way around examining it themself, making sure the correct checksums or GPG keys are being used, making sure no patches are changing the source maliciously (either as files or part of the prepare/build functions), and frankly, using the upstream project's VCS to verify tags and generated archives.
Fortunately, in pacman-git, makepkg now knows how to validate git signed commits/tags, same as tarballs. :) See https://lists.archlinux.org/pipermail/pacman-dev/2017-January/thread.html#21... -- Eli Schwartz
Le 24/02/2017 à 04:37, Eli Schwartz a écrit :
On 02/23/2017 10:16 PM, Mike Swanson wrote:
This is an interesting discussion, I don't exactly mind the points for remaining with md5sums as the example, but I do have some issue with it:
I believe the documentation and sample PKGBUILDs should show best practices, rather than purposely use a poor practice with the hope that a PKGBUILD author fixes it themself. _I_ know to replace the checksum function with something better, to use GPG keys where possible, but a brand new author would not, and a long-time author may not even realize the best practices do not match what they are used to: the example PKGBUILDs aren't being changed to show the contrary.
On false senses of security: Yes, there is some blind faith that an AUR maintainer just so happened to provide the correct checksums, but there's even a faith that the correct GPG keys are used and correct source host. Thankfully, it is usually plainly obvious if the latter is the case. :-)
On upstream always providing GPG signature files against tarballs: Beyond the fact that not all upstreams even do this (and you can make a fair argument that the AUR maintainer has no firm reason to believe _their_ download was the correct one), I'm not actually entirely convinced that they should always be expected to do as such.
This is a difficult position to defend, and it may come down to laziness, but hosting sites such as GitHub and GitLab provide automated tarball generation (by just using `git archive` on the backend -- it's easy to independently verify the archives). Speaking from my experience, it has become natural for me to stick with GPG-signing the tag in Git itself and ignoring output files such as these. It largely comes to “If you need to verify the integrity of the source code, I expect you to clone the repository and check that tag, and use `git archive | $HASH` to verify the archive GitHub/GitLab provide.” I encourage you to run `git archive` on your local master repo, and generate a GPG signature against that... it will be reproducible in the autogenerated version. Then upload the GPG signature as a release artifact.
Because you're right, it is sheer laziness. Downloading a potentially *huge* git repo just to verify signatures, is madness. Try applying that logic to the linux kernel... *cloning* it begins to approach the length of time required to *build* it. Knowing beforehand, how many commits to specify with --depth, is not a reasonable answer. :)
Debian wrote a nice page about this: https://wiki.debian.org/Creating%20signed%20GitHub%20releases Especially the alternative local workflow at the end, that is mostly what Eli proposes above. ;) One example of package doing this is https://github.com/vector-im/riot-web, they included this easily in their release process at https://github.com/matrix-org/matrix-js-sdk/pull/351. If you’re already signing tags used for releases, signing the tarball should really be easy and as underlined by Eli, quite a good idea. So, yes, PGP everywhere please. Cheers, Bruno
On Fri, Feb 24, 2017, 8:52 AM Bruno Pagani, <bruno.n.pagani@gmail.com> wrote: Le 24/02/2017 à 04:37, Eli Schwartz a écrit :
On 02/23/2017 10:16 PM, Mike Swanson wrote:
This is an interesting discussion, I don't exactly mind the points for remaining with md5sums as the example, but I do have some issue with it:
I believe the documentation and sample PKGBUILDs should show best practices, rather than purposely use a poor practice with the hope that a PKGBUILD author fixes it themself. _I_ know to replace the checksum function with something better, to use GPG keys where possible, but a brand new author would not, and a long-time author may not even realize the best practices do not match what they are used to: the example PKGBUILDs aren't being changed to show the contrary.
On false senses of security: Yes, there is some blind faith that an AUR maintainer just so happened to provide the correct checksums, but there's even a faith that the correct GPG keys are used and correct source host. Thankfully, it is usually plainly obvious if the latter is the case. :-)
On upstream always providing GPG signature files against tarballs: Beyond the fact that not all upstreams even do this (and you can make a fair argument that the AUR maintainer has no firm reason to believe _their_ download was the correct one), I'm not actually entirely convinced that they should always be expected to do as such.
This is a difficult position to defend, and it may come down to laziness, but hosting sites such as GitHub and GitLab provide automated tarball generation (by just using `git archive` on the backend -- it's easy to independently verify the archives). Speaking from my experience, it has become natural for me to stick with GPG-signing the tag in Git itself and ignoring output files such as these. It largely comes to “If you need to verify the integrity of the source code, I expect you to clone the repository and check that tag, and use `git archive | $HASH` to verify the archive GitHub/GitLab provide.” I encourage you to run `git archive` on your local master repo, and generate a GPG signature against that... it will be reproducible in the autogenerated version. Then upload the GPG signature as a release artifact.
Because you're right, it is sheer laziness. Downloading a potentially *huge* git repo just to verify signatures, is madness. Try applying that logic to the linux kernel... *cloning* it begins to approach the length of time required to *build* it. Knowing beforehand, how many commits to specify with --depth, is not a reasonable answer. :)
Debian wrote a nice page about this: https://wiki.debian.org/Creating%20signed%20GitHub%20releases Especially the alternative local workflow at the end, that is mostly what Eli proposes above. ;) One example of package doing this is https://github.com/vector-im/riot-web, they included this easily in their release process at https://github.com/matrix-org/matrix-js-sdk/pull/351. If you’re already signing tags used for releases, signing the tarball should really be easy and as underlined by Eli, quite a good idea. So, yes, PGP everywhere please. Cheers, Bruno I agree that PGP everywhere is absolutely something to push for. On the other hand, not every developer is in the web of trust strong set and if you're downloading the package sources from Github then that's probably where you got the PGP key id from as well. An attacker who can highjack your TLS secured source download when you bump the package version could also have fed you a forged PGP key id when you first made the package. Upgrading to stronger checksums is only marginally less secure than using PGP. I think we ought to settle on what to do about these checksums. MD5 and SHA-1 are not strong enough to provide security but they're also too bloated for mere error correction. A change is definitely needed and we should decide on which direction to take. -- Signed, Kieran Colford
Le 24/02/2017 à 16:41, Kieran Colford a écrit :
On Fri, Feb 24, 2017, 8:52 AM Bruno Pagani, <bruno.n.pagani@gmail.com> wrote:
Le 24/02/2017 à 04:37, Eli Schwartz a écrit :
On 02/23/2017 10:16 PM, Mike Swanson wrote:
This is an interesting discussion, I don't exactly mind the points for remaining with md5sums as the example, but I do have some issue with it:
I believe the documentation and sample PKGBUILDs should show best practices, rather than purposely use a poor practice with the hope that a PKGBUILD author fixes it themself. _I_ know to replace the checksum function with something better, to use GPG keys where possible, but a brand new author would not, and a long-time author may not even realize the best practices do not match what they are used to: the example PKGBUILDs aren't being changed to show the contrary.
On false senses of security: Yes, there is some blind faith that an AUR maintainer just so happened to provide the correct checksums, but there's even a faith that the correct GPG keys are used and correct source host. Thankfully, it is usually plainly obvious if the latter is the case. :-)
On upstream always providing GPG signature files against tarballs: Beyond the fact that not all upstreams even do this (and you can make a fair argument that the AUR maintainer has no firm reason to believe _their_ download was the correct one), I'm not actually entirely convinced that they should always be expected to do as such.
This is a difficult position to defend, and it may come down to laziness, but hosting sites such as GitHub and GitLab provide automated tarball generation (by just using `git archive` on the backend -- it's easy to independently verify the archives). Speaking from my experience, it has become natural for me to stick with GPG-signing the tag in Git itself and ignoring output files such as these. It largely comes to “If you need to verify the integrity of the source code, I expect you to clone the repository and check that tag, and use `git archive | $HASH` to verify the archive GitHub/GitLab provide.” I encourage you to run `git archive` on your local master repo, and generate a GPG signature against that... it will be reproducible in the autogenerated version. Then upload the GPG signature as a release artifact. Because you're right, it is sheer laziness. Downloading a potentially *huge* git repo just to verify signatures, is madness. Try applying that logic to the linux kernel... *cloning* it begins to approach the length of time required to *build* it. Knowing beforehand, how many commits to specify with --depth, is not a reasonable answer. :) Debian wrote a nice page about this: https://wiki.debian.org/Creating%20signed%20GitHub%20releases
Especially the alternative local workflow at the end, that is mostly what Eli proposes above. ;)
One example of package doing this is https://github.com/vector-im/riot-web, they included this easily in their release process at https://github.com/matrix-org/matrix-js-sdk/pull/351.
If you’re already signing tags used for releases, signing the tarball should really be easy and as underlined by Eli, quite a good idea.
So, yes, PGP everywhere please.
Cheers, Bruno
I agree that PGP everywhere is absolutely something to push for. On the other hand, not every developer is in the web of trust strong set and if you're downloading the package sources from Github then that's probably where you got the PGP key id from as well. An attacker who can highjack your TLS secured source download when you bump the package version could also have fed you a forged PGP key id when you first made the package. Upgrading to stronger checksums is only marginally less secure than using PGP.
That was part of my answer to NicoHood on aur-general in my TU application: PGP everywhere for sure, but not anyhow. PGP done without care is worth nothing, and same goes for the verification. Yes, this requires time. But any security system requires trust, and thus time. ;)
I think we ought to settle on what to do about these checksums. MD5 and SHA-1 are not strong enough to provide security but they're also too bloated for mere error correction. A change is definitely needed and we should decide on which direction to take.
I think that *sums should only be provided for integrity, not authenticity. I’m not an Arch Dev, nor a pacman/makepkg one though… Regards, Bruno
On 02/24/2017 10:41 AM, Kieran Colford wrote:
I agree that PGP everywhere is absolutely something to push for. On the other hand, not every developer is in the web of trust strong set
Which is why if you pedantically worship the web of trust strong set, PGP is kind of useless altogether, since you can never really trust it in practice. Or use TOFU.
and if you're downloading the package sources from Github then that's probably where you got the PGP key id from as well.
Or from any of the dozen other places you can find the developer's key. Particularly, their independent website (which is not GitHub). The fact that some users are stupid, is not an indictment against PGP.
An attacker who can highjack your TLS secured source download when you bump the package version could also have fed you a forged PGP key id when you first made the package. Upgrading to stronger checksums is only marginally less secure than using PGP.
What? The fingerprint is in the PKGBUILD which is downloaded via HTTPS from a second website which requires either breaking the HTTPS security model or violating multiple (presumably) secure channels, and is also easily cross-verified against multiple independent sources. PGP operates on a completely different conceptual landscape than checksums, and is *always*, no matter what, more "secure" than checksums. Once again, the existence of stupid users is not an indictment against PGP, and the fact that in cherry-picked situations PGP fails to live up to its end-user promise, is not an indictment either. PGP tells us a lot of things. It tells us the source is authorized by the same person who authorized multiple previous releases. It tells us the source is the one the AUR maintainer used. It tells us that someone who can be *absolutely* identified is the same person who did X and said X, on various mailing lists, websites, and partial PGP trust models. It tells us that we are still getting our sources from the same person we got them from last week/month/year. The fact that its mere existence is not a magic talisman saying everything is wonderful, fine and safe... is not news, and is not a problem either, since no one ever said that is what it was supposed to do. tl;dr The sky is not falling. -- Eli Schwartz
On Fri, 2017-02-24 at 14:52 +0100, Bruno Pagani wrote:
Debian wrote a nice page about this: https://wiki.debian.org/Creating%20signed%20GitHub%20releases
This wiki offers bad advice. It trusts that GitHub itself is not compromised and will provide a good download based on the repository alone. Thankfully, because GitHub normally just uses `git archive` and those releases are deterministic, it can be solved by using your local repository alone, for example: $ git archive --format=tar.gz --prefix=mysoftware-0.4/ mysoftware-0.4 \ | gpg -a -b -o mysoftware-0.4.tar.gz.asc
On 02/24/2017 03:27 PM, Mike Swanson wrote:
On Fri, 2017-02-24 at 14:52 +0100, Bruno Pagani wrote:
Debian wrote a nice page about this: https://wiki.debian.org/Creating%20signed%20GitHub%20releases
This wiki offers bad advice. It trusts that GitHub itself is not compromised and will provide a good download based on the repository alone.
Thankfully, because GitHub normally just uses `git archive` and those releases are deterministic, it can be solved by using your local repository alone, for example:
$ git archive --format=tar.gz --prefix=mysoftware-0.4/ mysoftware-0.4 \ | gpg -a -b -o mysoftware-0.4.tar.gz.asc
Congratulations, you have just won today's FUD award! For everyone else on this thread, what that Wiki *really* said, is:
4. Go back to your "Releases" section and download the tarball mysoftware-0.4.tar.gz automatically generated by GitHub. Verify that the tarball contains exactly the same data as the git repository.
Also, that Wiki page actually gave the original source for Mike's plagiarized local example. But someone should probably fix that Wiki, and Mike's untested plagiarism... because I, having actually tested it myself, can confirm those commands don't work on account of someone being really confused what a "tag" is. The following git alias does work for the latest tag on $currentbranch, assuming the repo is cloned to a directory named the same as the remote repo name (but for more robustness, parse the output of `git config --get remote.$(git config --get branch.master.remote).url`). Which I believe is a reasonable assumption to make. ``` [eschwartz@arch ~]$ git config --get alias.github-archive !sh -c 'repo=$(basename $(pwd)) && tag=$(git describe --abbrev=0 --tags) && git archive --prefix=${repo}-${tag#v}/ -o ${repo}-${tag#v}.tar.gz ${tag}' ``` -- Eli Schwartz
On Fri, 2017-02-24 at 16:01 -0500, Eli Schwartz wrote:
Congratulations, you have just won today's FUD award!
The goal, as I understood it, is to promote the practice of upstream developers (project maintainers, release managers, whomever) signing their code so that downstream users and packagers can verify that the source they receive is identical to what upstream wants to put out. For me, trusting the "generate an archive" to a third party is in opposition of promoting good practice. I don't care if GitHub is good today, they may not be good tomorrow, and if an upstream gets cozy to the idea of "just download the GitHub archive" to sign off a release, they open themselves up to a world of hurt when GitHub (or anyone successfully pulling off a MITM attack -- unlikely with HTTPS, but not entirely impossible) starts messing with those archives, inserting/changing things not supposed to be there. I do believe there is a healthy amount of uncertainty and doubt to take here. It's great that GitHub generates archives today that are identical to git-archive's own files. It may not always be the case.
For everyone else on this thread, what that Wiki *really* said, is:
4. Go back to your "Releases" section and download the tarball mysoftware-0.4.tar.gz automatically generated by GitHub. Verify that the tarball contains exactly the same data as the git repository.
The wiki also skimmed over exactly how to do this. "diff -r", comparing checksums from git-archive, diffoscope?
Also, that Wiki page actually gave the original source for Mike's plagiarized local example. But someone should probably fix that Wiki, and Mike's untested plagiarism... because I, having actually tested it myself, can confirm those commands don't work on account of someone being really confused what a "tag" is.
I stopped reading after the prior point, but thanks for accusing me of plagiarism when their example doesn't even take the same route I did. Or accusing me of having it untested. I use the command all the time. It works. (And if you're saying any upstream developer doesn't understand what a tag is, I'm sorry. It's irresponsible to not know how to use your own tooling. Learn git and get good at it.)
On 02/26/2017 12:49 AM, Mike Swanson wrote:
On Fri, 2017-02-24 at 16:01 -0500, Eli Schwartz wrote:
Congratulations, you have just won today's FUD award!
The goal, as I understood it, is to promote the practice of upstream developers (project maintainers, release managers, whomever) signing their code so that downstream users and packagers can verify that the source they receive is identical to what upstream wants to put out. For me, trusting the "generate an archive" to a third party is in opposition of promoting good practice. I don't care if GitHub is good today, they may not be good tomorrow, and if an upstream gets cozy to the idea of "just download the GitHub archive" to sign off a release, they open themselves up to a world of hurt when GitHub (or anyone successfully pulling off a MITM attack -- unlikely with HTTPS, but not entirely impossible) starts messing with those archives, inserting/changing things not supposed to be there.
I do believe there is a healthy amount of uncertainty and doubt to take here. It's great that GitHub generates archives today that are identical to git-archive's own files. It may not always be the case.
For everyone else on this thread, what that Wiki *really* said, is:
4. Go back to your "Releases" section and download the tarball mysoftware-0.4.tar.gz automatically generated by GitHub. Verify that the tarball contains exactly the same data as the git repository.
The wiki also skimmed over exactly how to do this. "diff -r", comparing checksums from git-archive, diffoscope?
Which presumably implies that your complete and utter FUD regarding 'trusting the "generate an archive" to a third party' is, well, FUD. You just admitted that they (Debian Wiki) didn't actually say that, so why did you just try confusing the issue *again*? As for "exactly how to do this", I didn't realize there were so many people out there who both don't know how to compare a set of files, and cannot figure out how to google it. But those people should definitely not be trusted to write software if they fail at *both*, and anyway feel free to improve the Debian wiki if you really think this kind of extra-redundant redundancy is vital. I still maintain that accusing that page of offering "bad advice" is unmitigated FUD, but whatever...
I stopped reading after the prior point, but thanks for accusing me of plagiarism when their example doesn't even take the same route I did.
The fact that you copied their first command right down to the specific example software name/tag, is not negated by your further modification into a pipeline. Either way, you very clearly read the second half of that page, refused to acknowledge its existence, and then attempted to give the impression that your command-line-fu will save us all from the "bad advice" on that page. (And then you made the same *exact* implementation mistake they did, which would be, again, because you *copied* their implementation, mistake and all.)
Or accusing me of having it untested. I use the command all the time. It works.
(And if you're saying any upstream developer doesn't understand what a tag is, I'm sorry. It's irresponsible to not know how to use your own tooling. Learn git and get good at it.)
Well, you clearly aped the Debian Wiki's confusion over what a tag is, and I don't much care whether you know how the command works in general, blind copypasta is still untested. But I'll humor you... the reason it doesn't work is because while it will certainly create *a* tar.gz archive (and sign it), it certainly won't create the *right* tar.gz archive/sig. If your tag is "mysoftware-0.4",[1] and the repo name is presumably "mysoftware", then GitHub will use a combined prefix of "mysoftware-mysoftware-0.4" which will generate a totally different tarball. And if not (because your tag is named "0.4"), then `git archive` won't accept invalid object names like "mysoftware-0.4". Again, you clearly didn't test *this* command, however much you may have tested `git archive` in the general sense... or you would have gotten "invalid signature" errors when attempting to use the results. To recap: If you want to generate the same archive as GitHub will, for reproducibility/signing/whatever reasons... the correct way to do it is by generating a tar.gz archive and using `--prefix ${reponame}-${tag#v}` (The #v part is because yes, GitHub will strip out the "v" if you tag your releases as "v0.4") -- Eli Schwartz [1] It is a terrible minority practice to embed the software name in the release tag version name. So the Debian Wiki is actually offering bad advice, just not the bad advice you thought they were.
Any further off-topic posting will result in permanent additions to the moderation queue (which is never checked). A
participants (6)
-
Allan McRae
-
Bruno Pagani
-
Eli Schwartz
-
Giancarlo Razzolini
-
Kieran Colford
-
Mike Swanson