[arch-dev-public] Use detached package signatures by default

Anatol Pomozov anatol.pomozov at gmail.com
Thu Jul 9 03:05:28 UTC 2020


TLDR; let’s start using detached package signatures to make system
updates faster.

Hi folks,

Some time ago there was a discussion at IRC where someone (Allan
maybe?) proposed to stop using embedded PGP signatures in favor of
detached signature files. I would like to bring this idea here and
quantify it with some numbers.

Here is a bit of technical details on this topic. Pacman has the
ability to verify authenticity of package files with PGP signatures.
PGP signatures add protection against undesired package modifications
by a third-party and it improves security aspects of the package
management. This feature can be configured per repository and the
official Arch Linux repos have it enabled. Package signatures have
been used by Arch Linux successfully for a couple of years now.

Package signatures are stored as a part of a pacman database file (it
is called “embedded signatures”). One issue with embedded signatures
is that they represent a quite large chunk of database file. What is
worse, a PGP signature is high-entropy data and does not compress
well. I was mildly shocked to learn how much of the *.db files
signatures consume.

I ran experiments and repackaged extra, community databases without
PGP data. For uncompressed “extra” repository size drops to 83% of its
original size (though uncompressed size is not that interesting). Arch
uses GZIP compressed database and in this case removing signatures
reduces the “extra” database to 36.8% of its original size. To
emphasize it one more time - removing PGP signatures makes this repo
only 1/3 of its original size. The change is even more dramatic in
case of “zstd -19” compression where the final database file is only
31% of its original size.

For community.db the numbers are: uncompressed file gets 79.8% of its
original size, “gzip -9” gets 33.4%, and with “zstd -19” it gets
27.51% of its original size.

A database gets modified with every package update. Users need to
re-download the databases where 2/3 of it are package signatures that
are used only when a specific package is installed.

An alternative to embedded signatures are detached signatures. These
are signatures stored in a separate file next to the package itself
(in a <pkg>.sig file to be specific). Instead of downloading *all*
signatures every time a database is updated, detached signatures are
downloaded only when a specific package is installed/updated. If Arch
could switch to this model then database files become 3 times smaller
that saves users bandwidth and system update time.

I looked through pacman code and most components have detached
signatures support already. Most of the places have a logic like this:

   if(pkg->embedded_sig) {
    use(pkg->embedded_sig)
   } else {
    sig = load_detached_sig(pkg)
    use(sig)
   }

I found only 2 places where pacman does not fallback to a detached signature:

1) Keyring key check. Pacman was using embedded signatures only. This
has been fixed in pacman’s commit b01bcc7d3d680 and it will be
available in pacman version 6.x

2) dump_pkg_full() that dump package information. If a package uses
detached signatures only then it prints “None”. I think this is fine
as this function displays database entries and it does not affect the
package verification process.

I disabled the embedded signatures at my testing machine to use
detached signatures only and things look great so far. ‘pacman
--debug’ confirms that detached signatures are correctly downloaded
and used to verify the package content.

Given this information I would like to propose to stop using embedded
signatures and move to detached signatures by default. This will
require pacman 6.x or as alternative backport the fix(es) to 5.x
branch. It will help to make system updates even faster, something
that me and many other Arch users really love.


More information about the arch-dev-public mailing list