Hi Jelle On Thu, Jul 9, 2020 at 2:00 AM Jelle van der Waa <jelle@vdwaa.nl> wrote:
On 09/07/2020 05:05, Anatol Pomozov via arch-dev-public wrote:
TLDR; let’s start using detached package signatures to make system updates faster.
Hi folks,
Some time ago there was a discussion at IRC where someone (Allan maybe?) proposed to stop using embedded PGP signatures in favor of detached signature files. I would like to bring this idea here and quantify it with some numbers.
The downside of not having the package signatures in the database is that consumers can not easily obtain this information. For archweb that's showing who signed the package on the package details page.
How would I implement an efficient alternative without fetching package files or all the sig files? A separate sig database? :P
The best option is to download and parse the signature file directly. Its filename is going to be <pkgfilename>.sig where <pkgfilename> is available in a package description as %FILENAME% entry.
As far now I'll have to adjust the code not to break because of a missing PGPSIG entry.
Here is a bit of technical details on this topic. Pacman has the ability to verify authenticity of package files with PGP signatures. PGP signatures add protection against undesired package modifications by a third-party and it improves security aspects of the package management. This feature can be configured per repository and the official Arch Linux repos have it enabled. Package signatures have been used by Arch Linux successfully for a couple of years now.
<snip>
An alternative to embedded signatures are detached signatures. These are signatures stored in a separate file next to the package itself (in a <pkg>.sig file to be specific). Instead of downloading *all* signatures every time a database is updated, detached signatures are downloaded only when a specific package is installed/updated. If Arch could switch to this model then database files become 3 times smaller that saves users bandwidth and system update time.
It would be insightful to provide the database numbers, because one could argue 30% of 1MB is nothing, as 30% of 100M is nice improvement.
Our biggest database should be community (5M atm), and with all the savings that would now be ~ 2 MB? Would be nice to have an overview of the real life numbers :)
For compressed "community" database the savings are going to be 5.2M -> 1.73M (gzip) or 1.26M (zstd -19). With other dbs I would say that for an average user we are looking at 7M->2.2M total savings in the database size. Keep in mind that database downloading/parsing is located at the critical path. Every user downloads these db files pretty much every time "pacman -Sy" is run. Detached signatures make this step faster by reducing the workload and downloading signatures on-demand later.