[pacman-dev] [PATCH] Increase maximum database size

Eli Schwartz eschwartz at archlinux.org
Sun Jan 19 00:36:24 UTC 2020


On 1/18/20 6:42 PM, Allan McRae wrote:
> We previously has the maximum database size as 25MB.  This was set in the days
> before repos had as many packages as they do now, and before we started
> distributing files databases.  Increase this limit to 128MB.

What ever happened to that long-ago idea to make .sig files be
downloaded on-demand rather than embedding them into the .db ? This
would have the added bonus of making downloads actually not be so big by
default...

Another potential optimization ISTR us discussing is making
community.files not include the content from community.db, and providing
e.g. community.alldb for anyone who needs the combined form.

Aside for that... we currently use .gz for databases on our official
infrastructure. We could get much better compression than that, I'm
sure. e.g. why not use xz there? We could even use xz -9, since the
databases tend to be fairly conservative in size so optimizing
decompression speed by switching to zstd is not really so important IMO,
and xz with level -9 compression can beat zstd -20 in both size and
compression speed.

community.files.gz when recompressed with either xz or zstd drops from
20MB to 15MB; exact numbers look like this:

$ du -b /var/lib/pacman/sync/community.files /tmp/community.files.*
20769830	/var/lib/pacman/sync/community.files
14969268	/tmp/community.files.xz
15090081	/tmp/community.files.zst

...

I'm not really a fan of just bumping the size forever, because it seems
to me people who are running into this issue are indeed doing something
they shouldn't. A 128MB repository that consumes 128MB of bandwidth on
every pacman -Syu just because a single package has been updated is
really not nice... I feel like the proper solution is more aggressive
compression, figuring out why these .files databases are actually so
huge (nodejs packages are probably a really annoying problem because
that completely ridiculous language will ship an application composed of
several hundred thousand micro-files, and the files database needs to
record every single path, so I'd quite like nodejs packaging to die a
horrible death), and, if databases are still running into size limits,
shipping packages in split repositories.

Splitting out more repositories is not just about "fooling" pacman into
splitting the limits up among multiple repos. It's about making a
single-package update only trigger an update to one of the splitted
repos. This strikes me as exactly the purpose of instituting a size
limit to begin with!

-- 
Eli Schwartz
Bug Wrangler and Trusted User

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1601 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/pacman-dev/attachments/20200118/7ccea4e0/attachment.sig>


More information about the pacman-dev mailing list