[arch-dev-public] arch-repo-management walkthrough 2022-02-02 19:00 CET (UTC+01:00)

David Runge dave at sleepmap.de
Mon Jan 31 14:36:02 UTC 2022


On 2022-01-31 23:55:07 (+1000), Allan McRae via arch-dev-public wrote:
> Any chance this can be recorded?  It will be at 4am in my timezone?

I think that can certainly be arranged!

> I am interested in mainly what problem this is solving.  From what I
> can tell, our current workflow is package->db, and this goes
> package->json->db. What is the advantage of the extra step?  Will this
> be covered by your talk?

Without going into too much detail:
It allows us to import current package repository databases and retain
their entire state in a decomposed directory structure (e.g. in a git
repository) and reproduce the package repository databases from this
state as well.
This is somewhat similar to our current "package sources and binary
package location" state approach in svn, with the difference, that in
the case of arch-repo-management we would allow for the *entire state*
of a binary package repository (default database and files database) to
be described in a unified decomposed directory structure and provide
transparent, validated builds or rebuilds of binary package databases
from that state.

When looking at svn vs. git approaches the fundamental difference is,
that with svn we track both the package sources *and* their "location"
state in the repositories while repo-add/repo-remove is used to
add/remove things on the fly to the package repository databases.
While with a future git based setup we would have a package source
repository per pkgbase and a management repository for
arch-repo-management which tracks the state of the repositories
transparently and should allow for atomic operations towards the package
repository databases (e.g. dbscripts may fail halfway through and leave
repositories in a bit of an undefined state when e.g. "moving" package
files from a to b).

> Also a couple of quick comments:
> 
> 1) might as well drop putting the signature into the package database
> - pacman will not add these be default from next release as the
> signatures are downloaded alongside the package.  This reduced db size
> substantially.

Yes, that is an open topic in the implementation (this was decided after
I implemented it/ I only got to know of that change after I implemented
this attribute).

For me this removal raises the following question which has been
bothering me a bit and maybe you have an idea how to solve it:
How would you allow for filtering packages in a repository for a
particular PGP key? We have had quite a few rebuilds due to invalid
packager keys or resigning packager keys. It would be great to have this
in mind, as I believe that e.g. querying all PGP signature files of a
repository to do so is not very feasible, but maybe this can still live
on in the proposed management repository as unused "metadata" (e.g. PGP
ID) of a given pkgbase which is populated upon import of a given
package/ set of packages.

> 2) I see databases hard coded as gz.  I think we should investigate
> switching to zstd - we did not switch to xz due to performance
> compared to gz, but I think zstd does not have that issue.

That is an implementation detail and can be changed/extended (it is just
not exposed to the outside currently). At the time of writing we are
using .gz which is why I used it that way to be able to test against
live databases.

Best,
David

-- 
https://sleepmap.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.archlinux.org/pipermail/arch-dev-public/attachments/20220131/e01304cc/attachment.sig>


More information about the arch-dev-public mailing list