[pacman-dev] Repository management

Mark Weiman mark.weiman at markzz.com
Sun Jul 30 02:36:01 UTC 2017


On Tue, 2017-05-09 at 22:54 +1000, Allan McRae wrote:
> Hi all,
> 
> Every time I attempt to work on repo-add, I find it to be a very
> difficult endeavour.  Even though it is half the size of makepkg
> (without even including any of libmakepkg), it is much more
> convoluted
> to work on.
> 
> We also have a weird repository database system.  We have:
> - .db dbs with package information, signatures and delta information
> - .files dbs that are the same as .db dbs but additionally include
> filelists
> 
> There are two reasons the .files dbs replicate all information in the
> .db dbs
>  - .db and .files dbs getting out of sync could cause issues
>  - a complete database is useful for things like archweb, mostly to
> avoid the above
> 
> I would also like to include information on source packages to these
> databases.  The files information is separate due to wanting our
> primary
> database to be small.  Likewise, source package information needs to
> be
> separate (the signatures take most of the size in the .db dbs, so
> adding
> source package signatures effectively doubles the size).
> 
> So two points up for discussion:
> 
> 
> 1) Sync repository layout?  I don't see any point in leaving the tar
> based format, as reading of sync databases is not a bottleneck.  (The
> local db format can be a bottleneck, but that is a separate
> discussion...)
> 
> Do we split the information in .db out of .files and add a .full db
> with
> complete information?  Then any .src db could follow suit and just
> have
> source package information.  How do we get around the out of sync
> issue
> (e.g., a package is removed from .db, but we have an old .files
> database
> with it).  Do we add timestamps, and print a warning on -F operations
> when the two are out of sync?
> 

Perhaps instead of timestamps, how about adding a .DBINFO file and
include a hash in that file that is shared between both the .db and
.files databases (and perhaps the source db as well). This way, when
something checks the .files, you can tell if it doesn't match the .db
(because in my opinion, the .db is more important so that's what I
would compare anything to).

I'm not really sure what good a .full db would do for us though. Just
seems to me like extra stuff to download.

> 
> 2) Do we need a better (read "more easily maintainable") tool for
> handling database generation and updates?  libalpm already can read
> in
> information package files, so we could add libalpm/db_write.c with
> the
> database creation functions.   Should we unify our repo format with
> our
> local database format which we already write?
> 

I think this would be great. Especially the part of implementing
something in libalpm to do this. It would allow projects like pyalpm or
my own php-alpm to be used to also create repos.

> 
> I am looking for ideas here.  Please brainstorm to your hearts
> content.

I know this is two months after the fact, but here's my take on it.

Mark


More information about the pacman-dev mailing list