On 05/09/17 at 10:54pm, Allan McRae wrote:
Hi all,
Every time I attempt to work on repo-add, I find it to be a very difficult endeavour. Even though it is half the size of makepkg (without even including any of libmakepkg), it is much more convoluted to work on.
We also have a weird repository database system. We have: - .db dbs with package information, signatures and delta information - .files dbs that are the same as .db dbs but additionally include filelists
There are two reasons the .files dbs replicate all information in the .db dbs - .db and .files dbs getting out of sync could cause issues - a complete database is useful for things like archweb, mostly to avoid the above
I would also like to include information on source packages to these databases. The files information is separate due to wanting our primary database to be small. Likewise, source package information needs to be separate (the signatures take most of the size in the .db dbs, so adding source package signatures effectively doubles the size).
So two points up for discussion:
1) Sync repository layout? I don't see any point in leaving the tar based format, as reading of sync databases is not a bottleneck. (The local db format can be a bottleneck, but that is a separate discussion...)
Do we split the information in .db out of .files and add a .full db with complete information? Then any .src db could follow suit and just have source package information. How do we get around the out of sync issue (e.g., a package is removed from .db, but we have an old .files database with it). Do we add timestamps, and print a warning on -F operations when the two are out of sync?
What about just not including the signature in the database? Make the inclusion of the signature optional and have pacman (or whatever downloads the source package) also look for a corresponding .sig file if it's not in the db. pacman -U already looks for a .sig file when downloading a package and you have a feature request to download .sig files even with -S, so code-wise this seems like a pretty clean solution. Then you can include the source information right in the primary DB and Arch's devtools can opt to omit the signature from the db.
2) Do we need a better (read "more easily maintainable") tool for handling database generation and updates? libalpm already can read in information package files, so we could add libalpm/db_write.c with the database creation functions. Should we unify our repo format with our local database format which we already write?
I would love to see us drop the ini-style .PKGINFO format, if that's what you mean. Even without adding a database writer to libalpm, having two formats for the exact same data is unnecessary and leads to inconsistencies between the two. apg