[arch-dev-public] New /arch/db-* scripts

Paul Mattal paul at mattal.com
Sun Jul 8 08:20:34 EDT 2007


Thomas Bächler wrote:
> I started writing new database scripts to replace the old
> db-*/updatesync-many/pkgdb2 scripts. The reason are support for the new
> pacman 3 naming scheme and fixing some design issues.
> 
> The old scripts use PKGBUILDs a lot for their operation and don't check
> the package file for consistency with the PKGBUILD (only the filename).
> They also move every file from the staging dir, regardless of whether
> the package will be added in any repository. And they recreate much of
> the functionality from updatesync, sort of reinventing the wheel.
> 
> My new draft has a cleaner design, but some new problems appear. It
> performs the following steps:
> 
> 1) Check every file in the staging/add dir using a small libalpm-based
> tool and obtain the pkgname, pkgver, pkgrel and architecture. Compare
> the arch specified in the commandline with the arch from the package
> (this step is missing in the old scripts). Find the PKGBUILD and compare
> pkgname, pkgver and pkgrel with the values from the package. If all
> checks are okay, add the package to a list. If additionally, a force
> flag is set in the PKGBUILD, add it to a "force-list".
> 
> 2) Check every file in the staging/del dir and obtain its pkgname. Add
> this package to a delete list.
> 
> 3) Lock the database
> 
> 4) Move all packages and force-packages to the ftp dir, add them with
> repo-add.
> 
> 5) Pass the package files to a pkgdb2-like tool to add them to the web
> interface.
> 
> 6) Delete all package from the delete-list with repo-remove.
> 
> 7) Pass the package names to a pkgdb2-like tool to remove them from the
> web interface.
> 
> 8) Release the database lock.
> 
> The problems I run into are these:
> 
> a) In step 4) I can't determinte the filename of the old package and
> remove it from the ftp. I could scan for certain filename schemes, but
> take into account that the package filename could be anything now, the
> script doesn't care. We could however rely on our current filename
> scheme to find/remove the dupes (who would go through the trouble to
> rename his package to sth like wrongname-notapackage.zip.bz8 anyway?).
> 
> b) The same thing counts for step 6). But we mv our packages to the
> staging/del anyway so the script won't have to remove them from the ftp.
> 
> c) This is the biggest problem currently. In step 5), due to the new
> script design, I don't have any data from the PKGBUILD any more and
> don't want to go find it again. That means I only use data from the
> package file itself to add it to the mysql db. The package file lacks
> the "package category" and "source" which are in the web interface. I
> could only solve this by either removing this data from the web
> interface or adding it to the package file.

My recommendation is that while you're in the PKGBUILD, grab these
things, and keep that data around associated with that pkgname. At least
that's how I do it for tupkgupdate for the TUs. Then you'll have it when
you need to add it to the mysql db.

I also match up every package in the repo directory with every PKGBUILD
in the source tree, so I know which binary packages are missing. At that
time, I note the full filename of that package so I can delete it later.

Of course, I'm changing most of these tactics in repoman. There'll be a
lightweight upload server that takes the file and figures out what to do
with it. It will have the full sql db and repo at its disposal, and will
know which files were committed by developers but haven't yet been
uploaded as well as their sha1sum signatures.

- P




More information about the arch-dev-public mailing list