Thomas Bächler wrote:
I started writing new database scripts to replace the old db-*/updatesync-many/pkgdb2 scripts. The reason are support for the new pacman 3 naming scheme and fixing some design issues.
The old scripts use PKGBUILDs a lot for their operation and don't check the package file for consistency with the PKGBUILD (only the filename). They also move every file from the staging dir, regardless of whether the package will be added in any repository. And they recreate much of the functionality from updatesync, sort of reinventing the wheel.
My new draft has a cleaner design, but some new problems appear. It performs the following steps:
1) Check every file in the staging/add dir using a small libalpm-based tool and obtain the pkgname, pkgver, pkgrel and architecture. Compare the arch specified in the commandline with the arch from the package (this step is missing in the old scripts). Find the PKGBUILD and compare pkgname, pkgver and pkgrel with the values from the package. If all checks are okay, add the package to a list. If additionally, a force flag is set in the PKGBUILD, add it to a "force-list".
2) Check every file in the staging/del dir and obtain its pkgname. Add this package to a delete list.
3) Lock the database
4) Move all packages and force-packages to the ftp dir, add them with repo-add.
5) Pass the package files to a pkgdb2-like tool to add them to the web interface.
6) Delete all package from the delete-list with repo-remove.
7) Pass the package names to a pkgdb2-like tool to remove them from the web interface.
8) Release the database lock.
The problems I run into are these:
a) In step 4) I can't determinte the filename of the old package and remove it from the ftp. I could scan for certain filename schemes, but take into account that the package filename could be anything now, the script doesn't care. We could however rely on our current filename scheme to find/remove the dupes (who would go through the trouble to rename his package to sth like wrongname-notapackage.zip.bz8 anyway?).
b) The same thing counts for step 6). But we mv our packages to the staging/del anyway so the script won't have to remove them from the ftp.
c) This is the biggest problem currently. In step 5), due to the new script design, I don't have any data from the PKGBUILD any more and don't want to go find it again. That means I only use data from the package file itself to add it to the mysql db. The package file lacks the "package category" and "source" which are in the web interface. I could only solve this by either removing this data from the web interface or adding it to the package file.
My recommendation is that while you're in the PKGBUILD, grab these things, and keep that data around associated with that pkgname. At least that's how I do it for tupkgupdate for the TUs. Then you'll have it when you need to add it to the mysql db. I also match up every package in the repo directory with every PKGBUILD in the source tree, so I know which binary packages are missing. At that time, I note the full filename of that package so I can delete it later. Of course, I'm changing most of these tactics in repoman. There'll be a lightweight upload server that takes the file and figures out what to do with it. It will have the full sql db and repo at its disposal, and will know which files were committed by developers but haven't yet been uploaded as well as their sha1sum signatures. - P