On Sun, 08 Jul 2018 22:38:06 -0400, Eli Schwartz wrote:
On 07/08/2018 09:14 PM, Luke Shumaker wrote:
From: Luke Shumaker <lukeshu@parabola.nu>
In a patchset that I recently submitted, Eli was concerned that I was parsing .db files with bsdtar+awk, when the format of .db files isn't "public"; the only guarantees made about it are that libalpm can parse it.
https://lists.archlinux.org/pipermail/arch-projects/2018-June/004932.html
I wasn't too concerned, because `ftpdir-cleanup` and `sourceballs` already parse the .db files in the same way. Nonetheless, I think Eli is right: we shouldn't be parsing these files ourselves.
So, add a `dbquery` function that uses pyalpm to parse the .db files:
What's wrong with expac?
expac --config ${dbscripts_root}/pacman-community.conf -S '%f'
expac is not only super elegant, there's pending patches to provide it in pacman 6 as part of the core project. This is what I'm waiting for, actually.
I see no reason to add an external dependency on both python and pyalpm, in order to run a small python program which evals its arguments in order to inject database queries, when a tool with a simple API can do the same and will eventually be guaranteed to be everywhere pacman itself is.
With the "True" filter that ftpdir-cleanup and sourceballs both use, you're right; this could be done with expac. But, with the context that this patch exists to enable me to address the concern you had with the other patchset: AFAICT, with expac there's no way to do a query like: dbquery core x86_64 \ "(pkg.base or pkg.name) == '$pkgbase'" \ ... Which is what most (all?) of the queries in the other patchset would become. (Drat, it seems that discussing this separately from the other patchset won't work after all.)
(Let's ignore for a moment, the defunct integrity checks service which is written in python, but not pyalpm. pyalpm is not currently installed on the dbscripts server ATM.)
Good call; check_packages.py is python2, the pyalpm dep does add a new dependency on python3.
As a final note, when re-writing the bit of sourceballs to use dbquery instead of AWK, I realized that it does not correctly handle licenses that have a space in them (as of 2018-07-07 there are 67 packages in the Arch repos that have license containing a space). I did not fix this bug; I merely translated it from AWK to Python, as the program would also need to be adjusted elsewhere. Keeping in mind the ones we're looking for are a whitelist of strictly-defined license types... I think those are all ad-hoc custom licenses, none of which we're interested in in the primary sourceballs deployment.
Indeed; if I thought it were a serious problem, I would have written a patch for it :) -- Happy hacking, ~ Luke Shumaker