On Tue, Jan 14, 2014 at 11:12:17PM +0100, Lukas Fleischer wrote:
On Tue, 14 Jan 2014 at 19:59:28, Dave Reisner wrote:
[...] The library includes an output format which I've created based on the last discussion from pacman-dev; in particular a post from Allan [1]. This can easily be changed if we forsee undesirable shortcomings. I should note that the format emphasizes split packages as a first class citizen. My hope is that this can be leveraged to introduce proper support for split packages in the AUR eventually. I realize that this probably means work on the AUR side (which I likely won't contribute to) in order to integrate my solution, but I firmly believe it's worthwhile if support for split packages is a desirable goal for the AUR (please tell me it is).
Along with this code, I've created two other utilities:
- parse_aurinfo.py: A parser implementation for the proposed .AURINFO format written in python. - mkaurball: A shell script which creates a source tarball and appends the generated .AURINFO file to the tarball.
There's also a debugging utility which simply imports the lib and dumps an .AURINFO file from a PKGBUILD you point it at.
Awesome! I will give it a try soon. Split packages are definitely a desirable goal for the AUR but that feature requires a lot of work on the AUR side indeed. I won't be able to do much AUR coding until April or May, so what I suggest is the following strategy:
* Test your utility. Do some manual tests and automated tests you described below. Fix common use cases.
So this went pretty well. I chose the automated route since I'm a little short on time (traveling tomorrow) and wrote some more python. Basically, it uses my tools to convert PKGBUILD -> .AURINFO -> python, and compares that against the parsed output of 'pacman -Si $repo/$pkg'. PKGBUILDs all come from ABS. As is the nature of ABS, there's bound to be differences between the archived PKGBUILD and the actual PKGBUILD that produced the package currently in the repos. There's bound to be false positives which may vary from day to day. Still, this gives a test bed of ~5000 PKGBUILDs to play with. Out of those, I'm able to fully match the repo 99.1% of the time (including false positives). I can actually boast a 100% strike rate on [extra], as the only differences were caused by out of sync PKGBUILDs which were otherwise correctly parsed. Legitimate problems fell into the expected categories: 1) architecture specifics (examples: core/grub, multilib/dev86) This is obvious, and I knew it would be here. Fixing this requires changes in makepkg. (something like depends_x86_64=(..) or what have you). This cannot reasonably be fixed in any parser. 2) external commands (examples: community/haskell-*, core/perl) I'd suggest that we consider some of these to be false positives or unfixable. The haskell packages all rely on the output of 'pacman -Q ghc' to lock the package to a specific GHC version (I'm not a haskell person). perl just jumps the shark and requires you to be on one of a few Arch servers (it unzips a tarball in a very specific location). 3) core/linux This is an interesting case and gets special mention. The goal of this hackery is to make it easier for folks maintaining custom versions of this PKGBUILD to track and merge changes. I suspect that a legitimate solution to this problem could handled in the PKGBUILD by introducing a new variable, pkgsuffix=, similar to the --with-program-suffix flag that automake offers. The code that I used for all of the above has been pushed, and I'm attaching a tarball of my results from the 4 repos I parsed in case anyone is curious in the gory details. Cheers, Dave