On Tue, Aug 26, 2008 at 9:21 AM, Xavier <shiningxc@gmail.com> wrote:
Hello,
I rewrote stonecrest's python script which generated the Integrity Checks : http://projects.archlinux.org/?p=dbscripts.git;a=blob_plain;f=cron-jobs/chec... last results : http://www.archlinux.org/pipermail/arch-dev-public/2008-August/007655.html
Main changes include : * better and safer parsing of PKGBUILDs (it now relies on a separate bash script which source PKGBUILDs and generate python code, rather than trying to parse bash directly in python) * much better performance (thanks to a small python C extension which allows to call alpm_vercmp natively rather than having to fork the vercmp binary thousands of times) * more accurate dependency and provision handling (having worked on this part of the pacman code for a while probably helped me a bit here)
I made one design change. Finally I found that generating python code was a bit hackish and less flexible, so I reverted back to a DESC-like format, very easy to parse in python. And this is the namcap way : the python script just calls the bash parsing script then read and parse its output. I finished to implement all features the original script had, mostly the repo hierarchy one and circular deps checking. About circular deps, the old algorithm used was a bit messy and hard to understand, so I just chose to implement an existing one : http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algori... This resulted in better and complete results, and also at a much better performance (old one took ~20 seconds in cygwin, new one is instant). The only feature that I refused to keep is the "Missing Repo Packages" one which checked the package existence on ftp.archlinux.org . This was the only one using network, and in my opinion, it does not have this place in this script. Instead, someone knowing the layout of the arch server could write another very simple script checking that all packages in the abs tree have a respective package in the package dir, if both happen to be on the same server. Enough with the technical details, here are the stripped results : 1) Results for core and extra $ ./check_packages.py --abs-tree=/var/abs --repos=core,extra Missing Dependencies ---------------------- archboot --> 'bcm43xx-fwcutter>=006-2' Missing Makedepends --------------------- wpa_supplicant --> 'kernel26<2.6.25' xosd --> 'xmms' flac --> 'xmms' Repo Hierarchy for Makedepends -------------------------------- core/iputils depends on extra/jade core/madwifi-utils depends on extra/sharutils core/e2fsprogs depends on extra/bc core/ca-certificates depends on extra/ruby core/madwifi depends on extra/sharutils Circular Dependencies ----------------------- glibc>bash>readline>ncurses>glibc db>coreutils>shadow>pam>db eclipse-ecj>gcc-gcj>eclipse-ecj 2) Results for community $ ./check_packages.py --abs-tree=/var/abs --repos=core,extra,community --exclude=core,extra Missing PKGBUILDs ------------------- community/kde/kdestyle-lipstik Missing Dependencies ---------------------- flumotion --> 'twisted-web' qc-usb --> 'kernel26<2.6.26' gg2 --> 'arts' eclipse-ve --> 'eclipse<3.3' man-pages-cs --> 'groff-utf8' Missing Makedepends --------------------- gensplash --> 'klibc-beyond' pygoocanvas --> 'pygobject-doc'