[arch-general] Integrity Check

Xavier shiningxc at gmail.com
Thu Aug 28 12:30:10 EDT 2008


On Tue, Aug 26, 2008 at 9:21 AM, Xavier <shiningxc at gmail.com> wrote:
> Hello,
>
> I rewrote stonecrest's python script which generated the Integrity
> Checks : http://projects.archlinux.org/?p=dbscripts.git;a=blob_plain;f=cron-jobs/check_archlinux.py;hb=HEAD
> last results : http://www.archlinux.org/pipermail/arch-dev-public/2008-August/007655.html
>
> Main changes include :
> * better and safer parsing of PKGBUILDs
> (it now relies on a separate bash script which source PKGBUILDs and
> generate python code, rather than trying to parse bash directly in
> python)
> * much better performance
> (thanks to a small python C extension which allows to call alpm_vercmp
> natively rather than having to fork the vercmp binary thousands of
> times)
> * more accurate dependency and provision handling
> (having worked on this part of the pacman code for a while probably
> helped me a bit here)
>

I made one design change. Finally I found that generating python code
was a bit hackish and less flexible, so I reverted back to a DESC-like
format, very easy to parse in python. And this is the namcap way : the
python script just calls the bash parsing script then read and parse
its output.
I finished to implement all features the original script had, mostly
the repo hierarchy one and circular deps checking. About circular
deps, the old algorithm used was a bit messy and hard to understand,
so I just chose to implement an existing one :
http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
This resulted in better and complete results, and also at a much
better performance (old one took ~20 seconds in cygwin, new one is
instant).

The only feature that I refused to keep is the "Missing Repo Packages"
one which checked the package existence on ftp.archlinux.org .
This was the only one using network, and in my opinion, it does not
have this place in this script.
Instead, someone knowing the layout of the arch server could write
another very simple script checking that all packages in the abs tree
have a respective package in the package dir, if both happen to be on
the same server.

Enough with the technical details, here are the stripped results :

1) Results for core and extra

$ ./check_packages.py --abs-tree=/var/abs --repos=core,extra

Missing Dependencies
----------------------
archboot --> 'bcm43xx-fwcutter>=006-2'

Missing Makedepends
---------------------
wpa_supplicant --> 'kernel26<2.6.25'
xosd --> 'xmms'
flac --> 'xmms'

Repo Hierarchy for Makedepends
--------------------------------
core/iputils depends on extra/jade
core/madwifi-utils depends on extra/sharutils
core/e2fsprogs depends on extra/bc
core/ca-certificates depends on extra/ruby
core/madwifi depends on extra/sharutils

Circular Dependencies
-----------------------
glibc>bash>readline>ncurses>glibc
db>coreutils>shadow>pam>db
eclipse-ecj>gcc-gcj>eclipse-ecj

2) Results for community

$ ./check_packages.py --abs-tree=/var/abs --repos=core,extra,community
--exclude=core,extra

Missing PKGBUILDs
-------------------
community/kde/kdestyle-lipstik

Missing Dependencies
----------------------
flumotion --> 'twisted-web'
qc-usb --> 'kernel26<2.6.26'
gg2 --> 'arts'
eclipse-ve --> 'eclipse<3.3'
man-pages-cs --> 'groff-utf8'

Missing Makedepends
---------------------
gensplash --> 'klibc-beyond'
pygoocanvas --> 'pygobject-doc'
-------------- next part --------------
A non-text attachment was scrubbed...
Name: check_archlinux-v4.tar.gz
Type: application/x-gzip
Size: 4678 bytes
Desc: not available
URL: <http://archlinux.org/pipermail/arch-general/attachments/20080828/ea79f6b2/attachment.bin>


More information about the arch-general mailing list