[arch-projects] [dbscripts] Integrity check false-positives when overriding package information from package() (was: Perl dependencies)

Mon Jan 9 04:44:46 EST 2012

Short summary:
The last integrity check included some missing dependencies that are
actually not missing, but provided by perl. The problem is that these
provisions are generate at build time and not included in the PKGBUILD
so the check won't deal with them correctly.

On 09.01.2012 04:43, Qadri wrote:
>>
>> After having a look at the integrity check source, it looks like it
>> should deal with provivions already, but that code is obviously broken.
>> If you could come up with a fix that would be very nice.
>>
>>
> :-/ It does look like it handles it fine. I _was_ going to ask if I could
> test things out in my local machine without mirroring a whole repo when it
> occurred to me that maybe the script is not the bug. The issue is that the
> provides array in the perl PKGBUILD is dynamically generated via a perl
> script.

My fault, I totally ignored the fact that it parses the PKGBUILDs when I
looked at it.

> So new question: Is it okay if the dbscript ran an external perl script,
> grabbed the output, and parsed that? For just the provides field? For any
> field? How many packages use this sort of script (answer - something on the
> order of tens)? Make an exception for perl? 

parse_pkgbuilds.sh already sources the PKGBUILD so whatever magic you do
in there will work as long as it's not inside build()/package(). The
problem is that I probably can't create the array for perl at that point.

> Thoughts?

Parsing PKGBUILDs will also create false positives when/if we use
libdepends/libprovides since the version part will be added by makepkg
and will be missing from the PKGBUILD.

As a solution, you can change the check to load the package data from
the actual repo database, but you still have to get the makedepends from
the PKGBUILD because these are not in the db. That should be fine though
because makepkg checks for those before running build/package.

One potential problem when doing this is that the database may not be in
sync with svn. Currently the packages that are in the tree, but not in
the db will be checked so you still notice problems in the tree when the
check runs.

When you use the database as the primary source  problems in the tree
would go unnoticed for quite a while because the cronjob currently runs
only once a week.

Maybe it's because most of the output is the same all the time? At least
that's why I skim over the mails. This could be solved by caching the
last output for each section and only sending different lines. To make
sure we don't missing anything you could clean the cache every now and
then (monthly?) so we get the full output.

So my solution is to use the database, send only different output and
send it more often. Does that sounds like a good idea?

PS: This discussion should continue on the arch-projects mailing list.

-- 
Florian Pritz

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.archlinux.org/pipermail/arch-projects/attachments/20120109/09b4c482/attachment.asc>