[pacman-dev] [PATCH 0/4] Package list find performance improvements

Dan McGee dpmcgee at gmail.com
Tue Dec 14 18:02:47 EST 2010


On Tue, Dec 14, 2010 at 5:00 PM, Allan McRae <allan at archlinux.org> wrote:
> On 15/12/10 04:46, Dan McGee wrote:
>>
>> This series of patches makes finding a package in our linked list
>> implementation a whole lot faster, if that search is using the standard
>> _alpm_pkg_find, which nearly all are (after the first patch).
>>
>> It does this by adding a hash function to util.c which is nothing too
>> complicated and named after a publicly available algorithm. When packages
>> are
>> created, we fill in this hash value as soon as the pkgname is read.
>> Finally,
>> the _alpm_pkg_find function is rewritten to take advantage of this field,
>> avoiding repeated strcmp() calls and only falling back to that if a hash
>> is not
>> available and to verify the hash value was not some sort of collision.
>>
>> Performance figures and numbers are available in the last patch. This
>> actually
>> speeds up operations by nearly 33%, so this is not a total waste of time
>> to
>> consider. :) Review and questions/comments/concerns welcome!
>>
>
> My only comment is more of a wondering whether would it be better to have an
> _alpm_pkg_set_name(pmpkg_t *) function that automatically updates the hash.
>  It is a tradeoff between having to always remember to update the hash after
> adjusting pmpkg_t->name (seems likely to get missed at some stage) and
> complexity that I am undecided on.

I thought about that as well; realized I only had to update two
places, "forgot" about it. If we did this we would want to do
something like we did for db and path: rename the field to _pkgname so
people realize it shouldn't be mucked with directly, and all the
sudden the patch is huge.

So I think it can be missed either way, but here's to automated
testing to hopefully catching fallout when someone forgets it.

-Dan


More information about the pacman-dev mailing list