[pacman-dev] pacman cold caches performance, too much stat()ing

Sebastian Nowicki sebnow at gmail.com
Sun Dec 13 07:42:21 EST 2009


On 13/12/2009, at 2:31 AM, Nagy Gabor wrote:

>> On Sat, 12 Dec 2009, Dimitrios Apostolou wrote:
>>> Hello list,
>>>
>>> I have been investigating the slow performance of pacman regarding
>>> the cold caches scenario and I'm trying to write some proof of
>>> concept code that improves things a lot for some cases. However I
>>> need your help regarding some facts I might have misunderstood, and
>>> any pointers to the source code you also give me would also help a
>>> lot. I wouldn't like to lose time changing stuff that would break
>>> current functionality. So here are some first questions that come
>>> to mind, just by using strace:
>>>
>>> When doing "pacman -Q blah" I can see that besides the getdents()
>>> syscalls in /var/lib/pacman/local (probably caused by readdir()),
>>> there are also stat() and access() calls for every single
>>> subdirectory. Why are the last ones necessary? Isn't readdir enough?
>>>
>>> The same goes when doing "pacman -S blah". But in that case it
>>> stat()'s both 'local' and 'sync' directories, so worst case is
>>> really bad, it will stat() all contents of local, core, extra and
>>> community...
>>
>> Regarding the stat() and access() operations I finally found out why
>> they happen exactly:
>>
>> In case of corrupted db the sync, for example, directory might
>> contain files, not subdirectories. So in that case
>> _alpm_db_populate() just makes sure it's a directory. However
>> stat()ing thousands of files is too much of a price to pay.
>> Similarly, access() checks it is accessible by the user.
>>
>> In the attached patch I have just removed the relevant lines, with
>> the following rationale: In the rare case of corrupted db, even if we
>> do open("sync/not_a_dir/depends") it will still fail and we'll catch
>> the failure there, no need to investigate the cause further, just
>> write a message like "couldn't access sync/not_a_dir/depends".
>>
>> By dropping caches ("echo 3 > /proc/sys/vm/drop_caches") before
>> running, I measure a nice performance boost on my old laptop: "pacman
>> -Q gdb" time is reduced from about 7s to 2.5s.
>
> Hm. This is a nice time boost... Did you test this with other
> operations, too?
>
>> What do you think? Is it possible to remove those checks?
>> Dimitris
>
> The best solution would be to rewrite our whole database crap as Dan
> said. I am pretty sure that this patch would not cause any harm irl,  
> but
> our code would become a little bit more dangerous: As I see,
> db_read(INFRQ_BASE) would become a ~NOP function and db_populate would
> become a simple "ls" function (the only remaining sanity check is
> splitname).

It occurs to me time and time again, that it would be a good idea to  
try and abstract the database functions, to allow different database  
backends to be "plugged in." This would make experimentation with  
backends a lot easier, since you just compile a different file in (the  
interface remains the same). A library called libpkg[1] does something  
along these lines, by leveraging function pointers. Unfortunately I  
don't have a lot of time to look into it further, but it's an  
interesting idea.

[1]: http://libpkg.berlios.de/doc/trunk/html/pkg__db_8h_source.html



More information about the pacman-dev mailing list