[pacman-dev] [PATCH] Avoid stat call to determine is_directory if possible

Dan McGee dpmcgee at gmail.com
Tue Oct 5 18:13:25 EDT 2010


On Tue, Oct 5, 2010 at 5:11 PM, Dan McGee <dpmcgee at gmail.com> wrote:
> On Tue, Oct 5, 2010 at 4:53 PM, Allan McRae <allan at archlinux.org> wrote:
>> On 06/10/10 02:49, Dan McGee wrote:
>>>
>>> On Linux and OS X, we can determine if an entry obtained through a
>>> readdir()
>>> call is a directory without also having to stat it. This can save a
>>> significant number of syscalls; it does make the getdents() call more
>>> expensive but cuts out a lot of stat() calls.
>>>
>>> Before:
>>> $ strace -c pacman -Ss pacman
>>> % time     seconds  usecs/call     calls    errors syscall
>>> ------ ----------- ----------- --------- --------- ----------------
>>>  82.95    0.016153           1     21733           read
>>>   4.21    0.000819           0     11056        23 open
>>>   3.96    0.000771           0     17602         1 access
>>>   1.84    0.000358           0     11026           fstat
>>>   1.72    0.000334           0     11033           close
>>>   1.54    0.000299           0      6608         3 stat
>>>   0.00    0.000000           0        20           getdents
>>> ------ ----------- ----------- --------- --------- ----------------
>>> 100.00    0.019473                101271        28 total
>>>
>>> After:
>>> $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman
>>> % time     seconds  usecs/call     calls    errors syscall
>>> ------ ----------- ----------- --------- --------- ----------------
>>>  30.11    0.001503          75        20           getdents
>>>  14.81    0.000739           0     21733           read
>>>  13.91    0.000694           0     17602         1 access
>>>  13.32    0.000665           0     11072        39 open
>>>   6.61    0.000330           0     11026           fstat
>>>   6.03    0.000301           0     11033           close
>>>   0.00    0.000000           0         9         6 stat
>>> ------ ----------- ----------- --------- --------- ----------------
>>> 100.00    0.004991                 94686        47 total
>>>
>>> Obviously the numbers will show some variation, but it never seems to be
>>> slower so this should be a win overall.
>>>
>>> Signed-off-by: Dan McGee<dan at archlinux.org>
>>> ---
>>>
>>> List,
>>>
>>> My take on a rather stale patch in my inbox. I added the is_dir() static
>>> function so we can still run on platforms not supporting this shortcut,
>>> and I
>>> also wasn't sure why the access() call was removed in the orignal patch
>>> (commit
>>> messages explaining changes, anyone?). Let me know what you think.
>>>
>>
>> This has been sitting in my flagged emails folder for a while too...
>>
>> I always figured the access part was being over heavy-handed in removal of
>> checks at the time.  However, we now have check that fopen on the "desc" etc
>> files is successful (see right below it) so perhaps it is indeed now
>> excessive to check the directory first?
>
> I'm partial to the fail-early, fail-often mantra, but someone else is
> welcome to submit a patch with sufficient explanation as to why it
> isn't necessary if we still feel it is overkill. The dir scanning
> stuff was definitely the bulk of the reduction in time here; I'm not
> so certain the extra 7ms are going to be noticed for the access()
> overhead.

And looking closer at my numbers, did we not lose time here? If you
through the read() calls out (which I left in only because that was
the top time user), I should do some new timings or someone else
should as well...

-Dan


More information about the pacman-dev mailing list