[pacman-dev] [PATCH] Avoid stat call to determine is_directory if possible
Dan McGee
dpmcgee at gmail.com
Tue Oct 5 18:11:38 EDT 2010
On Tue, Oct 5, 2010 at 4:53 PM, Allan McRae <allan at archlinux.org> wrote:
> On 06/10/10 02:49, Dan McGee wrote:
>>
>> On Linux and OS X, we can determine if an entry obtained through a
>> readdir()
>> call is a directory without also having to stat it. This can save a
>> significant number of syscalls; it does make the getdents() call more
>> expensive but cuts out a lot of stat() calls.
>>
>> Before:
>> $ strace -c pacman -Ss pacman
>> % time seconds usecs/call calls errors syscall
>> ------ ----------- ----------- --------- --------- ----------------
>> 82.95 0.016153 1 21733 read
>> 4.21 0.000819 0 11056 23 open
>> 3.96 0.000771 0 17602 1 access
>> 1.84 0.000358 0 11026 fstat
>> 1.72 0.000334 0 11033 close
>> 1.54 0.000299 0 6608 3 stat
>> 0.00 0.000000 0 20 getdents
>> ------ ----------- ----------- --------- --------- ----------------
>> 100.00 0.019473 101271 28 total
>>
>> After:
>> $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman
>> % time seconds usecs/call calls errors syscall
>> ------ ----------- ----------- --------- --------- ----------------
>> 30.11 0.001503 75 20 getdents
>> 14.81 0.000739 0 21733 read
>> 13.91 0.000694 0 17602 1 access
>> 13.32 0.000665 0 11072 39 open
>> 6.61 0.000330 0 11026 fstat
>> 6.03 0.000301 0 11033 close
>> 0.00 0.000000 0 9 6 stat
>> ------ ----------- ----------- --------- --------- ----------------
>> 100.00 0.004991 94686 47 total
>>
>> Obviously the numbers will show some variation, but it never seems to be
>> slower so this should be a win overall.
>>
>> Signed-off-by: Dan McGee<dan at archlinux.org>
>> ---
>>
>> List,
>>
>> My take on a rather stale patch in my inbox. I added the is_dir() static
>> function so we can still run on platforms not supporting this shortcut,
>> and I
>> also wasn't sure why the access() call was removed in the orignal patch
>> (commit
>> messages explaining changes, anyone?). Let me know what you think.
>>
>
> This has been sitting in my flagged emails folder for a while too...
>
> I always figured the access part was being over heavy-handed in removal of
> checks at the time. However, we now have check that fopen on the "desc" etc
> files is successful (see right below it) so perhaps it is indeed now
> excessive to check the directory first?
I'm partial to the fail-early, fail-often mantra, but someone else is
welcome to submit a patch with sufficient explanation as to why it
isn't necessary if we still feel it is overkill. The dir scanning
stuff was definitely the bulk of the reduction in time here; I'm not
so certain the extra 7ms are going to be noticed for the access()
overhead.
-Dan
More information about the pacman-dev
mailing list