[pacman-dev] [PATCH] Drastically reduce the number of syscalls.
--- lib/libalpm/be_files.c | 18 ++---------------- 1 files changed, 2 insertions(+), 16 deletions(-) diff --git a/lib/libalpm/be_files.c b/lib/libalpm/be_files.c index adf41aa..ddef742 100644 --- a/lib/libalpm/be_files.c +++ b/lib/libalpm/be_files.c @@ -113,8 +113,6 @@ static int dirlist_from_fs(const char *syncdbpath, alpm_list_t **dirlist) { DIR *dbdir; struct dirent *ent = NULL; - struct stat sbuf; - char path[PATH_MAX]; dbdir = opendir(syncdbpath); if (dbdir != NULL) { @@ -128,8 +126,7 @@ static int dirlist_from_fs(const char *syncdbpath, alpm_list_t **dirlist) } /* stat the entry, make sure it's a directory */ - snprintf(path, PATH_MAX, "%s%s", syncdbpath, name); - if(stat(path, &sbuf) != 0 || !S_ISDIR(sbuf.st_mode)) { + if(ent->d_type != DT_DIR) { continue; } @@ -353,8 +350,6 @@ int _alpm_db_populate(pmdb_t *db) { int count = 0; struct dirent *ent = NULL; - struct stat sbuf; - char path[PATH_MAX]; const char *dbpath; DIR *dbdir; @@ -374,9 +369,7 @@ int _alpm_db_populate(pmdb_t *db) if(strcmp(name, ".") == 0 || strcmp(name, "..") == 0) { continue; } - /* stat the entry, make sure it's a directory */ - snprintf(path, PATH_MAX, "%s%s", dbpath, name); - if(stat(path, &sbuf) != 0 || !S_ISDIR(sbuf.st_mode)) { + if(ent->d_type != DT_DIR) { continue; } @@ -475,13 +468,6 @@ int _alpm_db_read(pmdb_t *db, pmpkg_t *info, pmdbinfrq_t inforeq) pkgpath = get_pkgpath(db, info); - if(access(pkgpath, F_OK)) { - /* directory doesn't exist or can't be opened */ - _alpm_log(PM_LOG_DEBUG, "cannot find '%s-%s' in db '%s'\n", - info->name, info->version, db->treename); - goto error; - } - /* DESC */ if(inforeq & INFRQ_DESC) { snprintf(path, PATH_MAX, "%sdesc", pkgpath); -- 1.7.2
On 27/07/10 00:17, Tim Nieradzik wrote:
--- lib/libalpm/be_files.c | 18 ++---------------- 1 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/lib/libalpm/be_files.c b/lib/libalpm/be_files.c index adf41aa..ddef742 100644 --- a/lib/libalpm/be_files.c +++ b/lib/libalpm/be_files.c @@ -113,8 +113,6 @@ static int dirlist_from_fs(const char *syncdbpath, alpm_list_t **dirlist) { DIR *dbdir; struct dirent *ent = NULL; - struct stat sbuf; - char path[PATH_MAX];
dbdir = opendir(syncdbpath); if (dbdir != NULL) { @@ -128,8 +126,7 @@ static int dirlist_from_fs(const char *syncdbpath, alpm_list_t **dirlist) }
/* stat the entry, make sure it's a directory */ - snprintf(path, PATH_MAX, "%s%s", syncdbpath, name); - if(stat(path,&sbuf) != 0 || !S_ISDIR(sbuf.st_mode)) { + if(ent->d_type != DT_DIR) {
That is a platform specific construct (not defined by POSIX). It works with glibc (not sure about uclibc) and on BSD libc, but I believe this broken on OSX and potentially on broken cygwin. And those are only the platforms I know people are using pacman on... Any ideas if we can autoconf around that? Allan
On Wed, 04 Aug 2010 14:45, Allan McRae wrote:
That is a platform specific construct (not defined by POSIX). It works with glibc (not sure about uclibc) and on BSD libc, but I believe this broken on OSX and potentially on broken cygwin. And those are only the platforms I know people are using pacman on...
Any ideas if we can autoconf around that?
Looks like DT_DIR is defined in Mac OS X, too (cf. [1]). It's also supported in Cygwin (since 1.7.0, see [2]). I think it'd be best to go with a simple "#ifdef DT_DIR". If you'd rather do it the autoconf-way, you could try a similar approach as in this [3] patch. [1] http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPag... [2] http://lists.zerezo.com/cygwin/msg42255.html [3] http://www.mail-archive.com/notmuch@notmuchmail.org/msg02242.html --Tim
On 05/08/10 07:36, Tim Nieradzik wrote:
On Wed, 04 Aug 2010 14:45, Allan McRae wrote:
That is a platform specific construct (not defined by POSIX). It works with glibc (not sure about uclibc) and on BSD libc, but I believe this broken on OSX and potentially on broken cygwin. And those are only the platforms I know people are using pacman on...
Any ideas if we can autoconf around that?
Looks like DT_DIR is defined in Mac OS X, too (cf. [1]).
It's also supported in Cygwin (since 1.7.0, see [2]).
I think it'd be best to go with a simple "#ifdef DT_DIR". If you'd rather do it the autoconf-way, you could try a similar approach as in this [3] patch.
[1] http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPag... [2] http://lists.zerezo.com/cygwin/msg42255.html [3] http://www.mail-archive.com/notmuch@notmuchmail.org/msg02242.html
Well, it looks like updates to Cygwin and OSX have added support. So I think it is fine to use this approach. @Dan: given most platforms support DT_DIR, would you prefer to just assume DT_DIR support or do something like creating a function _is_dir() with an "#ifdef DT_DIR" block? Allan
On Linux and OS X, we can determine if an entry obtained through a readdir() call is a directory without also having to stat it. This can save a significant number of syscalls; it does make the getdents() call more expensive but cuts out a lot of stat() calls. Before: $ strace -c pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 82.95 0.016153 1 21733 read 4.21 0.000819 0 11056 23 open 3.96 0.000771 0 17602 1 access 1.84 0.000358 0 11026 fstat 1.72 0.000334 0 11033 close 1.54 0.000299 0 6608 3 stat 0.00 0.000000 0 20 getdents ------ ----------- ----------- --------- --------- ---------------- 100.00 0.019473 101271 28 total After: $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 30.11 0.001503 75 20 getdents 14.81 0.000739 0 21733 read 13.91 0.000694 0 17602 1 access 13.32 0.000665 0 11072 39 open 6.61 0.000330 0 11026 fstat 6.03 0.000301 0 11033 close 0.00 0.000000 0 9 6 stat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.004991 94686 47 total Obviously the numbers will show some variation, but it never seems to be slower so this should be a win overall. Signed-off-by: Dan McGee <dan@archlinux.org> --- List, My take on a rather stale patch in my inbox. I added the is_dir() static function so we can still run on platforms not supporting this shortcut, and I also wasn't sure why the access() call was removed in the orignal patch (commit messages explaining changes, anyone?). Let me know what you think. -Dan -Dan lib/libalpm/be_files.c | 29 +++++++++++++++++++---------- 1 files changed, 19 insertions(+), 10 deletions(-) diff --git a/lib/libalpm/be_files.c b/lib/libalpm/be_files.c index 4432171..0f055e0 100644 --- a/lib/libalpm/be_files.c +++ b/lib/libalpm/be_files.c @@ -108,13 +108,28 @@ static int dirlist_from_tar(const char *archive, alpm_list_t **dirlist) return(0); } +static int is_dir(const char *path, struct dirent *entry) +{ +#ifdef DT_DIR + return(entry->d_type == DT_DIR); +#else + char buffer[PATH_MAX]; + snprintf(buffer, PATH_MAX, "%s/%s", path, entry->d_name); + + struct stat sbuf; + if (!stat(buffer, &sbuf)) { + return(S_ISDIR(sbuf.st_mode)); + } + + return(0); +#endif +} + /* create list of directories in db */ static int dirlist_from_fs(const char *syncdbpath, alpm_list_t **dirlist) { DIR *dbdir; struct dirent *ent = NULL; - struct stat sbuf; - char path[PATH_MAX]; dbdir = opendir(syncdbpath); if (dbdir != NULL) { @@ -127,9 +142,7 @@ static int dirlist_from_fs(const char *syncdbpath, alpm_list_t **dirlist) continue; } - /* stat the entry, make sure it's a directory */ - snprintf(path, PATH_MAX, "%s%s", syncdbpath, name); - if(stat(path, &sbuf) != 0 || !S_ISDIR(sbuf.st_mode)) { + if(!is_dir(syncdbpath, ent)) { continue; } @@ -353,8 +366,6 @@ int _alpm_db_populate(pmdb_t *db) { int count = 0; struct dirent *ent = NULL; - struct stat sbuf; - char path[PATH_MAX]; const char *dbpath; DIR *dbdir; @@ -374,9 +385,7 @@ int _alpm_db_populate(pmdb_t *db) if(strcmp(name, ".") == 0 || strcmp(name, "..") == 0) { continue; } - /* stat the entry, make sure it's a directory */ - snprintf(path, PATH_MAX, "%s%s", dbpath, name); - if(stat(path, &sbuf) != 0 || !S_ISDIR(sbuf.st_mode)) { + if(!is_dir(dbpath, ent)) { continue; } -- 1.7.3.1
On 06/10/10 02:49, Dan McGee wrote:
On Linux and OS X, we can determine if an entry obtained through a readdir() call is a directory without also having to stat it. This can save a significant number of syscalls; it does make the getdents() call more expensive but cuts out a lot of stat() calls.
Before: $ strace -c pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 82.95 0.016153 1 21733 read 4.21 0.000819 0 11056 23 open 3.96 0.000771 0 17602 1 access 1.84 0.000358 0 11026 fstat 1.72 0.000334 0 11033 close 1.54 0.000299 0 6608 3 stat 0.00 0.000000 0 20 getdents ------ ----------- ----------- --------- --------- ---------------- 100.00 0.019473 101271 28 total
After: $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 30.11 0.001503 75 20 getdents 14.81 0.000739 0 21733 read 13.91 0.000694 0 17602 1 access 13.32 0.000665 0 11072 39 open 6.61 0.000330 0 11026 fstat 6.03 0.000301 0 11033 close 0.00 0.000000 0 9 6 stat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.004991 94686 47 total
Obviously the numbers will show some variation, but it never seems to be slower so this should be a win overall.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
List,
My take on a rather stale patch in my inbox. I added the is_dir() static function so we can still run on platforms not supporting this shortcut, and I also wasn't sure why the access() call was removed in the orignal patch (commit messages explaining changes, anyone?). Let me know what you think.
This has been sitting in my flagged emails folder for a while too... I always figured the access part was being over heavy-handed in removal of checks at the time. However, we now have check that fopen on the "desc" etc files is successful (see right below it) so perhaps it is indeed now excessive to check the directory first? Allan
On Tue, Oct 5, 2010 at 4:53 PM, Allan McRae <allan@archlinux.org> wrote:
On 06/10/10 02:49, Dan McGee wrote:
On Linux and OS X, we can determine if an entry obtained through a readdir() call is a directory without also having to stat it. This can save a significant number of syscalls; it does make the getdents() call more expensive but cuts out a lot of stat() calls.
Before: $ strace -c pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 82.95 0.016153 1 21733 read 4.21 0.000819 0 11056 23 open 3.96 0.000771 0 17602 1 access 1.84 0.000358 0 11026 fstat 1.72 0.000334 0 11033 close 1.54 0.000299 0 6608 3 stat 0.00 0.000000 0 20 getdents ------ ----------- ----------- --------- --------- ---------------- 100.00 0.019473 101271 28 total
After: $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 30.11 0.001503 75 20 getdents 14.81 0.000739 0 21733 read 13.91 0.000694 0 17602 1 access 13.32 0.000665 0 11072 39 open 6.61 0.000330 0 11026 fstat 6.03 0.000301 0 11033 close 0.00 0.000000 0 9 6 stat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.004991 94686 47 total
Obviously the numbers will show some variation, but it never seems to be slower so this should be a win overall.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
List,
My take on a rather stale patch in my inbox. I added the is_dir() static function so we can still run on platforms not supporting this shortcut, and I also wasn't sure why the access() call was removed in the orignal patch (commit messages explaining changes, anyone?). Let me know what you think.
This has been sitting in my flagged emails folder for a while too...
I always figured the access part was being over heavy-handed in removal of checks at the time. However, we now have check that fopen on the "desc" etc files is successful (see right below it) so perhaps it is indeed now excessive to check the directory first?
I'm partial to the fail-early, fail-often mantra, but someone else is welcome to submit a patch with sufficient explanation as to why it isn't necessary if we still feel it is overkill. The dir scanning stuff was definitely the bulk of the reduction in time here; I'm not so certain the extra 7ms are going to be noticed for the access() overhead. -Dan
On Tue, Oct 5, 2010 at 5:11 PM, Dan McGee <dpmcgee@gmail.com> wrote:
On Tue, Oct 5, 2010 at 4:53 PM, Allan McRae <allan@archlinux.org> wrote:
On 06/10/10 02:49, Dan McGee wrote:
On Linux and OS X, we can determine if an entry obtained through a readdir() call is a directory without also having to stat it. This can save a significant number of syscalls; it does make the getdents() call more expensive but cuts out a lot of stat() calls.
Before: $ strace -c pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 82.95 0.016153 1 21733 read 4.21 0.000819 0 11056 23 open 3.96 0.000771 0 17602 1 access 1.84 0.000358 0 11026 fstat 1.72 0.000334 0 11033 close 1.54 0.000299 0 6608 3 stat 0.00 0.000000 0 20 getdents ------ ----------- ----------- --------- --------- ---------------- 100.00 0.019473 101271 28 total
After: $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 30.11 0.001503 75 20 getdents 14.81 0.000739 0 21733 read 13.91 0.000694 0 17602 1 access 13.32 0.000665 0 11072 39 open 6.61 0.000330 0 11026 fstat 6.03 0.000301 0 11033 close 0.00 0.000000 0 9 6 stat ------ ----------- ----------- --------- --------- ---------------- 100.00 0.004991 94686 47 total
Obviously the numbers will show some variation, but it never seems to be slower so this should be a win overall.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
List,
My take on a rather stale patch in my inbox. I added the is_dir() static function so we can still run on platforms not supporting this shortcut, and I also wasn't sure why the access() call was removed in the orignal patch (commit messages explaining changes, anyone?). Let me know what you think.
This has been sitting in my flagged emails folder for a while too...
I always figured the access part was being over heavy-handed in removal of checks at the time. However, we now have check that fopen on the "desc" etc files is successful (see right below it) so perhaps it is indeed now excessive to check the directory first?
I'm partial to the fail-early, fail-often mantra, but someone else is welcome to submit a patch with sufficient explanation as to why it isn't necessary if we still feel it is overkill. The dir scanning stuff was definitely the bulk of the reduction in time here; I'm not so certain the extra 7ms are going to be noticed for the access() overhead.
And looking closer at my numbers, did we not lose time here? If you through the read() calls out (which I left in only because that was the top time user), I should do some new timings or someone else should as well... -Dan
On Tue, Oct 5, 2010 at 5:13 PM, Dan McGee <dpmcgee@gmail.com> wrote:
And looking closer at my numbers, did we not lose time here? If you through the read() calls out (which I left in only because that was the top time user), I should do some new timings or someone else should as well...
Slower it is! Non-truncated test output below after two warm up runs of each. $ strace -c pacman -Ss pacman >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 21.41 0.000764 0 11056 23 open 20.15 0.000719 0 21733 read 19.53 0.000697 0 17602 1 access 11.27 0.000402 0 11054 mmap 9.84 0.000351 0 11009 munmap 7.17 0.000256 0 6608 3 stat 5.89 0.000210 0 11026 fstat 4.74 0.000169 0 11033 close 0.00 0.000000 0 1 write 0.00 0.000000 0 19 mprotect 0.00 0.000000 0 55 brk 0.00 0.000000 0 8 rt_sigaction 0.00 0.000000 0 1 rt_sigprocmask 0.00 0.000000 0 13 13 ioctl 0.00 0.000000 0 1 execve 0.00 0.000000 0 2 uname 0.00 0.000000 0 1 fcntl 0.00 0.000000 0 20 getdents 0.00 0.000000 0 1 getrlimit 0.00 0.000000 0 1 geteuid 0.00 0.000000 0 1 arch_prctl 0.00 0.000000 0 3 1 futex 0.00 0.000000 0 1 set_tid_address 0.00 0.000000 0 1 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.003568 101250 41 total dmcgee@galway ~/projects/pacman (master) $ strace -c ./src/pacman/.libs/lt-pacman -Ss pacman >/dev/null % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 19.29 0.000790 0 21733 read 17.80 0.000729 0 17602 1 access 15.85 0.000649 32 20 getdents 15.14 0.000620 0 11009 munmap 11.33 0.000464 0 11072 39 open 7.20 0.000295 0 11033 close 6.74 0.000276 0 11026 fstat 6.64 0.000272 0 11054 mmap 0.00 0.000000 0 1 write 0.00 0.000000 0 9 6 stat 0.00 0.000000 0 19 mprotect 0.00 0.000000 0 53 brk 0.00 0.000000 0 8 rt_sigaction 0.00 0.000000 0 1 rt_sigprocmask 0.00 0.000000 0 13 13 ioctl 0.00 0.000000 0 1 execve 0.00 0.000000 0 2 uname 0.00 0.000000 0 1 fcntl 0.00 0.000000 0 1 getrlimit 0.00 0.000000 0 1 geteuid 0.00 0.000000 0 1 arch_prctl 0.00 0.000000 0 3 1 futex 0.00 0.000000 0 1 set_tid_address 0.00 0.000000 0 1 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.004095 94665 60 total
On 06/10/10 08:37, Dan McGee wrote:
On Tue, Oct 5, 2010 at 5:13 PM, Dan McGee<dpmcgee@gmail.com> wrote:
And looking closer at my numbers, did we not lose time here? If you through the read() calls out (which I left in only because that was the top time user), I should do some new timings or someone else should as well...
Slower it is! Non-truncated test output below after two warm up runs of each.
I just replicated the same result. It is even worse if you drop all caches before running the strace. Allan
On Tue, Oct 5, 2010 at 5:43 PM, Allan McRae <allan@archlinux.org> wrote:
On 06/10/10 08:37, Dan McGee wrote:
On Tue, Oct 5, 2010 at 5:13 PM, Dan McGee<dpmcgee@gmail.com> wrote:
And looking closer at my numbers, did we not lose time here? If you through the read() calls out (which I left in only because that was the top time user), I should do some new timings or someone else should as well...
Slower it is! Non-truncated test output below after two warm up runs of each.
I just replicated the same result. It is even worse if you drop all caches before running the strace.
Off to the trash bin for this patch then? Performance improvements that don't really improve performance are kind of a downer, unless someone shows me results on OS X or Cygwin that this is a huge win and worth keeping around. -Dan
On 06/10/10 08:52, Dan McGee wrote:
On Tue, Oct 5, 2010 at 5:43 PM, Allan McRae<allan@archlinux.org> wrote:
On 06/10/10 08:37, Dan McGee wrote:
On Tue, Oct 5, 2010 at 5:13 PM, Dan McGee<dpmcgee@gmail.com> wrote:
And looking closer at my numbers, did we not lose time here? If you through the read() calls out (which I left in only because that was the top time user), I should do some new timings or someone else should as well...
Slower it is! Non-truncated test output below after two warm up runs of each.
I just replicated the same result. It is even worse if you drop all caches before running the strace.
Off to the trash bin for this patch then? Performance improvements that don't really improve performance are kind of a downer, unless someone shows me results on OS X or Cygwin that this is a huge win and worth keeping around.
Agreed.
participants (4)
-
Allan McRae
-
Dan McGee
-
Dan McGee
-
Tim Nieradzik