On Sun, Nov 20, 2011 at 3:03 PM, <andrew.gregory.8@gmail.com> wrote:
From: Andrew Gregory <andrew.gregory.8@gmail.com>
Not allowing fileowner queries for directories was an unnecessary limitation. Queries for directories have poor performance due to having to call real_path on every directory listed in every package's file list. Because more than one package will likely own a given directory, all packages are checked when querying a directory instead of bailing out after the first owner is found.
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- src/pacman/query.c | 21 ++++++++++++--------- 1 files changed, 12 insertions(+), 9 deletions(-) Sorry I didn't get a chance to look at this earlier.
First, I should say I've tried to do this 5 months ago, but didn't make my patch public at the time. It is now here: http://code.toofishes.net/cgit/dan/pacman.git/commit/?h=dir-ownership I'll end up calling out things in there I had to do, as unfortunately there are a bazillion edge cases I encountered when writing the patch.
diff --git a/src/pacman/query.c b/src/pacman/query.c index 4c2ea81..c64f0d2 100644 --- a/src/pacman/query.c +++ b/src/pacman/query.c @@ -127,7 +127,7 @@ static int query_fileowner(alpm_list_t *targets) * iteration. */ root = alpm_option_get_root(config->handle); rootlen = strlen(root); - if(rootlen + 1 > PATH_MAX) { + if(rootlen >= PATH_MAX) { /* we are in trouble here */ pm_printf(ALPM_LOG_ERROR, _("path too long: %s%s\n"), root, ""); return 1; @@ -142,6 +142,7 @@ static int query_fileowner(alpm_list_t *targets) struct stat buf; alpm_list_t *i; int found = 0; + int searchall = 0;
filename = strdup(t->data);
@@ -164,12 +165,14 @@ static int query_fileowner(alpm_list_t *targets) } }
+ /* make sure directories end with '/' */ if(S_ISDIR(buf.st_mode)) { - pm_printf(ALPM_LOG_ERROR, - _("cannot determine ownership of directory '%s'\n"), filename); - ret++; - free(filename); - continue; + searchall = 1; + size_t fnlen = strlen(filename); + if(filename[fnlen-1] != '/'){ + filename = realloc(filename, sizeof(char) * (fnlen+2)); sizeof(char) is always 1 guaranteed, no need for that. + strcat(filename, "/"); strcat is rather inefficient when you've already got yourself the total length of the string; you can simply set filename[len - 1] = '/', filename[len] = '\0' in this case or whatever it needs to be. + } } I took the opposite approach here: ensured it did NOT end with a slash. This vastly simplifies the memory hoops you had to jump through here.
bname = mbasename(filename); @@ -192,7 +195,7 @@ static int query_fileowner(alpm_list_t *targets) } free(dname);
- for(i = alpm_db_get_pkgcache(db_local); i && !found; i = alpm_list_next(i)) { + for(i = alpm_db_get_pkgcache(db_local); i && (searchall || !found); i = alpm_list_next(i)) {
This looks so eerily similar...
alpm_pkg_t *info = i->data; alpm_filelist_t *filelist = alpm_pkg_get_files(info); size_t j; @@ -211,10 +214,10 @@ static int query_fileowner(alpm_list_t *targets) And now this is where we differ.
Things I came across that occur in the latter half of my patch: * We can break once we've seen a directory in a package; however there are two print_query_fileowner() calls that need to be handled; you only handled one of them. * I added code to (cheaply) bail early on a given file if we knew we were looking for a directory; this saves a lot of expense calling mdirname/resolve_path/etc. unnecessarily. * I want to say the reason I stripped the slash had to do with all sorts of edge cases involving symlinks. I don't know this for sure though. I'd check your code on something like the following: $ ln -s /usr/share /tmp/foobar $ ./src/pacman/pacman -Qo /usr/share/ | wc -l 739 $ ./src/pacman/pacman -Qo /usr/share | wc -l 739 $ ./src/pacman/pacman -Qo /tmp/foobar/ | wc -l error: No package owns /tmp/foobar 0 $ ./src/pacman/pacman -Qo /tmp/foobar | wc -l error: No package owns /tmp/foobar 0 $ ./src/pacman/pacman -Qo /var/mail /var/mail/ /var/spool/mail /var/spool/mail/ /var/mail is owned by filesystem 2011.10-1 /var/mail is owned by filesystem 2011.10-1 /var/spool/mail is owned by filesystem 2011.10-1 /var/spool/mail is owned by filesystem 2011.10-1 Sorry this was hidden away; wish I could have saved you some duplicate work. I encourage you to combine the work though and resubmit after thorough testing.
if(!rpath) { print_query_fileowner(filename, info); found = 1; - continue; + break; }
- if(rootlen + 1 + strlen(pkgfile) > PATH_MAX) { + if(rootlen + strlen(pkgfile) >= PATH_MAX) { pm_printf(ALPM_LOG_ERROR, _("path too long: %s%s\n"), root, pkgfile); } /* concatenate our file and the root path */ -- 1.7.7.3