[pacman-dev] [PATCH] allow querying fileowner for directories

Dan McGee dpmcgee at gmail.com
Tue Nov 22 02:14:19 EST 2011


On Sun, Nov 20, 2011 at 3:03 PM,  <andrew.gregory.8 at gmail.com> wrote:
> From: Andrew Gregory <andrew.gregory.8 at gmail.com>
>
> Not allowing fileowner queries for directories was an unnecessary
> limitation.  Queries for directories have poor performance due to
> having to call real_path on every directory listed in every package's
> file list.  Because more than one package will likely own a given
> directory, all packages are checked when querying a directory instead
> of bailing out after the first owner is found.
>
> Signed-off-by: Andrew Gregory <andrew.gregory.8 at gmail.com>
> ---
>  src/pacman/query.c |   21 ++++++++++++---------
>  1 files changed, 12 insertions(+), 9 deletions(-)
Sorry I didn't get a chance to look at this earlier.

First, I should say I've tried to do this 5 months ago, but didn't
make my patch public at the time. It is now here:
http://code.toofishes.net/cgit/dan/pacman.git/commit/?h=dir-ownership

I'll end up calling out things in there I had to do, as unfortunately
there are a bazillion edge cases I encountered when writing the patch.

> diff --git a/src/pacman/query.c b/src/pacman/query.c
> index 4c2ea81..c64f0d2 100644
> --- a/src/pacman/query.c
> +++ b/src/pacman/query.c
> @@ -127,7 +127,7 @@ static int query_fileowner(alpm_list_t *targets)
>         * iteration. */
>        root = alpm_option_get_root(config->handle);
>        rootlen = strlen(root);
> -       if(rootlen + 1 > PATH_MAX) {
> +       if(rootlen >= PATH_MAX) {
>                /* we are in trouble here */
>                pm_printf(ALPM_LOG_ERROR, _("path too long: %s%s\n"), root, "");
>                return 1;
> @@ -142,6 +142,7 @@ static int query_fileowner(alpm_list_t *targets)
>                struct stat buf;
>                alpm_list_t *i;
>                int found = 0;
> +               int searchall = 0;
>
>                filename = strdup(t->data);
>
> @@ -164,12 +165,14 @@ static int query_fileowner(alpm_list_t *targets)
>                        }
>                }
>
> +               /* make sure directories end with '/' */
>                if(S_ISDIR(buf.st_mode)) {
> -                       pm_printf(ALPM_LOG_ERROR,
> -                               _("cannot determine ownership of directory '%s'\n"), filename);
> -                       ret++;
> -                       free(filename);
> -                       continue;
> +                       searchall = 1;
> +                       size_t fnlen = strlen(filename);
> +                       if(filename[fnlen-1] != '/'){
> +                               filename = realloc(filename, sizeof(char) * (fnlen+2));
sizeof(char) is always 1 guaranteed, no need for that.
> +                               strcat(filename, "/");
strcat is rather inefficient when you've already got yourself the
total length of the string; you can simply set filename[len - 1] =
'/', filename[len] = '\0' in this case or whatever it needs to be.
> +                       }
>                }
I took the opposite approach here: ensured it did NOT end with a
slash. This vastly simplifies the memory hoops you had to jump through
here.

>
>                bname = mbasename(filename);
> @@ -192,7 +195,7 @@ static int query_fileowner(alpm_list_t *targets)
>                }
>                free(dname);
>
> -               for(i = alpm_db_get_pkgcache(db_local); i && !found; i = alpm_list_next(i)) {
> +               for(i = alpm_db_get_pkgcache(db_local); i && (searchall || !found); i = alpm_list_next(i)) {
This looks so eerily similar...

>                        alpm_pkg_t *info = i->data;
>                        alpm_filelist_t *filelist = alpm_pkg_get_files(info);
>                        size_t j;
> @@ -211,10 +214,10 @@ static int query_fileowner(alpm_list_t *targets)
And now this is where we differ.

Things I came across that occur in the latter half of my patch:
* We can break once we've seen a directory in a package; however there
are two print_query_fileowner() calls that need to be handled; you
only handled one of them.
* I added code to (cheaply) bail early on a given file if we knew we
were looking for a directory; this saves a lot of expense calling
mdirname/resolve_path/etc. unnecessarily.
* I want to say the reason I stripped the slash had to do with all
sorts of edge cases involving symlinks. I don't know this for sure
though.

I'd check your code on something like the following:

$ ln -s /usr/share /tmp/foobar
$ ./src/pacman/pacman -Qo /usr/share/ | wc -l
739
$ ./src/pacman/pacman -Qo /usr/share | wc -l
739
$ ./src/pacman/pacman -Qo /tmp/foobar/ | wc -l
error: No package owns /tmp/foobar
0
$ ./src/pacman/pacman -Qo /tmp/foobar | wc -l
error: No package owns /tmp/foobar
0

$ ./src/pacman/pacman -Qo /var/mail /var/mail/ /var/spool/mail /var/spool/mail/
/var/mail is owned by filesystem 2011.10-1
/var/mail is owned by filesystem 2011.10-1
/var/spool/mail is owned by filesystem 2011.10-1
/var/spool/mail is owned by filesystem 2011.10-1

Sorry this was hidden away; wish I could have saved you some duplicate
work. I encourage you to combine the work though and resubmit after
thorough testing.

>                                if(!rpath) {
>                                        print_query_fileowner(filename, info);
>                                        found = 1;
> -                                       continue;
> +                                       break;
>                                }
>
> -                               if(rootlen + 1 + strlen(pkgfile) > PATH_MAX) {
> +                               if(rootlen + strlen(pkgfile) >= PATH_MAX) {
>                                        pm_printf(ALPM_LOG_ERROR, _("path too long: %s%s\n"), root, pkgfile);
>                                }
>                                /* concatenate our file and the root path */
> --
> 1.7.7.3


More information about the pacman-dev mailing list