[pacman-dev] [PATCH] Allow querying directory ownership

Allan McRae allan at archlinux.org
Fri Sep 7 07:56:48 EDT 2012


On 07/09/12 21:06, Menachem Moystoviz wrote:
> I'm not a dev, nor the original author of the patch, but...
> 
> On Fri, Sep 7, 2012 at 10:29 AM, Allan McRae <allan at archlinux.org> wrote:
>> On 06/09/12 13:34, Andrew Gregory wrote:
>>> The restriction of not checking the ownership of a directory is
>>> unnecessary given that all the package filelists contain this
>>> information. Remove this restriction, with the expectation that you
>>> might get multiple packages returned for a given directory.
>>> Additionally attempt to minimise the number of files getting through
>>> to the slow realpath call.
>>>
>>> Original-work-by: Dan McGee <dan at archlinux.org>
>>> Original-work-by: Allan McRae <allan at archlinux.org>
>>> Signed-off-by: Andrew Gregory <andrew.gregory.8 at gmail.com>
>>> ---
>>>
>>> Applies on top of my other -Qo patches queued in Allan's
>>> working-split/query-owner branch.
>>>
>>>  src/pacman/query.c | 61 +++++++++++++++++++++++++++++++++++++++++-------------
>>>  1 file changed, 47 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/src/pacman/query.c b/src/pacman/query.c
>>> index 92219e8..06505eb 100644
>>> --- a/src/pacman/query.c
>>> +++ b/src/pacman/query.c
>>> @@ -97,6 +97,7 @@ static int query_fileowner(alpm_list_t *targets)
>>>       size_t rootlen;
>>>       alpm_list_t *t;
>>>       alpm_db_t *db_local;
>>> +     alpm_list_t *packages;
>>>
>>>       /* This code is here for safety only */
>>>       if(targets == NULL) {
>>> @@ -129,13 +130,14 @@ static int query_fileowner(alpm_list_t *targets)
>>>       }
>>>
>>>       db_local = alpm_get_localdb(config->handle);
>>> +     packages = alpm_db_get_pkgcache(db_local);
>>>
>>>       for(t = targets; t; t = alpm_list_next(t)) {
>>>               char *filename = NULL, *dname = NULL, *rpath = NULL;
>>>               const char *bname;
>>>               struct stat buf;
>>>               alpm_list_t *i;
>>> -             size_t len;
>>> +             size_t len, is_dir, bname_len, pbname_len;
>>>               int found = 0;
>>>
>>>               if((filename = strdup(t->data)) == NULL) {
>>> @@ -163,15 +165,20 @@ static int query_fileowner(alpm_list_t *targets)
>>>                       }
>>>               }
>>>
>>> -             if(S_ISDIR(buf.st_mode)) {
>>> -                     pm_printf(ALPM_LOG_ERROR,
>>> -                             _("cannot determine ownership of directory '%s'\n"), filename);
>>> -                     goto targcleanup;
>>> +             if((is_dir = S_ISDIR(buf.st_mode))) {
>>> +                     /* the entire filename is safe to resolve if we know it points to a dir,
>>> +                      * and it's necessary in case it ends in . or .. */
>>> +                     dname = realpath(filename, NULL);
>>> +                     bname = mbasename(dname);
>>> +                     rpath = mdirname(dname);
>>> +             } else {
>>> +                     bname = mbasename(filename);
>>> +                     dname = mdirname(filename);
>>> +                     rpath = realpath(dname, NULL);
>>>               }
>>>
>>> -             bname = mbasename(filename);
>>> -             dname = mdirname(filename);
>>> -             rpath = realpath(dname, NULL);
>>> +             bname_len = strlen(bname);
>>> +             pbname_len = bname_len + (is_dir ? 1 : 0);
>>>
>>>               if(!dname || !rpath) {
>>>                       pm_printf(ALPM_LOG_ERROR, _("cannot determine real path for '%s': %s\n"),
>>> @@ -179,7 +186,7 @@ static int query_fileowner(alpm_list_t *targets)
>>>                       goto targcleanup;
>>>               }
>>>
>>> -             for(i = alpm_db_get_pkgcache(db_local); i && !found; i = alpm_list_next(i)) {
>>> +             for(i = packages; i && (!found || is_dir); i = alpm_list_next(i)) {
>>>                       alpm_pkg_t *info = i->data;
>>>                       alpm_filelist_t *filelist = alpm_pkg_get_files(info);
>>>                       size_t j;
>>> @@ -188,22 +195,48 @@ static int query_fileowner(alpm_list_t *targets)
>>>                               const alpm_file_t *file = filelist->files + j;
>>>                               char *ppath, *pdname;
>>>                               const char *pkgfile = file->name;
>>> +                             size_t pkgfile_len = strlen(pkgfile);
>>> +
>>> +                             /* make sure pkgfile and target are of the same type */
>>> +                             if(!is_dir != !(pkgfile[pkgfile_len - 1] == '/')) {
>>
>> !!! - literally....   How about:
>>
>> if (is_dir && (pkgfile[pkgfile_len - 1] != '/'))
> This isn't the same check - Andrew's check verifies that is_dir XOR
> (pkgfile's last character is not '/' (i.e. pkgfile names a file
> path)),
> is true, while your test is for AND. This way of defining XOR is quite
> confusing, though - isn't there a clearer way? (i.e. macros, inline
> functions?)

Ah - that is confusing...  obviously!

So a clearer version of the test would be:

if ((!is_dir && pkgfile[pkgfile_len - 1] == '/') ||
	(is_dir && pkgfile[pkgfile_len - 1] != '/'))

>>
>>> +                                     continue;
>>> +                             }
>>> +
>>> +                             /* make sure pkgfile is long enough */
>>> +                             if(pkgfile_len < pbname_len) {
>>> +                                     continue;
>>> +                             }
>>
>> If I understand this correctly, we are comparing the length of the final
>> component of the passed in parameter to the entire filename length from
>> the package.  I think that it would be quite uncommon to hit the
>> continue here, so that can be removed.
>>
> I think you mean to say that we replace this if(){continue;} with
> if(!){...} wrapping the rest of the loop?
> That would make sense if this branch is costly.

No.  I was saying that I think this test is going to result in a very
small number of cases being skipped and so it would be more efficient to
just not perform it.

>>>
>>> -                             /* avoid the costly realpath usage if the basenames don't match */
>>> -                             if(strcmp(mbasename(pkgfile), bname) != 0) {
>>> +                             /* make sure pbname_len takes us to the start of a path component */
>>> +                             if(pbname_len != pkgfile_len && pkgfile[pkgfile_len - pbname_len - 1] != '/') {
>>
>> With the above removal:
>> if (pbname_len > pkgfile_len)
>>
>>> +                                     continue;
>>> +                             }
>>> +
>>> +                             /* compare basename with bname */
>>> +                             if(strncmp(pkgfile + pkgfile_len - pbname_len, bname, bname_len) != 0) {
>>
>> Why an strncmp here?  strcmp will do the same comparison...
>>
> In order to make sure that we don't run into buffer overflow issues -
> http://stackoverflow.com/questions/448563/am-i-correct-that-strcmp-is-equivalent-and-safe-for-literals

Both strings are null terminated.

>>>                                       continue;
>>>                               }
>>>
>>>                               /* concatenate our file and the root path */
>>> -                             if(rootlen + 1 + strlen(pkgfile) > PATH_MAX) {
>>> +                             if(rootlen + 1 + pkgfile_len > PATH_MAX) {
>>>                                       path[rootlen] = '\0'; /* reset path for error message */
>>>                                       pm_printf(ALPM_LOG_ERROR, _("path too long: %s%s\n"), path, pkgfile);
>>>                                       continue;
>>>                               }
>>>                               strcpy(path + rootlen, pkgfile);
>>>
>>> -                             pdname = mdirname(path);
>>> -                             ppath = realpath(pdname, NULL);
>>> +                             /* check that the file exists on disk and is the correct type */
>>> +                             if(lstat(path, &buf) == -1 || !is_dir != !S_ISDIR(buf.st_mode)) {
>>
>> !!! again.  How about:
>>
>> if(is_dir && lstat(path, &buf) == 0 && !S_ISDIR(buf.st_mode))
>>
>> which is what I think is being tested here.
> Again, your fix doesn't match the original test - Andrew's test
> verifies that the file either doesn't exist
> or that (the file is a directory) XOR is_dir is true.
>>
>> But I wonder if we really need this...  I believe this is to avoid
>> "strange" results when people replace directories in a package with
>> symlinks to directories in another package.  The more I think about
>> this, the more I want to just remove our special case handling of this
>> in general.  A directory in a package and a symlink on the filesystem
>> should just be a conflict.  So unless I am missing another reason for
>> this, get rid of it.
>>
> No comment, as I do not know the codebase well enough to respond.
>>> +                                     continue;
>>> +                             }
>>> +
>>> +                             if(is_dir) {
>>> +                                     pdname = realpath(path, NULL);
>>> +                                     ppath = mdirname(pdname);
>>> +                             } else {
>>> +                                     pdname = mdirname(path);
>>> +                                     ppath = realpath(pdname, NULL);
>>> +                             }
>>>                               free(pdname);
>>
>> Umm...   why calculate pdname just to free it immediately?
> Since we need to use it to calculate ppath?

Oops...  I blame doing quick review before rushing to the train!

>>
>>>                               if(!ppath) {
>>>
>>
>>
> 
> Overall, I can't really judge the quality of Andrew's patch, but I definitely
> do not agree with the comments given by Allan.
> 
> HTH in my own way,
> 
> Gesh
> 
> 
> 



More information about the pacman-dev mailing list