[pacman-dev] [PATCH 0/5] Use resolved paths for filelist operations
Properly handling symlinks requires using resolved paths. I'm not sure if returning strings instead of file_t structs is the best solution, but it seemed like the least drastic one as we prepare for 4.1 and we weren't using the extra information anyway. Andrew Gregory (5): use alpm_list_free on filelist intersection add fileconflict tests for cases with symlinks return resolved paths from filelist_intersection return resolved paths from filelist_difference use resolved_path for filelist_contains lib/libalpm/alpm.h | 2 +- lib/libalpm/conflict.c | 47 +++++++++++++++++------------------- lib/libalpm/filelist.c | 18 ++++++-------- lib/libalpm/remove.c | 3 +++ test/pacman/tests/fileconflict022.py | 19 +++++++++++++++ test/pacman/tests/fileconflict023.py | 18 ++++++++++++++ test/pacman/tests/fileconflict024.py | 17 +++++++++++++ test/pacman/tests/fileconflict025.py | 18 ++++++++++++++ 8 files changed, 106 insertions(+), 36 deletions(-) create mode 100644 test/pacman/tests/fileconflict022.py create mode 100644 test/pacman/tests/fileconflict023.py create mode 100644 test/pacman/tests/fileconflict024.py create mode 100644 test/pacman/tests/fileconflict025.py -- 1.8.1.3
alpm_filelist_intersection returns a list of pointers to internal file_t struct's, so only the list itself should be freed. Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- lib/libalpm/conflict.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index 96f2109..1aa653f 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -453,7 +453,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, conflicts = add_fileconflict(handle, conflicts, path, p1, p2); if(handle->pm_errno == ALPM_ERR_MEMORY) { FREELIST(conflicts); - FREELIST(common_files); + alpm_list_free(common_files); return NULL; } } -- 1.8.1.3
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- test/pacman/tests/fileconflict022.py | 21 +++++++++++++++++++++ test/pacman/tests/fileconflict023.py | 20 ++++++++++++++++++++ test/pacman/tests/fileconflict024.py | 19 +++++++++++++++++++ test/pacman/tests/fileconflict025.py | 20 ++++++++++++++++++++ 4 files changed, 80 insertions(+) create mode 100644 test/pacman/tests/fileconflict022.py create mode 100644 test/pacman/tests/fileconflict023.py create mode 100644 test/pacman/tests/fileconflict024.py create mode 100644 test/pacman/tests/fileconflict025.py diff --git a/test/pacman/tests/fileconflict022.py b/test/pacman/tests/fileconflict022.py new file mode 100644 index 0000000..6f9aec9 --- /dev/null +++ b/test/pacman/tests/fileconflict022.py @@ -0,0 +1,21 @@ +self.description = "File conflict between package with symlink and package with real path" + +self.filesystem = ["usr/lib/", "lib -> usr/lib/"] + +sp1 = pmpkg("foo") +# share/ causes the entries to be reordered after path resolution +sp1.files = ["lib/", "lib/file", "share/"] +self.addpkg2db("sync", sp1) + +sp2 = pmpkg("bar") +sp2.files = ["usr/", "usr/lib/", "usr/lib/file"] +self.addpkg2db("sync", sp2) + +self.args = "-S %s %s" % (sp1.name, sp2.name) + +self.addrule("PACMAN_RETCODE=1") +self.addrule("PACMAN_OUTPUT=.*/usr/lib/file exists in both 'foo' and 'bar'") +self.addrule("!PKG_EXIST=foo") +self.addrule("!PKG_EXIST=bar") + +self.expectfailure = True diff --git a/test/pacman/tests/fileconflict023.py b/test/pacman/tests/fileconflict023.py new file mode 100644 index 0000000..1310b68 --- /dev/null +++ b/test/pacman/tests/fileconflict023.py @@ -0,0 +1,20 @@ +self.description = "File conflict between package with symlink and package with real path resolved by removal" + +self.filesystem = ["usr/", "usr/lib/", "lib -> usr/lib/"] + +lp1 = pmpkg("foo") +lp1.files = ["lib/", "lib/file"] +self.addpkg2db("local", lp1) + +sp1 = pmpkg("bar") +sp1.conflicts = ["foo"] +sp1.files = ["usr/", "usr/lib/", "usr/lib/file"] +self.addpkg2db("sync", sp1) + +self.args = "-S %s --ask=4" % sp1.name + +self.addrule("PACMAN_RETCODE=0") +self.addrule("!PKG_EXIST=foo") +self.addrule("PKG_EXIST=bar") + +self.expectfailure = True diff --git a/test/pacman/tests/fileconflict024.py b/test/pacman/tests/fileconflict024.py new file mode 100644 index 0000000..1dcb5ad --- /dev/null +++ b/test/pacman/tests/fileconflict024.py @@ -0,0 +1,19 @@ +self.description = "Filesystem conflict on upgrade with symlinks" + +self.filesystem = ["share", "usr/lib/", "lib -> usr/lib/"] + +lp1 = pmpkg("foo", "1-1") +lp1.files = ["lib/"] +self.addpkg2db("local", lp1) + +sp1 = pmpkg("foo", "1-2") +# share/ causes the file order to change upon path resolution +sp1.files = ["lib/", "share"] +self.addpkg2db("sync", sp1) + +self.args = "-S %s" % sp1.name + +self.addrule("PACMAN_RETCODE=1") +self.addrule("PKG_EXIST=foo|1-1") + +self.expectfailure = True diff --git a/test/pacman/tests/fileconflict025.py b/test/pacman/tests/fileconflict025.py new file mode 100644 index 0000000..efd027e --- /dev/null +++ b/test/pacman/tests/fileconflict025.py @@ -0,0 +1,20 @@ +self.description = "File conflict between package with symlink and package with real path resolved by removal (reversed)" + +self.filesystem = ["usr/lib/", "lib -> usr/lib/"] + +lp1 = pmpkg("foo") +lp1.files = ["usr/", "usr/lib/", "usr/lib/file"] +self.addpkg2db("local", lp1) + +sp1 = pmpkg("bar") +sp1.conflicts = ["foo"] +sp1.files = ["lib/", "lib/file"] +self.addpkg2db("sync", sp1) + +self.args = "-S %s --ask=4" % sp1.name + +self.addrule("PACMAN_RETCODE=0") +self.addrule("!PKG_EXIST=foo") +self.addrule("PKG_EXIST=bar") + +self.expectfailure = True -- 1.8.1.3
We were comparing files based on resolved paths but returning the original file_t structures, which were not necessarily in the same order. The additional file_t information was never used, so just return the resolved path. Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- lib/libalpm/conflict.c | 4 ++-- lib/libalpm/filelist.c | 2 +- test/pacman/tests/fileconflict022.py | 2 -- 3 files changed, 3 insertions(+), 5 deletions(-) diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index 1aa653f..54e79fe 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -448,8 +448,8 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_list_t *k; char path[PATH_MAX]; for(k = common_files; k; k = k->next) { - alpm_file_t *file = k->data; - snprintf(path, PATH_MAX, "%s%s", handle->root, file->name); + char *filename = k->data; + snprintf(path, PATH_MAX, "%s%s", handle->root, filename); conflicts = add_fileconflict(handle, conflicts, path, p1, p2); if(handle->pm_errno == ALPM_ERR_MEMORY) { FREELIST(conflicts); diff --git a/lib/libalpm/filelist.c b/lib/libalpm/filelist.c index c4814a4..5c5f017 100644 --- a/lib/libalpm/filelist.c +++ b/lib/libalpm/filelist.c @@ -299,7 +299,7 @@ alpm_list_t *_alpm_filelist_intersection(alpm_filelist_t *filesA, /* when not directories, item in both qualifies as an intersect */ if(! (isdirA && isdirB)) { - ret = alpm_list_add(ret, filesA->files + ctrA); + ret = alpm_list_add(ret, filesA->resolved_path[ctrA]); } ctrA++; ctrB++; diff --git a/test/pacman/tests/fileconflict022.py b/test/pacman/tests/fileconflict022.py index 6f9aec9..5759d4c 100644 --- a/test/pacman/tests/fileconflict022.py +++ b/test/pacman/tests/fileconflict022.py @@ -17,5 +17,3 @@ self.addrule("PACMAN_OUTPUT=.*/usr/lib/file exists in both 'foo' and 'bar'") self.addrule("!PKG_EXIST=foo") self.addrule("!PKG_EXIST=bar") - -self.expectfailure = True -- 1.8.1.3
We were comparing files based on resolved paths but returning the original file_t structures, which were not necessarily in the same order. The extra file_t information was only being used to determine if the file was a directory which can be accomplished by testing for a trailing slash, so just return the resolved path. Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- lib/libalpm/conflict.c | 36 ++++++++++++++---------------------- lib/libalpm/filelist.c | 10 ++++------ test/pacman/tests/fileconflict024.py | 2 -- test/pacman/tests/fileconflict025.py | 2 -- 4 files changed, 18 insertions(+), 32 deletions(-) diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index 54e79fe..ba642dd 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -426,9 +426,8 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, for(current = 0, i = upgrade; i; i = i->next, current++) { alpm_pkg_t *p1 = i->data; alpm_list_t *j; - alpm_filelist_t tmpfiles; + alpm_list_t *tmpfiles = NULL; alpm_pkg_t *dbpkg; - size_t filenum; int percent = (current * 100) / numtargs; PROGRESS(handle, ALPM_PROGRESS_CONFLICTS_START, "", percent, @@ -472,23 +471,21 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, * that the former list needs to be freed while the latter list should NOT * be freed. */ if(dbpkg) { - _alpm_filelist_resolve(handle, alpm_pkg_get_files(dbpkg)); - alpm_list_t *difference; /* older ver of package currently installed */ - difference = _alpm_filelist_difference(alpm_pkg_get_files(p1), + _alpm_filelist_resolve(handle, alpm_pkg_get_files(dbpkg)); + tmpfiles = _alpm_filelist_difference(alpm_pkg_get_files(p1), alpm_pkg_get_files(dbpkg)); - tmpfiles.count = alpm_list_count(difference); - tmpfiles.files = alpm_list_to_array(difference, tmpfiles.count, - sizeof(alpm_file_t)); - alpm_list_free(difference); } else { /* no version of package currently installed */ - tmpfiles = *alpm_pkg_get_files(p1); + alpm_filelist_t *fl = alpm_pkg_get_files(p1); + size_t filenum; + for(filenum = 0; filenum < fl->count; filenum++) { + tmpfiles = alpm_list_add(tmpfiles, fl->resolved_path[filenum]); + } } - for(filenum = 0; filenum < tmpfiles.count; filenum++) { - alpm_file_t *file = tmpfiles.files + filenum; - const char *filestr = file->name; + for(j = tmpfiles; j; j = j->next) { + const char *filestr = j->data; const char *relative_path; alpm_list_t *k; /* have we acted on this conflict? */ @@ -506,7 +503,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, _alpm_log(handle, ALPM_LOG_DEBUG, "checking possible conflict: %s\n", path); - if(S_ISDIR(file->mode)) { + if(filestr[strlen(filestr) - 1] == '/') { struct stat sbuf; if(S_ISDIR(lsbuf.st_mode)) { _alpm_log(handle, ALPM_LOG_DEBUG, "file is a directory, not a conflict\n"); @@ -529,6 +526,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, /* Check remove list (will we remove the conflicting local file?) */ for(k = rem; k && !resolved_conflict; k = k->next) { alpm_pkg_t *rempkg = k->data; + _alpm_filelist_resolve(handle, alpm_pkg_get_files(rempkg)); if(rempkg && alpm_filelist_contains(alpm_pkg_get_files(rempkg), relative_path)) { _alpm_log(handle, ALPM_LOG_DEBUG, @@ -607,18 +605,12 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, conflicts = add_fileconflict(handle, conflicts, path, p1, NULL); if(handle->pm_errno == ALPM_ERR_MEMORY) { FREELIST(conflicts); - if(dbpkg) { - /* only freed if it was generated from _alpm_filelist_difference() */ - free(tmpfiles.files); - } + alpm_list_free(tmpfiles); return NULL; } } } - if(dbpkg) { - /* only freed if it was generated from _alpm_filelist_difference() */ - free(tmpfiles.files); - } + alpm_list_free(tmpfiles); } PROGRESS(handle, ALPM_PROGRESS_CONFLICTS_START, "", 100, numtargs, current); diff --git a/lib/libalpm/filelist.c b/lib/libalpm/filelist.c index 5c5f017..b0dcee4 100644 --- a/lib/libalpm/filelist.c +++ b/lib/libalpm/filelist.c @@ -229,14 +229,13 @@ alpm_list_t *_alpm_filelist_difference(alpm_filelist_t *filesA, size_t ctrA = 0, ctrB = 0; while(ctrA < filesA->count && ctrB < filesB->count) { - alpm_file_t *fileA = filesA->files + ctrA; - const char *strA = filesA->resolved_path[ctrA]; - const char *strB = filesB->resolved_path[ctrB]; + char *strA = filesA->resolved_path[ctrA]; + char *strB = filesB->resolved_path[ctrB]; int cmp = strcmp(strA, strB); if(cmp < 0) { /* item only in filesA, qualifies as a difference */ - ret = alpm_list_add(ret, fileA); + ret = alpm_list_add(ret, strA); ctrA++; } else if(cmp > 0) { ctrB++; @@ -248,8 +247,7 @@ alpm_list_t *_alpm_filelist_difference(alpm_filelist_t *filesA, /* ensure we have completely emptied pA */ while(ctrA < filesA->count) { - alpm_file_t *fileA = filesA->files + ctrA; - ret = alpm_list_add(ret, fileA); + ret = alpm_list_add(ret, filesA->resolved_path[ctrA]); ctrA++; } diff --git a/test/pacman/tests/fileconflict024.py b/test/pacman/tests/fileconflict024.py index 1dcb5ad..17394db 100644 --- a/test/pacman/tests/fileconflict024.py +++ b/test/pacman/tests/fileconflict024.py @@ -15,5 +15,3 @@ self.addrule("PACMAN_RETCODE=1") self.addrule("PKG_EXIST=foo|1-1") - -self.expectfailure = True diff --git a/test/pacman/tests/fileconflict025.py b/test/pacman/tests/fileconflict025.py index efd027e..3c6f063 100644 --- a/test/pacman/tests/fileconflict025.py +++ b/test/pacman/tests/fileconflict025.py @@ -16,5 +16,3 @@ self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=bar") - -self.expectfailure = True -- 1.8.1.3
alpm_filelist_contains was being used to search for resolved paths, but searching in the unresolved paths, causing it to miss matches. Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- lib/libalpm/alpm.h | 2 +- lib/libalpm/conflict.c | 5 +++++ lib/libalpm/filelist.c | 6 +++--- lib/libalpm/remove.c | 3 +++ test/pacman/tests/fileconflict023.py | 2 -- 5 files changed, 12 insertions(+), 6 deletions(-) diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h index 5037859..afa5cd7 100644 --- a/lib/libalpm/alpm.h +++ b/lib/libalpm/alpm.h @@ -1023,7 +1023,7 @@ int alpm_pkg_set_reason(alpm_pkg_t *pkg, alpm_pkgreason_t reason); * @param path the path to search for in the package * @return a pointer to the matching file or NULL if not found */ -alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path); +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path); /* * Signatures diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index ba642dd..afed895 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -321,6 +321,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, const char *root = handle->root; /* check directory is actually in package - used for subdirectory checks */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(!alpm_filelist_contains(alpm_pkg_get_files(pkg), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "directory %s not in package %s\n", dirpath, pkg->name); @@ -339,6 +340,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, continue; } + _alpm_filelist_resolve(handle, alpm_pkg_get_files(i->data)); if(alpm_filelist_contains(alpm_pkg_get_files(i->data), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "file %s also in package %s\n", dirpath, @@ -373,6 +375,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, return 0; } } else { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(alpm_filelist_contains(alpm_pkg_get_files(pkg), path)) { continue; } else { @@ -544,6 +547,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_pkg_t *localp2 = _alpm_db_get_pkgfromcache(handle->db_local, p2->name); /* localp2->files will be removed (target conflicts are handled by CHECK 1) */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(localp2)); if(localp2 && alpm_filelist_contains(alpm_pkg_get_files(localp2), filestr)) { /* skip removal of file, but not add. this will prevent a second * package from removing the file when it was already installed @@ -590,6 +594,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_list_t *local_pkgs = _alpm_db_get_pkgcache(handle->db_local); int found = 0; for(k = local_pkgs; k && !found; k = k->next) { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(k->data)); if(alpm_filelist_contains(alpm_pkg_get_files(k->data), filestr)) { found = 1; } diff --git a/lib/libalpm/filelist.c b/lib/libalpm/filelist.c index b0dcee4..f9092c1 100644 --- a/lib/libalpm/filelist.c +++ b/lib/libalpm/filelist.c @@ -324,7 +324,7 @@ int _alpm_files_cmp(const void *f1, const void *f2) } -alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path) { alpm_file_t key; @@ -335,8 +335,8 @@ alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, key.name = (char *)path; - return bsearch(&key, filelist->files, filelist->count, - sizeof(alpm_file_t), _alpm_files_cmp); + return bsearch(&key, filelist->resolved_path, filelist->count, + sizeof(char *), _alpm_filelist_strcmp); } /* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 64888c5..60ea8de 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -462,6 +462,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, "keeping directory %s (could not count files)\n", file); } else if(newpkg && alpm_filelist_contains(alpm_pkg_get_files(newpkg), fileobj->name)) { + /* newpkg's filelist should have already been resolved by the caller */ _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (in new package)\n", file); } else if(dir_is_mountpoint(handle, file, &buf)) { @@ -483,6 +484,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, continue; } filelist = alpm_pkg_get_files(local_pkg); + _alpm_filelist_resolve(handle, filelist); if(alpm_filelist_contains(filelist, fileobj->name)) { _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (owned by %s)\n", file, local_pkg->name); @@ -580,6 +582,7 @@ static int remove_package_files(alpm_handle_t *handle, * so this removal operation doesn't kill them */ /* old package backup list */ newfiles = alpm_pkg_get_files(newpkg); + _alpm_filelist_resolve(handle, newfiles); for(b = alpm_pkg_get_backup(newpkg); b; b = b->next) { const alpm_backup_t *backup = b->data; /* safety check (fix the upgrade026 pactest) */ diff --git a/test/pacman/tests/fileconflict023.py b/test/pacman/tests/fileconflict023.py index 1310b68..ce72087 100644 --- a/test/pacman/tests/fileconflict023.py +++ b/test/pacman/tests/fileconflict023.py @@ -16,5 +16,3 @@ self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=bar") - -self.expectfailure = True -- 1.8.1.3
On 16/02/13 12:02, Andrew Gregory wrote:
alpm_filelist_contains was being used to search for resolved paths, but searching in the unresolved paths, causing it to miss matches.
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> ---
Instead of requiring _alpm_filelist_resolve() to be called before calling alpm_filelist_contains(), just add the call in alpm_filelist_contains() itself. If the filelist has already been resolved, it returns immediately anyway. As an aside - this whole situation really sucks! We now are resolving every path...
lib/libalpm/alpm.h | 2 +- lib/libalpm/conflict.c | 5 +++++ lib/libalpm/filelist.c | 6 +++--- lib/libalpm/remove.c | 3 +++ test/pacman/tests/fileconflict023.py | 2 -- 5 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h index 5037859..afa5cd7 100644 --- a/lib/libalpm/alpm.h +++ b/lib/libalpm/alpm.h @@ -1023,7 +1023,7 @@ int alpm_pkg_set_reason(alpm_pkg_t *pkg, alpm_pkgreason_t reason); * @param path the path to search for in the package * @return a pointer to the matching file or NULL if not found */ -alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path); +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path);
/* * Signatures diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index ba642dd..afed895 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -321,6 +321,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, const char *root = handle->root;
/* check directory is actually in package - used for subdirectory checks */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(!alpm_filelist_contains(alpm_pkg_get_files(pkg), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "directory %s not in package %s\n", dirpath, pkg->name); @@ -339,6 +340,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, continue; }
+ _alpm_filelist_resolve(handle, alpm_pkg_get_files(i->data)); if(alpm_filelist_contains(alpm_pkg_get_files(i->data), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "file %s also in package %s\n", dirpath, @@ -373,6 +375,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, return 0; } } else { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(alpm_filelist_contains(alpm_pkg_get_files(pkg), path)) { continue; } else { @@ -544,6 +547,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_pkg_t *localp2 = _alpm_db_get_pkgfromcache(handle->db_local, p2->name);
/* localp2->files will be removed (target conflicts are handled by CHECK 1) */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(localp2)); if(localp2 && alpm_filelist_contains(alpm_pkg_get_files(localp2), filestr)) { /* skip removal of file, but not add. this will prevent a second * package from removing the file when it was already installed @@ -590,6 +594,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_list_t *local_pkgs = _alpm_db_get_pkgcache(handle->db_local); int found = 0; for(k = local_pkgs; k && !found; k = k->next) { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(k->data)); if(alpm_filelist_contains(alpm_pkg_get_files(k->data), filestr)) { found = 1; } diff --git a/lib/libalpm/filelist.c b/lib/libalpm/filelist.c index b0dcee4..f9092c1 100644 --- a/lib/libalpm/filelist.c +++ b/lib/libalpm/filelist.c @@ -324,7 +324,7 @@ int _alpm_files_cmp(const void *f1, const void *f2) }
-alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path) { alpm_file_t key; @@ -335,8 +335,8 @@ alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist,
key.name = (char *)path;
- return bsearch(&key, filelist->files, filelist->count, - sizeof(alpm_file_t), _alpm_files_cmp); + return bsearch(&key, filelist->resolved_path, filelist->count, + sizeof(char *), _alpm_filelist_strcmp); }
/* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 64888c5..60ea8de 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -462,6 +462,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, "keeping directory %s (could not count files)\n", file); } else if(newpkg && alpm_filelist_contains(alpm_pkg_get_files(newpkg), fileobj->name)) { + /* newpkg's filelist should have already been resolved by the caller */ _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (in new package)\n", file); } else if(dir_is_mountpoint(handle, file, &buf)) { @@ -483,6 +484,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, continue; } filelist = alpm_pkg_get_files(local_pkg); + _alpm_filelist_resolve(handle, filelist); if(alpm_filelist_contains(filelist, fileobj->name)) { _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (owned by %s)\n", file, local_pkg->name); @@ -580,6 +582,7 @@ static int remove_package_files(alpm_handle_t *handle, * so this removal operation doesn't kill them */ /* old package backup list */ newfiles = alpm_pkg_get_files(newpkg); + _alpm_filelist_resolve(handle, newfiles); for(b = alpm_pkg_get_backup(newpkg); b; b = b->next) { const alpm_backup_t *backup = b->data; /* safety check (fix the upgrade026 pactest) */ diff --git a/test/pacman/tests/fileconflict023.py b/test/pacman/tests/fileconflict023.py index 1310b68..ce72087 100644 --- a/test/pacman/tests/fileconflict023.py +++ b/test/pacman/tests/fileconflict023.py @@ -16,5 +16,3 @@ self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=bar") - -self.expectfailure = True
On Sun, 17 Feb 2013 10:32:48 +1000 Allan McRae <allan@archlinux.org> wrote:
On 16/02/13 12:02, Andrew Gregory wrote:
alpm_filelist_contains was being used to search for resolved paths, but searching in the unresolved paths, causing it to miss matches.
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> ---
Instead of requiring _alpm_filelist_resolve() to be called before calling alpm_filelist_contains(), just add the call in alpm_filelist_contains() itself. If the filelist has already been resolved, it returns immediately anyway.
As an aside - this whole situation really sucks! We now are resolving every path...
I considered doing that. The reason I decided not to was that alpm_filelist_contains wouldn't be able to easily notify the caller if there's an error resolving the files. We aren't checking for those errors at the moment, but it would be good to add that at some point.
lib/libalpm/alpm.h | 2 +- lib/libalpm/conflict.c | 5 +++++ lib/libalpm/filelist.c | 6 +++--- lib/libalpm/remove.c | 3 +++ test/pacman/tests/fileconflict023.py | 2 -- 5 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h index 5037859..afa5cd7 100644 --- a/lib/libalpm/alpm.h +++ b/lib/libalpm/alpm.h @@ -1023,7 +1023,7 @@ int alpm_pkg_set_reason(alpm_pkg_t *pkg, alpm_pkgreason_t reason); * @param path the path to search for in the package * @return a pointer to the matching file or NULL if not found */ -alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path); +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path);
/* * Signatures diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index ba642dd..afed895 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -321,6 +321,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, const char *root = handle->root;
/* check directory is actually in package - used for subdirectory checks */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(!alpm_filelist_contains(alpm_pkg_get_files(pkg), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "directory %s not in package %s\n", dirpath, pkg->name); @@ -339,6 +340,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, continue; }
+ _alpm_filelist_resolve(handle, alpm_pkg_get_files(i->data)); if(alpm_filelist_contains(alpm_pkg_get_files(i->data), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "file %s also in package %s\n", dirpath, @@ -373,6 +375,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, return 0; } } else { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(alpm_filelist_contains(alpm_pkg_get_files(pkg), path)) { continue; } else { @@ -544,6 +547,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_pkg_t *localp2 = _alpm_db_get_pkgfromcache(handle->db_local, p2->name);
/* localp2->files will be removed (target conflicts are handled by CHECK 1) */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(localp2)); if(localp2 && alpm_filelist_contains(alpm_pkg_get_files(localp2), filestr)) { /* skip removal of file, but not add. this will prevent a second * package from removing the file when it was already installed @@ -590,6 +594,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_list_t *local_pkgs = _alpm_db_get_pkgcache(handle->db_local); int found = 0; for(k = local_pkgs; k && !found; k = k->next) { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(k->data)); if(alpm_filelist_contains(alpm_pkg_get_files(k->data), filestr)) { found = 1; } diff --git a/lib/libalpm/filelist.c b/lib/libalpm/filelist.c index b0dcee4..f9092c1 100644 --- a/lib/libalpm/filelist.c +++ b/lib/libalpm/filelist.c @@ -324,7 +324,7 @@ int _alpm_files_cmp(const void *f1, const void *f2) }
-alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path) { alpm_file_t key; @@ -335,8 +335,8 @@ alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist,
key.name = (char *)path;
- return bsearch(&key, filelist->files, filelist->count, - sizeof(alpm_file_t), _alpm_files_cmp); + return bsearch(&key, filelist->resolved_path, filelist->count, + sizeof(char *), _alpm_filelist_strcmp); }
/* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 64888c5..60ea8de 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -462,6 +462,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, "keeping directory %s (could not count files)\n", file); } else if(newpkg && alpm_filelist_contains(alpm_pkg_get_files(newpkg), fileobj->name)) { + /* newpkg's filelist should have already been resolved by the caller */ _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (in new package)\n", file); } else if(dir_is_mountpoint(handle, file, &buf)) { @@ -483,6 +484,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, continue; } filelist = alpm_pkg_get_files(local_pkg); + _alpm_filelist_resolve(handle, filelist); if(alpm_filelist_contains(filelist, fileobj->name)) { _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (owned by %s)\n", file, local_pkg->name); @@ -580,6 +582,7 @@ static int remove_package_files(alpm_handle_t *handle, * so this removal operation doesn't kill them */ /* old package backup list */ newfiles = alpm_pkg_get_files(newpkg); + _alpm_filelist_resolve(handle, newfiles); for(b = alpm_pkg_get_backup(newpkg); b; b = b->next) { const alpm_backup_t *backup = b->data; /* safety check (fix the upgrade026 pactest) */ diff --git a/test/pacman/tests/fileconflict023.py b/test/pacman/tests/fileconflict023.py index 1310b68..ce72087 100644 --- a/test/pacman/tests/fileconflict023.py +++ b/test/pacman/tests/fileconflict023.py @@ -16,5 +16,3 @@ self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=bar") - -self.expectfailure = True
On 17/02/13 11:08, Andrew Gregory wrote:
On Sun, 17 Feb 2013 10:32:48 +1000 Allan McRae <allan@archlinux.org> wrote:
On 16/02/13 12:02, Andrew Gregory wrote:
alpm_filelist_contains was being used to search for resolved paths, but searching in the unresolved paths, causing it to miss matches.
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> ---
Instead of requiring _alpm_filelist_resolve() to be called before calling alpm_filelist_contains(), just add the call in alpm_filelist_contains() itself. If the filelist has already been resolved, it returns immediately anyway.
As an aside - this whole situation really sucks! We now are resolving every path...
I considered doing that. The reason I decided not to was that alpm_filelist_contains wouldn't be able to easily notify the caller if there's an error resolving the files. We aren't checking for those errors at the moment, but it would be good to add that at some point.
But there is no indication that the resolve function needs to be called first for the filelist_contains function to work. Also, see the email I just sent about killing resolving altogether... Allan
On 17/02/13 11:17, Allan McRae wrote:
On 17/02/13 11:08, Andrew Gregory wrote:
On Sun, 17 Feb 2013 10:32:48 +1000 Allan McRae <allan@archlinux.org> wrote:
On 16/02/13 12:02, Andrew Gregory wrote:
alpm_filelist_contains was being used to search for resolved paths, but searching in the unresolved paths, causing it to miss matches.
Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> ---
Instead of requiring _alpm_filelist_resolve() to be called before calling alpm_filelist_contains(), just add the call in alpm_filelist_contains() itself. If the filelist has already been resolved, it returns immediately anyway.
As an aside - this whole situation really sucks! We now are resolving every path...
I considered doing that. The reason I decided not to was that alpm_filelist_contains wouldn't be able to easily notify the caller if there's an error resolving the files. We aren't checking for those errors at the moment, but it would be good to add that at some point.
But there is no indication that the resolve function needs to be called first for the filelist_contains function to work.
Also, I just noticed that alpm_filelist_contains() is public, _alpm_filelist_resolve() is not... This is a bit of a mess. Allan
alpm_filelist_contains was being used to search for resolved paths, but searching in the unresolved paths, causing it to miss matches. We always search unresolved paths and search the resolved paths if available because _alpm_filelist_resolve is not public and requires a context handle, so it can't be called from alpm_filelist_contains. Signed-off-by: Andrew Gregory <andrew.gregory.8@gmail.com> --- lib/libalpm/alpm.h | 2 +- lib/libalpm/conflict.c | 5 +++++ lib/libalpm/filelist.c | 15 ++++++++++++--- lib/libalpm/remove.c | 3 +++ test/pacman/tests/fileconflict023.py | 2 -- 5 files changed, 21 insertions(+), 6 deletions(-) diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h index 5037859..afa5cd7 100644 --- a/lib/libalpm/alpm.h +++ b/lib/libalpm/alpm.h @@ -1023,7 +1023,7 @@ int alpm_pkg_set_reason(alpm_pkg_t *pkg, alpm_pkgreason_t reason); * @param path the path to search for in the package * @return a pointer to the matching file or NULL if not found */ -alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path); +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path); /* * Signatures diff --git a/lib/libalpm/conflict.c b/lib/libalpm/conflict.c index ba642dd..afed895 100644 --- a/lib/libalpm/conflict.c +++ b/lib/libalpm/conflict.c @@ -321,6 +321,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, const char *root = handle->root; /* check directory is actually in package - used for subdirectory checks */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(!alpm_filelist_contains(alpm_pkg_get_files(pkg), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "directory %s not in package %s\n", dirpath, pkg->name); @@ -339,6 +340,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, continue; } + _alpm_filelist_resolve(handle, alpm_pkg_get_files(i->data)); if(alpm_filelist_contains(alpm_pkg_get_files(i->data), dirpath)) { _alpm_log(handle, ALPM_LOG_DEBUG, "file %s also in package %s\n", dirpath, @@ -373,6 +375,7 @@ static int dir_belongsto_pkg(alpm_handle_t *handle, const char *dirpath, return 0; } } else { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(pkg)); if(alpm_filelist_contains(alpm_pkg_get_files(pkg), path)) { continue; } else { @@ -544,6 +547,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_pkg_t *localp2 = _alpm_db_get_pkgfromcache(handle->db_local, p2->name); /* localp2->files will be removed (target conflicts are handled by CHECK 1) */ + _alpm_filelist_resolve(handle, alpm_pkg_get_files(localp2)); if(localp2 && alpm_filelist_contains(alpm_pkg_get_files(localp2), filestr)) { /* skip removal of file, but not add. this will prevent a second * package from removing the file when it was already installed @@ -590,6 +594,7 @@ alpm_list_t *_alpm_db_find_fileconflicts(alpm_handle_t *handle, alpm_list_t *local_pkgs = _alpm_db_get_pkgcache(handle->db_local); int found = 0; for(k = local_pkgs; k && !found; k = k->next) { + _alpm_filelist_resolve(handle, alpm_pkg_get_files(k->data)); if(alpm_filelist_contains(alpm_pkg_get_files(k->data), filestr)) { found = 1; } diff --git a/lib/libalpm/filelist.c b/lib/libalpm/filelist.c index b0dcee4..6a24903 100644 --- a/lib/libalpm/filelist.c +++ b/lib/libalpm/filelist.c @@ -324,10 +324,10 @@ int _alpm_files_cmp(const void *f1, const void *f2) } -alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, +char *alpm_filelist_contains(alpm_filelist_t *filelist, const char *path) { - alpm_file_t key; + alpm_file_t key, *match; if(!filelist) { return NULL; @@ -335,8 +335,17 @@ alpm_file_t *alpm_filelist_contains(alpm_filelist_t *filelist, key.name = (char *)path; - return bsearch(&key, filelist->files, filelist->count, + match = bsearch(&key, filelist->files, filelist->count, sizeof(alpm_file_t), _alpm_files_cmp); + + if(match) { + return match->name; + } else if(filelist->resolved_path) { + return bsearch(&path, filelist->resolved_path, filelist->count, + sizeof(char *), _alpm_filelist_strcmp); + } else { + return NULL; + } } /* vim: set ts=2 sw=2 noet: */ diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 64888c5..60ea8de 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -462,6 +462,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, "keeping directory %s (could not count files)\n", file); } else if(newpkg && alpm_filelist_contains(alpm_pkg_get_files(newpkg), fileobj->name)) { + /* newpkg's filelist should have already been resolved by the caller */ _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (in new package)\n", file); } else if(dir_is_mountpoint(handle, file, &buf)) { @@ -483,6 +484,7 @@ static int unlink_file(alpm_handle_t *handle, alpm_pkg_t *oldpkg, continue; } filelist = alpm_pkg_get_files(local_pkg); + _alpm_filelist_resolve(handle, filelist); if(alpm_filelist_contains(filelist, fileobj->name)) { _alpm_log(handle, ALPM_LOG_DEBUG, "keeping directory %s (owned by %s)\n", file, local_pkg->name); @@ -580,6 +582,7 @@ static int remove_package_files(alpm_handle_t *handle, * so this removal operation doesn't kill them */ /* old package backup list */ newfiles = alpm_pkg_get_files(newpkg); + _alpm_filelist_resolve(handle, newfiles); for(b = alpm_pkg_get_backup(newpkg); b; b = b->next) { const alpm_backup_t *backup = b->data; /* safety check (fix the upgrade026 pactest) */ diff --git a/test/pacman/tests/fileconflict023.py b/test/pacman/tests/fileconflict023.py index 1310b68..ce72087 100644 --- a/test/pacman/tests/fileconflict023.py +++ b/test/pacman/tests/fileconflict023.py @@ -16,5 +16,3 @@ self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=bar") - -self.expectfailure = True -- 1.8.1.3
On 16/02/13 12:02, Andrew Gregory wrote:
Properly handling symlinks requires using resolved paths. I'm not sure if returning strings instead of file_t structs is the best solution, but it seemed like the least drastic one as we prepare for 4.1 and we weren't using the extra information anyway.
Thanks for catching this and the patches - I made one small comment. This whole situation is beginning to really annoy me... I know we (especially you) have put a lot of work to get symlinks to directories working so that when /lib -> usr/lib we now can deal with /lib/foo and /usr/lib/foo conflicting etc, but having to stat every directory to determine if it is a symlink and the resolve it is just wrong. So here is what I am proposing. Get pacman-4.1 out the door with these patches included. Then the sole focus of pacman-4.2 should be stripping support of symlinks to directories out of pacman. If /lib is a symlink on your system, no packages should put files in /lib. Just make that a conflict. If a package MUST put files there, the package() function in the PKGBUILD should look like: package() { ln -s usr/lib ${pkgdir}/lib make install rm ${pkgdir}/lib } Done... The only other reason these are used (that I have seen...), is poor man's mount points. For example, your /usr partition is getting full and you decide to put it on your /home partition in /home/share (yes - I have seen that done...). The are more correct options that should be used here - preferably resize your partitions or create a new partition. On Linux, a bind mount could be used. Overall, I see no need to keep this mis-feature. So, I am proposing, get this patchset merged and get pacman-4.1 out the door. The just concentrate removing all directory symlink support from pacman. The only thing that would need to be done here is to check all current packages do not have paths in them that require resolving symlinks to directories. E.g. on machines with /lib -> usr/lib, no package should have a file in /lib. I propose that in a 4.1.x point release, we should add a tool (or perhaps extend testdb) to detect these packages and print a warning. So a bit of planning is needed, but I think it can be done. Comments? Allan
On Sat, Feb 16, 2013 at 8:14 PM, Allan McRae <allan@archlinux.org> wrote:
On 16/02/13 12:02, Andrew Gregory wrote:
Properly handling symlinks requires using resolved paths. I'm not sure if returning strings instead of file_t structs is the best solution, but it seemed like the least drastic one as we prepare for 4.1 and we weren't using the extra information anyway.
Thanks for catching this and the patches - I made one small comment.
This whole situation is beginning to really annoy me... I know we (especially you) have put a lot of work to get symlinks to directories working so that when /lib -> usr/lib we now can deal with /lib/foo and /usr/lib/foo conflicting etc, but having to stat every directory to determine if it is a symlink and the resolve it is just wrong.
So here is what I am proposing. Get pacman-4.1 out the door with these patches included. Then the sole focus of pacman-4.2 should be stripping support of symlinks to directories out of pacman. If /lib is a symlink on your system, no packages should put files in /lib. Just make that a conflict. If a package MUST put files there, the package() function in the PKGBUILD should look like:
package() { ln -s usr/lib ${pkgdir}/lib make install rm ${pkgdir}/lib }
Done...
The only other reason these are used (that I have seen...), is poor man's mount points. For example, your /usr partition is getting full and you decide to put it on your /home partition in /home/share (yes - I have seen that done...). The are more correct options that should be used here - preferably resize your partitions or create a new partition. On Linux, a bind mount could be used.
Overall, I see no need to keep this mis-feature.
So, I am proposing, get this patchset merged and get pacman-4.1 out the door. The just concentrate removing all directory symlink support from pacman.
The only thing that would need to be done here is to check all current packages do not have paths in them that require resolving symlinks to directories. E.g. on machines with /lib -> usr/lib, no package should have a file in /lib. I propose that in a 4.1.x point release, we should add a tool (or perhaps extend testdb) to detect these packages and print a warning. So a bit of planning is needed, but I think it can be done.
Comments?
Allan
Does namcap currently warn about "official symlinks", such as /lib -> usr/lib? If not, it would be good to have namcap warn about these so problematic PKGBUILDs can be cleaned up sooner rather than later. This wouldn't help for custom symlinks, such as the /home/share kludge you mentioned, but I still see some value in it. Jason
On Sun, 17 Feb 2013 11:14:29 +1000 Allan McRae <allan@archlinux.org> wrote:
On 16/02/13 12:02, Andrew Gregory wrote:
Properly handling symlinks requires using resolved paths. I'm not sure if returning strings instead of file_t structs is the best solution, but it seemed like the least drastic one as we prepare for 4.1 and we weren't using the extra information anyway.
Thanks for catching this and the patches - I made one small comment.
This whole situation is beginning to really annoy me... I know we (especially you) have put a lot of work to get symlinks to directories working so that when /lib -> usr/lib we now can deal with /lib/foo and /usr/lib/foo conflicting etc, but having to stat every directory to determine if it is a symlink and the resolve it is just wrong.
So here is what I am proposing. Get pacman-4.1 out the door with these patches included. Then the sole focus of pacman-4.2 should be stripping support of symlinks to directories out of pacman. If /lib is a symlink on your system, no packages should put files in /lib. Just make that a conflict. If a package MUST put files there, the package() function in the PKGBUILD should look like:
package() { ln -s usr/lib ${pkgdir}/lib make install rm ${pkgdir}/lib }
Done...
The only other reason these are used (that I have seen...), is poor man's mount points. For example, your /usr partition is getting full and you decide to put it on your /home partition in /home/share (yes - I have seen that done...). The are more correct options that should be used here - preferably resize your partitions or create a new partition. On Linux, a bind mount could be used.
Overall, I see no need to keep this mis-feature.
So, I am proposing, get this patchset merged and get pacman-4.1 out the door. The just concentrate removing all directory symlink support from pacman.
The only thing that would need to be done here is to check all current packages do not have paths in them that require resolving symlinks to directories. E.g. on machines with /lib -> usr/lib, no package should have a file in /lib. I propose that in a 4.1.x point release, we should add a tool (or perhaps extend testdb) to detect these packages and print a warning. So a bit of planning is needed, but I think it can be done.
Comments?
Allan
I found the following places where we currently use resolved filelists (or will after my patchset): Removal: - check if an upgraded package owns a directory before removing it - check if any installed packages own a directory before removing it - check if backup files are in new packages Conflicts: - check for inter-target conflicts - remove currently installed package's files from the conflict check for upgrades - check if all files in a directory are owned by a package for a directory -> symlink transition - check if a package being removed solves a conflict - check if file ownership changed from one install target to another - check if a file conflict is due to the new package using a symlink - check if a conflicted file is unowned and in the backup array Having reviewed the related code far more than I never wanted to, I believe I can get the number of resolved paths down to a manageable level while keeping full symlink support. We should be able to take advantage of the fact that symlinks in paths are not the norm and file basename collisions are actually quite rare to selectively resolve individual paths as necessary instead of resolving whole filelists. I haven't worked out all of the specifics for doing this efficiently, though, so it may not be worth it in the end. If we do remove support, we should probably at least still use resolved paths to check for inter-target conflicts. If one target has a symlink in its paths I don't see any other way to catch a conflict prior to actually extracting the files. As long as we only resolve files that are new in the package and packages don't constantly move files around, this should have only a small performance hit most of the time.
On 18/02/13 12:52, Andrew Gregory wrote:
Having reviewed the related code far more than I never wanted to, I believe I can get the number of resolved paths down to a manageable level while keeping full symlink support. We should be able to take advantage of the fact that symlinks in paths are not the norm and file basename collisions are actually quite rare to selectively resolve individual paths as necessary instead of resolving whole filelists. I haven't worked out all of the specifics for doing this efficiently, though, so it may not be worth it in the end.
It would be great if the amount of resolving could be reduced greatly. But what is a real use case for this feature?
If we do remove support, we should probably at least still use resolved paths to check for inter-target conflicts. If one target has a symlink in its paths I don't see any other way to catch a conflict prior to actually extracting the files. As long as we only resolve files that are new in the package and packages don't constantly move files around, this should have only a small performance hit most of the time.
I don't understand this. A package can not have a symlink to a directory and files in the directory too. E.g. a package can not have /lib -> usr/lib and /lib/foo. So there is no need to resolve there. Or, a symlink one package and a directory is in the other would generate a conflict (as happens currently). What would change is that we currently allow these to be installed in separate transactions (symlink containing package first). Allan
On Mon, 18 Feb 2013 13:47:54 +1000 Allan McRae <allan@archlinux.org> wrote:
On 18/02/13 12:52, Andrew Gregory wrote:
Having reviewed the related code far more than I never wanted to, I believe I can get the number of resolved paths down to a manageable level while keeping full symlink support. We should be able to take advantage of the fact that symlinks in paths are not the norm and file basename collisions are actually quite rare to selectively resolve individual paths as necessary instead of resolving whole filelists. I haven't worked out all of the specifics for doing this efficiently, though, so it may not be worth it in the end.
It would be great if the amount of resolving could be reduced greatly. But what is a real use case for this feature?
I was mostly just concerned about not removing functionality if we don't really need to. If there's really no need for it, I'm fine with dropping it.
If we do remove support, we should probably at least still use resolved paths to check for inter-target conflicts. If one target has a symlink in its paths I don't see any other way to catch a conflict prior to actually extracting the files. As long as we only resolve files that are new in the package and packages don't constantly move files around, this should have only a small performance hit most of the time.
I don't understand this.
A package can not have a symlink to a directory and files in the directory too. E.g. a package can not have /lib -> usr/lib and /lib/foo. So there is no need to resolve there.
Or, a symlink one package and a directory is in the other would generate a conflict (as happens currently). What would change is that we currently allow these to be installed in separate transactions (symlink containing package first).
Sorry, I meant that one of the packages has a directory that exists on the filesystem as a symlink, for example: filesystem: /lib -> /usr/lib pkg1: /lib/foo pkg2: /usr/lib/foo Without resolving /lib/foo we wouldn't be able to detect the conflict until we started committing the transaction and actually tried to extract the files.
On 18/02/13 14:30, Andrew Gregory wrote:
On Mon, 18 Feb 2013 13:47:54 +1000 Allan McRae <allan@archlinux.org> wrote:
On 18/02/13 12:52, Andrew Gregory wrote:
Having reviewed the related code far more than I never wanted to, I believe I can get the number of resolved paths down to a manageable level while keeping full symlink support. We should be able to take advantage of the fact that symlinks in paths are not the norm and file basename collisions are actually quite rare to selectively resolve individual paths as necessary instead of resolving whole filelists. I haven't worked out all of the specifics for doing this efficiently, though, so it may not be worth it in the end.
It would be great if the amount of resolving could be reduced greatly. But what is a real use case for this feature?
I was mostly just concerned about not removing functionality if we don't really need to. If there's really no need for it, I'm fine with dropping it.
Sure. And I know a lot of work has gone into getting this feature as fixed as it is now. But there were masses of issues discovered with this feature over the last year and there is a large increase in complexity in the conflict checking required to keep it. So I am not sure this feature is worth keeping - especially if no real use case can be made for it. Removing it simplifies conflict checking (see below) but also keeps the local pacman database correct as now people will not be able to replace a directory with a symlink. Anybody else want to give an opinion on this?
If we do remove support, we should probably at least still use resolved paths to check for inter-target conflicts. If one target has a symlink in its paths I don't see any other way to catch a conflict prior to actually extracting the files. As long as we only resolve files that are new in the package and packages don't constantly move files around, this should have only a small performance hit most of the time.
I don't understand this.
A package can not have a symlink to a directory and files in the directory too. E.g. a package can not have /lib -> usr/lib and /lib/foo. So there is no need to resolve there.
Or, a symlink one package and a directory is in the other would generate a conflict (as happens currently). What would change is that we currently allow these to be installed in separate transactions (symlink containing package first).
Sorry, I meant that one of the packages has a directory that exists on the filesystem as a symlink, for example:
filesystem: /lib -> /usr/lib
pkg1: /lib/foo
pkg2: /usr/lib/foo
Without resolving /lib/foo we wouldn't be able to detect the conflict until we started committing the transaction and actually tried to extract the files.
If we removed the handling of symlinks for directories that would be detected as a conflict (/lib symlink on filesystem vs /lib directory in pkg1).
On Mon, 18 Feb 2013 14:46:14 +1000 Allan McRae <allan@archlinux.org> wrote:
Sorry, I meant that one of the packages has a directory that exists on the filesystem as a symlink, for example:
filesystem: /lib -> /usr/lib
pkg1: /lib/foo
pkg2: /usr/lib/foo
Without resolving /lib/foo we wouldn't be able to detect the conflict until we started committing the transaction and actually tried to extract the files.
If we removed the handling of symlinks for directories that would be detected as a conflict (/lib symlink on filesystem vs /lib directory in pkg1).
Ah. I guess I didn't fully understand the extent of what you were suggesting. So pacman would not allow any deviation from what the packages provide? That does sound much easier. The only use case I can think of would be people wanting to try merging directories (/bin and /usr/bin for example) without having to rebuild all the necessary packages first.
participants (3)
-
Allan McRae
-
Andrew Gregory
-
Jason St. John