[pacman-dev] Pacman doesn't handle properly the multiple satisfiers, localdb corruption
Hi! Well, you probably will ask me to send patch, but I'm waiting for your decision. First, here is the problem: ===requiredby005.py== self.description = "Remove lp1, requiredby should move from lp1 to lp2" lp1 = pmpkg("foo") lp1.requiredby = ["pkg3"] self.addpkg2db("local", lp1) lp2 = pmpkg("pkg2") lp2.provides = ["foo"] self.addpkg2db("local", lp2) lp3 = pmpkg("pkg3") lp3.depends = ["foo"] self.addpkg2db("local", lp3) self.args = "-R %s" % lp1.name self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=pkg2") self.addrule("PKG_REQUIREDBY=pkg2|pkg3") ------------------------- This test fails and hence the database becomes corrupt: alpm_checkdeps is clever enough to let pacman remove foo, however update of pkg2's requiredby is missing after the remove. However, this is not a pactest to commit: as you see, pkg2's requiredby field is missing: this can happen, if pkg3 was the last installed package, because _alpm_trans_update_depends works in a "funny" way: it fills in the "literal" satisfier's requiredby only or _all_ "provider" satisfiers' requiredby. So, my question, how I fix this? How should we store multiple satisfiers in localdb? case 1: We find all satisfier package and fill in _all_ requiredby fields. case 2: We flag only one package as a satisfier with requiredby, and we handle properly these requiredby "moves". case 3: any other idea? We must choose one, because this is not defined in pacman now. I prefer case 1. Pros: -easy to implement (only _alpm_trans_update_depends needs patching) -by doing this, we clean up _alpm_trans_update_depends using alpm_depcmp -_alpm_pkg_update_requiredby prefers this way now -this won't break prepare/commit hierarchy (because alpm_checkdeps should somehow inform remove_commit in case 2) -when using -d switch, implement case 2 is difficult Contras: -slower!! (we need to check for providers always) -some users may have corrupted(?) localdb as shown in the pactest file (this is rare) -one satisfier is enough Bye, ngaba ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Mon, Jul 16, 2007 at 11:43:38AM +0200, ngaba@petra.hos.u-szeged.hu wrote:
Hi!
Well, you probably will ask me to send patch, but I'm waiting for your decision. First, here is the problem: ===requiredby005.py== self.description = "Remove lp1, requiredby should move from lp1 to lp2"
lp1 = pmpkg("foo") lp1.requiredby = ["pkg3"] self.addpkg2db("local", lp1)
lp2 = pmpkg("pkg2") lp2.provides = ["foo"] self.addpkg2db("local", lp2)
lp3 = pmpkg("pkg3") lp3.depends = ["foo"] self.addpkg2db("local", lp3)
self.args = "-R %s" % lp1.name
self.addrule("PACMAN_RETCODE=0") self.addrule("!PKG_EXIST=foo") self.addrule("PKG_EXIST=pkg2") self.addrule("PKG_REQUIREDBY=pkg2|pkg3") ------------------------- This test fails and hence the database becomes corrupt: alpm_checkdeps is clever enough to let pacman remove foo, however update of pkg2's requiredby is missing after the remove. However, this is not a pactest to commit: as you see, pkg2's requiredby field is missing: this can happen, if pkg3 was the last installed package, because _alpm_trans_update_depends works in a "funny" way: it fills in the "literal" satisfier's requiredby only or _all_ "provider" satisfiers' requiredby. So, my question, how I fix this? How should we store multiple satisfiers in localdb? case 1: We find all satisfier package and fill in _all_ requiredby fields. case 2: We flag only one package as a satisfier with requiredby, and we handle properly these requiredby "moves". case 3: any other idea? We must choose one, because this is not defined in pacman now. I prefer case 1. Pros: -easy to implement (only _alpm_trans_update_depends needs patching) -by doing this, we clean up _alpm_trans_update_depends using alpm_depcmp -_alpm_pkg_update_requiredby prefers this way now -this won't break prepare/commit hierarchy (because alpm_checkdeps should somehow inform remove_commit in case 2) -when using -d switch, implement case 2 is difficult Contras: -slower!! (we need to check for providers always) -some users may have corrupted(?) localdb as shown in the pactest file (this is rare) -one satisfier is enough
Oh yes, please, clean that update_depends function, trying to read it almost killed me. (so go with case 1 I guess). About corrupted db, most db are already corrupted, but in a less important way : extra required by field (fixed by your upgraderm patch), instead of missing required by fields as it's the case here. But so, couldn't we have an option in pacman for repairing requiredby fields or something ?
About corrupted db, most db are already corrupted, but in a less important way : extra required by field (fixed by your upgraderm patch), instead of missing required by fields as it's the case here. But so, couldn't we have an option in pacman for repairing requiredby fields or something ? Well, in the future alpm_db_test should do a deeper db check (and try to fix errors: for example requiredby fields can be generated from depends easily); now it only does very basic "syntax" checking, which is a bit confusing for me (consistency check means more IMHO). Bye, ngaba
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Mon, Jul 16, 2007 at 04:31:32PM +0200, ngaba@petra.hos.u-szeged.hu wrote:
About corrupted db, most db are already corrupted, but in a less important way : extra required by field (fixed by your upgraderm patch), instead of missing required by fields as it's the case here. But so, couldn't we have an option in pacman for repairing requiredby fields or something ? Well, in the future alpm_db_test should do a deeper db check (and try to fix errors: for example requiredby fields can be generated from depends easily); now it only does very basic "syntax" checking, which is a bit confusing for me (consistency check means more IMHO).
Yes indeed, that's what I was thinking about. Maybe it could also look for broken dependencies also (mostly breakage caused by --nodeps). The problem is I'm not sure it fits perfectly in libalpm. Maybe it should rather be an external tool, using libalpm (in src/util/). That should be enough at least for checking, but for writing to the database in order to correct inconsistencies, I'm not sure..
On Mon, Jul 16, 2007 at 04:47:12PM +0200, Xavier wrote:
Yes indeed, that's what I was thinking about. Maybe it could also look for broken dependencies also (mostly breakage caused by --nodeps). The problem is I'm not sure it fits perfectly in libalpm. Maybe it should rather be an external tool, using libalpm (in src/util/). That should be enough at least for checking, but for writing to the database in order to correct inconsistencies, I'm not sure..
Finally, I think that isn't needed, errors can be corrected by just reinstalling the packages. So just displaying the missing dependencies, and extra / missing requiredby fields should be enough. I tried putting the code in an external tool (starting from the existing testpkg.c one), but that's stupid, because that code is already in libalpm. The dependencies / requiredby handling functions should be refactored a bit so that it could be used in this case, and then either they should be exported so that it could be used in an external tool, or there should be a little function in libalpm for checking for inconsistencies. I'll still attach the prog as a reference, because I think it might be interesting. But I really hate the code : copy / pasting is the lazy way, and imo that's one of the reason the current code base is so ugly and messy. /* * testdb.c : Test a pacman local database for validity * * Copyright (c) 2007 by Aaron Griffin <aaronmgriffin@gmail.com> * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, * USA. */ #include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <string.h> #include <sys/stat.h> #include <dirent.h> #include <libgen.h> #include <alpm.h> #include <alpm_list.h> #define PATH_MAX 1024 static void cleanup(int signum) { if(alpm_release() == -1) { fprintf(stderr, "error releasing alpm: %s\n", alpm_strerror(pm_errno)); } exit(signum); } void output_cb(pmloglevel_t level, char *fmt, va_list args) { if(strlen(fmt)) { switch(level) { case PM_LOG_ERROR: printf("error: "); break; case PM_LOG_WARNING: printf("warning: "); break; default: return; } vprintf(fmt, args); printf("\n"); } } int db_test(char *dbpath) { struct dirent *ent; char path[PATH_MAX]; struct stat buf; int ret = 0; DIR *dir = opendir(dbpath); while ((ent = readdir(dir)) != NULL) { if(!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) { continue; } /* check for desc, depends, and files */ snprintf(path, PATH_MAX, "%s/%s/desc", dbpath, ent->d_name); if(stat(path, &buf)) { printf("%s: description file is missing\n", ent->d_name); ret++; } snprintf(path, PATH_MAX, "%s/%s/depends", dbpath, ent->d_name); if(stat(path, &buf)) { printf("%s: dependency file is missing\n", ent->d_name); ret++; } snprintf(path, PATH_MAX, "%s/%s/files", dbpath, ent->d_name); if(stat(path, &buf)) { printf("%s: file list is missing\n", ent->d_name); ret++; } } return(ret); } int check_requiredby(pmpkg_t *pkg) { int retval = 0; alpm_list_t *reqs, *deps; const char *pkgname = alpm_pkg_get_name(pkg); for(reqs = alpm_pkg_get_requiredby(pkg); reqs; reqs = reqs->next) { int found = 0; char *reqname = reqs->data; pmpkg_t *reqpkg = alpm_db_get_pkg(alpm_option_get_localdb(), reqname); for(deps = alpm_pkg_get_depends(reqpkg); deps && !found; deps = deps->next) { pmdepend_t *dep = alpm_splitdep(deps->data); found = alpm_depcmp(pkg, dep); } if(!found) { printf("%s : wrong requiredby field %s\n", pkgname, reqname); retval++; } } return(retval); } int check_depends(pmpkg_t *pkg) { int retval = 0; alpm_list_t *i, *j; pmdb_t *db = alpm_option_get_localdb(); const char *pkgname = alpm_pkg_get_name(pkg); alpm_list_t *depends = alpm_pkg_get_depends(pkg); for(i = depends; i; i = i->next) { if(!i->data) { continue; } char *depname = i->data; pmdepend_t* dep = alpm_splitdep(depname); if(dep == NULL) { continue; } int found = 0; for(j = alpm_db_getpkgcache(db); j; j = j->next) { pmpkg_t *deppkg = j->data; if(deppkg && alpm_depcmp(deppkg, dep)) { found = 1; alpm_list_t *rqdby = alpm_pkg_get_requiredby(deppkg); const char *deppkgname = alpm_pkg_get_name(deppkg); if(!alpm_list_find_str(rqdby, pkgname)) { printf("%s : missing requiredby field %s\n", deppkgname, pkgname); retval++; } } } if(!found) { printf("%s : missing dependency %s\n", pkgname, depname); retval++; } } return(retval); } int main(int argc, char **argv) { int retval = 0; /* default = false */ pmdb_t *db = NULL; char dbpath[PATH_MAX]; alpm_list_t *i; if(argc != 2) { fprintf(stderr, "usage: %s <pacman db>\n", basename(argv[0])); return(1); } snprintf(dbpath, PATH_MAX, "%s/local", argv[1]); retval = db_test(dbpath); if(retval) { exit(retval); } if(alpm_initialize() == -1) { fprintf(stderr, "cannot initialize alpm: %s\n", alpm_strerror(pm_errno)); return(1); } /* let us get log messages from libalpm */ alpm_option_set_logcb(output_cb); alpm_option_set_dbpath(argv[1]); db = alpm_db_register("local"); if(db == NULL) { fprintf(stderr, "error: could not register 'local' database (%s)\n", alpm_strerror(pm_errno)); cleanup(EXIT_FAILURE); } for(i = alpm_db_getpkgcache(db); i; i = alpm_list_next(i)) { pmpkg_t *pkg = alpm_list_getdata(i); retval += check_requiredby(pkg); retval += check_depends(pkg); } cleanup(retval); }
Well, I haven't read your code yet (but I will;-); but you could use pseudo packages (created from the db) and use libalpm functions, for example, alpm_checkdeps for dependency checking and _alpm_pkg_update_requiredby (and then a comparison with the db) for requiredby testing; and this would be independent from libalpm version. Bye, ngaba ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Tue, Jul 17, 2007 at 09:40:02PM +0200, ngaba@petra.hos.u-szeged.hu wrote:
Well, I haven't read your code yet (but I will;-); but you could use pseudo packages (created from the db) and use libalpm functions, for example, alpm_checkdeps for dependency checking and _alpm_pkg_update_requiredby (and then a comparison with the db) for requiredby testing; and this would be independent from libalpm version.
Do you mean the same pseudo target code that was removed : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008911.html ? Or the the same idea but implemented differently ?
Do you mean the same pseudo target code that was removed : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008911.html ? Qoute from that e-mail: "But there is at least one way to do that, it's creating a dummy package, with the same list of dependencies as in the PKGBUILD, and work with that in libalpm." Yes, this is what I said, except we use the db/depends instead of PKGBUILD. Bye, ngaba
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Wed, Jul 18, 2007 at 02:57:09PM +0200, ngaba@petra.hos.u-szeged.hu wrote:
Do you mean the same pseudo target code that was removed : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008911.html ? Qoute from that e-mail: "But there is at least one way to do that, it's creating a dummy package, with the same list of dependencies as in the PKGBUILD, and work with that in libalpm." Yes, this is what I said, except we use the db/depends instead of PKGBUILD.
Yes, but I was mostly asking about the implementation from Aurelien that was removed. If finally it was ok and it should be added back, or if it should be implemented in another better way.
Yes, but I was mostly asking about the implementation from Aurelien that was removed. If finally it was ok and it should be added back, or if it should be implemented in another better way.
Well, Aurelien's implementation is quite complicated (but maybe fine), I think easier to use alpm_db_get_pkg for dummy package "creation". Bye, ngaba PS: After reading your code, it looks independent from libalpm version (it using alpm_depcmp), so it seems to be perfect ;-) ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Wed, Jul 18, 2007 at 03:33:07PM +0200, ngaba@petra.hos.u-szeged.hu wrote:
Yes, but I was mostly asking about the implementation from Aurelien that was removed. If finally it was ok and it should be added back, or if it should be implemented in another better way.
Well, Aurelien's implementation is quite complicated (but maybe fine), I think easier to use alpm_db_get_pkg for dummy package "creation".
Bye, ngaba PS: After reading your code, it looks independent from libalpm version (it using alpm_depcmp), so it seems to be perfect ;-)
Hm finally it doesn't seem easy (on the code side) to detect invalid duplicated requiredby entries (after your new update_trans_depends function). For example if pkg1 has pkg2 twice in its requiredby field, while pkg2 satisfy only one dependency of pkg1. We can't just detect for duplicate in requiredby fields anymore, because they might be valid (if pkg2 provided two deps of pkg1). I would need to compute requiredby fields fully for each pkg, and compare to the old one. By calling check_depends(pkg) on all package from the local database, I have enough information for computing all requiredby fields. But the problem is I can't store these. In this part : 139 if(deppkg && alpm_depcmp(deppkg, dep)) { 140 found = 1; I would need to do something like : alpm_list_t *newrqdby = alpm_pkg_get_newrequiredby(deppkg); newrqdby = alpm_list_add(newrqdby, pkgname). And when everything is done, compare old and new requiredby lists for each package. But I don't see any way to do that. (that would also make my check_requiredby() function useless). Or maybe I could achieve the same (in a less efficient way) with a more complicated check_requiredby function. But on the algorithm side, this check_requiredby function is stupid because everything is done by check_depends..
Hm finally it doesn't seem easy (on the code side) to detect invalid duplicated requiredby entries (after your new update_trans_depends function). For example if pkg1 has pkg2 twice in its requiredby field, while pkg2 satisfy only one dependency of pkg1. My patch hasn't been accepted yet ;-) By calling check_depends(pkg) on all package from the local database, I have enough information for computing all requiredby fields. But the problem is I can't store these. So you can use _alpm_trans_update_depends, then compare the result with the db;-) Bye, ngaba
---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Wed, Jul 18, 2007 at 08:20:40PM +0200, ngaba@petra.hos.u-szeged.hu wrote:
Hm finally it doesn't seem easy (on the code side) to detect invalid duplicated requiredby entries (after your new update_trans_depends function). For example if pkg1 has pkg2 twice in its requiredby field, while pkg2 satisfy only one dependency of pkg1. My patch hasn't been accepted yet ;-)
I forgot to answer to this part. Yes indeed, it hasn't been accepted, so it's just in case :)
On Wed, Jul 18, 2007 at 05:23:50PM +0200, Xavier wrote:
By calling check_depends(pkg) on all package from the local database, I have enough information for computing all requiredby fields. But the problem is I can't store these.
In this part : 139 if(deppkg && alpm_depcmp(deppkg, dep)) { 140 found = 1;
I would need to do something like : alpm_list_t *newrqdby = alpm_pkg_get_newrequiredby(deppkg); newrqdby = alpm_list_add(newrqdby, pkgname).
And when everything is done, compare old and new requiredby lists for each package. But I don't see any way to do that. (that would also make my check_requiredby() function useless).
Or maybe I could achieve the same (in a less efficient way) with a more complicated check_requiredby function. But on the algorithm side, this check_requiredby function is stupid because everything is done by check_depends..
Well, here is the original testdb file I already posted, plus the above suggestion implemented (which allows more efficient code + the detection of wrongly duplicated requiredby). But that really sucks of course :( I hate C.. There is really no clean way to do what I want? I guess I could create a new pkg structure type only for that testdb file, for storing the new requiredby field, or having a second local pkgcache (duplicating every package and storing them in a list, and modifying the requiredy field there). But the code is already huge compared to the simple things it does. It would make it even more huge :p I think I'll start learning C# so I can hack on Maelstorm when its released :) Or maybe the best thing here would be to use libalpm bindings with an higher level language.
On Fri, Jul 20, 2007 at 11:09:55PM +0200, Xavier wrote:
But that really sucks of course :( I hate C.. There is really no clean way to do what I want? I guess I could create a new pkg structure type only for that testdb file, for storing the new requiredby field, or having a second local pkgcache (duplicating every package and storing them in a list, and modifying the requiredy field there). But the code is already huge compared to the simple things it does. It would make it even more huge :p
What I would have done in Java is probably using a HashMap. So I did the same in C, I took the implementation of hash tables from there : http://www.cl.cam.ac.uk/~cwc22/hashtable/ and used that instead of my previous newrequiredby hack for storing the new requiredby fields :) The problem is that the hashtable implementation is bigger than the testdb.c file itself.. But couldn't hash tables also be useful for the rest of the code? Having O(1) access to elements instead of the O(n) _alpm_pkg_find() for example could be interesting in some cases in my opinion.
On Sat, Jul 21, 2007 at 11:31:35AM +0200, Xavier wrote:
On Fri, Jul 20, 2007 at 11:09:55PM +0200, Xavier wrote:
But that really sucks of course :( I hate C.. There is really no clean way to do what I want? I guess I could create a new pkg structure type only for that testdb file, for storing the new requiredby field, or having a second local pkgcache (duplicating every package and storing them in a list, and modifying the requiredy field there). But the code is already huge compared to the simple things it does. It would make it even more huge :p
What I would have done in Java is probably using a HashMap. So I did the same in C, I took the implementation of hash tables from there : http://www.cl.cam.ac.uk/~cwc22/hashtable/ and used that instead of my previous newrequiredby hack for storing the new requiredby fields :) The problem is that the hashtable implementation is bigger than the testdb.c file itself..
But couldn't hash tables also be useful for the rest of the code? Having O(1) access to elements instead of the O(n) _alpm_pkg_find() for example could be interesting in some cases in my opinion.
Also found out there was a hash table implementation based on kernel list : http://isis.poly.edu/kulesh/stuff/src/ This does the job as well.
Well, it looks fine; however free(dep) is missing. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Wed, Jul 18, 2007 at 03:15:53PM +0200, ngaba@petra.hos.u-szeged.hu wrote:
Well, it looks fine; however free(dep) is missing.
Indeed, I corrected that.
participants (2)
-
ngaba@petra.hos.u-szeged.hu
-
Xavier