[pacman-dev] undocumented features, bugs and why pacman's speed cannot be tuned...
Hi! I looked into the source of pacman (I want to speed it up), I haven't finished yet, but I'd like to share some of my impressions: 1. There are some undocumented features: -Syy (force sync remote repos), -A, -U can install from http://, ftp://; -Sgg list groups with their content 2. The main problem with speed, that pacman always reads the repos: I thought, that in the case of sync repos, this is absolutely not needed in most cases (an "ls" is enough) and when all information is needed (e.g. when -Ss is invoked) we may use the ungziped db. But of course, in local repo we cannot use a db file. I am disappointed, because in almost all cases (-S -A -U) the whole localdb must be cached. Why? Because of conflict checking and provide checking (this problem can be solved, as I wrote in my previous e-mail; but needs to convert (just adding some new symlinks to) the existing local repo). The main problem, when you install a new package, it is needed to fill in the requiredby field (you may say, that this is needless, nut it's not, because you may have broked dependencies with -d earlier) :-( 3. Probably when 2 packages provide the same content and they are not mutual, bugs can happen: as I can see from the source (someone should test) if foo1 and foo2 provides foo, and you install foo3 which needs foo, then requieredby foo3 is set for both foo1 and foo2. And you won't be able to remove neigther foo1 and nor foo2 until you remove foo3, however foo3's foo dependency is satisfied until at least one of the foo-providers is installed. Bye, Nagy Gabor
On 2/24/07, Nagy Gabor <ngaba@petra.hos.u-szeged.hu> wrote:
Hi!
I looked into the source of pacman (I want to speed it up), I haven't finished yet, but I'd like to share some of my impressions: 1. There are some undocumented features: -Syy (force sync remote repos), -A, -U can install from http://, ftp://; -Sgg list groups with their content
I added these options to the pacman manpage.
2. The main problem with speed, that pacman always reads the repos: I thought, that in the case of sync repos, this is absolutely not needed in most cases (an "ls" is enough) and when all information is needed (e.g. when -Ss is invoked) we may use the ungziped db. But of course, in local repo we cannot use a db file. I am disappointed, because in almost all cases (-S -A -U) the whole localdb must be cached. Why? Because of conflict checking and provide checking (this problem can be solved, as I wrote in my previous e-mail; but needs to convert (just adding some new symlinks to) the existing local repo). The main problem, when you install a new package, it is needed to fill in the requiredby field (you may say, that this is needless, nut it's not, because you may have broked dependencies with -d earlier) :-( 3. Probably when 2 packages provide the same content and they are not mutual, bugs can happen: as I can see from the source (someone should test) if foo1 and foo2 provides foo, and you install foo3 which needs foo, then requieredby foo3 is set for both foo1 and foo2. And you won't be able to remove neigther foo1 and nor foo2 until you remove foo3, however foo3's foo dependency is satisfied until at least one of the foo-providers is installed.
I think it is good to attempt to speed pacman up, but at the same time, how much of an issue is it once your db is loaded into cache? I have very few complaints about speed besides the time of that first run, and even that isn't that bad. I do run pacman-optimize as a cron job once a week, which may be a helpful suggestion for a lot of people. I guess it comes down to this- I don't want speedups to be a hack and have it fail on me- I think stability of your database is more important than speed. I've never broken dependencies with -d, so I really enjoy being able to look at the required-by field and have it tell me the right things. Getting rid of this would be a mistake. -Dan
Hi! Pacman's speed is may tuneable! I think that I found solutions to all of my problems: As I wrote earlier, we should extend the local repo with additional informations, so it would be compatible with earlier pacman versions. To do this, there should be a "create additional information to local repo" function in pacman, which is called automatically in the "first" use of a tuned pacman. And in practice the additional information is used. But if you think that the additional information is corrupt, you may recreate it by calling this func manually. This was a security option, but the key: we should store broken dependencies too! (So the above "security" function can be thought of as an integrity check too, because we store broken dependencies too, and we may offer an option to user to fix it...). When must we store these broken dependencies? 1. First time of course 2. Only when -d switch is used! So my last problem disappeared: You want to install a new package: My problem was, that when you fill in the requiredby field, you had to check all local packages' dependencies. But in normal case any installed package doesn't depend on our newly installed package 'foo' (note: -A and -U is different here!). So you have to check if there is any broken dependencies (which is stored somehow in local/.brokendeps for example), where 'foo' depend is missing. So you can fill in requiredby field in 'foo'. This is fast, because in most cases very broken deps exists (on my machine probably 0.) Bye, Nagy Gabor
Hi! The function _alpm_depcheck is buggy: lines 246-255 is terrible... I think, that the whole provides concept is not really tested and reasoned. I think packages that provides the same 'foo', should be mutual (I mean, conflicts with each other as I wrote earlier) by definition (this is the usual practice anyway); and if you use -S and a package needs a provided 'foo' package: it is not defined, which provider package will be installed etc. etc. Back to the current problem: During -U foo2, the old foo2's requiredby field is checked, and tested if the new version of foo2 is satisfies these dependencies too. As you look into the source, you'll see that pacman checks all packages found in requiredby, if they "accept" the new foo2 or not. However the dependency can be through provide. This should be checked between the lines what I mentioned; but it doesn't do that. That part is absolutely wrong! (To be precise: It checks if foo2 is between the dependencies, if not, checks if the _last_ dependency is provided by an other package: if yes, you will get a dependency error ... <- I tried it :-S) Bye
Hi! Sorry for the "style" of my previous e-mail. The solution: Let foo is the updateable package (we update it with foo2), its requiredby packages: r_1, r_2, r_3 ... . We need to provide: "if r_i depends on foo through 'dep' then foo2 satisfies 'dep' dependency too" for all i. This can be done easily with alpm_depcmp. (We don't need the slow alpm_db_whatprovides at all.) Bye
On 2/25/07, Nagy Gabor <ngaba@petra.hos.u-szeged.hu> wrote:
The solution: Let foo is the updateable package (we update it with foo2), its requiredby packages: r_1, r_2, r_3 ... . We need to provide: "if r_i depends on foo through 'dep' then foo2 satisfies 'dep' dependency too" for all i. This can be done easily with alpm_depcmp. (We don't need the slow alpm_db_whatprovides at all.)
Not to be a nag or anything, but I've said this a lot recently. Everyone has ideas; good ideas, bad ideas, ideas in general. Your ideas are actually good. The problem is that I barely have enough time to implement all *my* ideas, so I know I don't have time to implement yours. While these are decent ideas, the best thing you can do is actually provide a patch to this list. Please read the "submitting-patches" file in the cvs root of pacman-lib. Thanks, Aaron
Hi! Here is my patch. --- deps.c.bak 2007-02-21 09:34:36.000000000 +0100 +++ deps.c 2007-02-26 23:21:47.000000000 +0100 @@ -236,32 +236,17 @@ alpm_list_t *_alpm_checkdeps(pmtrans_t * continue; } _alpm_db_read(db, p, INFRQ_DEPENDS); - for(k = p->depends; k && !found; k = k->next) { - /* find the dependency info in p->depends */ - _alpm_splitdep(k->data, &depend); - if(!strcmp(depend.name, oldpkg->name)) { - found = 1; - } - } - if(found == 0) { - /* look for packages that list depend.name as a "provide" */ - alpm_list_t *provides = _alpm_db_whatprovides(db, depend.name); - if(provides == NULL) { - /* not found */ - continue; - } - /* we found an installed package that provides depend.name */ - FREELISTPTR(provides); - } - if(!_alpm_depcmp(tp, &depend)) { - _alpm_log(PM_LOG_DEBUG, _("checkdeps: found %s as required by %s"), - depend.name, p->name); - miss = _alpm_depmiss_new(p->name, PM_DEP_TYPE_REQUIRED, depend.mod, - depend.name, depend.version); - if(!_alpm_depmiss_isin(miss, baddeps)) { - baddeps = alpm_list_add(baddeps, miss); - } else { - FREE(miss); + for(k = p->depends; k; k = k->next) { + /* we won't break the old dependencies */ + _alpm_splitdep(k->data, &depend); + if(_alpm_depcmp(oldpkg, &depend) && !_alpm_depcmp(tp, &depend)) { + _alpm_log(PM_LOG_DEBUG, _("checkdeps: the updated '%s' wouldn't satisfy a dependency of '%s'"), + oldpkg->name, p->name); + miss = _alpm_depmiss_new(p->name, PM_DEP_TYPE_REQUIRED, depend.mod, + depend.name, depend.version); + if(!_alpm_depmiss_isin(miss, baddeps)) { + baddeps = alpm_list_add(baddeps, miss); + } else FREE(miss); } } }
2007/2/26, Nagy Gabor <ngaba@petra.hos.u-szeged.hu>:
Hi! Here is my patch.
I guess that it would be better to send the patch in attachment. -- Giovanni Scafora Arch Linux Trusted User (voidnull) http://www.archlinux.org linuxmania@gmail.com
Sorry, I attach the file. I cannot make my clients send this file as text correctly :-(
On 2/25/07, Nagy Gabor <ngaba@petra.hos.u-szeged.hu> wrote:
Hi!
The function _alpm_depcheck is buggy: lines 246-255 is terrible... I think, that the whole provides concept is not really tested and reasoned. I think packages that provides the same 'foo', should be mutual (I mean, conflicts with each other as I wrote earlier) by definition (this is the usual practice anyway); and if you use -S and a package needs a provided 'foo' package: it is not defined, which provider package will be installed etc. etc.
Back to the current problem: During -U foo2, the old foo2's requiredby field is checked, and tested if the new version of foo2 is satisfies these dependencies too. As you look into the source, you'll see that pacman checks all packages found in requiredby, if they "accept" the new foo2 or not. However the dependency can be through provide. This should be checked between the lines what I mentioned; but it doesn't do that. That part is absolutely wrong! (To be precise: It checks if foo2 is between the dependencies, if not, checks if the _last_ dependency is provided by an other package: if yes, you will get a dependency error ... <- I tried it :-S)
Hey Nagy, I committed some new tests for pactest tonight, and I think the one at pactest/tests/upgrade052.py is what you describe as the problem above. However, I'm not completely sure. Can you look it over and tell me if it is what you describe? If it is not, can you write a new one that does describe what you are trying to point out. In addition, please attempt to run it if you can on your patch- currently both code patched with your patch and the unpatched CVS version fail on this test. To run tests (this is for Nagy and anyone else interested), do the following: All tests: ./configure --disable-fakeroot make clean make check One test: ./configure --disable-fakeroot make clean make python ./pactest/pactest.py -t /pactest/tests/<testname.py> -p ./src/pacman/pacman -Dan
Hi! I attached my upgrade052.py file. Nagy Gabor
On 2/27/07, Nagy Gabor <ngaba@petra.hos.u-szeged.hu> wrote:
Hi! I attached my upgrade052.py file.
I committed this as upgrade055.py, with some small changes (description and added rules). Pactest currently passes this test with your patch applied, and fails without. Thanks for finding something. I am holding off on committing your patch because that function is quite ugly to begin with- we may want to do some more serious overhauling of it. Thanks for making the test case though, these are really helpful. And for anyone else out there that finds a bug, a test case using pactest to duplicate it is wonderful, as it makes the description of the problem a lot more concrete. -Dan
participants (4)
-
Aaron Griffin
-
Dan McGee
-
Giovanni Scafora
-
Nagy Gabor