[pacman-dev] lstat issue / fileconflict001 pactest
Following the discussion started in the comments of bug 7484, because I think it's more appropriate : http://bugs.archlinux.org/task/7484#comment20837 I recently discovered that a trailing / is very important when using lstat, for directory symlinks. In the following situation : symdir -> realdir/ lstat(symdir) stats the link itself, but lstat(symdir/) stats realdir. So lstat(symdir/) seems equivalent to stat(symdir). The archive extraction code in add.c that Dan rewrote didn't take this into consideration, so he added a new _alpm_lstat that removes the trailing / when there is one : http://projects.archlinux.org/git/?p=pacman.git;a=commit;h=b55abdce7aebb142c... A little typo in _alpm_lstat (path instead of newpath) made this change not active yet. As Dan noticed, fixing this makes fileconflict001 pass. The only purpose of this mail is explaining why. So in this pactest, we have the following situation : dir/realdir/ dir/symdir -> realdir Then pkg1 installs dir/realdir/file , and pkg2 installs dir/symdir/file . Since in this case, symdir points to realdir, pkg1 and pkg2 have a fileconflict on "file". But if fileconflict001 pass after the bugfix mentioned above, it's not because pacman detects there is a package<->package conflict between dir/realdir/file and dir/symdir/file. Instead, it detects a filesystem <-> package conflict between dir/symdir/ from pkg2 (which is a real directory there), and the dir/symdir symlink on the filesystem. So that's plain wrong. The fileconflict code currently does a lstat(dir/symdir/), which doesn't stat the symdir symlink, but the real directory realdir. So it treated dir/symdir/ as a directory, and ignored it : 336 /* stat the file - if it exists, do some checks / 337 if(_alpm_lstat(path, &buf) != 0) { 338 continue; 339 } 340 if(S_ISDIR(buf.st_mode)) { 341 _alpm_log(PM_LOG_DEBUG, "%s is a directory, not a conflict\n", path); Now, with the new behavior of _alpm_lstat, the code does a lstat(dir/symdir), which stats the symlink. And the following code will treat this as a conflict. The case of fileconflict001 is probably another situation easier to detect during the transaction (coupled with a rollback system), rather than beforehand. Otherwise, maybe the file conflict code should keep using the normal lstat, instead of _alpm_lstat which removes the trailing /.
Hi! To be honest, I don't like the rollback solution; the pre-transaction checking is much more elegant (for me). As I mentioned earlier my preferred fix to fileconflict001.py is: converting the filelists with the help of realpath(); so we get the "after-install" state of our filesystem. Note: realpath() is quite an "intelligent" function: realpath("/usr/var/blabla", ...) will result /var/blabla [blabla doesn't exist of course]. Some contras: 1. speed?! (needs testing; but during targets<->HDD check, all entries of filelist is alpm_lstat-ed, so this hopefully won't cause terrible slowdown neither, but I dunno.) 2. fileconflict002.py ?! (a possible solution: create temporarily the newly installed symlinks <- can we get this info from libarchive? then using realpath -- I'm suggesting a "1. create new symlinks first" (if possible), _then_ "2. extract tar.gz-s" ugly hack here) Bye, ngaba PS: One more thing: What about fileconflict resolution? (using query-fileowner)
On Sat, Nov 10, 2007 at 07:14:11PM +0100, Nagy Gabor wrote:
Hi!
To be honest, I don't like the rollback solution; the pre-transaction checking is much more elegant (for me). As I mentioned earlier my preferred fix to fileconflict001.py is: converting the filelists with the help of realpath(); so we get the "after-install" state of our filesystem. Note: realpath() is quite an "intelligent" function: realpath("/usr/var/blabla", ...) will result /var/blabla [blabla doesn't exist of course]. Some contras: 1. speed?! (needs testing; but during targets<->HDD check, all entries of filelist is alpm_lstat-ed, so this hopefully won't cause terrible slowdown neither, but I dunno.) 2. fileconflict002.py ?! (a possible solution: create temporarily the newly installed symlinks <- can we get this info from libarchive? then using realpath -- I'm suggesting a "1. create new symlinks first" (if possible), _then_ "2. extract tar.gz-s" ugly hack here)
I think your are contradicting yourself in this mail :) Sure, pre-transaction checking seems more elegant at first sight, but it's not practical. Either you try to keep it simple, but then it's not safe or correct. Or you try to check all possible cases, and then it's a nightmare. You said yourself solving fileconflict002 would require an ugly hack. It would be half pre-transaction checking / half rollback way. So it would be probably better to do fully the rollback way. Besides, you are only considering the case of file conflicts here. For removing files, it's the same mess. See bug 7652. You can also look at bugs 3564, 7692 and 8585. However, it's hard to tell how difficult it would be to go the rollback way before actually doing it. Probably a poor implementation of it would result in a worse situation than the current one. There are many corner cases where pacman currently breaks, but it still does a good job most of the times, so that might be good enough.
PS: One more thing: What about fileconflict resolution? (using query-fileowner)
What do you mean here?
I think your are contradicting yourself in this mail :) Sure, pre-transaction checking seems more elegant at first sight, but it's not practical. Either you try to keep it simple, but then it's not safe or correct. Or you try to check all possible cases, and then it's a nightmare. You said yourself solving fileconflict002 would require an ugly hack. It would be half pre-transaction checking / half rollback way. So it would be probably better to do fully the rollback way.
Besides, you are only considering the case of file conflicts here. For removing files, it's the same mess. See bug 7652. You can also look at bugs 3564, 7692 and 8585.
However, it's hard to tell how difficult it would be to go the rollback way before actually doing it. Probably a poor implementation of it would result in a worse situation than the current one. There are many corner cases where pacman currently breaks, but it still does a good job most of the times, so that might be good enough.
PS: One more thing: What about fileconflict resolution? (using query-fileowner)
What do you mean here? This function can be disabled by default: I mean, that targetlist<->hdd conflicts can be converted (in most cases) to target_x<->local_package_y conflicts (query_owner); and then
OK. My main problems with rollback: 1. this doesn't look easy to implement at all 2. in case of powerbreak (or any small hidden bugs) I can get MBs of "spam" on my HDD. But I must admit, that I cannot work out any better solution :-( these can be resolved like "dependency" conflicts. You will say, that this will be extremely slow. I'm saying: not necesserily: 1. ngaba@durci:~# time pacman -Qo `ls -d /usr/bin/* | head -n 1000` > /dev/null real 0m43.998s user 0m39.668s sys 0m0.247s 2. Obviously a radical speed-up can be done here: our fileconflict_query_owner would check first the "already found" conflicting localpkgs' filelists <- in real life examples this causes a _big_ speedup. 3. Usually targetpkg<->localpkg fileconflict is also a dependency conflict (resolved _before_ fileconflict). WOW, a bit off here: we just ran into an other reason for universal transaction [or rollback?;-)]!!: ------ self.description = "fileconflict stops the remove-upgrade transaction after removal" sp = pmpkg("pkg1") sp.replaces = ["pkg3"] sp.files = ["dir/file"] self.addpkg2db("sync", sp) lp1 = pmpkg("pkg3", "1.0-1") self.addpkg2db("local", lp1) lp2 = pmpkg("pkg2", "1.0-1") lp2.files = ["dir/file"] self.addpkg2db("local", lp2) self.args = "-Su" self.addrule("PACMAN_RETCODE=1") self.addrule("!PKG_EXIST=pkg1") self.addrule("PKG_EXIST=pkg2") self.addrule("PKG_EXIST=pkg3") self.addrule("PKG_VERSION=pkg2|1.0-1") self.addrule("PKG_VERSION=pkg3|1.0-1") ---------- Bye, ngaba
On Sat, Nov 10, 2007 at 09:53:59PM +0100, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
ngaba@durci:~# time pacman -Qo `ls -d /usr/bin/* | head -n 1000` > /dev/null real 0m43.998s user 0m39.668s sys 0m0.247s
did you free the kernel cache before doing this? or is this the 10th start? ;) - VMiklos
On Sat, Nov 10, 2007 at 09:53:59PM +0100, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
ngaba@durci:~# time pacman -Qo `ls -d /usr/bin/* | head -n 1000` > /dev/null real 0m43.998s user 0m39.668s sys 0m0.247s
did you free the kernel cache before doing this? or is this the 10th start? ;) In the imaginary fileconflict_query_owner example the pkgcache is
Sun, 11 Nov 2007 00:37:04 +0100 -n Miklos Vajna <vmiklos@frugalware.org> írta: probably loaded.
> OK. My main problems with rollback: > 1. this doesn't look easy to implement at all > 2. in case of powerbreak (or any small hidden bugs) I can get MBs of > "spam" on my HDD. I insert 3rd and 4th points here: 3. User does a pacman -Su, which will start extracting about ~1GB data. After 98% of the process pacman detects a fileconflict; and it rollbacks the transaction... User won't be happy. He repeat -Su, and he get an _other_ fileconflict error at about 99%. User is quite angry now ... a.) So, you should probably keep the current negligent fileconflict checking to collect (negligently) _all_ the fileconflicts. Which is not really nice (and slow). b.) Or you should continue(?!) the broken transaction to detect as many bugs as possibly. c.) "continue" option; grr 4. See the previous attached pactest file: correct rollback needs some kind of universal transaction: currently our transactions often create new pseudo transactions and this can mess things up... Bye, ngaba
On Sun, Nov 11, 2007 at 12:59:36AM +0100, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
3. User does a pacman -Su, which will start extracting about ~1GB data. After 98% of the process pacman detects a fileconflict; and it rollbacks the transaction... User won't be happy. He repeat -Su, and he get an _other_ fileconflict error at about 99%. User is quite angry now ...
no, i think the first goal would be to solve the following situation: - pacman -Su - replace util-linux with utils-linux-ng? yes - removing util-linux.. - checking for file conflicts.. bang if you have file conflicts then pacman will just exit and it will not re-install util-linux nor will install util-linux-ng on next -Su, so the next time the user does a reboot the system won't boot up due the missing /bin/mount ;) - VMiklos
On Sun, Nov 11, 2007 at 12:59:36AM +0100, Nagy Gabor <ngaba@bibl.u-szeged.hu> wrote:
3. User does a pacman -Su, which will start extracting about ~1GB data. After 98% of the process pacman detects a fileconflict; and it rollbacks the transaction... User won't be happy. He repeat -Su, Oops, I forgot to mention, that user certainly fix something before repeat. and he get an _other_ fileconflict error at about 99%. User is quite angry now ...
no, i think the first goal would be to solve the following situation:
- pacman -Su - replace util-linux with utils-linux-ng? yes - removing util-linux.. - checking for file conflicts.. bang
if you have file conflicts then pacman will just exit and it will not re-install util-linux nor will install util-linux-ng on next -Su, so the next time the user does a reboot the system won't boot up due the missing /bin/mount ;)
- VMiklos This is exactly my previously posted pactest file... But you gave a good reasoning for its importance now. Bye, ngaba
On Nov 10, 2007 12:42 PM, Xavier <shiningxc@gmail.com> wrote:
On Sat, Nov 10, 2007 at 07:14:11PM +0100, Nagy Gabor wrote:
Hi!
To be honest, I don't like the rollback solution; the pre-transaction checking is much more elegant (for me).
However, it's hard to tell how difficult it would be to go the rollback way before actually doing it. Probably a poor implementation of it would result in a worse situation than the current one.
This was my first instinct - we don't actually _have_ a "rollback solution" - we have talk of such a solution, and lots of speculation about how inelegant it is, with no actual implementation. I don't think either Dan or myself were thinking of getting rid of pre-checking for existing files. That's not complex. There's a difference between an expected error, and an unexpected one. Files existing - that's expected. The disk filling up WHILE installing (after the check is done), or hitting immutable files (ext FS only), these we can't account for. There will always be something that would cause an install to break, and codifying all cases is impossible. Recovering sanely, however, that is possible.
participants (4)
-
Aaron Griffin
-
Miklos Vajna
-
Nagy Gabor
-
Xavier