[pacman-dev] backup handling ugly mess
Following the recent bash breakage, Pierre was wondering why the /etc/profile just stayed there on the filesystem : http://www.archlinux.org/pipermail/arch-dev-public/2007-December/003556.html I didn't understand this at all. As stonecrest noted earlier, when the "current" mirrorlist was removed from pacman package, that file totally disappeared during the upgrade, even though it was in the old backup array : http://www.archlinux.org/pipermail/pacman-dev/2007-September/009376.html This caused the following commit : http://projects.archlinux.org/git/?p=pacman.git;a=commitdiff;h=843d368ef60a7... The first part of this commit is supposed to temporarily add all old backup entries (besides the new) to the NoUpgrade array. Because of a bug, it didn't have any effect. The second part was an idea of me to handle the pacsave during an removeupgrade just like we already did for a remove, but I finally think this is a really stupid / misleading / confusing idea. This code shouldn't even be reached during a removeupgrade if the first part was fixed anyway. So that was just a little sum up, now description of the bug: alpm_pkg_get_backup(newpkg) : newpkg is the one that will be installed, so this only return a list of elements like "file" alpm_pkg_get_backup(oldpkg) : oldpkg is a locally installed package, so this return a list of elements like "file<tab>md5sum" Since the md5sum wasn't removed there, adding these files had 0 effect. Now, this can easily be fixed. But then all files in either old or new backup array that were in oldpkg but not in newpkg will just stay there on the filesystem. They won't be renamed to .pacsave . Why the bash case was different to the pacman one is that, even if /etc/profile was removed from the newpkg, it was still in the backup array of the newpkg (which is the only one pacman looks at). So the file wasn't removed and was just left on the filesystem. Even though the bash case is a packaging error, it's a real case that happened, so it might be worth adding it to the pactests.
On Tue, Dec 04, 2007 at 09:18:08PM +0100, Xavier wrote:
The first part of this commit is supposed to temporarily add all old backup entries (besides the new) to the NoUpgrade array. Because of a bug, it didn't have any effect.
The second part was an idea of me to handle the pacsave during an removeupgrade just like we already did for a remove, but I finally think this is a really stupid / misleading / confusing idea. This code shouldn't even be reached during a removeupgrade if the first part was fixed anyway.
Here is a patch which fix the first part, and reverts the second.
From e3aa6ba2d7315bda9330b0f00014416540681881 Mon Sep 17 00:00:00 2001 From: Chantry Xavier <shiningxc@gmail.com> Date: Wed, 5 Dec 2007 00:47:53 +0100 Subject: [PATCH] libalpm/add.c : bugfix in backup handling.
commit 843d368ef60a74719dfc74a27de3fe3ef441951f was supposed to use the backup array of the old package (besides the new one), but the backup line wasn't correctly parsed, so this change didn't have any effect. Now that this is fixed, the second part of the commit doesn't make much sense. In case of a REMOVEUPGRADE transaction, the backup files are in the NoUpgrade array, and thus the unlink_file function will exit early. The .pacsave handling in unlink_file only concerns REMOVE transaction. Note: this breaks upgrade024, upgrade025 and the newly added upgrade026 pactests, because .pacsave file are never created on an upgrade operation. Ref: http://www.archlinux.org/pipermail/pacman-dev/2007-December/010416.html Signed-off-by: Chantry Xavier <shiningxc@gmail.com> --- lib/libalpm/add.c | 4 ++-- lib/libalpm/remove.c | 20 ++++++++++++-------- pactest/tests/upgrade026.py | 17 +++++++++++++++++ 3 files changed, 31 insertions(+), 10 deletions(-) create mode 100644 pactest/tests/upgrade026.py diff --git a/lib/libalpm/add.c b/lib/libalpm/add.c index 24f2b51..8c01985 100644 --- a/lib/libalpm/add.c +++ b/lib/libalpm/add.c @@ -244,7 +244,7 @@ static int upgrade_remove(pmpkg_t *oldpkg, pmpkg_t *newpkg, pmtrans_t *trans, pm const char *backup = b->data; _alpm_log(PM_LOG_DEBUG, "adding %s to the NoUpgrade array temporarily\n", backup); - handle->noupgrade = alpm_list_add(handle->noupgrade, strdup(backup)); + handle->noupgrade = alpm_list_add(handle->noupgrade, _alpm_backup_file(backup)); } /* new package backup list */ for(b = alpm_pkg_get_backup(newpkg); b; b = b->next) { @@ -253,7 +253,7 @@ static int upgrade_remove(pmpkg_t *oldpkg, pmpkg_t *newpkg, pmtrans_t *trans, pm if(!alpm_list_find_ptr(handle->noupgrade, backup)) { _alpm_log(PM_LOG_DEBUG, "adding %s to the NoUpgrade array temporarily\n", backup); - handle->noupgrade = alpm_list_add(handle->noupgrade, strdup(backup)); + handle->noupgrade = alpm_list_add(handle->noupgrade, _alpm_backup_file(backup)); } } diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 349ff10..5ab8001 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -227,14 +227,18 @@ static void unlink_file(pmpkg_t *info, alpm_list_t *lp, pmtrans_t *trans) return; } else if(needbackup) { /* if the file is flagged, back it up to .pacsave */ - if(!(trans->flags & PM_TRANS_FLAG_NOSAVE)) { - char newpath[PATH_MAX]; - snprintf(newpath, PATH_MAX, "%s.pacsave", file); - rename(file, newpath); - _alpm_log(PM_LOG_WARNING, _("%s saved as %s\n"), file, newpath); - return; - } else { - _alpm_log(PM_LOG_DEBUG, "transaction is set to NOSAVE, not backing up '%s'\n", file); + if(!(trans->type == PM_TRANS_TYPE_REMOVEUPGRADE)) { + /* if it was an upgrade, the file would be left alone because + * pacman_add() would handle it */ + if(!(trans->flags & PM_TRANS_FLAG_NOSAVE)) { + char newpath[PATH_MAX]; + snprintf(newpath, PATH_MAX, "%s.pacsave", file); + rename(file, newpath); + _alpm_log(PM_LOG_WARNING, _("%s saved as %s\n"), file, newpath); + return; + } else { + _alpm_log(PM_LOG_DEBUG, "transaction is set to NOSAVE, not backing up '%s'\n", file); + } } } _alpm_log(PM_LOG_DEBUG, "unlinking %s\n", file); diff --git a/pactest/tests/upgrade026.py b/pactest/tests/upgrade026.py new file mode 100644 index 0000000..8ad87e3 --- /dev/null +++ b/pactest/tests/upgrade026.py @@ -0,0 +1,17 @@ +self.description = "Upgrade a package, with a file leaving the pkg but staying in 'backup'" + +lp = pmpkg("dummy") +lp.files = ["etc/dummy.conf*"] +lp.backup = ["etc/dummy.conf"] +self.addpkg2db("local", lp) + +p = pmpkg("dummy", "1.0-2") +p.backup = ["etc/dummy.conf"] +self.addpkg(p) + +self.args = "-U %s" % p.filename() + +self.addrule("PKG_VERSION=dummy|1.0-2") +self.addrule("FILE_PACSAVE=etc/dummy.conf") +self.addrule("!FILE_PACNEW=etc/dummy.conf") +self.addrule("FILE_EXIST=etc/dummy.conf") -- 1.5.3.7
On Dec 5, 2007 12:46 AM, Xavier <shiningxc@gmail.com> wrote:
On Tue, Dec 04, 2007 at 09:18:08PM +0100, Xavier wrote:
The first part of this commit is supposed to temporarily add all old backup entries (besides the new) to the NoUpgrade array. Because of a bug, it didn't have any effect.
The second part was an idea of me to handle the pacsave during an removeupgrade just like we already did for a remove, but I finally think this is a really stupid / misleading / confusing idea. This code shouldn't even be reached during a removeupgrade if the first part was fixed anyway.
Here is a patch which fix the first part, and reverts the second.
OK, so I think I understand the problem here, and it makes sense. It seems like manipulating backup arrays and merging them is probably a bad idea. Let me get this straight- your patch doesn't actually solve these problems, just makes the existing code work as we thought it would? Let's start by laying out a set of rules that pacman should follow, then seeing if our pactests represent these cases. I'm only going to cover upgrades, because removals are straightforward (always create pacsave files of those in the backup array). 1. If the file is in the old backup array, and is then removed from the backup array AND package, we should move it to pacsave. (upgrade024?) 2. If the file is in the old backup array, and is then removed from the backup array BUT stays in the package, we have a gray area. We need to choose- either install the new file as pacnew and leave the old file (which would then be overwritten on the next upgrade because it is no longer in the backup array, so probably not the best solution), or do as upgrade025 does- move the existing file to pacsave and install the new file. 3. If the file is in the old backup array, and stays in the backup array BUT is removed from the package (bash), we should probably move (or should we copy and leave the original?) the file to pacsave. This is a gray area. Note that we haven't even looked at the new backup array yet. I think it may have been a bit naive to introduce that without much testing... 4. If the file exists in the old package BUT is not in the backup array, but is in the backup array of the new package, we should either install the new file as pacnew OR move existing file to pacsave and install new file in original location. Not sure here. Comments please, especially if I missed hard cases. Once we get this hammered out this needs to be documented somewhere. -Dan
On Wed, Dec 05, 2007 at 08:25:54PM -0600, Dan McGee wrote:
OK, so I think I understand the problem here, and it makes sense. It seems like manipulating backup arrays and merging them is probably a bad idea.
Let me get this straight- your patch doesn't actually solve these problems, just makes the existing code work as we thought it would?
Exactly.
Let's start by laying out a set of rules that pacman should follow, then seeing if our pactests represent these cases. I'm only going to cover upgrades, because removals are straightforward (always create pacsave files of those in the backup array). 1. If the file is in the old backup array, and is then removed from the backup array AND package, we should move it to pacsave. (upgrade024?)
ok.
2. If the file is in the old backup array, and is then removed from the backup array BUT stays in the package, we have a gray area. We need to choose- either install the new file as pacnew and leave the old file (which would then be overwritten on the next upgrade because it is no longer in the backup array, so probably not the best solution), or do as upgrade025 does- move the existing file to pacsave and install the new file.
yes, updrade025 looks alright.
3. If the file is in the old backup array, and stays in the backup array BUT is removed from the package (bash), we should probably move (or should we copy and leave the original?) the file to pacsave. This is a gray area.
This is a really odd case, but well, I would say just move it to pacsave, because otherwise, it just let a normal file on the filesystem, without a .pac* suffix, not owned by any packages. So when you install a package that adds this file again, it'll conflict. That is upgrade026.
Note that we haven't even looked at the new backup array yet. I think it may have been a bit naive to introduce that without much testing... 4. If the file exists in the old package BUT is not in the backup array, but is in the backup array of the new package, we should either install the new file as pacnew OR move existing file to pacsave and install new file in original location. Not sure here.
(upgrade023) I think this should be handled just like the common case : 5. If the file exists in both old package and new package, and both backup arrays, then we let the md5sum logic in add.c decide if a .pacnew needs to be extracted or not.
Comments please, especially if I missed hard cases. Once we get this hammered out this needs to be documented somewhere.
There are probably other weird cases, but if all the above are handled correctly, it should be good.
On Dec 6, 2007 1:56 AM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Dec 05, 2007 at 08:25:54PM -0600, Dan McGee wrote:
OK, so I think I understand the problem here, and it makes sense. It seems like manipulating backup arrays and merging them is probably a bad idea.
Let me get this straight- your patch doesn't actually solve these problems, just makes the existing code work as we thought it would?
Exactly.
Let's start by laying out a set of rules that pacman should follow, then seeing if our pactests represent these cases. I'm only going to cover upgrades, because removals are straightforward (always create pacsave files of those in the backup array). 1. If the file is in the old backup array, and is then removed from the backup array AND package, we should move it to pacsave. (upgrade024?)
ok.
2. If the file is in the old backup array, and is then removed from the backup array BUT stays in the package, we have a gray area. We need to choose- either install the new file as pacnew and leave the old file (which would then be overwritten on the next upgrade because it is no longer in the backup array, so probably not the best solution), or do as upgrade025 does- move the existing file to pacsave and install the new file.
yes, updrade025 looks alright.
3. If the file is in the old backup array, and stays in the backup array BUT is removed from the package (bash), we should probably move (or should we copy and leave the original?) the file to pacsave. This is a gray area.
This is a really odd case, but well, I would say just move it to pacsave, because otherwise, it just let a normal file on the filesystem, without a .pac* suffix, not owned by any packages. So when you install a package that adds this file again, it'll conflict.
That is upgrade026.
Note that we haven't even looked at the new backup array yet. I think it may have been a bit naive to introduce that without much testing... 4. If the file exists in the old package BUT is not in the backup array, but is in the backup array of the new package, we should either install the new file as pacnew OR move existing file to pacsave and install new file in original location. Not sure here.
(upgrade023) I think this should be handled just like the common case :
5. If the file exists in both old package and new package, and both backup arrays, then we let the md5sum logic in add.c decide if a .pacnew needs to be extracted or not.
Comments please, especially if I missed hard cases. Once we get this hammered out this needs to be documented somewhere.
There are probably other weird cases, but if all the above are handled correctly, it should be good.
So given all this- do you want to go ahead and try to patch it so we get the expected behavior? I think your patch is a step in the right direction (although we may end up changing the way old & new backup arrays are combined because of some of the above). The biggest weirdness is our backup logic has to be split in two places- the remove and add code- and they can't really talk to each other easier unless we make much bigger changes. I'd prefer to have a small patch now and maybe overhaul this later once we get a release out the door. -Dan
Hi! First of all, as I see, you simply forgot about the fact that we have clear definition about modified ["important"] and not modified ["can be reinstalled easily, user did not work on this file at all, so there is nothing to lose"] backup files in our manual (using md5sum). This is the reason for most of your "grey" areas. [EDIT: I noticed Xavier's 5th point now.] I also prefer the configfile terminology here, because that is more suggestive.
On Dec 6, 2007 1:56 AM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Dec 05, 2007 at 08:25:54PM -0600, Dan McGee wrote:
OK, so I think I understand the problem here, and it makes sense. It seems like manipulating backup arrays and merging them is probably a bad idea.
Let me get this straight- your patch doesn't actually solve these problems, just makes the existing code work as we thought it would?
Exactly.
Let's start by laying out a set of rules that pacman should follow, then seeing if our pactests represent these cases. I'm only going to cover upgrades, because removals are straightforward (always create pacsave files of those in the backup array).
I'm not sure. IMHO unmodified files can be removed.
1. If the file is in the old backup array, and is then removed from the backup array AND package, we should move it to pacsave. (upgrade024?)
This means that the new package renamed its config file, the solution also depends on whether the old file was modified (.pacsave) or not ("overwrite" is remove here).
ok.
2. If the file is in the old backup array, and is then removed from the backup array BUT stays in the package, we have a gray area. We need to choose- either install the new file as pacnew and leave the old file (which would then be overwritten on the next upgrade because it is no longer in the backup array, so probably not the best solution), or do as upgrade025 does- move the existing file to pacsave and install the new file.
This is probably a packaging error: the old package said that this is a config file, the new package says that this is not a config file... strange. Since even the simple removal of the file would induce a .pacsave, I vote a .pacsave here too, if the file was modified.
yes, updrade025 looks alright.
3. If the file is in the old backup array, and stays in the backup array BUT is removed from the package (bash), we should probably move (or should we copy and leave the original?) the file to pacsave. This is a gray area.
This is a really odd case, but well, I would say just move it to pacsave, because otherwise, it just let a normal file on the filesystem, without a .pac* suffix, not owned by any packages. So when you install a package that adds this file again, it'll conflict.
This is not necessarily an odd case. Imagine a package, where the config file is born after package install (by a graphical utility, or by post_update script...) <- we can confuse pacman easily here [we _must_ assume (and DEFINE) that the post_install script doesn't modify backup files, otherwise .pacnew would be useless] If the old configfile was modified (we can determine this from the old package), we can also keep that file or we can remove.
That is upgrade026.
Note that we haven't even looked at the new backup array yet. I think it may have been a bit naive to introduce that without much testing... 4. If the file exists in the old package BUT is not in the backup array, but is in the backup array of the new package, we should either install the new file as pacnew OR move existing file to pacsave and install new file in original location. Not sure here.
Imho we can simply remove the old file here.
(upgrade023) I think this should be handled just like the common case :
5. If the file exists in both old package and new package, and both backup arrays, then we let the md5sum logic in add.c decide if a .pacnew needs to be extracted or not.
Hmm. This is not implemented yet?;-) [see EDIT;-]
Comments please, especially if I missed hard cases. Once we get this hammered out this needs to be documented somewhere.
There are probably other weird cases, but if all the above are handled correctly, it should be good.
So given all this- do you want to go ahead and try to patch it so we get the expected behavior? I think your patch is a step in the right direction (although we may end up changing the way old & new backup arrays are combined because of some of the above). The biggest weirdness is our backup logic has to be split in two places- the remove and add code- and they can't really talk to each other easier unless we make much bigger changes. I'd prefer to have a small patch now and maybe overhaul this later once we get a release out the door.
-Dan
Poor Xavier... What about you and Aaron? Patches are welcome. Bye ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
3. If the file is in the old backup array, and stays in the backup array BUT is removed from the package (bash), we should probably move (or should we copy and leave the original?) the file to pacsave. This is a gray area.
This is a really odd case, but well, I would say just move it to
pacsave,
because otherwise, it just let a normal file on the filesystem, without a .pac* suffix, not owned by any packages. So when you install a package that adds this file again, it'll conflict.
This is not necessarily an odd case. Imagine a package, where the config file is born after package install (by a graphical utility, or by post_update script...) <- we can confuse pacman easily here [we _must_ assume (and DEFINE) that the post_install script doesn't modify backup files, otherwise .pacnew would be useless] If the old configfile was modified (we can determine this from the old package), we can also keep that file or we can remove.
Well, I was totally wrong here. Indeed, this is a pointless case (if I want to be consistant with my preferred 4. solution) Bye ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
On Thu, Dec 06, 2007 at 05:54:17PM +0100, Nagy Gabor wrote:
Hi!
First of all, as I see, you simply forgot about the fact that we have clear definition about modified ["important"] and not modified ["can be reinstalled easily, user did not work on this file at all, so there is nothing to lose"] backup files in our manual (using md5sum). This is the reason for most of your "grey" areas. [EDIT: I noticed Xavier's 5th point now.] I also prefer the configfile terminology here, because that is more suggestive.
The 5th point is just the common case, which is the only one that is really covered by pacman, and described in the manual. I only added it because I thought the 4th point could be handled just like the normal one. Dan only mentioned the unclear cases that need to be covered by pacman. What is wrong with that?
On Dec 6, 2007 1:56 AM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Dec 05, 2007 at 08:25:54PM -0600, Dan McGee wrote:
OK, so I think I understand the problem here, and it makes sense. It seems like manipulating backup arrays and merging them is probably a bad idea.
Let me get this straight- your patch doesn't actually solve these problems, just makes the existing code work as we thought it would?
Exactly.
Let's start by laying out a set of rules that pacman should follow, then seeing if our pactests represent these cases. I'm only going to cover upgrades, because removals are straightforward (always create pacsave files of those in the backup array).
I'm not sure. IMHO unmodified files can be removed.
Ok right, pacsave renaming really needs to be done only for modified files, but this is an improvment that could be made later. What's more important in my opinion is to make sure that potential modifications are not lost, and that the file doesn't stay in the way. Moving to .pacsave takes care of both problems.
3. If the file is in the old backup array, and stays in the backup array BUT is removed from the package (bash), we should probably move (or should we copy and leave the original?) the file to pacsave. This is a gray area.
This is a really odd case, but well, I would say just move it to pacsave, because otherwise, it just let a normal file on the filesystem, without a .pac* suffix, not owned by any packages. So when you install a package that adds this file again, it'll conflict.
This is not necessarily an odd case. Imagine a package, where the config file is born after package install (by a graphical utility, or by post_update script...) <- we can confuse pacman easily here [we _must_ assume (and DEFINE) that the post_install script doesn't modify backup files, otherwise .pacnew would be useless]
I think that's an odd case which should be avoided.
4. If the file exists in the old package BUT is not in the backup array, but is in the backup array of the new package, we should either install the new file as pacnew OR move existing file to pacsave and install new file in original location. Not sure here.
Imho we can simply remove the old file here.
Did you really understand what proactive backup was about?
5. If the file exists in both old package and new package, and both backup arrays, then we let the md5sum logic in add.c decide if a .pacnew needs to be extracted or not.
Hmm. This is not implemented yet?;-) [see EDIT;-]
Not sure what you mean here. That case is already implemented correctly, I was just saying the 4th case could be handled the same way.
participants (3)
-
Dan McGee
-
Nagy Gabor
-
Xavier