[pacman-dev] Huge cleanup of add.c and my attempt at fixing FS#7484

Dan McGee dpmcgee at gmail.com
Sun Jul 15 12:09:37 EDT 2007


On 7/15/07, Xavier <shiningxc at gmail.com> wrote:
> On Sat, Jul 14, 2007 at 02:54:50PM -0400, Dan McGee wrote:
> > Ok, I've been slaving away at this for a while and libalpm/add.c is
> > finally getting a bit more manageable. Showing a patch is worthless
> > because it is huge and is much harder to follow due to the refactoring
> > stuff. Instead I've attached a drop-in add.c that compiles cleanly
> > with no other changes needed. I'm looking for some feedback on this.
> >
> > Some highlights:
> > 1. All pactests still pass, so no regressions on that front.
> > 2. *Much* cleaner layout. add_commit has now been split across four
> > functions (upgrade_remove, extract_single_file, commit_single_pkg, and
> > _alpm_add_commit).
> > 3. No more super functions. The heaviest function
> > (extract_single_file) now weighs in at 350 lines, and a lot of that is
> > comments.
> > Comments/concerns/questions appreciated. I still want to cleanup the
> > huge if(needbackup) section a bit, that is still quite long but hard
> > to refactor.
> >
>
> Very nice job here, but yes, there is still that part to clean up.
> Maybe the suggestion from Aaron here could give some ideas :
> http://www.archlinux.org/pipermail/pacman-dev/2006-December/006286.html
>
> > 4. Most of what Andrew started to do in his 7484 patch has been
> > brought into this. However, I rewrote it a bit for clarity (at least I
> > thought so). We may need a bit more meat added to this section.
> >
>
> Yes I find it more clear that way, but there are still several things to take
> care of in my opinion.
>
> +       /* Check for file existence. This is one of the more crucial parts
> +        * to get 'right'. Here are the possibilities, with the filesystem
> +        * on the left and the package on the top:
> +        * (F=file, N=node, S=symlink, D=dir)
> +        *               |  F/N  |  S  |  D
> +        *  non-existent |   1   |  2  |  3
> +        *  F/N          |   4   |  5  |  6
> +        *  S            |   7   |  8  |  9
> +        *  D            |   10  |  11 |  12
> +        *
> +        *  1,2,3- just extract, no magic necessary. lstat will fail here.
> +        *  4,5,6,7,8,10- conflict checks should have caught this. overwrite/backup.
> +        *  9- follow the symlink, hopefully it is a directory
> +        *  11,12- skip extraction, dir already exists
> +        */
>
> About cases 6 and 10, I agree that conflict checks should catch this,
> but do they ? (they don't from my understanding of the code)

Then we should fix that in the conflict check code, probably. I
haven't looked at it enough.
By the way- is there a reason we even use the filelist provided with
the package? Any package can lie about it, and since we go though the
archive once anyway we might as well just use our own generated
filelist. We can also gather information about each of the files at
this time (dir, file, symlink, etc.).

> About cases 9 and 11, I think it depends on the nature of the symlink :
> * if the symlink points to a directory, then skip extraction
> * if not, then conflict

If you look at the case 9 code, then you'll see that this is what
happens, or at least tries to. I'm not sure how effective returning 1
is by extract_single_file, however it at least spits a warning/error
here.

For case 11, we don't care about the nature of the symlink- we want to
skip extracting the symlink. See the explanation in dpkg code for
details, which I have put below:
   *   = If we are trying to install a symlink we do nothing - ie,
   *     dpkg will never replace a directory tree with a symlink.  This
   *     is to avoid embarrassing effects such as replacing a directory
   *     tree with a link to a link to the original directory tree.

I think this is valid. Perhaps I should put a note like this in the code.

In addition, I just realized 10 needs a bit more code to get it to
work properly, because we would have to unlink an entire directory
tree. However, this is quite dangerous as well in that it could
interfere with other packages files. Perhaps we should never allow a
file to overwrite a directory, even in a force situation? Comments
welcome.

New code comments:
diff --git a/lib/libalpm/add.c b/lib/libalpm/add.c
index 2c1aa2c..9019c29 100644
--- a/lib/libalpm/add.c
+++ b/lib/libalpm/add.c
@@ -392,9 +392,14 @@ static int extract_single_file(struct archive *archive,
         *  D            |   10  |  11 |  12
         *
         *  1,2,3- just extract, no magic necessary. lstat will fail here.
-        *  4,5,6,7,8,10- conflict checks should have caught this.
overwrite/backup.
-        *  9- follow the symlink, hopefully it is a directory
-        *  11,12- skip extraction, dir already exists
+        *  4,5,6,7,8- conflict checks should have caught this. either overwrite
+        *      or backup the file.
+        *  9- follow the symlink, hopefully it is a directory, check it.
+        *  10- file replacing directory. a simple unlink probably
won't work here.
+        *      TODO handle this case, it is a *very* dangerous one.
+        *  11- don't extract symlink- a dir exists here. we don't want links to
+        *      links, etc.
+        *  12- skip extraction, dir already exists.
         */
        struct stat lsbuf;
        if(lstat(filename, &lsbuf) != 0) {

-Dan




More information about the pacman-dev mailing list