[pacman-dev] multiple extraction from package
The last commit made me remind of a little issue I wanted to bring a few hours ago, thanks Dan :) grep "\.INSTALL" lib/libalpm/*.c lib/libalpm/add.c: } else if(strcmp(entryname, ".INSTALL") == 0) { lib/libalpm/package.c: } else if(strcmp(entry_name, ".INSTALL") == 0) { lib/libalpm/trans.c: _alpm_unpack(installfn,tmpdir, ".INSTALL"); lib/libalpm/trans.c: snprintf(scriptfn, PATH_MAX, "%s/.INSTALL", tmpdir); The archive is read 3 times, in 3 different places. In particular, the .INSTALL file is checked for existence in package.c , for knowing the package has a scriptlet. In trans.c , the .INSTALL file is extracted to /tmp/ for running the preinstall scriptlet. And in add.c , the .INSTALL file is extracted to $dbpath/local , which is by the way causing some issues : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008693.html I wonder if it wouldn't be possible to read the archive just in one place, and extracting files just once. Or well, I don't know what is the best way, extracting files to /tmp/ and then moving them around (like for backup files), or extracting them several times from the archive (like .INSTALL) , or maybe keeping the special dot files from the archive in memory or something. Well, it's not that important, but it's related to the issue in the link above, that I would like to fix, so I just wanted to know if everyone was happy with the current way pacman reads/extracts files from the archive.
On 7/9/07, Xavier <shiningxc@gmail.com> wrote:
The last commit made me remind of a little issue I wanted to bring a few hours ago, thanks Dan :)
grep "\.INSTALL" lib/libalpm/*.c lib/libalpm/add.c: } else if(strcmp(entryname, ".INSTALL") == 0) { lib/libalpm/package.c: } else if(strcmp(entry_name, ".INSTALL") == 0) { lib/libalpm/trans.c: _alpm_unpack(installfn,tmpdir, ".INSTALL"); lib/libalpm/trans.c: snprintf(scriptfn, PATH_MAX, "%s/.INSTALL", tmpdir);
The archive is read 3 times, in 3 different places. In particular, the .INSTALL file is checked for existence in package.c , for knowing the package has a scriptlet. In trans.c , the .INSTALL file is extracted to /tmp/ for running the preinstall scriptlet. And in add.c , the .INSTALL file is extracted to $dbpath/local , which is by the way causing some issues : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008693.html
I wonder if it wouldn't be possible to read the archive just in one place, and extracting files just once. Or well, I don't know what is the best way, extracting files to /tmp/ and then moving them around (like for backup files), or extracting them several times from the archive (like .INSTALL) , or maybe keeping the special dot files from the archive in memory or something.
Well, it's not that important, but it's related to the issue in the link above, that I would like to fix, so I just wanted to know if everyone was happy with the current way pacman reads/extracts files from the archive.
The biggest problem- this isn't exactly easy to clean up. If we could knock this down to two times, that would be much better, that is for sure. However, I'm not sure how to do this easily without doing a lot of changes to libalpm. Of course, these changes are probably needed bigtime. -Dan
Dan McGee wrote:
On 7/9/07, Xavier <shiningxc@gmail.com> wrote:
The last commit made me remind of a little issue I wanted to bring a few hours ago, thanks Dan :)
grep "\.INSTALL" lib/libalpm/*.c lib/libalpm/add.c: } else if(strcmp(entryname, ".INSTALL") == 0) { lib/libalpm/package.c: } else if(strcmp(entry_name, ".INSTALL") == 0) { lib/libalpm/trans.c: _alpm_unpack(installfn,tmpdir, ".INSTALL"); lib/libalpm/trans.c: snprintf(scriptfn, PATH_MAX, "%s/.INSTALL", tmpdir);
The archive is read 3 times, in 3 different places. In particular, the .INSTALL file is checked for existence in package.c , for knowing the package has a scriptlet. In trans.c , the .INSTALL file is extracted to /tmp/ for running the preinstall scriptlet. And in add.c , the .INSTALL file is extracted to $dbpath/local , which is by the way causing some issues : http://www.archlinux.org/pipermail/pacman-dev/2007-July/008693.html
I wonder if it wouldn't be possible to read the archive just in one place, and extracting files just once. Or well, I don't know what is the best way, extracting files to /tmp/ and then moving them around (like for backup files), or extracting them several times from the archive (like .INSTALL) , or maybe keeping the special dot files from the archive in memory or something.
Well, it's not that important, but it's related to the issue in the link above, that I would like to fix, so I just wanted to know if everyone was happy with the current way pacman reads/extracts files from the archive.
The biggest problem- this isn't exactly easy to clean up.
If we could knock this down to two times, that would be much better, that is for sure. However, I'm not sure how to do this easily without doing a lot of changes to libalpm. Of course, these changes are probably needed bigtime.
-Dan
_______________________________________________ pacman-dev mailing list pacman-dev@archlinux.org http://archlinux.org/mailman/listinfo/pacman-dev I would leave this for post-3.1
But back to the point, I think we need a temporary directory (mkdtemp(/tmp/XXXXXXX)) where we put install and changelog before we commit a package to the database. Looking at this from the point of view of a normal db I would never enter the install file before I've actual created a valid entry for the package (the desc file). Andrew
On Wed, Jul 11, 2007 at 09:53:49PM +0100, Andrew Fyfe wrote:
But back to the point, I think we need a temporary directory (mkdtemp(/tmp/XXXXXXX)) where we put install and changelog before we commit a package to the database. Looking at this from the point of view of a normal db I would never enter the install file before I've actual created a valid entry for the package (the desc file).
Initially, I wanted to try avoiding extracting stuff to /tmp/ , influenced by the following comment in package.c : 544 /* TODO there is no reason to make temp files to read 545 * from a libarchive archive, it can be done by reading 546 * directly from the archive 547 * See: archive_read_data_into_buffer 548 * requires changes 'parse_descfile' as well 549 * */ But since these files are used again later, I think I would prefer what you just said, keeping files like install and changelog somewhere in /tmp/ , and then moving them only when the entry is created. I agree totally :)
On 7/11/07, Xavier <shiningxc@gmail.com> wrote:
On Wed, Jul 11, 2007 at 09:53:49PM +0100, Andrew Fyfe wrote:
But back to the point, I think we need a temporary directory (mkdtemp(/tmp/XXXXXXX)) where we put install and changelog before we commit a package to the database. Looking at this from the point of view of a normal db I would never enter the install file before I've actual created a valid entry for the package (the desc file).
Initially, I wanted to try avoiding extracting stuff to /tmp/ , influenced by the following comment in package.c : 544 /* TODO there is no reason to make temp files to read 545 * from a libarchive archive, it can be done by reading 546 * directly from the archive 547 * See: archive_read_data_into_buffer 548 * requires changes 'parse_descfile' as well 549 * */
But since these files are used again later, I think I would prefer what you just said, keeping files like install and changelog somewhere in /tmp/ , and then moving them only when the entry is created. I agree totally :)
The comment above is not taking into account that it's done multiple times. I think extracting to a temp dir might be subtly problematic, but may open up more opportunities for us later. For instance, if we extract to /tmp/foo/ first, we get the benefit of not having to run entirely through the archive twice to check tarball integrity. In addition, it allows us to stay slightly more transactional in the sense that we can extract to /tmp, do all the fs-to-fs comparisons and then simply dump the temp extraction to the root dir. Actually, this could (hah) clean up some of the directory-symlink issues too. Just some food for thought.
Aaron Griffin wrote:
On 7/11/07, Xavier <shiningxc@gmail.com> wrote:
On Wed, Jul 11, 2007 at 09:53:49PM +0100, Andrew Fyfe wrote:
But back to the point, I think we need a temporary directory (mkdtemp(/tmp/XXXXXXX)) where we put install and changelog before we commit a package to the database. Looking at this from the point of view of a normal db I would never enter the install file before I've actual created a valid entry for the package (the desc file).
Initially, I wanted to try avoiding extracting stuff to /tmp/ , influenced by the following comment in package.c : 544 /* TODO there is no reason to make temp files to read 545 * from a libarchive archive, it can be done by reading 546 * directly from the archive 547 * See: archive_read_data_into_buffer 548 * requires changes 'parse_descfile' as well 549 * */
But since these files are used again later, I think I would prefer what you just said, keeping files like install and changelog somewhere in /tmp/ , and then moving them only when the entry is created. I agree totally :)
The comment above is not taking into account that it's done multiple times. I think extracting to a temp dir might be subtly problematic, but may open up more opportunities for us later.
For instance, if we extract to /tmp/foo/ first, we get the benefit of not having to run entirely through the archive twice to check tarball integrity. In addition, it allows us to stay slightly more transactional in the sense that we can extract to /tmp, do all the fs-to-fs comparisons and then simply dump the temp extraction to the root dir.
Actually, this could (hah) clean up some of the directory-symlink issues too.
Just some food for thought.
_______________________________________________ pacman-dev mailing list pacman-dev@archlinux.org http://archlinux.org/mailman/listinfo/pacman-dev
My original idea for the temp dir was to only use it for the meta data files. But thinking about it extracting the whole package to a tmp dir means we wouldn't need to extend .FILELIST (pacman could generate a more detailed filelist at install time). If I remember correctly this is how dpkg installs a package, it 1st extracts the package to a temp dir, checks conflicts, then copies the new files into place. I've started writing do a detailed list/description of how pacman should install a package, I'll try and get it finished this weekend and post it here. Andrew
participants (4)
-
Aaron Griffin
-
Andrew Fyfe
-
Dan McGee
-
Xavier