[arch-dev-public] LZMA - in or out? ([signoff] libarchive 2.6.0)
On Sat, Jan 3, 2009 at 6:00 AM, Andreas Radke <a.radke@arcor.de> wrote:
I'd like to keep unneeded packages out of core. I see no need to move lzma into core. We only support tar.gz for our repos. Whoever wants to use a different format can rebuild libarchive easily.
I also wonder if our new tar package now supports lzma and lzop compression (no tests so far). Both packages were not present when building the tar package in he chroot. But even when it's now possible to use these formats at runtime they should stay in extra until we may use them for our repos.
I don't want to go too back and forth on this on the signoff thread so I'll move it here. I was thinking much more in the context of archives in general rather than just pacman using libarchive. libarchive ships with bsdtar, which I have found to be quicker than GNU tar at extracting (so I use it nearly exclusively). It would be a shame to tell people "we don't support .tar.lzma for anything"- that seems rather arbitrary, doesn't it? I too think [core] should stay as slim as possible, but when that requires we compile our packages with a less than ideal set of features, I want to at least give it some thought. -Dan
On Sat, Jan 3, 2009 at 8:31 AM, Dan McGee <dpmcgee@gmail.com> wrote:
On Sat, Jan 3, 2009 at 6:00 AM, Andreas Radke <a.radke@arcor.de> wrote:
I'd like to keep unneeded packages out of core. I see no need to move lzma into core. We only support tar.gz for our repos. Whoever wants to use a different format can rebuild libarchive easily.
I also wonder if our new tar package now supports lzma and lzop compression (no tests so far). Both packages were not present when building the tar package in he chroot. But even when it's now possible to use these formats at runtime they should stay in extra until we may use them for our repos.
I don't want to go too back and forth on this on the signoff thread so I'll move it here.
I was thinking much more in the context of archives in general rather than just pacman using libarchive. libarchive ships with bsdtar, which I have found to be quicker than GNU tar at extracting (so I use it nearly exclusively). It would be a shame to tell people "we don't support .tar.lzma for anything"- that seems rather arbitrary, doesn't it?
I too think [core] should stay as slim as possible, but when that requires we compile our packages with a less than ideal set of features, I want to at least give it some thought.
Where did this go? Do we have any additional opinions regarding LZMA? Personally, I think LZMA is great, and the new licensing (LGPL?) opens a lot of doors. Personally, I can't wait to see squashfs-lzma to pick up speed. I agree with Dan though - libarchive isn't just a dep of pacman. We also ship bsdtar, which I use quite often.
On Mon, 2009-01-12 at 15:15 -0600, Aaron Griffin wrote:
Where did this go?
Do we have any additional opinions regarding LZMA? Personally, I think LZMA is great, and the new licensing (LGPL?) opens a lot of doors. Personally, I can't wait to see squashfs-lzma to pick up speed.
I agree with Dan though - libarchive isn't just a dep of pacman. We also ship bsdtar, which I use quite often.
I think having an LZMA-enabled version of libarchive/bsdtar would fix this bug also: http://bugs.archlinux.org/task/12712 At this moment, rpmextract can't extract RPM files that use LZMA compression. To solve the above bug partially, I'm thinking about rewriting file-roller to use bsdtar for extracting compressed cpio files. Without LZMA support in bsdtar, this will still fail for SuSE's RPM files though.
I may be wrong, but i think we should have lzma in [core]. We could use .pkg.tar.lzma with pacman in the future. BTW, lzma + delta would be great for those with slow connections and it would save a lot of bandwidth for the Arch Linux servers. IMHO, save bandwidth is more interesting than saving processing. -- Hugo
On Mon, Jan 12, 2009 at 5:22 PM, Jan de Groot <jan@jgc.homeip.net> wrote:
On Mon, 2009-01-12 at 15:15 -0600, Aaron Griffin wrote:
Where did this go?
Do we have any additional opinions regarding LZMA? Personally, I think LZMA is great, and the new licensing (LGPL?) opens a lot of doors. Personally, I can't wait to see squashfs-lzma to pick up speed.
I agree with Dan though - libarchive isn't just a dep of pacman. We also ship bsdtar, which I use quite often.
I think having an LZMA-enabled version of libarchive/bsdtar would fix this bug also: http://bugs.archlinux.org/task/12712
At this moment, rpmextract can't extract RPM files that use LZMA compression. To solve the above bug partially, I'm thinking about rewriting file-roller to use bsdtar for extracting compressed cpio files. Without LZMA support in bsdtar, this will still fail for SuSE's RPM files though.
These are all good reason for lzma support in libarchive. No objections from me.
On Mon, Jan 12, 2009 at 7:42 PM, Eric Bélanger <snowmaniscool@gmail.com> wrote:
On Mon, Jan 12, 2009 at 5:22 PM, Jan de Groot <jan@jgc.homeip.net> wrote:
On Mon, 2009-01-12 at 15:15 -0600, Aaron Griffin wrote:
Where did this go?
Do we have any additional opinions regarding LZMA? Personally, I think LZMA is great, and the new licensing (LGPL?) opens a lot of doors. Personally, I can't wait to see squashfs-lzma to pick up speed.
I agree with Dan though - libarchive isn't just a dep of pacman. We also ship bsdtar, which I use quite often.
I think having an LZMA-enabled version of libarchive/bsdtar would fix this bug also: http://bugs.archlinux.org/task/12712
At this moment, rpmextract can't extract RPM files that use LZMA compression. To solve the above bug partially, I'm thinking about rewriting file-roller to use bsdtar for extracting compressed cpio files. Without LZMA support in bsdtar, this will still fail for SuSE's RPM files though.
These are all good reason for lzma support in libarchive. No objections from me.
Some non-benefits of LZMA for those that brought up compression of packages : $ ./testzip.sh openoffice-base-3.0.0-4-x86_64.pkg.tar gzip: zip openoffice-base-3.0.0-4-x86_64.pkg.tar 21.50user 0.19system 0:21.71elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+235minor)pagefaults 0swaps gzip: unzip openoffice-base-3.0.0-4-x86_64.pkg.tar 2.85user 0.21system 0:03.07elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+628minor)pagefaults 0swaps bzip2: zip openoffice-base-3.0.0-4-x86_64.pkg.tar 57.28user 0.27system 0:57.58elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2025minor)pagefaults 0swaps bzip2: unzip openoffice-base-3.0.0-4-x86_64.pkg.tar 17.50user 0.43system 0:17.95elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+1079minor)pagefaults 0swaps lzma: zip openoffice-base-3.0.0-4-x86_64.pkg.tar 288.53user 0.42system 4:52.06elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+21388minor)pagefaults 0swaps lzma: unzip openoffice-base-3.0.0-4-x86_64.pkg.tar 10.18user 0.36system 0:10.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2642minor)pagefaults 0swaps Yes, LZMA took nearly 5 minutes to compress this package which gzip accomplished in 22 seconds.
Yep, Compress really takes a little longer, but IIRC decompress takes almost the same time of gzip. -- Hugo
On Tue, Jan 13, 2009 at 11:56 AM, Hugo Doria <hugodoria@gmail.com> wrote:
Yep,
Compress really takes a little longer, but IIRC decompress takes almost the same time of gzip.
Also, if I am not mistaken, a given package is compressed only once, by the packager, but decompressed many many times (by all users). But of course the compression time needs to stay reasonable enough for the packager, and the opinion of active packagers or packagers with huge packages is important. Otherwise, if the download time benefit is lower than the decompression time slowdown for many users with high bandwidth, we should still think about all the poor users with low bandwidth :) And in any cases, there is a bandwidth win on both sides (user / server) so that is always good.
On Tue, 2009-01-13 at 16:45 +0100, Xavier wrote:
Also, if I am not mistaken, a given package is compressed only once, by the packager, but decompressed many many times (by all users). But of course the compression time needs to stay reasonable enough for the packager, and the opinion of active packagers or packagers with huge packages is important. Otherwise, if the download time benefit is lower than the decompression time slowdown for many users with high bandwidth, we should still think about all the poor users with low bandwidth :) And in any cases, there is a bandwidth win on both sides (user / server) so that is always good.
Looking at OpenOffice.org, I think the gain in upload time is much larger than the loss of compression time.
Am Tue, 13 Jan 2009 18:48:11 +0100 schrieb Jan de Groot <jan@jgc.homeip.net>:
On Tue, 2009-01-13 at 16:45 +0100, Xavier wrote:
Also, if I am not mistaken, a given package is compressed only once, by the packager, but decompressed many many times (by all users). But of course the compression time needs to stay reasonable enough for the packager, and the opinion of active packagers or packagers with huge packages is important. Otherwise, if the download time benefit is lower than the decompression time slowdown for many users with high bandwidth, we should still think about all the poor users with low bandwidth :) And in any cases, there is a bandwidth win on both sides (user / server) so that is always good.
Looking at OpenOffice.org, I think the gain in upload time is much larger than the loss of compression time.
1M upload here... so I doubt a bit but don't care much whatever compression we will use. -Andy
I am sorry for resurrect this thread, but i wish to know what was decided about lzma. I still think that lzma could be a good use for us. BTW, Slackware will use the lzma compression: ftp://ftp.slackware.com/pub/slackware/slackware-current/ChangeLog.txt -- Hugo
Am Montag, 11. Mai 2009 20:58:50 schrieb Hugo Doria:
I am sorry for resurrect this thread, but i wish to know what was decided about lzma. I still think that lzma could be a good use for us.
BTW, Slackware will use the lzma compression: ftp://ftp.slackware.com/pub/slackware/slackware-current/ChangeLog.txt
-- Hugo
I think we just need to wait until xz is stable and there is a libarchive version supporting it. According to http://tukaani.org/xz/ the file format itself seems stable but the tools not(?). -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
On Mon, May 11, 2009 at 6:19 PM, Pierre Schmitz <pierre@archlinux.de> wrote:
Am Montag, 11. Mai 2009 20:58:50 schrieb Hugo Doria:
I am sorry for resurrect this thread, but i wish to know what was decided about lzma. I still think that lzma could be a good use for us.
BTW, Slackware will use the lzma compression: ftp://ftp.slackware.com/pub/slackware/slackware-current/ChangeLog.txt
-- Hugo
I think we just need to wait until xz is stable and there is a libarchive version supporting it. According to http://tukaani.org/xz/ the file format itself seems stable but the tools not(?).
From the libarchive 2.7.0 release notes (whoops, this is still sitting in testing but is not currently built with lzma/xz support):
* First-class support for lzma and xz reading and writing, using the newly-released liblzma libraries. For libarchive to support lzma/xz natively, we would need liblzma in core as well. Thoughts? -Dan
Am Dienstag, 12. Mai 2009 01:23:16 schrieb Dan McGee:
For libarchive to support lzma/xz natively, we would need liblzma in core as well. Thoughts?
Sure its only a few KB. Even the complete lzma-utils package would be only 75KB. Anyway: Would it be better to use (successor) xz instead? Or ist the file format somehow compatible? However: I think its a good idea to build libarchive with support for lzma/xz. Other tools (afaik KDE 4.3) will make use if this then, too. -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
On Mon, May 11, 2009 at 6:33 PM, Pierre Schmitz <pierre@archlinux.de> wrote:
Am Dienstag, 12. Mai 2009 01:23:16 schrieb Dan McGee:
For libarchive to support lzma/xz natively, we would need liblzma in core as well. Thoughts?
Sure its only a few KB. Even the complete lzma-utils package would be only 75KB.
Anyway: Would it be better to use (successor) xz instead? Or ist the file format somehow compatible?
You snipped out what the libarchive release notes said- liblzma supports xz AND lzma, apparently. It is all here: http://tukaani.org/xz/ -Dan
Am Dienstag, 12. Mai 2009 01:45:47 schrieb Dan McGee:
You snipped out what the libarchive release notes said- liblzma supports xz AND lzma, apparently.
It is all here: http://tukaani.org/xz/
Yes, but if I get it right you need http://tukaani.org/xz/xz-4.999.8beta.tar.gz which should support lzma AND xz format. The lzma-tuils in extra will only support lzma. -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
On Mon, May 11, 2009 at 6:57 PM, Pierre Schmitz <pierre@archlinux.de> wrote:
Am Dienstag, 12. Mai 2009 01:45:47 schrieb Dan McGee:
You snipped out what the libarchive release notes said- liblzma supports xz AND lzma, apparently.
It is all here: http://tukaani.org/xz/
Yes, but if I get it right you need http://tukaani.org/xz/xz-4.999.8beta.tar.gz which should support lzma AND xz format. The lzma-tuils in extra will only support lzma.
Yes, of course. I think we can take some time to let it bake, as there is not an immediate need, and when 5.0 comes out we can move it to core and then rebuild libarchive with support for both. I had done some tests a while back with lzma compression (as it relates to bzip2 and gzip), and the results were not as good as I had hoped. Although the files were smaller, the savings was not near as significant as the increased compression time and slightly increased decompression speed. However, lzma/xz will definitely clock in better than bzip2 when it comes to speed. -Dan
Am Dienstag, 12. Mai 2009 02:06:31 schrieb Dan McGee:
Yes, of course. I think we can take some time to let it bake, as there is not an immediate need, and when 5.0 comes out we can move it to core and then rebuild libarchive with support for both.
Sure, no need to hurry. Slackware has achieved a decrease of size from 1,9GB to 1,4GB of their main repo. Our ratio should be lower as we don't have source packages in our repos. But it should be easy to test once the tools are ready. (du -h;gunzip *.gz;xz *.tar;du -h) -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
Pierre Schmitz wrote:
Am Dienstag, 12. Mai 2009 01:45:47 schrieb Dan McGee:
You snipped out what the libarchive release notes said- liblzma supports xz AND lzma, apparently.
It is all here: http://tukaani.org/xz/
Yes, but if I get it right you need http://tukaani.org/xz/xz-4.999.8beta.tar.gz which should support lzma AND xz format. The lzma-tuils in extra will only support lzma.
FYI my PKGBUILDs for xz-utils are in AUR: http://aur.archlinux.org/packages.php?ID=25140 (beta version) http://aur.archlinux.org/packages.php?ID=25141 (git version) Currently lzma-utils is a makedepends of all my texlive-* packages, which is why I brought lzma-utils to extra. However, starting with the 2009 release of TeXLive, they will depend on xz-utils instead. F
Pierre Schmitz wrote:
Am Dienstag, 12. Mai 2009 01:23:16 schrieb Dan McGee:
For libarchive to support lzma/xz natively, we would need liblzma in core as well. Thoughts?
Sure its only a few KB. Even the complete lzma-utils package would be only 75KB.
Anyway: Would it be better to use (successor) xz instead? Or ist the file format somehow compatible?
However: I think its a good idea to build libarchive with support for lzma/xz. Other tools (afaik KDE 4.3) will make use if this then, too.
Does lzma-utils need to be a dep or is an opt-dep fine? That would make the whole discussion moot... Anyway, I fine with bringing this to [core]. It would be bad to disable features just because of the arbitary bounary between [core] and [extra]. Allan
On Mon, May 11, 2009 at 7:09 PM, Allan McRae <allan@archlinux.org> wrote:
Pierre Schmitz wrote:
Am Dienstag, 12. Mai 2009 01:23:16 schrieb Dan McGee:
For libarchive to support lzma/xz natively, we would need liblzma in core as well. Thoughts?
Sure its only a few KB. Even the complete lzma-utils package would be only 75KB.
Anyway: Would it be better to use (successor) xz instead? Or ist the file format somehow compatible?
However: I think its a good idea to build libarchive with support for lzma/xz. Other tools (afaik KDE 4.3) will make use if this then, too.
Does lzma-utils need to be a dep or is an opt-dep fine? That would make the whole discussion moot... Anyway, I fine with bringing this to [core]. It would be bad to disable features just because of the arbitary bounary between [core] and [extra].
If we do it right, it can't be an optdep as libarchive will have a linking dep on liblzma, just as it currently does on gzip and bzip2. Unless there is some lazy dynamic loading going on (not really sure on this one), I would feel safer bringing it into core. -Dan
Am Mon, 12 Jan 2009 19:59:33 -0600 schrieb "Dan McGee" <dpmcgee@gmail.com>:
Yes, LZMA took nearly 5 minutes to compress this package which gzip accomplished in 22 seconds.
I guess it doesn't support parallel threaded compression. I couldn't find any good information about smp support. If only pbzip would integrate pipe support.... I'm fine with moving lzma into core. But then we should also think about using the best current compression format for our repos. -Andy
I'll throw in my (non-dev) hat and say that I like this change, for what it's worth. In my eyes, the costs are very low and the benefits are real. --Daenyth
participants (11)
-
Aaron Griffin
-
Allan McRae
-
Andreas Radke
-
Daenyth Blank
-
Dan McGee
-
Eric Bélanger
-
Firmicus
-
Hugo Doria
-
Jan de Groot
-
Pierre Schmitz
-
Xavier