Re: [arch-general] [arch-dev-public] LZMA - in or out? ([signoff] libarchive 2.6.0)
On Tue, May 12, 2009 at 08:03:43AM +0200, Pierre Schmitz wrote:
Am Dienstag, 12. Mai 2009 02:06:31 schrieb Dan McGee:
Yes, of course. I think we can take some time to let it bake, as there is not an immediate need, and when 5.0 comes out we can move it to core and then rebuild libarchive with support for both.
Sure, no need to hurry. Slackware has achieved a decrease of size from 1,9GB to 1,4GB of their main repo. Our ratio should be lower as we don't have source packages in our repos. But it should be easy to test once the tools are ready. (du -h;gunzip *.gz;xz *.tar;du -h)
Just for the record, Slackware didnt move the sources to xz. Just the compiled packages. The descrease in size you are reffering to is more or less correct. My local rsynced tree is 1.6Gb with .xz while it was around 2.0Gb with .tgz. The gain should be 20-25% from what i can estimate. -- Greg
Am Dienstag, 12. Mai 2009 08:37:58 schrieb Grigorios Bouzakis:
On Tue, May 12, 2009 at 08:03:43AM +0200, Pierre Schmitz wrote:
Am Dienstag, 12. Mai 2009 02:06:31 schrieb Dan McGee:
Yes, of course. I think we can take some time to let it bake, as there is not an immediate need, and when 5.0 comes out we can move it to core and then rebuild libarchive with support for both.
Sure, no need to hurry. Slackware has achieved a decrease of size from 1,9GB to 1,4GB of their main repo. Our ratio should be lower as we don't have source packages in our repos. But it should be easy to test once the tools are ready. (du -h;gunzip *.gz;xz *.tar;du -h)
Just for the record, Slackware didnt move the sources to xz. Just the compiled packages. The descrease in size you are reffering to is more or less correct. My local rsynced tree is 1.6Gb with .xz while it was around 2.0Gb with .tgz. The gain should be 20-25% from what i can estimate.
I am just doing some very simple test right now. (default compression preset) core (x86_64) (decompress time) none 552M gzip 186M 12s xz 121M 17s I will add a test for extra later. Even though this might not be a really valid benchmark it show that its defintely worth it. Most people will benefit from a smaller download size which should also comensate the slightly increase decompression time. (I don't think that a lot of people download 65MB within 5s) Note: xz does not seem to use smp; but I have read that this might be possible some day. -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
Pierre Schmitz schrieb:
I am just doing some very simple test right now. (default compression preset)
core (x86_64) (decompress time) none 552M gzip 186M 12s xz 121M 17s
I will add a test for extra later.
Even though this might not be a really valid benchmark it show that its defintely worth it. Most people will benefit from a smaller download size which should also comensate the slightly increase decompression time. (I don't think that a lot of people download 65MB within 5s)
Agreed. This is not even a hard transition: It should be no problem to have mixed gzip and lzma packages in the repos, so this will be a smooth transition (only new packages will be rebuilt, old ones will stay as they are). pacman doesn't care how it is compressed, as long as libarchive supports it, so the user shouldn't even notice (we should only ensure that pacman and libarchive stay gzip for a while). Does repo-add/dbscripts/devtools do anything gzip-specific? If so, it's probably easy to get rid of. Anyone else in favor of moving to lzma? Related: lzma-compressed kernel (support with 2.6.30 and newer), maybe lzma-compressed squashfs on the live CDs (2.6.30 has lzma support, no idea if squashfs can use it already).
Am Dienstag, 12. Mai 2009 12:02:22 schrieb Thomas Bächler:
Anyone else in favor of moving to lzma? Related: lzma-compressed kernel (support with 2.6.30 and newer), maybe lzma-compressed squashfs on the live CDs (2.6.30 has lzma support, no idea if squashfs can use it already).
I think we could just put xz-utils into testing (replacing and providing lzma- utils) and rebuild libarchive with xz support. If itsstable enough we could move it to core and a new version of pacman could have xz enabled by default in makepkg.conf. New packages will then be compressed with xz. There is no need to rebuild the whole repo. As Thomas suggested we just need to make sure that a gz version of pacman, libarchive, xz-utils and the db files remains available for some time. -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
Thomas Bächler wrote:
Pierre Schmitz schrieb:
I am just doing some very simple test right now. (default compression preset)
core (x86_64) (decompress time) none 552M gzip 186M 12s xz 121M 17s
I will add a test for extra later.
Even though this might not be a really valid benchmark it show that its defintely worth it. Most people will benefit from a smaller download size which should also comensate the slightly increase decompression time. (I don't think that a lot of people download 65MB within 5s)
Agreed. This is not even a hard transition: It should be no problem to have mixed gzip and lzma packages in the repos, so this will be a smooth transition (only new packages will be rebuilt, old ones will stay as they are). pacman doesn't care how it is compressed, as long as libarchive supports it, so the user shouldn't even notice (we should only ensure that pacman and libarchive stay gzip for a while).
Does repo-add/dbscripts/devtools do anything gzip-specific? If so, it's probably easy to get rid of.
Pacman needs no changes as far as I am aware. makepkg only can generation gzip and bzip2 packages, so someone has to patch that. The ftp clean-up script uses PKGEXT from makepkg.conf so would probably need an ugly fix to look at multiple extensions. db-update and db-move both use PKGEXT, so there will need to be dual db-scripts if pacman/libarchive/libfetch etc need to be in a different compression. Overall, I'd prefer to spend my time getting pkg deltas working which I think is the better bandwagon to jump on... Allan
Am Dienstag, 12. Mai 2009 12:31:19 schrieb Allan McRae:
Overall, I'd prefer to spend my time getting pkg deltas working which I think is the better bandwagon to jump on...
Sure deltas are quite usefull, too. OT: The recent approach is to automatically generate them within repo-add, right? So the uploader has no need to care about. I still think it's still woth it. Patching makepkg/devtools shouldn't be that hard. And we don't need to chagne it tomorrow. Also: if we'll have deltas we will need more disk space and traffic; using a better compression should help a little. -- Pierre Schmitz Clemens-August-Straße 76 53115 Bonn Telefon 0228 9716608 Mobil 0160 95269831 Jabber pierre@jabber.archlinux.de WWW http://www.archlinux.de
participants (4)
-
Allan McRae
-
Grigorios Bouzakis
-
Pierre Schmitz
-
Thomas Bächler