[arch-general] lzma compression tests
FWIW, possibly worth considering adding lzma to pacman... http://www.kdedevelopers.org/node/3326 --markc
On Tue, 2008-03-25 at 05:24 +1000, Mark Constable wrote:
FWIW, possibly worth considering adding lzma to pacman...
http://www.kdedevelopers.org/node/3326
--markc
Which will only happen if libarchive gets LZMA support.
On Mon, Mar 24, 2008 at 2:24 PM, Mark Constable <markc@renta.net> wrote:
FWIW, possibly worth considering adding lzma to pacman...
http://www.kdedevelopers.org/node/3326
--markc
Interesting. You may want ot take this to the pacman-dev ML. A good "real-world" look might be grabbing the openoffice package and uncompressing it to a folder, and then run the following tests: bsdtar czf package (gzip) bsdtar cjf package (bzip2) <lzma creation of package> and then the similar unzip costs, along with the size of the original folder structure (unzipped package) vs. their zipped counterparts. -Dan
On Mon, 2008-03-24 at 14:43 -0500, Dan McGee wrote:
On Mon, Mar 24, 2008 at 2:24 PM, Mark Constable <markc@renta.net> wrote:
FWIW, possibly worth considering adding lzma to pacman...
http://www.kdedevelopers.org/node/3326
--markc
Interesting. You may want ot take this to the pacman-dev ML. A good "real-world" look might be grabbing the openoffice package and uncompressing it to a folder, and then run the following tests:
bsdtar czf package (gzip) bsdtar cjf package (bzip2) <lzma creation of package>
and then the similar unzip costs, along with the size of the original folder structure (unzipped package) vs. their zipped counterparts.
-Dan
I tested with OpenOffice.org, here are the results. Note that I had to use a weird construction to compress lzma files, as tar has no native support yet (1.19.90 has official lzma support I read on some debian mailinglist). Compression times take a huge increase in time, but in case of openoffice, I think 30 seconds extra compression time will save on much more upload time, as the resulting package would be 20MB smaller. For the end user, it depends on their connection. Most users will have <10Mb/s connections and will see a positive change in installation speed: 10 seconds extra for extraction, 20 seconds less for downloading the package. [jan@server ~]$ time bsdtar czf oo.tar.gz oo/ real 0m24.244s user 0m23.882s sys 0m0.360s [jan@server ~]$ time bsdtar cjf oo.tar.bz2 oo/ real 0m55.812s user 0m55.310s sys 0m0.377s [jan@server ~]$ time tar c oo | lzma -2 > oo.tar.lzma real 0m55.070s user 0m54.583s sys 0m0.740s [jan@server ~]$ time tar c oo | lzma -1 > oo.tar.lzma1 real 0m38.995s user 0m38.737s sys 0m0.503s [jan@server ~]$ du -h oo.tar.* 108M oo.tar.bz2 115M oo.tar.gz 95M oo.tar.lzma 101M oo.tar.lzma1 [jan@server u]$ time tar zxf ../oo.tar.gz real 0m3.570s user 0m3.383s sys 0m0.967s [jan@server u]$ time tar jxf ../oo.tar.bz2 real 0m19.431s user 0m17.826s sys 0m1.367s [jan@server u]$ time tar --use-compress-program=lzma -xf ../oo.tar.lzma real 0m13.211s user 0m12.306s sys 0m1.117s (the lzma1 archive gives equal timings)
CCing to pacman-dev. On Mon, Mar 24, 2008 at 3:25 PM, Jan de Groot <jan@jgc.homeip.net> wrote:
On Mon, 2008-03-24 at 14:43 -0500, Dan McGee wrote:
On Mon, Mar 24, 2008 at 2:24 PM, Mark Constable <markc@renta.net> wrote:
FWIW, possibly worth considering adding lzma to pacman...
http://www.kdedevelopers.org/node/3326
--markc
Interesting. You may want ot take this to the pacman-dev ML. A good "real-world" look might be grabbing the openoffice package and uncompressing it to a folder, and then run the following tests:
bsdtar czf package (gzip) bsdtar cjf package (bzip2) <lzma creation of package>
and then the similar unzip costs, along with the size of the original folder structure (unzipped package) vs. their zipped counterparts.
-Dan
I tested with OpenOffice.org, here are the results. Note that I had to use a weird construction to compress lzma files, as tar has no native support yet (1.19.90 has official lzma support I read on some debian mailinglist).
Compression times take a huge increase in time, but in case of openoffice, I think 30 seconds extra compression time will save on much more upload time, as the resulting package would be 20MB smaller. For the end user, it depends on their connection. Most users will have <10Mb/s connections and will see a positive change in installation speed: 10 seconds extra for extraction, 20 seconds less for downloading the package.
[jan@server ~]$ time bsdtar czf oo.tar.gz oo/ real 0m24.244s user 0m23.882s sys 0m0.360s
[jan@server ~]$ time bsdtar cjf oo.tar.bz2 oo/ real 0m55.812s user 0m55.310s sys 0m0.377s
[jan@server ~]$ time tar c oo | lzma -2 > oo.tar.lzma real 0m55.070s user 0m54.583s sys 0m0.740s
[jan@server ~]$ time tar c oo | lzma -1 > oo.tar.lzma1 real 0m38.995s user 0m38.737s sys 0m0.503s
[jan@server ~]$ du -h oo.tar.* 108M oo.tar.bz2 115M oo.tar.gz 95M oo.tar.lzma 101M oo.tar.lzma1
[jan@server u]$ time tar zxf ../oo.tar.gz real 0m3.570s user 0m3.383s sys 0m0.967s
[jan@server u]$ time tar jxf ../oo.tar.bz2 real 0m19.431s user 0m17.826s sys 0m1.367s
[jan@server u]$ time tar --use-compress-program=lzma -xf ../oo.tar.lzma real 0m13.211s user 0m12.306s sys 0m1.117s (the lzma1 archive gives equal timings)
Did you try with bsdtar on the extraction as well? That may make a difference over (gnu)tar. Should be: time bsdtar xf ../oo.tar.gz time bsdtar xf ../oo.tar.bz2 time bsdtar --use-compress-program=lzma -xf ../oo.tar.lzma -Dan
Great improvements for my athlon xp, lzma is faster than bz2 and with better compresion ratios. For the end user, uncompressing speed it's only few seconds greater than gzip but the gzip file is nearly 20 MB bigger than lzma! I have used the current openoffice-base-2.4.0-0.4-i686.pkg.tar.gz package contents for the test. time bsdtar cfz oo.tar.gz openoffice/ real 0m47.403s user 0m42.761s sys 0m1.190s time bsdtar cjf oo.tar.bz2 openoffice/ real 3m9.821s user 2m45.149s sys 0m1.657s time tar c openoffice | lzma -2 > oo.tar.lzma real 2m26.341s user 2m14.491s sys 0m1.683s du -h oo.tar.* 106M oo.tar.bz2 112M oo.tar.gz 94M oo.tar.lzma time tar -xzf oo.tar.gz real 0m37.378s user 0m4.036s sys 0m1.713s time tar -xjf oo.tar.bz2 real 1m40.354s user 0m39.711s sys 0m3.946s time tar --use-compress-program=lzma -xf oo.tar.lzma real 0m43.464s user 0m13.986s sys 0m2.007s Rodland On Mon, Mar 24, 2008 at 9:46 PM, Dan McGee <dpmcgee@gmail.com> wrote:
CCing to pacman-dev.
On Mon, Mar 24, 2008 at 3:25 PM, Jan de Groot <jan@jgc.homeip.net> wrote:
On Mon, 2008-03-24 at 14:43 -0500, Dan McGee wrote:
On Mon, Mar 24, 2008 at 2:24 PM, Mark Constable <markc@renta.net> wrote:
FWIW, possibly worth considering adding lzma to pacman...
http://www.kdedevelopers.org/node/3326
--markc
Interesting. You may want ot take this to the pacman-dev ML. A good "real-world" look might be grabbing the openoffice package and uncompressing it to a folder, and then run the following tests:
bsdtar czf package (gzip) bsdtar cjf package (bzip2) <lzma creation of package>
and then the similar unzip costs, along with the size of the original folder structure (unzipped package) vs. their zipped counterparts.
-Dan
I tested with OpenOffice.org, here are the results. Note that I had to use a weird construction to compress lzma files, as tar has no native support yet (1.19.90 has official lzma support I read on some debian mailinglist).
Compression times take a huge increase in time, but in case of openoffice, I think 30 seconds extra compression time will save on much more upload time, as the resulting package would be 20MB smaller. For the end user, it depends on their connection. Most users will have <10Mb/s connections and will see a positive change in installation speed: 10 seconds extra for extraction, 20 seconds less for downloading the package.
[jan@server ~]$ time bsdtar czf oo.tar.gz oo/ real 0m24.244s user 0m23.882s sys 0m0.360s
[jan@server ~]$ time bsdtar cjf oo.tar.bz2 oo/ real 0m55.812s user 0m55.310s sys 0m0.377s
[jan@server ~]$ time tar c oo | lzma -2 > oo.tar.lzma real 0m55.070s user 0m54.583s sys 0m0.740s
[jan@server ~]$ time tar c oo | lzma -1 > oo.tar.lzma1 real 0m38.995s user 0m38.737s sys 0m0.503s
[jan@server ~]$ du -h oo.tar.* 108M oo.tar.bz2 115M oo.tar.gz 95M oo.tar.lzma 101M oo.tar.lzma1
[jan@server u]$ time tar zxf ../oo.tar.gz real 0m3.570s user 0m3.383s sys 0m0.967s
[jan@server u]$ time tar jxf ../oo.tar.bz2 real 0m19.431s user 0m17.826s sys 0m1.367s
[jan@server u]$ time tar --use-compress-program=lzma -xf ../oo.tar.lzma real 0m13.211s user 0m12.306s sys 0m1.117s (the lzma1 archive gives equal timings)
Did you try with bsdtar on the extraction as well? That may make a difference over (gnu)tar. Should be: time bsdtar xf ../oo.tar.gz time bsdtar xf ../oo.tar.bz2 time bsdtar --use-compress-program=lzma -xf ../oo.tar.lzma
-Dan
participants (4)
-
Dan McGee
-
Jan de Groot
-
Mark Constable
-
rodland