[arch-dev-public] RFC: (devtools) Changing default compression method to zstd

Robin Broda arch-ml at coderobe.net
Mon Mar 25 04:13:13 UTC 2019


Hello again,

after archange and Baptiste mentioned that the numbers look a little odd, I took some more time and re-ran the tests with additional parameters.
Most notably, this includes -T2 - to show behavior on lower-spec machines, and it fixes the higher compression levels by appending --ultra.

Here are the new results:

Compressor                 Package Name   Size (MiB)  Comp. Size (MiB)  Ratio   Time (mm:ss)  Max. RSS in MiB  Decomp. Time (mm:ss)  Decomp. RSS in MiB
xz -c -z -                 cuda           3038,58     1316,93           43,34%  19:03.44      95,32            1:19.74               10,18
zstd -c -T2 -18 -          cuda           3038,58     1375,41           45,26%  7:10.76       373,53           0:04.49               10,70
zstd -c -T0 -18 -          cuda           3038,58     1375,41           45,26%  1:17.23       2646,19          0:04.41               10,76
zstd -c -T2 -19 -          cuda           3038,58     1371,94           45,15%  9:08.09       420,43           0:04.43               10,68
zstd -c -T0 -19 -          cuda           3038,58     1371,94           45,15%  1:34.74       3415,77          0:04.51               10,75
zstd -c -T2 --ultra -20 -  cuda           3038,58     1286,91           42,35%  10:05.19      1255,64          0:04.46               34,78
zstd -c -T0 --ultra -20 -  cuda           3038,58     1286,91           42,35%  1:57.94       8192,42          0:04.43               34,76
zstd -c -T2 --ultra -21 -  cuda           3038,58     1141,94           37,58%  10:37.84      2404,56          0:04.11               66,73
zstd -c -T0 --ultra -21 -  cuda           3038,58     1141,94           37,58%  2:58.45       8035,52          0:04.08               66,77
xz -c -z -                 gcc            135,54      33,11             24,43%  0:54.54       95,34            0:02.59               10,11
zstd -c -T2 -18 -          gcc            135,54      35,87             26,47%  0:23.39       255,13           0:00.27               10,76
zstd -c -T0 -18 -          gcc            135,54      35,87             26,47%  0:12.42       419,35           0:00.24               10,81
zstd -c -T2 -19 -          gcc            135,54      35,66             26,31%  0:30.34       319,03           0:00.24               10,45
zstd -c -T0 -19 -          gcc            135,54      35,66             26,31%  0:16.07       579,00           0:00.24               10,73
zstd -c -T2 --ultra -20 -  gcc            135,54      24,38             17,99%  0:51.69       484,32           0:00.20               34,63
zstd -c -T0 --ultra -20 -  gcc            135,54      24,38             17,99%  0:51.79       484,66           0:00.20               34,73
zstd -c -T2 --ultra -21 -  gcc            135,54      22,89             16,89%  1:10.22       481,77           0:00.22               66,71
zstd -c -T0 --ultra -21 -  gcc            135,54      22,89             16,89%  1:10.39       482,17           0:00.21               66,65
xz -c -z -                 go             484,10      122,10            25,22%  3:19.11       95,35            0:08.78               10,16
zstd -c -T2 -18 -          go             484,10      132,69            27,41%  1:20.42       292,36           0:00.78               10,75
zstd -c -T0 -18 -          go             484,10      132,69            27,41%  0:15.42       1402,87          0:00.78               10,79
zstd -c -T2 -19 -          go             484,10      131,84            27,23%  1:46.85       352,77           0:00.79               10,75
zstd -c -T0 -19 -          go             484,10      131,84            27,23%  0:20.13       1914,13          0:00.80               10,75
zstd -c -T2 --ultra -20 -  go             484,10      121,87            25,17%  1:58.00       879,29           0:00.83               34,68
zstd -c -T0 --ultra -20 -  go             484,10      121,87            25,17%  1:07.37       1252,75          0:00.84               34,71
zstd -c -T2 --ultra -21 -  go             484,10      112,18            23,17%  2:09.79       1240,84          0:00.82               66,73
zstd -c -T0 --ultra -21 -  go             484,10      112,18            23,17%  2:09.70       1241,08          0:00.81               66,80
xz -c -z -                 intellij-*     772,46      384,37            49,76%  4:53.01       95,31            0:28.69               10,18
zstd -c -T2 -18 -          intellij-*     772,46      392,44            50,80%  1:51.91       342,91           0:00.94               10,71
zstd -c -T0 -18 -          intellij-*     772,46      392,44            50,80%  0:20.50       2341,05          0:00.93               10,70
zstd -c -T2 -19 -          intellij-*     772,46      391,04            50,62%  2:40.44       407,06           0:00.94               10,82
zstd -c -T0 -19 -          intellij-*     772,46      391,04            50,62%  0:28.37       3107,88          0:00.95               10,76
zstd -c -T2 --ultra -20 -  intellij-*     772,46      380,38            49,24%  3:19.46       1182,29          0:01.04               34,73
zstd -c -T0 --ultra -20 -  intellij-*     772,46      380,38            49,24%  1:28.10       2282,25          0:01.03               34,72
zstd -c -T2 --ultra -21 -  intellij-*     772,46      374,19            48,44%  4:07.18       1788,19          0:01.06               66,81
zstd -c -T0 --ultra -21 -  intellij-*     772,46      374,19            48,44%  2:31.77       2433,28          0:01.04               66,73
xz -c -z -                 linux          80,15       70,66             88,17%  0:31.27       95,35            0:03.85               10,11
zstd -c -T2 -18 -          linux          80,15       70,22             87,62%  0:09.94       250,32           0:00.06               10,56
zstd -c -T0 -18 -          linux          80,15       70,22             87,62%  0:07.52       299,25           0:00.06               10,72
zstd -c -T2 -19 -          linux          80,15       70,18             87,56%  0:12.90       314,25           0:00.05               10,57
zstd -c -T0 -19 -          linux          80,15       70,18             87,56%  0:09.32       395,17           0:00.06               10,71
zstd -c -T2 --ultra -20 -  linux          80,15       70,18             87,56%  0:22.64       313,47           0:00.08               34,61
zstd -c -T0 --ultra -20 -  linux          80,15       70,18             87,56%  0:22.69       313,82           0:00.08               34,68
zstd -c -T2 --ultra -21 -  linux          80,15       70,17             87,55%  0:27.01       473,56           0:00.09               66,64
zstd -c -T0 --ultra -21 -  linux          80,15       70,17             87,55%  0:26.97       473,89           0:00.09               66,71
xz -c -z -                 linux-headers  103,85      17,02             16,39%  0:42.24       95,35            0:01.45               10,15
zstd -c -T2 -18 -          linux-headers  103,85      18,92             18,22%  0:19.51       218,35           0:00.17               10,48
zstd -c -T0 -18 -          linux-headers  103,85      18,92             18,22%  0:12.74       320,92           0:00.17               10,47
zstd -c -T2 -19 -          linux-headers  103,85      18,88             18,18%  0:24.42       282,43           0:00.16               10,67
zstd -c -T0 -19 -          linux-headers  103,85      18,88             18,18%  0:16.28       448,92           0:00.17               10,59
zstd -c -T2 --ultra -20 -  linux-headers  103,85      18,77             18,08%  0:43.86       286,13           0:00.19               34,74
zstd -c -T0 --ultra -20 -  linux-headers  103,85      18,77             18,08%  0:44.00       286,32           0:00.19               34,79
zstd -c -T2 --ultra -21 -  linux-headers  103,85      18,70             18,00%  1:03.41       445,96           0:00.20               66,61
zstd -c -T0 --ultra -21 -  linux-headers  103,85      18,70             18,00%  1:03.29       446,32           0:00.20               66,66
xz -c -z -                 tensorflow     303,10      55,58             18,34%  1:59.56       95,40            0:04.78               10,27
zstd -c -T2 -18 -          tensorflow     303,10      61,83             20,40%  0:54.04       277,06           0:00.48               10,72
zstd -c -T0 -18 -          tensorflow     303,10      61,83             20,40%  0:15.64       856,86           0:00.47               10,61
zstd -c -T2 -19 -          tensorflow     303,10      61,49             20,29%  1:15.56       340,99           0:00.48               10,75
zstd -c -T0 -19 -          tensorflow     303,10      61,49             20,29%  0:20.82       1176,75          0:00.49               10,68
zstd -c -T2 --ultra -20 -  tensorflow     303,10      60,63             20,00%  1:30.19       678,34           0:00.53               34,67
zstd -c -T0 --ultra -20 -  tensorflow     303,10      60,63             20,00%  1:11.32       849,60           0:00.54               34,68
zstd -c -T2 --ultra -21 -  tensorflow     303,10      59,98             19,79%  2:42.81       1007,56          0:00.54               66,47
zstd -c -T0 --ultra -21 -  tensorflow     303,10      59,98             19,79%  2:43.03       1007,95          0:00.55               66,70


The new results show that -20 is actually more beneficial for our goals, as it:
- Actually reduces the size compared to xz more often
- While not as fast, still beats xz in compression time
- Only increases the decompressor memory usage negligibly
- Maintains similar decompression speed to the other levels

TL;DR:
Benefits:
- Faster
- Often smaller or similar to xz in size, an improvement over -18 either way
- Still reproducible :)

Trade-offs:
- Minimal increase in decompressor memory usage, but we're talking 50 MiB here.
- Increase in memory usage during compression, however the important part is that memory usage scales with the amount of threads used.

Given that low-end systems can simply change the thread allocation to 1 or 2 to slash the compressor memory usage as a trade-off on speed,
i don't think that is a problem.

New changeset:
PKGEXT='.pkg.tar.zst'
COMPRESSZST=(zstd -c -T0 -20 -)


This would hopefully address the concerns over the filesize increase, while still maintaining most of the benefits.


Regards,
Rob (coderobe)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/arch-dev-public/attachments/20190325/093f2d8a/attachment-0001.sig>


More information about the arch-dev-public mailing list