Hello all, in the past few weeks, some TUs and Developers have compared different compression algorithms to potentially replace the default compression method used in devtools. The current method is `xz -c -z -` which is single-threaded and rather slow, so we are looking to replace it with something faster. Multithreaded xz has come up in the past, and was quickly dismissed due to edge cases that would end up with packages being unreproducible on different machines - namely, xz -T0 -- the method that automatically determines the amount of threads -- produces different results when the amount of cores in a system is == 1: $ taskset -c 1 xz -c -z - -T0 < test > test.xz && sha256sum test.xz fe95a1af78304ae4be508e071f6697296e52b625fba95fca5622757779633d90 test.xz $ taskset -c 1,2 xz -c -z - -T0 < test > test.xz && sha256sum test.xz 3b2c520eda654de19c5fc02ea1d850e142ae24e1246edcce82e90bd690d18f99 test.xz $ taskset -c 1,2,3 xz -c -z - -T0 < test > test.xz && sha256sum test.xz 3b2c520eda654de19c5fc02ea1d850e142ae24e1246edcce82e90bd690d18f99 test.xz With this mail, i propose to switch to `zstd` instead (https://github.com/facebook/zstd). zstd does *not* exhibit this issue, and anthraxx has asked for some clarifications (https://github.com/facebook/zstd/issues/999#issuecomment-474114799) - just in case. The response is that zstd is generally friendly to reproducible builds. After some testing with heftig, I ran some additional benchmarks on our new build host 'dragon' to determine the appropriate compression level. Here are the results: (sorry for the wide mail :b) Compressor Package Name Size (MiB) Comp. Size (MiB) Ratio Time (mm:ss) Max. RSS in MiB Decomp. Time (mm:ss) Decomp. RSS in MiB xz -c -z - cuda 3038,58 1316,93 43,34% 19:03.44 95,32 1:19.74 10,18 zstd -c -T0 -18 - cuda 3038,58 1375,41 45,26% 01:12.50 2648,93 0:04.46 10,70 zstd -c -T0 -19 - cuda 3038,58 1371,94 45,15% 01:34.13 3401,67 0:04.47 10,73 zstd -c -T0 -20 - cuda 3038,58 1371,94 45,15% 01:34.34 3416,90 0:04.46 10,79 zstd -c -T0 -21 - cuda 3038,58 1371,94 45,15% 01:31.60 3414,14 0:04.46 10,79 xz -c -z - gcc 135,54 33,11 24,43% 00:54.54 95,34 0:02.59 10,11 zstd -c -T0 -18 - gcc 135,54 35,87 26,47% 00:12.37 419,23 0:00.23 10,77 zstd -c -T0 -19 - gcc 135,54 35,66 26,31% 00:15.76 578,99 0:00.24 10,66 zstd -c -T0 -20 - gcc 135,54 35,66 26,31% 00:16.36 579,11 0:00.25 10,75 zstd -c -T0 -21 - gcc 135,54 35,66 26,31% 00:16.18 579,01 0:00.25 10,46 xz -c -z - go 484,10 122,10 25,22% 03:19.11 95,35 0:08.78 10,16 zstd -c -T0 -18 - go 484,10 132,69 27,41% 00:15.40 1402,99 0:00.80 10,80 zstd -c -T0 -19 - go 484,10 131,84 27,23% 00:19.74 1914,07 0:00.79 10,78 zstd -c -T0 -20 - go 484,10 131,84 27,23% 00:20.19 1914,11 0:00.77 10,72 zstd -c -T0 -21 - go 484,10 131,84 27,23% 00:20.08 1914,09 0:00.79 10,78 xz -c -z - intellij-idea-community-edition 772,46 384,37 49,76% 04:53.01 95,31 0:28.69 10,18 zstd -c -T0 -18 - intellij-idea-community-edition 772,46 392,44 50,80% 00:27.10 2341,02 0:00.91 10,63 zstd -c -T0 -19 - intellij-idea-community-edition 772,46 391,04 50,62% 00:37.09 3107,97 0:00.93 10,47 zstd -c -T0 -20 - intellij-idea-community-edition 772,46 391,04 50,62% 00:34.43 3107,87 0:00.93 10,70 zstd -c -T0 -21 - intellij-idea-community-edition 772,46 391,04 50,62% 00:35.45 3104,94 0:00.94 10,64 xz -c -z - linux 80,15 70,66 88,17% 00:31.27 95,35 0:03.85 10,11 zstd -c -T0 -18 - linux 80,15 70,22 87,62% 00:07.48 299,30 0:00.05 10,64 zstd -c -T0 -19 - linux 80,15 70,18 87,56% 00:09.32 395,32 0:00.05 10,72 zstd -c -T0 -20 - linux 80,15 70,18 87,56% 00:08.88 395,23 0:00.06 10,57 zstd -c -T0 -21 - linux 80,15 70,18 87,56% 00:08.91 395,28 0:00.05 10,71 xz -c -z - linux-headers 103,85 17,02 16,39% 00:42.24 95,35 0:01.45 10,15 zstd -c -T0 -18 - linux-headers 103,85 18,92 18,22% 00:12.68 320,98 0:00.16 10,74 zstd -c -T0 -19 - linux-headers 103,85 18,88 18,18% 00:16.36 448,98 0:00.17 10,63 zstd -c -T0 -20 - linux-headers 103,85 18,88 18,18% 00:16.26 448,99 0:00.16 10,77 zstd -c -T0 -21 - linux-headers 103,85 18,88 18,18% 00:16.39 448,97 0:00.16 10,72 xz -c -z - tensorflow 303,10 55,58 18,34% 01:59.56 95,40 0:04.78 10,27 zstd -c -T0 -18 - tensorflow 303,10 61,83 20,40% 00:15.99 856,98 0:00.47 10,64 zstd -c -T0 -19 - tensorflow 303,10 61,49 20,29% 00:21.01 1176,74 0:00.50 10,68 zstd -c -T0 -20 - tensorflow 303,10 61,49 20,29% 00:21.11 1176,88 0:00.49 10,64 zstd -c -T0 -21 - tensorflow 303,10 61,49 20,29% 00:21.16 1176,89 0:00.50 10,67 This seems to conclude that the ideal zstd level would be `-18`, as anything higher than that has a steep incline in memory usage during compression for negligible gains. We're, however, looking at a minimal increase in package size most of the time. I would consider that a minimal increase only, and a tradeoff we can make - given the incredibly fast decompression. So, TL;DR, the benefits of `zstd -c -T0 -18 -` over `xz -c -z -` are: - Massive speed gain in compression - Massive speed gain in decompression - Stable, reproducible multithreading The speed gain in decompression substantially increases pacman's package installation speed. While the trade-offs would be: - Minimal increase in compressed package size - Increase in memory usage during compression The required changeset is, i think: PKGEXT='.pkg.tar.zst' COMPRESSZST=(zstd -c -T0 -18 -) This change requires a new pacman release, as as of writing this, zstd support is in master but hasn't landed in a release yet. Judging by recent IRC chats in -tu and -devops, I think that many TUs and Devs already think this is a good move. This mail is a general proposal to gather opinions on this change and hopefully clear up any misunderstandings or questions regarding this change, before the actual patch is sent. Regards, Rob (coderobe)