Thanks for the detailed reply Eli :) Before I go further, I should mention that this thread has sparked some interest among our teammates, and folks on IRC are researching now a possibility of switching from xz to zstd entirely. So I think we will return to this topic soon with some updates :)
And it is not a pixz issue either. For example xz -T $num will compress with up to $num cores, as opposed to the default 1, but basic testing shows that it produces the same output with 2, 3, 4 etc. cores but different output from what you get when using 1 core. You cannot even hardcode -T0, since (in addition to requiring packagers to use more cores than they want to) that may still get interpreted as 1, which is then unreproducible. You can verify this using taskset -c 1...
(This has been discussed already in #archlinux-pacman.)
Turns out it is possible to make xz use more threads and produce reproducible results, check out this message: https://lists.debian.org/debian-dpkg/2016/10/msg00011.html I've confirmed (with real VMs) that -T2, -T3, -T8, all produce the same result, on both single-core and multi-core CPUs, and -T0 produces the same result on multi-core CPU, but -T0 produces a different result on a single-core CPU. Therefore, the simplest workaround that produces reproducible results on all types of CPU is this: -T$([ $(nproc) -lt 2 ] && echo '2' || echo '0') Or maybe even something like this: -T$(( $(nproc) + 1 ))
For the topic of configurable implementations, there is an additional hurdle. Since compression happens inside the chroot, even if we did support it, it would need some way to look up the required binary and install it in the chroot, but we have no tooling for this and in fact makepkg does not even know how to check that the binary exists, so you will probably just get:
-> Compressing package... /usr/share/makepkg/util/compress.sh: line 39: pixz: command not found bsdtar: Write error ==> ERROR: Failed to create package file.
The solution would have to require either a known, embedded list of "good" ones (if pixz is not reproducible when running the same command line twice then we'd need to be able to blacklist that), or else assume the user installed it when adding it to their makepkg.conf and using pacman -Qo. It is probably overkill to use pkgfile and support compressors that aren't installed on the host either...
That is correct. One final remark, pixz does produce the same result when is run twice, I made a mistake when I was testing it. But it doesn't matter, since we'll probably settle for zstd or xz -T. -- Maxim Baz