On 3/18/19 5:56 AM, Maxim Baz via arch-projects wrote:
I did see an earlier patch proposing the same, which was rejected because it might break delta packages [1]. However, someone mentioned that pacman is dropping support for delta packages altogether [2] referring to a removal commit in pacman repo [3], so I wanted to bring this up again :)
I'm creating a new thread because according to my tests the earlier patch [1] is incomplete.
1: https://lists.archlinux.org/pipermail/arch-projects/2017-May/004574.html 2: https://github.com/AladW/aurutils/issues/547 3: https://git.archlinux.org/pacman.git/commit/?id=9adb0d5b37df7ca668e23877e854...
Well, I would ideally like to add delta packages back into pacman, whether that is via xdelta3 or something else, and it is also worth noting it got removed at least in large part because archlinux.org itself has never really used it, and therefore: - it doesn't exactly get well tested - for our use cases we don't ultimately notice the difference (I would like to one day add delta updates to dbscripts. :( But before that happens, I would need to convince Allan with a tested, safe delta implementation that someone actually understands and is willing to maintain.) Supporting packages in the official repos which are not byte-identical to the default COMPRESSXZ makes it quite difficult to ever add delta support back. ... The fundamental problem behind delta updates remains the same, though, which is that you cannot actually generate the same package twice using the same data (even from a byte-identical compressed tar, which is basically what xdelta3 did). As Levente pointed out, this completely nukes reproducible builds. And it is not a pixz issue either. For example xz -T $num will compress with up to $num cores, as opposed to the default 1, but basic testing shows that it produces the same output with 2, 3, 4 etc. cores but different output from what you get when using 1 core. You cannot even hardcode -T0, since (in addition to requiring packagers to use more cores than they want to) that may still get interpreted as 1, which is then unreproducible. You can verify this using taskset -c 1... (This has been discussed already in #archlinux-pacman.) Without a guarantee that a given compressor will always be reproducible no matter how many cores are used, we cannot take advantage of multithreaded compression on any level. More generally, without a variation on https://patchwork.archlinux.org/patch/692/ we cannot take advantage of any sort of configurability. But my patch was rejected. Specific sub-issues with unknown configurability that isn't recorded in the .BUILDINFO: User-configurable implementations. Without a guarantee that any implementation of a format retains byte-level compatibility with the standard format (in this case xz), we cannot support this as we do not know what was being used. User-configurable options. These are inherently unreproducible as long as they include the compression level. ... For the topic of configurable implementations, there is an additional hurdle. Since compression happens inside the chroot, even if we did support it, it would need some way to look up the required binary and install it in the chroot, but we have no tooling for this and in fact makepkg does not even know how to check that the binary exists, so you will probably just get: -> Compressing package... /usr/share/makepkg/util/compress.sh: line 39: pixz: command not found bsdtar: Write error ==> ERROR: Failed to create package file. The solution would have to require either a known, embedded list of "good" ones (if pixz is not reproducible when running the same command line twice then we'd need to be able to blacklist that), or else assume the user installed it when adding it to their makepkg.conf and using pacman -Qo. It is probably overkill to use pkgfile and support compressors that aren't installed on the host either... -- Eli Schwartz Bug Wrangler and Trusted User