[pacman-dev] [PATCH] makepkg: Pass --stream to `hg clone` when creating the working copy
From: Luke Shumaker <lukeshu@parabola.nu> Without --stream, `hg clone` reencodes+recompresses the entire repository, to the storage settings of the host. But download_hg() already did that on the initial network clone, and it is 100% pointless duplicated work for the local clone. The work that this saves is CPU-bound (not disk-bound), and is restricted to a single core. The --stream flag has only existed since Mercurial 4.4 (2017-11-01). Prior to that, it was named --uncompressed. --uncompressed still exists as a compatibility alias for --stream, and marked deprecated, though there is currently no schedule for its removal. --- scripts/libmakepkg/source/hg.sh.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/libmakepkg/source/hg.sh.in b/scripts/libmakepkg/source/hg.sh.in index ae9aed3b..7346e1e3 100644 --- a/scripts/libmakepkg/source/hg.sh.in +++ b/scripts/libmakepkg/source/hg.sh.in @@ -94,7 +94,7 @@ extract_hg() { plain "$(gettext "Aborting...")" exit 1 fi - elif ! hg clone -u "$ref" "$dir" "${dir##*/}"; then + elif ! hg clone -u "$ref" --stream "$dir" "${dir##*/}"; then error "$(gettext "Failure while creating working copy of %s %s repo")" "${repo}" "hg" plain "$(gettext "Aborting...")" exit 1 -- 2.18.0
On Sat, 08 Sep 2018 16:31:16 -0400, Luke Shumaker wrote:
From: Luke Shumaker <lukeshu@parabola.nu>
Without --stream, `hg clone` reencodes+recompresses the entire repository, to the storage settings of the host. But download_hg() already did that on the initial network clone, and it is 100% pointless duplicated work for the local clone.
The work that this saves is CPU-bound (not disk-bound), and is restricted to a single core.
After more testing, this didn't have the speed-up that I expected. Consider the patch withdrawn. -- Happy hacking, ~ Luke Shumaker
On 9/19/18 3:15 PM, Luke Shumaker wrote:
On Sat, 08 Sep 2018 16:31:16 -0400, Luke Shumaker wrote:
From: Luke Shumaker <lukeshu@parabola.nu>
Without --stream, `hg clone` reencodes+recompresses the entire repository, to the storage settings of the host. But download_hg() already did that on the initial network clone, and it is 100% pointless duplicated work for the local clone.
The work that this saves is CPU-bound (not disk-bound), and is restricted to a single core.
After more testing, this didn't have the speed-up that I expected. Consider the patch withdrawn.
As a matter of curiosity, was this just "not much savings" or "not actually saving"? What kind of practical effect does it have, ultimately? ... I'm wondering if it's worth doing something similar elsewhere, specifically for git clone --shared It would save cp'ing possibly bloaty files from SRCDEST to BUILDDIR, in the event that the two directories are on different partitions. Normally git would optimize this away by creating hardlinks. Downsides are: - If SRCDEST is rm'ed then the BUILDDIR clone breaks -- but I consider that reasonable, plus if you re-clone to SRCDEST it magically works again... - If the upstream source does a force push and SRCDEST prunes some commits in our ephemeral clone via git gc --auto, *and* users treat BUILDDIR as a place to commit changes they want to keep, they may get a broken repo and missing commits. Do we care about this? Worth noting is they will already have makepkg trying to force-reset the default "makepkg" branch. -- Eli Schwartz Bug Wrangler and Trusted User
participants (2)
-
Eli Schwartz
-
Luke Shumaker