[pacman-dev] [PATCH] makepkg: use "shared" git clones when checking out sources

Allan McRae allan at archlinux.org
Tue Mar 19 04:09:46 UTC 2019


On 8/3/19 2:18 pm, Eli Schwartz wrote:
> In order to cache sources offline, makpekg creates *two* copies of every

Fixed this typo.

> git repo. This is a useful tradeoff for network time, but comes at the
> cost of increased disk space.
> 
> Normally, git can smooth this over automagically. Whenever possible, git
> objects are hardlinked to save space, but this does not work when
> SRCDEST and BUILDDIR are on separate filesystems.
> 
> When the repo in question is both very large (linux.git for example is
> 2.2 GB) and crosses filesystem boundaries, this results in a lot of
> extra disk space being used; the most likely scenario is where BUILDDIR
> is a tmpfs for bonus ouch.
> 
> git(1) has a builtin feature which serves this case handily: the
> --shared flag will create the info/alternates file instructing git to
> not copy or hardlink or create objects/packs at all, but merely look for
> them in an external location (that being the source of the clone).
> 
> The downside of using shared clones, is that if you modify and drop
> commits from the original repo, or simply delete the whole repo
> altogether, you break the copy. But we don't care about that here,
> because
> 
> 1) the BUILDDIR copy is meant to be a temporary copy strictly derived
>    via PKGBUILD syntax from the SRCDEST, and must be able to be
>    recreated at any time,
> 2) if the SRCDEST disappears, makepkg will redownload it, thus restoring
>    the objects needed by the BUILDDIR clone,
> 3) if the user does non-default things like hacking on the BUILDDIR copy
>    then deleting and re-cloning the SRCDEST may result in momentary
>    breakage, but ultimately should be fine -- the unique objects they
>    created will be stored in the BUILDDIR copy.
> 
> While it's theoretically possible that upstream will force-push to
> overwrite the base tree from which makepkg is building (which they
> should not do), *and* the user deleted their SRCDEST which they should
> not do, *and* they saved work in makepkg's working directory which they
> should not do either...
> ... this is an unlikely chain of events for which we should not care.
> 
> Using --shared is therefore helpful in immediately useful ways and IMHO
> has no actual downsides; we should use it.
> 
> An alternative implementation would be to use worktrees. I've rejected
> this since it is essentially the same as shared clones, except adding
> additional restrictions on the branch namespace, and could potentially
> break existing use cases such as manually handling the SRCDEST in order
> to share repositories with normal working copies.
> 
> Signed-off-by: Eli Schwartz <eschwartz at archlinux.org>

I don't think the commit message is long enough relative to the size of
the change.  But I will accept anyway.

A

> ---
>  scripts/libmakepkg/source/git.sh.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/libmakepkg/source/git.sh.in b/scripts/libmakepkg/source/git.sh.in
> index 497a668c..96d79623 100644
> --- a/scripts/libmakepkg/source/git.sh.in
> +++ b/scripts/libmakepkg/source/git.sh.in
> @@ -91,7 +91,7 @@ extract_git() {
>  			exit 1
>  		fi
>  		cd_safe "$srcdir"
> -	elif ! git clone "$dir" "${dir##*/}"; then
> +	elif ! git clone -s "$dir" "${dir##*/}"; then
>  		error "$(gettext "Failure while creating working copy of %s %s repo")" "${repo}" "git"
>  		plain "$(gettext "Aborting...")"
>  		exit 1
> 


More information about the pacman-dev mailing list