[arch-projects] [ABS] [PATCH v3 3/7] git prototype: on initial clones, perform a shallow clone

Wed Nov 9 22:46:09 EST 2011

On Wed, Nov 09, 2011 at 12:28:23AM -0600, C Anthony Risinger wrote:
> On Tue, Nov 8, 2011 at 11:56 PM, Linus Arver <linusarver at gmail.com> wrote:
> >
> > Shallow git clones are just like regular clones, but do not contain any
> > of the past commit history. It is virtually the same thing as doing a
> > regular clone, then doing a rebase to squash all commits into a single
> > commit. Many people who do not understand git dismiss shallow clones
> > because they wrongly believe that shallow clones are incapable of
> > pulling in changes going forward from the remote. This is not the case!
> > You can still do pulls from the master remote repo in the future to
> > update the shallow clone, just like a regular clone!
> 
> i had a few other improvements that may be of interest, outlined here:
> 
> http://mailman.archlinux.org/pipermail/arch-general/2011-July/021078.html
> 
> ... [sort of] condensed:
> 
> ) `_gitname` is not used consistently ... though i now forget the
> various uses ive seen :-( will have to follow up on that
> 
> ) allow _git* variables to be set by the environment
> 
> ) introduce `_gitspec` variable which supersedes `_gitname` at
> *checkout* stage (fallback to `_gitname`)
> 
> ) use a targeted fetch command instead of a clone -- this can achieve
> even greater savings than shallow clone, even though the fetch is
> "deep".  the idea is to pull only $_gitname and _nothing_ else.  this
> can however be combined with shallow if done correctly for even
> greater savings (this method results in 50%+ reduction to kernel pull
> [dont know shallow variant offhand], and i've seen savings as high as
> 90%+)
> 
> ) store repositories in a known list of locations ... people WILL blow
> the repo away if it's in the build dir.  my PKGBUILDs searches these
> (in order of precedence):
> 
> /var/abs/PKGBUILD.devel/${pkgname}.git
> /var/abs/local/PKGBUILD.devel/${pkgname}.git
> ${SRCDEST}/PKGBUILD.devel/${pkgname}.git
> ${startdir}/PKGBUILD.devel/${pkgname}.git
> ~/PKGBUILD.devel/${pkgname}.git
> 
> ) use a proxy mechanism in the event a repository is found but it is
> read-only.  this lets you read-only bind mount a repo (think
> mkchrootpkg), and it will simply create a new repository, copy the
> refs, and setup the object directory as an alternative for the proxy
> repo.  thus the proxy has the ability to download new objects as
> needed, but starts from the same spot as the bind-mounted repo.
> 
> ... these techniques are all in use, and primarily derived from
> experiences developing, this PKGBUILD:
> 
> http://aur.archlinux.org/packages/pyjamas-engine-pythonwebkit/PKGBUILD
> 
> ... which is a massive 1GB+ download and lengthy compile.  these
> modifications also make it very simple to rapidly build git packages
> within a chroot (one of the primary motivations) *without* any
> copying/etc.
> 
> probably a little out of scope from what you've done here, and
> possibly in need of further discussion, but you're message sparked
> memory and i still believe they are all good changes -- it saved me
> oodles of time and prevents constant removal of humongous repos (esp.
> when in chroot).
> 
> -- 
> 
> C Anthony

The problem I have with your sample PKGBUILD is that it is extremely
complicated. Anything extremely complicated goes entirely against the
KISS philosophy that we Arch devs/contributors cherish. See
https://wiki.archlinux.org/index.php/The_Arch_Way

But of course, you are free to write up a separate patch series for git.
At this time, however, I am unwilling to delay this patch series to
incorporate such extensive changes.

-Linus