[arch-dev-public] Cronjob for regular git garbage collection

Dan McGee dpmcgee at gmail.com
Tue Nov 3 07:59:52 EST 2009


On Tue, Nov 3, 2009 at 3:49 AM, Thomas Bächler <thomas at archlinux.org> wrote:
> When I broke our projects.archlinux.org vhost, I noticed that cloning git
> via http:// takes ages. This could be vastly improved by running a regular
> cronjob to 'git gc' all /srv/projects/git repositories. It would also speed
> up cloning/pulling via git://, as the "remote: compressing objects" stage
> will be much less work on the server. Are there any objections against
> setting this up?

I used to do this fairly often on the pacman.git repo; I did a few of
the others as well. No objections here, just make sure running the
cronjob doesn't make the repository unwritable for the people that
need it.

Realize that this has drawbacks; someone that is fetching (not
cloning) over HTTP will have to redownload the whole pack again and
not just the incremental changeset. You may want something more like
the included script as it gives you the benefits of compressing
objects but not creating one huge pack.

-Dan

$ cat bin/prunerepos
#!/bin/sh

cwd=$(pwd)

for dir in $(ls | grep -F '.git'); do
	cd $cwd/$dir
	echo "pruning and packing $cwd/$dir..."
	git prune
	git repack -d
done


More information about the arch-dev-public mailing list