[arch-general] Syncing the mirrors

keenerd keenerd at gmail.com
Tue Mar 9 02:09:08 CET 2010


A lot of people have been getting bit by mirrors being out of sync.
Fundamentally, this comes down to a mirror's database tarball being
ahead or behind of the packages which actually exist on the mirror.

Rsync is made for bulk updating, but it is not atomic, and bad things
happen if you interact with a mirror while the rsync is running.
Generally, you get package not found errors.  This can be fixed, but
it is a pain to deploy, because the mirrors require an update.

Here's the what mirrors (probably) should be doing:

wget ...../core.db.tar.gz -O core.db.tar.gz.new  # and the others
rsync --delete-after --exclude="*.db.tar.gz" rsync://......
mv core.db.tar.gz.new core.db.tar.gz  # and the others

The key differences here: put off deleting files as long as possible
(rsync's default is to delete before transferring new files, causing
all sorts of trouble), and do not release the new DB until all the new
files have finished transferring.  This is hardly the same type of
guaranteed transaction you get with a ACID database, but it should
eliminate most all of the current errors.

There does not seem to be a documented "standard rsync" command for
the mirrors to use, so I'm making all sorts of wild assumptions about
what a mirror's rsync is doing.

-Kyle


More information about the arch-general mailing list