[arch-general] Repo mirrors
I've changed the mirror script from the wiki so mirror admins can run it every minute without huge additional load. This will help a lot to reduce update delays now and even further when we have multiple tiers (don't know how mucht that works already). I've already described it somewhere, but for those who don't know it yet: The script fetches an md5 sum of checksums of the repos (core, extra, testing, community). If the first checksum is different then the one locally generated on the mirror, it will check those for every repo. If they are also different it will rsync as needed. HTTP requests are used to fetch the checksums because most mirrors already have HTTP servers running and because they are cheap. I've also created a patch for db-scripts to generate the md5sums for the master server. As I don't have a repo set up this patch is untested (it's the same code as in the mirrorscript though). Once this is merged on the master we can start to tell mirror admins to use the new mirrorscript which can be found at http://karif.server-speed.net/~flo/tmp/mirrorsync.sh.txt C&C welcome :)
Signed-off-by: Florian Pritz <bluewind@xssn.at> --- db-update | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/db-update b/db-update index 418c680..6ee6232 100755 --- a/db-update +++ b/db-update @@ -238,6 +238,12 @@ for current_arch in ${ARCHES[@]}; do fi done fi + # generate syncsums for mirrors + md5sum <<< $(md5sum $(find "$FTP_BASE/$reponame/" -type f -name "$reponame$DBEXT") \ + | cut -d\ -f1) | cut -d\ -f1 > "$FTP_BASE/$reponame/syncsum" + md5sum <<< $(cat "$FTP_BASE/"*"/syncsum") \ + | cut -d\ -f1 > "$FTP_BASE/syncsum" + if ! /bin/cp "$WORKDIR/build/$reponame$DBEXT" "$ftppath/"; then die "failed to move repository $reponame-$current_arch". fi -- 1.7.0.4
On Tue, 13 Apr 2010 20:36:49 +0200, Florian Pritz <bluewind@xssn.at> wrote:
http://karif.server-speed.net/~flo/tmp/mirrorsync.sh.txt
C&C welcome :)
Some comments here: * There is no public http master * mirrors might only have incoming rsync access (firewall) * the script tends to be complicated and relies on local data; things might gone wild here and we have no control. pure rsync is a lot more fail proof. * The repo might change without the db files being modified; e.g. isos, sources, old packages get removed etc. * if mirrors would query us every minute there is a high chance that our 12 rsync slots might always been used. * mirrors should never sync single repos but use our rsync modules instead. (this simplifies your script a lot) My idea was to just sync all db files first as I do in https://git.archlinux.de/repo-tools.git/tree/syncrepo This ensures that repos are almost always consistent. This way all the problems we regularly have when core and extra are not on sync are gone. On the client side pacman will either try the next mirror if a file cannot be found are just refuses to update at all. And if we somehow manage to check if the db sync has changed anything we might even implement your approach here without a http master and the need to locally store data. In short: we should keep it simple. -- Pierre Schmitz, https://users.archlinux.de/~pierre
participants (2)
-
Florian Pritz
-
Pierre Schmitz