[arch-mirrors] CDN based/caching mirror?

Kristian Klausen kristian at klausen.dk
Thu Feb 27 20:34:34 UTC 2020


On 26.01.2020 17.19, Kristian Klausen via arch-mirrors wrote:
> Hi
>
> I'm considering setting up a Arch Linux mirror and I'm considering a 
> different design.
>

Hi

I just got time to implement this and the setup looks like this:
Cloudflare -> Cloudflare Workers -> Backblaze B2 bucket <- Tier1 mirror

The files is synced from mirror.ams1.nl.leaseweb.net every hour to the 
Backblaze B2 bucket and they are fetched from the bucket with the help 
of a Cloudflare Workers script.
Cloudflare is configured to cache everything (size <=2GB*), database 
files is cached for 5 minute everything else is cached for 24 hours.
* CF is sponsoring a plan with a higher limit than the 512MB default

I have done some quick testing, and time to first byte isn't impressive 
(at least not when downloading from Europe), but the speed is acceptable 
(80-100MB/s is achievable if the file is cached, and 8-12MB/s if not 
(tested from Europe)).

To make it easier to implement, I took some shortcuts:
* Directory listing isn't implemented
* "latest" files isn't synced
* Only packages in "pool/" is synced, the package files in the different 
repo isn't synced, but if you request a package 
(\.pkg\.tar\.(xz|zst)(|.sig)$) it is automatic retrieved from the pool/ 
directory. This means that you can download ex Firefox from both:
https://archlinux.amirror.xyz/extra/os/x86_64/firefox-73.0.1-1-x86_64.pkg.tar.zst
https://archlinux.amirror.xyz/community/os/x86_64/firefox-73.0.1-1-x86_64.pkg.tar.zst

I'm not sure if the shortcuts is acceptable, but it can be fixed if it 
is a issue.

Also please note that: archive, other and sources isn't synced.

Feel free to try it out: https://archlinux.amirror.xyz/

Best regards
Kristian Klausen

> So instead of mirroring the whole thing, the idea is to mirror only 
> the database files (core.db etc) and download the packages on demand 
> from a Tier 1 mirror (and let nginx cache them). By doing it that way, 
> I only download requested packages from the Tier 1 mirrors, instead of 
> downloading the whole thing (saving Tier 1 bandwidth).
>
> To provide even better performance a CDN (ex: Cloudflare) could be 
> used to provide more caching. So we end up with a setup like this:
> Cloudflare -> Nginx cache -> Tier1 mirrors (nginx with multiple upstream)
>
> Do I miss something? Is this a bad idea?
> If I do setup a mirror like that, is there any chance it could be 
> added as a official mirror?
>
> Best regards
> Kristian Klausen


More information about the arch-mirrors mailing list