On 26.01.2020 21.52, David Precious wrote:
On Sun, 26 Jan 2020 17:19:10 +0100 Kristian Klausen via arch-mirrors firstname.lastname@example.org wrote:
So instead of mirroring the whole thing, the idea is to mirror only the database files (core.db etc) and download the packages on demand from a Tier 1 mirror (and let nginx cache them). By doing it that way, I only download requested packages from the Tier 1 mirrors, instead of downloading the whole thing (saving Tier 1 bandwidth).
I'm not quite sure what problem you're trying to solve - tier 1 servers have plenty of bandwidth, otherwise they shouldn't be running such a mirror, and I'd wager that downstream mirrors syncing occasionally pales in comparison to end user traffic, so I don't think you need to really worry about the upstream.
If your concern is *your* bandwidth or disk space, then you probably shouldn't be setting up a public mirror at all - assuming, of course, that it is a public mirror you're talking about here, and not just a an internal network cache to point your boxes at so that you only download each package once, not once for every machine.
To provide even better performance a CDN (ex: Cloudflare) could be used to provide more caching.
Others have already addressed that this may break Cloudflare's terms, as they're designed to optimise websites by hosting HTML/JS.
Do I miss something? Is this a bad idea?
Immediate thought is that the first request for each package could seem unacceptably slow, as your mirror would have to fetch it first before it could serve it to the client, and for larger packages, that could begin to make it feel slow (especially if also doing that for ISOs, etc).
Valid point, that could in theory be fixed by downloading from multiple servers in parallel. It would require a more complex setup, but in theory it could be done.
It also means that if your upstream is temporarily down, you have an incomplete mirror which appears reachable but fails to serve some files, which is probably not ideal.
The idea was to fallback to another mirror on errors/404.
To me, it feels rather like you're trying to solve a problem which doesn't really exist.
Roger that, it was just a "crazy" idea to run a mirror without mirroring everything (requiring less storage) and a CDN like (deb.debian.org), but as the Arch project seems to have more than enough mirrors, the idea doesn't make sense. Thanks for your time everyone!