On Sun, 26 Jan 2020 17:19:10 +0100 Kristian Klausen via arch-mirrors <arch-mirrors@archlinux.org> wrote:
So instead of mirroring the whole thing, the idea is to mirror only the database files (core.db etc) and download the packages on demand from a Tier 1 mirror (and let nginx cache them). By doing it that way, I only download requested packages from the Tier 1 mirrors, instead of downloading the whole thing (saving Tier 1 bandwidth).
I'm not quite sure what problem you're trying to solve - tier 1 servers have plenty of bandwidth, otherwise they shouldn't be running such a mirror, and I'd wager that downstream mirrors syncing occasionally pales in comparison to end user traffic, so I don't think you need to really worry about the upstream. If your concern is *your* bandwidth or disk space, then you probably shouldn't be setting up a public mirror at all - assuming, of course, that it is a public mirror you're talking about here, and not just a an internal network cache to point your boxes at so that you only download each package once, not once for every machine.
To provide even better performance a CDN (ex: Cloudflare) could be used to provide more caching.
Others have already addressed that this may break Cloudflare's terms, as they're designed to optimise websites by hosting HTML/JS.
Do I miss something? Is this a bad idea?
Immediate thought is that the first request for each package could seem unacceptably slow, as your mirror would have to fetch it first before it could serve it to the client, and for larger packages, that could begin to make it feel slow (especially if also doing that for ISOs, etc). It also means that if your upstream is temporarily down, you have an incomplete mirror which appears reachable but fails to serve some files, which is probably not ideal. To me, it feels rather like you're trying to solve a problem which doesn't really exist. Cheers Dave P