Hey!
Looking at logs from my mirror (https://arch.jensgutermuth.de/)
reveals at least Google and AHrefs are crawling my mirror, which
is obviously a waste of resources for both sides. I'm thinking
about blocking them (and all other crawlers) using a robots.txt
file like so (nginx config snippet):
Doing it this way prevents robots.txt from showing up in
directory listings and circumvents all issues with the sync
script.
I know modifying mirror contents is a very touchy subject and
rightfully so. I therefore wanted to ask if there is some kind of
policy and if there is, if this would be allowed or a possible
exception.
Best regards
Jens Gutermuth