Hello everyone, I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror? Kind regards, Arnold Dechamps
On 5/11/25 11:14 PM, Arnold DECHAMPS wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Kind regards,
Arnold Dechamps
Hi, I didn't realize AI scrapers could affect mirrors as well. Do you by chance know what specific locations do they hammer or is it just the number of requests? So far I'm not aware of anyone attempting to use Anubis on the mirror and how pacman would react to it. Regards, Arun
Hey, Anubis requires running in a relatively modern browser. Pacman won't quite do. You could exclude the user agent I guess? What makes you think that you get AI scrapers btw? I doubt they'd be interested in binary files (like large compressed packages). Best, ave On May 11, 2025 11:14:04 PM GMT+02:00, Arnold DECHAMPS <arnold@adechamps.net> wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Kind regards,
Arnold Dechamps
On 11/05/2025 23.14, Arnold DECHAMPS wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Hey, It should work fine actually. The funny thing about how Anubis works is that it only triggers challenges for User-Agents containing the string "Mozilla" which the AI scraper requests do because they try to pretend to be regular web browser traffic.
Hey, Great to hear that this is a solution. Thank you very much. At this point, my mirror has been running IPv6 only for a few minutes (given that their scrapers only work in legacy IP). Most of the legacy traffic in my region is spam traffic anyway. But Anubis might then become the permanent solution (I guess that scrapers will use real internet at some point in the future). Kind regards, Arnold Dechamps On 5/11/25 11:29 PM, Johannes Löthberg wrote:
On 11/05/2025 23.14, Arnold DECHAMPS wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Hey,
It should work fine actually.
The funny thing about how Anubis works is that it only triggers challenges for User-Agents containing the string "Mozilla" which the AI scraper requests do because they try to pretend to be regular web browser traffic.
Oops, I meant to send the email to arch-mirrors. Basically was saying that the user-agent blocking solution works well for my mirror. I would be really interested though to see what kind of impact would Anubis have performance-wise on the mirrors, since the user-agent solution might not work forever.
Hello Everyone, To whom it may concern, Anubis is now running on my mirror. Works like a charm with Pacman ! Good to know in case we get another case like that. Kind regards, Arnold Dechamps On 5/11/25 11:29 PM, Johannes Löthberg wrote:
On 11/05/2025 23.14, Arnold DECHAMPS wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Hey,
It should work fine actually.
The funny thing about how Anubis works is that it only triggers challenges for User-Agents containing the string "Mozilla" which the AI scraper requests do because they try to pretend to be regular web browser traffic.
On 5/12/25 12:14 AM, Arnold DECHAMPS wrote:
Hello Everyone,
To whom it may concern, Anubis is now running on my mirror. Works like a charm with Pacman ! Good to know in case we get another case like that.
Kind regards,
Arnold Dechamps
On 5/11/25 11:29 PM, Johannes Löthberg wrote:
On 11/05/2025 23.14, Arnold DECHAMPS wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Hey,
It should work fine actually.
The funny thing about how Anubis works is that it only triggers challenges for User-Agents containing the string "Mozilla" which the AI scraper requests do because they try to pretend to be regular web browser traffic.
Which mirror is this exactly? So I may test it as well. Regards, Arun
Hello, My mirror is https://mirror.tiguinet.net/arch Kind regards, Arnold Dechamps On 5/12/25 12:17 AM, pitastrudl wrote:
On 5/12/25 12:14 AM, Arnold DECHAMPS wrote:
Hello Everyone,
To whom it may concern, Anubis is now running on my mirror. Works like a charm with Pacman ! Good to know in case we get another case like that.
Kind regards,
Arnold Dechamps
On 5/11/25 11:29 PM, Johannes Löthberg wrote:
On 11/05/2025 23.14, Arnold DECHAMPS wrote:
Hello everyone,
I'm getting a LOT of traffic from AI scrapers (to the point that my mirror starts to become unusable). I heared about a project called Anubis. Would this work on a mirror?
Hey,
It should work fine actually.
The funny thing about how Anubis works is that it only triggers challenges for User-Agents containing the string "Mozilla" which the AI scraper requests do because they try to pretend to be regular web browser traffic.
Which mirror is this exactly? So I may test it as well.
Regards, Arun
participants (5)
-
almirror@ave.zone
-
Arnold DECHAMPS
-
Franscobec
-
Johannes Löthberg
-
pitastrudl