About the wiki's CAPTCHA system
Hi all, (apologies if this is not the correct place to post this -- i didn't see anywhere more appropriate) I recently tried to sign up to the Arch wiki, but I was met by the following CAPTCHA. What is the output of: pacman -V|base32|head -1 and I had some issues with this that I'd like to share, while I get the reasons this captcha system was put in place (to prevent spam & to ensure the user is currently using an Arch system), however I don't believe the current captcha is a good solution. First of all, my problem is with the requirement to verify the user is currently using Arch, as the Arch Wiki is a very popular resource in the greater Linux community as a lot of resources apply to software that is commonly seen on other distributions, for example, I wanted to make a change (which I encountered configuring an Arch system) that was useful to everyone who used systemd, anyone, be it an Ubuntu, Debian, Fedora, etc user may have found on the Arch wiki. While I think (I'm not sure, anyone is welcome to prove me wrong), the little number of users trying to post good faith, however, off topic, non-Arch related content, should be stopped by having their changes undone, and people who are Arch Linux users may still fail the challenge (i.e. using a non-arch system at the time, using an Arch system with an different version of Pacman.) However, my second larger problem is that it doesn't seem that it'd be a very good spam prevention mechanism. The CAPTCHA seems to be the same for all users, and changes very infrequently. (pacman version 6.0.1 (according to https://archlinux.org/pacman/#_releases, released in September 2021) and 6.0.2 (according to https://archlinux.org/packages/core/x86_64/pacman/, released in November 2022) were released over a year apart, so any spammer could define the captcha challenge for the Arch Wiki and post spam for many months. Determined spammers could even write a system to run the command inside a Arch Linux container and cache it until the challenge does not work. Ideally, CAPTCHA systems should be - hard for computers - easy for humans - accessible - unique for every request. The current CAPTCHA system fails every one of those requirements, I'd suggest either using a more common captcha type (such as Google reCAPTCHA, hCaptcha or a generic maths challenge), switching to more human friendly, easily Googleable, but still domain specific questions (i.e. "in which year was the Arch Linux project founded"), less ideal, but you could use commands that always produce the same output (i.e. "uname | base64" should always produce "TGludXgK" on Linux) Thank you.
On Sat, 21 Jan 2023 at 15:17, me@foxt.dev <me@foxt.dev> wrote:
The current CAPTCHA system fails every one of those requirements, I'd suggest either using a more common captcha type (such as Google reCAPTCHA, hCaptcha or a generic maths challenge), switching to more human friendly, easily Googleable, but still domain specific questions
On the one hand, you say to make it more difficult, on the other hand, you say to make it easier. Obviously, this captcha is just to weed out the barest minimum of script kiddies.
Like, changing your SSH port isn't the same as securing your system, but it gets rid of 99% of the automated attacks. This current captcha also works great for people with visual impairments as the commands they need to run are all text based. FWIW: I'm not in charge of any decisions here, just giving my opinion
Hello,
On the one hand, you say to make it more difficult, on the other hand, you say to make it easier. Obviously, this captcha is just to weed out the barest minimum of script kiddies.
99% of botters are just script kiddies, so this is very effective xD
Like, changing your SSH port isn't the same as securing your system, but it gets rid of 99% of the automated attacks.
Yup, I used to argue against this until I realised how many script kiddies it got rid of, instantly changed my mind about the idea. But I would like to highlight it does not improve security, it just gets rid of annoying spammers (well some of them).
This current captcha also works great for people with visual impairments as the commands they need to run are all text based.
Arch Linux community contains a lot of neurodivergents and disabled which are overlooked a lot. It would be a disfavor for them if we made it harder for them.
FWIW: I'm not in charge of any decisions here, just giving my opinion
Only the Admins are, so I guess keep them on your good side (I already failed this so its all over for me). -- Polarian GPG signature: 0770E5312238C760 Website: https://polarian.dev JID/XMPP: polarian@polarian.dev
Hello, Please do not promote proprietary captchas, The command used for verification is far from perfect but it is better than nothing.
and I had some issues with this that I'd like to share, while I get the reasons this captcha system was put in place (to prevent spam & to ensure the user is currently using an Arch system), however I don't believe the current captcha is a good solution.
It is not to ensure a user is on an Arch Based system, it it uses the pacman version and converts it to base32, you can do this on any system (Linux based ofc), and why would you sign up for arch linux if you were not using the distribution, seems counter productive in my eyes.
First of all, my problem is with the requirement to verify the user is currently using Arch, as the Arch Wiki is a very popular resource in the greater Linux community as a lot of resources apply to software that is commonly seen on other distributions, for example, I wanted to make a change (which I encountered configuring an Arch system) that was useful to everyone who used systemd, anyone, be it an Ubuntu, Debian, Fedora, etc user may have found on the Arch wiki. While I think (I'm not sure, anyone is welcome to prove me wrong), the little number of users trying to post good faith, however, off topic, non-Arch related content, should be stopped by having their changes undone, and people who are Arch Linux users may still fail the challenge (i.e. using a non-arch system at the time, using an Arch system with an different version of Pacman.)
Again you do not need to use Arch Linux to run that command, the pacman package manager is shipped to other distributions. You do not need an account in order to read the ArchWiki, but they are specifically designed for the Arch Community, thus this is not a problem.
However, my second larger problem is that it doesn't seem that it'd be a very good spam prevention mechanism. The CAPTCHA seems to be the same for all users, and changes very infrequently. (pacman version 6.0.1 (according to https://archlinux.org/pacman/#_releases, released in September 2021) and 6.0.2 (according to https://archlinux.org/packages/core/x86_64/pacman/, released in November 2022) were released over a year apart, so any spammer could define the captcha challenge for the Arch Wiki and post spam for many months.
No capcha is perfect, recapchas can be broken by a bot within 2-3 seconds, and this has been the case for a while. The AUR has been hit multiple times by bots, and the TUs are very good at sorting this out, so you don't need to worry!
Determined spammers could even write a system to run the command inside a Arch Linux container and cache it until the challenge does not work.
This is a lot of work, the idea is that it is harder for a user to be able to use a bot to hit the ArchWiki. bare in mind the ArchWiki supports rollbacks, so any pages which are modified or damage can be rolled back immediately and the account can be suspended.
Ideally, CAPTCHA systems should be - hard for computers - easy for humans - accessible - unique for every request.
No captcha does this currently, I always fail captchas because of Human error or because of stupid things, its hard to map what a computer does unlike a human, the best way is to track the cursor movements as this is very random when it comes to Human beings, but the issue is this is privacy invasive. Arch Linux has tried to put basic protection against bots without breaking your privacy, please do not complain about this because I and many other people here value their privacy a lot and the last thing we want is an integration into a proprietary data hogging captcha company to be implemented into the ArchWiki.
The current CAPTCHA system fails every one of those requirements, I'd suggest either using a more common captcha type (such as Google reCAPTCHA, hCaptcha or a generic maths challenge), switching to more human friendly, easily Googleable, but still domain specific questions (i.e. "in which year was the Arch Linux project founded"), less ideal, but you could use commands that always produce the same output (i.e. "uname | base64" should always produce "TGludXgK" on Linux)
As I said above, this is privacy invasive, and if this ever happened to the ArchWiki, I would lose all support for Arch Linux and jump distribution, and I know a lot of others which would too. your suggestion will divide and destroy the community, I understand you are concerned about the security against bots, but privacy comes first, and this is not the solution! Thanks, -- Polarian GPG signature: 0770E5312238C760 Website: https://polarian.dev JID/XMPP: polarian@polarian.dev
First of all, my problem is with the requirement to verify the user is currently using Arch, as the Arch Wiki is a very popular resource in the greater Linux community as a lot of resources apply to software that is commonly seen on other distributions, for example, I wanted to make a change (which I encountered configuring an Arch system) that was useful to everyone who used systemd, anyone, be it an Ubuntu, Debian, Fedora, etc user may have found on the Arch wiki. While I think (I'm not sure, anyone is welcome to prove me wrong), the little number of users trying to post good faith, however, off topic, non-Arch related content, should be stopped by having their changes undone, and people who are Arch Linux users may still fail the challenge (i.e. using a non-arch system at the time, using an Arch system with an different version of Pacman.) Arch wiki is widely used by the entire community, but it is addressed to Arch users and is meant to contain information relevant to Arch.⁽¹⁾ Therefore changes must be tested on actual Arch, which implies that the editor must have one at hand to solve the puzzle.
I truly appreciate the will to help and I am certain other Archers do too. But be aware that providing information that doesn’t work on Arch is counterproductive and causes trouble to people seeking help. There are some minor exceptions, like housekeeping activities. If one wishes to do that and is open about the situation, help in solving the captcha was always given in #archlinux@Libera. But please note the previous paragraph.
However, my second larger problem is that it doesn't seem that it'd be a very good spam prevention mechanism. The CAPTCHA seems to be the same for all users, and changes very infrequently. (pacman version 6.0.1 (according to https://archlinux.org/pacman/#_releases, released in September 2021) and 6.0.2 (according to https://archlinux.org/packages/core/x86_64/pacman/, released in November 2022) were released over a year apart, so any spammer could define the captcha challenge for the Arch Wiki and post spam for many months. They are never perfect. It is not their purpose to be. They should only *limit* untargeted attacks. Empirical data shows that even the simplest solutions, like “Put number 42 in the next field” are so far effective.
Determined spammers could even write a system to run the command inside a Arch Linux container and cache it until the challenge does not work. Determined attackers can do anything, short of breaking the laws of physics. The most persistent cases are known to relentlessly continue disruptive activity manually, even constantly obtaining new IP ranges, for over a decade.
While desired and whenever possible appreciated, being absolutely perfect is never the goal in security. It is balancing risks and costs. Minimizing adversary’s success rate, done by addressing actually occuring issues. ____ ⁽¹⁾ https://wiki.archlinux.org/title/ArchWiki:About#Goals
participants (4)
-
Andy Pieters
-
me@foxt.dev
-
mpan
-
Polarian