About the wiki's CAPTCHA system

21 Jan 2023

      Hi all,

(apologies if this is not the correct place to post this -- i didn't
see anywhere more appropriate)

I recently tried to sign up to the Arch wiki, but I was met by the
following CAPTCHA.

    What is the output of: pacman -V|base32|head -1

and I had some issues with this that I'd like to share, while I get
the reasons this captcha system was put in place (to prevent spam & to
ensure the user is currently using an Arch system), however I don't
believe the current captcha is a good solution.

First of all, my problem is with the requirement to verify the user is
currently using Arch, as the Arch Wiki is a very popular resource in
the greater Linux community as a lot of resources apply to software
that is commonly seen on other distributions, for example, I wanted to
make a change (which I encountered configuring an Arch system) that
was useful to everyone who used systemd, anyone, be it an Ubuntu,
Debian, Fedora, etc user may have found on the Arch wiki. While I
think (I'm not sure, anyone is welcome to prove me wrong), the little
number of users trying to post good faith, however, off topic,
non-Arch related content, should be stopped by having their changes
undone, and people who are Arch Linux users may still fail the
challenge (i.e. using a non-arch system at the time, using an Arch
system with an different version of Pacman.)

However, my second larger problem is that it doesn't seem that it'd be
a very good spam prevention mechanism. The CAPTCHA seems to be the
same for all users, and changes very infrequently. (pacman version
6.0.1 (according to https://archlinux.org/pacman/#_releases, released
in September 2021) and 6.0.2 (according to
https://archlinux.org/packages/core/x86_64/pacman/, released in
November 2022) were released over a year apart, so any spammer could
define the captcha challenge for the Arch Wiki and post spam for many
months.

Determined spammers could even write a system to run the command
inside a Arch Linux container and cache it until the challenge does
not work.

Ideally, CAPTCHA systems should be
 - hard for computers
 - easy for humans
 - accessible
 - unique for every request.

The current CAPTCHA system fails every one of those requirements, I'd
suggest either using a more common captcha type (such as Google
reCAPTCHA, hCaptcha or a generic maths challenge), switching to more
human friendly, easily Googleable, but still domain specific questions
(i.e. "in which year was the Arch Linux project founded"), less ideal,
but you could use commands that always produce the same output (i.e.
"uname | base64" should always produce "TGludXgK" on Linux)

Thank you.

me＠foxt.dev

Andy Pieters

Polarian

Polarian

mpan

tags

participants (4)