[aur-general] Fighting spam on the AUR

Wed Mar 13 07:31:58 EDT 2013

On Wed, Mar 13, 2013 at 11:55:26AM +0100, Markus Unterwaditzer wrote:
> Lukas Fleischer <archlinux at cryptocrack.de> wrote:
> 
> >Status quo:
> >
> >    06:54 < gtmanfred> ok, it really is time for something else
> >    06:54 < gtmanfred> the spammer is now creating a new account for
> >    every comment and flag out of date
> >
> >The account suspension feature does not help here.
> >
> >Options:
> >
> >* Allow package maintainers to block the "Flag package out-of-date"
> >  feature for a certain amount of time. Note that this might eventually
> >  cripple the "out-of-date" function. Also, this does not work for
> >  comments.
> >
> >* Use CAPTCHAs during account registration. We could either use
> >MAPTCHAs
> >  ("What is 1 + 1?") or something like reCAPTCHA [1].
> >
> >* Moderate new accounts. Might be a lot of work. We need some TUs that
> >  review and unlock accounts. Also, it might be hard to distinguish a
> >  spam bot from a regular user. If we require a short application text,
> >  this might result in less users joining the AUR.
> >
> >* Block IP addresses. Bye-bye, Tor users!
> >
> >Comments and suggestions welcome! We need to find a proper solution as
> >soon as possible!
> >
> >[1] http://www.google.com/recaptcha
> 
> Other options:
> 
> * Deny the repeating of a specific action... e.g. you may not flag more than ten packages within ten minutes. Also block comments with same content.
> 
> * ability to report users (dunno if already possible), autoban if enough reports
> 
> * "Buffering actions" aka shadowban when a user gets reported, until a moderator reviews the report.

All of these do not address our current issue. We do have an account
suspension feature already but that does not help if a new account is
created upon every request.

> 
> * Do whatever Reddit does, they seem to deal very well with spam.

I think they use a Bayesian filter and reports. Not sure if it is
worthwhile adding that to the AUR. Also, Bayes classifiers will not
prevent spammers from flagging packages out-of-date.

Please correct me if I am wrong in my assumptions.

> 
> -- Markus (from phone)