Mon, 18 Jun 2012
Antispam recommendations for MediaWiki (that are simple, and actually work, and permit anonymous editing)
I've had the honor of working with Will Kahn-Greene at the Participatory Culture Foundation recently. He works on the Miro desktop video player.
He also maintains the PCF's wiki. It runs MediaWiki. It was being spammed to smithereens. In this screenshot, you see WillKahnGreene's account deleting files and spam pages created by bots.
Before
After
As of about a week ago, all the bots can do is create user accounts. That means Will doesn't have to go blocking them and deleting the content they uploaded:
The exceedingly simple antispam strategy
You might be wondering how he did it. Here's what it took, in terms of policy:
- Permit anonymous edits to the wiki. (This is essential for WikiNature.)
- If you are uploading a file, you must be logged in with an account that has confirmed its email address.
- If you are adding a new URL to a page, you must be logged in with an account that has confirmed its email address.
Will had the patience to listen to me and try a few of my ideas, most of which still let some spam through.
Then we came up with the idea of hacking some changes into the CAPTCHA plugin to enforce the above policy. MediaWiki has a permissions system (they call it "user rights"). We use the user-rights system to restrict file uploads, but crucially there is no built-in way to restrict who can add URLs.
So Will had to write some very simple code that, effectively, adds an addurl permission to MediaWiki. He did it extending the ConfirmEdit extension. It's easy to install; you can find instructions on his project page. He wrote a blog post about it.
As far as I can tell, anyone who runs a public MediaWiki install should configure the wiki this way.
Seriously.
If you look at the long, sprawling MediaWiki documentation page about restricting spam, it contains all sorts of nonsense. Ignore it. Just do what Will did.
[] permanent link and comments
Sat, 03 May 2008
reCAPTCHA
CAPTCHAs are a name for programs designed to test if they are being used by another computer (a "bot") or by a humamn. They do this by asking the user to do a task that presumably can't be done by a computer; for example, reading obscured words.
reCAPTCHA is a well-known CAPTCHA service that takes images from the Internet Archive's book scanning project. Some words are hard
But as for spam in MediaWiki, it seems that simply using the blacklists mentioned earlier is not enough; the Reed Free Culture wiki (for example) has been spammed beyond recognition with link spam. So I am deploying reCAPTCHA to show a CAPTCHA to users when they register, and showing a CAPTCHA to anonymous users who try to add links.
P.S. Attentive people may consider a personal link I have to the Internet Archive's book scanning project. That has nothing to do with my liking reCAPTCHA. (-:
[] permanent link and comments
Tue, 22 Apr 2008
Mediawiki antispam: SpamBlacklist
I end up maintaining a bunch of MediaWiki wikis. So far, here is what I do to keep them low in spam, high in ham.
Note that I have a bias to wanting to accept anonymous edits.
Use SpamBlacklist
Wikimedia maintains a list of bad domains that are linked-to by spammers. The famous chongqed.org maintains a similar list. The SpamBlacklist extension prevents saves with URLs that match patterns listed in a blacklist. Blocking this way is important, even if anonymous edits are disallowed, because many bots seem to register for accounts. Blocking this way is important, even if CAPTCHAs are enabled, because there seem to be spammers who sit at their computers and spam (or alternately who solve CAPTCHAs and then let their bots run (not that I've ever done that....)).
To use it, just:
- Check it out of their svn
- Configure a cron job to get the Chongqed and MW blacklists locally, and configure $wgSpamBlacklistFiles as appropriate.
- Don't forget to read the official docs.
- Caution: The Chongqed list blocks lots of .edu domains. I "grep -v" them out.