I have written many posts on my battles with WordPress comment spam but all that appears to be coming to a very satisfactory solution. I am now no longer using any comment spam plugins and I have stopped moderating comments on this blog.
How did I get to this enviable position? Well, it has been a long road and I have learned loads about WordPress along the way.
I started down this road by trying various comment spam plugins with different degrees of success. However, none were really satisfactory. The best one was WP-Hashcash – best in that it was most transparent to the user – but it requires commenters to have Javascript turned on in their browser. So I kept looking for another strategy to eradicate this scourge from my blog.
I upgraded from WordPress 1.2 to WordPress 1.5 (the current version) – WordPress 1.5 has a number of anti spam comment features natively built in.
Of these, I have set the number of links allowed in comments to 3 – any more than that, and the comment is auto-moderated.
I have populated the blacklist with a short list of words (just over 40) – any comments containing these words are automatically deleted – boom! No notification to me, no notification to the commenter.
I have written a custom .htaccess file which blocks a lot of potential spam commenters at the gates. Instructions on how and why I set it up are here.
And finally, I have installed Dr. Dave’s plugin Referrer Karma. I know, I know, I said I didn’t have any comment plugins, but I don’t. Referrer Karma is a referrer spam plugin which just happens to work like my .htaccess file (but much more elegantly) to block the bad guys at the gates.
The combination of these measures has allowed me to turn off moderation on the comments on my blog – and so far (one week later) no comment spam has made it through my defences. I’m not saying the war is over but, so far, I seem to have won this round.
You know that those comments that are moderated by WordPress are not deleted right?
They are flagged as “spam” and left in your DB.
Hi Chris,
thanks for your comment. I’m afraid I’m not sure which comments you are referring to, though.
If you mean the comments which have more than three links – yes, I knew that.
If you mean the comments which contain a term in my blacklist – no I wasn’t aware of that. In the Codex section on blacklists, it says comments matching these terms are deleted. Are you saying this is incorrect?
I knew there would be confusion when I wrote that in the Codex, but I don’t know how to word it exactly. Basically, they are deleted from your blog, but left in your database and marked as “spam”. This is in preparation for a future feature (possibly an intelligent/learning scanner). For now, Chris’ Spam Nuker will take care of them: http://www.chrisjdavis.org/2005/03/05/spam-nuker-151/
Ah!
I think I understand now – just to clarify, Chris’ plugin deletes comments marked as spam from the database – does it have any other functionality? i.e. is it an anti-spam plugin as well?
Ok guys,
I clicked on the link to Chris’ plugin and followed the link on that to the explanation of the spam-nuker plugin and it is, as you said James, a plugin to delete comments marked as spam from the wordpress database.
Thanks for the tip-off.
Note that if you simply want to see if you have any spam comments you can use my plugin.
That was blatant and shameless self-promotion ColdForged!
Must check it out,
🙂
Tom
Tom
I may have to look at some of those plugins you suggested, though since upgrading to 1.5 I haven’t had as many issues as before. The only anomaly I’ve had with the comment spam plugins is that Kitten’s spaminator kept on “killing” my blog completely, so I had to deactivate it.
True, but at least it was topical :). Honestly, though, it seemed to follow the course of the conversation since it veered off into how they’re stored in the database and how certain plugins nuke them from there. I’m not a big self horn-tooter, but if I see a place where it might be useful, I’ll comment.
(might want to include a CSS rule to remove those borders from smileys… something along the lines of “.wp-smiley { border: none; padding: 0; }”)
Thanks for that ColdForged,
I’ll try it out later on this evening if I get a chance.
The self-promotion dig was just that a dig – don’t worry about it – hopefully it will be helpful to someone.
That css snippet worked a charm ColdForged.
Thanks,
🙂
Tom
To get back onto the topic of the post… 🙂 How is this change working for you so far?
I have tried various plugin’s myself, and have settled on Spam Karma. It has done a faultless job of nailing the comment spam. I don’t really get that much so far. Just a couple a month.
Now as for referrer spam, I get thousands a month. and Referrer Karma has been doing a fantastic job of dealing with those. Both require no extra work on my part to do their job. In fact, I believe that there are some tricks that can be done to use both with the .htaccess file so that these people can’t even get into the site at all. I haven’t done this yet for fear of losing access myself.
Prior to WP and Spam Karma, I used ExpressionEngine and was editing the .htaccess file. At one point, my .htaccess file was over 1000 lines of items to be blocked. I was adding 3-5 new items every day. I just got sick and tired of doing that, so I just gave up and let the referrer spammers have at it. No one saw the referrals except myself anyway. They were eating bandwidth, but not enough to make me worry about my limits.
Dave,
I’m still doing great – not a single spam comment through!
Now, I have just moved domain, so anything from now on is hardly a fair test as this is a new domain with no blogging profile and no Google Page Rank so it isn’t a target yet.
Still, I was without spam or comment spam plugins, for almost three weeks before I switched and I fully expect to be spam free here for the foreseeable future.
Also, I tried both Spam Karma and Spaminator and dropped both of them because they were nuking non-spam comments.
When I first looked at Spam Karma (posting comments on other sites using it) I was being flagged as a spammer when I hadn’t even posted a comment there before! Turns out that the sites were changing the settings in SK to be more strict.
Dr. Dave has either made some minor changes to the plugin or made sure that users of the plugin know that the default setting is more than enough to nail spammers. I haven’t been flagged as a spammer since. I haven’t seen a false-positive yet on my site. Mind you, I don’t get that many comments, so it’s not much of a fair test either. 🙂
By “referral spam” I assume you mean trackback spam? Because I don’t get any comment spam these days, but tons of trackback spam.
Bryan,
referral spam is different to trackback spam. Referral spam is where you look at your sites logs (either raw or using a program like webalizer or Awstats) and you see loads of referrer entries for sites which plainly are not linking to your site.
HTH,
Tom
More to the point, the referrals are to porn sites, poker sites, etc… If you run a site that doesn’t display the “referrals” to users, referral spam isn’t a huge problem. Just a bandwidth drain.
The bandwidth drain is because the referral spammer is just sending a GET request to your webserver with the spam address in the referring address part of the request. The server sends the page requested, however, the “bot” could care less. This and the fact that these spammers typically send thousands of requests per hour. Sometimes acutally slowing down a site to a crawl like a dDOS attack would.
Well, I could sure use something decent to get rid of the trackback spam. It’s all poker and pr0n too, and comes in waves. I have the blog set up so that all of them are sent to a moderation queue, but it’s still a pain in the butt.
thanks for the clarification.
Spam Karma (by the same person) deals with comment spam, but it also has a checkbox that checks Trackbacks for spam too. I don’t currently have it active. However, Spam Karma seems to deal with Trackback spam pretty well even with that checkbox off. My test site was getting hammered pretty bad with trackback spam too, yet none of them made it to the moderation queue due to Spam Karme killing them.
You may want to check it out. The plugin is located at: http://www.unknowngenius.com/blog/wordpress/spam-karma
Dave and Bryan,
Referrer Karma, the referrer spam plugin I use, is written by Dr. Dave, the same guy who wrote Spam Karma. It works against referrer spam, comment spam and trackback spam (so well that I need no other plugin and have moderation turned off on this site).
I don’t see any need (for me) to install Spam Karma as it would only add overhead to the server for no extra benefit and the very real possibility of losing legitimate comments (as happened me when I used it previously) through false positives.
Tom, Referrer Karma only blocks referrals from being sent to your site if they don’t have your site address somewhere on their site. (or some other magic that I’m not really sure about) It doesn’t block comment spam or trackback spam. Unless you are using it to modify your .htaccess file. Even then, I wouldn’t say it blocks comment/trackback spam. If it did, then Dr. Dave wouldn’t have a need for Spam Karma anymore.
I use both Spam Karma and Referral Karma on my site. However, I don’t have them modify my .htaccess file. I would rather deal with moderated comments/trackbacks than send a false-positive 403 error to an innocent user.
For comment/trackback spam, I though you were using WP’s built-in anti-spam features. They will work quite well if you don’t mind maintaining the lists by hand. Spam Karma has a similar list that it downloads once a week from a central location. That central location is updated by sites using SK. So if someone experiences a new type of Comment spam, the rest of the SK users will have a block for it within a week, less if they update manually.
Dave,
I am using WP’s built-in anti-spam features – I have a small blacklist (you can see it at http://www.tomrafteryit.net/blacklist.txt) and I have a custom .htaccess file (you can see it at http://www.tomrafteryit.net/htaccess.txt). I have written several articles on how I generated these – check the relevant categories on this site.
I haven’t updated the blacklist in several weeks and there are only around 40 terms in it. I updated the .htaccess by adding a new line maybe once a week now. No great overhead there.
As for Referrer karma – I don’t allow it to write to my .htaccess – all I know is that when I installed it, I saw an immediate drop off in referrer, comment and trackback spam – I haven’t received any in weeks now!
I can’t attribute it all to RK, but the combination of RK, .htaccess and the blacklist is working for me.
I fully understand what you did minus the RK install. The fact that when you installed RK, your comment/trackback spam all but stopped is pure coincidence. In fact, I would say that if you disabled RK, you would still see the same comment/trackback free days that you are experiencing now.
The blacklist words you use are more than enough to stop the comment/trackback spam that you were getting.
No argument that the combination of RK, blacklist, and .htaccess file is working for you. In fact, really, just the .htaccess changes you made will cover most of the referral spam that one sees.
I personally have been down the road of modifying the .htaccess file and found it to be way to time consuming myself. I was adding multiple entries a day. The method I use now, I can pretty much let it do it’s thing and they will prevent 99.9% of all the crap I used to get. I check the logs about every three days or so these days to make sure I’m not getting false-positives. I would go longer, but Spam Karma will actually block some comments, so I want to make sure it’s behaving. So far it has.
My previous comment was made to make sure folks don’t get confused. RK does not prevent comment/trackback spam. It just prevents referral spam from reaching any statistic scripts you might be running. Your server logs will still show the offending referrals since they are getting to the script. To prevent those, you would have to let RK modify the .htaccess file, and I am not keen on that.
Hi Dave,
sorry for taking so long to reply but I have been rather unwell today and this is my first time near my computer – forgive me if this post is even less comprehensible than usual 🙂
First off, I’m not a programmer, so whatever you say about RK is probably far more accurate than anything I’m saying. However when you said
I’m not 100% sure you are correct.
The reason I’m not sure is because on Dr. Dave’s RK page, he himself says
Also, I know you said
but I’m not the only one who has had this experience. MacManX also reported a similar experience after installing RK.
Cheers,
Tom
I see what you are seeing. The problem is Dr. Dave is using 403 to describe a state, not a true server error.
He does say that the user gets a 403: Access Forbidden error, however, when I tested the script by using the “RefSpoof” Firefox extension, I simply got the error message he mentions telling me that because of referral spam, you need to click this link to go to the page you asked for, blah blah, what ever the thing said. It has since been changed to automatically redirect to the correct page without passing the referrer data to the site. I never got a true 403 server error. I probably would have if I had the script update the .htaccess file.
From what I can see in my server logs, a comment post doesn’t send a referrer. So it’s unlikely that RK would even look at it.
What just hit me was that it’s possible that the scripts/bots/whatever that spammers use first attempt to GET a page from to verify what kind of CMS the target is using, then depending on the results, they then send their spam comment/trackback. If this is the case, then yes, RK will definatly prevent some comment/trackback spam. But not intentionally. 🙂
I wonder if anyone knows how comment/trackback spammers do their thing. It would be really interesting to see if that is the case.
Thanks SOSOSOSO much Tom!!!! I’m going to have to implement some of these methods you talk about!!! Matt hasn’t loaded the new version of WP for me yet – you think he would do that for his SISTER!!! – but he’s fixing it as soon as he gets back from Europe; I’m going to show him this post to help with the process. I think that I actually get more comment spam than he does for some weird reason, and 99.9999% of it is obscene. It really upsets my mother. I also really like the way your site is set up. I may have to show that to him as well…you know, as a hint… 🙂
Am I missing something? Since upgrading to WordPress 1.5, I’ve been singingn it’s praises when it comes to dealing with spam. Right now, anything that looks like spam (or from a new commenter) gets routed to my moderatin queue. I check it a few times a day, and can usually manage everything (even if I have 100 or more spam comments in moderation) in two clicks or three clicks. No need for a plugin, and I’m not even using the blacklist feature. I get emails about every comment, and I haven’t seen any spam get through.
So am I just lucky, or am I missing something.
No Terrance,
you are not missing anything. It used to really annoy me that I had to spend the time going through the moderation queue several times a day emptying it. I was constantly afraid that I would, at some point be overzealous and delete a legit comment.
So I decided to only let legitimate comments through using the measures outlined in this post. Now, only legitimate comments are posted, spam never makes it to the site and it is all transparent to the user. Their comments just appear as soon as they are made.
And they don’t have to have javascript turned on nor do they have to enter CAPTCHA text in a box – this is annoying for many commenters as it places an extra burden on them.
Hope this clarifies the issue for you.
Tom.
Just a quick comment to correct what Dave M. says above: RK does indeed server a real 403 server error (with http headers etc). It just does so with its own 403 message (which also attempts to redirect) and not with Apache’s. Which is completely valid and will still insure your logs stay clean.
Good luck with that, and if you still get tired of having to clean through the few remaining comment spams, consider using SK2: it’s a major leap from SK1 and shouldn’t give you or your commenter any trouble whatsoever.
Dr. Dave,
thanks for clarifying this.
I’ll keep that in mind if I get comment spam in the future. Since implementing these measures I have had only one comment spam and I believe it got through these defences because it was manually submitted.
I guess if the author says it, it must be true. 🙂
I have all but given up on RK. I watched the RK logs daily, yet I noticed that my referral list was filling up with a ton of spam. I double checked the RK logs and didn’t see any entries for the spam. I don’t know how they were bypassing RK, but they must have been.
So I took out the referral page on my site, and removed RK. I really feel that a blacklist is necessary with RK. Sure, it’s a pain to put in the BL entries, but it was just as much of a pain to change the entries in the logs that were wrong, so I would rather “educate” the tool, than have to “remind” it every so often.
Ah well, referral lists are not that useful a part of a website anyway.
I don’t see any need (for me) to install Spam Karma as it would only add overhead to the server for no extra benefit and the very real possibility of losing legitimate comments (as happened me when I used it previously) through false positives.
I wish I had the time to edit .htaccess files and blacklist tables to make sure I don’t see comment spam too. However, I am very pleased that Spam Karma does this for me and an increadible job of it at that. Out of over 3400 comment spams in 3 months, I have only had *1* false positive and a couple of false negitives.
If using WP’s built in precautions work for you, wonderful. For me, I need an automated way of dealing with them.
mp3seeker,
Couldn’t agree more – I had Spam Karma on my site very briefly but I found it gave too many false positives so I uninstalled it.
I haven’t had any anti-comment spam plugins on this site since I wrote this article and I haven’t had any problems with comment spam either.
Hope this helps,
Tom
In an ideal world, spammers could be shot on sight. This would make it more difficult (or dangerous, rather) to send spam, but I doubt the authorities will go for it.
Oh well.
I had a serious problem with comment spam. I was getting hundreds or even thousands of them daily. I used to run Spam Karma, Spam Karma 2, Spaminator, Spam this, Spam that, you name it. I had problems with them all. Mostly false positives, though on occasion my site would be completely unavailable. So I wrote my own anti-spam plugin, wp-spamassassin (if you REALLY want it, search for it). That was five months ago. I don’t use it anymore.
Every approach to combating spam has its advantages and disadvantages. So I wrote something else, not to combat spam, but to combat the spammers’ software. Thus Bad Behavior was born.
I take a completely different approach. I don’t screen comment/trackback content at all. Instead, I screen the user agents themselves. This is similar to .htaccess rules, but FAR more sophisticated. I can analyze far more about the user agent from PHP than from .htaccess, so I have been able to eliminate false positives and block out virtually every spambot out there. Out of over 40,000 attempts at my site, exactly eight have gotten as far as the moderation queue. Not too bad for a plugin that doesn’t even look at the comments. (And I’ve since fixed that.)
This is very important; it’s quite easy to block legitimate users using .htaccess. For instance, this person inadvertently blocked out many people using Internet Explorer. You have to go deeper than User-Agent and Referer in order to distinguish accurately between the spambots and the people.
Thanks for your nice tips. I followed you and it is true. WP is powerful.
There are many men descussed in blog.Why? I think men must to work.
we have an interesting sample of spam just above this message.
Looks like what you did’s note quite enough…
Hi,
we are testing a new free form-protection service (www.cerospam.com.ar), for blogs and for any kind of web site. It is easy to setup each form with this system, and it is very useful for protecting comment forms from spammers.
It is based on captcha method. Until now it seems to work fine. No matter what kind of blog software you are using, this is not a plugin.
Please, test it and do not hesitate to send us your comments!
Thank you.
Cero,
CAPTCHA’s are a very poor way of dealing with spam – the American Foundation for the blind has written many times about how difficult Captchas make browsing for blind or partially sighted people and the W3C in a report on Captcha’s said:
Another point Cero – your blog doesn’t allow comments so I can’t give you feedback there – blogs without comments and CAPTCHAs – not my kind of company.
Captcha is shit.
Readers don’t post comments if you have captcha 🙁
With my WP blogs that got spam a lot I installed bad behaviour plugin and suddenly I’m not getting anywhere near the levels of spame I was getting.
Have you had any issue with WP blocking comments that were not spam? Or not allowing moderation of comments that it deemed “bad”? My account is showing a huge number of so called spam and I don’t believe that all of the comments are spam, but it will not allow me to check them.