I have been using my .htaccess file to stop comment and referrer spam on this site and it has been surprisingly successful (so far!). How do I create a .htaccess file capable of greatly reducing comment and referrer spam?
Firstly, I use Awstats to analyse visits to my site daily and I use Spam Karma to help control comment spam. Both applications give me information on spammers visiting my site.
Awstats gives me a list of the referer sites – this list contains those sites which are trying to spam my referrer logs. I monitor those sites and as new ones appear I add them to my .htaccess list in the form:
RewriteCond %{HTTP_REFERER} \.domain\.tld [NC]
where .domain is the domain trying to spam my site (psxtreme, freakycheats, terashells, and so on) and the .tld is the top level domain the site is registered to (.com, .net, .org, .info, etc.).
So, for instance, in the case of the spammer coming from the smsportali.net domain, I have added the following line to my .htaccess code:
RewriteCond %{HTTP_REFERER} \.smsportali\.net [NC]
This will stop accesses from all subdomains of smsportali.net (spamterm.smsportali.net) to the site and the NC ensures that this rule is case insensitive.
In the case of comment spam, I have configured Spam Karma to email me every time it deletes a spam comment – this is becoming rarer and rarer as the .htaccess file becomes more and more effective. I have configured Spam Karma to include the server variables and request headers of a comment that is not approved in the email – this is one of the configuration options of this plugin.
Scanning these emails, I can see the User Agents being employed by these spammers – armed with this information, I added the following lines to my .htaccess file:
RewriteCond %{HTTP_USER_AGENT} Indy.Library [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Crazy\ Browser [NC]
RewriteRule .* – [F]
and this has greatly reduced the amount of comment spam coming through.
Also, Cindy alerted me to the fact that adding:
RewriteCond %{HTTP:VIA} ^.+pinappleproxy [NC]
RewriteRule .* – [F]
Will also catch a lot of the spammers.
I have a copy of my .htaccess file available for review (it is in .txt format).
NOTE:
For each set of rules in your .htaccess file, you need to finish with a RewriteRule – RewriteRule .* – [F] will give a 403 (page forbidden) to the spammers. Your last set of rules should end with RewriteRule .* – [F,L] – the L telling the RewriteEngine that this is the last line and to stop processing the rules here.
IMPORTANT WARNING:
the .htaccess file is a very unforgiving file. It has the power to make your entire site unavailable to anyone. It is strongly advised to read up on Regular Expressions and Mod_Rewrite (the Apache module which processes these commands in a .htaccess file) before creating a .htaccess file or modifying an existing one.