I use AwStats to monitor traffic on the tomandpilar.net site. I monitor the traffic regularly and recently started to notice that my site was apparently being linked to by some very strange sounding sites – Online Poker sites and Online Pharmacies!
A quick bit of investigation (and a quick word of explanation from FrankP) told me that I was the victim of Log File Spam. The idea behind Log File Spam is that Log File analysers, like AwStats, often create html based reports including hyperlinks to referrers. Therefore, if someone appears to come to my site from genericlogfilespammer.com, there is a link to that domain automatically created in my AwStats file report.
If the report is not password protected, then this is found by search-engines and it increases the page-ranking of the spammers’ site.
How do we combat this?
Luckily there are a few simple steps we can take to combat this. The first and most basic, is to password protect the Log File analyser folder.
As added protection, a line can be added to the robots.txt file instructing search engines not to look in the log file analyser folder. Add the following line:
Disallow: /Insert Logfile Analyser folder path here/
After a little further digging I found an article on how to modify my .htaccess file to exclude the majority of offenders. I modified my .htaccess file follwing the tips on this site and using some of Joe Maller’s sample .htaccess file data .
This was my first time modifying an .htaccess file by hand so I am interested to see how it will work out for me. If you would like to check out a copy of the .htaccess file I created – click here
I am also plagued by Blog comment spam. I have always moderated comments on my blogs but it is still a pain to be receiving emails about spam comments daily – which then have to be deleted. Hopefully the .htacess modifications will eliminate a lot of this too.
UPDATE – The link to Joe Maller’s .htaccess file above appears to be re-directing to microsoft.com. I have emailed Joe to ask if this is expected behavour. In the meantime, if you find yourself unable to access it, feel free to browse my own effort – a lightly edited version of Joe’s file.