Referer spam mirror

Today, I mostly block referer spammers, but I am considering letting them taste their own medicine, by way of a little @mod_rewrite@ magic.


The trick is deceptively simple:
bc.. SetEnvIfNoCase Referer “refererspam.example.com” RefererSpam
RewriteCond %{ENV:RefererSpam} ^1$
RewriteCond %{HTTP_REFERER} ^(.*)$
RewriteRule ^(.*)$ %1 [R=301,L]
p. What is happening here is really quite simple:
# Use “SetEnvIfNoCase”:http://httpd.apache.org/docs/mod/mod_setenvif.html to identify the referer spammer and set the enviroment variable RefererSpam. In the example, any referer string that contains refererspam.example.com will have this variable set.
# Then, use “mod_rewrite”:http://httpd.apache.org/docs/mod/mod_rewrite.html to send the refererspammer back to wherever he came from, using the @%1@ backreference. In the example, the referer spammer, attempting to spam with http://refererspam.example.com/ will be sent back to that exact address.

Previous Post

7 Comments

  1. I’ve thought about similar tactics. However, I really wonder if they follow redirects, and if they do, do you think they would even mind the hits on their own site?
    I guess it can clog up their logs just as it clogs up ours….

  2. Most of these bots do indeed follow 301’s, since they can’t deliver their payload, except on statuscode 200 or 304.
    Not only that, I suspect that some of these bots actually treats these redirects as HTTP 1.1 instructs them to: by retrying directly at the redirected URL.

  3. ghola

     /  2005-06-18

    I find this technique very interesting but I have a question:
    – If they do the same, couldn’t it result in an infinite loop of sorts? (I really don’t know much about the subject, this is a genuine question)

  4. Ghola: This is the brilliant part: There is no chance of an infinite loop, as the referer sent will be the referer of the spammer him/herself.

  5. I can confirm that several of these bots actually do follow 301’s. My experience is the same as Arve’s, many directly retry to spam the specified/redirected URI.
    I’ve also been wondering a bit about where they’d best be redirected. I’m currently considering sending all bad-bots (spammers, viruses, hack attempts etc) to one of IANA’s black holes (Ex: Prisoner / Black Hole 1) or possibly back to the loopback IP. The reasoning being that they’d be severely slowed down by the timeouts from the black hole or “attacking” themselves on the loopback…

  6. Nick

     /  2005-09-29

    That’s a very nice technique.
    I think it is better to deny access, because the spam bot propably will not follow the redirect header, and it seems that this method produces more overhead to the server.
    There is a script that I made for this job, which connects to a spam database (which me and other ‘contributors’ update) and then parses the .htaccess files. Feel free to use it and give some feedback. I feel that this phainomenon is about to raise, so we should work together to find the solution.
    If you want to use my script visit http://www.thetopsites.net/referer_spam/

  7. Sending a Status 301 or 302 with a location header and some placeholder content is no more expensive to the receiving server than sending a 403.