I have just watched the digital equivalent of paint dry, looking at half an hour worth of tail -f access.log. I actually did this to understand the behavior of the referer spammers
Which actually was quite useful. Not so much for locating any patterns in refererspammer behavior, but for other reasons.
I simply don’t understand this bot:
* It visits every document twice
* On the second visit, it will still try to visit the original URL, even if my webserver has instructed it that it has a new permanent address by sending 301 Moved Permanently
* It doesn’t care if a document is 404 Missing, or even 410 Gone. It will still visit it twice.
* It visits in bursts, and will fetch robots.txt at the start of each burst.
h3. Internet Explorer
Of all the requests from browsers claiming to be Internet Explorer, only about half of them actually fetch additional content, such as stylesheets and images embedded within stylesheets. I guess there is a huge number of malicious bots out there, lying about their identity in an effort to avoid being detected. God (and spammers) only knows what their real purpose is.
h3. Hall of Shame
If you’re attempting to alert people about your online (genuine) search services, doing so by sending referer spam once in a while is _not_ a good idea. The following “services” can consider themselves as banned:
* world-of-newave dot info
* dailyorbit dot com