Like everybody else, I get horded by referrer spam. In sheer numbers, referer spam is much, much worse than any comment spam attack I’ve ever seen.
rel="nofollow", the search engines intended fix does not work. It does not work against weblog spam, and it most certainly does not work against referrer spamming.
h3. Stat tools
There are a number of excellent tools for generating visitor statistics.
The problem is: people start using these tools. And that is all they do: They don’t keep these tools up to date, they don’t change the config, and they don’t limit access to them.
Which is why spammers love these tools even more than the actual users do: They’re excellent vectors for referrer spam attacks. Guess what: None of these tools use @rel=”nofollow”@. And they all display referrers. And the installed base is not very likely to be patched either.
Searching for “AWStats installations”:http://www.google.com/search?q=allinurl:awstats.pl and “Webalizer installations”:http://www.google.com/search?q=%22Generated+by+Webalizer+Version%22 reveals that hundreds of thousands of these stat pages have made it into Google. Which means they’re free-for-all link farms for Blackhat SEOs.
Since spammers are write-only, they can’t be bothered actually searching for attack vectors: Instead they brute-force their way through the Internet, setting up simple shell scripts that collect URLs in documents, spam them and move on. In these days of relatively fast connections, this is much faster than actually searching for vulnerable installations.
Which is why this should not be the responsibility of users. Recognizing an AWStats or Webalizer installation programmatically is trivial.
At the very least, search engine vendors should treat all links on these pages as they all had @rel=”nofollow”@ set. Ideally though, these search engine vendors should simply _drop known statistics tools from their indexes._ Apart from a small group of spam and security researchers, crackers and refererrer spammers, these pages aren’t useful on the public web.
If Google and others prevented the latter two groups from getting their kicks, the “researchers” would hardly need to research.
So, Google, MSN Search, Yahoo! and other search engine vendors: Could you please drop useless stats pages from your indexes, and be as vocal about it as you were about the useless
rel="nofollow"? While it might drop that
Searching 8,058,044,651 web pages number by a few hundred thousand to a million, it will leave search engine results more useful, and you will spare site owners a lot of agony.