Fake blogs, fake forums, fake directories. Spammers have resorted to all sorts of interesting means for fooling the search engines.
I’ve been wondering if it’s possible to create a fake Web site. You can certainly create a spammy Web site, but can you fake one?
What sparked my curiosity on the topic was the post Matt Cutts made about so-called undetectable Web spam. I don’t believe there is any undetectable Web spam, although I am sure there is plenty of spam that hasn’t yet been detected.
The distinction between “undetectable” and “not yet detected” is hardly subtle, in my book, but some people may not be able to tell the difference. After all, if the spam has gone undetected for any length of time, doesn’t that mean it’s undetectable? Hm. Not in my book. And I seriously doubt the search engineers at Google, Yahoo!, Live, Ask, and other services would agree with such a conclusion, either.
The Web spam wars are an arms race. You have a sort of trickle down effect where the two super powers are constantly challenging each other on multiple fronts. If we say for the sake of discussion that the search engines are the Soviet Bloc and the Web spammers are the United States (don’t assume there are any intentional compliments in that comparison, by the way), then the search engines tend to follow a very secretive but structured approach to detecting and filtering spam. The spammers, on the other hand, throw their arsenals into every corner of the Web and trample as many would-be allies as possible.
Having learned a fair amount about the mechanics of relevance from spammers, I still don’t appreciate it when I find a black hat is using my content (and/or name) to make money. If you guys would give me a cut, I might consider things a little differently. But it’s been a LONG time since anyone simply dropped a check in the mail and said, “Dear Michael — Thanks for all your help!”
Speaking as an outsider, I see the spam community being structured very much like a multi-level marketing organization. You have your star performers at the top and the second-tier performers right under them. Then you have the mangy crowd of wannabe millionaires below them milling around in general confusion mixed with anticipated ecstasy and growing frustration.
The problem with multi-level marketing is that not everyone in the organization can become a milliionaire. That may be the promise but it’s not the reality. It’s never happened and the reason no multi-level marketing organization has ever turned out more millionaires than non-millionaires is that there is only so much money for people to throw into their dreams.
Spammers have the same problem. The elite guys — the gurus — seem to fall into two groups. There are the silent aces who don’t spend their time sharing ideas with anyone outside their inner circles. And then there are the gurus who have built devoted followings. The gurus are merchants selling picks, pans, and shovels to eager gold miners. And as we know, in every gold rush the people who make the most money are the guys selling the picks, pans, and shovels.
If I had a spam software application that really could make me a millionaire, I would have absolutely no reason to sell it, share it, tell anyone about it, etc. Maybe there is someone who actually does generate millions of dollars through spam and just shares the secret out of the goodness of his heart, but I haven’t met the dude. If there is money to be made, you keep your competitive advantage by NOT sharing your secret sauce.
Still, tomorrow’s elite spammers are probably milling about in the dazed and confused crowd, slowly figuring things out for themselves. They’ll eventually join the ranks of the bloodthirsty second-tier spammers and when they’ve sated their competitive thirst they’ll either slink off into the shadows never to be heard from again or they’ll start selling software to other wannabe spammers.
It’s a vicious, go-for-the-throat cycle.
Or maybe not. Maybe everyone who joins the right secret, paid-membership spam forum finds out quickly how to become a millionaire and then their entry fees are raised to $1,000 a month just so they can prove they are making big bucks with spam.
Which brings me back to the question of: is it possible to fake a Web site? If it is, I think someone has probably figured it out by now. In his post, Matt Cutts wrote: “For cloaking to be completely “undetectable,†it would have to be like that Steven Wright joke: ‘Last night somebody broke into my apartment and replaced everything with exact duplicates.’ And a cloaking script that gave users and Googlebot exactly duplicate pages would be a bit pointless.”
That got me to wondering about whether you can create a Web site that is not a Web site. What would it look like? Would it exist only for search engines? Could it only exist when the spiders show up so that it builds out links to other sites? Think of all the neat things you could do without having to worry about who else sees your site. It could glisten like gold, be so over-the-top that Bill Gates himself would kneel down and cry for the sheer beauty of the design.
I mean, if I were going to fake a Web site, I would make it look like the best danged Web site that ever existed. Why? Because it would pass algorithmic love like nothing else in the spammers’ arsenal. No one else would see it so they wouldn’t be able to report it. No one else would be able to find your links. No one else would be able to complain that your site violates search engine guidelines. They see it in the search results but clicking through brings up…nothing.
Dead site. Not even a park page.
You can’t say that’s cloaked, can you? After all, you’re cloaking nothing. If a planet’s atmosphere vents into space, does space remain virtually empty? What if the emptiness over there is less substantial than the emptiness over here?
If you put content on an HTML page and serve that page from a Web server, you have a Web site. It’s real. It’s legitimate. It’s something (technically, it doesn’t have any physical shape or form) but it’s not a fake something. The problem is that some Web sites do things that search engines don’t like and we call those sites “spam sites”.
Search engines themselves violate their own guidelines in several ways: they scrape other sites for content, they dupilcate content, they sell links, they link to spam sites and pornographic sites, they implement redirection (cloaking), they don’t handle 404 errors consistently, they change their content often so as to make it “fresh”, and they create hardly any original content themselves.
So if you’re wondering why I would want to learn anything about relevance from spammers, ask yourself: just exactly who ARE the spammers? And are their pages real? Can you touch them, feel them, smell them?
At the end of the day, Larry Page, Sergey Brin, Jerry Yang, Steve Ballmer, and Apostolos Gerasoulis all have to look themselves in the mirror and say, “We have met the enemy and he is us.”
‘Nuff said.
{ 2 comments… read them below or add one }
dink 11.29.07 at 8:56 pm
Another outstanding idea. May prove to be a little difficult to make an invisible site tho.
Say, I thought I was a little on the dark side until you put those names in that last paragraph.
David LaFerney 11.30.07 at 6:42 pm
“The problem is that some Web sites do things that search engines don’t like and we call those sites “spam sitesâ€.”
Maybe, but personally I don’t care what the search engines like or not I consider it spam if it has no value to me the user.
Search engines do all of those things as you say, but they also have utility for the users – that’s what separates them from spam. I remember the Internet before effective search engines, and I thought it was fantastic at the time, but it’s a lot more useful now.
BTW – good post. Good stuff, digestible size.
You must log in to post a comment.