There is an old study from 1961 that looked at how people determine relevance of documents. The researcher selected technical documents and gave them to qualified individuals to sort on the basis of relevance. Some of the documents were sorted on the basis of title and some of them were sorted on the basis of abstract. The researcher found that no statistical difference existed between either method.
In that study, none of the documents would have been deceptive. That is, no one was trying to assert relevance (either real or artificial) through keyword injection, a very common practice on today’s Web. Best practices SEO advocates the inclusion of specific keywords into Web copy (titles, meta data, visible copy, and page URLs) so that search engines (and visitors) will “know” a document is relevant to those keywords.
Since we can use links to tell search engines documents are relevant to specific keywords, the inclusion of keywords in on-page factors matters more for people than for search engines. That is, there is no benefit to humans in seeing link anchor text that points to another page because the humans don’t know what lies at the other end of the linking relationship.
Today we find ourselves playing a game of Liar’s Trust with the search engines. Every link is a potential lie intended to deceive the search engnies, and the search engines have to determine for themselves (and us) which links (potential liars) can be trusted enough that we have to accept the links’ anchor text (or not).
In Liar’s Trust, you have two players: the person doing the lying and the person being lied to. The Liar wins if he tells either a lie or a truth and the other person believes him. The other person wins if he only believes the truth or distrusts the lie. You cannot have a deadlock in Liar’s Trust. That is, there is no scenario in which no one loses. Either player can win or both players can win.
The liar’s objective is to influence the other person to believe whatever the liar says. The other player’s objective is to influence the liar to tell the truth. The best outcome is where both players get what they want (the liar tells the truth and the other player believes him).
Our links are liars. They may tell the truth or they may be lying. The search engines are the other player. They may believe our lies (we win), or they may disbelieve our truths (we both lose), or they may believe our truths (we both win). In this special case of Liar’s Trust, there is no scenario in which the other player can win without the liar winning. That is, Link Liar’s Trust requires that if there is are winners at least one of them must be the Liar.
That’s not a very fun game to play if you’re not the liar, as the odds of winning are stacked against you.
But that is one way search engines today determine relevance: they play Link Liar’s Trust with Web documents and their only means of leveling the playing field (for themselves) is to try to disqualify the most dishonest liars before letting them play in the game. In other words, search engines can only hope for a win-win scenario where they trust the right liars (the ones who happen to be telling the truth).
With such a lop-sided methodology, any third player who has to wager on which outcome is most likely (no winner, liar wins, both win) needs to know something about both players in order to make something other than a random guess.
If the third player knows that the liar is always dishonest then he can say the liar has a 1-in-3 chance of winning and the search engine has no chance of winning.
If the third player knows that the search engine always trusts the liar, then he can say the liar wins 3 times but the search engine only wins 1 time.
If the third player knows that the liar tells the truth 2 out of 3 times, he knows the liar will only win 1 time if the search engine can tell the difference between a lie and the truth. The search engine can win at most only 1 time out of 3 regardless of whether can tell the difference between a lie and a truth.
If we take links out of the picture and go back to using only titles and abstracts to describe documents, we find we may still be playing a game of Liar’s Trust. Without access to the primary document, we have to blindly trust that titles and abstracts tell us something useful and relevant about a document’s content.
So if the titles or abstracts are deceptive, we once again find that the liars have the advantage. That is why search engines have to keep liars from playing the game because once the liar sits down at the table the advantages are all his.
The only way to avoid playing (Search) Liar’s Trust is to tie the link anchor text (or the titles and abstracts) to the document’s primary text. That is, we need to require that at least one word found in the link anchor text, title, or abstract is also found in the document’s primary text.
In today’s search optimization practices there is nothing really equivalent to a document abstract. Search engines don’t trust the meta description tag enough to include it in their relevance computations (at least, not so that anyone can measure the impact of meta descriptions on relevance). If we assume for the sake of discussion, however, that the meta description is most likely going to be seen by anyone who finds the document listed in search results, then the meta description plays the role of a limited abstract.
You can score a document for “trust” in three ways:
- Documents you trust tell you about the document
- Documents you don’t trust tell you something truthful about the document
- The document’s meta data tells you something truthful about the document
Most people assume that trusted links are all you need to optimize a Web site for search, but you really need to optimize the site for searchers.
That is, you’re not just looking to earn algorithmic trust; you want to earn visitor trust as well. Some people call this “preselling” or “prequalifying” your visitors. You discourage irrelevant visitors who are least likely to convert and encourage highly relevant visitors who are most likely to convert. Your titles, meta descriptions, and inbound link anchor text (as well as the source documents you link from) all need to be relevant to your primary document copy in order for people to relax and feel like they have found the right thing.
One would think such a natural means of determining relevance would be simple enough to score algorithmically. But then we have to take context into consideration. A query for “dogs”, for example, may be relevant to four-legged furry creatures or it may be relevant to human feet. Still, it would be nice to know that search engines look at link anchor text and see if it really has any relevance to the destination document.
You need not worry about whether search engines do that if you actively build links using keywords in your anchor text that match one or more the keywords in your titles, meta descriptions, and on-page copy. The more matches you have between anchor text and titles, the better — at least for human visitors because they’ll at least see that you tried to tell them what the documents were called.
People may trust documents before search engines trust them. If people express their trust through links the search engines may decide to trust the documents. But what if every person who knows about and trusts a document is a known liar whom the search engines no longer trust? If Ralph Nader launches a new Web site and only 1,000 spammers whose sites have been penalized link to his site with trufthful, accurate link anchor text, should a search engine trust the new site?
It’s not in a search engine’s best interest to ignore 1,000 liars who are telling the truth any more than it is in the search engine’s best interest to pay attention to 1,000 usually honest people who decide to lie. In either scenario the search engine’s results prove to be less useful and relevant because the search engine is really blind to the truth and deception.
A search engine’s evaluation of the trustworthiness of a page is only as good as the information made available to the trust-assessing algorithm. Where does that data come from? Is the search engine watching Web pages, waiting to see how many spammy links they hold and how many links to non-spammy sites? If that is the case, then why do so many pages that don’t link out to spammy sites fail to earn trust?
Web directory design theory owes something to that 1961 study, even if the connection was only accidental and intuitive. If human-reviewed titles and abstracts are relevant to their associated documents then it follows that both the titles and the abstracts will be equally useful in determining relevance for queries. In practice, that is usually what Web directories do: evaluate listing titles, listing descriptions, listing URLs, and listing category names.
We can apply this principle (of being honest about good content) to our Web page design if we think in terms of being accurately informative. Now we (the Web site operators) are the liar and our visitors (both search engines and human visitors) are the other player. We lose if we convey inaccurate information and the other player doesn’t agree with us (the search engine fails to show our page or the visitor leaves immediately).
We win by accident if we convey inaccurate information or only partially inaccurate information and the other player shows our content (search engine) or clicks through on our call to action (visitor).
Everyone wins if we convey completely accurate information that is shown in search results and the visitor takes the call to action.
The distinction between an accidental win and an earned/community win is that our credibility as a resource is enhanced — everyone is likely to agree that we’re worth showing off and using again — in the third scenario. That is, the third scenario offers a stronger win because it provides a more satisfying experience for everyone.
If we score this game on a basis of 0, 1, or 2 we find we rack up points more quickly every time we earn 2 points rather than 1 point. How many 1-point shots do you want to take if your goal is to earn 500 points? How many 0-point shots do you want to take?
Quantifying what you do with your search engine optimization this way is a quick, easy way to determine how effective your efforts are proving to be. You don’t have to wait a year to know if someone sees your page in the search results or if you get any conversions.
Nor do you have to wait very long to see that whatever you did with a specific page will or will not work. Some SEOs beat their heads against the wall for anywhere from 6 months to 2 years. I once saw a message in a forum posted by someone who said their site had been banned for about 3 years and they still didn’t know what to do.
If your site doesn’t perform after 1-2 months, it’s time to try something different. You may not have to change the site itself, but you need to change your strategy because after 2 months either the search engine thinks you’re a liar or everyone else does.
Either way, you lose. And if you really do have useful content, everyone else loses, too.
{ 1 comment… read it below or add one }
johnson 11.13.07 at 5:07 pm
I have learn more from your blog, thank you, Michael Martinez.
You must log in to post a comment.