The Bias Factor of Search Optimization

by Michael Martinez on January 18, 2008

The Theory of Search Engine Optimization tells us that people use or apply “algorithms to influence the predictable content and quality of search engine results according to the chosen criteria of the optimizer”. Search engines optimize, searchers optimize, and Web content providers optimize.

You cannot NOT optimize search, although some people do it better than others.

Search engine indexing and ranking algorithms optimize results primarily through: filtration, repetition, emphasis, and reference. Regardless of what a search engine chooses to do or not to do, it shapes its search results according to pre-determined criteria that (ideally) are as objective as possible.

Searchers optimize results either through their choice of engine or by the construction of their queries. Most people are completely unaware of the fact that they are optimizing their results through the choices they make. One person searches for “footware” and another person searches for “shoes”, both seeking approximately the same thing, but each looking at their needs in a unique way.

Web content providers optimize results through repetition, emphasis, and reference. Sound familiar? We feed the search engines what we think they want, but we’re also feeding the searchers what we think they want.

Search engines dilute the quality of their search results by weighting reference. Links cannot express opinion (good, bad, indifferent). Nor can links evaluate the quality of destinations (authoritative, bunk, undecided). Most Web sites provide at least one list of links. Those links are usually chosen for personal reasons (these are my other sites, these are the sites of my key employees, these are sites I like, these are sites I found in a search engine that I feel may make this page of links interesting, etc.) without any regard for quality or authority.

Link lists are content and in the early days of the Web people built link lists just for the sake of binding the Web together. No one really cared if the content was good just so long as they could help people find some content. That philosophy has morphed into several areas of social media development.

People share links to news stories, ostensibly to help spread news but the reward systems that social media sites implement lead some people to share links solely for the reward of earning recognition and building influence. The incentive to share links for rewards thus reduces the objective qualification process that news sharing might otherwise be expected to produce.

People share links to similar-themed sites. That is, if I am a car racing buff I’ll create a car racing site and suggest a few other car racing sites. However, personal connections usually influence such choices. I’ll suggest Mark’s car racing site because I know that Mark has a car racing site. I won’t suggest Joe’s car racing site because I have never heard of Joe, and the fact that Joe’s site is the most awesome car racing site ever constructed won’t help him get a link from me because I have never heard of him.

People create links to fluff out their apparent authority. Journalists do this all the time. They write a story on a topic and collect a few links they find in search engines to “popular” Web sites. The journalists, being unqualified in the topics about which they write, often provide links to inappropriate content. Some news organizations are better at vetting link destinations than others, but one way such organizations have “improved” the quality of their outbound links is to limit the types of sites they’ll link to. Fortunately (or unfortunately) the growth in blogging’s influence in social commentary has forced news organizations to expand their boundaries on linking references.

People also create links as smokescreens for self-promotion. Long before search engines cared about links people would share lists of “recommended” sites which just happened to include their own sites. Some people were transparent in admitting they had included their own sites. Many people never bothered to disclose that one or more of their links were self-promotional.

Archival links occur in great abundance. Mailing lists, news groups, and Web forums have been archived through hundreds — perhaps thousands — of scripts and specialty archive services. So have press releases, syndicated stories, and other distributed content. Nearly every archive provides links associated with the archived content.

The inobtrusive replication of links across the Web has been in place for almost as long as we have had the Web. These replicated links have occurred as natural parts of discussions — in the old days, many people included one or more links in their email and news group signatures, and their posts found their ways into numerous gateway and archive sites. There are also heavily replicated F.A.Q. documents — often compiled in good faith on the basis of significant research, but sometimes compiled quickly or for self-promotional reasons — that include many, many links. F.A.Q. documents may be regularly republished to news groups, Web forums, and email discussion lists.

Long before search engine spammers began flooding the Web with worthless links, the Web was already flooded with worthless links (many links in archived documents, for example, lead to content that no longer exists). Archivists typically do not update the links they have archived. Through the years I have tried to update some of the links in my own Web forum archives but the task is so incomprehensably complex that I’ll never be able to update more than a fraction of all the dead links in my own archives.

Today’s SEO community focuses primarily on links for their optimization. In recent months there has been a rising trend favoring a return to basic on-site optimization principles but the link mentality will never fully go away. If search engines did away with link-based weighting in their algorithms the quality of their search results would improve drastically in “optimized” queries and not so much in natural, unoptimized queries.

The Theorem of Search Engine Optimization tells us that “achieving optimal performance from search engine results diminishes the naturality of search results.” A corollary to that theorem (which I don’t have the time to offer support for in this post) is that natural search results minimize repetition, emphasis, and reference because there is no incentive other than to persuade the reader to accept or evaluate a point of view.

That is not to say you won’t find repetition, emphasis, and reference in natural search results. You’ll find plenty of it, but it’s there for non-search related reasons. But even in natural search results you’ll find that reference is the weakest branch of the tree because it is still incentivized by non-search marketing and promotion strategies.

Web sites have been selling links as advertisements at least since the mid-1990s (before Inktomi and Google popularized the idea that links can be used to evaluate Web site content). But Web sites have also been exchanging links as advertisements and placing links as part of affiliate promotions. There are other incentives for linking that have been in place (contests, collaborative commemorations, protests, use of free content and appliactions in mashups, etc.) since before search engines started using links to evaluate Web content.

Another time, I’ll come back and take a look at how any algorithm that evaluates content through Web-based criteria throws an unintentional bias into search results. Both search engines and searchers introduce these dark biases into search results every day. Web content providers struggle to respond to dark bias by manipulating search engine data and/or by educating searchers to change their biases.

{ 3 comments… read them below or add one }

Carlos 01.18.08 at 12:18 pm

So is that why you don’t link out? You don’t feel that you need anymore authority? ;)

I disagree on the part of your analysis of links. Yes on the small scale, and individual, there are heavy biases the mass of links as a whole have implications. Qualitative values can be inferred from the commonalities in linkage.

Example: If one person says you are wrong, they are probably biased. If many people say you are wrong you should inspect your decision. Right?

Certainly references (links) are fallible, but I think that is a stretch to say that individual bias continues to be an issue as data points increase. At this point search engines have captured billions of links and contexts for linkage, individual bias is greatly devalued.

Michael Martinez 01.19.08 at 4:51 pm

It is statistically and technologically impossible to derive a reliable estimate of quality from linking data on the Web.

Since links do not convey a sense of quality or value to begin with, and since the majority of links have always been created for purposes OTHER than to convey a sense of quality or value, there is no way that any analysis of links can reliably produce an accurate estimate of quality.

David LaFerney 01.19.08 at 8:26 pm

I remember seeing a comedy sketch once about a game show called “Common Knowledge” (or something like that) where the correct answer to any question was whatever most people thought it was:
Q – What is the capitol of New York State?
A – New York City.

Even without the bias cause by commercial interests the current link based search rankings is an awful lot like a game of Common Knowledge, where the best result is whatever the masses point to.

Just an observation. I have no ideas about a system that would work better – not until someone invents artificial intelligence anyway.