The object-oriented approach to search engine optimization
Posted by Michael Martinez on January 2, 2008 in Advanced SEO
If you have ever worked with Javascript or style sheets you may be familiar with the concept of the Document Object Model, a formal specification that allows Web browsers and browser simulators (or other applications) to execute scripts and interact with documents as if they were responding to specific commands.
The Document Object Model is heavily influenced by object-oriented programming, which in turn is heavily influenced by Set Theory and Lambda Calculus. All of these disciplines treat things as objects, specialized constructs which are responsive to commands and/or inquiries.
I like to describe objects as “bounded data spaces”, in that the data encapsulated by an object’s defined structure observes certain boundaries. Those boundaries are the properties of the object definition (callled a class).
For example, you can create an integer object in a variety of programming languages. That integer object has certain properties: it can only contain a whole number (no fractional values); it can only modify its data in such a way that the resulting data is only a whole number (no fractional values); it can disclose what data it possesses; it can wipe out its data; etc.
You might use an integer object as a simple counter, where each time a specific logical action occurs you tell the integer object to increment itself by 1. You might also use the integer object as a repository for simple on/off metaphors. That is, the integer object might record (temporarily) that a certain state is true or false. The state I refer to could be any condition resulting from a test, such as asking whether A = B.
Web Documents are more complex than integers, at least as far as you and I are concerned. But a Web document can act very much like an integer object in that we can tell the Web document (through the Document Object Model interface) to modify itself, or ask it to report specific information about itself.
The Web Document Object doesn’t actually exist, except inside your browser. That is, the Web page file(s) that the browser loads to construct the Web Document Object simply tell the browser how to construct the object — they define its “class”. Your Web page thus defines a unique class of Web document objects.
It’s all very virtual but it’s neato stuff for people who like to think in terms of sets, objects, and self-managing things. The Document Object Model makes it possible for one Web page to be many things to many people. In search engine optimization terms, you can say that the Web Document Object is a natural form of page cloaking, although I don’t have space here to justify that comparison.
An object has properties. The object also possesses functions that either tell you what those properties are or that allow you to alter those properties. The Document Object Model defines the properties of a Web page and the functions that your browser can invoke to either report or manage those properties (of course, these functions are expressed through the scripts we embed in our HTML code).
You can extend the object-oriented concept to search engine optimization. There are no formal specifications or applications that implement the object-oriented concept in SEO but that doesn’t mean there won’t ever be. It just means that we’re still at the stage of drawing pictures on white boards.
You can define a class of objects that possess the following properties: they collect data, they organize data, and they report data in response to queries. What are these objects? They could be search engines. They could also be interactive Web-page forms. The class definition I provided is too generic to be very useful. But with a little refinement you can define your own search engine object.
A search engine object is a useful theoretical tool that helps you answer questions. However, to use it properly you have to discipline yourself to ignore the assumptions you are accustomed to working with. And in SEO, most people rely extensively on assumptions.
To illustrate my point, I’ll propose a simple search engine object with the following properties: it possesses the three properties I described above (which, by the way, is called inheritance, where one class is a subset of another class because the smaller class possesses all the properties of the larger class); it possesses an interface that allows people to refine their queries; it possesses an algorithm that sorts data in response to queries; it possesses an algorithm that separates data into useful data and irrelevant data.
The last property is important because a useful search engine has to make some sort of judgement about which data is relevant to a user’s query. The specific criteria a search engine may use are not important to this exercise.
That is, we need only think in terms of “User asks search engine for the most relevant results to query X”, “Search engine collects Y documents that it deems to be relevant to query X”, “Search engine sorts Y documents, ordering them most relevant first”, “Search engine shows user the first 10 documents”.
The algorithms themselves can be treated as objects; hence, we can define different algorithm objects, some more complex than others. An example of an algorithm object is one that sorts whatever data it is provided alphabetically (like a classical directory search tool might sort data). Another example of an algorithm object is one that filters data so that only pages from a specific domain are searched (that is called a site search).
Now, you might wonder what benefit you would gain from describing search engines through object specifications, but that’s actually a pretty useful approach for analyzing search algorithms. Instead of guessing what the search engine does, you just list out the properties you can ascertain (I prefer not to use the standard seat-of-the-pants approach that most SEOs use).
We can say with some certainty that the major search engines award some undisclosed amount of relevance scoring to the title and page URL for a page. Hence, two properties of search relevance algorithms are “keywords in title” and “keywords in page URL”.
Does that sound familiar? Surely you’ve looked at one or more attempts to list all the “ranking factors” a given search engine might be considering. But whereas most people attempt to weight ranking factors the object-oriented approach dispenses with subjective weighting. It doesn’t matter which factor is given the most or least consideration. All that matters is that you assemble as large (and verifiable) list of relevance algorithm properties (ranking factors) as you can.
By moving away from the flawed weighting surveys and schemes that people construct, you free yourself of the assumptions that hold you back. You may like those assumptions but all they do is get in your way. Simply knowing that a page title or page URL may impact a document’s relevance score is enough.
On the other side of the equation you can apply the object-oriented model to your page optimization technique. An optimized page object has all the properties you care to give it. For example, suppose you only put keywords into your page titles. Then your optimized page object only has one property: keywords-in-title.
Page optimization, when viewed as the specification for an optimized page object, combines all possible factors that you can control: on-page stuff like repetition, emphasis, and page elements as well as off-page stuff like inbound link anchor text, repetition of inbound link anchor text, emphasis of inbound link anchor text, and text or page elements associated with inbound link anchor text.
Your optimized page object then becomes a fairly simply checklist for optimization. Wouldn’t it be cool, though, if you could load a Web page into a tool that tells you which of these properties have been set for the page? Instead of telling you something useless like “Yahoo! backlinks” the tool would tell you, “X pages in Google link to this page with the anchor text ‘you cannot believe this!’”.
Now, there is no search engine that tells you how many pages link to a page with specific anchor text, but there are ways to find out that kind of information. And there are ways to document that information. If you actively build links you should be tracking those links in a spreadsheet or something.
That spreadsheet could be a source of information for your Really Cool Tool that acts as an interface between you and your optimized page object. Imagine building a database of OPOs for every Web page you manage in the search results.
Your database would be large and complex but if you build an interface that treats each page as an optimized object you can just specify a page URL and look for the properties you feel are most important. Each link (inbound or outbound) is a separate, distinct property of the optimized page object.
And why would you want to do this? Because if you could ask your Web pages how they are optimized, you would be surprised by the large number of pages that are weakly optimized. If your Web pages could talk they would tell you which keywords were most important to them, which other pages were most important to them, and which pages they were drawing upon for support.
If you could ask a Web page to separate its inbound links into strong links (pass PageRank, relevant anchor text, and come from highly visible pages) and weak links (located on pages with poor visibility that don’t pass PageRank or relevant anchor text), would you not want to know how many strong links and how many weak links it had?
What an incredible metric that would be — you would finally know which of your link building efforts really pay off and which are just a waste of your time. There are, in fact, ways to capture this data but they are tedious so most SEOs don’t even bother to watch where the power comes from.
Most SEOs just guess blindly about which links help and they babble about “link juice”, “link love”, and PageRank. When you have little detailed knowledge about the effectiveness of your own resources, the best thing you can do is shotgun your way to success.
That is, you load your shotgun (your SEO strategy) with as many cartridges as you can (all the linking resources you feel you can exploit) and you start pumping away, roughly pointing the shotgun toward your target. If you have enough cartridges, eventually something will hit the broad side of a barn.
That’s how search engine optimization works today. It’s crude, inefficient, and mostly ineffective. Effective search engine optimization relies upon a sound knowledge and understanding of how Web pages work together. Just telling people to “get more links” isn’t sufficient.
You need to define methods for evaluating the effectiveness of your optimization efforts. Don’t take my word for it. You’re in a better position to learn just how ineffective most of your SEO efforts really are. If you know you could have done one more thing to get a page to rank well, and it’s NOT ranking well, then you know your SEO effort is ineffective.
If you know you could have done a hundred more things to get a page to rank well but the page rocks its way to the first position, then you know you don’t need real SEO. You just happen to be working a poorly optimized SERP but you’ll need to keep an eye on it because optimization tends to increase in search results that have been optimized.
Of course, that last assertion is a theorem wanting a proof, but that will have to wait for another day.
2 Comments on The object-oriented approach to search engine optimization
By chrisg on January 3, 2008 at 3:49 pm
Hi Michael,
In this and other posts you’ve alluded to the fact that there are ways to determine if a link ‘passes PageRank’ …. could you elaborate on this?
I’ll start off by suggesting if a target page (i.e. a page being linked to from another web page) shows up in a SERP for a keyword phrase which is contained in the anchor text of a link, but that keyword phrase is not contained in the on-page text, etc. of the target page, then this indicates that the link passes PageRank.
Can you comment on this, and perhaps suggest other techniques that would indicate a link is passing PageRank/link juice.
Thanks.
By Michael Martinez on January 3, 2008 at 9:43 pm
Chris, no one outside the search engines can be 100% certain of which links pass value or of what value the links pass. However, it may be safe to assume that if a link passes anchor text it probably also passes PageRank.
In a world where you cannot really know anything for sure, you make assumptions. I assume that a search engine can allow a link to pass anchor text but not PageRank, but I also assume that — until I see otherwise — a link will pass both PageRank and anchor text if it passes anything at all.
Given those constraints, the test you describe is as valid as any other test I have seen shared on a blog or forum. And it’s more reliable than most of the tests I have seen.
Comment
Log in or Register to post a comment.