Metrics matter most when you don’t have them.
There are many ways in which to measure success, failure, and stallout. No one really has the right solution for measuring SEO campaign activity because we have no industry standards.
I occasionally point out that (my research over the past couple of years has indicated that) somewhere between 60% and 80% of all Web documents appear to be in Google’s Supplemental Web Index. People ask me for citations and proof. Dudes, no one else is talking about this stuff. I’m not basing these remarks on stuff I find on blogs. I’m evaluating Web sites.
You can do your own research. In fact, I encourage people to do their own research. It would be nice if 15-20 people randomly chose 100 large content Web sites (without telling each other which sites they picked) and started evaluating those sites’ coverage in Google. Then those people could share their findings and we’d start to build a community baseline.
But I’m not holding my breath and waiting for that to happen. Until this industry finds a real voice for research, you’re stuck with wondering whether I’m just making up numbers or if I actually have identified something.
But let’s look at how you could do some analysis to test my findings. This is a good exercise for practicing what people loosely refer to as “SEO metrics”. One aspect of search engine marketing metrics is to determine how much coverage you get for a site in search indexes.
Where Google is concerned you need to be concerned about both the Main Web Index and the Supplemental Results Index. Technically, links in both indexes pass at least one type of value: crawling. The Supplemental Googlebot is following links from someplace and I’m willing to stick my neck out on this issue and suggest that it probably collects its own links for crawling.
So if you get 1 page into the Supplemental Results Index, chances are pretty good that any pages you link to from that one page will also be crawled by Supplemental Googlebot.
Pages graduate from the Supplemental Index (so we are told) once they accumulate enough (internal, not Toolbar) PageRank. How much is that? I don’t know. I have no way of measuring Google’s PageRank. Neither do you.
If we assume for the sake of discussion that PageRank is the only reason why a page is Supplemental, we have to concede that we have no reliable indicator of whether a page is in the Supplemental Index. However, pages in the Supplemental Results Index do not rank above pages in the Main Web Index, so there is one test you can use to identify some probable Supplemental Index pages.
Regardless of how relevant a Supplemental Results Page may be to a query, if there is a Main Web Index page that has been pulled for the query, that page will rank above the Supplemental Results page. Hence, when you see less relevant content ranking above more relevant content, you may be seeing Main Web Index content outranking Supplemental content.
Of course, a lot of people use links to boost the apparent relevance of their documents in Google’s index, but if 60-80% of all Web documents are in the Supplemental Index, do they pass value to pages in the Main Web Index?
One way to test that question is to place links with unique anchor text on new pages and wait for them to be indexed. Don’t point any links to the pages. Just submit an XML sitemap to Google and let it crawl the pages.
If the pages don’t appear in search results for their unique title tags, they are most likely Supplemental.
If the pages don’t appear in search results for the unique anchor text in their outbound links, they are most likely Supplemental.
If the pages appear above the destination pages in the search results for their unique anchor text, they are most likely Supplemental.
In my opinion, of course. A group of Googlers could be reading this post and snickering at my naivete but the point is that if you don’t pass sufficient value from page A to page B where page B ranks above page A for the unique anchor text, you’re not helping your page B with links.
Technically, you cannot measure links or linking power. A lot of SEOs think they do this but their just spinning their wheels doing feel-good work.
What you can measure, however, is the effectiveness of your linking resources. If you build links, make links, place links, trade links, capture links, etc. you have a working inventory of linking resources. Some of you undoubtedly have thousands of linking resource pages you work with. You probably don’t own most of them. Every profile page you drop a link on, every forum and blog comment you drop a link on, every bookmark page you add a link to, etc. is in your linking resources inventory.
Most of your inventory is worthless, useless, electronic junk that doesn’t pass value within Google’s index. Your linking creates visibility for your pages, so there is some value in all those links, but they are not really helping with your search engine rankings. Don’t believe me? Just add a unique expression to your link anchor text and see how long it takes for those pages to appear for it.
Supplemental Index Pages are not fully indexed. You can fool yourself by putting relatively unique words in your anchor text, but Googlers have indicated that the more rare a word is, the more likely that word will be indexed in supplemental pages. Rare word tagging is useful for quickly scanning search engine indexes to see who has indexed new content but it doesn’t tell you how much value a page has accrued.
If you benchmark 1,000 pages that at have least 1 outbound unique link anchor you’ll find that in your first snapshot analysis some percentage of those pages are clearly passing anchor text, some percentage of them don’t appear in search results for that anchor text, and some percentage of them appear in search results but may or may not be passing much value.
You have four states for anchor text passing:
- Linking page indexed but does not appear for unique anchor text
- Linking and destination page indexed but only linking page appears for unique anchor text
- Linking and destination pages indexed but linking page appears above destination page for unique anchor text
- Linking and destination pages indexed and destination page appears above linking page for unique anchor text
Ideally, you always want to reach the fourth state. Most links never accomplish this. Now, some people would guess (perhaps quite reasonably) that if you point enough low-value links toward a destination that eventually the destination should rank well. While I tend to agree with the intuitive aspect of that argument, there is no way to test it. There is no “proof in the pudding”, so to speak, because if you create 1,000 links to test that hypothesis — unless you build all 1,000 low-value linking pages yourself — you have no way of knowing if you stumble across a really good linking page.
All it seems to take to get into the Main Web Index is one really great link. So one really great link can spoil a linking test.
Some people do indeed have the resources to build, say, 1,000 new pages and use them to bootstrap content into the Main Web Index. Most people would not have the patience to wait out an experiment like that. After all, how long does it take? 1 Week? 1 Month? 1 Year? I might have an opinion but is it any more valid than yours?
Now, I’ve been saying “1,000 pages” but that is a statistically minimal number of pages. If you want a more statistically valid test, you need to create a lot of domains (each with multiple pages of content) located on various servers around the world, etc. Again, most people don’t have either the resources or the inclination to do that kind of test.
Still, you can look at other people’s Web sites and apply some query tests to them to see what turns out. A lot of pages that should appear in search results for unique expressions don’t come up. Sometimes you can shorten the expressions but some pages appear in search results for very large expressions. I’ve pulled up pages for as many as 20 words and pages for as few as 5 words.
If a page appears in search results for 5 words but doesn’t appear for the 6th word following word number 5, what does that mean? Why cannot Google show us a page for a 6-word expression when the page appears for a 5-word expression?
I seriously doubt Google has indexed many 20-word “phrases” that are nothing more than random sentences. The phrase-indexing patent applications and papers I’ve read indicate that the science favors indexing popular expressions that appear in many places. Hence, a page that comes up for any random combination of words must be found through word-indexing rather than phrase-indexing.
An SEO campaign’s objective should be to get as many pages to rank for word-indexing rather than phrase-indexing as possible. If you can only rank for phrases and rarely used words, you have a problem. You’re not chasing the long tail of search you’re falling off the fringe.
Search engine marketing metrics tend focus on rankings for specific expressions, but a more comprehensive set of SEO metrics should look at how much value and visibility a page accrues. Can links on the page help other pages? Can the page rank for a variety of word combinations? Does the page depend entirely on inbound link anchor text for rankings?
Some people (wrongly) believe that “the first link counts the most”. I have no idea of what kind of “SEO test” would produce results to support such findings but I actually have many linking pages where the unique anchor text is passed from links lower down on the page but not from the first link. Part of the process (in Google, and perhaps in other search engines) appears to depend on what the search engine makes of the destination page.
That is, it is easier to pass anchor text to a page that is already receiving anchor text than to a page that is not. Sound weird? It sounds to me like it’s simply easier to pass anchor text to a page that is already in the Main Web Index than to a page that is in the Supplemental Results Index.
So maybe I have it all wrong but the fundamental problem is that Google does not return the most relevant results. If you’re optimizing for Google you need to start measuring relevance (according to your own personal metrics) in order to reach a conclusion about whether PageRank is influencing a query space. If less relevant content is showing up first, you’re proably looking at Main Web Index content being promoted above Supplemental Index content.
The pages in the Supplemental Index are not fully indexed, don’t pass anchor text very well, don’t receive anchor text very well, and will be shown AFTER the pages in the Main Web Index. If your evaluation of 1,000 randomly chosen pages that you don’t control shows something different, you should be sharing your findings with the rest of us. If enough people get together and compare notes on methodologies and findings, we’ll eventually come to a reasonable consensus that has nothing to do with who likes me and who likes someone else.
{ 3 comments… read them below or add one }
Demerzel 02.14.08 at 8:23 pm
I’m in on helping get some real research done for SEO.
Tyler 02.15.08 at 12:30 pm
I would be willing to join an SEO research group as well.
Chas 02.17.08 at 1:27 am
I’m really baffled by your stand on this. 1. Create a database. 2. Do a search and copy the top 10 results into your database noting the position 1 to 10. 3. Repeat these steps with a different search term 10 to 100,000 times.
4. Now you query the database and check for any factors you choose, links near the top, links near the bottom, keywords found 10 times, 100 times…whatever you want to check. If you find a statistical correlation then that factor may be legitimate.
There is no need to start any experiments. There are an infinite number of results sitting and waiting for you to analyze.
“Just add a unique expression to your link anchor text and see how long it takes for those pages to appear for it.”
I add a unique expression to my blog and I can usually search for it and find it at #1 in about an hour. The length of time needed to get any page to #1 by adding 4 text links a day is 2-6 months for competitive keyphrases. One of my clients took 6 months and 250 links from non-related site link pages. Google does return the most relevant pages (of all the search engines). And is always getting better.
You must log in to post a comment.