Advanced SEO Metrics for Beginners

by Michael Martinez on June 24, 2008

There are no reliable Web metrics tools. People have their favorite tools but they all pretty much suck alike. Nonetheless, some information is better than no information, even if there is a good chance the information will be wrong. In competitive SEO the chief advantage lies with the optimizer who posssesses the most information, or whose information is least unreliable.

It’s virtually impossible to know whether you have more reliable information than anyone else. Information may be unreliable because it’s incomplete, because it’s outdated, because it’s been distorted, or because it’s irrelevant. Distortion occurs mostly due to misinterpretation, misreporting, and deliberate omission — that is, distorted information is almost always provided by people.

Understanding where your information comes from makes it more useful because you can provide a framework or context around it that sets boundaries. If you don’t know the source of information you’re using, you cannot set boundaries and the information is therefore completely unreliable.

Let’s take a look at search engine market share. This metric is the most widely reported, most heavily distorted, and least well understood metric in the industry. Ask any random SEO which search engine has the largest market share and nearly all will tell you without hesitation in mid-2008 that Google has the largest search market share.

But people base that assessment on distorted information provided by the four major metrics services: comScore, Compete, Hitwise, and Nielsen Netratings. These are all professional, well-respected companies but their collective published data do not support the conclusion that Google controls most of the search market.

For example, both comScore and Nielsen Netratings report search market share on the basis of number of estimated queries performned. Their numbers don’t even come close to matching each other, however.

comScore’s May 2008 Search Market Share report assesses Google’s market share at 61.8% based on an estimated 6,664,000,000 Google queries versus a total of 10,777,000,000 queries performed across all search engines.

Nielsen’s May 2008 Search Market Share report assesses Google’s market share at 59.3% based on an estimated 4,654,624,000 Google queries versus a total of 7,849,553,000 queries performed across all search engines.

Nielsen’s reported query base is several billion smaller than comScore’s, and Nielsen’s Google share is also smaller than comScore’s Google share.

The Hitwise May 2008 Search Market Share report assesses Google a 68.29% market share, the highest of any estimate.

Compete revised its May 2008 Search Market Share report based on reader feedback from SearchEngineLand. They also assess a 68.7% search market share based on queries for Google.

The problem with these metrics is that they don’t accurately define what the search market is (in fact, they make no attempt to explain what they think constitutes the search market). Number of estimated queries performs doesn’t tell you anything about:

  1. Number of referrals that sent traffic to other sites
  2. Number of people who used search engines to search
  3. Number of sessions each person initiated on search engines
  4. Number of queries per session
  5. Number of queries per person
  6. Amount of time per session
  7. Amount of time per person

There are different types of queries. Many people run queries they have no intention of clicking out of. SEOs, for example, run ranking reports. Many other people search the results for their Web sites, their friends’ Web sites, their enemies’ Web sites, etc. Many people also use search as a site navigation tool.

Queries can be divided into several categories:

  1. Informational queries, where the search results themselves provide the information sought (such as ranking queries, indexing queries, and similar SEO-style queries)
  2. Navigational queries, where people are either looking for sites they know about or are searching specific sites
  3. Discovery queries, where people are looking for information from unknown sources
  4. Spectator queries, where people have been inspired by online discussions, offline discussions, or news stories to perform specific queries (such as “miserable failure” or “sexiest man in SEO”)
  5. Research queries, where queries are being performed in volume and depth for the purpose of collecting information for further evaluation
  6. Transactional queries, where people are looking for product listings, phone book and address information, etc.

All of these factors taken together define the whole search market but that market is not reflected in the metrics by which the search engines are judged. For a competitive SEO, knowing that the search metrics are so flawed provides you with an advantage because you can develop a more comprehensive strategy. That is, even though we don’t have sufficient information to filter out non-converting queries, we do know there are other metrics by which to judge search engine performance. Since most SEOs don’t take those metrics into consideration, their strategies will be less than comprehensive.

Let’s take a look at one of the most under-rated search engines: Ask. Ask by most accounts has a superior ranking technology, but that superiority comes at a cost. Ask discards the vast majority of Web content because of poor link relationships. You cannot use Ask for effective site search because it won’t index the majority of pages on your site.

But does anyone actually use Ask? The people at Ask know if they are capturing server log data. Regrettably, they don’t publish that data or make it available to third-party review (none of the search engines do, although I should think this is important for publicly traded companies whose business models are so heavily dependent upon that data).

Lacking complete, accurate data we can still look at some estimates for Ask’s traffic. For example, Quantcast estimates 35,000,000 people visit Ask each month (down from about 44,000,000 a few months ago). Compete estimates 27,643,652 people visited Ask in May 2008.

Again, we have no agreement but we at least have some traffic estimates that show Ask receives a respectable amount of traffic.

Quantcast estimates that Live.com receives 87,000,000 visitors each month. (NOTE: I don’t know if Quantcast in any way includes estimated overlap from MSN, for which they estimate 88,000,000 visitors per month.)

Compete estimates 76,890,429 people visited Live.com in May, 2008 — a very impressive increase from last year. Compete also reports somewhat higher but similar numbers for MSN. The MSN/Live situation is odd because MSN serves content as well as search (just like Yahoo!).

For a search property that is supposedly struggling, Microsoft’s network doesn’t seem to be doing so badly. Don’t believe me? Let’s take a look at traffic estimates for Yahoo! and Google.

Quantcast estimates 124,000,000 visitors for Yahoo! and 136,000,000 visitors for Google. Hm. Google’s not looking so impressive all of a sudden.

By comparison, Compete estimates 131,844,268 people visited Yahoo! in May 2008 and Compete estimates 135,291,588 people visited Google in May, 2008.

Now, let’s give Google some credit where credit is due. It’s only this year that they surpassed all other Web sites in estimated number of visitors so they ARE increasing their market share. Nonetheless, both Microsoft and Yahoo! have continued to increase their number of visitors. Only Ask has lost actual traffic (perhaps because of its well-publicized decision not to compete as a major search service any longer).

Like the query-based metrics, we can only produce crude estimates of search market share based on traffic estimates. Neither metric is necessarily more accurate than the other, but I think you’ll soon see why the visitor-based metric is a more reliable source of information on actual search market share.

If we sum up the Quantcast traffic estimates for May 2008, we have about 382,000,000 combined monthly visitors for the four major search engines. Now, this number in no way reflects how many people actually used search engines. It’s just a raw total for reported search visitors. And unfortunately only Compete provides estimates of actual visitors versus actual visits.

In other words, the numbers I reported for Compete above are estimated total number of unique visitors, whereas the numbers Quantcast reports appear to be estimated total unique visits. But I could be wrong because Compete’s monthly visit counts are substantially higher than their monthly visitor counts:

  1. Google - 1,655,000,000 visits (sessions)
  2. Yahoo! - 1,997,326,445 visits (sessions)
  3. Live - 863,090,425 visits (sessions)
  4. Ask - 88,513,068 visits (sessions)

Now, the search metric services appear to disregard club.live.com wherever possible but it doesn’t look like Microsoft’s search club is having a significant impact on their overall market performance. For that matter, none of the metrics services indicate whether they try to filter out Yahoo!’s content network. It’s highly unusual to see a sub-domain singled out for metrical filtering (and quite unreasonable, in my opinion).

If we estimate market share based on Quantcast’s visitor estimates, Google has a 35.6% share, Yahoo! has a 34.5% share, Live has a 22.7% share, and Ask has a 9.1% share. Of course, I did not include AOL or other search engines in these tallies, so please understand I am only providing them for illustrative purposes. Also, keep in mind that you can do this kind of analysis for virtually any other Web sites (but there are some drawbacks I’ll talk about below).

Looking at Compete’s numbers, I’m going to round to the nearest miillions for a total of 4,603,000,000 estimated visits to the major search services in May 2008. Interestingly, Ask’s share is inverted for a 1.9% share of estimated monthly visits (versus 9.1% of estimated monthly vistors). Live received 18.7% of estimated monthly visits. Yahoo! received 43.3% of estimated monthly visits. And Google received 35.9% of estimated monthly visits.

Crude though these estimates may be, they show that Google probably has only a little more than 1/3 of the real search market — if you’re talking about people who use search engines versus how much query activity the search engines experience.

Query activity can be grossly misleading for a variety of reasons. For example, Hewlett-Packard’s Web site (hp.com) receives 60 million queries per month (note: I can no longer find my original source for that statement). Most of the HP queries will be of a transactional or informational nature — people using HP’s search engine most likely want information about HP products or support for their HP products.

Still, if you think about other major etailers like Amazon, ebay, Walt-mart, et. al. and all their on-site searches, can you be so certain that Google really controls 35% of today’s search traffic? What about government Web sites with their own built-in search tools? They serve tens of millions of queries every month. Google’s ironclad grasp on search dominance feels more like weak finger tips struggling to grasp the ledge when you look at what is really going on in search.

So where does that leave us? First, we can show conclusively that query-based measurements of the search market are completely bogus, so you should celebrate every time you see another SEO say something like, “Google is the only search engine that matters”. If all they’re concerned with is Google, that means you can pay attention to those 64% other searchers your competitor doesn’t think exist.

Secondly, for any SEO who is managing or advising on optimization for a truly large site with millions of pages of content, the importance of on-site search cannot be exaggerated. You have to make site searches easy for people from as many search engines as possible because most people DON’T search your site from Google (unless you have forced them to do so).

In other words, if you bring people in from all possible search sources, the better indexed your sites are in all those search sources, the more likely that navigational queries on those non-Google sources will provide you with traffic. I doubled my personal network’s Live.com referrals without any significant loss from Google referrals by replacing Google-driven site search with Live-driven site search.

That was a gamble because Live doesn’t index as much of the Web as Google, but Google’s main Web site index excludes most of the Web. I have not had time to implement a Google Custom Search Engine site search tool since they expanded the service. It may be that Google now offers a better solution than Microsoft.

Still, using all this competitive data requires some care. For example, where do Quantcast and Compete get their data? Where do the other metrics services get their data? Hitwise says its estimates are based on the activities of about 10 million Internet users, which sounds suspiciously like the old Alexa toolbar database.

Compete has admitted to purchasing click-data from undisclosed Internet Service Providers, and they allege that other metrics services buy this data, too. But Compete — like Alexa — actually offers people a toolbar that they can install. The Alexa toolbar was reportedly cracked and replicated by black hat SEOs years ago. I don’t know if anyone is trying to spoof Compete’s toolbar or anyone else’s toolbar in 2008.

Quantcast, of course, asks Website operators to quantify their sites by installing Quantcast metrics. If you sign up with their service they’ll show you more accurate data (so they say). They now also offer cookie-adjusted audience estimates, although I don’t trust cookie-based systems since many people disable and/or periodically clear out their cookies.

Google Analytics now offers a beta benchmarking tool that allows you to compare your site’s performance to selected categories. SEO Theory supposedly does very well against SEO blogs and sites but Google doesn’t disclose how it determines which sites will be placed in that category (or any category).

Google Analytics — like all Javascript-based analytics packages (such as Sitemeter) — suffers from data blockage. This cannot be helped. Javascript misses up to 1/5 of Web site visitors because many people (estimates vary from 12% to 20%) disable Javascript in their browsers. The problem is compounded by placemant of Javascript code (you’ll see more hits, generally, the closer you put the code to your Web document’s HEAD section), by server outages, by internet congestion, and by user security settings. I generally assume about 40-50% higher numbers than whatever traffic estimates are reported by Javascript-based metrics.

Both Quantcast and Compete offer estimates on demographics, engagement, and other factors that you can use to evaluate a site’s reach into the Internet marketplaces. The demographic data is incomplete because it’s based on limited data sources. Nonetheless, I have found that the incomplete demographics reports tend to be an accurate reflection of significant portions of site visitors. The larger and more popular a site is, the more likely the demographic reports are to be fairly close to reality.

Some people still like to use Alexa. To their credit, Alexa redesigned their system a few months ago and they have been collecting new data. I have not heard that Alexa’s new system can be gamed, or that anyone is trying to do so. Perhaps people have gotten over that nonsense, but I’m in no position to say. The last time I checked Alexa’s system they had not yet integrated their older data into their new database. I hope they never do, as that old data was (in my opinion) pretty snarky.

Of course, none of the metrics companies offer very good data. If you’re interested in developing your competitive analytics skills you need to look at as many sources of information as possible, and you need to question the assertions and assumptions these companies make. Don’t ever make the mistake of dismissing or devaluing an admitted data exclusion — many SEOs do that because they don’t want to deal with the inconvenience of facts.

Google dominates search in a metaphorical way, but it has a long way to go before it becomes the monopoly many people think it is. Still, if you don’t like the numbers I rounded up today you could go looking for other numbers that fit the Google-dominates-search model better.

For example, Google reported a quarterly gross profit of $3,000,000,000 in March 2008. Yahoo! reported a quarterly gross profit of $1,000,000,000 in March 2008. Unfortunately, you cannot determine how much of Microsoft’s 1st Qtr 2008 gross profits were derived from search-related revenues (which may include more than PPC for all these services). Microsoft earned $11.94 billion that quarter, mostly (I would guess) from software sales unrelated to search.

IAC’s whole network reported less than $1 billion in quarterly earnings.

However, gross earnings is not as useful as, say, PPC-revenues. Unfortunately, finding reliable data on PPC revenues is not that easy. For example, in February 2008 Microsoft released a statement alleging that Google “has now amassed about 75 percent of the paid search market”. Now, where did that figure come from?

Maybe from SearchIgnite. I don’t subscribe to their service but in April 2008 The New York Times cited a SearchIgnite study (and other sources mention it as well) that indicates Google’s paid search market share declined to about 70.4% (Yahoo!’s paid search market share rose to an estimated 24.2%).

Since I have not read the original study I cannot share an opinion on the quality of the information. However, Rimm-Kaufman Group has published paid search market share estimates on its blog that is somewhat close to the SearchIgnite numbers. For example, Rimm-Kaufman reports that their May 2008 PPC spend allocated about 81% to Google.

The paid search market, however, is unquestionably driven by a faulty metric - number of queries performed. That is, people are more likely to invest their PPC dollars in Google’s search advertising because Google serves more queries than other search engines. Now, in this case the query-driven metric may actually be more reliable than in measuring overall search market share. Our PPC guys have told me on a few occasions that most of their clients see better performance on Google’s network. However, although we manage a substantial portfolio in PPC campaigns I don’t think we have a significant enough client base for a scientifically valid sampling.

The correlation between query performance and where to place your query advertising dollars is intuitively obvious, but there is more to setting up and managing a PPC campaign than asking which search engine serves the most queries. The more competitive queries may be too expensive for many smaller competitors, who have to pursue different strategies.

And a sale driven by Yahoo! or Microsoft PPC may be as profitable or more profitable than a sale driven by Google PPC (or less profitable). You cannot easily predict where your conversion costs will be most efficient until you jump in and do the research. But Google clearly dominates paid search in part because of its AdSense program. That is, AdSense revenues are part of the overall picture even though AdSense is technically not paid search.

I mean, AdSense is not paid search if the ads are shown on a non-search site. Of course, they would be paid search if they are shown on a non-Google search site (like Yahoo!). Confused? Hey, I don’t do PPC any more. I leave that to the experts.

Still, that’s my introduction to advanced SEO metrics for beginners. There is indeed much, much more that can be said on the topic. When you’re analyzing markets you really cannot afford to be dazzled by the whizbang tools and gadgets people have scattered across the Web. But neither should you accept any market analysis at face value. There may be no intentional bias behind the analysis but there is certainly a bias of perspective behind it.

The bias of perspective diminishes the value of all analytics. You cannot escape being biased, but once you identify your biases you have the option of setting them aside and exploring other perspectives. That is, you exchange one site of biases for another and then you can make comparative analyses that were not possible with only one set of biases.

Advanced SEO metrics help you understand the markets you’re dealing with. People search for the same content for different reasons, at different times of the year, from different resources. The sooner you identify the strengths and weaknesses of a particular market (usually what SEOs call a “vertical”), the sooner you can customize a strategy to participate in that market. You really cannot own or control a market — not unless you own the technology that drives the market.

Google does not control or dominate the search market nearly as much as people think it does. The huge gap between reality and the metrics reports provides even beginner SEOs with opportunities to explore and capitalize upon query spaces that have only matured on Google.

Hundreds of millions of people use search engines OTHER than Google every month. Most SEOs ignore those people.

When you look at other verticals, understand that just because there are dominant players in the market doesn’t mean that everyone focuses on those players. Use all the metrics tools at your disposal to find out who is visiting which Web sites, why, and what you can do to attract those people’s attention. Don’t ever throw in the towel and conclude that you cannot knock the big guy off his pedestal.

That pedestal may be nothing more than statistical gibberish someone cooked up out of a different set of biases from your own.

{ 3 comments… read them below or add one }

chrisg 06.27.08 at 11:19 pm

Michael,

I agree with the spirit of your post — check as many data points as possible and employ healthy, independent critical thought. Definitely.

However … “Javascript misses up to 1/5 of Web site visitors because many people (estimates vary from 12% to 20%) disable Javascript in their browsers.” …. in the spirit of questioning the numbers, I find this hard to believe. I doubt 20% of web users would know what javascript is and know how to turn it off in a web browser, never mind actually do it. I have no data to back that assertion up, though I’d also point out you didn’t cite any authorities for the numbers you presented.

Cheers.

Michael Martinez 06.29.08 at 9:28 am

I know of no authorities on the topic of browser statistics, although there are some well-known sites that are frequently referred to. These sites publish statistics from their own server logs (which is why they are not authorities on the subject).

One source that is often cited is the W3C Schools’ Browser Statistics page. There is also a Wikipedia article that is updated monthly (a tedious task that requires a lot of dedication from a volunteer or group of volunteers). Of course, none of the Wikipedia sources (which includes the W3C Schools page) is authoritative.

If you scan these resources you’ll see that reported Javascript usage is up to about 95% for mid-2008. However, these statistics don’t reflect the number of people who are reading content through mobile search, offline feedreaders, and other software that may or may not execute Javascript.

To understand just how limited the W3C Schools’ data is, look at their screen resolutions data. How many of those people are using mobile search?

Mobile search is extremely popular in Asia, and I know from personal experience that Asian surfers are interested in multilingual and international content. Both my personal network and this blog receive significant traffic from Asia.

So that is why I say “up to 1/5 of Web site visitors” may not be tracked by Javascript-analytics software. Google Analytics, for example, does not break out RSS feed subscriptions for me. It reports a lot of Direct traffic but what is considered to be “Direct”?

Also, I don’t know if the detection of “Java Support” in a browser indicates that Java and Javascript are turned on. My guess is the statistics programmers only mean that your identified browser is known to support Java and Javascript but they are not trying to determine how many people actually have Java and Javascript enabled.

I suppose I could test that and I may at some point when I have more time on my hands.

chrisg 06.30.08 at 9:13 am

Good answer Michael. The point re: browsers other than “web browsers” is fair.