**What happened so far**

This post is the fourth and deciding part in a series of posts to reveal the secret of AdWords quality score. Previous posts:

The first post was about the lack of consensus about what quality score actually achieves. It’s a little bit as if Google had an abstract formula for something they call quality and advertisers just had to live with it. The second post contradicted the notion that Google was all about quality. With the help of some examples I showed that Google is in fact more interested in making money, even if it comes at the cost of quality. So the idea of quality score being something that sacrifices profit in favor of quality seems more like clever marketing on Google’s part.

The third post pursued the idea of Google being primarily interested in maximizing its own profits. I took a closer look at the ad auction – the procedure that determines the order in which ads are shown on Google. The question was how the ad auction had to work in order for Google to earn as much money as possible. The only parameter Google can control is quality score. I found that quality score has to match click-through rate (CTR) in order for Google to maximize profits. So last post’s conclusion was:

Quality Score = Click-Through Rate

**A Contradiction?**

First of all it’s important to make one thing clear: There is no doubt that quality score has to equal click-through rate in order to maximize profits for Google. The proof isn’t just in the last post: before Google introduced quality score in 2005, CTR was used in precisely the same way. The ad rank was calculated as *Bid x CTR*, today the formula is *Bid x Quality Score*.

I know, that sounds absurd and contradictory: if the formula was already optimal and was subesquently changed, wouldn’t that mean that Google is pursuing another goal in addition to maximizing profits? Or did Google not change anything at all?

**The nature of Click-Through Rate**

It’s actually a little more complicated. A closer look at the trivial-looking CTR can shed some light on the issue.

When Google determines the ad order, there are two things to be considered: how much does Google make per click on an ad and how many clicks on that ad are there? In the last post I used two terms to describe these factors: bid and click-through rate.

But click-through rate is actually an inadequate metric here, because it refers to the past. Click-through rate is simply the number of clicks so far divided by the number of impressions at this point in time.

But whenever Google handles a search query and has to determine the order for the ad, then it’s not about the past: it’s about whether the ads will be clicked here and now. It’s about the immediate future. When the ad order is determined, it is just about this one impression for just the current search query. Will a click follow or not? Click-through rate has little to do with that – rather, this is about click-through probability.

**What’s the difference?**

To illustrate the difference between click-through rate and click-through probability we can look at a coin toss. Heads or tails, both have the same probability of 50%. These probabilities are set and will never change.

By counting the coin tosses that result in heads and dividing these by the total number of tosses, we can determine the “heads rate”. So if we get heads once out of two tosses, we get a heads rate of 50%. That would be consistent with the “heads probability”, which is also 50%.

However, two coin tosses might lead to heads twice. That would be a heads rate of 100%. Still, heads probability remains unchanged at 50%.

It’s the same with click-through rate and click-through probability: if an ad had two impressions and one click it would have a click-through rate of 50%. Still, that doesn’t mean it will have a click-through rate of 50 % for future impressions as well. If Google wants to make predictions, the real click-through probability is needed.

As hinted in the example, determining click-through probability isn’t as easy as one might think. Of course, there is a connection between click-through rate and click-through probability. However, just looking at click-through rate and concluding that click-through probability should be the same doesn’t work.

Let’s go back to the example of the coin toss where we had heads twice and a rate of 100%. If we assume that the probability for heads is 100% as well, we would be very wrong.

At first glance one might think that rate and probability would converge quickly. Naturally, with ten coin tosses it’s easier to make assumptions concerning probability than it is after just two tosses. But after ten coin tosses it is even more unlikely that exactly 50% will come out heads. Reverse that, and it follows that even with five out of ten being heads you cannot conclude that the underlying probability is 50%.

Before this gets too complicated, let me demonstrate the dilemma with a few numbers. The following table shows the number of coin tosses, the number of times they resulted in heads and the heads rate (always at 50%). The last two columns show the range in which the probability is likely to fall into. An example for how to read the table: If there are ten coin tosses which result in heads five times, the heads rate is at 50%. The probability for heads is likely to be somewhere between 19% and 81%.

(In case someone wants to check the math: these are Clopper-Pearson intervals with a confidence level of 95%.)

The numbers show: even if the rate is always at 50%, you need a lot of tosses in order to narrow down the probabilities. (By the way: even the ranges which are used in this example are only correct 95 % of the time. That means, 5% of the time the real probabilities lie outside of this range.)

Google finds itself faced with the same dilemma when trying to deduce click-through probability from click-through rate. The following table shows the same situation as above, but this time it’s about an ad with a click-through rate of 5% – a very common situation in AdWords.

As you can see, a lot of impressions are needed until reliable numbers are available. Furthermore, Google needs numbers to be as exact as possible, because an ad auction is often about multiple ads with click-through probabilities very close to each other.

This exercise may appear very theoretical – who cares about how accurately one can calculate click-through probabilities? Worst case, Google miscalculates and messes up the ad order a bit, losing a penny in revenue now and then. However, if you consider that Google is delivering millions upon millions of ads each day, these subtleties add up to a big difference.

**Estimating Click-Through Probability**

In order to rank ads optimally and achieve the highest possible profit from them, Google needs the best estimates possible for click-through probabilities. The tables demonstrated that for a good estimate much data is required. Until this data is collected, Google would have to fly blind – or at least would have to rely on bad estimates. So, during the time of data collection, suboptimal results would have to suffice – Google would have to settle for lower earnings.

Google faces an even bigger dilemma. The tables suggest that there will be reliable data at some point. Reality is oftentimes different, because there usually isn’t the one click-through rate to rely on. For example, let’s look at the keyword “running shoes”, which has a click-through rate of 5% after 10,000 impressions. According to the second table that narrows down real click-through probability fairly well.

Nonetheless, these 10,000 impressions come from a large number of different search queries like “running shoes” or “discount running shoes”, etc. Naturally, every search query has its own click-through probability. Maybe our ad has the headline “Discount Running Shoes”, so the query “discount running shoes” will probably have a higher click-through probability. So each query has to be looked upon separately. Thus the 10,000 impressions are split between the actual search queries.

But it doesn’t end there. Ad rotation makes things even worse. Every query works differently in combination with different ads. For example, the ad for discount running shoes will work better for the query “discount running shoes”rather than an ad with the headline “Best Running Shoes”. And of course there is the situation that a query or ad are simply new and there is no data available yet.

Bottom line: Google really needs enormous amounts of data to estimate click-through probabilities if those estimates are based on clicks and impressions. Many times, that data is simply not available. So what else could Google use in order to estimate click-through probabilities? Lucky for us, we don’t have to look very far. The answer lies in the well-known components of quality score, which can easily be looked up under the AdWords help pages.

There’s click-through rate itself – well, we know why that one is important. Other factors are centered upon relevancy. That makes sense: the more relevant an ad is, the higher the probability for a click. Aside from click-through rate and relevancy factors, the landing page is often mentioned. But the landing page is considered within the ad auction. In the second post I called that a contradiction to Google’s promise of quality. Now we know why the landing page is irrelevant for the ad auction: it has nothing to do with click-through probability, because the user sees it only after the click. The only thing that gives any indication about the landing page is the display URL – and its click-through rate is in fact listed as a separate factor.

It is reasonable to assume that there are many more signals Google utilizes to refine its estimate. How exactly all of this is used in estimating click-through probability is everything but trivial and cannot be determined from the outside looking in. However, we can assume that a quality score for newly added keywords or ads is calculated based on factors like relevancy and more or less related click-through rates (whole account, similar keywords, etc.). When more and more data is collected, those factors slowly fade into the background.

**The Secret of Quality Score**

In the first post I promised answers for three questions. Now we have these answers:

*What is quality score good for?*

To maximize profits for Google.

*What is quality?*

Click-through probability.

*What is quality score?*

One might think this is, again, click-through probability. However, as outlined above, click-through probability is an intangible variable which can only be estimated as well as possible. That estimate is quality score.

So, again in clear bold letters:

**Quality score is Google’s estimate for click-through probability.**

And there we have revealed the secret of quality score.

**Is that really true?**

As I mentioned in the first post: I didn’t break into the Googleplex to steal the secret. Neither has Google in any way confirmed that this post is correct. So I honestly have to call it a theory.

On the other hand one has to consider that this theory is simply following up on the idea of Google trying to make as much money as possible. So if this theory is wrong, this would mean one of two things: either Google doesn’t really want to maximize its earnings or they simply haven’t come up with the idea I just presented. Both seem unlikely to me.

But what about actual quality, you might ask. Wouldn’t this mean that Google corrupts its search results in order to make a quick buck? No, because there is no contradiction here. If quality is click-through probability, this basically says that a good ad is an ad that gets clicked often. This is a definition of quality that works for the users, too. As long as Google makes sure that shady advertisers cannot abuse the system – which is accomplished by high minimum bids and advertising policies that exclude such advertisers.

By using click-through probability as a measure of quality, Google makes as much money as possible and users get good search results – a win-win situation. Of course, Google only highlights the part where they deliver high-quality search results and downplays the fact that it makes them a ton of money. But that’s not a conspiracy, it’s just marketing.

**So what now?**

The big secret has been revealed, but that doesn’t mean we’re done. Because using this knowledge we can dig a little deeper into how quality score works. We can use it to correct myths and misunderstandings, as well as answer some open questions about quality score. We can even speculate about previously unknown factors. So, in the coming days and weeks, several posts based upon this one will follow.

Until then, thanks for reading.

{ 1 trackback }

{ 13 comments }

Hey Martin, very interesting.

It’s hard for me to beleive that Google only cares about maximizing profit. I thought it was all about the searcher having the best experience?

So, does this mean that we need to focus on CTR in order to have the best CTR probability?

Thanks,

Chad

Hi Chad,

Thanks for your comment.

I think for the most part, there is no contradiction between quality for Google and quality for the searcher: click-through probability works for both. But when Google has to choose either quality or profits, it’s usually profits. I wrote about this in the second post of the series. For example, Google is doing a poor job of educating people about match types. People understanding match types would increase quality, but advertisers using broad match without knowing about the pitfalls is more profitable…

In the end, the concept of click-through probability does a better job of explaining how things work than the vague information Google gives us. It just fits too beautifully…

“So, does this mean that we need to focus on CTR in order to have the best CTR probability?”

Yeah, but I have to admit, this isn’t really new. Google tells us that CTR is by far the most important factor in Quality Score. This post explains why.

I think it’s important to understand that Google is really clueless about click-through probability / quality score at the start of a new account, ad, or keyword. All they can do is guess from other signals: is the keyword in the ad? (relevancy) How did the advertiser do in the past? How did similar ads do in the past? (related performance data)

The more Google knows about actual CTR, the less it has to use substitues like all sorts of relevancy and related performance data.

Hello Martin,

thanks for the interesting article – I realy enjoyed reading it. But I have to add one thing i realized at the quality score:

1. Landing Page is only relevant to minimum bid and has no effect on quality score:

Keywords with low quality score (=3) can be uplifted by putting the keyword in the click url – this means that the click url resp. the landing page has an impact on the quality score.

This leads to my second addition:

Google is only interested in the click (=short term profit):

I do believe that Google is more interested in user experience to save long term profits than in just getting the next click… this might lead back to my first point: landing page influences quality score. Keywords with high minimum bids have always a low quality score therefore there must be a reaction between qs and landing page.

Cheers,

Thorsten

Thorsten,

Thanks for your comment. You are right about there being a connection between landing page quality and minimum bids, but in terms of ad position, landing pages just don’t matter. (see the AdWords help page linked in the article: “For calculating a keyword-targeted ad’s position, landing page quality is not a factor.”) The problem is, there is more than one thing that’s called “quality score”. My article is about the quality score used in the ad auction – the one that determines your ad position and the price you pay per click.

I believe, the lack of an influence of landing page quality for ad position is a crucial fact. If Google was in fact primarily interested in user experience, landing pages would have to be a factor when ad position is calculated. After all, landing pages are a vital part of user experience. But Google admits that landing pages don’t matter for ad positions.

To be clear, Google won’t corrupt the search results just for the quick buck. But shady advertisers and advertisers with slow or otherwise bad landing pages are filtered out with high minimum bids. When those minimum bids are determined, landing page quality is a factor. When the ad position is determined, landing page quality is irrelevant.

Good post. I’ve subscribed to your blog now.

Have you seen “Web Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine”? http://research.microsoft.com/apps/pubs/default.aspx?id=122779

The paper describes how Bing estimates click through rate and how this effects their search auction

Oh thanks. I didn’t know the paper… already printing

Very strong article. I just got through reading Andrew Goodman’s newest book on AdWords, and some of the concepts he emphasized when describing how Quality Score is calculated (the importance of account history, why new advertisers should start small and then gradually add new groups/campaigns, industry-wide behavior on advertiser’s side) made sense in light of my own experience, but not in terms of reasoning.

Until today, that is.

I think a discussion of CTR probability should be an essential aspect of any intermediate level understanding of AdWords. Thank you very much for this.

One question: What is your estimate of the importance of history when predicting a quality score for a new keyword? It seems like history might be the best indicator…but that would fly in the face of the advice we’re given about matching search text to ad text.

Thanks for your kind words, Jason

You raise a good question about the importance of account history. I planned to write an article about that because I believe many AdWords managers heavily consider this factor when starting new accounts. It seems to be considered best practice to always start with your brand, to build up a good account history.

There is no doubt that account history is a factor, but we can’t say how important it is. For new keywords, there will always be other factors to be considered, like relevancy. The performance of similar keyword-ad combinations is probably considered, too… I’d think that Google probably even uses the performance of similar keyword-ad combinations from competitors. The bottom line is, overall account history is just one thing to be considered.

Of course, you can’t have too many things speaking for your quality score. The numbers you see in your account go from 1 to 10. Click-through probability, on the other hand, certainly won’t reach 100%, so there’s always room for improvement.

In terms of a practical strategy, I would say this: if there is an easy way to make account history work for you, do it. If you set up a new account, you might as well start with your brand terms right away. But I wouldn’t delay the start of the rest of the account just to build up a strong history. There is no way to know for sure if that has much of an impact, so I’d always put business goals first.

Martin,

Does the CTR probability stop there or do you think they calculate relative CTR probablility? The relative portion being with respect to other ads that could be in the specific auction. In particular do you think they group or pair different adds to see how they would do against each other specifically or as a set? For instance if ad A & B have pretty much the same content but ad C is very different will that impact the probability of a click for any of the ads? And in which way for A/B or for C?

Dave

Dave,

That’s a good thought… I think interdependencies between ads are likely to have an impact on click-through probability. So in order to improve the estimates, Google should probably consider these. But I have my doubts that it can be done…

You would need an awful lot of data to arrive at any conclusions. One would have to eliminate so many other influences, basically everything else that drives click-through probability. Then you would have to look at all possible combinations of things that might influence each others’ CTR – there could be up to eleven ads that influence each other.

Position would be an important factor, too: the ad in the top position probably has a bigger influence on all the others; two ads right next to each other might influence each other more. This would turn the ad auction upside down, because quality score would determine position and position would determine quality score – you would have to iterate until there’s a stable ad order.

Another way of looking at it is this: quality score was introduced in 2005. It took until 2008 that Google announced that they were now able to eliminate the influence of ad position when evaluating CTR. This seems like a much simpler problem compared to estimating ad interdependencies. So I’d say they probably don’t do that.

Hi Martin,

Great job explaining why even Google has to estimate QS on sparse data. It is a non trivial problem. I wrote a series of articles on QS estimation and on the difference between the theoretical Adwords help pages auction marketplace and the real one. You might find the real life examples interesting.

http://searchengineland.com/the-subtle-science-of-bidding-part-3-second-order-effects-46983

I also wrote a piece on SEL a while ago which showed that the R^2 of QS vs CTR was 90%. Essentially CTR is a proxy for QS. Again, this echoes your findings.

Sid

Hi Sid,

Thank you. Yes, I know your articles and your findings on QS-CTR correlation – I’ve seen your presentations on SMX Seattle last year and have been a fan ever since

Actually, I had your series about the subtleties of bidding in mind for an article I am currently working on. I believe I can explain the effects you describe in parts 1 and 2. It will probably be posted here in the next few days or so.

Martin

— COMMENTS ARE NO LONGER WORKING —

This site is dead and the blog is no longer working properly. In case someone tries/tried to comment it seems no notifications are sent. I managed to log in, but the admin panel is broken and I can’t get to the comments. If there are any comments awaiting moderation, I can’t get there and I am truly sorry…

In any case, you can always reach me over Twitter (@bloomarty) or at http://www.PPC-Epiphany.com.

— COMMENTS ARE NO LONGER WORKING —

Comments on this entry are closed.