Making Google richer: optimal algorithms for the AdWords auction

Instead of my usual focus on AdSense, today I'm going to talk about a related topic, AdWords. On September 29, I attended a talk by Professor Umesh Vazirani, a well-known computer scientist from UC Berkeley. He was the opening speaker for this year's Distinguished Lecture Series sponsored by the University of Waterloo's School of Computer Science. The talk was titled Making Google Richer: Optimal Algorithms for the AdWords Auction. It wasn't specifically about AdWords per se, but about advertisement auctions in general.

As most of you know, AdWords uses a competitive bidding system that lets advertisers choose how much they're willing to pay to have their ads displayed in conjunction with specific keywords. I wrote about this briefly in Make
Easy Money with Google
, but I didn't spend a lot of time on it other than mention that Google had to license some of the technology from Yahoo!, who acquired it when they bought Overture. Prof. Vazirani briefly went over some of this history and also explained in general how AdWords works for the benefit of the audience members (mostly computer science students and professors) who aren't as familiar with these concepts as those of you reading this article.

The problem faced by Google and other companies that implement bid-for-placement systems is to devise a bidding algorithm that generates the maximum revenue for the auctioneer. (Prof. Vazirani mentioned that Yahoo! was looking to redesign its algorithms because they felt they were leaving money on the table.) As it turns out, the “obvious” answer of ranking the ads by bid price (or some variation of bid price, such as Google's early attempts at using bid price multiplied by the clickthrough ratio) is not optimal. In many cases, this so-called “greedy” algorithm can in fact prove quite detrimental.

The key to all of this is the advertiser's budget, the total amount of money they're willing to spend on their bids within a certain time period. (For AdWords advertisers, the budget is calculated on a daily basis.) Using some complicated mathematical analysis, Prof. Vazirani and his colleagues were able to come up with an optimal algorithm for ranking bids by factoring in each advertiser's remaining budget into the equation. In other words, the more an advertiser spends its budget, the less likely it becomes to have its ads chosen. In the greedy algorithm, the advertisers who bid the highest end up dominating the auction until their budget is spent. In the optimal algorithm, that domination doesn't happen and other advertisers get to spend their budget. The end result is more revenue for the auctioneer, because most of the advertisers are spending all or most of their budgets.

It's certainly an interesting topic and a neat mathematical problem. If you want more details, see the article Computer Scientists Optimize Innovative Ad Auction (PDF), which goes over the problem and the solution in more detail. (As an amusing note, the article uses Vioxx as an example of a high-paying keyword. Readers of my book know how well Vioxx did for me as an AdSense publisher!)

Of course, we may never know if companies like Google or Yahoo! actually incorporate this work into their own algorithms. They're definitely following the research with some interest, though, as Prof. Vazirani has presented it to both companies.

P.S.: Today's the last day this month to join my mailing list and get a chance at winning a signed copy of my book.

Knowing when the Alexa Toolbar visits

Since my last posting about the relevancy of Alexa rankings I've had some questions about detecting when the Alexa Toolbar is being used. It's really quite simple, actually, if you're web server is logging the right information.

You see, most browsers send what is called a user agent string to the web server whenever they request a page from the web server. The user agent string is meant to identify the “user agent” (the browser, who is acting as an “agent” for the user) to the web server. The web server can use this information in different ways, perhaps by serving up different content formatted for different browsers.

[Note: There are actually better ways of doing that than relying on the user agent string, but let's not get too technical here. If you really want to know the details, see my articles Masquerading Your Browser and How to Detect Internet Explorer.]

Browsers are not required to send a user agent string, and for privacy reasons some people turn them off. Some browsers even send different user agent strings to masquerade as a certain kind of browser — this often happens when you want to visit sites with Firefox/Mozilla/Opera that have been “designed for Internet Explorer only” and are obnoxious about it. Non-browsers often send user agent strings: web crawlers like the Googlebot or the Mediapartners crawler identify themselves with a user agent string. In fact, using the user agent information to figure out when Google is crawling your site is infinitely more useful than figuring out when the Alexa Toolbar is being used.

Anyhow, browsers with the Alexa Toolbar installed include the phrase “Alexa Toolbar” in their user agent string. So all you need to do is configure your web server to include user agent information in the log files it generates. What's a log file, you ask? A log file is basically just a text file to which the web server writes (“logs”) important information. Things like what pages are being accessed at what times. If you're lucky, your log files should already be tracking this information. If not, you'll have to consult your web server documentation (or ask your web hosting service) for instructions. (If you're running a blog using a free service and you don't have access to those logs, then there's not much I can do to help you, sorry!)

And if you're curious about the user agent string your own browser is sending, visit my web browser header viewer page to see what exactly my web server knows about your web browser.

Are Alexa rankings relevant?

Advertisers, search engines and other interested parties are always looking for ways to rank sites against each other. Everyone knows about Google's PageRank, for example. But what about Alexa rankings?

For those who don't know, Alexa is an Amazon subsidiary that provides tools for navigating, classifying and searching the Web. Some of these features are done in partnership with other companies — basic search technology is provided by Google, for example. Alexa's main tool is the Alexa Toolbar (that version is for Internet Explorer, Firefox/Mozilla/Netscape users can install the A9 Toolbar instead). Besides providing easy access to Alexa information, the Toolbar actually collects statistics on what sites users are visiting and sends it back to Alexa for compilation. This sampling allows Alexa to assign popularity rankings for a given site. For example, you can see that the Alexa ranking for MakeEasyMoneyWithGoogle.com shows this site gaining readership over the past few months, finally breaking into the “top 100,000″ sites.

The problem with Alexa's rankings is that its accuracy depends on the number of browsers running the Toolbar. I think these days many more people run the Google Toolbar or simply use similar functionality already built into modern browsers like Firefox. And the Alexa Toolbar was initially only available on Internet Explorer anyhow, so it was ignored by other browser users.

Of course, blogs and other syndicated content read through news aggregators don't get counted in those statistics. (Mind you, blog stats are another problem entirely.)

So take the Alexa rankings with a grain of salt. Use them more for relative comparisons. If you're concerned about your own site's ranking, then load up the Toolbar and visit your own site once or twice a week, just to get some activity going. But spend your time worrying about other things like writing good content and following good search engine optimization techniques — they'll benefit you much more in the long run.

Eric Giguere is the author of the introductory AdSense book Make Easy Money with Google. Be sure to download and read the free sample chapter.

Next Page →