Are AdSense publishers being favored with more frequent indexing?

Today I was going to address some of the comments that Stu Drew left about managing to get a high ranking for his private-label rights articles blog entry, but I'm going to defer that to a later time. If you're interested in that topic, let me point you to an article I've written about the so-called “Google Sandbox” that should address some of the questions: Redcowl Bluesingksy: Why the Google Sandbox Doesn't Exist.

I want to talk some more about Google's indexing of AdSense pages. In case you hadn't heard, Googler Matt Cutts confirmed that the AdSense crawler is feeding pages into Google's new “BigDaddy” search indexes. This confirms what others had noticed about what the AdSense crawler (usually referred to as the “mediabot”) is doing. Or does it?

As always, there are different ways to look at what's happening. We know that pages crawled by the mediabot are now making their way into the Google search index. What we don't know, however, is whether those pages are being pushed or pulled into the index. Let me explain.

Let's think of the innards of the Google search engine as a bunch of black boxes. (Disclaimer: I have no special knowledge of how things actually work internally.) For our purposes, we're only concerned with three of those boxes:

  1. The manager maintains a list of URLs and decides when each need to be indexed
  2. The crawler (this is the Googlebot) goes out and fetches pages for indexing
  3. The indexer takes crawled pages and indexes and ranks them using proprietary algorithms

At some point, the manager decides that a given URL needs to be recrawled. It decides this based on age, Google Sitemaps, PageRank, whatever. No one disputes that different sites get crawled with different frequencies, and the manager is the one making those decisions. So it tells the crawler to fetch the page. This won't happen for a while, but when it's done the crawler tells the manager the page has been fetched and the manager then passes the page to the indexer for processing.

Now throw the AdSense crawler into the mix and see what happens. The case that concerns the SEO community is if the mediabot pushes its pages directly to the indexer, bypassing the manager's controls. In this scenario, changes to AdSense pages can potentially be noticed much more quickly than they would through the normal crawling process, giving them an unfair advantage. In this “push” model, the AdSense crawler effectively acts as a secondary manager.

The “pull” model, on the other hand, only affects the crawler. When the manager asks the crawler to get the contents of a given URL, the crawler first checks with the mediabot to see if the latter has crawled the page recently, where “recently” can be any reasonable length of time, say 24 hours. If it does, the crawler just returns a copy of what the mediabot saw instead of going out to fetch the page contents again. The manager is still in control in this scenario — only it decides when a page is to be crawled.

What I've been assuming is that Google is using the pull model, not the push model. Others are assuming the reverse (and the worst), hence the controversy. We need someone from Google to clarify this issue for us…

Eric Giguere is the contextual advertising expert who wrote Make Easy Money with Google and Uncommon AdSense. You can read this blog by mail if it's more convenient for you, just send a blank email to memwg-blog@aweber.com to subscribe.

Socialize This Post (Please!)

Add to OnlywireAdd to Onlywire

Tags

Comments

5 Responses to “Are AdSense publishers being favored with more frequent indexing?”

  1. Rajput Jitendra on February 18th, 2008 6:47 am

    Hi,

    Nice Article, If i understand well then it’s mean if i put adsence in my new lonch website then it will be index more quickly then normal index, am i right? becuase i get ad sence bot advantage to cache my pages more frequiltlly.

    Jitendra
    http://www.tatvasoft.com

  2. Eric Giguere on February 18th, 2008 9:54 am

    No, not necessarily. This is just a way for Google to share its view of a page between the different bots Google sends out. It doesn’t mean that your page will be crawled more quickly or more often.

  3. Rajput Jitendra on February 18th, 2008 11:08 am

    so, with adsence, Media bot only crawl google ads only, not effected on my ranking or index frequency. But if it take cache view of my pages then what it will be do with it. Some where it defiantly store to quick result of website for saving bandwidth. So, when ever search query match with it then website result is popup because adsence are automate from google and dependant on page content, and page content is very related to google query then it will be come up.

  4. Eric Giguere on February 18th, 2008 6:49 pm

    No, it doesn’t work that way. The cache we’re talking about here is separate from the cache used to store seach engine pages for querying… it’s just that Google has so many bots going out and querying the same sites that it made sense to get them to share the input at least…

  5. Rajput Jitendra on February 18th, 2008 11:17 pm

    k, i understand now, thank you for nice chat, keep me update and give me new information about google.

    Jitendra
    http://www.tatvasoft.com

Subscribe without commenting