Usage Specific robots.txt Directives - Dare Obasanjo's weblog

February 12, 2007

@ 08:13 PM

A couple of weeks ago I read a blog post by Matt Cutts entitled What did I miss last week? where he wrote

- Hitwise offered a market share comparison between Bloglines, Google Reader, Rojo, and other feed readers that claimed Bloglines was about 10x more popular than Google Reader. My hunch is that both AJAX and frames may be muddying the water here; I’ve mentioned that AJAX can heavily skew pageview metrics before. If the Google Reader team gets a chance to add subscriber numbers to the Feedfetcher user-agent (which may not be a trivial undertaking, since they probably share code with other groups at Google that fetch using the same bot mechanism), that would allow an apples-to-apples comparison.

As I was thinking about the fact that Google Reader can't make changes to the FeedFetcher user agent without tightly coupling a general platform component that likely services Google Reader, Google Homepage, Google Blog Search and other services with their own. I realized that by using one user agent for all of this servides, it pretty much makes it impossible for Web masters to exclude themselves from some of Google's crawlers.

Exactly how one would go about creating a robots.txt file that limits your feed from showing up in Google Blog Search results but doesn't end up exlcuding you from Google Reader and Google Homepage as well? I can't think of a way to do this but maybe it's because my kung fu is weak. Any suggestions?

PS: This isn't work related.

Categories: Web Development

« Googlegate and Founders' Awards | Home | Live.com Collections Debut »

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Usage Specific robots.txt Directives - Dare Obasanjo's weblog