The Google Webmaster Central blog is where the people at Google try to explain their systems to us in simple terms that we can all understand. Worth reading are a couple of posts about exactly how the Googlebot crawls websites written as â€˜datesâ€™ between the bot and a website.
The first post (date) is about how the Googlebot identifies itself, what kind of files it accepts for reading and a little tip about how not to use the robots.txt file. First date with the Googlebot: Headers and compression
In the second post the site and the bot exchange a series of letters discussing the best ways to deal with different types of redirects plus a good tip about flagging unmodified content. Date with Googlebot, Part II: HTTP status codes and If-Modified-Since