Introduction

Web pages are constantly crawled by bots, gathering data and adding that to their indexed. This is a little test to find out a bit more about how the bots make those requests.

The idea is simple. I gather all the data in the request and place pertinent information from it into the title tag, meta description and content of the page. This information will get indexed and used by each bot. Ince the information is shown by the bot owner, I can see that information. e.g. Google may use the information in their search result snippets or the cached version of the page.

Note that you will see the request header information below in relation to your browser making the request.

Request Headers

Url /Request-HTTP-Header-Info
Method GET
UserHostAddress 54.167.155.163
UserHostName 54.167.155.163
IsSecureConnection True
IsAuthenticated False
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding x-gzip, gzip, deflate
Host websiteadvantage.com.au
User-Agent CCBot/2.0 (http://commoncrawl.org/faq/)

Googlebot HTTP Headers

I performed a Fetch As Google in the search console to test the idea. As expected, the UserAgent was
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Google also includes a header value for From set to googlebot(at)googlebot.com. I wonder if the bot answers emails ;-)

Googlebots Request Headers

Google Mobile: Smartphone User-Agent:
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Google Mobile: XHTML/WML User-Agent:
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

Google Mobile: cHTML User-Agent:
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)