Web pages are constantly crawled by bots, gathering data and adding that to their indexed. This is a little test to find out a bit more about how the bots make those requests.
The idea is simple. I gather all the data in the request and place pertinent information from it into the title tag, meta description and content of the page. This information will get indexed and used by each bot. Once the information is shown by the bot owner, I can see that information. e.g. Google may use the information in their search result snippets or the cached version of the page. Check the cached version of this page to see how the last Googlebot crawl looked like.
Note that you will see the request header information below in relation to your browser making the request.
|Accept-Encoding||x-gzip, gzip, deflate|
|If-Modified-Since||Sun, 18 Mar 2018 11:44:11 GMT|
I performed a Fetch As Google in the search console to test the idea. As expected, the UserAgent was
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Google also includes a header value for From set to googlebot(at)googlebot.com. I wonder if the bot answers emails ;-)
Google Mobile: Smartphone User-Agent:
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Google Mobile: XHTML/WML User-Agent:
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/18.104.22.168.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)
Google Mobile: cHTML User-Agent:
DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)