Over the past couple of years, Google has spent a significant amount of time incorporating feedback and making improvements to their Webmaster Tools. From gaining insight into how Google is crawling and indexing your site, to getting notified of an outdated WordPress installation, this article will highlight some of the more useful (and free) functionality Google has to offer after the domain verification process is complete.
Crawling and Indexation
As Googlebot crawls a site, it keeps track of errors it encounters and data on how pages are incorporated into indices. The information gathered is provided in these areas found in Google Webmaster Tools.
Google separates errors at the site and URL level. Site errors are instances where Google can’t access your site and URL errors happen when there is a problem accessing specific URLs.
The top 1,000 URL errors are provided, can be downloaded, and by clicking the URL displayed in the report Google provides the URLs linking to the errors in most cases. Crawl error definitions provide additional clarification as to what is considered to be an error.
This report seems to be accurate and it has been used to find correlations between spikes in server errors and drops in traffic.
In the Crawl Stats area graphs displaying the past 90 days of Googlebot activity can be seen. Pages crawled, kilobytes downloaded, and time spent downloading a page live here.
On July 24, Google announced the Index Status tool. This provides a window into the indexation of the site.
This can be been extremely useful for catching and correcting severe drops in indexation, or just confirming what would be expected after recommended changes have been implemented on a site.
Especially applicable to ecommerce stores, Google’s Maile Ohye recently put out some much needed documentation and explanation on exactly how the Parameter Handling Tool works. In it she expresses that it can be a powerful tool and should be used with caution.
The parameter handling tool provides a way to tell Google what the effects of a particular parameter have on a page. By identifying which pages aren't necessary for Google to crawl, Googlebot can focus on the canonical pages and save resources.
Under configuration, then settings, you can configure how fast Google crawls your site.
Typically leaving this alone is a good idea unless there is a significant reason to limit Google’s crawling.
Last November, Google announced the availability of search query data in Google Webmaster Tools.
Linking Webmaster Tools with Google Analytics will give you even more ways to slice-and-dice data. It is important to note when looking at search query data that (after January 25th) what you’re seeing is the highest position your page was clicked. Google provides a good example.
“Let’s say Nick searched for [bacon] and URLs from your site appeared in positions 3, 6, and 12. Jane also searched for [bacon] and URLs from your site appeared in positions 5 and 9.“
Because the data in this report is coming from Google, keywords that would typically show up as not provided will show up in this report. Clicks are basically equal to visits. An experiment on not provided data showed that when segmenting the data properly these numbers are accurate, despite the common regard that they are not.
This article just nicked the tip of the iceberg. Here’s a list of other functionality not covered.
- Messages can be forwarded to an email
- Set a geographic target (like the United States)
- Set a preferred domain (www vs. non-www)
- Remove particular sitelinks from search result pages
- Instructions for moving to a new domain
- Add and remove users who can access the account
- URLs blocked by robots.txt, test new directives against specified URLs
- Fetch as Googlebot and see if all your content can be seen
- Check for malware on the site
- External linking data
- Internal linking data
- Google+ data
- Manage sitemaps
- Submit site/folder/URL removal requests
- Suggested HTML improvements
- On-page content keyword significance
- Rich snippet structured data