Google Webmaster Tools Crawl Errors: How To Get Detailed Data From the API

Earlier this week, I wrote about my disappointment that granular data (the number of URLs reported, the specifics of the errors) was removed from Google webmaster tools. However, as Ive been talking with Google, Ive discovered that much of this detail is still available via the GData API. That this detail was available through the API wasnt at all obvious to me from reading their blog post about the changes. The post included the following:

And led me to believe that the current API would only provide access to the same data available from the downloads from the UI. But in any case, up to 100,000 URLs for each error and the details of most of what has gone missing is in fact available through the API now, so rejoice!

The data is a little tricky to get to and the specifics of whats available varies based on how you retrieve it. Two different types of files are available that provide detail about crawl errors:

(Thanks toRyan JonesandRyan Smithfor help in tracking these details down.)

What this means is that different slices of data are available in four ways:

What youre able to see about each error is different based on how you access it.

Eight CSV files are available through the API (you can download them all for a single site or for all sites in your account at once as well as just a specific CSV and a specific date range), but this support is not built into most of the available client libraries. Youll need to build it in yourself or use the PHP client library(which seems to be the only one that has support built in). The CSV files are:

For the topic at hand, lets dive into the crawl errors CSV. It contains the following data:

This file does not include details on crawl error sources (but that is available through the crawl errors feed, described below).

It appears that thecrawl errors feedrequest code is built into theJava andObjective Cclient libraries, but youll have to write your own code to request this if youre using a different library. You can fetch 25 errors at a time and programmatically loop through them all. The information returned is in the following format:

Originally posted here:
Google Webmaster Tools Crawl Errors: How To Get Detailed Data From the API

Related Posts

Comments are closed.