Welcome to part 4 of my 7 1/2 part series on Google Webmaster Tools. In this post I am going to discuss the next section of Google Webmaster Tools, Google Index.
This section of Google Webmaster Tools lets us see
- How many pages of our website are indexed
- The most commonly used keywords on our website and how often they’re used and
- Also allows us to remove a URL or set of URLs from Google’s index.
Let’s take a closer look at each section and how it can be used.
The first link under Google Index is Index Status. On this page you are able to see how many pages of your website are indexed. Right away we are presented with two options: Basic and Advanced. The Basic chart shows a linear graph of how many pages were indexed over the past year. Rolling over the line will present you with a pop-up that has a date and how many pages were indexed on that day.
Clicking the Advanced tab give us more options to review our historical index status. The extra options are: Total Indexed, Ever Crawled, Blocked by robots and Removed. Let’s look a little closer at this:
Here we see the extra options above the graph. Checking them off and hitting update provides us with four lines on the chart. The red line represents all pages that were ever crawled. Remember, just because a page is crawled does not mean that it will be indexed. Google may crawl development pages, noindex,nofollow pages, pages blocked by robots and so forth, but those pages will not show up in a search.
The next line is the blue line which represents total indexed. This line is what the basic chart shows when we first click on Index Status. The third line is purple and represents URLs that have been removed. These are URLs that are no longer indexed and may have been manually removed by yourself (which is the case here). We will cover that more later in this post. A fourth line which is not represented here is URLs blocked by the robots.txt file. Since I’m not blocking any pages on mine we don’t see that data available but this just reinforces what I mentioned earlier about Google being able to crawl, but not index all URLs.
Content Keywords allows us to see how often a keyword is used throughout our website. The list we are shown below are keywords, in descending order, based on how often they’re used on our site.
The blue bar indicates how often that keyword is used; scrolling over it will give us the percentage. Clicking on a word will bring us to a new page that shows us specifics such as variants, significance, occurrence as well as a list of the URLs that contain that particular word.
On this screen we are provided a list of URLs that each contain an instance of this particular keyword. Hovering over a URL will show us a preview of that page in an on-page pop-up; clicking the URL will bring us to that page in a new tab. If there is a URL that you no longer wish to use as part of your website, you have the option to remove it via the Remove URLs section.
Search Console added a new section to Google Index called Blocked Resources.
Here we are able to see any sections of our website(s) that were not able to be crawled by Google.
Fortunately, my sites doesn’t have anything that Google can’t see so this area is not returning any errors. If there were issues it would list them in a chart below the graph arranged by Host and Pages Affected.
Clicking in deeper would allow us to see the specific blocked resource and the date it was detected.
Please use this section with extreme caution as you can potentially harm your website by selecting the wrong option or delete the wrong URL.
The first thing we see when navigating to this page is a message from Google. The message states that we should use our robots.txt to instruct them on how to crawl our website. This means if there are any directories or URLs we do not wish to be crawled or indexed, we should disallow them. If that was already done and a URL or directory is still indexing, or if you want to speed up the process, you can use this tool. Please make sure you read Google’s requirements for removing content before you proceed.
By clicking Create a new removal request we get a pop-up asking us to enter the URL that we would like to remove. Please note that URLs entered here are case sensitive, this means that www.example.com/Sample.html and www.example.com/sample.html are viewed as two separate URLs. When we enter a URL it can be just the file name, /sample.html, or directory, /sample/. Clicking continue brings us to the following screen:
On this final page we get to choose from three different options. The first option is Remove page from search results and cache. Selecting this option will remove a URL from Google’s search results, this means that it will no longer show up when a search is preformed for a particular keyword on that page. It will also removed cached versions of the page as well.
The second option is Remove from cache only. This option is used to remove an older version or versions of the page. You would use this if you have made major changes to a page or layout and no longer want the older version to be seen.
The third option is Remove directory. This option is used if you need to remove an entire section of your website. This can be useful if your site was hacked, you deleted a lot of content, have redirected content to another section or are just no longer using this section of your website. BE CAREFUL when selecting this option since you do not want to accidentally delete an entire section of your website from Google. The Google Index section is a very powerful section of Webmaster Tools and can be very beneficial towards your marketing strategy. In my next post I will be covering the Crawl section and all of the tools that section has to offer.
The Google Index section in Webmaster Tools provides a lot of information as to how you website is indexed as well as how often specific keywords are used within your site. This section also provides you with ways to remove content pages, versions of pages and directories from your website.