Proper SEO and the Robots.txt File

When it comes to SEO, most people understand that a website must have content, "search engine friendly" site architecture/HTML, and meta data (title tags andmeta descriptions).

Another meta element, if implemented incorrectly, that can also trip up websites is robots.txt.I was recently reminded of this while reviewing the website of a large company that had spent considerable money on building a mobile version of their website, on a sub-directory. Thats fine, but having a disallow statement in their robots.txt file meant that the website wasnt accessible to search engines (Disallow: /mobile/)

Lets review how to properly implement robots.txt to avoid search ranking problems and damaging your business, as well as how to correctly disallow search engine crawling.

Simply put, if you go to domain.com/robots.txt, you should see a list of directories of the website that the site owner is asking the search engines to "skip" (or "disallow"). However, if you arent careful when editing a robots.txt file, you could be putting information in your robots.txt file that could really hurt your business.

There's tons of information about the robots.txt file available at the Web Robots Pages, including the proper usage of the disallow feature, and blocking "bad bots" from indexing your website.

The general rule of thumb is to make sure a robots.txt file exists at the root of your domain (e.g., domain.com/robots.txt). To exclude all robots from indexing part of your website, your robots.txt file would look something like this:

User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /junk/

The above syntax would tell all robots not to index the /cgi-bin/, the /tmp/, and the /junk/ directories on your website.

In the past, I reviewed a website that had a good amount of content and several high quality backlinks. However, the website had virtually no presence in the search engine results pages (SERPs).

What happened? Penalty? Well, no. The site's owner had included a disallow to "/". They were telling the search engine robots not to crawl any part of the website.

Read more here:
Proper SEO and the Robots.txt File

Related Posts

Comments are closed.