Skip to content

Search Engine Optimisation icon

Telling Search Engines Where to Go

By default, Mambo does not give directions to search engines. Some templates add meta data to tell search engines to follow links within your site and to index them. This is done through the use of the robots meta tag. Most templates don't include this and search engines don't need this - if the content it there, they will index it and will follow any links along the way.

Let's take a look at the various ways you can tell a search engine where to go.
There are four possible directions you can give to search engine bots:

  • crawl, index and follow everything;
  • disallowed ie. don't crawl;
  • noindex ie. do not index;
  • nofollow ie. do not follow links.

Let's look now at what each one means and how it is used.

If there are no search engine robots directives on a site then all content will be crawled (sometimes called "spidered") and all available links will be followed. Although it is not necessary, a directive can be given within the page itself to tell search engine bots to do the same thing.
eg.
<meta name="robots" content="index,follow" />

To prevent robots from crawling certain parts of your site, Mambo uses a robots.txt file. You will find this in your site root. It's a plain text file that contains directives to tell complying search engine robots what to do. The directives use the following format:

User-agent: *
Disallow: /directoryName/

The "noindex" directive is used within the page itself.
eg. <meta name="robots" content="noindex" />
This tells search engines that they can crawl the content, but cannot index it in their search results. Page Rank cannot be assigned to content that is not indexed.

The "nofollow" directive is also given within the page.
Tip: Take care not to use "nofollow" in your page meta data. Adding it to the robots meta data will prevent search bots from crawling your whole site and will therefore result in some content being ignored. Remember, your menu items and multiple page content are all accessed via links.
To use "nofollow" you add the directive to links, eg. rel=”nofollow”

Points to be aware of

The "Disallow" directive in robots.txt does not prevent that content from showing up in search results. If there are any links to that content it may still be discovered by search engines and indexed. If this happens and you want your content removed from Google, Yahoo! or MSN they have tools available for requesting content removal from their index.

If you use "disallow" in your robots.txt AND "noindex" on the page you content may still show up in search engine results. Because you have told the search bots not to crawl the page they cannot see that you have given a "noindex" directive. Use one or the other, but not both.

The "nofollow" applies only to the link you have added it to. Therefore, if you have any other internal links to that content, or any other site has links to it, the content may still be indexed and attract a Page Rank.

Security Tip for Mambo

Periodically run searches for content that you have blocked using robots directives. If you find it, contact the search engines involved and get it removed. I recently analysed one site that was failing to attract Page Rank only to find that the highest ranked content was the admin login! While this was a very unusual situation it does illustrate the importance of checking to find out just what comes up in search engine results.

Bookmark This:
  • bodytext
  • Technorati
  • del.icio.us
  • Facebook
  • Google
  • StumbleUpon
  • Reddit

Whether I am developing Mambo or working on tutorials I am fuelled by coffee. Caffeine keeps me going so if you like the work I am doing please click on the cup to buy me a coffee today. Just $10 covers the cost of getting my caramel macchiato ;)

If you enjoyed this article make sure you subscribe to my RSS feed!

Leave a Reply

This is a gravatar-friendly site, enter your email address to use your gravatar.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

By submitting a comment here you grant this site a perpetual license to reproduce your words and name/web site in attribution.