Skip to content

Search Engine Optimisation icon

How to Avoid Duplicate Content Issues With Mambo

What is Duplicate Content?

In web site terms, "duplicate content" is content that is the same, or where large parts of it are the same, as content published elsewhere. This can be content that is duplicated on the same site, or content that is duplicated by being published on other sites. Most duplicate content occurs by accident or through people not being aware of the implications of having it. Some content duplication is deliberate and is used by spammers to try to influence search engine results.

Duplicate Content in Mambo

Mambo can produce duplicate content. The key culprits are as follows:

  • "print this page" link
  • PDF link
  • Mambo's URL structure.

We looked at the anatomy of the Mambo URL and talked about why you should use SEF URl's with Mambo. So, let's consider the other two issues.

If you don"t need the "print this page", pdf and "tell a friend" features - disable them from within your global configuration options.

The greatest contributor to duplicate content in Mambo, outside of the URL structure, is the PDF generator. Mambo's PDF generator is fairly basis, does not include images and is limited with its use of character sets. If you want to provide PDF's of your content you would be better off creating your own PDF files and making these available through links on your site. To prevent searchbots from indexing the PDF downloads, add a rel="noindex, nofollow" within the link. However, if you wish to utilise the in-built PDF generator function of Mambo you can still tell searchbots not to crawl the generated output.

How to Tell Search Engines Not to Index Mambo PDFs

In Mambo versions less than 4.7, the PDF generator link uses JavaScript. It builds a dynamic link through the PDF icon which will appear on each of your pages in your Mambo site. Note: This only appears if you have enabled PDF's through the options within your site's global configuration content screen. Due to the way the link is constructed, blocking it through a robots.txt directive is not practical. However, the following small core hack tells search engines not to index or follow the link.

In Mambo <=4.6.4, go to /components/com_content/content.html.php

Look for the following:

	/**
	* Writes PDF icon
	*/

At around line 620 (depending on your version of Mambo), look for the following code:

<a href="javascript:void window.open('<?php echo $link; ?>', 'win2', '<?php echo $status; ?>');" title="<?php echo T_('PDF');?>">

Let's add a rel="noindex.nofollow" to this code to tell search robots not to index output of that link and not to follow the link. This result will look like this:

<a href="javascript:void window.open('<?php echo $link; ?>', 'win2', '<?php echo $status; ?>');" rel="noindex, nofollow" title="<?php echo T_('PDF');?>">

You can do the same for the "print this page" function or, alternatively, disable it altogether. Most visitors will know how to use their browser's built-in print function and provding a "print.css" for your template will enable you to provide a much better printed page than you can otherwise get through the default print feature.

In the next tutorial we will look at the common causes of duplicate content. Till next time…

Bookmark This:
  • bodytext
  • Technorati
  • del.icio.us
  • Facebook
  • Google
  • StumbleUpon
  • Reddit

Whether I am developing Mambo or working on tutorials I am fuelled by coffee. Caffeine keeps me going so if you like the work I am doing please click on the cup to buy me a coffee today. Just $10 covers the cost of getting my caramel macchiato ;)

If you enjoyed this article make sure you subscribe to my RSS feed!

Leave a Reply

This is a gravatar-friendly site, enter your email address to use your gravatar.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

By submitting a comment here you grant this site a perpetual license to reproduce your words and name/web site in attribution.