Skip to content

Search Engine Optimisation icon

Common Causes of Duplicate Content

The most common causes of duplicate content relate to all web sites, not just sites using the Mambo content management system. In this tutorial, I will itemise the top culprits and tell you what steps to take to ensure these do not have a negative impact on your search engine rankings.

www and no-www

Most web hosts these days have the www and no-www for a site pointing to the same thing, which means that when anyone comes to your site through either http://www.example.com or http://example.com, they will get the same content. Hello!! Duplicate content!

However, technically, www is a subdomain of the no-www site. It is just the same as, for example, http://forum.example.com or any other sub.example.com, although it exists for a very different reason.

Due to www being a subdomain, search engines treat the content as if it was a completely different entity to the no-www version of your site. This means they see duplicate content. In some cases, search engines may penalise a site for duplicate content. Page rankings can also be split across the two versions of your content.

So, why don't we get our hosts to just set us up as one or the other? We can, but that also leads to problems. If you use the no-www for your site, you will still find that some other sites will put up links to you using www. The reverse also happens. If you are already listed in links or search engines as being both www and no-www, you need to make changes that will standardise your search engine listings while not harming your page rank or making pages unaccessible through the URL.

This can be done in several ways.

If you want to redirect all traffic to www, add this to your .htaccess file:

Options +FollowSymLinks  RewriteEngine on RewriteCond %{HTTP_HOST} ^domain\.com RewriteRule ^(.*)$ http://www.example.com/$1 [R=permanent,L]

Remember to change the "example.com" to your domain! If you are using SEF, you won't have to add the following:

Options +FollowSymLinks  RewriteEngine on

To redirect traffic from www to no-www, use this:

Options +FollowSymLinks  RewriteEngine on  RewriteCond %{HTTP_HOST} !^example.com$ [NC]  RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

The best solution, however, is if you can get your host to make a small change through BIND/Local DNS. If you can edit the local DNS for the domain itself, simply add a CNAME record with the non-www version pointing to the www version (or vice versa, depending on which configuration you prefer).
The reason that this is the better solution is that every directive you add to .htaccess increases server processing.

Keep Sites Under Development Hidden from Search Engines

A common mistake is to redevelop an existing web site under a different domain or on a new server. As long as there is a link anywhere on the web, search engines can find it. It almost seems like one of life's cruel jokes that it can take months before searchbots visit a brand new site but the one you don't want them to find is sure to be crawled.

If your new site is being developed in a directory or subdomain, make sure you have it either password protected or have used a robots.txt directive to tell good searchbots not to index your site. It is not uncommon to find sites that are under development showing in Google search results.

If you have ever posted a link, for example, on the Mambo forums, to your site that is under development then you have given searchbots the means to find, and index, your site. And if your new site is using the same content as the "live" site, then you have duplicated your content!

Free Content, Affiliate Sites, Book Stores and Other Duplicate Content

The web is full of offers for free content, affiliate sites, book and other product stores, and aggregated content of all kinds. On the surface, this looks like a good way of adding content to your site and many owners of new web sites make the mistake of populating their site with this easy content. DON'T! Not only will you be adding someone else's original content, you will also be adding content that thousands of other sites have also included. All content that you receive from other sites is duplicate content. Your site will not be penalised by search engines for including it, but it can dilute the value of your own content.

If you are serious about promoting an affiliate or products that come from a third party then you need to write your own review and ensure that the first content on that page is original. Whenever you consider adding this kind of content ask yourself, "Does this content really add value to my visitors?" If the answer is "no" or you have not had your site online long enough to clearly identify just what your visitors want, then don't add it.

RSS Feeds

Take care with RSS feeds. If you must publish RSS feeds from other sites on your own site try to use feeds that provide either an introduction or a summary of the feed. If you use complete feeds you will be duplicating the content on your site. Again, ask yourself if the feeds really add value to your visitors experience when they visit your site. Do they add value to your site in terms of search engine optimisation? Usually, the answer is "no".

I will be doing a tutorial covering RSS feeds from your site, and their impact on search engine optimisation. Watch out for it.

Bookmark This:
  • bodytext
  • Technorati
  • del.icio.us
  • Facebook
  • Google
  • StumbleUpon
  • Reddit

Whether I am developing Mambo or working on tutorials I am fuelled by coffee. Caffeine keeps me going so if you like the work I am doing please click on the cup to buy me a coffee today. Just $10 covers the cost of getting my caramel macchiato ;)

If you enjoyed this article make sure you subscribe to my RSS feed!

Leave a Reply

This is a gravatar-friendly site, enter your email address to use your gravatar.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

By submitting a comment here you grant this site a perpetual license to reproduce your words and name/web site in attribution.