Google released on its official blog the other day that it now has provided webmasters with a way to avoid duplicated content issues. They now will allow you to tell them which version of the content you want them to accept. A quick note that should be mentioned here is that Yahoo and Live have implemented this as well.
Now you might be wondering where this would actually benefit yourselves from an SEO standpoint. The following is a simple example:
index.php?color=blue
index.php?color=red
With the above 2 urls all we are changing is the color scheme of the website using the php $_GET variables. The problem comes because both of the above urls will have the same title tag, and content. To Google this is a great example of duplicate content.
Now using this handy new option from Google; we could simply add the following to the head of our HTML document to avoid the duplicate content:
<link rel="canonical" href="index.php" />
Now Google will know that index.php is the file with the content, and will not penalize you for any duplicate content.
This theory is relatively new, and thus it will be changing frequently as people try and get the upper hand. Some people are even suggesting that the tag may eventually become as useless as the meta keyword tag currently is.
Personally I still think avoiding duplicate content without using this tag is preferable, but if you have no choice take a look at it.
Further resources to expand your knowledge on this topic:
February 14th, 2009 at 10:37 pm
Great dig up on this! There are so many Web sites suffering from diluted page ranks due to this. I usually handle this kind of stuff with .htaccess hacks ![]()
February 15th, 2009 at 5:58 pm
Thanks for the info Brenelz. I agree with you on the fact it’s best to avoid duplicate content altogether. With the example you used (different color schemes) it would be quite easy to get the GET data, set the cookie, and then redirect back to index.php – this way no duplicate content is created…
I’m always a bit skeptical about this though – Google and other search engines seem quite sensitive. I mean, if I created a link to this site but added a random query string like ?blah=22938 then Google would see it as an entirely different page; duplicate content. There should really be a better system in place for this type of scenario.
February 17th, 2009 at 4:23 pm
This is a new standard that has been agreed on by the three major search engines (Google, Yahoo!, and Live) and was announced recently at SMX West. Live’s announcement: http://blogs.msdn.com/webmaster/archive/2009/02/12/partnering-to-help-solve-duplicate-content-issues.aspx
Yes it is better to avoid duplicate content and generate the unique content for the site. It will take time and money but still it is worth doing it.
Twitter
Follow me on Twitter to keep up to date!
RSS Feed
Keep up with all of our updates by subscribing to our RSS feed!
FaceBook
Join our group on Facebook and become a fan of us!