Dealing with Duplicate Content

It seems that every month or so more and more webmasters are getting hit by Google’s attempt to sort out the issue of duplicate content. Duplicate content of course is content or blocks or content that appear across multiple sites.

And in Google’s mind, and rightfully so in most cases, why should they waste their resources (i.e., their money!) indexing stuff that’s already out there. “If it’s already on the web in one place, why do we need to index it again?” Makes sense, right?

But what’s the definition of duplicate content? Well, that’s tough to answer and only Google knows for sure, assuming they do at all.

But be forewarned that if you run chunks of text and especially full content that’s already on the web than you may get hit. If you do run some duplication – and many of us do, especially news sites like dailyindia.com – make sure you have plenty of original content around it. Don’t expect to pull a wikipedia page and immediately have it rank on the first page. It’s just not going to work in the long run.

And also be wary of copying the html code. My guess is Google also can spot similarities in coding across sites make unfavorable judgments off that.

For most webmasters though problems comes when sites have varying degrees of duplicate text. And when you add in RSS feeds, syndication, social bookmarking and scraping, it’s hard to keep all your content 100% unique to your site.

If you do find your content stolen of used without permission, you can try to get it removed by contacting the webmaster or taking it a step further by filing a DMCA.

But for Google it’s still tough to decide. For example, what if this post or pieces of this post shows up on multiple websites? That when things get tricky.

Google will try to give credit to the original poster, but if your site is slow to get indexed and large portions of your content show up on say an authoritative site, then you are probably going to lose out.

Google says that it will simply ignore duplicate content, but do they always follow that policy? If your site contains a high amount of duplicate content, even your original stuff may get clipped and taken behind the woodshed (i.e. -950 land!), and that’s not where you want to be

So if you want to avoid the wrath of Google when it comes to duplicate content, make sure you site is unique as possible and cut down on your syndication. If you do use RSS feeds like in Wordpress, make sure you are not initially syndicating the full posts but rather the excerpt of the summary. And if you can make that summary unique then even better.

Also, make sure your site doesn’t contain large chunks of you original content across multiple pages.

There’s a lot more to it of course, but those are just some quick guidelines to help you with the issue of duplicate content.



Filed under Google, SEO and tagged

Leave a Reply

** Comment Policy: Real simple, don't spam. That means refrain from the use of keywords in your anchor text and don't use your company/website in the name field. We reserve the right to edit/delete your comments as we choose. All first comments will go through moderation.

Stay Connected with Geekwerks

Subscribe Now

  • Recent Comments

    • Michael Swanberg: Wow! THE Randal Schwartz commented on my blog? Awesome! Don’t be modest. You are “da...
    • Randal L. Schwartz: You made my day. Not about how unreadable the book is, but that you liked it. :)
    • Michael Swanberg: Oh come on… I don’t think that not having Flash makes the iPhone “useless to...
    • ASHIIMMY: Say what you will but not having flash on the iPhone renders many websites useless on the iPhone and that...
    • Trey Karroach: VRy wonderful to look into it

    Ad