I was reading the recap Shawn Collins did for his trip to SMX on day one, and a comment he made within it got my attention:
Some interesting revelations in the duplicate content session, such as Google's Joachim Kupke saying there is no duplicate content penalty and that Google is “working on something to help ID a site as the original provider of content.”
Now, this isn't a new revelation. Google has been giving explanations about how duplicate content doesn't really cause a penalty for years now. When that was first stated, webmasters were up in arms, and those with clients knowledgeable enough to attend conferences and read official Google blogs were receiving phone calls in droves. And it left those of us, who'd known this all along, with “some splaining to do.”
The problem is it really boils down to semantics to an extent. Let's take a look at some hypothetical examples.
Domain A writes a kick ass page on “rainbow widgets.” So kick ass, that domain B and C copy the content and reprint it.
Now, someone does a search for “rainbow widgets” and this kick ass article ranks at number three for the term. However, it is domain B that is considered the “original provider” of the content because their domain is older and more linked to, so it is domain B's version of the article that shows at number three. The true original, on domain A, never shows for any search terms the article is relevant for because anytime that article is deemed the “best” result, it is domain B's version that appears.
I highly doubt domain A, the creator of the content, cares if it is called a “penalty” or “filtering to show the original.” Either way, they're getting screwed due to duplicate content issues.
An e-commerce site uses a datafeed provided by the manufacturer that they give to all of their retailers, much like those I described when I talked about affiliate datafeeds recently. They publish the information without adding anything unique to it. The problem is, they are a newer merchant and the competitors using the very same datafeed have been around for years and have strong backlink profiles. Someone does a search for “red widgets for blue walls” and even though our new e-commerce merchant has a page for it, he never shows for the term because the stronger e-commerce sites are considered the “original.” They also experience a crawl slow down on their site because Google deems a lot of their internal pages “unimportant” because they are duplicates.
They don't have a “penalty,” but they can't get traffic on long tail keywords directly to their product pages because they are always “filtered” for being duplicates of others by Google. As with example 1, they're getting screwed due to duplicate content.
Issues, penalties and filtering – oh my!
Whether you want to call them duplicate content “issues,” duplicate content “penalties,” or “duplicate content filtering,” the adverse effects on your site and your rankings are still the same.
I typically say “issues” when referring to duplicate content, but “penalty” is something most clients understand at first shot. A penalty is bad. We understand a penalty is a problem. We're willing to do anything to avoid a penalty. And I'd guess that is why a lot of people still refer to the issues duplicate content can bring on as one. Either way, don't take Google saying “there is no penalty for duplicate content” as a sign that duplicate content “can't hurt you.” It can.