Duplicate vs. Original Content: Quality Content Series Part 6
While it’s well known that plagiarism is harmful and has legal consequences, there’s a common misunderstanding about Google’s perception of duplicate content and how it affects SEO.
In this post, we look at what’s OK to duplicate, when it’s a problem and why original content is so important.
What is duplicate content?
This is Google’s official definition of duplicate content:
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.”
It makes sense for marketers to worry about duplicate content. You want the same, original page to continue showing up in search results so your website can benefit from the organic traffic. Thumbs up, marketer!
But SEO experts want you to know that you should put more trust in Google’s algorithms to choose the correct, original page.
According to Search Engine Land, “the duplicates are just being filtered in search results. You can see this for yourself by adding &filter=0 to the end of the URL and removing the filtering.”
That same article lists a few causes of duplicate content. It could be something as simple as having both HTTP and HTTPS versions of a page or having versions in different languages.
Is it OK to repost content?
Another common worry among marketers (and a worry we once had at Brandpoint) is whether reposting our blog content on Medium would harm search engine results for our original page.
We wondered if Google would penalize us if it thought we were trying to trick the search engines.
Or what about guest posts? Could we start taking the posts we publish on other blogs and reposting them on ours?
The answer is that all of this is a safe solution and will not harm the SEO value of the original content. Why? Because Google knows the difference and can identify the original, most useful source for searchers.
Search Engine Land explains that “Googlers know that users want diversity in the search results and not the same article over and over, so they choose to consolidate and show only one version.”
So, you don’t need to worry. However, if you’d like even more peace of mind, there are a few actions you can take to identify the original piece of content for Google.
Say you want to repost a guest blog on your website. You can add the canonical HTML tag to that post with the original URL, which gives a direct signal to search engines where the original version appeared.
If your content gets syndicated on another website (HuffPost often republishes popular blog posts), then these sites usually include a link back to the original post, and you can ask the website to include that canonical tag to ensure your original post gets the search traffic.
When is duplicate content a problem?
This Google Webmaster Blog asserts that, “Only when there are signals pointing to deliberate and malicious intent, occurrences of duplicate content might be considered a violation of the webmaster guidelines.”
This includes cases of plagiarism — when someone else tries to steal your online identity or pass off your content as their own. It also includes cases when duplicate content is used to manipulate users, such as content that possesses any of these malicious behaviors.
Otherwise, you won’t need to worry about other instances of duplicate content, including scraped content.
Scraping involves “sites that copy and republish content from other sites without adding any original content or value.” Scraped sites don’t usually include any benefit to users. Mostly, their main purpose is to increase page count and trick the search engines.
In these cases, this Kissmetrics blog post assures that “scrapers don’t help or hurt you. Do you think that a little blog in Asia with no original writing and no visitors confuses Google? No. It just isn’t relevant.”
Google’s algorithm is able to identify the original, correct page, and will always choose to feature that in search results. As confirmed by Google: “In most cases the original content can be correctly identified, resulting in no negative effects for the site that originated the content.”
Scraped content could appear in search results, but it’s very rare.
What are solutions to duplicate content?
As mentioned above, the canonical HTML tag identifies which piece of content is the original. If your site has duplicate content, just add that tag to direct search engines to the correct version. Or, add a “noindex” and/or “nofollow” metatag, which will exclude a page from a search engine’s index.
You can also set up a 301 redirect from the duplicate page to the original content, as recommended by Moz. Setting up these redirects increases the ranking power of the correct page because the duplicate pages will no longer compete. All inbound links will be directed to the correct version to boost its SERP position.
It’s also good practice to use consistent internal links throughout a website. For example, using “http://www” or just “http://”. You can view all of Google’s recommended best practices for avoiding duplicate content in their webmaster guidelines.
The worst-case scenario is that someone completely plagiarizes your content, taking credit for creating your content. In this case, more drastic measures will need to be taken — we’re talking about sending a “take down” letter, involving lawyers and major headaches. No one wants to deal with that. It’s why you include that copyright symbol on your site, right? You can also use a plagiarism checking tool every so often to make sure no one stolen your hard work.
What is unique, original content?
We’ve learned that duplicate content will not harm your SEO, but it also won’t help it.
To rank in search, you need unique and original content. However, it takes even more than that to get your content to appear at the top of the results page. The content needs to be considered what Moz’s Rand Fishkin coined as “10x content.”
This type of content is what will beat out your competitors in search. Rand explains that you can’t be as good as the best in the SERP, but you need to “create something ten times better than the best result out there.”
This idea of “10x content” is a result of users’ increasing expectations for better search results. More and more companies have responded by creating this type of content to keep providing better answers to their questions.
What does 10x content look like?
10x content keeps users on a page with detailed, comprehensive information that is easy to read and navigate. There should be a good use of visuals, videos and other interactive elements.
It also takes research to determine opportunities for content that’s missing in your industry or for a targeted keyword. Use tools such as Buzzsumo to find trends on social media, and conduct keyword research to identify opportunities for a specific keyword or topic.
Once you’ve landed on a topic, study all of the posts at the top of the results page. That’ll show you what you need to do to create something ten times better. For more helpful tips on creating 10x content, check out Rand’s Whiteboard Friday on the topic.
Duplicate vs. original
If you want your content to appear in search results and get your business in front of more eyes, it needs to be original. No argument. But will a repost of your blog somewhere else harm your SEO? Not likely. Especially if you follow proper practices, such as using a canonical tag or 301 redirect, duplicate content is not a pressing SEO issue.
Keep your focus on developing high-quality, original pieces that fit the characteristics of 10x content. You’ll have a better chance at ranking higher in search results and your brand reputation will benefit from producing expertly crafted content.
This post is part of the Brandpoint Quality Content Series, which analyzes how Google assesses quality content and how you can get your pages to appear higher in search results.
Part 1: What is High-Quality Content?
Part 2: Google Search Quality Guidelines: What is E-A-T?
Part 3: Are High-Quality Links Important for SEO?
Part 4: How to Create Readable Content
Part 5: How to Create Comprehensive Content
Part 6: Duplicate vs. Original Content
Part 7: Latent Semantic Indexing and Long-tail Keywords
Part 8: How to Optimize Images and Visuals for SEO
Part 9: Content Freshness and Generating New Topics
Part 10: SEO Success Stories