What Is Duplicate Content On a Website? Duplicate content, without a doubt, is something that can weigh down the SEO positioning of our website or eCommerce. And despite this, this is one of the least optimized errors or taken into account by most of us.
Currently, this problem is more than enough reason for many of the pages on your site, perhaps those that were already moderately positioned, over time disappear from the top positions in searches.
Why? Simple, because if we take into account that Google constantly debugs the SERP, in order to show results of greater quality and relevance for its users, we should not be surprised that duplicate content is frowned upon by it.
Just in case, in one of your searches, you would like to find yourself literally with the same text and information in almost every one of the results on the first pages of Google.
I guess not”. You wouldn’t like that situation, nor do I think anyone likes it. But luckily, whether it was voluntary or involuntary, these mistakes have a solution.
In this article we are going to learn how to detect duplicate content (thanks to some of the SEO tools that I use myself) and learn how to give you the most effective solutions for each case. But before, as I usually do in all my guides and tutorials, I would like to first put you in a situation and give you a definition of the topic that concerns us today.
What Is Duplicate Content On a Website?
The duplicate content SEO occurs when the text is replicated partially or completely different URLs, whether on pages within the same domain (internal) or on other pages of different websites (external). Furthermore, this problem can also happen because 2 or more URLs lead to the same page within your web domain.
In most cases, external mirrors occur due to third party copying or plagiarism. On the contrary, internal duplication usually occurs due to errors in the web structure of our site, and they cause multiple URLs to lead to the same page, or because we have used much of a text in descriptions of 2 or more pages.
It is estimated that the content can be considered as duplicate, when more than 30% of it is already literally published in other URLs. Conversely, it could be considered original when at least 70% of the text on a page does not have a structure that is literally identical to that of others.
Can I Be Penalized For Duplicate Content?
In my experience, both the duplication of content and the cannibalization of keywords, in most cases, instead of bringing us a pure and simple penalty, what it does is reduce the quality of our Google pages and therefore, they have a great loss of positions in the SERP.
Of course, if you have a website that continually abuses these practices, you will most likely not be penalized by Panda (Google’s algorithm for controlling these issues).
Anyway, penalize or not, you should be clear that search engines do not welcome these things. In addition, the great progress in their algorithms, allows them to more easily detect these copies of texts (especially within the same website).
Why Is Duplicate Content So Negative For My Website?
Taking into account that you are already aware of what we are talking about and what I mean with this common SEO problem, you should know that some of the consequences that it may cause you are the following:
► Have Low Quality Content
This problem can lower the quality of your pages for users and Google. This means that the one that Google selects, may not be what you want and, as a consequence, a “copy” with lower quality can be shown to users and that it would rank worse.
► Decreased Organic Visibility
In short, if you lose quality, you also lose positions with your pages. And this drop in the results of the SERP, leads to a decrease in your online visibility and traffic from search engines.
► Decrease in Conversions
If you have different pages with a very similar text, the search engine should select the most optimal page for that search intention. Being able to be the chosen one that is not the most convenient for your business strategy.
You may also be interested: Conversion Rate: What Is It and How Is It Calculated?
► Wrong Authorship Attribution
When it detects two similar URLs in different domains, the search engine chooses the original version based on the index date and / or the popularity of the site.
In other words, Google could wrongly decide which is the original and, especially if you have little authority on the Internet, punish the wrong website. If you always act professionally and generate your own texts, I understand that this will produce outrage at least.
That is why it is so important to crawl the internet with a certain frequency, to detect copies of your original content, since the person harmed may become unfairly you.
► Loss Of Authority
As with cannibalization, duplicate pages can make your website less powerful. But, in addition, the links you receive can point to different URLs for the same topic, and instead of joining forces to enhance their positioning, the links you receive are being divided.
► Problems in Indexing
The indexing of the pages can be affected, because the search engine crawls all of them for a certain amount of time. The loss of time in this crawling, due to an excess of low quality or duplicate pages, will make the search engine leave part of your site without visiting.
You may also be interested: What is a “CTA” or Call To Action? Guide To Creating More Attractive Call To Action
10 Tools To Know If I Have Duplicate Content On My Website
To analyze duplications, the most sensible thing to do is to start with titles, headings, descriptions and similar sections. The most effective methods to identify it is through the use of tools.
And when I say tools, I’m not only talking about different types of platforms or software created for it, but I’m also talking about search methods, such as the “site:”, which I’ll talk about later:
1. Google Search Console
It is one of the best starting points. To analyze this and other questions related to your own domain on the Internet , sign up for Google webmaster tools and go to “Search aspects” and “HTML improvements”.
Next, look at the duplicate title tags and meta descriptions. You will find the existing replicas and the pages so that you can correct them.
Without a doubt, going to Google Search Console is a good option to detect it within your website.
As you know, SEMrush in addition to being one of my favorite tools, it is also one of the most complete and, as such, it includes a way to detect if you have any problem of this type.
It has a very complete “SEO Audit” tool for a website , where duplicate content can be easily identified.
3. Screaming Frog
Thanks to Screaming Frog , you can crawl a site in search of duplications, among other functionalities that this powerful SEO tool allows you. To do this you have to use the “duplicate” filter on the Page, URL, H1 and Meta Description tabs.
I must warn you that it is not a free tool, however, its great features can make you think about hiring it.
4. Google Analytics
If you access the report « Behavior «, « Site content » and « Landing pages «, you can also find duplications. Here, Google Analytics searches for pages and URLs that receive less organic traffic than they should.
Thanks to the Online Plagiarism tool you can identify if a text is original or coincides with one already published on the network, simply by including it in the space provided for it.
In addition, you can easily upload your PDF file from Google Drive, if you save your posts in the cloud, before publishing them on your blog.
Personally, I have a predilection for it, given its speed and simplicity in informing you if the chosen content is copied from another that already exists on the Internet. In fact, it is one of the verification tools that I use together with my team in JF-Digital. You can also download and install it on your hard drive, if you prefer.
It is a simple and intuitive Online platform that, once you paste the piece of text in question into the space provided for it, gives you all the necessary information to know if it is copied or original.
With Quetext you can know exactly what other websites are those that have already published an identical text to the one you have indicated to the tool, thus marking the exact fragments that, therefore, you should not publish on your page , if you do not want to be penalized.
There are a wealth of web analytics tools to identify broken links, unindexed pages, and duplications, plus other issues that are more difficult to detect.
We will see these tools later and they are Siteliner or SEMrush among others.
In addition to being one of my favorites, it is widely used by a large number of professionals in Digital Marketing and the Internet in general.
With CopyScape you can enter the URL of your site and check if there is any other text on the network identical to yours. In this way, you could contact the person in charge and ask for explanations for it.
8. Command «site:» + «Keyword» in Google
This command searches Google for indexed pages of your website with a certain phrase or specific keyword (or products if we are talking about an online store IN PrestaShop or similar ).
For example, among the results, you can check if there are pages indexed in Google with duplicate titles or descriptions and if some of them have been moved to the secondary index. At the same time, this is also a great method for finding SEO cannibalizations.
9. Virante Tools
It is very effective in detecting basic aspects that a blog must meet in order not to have duplication.
The ideal is to have all the “checks” in green and, if you find any in red, this is where we should work to be able to correct the error.
If in Virante Tools the first check is red, it means that the URL is not canonical and that the URL format is not correctly selected. This is the biggest mistake with which to generate this problem that concerns us today.
With this Online tool you will be able to detect the duplicate content of a small website. With its free version, a maximum of 250 pages can be analyzed.
Therefore, even if it is to start, with SiteLiner you have enough.
How Can I Remove Duplicate Content On My Blog or Website?
It has already become clear that search engines do not like duplication, because it impoverishes the user experience . Therefore, if you detect it, you must do “the impossible” to eliminate it.
If you have duplicated it yourself on your own site, there are several ways to fix it or make sure search engines know which one you want to take as “primary”.
The problem is that you have to know some programming and not everyone is in a position to write code in the right places on the Web.
If you do not master the HTML language, my advice is to seek the help of a specialist or hire their services.
At this point in the article, you should already be aware of the importance of not having duplicate content if you want your website not to be delegated to the most delayed positions (or pages) of the Google SERP.
The most common ways of dealing with duplication on our website are:
1. Change The Text
This is one of the simplest ways, but at the same time less used. Therefore, if you have two very similar pages, and you want to position both in the search engines, choose to rewrite the content of one of those URLs to make it as original as possible.
2. Make a “Canonical”
The tag “rel = canonical” was created to deal precisely with this situation. For example, it is widely used in an eCommerce when we have products with very similar descriptions.
The “rel = canonical” , is a line of code that is inserted into the <head> of your page code and tells search engines which is the original version of it. And, therefore, it prevents them from cataloging these contents as duplicates.
Here, we must bear in mind that, with this attribute, it is the search engines that make the final decision of what to do with those pages, that is, they resolve whether they index them all or only the main (or canonical).
This is a solution that anyone can implement. But, it is also true that you need to have some knowledge of HTML to put the tag in the right place or to have a plugin / module to help you with that job.
3. 301 Redirect
It is the best option when it is not feasible to use the previous tag or when you have two indexed URLs that lead to the same place.
With 301 Redirection, visitors are automatically sent from one page to another page that interests us automatically.
That is, you can use it mainly in two situations:
1º If you have two identical or extremely similar pages, and for whatever reason you cannot use a canonical, then you should redirect one to the other (taking into account their relevance or importance, since one of them would disappear in the face of your visitors) .
2nd Visitors to your website can reach the same landing page from different URLs. Making a 301 redirect of all those URLs to the main or correct one, wherever they come from, you direct your visitors and search engines to a single URL.
And, incidentally, we inform the search engines what the “correct” URL is and what they should index.
4. Through URL’s Parameters
If duplicate content is produced by certain parameters, from ‘crawling’ and ‘URL parameters’ , you can tell Google what to ignore using Search Console (Webmaster Tools).
The procedure is almost the same as with Robots.txt: search engines are told which URL’s to index and which ones to ignore.
This method is very useful especially for eCommerce with different sizes and colors of the same product. The URL will be the same for all size and color variables, but the webmaster will only be interested in highlighting one of them, with the general description of the product.
It is another of the actions to avoid duplication on the pages.
If, for any reason, you cannot redirect or delete the page with duplicate content, this is the best option to avoid the dreaded sanctions.
With Robots.txt files, we tell search engines which pages or files to ignore or block and, therefore, they should not invest a single millisecond on it.
6. Sole Editor
If you edit your blog yourself, you are generating duplicate information without your knowing it, through the author pages. In WordPress, they are usually of the type:
The solution is very simple, you must mark the author pages as ” noindex – follow ” and so you tell the search engine not to index those URLs.
This only needs to be done when there is a single author. Still, in blogs or digital magazines where there is a variety of authors, it wouldn’t hurt if you did either.
7. Do not Abuse The Labels or The Categories in The Blogs
Misused categories and tags can be very dangerous to your SEO positioning.
For example, in a common blog, contrary to what you may be thinking, indexing this type of thing only serves to generate duplicate content or cannibalizations.
Still, if you want to index the categories and / or tags of your blog, do it strategically, with great care and do not generate industrial amounts of them without any coherence or meaning.
I insist, if you are not clear, what you can do is add ” meta-tags: noindex, follow “ between the options of your SEO plugin (if you normally use WordPress) and thus you will not create duplicate content with them.
How To Fix Duplicate Content When It’s Off Your Site?
After reviewing all the ways that exist to detect it on our own website and the different ways to solve that problem. Let’s see now what we can do when it is in a foreign domain.
Option 1: request to be removed
In this case, you can “kindly” request that they be removed, via email, social media or through a contact form.
Perhaps the person who plagiarized you does not know how bad that can be for both of you.
If this first contact does not bear fruit, something that unfortunately happens with some frequency, we have to take a second step.
Option 2: a canonical link
Although I find it difficult to achieve, if you do not want to delete it, you can ask to link your text with a « canonical «.
And, in this way, the search engine will find the original content and neither will run the risk of being penalized.
Option 3: formal request to Google
But this 2nd option may not work either, which also happens frequently.
So, we will move on to “bigger words”: request Google to de-index your URL.
To do this, you must file an application under US copyright law.
In all these cases it is convenient to save all the effort previously made to solve the problem on your own.
Keep a copy of the emails or messages you have sent / received with the Webmaster of the site with your duplicate content.
Option 4: ordinary justice
The last option you have if everything else does not work, is to resort to the ordinary justice of your country, so that they apply the current legislation.
You must file a complaint for a crime of plagiarism. Since publishing texts online does not differ in any way if you publish texts on paper .
Everything you do digitally is automatically copyrighted and therefore you can go to court to have a plaintiff ordered to erase plagiarized content.
And, by the way, to compensate you financially for the damages caused, if applicable.
This last method may seem too extreme, but you have to keep in mind that there are many people who make a living from their website or blog (as in my case).
The visitors that come to you hire their services, buy their products or simply click on the ads.
Duplicate content on these types of sites can lower your income considerably.