-> The term Canonicalization can be tough to understand. Let me try to explain this in simple terms.
Let’s say there are two URLs of a website:
http://thewebpage.orghttp://www.thewebpage.org-> Both of those pages show content, and none of these pages redirects to any one of them. This can result in duplicate content issue on Google, and you can face penalties.
Let us see one more example. There are two URLs on a website that result in the same page resolution.
http://thewebpage.orghttp://thewebpage.org/index.phpIf both of these web pages show the same result, then this might cause an issue as well!
You might not pay much attention to this issue, but this might result in serious duplicate content penalties. The problem with search engine bots is that they can’t decide which version of the URL they should add in their index. If two pages are resolving the same content, they will just assume one copy is a copy of the other and your website will get penalized.
If your site is opening on 2 URLS showing the same content, then you must fix it. You must use server settings so that whether a user opens with www or without www, the site will open on any of the one version. In this way, you can fix the canonicalization.
Though, at times you would like to share same content on two URLS, then you can use rel=”canonical” tags to let search engine know that which is the original and which one is a copy of it. This can save you from duplicate content penalties.
-> How to correctly apply URL Canonicalization?
Take an example, there are two URLs on the website that result in the same content when they resolve. These two URLs are:
http://thewebpage.orghttp://thewebpage.org/index.php-> HTML Canonicalization
The second URL results in the same content as the first URL. They are both displaying the same page and hence you can apply the rel=”canonical” tag, in this case, to indicate that the URL with index.php is a Canonical URL of the first one.
This is how it is applied.
<link rel=”canonical” href=”http://thewebpage.org/index.php”>