We keep hearing about website internal duplicate content issues but many of us are sincerely unaware of the fact that that’s what our own website also suffers from. Why?
most of the websites are made with the help of third-party site creators that create duplicate URLs to same content and we don’t know about that;
sometimes webmasters just lack SEO knowledge. For example, they might be unaware of the fact that URLs are case sensitive and
www.yoursite.com/page1 and
www.yoursite.com/Page1 are handled as two different pages with the same content.
Now what can create duplicate content:
canonical issues (www and non-www version);
pagination when different pages have identical titles and meta description;
various versions of the home page (e.g.
www.site.com and
www.site.com/index.php);incorrect internal navigation creating several URLs to one and the same page (e.g.
www.site.com/page.php?id=567 and
www.site/category/page.php?id=567); etc
Why is it important to get rid of duplicate content issues?
Google has mostly figured how to sort this out. It will drop one version and index and rank another one. But still internal content duplication may result in a few issues:
decreased crawl rate as Googlebot is kept busy crawling unnecessary identical pages;
a wrong version of the page ranked which results in bad user experience (e.g page 2 is ranked instead of page 1);
delayed ranking of newly launched sites.