• Search Engine Marketing (PPC & SEO) and other Internet Marketing techniques, strategies and management on multi million dollar projects.

Canonical URLs and 301 Redirects Explained

Canonicalization is the process of picking the best URL when there are several choices, according to Google’s Matt Cutts. Consider canonicalization that refers to home pages like the examples below:

* www.example.com
* example.com/
* www.example.com/index.html
* example.com/home.asp

All of the above URLs seem to be the same URL, however, technically they are all different (the www version is essentially different from the non-www version). A web server can be set up to return totally distinct content for each of the above URLs. In such case, Google would choose to crawl only one of the URLs.

However for most sites, all of the above URLs usually return the same page. Therefore what Google sees is identical content on different URLs. If you are fortunate enough not to have run into duplicate content issues, Google picks one of the URLs and stores the others in the supplemental index.

Also, you are not optimized on consolidating all the possible pagerank scores your home page received from both external (inbound) links and your own site’s internal links.

To resolve the problems, you should consider the following:

Use Google’s webmaster tools to set the preferred URL of your home page to the one of the versions (either www or non-www). By doing this, it helps Google to identify the canonical URL. Over time, Google will only crawl the preferred version of your homepage URL.

URL consistency is important. When creating internal links you should ensure using one URL format across your site. Take my blog for example, I only use www.gordonchoi.com whenever pointing to the home page, and not the non-www version (gordonchoi.com).

If you want your homepage’s default URL to be the www version, set up a 301 permanent redirect on the non-www version. This way, when the non-www version is requested, the www version will be served to the users.

Do not use commands in your Robots.txt file to block the non-www format of the URL. This is one of the common mistakes being made by some amateur webmasters. Otherwise, Google not only will stop crawling your non-www URL but will start removing your entire site from its index.

Posted in: Google, Link Building, SEO

Leave a Comment