Canonical URL

Share

What is a Canonical URL?

A canonical URL is the URL for a webpage that is considered the "canonical" one, i.e. there are multiple URLs that lead to the exact same webpage, and the canonical URL is the preferred URL for it.

In most cases you can add as many query parameters as you wish to a URL without changing the resulting webpage, e.g.:

https://www.example.com/
https://www.example.com/?foo=bar
https://www.example.com/?foo=bar&fish=fries

Search engines, reverse proxies, and other Internet-related programs consider URLs with different query parameters to be distinct URLs, because they are. The web server COULD return different webpages for different parameters. In fact, search engines do this. So programs can't just assume any of these URLs refers to exactly the same webpage as another without fetching the webpage first and inspecting its contents.

Assuming that a program could heuristically figure out that two webpages are in fact identical, the program would still have to figure out which URL is the preferred one. Should the URL ending in /, /?foo=bar, or /?foo=bar&fish=fries be used? There is no general way to solve this problem programmatically.

Instead, it's up to the webmaster to tell programs that consume the webpage which is the canonical URL through metadata. There is a standard for this. It's the following HTML code:

<link rel="canonical" href="https://www.example.com/">

When a search engine indexes a webpage with this code, it will use the specified URL as the canonical one.

This means that when the webpage appears on the search results, the link will go to this URL, and not to any of the other possible URLs.

Comments

Leave a Reply

Leave your thoughts! Required fields are marked *