If a site has a page available at multiple URLs, or pages with identical or similar content, the Yandex robot may count them as duplicates. In this case, it will combine the pages in a group of duplicates and choose one of them, the most informative and relevant to the search query, to be displayed in the search results. This is called a canonical page.
- How do I specify the canonical URL of a page?
- How do I change a URL using the canonical address?
- Cases where the canonical address isn't taken into account
How do I specify the canonical URL of a page?
Add the canonical URL in the rel="canonical" attribute using one of the following methods:
Let's say a page can be accessed at two URLs: www.example.com/pages?id==2 and www.example.com/blog.
If the preferred address is /blog, add in the /pages?id=2 HTML the link element:
<link rel="canonical" href="http://www.example.com/blog"/>
Link: <http://www.example.com/offer/file.pdf>; rel="canonical"
The robot learns about the changes when it crawls the site. If the canonical URL is entered correctly and the robot doesn't ignore the instructions, the non-canonical page disappears from the search results. To make sure that the page is removed from the search results, check in Yandex.Webmaster (the Excluded pages block).
The robot ignores instructions if the contents of the canonical and non-canonical page are significantly different. In this case, a non-canonical page may be included in the search. To check this, go to.
To exclude a non-canonical page that contains GET parameters or tags (UTM, from, and so on) in the URL, add the Clean-param directive to the robots.txt file. Otherwise, use the Disallow directive.
How do I change a URL using the canonical address?
You can enter the canonical address to change the URL of a site:
- To a domain with or without the www prefix.
- To use HTTPS or HTTP protocol.
The robot will interpret the canonical address as a redirect to the new main mirror and group the two site versions. To do this, add a link to the pages on the new site with the rel="canonical" attribute in the HTML or in the HTTP header of every page on the old site. For example, you change http://example.com to https://example.com. On the http://example.com/main/ page, include:
<link rel="canonical" href="https://example.com/main"/>
If the attribute points to a different page, the robot might consider this a difference in the site structure. In this case, the site can't be moved.
If you change the URL, make sure that the contents match on the old site and new site. For more information, see relocation instructions.
Cases where the canonical address isn't taken into account
The Yandex robot doesn't consider a URL canonical if:
- At the time of crawling, non-canonical pages respond more fully to the user's request, and their content differs significantly from the canonical ones. If you are sure that such pages won't be useful in search, prohibit indexing in the robots.txt file.
- The canonical URL is not accessible to the robot — it redirects to another page or is closed from indexing. This means it can't be included in the search. In this case, a non-canonical URL can be included in the search instead of the canonical URL, provided the robot can access it.
The canonical URL points to another domain or subdomain.
Several canonical URLs are specified.
- A chain of canonical URLs is specified. For example, for example.ru/1, the canonical URL is example.ru/2. At the same time, example.ru/2 has the canonical URL example.ru/3.
No. If the rel="canonical" attribute refers to the page it's on, the robot considers it canonical.
If a page was excluded from search results for being non-canonical, it means that the robot found the rel="canonical" attribute with the canonical URL in its HTML code or HTTP header. Delete this reference and check that the page you want to include back in the search is not closed to indexing.