The Clean-param directive
Note
Sometimes the Disallow directive is used to close these pages. We recommend using Clean-param because this directive allows you to transfer some accumulated metrics to the main URL or website.
How to use the Clean-param directive
Specify the Clean-param directive as fully as possible and keep it up-to-date. The new parameter that doesn't affect the page content may result in duplicate pages that should not be included in the search. Due to the large number of such pages, the robot crawls the site more slowly. So, it'll take longer for important changes to show up in the search results.
The Yandex robot uses this directive to avoid reloading duplicate information. This improves the robot's efficiently and reduces the server load.
For example, your site contains the following pages:
www.example.com/some_dir/get_book.pl?ref=site_1&book_id=123
www.example.com/some_dir/get_book.pl?ref=site_2&book_id=123
www.example.com/some_dir/get_book.pl?ref=site_3&book_id=123
The ref
parameter is only used to track which resource the request was sent from. It doesn't change the page content. All three URLs will display the same page with the book_id=123
book. Then, if you indicate the directive in the following way:
User-agent: Yandex
Clean-param: ref /some_dir/get_book.pl
The Yandex robot will converge all the page addresses into one:
www.example.com/some_dir/get_book.pl?book_id=123
If such page is available on the site, it is included in the search results.
To apply the directive to parameters on pages at any address, do not specify the address:
User-agent: Yandex
Clean-param: utm
Tip
The Clean-Param directive is intersectional, so it can be indicated anywhere within the file. If you define other directives specifically for the Yandex bot, list all rules intended for it in a single section. In this case, the User-agent: *
string will be ignored.
Directive syntax
Clean-param: p0[&p1&p2&..&pn] [path]
In the first field, list the parameters that should be disregarded by the robot, separated by the &
character. In the second field, indicate the path prefix for the pages the rule should apply to.
The prefix can contain a regular expression in the format similar to the one used in the robots.txt
file, but with some restrictions: you can only use the characters A-Za-z0-9
. However, the *
character treated the same way as in the robots.txt
file: the *
character is always implicitly appended to the end of the prefix. For example:
Clean-param: s /forum/showthread.php
means that the s
parameter is disregarded for all URLs that begin with /forum/showthread.php
. The second field is optional, and in this case the rule will apply to all pages on the site.
It is case sensitive. The maximum length of the rule is 500 characters. For example:
Clean-param: abc /forum/showthread.php
Clean-param: sid&sort /forum/*.php
Clean-param: someTrash&otherTrash
Additional examples
#for addresses like:
www.example1.com/forum/showthread.php?s=681498b9648949605&t=8243
www.example1.com/forum/showthread.php?s=1e71c4427317a117a&t=8243
#robots.txt will contain:
User-agent: Yandex
Clean-param: s /forum/showthread.php
#for addresses like:
www.example2.com/index.php?page=1&sid=2564126ebdec301c607e5df
www.example2.com/index.php?page=1&sid=974017dcd170d6c4a5d76ae
#robots.txt will contain:
User-agent: Yandex
Clean-param: sid /index.php
#if there are several such parameters:
www.example1.com/forum_old/showthread.php?s=681498605&t=8243&ref=1311
www.example1.com/forum_new/showthread.php?s=1e71c417a&t=8243&ref=9896
#robots.txt will contain:
User-agent: Yandex
Clean-param: s&ref /forum*/showthread.php
#if the parameter is used in several scripts:
www.example1.com/forum/showthread.php?s=681498b9648949605&t=8243
www.example1.com/forum/index.php?s=1e71c4427317a117a&t=8243
#robots.txt will contain:
User-agent: Yandex
Clean-param: s /forum/index.php
Clean-param: s /forum/showthread.php
Disallow and Clean-param
The Clean-param directive does not require mandatory combination with the Disallow directive.
User-agent: Yandex
Disallow:
Clean-param: s&ref /forum*/showthread.php
#is identical to:
User-agent: Yandex
Clean-param: s&ref /forum*/showthread.php
Since the Clean-param directive is intersectional, it can be specified anywhere in the file, regardless of the location of the Disallow and Allow directives. Execution of Disallow takes priority, and if the page address is disallowed for indexing in Disallow and simultaneously restricted in Clean-param, the page will not be indexed.
User-agent: Yandex
Disallow:/forum
Clean-param: s&ref /forum*/showthread.php
In this case, the https://example.com/forum?ref=page page will be considered disallowed. Do not specify the Disallow directive for pages if you only want to remove link variants with GET parameters from the search.