Using robots.txt

Robots.txt is a text file that contains site indexing parameters for the search engine robots. In robots.txt, you can restrict the indexing of website pages by bots, which can reduce the load on the site and speed up its performance.

Note

Pages restricted in robots.txt can participate in Yandex search. To remove pages from search, specify the noindex directive in the HTML code of the page or configure the HTTP header. Do not restrict such pages in robots.txt, or the Yandex bot can’t index them and detect your instructions. See details in the How to exclude pages from search section.

Yandex supports the Robots Exclusion Protocol with advanced features.

Requirements to the robots.txt file

Yandex robots correctly process robots.txt, if:

  • The file size doesn't exceed 500 KB.

  • It is a TXT file named "robots", robots.txt.

  • The file is located in the root directory of the site.

  • The file is available for robots: the server that hosts the site responds with an HTTP code with the status 200 OK. Check the server response

If the file doesn't meet the requirements, the site is considered open for indexing.

Yandex supports redirection from the robots.txt file located on one site to the file located on another site. In this case, the directives in the target file are taken into account. This redirect can be useful when moving the site.

Recommendations on the content of the file

Yandex supports the following directives:

Directive

What it does

User‑agent *

Indicates the robot to which the rules listed in robots.txt apply.

Disallow

Prohibits crawling of sections or individual pages of the site.

Sitemap

Specifies the path to the Sitemap file that is posted on the site.

Clean-param

Indicates to the robot that the page URL contains parameters (like UTM tags) that should be ignored when indexing it.

Allow

Allows indexing site sections or individual pages.

Crawl-delay

Specifies the minimum interval (in seconds) for the search robot to wait after loading one page, before starting to load another.

We recommend using the crawl speed setting in Yandex Webmaster instead of the directive.

* Mandatory directive.

You'll most often need the Disallow, Sitemap, and Clean-param directives. For example:

User-agent: * #indicates which bots the directives are set for
Disallow: /bin/ # prohibits links from the “Shopping Cart”.
Disallow: /search/ # prohibits links from the site’s built-in search
Disallow: /admin/ # prohibits links from the admin panel
Sitemap: http://example.com/sitemap # points the bot to the Sitemap file for the site
Clean-param: ref /some_dir/get_book.pl

Robots from other search engines and services may interpret the directives in a different way.

Note

The robot takes into account the case of substrings (file name or path, robot name) and ignores the case in the names of directives.

Using Cyrillic characters

The use of the Cyrillic alphabet is not allowed in the robots.txt file and server HTTP headers.

For domain names, use Punycode. For page addresses, use the same encoding as that of the current site structure.

Example of the robots.txt file:

#Incorrect:
User-agent: Yandex
Disallow: /cart
Sitemap: site.ru/sitemap.xml

#Correct:
User-agent: Yandex
Disallow: /%D0%BA%D0%BE%D1%80%D0%B7%D0%B8%D0%BD%D0%B0
Sitemap: http://xn--80aswg.xn--p1ai/sitemap.xml

How to set up robots.txt

  1. In the text editor, create a file named robots.txt and add the directives you need in it.
  2. Check the file in Yandex.Webmaster.
  3. Place the file to your site's root directory.

Sample file. This file allows indexing of the entire site for all search engines.

Sorular ve Сevaplar

The “Server responds with redirect to /robots.txt request” error occurs on the “Site diagnostics” page in Yandex.Webmaster

For the robots.txt file to be taken into account by the robot, it must be located in the root directory of the site and respond with HTTP 200 code. The indexing robot doesn't support the use of files hosted on other sites.

To check the availability of the robots.txt file for the bot, check the server response.

If your robots.txt redirects to another robots.txt file (for example, when moving a site), Yandex takes into account the target robots.txt. Make sure that the correct directives are specified in this file. To check the file, add the target site in Yandex.Webmaster and verify your site management rights.

Contact Support



You can also go to