Analyzer for robots.txt

How is robots.txt checked?

Theanalyzer automatically loads robots.txt from your site's root directory to the “robots.txt” field.

Once you click the Check button, the analyzer parses the content of the robots.txt text field line by line and checks the directives. You can also find out whether the robot is going to crawl the pages specified in the URL list field.

You can edit the rules to compose a robots.txt file that suits your site. Remember that the file on the site does not change. To save the changes, you need to load the updated file to the site.

In the sections intended for the Yandex robot (User-agent: Yandex or User-agent: *), the analyzer checks the directives using The robots.txt terms of use. Other sections are checked using thestandard. As the analyzer parses the file, it reports errors, warns you about inaccuracies in the rules and lists the file parts that are intended for the Yandex robot.

The robots.txt analysis discovered some errors. How do I find out the reason?

There are two types of messages from the analyzer: errors and warnings. They are described in the The robots.txt analysis error reference.

An error means that the analyzer can't process a certain line, section or the entire file because of syntax errors in the directives.

A warning means there is a deviation from the rules that can't be fixed by the analyzer, or that there is a potential problem (possible but not necessarily real) because of a typo or an inaccuracy in rules.

Why does the analyzer return the “This URL does not belong to your domain” error message?

Most likely, you included a mirror into the list of your site URLs. For example, http://example.com instead of http://www.example.com (technically, these are two different URLs). The URLs in the list must belong to the site for which robots.txt is checked.