Robots.txt parsing errors


List of errors when parsing the robots.txt file.

ErrorYandex extensionDescription
Rule doesn't start with / or *Yes

A rule can start only with the / or * character.

Multiple 'User-agent: *' rules foundNo

Only one rule of this type is allowed.

Several Host directives foundYes

Only one Host directive is allowed.

Robots.txt file size limit exceededYes

The number of rules in the file exceeds 2048.

No User-agent directive in front of rule.No

A rule should always follow the User-agent directive. Perhap, the file contains an empty line after User-agent.

Rule is too longYes

The rule exceeds the length limit (1024 characters).

Invalid main mirror nameYes

The name of the main mirror of the site in the Host directive contains a syntax error.

Invalid Sitemap file URLYes

The sitemap file URL should be specified in full, including the protocol. For example,

Invalid Crawl-delay directive formatYes

Time in the Crawl-delay directive is specified incorrectly.

Multiple Crawl-delay directives foundYes

Only one Crawl-delay is allowed.

Invalid Clean-param directive formatYes

The Clean-param directive should contain one or more parameters ignored by the robot, and the path prefix. Parameters are separated by the & character. They are separated from the path prefix with a space.


List of warnings when parsing the robots.txt file.

WarningYandex extensionDescription
It's possible that an illegal character was usedYes

The file contains a special character other than * and $.

Unknown directive foundYes

The file contains a directive that isn't described in the rules for using robots.txt. This directive may be used by the robots of other search engines.

Syntax errorYes

The string cannot be interpreted as a robots.txt directive.

Unknown errorYes

An unknown error occurred while analyzing the file. Contact the support service.

URL validation errors

The list of the URL validation errors in the robots.txt analyzer.

Syntax error

URL syntax error.

This URL does not belong to your domain

The specified URL does not belong to the site for which the file is parsed. Perhaps you entered the address of a site mirror or misspelled the domain name.