Determining a site's theme using the YAN robot

The YAN robot regularly crawls sites on the Advertising Network and, based on the content of each site, determines its theme for contextual ad display.

When a site is not accessible to the ad robot, contextual ads become less relevant to your site theme, which in turn reduces your revenue.

About the Advertising Network robot

The name of the Advertising Network robot is YandexDirect. In User-Agent format, the robot that indexes pages of participating sites on the Yandex Advertising Network is represented as follows:

Mozilla/5.0 (compatible; YandexDirect/3.0)
Attention. Blocking the User-agent: Yandex robot in the robots.txt file may result in all Yandex robots being blocked, including the YAN robot.

To make sure that the YAN robot crawls your site, the beginning of the robots.txt file in the root directory must have the following entry:

User-Agent: YandexDirect
Disallow:

Check your site's accessibility to the YAN robot

Using the tool to Check a site's accessibility to the YAN robot, YAN partners can check to see if their site pages are accessible for indexing by the YandexDirect robot. This is checked based on parameters written in the robots.txt file.

This tool lets you find out whether pages of a site were needlessly closed to indexing due to errors in the robots.txt file (for example, if it was necessary to block a site from the search robot and make it accessible only to the ad robot but the rule was incorrectly written).

The tool's operating method is simple. In the interface, you must enter the address or list of addresses to be checked. If it turns out that they were banned from indexing by the ad robot, the system will display the appropriate message, and in some cases will suggest ways to resolve the problem.

Speed at which the YAN robot crawls sites

You can manage the speed at which the YAN robot will crawl your site by using the Crawl-delay directive in the robots.txt file.

The Crawl-delay directive determines how long the robot will pause before it loads each successive page of a site. If the robots.txt file or the directive in it is absent, the minimum pause duration is 2 seconds. This pause duration provides optimal indexing speed for most sites without creating excessive loads on their servers or hosting services. For example, it lets the YAN robot fully index a site consisting of several thousand pages within one day.

Tip. For large sites, we recommend that you set the Crawl-delay variable to less than two seconds. Setting Crawl-delay to more than two seconds would be feasible if the YAN robot creates a noticeable load on the site and interferes with its normal operation.

Please keep in mind that a Сrawl-delay value that is too high may reduce the quality of ads and, consequently, reduce your site's revenue.

Tragic content

Yandex deems it unethical to display ads on pages with tragic content. We use a special filter to search pages' text for phrases that indicate tragic content, and such pages may be flagged as tragic content. However, for news feeds of “mass media” sites, the tragic content indication may be ignored.