Using the YAN robot to detect the theme of site content
The YAN robot regularly crawls sites on the Advertising Network and, based on the content of each site, determines its theme for contextual ad display.
When a site is not accessible to the ad robot, contextual ads become less relevant to your site theme, which in turn reduces your revenue.
About the Yandex Advertising Network robot
The name of the Yandex Advertising Network robot is
YandexDirect. In User-Agent format, the robot that indexes pages of participating sites on the Yandex Advertising Network is represented as follows:
Mozilla/5.0 (compatible; YandexDirect/3.0)
User-agent: Yandexrobot in the robots.txt file may result in all Yandex robots being blocked, including the YAN robot.
To make sure that the YAN robot crawls your site, the beginning of the robots.txt file in the root directory must have the following entry:
User-Agent: YandexDirect Disallow:
Check your site's accessibility to the YAN robot
YAN partners can open Yandex.Webmaster and go to to check if their site pages can be indexed by the
YandexDirect robot. This is checked based on parameters written in the robots.txt file.
This tool lets you find out whether pages of a site were needlessly closed to indexing due to errors in the robots.txt file (for example, if it was necessary to block a site from the search robot and make it accessible only to the ad robot but the rule was incorrectly written).
The tool's operating method is simple. You must embed the source code of the robots.txt file or choose a website to check. If it turns out that they were banned from being indexed by the ad robot, the system will display the corresponding message, and in some cases will suggest ways to resolve the problem.
Crawling speed of the YAN robot
You can manage the speed at which the YAN robot will crawl your site by using the
Crawl-delay directive in the robots.txt file.
Crawl-delay directive determines how long the robot will pause before it loads each successive page of a site. If the robots.txt file or the directive in it is absent, the minimum pause duration is 2 seconds. This pause duration provides optimal indexing speed for most sites without creating excessive loads on their servers or hosting services. For example, it lets the YAN robot fully index a site consisting of several thousand pages within one day.
Crawl-delayvariable to less than two seconds. Setting
Crawl-delayto more than two seconds makes sense if the YAN robot creates a noticeable load on the site and interferes with its standard functionality.
Keep in mind that a
Сrawl-delay value that is too high may reduce the quality of ads and, consequently, reduce your site's revenue.
Yandex deems it unethical to display ads on pages with tragic content. We use a special filter to search pages' text for phrases that indicate tragic content, and such pages may be flagged as tragic content. However, for news feeds of “mass media” sites, the tragic content indication may be ignored.