How to check that a robot belongs to Yandex

Some robots can disguise themselves as Yandex robots by indicating the relevant User Agent. You can check the authenticity of a robot using reverse DNS lookup.

Just follow these steps:

  1. Determine the IP address of the user agent in question using your server logs.
  2. Use a reverse DNS lookup of the IP address to determine the host domain name.
  3. Check whether the host belongs to Yandex. All Yandex robot names end in yandex.ru, yandex.net or yandex.com. If the host name has a different ending, the robot does not belong to Yandex.
  4. Make sure that the name is correct. Use a forward DNS lookup to get the IP address corresponding to the host name. It should match the IP address used in the reverse DNS lookup. If the IP addresses do not match it means that the host name is fake.
    1. Yandex robots in server logs
    2. FAQ

Yandex robots in server logs

A number of Yandex robots download web documents for purposes other than indexing. To avoid unintentional blocking by site owners, they may ignore the file's restrictive directives robots.txtdesigned for arbitrary robots (User-agent: *).

In addition, robots may ignore some robots.txt restrictions for certain sites if there is an agreement between «Yandex» and the owners of those sites.

Примечание. If such a robot downloads a document that the main Yandex robot can't access, this document will never be indexed and won't be found in search results.

To restrict access to such robots to the site, use directives specifically for them, for example:

User-agent: YandexCalendar
Disallow: /

User-agent: YandexMobileBot
Disallow: /private/*.txt$

Robots use a variety of IP addresses that change frequently. Therefore, their list is not disclosed.

The robot's full name, including the User agent Purpose of the robot Takes into account the General rules specified in robots.txt
Mozilla/5.0 (compatible; YandexAccessibilityBot/3.0; +http://yandex.com/bots)

YandexAccessibilityBot downloads pages to check their accessibility for users.

It sends up to 3 requests to the site per second. The robot ignores the setting in Yandex.Webmaster.

No
Mozilla/5.0 (compatible; YandexAdNet/1.0; +http://yandex.com/bots) The Yandex advertising network robot. Yes
Mozilla/5.0 (compatible; YandexBlogs/0.99; robot; +http://yandex.com/bots) The blog search robot that indexes post comments. Yes
Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) The main indexing robot. Yes
Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots) Detecting site mirrors. Yes
Mozilla/5.0 (compatible; YandexCalendar/1.0; +http://yandex.com/bots) The Yandex.Calendar robot. Downloads calendar files by users' requests. These files are often located in directories prohibited from indexing. No
Mozilla/5.0 (compatible; YandexDirect/3.0; +http://yandex.com/bots) Downloads information about the content of Yandex Advertising network partner sites to identify their topic categories to match relevant advertising. No
Mozilla/5.0 (compatible; YandexDirectDyn/1.0; +http://yandex.com/bots Generates dynamic banners. No
Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/bots) Downloads the site's favicon file to display in search results. No
Mozilla/5.0 (compatible; YaDirectFetcher/1.0; Dyatel; +http://yandex.com/bots) Downloads target pages of ads to check their availability and topic. This is necessary for ad placement in the search results and on the partner sites. No. The robot doesn't use the robots.txt file and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexForDomain/1.0; +http://yandex.com/bots) The Yandex.Mail for domain robot used to verify domain ownership rights. Yes
Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots) Indexes images to display them in Yandex.Images. Yes
Mozilla/5.0 (compatible; YandexImageResizer/2.0; +http://yandex.com/bots) Mobile devices robot. Yes
Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexBot/3.0; +http://yandex.com/bots) Indexing robot. Yes
Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots) Defines pages with layout suitable for mobile devices. No
Mozilla/5.0 (compatible; YandexMarket/1.0; +http://yandex.com/bots) The Yandex.Market robot. Yes
Mozilla/5.0 (compatible; YandexMarket/2.0; +http://yandex.com/bots) No
Mozilla/5.0 (compatible; YandexMedia/3.0; +http://yandex.com/bots) Indexes multimedia data. Yes
Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots yabs01) Downloads site pages to check their availability, including landing pages of the Yandex.Direct ads. No. The robot doesn't use the robots.txt file and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots) The Yandex.Metrica robot. No
Mozilla/5.0 (compatible; YandexMetrika/3.0; +http://yandex.com/bots) No
Mozilla/5.0 (compatible; YandexMetrika/4.0; +http://yandex.com/bots) The Yandex.Metrica robot. Downloads and caches the CSS styles to render site pages in Webvisor. No. The robot doesn't use the robots.txt file and ignores the directives set for it.
Mozilla/5.0 (compatible; YandexMobileScreenShotBot/1.0; +http://yandex.com/bots) Takes a screenshot of the mobile page. No
Mozilla/5.0 (compatible; YandexNews/4.0; +http://yandex.com/bots) The Yandex.News robot. Yes
Mozilla/5.0 (compatible; YandexOntoDB/1.0; +http://yandex.com/bots) The object response robot. Yes
Mozilla/5.0 (compatible; YandexOntoDBAPI/1.0; +http://yandex.com/bots) The object response robot that downloads dynamic data. No
Mozilla/5.0 (compatible; YandexPagechecker/1.0; +http://yandex.com/bots) Accesses the page for validating micro-markup via the Structured data validator. Yes
Mozilla/5.0 (compatible; YandexPartner/3.0; +http://yandex.com/bots) Downloads information about the content of Yandex partner sites. No
Mozilla/5.0 (compatible; YandexRCA/1.0; +http://yandex.com/bots) Collects data for generating previews. For example, wizard preview. No
Mozilla/5.0 (compatible; YandexSearchShop/1.0; +http://yandex.com/bots) Downloads product catalogs in YML files by users' requests. These files are often placed in directories prohibited for indexing. No
Mozilla/5.0 (compatible; YandexSitelinks; Dyatel; +http://yandex.com/bots) Checks the availability of pages used as sitelinks. Yes
Mozilla/5.0 (compatible; YandexSpravBot/1.0; +http://yandex.com/bots) The Yandex.Directory robot. Yes
Mozilla/5.0 (compatible; YandexTracker/1.0; +http://yandex.com/bots) The Yandex.Tracker robot. No
Mozilla/5.0 (compatible; YandexTurbo/1.0; +http://yandex.com/bots) Crawls the RSS feed is created to form Turbo pages. It sends up to 3 requests to the site per second. The robot ignores the settings in Yandex.Webmaster and the Crawl-delay directive. Yes
Mozilla/5.0 (compatible; YandexVertis/3.0; +http://yandex.com/bots) Search verticals robot. Yes
Mozilla/5.0 (compatible; YandexVerticals/1.0; +http://yandex.com/bots) The Yandex.Verticals robot: Auto.ru, Yanex.Realty, Yandex.Rabota, Yandex.Reviews. Yes
Mozilla/5.0 (compatible; YandexVideo/3.0; +http://yandex.com/bots) Indexes video clips to display in Yandex.Video. Yes
Mozilla/5.0 (compatible; YandexVideoParser/1.0; +http://yandex.com/bots) Indexes video clips to display in Yandex.Video. No
Mozilla/5.0 (compatible; YandexWebmaster/2.0; +http://yandex.com/bots) The Yandex.Webmaster robot. Yes
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots) Takes a screenshot of the page. No
Mozilla / 5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 (compatible; YandexMedianaBot / 1.0; +http://yandex.com/bots) The Yandex.Mediana service robot. No

FAQ

How do I protect myself from fake robots that pretend to be Yandex robots?

To protect yourself against fake robots, use the reverse DNS lookup filter. This method is preferable to managing access by IP addresses, as it is more resistant to changes in the Yandex internal networks.

There is too much traffic going back and forth between my web server and your robot. Does Yandex support downloading of compressed pages?

Yes, it does. Each time the Yandex robot requests a page it says: “"Accept-Encoding: gzip,deflate” . This means you can set up your web server to reduce the traffic between the server and our robot. However, note that sending compressed content increases CPU usage on your server. If it is overloaded, it can cause problems. For gzip and deflate download, the robot applies the rfc2616 standard, section 3.5.

The robot creates an excessive load on the site or server

The indexing robot plans site page visits by itself, adjusting the load on the site or server automatically depending on how many new or already indexed pages of the site need to be crawled.

Sometimes the number of robot requests can increase dramatically, for example, if the robot finds out about a new site section, about changes in the site structure or about new page URLs. To reduce the server load, you can:
  • Check the server logs and disallow indexing of technical pages using the Disallow directive in the robots.txt file.
  • Add the Crawl-delay directive in the robots.txt file.
  • Change the site crawl rate in Yandex.Webmaster. If you use this method, the robot doesn't take into account the Crawl-delay directive.

Make sure to include the server logs in the message. This will help us solve the problem faster.