How to check that a robot belongs to Yandex

Some robots can disguise themselves as Yandex robots by indicating the corresponding User Agent. You can check the authenticity of a robot using reverse DNS lookup.

Just follow these steps:

  • Determine the IP address of the user-agent in question using your server logs. All Yandex robots present themselves in the User agent.

  • Use a reverse DNS lookup of the received IP address to determine the host domain name.

  • After determining the host name, you can check whether or not it belongs to Yandex. All Yandex robots have names ending in 'yandex.ru','yandex.net' or 'yandex.com'. If the host name has a different ending, the robot does not belong to Yandex.

  • Finally, make sure that the name is correct. Use a forward DNS lookup to get the IP address corresponding to the host name. It should match the IP address used in the reverse DNS lookup. If the IP addresses do not match it means that the host name is fake.

Yandex robots in server logs

Yandex has many robots that take different forms:

  • Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) — The main indexing robot.

  • Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexBot/3.0; +http://yandex.com/bots) — Indexing robot.
  • Mozilla/5.0 (compatible; YandexAccessibilityBot/3.0; +http://yandex.com/bots) — Downloads pages to check user accessibility. Interprets robots.txt in a special way.
  • Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots) — Determines if the page layout is suitable for mobile devices. Interprets robots.txt in a special way.
  • Mozilla/5.0 (compatible; YandexDirectDyn/1.0; +http://yandex.com/bots — Generates dynamic banners, interprets robots.txt in a special way.
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 (compatible; YandexScreenshotBot/3.0; +http://yandex.com/bots) — Makes a snapshot of a page. Interprets robots.txt in a special way.
  • Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots) — The Yandex.Images indexing robot.

  • Mozilla/5.0 (compatible; YandexVideo/3.0; +http://yandex.com/bots) — The Yandex.Video indexing robot.

  • Mozilla/5.0 (compatible; YandexVideoParser/1.0; +http://yandex.com/bots) — The Yandex.Video indexing robot. Interprets robots.txt in a special way.
  • Mozilla/5.0 (compatible; YandexMedia/3.0; +http://yandex.com/bots) — Multimedia data indexer.

  • Mozilla/5.0 (compatible; YandexBlogs/0.99; robot; +http://yandex.com/bots) — The blog search robot. Indexes comments to posts.

  • Mozilla/5.0 (compatible; YandexFavicons/1.0; +http://yandex.com/bots)— The favicons indexer.

  • Mozilla/5.0 (compatible; YandexWebmaster/2.0; +http://yandex.com/bots)— TheYandex.Webmaster indexing robot.

  • Mozilla/5.0 (compatible; YandexPagechecker/1.0; +http://yandex.com/bots)— The robot that validates markup submitted through the Structured data validator form.

  • Mozilla/5.0 (compatible; YandexImageResizer/2.0; +http://yandex.com/bots) — The mobile services robot.

  • Mozilla/5.0 (compatible; YaDirectFetcher/1.0; Dyatel; +http://yandex.com/bots) — Downloads the ads' landing pages to check their availability and topic. This is necessary for ad placement in the search results and on the partner sites. When crawling a site, the robot does not use the robots.txt file and ignores the directives set for it.

  • Mozilla/5.0 (compatible; YandexCalendar/1.0; +http://yandex.com/bots) — The Yandex.Calendar robot used for syncing with other calendars. Interprets robots.txt in a special way.

  • Mozilla/5.0 (compatible; YandexSitelinks; Dyatel; +http://yandex.com/bots) — The sitelinks “fetcher” used for checking the availability of the pages detected as sitelinks.

  • Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots) — The Yandex.Metrica robot. Interprets robots.txt in a special way.

  • Mozilla/5.0 (compatible; YandexNews/4.0; +http://yandex.com/bots) — The Yandex.News robot.

  • Mozilla/5.0 (compatible; YandexVertis/3.0; +http://yandex.com/bots) — Vertical search robot.

  • Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots) — The robot detecting site mirrors.
  • Mozilla/5.0 (compatible; YandexSearchShop/1.0; +http://yandex.com/bots) — The robot that regularly downloads product catalogues in YML files by users's requests. These files are often placed in directories prohibited from indexing.Interprets robots.txt in a special way.
  • Mozilla/5.0 (compatible; YandexVerticals/1.0; +http://yandex.com/bots) — The robot of Yandex.Verticals: Auto.ru, Yandex.Realty, Yandex.Job, Yandex.Reviews.

There are many IP addresses that Yandex robots can “originate” from, and these addresses change frequently. We are therefore unable to offer a list of IP addresses and we do not recommend using a filter based on IP addresses.