Site indexing

Add your site in Yandex Webmaster.
Sitemap. Sitemap is a special format developed for webmasters and search engines to describe the structure of the site. It is a list of links to the site's internal pages presented in XML format. Yandex supports this format as well. On a special page in Yandex Webmaster, you can upload a Sitemap for your website. Use it to set the priority of crawling certain pages for the robot. For example, if some pages are updated more often, make this clear so the robot plans crawling correctly.
Robots.txt is a file for search engine robots. In this file, the webmaster can specify indexing parameters for all robots or for each search engine separately. Here are the most important parameters you can specify in this file:

Disallow

This directive is used to prohibit indexing of certain site sections. Use it to prevent indexing for technical pages and pages that aren't important for the users and search engines. This includes:

For more information, see Using robots.txt.

Clean param

Use this directive to tell the robot which CGI parameters in the page URL are insignificant. Sometimes the page URLs contain session identifiers. Formally, pages with different IDs are different, but their content is still the same. If there are many pages of this kind on the site, the robot can start indexing such pages, rather than downloading the useful content. For more information, see Using robots.txt.
Yandex indexes the main types of documents distributed online. But there are limitations that affect how the document is indexed and whether it is indexed at all:
- A large number of CGI parameters in a URL, a large number of nested directories, and overly long URLs may interfere with document indexing.
- The size of the document is important for indexing. Documents more than 10 MB aren't indexed.
- Indexing Flash:
  1. The robot indexes *.swf files if there is a direct link to them or they are embedded in the HTML with the "object" or "embed" tags.
  2. If a Flash file contains useful content, the original HTML document can be found by the content indexed in the SWF file.
- In PDF documents, only text content is indexed. Text represented as images is not indexed.
- Yandex indexes documents in the Open Office XML and OpenDocument formats (including the Microsoft Office and Open Office documents). But support for new formats can take some time.
- You can use the <frameset> and <frame> tags. The Yandex robot indexes the content loaded in them and finds the source document based on the contents of the frames.
If you set a different server behavior for non-existent URLs, make sure that the server returns the 404 error code. Once the search engine receives the 404 code, it removes the document from the index. Make sure that all necessary pages on the site respond with the 200 OK code.
Make sure that the HTTP headers are correct. The server response to the “if-modified-since” request is important. The Last-Modified header must contain the correct last modified date for the document.

Note

Manage the Yandex robot and prohibit indexing for pages that are not intended for users.

Contact support

If pages are accessible to the robot and sent for reindexing, but do not appear in the search for more than two weeks, fill out the form below:

You can also go to

Useful tools

Services

Technologies

Was the article helpful?

Site usability

Site structure