How do I exclude pages from the search?
Sometimes you need to exclude a site page from search results, for example, if it contains confidential information, is a duplicate of another page, or was deleted from the site.
- Step 1. Prohibit the page or directory indexing
- Step 2. Speed up the page removal
- How do I return a page to the search results?
- FAQ
Step 1. Prohibit the page or directory indexing
There are several ways to do this:
- If the page is removed from the site
-
- Add the Disallow directive in the robots.txt file.
Configure the server so that when the robot accesses the page URL it sends HTTP status with the 404 Not Found, 403 Forbidden or 410 Gone code. For the user convenience, we recommend setting up a redirect with HTTP 301 code.
- If the page should not be displayed in the search
-
- Add the Disallow directive in the robots.txt file.
- Specify the robots meta tag with the noindex directive.
To check whether the instructions in the robots.txt file are correct, use the Robots.txt analysis tool.
Exclusion method | The robot's behavior |
---|---|
Prohibition in the robots.txt file | The robot stops accessing the page within 24 hours. |
HTTP status with the 404, 403 or 410 code | The robot continues to visit the page for some time to make sure that its status doesn't change. If the page remains unavailable, the robot stops crawling it. |
The robots meta tag with the noindex directive |
Exclusion method | The robot's behavior |
---|---|
Prohibition in the robots.txt file | The robot stops accessing the page within 24 hours. |
HTTP status with the 404, 403 or 410 code | The robot continues to visit the page for some time to make sure that its status doesn't change. If the page remains unavailable, the robot stops crawling it. |
The robots meta tag with the noindex directive |
When the robot visits the site and finds out that it is prohibited from indexing, the page disappears from the search results within a week. The URL of the deleted page is displayed in the list of excluded pages on the page in Yandex.Webmaster.
Excluding pages that violate copyright from the search isn't the robot's priority task. To exclude a page from the search, use the methods described in this section.
Pages excluded from search results can be displayed in Yandex.Webmaster until the next site crawl.
Step 2. Speed up the page removal
To speed up the page removal from the search, tell Yandex to remove it, without waiting for the planned robot crawl.
If your site isn't added or isn't verified in Yandex.Webmaster:
- Go to the Remove pages from search results page in Yandex.Webmaster.
- Enter the URL of the page to exclude in the field, for example http://example.com/page.html.
- Click the Remove button.
To exclude multiple pages from the search, remove them one by one.
If your site is added to Yandex.Webmaster and you confirmed your site management rights:
- Go to thepage.
- Set the radio button to By URL.
- Enter the page URL in the field, for example http://example.com/page.html.
- Click the Remove button.
You can specify up to 500 URLs per site per day.
You can delete all site pages, individual directories, or pages with the specified parameters in the URL, if your site is added to Yandex.Webmaster and you verified your site management rights.
- In Yandex.Webmaster, go to thepage.
- Set the radio button to By prefix.
- Specify the prefix:
What to delete Example Site directory http://example.com/catalogue/ All site pages http://example.com/ URL with parameters http://example.com/page? What to delete Example Site directory http://example.com/catalogue/ All site pages http://example.com/ URL with parameters http://example.com/page? You can send up to 20 prefixes per site per day.
- Click the Remove button.
After the URL is sent to Yandex.Webmaster, you can track status changes on the
:Status | Description |
---|---|
“In the delete queue” | The robot checks the server response and if the page is prohibited from indexing. The check can take several minutes. |
“In progress” | The robot checked the page. The page will be removed from search results within 24 hours. |
“Deleted” | The page was removed from the search results. |
“Rejected” | The page is allowed for indexing or when the robot accesses the page URL, the server response is different from 404 Not Found, 403 Forbidden or 410 Gone. |
Status | Description |
---|---|
“In the delete queue” | The robot checks the server response and if the page is prohibited from indexing. The check can take several minutes. |
“In progress” | The robot checked the page. The page will be removed from search results within 24 hours. |
“Deleted” | The page was removed from the search results. |
“Rejected” | The page is allowed for indexing or when the robot accesses the page URL, the server response is different from 404 Not Found, 403 Forbidden or 410 Gone. |
How do I return a page to the search results?
Remove the indexing prohibition: the Disallow directive in the robots.txt file or the noindex meta tag. The pages return to the search results when the robot crawls the site and finds out about the changes. This may take up to three weeks.
FAQ
If you use a redirect, the robot will gradually track redirects and the old pages will disappear from the search results as it crawls the site. For the robot to learn about the changes faster, send the pages for reindexing.
If the page URLs change because you changed the site's domain name, the search data update may take more than a month. Check if the mirrors are configured correctly.