Viewing examples of pages that appeared in search or were removed from search

Returns the URLs of pages that appeared in search results or were excluded from search (maximum 50,000).

Request format

GET https://api.webmaster.yandex.net/v4/user/{user-id}/hosts/{host-id}/search-urls/events/samples
  ? [offset=<int32>]
  & [limit=<int32>]

user-id

Type: int64. User ID. Required when calling all Yandex.Webmaster API resources. To get it, use the GET /v4/user method.

host-id

Type: host id (string). The site ID. To get it, use the GET /v4/user/{user-id}/hosts method.

offset

The list offset. The minimum value is 0.

limit

Page size (1-100). Default value: 50.

Response format

Examples

{
  "count": 1,
  "samples": [
    {
      "url": "http://example.com/some/path?a=b",
      "title": "some string",
      "event_date": "2016-01-01T00:00:00,000+0300",
      "last_access": "2016-01-01T00:00:00,000+0300",
      "event": "APPEARED_IN_SEARCH",
      "excluded_url_status": "NOTHING_FOUND",
      "bad_http_status": 500,
      "target_url": "http://example.com/some/path?a=b"
    }
  ]
}
<Data>  
  <count>1</count>  
  <sample>    
    <url>http://example.com/some/path?a=b</url>    
    <title>some string</title>    
    <event_date>2016-01-01T00:00:00,000+0300</event_date>    
    <last_access>2016-01-01T00:00:00,000+0300</last_access>    
    <event>APPEARED_IN_SEARCH</event>    
    <excluded_url_status>NOTHING_FOUND</excluded_url_status>    
    <bad_http_status>500</bad_http_status>    
    <target_url>http://example.com/some/path?a=b</target_url>
  </sample>  
</Data>

Name

Required

Type

Description

count

Yes

int32

Total number of available examples.

sample | samples

Yes

Sample pages.

url

Yes

url

Page address.

title

Yes

string

Page heading.

event_date

Yes

datetime

Date when a page appeared or was excluded.

last_access

Yes

datetime

The date when the page was last crawled before it appeared or was excluded.

event

Yes

string (ApiSearchEventEnum)

The appearance or removal of the page.

excluded_url_status

No

string (ApiExcludedUrlStatus)

The reason the page was excluded.

bad_http_status

No

int32

The page's HTTP response code for the HTTP_ERROR status.

target_url

No

url

Another address of the page that the robot is aware of. This could be a redirect target, canonical address or a duplicate page.

Site page status in search results (ApiSearchEventEnum)

Indicator

Description

APPEARED_IN_SEARCH

The page appeared in search results.

REMOVED_FROM_SEARCH

The page was removed from search results.

Reasons for excluding the page from search results (ApiExcludedUrlStatus)

Indicator

Description

NOTHING_FOUND

The robot doesn't know about the page, or it was unavailable for a long time. Submit the page for reindexing.

HOST_ERROR

When trying to access the site, the robot could not connect to the server. Check the server response and make sure that the Yandex robot isn't blocked by the hosting provider. The site is indexed automatically when it becomes available for the robot. For information about the user agent robots, see the help section.

REDIRECT_NOTSEARCHABLE

The page redirects to another page. The target page is indexed (RedirectTarget). Check the indexing of the target page.

HTTP_ERROR

An error occurred when accessing the “HTTP error” page. Check the server response. If the problem persists, contact your site administrator or the server administrator. If the page is already available, submit it for reindexing.

NOT_CANONICAL

The page is indexed by the canonical URL specified in the rel="canonical" attribute in its source code. Correct or delete the attribute if it is specified incorrectly. The robot will track the changes automatically.

NOT_MAIN_MIRROR

The page belongs to a secondary site mirror, so it was excluded from the search.

PARSER_ERROR

When trying to access the page, the robot couldn't get its content. Check the server response or the presence of prohibiting HTML elements. If the problem persists, contact your site administrator or the server administrator. If the page is already available, send it for reindexing.

ROBOTS_HOST_ERROR

Site indexing is prohibited in the robots.txt file. The robot will automatically start crawling the page when the site becomes available for indexing.

ROBOTS_URL_ERROR

Page indexing is prohibited in the robots.txt file. The robot will automatically crawl the page when it becomes available for indexing.

DUPLICATE

The page duplicates a site page that is already in the search. For more information, see the help section.

LOW_QUALITY

The page has been removed from search results due to low quality as determined by a special algorithm. If the algorithm finds the page relevant to users' search queries, it will appear in the search automatically.

CLEAN_PARAMS

The page was excluded from the search after the robot processed the Clean-param directive. To get the page indexed, edit the robots.txt file.

NO_INDEX

The page is excluded because the robots meta tag has the noindex value.

OTHER

The robot does not have updated data for the page.

Check the server response or the presence of prohibiting HTML elements.

If the page can't be accessed by the robot, contact the administrator of your site or server. If the page is already available, send it for reindexing.

Response codes

To view the response structure in detail, click the reason.

Code

Reason

Description

200

OK

403

INVALID_USER_ID

The ID of the user who issued the token differs from the one specified in the request. In the examples below, {user_id} shows the correct uid of the OAuth token owner.

{   
    "error_code": "INVALID_USER_ID",   
    "available_user_id": 1,   
    "error_message": "Invalid user id. {user_id} should be used." 
}
<Data>     
    <error_code>INVALID_USER_ID</error_code>     
    <available_user_id>1</available_user_id>     
    <error_message>Invalid user id. {user_id} should be used.</error_message>
</Data>

404

HOST_NOT_VERIFIED

Site management rights are not verified.

{   
    "error_code": "HOST_NOT_VERIFIED",   
    "host_id": "http:ya.ru:80",   
    "error_message": "some string"
}
<Data>   
    <error_code>HOST_NOT_VERIFIED</error_code>     
    <host_id>http:ya.ru:80</host_id>   
    <error_message>some string</error_message> 
</Data>

Type: int64. User ID. Required when calling all Yandex.Webmaster API resources. To get it, use the GET /v4/user method.

Type: host id (string). The site ID. To get it, use the GET /v4/user/{user-id}/hosts method.

The list offset. The minimum value is 0.

Page size (1-100). Default value: 50.

Required

Yes

Type

int32

Description

Total number of available examples.

Required

Yes

Type

Description

Sample pages.

Required

Yes

Type

url

Description

Page address.

Required

Yes

Type

string

Description

Page heading.

Required

Yes

Type

datetime

Description

Date when a page appeared or was excluded.

Required

Yes

Type

datetime

Description

The date when the page was last crawled before it appeared or was excluded.

Required

Yes

Type

string (ApiSearchEventEnum)

Description

The appearance or removal of the page.

Required

No

Type

string (ApiExcludedUrlStatus)

Description

The reason the page was excluded.

Required

No

Type

int32

Description

The page's HTTP response code for the HTTP_ERROR status.

Required

No

Type

url

Description

Another address of the page that the robot is aware of. This could be a redirect target, canonical address or a duplicate page.

Description

Error code.

Description

ID of the user who allowed access.

Description

ID of the requested site.

Description

Error message.