Indexing office documents and Flash files

Yandex indexes more than just HTML documents. It also indexes documents of the following types: PDF, Flash (Adobe Systems); DOC/DOCX, XLS/XLSX, PPT/PPTX (MS Office); ODS, ODP, ODT, ODG (Open Office); RTF, TXT.

Restrictions on the indexed data:

  • In PDF documents, only text content is indexed. Text represented as images is not indexed.

  • In Flash documents, the text from the following blocks is indexed:

    • DefineText.

    • DefineText2.

    • DefineEditText.

    • Metadata.

    Links are indexed if they are in these blocks:

    • DoAction.

    • DefineButton.

    • DefineButton2.

  • When new software versions are released, support for the new formats may take a while.

  • Documents larger than 10 MB aren't indexed.