Audio transcription

  1. Create a project
  2. Add a task pool
  3. Upload tasks
  4. Set up quality control
  5. Start the pool and get the results
  6. Let performers check the responses

Run the project in the Sandbox first. This helps you avoid making mistakes and spending money on a task that isn't working right.

You can publish tasks for transcribing short audio recordings. We recommend that all the recordings in a pool are the same length.

You may need additional projects for your task, such as dataset pre-check or checking performers' responses. Learn more about this in Designing the solution architecture.

Let's say you need to transcribe an audio recording. To do this, create a task that provides an audio recording in the built-in player. The performer has to type the text they hear on the recording.

Example of a prepared task

To run tasks and get responses:

Create a project

The project defines what the task will look like for a performer.

  1. Click + Create project and select the Transcribing audio recordings template.

  2. Provide general information:

    1. Enter a clear name and a short description for the project. Performers will see this in the task list.

    2. Optionally add a Private comment.
    3. Click Save.
  3. Edit the task interface:

    1. The task interface describes how the elements should be arranged in the task.

      Note. This tutorial shows how to create a task interface in the HTML/JS/CSS editor. You can also try creating a task interface in the Template Builder.

      You can use standard HTML tags and special expressions in double curly brackets for input and output data fields.

      Leave the JavaScript unchanged. It is configured to check whether the entire audio has been played. The performer won't be able to submit the response without listening to all audio recordings in the task.

    2. The template includes input and output data fields:

      • Input data field — The audio link to an audio file.

        Change the data type to string to add links to your files or upload audio files stored on Yandex.Disk.

      • Output data fields:
        • The speech string for recording the value of the Is anyone speaking on the audio recording? field.;
        • The user_text string for recording the text entered by the performer.
        • The clean_text string for recording the processed text (used for checking responses using control tasks).
      What are input and output data?

      Input data is types of objects that are passed to the performer for completing the task. For example, this could be a text, an image, or geographic coordinates.

      Output data is types of objects that you receive after the task is completed. For example, this could be one of several response options, typed text, or an uploaded file.

      Learn more about input and output data fields.

      You can use this list of fields or customize it for your tasks. If you add interface elements to the task template, create fields for them in the Data specification block.

    3. Click to see the performer's view of the task.

      Note. The project preview shows one task with standard data. You can define the number of tasks to show on the page later.
    4. Click Save.
  4. Write instructions for performers:

    1. Write short and clear guidelines (see the recommendations). Describe what needs to be done and give examples in them.

      You can prepare instructions in HTML format, then copy and paste into the editor. Click <> to switch to HTML mode.

    2. Click Finish.

Add a task pool

A pool is a set of paid tasks sent out for completion at the same time.

  1. Open the project and click Add pool.
  2. Give the pool any convenient name. It is available only to you, the performer will only see the name of the project.
  3. Set the price per task suite (for instance, $0.05). The price depends on the length of the audio recordings.
    What is a task suite?

    A page can contain one or several tasks. If the tasks are simple, you can add 10-20 tasks per page. Don't make pages too long because it slows down loading speed for performers.

    Performers get paid for completing the whole page.

    The number of tasks on the page is set when uploading tasks.

    What is the fair price for a task suite?

    The general rule of pricing is the more time the performer spends to complete the task, the higher the price is.

    You can register in Toloka as a performer and find out how much other requesters pay for tasks, or see examples of cost for different types of tasks.

  4. Add Filters to choose performers.

    It is best to launch transcription tasks in the Toloka web version so that performers can use the keyboard for typing. Click Add filter, select the “Device category” filter in the “Calculated data” section and set its value to “Personal computer”.

  5. Turn on the Non-automatic acceptance option and enter the number of days for checking the task in the Deadline field (for example, 7).
    What is non-automatic acceptance (assignment review)?

    The non-automatic acceptance option allows you to review completed task suites before accepting them and paying for them. If the performer didn't follow instructions, you can reject the task. The maximum allowed period for the review is set in the Deadline field.

  6. Set the Overlap, which is the number of performers to complete the same task. For the speech transcription, it is 1, as a rule.
  7. Set the Time allowed for completing a task suite. This time should be enough to read the instructions, load the task, listen to audio recordings, and type text. (for example, 1200 seconds).
  8. Save the pool.

Upload tasks

Prepare your own task file. Check out the example in a demo TSV file. You can find it on the pool page. At the top-left of the page, there are links to TSV files with regular, control, and training tasks.

  1. Click Upload. In the window that opens, you can also download a sample TSV file by clicking Sample file for uploading tasks.
    What is TSV?
    A TSV file presents a table as a text file in which columns are separated by tabs.

    You can work with it both in a table editor and a text editor, and then save it to the desired format. More about working with a TSV file. There is a CSV format that is similar to TSV, but you should use a TSV file for uploading.

  2. Add input data in it. The header of the input data column contains the word INPUT. Use URL links to your files as values. If you don't have links, upload your files to Yandex.Cloud or Yandex.Disk.

    To use files on Yandex.Disk, you will need to slightly change the project and specification. Set the string data type for the input data field in which you will pass the file link. In the HTML block, add proxy to the audio player before the name of the audio input field: src="{{proxy audio}}". The link format when using Yandex.Disk is <unique name> / audio1.mp3, where the unique name is the name of your proxy.

  3. Upload the tasks: choose Set manually and set the number of tasks (for example, 4 tasks per page). This means that there will be 4 audio recordings per page, each recording with a text field for transcription.
  4. Click Add to upload your tasks to the pool.

Set up quality control

Quality control rules allow you to filter out inattentive performers. You can configure quality control both in the project and in the pool.


Quality control settings are applied to all project pools, so you can't change them in just one of the pools.

    Go to pool editing (the Edit button in the upper-right corner of the page) and click Add Quality Control Rule.

    You can copy quality control settings from another pool. To do this, click Copy settings from in the Users filter section.

  1. Add a restriction for Fast responses.

    The Minimum time per page value depends on two characteristics: the number of tasks on this page and the length of audio recordings. In the example, we set four tasks and the audio length is unknown. We estimate an adequate threshold for the rule.

    Make allowances for technical errors. For example, some recordings failed to load or play. The performer will quickly submit responses for tasks like this and this won't be an error. Let's add two rules.

    • One is to catch bots. Set 10-15 seconds per response. Ban performers after two fast responses.

      This means that a user who completes two or more task suites in less than 10 seconds will be blocked for 10 days and won't be able to complete your tasks.

    • The second rule is for detecting performers who don't take tasks seriously, retype texts inattentively, make mistakes, or skip words. In this case, the Minimum time per task suite value depends on the length of recordings and their amount on the page, as well as on how difficult it is to type text (it's hard to hear, there is jargon, problems with transcribing, and so on). Ban performers after three fast responses.

      This means that a user who gives a minimum of 3 responses in less than 30 seconds will be blocked for 5 days and won't be able to complete your tasks.

  2. Add the Control tasks block to filter out performers who often make mistakes.


    Add control responses when two conditions are met:

    How to create a TSV file with control tasks

    1. To create control tasks, mark up the tasks in the interface.

    2. When marking, put checkmarks next to the clean_text and speech fields and skip the user_text field. The clean_text field compares the processed performer result with the response to avoid mistakes such as extra spaces, incorrect capitalization, commas, and so on.

    1. Click Add Quality Control Rule.

    2. Find the Rules block in the list and choose Control tasks.

    3. 'Set a rule for control task: if the number of responses to the control questions is ≥ 3 and correct responses (%) to the control questions is < 60, then ban the performer on project for 10 days. Specify Control task as a reason.

      This means that if a performer completed more than three control tasks and gave incorrect answers in more than 60% of them, they will be blocked and won't be able to complete tasks on this project for 10 days.

  3. Add the Review results quality control rule and enter the following values:

    This means that if 35% or more of a performer's responses are rejected, the performer is banned and can't access your tasks for 15 days. The rule takes effect after 3 responses of the performer are reviewed.

  4. Add Processing rejected and accepted assignments. When the overlap value is "1", you should resend assignments to the pool for other performers to redo them.

    This means that if you reject assignments during the review, they'll be sent for re-completion, but to another performer.

  5. Create a skill. This is useful if you plan to create a separate project for reviewing performers' responses To do this, go to the Skills page, click +Add skill and enter the skill name, for example, "Transcriber".
    What is a skill?
    A skill is an assessment of some aspect of the performer's work (a number from 0 to 100). A skill can be awarded to the performer for correct responses in control tasks. It can be appointed arbitrarily as well.

    You can use the skill value when choosing performers.

  6. Add the Submitted answers section and enter the following values:

    This means that the skill is appointed to the performer if they completed at least one task.

Start the pool and get the results

  1. Start the pool by clicking .
  2. Track the completion of tasks in the Pool statistics section.
  3. When the first results are received, you can start the review . After the specified time period, all responses are automatically accepted, regardless of their quality.

    To review assignments, go to the pool and click Review assignments.

Let performers check the responses

Send the results to performers for the review as tasks. To make these tasks available to performers who didn't transcribe audio recordings, set the filter.

  1. Go to the pool and click Download results.
  2. Create a project with the classification type.
    Example of a prepared task
  3. Create a task interface that shows:
    • An audio recording in the audio player.
    • The transcript of the recording.
    • Response options.
      • The text fully matches the audio recording.
      • Minor mistakes were made in the text.
      • The recording is not fully transcribed.
      • The text doesn't match the audio recording.

    Add the assignment_id field to the input data where you will pass the ID of the response to be checked.

  4. Add a pool and set Overlap to 3 in it.
  5. Add a filter to choose performers without skill:
  6. Upload tasks to the pool and start it.
  7. When the pool is fully completed, start aggregation of results.
  8. Accept transcription tasks without errors. Reject the rest, specifying the reason.


I can't upload files from Yandex.Disk

If images, audio or video from Yandex.Disk don't appear in the instructions or on the task suite, make sure you connected Yandex.Disk correctly and uploaded the files.

How to create a task where the performer has to view a video from Yandex.Disk

To create such a task, take the video markup template as a basis.

To host your videos on Yandex.Disk, connect Yandex.Disk and set up the project.

Why can't my task for selecting objects in an image display images from Yandex.Disk?
The problem is in your task template. Make sure that:
  • In the project, the input field where you pass the file link has the “string” type.
  • The component in the task template uses the "proxy" expression.
  • The format of relative links in the TSV file with tasks is correct: <unique name>/<file path and name>.
For detailed instructions and videos, see the page Using files from Yandex.Disk.
Frequent mistakes when connecting to Yandex.Disk and uploading files
  • The Input data field in the project settings has the link type. You should choose the string type.
  • The TSV file contains absolute references to the task files. You need to insert a link <unique name>/<path and file name>. For example: yadisk/image1.jpg or yadisk/photos/image1.png.
  • Photos from Yandex.Disk are used in the task instructions in the mobile app. To display the photos in the instructions, use only direct links.
  • Files are deleted or aren't located in the Yandex.Disk folder that the link leads to.
  • The OAuth token isn't active. Update the token on the External Services Integration page.
To display files from Yandex.Disk (images, audio files, videos) to the performer:
  1. Link Yandex.Disk in your profile.
  2. Set the string type for the input data field.
  3. Insert a file link using the proxy component.

Detailed instructions

Files load too slowly from Yandex.Disk. How do I speed up the loading process?

Try the recommendations on this page or contact Yandex.Disk support.