Determine what is in the photo (classification)

Tip. If you encounter any difficulties, contact Yandex.Toloka Laboratory and we'll set up a “turn-key” task. You can also contact our partners. For more information, see Get help.
  1. Create a project
  2. Add a task pool
  3. Upload tasks
  4. Set up quality control
  5. Add training
  6. Start the pool and get the results

Projects of the classification type are intended for tasks with multiple choice. Examples are moderating content or grouping images by category.

You may need additional projects for your task, such as dataset pre-check or checking performers' responses. Learn more about this in the Designing the solution architecture section.

Suppose you have a set of cat photos and want them split into several groups according to the cat's mood. You should create a task where a performer sees a photo and has to choose one of three responses. The performer can also mark if they like a photo.
Tip.

Run the project in the Sandbox first. This helps you avoid making mistakes and spending money on a task that isn't working right.

Example of a prepared task

To run tasks and get responses:

Create a project

The project defines what the task will look like for a performer.

  1. Click the + Create project button and choose the Image categorization template.

  2. Enter a clear name and a short description for the project. Performers will see this in the task list.

  3. Write short and clear guidelines (see the recommendations).
  4. Note. This tutorial shows how to create a task interface in Yandex.Toloka. You can also try creating a task interface in the Template Builder.
    Define which objects you are going to pass to the performers and receive from them in response. To do this, add input and output fields in the Specifications block.
    What are input and output data?

    Input data is types of objects that are passed to the performer for completing the task. For example, this could be a text, an image, or geographic coordinates.

    Output data is types of objects that you receive after the task is completed. For example, this could be one of several response options, typed text, or an uploaded file.

    Learn more about input and output data fields.

    In this case they are:

    • The image input data field for a link to an image.
    • Output data fields:
      • Boolean like for a checkbox answer.
      • The result string with a radio button response.
  5. Create the task interface in the HTML block. It describes how the task elements should be arranged in the task.

    You can use standard HTML tags and special expressions in double curly brackets for input and output data fields.

    {{img src=image width="100%" height="400px"}}
    
    {{field type="radio" name="result" value="OK" label="Хорошее" hotkey="1"}}
    {{field type="radio" name="result" value="BAD" label="Плохое" hotkey="2"}}
    {{field type="radio" name="result" value="404" label="Ошибка загрузки" hotkey="3"}}
    <br>
    {{field type="checkbox" name="like" label="Понравилось фото" hotkey="q"}}
    This notation describes the following task design:
    • A picture at the image link.
    • Three radio buttons, and the chosen option is output to the result field.
    • A checkbox, with the value (true or false) output to the like field.

    Leave the CSS and JavaScript blocks unchanged.

  6. Click the Preview button to see the performer's view of the task.
    Note. The project preview shows one task with standard data. You can define the number of tasks to show on the page later.
  7. Save the project.

Add a task pool

A pool is a set of paid tasks sent out for completion at the same time.

  1. Open the project and click Add pool.
  2. Give the pool any convenient name and description. The pool info is only available to you. Performers can view only the project name and description.
  3. Set the price per task page (for instance, $0.02).
    What is a task page?

    A page can contain one or several tasks. If the tasks are simple, you can add 10-20 tasks per page. Don't make pages too long because it slows down loading speed for performers.

    Performers get paid for completing the whole page.

    The number of tasks on the page is set when uploading tasks.

    What is the fair price for a task page?

    The general rule of pricing is the more time the performer spends to complete the task, the higher the price is.

    You can register in Yandex.Toloka as a performer and find out how much other requesters pay for tasks, or see examples of cost for different types of tasks.

  4. Set the Time allowed for completing a task page. It should be long enough to read the guidelines and wait for task data to download (for example, 600 seconds).
  5. Set Overlap, which is the number of performers to complete the same task. For classification tasks, 3 is enough.
  6. Add Filters to select performers. To make your task available only to English-speaking users, set filters by language and country detected by the phone number.
  7. Save the pool.

Upload tasks

Prepare your own task file. Check out the example in a demo TSV file. You can find it on the pool page. At the top-left of the page, there are links to TSV files with regular, control, and training tasks.

  1. Click Upload. In the window that opens, you can also download a sample TSV file by clicking Sample file for uploading tasks.

    What is TSV?
    A TSV file presents a table as a text file in which columns are separated by tabs.
    You can work with it both in a table editor and a text editor, and then save it to the desired format. More about working with a TSV file. There is a CSV format that is similar to TSV, but you should use a TSV file for uploading.
    Note. Before uploading the file, make sure it is saved in UTF-8 encoding.
  2. Add input data in it. The header of the input data column contains the word INPUT. Leave the other columns empty.
  3. Upload the tasks using Smart mixing and enter the number of tasks per page. For example: 9 main tasks and 1 control task.
    What is smart mixing?
    Smart mixing randomly generates pages with tasks so that tasks are not repeated for each performer.
  4. Add control tasks. To do this, click the Edit button and give the correct responses for several tasks.
    Note.

    If you selected something else instead of smart mixing, click Edit. If this button is missing, delete the file and upload it again.

    What are the control tasks?

    Control tasks are tasks with the correct response known in advance. They are used to track the performer's quality of responses. The response you provided is compared to the performer&apos;s response. If they match, it means the performer answered correctly.

    Control tasks should make up at least 1% of the total number of tasks. This means that for 1000 tasks you should add at least 20 control tasks.

    More about control tasks.

Set up quality control

Quality control rules allow you to filter out inattentive performers. You can configure quality control both in the project and in the pool.

Attention.

Quality control settings are applied to all project pools, so you can't change them in just one of the pools.

When you clone a project, its quality control settings aren't transferred.

    Go to pool editing (the Edit button in the upper-right corner of the page) and click Add Quality Control Rule.

    You can copy quality control settings from another pool. To do this, click Copy settings from in the Users filter section.

  1. Add the Control tasks section and specify the following values:

    This means that a performer who gives more than 40% of incorrect responses will be blocked for five days and won&apos;t be able to complete tasks in this project.

  2. Add a restriction for Fast responses.

    The Minimum time per page value depends on the number of tasks on this page. It takes 2-4 seconds to identify the cat's mood. This means that a page with 10 tasks may take 20-30 seconds to complete.

    A performer can make an accidental mistake once in a while, but after 2-3 repeated mistakes you can ban the performer for a while.

    Specify the following values:

    This means that a user who completes two task pages in less than 20 seconds will be blocked for 10 days and won't be able to complete your tasks.

Add training

Create a training pool:

  1. Open the project page.

  2. Go to the Training tab.

  3. Click the Add training button.

  4. Fill in the training settings fields.

    You can use the Retry after field to set up repeated training.
  5. Choose the pool type.
  6. Click Create training.
After you create a training pool:
  1. Get the task template (TSV) or edit the one you used for uploading the main pool tasks.
    Note. TSV files for all project pools have the same structure.
  2. Add links to images for the training tasks in the TSV file.
  3. Upload the file and specify the number of tasks on the page. For example, 10. This number must not exceed the number of tasks per page in the main pool.
  4. Click Download and enter the number of training tasks on the page.
  5. Click Add.
  6. Click Mark upand then Create training tasks. Next, add correct answers and hints for all the uploaded tasks.
  7. After the file is uploaded, open the Preview and check that the tasks are displayed correctly.
  8. Open the main pool with tasks, link Training to it and set the Level required to 55. This means that the main pool will be available for users who made no more than 45% of mistakes in the training pool.

    To link the training pool, go to the main pool editing mode and select your training pool in the Training parameter drop-down list.

Learn more about creating a pool with training.

Start the pool and get the results

  1. Start the pool by clicking .
  2. Track the completion of tasks in the Pool statistics section.
  3. When the pool is completed, launch aggregation of results. To do this, find the Download results button and click  → Dawid-Skene aggregation model next to it.

    Aggregation of responses is necessary to get a complete picture of all results. Learn more about aggregation.

  4. Track the aggregation progress on the Operations page. When the process is completed, click Download.