Uploading tasks to a pool

Restriction. You can add up to one million tasks to the pool. To upload more tasks, create another pool.
To upload a TSV file with tasks to a pool:
  1. On the pool page, click Upload.
  2. Choose how tasks are placed on the page. Currently, there are three methods to place tasks: By empty row, Set manually, and Smart mixing.
    How to distribute tasks by page
    Characteristics/upload type By empty row and Set manually By empty row and Set manually (keep task order) Smart mixing Smart mixing (keep task order)
    To generate pages, tasks are taken in the order of rows (from top to bottom) in an uploaded file Yes Yes No Yes
    Tasks are mixed within a page No No Yes Yes
    Pages are distributed to performers in the same order No Yes Yes Yes
    Within identical pages, control tasks are the same for all performers Yes Yes No Yes

    For more information about how to distribute tasks by page, see below.

    Ways to distribute tasks by page
    By empty row

    Divide the tasks into pages yourself in the TSV file. To do this, add an empty line after each task page in the file.

    Set manually
    Enter the number of tasks per page. Task pages are formed from the tasks in the order they are placed in the TSV file.
    Smart mixing

    Specify how many tasks of each type should be on the page. For example, 8 main tasks, 1 training and 1 control task. If necessary, specify the minimum number of tasks for each type in additional settings. If there aren't enough main tasks and the Assign partial page option is set, the performer is given an incomplete page. Please note that the number of control and training tasks in this case must be complete.

    Attention. If you upload a file via “Smart mixing”, you won't be able to use other ways of task distribution on the pages in this pool.

    This method is useful if the created pool:

    Examples


    Smart mixing and keeping the task order
    • Tasks are divided into lists by task type: regular, control, or training.
    • Pages are generated using these lists. The number of tasks of the given type that you specified in the settings is added from each list. By default, tasks are randomly selected.

      If the Keep task order option is enabled, tasks are added in the same order as they were listed in the source TSV file. This takes into account the overlap: the task that goes first will be assigned until it reaches the desired overlap.

    • Tasks on pages are mixed when a page is shown to the performer.
    Smart mixing without "Keep task order"
    Example
    Smart mixing + "Keep task order"
    Example

    After uploading the tasks with smart mixing you will be able to mark up tasks and set selective majority vote checking.

    Setting overlap

    If you upload tasks from the Yandex.Toloka interface, infinite overlap is set automatically for control and training tasks, so that there is enough to mark up all main tasks.

    You can set the overlap via the Yandex.Toloka API.

    If you used Set manually, you can find out the number of tasks per page in the pool settings. But some pages may be incomplete. If you uploaded tasks in a different way, you can check how they're distributed by page in the Yandex.Toloka interface for requesters. To do this, on the pool page, click filesDownload all tasks. You can also check task distribution by page using the Yandex.Toloka API.

    Note. Set the number of tasks on the page depending on the complexity and time allocated for a task. We recommend that you distribute them so that each task page takes no more than five minutes to complete. The performers are paid for completing a full task page. The amount they get is specified in the pool settings.
  3. Click the Upload file button and choose the file. To put different types of tasks in a pool, you can upload them in separate files. You can also add tasks to existing ones as a separate file. Please note that this upload option will only work if Smart mixing is set. For example, if you selected Set manually, after uploading a file with main tasks and then a file with control tasks, you'll get separate pages with these types of tasks.
  4. Wait for the result. If you get a processing error, it means that the data file is not formatted correctly. For example, there are unnecessary tabs in the file or some lines, headers, or quotes are missing. In this case, click Cancel, correct the mistakes, and then upload the file again.
  5. Click Add.

  6. View the result by clicking the Preview button.

To delete all the tasks in the pool, click Delete.

How do I save the task order?

Keeping the task order while ignoring overlap

If you need the performers to receive task pages in the same order as they are in the uploaded TSV file, set it up with the Keep task order option. The Keep task order option works differently depending on the method for distributing tasks on pages. If the by empty row and set manually methods are used, performers will get task pages one after another: page 1 first, then pages 2 and 3, and so on. Tasks within pages will also go one after another and all performers will see the same sequence. For smart mixing, the algorithm generates pages so that performers get tasks in the order they are listed in the TSV file. Note that only task pages will be distributed in order, while the tasks within the pages will be mixed.

To use this option in your project, turn on the Keep task order option in the Parameters settings when creating a new pool.

Note. Keeping the order of tasks is useful if you need to quickly reach the overlap to monitor the majority vote or maintain the sequence of questions in a survey or training.
  • By default, this option is disabled (set to No). In this case, both the task pages and the tasks inside the pages will be given to the performers in random order.

    For example, if you upload 20 tasks in the TSV file to the pool (in order from the 1st to the 20th) and set four tasks per page, the tasks will be distributed to the performers in the following way:

    Performers Task page number Order of tasks on the page:
    1 1 3, 2, 4, 1
    2 5 17, 20, 18, 19
    1 3 12, 9, 11, 10
    3 2 7, 8, 6, 5
    2 4 16, 13, 15, 14
    3 3 11, 12, 10, 9
    ... ... ...
    Example


  • If the option is enabled (set to Yes), the tasks are given to the performer page by page in the same order as they are in the TSV file. The tasks within the page are shuffled.

    For example, like in the previous case, tasks are loaded in the pool in order (from the 1st to the 20th), four tasks per page. But in this case, the performers will get pages in the same order as in the upload file, with tasks shuffled inside each page:

    Performers Task page number Order of tasks on the page:
    1 1 1, 4, 3, 2
    2 1 3, 4, 1, 2
    1 2 6, 5, 7, 8
    3 1 2, 1, 4, 3
    2 2 8, 5, 7, 6
    3 2 5, 8, 6, 7
    ... ... ...
Note. In the pool preview, pages and tasks are shuffled because the task order isn't preserved in the preview. However, when you start the pool, task pages will be issued to each performer in the specified order.
Task order accounting for overlap

If you set an overlap to more than one and turn on the Keep task order option, each subsequent page is distributed to interested users only after there are enough users who submitted the page that was already assigned (in other words, after it reaches full overlap).

In this case, if the user already completed one pool page or there is a new interested user, they will get the next page that isn't in progress yet, even if the previous one didn't reach full overlap.

If a user refuses the issued task page, it will be given to another user — either someone else who is interested in the pool, or an available user who accepts the task.

For example, if overlap is set to 3:

Performers Task page number The overlap value achieved Note
1 1 1 Interested users received page 1
2 1 2
1 2 1 A performer completed page 1 and got page 2, although page 1 didn't reach full overlap yet
3 1 3 Full overlap of page 1
3 2 1 The user who took the task refused to complete page 2
4 2 2 The interested user received page 2 straight away, since there is already a full overlap for page 1, and the user who took it refused to perform page 2
1 3 1 A performer completed page 2 and got page 2, although page 2 didn't reach full overlap yet
2 2 3 Full overlap of page 2
5 3 1 Interested user refused to complete page 3
2 3 2 The user who submitted to the pool before received page 3, since the interested user refused to complete it
3 3 3 Full overlap of page 3
... ... ... ...

You can also set the order of tasks in the Yandex.Toloka API. To do this, use the function shuffle_tasks_in_task_suite: If true, the task order within a page is random. If false, the order in which tasks were uploaded is kept. The default is true, meaning that tasks are shuffled within the page.

Skill

If you added the majority vote quality control rule, once all completed pages have reached full overlap, a performer will be assigned a skill by majority vote. For example, if overlap 3 is set in the pool settings, the skill is calculated after each of these pages reaches overlap 3, not after the performer completes 3 pages.

Troubleshooting

How many tasks should be on the page?

The number of tasks depends on how difficult and time-consuming the tasks are. Don't make task pages too large. They are unpopular, partly because they are inconvenient for performers (for example, if the internet connection is unstable).

Processing errors
How do I view the processing log?
To view the processing log, click More on uploading errors. The processing log is written in JSON format. Objects inside result match the line number of the uploaded file. Lines that were processed with an error have the status "success": false.
Tip. To work with a large log conveniently, copy it to the text editor.
Errors in column headers

If the column headings are incorrect, the whole file is rejected. Otherwise, Toloka specifies the number of tasks with processing errors.

Processing errors table
Description How to fix
"parsing_error_of": "https://tlk.s3.yandex.net/wsdm2020/photos/2d5f63a3184919ce7e3e7068cf93da4b.jpg\t\t",
"exception_msg": "the nameMapping array and the sourceList should be the same size (nameMapping length = 1, sourceList size = 3)"

Extra tabs

If the TSV file contains more \t column separators after the data or the link than the number of columns set in the input data, you will get en error message.

For example, if 1 column is defined in the input, and two more \t\t tabs are added in the TSV file after the link, you get 3 columns, 2 of which are extra.

Remove extra column separators in the above example — both \t\t characters.

"exception_msg": "the nameMapping array and the sourceList should be the same size (nameMapping length = 4, sourceList size = 6)"

The number of fields in the header and in the row doesn't match.

Make sure that:

  • The number of tabs in the file structure is correct.
  • String values with tab characters are enclosed in quotation marks " ".
"code": "VALUE_REQUIRED", "message": "Value must be present and not equal to null"
The value for a required input field is not specified.

Make sure that columns with required input data fields are filled.

"code": "INVALID_URL_SYNTAX", "message": "Value must be in valid url format"
Invalid data in the URL field.
Make sure that:
"exception_msg": "unexpected end of file while reading quoted column beginning on line 2 and ending on line 4"

The string includes unpaired quotation mark.

Check that all quotation marks are escaped.

The same task appeared on different pages

The same task may appear on different pages if:

  • The project uses incremental relabeling. As an example, let's say there were 5 tasks on a page. For 4 of them, responses coincided and the common response was counted as correct. The fifth task was mixed into another set because it didn't get into the final response and it needs to be “reassessed”.
  • Different tasks have different overlap. Tasks with higher overlap will be additionally shown in sets with the other remaining tasks in the pool.
  • If a quality control rule changes a task's overlap, it will appear in a different set.